{
 "cells": [
  {
   "cell_type": "markdown",
   "id": "89145993",
   "metadata": {},
   "source": [
    "# Maximum Likelihood Estimation - Spatial Error Model\n",
    "\n",
    "### Luc Anselin\n",
    "\n",
    "### (revised 09/21/2024)"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "a85427f9",
   "metadata": {},
   "source": [
    "## Preliminaries\n",
    "\n",
    "Similar to the treatment of the spatial lag model, the estimation of the spatial error model is covered in two notebooks. This first one covers Maximum Likelihood estimation. General Method of Moments (GMM) estimation is considered in a separate notebook.\n",
    "\n",
    "As mentioned in the spatial lag notebook, it should be kept in mind that the maximum likelihood estimation in `spreg` is primarily included for pedagogical purposes. Generally, the GMM approach is preferred. In addition, an optimal maximum likelihood estimation implementation, based on the Smirnov-Anselin (2001) approximation, is not currently implemented in `spreg`. It is implemented in C++ in `GeoDa`. This is the preferred approach for ML estimation in large(r) data sets.\n",
    "\n",
    "The `spreg` module implements ML estimation of the spatial error model in the `ML_Error` class. The estimation of the SLX-Error model is implemented through the inclusion of the `slx_lags` argument. "
   ]
  },
  {
   "cell_type": "markdown",
   "id": "f9a2c9eb",
   "metadata": {},
   "source": [
    "### Modules Needed\n",
    "\n",
    "As before, the main module is *spreg* for spatial regression analysis. From this, `OLS` and `ML_Error` are imported. In addition, the utilities in *libpysal* (to open spatial weights and access the sample data set), *pandas* and *geopandas* are needed, as well as *time* (for some timing results), *matplotlib.pyplot* and *seaborn* for visualization. All of these rely on *numpy* as a dependency. Finally, in order to carry out the Likelihood Ratio tests, `likratiotest` is imported from `spreg.diagnostics`.\n",
    "\n",
    "The usual *numpy* `set_printoptions` is included as well."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 1,
   "id": "5ac490b0",
   "metadata": {},
   "outputs": [],
   "source": [
    "import warnings\n",
    "warnings.filterwarnings(\"ignore\")\n",
    "import os\n",
    "os.environ['USE_PYGEOS'] = '0'\n",
    "\n",
    "import numpy as np\n",
    "import pandas as pd\n",
    "import geopandas as gpd\n",
    "import time\n",
    "import matplotlib.pyplot as plt\n",
    "import seaborn as sns\n",
    "from libpysal.io import open\n",
    "from libpysal.examples import get_path\n",
    "from libpysal.weights import lag_spatial\n",
    "\n",
    "from spreg import OLS, ML_Error\n",
    "from spreg.diagnostics import likratiotest\n",
    "np.set_printoptions(legacy=\"1.25\")"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "0ee18820",
   "metadata": {},
   "source": [
    "### Functions Used\n",
    "\n",
    "- from pandas/geopandas:\n",
    "  - read_file\n",
    "  - DataFrame\n",
    "  - head\n",
    "  - describe\n",
    "  \n",
    "- from libpysal:\n",
    "  - io.open\n",
    "  - examples.get_path\n",
    "  - weights.lag_spatial\n",
    " \n",
    "- from numpy:\n",
    "  - hstack\n",
    "\n",
    "- from matplotlib/seaborn:\n",
    "  - regplot\n",
    "  - show\n",
    "\n",
    "- from spreg:\n",
    "  - spreg.OLS\n",
    "  - spreg.ML_Error\n",
    "  - spreg.diagnostics.likratiotest"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "47934d1b-7587-4ddf-be38-83904eede8e8",
   "metadata": {},
   "source": [
    "### Variable definition and data input"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "bfb04af4-4521-4ac0-8e55-d3b92f54a403",
   "metadata": {},
   "source": [
    "The data set and spatial weights are from the **chicagoSDOH** sample data set:\n",
    "\n",
    "- **Chi-SDOH.shp,shx,dbf,prj**: socio-economic indicators of health for 2014 in 791 Chicago tracts\n",
    "- **Chi-SDOH_q.gal**: queen contiguity weights\n",
    "\n",
    "To illustrate the methods, the same descriptive model is used that relates the rate of uninsured households in a tract(for health insurance, **EP_UNINSUR**) to the lack of high school education (**EP_NOHSDP**), the economic deprivation index (**HIS_ct**), limited command of English (**EP_LIMENG**) and the lack of access to a vehicle (**EP_NOVEH**). This is purely illustrative of a spatial error specification and does not have a particular theoretical or policy motivation.\n",
    "\n",
    "The file names and variable names are set in the usual manner. Any customization for different data sets/weights and different variables should be specified in this top cell."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 2,
   "id": "14a1e98e",
   "metadata": {},
   "outputs": [],
   "source": [
    "infileshp = get_path(\"Chi-SDOH.shp\")     # input shape file with data\n",
    "infileq = get_path(\"Chi-SDOH_q.gal\")     # queen contiguity weights created with GeoDa\n",
    "\n",
    "y_name = 'EP_UNINSUR'\n",
    "x_names = ['EP_NOHSDP','HIS_ct','EP_LIMENG','EP_NOVEH']\n",
    "ds_name = 'Chi-SDOH'\n",
    "w_name = 'Chi-SDOH_q'"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "d9514984-a22e-4d5b-a993-f40771a290a0",
   "metadata": {},
   "source": [
    "The `read_file` and `open` functions are used to access the sample data set and contiguity weights. The weights are row-standardized and the data frames for the dependent and explanatory variables are constructed. As before, this functionality is agnostic to the actual data sets and variables used, since it relies on the specification given in the initial block above."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 3,
   "id": "3248d23d-2247-40bf-a3a5-3ba21b477aa8",
   "metadata": {},
   "outputs": [],
   "source": [
    "dfs = gpd.read_file(infileshp)\n",
    "wq =  open(infileq).read()    \n",
    "wq.transform = 'r'    # row-transform the weights\n",
    "y = dfs[y_name]\n",
    "x = dfs[x_names]"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "2aea09bf",
   "metadata": {},
   "source": [
    "## OLS and SLX with Spatial Diagnostics"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "fb5eb774-814b-467e-ad51-11d21d3cee50",
   "metadata": {},
   "source": [
    "For ease of reference, standard OLS and SLX regressions with spatial diagnostics included in this notebook as well. These results are identical to the ones provided in the spatial lag notebooks."
   ]
  },
  {
   "cell_type": "markdown",
   "id": "fdb2b8ce-6ae8-4a15-a6c7-7ec23700dfc7",
   "metadata": {},
   "source": [
    "### OLS"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "379724ce",
   "metadata": {},
   "outputs": [],
   "source": [
    "ols1 = OLS(y,x,w=wq,spat_diag=True,moran=True,\n",
    "                 name_w=w_name,name_ds=ds_name)\n",
    "print(ols1.summary)"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "9d0e0f74-5877-4497-8e7b-79394b6701fd",
   "metadata": {},
   "source": [
    "The specification achieves an acceptable $R^2$ of about 0.63 and all coefficients are positive and highly significant.\n",
    "\n",
    "The non-spatial diagnostics suggest non-normality as well as a hight degree of heteroskedasticity. There is no problem with multicollinearity.\n",
    "\n",
    "The spatial diagnostics against the SARERR alternatives show very significant LM-Lag and LM-Error, but of the two robust tests, only RLM-Lag is highly significant (RLM-Error only at p < 0.03). Hence, there is a strong indication that a Lag rather than an Error alternative may be appropriate. While the joint LM test is also highly significant, this is likely due to a strong one-sided (Lag) alternative.\n",
    "\n",
    "Interestingly, the diagnostics against a spatial Durbin alternative strongly support the latter as well. Both LM tests and their robust forms are highly significant, and so is the joint test. Moreover, the value for the robust forms of the test is smaller than the original, which is the expected behavior (although not always reflected in empirical practice).\n",
    "\n",
    "In sum, in addition to a spatial Lag model as an alternative, the spatial Durbin specification deserves consideration as well."
   ]
  },
  {
   "cell_type": "markdown",
   "id": "0cfc5d4e-a1b3-4b07-8120-2def922d8ef6",
   "metadata": {},
   "source": [
    "### SLX"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "53f9b90d-7f6d-4b3e-9e99-83430957fe3a",
   "metadata": {},
   "outputs": [],
   "source": [
    "slx1 = OLS(y,x,w=wq,slx_lags=1,spat_diag=True,moran=True,\n",
    "                 name_w=w_name,name_ds=ds_name)\n",
    "print(slx1.summary)"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "9ee2be15-2067-4caf-b02b-aca6b46118c5",
   "metadata": {},
   "source": [
    "Relative to the classic regression model, the fit improves slightly, but the constant, **EP_NOHSDP** and **HIS_CT** become non-significant at p = 0.01 (they are marginally signifcant at p=0.05). All but one coefficient of the SLX terms are significant (**W_EP_NOVEH** is not). The signs and magnitudes of the SLX coefficients relative to their unlagged counterparts remain a bit confusing. Only for **EP_LIMENG** and **W_EP_LIMENG** are they the same, with the lag coefficient smaller than the unlagged one, in accordance with Tobler's law. The coefficient for **W_HIS_ct** is significant and larger than that of **HIS_ct**, while the latter is not significant at p = 0.01. In other words, the interpretation of these results in terms of distance decay and Tobler's law may be a bit problematic.\n",
    "\n",
    "In terms of diagnostics, there is a slight problem with multicollinearity (often the case in SLX specifications), strong non-normality and evidence of heteroskedasticity. Moran's I is significant, as are both LM-tests, but neither of the robust forms is significant. Based on the relative magnitudes of the test statistics, there is a slight indication of a possible Lag alternative, i.e., a spatial Durbin specification. However, this indication is not as strong as that provided by the LM-SDM test statistics in the classic specification."
   ]
  },
  {
   "cell_type": "markdown",
   "id": "a8a0c186-dd30-4f64-9a5b-d95bb562ebde",
   "metadata": {},
   "source": [
    "## ML Estimation of the Error Model"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "4cc74112-a2eb-4efa-9116-d1e5197d20ed",
   "metadata": {},
   "source": [
    "### Principle"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "0385d805-bfb4-4e79-b0ed-0fc59c811926",
   "metadata": {},
   "source": [
    "The spatial Error model is a regular linear regression model with spatially autoregressive error terms:\n",
    "\n",
    "$$y = X\\beta + u, u = \\lambda W + e,$$\n",
    "\n",
    "where $\\lambda$ is the spatial autoregressive (error) parameter\n",
    "\n",
    "The point of departure of the Maximum Likelihood estimation of this model is again the assumption of joint normality of the error terms. However, the error terms are no longer\n",
    "independent, but they have a covariance matrix that follows from the \n",
    "spatial autoregressive specification. Specifically:\n",
    "\n",
    "$$E[uu'] = \\Sigma = \\sigma^2 [(I - \\lambda W)'(I - \\lambda W)]^{-1},$$\n",
    "\n",
    "where $\\sigma^2$ is the variance of the remaining error terms. \n",
    "\n",
    "This leads to the log-likelihood function:\n",
    "\n",
    "$$\\ln L = -(n/2)(\\ln 2\\pi) - (n/2) \\ln \\sigma^2 + \\ln | I - \\lambda W | \\\\\n",
    "           - \\frac{1}{2\\sigma^2} (y - X \\beta)'(I - \\lambda W)'(I - \\lambda W)(y - X \\beta).$$\n",
    "\n",
    "The last term in this expression is a sum of squared residuals in a \n",
    "regression of $(I - \\lambda W)y$ on $(I - \\lambda W)X$, i.e., a standard OLS estimation, but\n",
    "based on the\n",
    "spatially filtered dependent and explanatory variables (of course, this assumes a value for $\\lambda$).\n",
    "The regression of the spatially filtered variables is referred to as *spatially weighted least squares* or *spatial Cochran-Orcutt*, the latter due to its similarity to a familiar time series transformation.\n",
    "\n",
    "As in ths spatial lag case, maximization of the log-likelihood is simplified since a *concentrated* likelihood can be derived that is only a function of the single parameter $\\lambda$. Once an estimate for $\\lambda$ is obtained, the corresponding estimates for $\\beta$ and $\\sigma^2$ are easily computed. For technical details, see Chapter 10 of Anselin and Rey (2014).\n",
    "\n",
    "Inference is again based on an asymptotic variance matrix."
   ]
  },
  {
   "cell_type": "markdown",
   "id": "1bfd7648-bdd1-4ebb-bcf5-4dc74782693d",
   "metadata": {},
   "source": [
    "### Implementation methods"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "4ada2a0d-81fc-4b07-8ba9-d7e1bc101b1a",
   "metadata": {},
   "source": [
    "ML estimation of the spatial error model works in the same way as for the lag model. It is implemented by means of `spreg.ML_Error`, with all the standard regression arguments (i.e., at a minimum, **y**, **x** and **w**). Again, three different methods are implemented: `full`, `ord` an `LU`. These differ only in the way the Jacobian term $\\ln | I - \\lambda W |$ is computed. \n",
    "\n",
    "As in the lag case, the default optimization method is *brute force*, or `method=\"full\"`. The other options are the Ord eigenvalue method, `method = \"ord\"`, and the LU matrix decomposition, `method = \"LU\"`. Again, the latter is the only reliable method for larger data sets."
   ]
  },
  {
   "cell_type": "markdown",
   "id": "75672c3c-9170-4135-bdaf-2a335eec8090",
   "metadata": {},
   "source": [
    "The ML estimation is illustrated for the same specification as before, first using `method=\"full\"`. Since this is also the default, it is not necessary to explicitly include this argument, but it is listed here for clarity. To compare the relative speed of the different methods, `time.time()` is used.\n",
    "\n",
    "Unlike what holds for the spatial lag model, there are no impacts for the spatial error model, since any spillover effects are limited to the error terms. Since the model impacts are based on averages (conditional expectation of y given X), the error terms are immaterial (on average, they are zero)."
   ]
  },
  {
   "cell_type": "markdown",
   "id": "71307ef9-cc53-46a6-866e-5bd521e4503e",
   "metadata": {},
   "source": [
    "#### Method `full`"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "1975d6b2",
   "metadata": {},
   "outputs": [],
   "source": [
    "t0 = time.time()\n",
    "err1 = ML_Error(y,x,w=wq,method=\"full\",\n",
    "                     name_w=w_name,name_ds=ds_name)\n",
    "t1 = time.time()\n",
    "print(\"Time in seconds: \",t1-t0)\n",
    "print(err1.summary)"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "ec92847a-2982-4d67-a514-da927b0fd7f6",
   "metadata": {},
   "source": [
    "Even though the LM tests provided only weak evidence of an error alternative, the estimate of $\\lambda$, 0.451, is highly significant. The other regression coefficients change slightly relative to the OLS estimates, but not nearly as much as was the case for the spatial lag model. In fact, the estimates should *not* change much, since OLS remains unbiased (but becomes inefficient) in the presence of spatial error autocorrelation.\n",
    "\n",
    "The measures of fit include a pseudo $R^2$, 0.633 (squared correlation between observed and predicted y), the log-likelihood and the AIC and SC information criteria. Since the lag and error models have the same number of parameters, their log-likelihoods are directly comparable. In the lag model, the result was -2418.99, here it is -2428.85. In other words, the log-likelihood in the lag model is somewhat *larger* than for the error model, confirming the indication given by the LM test statistics (in favor of the lag model).\n",
    "\n",
    "As before, important attributes of the results are stored in the regression object. These include the regression coefficients, in **betas**, with the spatial autoregressive coefficient as the last element. The latter is also included separately as **lam**. The standard errors are in **std_err**, z-statistics and p-values in **z_stat**, and the complete variance-covariance matrix is **vm**.\n",
    "\n",
    "Since there is no reduced form for the error model, there is only one type of predicted value, contained in **predy**. However, there are two types of residuals, the classic residual, $u = y - X \\hat{\\beta}$, stored in **u**, and the spatially filtered residuals, $u - \\lambda W u$, stored in **e_filtered**. The estimate for the error variance, $\\sigma^2$, is based on the latter."
   ]
  },
  {
   "cell_type": "markdown",
   "id": "9866e7c6-4fc9-4cf4-af2f-7587b6c2a490",
   "metadata": {},
   "source": [
    "The contents of the **betas** and **lam** attributes show how the estimate for $\\lambda$ is also the last element in **betas**."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "386e6b44-9f51-4572-9ea1-7fa98f7beb4f",
   "metadata": {
    "scrolled": true
   },
   "outputs": [],
   "source": [
    "print(\"betas \",err1.betas)\n",
    "print(f\"lambda: {err1.lam:0.3f}\")"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "2d3b5229-f169-4d15-aafe-416f325183a5",
   "metadata": {},
   "source": [
    "#### Method `ord`"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "6ba06a23-111a-430e-bd60-ce54ee60a660",
   "metadata": {},
   "source": [
    "The Ord eigenvalue method is invoked by means of `method=\"ord\"`. All other attributes are the same as before."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "e1ec4ab7-5eed-4c14-b046-2f9efff13642",
   "metadata": {},
   "outputs": [],
   "source": [
    "t0 = time.time()\n",
    "err2 = ML_Error(y,x,w=wq,method=\"ord\",\n",
    "                     name_w=w_name,name_ds=ds_name)\n",
    "t1 = time.time()\n",
    "print(\"Time in seconds: \",t1-t0)\n",
    "print(err2.summary)"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "ff1ea981-669d-4ba0-a200-6f41c17717f0",
   "metadata": {},
   "source": [
    "The coefficient estimates are identical to those obtained with the `full` method. There are some slight differences in the computed standard errors (and thus also in the z-values and p-values), but the overall effect is minimal. "
   ]
  },
  {
   "cell_type": "markdown",
   "id": "e0439c7c-771a-446f-8677-51570d7f8dea",
   "metadata": {},
   "source": [
    "#### Method `LU`"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "846b5ac1-e4b0-4b99-8a56-7663403935d1",
   "metadata": {},
   "source": [
    "Again, all arguments are the same, except for `method = \"LU\"`."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "1e485647-951e-4b62-80be-b1bb505b5027",
   "metadata": {},
   "outputs": [],
   "source": [
    "t0 = time.time()\n",
    "err3 = ML_Error(y,x,w=wq,method=\"LU\",\n",
    "                     name_w=w_name,name_ds=ds_name)\n",
    "t1 = time.time()\n",
    "print(\"Time in seconds: \",t1-t0)\n",
    "print(err3.summary)"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "34682841-1194-4f32-a14d-1aea4ce6b9e3",
   "metadata": {},
   "source": [
    "In this case, the estimation results are identical to those for the `full` method."
   ]
  },
  {
   "cell_type": "markdown",
   "id": "2e768c5f-9216-48f7-ac1f-f25c2d7e0a43",
   "metadata": {},
   "source": [
    "## Predicted Values and Residuals"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "1c4af1a9-aba6-4b50-b4f1-f45694e634c4",
   "metadata": {},
   "source": [
    "The two types of residuals can be readily turned into a data frame by means of `pd.DataFrame` applied to an array constructed with `np.hstack`, in the same way as was done for the OLS predicted values and residuals. In the example, the associated variable names are **resid** and **filtered**, passed as the `columns` argument.\n",
    "\n",
    "Descriptive statistics are obtained with `describe()`"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "343c44c2",
   "metadata": {},
   "outputs": [],
   "source": [
    "preds = pd.DataFrame(np.hstack((err1.u,err1.e_filtered)),columns=['resid','filtered'])\n",
    "preds.describe()"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "9bd86573-c074-4aec-9d51-13314ca79451",
   "metadata": {},
   "source": [
    "Note some important differences between the two concepts. The filtered residuals are the proper estimates for the regression error term e (not u). Clearly, it has a mean of zero, which is not quite the case for the unfiltered residuals. Also, the variance for the filtered residual is slightly smaller, but the range is very much smaller."
   ]
  },
  {
   "cell_type": "markdown",
   "id": "18400a28-a42c-46ad-9657-f17932d67a68",
   "metadata": {},
   "source": [
    "The correlation between the two concepts is high, but not perfect, at 0.968."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "44974590-467f-49ed-beec-9725184c233e",
   "metadata": {},
   "outputs": [],
   "source": [
    "print(f\"Correlation between residuals:        {preds['resid'].corr(preds['filtered']):0.3f}\")"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "946ad797-705e-4782-9a46-f0865292e917",
   "metadata": {},
   "source": [
    "#### Spatial pattern of residuals"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "20a5a87f-e2b2-4149-902c-7a2560b6da56",
   "metadata": {},
   "source": [
    "A final interesting comparison is between the spatial pattern of the two types of residuals. To assess this, a simple Moran scatterplot is constructed, where the spatial lag is computed by means of `libpysal.lag_spatial`. The plot itself is constructed with `sns.regplot`, which superimposes a regression line on the scatter plot of the spatial lag on the original variable. No customization of the graph is carried out.\n",
    "\n",
    "For the *naive* residuals, this yields the following plot."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "0ecebb8e-d396-4e1c-8092-b6db612aebee",
   "metadata": {},
   "outputs": [],
   "source": [
    "werr = lag_spatial(wq,preds['resid']).reshape(-1,1)\n",
    "sns.regplot(x=preds['resid'],y=werr)\n",
    "plt.show()"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "cd782036-bd82-45d5-90c7-2d527d46fd61",
   "metadata": {},
   "source": [
    "The regression line shows a strong positive slope, suggesting remaining spatial clustering."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "245082da-2703-4282-bf5b-24c58f83edac",
   "metadata": {},
   "outputs": [],
   "source": [
    "wfor = lag_spatial(wq,preds['filtered']).reshape(-1,1)\n",
    "sns.regplot(x=preds['filtered'],y=wfor)\n",
    "plt.show()"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "12ae004f-e7cc-47a1-85cc-6426152ff8d3",
   "metadata": {},
   "source": [
    "In contrast, the slope for the filtered residuals is essentially flat, suggesting that the spatial autocorrelation has been removed."
   ]
  },
  {
   "cell_type": "markdown",
   "id": "ed8bd5bc-4921-40f3-96c3-79dfd68ed92d",
   "metadata": {},
   "source": [
    "#### Mapping predicted values and residuals"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "54225ac9-bdc4-4560-a62f-c2761b8ed963",
   "metadata": {},
   "source": [
    "Optionally, the predicted values and residuals can be added to the spatial data frame in order to construct associated maps. However, since these maps create only visual impressions of spatial patterning, this is not further pursued here."
   ]
  },
  {
   "cell_type": "markdown",
   "id": "708ffa42-763f-4ee2-acf0-2ab079795cfd",
   "metadata": {},
   "source": [
    "## ML Estimation of the SLX-Error Model"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "5a27b8d3-13cb-4e6d-aef2-549f19c6dc6f",
   "metadata": {},
   "source": [
    "ML estimation of the SLX-Error model is a special case of `spreg.ML_Error`, with the additional argument of `slx_lags=1` (or a larger value). Everything else remains the same. More specifically, the three methods of `full`, `ord` and `LU` are again available. Only the default `full` is considered here. The results are essentially the same for the other methods. "
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "9d5bc8e4-3683-4bfd-a904-f944cc69a71b",
   "metadata": {},
   "outputs": [],
   "source": [
    "t0 = time.time()\n",
    "slxerr = ML_Error(y,x,w=wq,slx_lags=1,\n",
    "                    name_w=w_name,name_ds=ds_name)\n",
    "t1 = time.time()\n",
    "print(\"Time in seconds: \",t1-t0)\n",
    "print(slxerr.summary)"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "73b650f7-2272-4f1a-8d2b-e6c007f45d3a",
   "metadata": {},
   "source": [
    "Compared to the OLS SLX estimates, there are some minor changes, but much less so than for the spatial Durbin model. For example, **W_EP_NOHSDP** becomes marginally non-significant (p=0.02), whereas it was significant in OLS. This is the only lag coefficient where the sign differs from that of the original coefficient, which was also the case in OLS. In terms of fit, there is a slight improvement, from an AIC of 4903.5 in OLS to 4841.6 here (smaller is better).\n",
    "\n",
    "The spatial autoregressive coefficient, 0.400, is highly significant.\n",
    "\n",
    "As in the lag case, further refinements of the model specification can be carried out by eliminating some lag terms by means of `slx_vars`, as in the standard SLX model. This is not further pursued here."
   ]
  },
  {
   "cell_type": "markdown",
   "id": "9f5f5846-51fa-4ad4-985f-c11a68d2f380",
   "metadata": {},
   "source": [
    "### Likelihood-Ratio Tests"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "ab69fd57-726f-4f4d-8bd2-2273235feaa9",
   "metadata": {},
   "source": [
    "A likelihood ratio test is $LR = 2.0 \\times (LogL_1 - LogL_0)$, where $LogL_1$ is the log-likelihood for the *unrestricted* model (i.e., with more non-zero parameters), and $LogL_0$ is the log-likelihood for the *restricted* model (i.e., where some parameters, like $\\rho$, are set to zero). For example, a likelihood ratio test on the coefficient $\\rho$ in the spatial lag model would use the log likelihood in the spatial lag model as $LogL_1$, and the log-likelihood from the classic regression as $LogL_0$. \n",
    "\n",
    "The $LR$ statistic is distributed as a Chi-square random variable with degrees of freedom equal to the number of restrictions, i.e., 1 for the spatial autoregressive coefficient, but more for the SLX and spatial Durbin models, depending on how many explanatory variables are included. The LR tests are an alternative to the Wald tests (asymptotic t-values) on the spatial coefficient and the LM tests for spatial effects considered earlier.\n",
    "\n",
    "The same likelihood ratio test as in the lag model can be implemented with `spreg.diagnostics.likratiotest`. Its two arguments are the regression object for the constrained model and the regression object for the unconstrained model. The result is a dictionary with the statistic (`likr`), the degrees of freedom (`df`) and the p-value (`p-value`)"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "fc1e3f8e-4901-4c8e-8cf9-5e41460d40fa",
   "metadata": {},
   "source": [
    "Four different LR test consider the following constraints:\n",
    "- Error vs OLS, i.e., $\\lambda = 0$ in the Error model: arguments are **ols1** and **err1**\n",
    "- SLX-Error vs OLS, i.e., both $\\lambda = 0$ and $\\gamma = 0$ in the SLX-Error model: argumentes are **ols1** and **slxerr**\n",
    "- SLX-Error model vs Error model, i.e., $\\gamma = 0$ in the SLX-Error model: arguments are **err1** and **slxerr**\n",
    "- SLX-Error model vs SLX, i.e., $\\lambda = 0$ in the SLX-Error model: arguments are **slx1** and **slxerr**"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "4c3b9c76",
   "metadata": {},
   "outputs": [],
   "source": [
    "LR_Error = likratiotest(ols1,err1)\n",
    "LR_SLXO = likratiotest(ols1,slxerr)\n",
    "LR_SLXE = likratiotest(err1,slxerr)\n",
    "LR_SLXS = likratiotest(slx1,slxerr)\n",
    "\n",
    "print(f\"LR statistic Error-OLS: {LR_Error[\"likr\"]:0.3f}, d.f. {LR_Error[\"df\"]:2d}, p-value {LR_Error[\"p-value\"]:0.4f}\")\n",
    "print(f\"LR statistic SLX-Err-OLS: {LR_SLXO[\"likr\"]:0.3f}, d.f. {LR_SLXO[\"df\"]:2d}, p-value {LR_SLXO[\"p-value\"]:0.4f}\")\n",
    "print(f\"LR statistic SLX-Err-Error: {LR_SLXE[\"likr\"]:0.3f}, d.f. {LR_SLXE[\"df\"]:2d}, p-value {LR_SLXE[\"p-value\"]:0.4f}\")\n",
    "print(f\"LR statistic SLX-Err-SLX: {LR_SLXS[\"likr\"]:0.3f}, d.f. {LR_SLXS[\"df\"]:2d}, p-value {LR_SLXS[\"p-value\"]:0.4f}\")"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "ca6ed963-0f88-4354-8236-a94beba50e4a",
   "metadata": {},
   "source": [
    "In the current example, all null hypotheses are strongly rejected. \n",
    "\n",
    "For the error model in this example, the LM-Error test statistic was 88.33, the Wald test was 9.560^2 or 91.38, and the LR test (above) 72.72. Whereas the LR and Wald test follow the prescribed order (LM < LR < W), the LM-Lag test does not, which may point to potential remaining specification problems.\n",
    "\n",
    "As mentioned, the model can be refined by selectively setting `slx_vars`, but this is not pursued here."
   ]
  },
  {
   "cell_type": "markdown",
   "id": "3b086a22",
   "metadata": {},
   "source": [
    "## Practice\n",
    "\n",
    "As practice, different model specifications could be considered, including adding additional explanatory variables, selectively removing some lag terms, and using different spatial weights. "
   ]
  }
 ],
 "metadata": {
  "kernelspec": {
   "display_name": "Python 3 (ipykernel)",
   "language": "python",
   "name": "python3"
  },
  "language_info": {
   "codemirror_mode": {
    "name": "ipython",
    "version": 3
   },
   "file_extension": ".py",
   "mimetype": "text/x-python",
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3",
   "version": "3.12.4"
  }
 },
 "nbformat": 4,
 "nbformat_minor": 5
}