{ "cells": [ { "cell_type": "markdown", "metadata": {}, "source": [ "---------------------------------------\n", "\n", "# Spatial 2SLS\n", "\n", "* **This notebook contains the PySAL/spreg code for Chapter 7 - Spatial 2SLS**\n", " * *in: Modern Spatial Econometrics in Practice: A Guide to GeoDa, GeoDaSpace and PySAL.*\n", " * *by: Luc Anselin and Sergio J. Rey.*" ] }, { "cell_type": "code", "execution_count": 1, "metadata": { "ExecuteTime": { "end_time": "2021-01-03T16:26:48.619651Z", "start_time": "2021-01-03T16:26:48.586688Z" } }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "Last updated: 2021-01-03T11:26:48.607797-05:00\n", "\n", "Python implementation: CPython\n", "Python version : 3.8.6\n", "IPython version : 7.19.0\n", "\n", "Compiler : Clang 11.0.0 \n", "OS : Darwin\n", "Release : 20.2.0\n", "Machine : x86_64\n", "Processor : i386\n", "CPU cores : 8\n", "Architecture: 64bit\n", "\n" ] } ], "source": [ "%load_ext watermark\n", "%watermark" ] }, { "cell_type": "code", "execution_count": 2, "metadata": { "ExecuteTime": { "end_time": "2021-01-03T16:26:49.970828Z", "start_time": "2021-01-03T16:26:48.621823Z" } }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "Watermark: 2.1.0\n", "\n", "spreg : 1.2.0.post1\n", "libpysal: 4.3.0\n", "numpy : 1.19.4\n", "\n" ] } ], "source": [ "import numpy\n", "import libpysal\n", "import spreg\n", "\n", "%watermark -w\n", "%watermark -iv" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Data" ] }, { "cell_type": "code", "execution_count": 3, "metadata": { "ExecuteTime": { "end_time": "2021-01-03T16:26:49.976860Z", "start_time": "2021-01-03T16:26:49.973373Z" } }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "baltim\n", "======\n", "\n", "Baltimore house sales prices and hedonics, 1978. \n", "----------------------------------------------------------------\n", "\n", "* baltim.dbf: attribute data. (k=17)\n", "* baltim.shp: Point shapefile. (n=211)\n", "* baltim.shx: spatial index.\n", "* baltim.tri.k12.kwt: kernel weights using a triangular kernel with 12 nearest neighbors in KWT format.\n", "* baltim_k4.gwt: nearest neighbor weights (4nn) in GWT format.\n", "* baltim_q.gal: queen contiguity weights in GAL format.\n", "* baltimore.geojson: spatial weights in geojson format.\n", "\n", "Source: Dubin, Robin A. (1992). Spatial autocorrelation and neighborhood quality. Regional Science and Urban Economics 22(3), 433-452.\n" ] } ], "source": [ "libpysal.examples.explain(\"baltim\")" ] }, { "cell_type": "code", "execution_count": 4, "metadata": { "ExecuteTime": { "end_time": "2021-01-03T16:26:49.991200Z", "start_time": "2021-01-03T16:26:49.979155Z" } }, "outputs": [], "source": [ "# Read Baltimore data\n", "db = libpysal.io.open(libpysal.examples.get_path(\"baltim.dbf\"), \"r\")\n", "ds_name = \"baltim.dbf\"\n", "\n", "# Read dependent variable\n", "y_name = \"PRICE\"\n", "y = numpy.array(db.by_col(y_name)).T\n", "y = y[:, numpy.newaxis]\n", "\n", "# Read exogenous variables\n", "x_names = [\"NROOM\", \"NBATH\", \"PATIO\", \"FIREPL\", \"AC\", \"GAR\", \"AGE\", \"LOTSZ\", \"SQFT\"]\n", "x = numpy.array([db.by_col(var) for var in x_names]).T" ] }, { "cell_type": "code", "execution_count": 5, "metadata": { "ExecuteTime": { "end_time": "2021-01-03T16:26:50.062604Z", "start_time": "2021-01-03T16:26:49.993233Z" } }, "outputs": [], "source": [ "# Read spatial data\n", "ww = libpysal.io.open(libpysal.examples.get_path(\"baltim_q.gal\"))\n", "w = ww.read()\n", "ww.close()\n", "w_name = \"baltim_q.gal\"\n", "w.transform = \"r\"" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Basic Spatial 2SLS\n", "\n", "The model to estimate is:\n", "\n", "$$ y = \\rho Wy + X \\beta + \\epsilon $$\n", "\n", "where you use $WX$ as instruments of $Wy$." ] }, { "cell_type": "code", "execution_count": 6, "metadata": { "ExecuteTime": { "end_time": "2021-01-03T16:26:50.074860Z", "start_time": "2021-01-03T16:26:50.065491Z" } }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "REGRESSION\n", "----------\n", "SUMMARY OF OUTPUT: SPATIAL TWO STAGE LEAST SQUARES\n", "--------------------------------------------------\n", "Data set : baltim\n", "Weights matrix : baltim_q\n", "Dependent Variable : PRICE Number of Observations: 211\n", "Mean dependent var : 44.3072 Number of Variables : 11\n", "S.D. dependent var : 23.6061 Degrees of Freedom : 200\n", "Pseudo R-squared : 0.7278\n", "Spatial Pseudo R-squared: 0.6928\n", "\n", "------------------------------------------------------------------------------------\n", " Variable Coefficient Std.Error z-Statistic Probability\n", "------------------------------------------------------------------------------------\n", " CONSTANT -2.5762742 5.5210355 -0.4666288 0.6407655\n", " NROOM 0.9440746 1.0609697 0.8898224 0.3735612\n", " NBATH 5.5981348 1.7376725 3.2216283 0.0012746\n", " PATIO 5.8424768 2.7435166 2.1295577 0.0332081\n", " FIREPL 6.4579185 2.4238370 2.6643369 0.0077140\n", " AC 5.4871926 2.3450930 2.3398614 0.0192909\n", " GAR 4.3565951 1.6955478 2.5694321 0.0101865\n", " AGE -0.0730060 0.0523546 -1.3944510 0.1631814\n", " LOTSZ 0.0579765 0.0149534 3.8771359 0.0001057\n", " SQFT 0.0395330 0.1638055 0.2413409 0.8092909\n", " W_PRICE 0.5823313 0.0721387 8.0723813 0.0000000\n", "------------------------------------------------------------------------------------\n", "Instrumented: W_PRICE\n", "Instruments: W_AC, W_AGE, W_FIREPL, W_GAR, W_LOTSZ, W_NBATH, W_NROOM,\n", " W_PATIO, W_SQFT\n", "================================ END OF REPORT =====================================\n" ] } ], "source": [ "model = spreg.GM_Lag(\n", " y,\n", " x,\n", " w=w,\n", " name_y=y_name,\n", " name_x=x_names,\n", " name_w=\"baltim_q\",\n", " name_ds=\"baltim\"\n", ")\n", "print(model.summary)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Second order spatial lags\n", "\n", "You can also use $[WX, W^2X]$ as instruments of $Wy$." ] }, { "cell_type": "code", "execution_count": 7, "metadata": { "ExecuteTime": { "end_time": "2021-01-03T16:26:50.085089Z", "start_time": "2021-01-03T16:26:50.076795Z" } }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "REGRESSION\n", "----------\n", "SUMMARY OF OUTPUT: SPATIAL TWO STAGE LEAST SQUARES\n", "--------------------------------------------------\n", "Data set : baltim\n", "Weights matrix : baltim_q\n", "Dependent Variable : PRICE Number of Observations: 211\n", "Mean dependent var : 44.3072 Number of Variables : 11\n", "S.D. dependent var : 23.6061 Degrees of Freedom : 200\n", "Pseudo R-squared : 0.7276\n", "Spatial Pseudo R-squared: 0.6915\n", "\n", "------------------------------------------------------------------------------------\n", " Variable Coefficient Std.Error z-Statistic Probability\n", "------------------------------------------------------------------------------------\n", " CONSTANT -2.8867580 5.4563344 -0.5290654 0.5967601\n", " NROOM 0.9527428 1.0613055 0.8977083 0.3693411\n", " NBATH 5.5975309 1.7386698 3.2194330 0.0012844\n", " PATIO 5.7884989 2.7409866 2.1118304 0.0347010\n", " FIREPL 6.4012808 2.4201110 2.6450360 0.0081682\n", " AC 5.4587589 2.3451078 2.3277220 0.0199269\n", " GAR 4.3440361 1.6961624 2.5610969 0.0104342\n", " AGE -0.0713188 0.0521742 -1.3669353 0.1716456\n", " LOTSZ 0.0575329 0.0149111 3.8583943 0.0001141\n", " SQFT 0.0377524 0.1638248 0.2304438 0.8177470\n", " W_PRICE 0.5893267 0.0695101 8.4782886 0.0000000\n", "------------------------------------------------------------------------------------\n", "Instrumented: W_PRICE\n", "Instruments: W2_AC, W2_AGE, W2_FIREPL, W2_GAR, W2_LOTSZ, W2_NBATH, W2_NROOM,\n", " W2_PATIO, W2_SQFT, W_AC, W_AGE, W_FIREPL, W_GAR, W_LOTSZ,\n", " W_NBATH, W_NROOM, W_PATIO, W_SQFT\n", "================================ END OF REPORT =====================================\n" ] } ], "source": [ "# using second order spatial lags for the instruments, set w_lags = 2\n", "model2 = spreg.GM_Lag(\n", " y,\n", " x,\n", " w=w,\n", " w_lags=2,\n", " name_y=y_name,\n", " name_x=x_names,\n", " name_w=\"baltim_q\",\n", " name_ds=\"baltim\"\n", ")\n", "print(model2.summary)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Spatial Diagnostics" ] }, { "cell_type": "code", "execution_count": 8, "metadata": { "ExecuteTime": { "end_time": "2021-01-03T16:26:50.099577Z", "start_time": "2021-01-03T16:26:50.088864Z" } }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "REGRESSION\n", "----------\n", "SUMMARY OF OUTPUT: SPATIAL TWO STAGE LEAST SQUARES\n", "--------------------------------------------------\n", "Data set : baltim\n", "Weights matrix : baltim_q\n", "Dependent Variable : PRICE Number of Observations: 211\n", "Mean dependent var : 44.3072 Number of Variables : 11\n", "S.D. dependent var : 23.6061 Degrees of Freedom : 200\n", "Pseudo R-squared : 0.7278\n", "Spatial Pseudo R-squared: 0.6928\n", "\n", "------------------------------------------------------------------------------------\n", " Variable Coefficient Std.Error z-Statistic Probability\n", "------------------------------------------------------------------------------------\n", " CONSTANT -2.5762742 5.5210355 -0.4666288 0.6407655\n", " NROOM 0.9440746 1.0609697 0.8898224 0.3735612\n", " NBATH 5.5981348 1.7376725 3.2216283 0.0012746\n", " PATIO 5.8424768 2.7435166 2.1295577 0.0332081\n", " FIREPL 6.4579185 2.4238370 2.6643369 0.0077140\n", " AC 5.4871926 2.3450930 2.3398614 0.0192909\n", " GAR 4.3565951 1.6955478 2.5694321 0.0101865\n", " AGE -0.0730060 0.0523546 -1.3944510 0.1631814\n", " LOTSZ 0.0579765 0.0149534 3.8771359 0.0001057\n", " SQFT 0.0395330 0.1638055 0.2413409 0.8092909\n", " W_PRICE 0.5823313 0.0721387 8.0723813 0.0000000\n", "------------------------------------------------------------------------------------\n", "Instrumented: W_PRICE\n", "Instruments: W_AC, W_AGE, W_FIREPL, W_GAR, W_LOTSZ, W_NBATH, W_NROOM,\n", " W_PATIO, W_SQFT\n", "\n", "DIAGNOSTICS FOR SPATIAL DEPENDENCE\n", "TEST MI/DF VALUE PROB\n", "Anselin-Kelejian Test 1 5.234 0.0221\n", "================================ END OF REPORT =====================================\n" ] } ], "source": [ "model = spreg.GM_Lag(\n", " y,\n", " x,\n", " w=w,\n", " spat_diag=True,\n", " name_y=y_name,\n", " name_x=x_names,\n", " name_w=\"baltim_q\",\n", " name_ds=\"baltim\"\n", ")\n", "print(model.summary)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## White Standard Errors" ] }, { "cell_type": "code", "execution_count": 9, "metadata": { "ExecuteTime": { "end_time": "2021-01-03T16:26:50.112672Z", "start_time": "2021-01-03T16:26:50.101959Z" } }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "REGRESSION\n", "----------\n", "SUMMARY OF OUTPUT: SPATIAL TWO STAGE LEAST SQUARES\n", "--------------------------------------------------\n", "Data set : baltim\n", "Weights matrix : baltim_q\n", "Dependent Variable : PRICE Number of Observations: 211\n", "Mean dependent var : 44.3072 Number of Variables : 11\n", "S.D. dependent var : 23.6061 Degrees of Freedom : 200\n", "Pseudo R-squared : 0.7278\n", "Spatial Pseudo R-squared: 0.6928\n", "\n", "White Standard Errors\n", "------------------------------------------------------------------------------------\n", " Variable Coefficient Std.Error z-Statistic Probability\n", "------------------------------------------------------------------------------------\n", " CONSTANT -2.5762742 7.0147591 -0.3672648 0.7134215\n", " NROOM 0.9440746 1.4002856 0.6742015 0.5001832\n", " NBATH 5.5981348 2.1605285 2.5910951 0.0095671\n", " PATIO 5.8424768 2.9445656 1.9841558 0.0472385\n", " FIREPL 6.4579185 2.4500195 2.6358641 0.0083923\n", " AC 5.4871926 2.6021469 2.1087175 0.0349690\n", " GAR 4.3565951 2.2070747 1.9739228 0.0483905\n", " AGE -0.0730060 0.0976079 -0.7479516 0.4544894\n", " LOTSZ 0.0579765 0.0237454 2.4415887 0.0146228\n", " SQFT 0.0395330 0.2355809 0.1678105 0.8667323\n", " W_PRICE 0.5823313 0.1325884 4.3920220 0.0000112\n", "------------------------------------------------------------------------------------\n", "Instrumented: W_PRICE\n", "Instruments: W_AC, W_AGE, W_FIREPL, W_GAR, W_LOTSZ, W_NBATH, W_NROOM,\n", " W_PATIO, W_SQFT\n", "\n", "DIAGNOSTICS FOR SPATIAL DEPENDENCE\n", "TEST MI/DF VALUE PROB\n", "Anselin-Kelejian Test 1 5.234 0.0221\n", "================================ END OF REPORT =====================================\n" ] } ], "source": [ "model = spreg.GM_Lag(\n", " y,\n", " x,\n", " w=w,\n", " robust=\"white\",\n", " spat_diag=True,\n", " name_y=y_name,\n", " name_x=x_names,\n", " name_w=\"baltim_q\",\n", " name_ds=\"baltim\"\n", ")\n", "print(model.summary)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "-----------------------------" ] } ], "metadata": { "kernelspec": { "display_name": "Python [conda env:py38_spreg]", "language": "python", "name": "conda-env-py38_spreg-py" }, "language_info": { "codemirror_mode": { "name": "ipython", "version": 3 }, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", "version": "3.8.6" }, "toc": { "base_numbering": 1, "nav_menu": { "height": "103px", "width": "212px" }, "number_sections": true, "sideBar": true, "skip_h1_title": false, "title_cell": "Table of Contents", "title_sidebar": "Contents", "toc_cell": false, "toc_position": {}, "toc_section_display": true, "toc_window_display": false } }, "nbformat": 4, "nbformat_minor": 4 }