{ "cells": [ { "cell_type": "markdown", "metadata": {}, "source": [ "------------\n", "\n", "# Spatial Panel Models with Fixed Effects\n", "\n", "* **This notebook uses the [Panel_FE_Lag](https://pysal.org/spreg/generated/spreg.Panel_FE_Lag.html#spreg.Panel_FE_Lag) and [Panel_FE_Error](https://pysal.org/spreg/generated/spreg.Panel_FE_Error.html#spreg.Panel_FE_Error) classes.**\n" ] }, { "cell_type": "code", "execution_count": 1, "metadata": { "ExecuteTime": { "end_time": "2021-01-04T16:36:53.158014Z", "start_time": "2021-01-04T16:36:50.182287Z" } }, "outputs": [], "source": [ "import numpy\n", "import libpysal\n", "import spreg" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "**Open data on NCOVR US County Homicides (3085 areas).**\n", "\n", "* First, extract the HR (homicide rates) data in the 70's, 80's and 90's as the dependent variable.\n", "* Data can also be passed in the long format instead of wide format.\n", " * i.e. a vector with $n \\times t$ rows and a single column for the dependent variable, and\n", " * a matrix of dimension $n \\times (t \\ast k)$ for the independent variables\n", "* Then, extract RD and PS as independent variables in the regression." ] }, { "cell_type": "code", "execution_count": 2, "metadata": { "ExecuteTime": { "end_time": "2021-01-04T16:36:53.489678Z", "start_time": "2021-01-04T16:36:53.160457Z" } }, "outputs": [], "source": [ "# Open data on NCOVR US County Homicides (3085 areas).\n", "nat = libpysal.examples.load_example(\"NCOVR\")\n", "db = libpysal.io.open(nat.get_path(\"NAT.dbf\"), \"r\")\n", "\n", "# Create spatial weight matrix\n", "nat_shp = libpysal.examples.get_path(\"NAT.shp\")\n", "w = libpysal.weights.Queen.from_shapefile(nat_shp)\n", "w.transform = 'r'\n", "\n", "# Define dependent variable\n", "name_y = [\"HR70\", \"HR80\", \"HR90\"]\n", "y = numpy.array([db.by_col(name) for name in name_y]).T\n", "\n", "# Define independent variables\n", "name_x = [\"RD70\", \"RD80\", \"RD90\", \"PS70\", \"PS80\", \"PS90\"]\n", "x = numpy.array([db.by_col(name) for name in name_x]).T" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "--------------------\n", "\n", "## Spatial Lag model\n", "\n", "Let's estimate a spatial lag panel model with fixed effects:\n", "\n", "$$\n", "y = \\rho Wy + X\\beta + \\mu_i + e\n", "$$" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "ExecuteTime": { "end_time": "2021-01-04T16:36:59.736302Z", "start_time": "2021-01-04T16:36:53.492370Z" } }, "outputs": [], "source": [ "fe_lag = spreg.Panel_FE_Lag(y, x, w, name_y=name_y, name_x=name_x, name_ds=\"NAT\")" ] }, { "cell_type": "code", "execution_count": 4, "metadata": { "ExecuteTime": { "end_time": "2021-01-04T16:36:59.741882Z", "start_time": "2021-01-04T16:36:59.737965Z" } }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "REGRESSION\n", "----------\n", "SUMMARY OF OUTPUT: MAXIMUM LIKELIHOOD SPATIAL LAG PANEL - FIXED EFFECTS\n", "-----------------------------------------------------------------------\n", "Data set : NAT\n", "Weights matrix : unknown\n", "Dependent Variable : HR Number of Observations: 9255\n", "Mean dependent var : 0.0000 Number of Variables : 3\n", "S.D. dependent var : 3.9228 Degrees of Freedom : 9252\n", "Pseudo R-squared : 0.0319\n", "Spatial Pseudo R-squared: 0.0079\n", "Sigma-square ML : 14.935 Log likelihood : -67936.533\n", "S.E of regression : 3.865 Akaike info criterion : 135879.066\n", " Schwarz criterion : 135900.465\n", "\n", "------------------------------------------------------------------------------------\n", " Variable Coefficient Std.Error z-Statistic Probability\n", "------------------------------------------------------------------------------------\n", " RD 0.8005886 0.1614474 4.9588189 0.0000007\n", " PS -2.6003523 0.4935486 -5.2686851 0.0000001\n", " W_HR 0.1903043 0.0159991 11.8947008 0.0000000\n", "------------------------------------------------------------------------------------\n", "Warning: Assuming panel is in wide format.\n", "y[:, 0] refers to T0, y[:, 1] refers to T1, etc.\n", "x[:, 0:T] refers to T periods of k1, x[:, T+1:2T] refers to k2, etc.\n", "================================ END OF REPORT =====================================\n" ] } ], "source": [ "print(fe_lag.summary)" ] }, { "cell_type": "code", "execution_count": 5, "metadata": { "ExecuteTime": { "end_time": "2021-01-04T16:36:59.753663Z", "start_time": "2021-01-04T16:36:59.743818Z" } }, "outputs": [ { "data": { "text/plain": [ "array([[ 0.8006],\n", " [-2.6004],\n", " [ 0.1903]])" ] }, "execution_count": 5, "metadata": {}, "output_type": "execute_result" } ], "source": [ "numpy.around(fe_lag.betas, decimals=4)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Data can also be in 'long' format:" ] }, { "cell_type": "code", "execution_count": 6, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "REGRESSION\n", "----------\n", "SUMMARY OF OUTPUT: MAXIMUM LIKELIHOOD SPATIAL LAG PANEL - FIXED EFFECTS\n", "-----------------------------------------------------------------------\n", "Data set : NAT\n", "Weights matrix : unknown\n", "Dependent Variable : HR Number of Observations: 9255\n", "Mean dependent var : 0.0000 Number of Variables : 3\n", "S.D. dependent var : 3.9228 Degrees of Freedom : 9252\n", "Pseudo R-squared : 0.0319\n", "Spatial Pseudo R-squared: 0.0079\n", "Sigma-square ML : 14.935 Log likelihood : -67936.533\n", "S.E of regression : 3.865 Akaike info criterion : 135879.066\n", " Schwarz criterion : 135900.465\n", "\n", "------------------------------------------------------------------------------------\n", " Variable Coefficient Std.Error z-Statistic Probability\n", "------------------------------------------------------------------------------------\n", " RD 0.8005886 0.1614474 4.9588189 0.0000007\n", " PS -2.6003523 0.4935486 -5.2686851 0.0000001\n", " W_HR 0.1903043 0.0159991 11.8947008 0.0000000\n", "------------------------------------------------------------------------------------\n", "Warning: Assuming panel is in long format.\n", "y[0:N] refers to T0, y[N+1:2N] refers to T1, etc.\n", "x[0:N] refers to T0, x[N+1:2N] refers to T1, etc.\n", "================================ END OF REPORT =====================================\n" ] } ], "source": [ "y_long = y.reshape((y.shape[0]*y.shape[1],1), order='F')\n", "x_long = x.reshape((x.shape[0]*3,2), order='F')\n", "\n", "fe_lag_long = spreg.Panel_FE_Lag(y_long, x_long, w, name_y=name_y, name_x=name_x, name_ds=\"NAT\")\n", "print(fe_lag_long.summary)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "------------------------\n", "\n", "## Spatial Error model\n", "\n", "Now, let's estimate a spatial error panel model with fixed effects:\n", "\n", "$$\n", "y = X\\beta + \\mu_i + v\n", "$$\n", "\n", "where\n", "\n", "$$\n", "v = \\lambda W v + e\n", "$$" ] }, { "cell_type": "code", "execution_count": 7, "metadata": { "ExecuteTime": { "end_time": "2021-01-04T16:37:03.722913Z", "start_time": "2021-01-04T16:36:59.755557Z" } }, "outputs": [], "source": [ "fe_error = spreg.Panel_FE_Error(\n", " y, x, w, name_y=name_y, name_x=name_x, name_ds=\"NAT\"\n", ")" ] }, { "cell_type": "code", "execution_count": 8, "metadata": { "ExecuteTime": { "end_time": "2021-01-04T16:37:03.729503Z", "start_time": "2021-01-04T16:37:03.726165Z" } }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "REGRESSION\n", "----------\n", "SUMMARY OF OUTPUT: MAXIMUM LIKELIHOOD SPATIAL ERROR PANEL - FIXED EFFECTS\n", "-------------------------------------------------------------------------\n", "Data set : NAT\n", "Weights matrix : unknown\n", "Dependent Variable : HR Number of Observations: 9255\n", "Mean dependent var : 0.0000 Number of Variables : 2\n", "S.D. dependent var : 3.9228 Degrees of Freedom : 9253\n", "Pseudo R-squared : 0.0084\n", "Sigma-square ML : 14.923 Log likelihood : -67934.005\n", "S.E of regression : 3.863 Akaike info criterion : 135872.010\n", " Schwarz criterion : 135886.276\n", "\n", "------------------------------------------------------------------------------------\n", " Variable Coefficient Std.Error z-Statistic Probability\n", "------------------------------------------------------------------------------------\n", " RD 0.8697923 0.1718029 5.0627323 0.0000004\n", " PS -2.9660674 0.5444783 -5.4475397 0.0000001\n", " lambda 0.1943460 0.0160253 12.1274197 0.0000000\n", "------------------------------------------------------------------------------------\n", "Warning: Assuming panel is in wide format.\n", "y[:, 0] refers to T0, y[:, 1] refers to T1, etc.\n", "x[:, 0:T] refers to T periods of k1, x[:, T+1:2T] refers to k2, etc.\n", "================================ END OF REPORT =====================================\n" ] } ], "source": [ "print(fe_error.summary)" ] }, { "cell_type": "code", "execution_count": 9, "metadata": { "ExecuteTime": { "end_time": "2021-01-04T16:37:03.735854Z", "start_time": "2021-01-04T16:37:03.731739Z" } }, "outputs": [ { "data": { "text/plain": [ "array([[ 0.8698],\n", " [-2.9661],\n", " [ 0.1943]])" ] }, "execution_count": 9, "metadata": {}, "output_type": "execute_result" } ], "source": [ "numpy.around(fe_error.betas, decimals=4)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "------------------" ] } ], "metadata": { "kernelspec": { "display_name": "Python [conda env:myenv] *", "language": "python", "name": "conda-env-myenv-py" }, "language_info": { "codemirror_mode": { "name": "ipython", "version": 3 }, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", "version": "3.6.10" } }, "nbformat": 4, "nbformat_minor": 4 }