spreg.GM_KKP

class spreg.GM_KKP(y, x, w, full_weights=False, regimes=None, vm=False, name_y=None, name_x=None, name_w=None, name_ds=None, name_regimes=None)[source]

GMM method for a spatial random effects panel model based on Kapoor, Kelejian and Prucha (2007) [KKP07].

Parameters:
yarray or pandas DataFrame

n*tx1 or nxt array for dependent variable

xarray or pandas DataFrame

Two dimensional array or DF with n*t rows and k columns for independent (exogenous) variable or n rows and k*t columns (note, must not include a constant term)

wspatial weights object

Spatial weights matrix, nxn

full_weights: boolean

Considers different weights for each of the 6 moment conditions if True or only 2 sets of weights for the first 3 and the last 3 moment conditions if False (default)

regimeslist

List of n values with the mapping of each observation to a regime. Assumed to be aligned with ‘y’.

vmbool

If True, include variance-covariance matrix in summary results

name_ystr or list of strings

Name of dependent variable for use in output

name_xlist of strings

Names of independent variables for use in output

name_wstr

Name of weights matrix for use in output

name_dsstr

Name of dataset for use in output

name_regimesstr

Name of regime variable for use in the output

Attributes:
betasarray

kx1 array of estimated coefficients

uarray

nx1 array of residuals

e_filteredarray

nx1 array of spatially filtered residuals

predyarray

nx1 array of predicted y values

ninteger

Number of observations

tinteger

Number of time periods

kinteger

Number of variables for which coefficients are estimated (including the constant)

yarray

nx1 array for dependent variable

xarray

Two dimensional array with n rows and one column for each independent (exogenous) variable, including the constant

vmarray

Variance covariance matrix (kxk)

chowtuple

Contains 2 elements. 1: Pair of Wald statistic and p-value for the setup of global regime stability. 2: array with Wald statistic (col 0) and its p-value (col 1) for each beta that varies across regimes. Exists only if regimes is not None.

name_ystr

Name of dependent variable for use in output

name_xlist of strings

Names of independent variables for use in output

name_wstr

Name of weights matrix for use in output

name_dsstr

Name of dataset for use in output

name_regimesstr

Name of regime variable for use in the output

titlestr

Name of the regression method used

“””
Examples
——–
We first need to import the needed modules, namely numpy to convert the
data we read into arrays that ``spreg`` understands and ``pysal`` to
perform all the analysis.
>>> from spreg import GM_KKP
>>> import numpy as np
>>> import libpysal
Open data on NCOVR US County Homicides (3085 areas) using libpysal.io.open().
This is the DBF associated with the NAT shapefile. Note that
libpysal.io.open() also reads data in CSV format; The GM_KKP function requires
data to be passed in as numpy arrays, hence the user can read their
data in using any method.
>>> nat = libpysal.examples.load_example(‘NCOVR’)
>>> db = libpysal.io.open(nat.get_path(“NAT.dbf”),’r’)
Extract the HR (homicide rates) data in the 70’s, 80’s and 90’s from the DBF file
and make it the dependent variable for the regression. Note that the data can also
be passed in the long format instead of wide format (i.e. a vector with n*t rows
and a single column for the dependent variable and a matrix of dimension n*txk
for the independent variables).
>>> name_y = [‘HR70’,’HR80’,’HR90’]
>>> y = np.array([db.by_col(name) for name in name_y]).T
Extract RD and PS in the same time periods from the DBF to be used as
independent variables in the regression. Note that PySAL requires this to
be an nxk*t numpy array, where k is the number of independent variables (not
including a constant) and t is the number of time periods. Data must be
organized in a way that all time periods of a given variable are side-by-side
and in the correct time order.
By default a vector of ones will be added to the independent variables passed in.
>>> name_x = [‘RD70’,’RD80’,’RD90’,’PS70’,’PS80’,’PS90’]
>>> x = np.array([db.by_col(name) for name in name_x]).T
Since we want to run a spatial error panel model, we need to specify the spatial
weights matrix that includes the spatial configuration of the observations
into the error component of the model. To do that, we can open an already
existing gal file or create a new one. In this case, we will create one
from ``NAT.shp``.
>>> w = libpysal.weights.Queen.from_shapefile(libpysal.examples.get_path(“NAT.shp”))
Unless there is a good reason not to do it, the weights have to be
row-standardized so every row of the matrix sums to one. Among other
things, his allows to interpret the spatial lag of a variable as the
average value of the neighboring observations. In PySAL, this can be
easily performed in the following way:
>>> w.transform = ‘r’
We are all set with the preliminaries, we are good to run the model. In this
case, we will need the variables and the weights matrix. If we want to
have the names of the variables printed in the output summary, we will
have to pass them in as well, although this is optional. In this example
we set full_weights to False (the default), indicating that we will use
only 2 sets of moments weights for the first 3 and the last 3 moment conditions.
>>> reg = GM_KKP(y,x,w,full_weights=False,name_y=name_y, name_x=name_x)
Warning: Assuming time data is in wide format, i.e. y[0] refers to T0, y[1], refers to T1, etc.

Similarly, assuming x[0:k] refers to independent variables for T0, x[k+1:2k] refers to T1, etc.

Once we have run the model, we can explore a little bit the output. We can
either request a printout of the results with the command print(reg.summary) or
check out the individual attributes of GM_KKP:
>>> print(reg.summary)
REGRESSION
———-
SUMMARY OF OUTPUT: GM SPATIAL ERROR PANEL MODEL - RANDOM EFFECTS (KKP)
———————————————————————-
Data setunknown
Weights matrixunknown
Dependent VariableHR Number of Observations: 3085
Mean dependent var6.4983 Number of Variables3
S.D. dependent var6.9529 Degrees of Freedom3082
Pseudo R-squared0.3248
<BLANKLINE>
————————————————————————————

Variable Coefficient Std.Error z-Statistic Probability

————————————————————————————
CONSTANT 6.4922156 0.1126713 57.6208690 0.0000000

RD 3.6244575 0.0877475 41.3055536 0.0000000 PS 1.3118778 0.0852516 15.3883058 0.0000000

lambda 0.4177759

sigma2_v 22.8190822 sigma2_1 39.9099323

————————————————————————————
================================ END OF REPORT =====================================
>>> print(reg.name_x)
[‘CONSTANT’, ‘RD’, ‘PS’, ‘lambda’, ‘ sigma2_v’, ‘sigma2_1’]
The attribute reg.betas contains all the coefficients: betas, the spatial error
coefficient lambda, sig2_v and sig2_1:
>>> print(np.around(reg.betas,4))
[[ 6.4922]

[ 3.6245] [ 1.3119] [ 0.4178] [22.8191] [39.9099]]

Finally, we can check the standard erros of the betas:
>>> print(np.around(np.sqrt(reg.vm.diagonal().reshape(3,1)),4))
[[0.1127]

[0.0877] [0.0853]]

__init__(y, x, w, full_weights=False, regimes=None, vm=False, name_y=None, name_x=None, name_w=None, name_ds=None, name_regimes=None)[source]

Methods

__init__(y, x, w[, full_weights, regimes, ...])

Attributes

mean_y

std_y

property mean_y
property std_y