spreg.SURlagIV¶

class spreg.SURlagIV(bigy, bigX, bigyend=None, bigq=None, w=None, df=None, regimes=None, vm=False, regime_lag_sep=False, w_lags=1, lag_q=True, nonspat_diag=True, spat_diag=False, name_bigy=None, name_bigX=None, name_bigyend=None, name_bigq=None, name_ds=None, name_w=None, name_regimes=None)[source]¶

User class for spatial lag estimation using IV

Parameters:

bigylist or dictionary: list with the names of the dependent variable for each equation or dictionary with vectors for dependent variable by equation
bigXlist or dictionary: list of lists the names of the explanatory variables for each equation or dictionary with matrix of explanatory variables by equation (note, already includes constant term)
bigyendlist or dictionary: list of lists the names of the endogenous variables for each equation or dictionary with matrix of endogenous variables by equation
bigqlist or dictionary: list of lists the names of the instrument variables for each equation or dictionary with matrix of instruments by equation
wspatial weights object, required
dbPandas DataFrame: Optional. Required in case bigy and bigX are lists with names of variables
vmbool: listing of full variance-covariance matrix, default = False
w_lagsinteger: order of spatial lags for WX instruments, default = 1
lag_qbool: flag to apply spatial lag to other instruments, default = True
nonspat_diagbool: flag for non-spatial diagnostics, default = True
spat_diagbool: flag for spatial diagnostics, default = False
name_bigydictionary: with name of dependent variable for each equation. default = None, but should be specified. is done when sur_stackxy is used.
name_bigXdictionary: with names of explanatory variables for each equation. default = None, but should be specified. is done when sur_stackxy is used.
name_bigyenddictionary: with names of endogenous variables for each equation. default = None, but should be specified. is done when sur_stackZ is used.
name_bigqdictionary: with names of instrumental variables for each equations. default = None, but should be specified. is done when sur_stackZ is used.
name_dsstr: name for the data set
name_wstr: name for the spatial weights

Attributes:

wspatial weights object
bigydictionary: with y values
bigZdictionary: with matrix of exogenous and endogenous variables for each equation
bigyenddictionary: with matrix of endogenous variables for each equation; contains Wy only if no other endogenous specified
bigqdictionary: with matrix of instrumental variables for each equation; contains WX only if no other endogenous specified
bigZHZHdictionary: with matrix of cross products Zhat_r’Zhat_s
bigZHydictionary: with matrix of cross products Zhat_r’y_end_s
n_eqint: number of equations
nint: number of observations in each cross-section
bigKarray: vector with number of explanatory variables (including constant, exogenous and endogenous) for each equation
b2SLSdictionary: with 2SLS regression coefficients for each equation
tslsEarray: N x n_eq array with OLS residuals for each equation
b3SLSdictionary: with 3SLS regression coefficients for each equation
varbarray: variance-covariance matrix
sigarray: Sigma matrix of inter-equation error covariances
residsarray: n by n_eq array of residuals
corrarray: inter-equation 3SLS error correlation matrix
tsls_infdictionary: with standard error, asymptotic t and p-value, one for each equation
joinrhotuple: test on joint significance of spatial autoregressive coefficient. tuple with test statistic, degrees of freedom, p-value
surchowarray: list with tuples for Chow test on regression coefficients each tuple contains test value, degrees of freedom, p-value
name_wstr: name for the spatial weights
name_dsstr: name for the data set
name_bigydictionary: with name of dependent variable for each equation
name_bigXdictionary: with names of explanatory variables for each equation
name_bigyenddictionary: with names of endogenous variables for each equation
name_bigqdictionary: with names of instrumental variables for each equations

Examples

>>> import libpysal
>>> import geopandas as gpd
>>> from spreg import SURlagIV
>>> import numpy as np
>>> np.set_printoptions(suppress=True) #prevent scientific format

Open data on NCOVR US County Homicides (3085 areas) from libpysal examples using geopandas.

>>> nat = libpysal.examples.load_example('Natregimes')
>>> df = gpd.read_file(nat.get_path("natregimes.shp"))

The specification of the model to be estimated can be provided as lists. Each equation should be listed separately. In this example, equation 1 has HR80 as dependent variable, PS80 and UE80 as exogenous regressors, RD80 as endogenous regressor and FP79 as additional instrument. For equation 2, HR90 is the dependent variable, PS90 and UE90 the exogenous regressors, RD90 as endogenous regressor and FP99 as additional instrument

>>> y_var = ['HR80','HR90']
>>> x_var = [['PS80','UE80'],['PS90','UE90']]
>>> yend_var = [['RD80'],['RD90']]
>>> q_var = [['FP79'],['FP89']]

To run a spatial lag model, we need to specify the spatial weights matrix. To do that, we can open an already existing gal file or create a new one. In this example, we will create a new one from NAT.shp and transform it to row-standardized.

>>> w = libpysal.weights.Queen.from_dataframe(df)
>>> w.transform='r'

We can now run the regression and then have a summary of the output by typing: print(reg.summary)

Alternatively, we can just check the betas and standard errors, asymptotic t and p-value of the parameters:

>>> reg = SURlagIV(y_var,x_var,yend_var,q_var,w=w,df=df,name_ds="NAT",name_w="nat_queen")
>>> reg.b3SLS
{0: array([[ 6.95472387],
       [ 1.44044301],
       [-0.00771893],
       [ 3.65051153],
       [ 0.00362663]]), 1: array([[ 5.61101925],
       [ 1.38716801],
       [-0.15512029],
       [ 3.1884457 ],
       [ 0.25832185]])}

>>> reg.tsls_inf
{0: array([[ 0.49128435, 14.15620899,  0.        ],
       [ 0.11516292, 12.50787151,  0.        ],
       [ 0.03204088, -0.2409087 ,  0.80962588],
       [ 0.1876025 , 19.45875745,  0.        ],
       [ 0.05450628,  0.06653605,  0.94695106]]), 1: array([[ 0.44969956, 12.47726211,  0.        ],
       [ 0.10440241, 13.28674277,  0.        ],
       [ 0.04150243, -3.73761961,  0.00018577],
       [ 0.19133145, 16.66451427,  0.        ],
       [ 0.04394024,  5.87893596,  0.        ]])}

__init__(bigy, bigX, bigyend=None, bigq=None, w=None, df=None, regimes=None, vm=False, regime_lag_sep=False, w_lags=1, lag_q=True, nonspat_diag=True, spat_diag=False, name_bigy=None, name_bigX=None, name_bigyend=None, name_bigq=None, name_ds=None, name_w=None, name_regimes=None)[source]¶

Methods

__init__(bigy, bigX[, bigyend, bigq, w, df, ...])