spreg.ThreeSLS

class spreg.ThreeSLS(bigy, bigX, bigyend, bigq, df=None, regimes=None, nonspat_diag=True, name_bigy=None, name_bigX=None, name_bigyend=None, name_bigq=None, name_ds=None, name_regimes=None)[source]

User class for 3SLS estimation

Parameters:
bigylist or dictionary

list with the names of the dependent variable for each equation or dictionary with vectors for dependent variable by equation

bigXlist or dictionary

list of lists the names of the explanatory variables for each equation or dictionary with matrix of explanatory variables by equation (note, already includes constant term)

bigyendlist or dictionary

list of lists the names of the endogenous variables for each equation or dictionary with matrix of endogenous variables by equation

bigqlist or dictionary

list of lists the names of the instrument variables for each equation or dictionary with matrix of instruments by equation

dbPandas DataFrame

Optional. Required in case bigy and bigX are lists with names of variables

regimeslist

List of n values with the mapping of each observation to a regime. Assumed to be aligned with ‘x’.

nonspat_diag: boolean

flag for non-spatial diagnostics, default = True.

name_bigydictionary

with name of dependent variable for each equation. default = None, but should be specified. is done when sur_stackxy is used

name_bigXdictionary

with names of explanatory variables for each equation. default = None, but should be specified. is done when sur_stackxy is used

name_bigyenddictionary

with names of endogenous variables for each equation. default = None, but should be specified. is done when sur_stackZ is used

name_bigqdictionary

with names of instrumental variables for each equation. default = None, but should be specified. is done when sur_stackZ is used.

name_dsstr

name for the data set.

name_regimesstr

name of regime variable for use in the output.

Attributes:
bigydictionary

with y values

bigZdictionary

with matrix of exogenous and endogenous variables for each equation

bigZHZHdictionary

with matrix of cross products Zhat_r’Zhat_s

bigZHydictionary

with matrix of cross products Zhat_r’y_end_s

n_eqint

number of equations

nint

number of observations in each cross-section

bigKarray

vector with number of explanatory variables (including constant, exogenous and endogenous) for each equation

b2SLSdictionary

with 2SLS regression coefficients for each equation

tslsEarray

N x n_eq array with OLS residuals for each equation

b3SLSdictionary

with 3SLS regression coefficients for each equation

varbarray

variance-covariance matrix

sigarray

Sigma matrix of inter-equation error covariances

bigEarray

n by n_eq array of residuals

corrarray

inter-equation 3SLS error correlation matrix

tsls_infdictionary

with standard error, asymptotic t and p-value, one for each equation

surchowarray

list with tuples for Chow test on regression coefficients each tuple contains test value, degrees of freedom, p-value

name_dsstr

name for the data set

name_bigydictionary

with name of dependent variable for each equation

name_bigXdictionary

with names of explanatory variables for each equation

name_bigyenddictionary

with names of endogenous variables for each equation

name_bigqdictionary

with names of instrumental variables for each equations

name_regimesstr

name of regime variable for use in the output

Examples

>>> import libpysal
>>> import geopandas as gpd
>>> from spreg import ThreeSLS
>>> import numpy as np
>>> np.set_printoptions(suppress=True) #prevent scientific format

Open data on NCOVR US County Homicides (3085 areas) from libpysal examples using geopandas.

>>> nat = libpysal.examples.load_example('Natregimes')
>>> df = gpd.read_file(nat.get_path("natregimes.shp"))

The specification of the model to be estimated can be provided as lists. Each equation should be listed separately. In this example, equation 1 has HR80 as dependent variable, PS80 and UE80 as exogenous regressors, RD80 as endogenous regressor and FP79 as additional instrument. For equation 2, HR90 is the dependent variable, PS90 and UE90 the exogenous regressors, RD90 as endogenous regressor and FP99 as additional instrument

>>> y_var = ['HR80','HR90']
>>> x_var = [['PS80','UE80'],['PS90','UE90']]
>>> yend_var = [['RD80'],['RD90']]
>>> q_var = [['FP79'],['FP89']]

We can now run the regression and then have a summary of the output by typing: print(reg.summary)

Alternatively, we can just check the betas and standard errors, asymptotic t and p-value of the parameters:

>>> reg = ThreeSLS(y_var,x_var,yend_var,q_var,df=df,name_ds="NAT")
>>> reg.b3SLS
{0: array([[6.92426353],
       [1.42921826],
       [0.00049435],
       [3.5829275 ]]), 1: array([[ 7.62385875],
       [ 1.65031181],
       [-0.21682974],
       [ 3.91250428]])}
>>> reg.tsls_inf
{0: array([[ 0.23220853, 29.81916157,  0.        ],
       [ 0.10373417, 13.77770036,  0.        ],
       [ 0.03086193,  0.01601807,  0.98721998],
       [ 0.11131999, 32.18584124,  0.        ]]), 1: array([[ 0.28739415, 26.52753638,  0.        ],
       [ 0.09597031, 17.19606554,  0.        ],
       [ 0.04089547, -5.30204786,  0.00000011],
       [ 0.13586789, 28.79638723,  0.        ]])}
__init__(bigy, bigX, bigyend, bigq, df=None, regimes=None, nonspat_diag=True, name_bigy=None, name_bigX=None, name_bigyend=None, name_bigq=None, name_ds=None, name_regimes=None)[source]

Methods

__init__(bigy, bigX, bigyend, bigq[, df, ...])