spreg.SUR¶
- class spreg.SUR(bigy, bigX, w=None, regimes=None, nonspat_diag=True, spat_diag=False, vm=False, iter=False, maxiter=5, epsilon=1e-05, verbose=False, name_bigy=None, name_bigX=None, name_ds=None, name_w=None, name_regimes=None)[source]¶
User class for SUR estimation, both two step as well as iterated
- Parameters:
- bigy
dictionary
with vector for dependent variable by equation
- bigX
dictionary
with matrix of explanatory variables by equation (note, already includes constant term)
- w
spatial
weights
object
default = None
- regimes
list
default = None. List of n values with the mapping of each observation to a regime. Assumed to be aligned with ‘x’.
- nonspat_diag: boolean
flag for non-spatial diagnostics, default = True
- spat_diagbool
flag for spatial diagnostics, default = False
- iterbool
whether or not to use iterated estimation. default = False
- maxiter
int
maximum iterations; default = 5
- epsilon
float
precision criterion to end iterations. default = 0.00001
- verbosebool
flag to print out iteration number and value of log det(sig) at the beginning and the end of the iteration
- name_bigy
dictionary
with name of dependent variable for each equation. default = None, but should be specified is done when sur_stackxy is used
- name_bigX
dictionary
with names of explanatory variables for each equation. default = None, but should be specified is done when sur_stackxy is used
- name_ds
str
name for the data set
- name_w
str
name for the weights file
- name_regimes
str
name of regime variable for use in the output
- bigy
Examples
First import libpysal to load the spatial analysis tools.
>>> import libpysal >>> from libpysal.examples import load_example >>> from libpysal.weights import Queen >>> from spreg import ML_Error_Regimes, sur_dictxy
Open data on NCOVR US County Homicides (3085 areas) using libpysal.io.open(). This is the DBF associated with the NAT shapefile. Note that libpysal.io.open() also reads data in CSV format.
>>> nat = load_example('Natregimes') >>> db = libpysal.io.open(nat.get_path("natregimes.dbf"),'r')
The specification of the model to be estimated can be provided as lists. Each equation should be listed separately. In this example, equation 1 has HR80 as dependent variable and PS80 and UE80 as exogenous regressors. For equation 2, HR90 is the dependent variable, and PS90 and UE90 the exogenous regressors.
>>> y_var = ['HR80','HR90'] >>> x_var = [['PS80','UE80'],['PS90','UE90']]
Although not required for this method, we can load a weights matrix file to allow for spatial diagnostics.
>>> w = libpysal.weights.Queen.from_shapefile(nat.get_path("natregimes.shp")) >>> w.transform='r'
The SUR method requires data to be provided as dictionaries. PySAL provides the tool sur_dictxy to create these dictionaries from the list of variables. The line below will create four dictionaries containing respectively the dependent variables (bigy), the regressors (bigX), the dependent variables’ names (bigyvars) and regressors’ names (bigXvars). All these will be created from th database (db) and lists of variables (y_var and x_var) created above.
>>> bigy,bigX,bigyvars,bigXvars = sur_dictxy(db,y_var,x_var)
We can now run the regression and then have a summary of the output by typing: ‘print(reg.summary)’
>>> reg = SUR(bigy,bigX,w=w,name_bigy=bigyvars,name_bigX=bigXvars,spat_diag=True,name_ds="nat") >>> print(reg.summary) REGRESSION ---------- SUMMARY OF OUTPUT: SEEMINGLY UNRELATED REGRESSIONS (SUR) -------------------------------------------------------- Data set : nat Weights matrix : unknown Number of Equations : 2 Number of Observations: 3085 Log likelihood (SUR): -19902.966 Number of Iterations : 1 ---------- SUMMARY OF EQUATION 1 --------------------- Dependent Variable : HR80 Number of Variables : 3 Mean dependent var : 6.9276 Degrees of Freedom : 3082 S.D. dependent var : 6.8251 ------------------------------------------------------------------------------------ Variable Coefficient Std.Error z-Statistic Probability ------------------------------------------------------------------------------------ Constant_1 5.1390718 0.2624673 19.5798587 0.0000000 PS80 0.6776481 0.1219578 5.5564132 0.0000000 UE80 0.2637240 0.0343184 7.6846277 0.0000000 ------------------------------------------------------------------------------------ SUMMARY OF EQUATION 2 --------------------- Dependent Variable : HR90 Number of Variables : 3 Mean dependent var : 6.1829 Degrees of Freedom : 3082 S.D. dependent var : 6.6403 ------------------------------------------------------------------------------------ Variable Coefficient Std.Error z-Statistic Probability ------------------------------------------------------------------------------------ Constant_2 3.6139403 0.2534996 14.2561949 0.0000000 PS90 1.0260715 0.1121662 9.1477755 0.0000000 UE90 0.3865499 0.0341996 11.3027760 0.0000000 ------------------------------------------------------------------------------------ REGRESSION DIAGNOSTICS TEST DF VALUE PROB LM test on Sigma 1 680.168 0.0000 LR test on Sigma 1 768.385 0.0000 OTHER DIAGNOSTICS - CHOW TEST BETWEEN EQUATIONS VARIABLES DF VALUE PROB Constant_1, Constant_2 1 26.729 0.0000 PS80, PS90 1 8.241 0.0041 UE80, UE90 1 9.384 0.0022 DIAGNOSTICS FOR SPATIAL DEPENDENCE TEST DF VALUE PROB Lagrange Multiplier (error) 2 1333.586 0.0000 Lagrange Multiplier (lag) 2 1275.821 0.0000 ERROR CORRELATION MATRIX EQUATION 1 EQUATION 2 1.000000 0.469548 0.469548 1.000000 ================================ END OF REPORT =====================================
- Attributes:
- bigy
dictionary
with y values
- bigX
dictionary
with X values
- bigXX
dictionary
with \(X_t'X_r\) cross-products
- bigXy
dictionary
with \(X_t'y_r\) cross-products
- n_eq
int
number of equations
- n
int
number of observations in each cross-section
- bigK
array
vector with number of explanatory variables (including constant) for each equation
- bOLS
dictionary
with OLS regression coefficients for each equation
- olsE
array
N x n_eq array with OLS residuals for each equation
- bSUR
dictionary
with SUR regression coefficients for each equation
- varb
array
variance-covariance matrix
- bigE
array
n by n_eq array of residuals
- sig_ols
array
Sigma matrix for OLS residuals (diagonal)
- ldetS0
float
log det(Sigma) for null model (OLS by equation)
- niter
int
number of iterations (=0 for iter=False)
- corr
array
inter-equation error correlation matrix
- llik
float
log-likelihood (including the constant pi)
- sur_inf
dictionary
with standard error, asymptotic t and p-value, one for each equation
- lrtest
tuple
Likelihood Ratio test on off-diagonal elements of sigma (tuple with test,df,p-value)
- lmtest
tuple
Lagrange Multipler test on off-diagonal elements of sigma (tuple with test,df,p-value)
- lmEtest
tuple
Lagrange Multiplier test on error spatial autocorrelation in SUR (tuple with test, df, p-value)
- lmlagtest
tuple
Lagrange Multiplier test on spatial lag autocorrelation in SUR (tuple with test, df, p-value)
- surchow
array
list with tuples for Chow test on regression coefficients. each tuple contains test value, degrees of freedom, p-value
- name_bigy
dictionary
with name of dependent variable for each equation
- name_bigX
dictionary
with names of explanatory variables for each equation
- name_ds
str
name for the data set
- name_w
str
name for the weights file
- name_regimes
str
name of regime variable for use in the output
- bigy
- __init__(bigy, bigX, w=None, regimes=None, nonspat_diag=True, spat_diag=False, vm=False, iter=False, maxiter=5, epsilon=1e-05, verbose=False, name_bigy=None, name_bigX=None, name_ds=None, name_w=None, name_regimes=None)[source]¶
Methods
__init__
(bigy, bigX[, w, regimes, ...])