spreg.SUR¶
- class spreg.SUR(bigy, bigX, df=None, w=None, regimes=None, nonspat_diag=True, spat_diag=False, vm=False, iter=False, maxiter=5, epsilon=1e-05, verbose=False, name_bigy=None, name_bigX=None, name_ds=None, name_w=None, name_regimes=None)[source]¶
User class for SUR estimation, both two step as well as iterated
- Parameters:
- bigy
list
ordictionary
list with the name of the dependent variable for each equation or dictionary with vectors for dependent variable by equation
- bigX
list
ordictionary
list of lists the name of the explanatory variables for each equation or dictionary with matrix of explanatory variables by equation (note, already includes constant term)
- db
Pandas
DataFrame
Optional. Required in case bigy and bigX are lists with names of variables
- w
spatial
weights
object
default = None
- regimes
list
default = None. List of n values with the mapping of each observation to a regime. Assumed to be aligned with ‘x’.
- nonspat_diag: boolean
flag for non-spatial diagnostics, default = True
- spat_diagbool
flag for spatial diagnostics, default = False
- iterbool
whether or not to use iterated estimation. default = False
- maxiter
int
maximum iterations; default = 5
- epsilon
float
precision criterion to end iterations. default = 0.00001
- verbosebool
flag to print out iteration number and value of log det(sig) at the beginning and the end of the iteration
- name_bigy
dictionary
with name of dependent variable for each equation. default = None, but should be specified is done when sur_stackxy is used
- name_bigX
dictionary
with names of explanatory variables for each equation. default = None, but should be specified is done when sur_stackxy is used
- name_ds
str
name for the data set
- name_w
str
name for the weights file
- name_regimes
str
name of regime variable for use in the output
- bigy
- Attributes:
- bigy
dictionary
with y values
- bigX
dictionary
with X values
- bigXX
dictionary
with \(X_t'X_r\) cross-products
- bigXy
dictionary
with \(X_t'y_r\) cross-products
- n_eq
int
number of equations
- n
int
number of observations in each cross-section
- bigK
array
vector with number of explanatory variables (including constant) for each equation
- bOLS
dictionary
with OLS regression coefficients for each equation
- olsE
array
N x n_eq array with OLS residuals for each equation
- bSUR
dictionary
with SUR regression coefficients for each equation
- varb
array
variance-covariance matrix
- bigE
array
n by n_eq array of residuals
- sig_ols
array
Sigma matrix for OLS residuals (diagonal)
- ldetS0
float
log det(Sigma) for null model (OLS by equation)
- niter
int
number of iterations (=0 for iter=False)
- corr
array
inter-equation error correlation matrix
- llik
float
log-likelihood (including the constant pi)
- sur_inf
dictionary
with standard error, asymptotic t and p-value, one for each equation
- lrtest
tuple
Likelihood Ratio test on off-diagonal elements of sigma (tuple with test,df,p-value)
- lmtest
tuple
Lagrange Multipler test on off-diagonal elements of sigma (tuple with test,df,p-value)
- lmEtest
tuple
Lagrange Multiplier test on error spatial autocorrelation in SUR (tuple with test, df, p-value)
- lmlagtest
tuple
Lagrange Multiplier test on spatial lag autocorrelation in SUR (tuple with test, df, p-value)
- surchow
array
list with tuples for Chow test on regression coefficients. each tuple contains test value, degrees of freedom, p-value
- name_bigy
dictionary
with name of dependent variable for each equation
- name_bigX
dictionary
with names of explanatory variables for each equation
- name_ds
str
name for the data set
- name_w
str
name for the weights file
- name_regimes
str
name of regime variable for use in the output
- bigy
Examples
>>> import libpysal >>> import geopandas as gpd >>> from spreg import SUR
Open data on NCOVR US County Homicides (3085 areas) from libpysal examples using geopandas.
>>> nat = libpysal.examples.load_example('Natregimes') >>> df = gpd.read_file(nat.get_path("natregimes.shp"))
The specification of the model to be estimated can be provided as lists. Each equation should be listed separately. In this example, equation 1 has HR80 as dependent variable and PS80 and UE80 as exogenous regressors. For equation 2, HR90 is the dependent variable, and PS90 and UE90 the exogenous regressors.
>>> y_var = ['HR80','HR90'] >>> x_var = [['PS80','UE80'],['PS90','UE90']]
Although not required for this method, we can create a weights matrix to allow for spatial diagnostics.
>>> w = libpysal.weights.Queen.from_dataframe(df) >>> w.transform='r'
We can now run the regression and then have a summary of the output by typing: ‘print(reg.summary)’
>>> reg = SUR(y_var,x_var,df=df,w=w,spat_diag=True,name_ds="nat") >>> print(reg.summary) REGRESSION ---------- SUMMARY OF OUTPUT: SEEMINGLY UNRELATED REGRESSIONS (SUR) -------------------------------------------------------- Data set : nat Weights matrix : unknown Number of Equations : 2 Number of Observations: 3085 Log likelihood (SUR): -19902.966 Number of Iterations : 1 ---------- SUMMARY OF EQUATION 1 --------------------- Dependent Variable : HR80 Number of Variables : 3 Mean dependent var : 6.9276 Degrees of Freedom : 3082 S.D. dependent var : 6.8251 ------------------------------------------------------------------------------------ Variable Coefficient Std.Error z-Statistic Probability ------------------------------------------------------------------------------------ Constant_1 5.1390718 0.2624673 19.5798587 0.0000000 PS80 0.6776481 0.1219578 5.5564132 0.0000000 UE80 0.2637240 0.0343184 7.6846277 0.0000000 ------------------------------------------------------------------------------------ SUMMARY OF EQUATION 2 --------------------- Dependent Variable : HR90 Number of Variables : 3 Mean dependent var : 6.1829 Degrees of Freedom : 3082 S.D. dependent var : 6.6403 ------------------------------------------------------------------------------------ Variable Coefficient Std.Error z-Statistic Probability ------------------------------------------------------------------------------------ Constant_2 3.6139403 0.2534996 14.2561949 0.0000000 PS90 1.0260715 0.1121662 9.1477755 0.0000000 UE90 0.3865499 0.0341996 11.3027760 0.0000000 ------------------------------------------------------------------------------------ REGRESSION DIAGNOSTICS TEST DF VALUE PROB LM test on Sigma 1 680.168 0.0000 LR test on Sigma 1 768.385 0.0000 OTHER DIAGNOSTICS - CHOW TEST BETWEEN EQUATIONS VARIABLES DF VALUE PROB Constant_1, Constant_2 1 26.729 0.0000 PS80, PS90 1 8.241 0.0041 UE80, UE90 1 9.384 0.0022 DIAGNOSTICS FOR SPATIAL DEPENDENCE TEST DF VALUE PROB Lagrange Multiplier (error) 2 1333.586 0.0000 Lagrange Multiplier (lag) 2 1275.821 0.0000 ERROR CORRELATION MATRIX EQUATION 1 EQUATION 2 1.000000 0.469548 0.469548 1.000000 ================================ END OF REPORT =====================================
- __init__(bigy, bigX, df=None, w=None, regimes=None, nonspat_diag=True, spat_diag=False, vm=False, iter=False, maxiter=5, epsilon=1e-05, verbose=False, name_bigy=None, name_bigX=None, name_ds=None, name_w=None, name_regimes=None)[source]¶
Methods
__init__
(bigy, bigX[, df, w, regimes, ...])