spreg.SUR¶

class spreg.SUR(bigy, bigX, df=None, w=None, regimes=None, nonspat_diag=True, spat_diag=False, vm=False, iter=False, maxiter=5, epsilon=1e-05, verbose=False, name_bigy=None, name_bigX=None, name_ds=None, name_w=None, name_regimes=None)[source]¶

User class for SUR estimation, both two step as well as iterated

Parameters:

bigylist or dictionary: list with the name of the dependent variable for each equation or dictionary with vectors for dependent variable by equation
bigXlist or dictionary: list of lists the name of the explanatory variables for each equation or dictionary with matrix of explanatory variables by equation (note, already includes constant term)
dbPandas DataFrame: Optional. Required in case bigy and bigX are lists with names of variables
wspatial weights object: default = None
regimeslist: default = None. List of n values with the mapping of each observation to a regime. Assumed to be aligned with ‘x’.
nonspat_diag: boolean: flag for non-spatial diagnostics, default = True
spat_diagbool: flag for spatial diagnostics, default = False
iterbool: whether or not to use iterated estimation. default = False
maxiterint: maximum iterations; default = 5
epsilonfloat: precision criterion to end iterations. default = 0.00001
verbosebool: flag to print out iteration number and value of log det(sig) at the beginning and the end of the iteration
name_bigydictionary: with name of dependent variable for each equation. default = None, but should be specified is done when sur_stackxy is used
name_bigXdictionary: with names of explanatory variables for each equation. default = None, but should be specified is done when sur_stackxy is used
name_dsstr: name for the data set
name_wstr: name for the weights file
name_regimesstr: name of regime variable for use in the output

Attributes:

bigydictionary: with y values
bigXdictionary: with X values
bigXXdictionary: with \(X_t'X_r\) cross-products
bigXydictionary: with \(X_t'y_r\) cross-products
n_eqint: number of equations
nint: number of observations in each cross-section
bigKarray: vector with number of explanatory variables (including constant) for each equation
bOLSdictionary: with OLS regression coefficients for each equation
olsEarray: N x n_eq array with OLS residuals for each equation
bSURdictionary: with SUR regression coefficients for each equation
varbarray: variance-covariance matrix
bigEarray: n by n_eq array of residuals
sig_olsarray: Sigma matrix for OLS residuals (diagonal)
ldetS0float: log det(Sigma) for null model (OLS by equation)
niterint: number of iterations (=0 for iter=False)
corrarray: inter-equation error correlation matrix
llikfloat: log-likelihood (including the constant pi)
sur_infdictionary: with standard error, asymptotic t and p-value, one for each equation
lrtesttuple: Likelihood Ratio test on off-diagonal elements of sigma (tuple with test,df,p-value)
lmtesttuple: Lagrange Multipler test on off-diagonal elements of sigma (tuple with test,df,p-value)
lmEtesttuple: Lagrange Multiplier test on error spatial autocorrelation in SUR (tuple with test, df, p-value)
lmlagtesttuple: Lagrange Multiplier test on spatial lag autocorrelation in SUR (tuple with test, df, p-value)
surchowarray: list with tuples for Chow test on regression coefficients. each tuple contains test value, degrees of freedom, p-value
name_bigydictionary: with name of dependent variable for each equation
name_bigXdictionary: with names of explanatory variables for each equation
name_dsstr: name for the data set
name_wstr: name for the weights file
name_regimesstr: name of regime variable for use in the output

Examples

>>> import libpysal
>>> import geopandas as gpd
>>> from spreg import SUR

Open data on NCOVR US County Homicides (3085 areas) from libpysal examples using geopandas.

>>> nat = libpysal.examples.load_example('Natregimes')
>>> df = gpd.read_file(nat.get_path("natregimes.shp"))

The specification of the model to be estimated can be provided as lists. Each equation should be listed separately. In this example, equation 1 has HR80 as dependent variable and PS80 and UE80 as exogenous regressors. For equation 2, HR90 is the dependent variable, and PS90 and UE90 the exogenous regressors.

>>> y_var = ['HR80','HR90']
>>> x_var = [['PS80','UE80'],['PS90','UE90']]

Although not required for this method, we can create a weights matrix to allow for spatial diagnostics.

>>> w = libpysal.weights.Queen.from_dataframe(df)
>>> w.transform='r'

We can now run the regression and then have a summary of the output by typing: ‘print(reg.summary)’

>>> reg = SUR(y_var,x_var,df=df,w=w,spat_diag=True,name_ds="nat")
>>> print(reg.summary)
REGRESSION
----------
SUMMARY OF OUTPUT: SEEMINGLY UNRELATED REGRESSIONS (SUR)
--------------------------------------------------------
Data set            :         nat
Weights matrix      :     unknown
Number of Equations :           2                Number of Observations:        3085
Log likelihood (SUR):  -19902.966                Number of Iterations  :           1
----------

SUMMARY OF EQUATION 1
---------------------
Dependent Variable  :        HR80                Number of Variables   :           3
Mean dependent var  :      6.9276                Degrees of Freedom    :        3082
S.D. dependent var  :      6.8251

------------------------------------------------------------------------------------
            Variable     Coefficient       Std.Error     z-Statistic     Probability
------------------------------------------------------------------------------------
          Constant_1       5.1390718       0.2624673      19.5798587       0.0000000
                PS80       0.6776481       0.1219578       5.5564132       0.0000000
                UE80       0.2637240       0.0343184       7.6846277       0.0000000
------------------------------------------------------------------------------------

SUMMARY OF EQUATION 2
---------------------
Dependent Variable  :        HR90                Number of Variables   :           3
Mean dependent var  :      6.1829                Degrees of Freedom    :        3082
S.D. dependent var  :      6.6403

------------------------------------------------------------------------------------
            Variable     Coefficient       Std.Error     z-Statistic     Probability
------------------------------------------------------------------------------------
          Constant_2       3.6139403       0.2534996      14.2561949       0.0000000
                PS90       1.0260715       0.1121662       9.1477755       0.0000000
                UE90       0.3865499       0.0341996      11.3027760       0.0000000
------------------------------------------------------------------------------------


REGRESSION DIAGNOSTICS
                                     TEST         DF       VALUE           PROB
                         LM test on Sigma         1      680.168           0.0000
                         LR test on Sigma         1      768.385           0.0000

OTHER DIAGNOSTICS - CHOW TEST BETWEEN EQUATIONS
                                VARIABLES         DF       VALUE           PROB
                   Constant_1, Constant_2         1       26.729           0.0000
                               PS80, PS90         1        8.241           0.0041
                               UE80, UE90         1        9.384           0.0022

DIAGNOSTICS FOR SPATIAL DEPENDENCE
TEST                              DF       VALUE           PROB
Lagrange Multiplier (error)       2        1333.586        0.0000
Lagrange Multiplier (lag)         2        1275.821        0.0000

ERROR CORRELATION MATRIX
  EQUATION 1  EQUATION 2
    1.000000    0.469548
    0.469548    1.000000
================================ END OF REPORT =====================================

__init__(bigy, bigX, df=None, w=None, regimes=None, nonspat_diag=True, spat_diag=False, vm=False, iter=False, maxiter=5, epsilon=1e-05, verbose=False, name_bigy=None, name_bigX=None, name_ds=None, name_w=None, name_regimes=None)[source]¶

Methods

__init__(bigy, bigX[, df, w, regimes, ...])