spreg.SUR

class spreg.SUR(bigy, bigX, df=None, w=None, regimes=None, nonspat_diag=True, spat_diag=False, vm=False, iter=False, maxiter=5, epsilon=1e-05, verbose=False, name_bigy=None, name_bigX=None, name_ds=None, name_w=None, name_regimes=None)[source]

User class for SUR estimation, both two step as well as iterated

Parameters:
bigylist or dictionary

list with the name of the dependent variable for each equation or dictionary with vectors for dependent variable by equation

bigXlist or dictionary

list of lists the name of the explanatory variables for each equation or dictionary with matrix of explanatory variables by equation (note, already includes constant term)

dbPandas DataFrame

Optional. Required in case bigy and bigX are lists with names of variables

wspatial weights object

default = None

regimeslist

default = None. List of n values with the mapping of each observation to a regime. Assumed to be aligned with ‘x’.

nonspat_diag: boolean

flag for non-spatial diagnostics, default = True

spat_diagbool

flag for spatial diagnostics, default = False

iterbool

whether or not to use iterated estimation. default = False

maxiterint

maximum iterations; default = 5

epsilonfloat

precision criterion to end iterations. default = 0.00001

verbosebool

flag to print out iteration number and value of log det(sig) at the beginning and the end of the iteration

name_bigydictionary

with name of dependent variable for each equation. default = None, but should be specified is done when sur_stackxy is used

name_bigXdictionary

with names of explanatory variables for each equation. default = None, but should be specified is done when sur_stackxy is used

name_dsstr

name for the data set

name_wstr

name for the weights file

name_regimesstr

name of regime variable for use in the output

Attributes:
bigydictionary

with y values

bigXdictionary

with X values

bigXXdictionary

with \(X_t'X_r\) cross-products

bigXydictionary

with \(X_t'y_r\) cross-products

n_eqint

number of equations

nint

number of observations in each cross-section

bigKarray

vector with number of explanatory variables (including constant) for each equation

bOLSdictionary

with OLS regression coefficients for each equation

olsEarray

N x n_eq array with OLS residuals for each equation

bSURdictionary

with SUR regression coefficients for each equation

varbarray

variance-covariance matrix

bigEarray

n by n_eq array of residuals

sig_olsarray

Sigma matrix for OLS residuals (diagonal)

ldetS0float

log det(Sigma) for null model (OLS by equation)

niterint

number of iterations (=0 for iter=False)

corrarray

inter-equation error correlation matrix

llikfloat

log-likelihood (including the constant pi)

sur_infdictionary

with standard error, asymptotic t and p-value, one for each equation

lrtesttuple

Likelihood Ratio test on off-diagonal elements of sigma (tuple with test,df,p-value)

lmtesttuple

Lagrange Multipler test on off-diagonal elements of sigma (tuple with test,df,p-value)

lmEtesttuple

Lagrange Multiplier test on error spatial autocorrelation in SUR (tuple with test, df, p-value)

lmlagtesttuple

Lagrange Multiplier test on spatial lag autocorrelation in SUR (tuple with test, df, p-value)

surchowarray

list with tuples for Chow test on regression coefficients. each tuple contains test value, degrees of freedom, p-value

name_bigydictionary

with name of dependent variable for each equation

name_bigXdictionary

with names of explanatory variables for each equation

name_dsstr

name for the data set

name_wstr

name for the weights file

name_regimesstr

name of regime variable for use in the output

Examples

>>> import libpysal
>>> import geopandas as gpd
>>> from spreg import SUR

Open data on NCOVR US County Homicides (3085 areas) from libpysal examples using geopandas.

>>> nat = libpysal.examples.load_example('Natregimes')
>>> df = gpd.read_file(nat.get_path("natregimes.shp"))

The specification of the model to be estimated can be provided as lists. Each equation should be listed separately. In this example, equation 1 has HR80 as dependent variable and PS80 and UE80 as exogenous regressors. For equation 2, HR90 is the dependent variable, and PS90 and UE90 the exogenous regressors.

>>> y_var = ['HR80','HR90']
>>> x_var = [['PS80','UE80'],['PS90','UE90']]

Although not required for this method, we can create a weights matrix to allow for spatial diagnostics.

>>> w = libpysal.weights.Queen.from_dataframe(df)
>>> w.transform='r'

We can now run the regression and then have a summary of the output by typing: ‘print(reg.summary)’

>>> reg = SUR(y_var,x_var,df=df,w=w,spat_diag=True,name_ds="nat")
>>> print(reg.summary)
REGRESSION
----------
SUMMARY OF OUTPUT: SEEMINGLY UNRELATED REGRESSIONS (SUR)
--------------------------------------------------------
Data set            :         nat
Weights matrix      :     unknown
Number of Equations :           2                Number of Observations:        3085
Log likelihood (SUR):  -19902.966                Number of Iterations  :           1
----------

SUMMARY OF EQUATION 1
---------------------
Dependent Variable  :        HR80                Number of Variables   :           3
Mean dependent var  :      6.9276                Degrees of Freedom    :        3082
S.D. dependent var  :      6.8251

------------------------------------------------------------------------------------
            Variable     Coefficient       Std.Error     z-Statistic     Probability
------------------------------------------------------------------------------------
          Constant_1       5.1390718       0.2624673      19.5798587       0.0000000
                PS80       0.6776481       0.1219578       5.5564132       0.0000000
                UE80       0.2637240       0.0343184       7.6846277       0.0000000
------------------------------------------------------------------------------------

SUMMARY OF EQUATION 2
---------------------
Dependent Variable  :        HR90                Number of Variables   :           3
Mean dependent var  :      6.1829                Degrees of Freedom    :        3082
S.D. dependent var  :      6.6403

------------------------------------------------------------------------------------
            Variable     Coefficient       Std.Error     z-Statistic     Probability
------------------------------------------------------------------------------------
          Constant_2       3.6139403       0.2534996      14.2561949       0.0000000
                PS90       1.0260715       0.1121662       9.1477755       0.0000000
                UE90       0.3865499       0.0341996      11.3027760       0.0000000
------------------------------------------------------------------------------------


REGRESSION DIAGNOSTICS
                                     TEST         DF       VALUE           PROB
                         LM test on Sigma         1      680.168           0.0000
                         LR test on Sigma         1      768.385           0.0000

OTHER DIAGNOSTICS - CHOW TEST BETWEEN EQUATIONS
                                VARIABLES         DF       VALUE           PROB
                   Constant_1, Constant_2         1       26.729           0.0000
                               PS80, PS90         1        8.241           0.0041
                               UE80, UE90         1        9.384           0.0022

DIAGNOSTICS FOR SPATIAL DEPENDENCE
TEST                              DF       VALUE           PROB
Lagrange Multiplier (error)       2        1333.586        0.0000
Lagrange Multiplier (lag)         2        1275.821        0.0000

ERROR CORRELATION MATRIX
  EQUATION 1  EQUATION 2
    1.000000    0.469548
    0.469548    1.000000
================================ END OF REPORT =====================================
__init__(bigy, bigX, df=None, w=None, regimes=None, nonspat_diag=True, spat_diag=False, vm=False, iter=False, maxiter=5, epsilon=1e-05, verbose=False, name_bigy=None, name_bigX=None, name_ds=None, name_w=None, name_regimes=None)[source]

Methods

__init__(bigy, bigX[, df, w, regimes, ...])