segregation.inference.TwoValueTest

class segregation.inference.TwoValueTest(seg_class_1, seg_class_2, iterations_under_null=500, null_approach='random_label', n_jobs=-1, backend='loky', index_kwargs_1=None, index_kwargs_2=None, **kwargs)[source]

Perform comparative inference for two segregation measures.

Parameters:
seg_class_1segregation.singlegroup or segregation.multigroup class

a fitted segregation class to be compared to seg_class_2

seg_class_2segregation.singlegroup or segregation.multigroup class

a fitted segregation class to be compared to seg_class_1

iterations_under_nullint

number of iterations to simulate observations in a null distribution

null_approachstr

Which type of null hypothesis the inference will iterate. One of the following:

  • random_label:

Randomly assign each spatial unit to a region then recalculate segregation indices and take their difference. Repeat this process iterations times to generate a reference distribution. Then test the observed difference aginst this distribution.

  • bootstrap:

Use bootstrap resampling to generate distributions of each segregation index in the comparison, then use a two sample t-test to compare differences between the distribution means.

  • composition:

Generate counterfactual estimates for each region using the sim_composition approach. On each iteration, generate a synthetic dataset for each region where each unit has a 50% chance of belonging to the original data or the counterfactual data. Recalculate segregation indices on the synthetic datasets.

  • share:

Generate counterfactual estimates for each region using the sim_share approach. On each iteration, generate a synthetic dataset for each region where each unit has a 50% chance of belonging to the original data or the counterfactual data. Recalculate segregation indices on the synthetic datasets. Then follow the random labeling method on these synthetic data

  • dual_composition:

Generate counterfactual estimates for each region using the sim_dual_composition approach. On each iteration, generate a synthetic dataset for each region where each unit has a 50% chance of belonging to the original data or the counterfactual data. Then follow the random labeling method on these synthetic data

  • person_permutation:

Use the simulate_person_permutation approach to randomly reallocate the combined population across both regions then recalculate segregation indices

n_jobs: int, optional

number of cores to use for estimation. If -1 all available cpus will be used

backend: str, optional

which backend to use with joblib. Options include “loky”, “multiprocessing”, or “threading”

index_kwargs_1dict, optional

extra parameters to pass to segregation index 1.

index_kwargs_2dict, optional

extra parameters to pass to segregation index 2.

Notes

This function performs inference to compare two segregation measures. This can be either two measures of the same locations in two different points in time or it can be two different locations at the same point in time. The null hypothesis is H0: Segregation_1 is not different than Segregation_2. Based on Rey, Sergio J., and Myrna L. Sastré-Gutiérrez. “Interregional inequality dynamics in Mexico.” Spatial Economic Analysis 5.3 (2010): 277-298.

Examples

Several examples can be found here https://github.com/pysal/segregation/blob/master/notebooks/inference_wrappers_example.ipynb.

Attributes:
p_valuefloat

Two-Tailed p-value

est_simnumpy array

Estimates of the segregation measure differences under the null hypothesis

est_point_difffloat

Observed difference between the segregation measures

__init__(seg_class_1, seg_class_2, iterations_under_null=500, null_approach='random_label', n_jobs=-1, backend='loky', index_kwargs_1=None, index_kwargs_2=None, **kwargs)[source]

Methods

__init__(seg_class_1, seg_class_2[, ...])

plot([color, color2, kde, ax])

Plot the distribution of simulated values and the index value being tested.

plot(color='darkblue', color2='darkred', kde=True, ax=None, **kwargs)[source]

Plot the distribution of simulated values and the index value being tested.

Parameters:
colorstr, optional

histogram color, by default ‘darkblue’

color2: str, optional, by default “darkred”

Color for second histogram. Only relevant for bootstrap test

kdebool, optional

Whether to plot the kernel density estimate along with the histogram, by default True

axmatplotlib.axes, optional

axes object to plot onto, by default None

kwargsseaborn.histplot argument, optional

additional keyword arguments passed to seaborn’s histplot function

Returns:
matplotlib.axes

pyplot axes object