segregation.inference.simulate_systematic_randomization

segregation.inference.simulate_systematic_randomization(df, group=None, total=None, groups=None)[source]

Simulate systematic redistribution of population groups across spatial units.

Parameters:
dfgeopandas.GeoDataFrame

geodataframe with population data to be randomized

groupstr, optional

name of column on geodataframe that holds the group total (for use with singlegroup indices).

totalstr, optional

name of column on geodataframe that holds the total population for each unit. For singlegroup indices, this parameter is required. For multigroup indices, this is optional if groups are not exhaustive.

groupslist, optional

list of columns on input dataframe that hold total population counts for each group of interest. Note that if not passing a total argument, groups are assumed to be exhaustive. If total is not set and groups are not exhaustive, the function will estimate incorrect probabilities of choosing each geographic unit.

Returns:
geopandas.GeoDataFrame

geodataframe with systematically randomized population groups

Notes

Simulates the random allocation of each group across geographic units, given the total population of each group (randomizes location totals for each group). Given the total population of each group in the region, take draws from a multinomial distribution where the probability of choosing each geographic unit is equal to the total regional share currently residing in the unit. Results are guaranteed to respect regional group totals, but will include variation in the total population of each geographic unit.

For more, see Allen, R., Burgess, S., Davidson, R., & Windmeijer, F. (2015). More reliable inference for the dissimilarity index of segregation. The Econometrics Journal, 18(1), 40–66. https://doi.org/10.1111/ectj.12039

Reference: [Allen et al., 2015]