segregation.inference.simulate_systematic_randomization¶
- segregation.inference.simulate_systematic_randomization(df, group=None, total=None, groups=None)[source]¶
Simulate systematic redistribution of population groups across spatial units.
- Parameters:
- df
geopandas.GeoDataFrame
geodataframe with population data to be randomized
- group
str
,optional
name of column on geodataframe that holds the group total (for use with singlegroup indices).
- total
str
,optional
name of column on geodataframe that holds the total population for each unit. For singlegroup indices, this parameter is required. For multigroup indices, this is optional if groups are not exhaustive.
- groups
list
,optional
list of columns on input dataframe that hold total population counts for each group of interest. Note that if not passing a total argument, groups are assumed to be exhaustive. If total is not set and groups are not exhaustive, the function will estimate incorrect probabilities of choosing each geographic unit.
- df
- Returns:
geopandas.GeoDataFrame
geodataframe with systematically randomized population groups
Notes
Simulates the random allocation of each group across geographic units, given the total population of each group (randomizes location totals for each group). Given the total population of each group in the region, take draws from a multinomial distribution where the probability of choosing each geographic unit is equal to the total regional share currently residing in the unit. Results are guaranteed to respect regional group totals, but will include variation in the total population of each geographic unit.
For more, see Allen, R., Burgess, S., Davidson, R., & Windmeijer, F. (2015). More reliable inference for the dissimilarity index of segregation. The Econometrics Journal, 18(1), 40–66. https://doi.org/10.1111/ectj.12039
Reference: [Allen et al., 2015]