segregation.dynamics.compute_divergence_profiles

segregation.dynamics.compute_divergence_profiles(gdf, groups, metric='euclidean', network=None, distance_matrix=None)[source]

A segregation metric using Kullback-Leiber (KL) divergence to quantify the difference in the population characteristics between (1) an area and (2) the total population.

Parameters:
datapandas.DataFrame or geopandas.GeoDataFrame, required

dataframe or geodataframe if spatial index holding data for location of interest

groupslist, required

list of columns on dataframe holding population totals for each group

metricstr (optional; ‘euclidean’ by default)

Distance metric for calculating pairwise distances, Accepts any inputs to scipy.spatial.distance.pdist. Ignored if passing a network or distance matrix

network: pandana.Network object (optional, None by default)

A pandana Network object used to compute distance between observations

distance_matrix: numpy.array (optional; None by default)

numpy array of distances between observations in the dataset

Returns:
auxgeopandas.GeoDataFrame

geodataframe of the KL divergence measure, between the aggregated population and the total population, will converge to zero for the final row of each observation to represent that the total population is covered. population_covered : the population count within the aggregated population. Returns a concatenated object of Pandas dataframes. Each dataframe contains a set of divergence levels between an area and the total population. These areas become consecutively larger, starting from a single location and aggregating outward from this location, until the area represents the total population. Thus, together the divergence levels within a dataframe represent a profile of divergence from an area. The concatenated object is the collection of these divergence profiles for every areas within the total population.