segregation.dynamics.compute_divergence_profiles¶

segregation.dynamics.compute_divergence_profiles(gdf, groups, metric='euclidean', network=None, distance_matrix=None)[source]¶

A segregation metric using Kullback-Leiber (KL) divergence to quantify the difference in the population characteristics between (1) an area and (2) the total population.

Parameters:

datapandas.DataFrame or geopandas.GeoDataFrame, required: dataframe or geodataframe if spatial index holding data for location of interest
groupslist, required: list of columns on dataframe holding population totals for each group
metricstr (optional; ‘euclidean’ by default): Distance metric for calculating pairwise distances, Accepts any inputs to scipy.spatial.distance.pdist. Ignored if passing a network or distance matrix
network: pandana.Network object (optional, None by default): A pandana Network object used to compute distance between observations
distance_matrix: numpy.array (optional; None by default): numpy array of distances between observations in the dataset

Returns:

auxgeopandas.GeoDataFrame: geodataframe of the KL divergence measure, between the aggregated population and the total population, will converge to zero for the final row of each observation to represent that the total population is covered. population_covered : the population count within the aggregated population. Returns a concatenated object of Pandas dataframes. Each dataframe contains a set of divergence levels between an area and the total population. These areas become consecutively larger, starting from a single location and aggregating outward from this location, until the area represents the total population. Thus, together the divergence levels within a dataframe represent a profile of divergence from an area. The concatenated object is the collection of these divergence profiles for every areas within the total population.