tobler.model.glm

tobler.model.glm(source_df=None, target_df=None, raster='nlcd_2011', raster_codes=None, variable=None, formula=None, likelihood='poisson', force_crs_match=True, return_model=False)[source]

Train a generalized linear model to predict polygon attributes based on the collection of pixel values they contain.

Parameters:
source_df : geopandas.GeoDataFrame, required

geodataframe containing source original data to be represented by another geometry

target_df : geopandas.GeoDataFrame, required

geodataframe containing target boundaries that will be used to represent the source data

raster : str, required (default="nlcd_2011")

path to raster file that will be used to input data to the regression model. i.e. a coefficients refer to the relationship between pixel counts and population counts. Defaults to 2011 NLCD

raster_codes : list, required (default =[21, 22, 23, 24, 41, 42, 52])

list of integers that represent different types of raster cells. If no formula is given, the model will be fit from a linear combination of the logged count of each cell type listed here. Defaults to [21, 22, 23, 24, 41, 42, 52] which are informative land type cells from the NLCD

variable : str, required

name of the variable (column) to be modeled from the source_df

formula : str, optional

patsy-style model formula that specifies the model. Raster codes should be prefixed with “Type_”, e.g. “n_total_pop ~ -1 + np.log1p(Type_21) + np.log1p(Type_22)

likelihood : str, {'poisson', 'gaussian', 'neg_binomial'} (default = "poisson")

the likelihood function used in the model

force_crs_match : bool

whether to coerce geodataframe and raster to the same CRS

model : return

whether to return the fitted model in addition to the interpolated geodataframe. If true, this will return (geodataframe, model)

Returns:

interpolated – a new geopandas dataframe with boundaries from target_df and modeled attribute data from the source_df. If return_model is true, the function will also return the fitted regression model for further diagnostics

Return type:

geopandas.GeoDataFrame