gwlearn.base.BaseRegressor#

class gwlearn.base.BaseRegressor(model, *, bandwidth=None, fixed=False, kernel='bisquare', include_focal=False, geometry=None, graph=None, n_jobs=-1, fit_global_model=True, measure_performance=True, strict=False, keep_models=False, temp_folder=None, batch_size=None, verbose=False, **kwargs)[source]#

Generic geographically weighted regression meta-class

TODO:

tvalues & adj_alpha & critical_t val
predict
performance measurements

Parameters:

modelmodel class: Scikit-learn model class
bandwidthint | float: Bandwidth value consisting of either a distance or N nearest neighbors
fixedbool, optional: True for distance based bandwidth and False for adaptive (nearest neighbor) bandwidth, by default False
kernelstr | Callable, optional: Type of kernel function used to weight observations, by default “bisquare”
include_focalbool, optional: Include focal in the local model training. Excluding it allows assessment of geographically weighted metrics on unseen data without a need for train/test split, hence providing value for all samples. This is needed for further spatial analysis of the model performance (and generalises to models that do not support OOB scoring). However, it leaves out the most representative sample. By default False
geometrygpd.GeoSeries, optional: Geographic location of the observations in the sample. Used to determine the spatial interaction weight based on specification by bandwidth, fixed, kernel, and include_focal keywords. Either geometry or graph need to be specified. To allow prediction, it is required to specify geometry.
graphGraph, optional: Custom libpysal.graph.Graph object encoding the spatial interaction between observations in the sample. If given, it is used directly and bandwidth, fixed, kernel, and include_focal keywords are ignored. Either geometry or graph need to be specified. To allow prediction, it is required to specify geometry. Potentially, both can be specified where graph encodes spatial interaction between observations in geometry.
n_jobsint, optional: The number of jobs to run in parallel. -1 means using all processors by default -1
fit_global_modelbool, optional: Determines if the global baseline model shall be fitted alongside the geographically weighted, by default True
measure_performancebool, optional: Calculate performance metrics for the model, by default True. If True, measures R2 and adjusted R2.
strictbool | None, optional: Do not fit any models if at least one neighborhood has invariant y, by default False. None is treated as False but provides a warning if there are invariant models.
keep_modelsbool | str | Path, optional: Keep all local models (required for prediction), by default False. Note that for some models, like random forests, the objects can be large. If string or Path is provided, the local models are not held in memory but serialized to the disk from which they are loaded in prediction.
temp_folderstr | None, optional: Folder to be used by the pool for memmapping large arrays for sharing memory with worker processes, e.g., /tmp. Passed to joblib.Parallel, by default None
batch_sizeint | None, optional: Number of models to process in each batch. Specify batch_size if your models do not fit into memory. By default None
random_stateint | None, optional: Random seed for reproducibility, by default None
verbosebool, optional: Whether to print progress information, by default False
**kwargs: Additional keyword arguments passed to model initialisation

Attributes:

pred_pd.Series: Focal predictions for each location.
resid_pd.Series: Residuals for each location (y - pred_).
RSS_pd.Series: Residual sum of squares for each location.
TSS_pd.Series: Total sum of squares for each location.
y_bar_pd.Series: Weighted mean of y for each location.
local_r2_pd.Series: Local R2 for each location.
focal_r2_float: Global R2 for focal predictions.
score_float: Alias for focal_r2_ (global R2 for focal predictions).
focal_adj_r2_float: Adjusted R2 for focal predictions.
hat_values_pd.Series: Hat values for each location (diagonal elements of hat matrix).
effective_df_float: Effective degrees of freedom (sum of hat values).
log_likelihood_float: Global log likelihood of the model.
aic_float: Akaike information criterion of the model.
aicc_float: Corrected Akaike information criterion to account for model complexity (smaller bandwidths).
bic_float: Bayesian information criterion.

__init__(model, *, bandwidth=None, fixed=False, kernel='bisquare', include_focal=False, geometry=None, graph=None, n_jobs=-1, fit_global_model=True, measure_performance=True, strict=False, keep_models=False, temp_folder=None, batch_size=None, verbose=False, **kwargs)#

Methods

`__init__`(model, *[, bandwidth, fixed, ...])
`fit`(X, y)	Fit the geographically weighted model
`get_metadata_routing`()	Get metadata routing of this object.
`get_params`([deep])	Get parameters for this estimator.
`score`(X, y[, sample_weight])	Return the coefficient of determination of the prediction.
`set_fit_request`(*[, geometry])	Request metadata passed to the `fit` method.
`set_params`(**params)	Set the parameters of this estimator.
`set_score_request`(*[, sample_weight])	Request metadata passed to the `score` method.

fit(X, y)[source]#

Fit the geographically weighted model

Parameters:

Xpd.DataFrame: Independent variables
ypd.Series: Dependent variable
geometrygpd.GeoSeries: Geographic location

set_score_request(*, sample_weight='$UNCHANGED$')#

Request metadata passed to the score method.

Note that this method is only relevant if enable_metadata_routing=True (see sklearn.set_config()). Please see User Guide on how the routing mechanism works.

The options for each parameter are:

True: metadata is requested, and passed to score if provided. The request is ignored if metadata is not provided.
False: metadata is not requested and the meta-estimator will not pass it to score.
None: metadata is not requested, and the meta-estimator will raise an error if the user provides it.
str: metadata should be passed to the meta-estimator with this given alias instead of the original name.

The default (sklearn.utils.metadata_routing.UNCHANGED) retains the existing request. This allows you to change the request for some parameters and not others.

Added in version 1.3.

Note

This method is only relevant if this estimator is used as a sub-estimator of a meta-estimator, e.g. used inside a Pipeline. Otherwise it has no effect.

Parameters:

sample_weightstr, True, False, or None, default=sklearn.utils.metadata_routing.UNCHANGED: Metadata routing for sample_weight parameter in score.

Returns:

selfobject: The updated object.

gwlearn.base.BaseRegressor

Contents

gwlearn.base.BaseRegressor#