gwlearn.base.BaseRegressor¶
-
class gwlearn.base.BaseRegressor(model, *, bandwidth=
None, fixed=False, kernel='bisquare', include_focal=False, geometry=None, graph=None, n_jobs=-1, fit_global_model=True, strict=False, keep_models=False, temp_folder=None, batch_size=None, verbose=False, **kwargs)[source]¶ Generic geographically weighted regression meta-estimator.
This class wraps a scikit-learn-compatible regressor class and fits one local model per focal observation using spatially varying sample weights.
The fitted object exposes focal predictions (
pred_, in-sample ifinclude_focal=True) and local goodness-of-fit summaries.Prediction for new (out-of-sample) observations is not currently implemented for regressors.
Notes
Only point geometries are supported.
- Parameters:¶
- model : RegressorMixin¶
Class implementing the scikit-learn regressor API (e.g.
sklearn.linear_model.LinearRegression). The class (not an instance) is instantiated internally for each local model.- bandwidth : float | int | None¶
Bandwidth for defining neighborhoods.
If
fixed=True, this is a distance threshold.If
fixed=False, this is the number of nearest neighbors used to form the local neighborhood.
If
graphis provided,bandwidthis ignored.- fixed : bool, optional¶
True for distance based bandwidth and False for adaptive (nearest neighbor) bandwidth, by default False
- kernel : str | Callable, optional¶
Type of kernel function used to weight observations, by default “bisquare”
- include_focal : bool, optional¶
Include focal in the local model training. Excluding it allows assessment of geographically weighted metrics on unseen data without a need for train/test split, hence providing value for all samples. This is needed for further spatial analysis of the model performance (and generalises to models that do not support OOB scoring). However, it leaves out the most representative sample. By default False
- geometry : gpd.GeoSeries, optional¶
Geographic location of the observations in the sample. Used to determine the spatial interaction weight based on specification by
bandwidth,fixed,kernel, andinclude_focalkeywords. Eithergeometryorgraphneed to be specified. To allow prediction, it is required to specifygeometry.- graph : Graph, optional¶
Custom libpysal.graph.Graph object encoding the spatial interaction between observations in the sample. If given, it is used directly and
bandwidth,fixed,kernel, andinclude_focalkeywords are ignored. Eithergeometryorgraphneed to be specified. To allow prediction, it is required to specifygeometry. Potentially, both can be specified wheregraphencodes spatial interaction between observations ingeometry.- n_jobs : int, optional¶
The number of jobs to run in parallel.
-1means using all processors by default-1- fit_global_model : bool, optional¶
Determines if the global baseline model shall be fitted alongside the geographically weighted, by default True
- strict : bool | None, optional¶
Do not fit any models if at least one neighborhood has invariant
y, by default False. None is treated as False but provides a warning if there are invariant models.- keep_models : bool | str | Path, optional¶
Keep all local models (required for prediction), by default False. Note that for some models, like random forests, the objects can be large. If string or Path is provided, the local models are not held in memory but serialized to the disk from which they are loaded in prediction.
- temp_folder : str | None, optional¶
Folder to be used by the pool for memmapping large arrays for sharing memory with worker processes, e.g.,
/tmp. Passed tojoblib.Parallel, by default None- batch_size : int | None, optional¶
Number of models to process in each batch. Specify batch_size if your models do not fit into memory. By default None
- verbose : bool, optional¶
Whether to print progress information, by default False
- **kwargs¶
Additional keyword arguments passed to
modelinitialisation
- aicc_¶
Corrected Akaike information criterion to account for model complexity (smaller bandwidths).
Examples
>>> import geopandas as gpd >>> from geodatasets import get_path >>> from sklearn.linear_model import LinearRegression >>> from gwlearn.base import BaseRegressor>>> gdf = gpd.read_file(get_path('geoda.guerry')) >>> X = gdf[['Crm_prp', 'Litercy', 'Donatns', 'Lottery']] >>> y = gdf["Suicids"]>>> gwr = BaseRegressor( ... LinearRegression, ... bandwidth=30, ... fixed=False, ... include_focal=True, ... geometry=gdf.representative_point(), ... ).fit(X, y) >>> gwr.local_r2_.head() 0 0.614715 1 0.488495 2 0.599862 3 0.662435 4 0.662276 dtype: float64Methods
__init__(model, *[, bandwidth, fixed, ...])fit(X, y)Fit geographically weighted local regression models.
Get metadata routing of this object.
get_params([deep])Get parameters for this estimator.
score(X, y[, sample_weight])Return coefficient of determination on test data.
set_params(**params)Set the parameters of this estimator.
set_score_request(*[, sample_weight])Configure whether metadata should be requested to be passed to the
scoremethod.Attributes
- fit(X, y)[source]¶
Fit geographically weighted local regression models.
Fits one local model per focal observation and stores focal (in-sample if
include_focal=True) predictions inpred_.Notes
The neighborhood definition comes from either
self.graphor fromself.geometry+ (bandwidth,fixed,kernel,include_focal).
- get_metadata_routing()¶
Get metadata routing of this object.
Please check User Guide on how the routing mechanism works.
- Returns:¶
routing – A
MetadataRequestencapsulating routing information.- Return type:¶
MetadataRequest