Prediction¶
Both classification and regression models implement tools for prediction. Classifiers provide predict_proba (probabilities) and predict (labels), while regressors provide predict (continuous values) as any scikit-learn estimator would. However, they are retrieved differently.
The prediction can be either done using a single nearest local model or using the ensemble of local models available within the bandwidth. At the same time, the prediction from local model(s) can be fused with the global model to combine the best from both worlds, following Georganos et al (2021).
Classification prediction¶
See that in action with a classifier:
import geopandas as gpd
from geodatasets import get_path
from sklearn import metrics
from sklearn.model_selection import train_test_split
from gwlearn.ensemble import GWRandomForestClassifier
from gwlearn.linear_model import GWLogisticRegression
Get sample data
gdf = gpd.read_file(get_path("geoda.ncovr")).to_crs(5070)
gdf["point"] = gdf.representative_point()
gdf = gdf.set_geometry("point").set_index("FIPS")
y = gdf["FH90"] > gdf["FH90"].median()
X = gdf.iloc[:, 9:15]
Leave out some locations for prediction later.
X_train, X_test, y_train, y_test, geom_train, geom_test = train_test_split(
X, y, gdf.geometry, test_size=0.1
)
Fit the model using the training subset. If you plan to do the prediction, you need to store the local models, which is False by default. When set to True, all the models are kept in memory, so be careful with large datasets. If given a path, all the models will be stored on disk instead, freeing the memory load.
gwrf = GWLogisticRegression(
bandwidth=250,
fixed=False,
keep_models=True,
)
gwrf.fit(
X_train,
y_train,
geometry=geom_train,
)
GWLogisticRegression(bandwidth=250, keep_models=True)In a Jupyter environment, please rerun this cell to show the HTML representation or trust the notebook.
On GitHub, the HTML representation is unable to render, please try loading this page with nbviewer.org.
Parameters
| bandwidth | 250 | |
| fixed | False | |
| kernel | 'bisquare' | |
| include_focal | True | |
| graph | None | |
| n_jobs | -1 | |
| fit_global_model | True | |
| strict | False | |
| keep_models | True | |
| temp_folder | None | |
| batch_size | None | |
| min_proportion | 0.2 | |
| undersample | False | |
| leave_out | None | |
| random_state | None | |
| verbose | False |
Using the nearest local model¶
Now, you can use the test subset to get the prediction. The default behaviour finds the nearest local model and uses it to get the prediction.
proba = gwrf.predict_proba(X_test, geometry=geom_test)
proba
| False | True | |
|---|---|---|
| FIPS | ||
| 40029 | 0.856428 | 0.143572 |
| 01095 | 0.200643 | 0.799357 |
| 13005 | NaN | NaN |
| 55135 | 0.941676 | 0.058324 |
| 26101 | 0.797047 | 0.202953 |
| ... | ... | ... |
| 18061 | 0.874506 | 0.125494 |
| 51127 | 0.174828 | 0.825172 |
| 06073 | 0.000000 | 1.000000 |
| 37163 | 0.016462 | 0.983538 |
| 21105 | 0.381693 | 0.618307 |
309 rows × 2 columns
You can then check the accuracy of this prediction. Note that similarly to fitting, there might be locations that return NA, if the local model was not fitted due to invariance.
gpd.GeoDataFrame(proba, geometry=geom_test).plot(
True, missing_kwds=dict(color="red")
).set_axis_off()
The missing values have been linked to models that were not fitted. You can check all of those visually by plotting the focal prediction.
geom_train.to_frame().plot(gwrf.pred_, missing_kwds=dict(color="red")).set_axis_off()
Filter it out and measure the performance on the left-out sample.
na_mask = proba.isna().any(axis=1)
pred = proba[~na_mask].idxmax(axis=1).astype(bool)
metrics.accuracy_score(y_test[~na_mask], pred)
0.7434782608695653
Using the ensemble of local models¶
In case of the ensemble-based prediction, the process works as follows:
For a new location on which you want a prediction, identify local models within the set bandwidth. If
bandwidth=None, uses the bandwidth set at the fit time.Apply the kernel function used to train the model to derive weights of each of the local models.
Make prediction using each of the local models in the bandwidth.
Make weighted average of predictions based on the kernel weights.
For classifiers, normalize the result to ensure sum of probabilities is 1.
Note that given the prediction is pulled from an ensemble of local models, it is not particularly performant. However, it shall be spatially robust.
proba_ensemble = gwrf.predict_proba(X_test, geometry=geom_test, bandwidth=250)
proba_ensemble
| False | True | |
|---|---|---|
| FIPS | ||
| 40029 | 0.856335 | 0.143665 |
| 01095 | 0.166166 | 0.833834 |
| 13005 | 0.194031 | 0.805969 |
| 55135 | 0.915510 | 0.084490 |
| 26101 | 0.841802 | 0.158198 |
| ... | ... | ... |
| 18061 | 0.833033 | 0.166967 |
| 51127 | 0.199651 | 0.800349 |
| 06073 | 0.000000 | 1.000000 |
| 37163 | 0.021212 | 0.978788 |
| 21105 | 0.435497 | 0.564503 |
309 rows × 2 columns
Given it relies on an ensemble, not a single nearest model, it returns NaN only when all the local models within the bandwidth were not fitted.
gpd.GeoDataFrame(proba_ensemble, geometry=geom_test).plot(
True, missing_kwds=dict(color="red")
).set_axis_off()
This is one of the benefits of the ensemble prediction. Another is theoretical spatial robustness. Nevertheless, the actual metrics are typically very similar between the models.
na_mask = proba_ensemble.isna().any(axis=1)
pred = proba_ensemble[~na_mask].idxmax(axis=1).astype(bool)
metrics.accuracy_score(y_test[~na_mask], pred)
0.7908496732026143
Fusing with the global model¶
Predictions from local models can be fused with the predictions from the global model as a weighted average, by assigning non-zero weight to global model.
proba_fused = gwrf.predict_proba(X_test, geometry=geom_test, global_model_weight=0.25)
proba_fused
| False | True | |
|---|---|---|
| FIPS | ||
| 40029 | 8.312543e-01 | 0.168746 |
| 01095 | 1.715358e-01 | 0.828464 |
| 13005 | 3.604888e-01 | 0.639511 |
| 55135 | 9.231339e-01 | 0.076866 |
| 26101 | 8.015589e-01 | 0.198441 |
| ... | ... | ... |
| 18061 | 8.699378e-01 | 0.130062 |
| 51127 | 2.135534e-01 | 0.786447 |
| 06073 | 1.085322e-10 | 1.000000 |
| 37163 | 1.423262e-02 | 0.985767 |
| 21105 | 3.967436e-01 | 0.603256 |
309 rows × 2 columns
When prediction from local models yields NaN, it is ignored and the global prediction is used instead.
pred_fused = gwrf.predict(X_test, geometry=geom_test, global_model_weight=0.25)
metrics.accuracy_score(y_test, pred_fused)
0.7896440129449838
You can see that the accuracy is slightly better but keep in mind that this is only a toy example and specific strategy for fusing local and global predictions needs to be tested on your use case.
Prediction with regressors¶
Regression models also implement predict method following the same logic as classifiers. Let’s see that in action with a GWLinearRegression.
from gwlearn.linear_model import GWLinearRegression
Prepare the data with a continuous target variable.
y_reg = gdf["FH90"] # Use the continuous variable directly
X_train_reg, X_test_reg, y_train_reg, y_test_reg, geom_train_reg, geom_test_reg = (
train_test_split(X, y_reg, gdf.geometry, test_size=0.1, random_state=42)
)
Fit the regression model with keep_models=True to enable prediction.
gwrf_reg = GWLinearRegression(
bandwidth=250,
fixed=False,
keep_models=True,
)
gwrf_reg.fit(
X_train_reg,
y_train_reg,
geometry=geom_train_reg,
)
GWLinearRegression(bandwidth=250, keep_models=True)In a Jupyter environment, please rerun this cell to show the HTML representation or trust the notebook.
On GitHub, the HTML representation is unable to render, please try loading this page with nbviewer.org.
Parameters
| bandwidth | 250 | |
| fixed | False | |
| kernel | 'bisquare' | |
| include_focal | True | |
| graph | None | |
| n_jobs | -1 | |
| fit_global_model | True | |
| strict | False | |
| keep_models | True | |
| temp_folder | None | |
| batch_size | None | |
| verbose | False |
Make predictions on the test set using the predict method.
pred_reg = gwrf_reg.predict(X_test_reg, geometry=geom_test_reg)
pred_reg
FIPS
51101 11.457256
13231 15.005252
47019 13.176498
16053 8.860581
21233 12.020863
...
17099 11.625854
38081 6.883169
13317 17.669495
40005 13.573489
39149 10.530572
Length: 309, dtype: float64
Visualize the predicted values spatially.
gpd.GeoDataFrame({"prediction": pred_reg}, geometry=geom_test_reg).plot(
"prediction", legend=True
).set_axis_off()
Evaluate the prediction performance using common regression metrics.
print(f"R2 score: {metrics.r2_score(y_test_reg, pred_reg):.3f}")
print(f"RMSE: {metrics.root_mean_squared_error(y_test_reg, pred_reg):.3f}")
R2 score: 0.652
RMSE: 3.316
The ensemble prediction is available analogously to classifiers.
pred_reg_ensemble = gwrf_reg.predict(X_test_reg, geometry=geom_test_reg, bandwidth=None)
pred_reg_ensemble
FIPS
51101 11.489416
13231 14.672679
47019 13.041364
16053 9.080974
21233 12.289937
...
17099 11.493070
38081 7.248033
13317 17.023484
40005 13.408702
39149 10.781169
Length: 309, dtype: float64
print(f"R2 score: {metrics.r2_score(y_test_reg, pred_reg_ensemble):.3f}")
print(f"RMSE: {metrics.root_mean_squared_error(y_test_reg, pred_reg_ensemble):.3f}")
R2 score: 0.650
RMSE: 3.327
Similarly, you can fuse predictions from local and global models.
pred_reg_fused = gwrf_reg.predict(
X_test_reg, geometry=geom_test_reg, global_model_weight=0.25
)
pred_reg_fused
FIPS
51101 11.232861
13231 14.827121
47019 13.076454
16053 9.264119
21233 11.984785
...
17099 11.410702
38081 7.235080
13317 17.418047
40005 13.868180
39149 10.347441
Length: 309, dtype: float64
print(f"R2 score: {metrics.r2_score(y_test_reg, pred_reg_fused):.3f}")
print(f"RMSE: {metrics.root_mean_squared_error(y_test_reg, pred_reg_fused):.3f}")
R2 score: 0.647
RMSE: 3.338
As you can see, contrary to the classification task, the fusion of global prediction to the local one did not improve the model performance in this case. However, this is very much case dependent.