This page was generated from notebooks/azp.ipynb.
Interactive online version:
Automatic Zoning Procedure (AZP) algorithm¶
Authors: Xin Feng, James Gaboardi
AZP can work with different types of objective functions, which are very sensitive to aggregating data from a large number of zones into a pre-designated smaller number of regions. AZP was originally formulated in Openshaw, 1977 and then extended in Openshaw, S. and Rao, L. (1995).
[1]:
%config InlineBackend.figure_format = "retina"
%load_ext watermark
%watermark
Last updated: 2023-12-10T13:26:50.972099-05:00
Python implementation: CPython
Python version : 3.12.0
IPython version : 8.18.0
Compiler : Clang 15.0.7
OS : Darwin
Release : 23.1.0
Machine : x86_64
Processor : i386
CPU cores : 8
Architecture: 64bit
[2]:
import warnings
import geopandas
import libpysal
import spopt
from spopt.region import AZP
%matplotlib inline
%watermark -w
%watermark -iv
Watermark: 2.4.3
spopt : 0.5.1.dev53+g5cadae7
libpysal : 4.9.2
geopandas: 0.14.1
Mexican State Regional Income Clustering¶
To illustrate azp
we utilize data on regional incomes for Mexican states over the period 1940-2000, originally used in Rey and Sastré-Gutiérrez (2010).
We can first explore the data by plotting the per capital gross regional domestic product (in constant USD 2000 dollars) for each year in the sample, using a quintile classification:
[3]:
pth = libpysal.examples.get_path("mexicojoin.shp")
mexico = geopandas.read_file(pth)
[4]:
for year in range(1940, 2010, 10):
base = mexico.plot(
figsize=(8, 5),
column=f"PCGDP{year}",
scheme="Quantiles",
cmap="GnBu",
edgecolor="b",
legend=True,
)
base.axis("off")
base.set_title(str(year))
data:image/s3,"s3://crabby-images/c6c0a/c6c0a35bb7e02a9e4796ac3633528c603d2d3d46" alt="../_images/notebooks_azp_5_0.png"
data:image/s3,"s3://crabby-images/e2d2d/e2d2df50c424d081f8a08dbf9712b426aecaf040" alt="../_images/notebooks_azp_5_1.png"
data:image/s3,"s3://crabby-images/25691/25691244e160d3e5d589f4ab25f2dec849925ff0" alt="../_images/notebooks_azp_5_2.png"
data:image/s3,"s3://crabby-images/1a949/1a9499d5308db1bdbebaa428963175df78fed120" alt="../_images/notebooks_azp_5_3.png"
data:image/s3,"s3://crabby-images/1b7df/1b7df251797a8c00a62c3c1b735b6090039289a3" alt="../_images/notebooks_azp_5_4.png"
data:image/s3,"s3://crabby-images/0118f/0118f7177bbed189095adf38594648097d717dcd" alt="../_images/notebooks_azp_5_5.png"
data:image/s3,"s3://crabby-images/29167/29167e621bb8f92bd42332db514e4bc1ae238853" alt="../_images/notebooks_azp_5_6.png"
Regionalization¶
First, we specify a number of parameters that will serve as input to the azp
model.
The variables in the dataframe that will be used to measure regional dissimilarity:
[5]:
attrs_name = [f"PCGDP{year}" for year in range(1950, 2010, 10)]
attrs_name
[5]:
['PCGDP1950', 'PCGDP1960', 'PCGDP1970', 'PCGDP1980', 'PCGDP1990', 'PCGDP2000']
A spatial weights object expresses the spatial connectivity of the zones:
[6]:
with warnings.catch_warnings(record=True):
w = libpysal.weights.Queen.from_dataframe(mexico)
The number of regions that we would like to aggregate these zones into:
[7]:
n_clusters = 5
There are four optional parameters. In this example, we only use the default settings, you can define them as needed.
allow_move_strategy: For a different behavior for allowing moves, an AllowMoveStrategy instance can be passed as argument.
class: AllowMoveStrategy or None, default: None
random_state: Random seed.
None, int, str, bytes, or bytearray, default: None
initial_labels: One-dimensional array of labels at the beginning of the algorithm.
class: numpy.ndarray or None, default: None
If None, then a random initial clustering will be generated.
objective_func: the objective function to use.
class: spopt.region.objective_function.ObjectiveFunction, default: ObjectiveFunctionPairwise()
The model can then be solved:
[8]:
model = AZP(mexico, w, attrs_name, n_clusters)
model.solve()
[9]:
mexico["azp_new"] = model.labels_
[10]:
mexico["number"] = 1
mexico[["azp_new", "number"]].groupby(by="azp_new").count()
[10]:
number | |
---|---|
azp_new | |
0.0 | 8 |
1.0 | 10 |
2.0 | 5 |
3.0 | 4 |
4.0 | 5 |
[11]:
mexico.plot(figsize=(8, 5), column="azp_new", categorical=True, ec="w").axis("off");
data:image/s3,"s3://crabby-images/158cd/158cdc897aeb701252de36c543be604890516d8f" alt="../_images/notebooks_azp_19_0.png"
The model solution results in five regions, two of which have five states, one with four, one with eight, and one with ten states.
Year-by-Year Regionalization (n_clusters = 5 regions)¶
[12]:
for year in attrs_name:
model = AZP(mexico, w, year, 5)
model.solve()
lab = year + "labels_"
mexico[lab] = model.labels_
base = mexico.plot(figsize=(8, 5), column=lab, categorical=True, edgecolor="w")
base.axis("off")
base.set_title(year)
data:image/s3,"s3://crabby-images/3a030/3a03048f949c72dfb4b5ad6a5a2322d0ed814a0c" alt="../_images/notebooks_azp_22_0.png"
data:image/s3,"s3://crabby-images/0be09/0be09930d0cb0b3e6aaca27858e9a8e2e91f304c" alt="../_images/notebooks_azp_22_1.png"
data:image/s3,"s3://crabby-images/f3246/f3246ba989ad451034eedb69321253bfbede1d16" alt="../_images/notebooks_azp_22_2.png"
data:image/s3,"s3://crabby-images/44e86/44e86925c5b9346828fa255b86bae43ec2d77ef2" alt="../_images/notebooks_azp_22_3.png"
data:image/s3,"s3://crabby-images/b82e2/b82e24ebf77768c1c4b0ce7db19a052c1cacca59" alt="../_images/notebooks_azp_22_4.png"
data:image/s3,"s3://crabby-images/5a4d4/5a4d4dd7916da46fbfacd870c5e8df18a2a396f3" alt="../_images/notebooks_azp_22_5.png"