{ "cells": [ { "cell_type": "markdown", "id": "27d7d8f2-a943-481c-be10-365391d39458", "metadata": {}, "source": [ "# W to Graph Migration Guide\n", "\n", "Author: [Serge Rey](http://github.com/sjsrey)" ] }, { "cell_type": "markdown", "id": "57537223-f1c3-4771-b8f0-f7b32962e995", "metadata": {}, "source": [ "## Introduction\n", "\n", "Beginning in the fall of 2023, the PySAL project released a new `graph` module that offers a modern implementation of spatial weights. This module's [Graph](../../generated/libpysal.graph.Graph.html) class is set to eventually replace the [W](../../generated/libpysal.weights.W.html) class, which has been the cornerstone for spatial weights in PySAL for the past 15 years. The `W` class has significantly contributed to the library's success, but as the scientific landscape evolves, new opportunities necessitate updated interfaces and designs for spatial weights.\n", "\n", "While the application programming interfaces (API) of the `W` and `Graph` classes are similar, there are important [differences](../../migration.rst)\n", "to consider when transitioning from weights-based resources to graph-based implementations.\n", "\n", "This guide is designed to provide users with an overview of migrating from the `W` class to the `Graph` class.\n", "\n", "Beyond the specifics that we outline below, it is important to note two utility methods are available to convert between the two classes:\n", "\n", "- `Graph.to_W()` will generate a `W` instance from a `Graph` object\n", "- `Graph.from_W()`will generate a `Graph` instance from a `W` object\n", "\n" ] }, { "cell_type": "markdown", "id": "2d4e2096-0b7c-4acc-9875-32321256693b", "metadata": {}, "source": [ "## Imports\n", "To access the `W` and `Graph` class, use the following imports:" ] }, { "cell_type": "code", "execution_count": 1, "id": "f46293f2-1362-4e44-840f-eef77a67285d", "metadata": {}, "outputs": [], "source": [ "from libpysal import graph, weights" ] }, { "cell_type": "markdown", "id": "c5d73325-f533-466d-8108-ef034705fa58", "metadata": {}, "source": [ "## Example Data Set\n", "\n", "To illustrate the migration from `W` to `Graph` we will utilize a built-in data set from `libpysal`. In addition to the relevant `libpysal` modules we will also import the other packages needed:" ] }, { "cell_type": "code", "execution_count": 2, "id": "a2254e95-de33-4856-bc71-2817e52af346", "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "Last updated: 2024-07-18\n", "\n", "Python implementation: CPython\n", "Python version : 3.12.2\n", "IPython version : 8.21.0\n", "\n", "libpysal: 4.2.3.dev1352+gcfa4e0ce\n", "\n" ] } ], "source": [ "%matplotlib inline\n", "\n", "import geopandas as gpd\n", "import pandas as pd\n", "import seaborn as sns\n", "\n", "from libpysal import examples\n", "\n", "%load_ext watermark\n", "%watermark -v -d -u -p libpysal" ] }, { "cell_type": "code", "execution_count": 3, "id": "8c3bbda5-833a-4e73-82cc-eb6cfc906856", "metadata": {}, "outputs": [], "source": [ "dbs = examples.available()" ] }, { "cell_type": "code", "execution_count": 4, "id": "acd84693-21e0-49fc-b878-840ba66840e9", "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "sids2\n", "=====\n", "\n", "North Carolina county SIDS death counts and rates\n", "-------------------------------------------------\n", "\n", "* sids2.dbf: attribute data. (k=18)\n", "* sids2.html: metadata.\n", "* sids2.shp: Polygon shapefile. (n=100)\n", "* sids2.shx: spatial index.\n", "* sids2.gal: spatial weights in GAL format.\n", "\n", "Source: Cressie, Noel (1993). Statistics for Spatial Data. New York, Wiley, pp. 386-389. Rates computed.\n", "Updated URL: https://geodacenter.github.io/data-and-lab/sids2/\n", "\n" ] } ], "source": [ "examples.explain(\"sids2\")" ] }, { "cell_type": "code", "execution_count": 5, "id": "8422e23d-a7f4-48b9-a31c-725a06dde2f6", "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "\n", "RangeIndex: 100 entries, 0 to 99\n", "Data columns (total 19 columns):\n", " # Column Non-Null Count Dtype \n", "--- ------ -------------- ----- \n", " 0 AREA 100 non-null float64 \n", " 1 PERIMETER 100 non-null float64 \n", " 2 CNTY_ 100 non-null int64 \n", " 3 CNTY_ID 100 non-null int64 \n", " 4 NAME 100 non-null object \n", " 5 FIPS 100 non-null object \n", " 6 FIPSNO 100 non-null int64 \n", " 7 CRESS_ID 100 non-null int64 \n", " 8 BIR74 100 non-null float64 \n", " 9 SID74 100 non-null float64 \n", " 10 NWBIR74 100 non-null float64 \n", " 11 BIR79 100 non-null float64 \n", " 12 SID79 100 non-null float64 \n", " 13 NWBIR79 100 non-null float64 \n", " 14 SIDR74 100 non-null float64 \n", " 15 SIDR79 100 non-null float64 \n", " 16 NWR74 100 non-null float64 \n", " 17 NWR79 100 non-null float64 \n", " 18 geometry 100 non-null geometry\n", "dtypes: float64(12), geometry(1), int64(4), object(2)\n", "memory usage: 15.0+ KB\n" ] } ], "source": [ "# Read the file in\n", "gdf = gpd.read_file(examples.get_path(\"sids2.shp\"))\n", "\n", "gdf.info()" ] }, { "cell_type": "code", "execution_count": 6, "id": "c116377a-e1c6-4a0c-90ed-928db6f08220", "metadata": {}, "outputs": [ { "data": { "text/plain": [ "0 POLYGON ((-81.47276 36.23436, -81.54084 36.272...\n", "1 POLYGON ((-81.23989 36.36536, -81.24069 36.379...\n", "2 POLYGON ((-80.45634 36.24256, -80.47639 36.254...\n", "3 MULTIPOLYGON (((-76.00897 36.31960, -76.01735 ...\n", "4 POLYGON ((-77.21767 36.24098, -77.23461 36.214...\n", " ... \n", "95 POLYGON ((-78.26150 34.39479, -78.32898 34.364...\n", "96 POLYGON ((-78.02592 34.32877, -78.13024 34.364...\n", "97 POLYGON ((-78.65572 33.94867, -79.07450 34.304...\n", "98 POLYGON ((-77.96073 34.18924, -77.96587 34.242...\n", "99 POLYGON ((-78.65572 33.94867, -78.63472 33.977...\n", "Name: geometry, Length: 100, dtype: geometry" ] }, "execution_count": 6, "metadata": {}, "output_type": "execute_result" } ], "source": [ "gdf.geometry" ] }, { "cell_type": "code", "execution_count": 7, "id": "248f154c-d874-496b-88ad-b8ad2f367bbc", "metadata": {}, "outputs": [], "source": [ "gdf = gdf.set_crs(\"epsg:4326\")" ] }, { "cell_type": "code", "execution_count": 8, "id": "592bad5b-95fc-4f62-9dfe-8139188c6eef", "metadata": {}, "outputs": [ { "data": { "text/html": [ "
Make this Notebook Trusted to load map: File -> Trust Notebook
" ], "text/plain": [ "" ] }, "execution_count": 8, "metadata": {}, "output_type": "execute_result" } ], "source": [ "gdf.explore()" ] }, { "cell_type": "markdown", "id": "ed1f23f4-968f-4ab8-a876-2d1b07fdd794", "metadata": {}, "source": [ "## Building Spatial Weights\n", "With a GeoDataFrame in hand, we can build spatial weights using the `W` class as:" ] }, { "cell_type": "code", "execution_count": 9, "id": "4e4ea913-d462-4943-ae66-41a3878c60b0", "metadata": {}, "outputs": [ { "name": "stderr", "output_type": "stream", "text": [ "/tmp/ipykernel_616347/4235704840.py:1: FutureWarning: `use_index` defaults to False but will default to True in future. Set True/False directly to control this behavior and silence this warning\n", " w_queen = weights.Queen.from_dataframe(gdf)\n" ] }, { "data": { "text/plain": [ "" ] }, "execution_count": 9, "metadata": {}, "output_type": "execute_result" } ], "source": [ "w_queen = weights.Queen.from_dataframe(gdf)\n", "w_queen" ] }, { "cell_type": "markdown", "id": "d6f48b63-6a66-4bc5-97b6-00a276dcdb1f", "metadata": {}, "source": [ "For the `Graph`, weights are constructed from the dataframe as:" ] }, { "cell_type": "code", "execution_count": 10, "id": "3e77b0cb-313c-4960-9cc6-f5deb602748a", "metadata": {}, "outputs": [ { "data": { "text/plain": [ "" ] }, "execution_count": 10, "metadata": {}, "output_type": "execute_result" } ], "source": [ "g_queen = graph.Graph.build_contiguity(gdf, rook=False)\n", "g_queen" ] }, { "cell_type": "markdown", "id": "c601e671-09b3-41cf-860d-92db22326e1b", "metadata": {}, "source": [ "Two things to be aware of here are:\n", "\n", "- the methods have different names for the two classes\n", "- the `W` relies on different methods to generate `Rook` or `Queen` contiguity weights, while the `Graph` relies on the `rook` keyword argument to do so." ] }, { "cell_type": "markdown", "id": "4192da75-3d27-4a61-99aa-8b9f11e11bd1", "metadata": {}, "source": [ "## Neighbors\n", "From the output in the previous cells, we see different information reported in the two cases.\n", "\n", "The neighbors of a spatial unit are those units that satisfy the specific contiguity relationship specified by the user. In our case of `Queen` contiguity, and pair of polygons that share at least one vertex are considered neighbors.\n", "\n", "This information is encoded differently in the two classes. For the `W` class, information on the neighbors is stored in the `neighbors` attribute which is a `dict` using the unit's id as the key, while the `list` of neighbor ids is the value:" ] }, { "cell_type": "code", "execution_count": 11, "id": "302e07a0-6bad-4ba2-9866-1194caa08c8f", "metadata": {}, "outputs": [ { "data": { "text/plain": [ "dict" ] }, "execution_count": 11, "metadata": {}, "output_type": "execute_result" } ], "source": [ "type(w_queen.neighbors)" ] }, { "cell_type": "code", "execution_count": 12, "id": "4e035e8a-6ac7-46aa-96d8-866cd3f01b4e", "metadata": {}, "outputs": [ { "data": { "text/plain": [ "[96, 97, 98]" ] }, "execution_count": 12, "metadata": {}, "output_type": "execute_result" } ], "source": [ "w_queen.neighbors[99]" ] }, { "cell_type": "markdown", "id": "5f75edf4-9cab-4e29-9246-fa71365c057a", "metadata": {}, "source": [ "For the `Graph` class the neighbor information is stored in the `adjacency` attribute:" ] }, { "cell_type": "code", "execution_count": 13, "id": "f4e3cae4-854f-4a56-8afc-f0fb2524b67a", "metadata": {}, "outputs": [ { "data": { "text/plain": [ "focal neighbor\n", "0 1 1\n", " 17 1\n", " 18 1\n", "1 0 1\n", " 2 1\n", " ..\n", "98 96 1\n", " 99 1\n", "99 96 1\n", " 97 1\n", " 98 1\n", "Name: weight, Length: 490, dtype: int64" ] }, "execution_count": 13, "metadata": {}, "output_type": "execute_result" } ], "source": [ "g_queen.adjacency" ] }, { "cell_type": "code", "execution_count": 14, "id": "3be036b5-97ef-4eb2-8429-93d867d017da", "metadata": {}, "outputs": [ { "data": { "text/plain": [ "pandas.core.series.Series" ] }, "execution_count": 14, "metadata": {}, "output_type": "execute_result" } ], "source": [ "type(g_queen.adjacency)" ] }, { "cell_type": "markdown", "id": "b686532a-fb2e-4b45-a362-c7f7bdb13609", "metadata": {}, "source": [ "This is encoded as a pandas Series, with a multi-index. The first index is for the focal unit, and the second is for the neighboring unit.\n", "So we see here that the observation with the identifier of 99 has three neighbors: 96, 97, 98.\n", "This agrees with what we had for `W` so the question is why the need for the change?\n", "\n", "Part of the answer is in facilitating easier access to this information:" ] }, { "cell_type": "code", "execution_count": 15, "id": "075208ba-584e-403e-bc01-87834241fcc6", "metadata": {}, "outputs": [ { "data": { "text/plain": [ "neighbor\n", "96 1\n", "97 1\n", "98 1\n", "Name: weight, dtype: int64" ] }, "execution_count": 15, "metadata": {}, "output_type": "execute_result" } ], "source": [ "g_queen[99]" ] }, { "cell_type": "markdown", "id": "23bfd089-9b67-493f-81f0-5900ed4368f6", "metadata": {}, "source": [ "here we can query the graph with an id to get the neighbor information, along with the weights attached to each neighbor in the form of a pandas series:" ] }, { "cell_type": "code", "execution_count": 16, "id": "589290ec-aec6-4f93-b70e-1f3a80d63643", "metadata": {}, "outputs": [ { "data": { "text/plain": [ "pandas.core.series.Series" ] }, "execution_count": 16, "metadata": {}, "output_type": "execute_result" } ], "source": [ "type(g_queen[99])" ] }, { "cell_type": "markdown", "id": "81380f43-5245-4502-a2bf-567d47ec7a6d", "metadata": {}, "source": [ "While we could also query the `W` object with an id, we get back a `dict`." ] }, { "cell_type": "code", "execution_count": 17, "id": "5e1d8999-5d5d-4136-904a-05df8889919c", "metadata": {}, "outputs": [ { "data": { "text/plain": [ "{96: 1.0, 97: 1.0, 98: 1.0}" ] }, "execution_count": 17, "metadata": {}, "output_type": "execute_result" } ], "source": [ "w_queen[99]" ] }, { "cell_type": "markdown", "id": "ec3c3a30-982a-4568-b942-e3f5204d29b1", "metadata": {}, "source": [ "As we will see below, the pandas series will offer substantial gains in efficiency and scope over encoding the weights as `dicts`." ] }, { "cell_type": "markdown", "id": "3591c67e-d29b-4630-a401-0b2892e6ab6d", "metadata": {}, "source": [ "At the same time, if the neighbors are needed in the form of a `dict`, the graph has such an attribute:" ] }, { "cell_type": "code", "execution_count": 18, "id": "afc97b96-a76a-4001-a59e-a1d027c7c653", "metadata": {}, "outputs": [ { "data": { "text/plain": [ "{0: (1, 17, 18),\n", " 1: (0, 2, 17),\n", " 2: (1, 9, 17, 22, 24),\n", " 3: (6, 55),\n", " 4: (5, 8, 15, 27),\n", " 5: (4, 7, 27),\n", " 6: (3, 7, 16),\n", " 7: (5, 6, 16, 19, 20),\n", " 8: (4, 14, 15, 23, 30),\n", " 9: (2, 11, 24, 25),\n", " 10: (11, 13, 26, 28),\n", " 11: (9, 10, 24, 25, 26),\n", " 12: (13, 14, 23, 29, 36),\n", " 13: (10, 12, 28, 29),\n", " 14: (8, 12, 23),\n", " 15: (4, 8, 23, 27, 30, 32, 35),\n", " 16: (6, 7, 19),\n", " 17: (0, 1, 2, 18, 22, 33, 38, 40),\n", " 18: (0, 17, 21, 33),\n", " 19: (7, 16, 20),\n", " 20: (7, 19),\n", " 21: (18, 31, 33, 42, 45),\n", " 22: (2, 17, 24, 38, 39),\n", " 23: (8, 12, 14, 15, 30, 36, 53),\n", " 24: (2, 9, 11, 22, 25, 39, 41),\n", " 25: (9, 11, 24, 26, 41, 46),\n", " 26: (10, 11, 25, 28, 46, 47),\n", " 27: (4, 5, 15, 35, 43),\n", " 28: (10, 13, 26, 29, 47),\n", " 29: (12, 13, 28, 36, 47),\n", " 30: (8, 15, 23, 32, 36, 48, 53),\n", " 31: (21, 34, 45),\n", " 32: (15, 30, 35, 48, 50),\n", " 33: (17, 18, 21, 40, 42, 51),\n", " 34: (31, 37, 45, 52),\n", " 35: (15, 27, 32, 43, 50, 56),\n", " 36: (12, 23, 29, 30, 47, 53, 62),\n", " 37: (34, 52, 54),\n", " 38: (17, 22, 39, 40, 49, 51, 64, 67, 68),\n", " 39: (22, 24, 38, 41, 49),\n", " 40: (17, 33, 38, 51),\n", " 41: (24, 25, 39, 46, 49, 69, 70),\n", " 42: (21, 33, 45, 51, 60, 63, 64),\n", " 43: (27, 35, 44, 56, 86),\n", " 44: (43, 86),\n", " 45: (21, 31, 34, 42, 52, 60),\n", " 46: (25, 26, 41, 47, 66, 69),\n", " 47: (26, 28, 29, 36, 46, 59, 62, 66),\n", " 48: (30, 32, 50, 53, 58, 61),\n", " 49: (38, 39, 41, 68, 69, 70),\n", " 50: (32, 35, 48, 56, 58, 73, 90),\n", " 51: (33, 38, 40, 42, 63, 64),\n", " 52: (34, 37, 45, 54, 60, 71, 74),\n", " 53: (23, 30, 36, 48, 61, 62, 78),\n", " 54: (37, 52, 57, 65, 71, 74),\n", " 55: (3, 86),\n", " 56: (35, 43, 50, 79, 86, 90),\n", " 57: (54, 65, 72, 77),\n", " 58: (48, 50, 61, 73),\n", " 59: (47, 62, 66),\n", " 60: (42, 45, 52, 63, 71, 76),\n", " 61: (48, 53, 58, 73, 78, 87),\n", " 62: (36, 47, 53, 59, 66, 78, 81),\n", " 63: (42, 51, 60, 64, 75),\n", " 64: (38, 42, 51, 63, 67, 75),\n", " 65: (54, 57, 74, 77),\n", " 66: (46, 47, 59, 62, 69, 81, 85, 88, 91),\n", " 67: (38, 64, 68, 75, 83),\n", " 68: (38, 49, 67, 70, 83),\n", " 69: (41, 46, 49, 66, 70, 84, 88),\n", " 70: (41, 49, 68, 69, 83, 84),\n", " 71: (52, 54, 60, 74, 76),\n", " 72: (57, 77, 80),\n", " 73: (50, 58, 61, 82, 87, 90),\n", " 74: (52, 54, 65, 71),\n", " 75: (63, 64, 67),\n", " 76: (60, 71),\n", " 77: (57, 65, 72, 80, 89),\n", " 78: (53, 61, 62, 81, 87, 95, 96),\n", " 79: (56, 90),\n", " 80: (72, 77, 89),\n", " 81: (62, 66, 78, 85, 93, 95),\n", " 82: (73, 87, 90, 92, 94),\n", " 83: (67, 68, 70, 84),\n", " 84: (69, 70, 83, 88),\n", " 85: (66, 81, 88, 91, 93),\n", " 86: (43, 44, 55, 56),\n", " 87: (61, 73, 78, 82, 92, 96),\n", " 88: (66, 69, 84, 85, 91),\n", " 89: (77, 80),\n", " 90: (50, 56, 73, 79, 82, 94),\n", " 91: (66, 85, 88, 93),\n", " 92: (82, 87, 94, 96),\n", " 93: (81, 85, 91, 95, 97),\n", " 94: (82, 90, 92),\n", " 95: (78, 81, 93, 96, 97),\n", " 96: (78, 87, 92, 95, 97, 98, 99),\n", " 97: (93, 95, 96, 99),\n", " 98: (96, 99),\n", " 99: (96, 97, 98)}" ] }, "execution_count": 18, "metadata": {}, "output_type": "execute_result" } ], "source": [ "g_queen.neighbors" ] }, { "cell_type": "markdown", "id": "5d9c48c6-9b9e-45da-bcab-ebf1302ff4ce", "metadata": {}, "source": [ "## Weights\n", "\n", "The value of a weight specifies the \"strength\" of the neighbor relationship between to geographical units.\n", "For our contiguity weights, these will be binary valued.\n", "\n", "In the `W` case, these values are stored in the `weights` attribute:" ] }, { "cell_type": "code", "execution_count": 19, "id": "81111346-a12e-4ada-a9b4-de7318450d8b", "metadata": {}, "outputs": [ { "data": { "text/plain": [ "[1.0, 1.0, 1.0]" ] }, "execution_count": 19, "metadata": {}, "output_type": "execute_result" } ], "source": [ "w_queen.weights[99]" ] }, { "cell_type": "markdown", "id": "e0bfecb4-2b20-47ff-87d2-c92226132977", "metadata": {}, "source": [ "For the `Graph`, the values of the weights are stored in the `adjacency` attribute:" ] }, { "cell_type": "code", "execution_count": 20, "id": "75126118-bd99-4467-967a-6e91a745acdc", "metadata": {}, "outputs": [ { "data": { "text/plain": [ "neighbor\n", "96 1\n", "97 1\n", "98 1\n", "Name: weight, dtype: int64" ] }, "execution_count": 20, "metadata": {}, "output_type": "execute_result" } ], "source": [ "g_queen[99]" ] }, { "cell_type": "markdown", "id": "ebee5f1c-88b3-4d23-a9cd-aee2eb42ce1b", "metadata": {}, "source": [ "Again, the underlying types of these attributes need to be kept in mind. `weights` is a `dict` for `W`, and as part of the `ajacency` attribute of the `Graph`, which is of type:\n" ] }, { "cell_type": "code", "execution_count": 21, "id": "1814d24f-3283-45a0-8ff5-f14a16ee22b5", "metadata": {}, "outputs": [ { "data": { "text/plain": [ "pandas.core.series.Series" ] }, "execution_count": 21, "metadata": {}, "output_type": "execute_result" } ], "source": [ "type(g_queen[0])" ] }, { "cell_type": "markdown", "id": "d2a6f2fb-7dce-4668-9529-b080207ad373", "metadata": {}, "source": [ "And, the helper `weights` attribute on the `Graph` mimics that on the `W`." ] }, { "cell_type": "code", "execution_count": 22, "id": "ce459560-5b7b-433a-bede-5bfaac273bb5", "metadata": {}, "outputs": [ { "data": { "text/plain": [ "(1, 1, 1)" ] }, "execution_count": 22, "metadata": {}, "output_type": "execute_result" } ], "source": [ "g_queen.weights[99]" ] }, { "cell_type": "markdown", "id": "ef3bccc1-a36c-4b47-afd3-c02cd1015b4f", "metadata": {}, "source": [ "Individual weight values will be identical between the two implementations:" ] }, { "cell_type": "code", "execution_count": 23, "id": "b6f6d449-1d55-4d22-b889-76088cb106e6", "metadata": {}, "outputs": [ { "data": { "text/plain": [ "True" ] }, "execution_count": 23, "metadata": {}, "output_type": "execute_result" } ], "source": [ "g_queen[99][97] == w_queen[99][97]" ] }, { "cell_type": "markdown", "id": "30c912f4-f367-405d-bd9e-f9d3589e4cc0", "metadata": {}, "source": [ "As well as the neighbor sets for a given unit:" ] }, { "cell_type": "code", "execution_count": 24, "id": "6fa4e245-68e2-4dc2-bbdf-b05ed7527d76", "metadata": {}, "outputs": [ { "data": { "text/plain": [ "neighbor\n", "96 True\n", "97 True\n", "98 True\n", "Name: weight, dtype: bool" ] }, "execution_count": 24, "metadata": {}, "output_type": "execute_result" } ], "source": [ "g_queen[99] == w_queen.weights[99]" ] }, { "cell_type": "code", "execution_count": 25, "id": "b3240892-ab52-4f99-a106-c9ebee1b3131", "metadata": {}, "outputs": [ { "data": { "text/plain": [ "[1.0, 1.0, 1.0]" ] }, "execution_count": 25, "metadata": {}, "output_type": "execute_result" } ], "source": [ "w_queen.weights[99]" ] }, { "cell_type": "markdown", "id": "bb9e8401-d6ab-47c2-92cf-5617dbee8b69", "metadata": {}, "source": [ "We are implicitly assuming that the order of the neighbor ids in the `W` matches that of the `Graph`.\n", "Here we see one advantage of the `Graph` in that the information about the neighbor ids comes along for the ride in the adjacency attribute, or any pandas like queries on that attribute.\n", "\n", "To be safe, we would have to double check the ordering of the weights in `W`:" ] }, { "cell_type": "code", "execution_count": 26, "id": "57ef256a-8130-4124-9916-5fbe288a76db", "metadata": {}, "outputs": [ { "data": { "text/plain": [ "{96: 1.0, 97: 1.0, 98: 1.0}" ] }, "execution_count": 26, "metadata": {}, "output_type": "execute_result" } ], "source": [ "w_queen[99]" ] }, { "cell_type": "markdown", "id": "ea43bd1a-9525-4b5a-8d24-22fc37db423d", "metadata": {}, "source": [ "So in this case our equality check above happened to be comparing the values for the same $i,j$ observations, but this is not guaranteed to always be the case. Handling the proper alignment of the ids and the weights was a key motivation for developing the `Graph`." ] }, { "cell_type": "markdown", "id": "a11ed77e-1f95-406e-8ad1-72ffaf7f37a3", "metadata": {}, "source": [ "The key take-away here is that the `Graph` combines the information about who are the neighbors *and* the values of the associated weights in the *same* data structure, the `adjacency` attribute, while in `W` there are *two different* `dicts` that handle the neighbor information and the weights information (`neighbor` and `weights`, respectively)." ] }, { "cell_type": "markdown", "id": "b60d44ce-d9a8-4a3c-a4a2-6d2f9b1336c5", "metadata": {}, "source": [ "## Cardinalities\n", "The `cardinalities` attribute contains information on the number of neighbors for each unit." ] }, { "cell_type": "code", "execution_count": 27, "id": "f420b789-0ede-4ade-8b17-ff7f88e3c9e0", "metadata": {}, "outputs": [ { "data": { "text/plain": [ "{0: 3,\n", " 1: 3,\n", " 2: 5,\n", " 3: 2,\n", " 4: 4,\n", " 5: 3,\n", " 6: 3,\n", " 7: 5,\n", " 8: 5,\n", " 9: 4,\n", " 10: 4,\n", " 11: 5,\n", " 12: 5,\n", " 13: 4,\n", " 14: 3,\n", " 15: 7,\n", " 16: 3,\n", " 17: 8,\n", " 18: 4,\n", " 19: 3,\n", " 20: 2,\n", " 21: 5,\n", " 22: 5,\n", " 23: 7,\n", " 24: 7,\n", " 25: 6,\n", " 26: 6,\n", " 27: 5,\n", " 28: 5,\n", " 29: 5,\n", " 30: 7,\n", " 31: 3,\n", " 32: 5,\n", " 33: 6,\n", " 34: 4,\n", " 35: 6,\n", " 36: 7,\n", " 37: 3,\n", " 38: 9,\n", " 39: 5,\n", " 40: 4,\n", " 41: 7,\n", " 42: 7,\n", " 43: 5,\n", " 44: 2,\n", " 45: 6,\n", " 46: 6,\n", " 47: 8,\n", " 48: 6,\n", " 49: 6,\n", " 50: 7,\n", " 51: 6,\n", " 52: 7,\n", " 53: 7,\n", " 54: 6,\n", " 55: 2,\n", " 56: 6,\n", " 57: 4,\n", " 58: 4,\n", " 59: 3,\n", " 60: 6,\n", " 61: 6,\n", " 62: 7,\n", " 63: 5,\n", " 64: 6,\n", " 65: 4,\n", " 66: 9,\n", " 67: 5,\n", " 68: 5,\n", " 69: 7,\n", " 70: 6,\n", " 71: 5,\n", " 72: 3,\n", " 73: 6,\n", " 74: 4,\n", " 75: 3,\n", " 76: 2,\n", " 77: 5,\n", " 78: 7,\n", " 79: 2,\n", " 80: 3,\n", " 81: 6,\n", " 82: 5,\n", " 83: 4,\n", " 84: 4,\n", " 85: 5,\n", " 86: 4,\n", " 87: 6,\n", " 88: 5,\n", " 89: 2,\n", " 90: 6,\n", " 91: 4,\n", " 92: 4,\n", " 93: 5,\n", " 94: 3,\n", " 95: 5,\n", " 96: 7,\n", " 97: 4,\n", " 98: 2,\n", " 99: 3}" ] }, "execution_count": 27, "metadata": {}, "output_type": "execute_result" } ], "source": [ "w_queen.cardinalities" ] }, { "cell_type": "code", "execution_count": 28, "id": "358a43aa-7859-4574-9150-780ef3b4c2c6", "metadata": {}, "outputs": [ { "data": { "text/plain": [ "focal\n", "0 3\n", "1 3\n", "2 5\n", "3 2\n", "4 4\n", " ..\n", "95 5\n", "96 7\n", "97 4\n", "98 2\n", "99 3\n", "Name: cardinalities, Length: 100, dtype: int64" ] }, "execution_count": 28, "metadata": {}, "output_type": "execute_result" } ], "source": [ "g_queen.cardinalities" ] }, { "cell_type": "markdown", "id": "5bbcf980-4658-4e43-80f3-3955151b39a9", "metadata": {}, "source": [ "Here we see that, although the attribute name is common to both classes, the data types are different. (Note: other cases of common names but different types that we won't cover here are listed [here](../../migration.rst)).\n" ] }, { "cell_type": "code", "execution_count": 29, "id": "0ff66bd7-c78f-4ac3-8eae-27fbd09646d6", "metadata": {}, "outputs": [ { "data": { "text/plain": [ "(dict, pandas.core.series.Series)" ] }, "execution_count": 29, "metadata": {}, "output_type": "execute_result" } ], "source": [ "type(w_queen.cardinalities), type(g_queen.cardinalities)" ] }, { "cell_type": "markdown", "id": "88aaf817-8b72-4da4-93e3-41976bda415f", "metadata": {}, "source": [ "Summaries of the cardinality distribution can be obtained for the `W` as:" ] }, { "cell_type": "code", "execution_count": 30, "id": "18a20a7e-9377-4f66-bd77-48e89866a874", "metadata": {}, "outputs": [ { "data": { "text/plain": [ "[(2, 8), (3, 15), (4, 17), (5, 23), (6, 19), (7, 14), (8, 2), (9, 2)]" ] }, "execution_count": 30, "metadata": {}, "output_type": "execute_result" } ], "source": [ "w_queen.histogram" ] }, { "cell_type": "markdown", "id": "8f2f0877-4c46-4502-9391-820719b77fcc", "metadata": {}, "source": [ "which indicates that 8 units have 2 neighbors, 15 have 3 neighbors and so on.\n", "\n", "For the `Graph` we can more easily visualize this distribution with:" ] }, { "cell_type": "code", "execution_count": 31, "id": "2364e921-4e95-4441-a8a2-37a8b2ea351c", "metadata": {}, "outputs": [ { "data": { "text/plain": [ "" ] }, "execution_count": 31, "metadata": {}, "output_type": "execute_result" }, { "data": { "image/png": "", "text/plain": [ "
" ] }, "metadata": {}, "output_type": "display_data" } ], "source": [ "g_queen.cardinalities.hist(bins=range(2, 10))" ] }, { "cell_type": "markdown", "id": "072e73ce-f9ba-4f5e-9d7a-904ce50f12af", "metadata": {}, "source": [ "While to get a similar visualization for `W` requires other packages:" ] }, { "cell_type": "code", "execution_count": 32, "id": "519487e9-8a23-4a15-8731-4cc1796ddda3", "metadata": {}, "outputs": [ { "data": { "image/png": "", "text/plain": [ "
" ] }, "metadata": {}, "output_type": "display_data" } ], "source": [ "sns.displot(pd.Series(w_queen.cardinalities), bins=range(2, 10));" ] }, { "cell_type": "markdown", "id": "b8fe49b2-8bb8-4e93-bfb3-dc9e93410df2", "metadata": {}, "source": [ "## Geovisualization of the Weights\n", "\n", "Both `W` and `Graph` afford the ability to visualize the connectivity structure as a graph embedded in the geographic space.\n", "For the `W`, the `plot` method can be used" ] }, { "cell_type": "code", "execution_count": 33, "id": "0cb29997-4555-4883-b118-3802d2adda9e", "metadata": {}, "outputs": [], "source": [ "gdf = gdf.to_crs(gdf.estimate_utm_crs())" ] }, { "cell_type": "code", "execution_count": 34, "id": "72a287f3-c85e-411b-939f-a1ba5fdf5ff1", "metadata": {}, "outputs": [ { "data": { "image/png": "", "text/plain": [ "
" ] }, "metadata": {}, "output_type": "display_data" } ], "source": [ "w_queen.plot(gdf);" ] }, { "cell_type": "markdown", "id": "068e0437-0e1e-47cd-a0e5-8f477bc9d2d9", "metadata": {}, "source": [ "and the same can be done with the `Graph`:" ] }, { "cell_type": "code", "execution_count": 35, "id": "b560cc2e-9438-4e65-a3a2-d70bec954f8b", "metadata": {}, "outputs": [ { "data": { "text/plain": [ "" ] }, "execution_count": 35, "metadata": {}, "output_type": "execute_result" }, { "data": { "image/png": "", "text/plain": [ "
" ] }, "metadata": {}, "output_type": "display_data" } ], "source": [ "g_queen.plot(gdf)" ] }, { "cell_type": "markdown", "id": "ff87e52b-5035-44c3-9e25-adff9d1ffcfe", "metadata": {}, "source": [ "However, the `Graph` adds an `explore` function allowing for a richer visualization:" ] }, { "cell_type": "code", "execution_count": 36, "id": "e372a3c5-cc3c-45b1-b1be-b73756ff1279", "metadata": {}, "outputs": [ { "data": { "text/html": [ "
Make this Notebook Trusted to load map: File -> Trust Notebook
" ], "text/plain": [ "" ] }, "execution_count": 36, "metadata": {}, "output_type": "execute_result" } ], "source": [ "m = gdf.explore(tiles=\"CartoDB Positron\")\n", "g_queen.explore(gdf, m=m)" ] }, { "cell_type": "markdown", "id": "b44a171d-d458-4773-859f-00eef92aea6e", "metadata": {}, "source": [ "This can be leveraged to look at the spatial distribution of the cardinalities, for example:" ] }, { "cell_type": "code", "execution_count": 37, "id": "8cfb4407-c8a3-4e87-a634-e5b993436a74", "metadata": {}, "outputs": [ { "data": { "text/plain": [ "focal\n", "0 3\n", "1 3\n", "2 5\n", "3 2\n", "4 4\n", " ..\n", "95 5\n", "96 7\n", "97 4\n", "98 2\n", "99 3\n", "Name: cardinalities, Length: 100, dtype: int64" ] }, "execution_count": 37, "metadata": {}, "output_type": "execute_result" } ], "source": [ "g_queen.cardinalities" ] }, { "cell_type": "code", "execution_count": 38, "id": "5ebf7c0d-f74d-4345-b6ab-d71a2cab197b", "metadata": {}, "outputs": [], "source": [ "gdf[\"cardinalities\"] = g_queen.cardinalities" ] }, { "cell_type": "code", "execution_count": 39, "id": "2359c40d-bb0d-4823-aca5-5a46fba40192", "metadata": {}, "outputs": [ { "data": { "text/html": [ "
Make this Notebook Trusted to load map: File -> Trust Notebook
" ], "text/plain": [ "" ] }, "execution_count": 39, "metadata": {}, "output_type": "execute_result" } ], "source": [ "m = gdf.explore(tiles=\"CartoDB Positron\", column=\"cardinalities\", legend=True)\n", "g_queen.explore(gdf, m=m)" ] }, { "cell_type": "markdown", "id": "ca807531-72df-4027-a946-0f65f847c0b2", "metadata": {}, "source": [ "## Transformations\n", "\n", "Transformation of the weight values is often required in various spatial statistics and operations. How these are handled is a major change between the `W` and `Graph` classes.\n", "\n", "PySAL currently supports the following transformations:\n", "\n", "- O: original, returning the object to the initial state.\n", "- B: binary, with every neighbor having assigned a weight of one.\n", "- R: row, with all the neighbors of a given observation adding up to one.\n", "- V: variance stabilizing, with the sum of all the weights being constrained to the number of observations.\n", "- D: double, with all the weights across all observations adding up to one.\n", "\n", "\n", "For the `W`, the `transform` property stores the type of transformation that is associated with the weight values:" ] }, { "cell_type": "code", "execution_count": 40, "id": "c4a0cb0a-76f2-41e5-b439-d99a24e7c280", "metadata": {}, "outputs": [ { "data": { "text/plain": [ "'O'" ] }, "execution_count": 40, "metadata": {}, "output_type": "execute_result" } ], "source": [ "w_queen.transform" ] }, { "cell_type": "markdown", "id": "394d8a97-1496-449b-bc46-3b3c815c1ff8", "metadata": {}, "source": [ "An 'O' here means the weight values are set to the original transformation upon construction. In this case they are binary:" ] }, { "cell_type": "code", "execution_count": 41, "id": "7fdba8fd-8520-40e2-bc3c-b839f0e14bad", "metadata": {}, "outputs": [ { "data": { "text/plain": [ "{96: 1.0, 99: 1.0}" ] }, "execution_count": 41, "metadata": {}, "output_type": "execute_result" } ], "source": [ "w_queen[98]" ] }, { "cell_type": "markdown", "id": "66ae843c-9563-45c3-8239-40d802e1ddd9", "metadata": {}, "source": [ "In some cases, rather than using the binary weights, we may need to row-standardize the weights, so that in the full $n \\times n$ weights matrix, the row sums would all be equal to unity. The relevant transform in this case is 'r':" ] }, { "cell_type": "code", "execution_count": 42, "id": "8827dd89-b78f-46bc-b3bf-cc7762af0f4b", "metadata": {}, "outputs": [], "source": [ "w_queen.transform = \"r\"" ] }, { "cell_type": "code", "execution_count": 43, "id": "bb868961-6809-4812-99f1-e5c225e9cc7c", "metadata": {}, "outputs": [ { "data": { "text/plain": [ "{96: 0.5, 99: 0.5}" ] }, "execution_count": 43, "metadata": {}, "output_type": "execute_result" } ], "source": [ "w_queen[98]" ] }, { "cell_type": "markdown", "id": "126d3420-97f9-4682-babf-ff0e61268821", "metadata": {}, "source": [ "Since `transform` is a property of `W`, setting the `transform` will update the values of the weights.\n", "\n", "For the `Graph` class, things have changed in terms of these standardizations." ] }, { "cell_type": "code", "execution_count": 44, "id": "c6bb10a2-e4d2-49f2-a67e-6d57e923d058", "metadata": {}, "outputs": [ { "data": { "text/plain": [ "neighbor\n", "96 1\n", "99 1\n", "Name: weight, dtype: int64" ] }, "execution_count": 44, "metadata": {}, "output_type": "execute_result" } ], "source": [ "g_queen[98]" ] }, { "cell_type": "code", "execution_count": 45, "id": "9c161b95-9cdc-479c-a449-e33d9a584ffc", "metadata": {}, "outputs": [ { "data": { "text/plain": [ ">" ] }, "execution_count": 45, "metadata": {}, "output_type": "execute_result" } ], "source": [ "g_queen.transform" ] }, { "cell_type": "markdown", "id": "35ad4391-6829-48d2-a7c0-403b1b1688cc", "metadata": {}, "source": [ "This tells us that the `transform` member is now a method of the `Graph`.\n", "\n", "The related change is the addition of a `transformation` property:" ] }, { "cell_type": "code", "execution_count": 46, "id": "ed78f5f3-d306-4b32-b159-3b2fb3aad0b8", "metadata": {}, "outputs": [ { "data": { "text/plain": [ "'O'" ] }, "execution_count": 46, "metadata": {}, "output_type": "execute_result" } ], "source": [ "g_queen.transformation" ] }, { "cell_type": "markdown", "id": "c94993a4-8c24-497d-a803-25b9a7343e2e", "metadata": {}, "source": [ "which plays a similar role to the `W.transform` property, in the sense it informs us as to what standardization is associated with the value of the spatial weights.\n", "\n", "However, `Graph.transformation` is not a setter in the sense that if the user changes its value, the weight values will not be affected. It is an information-only property.\n", "\n", "This is because, by design, the weights for the `Graph` are **immutable**.\n", "In other words, once the `Graph` instance is created, its state cannot be changed.\n", "\n", "If transformed spatial weights are required, the `Graph` method `transform` can be called with the type of transformation required:" ] }, { "cell_type": "code", "execution_count": 47, "id": "a18cb028-c9a3-4bd3-86c7-dd28f8bf52eb", "metadata": {}, "outputs": [], "source": [ "g1 = g_queen.transform(\"r\")" ] }, { "cell_type": "code", "execution_count": 48, "id": "86f2ed73-49c0-493e-a41a-55915ac402cf", "metadata": {}, "outputs": [ { "data": { "text/plain": [ "neighbor\n", "96 0.5\n", "99 0.5\n", "Name: weight, dtype: float64" ] }, "execution_count": 48, "metadata": {}, "output_type": "execute_result" } ], "source": [ "g1[98]" ] }, { "cell_type": "markdown", "id": "3f2f5116-4ba1-4619-aaea-0d262d3d11cd", "metadata": {}, "source": [ "This will return a new `Graph` instance with the weight values suitably transformed:\n" ] }, { "cell_type": "code", "execution_count": 49, "id": "30a3a8eb-c917-4093-9e8d-d88477fe55b5", "metadata": {}, "outputs": [ { "data": { "text/plain": [ "'R'" ] }, "execution_count": 49, "metadata": {}, "output_type": "execute_result" } ], "source": [ "g1.transformation" ] }, { "cell_type": "markdown", "id": "a8e3b89a-5e0b-4ffc-a0bd-68b0fd9b9358", "metadata": {}, "source": [ "## Spatial Lag" ] }, { "cell_type": "markdown", "id": "00295e4f-283d-4b34-be43-087749b7ba5e", "metadata": {}, "source": [ "The spatial lag of a variable is a weight sum, or weighted average, of the neighboring values for that variable:\n", "\n", "$$[Wy]_i = \\sum_j w_{i,j} y_j$$" ] }, { "cell_type": "markdown", "id": "5766e374-e416-4595-a71d-456dcb468076", "metadata": {}, "source": [ "The implementation of the spatial lag has changed between the `W` and `Graph` classes.\n", "\n", "To illustrate, we will pull out the Sudden Infant Death Rate in 1979 for the counties into the variable `y`:" ] }, { "cell_type": "code", "execution_count": 50, "id": "34f5d317-00bb-4d79-9f49-1ec6b02b45a6", "metadata": {}, "outputs": [], "source": [ "y = gdf.SID79" ] }, { "cell_type": "markdown", "id": "ec106df5-4899-4113-b8aa-7f6013c6e8b1", "metadata": {}, "source": [ "To calculate the spatial lag as a weighted average, we need to use the weights that have been row-standardized. For the `W` class, the calculation of the spatial lag is done with a function that takes the `W` as an argument together with the variable of interest:" ] }, { "cell_type": "code", "execution_count": 51, "id": "7b499f38-3e54-4aa6-b70a-ac237b4d3cf9", "metadata": {}, "outputs": [], "source": [ "from libpysal.weights import lag_spatial" ] }, { "cell_type": "code", "execution_count": 52, "id": "47159a15-996d-455d-bf65-74cb2a6234fe", "metadata": {}, "outputs": [], "source": [ "w_queen.transform = \"r\"\n", "wlag = lag_spatial(w_queen, y)" ] }, { "cell_type": "code", "execution_count": 53, "id": "3f20b6ee-702c-46e0-9d6e-dacba84b0831", "metadata": {}, "outputs": [ { "data": { "text/plain": [ "array([ 3.66666667, 4.33333333, 6.8 , 1.5 , 7.25 ,\n", " 3.33333333, 2.66666667, 2.4 , 6.6 , 16.75 ,\n", " 6.5 , 14.8 , 12.6 , 8.5 , 2. ,\n", " 3.85714286, 1.33333333, 3.375 , 4. , 2.33333333,\n", " 1. , 6.4 , 7.8 , 11.42857143, 9.42857143,\n", " 9.83333333, 11. , 5.2 , 8.4 , 9.6 ,\n", " 12.14285714, 2. , 9.8 , 7.66666667, 6.75 ,\n", " 7.66666667, 8.42857143, 9. , 11.55555556, 8. ,\n", " 10.5 , 13.42857143, 10.14285714, 2. , 0. ,\n", " 7.33333333, 12.16666667, 12.875 , 11.16666667, 8.5 ,\n", " 9. , 9.83333333, 5.14285714, 12.57142857, 6.5 ,\n", " 1. , 5.16666667, 4.25 , 15.25 , 6. ,\n", " 11.16666667, 9.16666667, 17. , 15.4 , 20.5 ,\n", " 4.25 , 13.88888889, 13.4 , 12.8 , 7.28571429,\n", " 9.5 , 7.6 , 2. , 10.83333333, 9.75 ,\n", " 21. , 8. , 1.8 , 16.85714286, 11. ,\n", " 1.33333333, 9.33333333, 13.2 , 16.5 , 7.75 ,\n", " 22.2 , 1.25 , 11.5 , 7.8 , 2. ,\n", " 6. , 11. , 4. , 20.2 , 14.33333333,\n", " 21.4 , 10.14285714, 10. , 4.5 , 9.66666667])" ] }, "execution_count": 53, "metadata": {}, "output_type": "execute_result" } ], "source": [ "wlag" ] }, { "cell_type": "markdown", "id": "4874f241-ae96-4296-a941-e21c311a807b", "metadata": {}, "source": [ "For the `Graph`, the `lag` is now a method:" ] }, { "cell_type": "code", "execution_count": 54, "id": "25c264fd-55fd-440c-b4f4-bb3c6b1aa1d2", "metadata": {}, "outputs": [], "source": [ "glag = g1.lag(y)" ] }, { "cell_type": "code", "execution_count": 55, "id": "d4aec77e-b453-444f-ae8e-ba5c18aa2c63", "metadata": {}, "outputs": [ { "data": { "text/plain": [ "array([ 3.66666667, 4.33333333, 6.8 , 1.5 , 7.25 ,\n", " 3.33333333, 2.66666667, 2.4 , 6.6 , 16.75 ,\n", " 6.5 , 14.8 , 12.6 , 8.5 , 2. ,\n", " 3.85714286, 1.33333333, 3.375 , 4. , 2.33333333,\n", " 1. , 6.4 , 7.8 , 11.42857143, 9.42857143,\n", " 9.83333333, 11. , 5.2 , 8.4 , 9.6 ,\n", " 12.14285714, 2. , 9.8 , 7.66666667, 6.75 ,\n", " 7.66666667, 8.42857143, 9. , 11.55555556, 8. ,\n", " 10.5 , 13.42857143, 10.14285714, 2. , 0. ,\n", " 7.33333333, 12.16666667, 12.875 , 11.16666667, 8.5 ,\n", " 9. , 9.83333333, 5.14285714, 12.57142857, 6.5 ,\n", " 1. , 5.16666667, 4.25 , 15.25 , 6. ,\n", " 11.16666667, 9.16666667, 17. , 15.4 , 20.5 ,\n", " 4.25 , 13.88888889, 13.4 , 12.8 , 7.28571429,\n", " 9.5 , 7.6 , 2. , 10.83333333, 9.75 ,\n", " 21. , 8. , 1.8 , 16.85714286, 11. ,\n", " 1.33333333, 9.33333333, 13.2 , 16.5 , 7.75 ,\n", " 22.2 , 1.25 , 11.5 , 7.8 , 2. ,\n", " 6. , 11. , 4. , 20.2 , 14.33333333,\n", " 21.4 , 10.14285714, 10. , 4.5 , 9.66666667])" ] }, "execution_count": 55, "metadata": {}, "output_type": "execute_result" } ], "source": [ "glag" ] }, { "cell_type": "markdown", "id": "891ae30f-34ff-4240-8205-aaf8288aaf81", "metadata": {}, "source": [ "## Pandas operations\n", "\n", "As mentioned above, gains in efficiency and scope have been the main motivating forces behind the development of the new `Graph` class. Here we highlight a few additional gains." ] }, { "cell_type": "markdown", "id": "4e649b10-e9e7-490f-b42f-e30418a7bca6", "metadata": {}, "source": [ "The `adjacency` attribute for the `Graph` lets users leverage the power of pandas series. For example, the operation:" ] }, { "cell_type": "code", "execution_count": 56, "id": "9921c8c7-e2f8-4bd8-8c94-1b1cdd11b49d", "metadata": {}, "outputs": [ { "data": { "text/plain": [ "neighbor\n", "96 1\n", "97 1\n", "98 1\n", "Name: weight, dtype: int64" ] }, "execution_count": 56, "metadata": {}, "output_type": "execute_result" } ], "source": [ "g_queen[99]" ] }, { "cell_type": "markdown", "id": "4798679f-ad54-4488-9b45-afd2fd978b4a", "metadata": {}, "source": [ "is actually a [slice](https://pandas.pydata.org/docs/user_guide/indexing.html) of the `Graph` that returns another pandas series where the index is the id of the neighboring unit and the value is the weight of that neighbor relationship.\n" ] }, { "cell_type": "code", "execution_count": 57, "id": "ea8dca41-4b87-40aa-a6eb-caa425c18a96", "metadata": {}, "outputs": [ { "data": { "text/plain": [ "MultiIndex([( 0, 1),\n", " ( 0, 17),\n", " ( 0, 18),\n", " ( 1, 0),\n", " ( 1, 2),\n", " ( 1, 17),\n", " ( 2, 1),\n", " ( 2, 9),\n", " ( 2, 17),\n", " ( 2, 22),\n", " ...\n", " (96, 99),\n", " (97, 93),\n", " (97, 95),\n", " (97, 96),\n", " (97, 99),\n", " (98, 96),\n", " (98, 99),\n", " (99, 96),\n", " (99, 97),\n", " (99, 98)],\n", " names=['focal', 'neighbor'], length=490)" ] }, "execution_count": 57, "metadata": {}, "output_type": "execute_result" } ], "source": [ "g_queen.adjacency.index" ] }, { "cell_type": "code", "execution_count": 58, "id": "c2128226-0976-49ea-8adf-133b8dabc9ae", "metadata": {}, "outputs": [ { "data": { "text/plain": [ "Index([96, 97, 98], dtype='int64', name='neighbor')" ] }, "execution_count": 58, "metadata": {}, "output_type": "execute_result" } ], "source": [ "g_queen[99].index" ] }, { "cell_type": "code", "execution_count": 59, "id": "406c3323-2d58-42ca-9ebc-a362de2046cb", "metadata": {}, "outputs": [ { "data": { "text/plain": [ "focal neighbor\n", "96 78 1\n", " 87 1\n", " 92 1\n", " 95 1\n", " 97 1\n", " 98 1\n", " 99 1\n", "97 93 1\n", " 95 1\n", " 96 1\n", " 99 1\n", "98 96 1\n", " 99 1\n", "Name: weight, dtype: int64" ] }, "execution_count": 59, "metadata": {}, "output_type": "execute_result" } ], "source": [ "g_queen.adjacency.loc[[96, 97, 98]]" ] }, { "cell_type": "markdown", "id": "4cfe70cb-18f2-432d-95e0-cbe2c7fe7b4f", "metadata": {}, "source": [ "Another way the `Graph` makes use of the powerful indexing afforded by pandas is seen it the method `subgraph`:" ] }, { "cell_type": "code", "execution_count": 60, "id": "760dc3be-6c4d-4e58-bbff-a2907f99f502", "metadata": {}, "outputs": [], "source": [ "g1 = g_queen.subgraph([96, 97, 98])" ] }, { "cell_type": "code", "execution_count": 61, "id": "fc216581-a49a-4205-b5c0-688bd200eb3d", "metadata": {}, "outputs": [ { "data": { "text/plain": [ "focal neighbor\n", "96 97 1\n", " 98 1\n", "97 96 1\n", "98 96 1\n", "Name: weight, dtype: int64" ] }, "execution_count": 61, "metadata": {}, "output_type": "execute_result" } ], "source": [ "g1.adjacency" ] }, { "cell_type": "code", "execution_count": 62, "id": "1ecf948a-3b0c-4e03-8044-fde8621dae6e", "metadata": {}, "outputs": [ { "data": { "text/plain": [ "3" ] }, "execution_count": 62, "metadata": {}, "output_type": "execute_result" } ], "source": [ "g1.n" ] }, { "cell_type": "markdown", "id": "abf50253-917a-48ce-ba84-979f9fd04126", "metadata": {}, "source": [ "This allows more efficient extraction of subgraphs from the weights graph, relative to the way this is done with the `W` class:" ] }, { "cell_type": "code", "execution_count": 63, "id": "d145dc1b-aa21-4b67-b3d3-fe324e2bec2f", "metadata": {}, "outputs": [], "source": [ "from libpysal.weights import w_subset" ] }, { "cell_type": "code", "execution_count": 64, "id": "15d60171-3229-497e-9ed3-55812b0abdb3", "metadata": {}, "outputs": [], "source": [ "w1 = w_subset(w_queen, [96, 97, 98])" ] }, { "cell_type": "code", "execution_count": 65, "id": "9b7dabec-f7ce-49d5-9f41-91ea51245302", "metadata": {}, "outputs": [ { "data": { "text/plain": [ "{96: [1.0, 1.0], 97: [1.0], 98: [1.0]}" ] }, "execution_count": 65, "metadata": {}, "output_type": "execute_result" } ], "source": [ "w1.weights" ] }, { "cell_type": "code", "execution_count": 66, "id": "50fda366-2a8e-494b-a1c7-9926fef979c3", "metadata": {}, "outputs": [ { "data": { "text/plain": [ "3" ] }, "execution_count": 66, "metadata": {}, "output_type": "execute_result" } ], "source": [ "w1.n" ] }, { "cell_type": "markdown", "id": "9a723115-c894-4bc2-8e12-d2668cfbec18", "metadata": {}, "source": [ "The index of the dataframe can also be set to a more informative attribute than simple integers:" ] }, { "cell_type": "code", "execution_count": 67, "id": "0a88d13f-0573-4680-ba61-e33735fa4d97", "metadata": {}, "outputs": [], "source": [ "ngdf = gdf.set_index(\"NAME\")" ] }, { "cell_type": "markdown", "id": "7fb4873d-ed96-46f7-8d89-26c6a6e9bd98", "metadata": {}, "source": [ "Once this is done, the new `NAME` based index will propagate to any `Graph` built on this dataframe:" ] }, { "cell_type": "code", "execution_count": 68, "id": "80c5aec6-9640-4793-80e4-6171d32e9c73", "metadata": {}, "outputs": [], "source": [ "g = graph.Graph.build_contiguity(ngdf, rook=False)" ] }, { "cell_type": "markdown", "id": "823c2b51-1dd0-4dee-90cf-7c3d7a1c8736", "metadata": {}, "source": [ "This facilities more user-friendly queries. For example, we can ask who are the neighbors for Ashe county:" ] }, { "cell_type": "code", "execution_count": 69, "id": "8466eb37-cb73-4bd4-aab0-86d83230dc18", "metadata": {}, "outputs": [ { "data": { "text/plain": [ "neighbor\n", "Alleghany 1\n", "Wilkes 1\n", "Watauga 1\n", "Name: weight, dtype: int64" ] }, "execution_count": 69, "metadata": {}, "output_type": "execute_result" } ], "source": [ "g[\"Ashe\"]" ] }, { "cell_type": "markdown", "id": "c28a39d5-50d2-415b-98be-ec265663daa7", "metadata": {}, "source": [ "We encountered the `explore` method of the `Graph` above. This can also make handy use of the name-based indexing:" ] }, { "cell_type": "code", "execution_count": 70, "id": "726d4da2-52e0-4817-b91d-8782d7b4646e", "metadata": {}, "outputs": [ { "data": { "text/html": [ "
Make this Notebook Trusted to load map: File -> Trust Notebook
" ], "text/plain": [ "" ] }, "execution_count": 70, "metadata": {}, "output_type": "execute_result" } ], "source": [ "m = ngdf.loc[g[\"Ashe\"].index].explore(color=\"#25b497\")\n", "ngdf.loc[[\"Ashe\"]].explore(m=m, color=\"#fa94a5\")\n", "g.explore(ngdf, m=m, focal=\"Ashe\")" ] }, { "cell_type": "markdown", "id": "4bd40414-eee6-4efa-8e8c-31648fea1eaf", "metadata": {}, "source": [ "## Conclusion\n", "\n", "This notebook highlights the main areas that users need to be aware of when considering porting their code from the `W` class to the new `Graph`.\n", "More details on the specifications and differences of the graphs can be found in the [W to Graph Member Comparisions](../../migration.rst).\n", "\n", "Downstream packages in the pysal family are in the process of supporting the new `Graph` while preserving backwards compatibility with the `W` class." ] }, { "cell_type": "markdown", "id": "10c4e285-977c-408a-9c9e-ffa80e8ffb9e", "metadata": {}, "source": [ "## Further Reading\n", "\n", "- Anselin, L. and S.J. Rey (2014) *[Modern Spatial Econometrics in Practice, Chs 3,4](https://www.amazon.com/Modern-Spatial-Econometrics-Practice-GeoDaSpace/dp/0986342106)*. GeoDa Press.\n", "- Arribas-Bel, D. (2019). *[Geographic Data Science, Lab 4](https://darribas.org/gds_course/content/bE/lab_E.html)*\n", "- Fleischmann, M. (2024). *[Spatial Data Science for Social Geography, Ch 4](https://martinfleischmann.net/sds/chapter_04/hands_on.html)*\n", "- Knaap, E. (2024). *[Urban Analysis & Spatial Science, Ch 9](https://knaaptime.com/urban_analysis/03_esda/spatial_graphs.html)*\n", "- Rey, S.J., D. Arribas-Bel, & L.J. Wolf. (2023) *[Geographic Data Science with Python](https://www.routledge.com/Geographic-Data-Science-with-Python/Rey-Arribas-Bel-Wolf/p/book/9781032445953?gad_source=1&gclid=CjwKCAjw1920BhA3EiwAJT3lScLq-TnzytfU0JRyROH9eOG97U4r7YX3G0ZCs3p5mVGUymqptvNe6hoCkqcQAvD_BwE)*. CRC/Taylor Francis. [Chapter 4](https://geographicdata.science/book/notebooks/04_spatial_weights.html).\n" ] } ], "metadata": { "kernelspec": { "display_name": "Python 3 (ipykernel)", "language": "python", "name": "python3" }, "language_info": { "codemirror_mode": { "name": "ipython", "version": 3 }, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", "version": "3.12.2" } }, "nbformat": 4, "nbformat_minor": 5 }