{ "cells": [ { "cell_type": "code", "execution_count": 1, "id": "0c88261c-c8d4-4638-a727-60d833b7d111", "metadata": { "execution": { "iopub.execute_input": "2023-08-05T22:38:08.827468Z", "iopub.status.busy": "2023-08-05T22:38:08.827205Z", "iopub.status.idle": "2023-08-05T22:38:09.918747Z", "shell.execute_reply": "2023-08-05T22:38:09.918338Z", "shell.execute_reply.started": "2023-08-05T22:38:08.827374Z" }, "tags": [] }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "Author: eli knaap\n", "\n" ] } ], "source": [ "%load_ext watermark\n", "%watermark -a 'eli knaap'\n", "%load_ext autoreload\n", "%autoreload 2\n", "\n", "from libpysal import examples\n", "from esda import correlogram\n", "\n", "import geopandas as gpd" ] }, { "cell_type": "code", "execution_count": 2, "id": "68d86eb6-8595-48f0-a244-8889c59e770d", "metadata": { "execution": { "iopub.execute_input": "2023-08-05T22:38:09.920052Z", "iopub.status.busy": "2023-08-05T22:38:09.919906Z", "iopub.status.idle": "2023-08-05T22:38:10.131010Z", "shell.execute_reply": "2023-08-05T22:38:10.130752Z", "shell.execute_reply.started": "2023-08-05T22:38:09.920043Z" }, "tags": [] }, "outputs": [], "source": [ "sac = gpd.read_file(examples.load_example(\"Sacramento1\").get_path(\"sacramentot2.shp\"))" ] }, { "cell_type": "code", "execution_count": 3, "id": "2650732f-b704-4af8-bfee-ccd8c75cf133", "metadata": { "execution": { "iopub.execute_input": "2023-08-05T22:38:10.131596Z", "iopub.status.busy": "2023-08-05T22:38:10.131507Z", "iopub.status.idle": "2023-08-05T22:38:10.206997Z", "shell.execute_reply": "2023-08-05T22:38:10.206722Z", "shell.execute_reply.started": "2023-08-05T22:38:10.131587Z" }, "tags": [] }, "outputs": [], "source": [ "sac = sac.to_crs(sac.estimate_utm_crs()) # now in meters)" ] }, { "cell_type": "code", "execution_count": 4, "id": "d3b1de58-3fb4-4874-ae5b-63f5e03f869d", "metadata": { "execution": { "iopub.execute_input": "2023-08-05T22:38:10.207610Z", "iopub.status.busy": "2023-08-05T22:38:10.207522Z", "iopub.status.idle": "2023-08-05T22:38:10.252685Z", "shell.execute_reply": "2023-08-05T22:38:10.252390Z", "shell.execute_reply.started": "2023-08-05T22:38:10.207601Z" }, "tags": [] }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "\u001b[31mSignature:\u001b[39m\n", "correlogram(\n", " geometry: geopandas.geoseries.GeoSeries,\n", " variable: str | list | pandas.core.series.Series | \u001b[38;5;28;01mNone\u001b[39;00m,\n", " support: list | \u001b[38;5;28;01mNone\u001b[39;00m = \u001b[38;5;28;01mNone\u001b[39;00m,\n", " statistic: collections.abc.Callable | str = <\u001b[38;5;28;01mclass\u001b[39;00m \u001b[33m'esda.moran.Moran'\u001b[39m>,\n", " distance_type: str = \u001b[33m'band'\u001b[39m,\n", " weights_kwargs: dict = \u001b[38;5;28;01mNone\u001b[39;00m,\n", " stat_kwargs: dict = \u001b[38;5;28;01mNone\u001b[39;00m,\n", " select_numeric: bool = \u001b[38;5;28;01mFalse\u001b[39;00m,\n", " n_jobs: int = -\u001b[32m1\u001b[39m,\n", " n_bins: int | \u001b[38;5;28;01mNone\u001b[39;00m = \u001b[32m50\u001b[39m,\n", ") -> pandas.core.frame.DataFrame\n", "\u001b[31mDocstring:\u001b[39m\n", "Generate a spatial correlogram\n", "\n", "A spatial profile is a set of spatial autocorrelation statistics calculated for\n", "a set of increasing distances. It is a useful exploratory tool for examining\n", "how the relationship between spatial units changes over different notions of scale.\n", "\n", "Parameters\n", "----------\n", "geometry : gpd.GeoSeries\n", " geodataframe holding spatial and attribute data\n", "variable: pd.Series or list\n", " pandas series matching input geometries\n", "support : list or None\n", " list of values at which to compute the autocorrelation statistic\n", "statistic : callable or str\n", " statistic to be computed for a range of libpysal.Graph specifications.\n", " This should be a class with a signature like `Statistic(y,w, **kwargs)`\n", " where y is a array and w is a libpysal.weights.W object\n", " Generally, this is a class from pysal's `esda` package\n", " defaults to esda.Moran, which computes the Moran's I statistic. If\n", " 'lowess' is provided, a non-parametric correlogram is computed using\n", " lowess regression on the spatial-covariation model, see Notes.\n", "distance_type : str, optional\n", " which concept of distance to increment. Options are {`band`, `knn`}.\n", " by default 'band' (for `libpysal.weights.DistanceBand` weights)\n", "weights_kwargs : dict\n", " additional keyword arguments passed to the libpysal.weights.W class\n", "stat_kwargs : dict\n", " additional keyword arguments passed to the `esda` autocorrelation statistic class.\n", " For example for faster results with no statistical inference, set the number\n", " of permutations to zero with stat_kwargs={permutations: 0}\n", "select_numeric : bool\n", " if True, only return numeric attributes from the original class. This is useful\n", " e.g. to prevent lists inside a \"cell\" of a dataframe\n", "n_jobs : int\n", " number of jobs to pass to joblib. If -1 (default), all cores will be used\n", "n_bins : int\n", " number of distance bands or k-nearest neighbor values to use if\n", " `support` is not provided. Ignored if `support` is provided.\n", " by default 10. If `distance_type` is 'knn', the number of neighbors\n", " will be capped at n-1, where n is the number of observations. Further,\n", " if n-1 is not divisible by `n_bins`, the actual number of bins will be\n", " may be off by one bin.\n", "\n", "Returns\n", "-------\n", "outputs : pandas.DataFrame\n", " table of autocorrelation statistics at increasing distance bandwidths\n", "\n", "Notes\n", "-----\n", "The nonparametric correlogram uses a lowess regression\n", "to estimate the spatial-covariation model:\n", "\n", " zi*zj = f(d_{ij}) + e_ij\n", "\n", "where f is a smooth function of distance d_{ij} between points i and j.\n", "This function requires the statsmodels package to be installed.\n", "\n", "For the nonparametric correlogram, a precomputed distance matrix can\n", "be used. To do this, set\n", "stat_kwargs={'metric':'precomputed', 'coordinates':distance_matrix}\n", "where `distance_matrix` is a square matrix of pairwise distances that\n", "aligns with the `geometry` rows.\n", "\u001b[31mFile:\u001b[39m ~/Dropbox/work/dev/esda/esda/correlogram.py\n", "\u001b[31mType:\u001b[39m function" ] } ], "source": [ "correlogram?" ] }, { "cell_type": "code", "execution_count": 5, "id": "90fc27f4-1f71-4a39-bcd1-339fa27fa3b4", "metadata": { "execution": { "iopub.execute_input": "2023-08-05T22:38:10.253267Z", "iopub.status.busy": "2023-08-05T22:38:10.253180Z", "iopub.status.idle": "2023-08-05T22:38:10.283353Z", "shell.execute_reply": "2023-08-05T22:38:10.282850Z", "shell.execute_reply.started": "2023-08-05T22:38:10.253257Z" }, "tags": [] }, "outputs": [], "source": [ "from esda import Moran, Geary, G" ] }, { "cell_type": "markdown", "id": "d77ee874-e889-4d60-a039-434906696451", "metadata": {}, "source": [ "## Distance Bands" ] }, { "cell_type": "code", "execution_count": 6, "id": "ab615631-b083-4700-b402-8dd8343cc790", "metadata": { "execution": { "iopub.execute_input": "2023-08-05T22:38:10.283933Z", "iopub.status.busy": "2023-08-05T22:38:10.283848Z", "iopub.status.idle": "2023-08-05T22:38:10.313104Z", "shell.execute_reply": "2023-08-05T22:38:10.312763Z", "shell.execute_reply.started": "2023-08-05T22:38:10.283925Z" }, "tags": [] }, "outputs": [], "source": [ "# Create a liste of distances between 500 and 5000 (meters, here) in increments of 500\n", "\n", "distances = [i+500 for i in range(0,5000, 500)]" ] }, { "cell_type": "code", "execution_count": 7, "id": "049e4002-52a1-4205-9238-bfdb175f04ff", "metadata": { "execution": { "iopub.execute_input": "2023-08-05T22:38:10.315016Z", "iopub.status.busy": "2023-08-05T22:38:10.314723Z", "iopub.status.idle": "2023-08-05T22:38:10.343152Z", "shell.execute_reply": "2023-08-05T22:38:10.342713Z", "shell.execute_reply.started": "2023-08-05T22:38:10.315005Z" }, "tags": [] }, "outputs": [ { "data": { "text/plain": [ "[500, 1000, 1500, 2000, 2500, 3000, 3500, 4000, 4500, 5000]" ] }, "execution_count": 7, "metadata": {}, "output_type": "execute_result" } ], "source": [ "distances" ] }, { "cell_type": "markdown", "id": "d5642b21-6e3c-46ed-8c6f-5ada5baec96d", "metadata": {}, "source": [ "The correlogram will compute an autocorrelation statistic (Moran's $I$ by default) at each distance threshold. Plotting this statistic against distance reveals how spatial similarity changes over distance (similar in concept to a variogram)" ] }, { "cell_type": "code", "execution_count": null, "id": "9d546402-f9c8-407c-945e-b72979fc9df8", "metadata": { "tags": [] }, "outputs": [], "source": [ "prof = correlogram(\n", " sac.centroid,\n", " sac.HH_INC,\n", " distances,\n", " Moran\n", ")" ] }, { "cell_type": "markdown", "id": "5187da59-aba0-465d-9fa7-e22794c8eb1a", "metadata": {}, "source": [ "`prof` is a dataframe of autocorrelation statistics indexed by distance. It includes all attributes created by the esda autocorrelation statistic class (e.g. [Moran](https://pysal.org/esda/generated/esda.Moran.html#esda.Moran), [Geary](https://pysal.org/esda/generated/esda.Geary.html#esda.Geary), or [Geits-Ord G](https://pysal.org/esda/generated/esda.G.html#esda.G)). The row index for each statistic is the distance at which it was computed" ] }, { "cell_type": "code", "execution_count": 9, "id": "af5bad37-874a-483b-a692-a3ae205416f8", "metadata": { "execution": { "iopub.execute_input": "2023-08-05T22:38:11.719943Z", "iopub.status.busy": "2023-08-05T22:38:11.719810Z", "iopub.status.idle": "2023-08-05T22:38:11.764412Z", "shell.execute_reply": "2023-08-05T22:38:11.764069Z", "shell.execute_reply.started": "2023-08-05T22:38:11.719928Z" }, "tags": [] }, "outputs": [ { "data": { "text/html": [ "
| \n", " | y | \n", "w | \n", "permutations | \n", "n | \n", "z | \n", "z2ss | \n", "EI | \n", "VI_norm | \n", "seI_norm | \n", "VI_rand | \n", "... | \n", "z_rand | \n", "p_norm | \n", "p_rand | \n", "sim | \n", "p_sim | \n", "EI_sim | \n", "seI_sim | \n", "VI_sim | \n", "z_sim | \n", "p_z_sim | \n", "
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 500 | \n", "[52941, 51958, 32992, 54556, 50815, 60167, 490... | \n", "<libpysal.weights.distance.DistanceBand object... | \n", "999 | \n", "403 | \n", "[0.27506115837390277, 0.21948138443207246, -0.... | \n", "1.260604e+11 | \n", "-0.002488 | \n", "0.497534 | \n", "0.705361 | \n", "0.496239 | \n", "... | \n", "0.086188 | \n", "9.314058e-01 | \n", "9.313166e-01 | \n", "[-0.041149125017795225, 0.8047828919464444, 0.... | \n", "0.465 | \n", "-0.017506 | \n", "0.693436 | \n", "0.480854 | \n", "0.109214 | \n", "4.565164e-01 | \n", "
| 1000 | \n", "[52941, 51958, 32992, 54556, 50815, 60167, 490... | \n", "<libpysal.weights.distance.DistanceBand object... | \n", "999 | \n", "403 | \n", "[0.27506115837390277, 0.21948138443207246, -0.... | \n", "1.260604e+11 | \n", "-0.002488 | \n", "0.014259 | \n", "0.119409 | \n", "0.014221 | \n", "... | \n", "4.147088 | \n", "3.447621e-05 | \n", "3.367309e-05 | \n", "[0.01407910066509541, -0.09399265148441757, -0... | \n", "0.001 | \n", "-0.007273 | \n", "0.120349 | \n", "0.014484 | \n", "4.149132 | \n", "1.668694e-05 | \n", "
| 1500 | \n", "[52941, 51958, 32992, 54556, 50815, 60167, 490... | \n", "<libpysal.weights.distance.DistanceBand object... | \n", "999 | \n", "403 | \n", "[0.27506115837390277, 0.21948138443207246, -0.... | \n", "1.260604e+11 | \n", "-0.002488 | \n", "0.004586 | \n", "0.067719 | \n", "0.004574 | \n", "... | \n", "6.763620 | \n", "1.430242e-11 | \n", "1.345857e-11 | \n", "[-0.028776580865062164, 0.07946937157388524, 0... | \n", "0.001 | \n", "-0.003352 | \n", "0.067581 | \n", "0.004567 | \n", "6.781418 | \n", "5.950106e-12 | \n", "
| 2000 | \n", "[52941, 51958, 32992, 54556, 50815, 60167, 490... | \n", "<libpysal.weights.distance.DistanceBand object... | \n", "999 | \n", "403 | \n", "[0.27506115837390277, 0.21948138443207246, -0.... | \n", "1.260604e+11 | \n", "-0.002488 | \n", "0.002164 | \n", "0.046515 | \n", "0.002158 | \n", "... | \n", "12.164298 | \n", "5.846476e-34 | \n", "4.815656e-34 | \n", "[-0.0566559868640034, 0.06456200403436937, -0.... | \n", "0.001 | \n", "-0.000508 | \n", "0.047756 | \n", "0.002281 | \n", "11.791263 | \n", "2.164994e-32 | \n", "
| 2500 | \n", "[52941, 51958, 32992, 54556, 50815, 60167, 490... | \n", "<libpysal.weights.distance.DistanceBand object... | \n", "999 | \n", "403 | \n", "[0.27506115837390277, 0.21948138443207246, -0.... | \n", "1.260604e+11 | \n", "-0.002488 | \n", "0.001481 | \n", "0.038483 | \n", "0.001477 | \n", "... | \n", "13.102771 | \n", "3.974924e-39 | \n", "3.174519e-39 | \n", "[-0.03272028047068946, -0.025327634962963246, ... | \n", "0.001 | \n", "-0.002785 | \n", "0.039315 | \n", "0.001546 | \n", "12.816543 | \n", "6.623787e-38 | \n", "
5 rows × 23 columns
\n", "