Quadrat-based statistics in pointpats

This notebook demonstrates quadrat-based methods for analyzing planar point patterns using the current pointpats.quadrat_statistics implementation.

The workflow is:

  1. Represent the point pattern as a NumPy array of coordinates with shape (n, 2).

  2. Construct a rectangular or hexagonal quadrat tessellation over the minimum bounding box (MBB) of the points.

  3. Compute a Pearson chi-square statistic from per-quadrat counts.

  4. Optionally, obtain a Monte Carlo p-value by simulating CSR realizations within the same study window, with reproducibility controlled by rng.

Key implementation facts (relevant for interpretation):

  • The default chi-square statistic uses equal expected counts across included cells.

  • A window may be supplied; if omitted, the MBB rectangle is used.

  • Monte Carlo inference makes the reference distribution window-aware, but it does not change how expected counts are defined.

1. Quadrat statistics and CSR

Quadrat analysis partitions space into a set of non-overlapping cells and compares the observed counts per cell to what is expected under complete spatial randomness (CSR).

Let \(O_i\) be the count in cell (i), and let (k) be the number of cells included in the analysis. The implemented test statistic is the Pearson chi-square statistic with equal expected counts across included cells:

\[X^2 = \sum_{i=1}^k \frac{(O_i - \bar{O})^2}{\bar{O}}, \qquad \bar{O} = \frac{1}{k} \sum_{i=1}^k O_i.\]

Analytical inference compares \(X^2\) to a chi-square distribution with \((k-1)\) degrees of freedom. Simulation-based inference (Monte Carlo) instead compares the observed \(X^2\) to the sampling distribution obtained from CSR realizations generated in the same window.

2. Example: the juvenile delinquency point pattern

We will illustrate the quadrat workflow using the juvenile.shp example dataset. The dataset is read as a GeoDataFrame and then converted to a NumPy coordinate array xy with shape (n, 2).

These coordinates are passed directly to QStatistic.

[1]:
import libpysal as ps
import numpy as np
import geopandas as gpd
%matplotlib inline
import matplotlib.pyplot as plt
import warnings
warnings.filterwarnings("ignore")

Import pointpats.quadrat_statistics and compute quadrat statistics.

The main entry point is:

QStatistic(points, shape="rectangle"|"hexagon", realizations=0, window=None,
           drop_nonintersecting=False, rng=None)
  • points is always a NumPy array of shape (n, 2).

  • shape selects the tessellation type.

  • realizations > 0 enables Monte Carlo inference.

  • rng controls reproducibility of simulations.

[2]:
from pointpats.quadrat_statistics import QStatistic

Open the point shapefile and extract coordinates.

Here we use GeoDataFrame.get_coordinates() to obtain an (n, 2) NumPy array, which is the required input type for QStatistic.

[3]:
juv = gpd.read_file((ps.examples.get_path("juvenile.shp")))
[4]:
juv.shape
[4]:
(168, 4)
[5]:
xy = juv.get_coordinates()
[6]:
xy
[6]:
x y
0 94.0 93.0
1 80.0 95.0
2 79.0 90.0
3 78.0 92.0
4 76.0 92.0
... ... ...
163 45.0 14.0
164 33.0 8.0
165 31.0 7.0
166 32.0 6.0
167 31.0 8.0

168 rows × 2 columns

[7]:
resRect = QStatistic(xy)
[8]:
resRect.chi2_pvalue
[8]:
np.float64(5.890978545159614e-05)
[9]:
resRect.plot()
[9]:
<Axes: title={'center': 'Quadrat Count'}>
../_images/user-guide_Quadrat_statistics_13_1.svg
[10]:
resHex = QStatistic(xy, shape='hexagon')
[11]:
resHex.plot()
[11]:
<Axes: title={'center': 'Quadrat Count'}>
../_images/user-guide_Quadrat_statistics_15_1.svg
[12]:
resHex.chi2_pvalue
[12]:
np.float64(6.375950695202749e-22)

3. Simulation-based inference (Monte Carlo)

When realizations > 0, CSR realizations are generated within the same window and the chi-square statistic is recomputed for each realization. The Monte Carlo p-value uses the standard +1 correction:

\[p = \frac{\#\{X^2_{sim} \ge X^2_{obs}\} + 1}{R + 1},\]

where \(R\) is the number of realizations.

Reproducibility

All simulation-based results can be made reproducible by supplying rng:

  • None (default): create a new numpy.random.default_rng()

  • integer seed

  • numpy.random.Generator

  • legacy numpy.random.RandomState (wrapped)

The same generator is passed through the CSR simulation code, ensuring repeatable Monte Carlo p-values.

[13]:
from numpy.random import default_rng
[14]:
rng = default_rng(42)
[15]:
resHex_sim = QStatistic(xy, shape='hexagon', realizations=99, rng=rng)
[16]:
resHex_sim.chi2_r_pvalue

[16]:
np.float64(0.01)
[17]:
resHex_sim.chi2_realizations
[17]:
array([ 97.5       ,  59.        ,  69.        ,  67.5       ,
        70.        ,  58.5       ,  60.5       ,  78.        ,
        60.        ,  54.5       ,  65.5       ,  72.5       ,
        83.        ,  92.5       ,  98.5       ,  90.5       ,
        62.        ,  83.        ,  74.        ,  68.        ,
        79.5       ,  64.5       ,  52.5       ,  59.        ,
        63.5       ,  59.        ,  70.        , 101.5       ,
        66.5       ,  76.        ,  49.5       ,  39.07142857,
        70.5       ,  89.5       ,  88.5       ,  62.5       ,
        72.        ,  76.5       ,  55.        ,  62.        ,
        63.21428571,  88.        ,  52.5       ,  65.        ,
        91.        ,  54.85714286,  61.        ,  45.5       ,
        61.        ,  67.5       ,  64.        ,  59.5       ,
        72.5       ,  58.        ,  68.5       ,  62.5       ,
        84.5       ,  74.5       ,  69.5       ,  51.        ,
        65.        ,  39.5       ,  80.        ,  70.        ,
        83.5       ,  73.5       ,  98.        ,  95.5       ,
        83.5       ,  50.5       ,  51.        ,  65.        ,
        69.5       ,  66.5       ,  59.        ,  62.        ,
        70.5       , 109.5       ,  68.        ,  68.5       ,
        67.5       ,  61.5       ,  63.5       ,  91.        ,
        82.5       ,  61.5       ,  64.        ,  78.        ,
        67.        ,  65.5       ,  63.5       ,  49.5       ,
        57.5       ,  79.5       ,  58.        ,  90.        ,
        72.        ,  91.5       ,  76.        ])
[18]:
resHex_sim.chi2
[18]:
np.float64(195.0)

4. Irregular windows

Quadrat grids are constructed over the MBB of the points. If the study window is irregular (e.g., a polygon), some grid cells may lie entirely outside the window.

  • If such outside cells are included, they contribute guaranteed zeros and can distort the test.

How Monte Carlo interacts with this

Comparing the observed chi-square statistic to simulated chi-square values is a valid Monte Carlo test for the statistic being used, because the same grid, window, and inclusion rule are applied to both observed and simulated patterns.

Monte Carlo inference makes the reference distribution window-aware, but it does not convert the test into an area-proportional quadrat test when the window is irregular.

[19]:
from shapely.geometry import Polygon
[20]:
window = Polygon([ [0, 0], [1, 0], [0, 1] ])
[21]:
from pointpats.random import poisson
[22]:
pp = poisson(window, 100/1, rng=42)
[23]:
pp
[23]:
array([[0.12811363, 0.45038594],
       [0.4434142 , 0.22723872],
       [0.55458479, 0.06381726],
       [0.7783835 , 0.19463871],
       [0.466721  , 0.04380377],
       [0.15428949, 0.68304895],
       [0.32582536, 0.37045971],
       [0.46955581, 0.18947136],
       [0.12992151, 0.47570493],
       [0.22690935, 0.66981399],
       [0.38747838, 0.2883281 ],
       [0.6824955 , 0.13975248],
       [0.1999082 , 0.00736227],
       [0.139797  , 0.11453007],
       [0.55920716, 0.3039501 ],
       [0.03081783, 0.43671739],
       [0.21458467, 0.40852864],
       [0.05830274, 0.28138389],
       [0.29359376, 0.66191651],
       [0.81402038, 0.16697292],
       [0.02271207, 0.09004786],
       [0.16127178, 0.50104478],
       [0.1523121 , 0.69632038],
       [0.44615628, 0.38102123],
       [0.30151209, 0.63028259],
       [0.36181261, 0.08764992],
       [0.4493615 , 0.27224156],
       [0.09639096, 0.9026024 ],
       [0.45577629, 0.20236336],
       [0.30595662, 0.57921957],
       [0.08444432, 0.4158074 ],
       [0.04161417, 0.49399082],
       [0.32986121, 0.14452419],
       [0.10340297, 0.58764457],
       [0.58106114, 0.3468698 ],
       [0.59091549, 0.02280387],
       [0.78273523, 0.08273   ],
       [0.48665833, 0.49070699],
       [0.4734894 , 0.26697566],
       [0.331569  , 0.5206724 ],
       [0.43891146, 0.02161208],
       [0.14024909, 0.55403614],
       [0.10857574, 0.67224009],
       [0.28123378, 0.65942263],
       [0.23021399, 0.03741256],
       [0.55485247, 0.37092228],
       [0.29091784, 0.51505713],
       [0.16460782, 0.04491062],
       [0.51885836, 0.31592905],
       [0.37365773, 0.09446667]])
[24]:
qs = QStatistic(pp, nx=5, ny=5)
[25]:
qs.plot()
[25]:
<Axes: title={'center': 'Quadrat Count'}>
../_images/user-guide_Quadrat_statistics_31_1.svg
[26]:
qs.chi2_pvalue
[26]:
np.float64(0.0010544670440019426)
[27]:
qsw = QStatistic(pp, nx=5, ny=5, window=window, realizations=99, rng=42)
[28]:
qsw.chi2_r_pvalue
[28]:
np.float64(0.37)
[29]:
qsw.plot()
[29]:
<Axes: title={'center': 'Quadrat Count'}>
../_images/user-guide_Quadrat_statistics_35_1.svg

5. Recap

In this notebook, we:

  • represented a point pattern as an (n, 2) NumPy coordinate array,

  • computed quadrat counts using rectangular and hexagonal tessellations over the MBB,

  • used a Pearson chi-square statistic with equal expected counts across included cells,

  • obtained either an analytical p-value or a Monte Carlo p-value from CSR realizations,

  • clarified the issues associated with irregular windows, and

  • ensured reproducibility of simulations via an explicit rng.