libpysal.graph.Graph¶

class libpysal.graph.Graph(adjacency, transformation='O', is_sorted=False)[source]¶

Graph class encoding spatial weights matrices

The Graph is currently experimental and its API is incomplete and unstable.

Weights base class based on adjacency list

It is recommenced to use one of the from_* or build_* constructors rather than invoking __init__ directly.

Each observation needs to be present in the focal, at least as a self-loop with a weight 0.

Parameters:¶

adjacency : pandas.Series¶

A MultiIndexed pandas.Series with "focal" and "neigbor" levels encoding adjacency, and values encoding weights. By convention, isolates are encoded as self-loops with a weight 0.

transformation : str, default "O"¶

weights transformation used to produce the table.

O – Original
B – Binary
R – Row-standardization (global sum \(=n\))
D – Double-standardization (global sum \(=1\))
V – Variance stabilizing
C – Custom

is_sorted : bool, default False¶

adjacency capturing the graph needs to be canonically sorted to initialize the class. The MultiIndex needs to be ordered i–>j on both focal and neighbor levels according to the order of ids in the original data from which the Graph is created. Sorting is performed by default based on the order of unique values in the focal level. Sorting needs to be reflected in both the values of the MultiIndex and also the underlying MultiIndex.codes. Set is_sorted=True to skip this step if the adjacency is already canonically sorted and you are certain about it.

Methods

`aggregate`(func)	Aggregate weights within a neighbor set
`apply`(y, func, **kwargs)	Apply a reduction across the neighbor sets
`assign_self_weight`([weight])	Assign values to edges representing self-weight.
`asymmetry`([intrinsic])	Asymmetry check.
`build_block_contiguity`(regimes)	Generate Graph from block contiguity (regime neighbors)
`build_contiguity`(geometry[, rook, ...])	Generate Graph from geometry based on contiguity
`build_distance_band`(data, threshold[, ...])	Generate Graph from geometry based on a distance band
`build_fuzzy_contiguity`(geometry[, ...])	Generate Graph from fuzzy contiguity
`build_h3`(ids[, order, weight])	Generate Graph from indices of H3 hexagons.
`build_kernel`(data[, kernel, k, bandwidth, ...])	Generate Graph from geometry data based on a kernel function
`build_knn`(data, k[, metric, p, coplanar, ...])	Generate Graph from geometry data based on k-nearest neighbors search
`build_raster_contiguity`(da[, rook, z_value, ...])	Generate Graph from `xarray.DataArray` raster object
`build_spatial_matches`(data, k[, metric, ...])	Match locations in one dataset to at least `k` locations in another (possibly identical) dataset by minimizing the total distance between matched locations.
`build_travel_cost`(df, network, threshold[, ...])	Generate a Graph based on shortest travel costs from a pandarm.Network
`build_triangulation`(data[, method, ...])	Generate Graph from geometry based on triangulation
`copy`([deep])	Make a copy of this Graph's adjacency table and transformation
`describe`(y[, q, statistics])	Describe the distribution of `y` values within the neighbors of each node.
`difference`(right)	Provide the set difference between the graph on the left and the graph on the right.
`eliminate_zeros`()	Remove graph edges with zero weight
`equals`(right)	Check that two graphs are identical.
`explore`(gdf[, focal, nodes, color, ...])	Plot graph as an interactive Folium Map
`from_W`(w)	Create an experimental Graph from libpysal.weights.W object
`from_adjacency`(adjacency[, focal_col, ...])	Create a Graph from a pandas DataFrame formatted as an adjacency list
`from_arrays`(focal_ids, neighbor_ids, weight, ...)	Generate Graph from arrays of indices and weights of the same length
`from_dense`(dense[, ids])	Convert a `numpy.ndarray` of a shape (N, N) to a PySAL `Graph` object.
`from_dicts`(neighbors[, weights])	Generate Graph from dictionaries of neighbors and weights
`from_lattice`([nrows, ncols, rook, index_type])	Create a Graph object for a regular lattice.
`from_networkx`(graph[, weight])	Generate a Graph from a NetworkX graph.
`from_sparse`(sparse[, ids])	Convert a `scipy.sparse` array to a PySAL `Graph` object.
`from_weights_dict`(weights_dict)	Generate Graph from a dict of dicts
`generate_da`(y)	Creates xarray.DataArray object from passed data aligned with the Graph.
`higher_order`([k, shortest_path, diagonal, ...])	Contiguity weights object of order \(k\).
`intersection`(right)	Returns a binary Graph, that includes only those neighbor pairs that exist in both left and right.
`intersects`(right)	Returns True if left and right share at least one link, irrespective of weights value.
`isomorphic`(right)	Check that two graphs are isomorphic.
`issubgraph`(right)	Return True if every link in the left Graph also occurs in the right Graph.
`lag`(y[, categorical, ties])	Spatial lag operator
`make_symmetric`([intersection, reduction])	Create a symmetric version of this graph
`plot`(gdf[, focal, nodes, color, edge_kws, ...])	Plot edges and nodes of the Graph
`subgraph`(ids)	Returns a subset of Graph containing only nodes specified in ids
`summary`([asymmetries])	Summary of the Graph properties
`symmetric_difference`(right)	Filter out links that are in both left and right Graph objects.
`to_W`()	Convert Graph to a libpysal.weights.W object
`to_gal`(path)	Save Graph to a GAL file
`to_gwt`(path)	Save Graph to a GWT file
`to_networkx`()	Convert Graph to a `networkx` graph.
`to_parquet`(path, **kwargs)	Save Graph to a Apache Parquet
`transform`(transformation)	Transformation of weights
`union`(right)	Provide the union of two Graph objects, collecing all links that are in either graph.

Attributes

`adjacency`	Return a copy of the adjacency list
`cardinalities`	Number of neighbors for each observation
`component_labels`	Get component labels per observation
`index_pairs`	Return focal-neighbor index pairs
`isolates`	Index of observations with no neighbors
`n`	Number of observations.
`n_components`	Get a number of connected components
`n_edges`	Number of edges.
`n_nodes`	Number of nodes.
`neighbors`	Get neighbors dictionary
`nonzero`	Number of nonzero weights.
`pct_nonzero`	Percentage of nonzero weights.
`sparse`	Return a scipy.sparse array (CSR)
`unique_ids`	Unique IDs used in the Graph
`weights`	Get weights dictionary

property adjacency[source]¶

Return a copy of the adjacency list

Returns:¶: Underlying adjacency list
Return type:¶: pandas.Series

aggregate(func)[source]¶

Aggregate weights within a neighbor set

Apply a custom aggregation function to a group of weights of the same focal geometry.

Parameters:¶

func : callable¶: A callable accepted by pandas groupby.agg method

Returns:¶

Aggregated weights

Return type:¶

pd.Series

apply(y, func, **kwargs)[source]¶

Apply a reduction across the neighbor sets

Applies func over groups of y defined by neighbors for each focal.

Parameters:¶

y : array_like¶: array of values to be grouped. Can be 1-D or 2-D and will be coerced to a pandas object
func : function, str, list, dict or None¶: Function to use for aggregating the data passed to pandas GroupBy.apply.

Returns:¶

pandas object indexed by unique_ids

Return type:¶

Series | DataFrame

assign_self_weight(weight=1)[source]¶

Assign values to edges representing self-weight.

The value for each focal == neighbor location in the graph is set to weight.

Parameters:¶

weight : float | array-like¶: Defines the value(s) to which the weight representing the relationship with itself should be set. If a constant is passed then each self-weight will get this value (default is 1). An array of length Graph.n can be passed to set explicit values to each self-weight (assumed to be in the same order as original data).

Returns:¶

A new Graph with added self-weights.

Return type:¶

Graph

Examples

>>> import geopandas as gpd
>>> from geodatasets import get_path
>>> nybb = gpd.read_file(get_path("nybb")).set_index("BoroName")
>>> nybb
               BoroCode  ...                                           geometry
BoroName                 ...
Staten Island         5  ...  MULTIPOLYGON (((970217.022 145643.332, 970227....
Queens                4  ...  MULTIPOLYGON (((1029606.077 156073.814, 102957...
Brooklyn              3  ...  MULTIPOLYGON (((1021176.479 151374.797, 102100...
Manhattan             1  ...  MULTIPOLYGON (((981219.056 188655.316, 980940....
Bronx                 2  ...  MULTIPOLYGON (((1012821.806 229228.265, 101278...
[5 rows x 4 columns]

>>> contiguity = graph.Graph.build_contiguity(nybb)
>>> contiguity_weights = contiguity.assign_self_weight(0.5)
>>> contiguity_weights.adjacency
focal          neighbor
Staten Island  Staten Island    0.5
Queens         Queens           0.5
               Brooklyn         1.0
               Manhattan        1.0
               Bronx            1.0
Brooklyn       Queens           1.0
               Brooklyn         0.5
               Manhattan        1.0
Manhattan      Queens           1.0
               Brooklyn         1.0
               Manhattan        0.5
               Bronx            1.0
Bronx          Queens           1.0
               Manhattan        1.0
               Bronx            0.5
Name: weight, dtype: float64

asymmetry(intrinsic=True)[source]¶

Asymmetry check.

Parameters:¶

intrinsic : bool, optional¶

Default is True. Intrinsic symmetry is defined as:

\[w_{i,j} == w_{j,i}\]

If intrinsic is False symmetry is defined as:

\[i \in N_j \ \& \ j \in N_i\]

where \(N_j\) is the set of neighbors for \(j\), e.g., True requires equality of the weight to consider two links equal, False requires only a presence of a link with a non-zero weight.

Returns:¶

A Series of (i,j) pairs of asymmetries sorted ascending by the focal observation (index value), where i is the focal and j is the neighbor. An empty Series is returned if no asymmetries are found.

Return type:¶

pandas.Series

classmethod build_block_contiguity(regimes)[source]¶

Generate Graph from block contiguity (regime neighbors)

Block contiguity structures are relevant when defining neighbor relations based on membership in a regime. For example, all counties belonging to the same state could be defined as neighbors, in an analysis of all counties in the US.

Parameters:¶

regimes : list-like¶: list-like of regimes. If pandas.Series, its index is used to encode Graph. Otherwise a default RangeIndex is used.

Returns:¶

libpysal.graph.Graph encoding block contiguity

Return type:¶

Graph

Examples

>>> import geopandas as gpd
>>> from geodatasets import get_path
>>> france = gpd.read_file(get_path('geoda guerry')).set_index('Dprmnt')

In the GeoDa Guerry dataset, the Region column reflects the region (North, East, West, South or Central) to which each department belongs.

>>> france[['Region', 'geometry']].head()
             Region                                           geometry
Dprtmnt
Ain               E  POLYGON ((801150.000 2092615.000, 800669.000 2...
Aisne             N  POLYGON ((729326.000 2521619.000, 729320.000 2...
Allier            C  POLYGON ((710830.000 2137350.000, 711746.000 2...
Basses-Alpes      E  POLYGON ((882701.000 1920024.000, 882408.000 1...
Hautes-Alpes      E  POLYGON ((886504.000 1922890.000, 885733.000 1...

Using the "Region" labels as regimes then identifies all departments within the region as neighbors.

>>> block_contiguity = graph.Graph.build_block_contiguity(france['Region'])
>>> block_contiguity.adjacency
focal   neighbor
Ain     Basses-Alpes       1
        Hautes-Alpes       1
        Aube               1
        Cote-d'Or          1
        Doubs              1
                          ..
Vienne  Mayenne            1
        Morbihan           1
        Basses-Pyrenees    1
        Deux-Sevres        1
        Vendee             1
Name: weight, Length: 1360, dtype: int32

classmethod build_contiguity(geometry, rook=True, by_perimeter=False, strict=False)[source]¶

Generate Graph from geometry based on contiguity

Contiguity builder assumes that all geometries are forming a coverage, i.e. a non-overlapping mesh and neighbouring geometries share only points or segments of their exterior boundaries. In practice, build_contiguity is capable of creating a Graph of partially overlapping geometries when strict=False, by_perimeter=False, but that would not strictly follow the definition of queen or rook contiguity.

Parameters:¶

geometry : array-like of shapely.Geometry objects¶: Could be geopandas.GeoSeries or geopandas.GeoDataFrame, in which case the resulting Graph is indexed by the original index. If an array of shapely.Geometry objects is passed, Graph will assume a RangeIndex.
rook : bool, optional¶: Contiguity method. If True, two geometries are considered neighbours if they share at least one edge. If False, two geometries are considered neighbours if they share at least one vertex. By default True
by_perimeter : bool, optional¶: If True, weight represents the length of the shared boundary between adjacent units, by default False. For row-standardized version of perimeter weights, use Graph.build_contiguity(gdf, by_perimeter=True).transform("r").
strict : bool, optional¶: Use the strict topological method. If False, the contiguity is determined based on shared coordinates or coordinate sequences representing edges. This assumes geometry coverage that is topologically correct. This method is faster but can miss some relations. If True, the contiguity is determined based on geometric relations that do not require precise topology. This method is slower but will result in correct contiguity even if the topology of geometries is not optimal. By default False.

Returns:¶

libpysal.graph.Graph encoding contiguity weights

Return type:¶

Graph

Examples

>>> import geopandas as gpd
>>> from geodatasets import get_path
>>> nybb = gpd.read_file(get_path("nybb")).set_index("BoroName")
>>> nybb
               BoroCode  ...                                           geometry
BoroName                 ...
Staten Island         5  ...  MULTIPOLYGON (((970217.022 145643.332, 970227....
Queens                4  ...  MULTIPOLYGON (((1029606.077 156073.814, 102957...
Brooklyn              3  ...  MULTIPOLYGON (((1021176.479 151374.797, 102100...
Manhattan             1  ...  MULTIPOLYGON (((981219.056 188655.316, 980940....
Bronx                 2  ...  MULTIPOLYGON (((1012821.806 229228.265, 101278...
[5 rows x 4 columns]

>>> contiguity = graph.Graph.build_contiguity(nybb)
>>> contiguity.adjacency
focal          neighbor
Staten Island  Staten Island    0
Queens         Brooklyn         1
               Manhattan        1
               Bronx            1
Brooklyn       Queens           1
               Manhattan        1
Manhattan      Queens           1
               Brooklyn         1
               Bronx            1
Bronx          Queens           1
               Manhattan        1
Name: weight, dtype: int64

Weight by perimeter instead of binary weights:

>>> contiguity_perimeter = graph.Graph.build_contiguity(nybb, by_perimeter=True)
>>> contiguity_perimeter.adjacency
focal          neighbor
Staten Island  Staten Island        0.000000
Queens         Brooklyn         50867.502055
               Manhattan          103.745207
               Bronx                5.777002
Brooklyn       Queens           50867.502055
               Manhattan         5736.546898
Manhattan      Queens             103.745207
               Brooklyn          5736.546898
               Bronx             5258.300879
Bronx          Queens               5.777002
               Manhattan         5258.300879
Name: weight, dtype: float64

classmethod build_distance_band(data, threshold, binary=True, alpha=-1.0, kernel=None, bandwidth=None, taper=True, decay=False, tree=None)[source]¶

Generate Graph from geometry based on a distance band

Parameters:¶

data : numpy.ndarray, geopandas.GeoSeries, geopandas.GeoDataFrame¶: geometries containing locations to compute the delaunay triangulation. If a geopandas object with Point geometry is provided, the .geometry attribute is used. If a numpy.ndarray with shapely geometry is used, then the coordinates are extracted and used. If a numpy.ndarray of a shape (2,n) is used, it is assumed to contain x, y coordinates.
threshold : float¶: distance band
binary : bool, optional¶: If True \(w_{ij}=1\) if \(d_{i,j}<=threshold\), otherwise \(w_{i,j}=0\). If False \(wij=dij^{alpha}\), by default True.
alpha : float, optional¶: distance decay parameter for weight (default -1.0) if alpha is positive the weights will not decline with distance. Ignored if binary=True or kernel is not None.
kernel : str, optional¶: kernel function to use in order to weight the output graph. See Graph.build_kernel() for details. Ignored if binary=True.
bandwidth : float (default: None)¶: distance to use in the kernel computation. Should be on the same scale as the input coordinates. Ignored if binary=True or kernel=None.
taper : bool (default: True)¶: remove links with a weight equal to zero
decay : bool (default: False)¶: whether to calculate the kernel using the decay formulation. In the decay form, a kernel measures the distance decay in similarity between observations. It varies from from maximal similarity (1) at a distance of zero to minimal similarity (0 or negative) at some very large (possibly infinite) distance. Otherwise, kernel functions are treated as proper volume-preserving probability distributions.
tree : scipy.spatial.KDTree, optional¶: A pre-built scipy KDTree for distance computation. If provided, the tree’s data will be used as coordinates. This avoids rebuilding the tree when it has already been constructed. Note that only scipy.spatial.KDTree is supported for distance band computation.

Returns:¶

libpysal.graph.Graph encoding distance band weights

Return type:¶

Graph

Examples

>>> import geopandas as gpd
>>> from geodatasets import get_path
>>> nybb = gpd.read_file(get_path("nybb")).set_index("BoroName")
>>> nybb
               BoroCode  ...                                           geometry
BoroName                 ...
Staten Island         5  ...  MULTIPOLYGON (((970217.022 145643.332, 970227....
Queens                4  ...  MULTIPOLYGON (((1029606.077 156073.814, 102957...
Brooklyn              3  ...  MULTIPOLYGON (((1021176.479 151374.797, 102100...
Manhattan             1  ...  MULTIPOLYGON (((981219.056 188655.316, 980940....
Bronx                 2  ...  MULTIPOLYGON (((1012821.806 229228.265, 101278...
[5 rows x 4 columns]

Note that the method requires point geometry (or an array of coordinates representing points) as an input.

The threshold distance is in the units of the geometry projection. You can check it using the nybb.crs property.

>>> distance_band = graph.Graph.build_distance_band(nybb.centroid, 45000)
>>> distance_band.adjacency
focal          neighbor
Staten Island  Staten Island    0
Queens         Brooklyn         1
Brooklyn       Queens           1
Manhattan      Bronx            1
Bronx          Manhattan        1
Name: weight, dtype: int64

The larger threshold yields more neighbors.

>>> distance_band = graph.Graph.build_distance_band(nybb.centroid, 110000)
>>> distance_band.adjacency
focal          neighbor
Staten Island  Queens           1
               Brooklyn         1
               Manhattan        1
Queens         Staten Island    1
               Brooklyn         1
               Manhattan        1
               Bronx            1
Brooklyn       Staten Island    1
               Queens           1
               Manhattan        1
               Bronx            1
Manhattan      Staten Island    1
               Queens           1
               Brooklyn         1
               Bronx            1
Bronx          Queens           1
               Brooklyn         1
               Manhattan        1
Name: weight, dtype: int64

Instead of binary weights you can use inverse distance.

>>> distance_band = graph.Graph.build_distance_band(
...     nybb.centroid,
...     45000,
...     binary=False,
... )
>>> distance_band.adjacency
focal          neighbor
Staten Island  Staten Island    0.000000
Queens         Brooklyn         0.000024
Brooklyn       Queens           0.000024
Manhattan      Bronx            0.000026
Bronx          Manhattan        0.000026
Name: weight, dtype: float64

Or specify the kernel function to derive weight from the distance.

>>> distance_band = graph.Graph.build_distance_band(
...     nybb.centroid,
...     45000,
...     binary=False,
...     kernel='bisquare',
...     bandwidth=60000,
... )
>>> distance_band.adjacency
focal          neighbor
Staten Island  Staten Island    0.000000
Queens         Brooklyn         0.232079
Brooklyn       Queens           0.232079
Manhattan      Bronx            0.309825
Bronx          Manhattan        0.309825
Name: weight, dtype: float64

classmethod build_fuzzy_contiguity(geometry, tolerance=None, buffer=None, predicate='intersects', **kwargs)[source]¶

Generate Graph from fuzzy contiguity

Fuzzy contiguity relaxes the notion of contiguity neighbors for the case of geometry collections that violate the condition of planar enforcement. It handles three types of conditions present in such collections that would result in missing links when using the regular contiguity methods.

The first are edges for nearby polygons that should be shared, but are digitized separately for the individual polygons and the resulting edges do not coincide, but instead the edges intersect. This case can also be covered by build_contiguty with the strict=False parameter.

The second case is similar to the first, only the resultant edges do not intersect but are “close”. The optional buffering of geometry then closes the gaps between the polygons and a resulting intersection is encoded as a link.

The final case arises when one polygon is “inside” a second polygon but is not encoded to represent a hole in the containing polygon.

It is also possible to create a contiguity based on a custom spatial predicate.

Parameters:¶

geometry : array-like of shapely.Geometry objects¶: Could be geopandas.GeoSeries or geopandas.GeoDataFrame, in which case the resulting Graph is indexed by the original index. If an array of shapely.Geometry objects is passed, Graph will assume a RangeIndex.
tolerance : float, optional¶: The percentage of the length of the minimum side of the bounding rectangle for the geometry to use in determining the buffering distance. Either tolerance or buffer may be specified but not both. By default None.
buffer : float, optional¶: Exact buffering distance in the units of geometry.crs. Either tolerance or buffer may be specified but not both. By default None.
predicate : str, optional¶: The predicate to use for determination of neighbors. Default is ‘intersects’. If None is passed, neighbours are determined based on the intersection of bounding boxes. See the documentation of geopandas.GeoSeries.sindex.query for allowed predicates.
**kwargs¶: Keyword arguments passed to geopandas.GeoSeries.buffer.

Returns:¶

libpysal.graph.Graph encoding fuzzy contiguity

Return type:¶

Graph

Examples

>>> import geopandas as gpd
>>> from geodatasets import get_path
>>> nybb = gpd.read_file(get_path("nybb")).set_index("BoroName")
>>> nybb
               BoroCode  ...                                           geometry
BoroName                 ...
Staten Island         5  ...  MULTIPOLYGON (((970217.022 145643.332, 970227....
Queens                4  ...  MULTIPOLYGON (((1029606.077 156073.814, 102957...
Brooklyn              3  ...  MULTIPOLYGON (((1021176.479 151374.797, 102100...
Manhattan             1  ...  MULTIPOLYGON (((981219.056 188655.316, 980940....
Bronx                 2  ...  MULTIPOLYGON (((1012821.806 229228.265, 101278...
[5 rows x 4 columns]

Example using the default parameters:

>>> fuzzy_contiguity = graph.Graph.build_fuzzy_contiguity(nybb)
>>> fuzzy_contiguity
<Graph of 5 nodes and 10 nonzero edges indexed by
 ['Staten Island', 'Queens', 'Brooklyn', 'Manhattan', 'Bronx']>

Example using the tolerance of 0.05:

>>> fuzzy_contiguity = graph.Graph.build_fuzzy_contiguity(nybb, tolerance=0.05)
>>> fuzzy_contiguity
<Graph of 5 nodes and 12 nonzero edges indexed by
 ['Staten Island', 'Queens', 'Brooklyn', 'Manhattan', 'Bronx']>

Example using a buffer of 10000 feet (CRS of nybb is in feet):

>>> fuzzy_contiguity = graph.Graph.build_fuzzy_contiguity(nybb, buffer=10000)
>>> fuzzy_contiguity
<Graph of 5 nodes and 14 nonzero edges indexed by
 ['Staten Island', 'Queens', 'Brooklyn', 'Manhattan', 'Bronx']>

classmethod build_h3(ids, order=1, weight='distance')[source]¶

Generate Graph from indices of H3 hexagons.

Encode a graph from a set of H3 hexagons. The graph is generated by considering the H3 hexagons as nodes and connecting them based on their contiguity. The contiguity is defined by the order parameter, which specifies the number of steps to consider as neighbors. The weight parameter defines the type of weight to assign to the edges.

Requires the h3 library.

Parameters:¶

ids : array-like¶

Array of H3 IDs encoding focal geometries

order : int, optional¶

Order of contiguity, by default 1

weight : str, optional¶

Type of weight. Options are:

distance: raw topological distance between cells
binary: 1 for neighbors, 0 for non-neighbors
inverse: 1 / distance between cells

By default “distance”.

Return type:¶

Graph

Examples

>>> import geopandas as gpd
>>> from geodatasets import get_path
>>> from tobler.util import h3fy
>>> gdf = gpd.read_file(get_path("geoda guerry"))
>>> h3 = h3fy(gdf, resolution=4)
>>> h3.head()
                                                          geometry
hex_id
841f94dffffffff  POLYGON ((609346.657 2195981.397, 604556.817 2...
841fa67ffffffff  POLYGON ((722074.162 2561038.244, 717442.706 2...
84186a3ffffffff  POLYGON ((353695.287 2121176.341, 329999.974 2...
8418609ffffffff  POLYGON ((387747.482 2509794.492, 364375.032 2...
8418491ffffffff  POLYGON ((320872.289 1846157.662, 296923.464 1...

>>> h3_contiguity = graph.Graph.build_h3(h3.index)
>>> h3_contiguity
<Graph of 320 nodes and 1740 nonzero edges indexed by
 ['841f94dffffffff', '841fa67ffffffff', '84186a3ffffffff', ...]>

classmethod build_kernel(data, kernel='gaussian', k=None, bandwidth=None, metric='euclidean', p=2, coplanar='raise', taper=True, decay=False, tree=None)[source]¶

Generate Graph from geometry data based on a kernel function

Parameters:¶

data : numpy.ndarray, geopandas.GeoSeries, geopandas.GeoDataFrame¶

geometries over which to compute a kernel. If a geopandas object with Point geoemtry is provided, the .geometry attribute is used. If a numpy.ndarray with shapely geoemtry is used, then the coordinates are extracted and used. If a numpy.ndarray of a shape (2,n) is used, it is assumed to contain x, y coordinates. If metric=”precomputed”, data is assumed to contain a precomputed distance metric.

kernel : string or callable (default: 'gaussian')¶

kernel function to apply over the distance matrix computed by metric. The following kernels are supported:

"triangular":
"parabolic":
"gaussian":
"bisquare":
"cosine":
'boxcar'/discrete: all distances less than bandwidth are 1, and all other distances are 0
"identity"/None : do nothing, weight similarity based on raw distance
callable : a user-defined function that takes the distance vector and the bandwidth and returns the kernel: kernel(distances, bandwidth)

k : int (default: None)¶

number of nearest neighbors used to truncate the kernel. This is assumed to be constant across samples. If None, no truncation is conduted.

bandwidth : float or "auto" or "adaptive" (default: None)¶

distance to use in the kernel computation. Should be on the same scale as the input coordinates. If “adaptive”, a per-observation bandwidth is used equal to each observation’s distance to its k-th nearest neighbor. This requires k to be set. If “auto”, the bandwidth is optimized as a function of entropy for a given kernel function. This ensures that the entropy of the kernel is maximized for a given distance matrix. This will result in the smoothing that provide the most uniform distribution of kernel values, which is a good proxy for a “moderate” level of smoothing.

metric : string or callable (default: 'euclidean')¶

distance function to apply over the input coordinates. Supported options depend on whether or not scikit-learn is installed. If so, then any distance function supported by scikit-learn is supported here. Otherwise, only euclidean, minkowski, and manhattan/cityblock distances are admitted.

p : int (default: 2)¶

parameter for minkowski metric, ignored if metric != “minkowski”.

coplanar : str, optional (default "raise")¶

Method for handling coplanar points when k is not None. Options are 'raise' (raising an exception when coplanar points are present), 'jitter' (randomly displace coplanar points to produce uniqueness), & 'clique' (induce fully-connected sub cliques for coplanar points).

taper : bool (default: True)¶

remove links with a weight equal to zero

decay : bool (default: False)¶

whether to calculate the kernel using the decay formulation. In the decay form, a kernel measures the distance decay in similarity between observations. It varies from from maximal similarity (1) at a distance of zero to minimal similarity (0 or negative) at some very large (possibly infinite) distance. Otherwise, kernel functions are treated as proper volume-preserving probability distributions.

tree : scipy.spatial.KDTree, sklearn.neighbors.KDTree, sklearn.neighbors.BallTree, optional¶

A pre-built tree for distance computation. If provided, the tree’s data will be used as coordinates. This avoids rebuilding the tree when it has already been constructed.

Returns:¶

libpysal.graph.Graph encoding kernel weights

Return type:¶

Graph

classmethod build_knn(data, k, metric='euclidean', p=2, coplanar='raise', taper=True, decay=False, tree=None)[source]¶

Generate Graph from geometry data based on k-nearest neighbors search

Parameters:¶

data : numpy.ndarray, geopandas.GeoSeries, geopandas.GeoDataFrame¶: geometries over which to compute a kernel. If a geopandas object with Point geoemtry is provided, the .geometry attribute is used. If a numpy.ndarray with shapely geoemtry is used, then the coordinates are extracted and used. If a numpy.ndarray of a shape (2,n) is used, it is assumed to contain x, y coordinates.
k : int¶: number of nearest neighbors.
metric : string or callable (default: 'euclidean')¶: distance function to apply over the input coordinates. Supported options depend on whether or not scikit-learn is installed. If so, then any distance function supported by scikit-learn is supported here. Otherwise, only euclidean, minkowski, and manhattan/cityblock distances are admitted.
p : int (default: 2)¶: parameter for minkowski metric, ignored if metric != “minkowski”.
coplanar : str, optional (default "raise")¶: Method for handling coplanar points. Options include 'raise' (raising an exception when coplanar points are present), 'jitter' (randomly displace coplanar points to produce uniqueness), & 'clique' (induce fully-connected sub cliques for coplanar points).
taper : bool (default: True)¶: remove links with a weight equal to zero
decay : bool (default: False)¶: whether to calculate the kernel using the decay formulation. In the decay form, a kernel measures the distance decay in similarity between observations. It varies from from maximal similarity (1) at a distance of zero to minimal similarity (0 or negative) at some very large (possibly infinite) distance. Otherwise, kernel functions are treated as proper volume-preserving probability distributions.
tree : scipy.spatial.KDTree, sklearn.neighbors.KDTree, sklearn.neighbors.BallTree, optional¶: A pre-built tree for distance computation. If provided, the tree’s data will be used as coordinates. This avoids rebuilding the tree when it has already been constructed.

Returns:¶

libpysal.graph.Graph encoding KNN weights

Return type:¶

Graph

Examples

>>> import geopandas as gpd
>>> from geodatasets import get_path
>>> nybb = gpd.read_file(get_path("nybb")).set_index('BoroName')
>>> nybb
               BoroCode  ...                                           geometry
BoroName                 ...
Staten Island         5  ...  MULTIPOLYGON (((970217.022 145643.332, 970227....
Queens                4  ...  MULTIPOLYGON (((1029606.077 156073.814, 102957...
Brooklyn              3  ...  MULTIPOLYGON (((1021176.479 151374.797, 102100...
Manhattan             1  ...  MULTIPOLYGON (((981219.056 188655.316, 980940....
Bronx                 2  ...  MULTIPOLYGON (((1012821.806 229228.265, 101278...

>>> knn3 = graph.Graph.build_knn(nybb.centroid, k=3)
>>> knn3.adjacency
focal           neighbor
Staten Island   Queens           1
                Brooklyn         1
                Manhattan        1
Queens          Brooklyn         1
                Manhattan        1
                Bronx            1
Brooklyn        Staten Island    1
                Queens           1
                Manhattan        1
Manhattan       Queens           1
                Brooklyn         1
                Bronx            1
Bronx           Queens           1
                Brooklyn         1
                Manhattan        1
Name: weight, dtype: int32

Specifying k=1 identifies the nearest neighbor (note that this can be asymmetrical):

>>> knn1 = graph.Graph.build_knn(nybb.centroid, k=1)
>>> knn1.adjacency
focal          neighbor
Staten Island  Brooklyn     1
Queens         Brooklyn     1
Brooklyn       Queens       1
Manhattan      Bronx        1
Bronx          Manhattan    1
Name: weight, dtype: int32

classmethod build_raster_contiguity(da, rook=False, z_value=None, coords_labels=None, k=1, include_nodata=False, n_jobs=1)[source]¶

Generate Graph from xarray.DataArray raster object

Create Graph object encoding contiguity of raster cells from xarray.DataArray object. The coordinates are flatten to tuples representing the location of each cell within the raster.

Parameters:¶

da : xarray.DataArray¶: Input 2D or 3D DataArray with shape=(z, y, x)
rook : bool, optional¶: Contiguity method. If True, two cells are considered neighbours if they share at least one edge. If False, two geometries are considered neighbours if they share at least one vertex. By default True
z_value : {int, str, float}, optional¶: Select the z_value of 3D DataArray with multiple layers. By default None
coords_labels : dict, optional¶: Pass dimension labels for coordinates and layers if they do not belong to default dimensions, which are (band/time, y/lat, x/lon) e.g. coords_labels = {"y_label": "latitude", "x_label": "longitude", "z_label": "year"} When None, defaults to empty dictionary.
k : int, optional¶: Order of contiguity, this will select all neighbors up to k-th order. Default is 1.
include_nodata : bool, optional¶: If True, missing values will be assumed as non-missing when selecting higher_order neighbors, Default is False
n_jobs : int, optional¶: Number of cores to be used in the sparse weight construction. If -1, all available cores are used. Default is 1. Requires joblib.

Returns:¶

libpysal.graph.Graph encoding raster contiguity

Return type:¶

Graph

classmethod build_spatial_matches(data, k, metric='euclidean', solver=None, allow_partial_match=False, **metric_kwargs)[source]¶

Match locations in one dataset to at least k locations in another (possibly identical) dataset by minimizing the total distance between matched locations.

Letting \(d_{ij}\) be

\[ \begin{align}\begin{aligned}\text{minimize} \sum_i^n \sum_j^n d_{ij}m_{ij}\\\text{subject to} \sum_j^n m_{ij} >= k \forall i\\ m_{ij} \in {0,1} \forall ij\end{aligned}\end{align} \]

Parameters:¶

data : numpy.ndarray, geopandas.GeoSeries, geopandas.GeoDataFrame¶: Geometries that need matches. If a geopandas object is provided, the .geometry attribute is used. If a numpy.ndarray with a geometry dtype is used, then the coordinates are extracted and used.
k : int¶: Number of matches for each observation.
metric : string or callable (default: 'euclidean')¶: distance function to apply over the input coordinates. Supported options depend on whether or not scikit-learn is installed. If so, then any distance function supported by scikit-learn is supported here. Otherwise, only euclidean, minkowski, and manhattan/cityblock distances are admitted.
solver : solver from pulp (default: None)¶: a solver defined by the pulp optimization library. If no solver is provided, pulp’s default solver will be used. This is generally pulp.COIN(), but this may vary depending on your configuration.
allow_partial_match : bool (default: False)¶: whether to allow for partial matching. A partial match may have a weight between zero and one, while a “full” match (by default) must have a weight of either zero or one. A partial matching may have a shorter total distance, but will result in a weighted graph.

classmethod build_travel_cost(df, network, threshold, kernel=None, mapping_distance=None, taper=True, decay=False)[source]¶

Generate a Graph based on shortest travel costs from a pandarm.Network

Parameters:¶

df : geopandas.GeoDataFrame¶: geodataframe representing observations which are snapped to the nearest node in the pandarm.Network. CRS should be the same as the locations of node_x and node_y in the pandarm.Network (usually 4326 if network comes from OSM, but sometimes projected to improve snapping quality).
network : pandarm.Network¶: pandarm Network object describing travel costs between nodes in the study area. See <https://oturns.github.io/pandarm/> for more
threshold : int¶: threshold representing maximum cost distances. This is measured in the same units as the pandarm.Network (not influenced by the df.crs in any way). For travel modes with relatively constant speeds like walking or biking, this is usually distance (e.g. meters if the Network is constructed from OSM). For a a multimodal or auto network with variable travel speeds, this is usually some measure of travel time
kernel : str or callable, optional¶: kernel transformation applied to the weights. See libpysal.graph.Graph.build_kernel for more information on kernel transformation options. Default is None, in which case the Graph weight is pure distance between focal and neighbor
mapping_distance : int¶: snapping tolerance passed to pandarm.Network.get_node_ids that defines the maximum range at which observations are snapped to nearest nodes in the network. Default is None
taper : bool (default: True)¶: remove links with a weight equal to zero
decay : bool (default: False)¶: whether to calculate the kernel using the decay formulation. In the decay form, a kernel measures the distance decay in similarity between observations. It varies from from maximal similarity (1) at a distance of zero to minimal similarity (0 or negative) at some very large (possibly infinite) distance. Otherwise, kernel functions are treated as proper volume-preserving probability distributions.

Return type:¶

Graph

Examples

>>> import geodatasets
>>> import geopandas as gpd
>>> import osmnx as ox
>>> import pandarm

Read an example geodataframe:

>>> df = gpd.read_file(geodatasets.get_path("geoda Cincinnati")).to_crs(4326)

Download a walk network using osmnx

>>> osm_graph = ox.graph_from_polygon(df.union_all(), network_type="walk")
>>> nodes, edges = ox.utils_graph.graph_to_gdfs(osm_graph)
>>> edges = edges.reset_index()

Generate a routable pandarm network from the OSM nodes and edges

>>> network = pandarm.Network(
>>>     edge_from=edges["u"],
>>>     edge_to=edges["v"],
>>>     edge_weights=edges[["length"]],
>>>     node_x=nodes["x"],
>>>     node_y=nodes["y"],)

Use the pandarm network to compute shortest paths between gdf centroids and generate a Graph

>>> G = Graph.build_travel_cost(df.set_geometry(df.centroid), network, 500)
>>> G.adjacency.head()
focal  neighbor
0       62          385.609009
        65          309.471985
        115         346.858002
        116           0.000000
        117         333.639008
Name: weight, dtype: float64

classmethod build_triangulation(data, method='delaunay', bandwidth=inf, kernel='boxcar', clip='bounding_box', rook=True, coplanar='raise', taper=True, decay=False)[source]¶

Generate Graph from geometry based on triangulation

Parameters:¶

data : numpy.ndarray, geopandas.GeoSeries, geopandas.GeoDataFrame¶

geometries containing locations to compute the delaunay triangulation. If a geopandas object with Point geoemtry is provided, the .geometry attribute is used. If a numpy.ndarray with shapely geoemtry is used, then the coordinates are extracted and used. If a numpy.ndarray of a shape (2,n) is used, it is assumed to contain x, y coordinates.

method : str, (default "delaunay")¶

method of extracting the weights from triangulation. Supports:

"delaunay"
"gabriel"
"relative_neighborhood"
"voronoi"

bandwidth : float, optional¶

distance to use in the kernel computation. Should be on the same scale as the input coordinates, by default numpy.inf

kernel : str, optional¶

kernel function to use in order to weight the output graph. See Graph.build_kernel() for details. By default “boxcar”

clip : str (default: 'bbox')¶

Clipping method when method="voronoi". Ignored otherwise. Default is 'bounding_box'. Options are as follows.

None: No clip is applied. Voronoi cells may be arbitrarily larger that the source map. Note that this may lead to cells that are many orders of magnitude larger in extent than the original map. Not recommended.
'bounding_box': Clip the voronoi cells to the bounding box of the input points.
'convex_hull': Clip the voronoi cells to the convex hull of the input points.
'alpha_shape': Clip the voronoi cells to the tightest hull that contains all points (e.g. the smallest alpha shape, using libpysal.cg.alpha_shape_auto()).
shapely.Polygon: Clip to an arbitrary Polygon.

rook : bool, optional¶

Contiguity method when method="voronoi". Ignored otherwise. If True, two geometries are considered neighbours if they share at least one edge. If False, two geometries are considered neighbours if they share at least one vertex. By default True

coplanar : str, optional (default "raise")¶

Method for handling coplanar points. Options include 'raise' (raising an exception when coplanar points are present), 'jitter' (randomly displace coplanar points to produce uniqueness), & 'clique' (induce fully-connected sub cliques for coplanar points).

taper : bool (default: True)¶

remove links with a weight equal to zero

decay : bool (default: False)¶

Returns:¶

libpysal.graph.Graph encoding triangulation weights

Return type:¶

Graph

Examples

>>> import geopandas as gpd
>>> from geodatasets import get_path
>>> nybb = gpd.read_file(get_path("nybb")).set_index("BoroName")
>>> nybb
               BoroCode  ...                                           geometry
BoroName                 ...
Staten Island         5  ...  MULTIPOLYGON (((970217.022 145643.332, 970227....
Queens                4  ...  MULTIPOLYGON (((1029606.077 156073.814, 102957...
Brooklyn              3  ...  MULTIPOLYGON (((1021176.479 151374.797, 102100...
Manhattan             1  ...  MULTIPOLYGON (((981219.056 188655.316, 980940....
Bronx                 2  ...  MULTIPOLYGON (((1012821.806 229228.265, 101278...
[5 rows x 4 columns]

Note that the method requires point geometry (or an array of coordinates representing points) as an input.

>>> triangulation = graph.Graph.build_triangulation(nybb.centroid)
>>> triangulation.adjacency
focal          neighbor
Staten Island  Brooklyn         1
               Manhattan        1
Queens         Brooklyn         1
               Manhattan        1
               Bronx            1
Brooklyn       Staten Island    1
               Queens           1
               Manhattan        1
Manhattan      Staten Island    1
               Queens           1
               Brooklyn         1
               Bronx            1
Bronx          Queens           1
               Manhattan        1
Name: weight, dtype: int64

property cardinalities[source]¶

Number of neighbors for each observation

Returns:¶: Series with a number of neighbors per each observation
Return type:¶: pandas.Series

property component_labels[source]¶

Get component labels per observation

Returns:¶: Series of component labels
Return type:¶: pandas.Series

copy(deep=True)[source]¶

Make a copy of this Graph’s adjacency table and transformation

Parameters:¶

deep : bool, optional¶: Make a deep copy of the adjacency table, by default True

Returns:¶

libpysal.graph.Graph as a copy of the original

Return type:¶

Graph

describe(y, q=None, statistics=None)[source]¶

Describe the distribution of y values within the neighbors of each node.

Given the graph, computes the descriptive statistics of values within the neighbourhood of each node. Optionally, the values can be limited to a certain quantile range before computing the statistics.

Notes

The index of values must match the index of the graph.

Weight values do not affect the calculations, only adjacency does.

Returns numpy.nan for all isolates.

The numba package is used extensively in this function to accelerate the computation of statistics. Without numba, these computations may become slow on large data.

Parameters:¶

y : NDArray[np.float64] | Series¶: An 1D array of numeric values to be described.
q : tuple[float, float] | None, optional¶: Tuple of percentages for the percentiles to compute. Values must be between 0 and 100 inclusive. When set, values below and above the percentiles will be discarded before computation of the statistics. The percentiles are computed for each neighborhood. By default None.
statistics : list[str] | None¶: A list of stats functions to compute. If None, compute all available functions - “count”, “mean”, “median”, “std”, “min”, “max”, “sum”, “nunique”, “mode”. By default None.

Returns:¶

A DataFrame with descriptive statistics.

Return type:¶

DataFrame

difference(right)[source]¶: Provide the set difference between the graph on the left and the graph on the right. This returns all links in the left graph that are not in the right graph.

eliminate_zeros()[source]¶

Remove graph edges with zero weight

Eliminates edges with weight == 0 that do not encode an isolate. This is useful to clean-up edges that will make no effect in operations like lag().

Returns:¶: subset of Graph with zero-weight edges eliminated
Return type:¶: Graph

equals(right)[source]¶

Check that two graphs are identical. This reqiures them to have 1. the same edge labels and node labels 2. in the same order 3. with the same weights

This is implemented by comparing the underlying adjacency series.

This is equivalent to checking whether the sorted list of edge tuples (focal, neighbor, weight) for the two graphs are the same.

explore(gdf, focal=None, nodes=True, color='black', edge_kws=None, node_kws=None, focal_kws=None, m=None, **kwargs)[source]¶

Plot graph as an interactive Folium Map

Parameters:¶

gdf : geopandas.GeoDataFrame¶: geodataframe used to instantiate to Graph
focal : list, optional¶: subset of focal observations to plot in the map, by default None. If none, all relationships are plotted
nodes : bool, optional¶: whether to display observations as nodes in the map, by default True
color : str, optional¶: color applied to nodes and edges, by default “black”
edge_kws : dict, optional¶: additional keyword arguments passed to geopandas explore function when plotting edges, by default None
node_kws : dict, optional¶: additional keyword arguments passed to geopandas explore function when plotting nodes, by default None
focal_kws : dict, optional¶: additional keyword arguments passed to geopandas explore function when plotting focal observations, by default None. Only applicable when passing a subset of nodes with the focal argument
m : Folilum.Map, optional¶: folium map objecto to plot on top of, by default None
**kwargs : dict, optional¶: additional keyword arguments are passed directly to geopandas.explore, when m=None by default None

Returns:¶

folium map

Return type:¶

folium.Map

classmethod from_W(w)[source]¶

Create an experimental Graph from libpysal.weights.W object

Parameters:¶

w : libpysal.weights.W¶

Returns:¶

libpysal.graph.Graph from W

Return type:¶

Graph

Examples

>>> import geopandas as gpd
>>> from geodatasets import get_path
>>> nybb = gpd.read_file(get_path("nybb")).set_index("BoroName")
>>> nybb
               BoroCode  ...                                           geometry
BoroName                 ...
Staten Island         5  ...  MULTIPOLYGON (((970217.022 145643.332, 970227....
Queens                4  ...  MULTIPOLYGON (((1029606.077 156073.814, 102957...
Brooklyn              3  ...  MULTIPOLYGON (((1021176.479 151374.797, 102100...
Manhattan             1  ...  MULTIPOLYGON (((981219.056 188655.316, 980940....
Bronx                 2  ...  MULTIPOLYGON (((1012821.806 229228.265, 101278...
[5 rows x 4 columns]

>>> queen_w = weights.Queen.from_dataframe(nybb, use_index=True)
>>> queen_graph = graph.Graph.from_W(queen_w)
>>> queen_graph
<Graph of 5 nodes and 10 nonzero edges indexed by
 ['Staten Island', 'Queens', 'Brooklyn', 'Manhattan', 'Bronx']>

classmethod from_adjacency(adjacency, focal_col='focal', neighbor_col='neighbor', weight_col='weight')[source]¶

Create a Graph from a pandas DataFrame formatted as an adjacency list

Parameters:¶

adjacency : pandas.DataFrame¶: a dataframe formatted as an ajacency list. Should have columns “focal”, “neighbor”, and “weight”, or columns that can be mapped to these (e.g. origin, destination, cost)
focal_col : str, optional¶: name of column holding focal/origin index, by default ‘focal’
neighbor_col : str, optional¶: name of column holding neighbor/destination index, by default ‘neighbor’
weight_col : str, optional¶: name of column holding weight values, by default ‘weight’

Returns:¶

libpysal.graph.Graph

Return type:¶

Graph

classmethod from_arrays(focal_ids, neighbor_ids, weight, **kwargs)[source]¶

Generate Graph from arrays of indices and weights of the same length

The arrays needs to be sorted in a way ensuring that focal_ids.unique() is equal to the index of original observations from which the Graph is being built

Parameters:¶

focal_ids : array-like¶: focal indices
neighbor_ids : array-like¶: neighbor indices
weight : array-like¶: weights
**kwargs¶: keyword arguments passed to the class constructor

Returns:¶

libpysal.graph.Graph based on arrays

Return type:¶

Graph

classmethod from_dense(dense, ids=None)[source]¶

Convert a numpy.ndarray of a shape (N, N) to a PySAL Graph object.

Parameters:¶

dense : numpy.ndarray¶: dense representation of a graph
ids : list-like, default None¶: list-like of ids for geometries that is mappable to positions from dense. If None, the positions are used as labels.

Returns:¶

libpysal.graph.Graph based on dense

Return type:¶

Graph

classmethod from_dicts(neighbors, weights=None)[source]¶

Generate Graph from dictionaries of neighbors and weights

Parameters:¶

neighbors : dict¶: dictionary of neighbors with the {focal: [neighbor1, neighbor2]} structure
weights : dict, optional¶: dictionary of neighbors with the {focal: [weight1, weight2]} structure. If None, assumes binary weights.

Returns:¶

libpysal.graph.Graph based on dictionaries

Return type:¶

Graph

Examples

>>> neighbors = {
...     'Africa': ['Asia'],
...     'Asia': ['Africa', 'Europe'],
...     'Australia': [],
...     'Europe': ['Asia'],
...     'North America': ['South America'],
...     'South America': ['North America'],
... }
>>> connectivity = graph.Graph.from_dicts(neighbors)
>>> connectivity.adjacency
focal          neighbor
Africa         Asia             1
Asia           Africa           1
               Europe           1
Australia      Australia        0
Europe         Asia             1
North America  South America    1
South America  North America    1
Name: weight, dtype: float64

You can also specify weights (for example based on the length of the shared border):

>>> weights = {
...     'Africa': [1],
...     'Asia': [0.2, 0.8],
...     'Australia': [],
...     'Europe': [1],
...     'North America': [1],
...     'South America': [1],
... }
>>> connectivity = graph.Graph.from_dicts(neighbors, weights)
>>> connectivity.adjacency
focal          neighbor
Africa         Asia             1.0
Asia           Africa           0.2
               Europe           0.8
Australia      Australia        0.0
Europe         Asia             1.0
North America  South America    1.0
South America  North America    1.0
Name: weight, dtype: float64

classmethod from_lattice(nrows=5, ncols=5, rook=True, index_type='int')[source]¶

Create a Graph object for a regular lattice.

Parameters:¶

nrows : int, default 5¶: Number of rows.
ncols : int, default 5¶: Number of columns.
rook : bool, default True¶: Type of contiguity. If False, queen contiguity is used.
index_type : {"int", "float", "string"}, default "int"¶: Type of index IDs to use in the final Graph object.

Returns:¶

Graph encoding lattice contiguity.

Return type:¶

Graph

Examples

>>> Graph.from_lattice()
<Graph of 25 nodes and 80 nonzero edges (1 component, 0 isolates) indexed by
 [0, 1, 2, 3, 4, ...]>

>>> Graph.from_lattice(3, 5, index_type="string")
<Graph of 15 nodes and 44 nonzero edges (1 component, 0 isolates) indexed by
 ['id0', 'id1', 'id2', 'id3', 'id4', ...]>

classmethod from_networkx(graph, weight=None)[source]¶

Generate a Graph from a NetworkX graph.

Parameters:¶

graph¶: representation of the graph as a networkx.Graph or networkx.DiGraph. Multi-graphs are not supported as they do not translate to a unique weight between two nodes.
weight : str | None, default None¶: name of the edge attribute to use as weights.

Returns:¶

libpysal.graph.Graph based on NetworkX graph

Return type:¶

Graph

Examples

>>> import networkx as nx
>>> nx_graph = nx.path_graph(5)
>>> g = graph.Graph.from_networkx(nx_graph)
>>> g.n
5

classmethod from_sparse(sparse, ids=None)[source]¶

Convert a scipy.sparse array to a PySAL Graph object.

Parameters:¶

sparse : scipy.sparse array¶: sparse representation of a graph
ids : list-like, default None¶: list-like of ids for geometries that is mappable to positions from sparse. If None, the positions are used as labels.

Returns:¶

libpysal.graph.Graph based on sparse

Return type:¶

Graph

classmethod from_weights_dict(weights_dict)[source]¶

Generate Graph from a dict of dicts

Parameters:¶

weights_dict : dictionary of dictionaries¶: weights dictionary with the {focal: {neighbor: weight}} structure.

Returns:¶

libpysal.graph.Graph based on weights dictionary of dictionaries

Return type:¶

Graph

generate_da(y)[source]¶

Creates xarray.DataArray object from passed data aligned with the Graph.

Parameters:¶

y : array_like¶: flat array that shall be reshaped into a DataArray with dimensionality conforming to Graph

Returns:¶

instance of xarray.DataArray that can be aligned with the DataArray from which Graph was built

Return type:¶

xarray.DataArray

higher_order(k=2, shortest_path=True, diagonal=False, lower_order=False)[source]¶

Contiguity weights object of order \(k\).

Proper higher order neighbors are returned such that \(i\) and \(j\) are \(k\)-order neighbors if the shortest path from \(i-j\) is of length \(k\).

Parameters:¶

k : int, optional¶: Order of contiguity. By default 2.
shortest_path : bool, optional¶: If True, \(i,j\) and \(k\)-order neighbors if the shortest path for \(i,j\) is \(k\). If False, \(i,j\) are k-order neighbors if there is a path from \(i,j\) of length \(k\). By default True.
diagonal : bool, optional¶: If True, keep \(k\)-order (\(i,j\)) joins when \(i==j\). If False, remove \(k\)-order (\(i,j\)) joins when \(i==j\). By default False.
lower_order : bool, optional¶: If True, include lower order contiguities. If False return only weights of order \(k\). By default False.

Returns:¶

higher order weights

Return type:¶

Graph

Examples

>>> import geopandas as gpd
>>> from geodatasets import get_path
>>> gdf = gpd.read_file(get_path("geoda guerry"))
>>> contiguity = graph.Graph.build_contiguity(gdf)
>>> contiguity
<Graph of 85 nodes and 420 nonzero edges indexed by
 [0, 1, 2, 3, 4, ...]>

>>> contiguity.higher_order(k=2)
<Graph of 85 nodes and 756 nonzero edges indexed by
 [0, 1, 2, 3, 4, ...]>

>>> contiguity.higher_order(lower_order=True)
<Graph of 85 nodes and 1176 nonzero edges indexed by
 [0, 1, 2, 3, 4, ...]>

property index_pairs[source]¶

Return focal-neighbor index pairs

Returns:¶: tuple of two aligned pandas.Index objects encoding all edges of the Graph by their nodes
Return type:¶: tuple(Index, Index)

Examples

>>> import geopandas as gpd
>>> from geodatasets import get_path
>>> nybb = gpd.read_file(get_path("nybb")).set_index("BoroName")
>>> nybb
               BoroCode  ...                                           geometry
BoroName                 ...
Staten Island         5  ...  MULTIPOLYGON (((970217.022 145643.332, 970227....
Queens                4  ...  MULTIPOLYGON (((1029606.077 156073.814, 102957...
Brooklyn              3  ...  MULTIPOLYGON (((1021176.479 151374.797, 102100...
Manhattan             1  ...  MULTIPOLYGON (((981219.056 188655.316, 980940....
Bronx                 2  ...  MULTIPOLYGON (((1012821.806 229228.265, 101278...
[5 rows x 4 columns]

>>> contiguity = graph.Graph.build_contiguity(nybb)
>>> focal, neighbor = contiguity.index_pairs
>>> focal
Index(['Staten Island', 'Queens', 'Queens', 'Queens', 'Brooklyn', 'Brooklyn',
       'Manhattan', 'Manhattan', 'Manhattan', 'Bronx', 'Bronx'],
      dtype='object', name='focal')

>>> neighbor
Index(['Staten Island', 'Brooklyn', 'Manhattan', 'Bronx', 'Queens',
       'Manhattan', 'Queens', 'Brooklyn', 'Bronx', 'Queens', 'Manhattan'],
      dtype='object', name='neighbor')

intersection(right)[source]¶: Returns a binary Graph, that includes only those neighbor pairs that exist in both left and right.

intersects(right)[source]¶: Returns True if left and right share at least one link, irrespective of weights value.

property isolates[source]¶

Index of observations with no neighbors

Isolates are encoded as a self-loop with the weight == 0 in the adjacency table.

Returns:¶: Index with a subset of observations that do not have any neighbor
Return type:¶: pandas.Index

isomorphic(right)[source]¶: Check that two graphs are isomorphic. This requires that a re-labelling can be found to convert one graph into the other graph. Requires networkx.

issubgraph(right)[source]¶: Return True if every link in the left Graph also occurs in the right Graph. This requires both Graphs are labeled equally. Isolates are ignored.

lag(y, categorical=None, ties='raise')[source]¶

Spatial lag operator

Constructs spatial lag based on neighbor relations of the graph.

Parameters:¶

y : array_like¶: Array-like aligned with the graph. Can be 2-dimensional if all columns are numerical.
categorical : bool¶: True if y is categorical, False if y is continuous. If None, it is derived from the dtype of y.
ties : {'raise', 'random', 'tryself'}, optional¶: Policy on how to break ties when a focal unit has multiple modes for a categorical lag. - ‘raise’: This will raise an exception if ties are encountered to alert the user (Default). - ‘random’: modal label ties Will be broken randomly. - ‘tryself’: check if focal label breaks the tie between label modes. If the focal label does not break the modal tie, the tie will be be broken randomly. If the focal unit has a self-weight, focal label is not used to break any tie, rather any tie will be broken randomly.

Returns:¶

array of numeric|categorical values for the spatial lag

Return type:¶

numpy.ndarray

Examples

>>> import numpy as np
>>> import pandas as pd
>>> import geopandas as gpd
>>> from geodatasets import get_path
>>> aus = gpd.read_file(get_path("abs.australia_states_territories")).set_index(
...     "STE_NAME21"
... )
>>> aus = aus[aus.geometry.notna()]
>>> contiguity = graph.Graph.build_contiguity(aus)

Spatial lag operator for continuous variables.

>>> y = np.arange(9)
>>> contiguity.lag(y)
array([21.,  3.,  9., 13.,  9.,  0.,  9.,  0.,  0.])

You can also perform transformation of weights.

>>> contiguity_r = contiguity.transform("r")
>>> contiguity_r.lag(y)
array([4.2, 1.5, 3. , 2.6, 4.5, 0. , 3. , 0. , 0. ])

make_symmetric(intersection=False, reduction=None)[source]¶

Create a symmetric version of this graph

Parameters:¶

intersection : bool, optional¶: whether to use the intersection of the neighbor set to make a symmetric graph. If True, then links are only dropped from the graph. If False, then links are only added to the graph.
reduction : str or None, optional¶: How to combine weights when the graph has links in both directions. Options are “sum”, “min”, “max”, or “mean”. By default, this is None, which means than an error is raised when the weight on the edge linking node i to j is not the same as the weight on the edge linking j to i.

property n[source]¶: Number of observations.

property n_components[source]¶

Get a number of connected components

Returns:¶: number of components
Return type:¶: int

property n_edges[source]¶: Number of edges.

property n_nodes[source]¶: Number of nodes.

property neighbors[source]¶

Get neighbors dictionary

Notes

It is recommended to work directly with Graph.adjacency() rather than using the Graph.neighbors().

Returns:¶: dict of tuples representing neighbors
Return type:¶: dict

property nonzero[source]¶: Number of nonzero weights.

property pct_nonzero[source]¶: Percentage of nonzero weights.

plot(gdf, focal=None, nodes=True, color='k', edge_kws=None, node_kws=None, focal_kws=None, ax=None, figsize=None, limit_extent=False)[source]¶

Plot edges and nodes of the Graph

Creates a maptlotlib plot based on the topology stored in the Graph and spatial location defined in gdf.

Parameters:¶

gdf : geopandas.GeoDataFrame¶: Geometries indexed using the same index as Graph. Geometry types other than points are converted to centroids encoding start and end point of Graph edges.
focal : hashable | array-like[hashable] | None, optional¶: ID or an array-like of IDs of focal geometries whose weights shall be plotted. If None, all weights from all focal geometries are plotted. By default None
nodes : bool, optional¶: Plot nodes as points, by default True
color : str, optional¶: The color of all objects, by default “k”
edge_kws : dict, optional¶: Keyword arguments dictionary to send to LineCollection, which provides fine-grained control over the aesthetics of the edges in the plot. By default None
node_kws : dict, optional¶: Keyword arguments dictionary to send to ax.scatter, which provides fine-grained control over the aesthetics of the nodes in the plot. By default None
focal_kws : dict, optional¶: Keyword arguments dictionary to send to ax.scatter, which provides fine-grained control over the aesthetics of the focal nodes in the plot on top of generic node_kws. Values of node_kws are updated from focal_kws. Ignored if focal=None. By default None
ax : matplotlib.axes.Axes, optional¶: Axis on which to plot the weights. If None, a new figure and axis are created. By default None
figsize : tuple, optional¶: figsize used to create a new axis. By default None
limit_extent : bool, optional¶: limit the extent of the axis to the extent of the plotted graph, by default False

Returns:¶

Axis with the resulting plot

Return type:¶

matplotlib.axes.Axes

Notes

If you’d like to overlay the actual geometries from the geopandas.GeoDataFrame, create an axis by plotting the GeoDataFrame and plot the Graph on top.

ax = gdf.plot() gdf_graph.plot(gdf, ax=ax)

property sparse[source]¶

Return a scipy.sparse array (CSR)

Returns:¶: sparse representation of the adjacency
Return type:¶: scipy.sparse.CSR

subgraph(ids)[source]¶

Returns a subset of Graph containing only nodes specified in ids

The resulting subgraph contains only the nodes in ids and the edges between them or zero-weight self-loops in case of isolates.

The order of ids reflects a new canonical order of the resulting subgraph. This means ids should be equal to the index of the DataFrame containing data linked to the graph to ensure alignment of sparse representation of subgraph.

Parameters:¶

ids : array-like¶: An array of node IDs to be retained

Returns:¶

A new Graph that is a subset of the original

Return type:¶

Graph

Examples

>>> import geopandas as gpd
>>> from geodatasets import get_path
>>> nybb = gpd.read_file(get_path("nybb")).set_index("BoroName")
>>> nybb
               BoroCode  ...                                           geometry
BoroName                 ...
Staten Island         5  ...  MULTIPOLYGON (((970217.022 145643.332, 970227....
Queens                4  ...  MULTIPOLYGON (((1029606.077 156073.814, 102957...
Brooklyn              3  ...  MULTIPOLYGON (((1021176.479 151374.797, 102100...
Manhattan             1  ...  MULTIPOLYGON (((981219.056 188655.316, 980940....
Bronx                 2  ...  MULTIPOLYGON (((1012821.806 229228.265, 101278...
[5 rows x 4 columns]

>>> contiguity = graph.Graph.build_contiguity(nybb)
>>> contiguity.subgraph(["Queens", "Brooklyn", "Manhattan", "Bronx"])
<Graph of 4 nodes and 10 nonzero edges indexed by
 ['Queens', 'Brooklyn', 'Manhattan', 'Bronx']>

Notes

Unlike the implementation in networkx, this creates a copy since Graphs in libpysal are immutable.

summary(asymmetries=False)[source]¶

Summary of the Graph properties

Returns a GraphSummary object with the statistical attributes summarising the Graph and its basic properties. See the docstring of the GraphSummary for details and all the available attributes.

Parameters:¶

asymmetries : bool¶: whether to compute n_asymmetries, which is considerably more expensive than the other attributes. By default False.

Returns:¶

a class containing a summary statisitcs about the graph

Return type:¶

GraphSummary

Examples

>>> import geopandas as gpd
>>> from geodatasets import get_path
>>> nybb = gpd.read_file(get_path("nybb")).set_index("BoroName")
>>> nybb
               BoroCode  ...                                           geometry
BoroName                 ...
Staten Island         5  ...  MULTIPOLYGON (((970217.022 145643.332, 970227....
Queens                4  ...  MULTIPOLYGON (((1029606.077 156073.814, 102957...
Brooklyn              3  ...  MULTIPOLYGON (((1021176.479 151374.797, 102100...
Manhattan             1  ...  MULTIPOLYGON (((981219.056 188655.316, 980940....
Bronx                 2  ...  MULTIPOLYGON (((1012821.806 229228.265, 101278...
[5 rows x 4 columns]

>>> contiguity = graph.Graph.build_contiguity(nybb)
>>> contiguity
<Graph of 5 nodes and 10 nonzero edges indexed by
 ['Staten Island', 'Queens', 'Brooklyn', 'Manhattan', 'Bronx']>

>>> summary = contiguity.summary(asymmetries=True)
>>> summary
Graph Summary Statistics
========================
Graph indexed by:
['Staten Island', 'Queens', 'Brooklyn', 'Manhattan', 'Bronx']
==============================================================
Number of nodes:                                             5
Number of edges:                                            10
Number of connected components:                              2
Number of isolates:                                          1
Number of non-zero edges:                                   10
Percentage of non-zero edges:                           44.00%
Number of asymmetries:                                       0
--------------------------------------------------------------
Cardinalities
==============================================================
Mean:                       2    25%:                        2
Standard deviation:         1    50%:                        2
Min:                        0    75%:                        3
Max:                        3
--------------------------------------------------------------
Weights
==============================================================
Mean:                       1    25%:                        1
Standard deviation:         0    50%:                        1
Min:                        0    75%:                        1
Max:                        1
--------------------------------------------------------------
Sum of weights
==============================================================
S0:                                                         10
S1:                                                         20
S2:                                                        104
--------------------------------------------------------------
Traces
==============================================================
GG:                                                         10
G'G:                                                        10
G'G + GG:                                                   20

>>> summary.s1
20

symmetric_difference(right)[source]¶: Filter out links that are in both left and right Graph objects.

to_W()[source]¶

Convert Graph to a libpysal.weights.W object

Returns:¶: representation of graph as a weights.W object
Return type:¶: libpysal.weights.W

Examples

>>> import geopandas as gpd
>>> from geodatasets import get_path
>>> nybb = gpd.read_file(get_path("nybb")).set_index("BoroName")
>>> nybb
               BoroCode  ...                                           geometry
BoroName                 ...
Staten Island         5  ...  MULTIPOLYGON (((970217.022 145643.332, 970227....
Queens                4  ...  MULTIPOLYGON (((1029606.077 156073.814, 102957...
Brooklyn              3  ...  MULTIPOLYGON (((1021176.479 151374.797, 102100...
Manhattan             1  ...  MULTIPOLYGON (((981219.056 188655.316, 980940....
Bronx                 2  ...  MULTIPOLYGON (((1012821.806 229228.265, 101278...
[5 rows x 4 columns]

>>> contiguity = graph.Graph.build_contiguity(nybb)
>>> contiguity.adjacency
focal          neighbor
Staten Island  Staten Island    0
Queens         Brooklyn         1
               Manhattan        1
               Bronx            1
Brooklyn       Queens           1
               Manhattan        1
Manhattan      Queens           1
               Brooklyn         1
               Bronx            1
Bronx          Queens           1
               Manhattan        1
Name: weight, dtype: int64

>>> w = contiguity.to_W()
>>> w.neighbors
{'Bronx': ['Queens', 'Manhattan'],
 'Brooklyn': ['Queens', 'Manhattan'],
 'Manhattan': ['Queens', 'Brooklyn', 'Bronx'],
 'Queens': ['Brooklyn', 'Manhattan', 'Bronx'],
 'Staten Island': []}

to_gal(path)[source]¶

Save Graph to a GAL file

Graph is serialized to the GAL file format.

Parameters:¶

path : str¶: path to the GAL file