libpysal.graph.Graph¶
- class libpysal.graph.Graph(adjacency, transformation='O', is_sorted=False)[source]¶
Graph class encoding spatial weights matrices
The
Graph
is currently experimental and its API is incomplete and unstable.- __init__(adjacency, transformation='O', is_sorted=False)[source]¶
Weights base class based on adjacency list
It is recommenced to use one of the
from_*
orbuild_*
constructors rather than invoking__init__
directly.Each observation needs to be present in the focal, at least as a self-loop with a weight 0.
- Parameters:
- adjacency
pandas.Series
A MultiIndexed pandas.Series with
"focal"
and"neigbor"
levels encoding adjacency, and values encoding weights. By convention, isolates are encoded as self-loops with a weight 0.- transformation
str
,default
“O” weights transformation used to produce the table.
O – Original
B – Binary
R – Row-standardization (global sum \(=n\))
D – Double-standardization (global sum \(=1\))
V – Variance stabilizing
C – Custom
- is_sortedbool,
default
False
adjacency
capturing the graph needs to be canonically sorted to initialize the class. The MultiIndex needs to be ordered i–>j on both focal and neighbor levels according to the order of ids in the original data from which the Graph is created. Sorting is performed by default based on the order of unique values in the focal level. Sorting needs to be reflected in both the values of the MultiIndex and also the underlying MultiIndex.codes. Setis_sorted=True
to skip this step if the adjacency is already canonically sorted and you are certain about it.
- adjacency
Methods
__init__
(adjacency[, transformation, is_sorted])Weights base class based on adjacency list
aggregate
(func)Aggregate weights within a neighbor set
apply
(y, func, **kwargs)Apply a reduction across the neighbor sets
assign_self_weight
([weight])Assign values to edges representing self-weight.
asymmetry
([intrinsic])Asymmetry check.
build_block_contiguity
(regimes)Generate Graph from block contiguity (regime neighbors)
build_contiguity
(geometry[, rook, ...])Generate Graph from geometry based on contiguity
build_distance_band
(data, threshold[, ...])Generate Graph from geometry based on a distance band
build_fuzzy_contiguity
(geometry[, ...])Generate Graph from fuzzy contiguity
build_h3
(ids[, order, weight])Generate Graph from indices of H3 hexagons.
build_kernel
(data[, kernel, k, bandwidth, ...])Generate Graph from geometry data based on a kernel function
build_knn
(data, k[, metric, p, coplanar])Generate Graph from geometry data based on k-nearest neighbors search
build_spatial_matches
(data, k[, metric, ...])Match locations in one dataset to at least n_matches locations in another (possibly identical) dataset by minimizing the total distance between matched locations.
build_triangulation
(data[, method, ...])Generate Graph from geometry based on triangulation
copy
([deep])Make a copy of this Graph's adjacency table and transformation
describe
(y[, q, statistics])Describe the distribution of
y
values within the neighbors of each node.difference
(right)Provide the set difference between the graph on the left and the graph on the right.
Remove graph edges with zero weight
equals
(right)Check that two graphs are identical.
explore
(gdf[, focal, nodes, color, ...])Plot graph as an interactive Folium Map
from_W
(w)Create an experimental Graph from libpysal.weights.W object
from_adjacency
(adjacency[, focal_col, ...])Create a Graph from a pandas DataFrame formatted as an adjacency list
from_arrays
(focal_ids, neighbor_ids, weight, ...)Generate Graph from arrays of indices and weights of the same length
from_dicts
(neighbors[, weights])Generate Graph from dictionaries of neighbors and weights
from_sparse
(sparse[, ids])Convert a
scipy.sparse
array to a PySALGraph
object.from_weights_dict
(weights_dict)Generate Graph from a dict of dicts
higher_order
([k, shortest_path, diagonal, ...])Contiguity weights object of order \(k\).
intersection
(right)Returns a binary Graph, that includes only those neighbor pairs that exist in both left and right.
intersects
(right)Returns True if left and right share at least one link, irrespective of weights value.
isomorphic
(right)Check that two graphs are isomorphic.
issubgraph
(right)Return True if every link in the left Graph also occurs in the right Graph.
lag
(y[, categorical, ties])Spatial lag operator
plot
(gdf[, focal, nodes, color, edge_kws, ...])Plot edges and nodes of the Graph
subgraph
(ids)Returns a subset of Graph containing only nodes specified in ids
symmetric_difference
(right)Filter out links that are in both left and right Graph objects.
to_W
()Convert Graph to a libpysal.weights.W object
to_gal
(path)Save Graph to a GAL file
to_gwt
(path)Save Graph to a GWT file
Convert Graph to a
networkx
graph.to_parquet
(path, **kwargs)Save Graph to a Apache Parquet
transform
(transformation)Transformation of weights
union
(right)Provide the union of two Graph objects, collecing all links that are in either graph.
Attributes
Return a copy of the adjacency list
Number of neighbors for each observation
Get component labels per observation
Index of observations with no neighbors
Number of observations.
Get a number of connected components
Number of observations.
Number of observations.
Get neighbors dictionary
Number of nonzero weights.
Percentage of nonzero weights.
Return a scipy.sparse array (CSR)
Unique IDs used in the Graph
Get weights dictionary
- property adjacency¶
Return a copy of the adjacency list
- Returns:
pandas.Series
Underlying adjacency list
- aggregate(func)[source]¶
Aggregate weights within a neighbor set
Apply a custom aggregation function to a group of weights of the same focal geometry.
- Parameters:
- func
callable()
A callable accepted by pandas
groupby.agg
method
- func
- Returns:
pd.Series
Aggregated weights
- apply(y, func, **kwargs)[source]¶
Apply a reduction across the neighbor sets
Applies
func
over groups ofy
defined by neighbors for each focal.- Parameters:
- yarray_like
array of values to be grouped. Can be 1-D or 2-D and will be coerced to a pandas object
- func
function
,str
,list
,dict
orNone
Function to use for aggregating the data passed to pandas
GroupBy.apply
.
- Returns:
Series
|DataFrame
pandas object indexed by unique_ids
- assign_self_weight(weight=1)[source]¶
Assign values to edges representing self-weight.
The value for each
focal == neighbor
location in the graph is set toweight
.- Parameters:
- weight
float
| array_like Defines the value(s) to which the weight representing the relationship with itself should be set. If a constant is passed then each self-weight will get this value (default is 1). An array of length
Graph.n
can be passed to set explicit values to each self-weight (assumed to be in the same order as original data).
- weight
- Returns:
Graph
A new
Graph
with added self-weights.
- asymmetry(intrinsic=True)[source]¶
Asymmetry check.
- Parameters:
- intrinsicbool,
optional
Default is
True
. Intrinsic symmetry is defined as:\[w_{i,j} == w_{j,i}\]If
intrinsic
isFalse
symmetry is defined as:\[i \in N_j \ \& \ j \in N_i\]where \(N_j\) is the set of neighbors for \(j\), e.g.,
True
requires equality of the weight to consider two links equal,False
requires only a presence of a link with a non-zero weight.
- intrinsicbool,
- Returns:
pandas.Series
A
Series
of(i,j)
pairs of asymmetries sorted ascending by the focal observation (index value), wherei
is the focal andj
is the neighbor. An emptySeries
is returned if no asymmetries are found.
- classmethod build_block_contiguity(regimes)[source]¶
Generate Graph from block contiguity (regime neighbors)
Block contiguity structures are relevant when defining neighbor relations based on membership in a regime. For example, all counties belonging to the same state could be defined as neighbors, in an analysis of all counties in the US.
- Parameters:
- regimeslist-like
list-like of regimes. If pandas.Series, its index is used to encode Graph. Otherwise a default RangeIndex is used.
- Returns:
Graph
libpysal.graph.Graph encoding block contiguity
Examples
>>> import geopandas as gpd >>> from geodatasets import get_path >>> france = gpd.read_file(get_path('geoda guerry')).set_index('Dprmnt')
In the GeoDa Guerry dataset, the Region column reflects the region (North, East, West, South or Central) to which each department belongs.
>>> france[['Region', 'geometry']].head() Region geometry Dprtmnt Ain E POLYGON ((801150.000 2092615.000, 800669.000 2... Aisne N POLYGON ((729326.000 2521619.000, 729320.000 2... Allier C POLYGON ((710830.000 2137350.000, 711746.000 2... Basses-Alpes E POLYGON ((882701.000 1920024.000, 882408.000 1... Hautes-Alpes E POLYGON ((886504.000 1922890.000, 885733.000 1...
Using the
"Region"
labels asregimes
then identifies all departments within the region as neighbors.>>> block_contiguity = graph.Graph.build_block_contiguity(france['Region']) >>> block_contiguity.adjacency focal neighbor Ain Basses-Alpes 1 Hautes-Alpes 1 Aube 1 Cote-d'Or 1 Doubs 1 .. Vienne Mayenne 1 Morbihan 1 Basses-Pyrenees 1 Deux-Sevres 1 Vendee 1 Name: weight, Length: 1360, dtype: int32
- classmethod build_contiguity(geometry, rook=True, by_perimeter=False, strict=False)[source]¶
Generate Graph from geometry based on contiguity
Contiguity builder assumes that all geometries are forming a coverage, i.e. a non-overlapping mesh and neighbouring geometries share only points or segments of their exterior boundaries. In practice,
build_contiguity
is capable of creating a Graph of partially overlapping geometries whenstrict=False, by_perimeter=False
, but that would not strictly follow the definition of queen or rook contiguity.- Parameters:
- geometryarray_like
of
shapely.Geometry
objects
Could be geopandas.GeoSeries or geopandas.GeoDataFrame, in which case the resulting Graph is indexed by the original index. If an array of shapely.Geometry objects is passed, Graph will assume a RangeIndex.
- rookbool,
optional
Contiguity method. If True, two geometries are considered neighbours if they share at least one edge. If False, two geometries are considered neighbours if they share at least one vertex. By default True
- by_perimeterbool,
optional
If True,
weight
represents the length of the shared boundary between adjacent units, by default False. For row-standardized version of perimeter weights, useGraph.build_contiguity(gdf, by_perimeter=True).transform("r")
.- strictbool,
optional
Use the strict topological method. If False, the contiguity is determined based on shared coordinates or coordinate sequences representing edges. This assumes geometry coverage that is topologically correct. This method is faster but can miss some relations. If True, the contiguity is determined based on geometric relations that do not require precise topology. This method is slower but will result in correct contiguity even if the topology of geometries is not optimal. By default False.
- geometryarray_like
- Returns:
Graph
libpysal.graph.Graph encoding contiguity weights
Examples
>>> import geopandas as gpd >>> from geodatasets import get_path >>> nybb = gpd.read_file(get_path('nybb')).set_index("BoroName") >>> nybb BoroCode ... geometry BoroName ... Staten Island 5 ... MULTIPOLYGON (((970217.022 145643.332, 970227.... Queens 4 ... MULTIPOLYGON (((1029606.077 156073.814, 102957... Brooklyn 3 ... MULTIPOLYGON (((1021176.479 151374.797, 102100... Manhattan 1 ... MULTIPOLYGON (((981219.056 188655.316, 980940.... Bronx 2 ... MULTIPOLYGON (((1012821.806 229228.265, 101278... [5 rows x 4 columns]
>>> contiguity = graph.Graph.build_contiguity(nybb) >>> contiguity.adjacency focal neighbor Staten Island Staten Island 0 Queens Brooklyn 1 Manhattan 1 Bronx 1 Brooklyn Queens 1 Manhattan 1 Manhattan Queens 1 Brooklyn 1 Bronx 1 Bronx Queens 1 Manhattan 1 Name: weight, dtype: int64
Weight by perimeter instead of binary weights:
>>> contiguity_perimeter = graph.Graph.build_contiguity(nybb, by_perimeter=True) >>> contiguity_perimeter.adjacency focal neighbor Staten Island Staten Island 0.000000 Queens Brooklyn 50867.502055 Manhattan 103.745207 Bronx 5.777002 Brooklyn Queens 50867.502055 Manhattan 5736.546898 Manhattan Queens 103.745207 Brooklyn 5736.546898 Bronx 5258.300879 Bronx Queens 5.777002 Manhattan 5258.300879 Name: weight, dtype: float64
- classmethod build_distance_band(data, threshold, binary=True, alpha=-1.0, kernel=None, bandwidth=None)[source]¶
Generate Graph from geometry based on a distance band
- Parameters:
- data
numpy.ndarray
,geopandas.GeoSeries
,geopandas.GeoDataFrame
geometries containing locations to compute the delaunay triangulation. If a geopandas object with Point geometry is provided, the .geometry attribute is used. If a numpy.ndarray with shapely geometry is used, then the coordinates are extracted and used. If a numpy.ndarray of a shape (2,n) is used, it is assumed to contain x, y coordinates.
- threshold
float
distance band
- binarybool,
optional
If True \(w_{ij}=1\) if \(d_{i,j}<=threshold\), otherwise \(w_{i,j}=0\). If False \(wij=dij^{alpha}\), by default True.
- alpha
float
,optional
distance decay parameter for weight (default -1.0) if alpha is positive the weights will not decline with distance. Ignored if
binary=True
orkernel
is not None.- kernel
str
,optional
kernel function to use in order to weight the output graph. See
Graph.build_kernel()
for details. Ignored ifbinary=True
.- bandwidth
float
(default:None
) distance to use in the kernel computation. Should be on the same scale as the input coordinates. Ignored if
binary=True
orkernel=None
.
- data
- Returns:
Graph
libpysal.graph.Graph encoding distance band weights
Examples
>>> import geopandas as gpd >>> from geodatasets import get_path >>> nybb = gpd.read_file(get_path('nybb')).set_index("BoroName") >>> nybb BoroCode ... geometry BoroName ... Staten Island 5 ... MULTIPOLYGON (((970217.022 145643.332, 970227.... Queens 4 ... MULTIPOLYGON (((1029606.077 156073.814, 102957... Brooklyn 3 ... MULTIPOLYGON (((1021176.479 151374.797, 102100... Manhattan 1 ... MULTIPOLYGON (((981219.056 188655.316, 980940.... Bronx 2 ... MULTIPOLYGON (((1012821.806 229228.265, 101278... [5 rows x 4 columns]
Note that the method requires point geometry (or an array of coordinates representing points) as an input.
The threshold distance is in the units of the geometry projection. You can check it using the
nybb.crs
property.>>> distance_band = graph.Graph.build_distance_band(nybb.centroid, 45000) >>> distance_band.adjacency focal neighbor Staten Island Staten Island 0 Queens Brooklyn 1 Brooklyn Queens 1 Manhattan Bronx 1 Bronx Manhattan 1 Name: weight, dtype: int64
The larger threshold yields more neighbors.
>>> distance_band = graph.Graph.build_distance_band(nybb.centroid, 110000) >>> distance_band.adjacency focal neighbor Staten Island Queens 1 Brooklyn 1 Manhattan 1 Queens Staten Island 1 Brooklyn 1 Manhattan 1 Bronx 1 Brooklyn Staten Island 1 Queens 1 Manhattan 1 Bronx 1 Manhattan Staten Island 1 Queens 1 Brooklyn 1 Bronx 1 Bronx Queens 1 Brooklyn 1 Manhattan 1 Name: weight, dtype: int64
Instead of binary weights you can use inverse distance.
>>> distance_band = graph.Graph.build_distance_band( ... nybb.centroid, ... 45000, ... binary=False, ... ) >>> distance_band.adjacency focal neighbor Staten Island Staten Island 0.000000 Queens Brooklyn 0.000024 Brooklyn Queens 0.000024 Manhattan Bronx 0.000026 Bronx Manhattan 0.000026 Name: weight, dtype: float64
Or specify the kernel function to derive weight from the distance.
>>> distance_band = graph.Graph.build_distance_band( ... nybb.centroid, ... 45000, ... binary=False, ... kernel='bisquare', ... bandwidth=60000, ... ) >>> distance_band.adjacency focal neighbor Staten Island Staten Island 0.000000 Queens Brooklyn 0.232079 Brooklyn Queens 0.232079 Manhattan Bronx 0.309825 Bronx Manhattan 0.309825 Name: weight, dtype: float64
- classmethod build_fuzzy_contiguity(geometry, tolerance=None, buffer=None, predicate='intersects', **kwargs)[source]¶
Generate Graph from fuzzy contiguity
Fuzzy contiguity relaxes the notion of contiguity neighbors for the case of geometry collections that violate the condition of planar enforcement. It handles three types of conditions present in such collections that would result in missing links when using the regular contiguity methods.
The first are edges for nearby polygons that should be shared, but are digitized separately for the individual polygons and the resulting edges do not coincide, but instead the edges intersect. This case can also be covered by
build_contiguty
with thestrict=False
parameter.The second case is similar to the first, only the resultant edges do not intersect but are “close”. The optional buffering of geometry then closes the gaps between the polygons and a resulting intersection is encoded as a link.
The final case arises when one polygon is “inside” a second polygon but is not encoded to represent a hole in the containing polygon.
It is also possible to create a contiguity based on a custom spatial predicate.
- Parameters:
- geomsarray_like
of
shapely.Geometry
objects
Could be geopandas.GeoSeries or geopandas.GeoDataFrame, in which case the resulting Graph is indexed by the original index. If an array of shapely.Geometry objects is passed, Graph will assume a RangeIndex.
- tolerance
float
,optional
The percentage of the length of the minimum side of the bounding rectangle for the
geoms
to use in determining the buffering distance. Eithertolerance
orbuffer
may be specified but not both. By default None.- buffer
float
,optional
Exact buffering distance in the units of
geoms.crs
. Eithertolerance
orbuffer
may be specified but not both. By default None.- predicate
str
,optional
The predicate to use for determination of neighbors. Default is ‘intersects’. If None is passed, neighbours are determined based on the intersection of bounding boxes. See the documentation of
geopandas.GeoSeries.sindex.query
for allowed predicates.- **kwargs
Keyword arguments passed to
geopandas.GeoSeries.buffer
.
- geomsarray_like
- Returns:
Graph
libpysal.graph.Graph encoding fuzzy contiguity
- classmethod build_h3(ids, order=1, weight='distance')[source]¶
Generate Graph from indices of H3 hexagons.
Encode a graph from a set of H3 hexagons. The graph is generated by considering the H3 hexagons as nodes and connecting them based on their contiguity. The contiguity is defined by the order parameter, which specifies the number of steps to consider as neighbors. The weight parameter defines the type of weight to assign to the edges.
Requires the h3 library.
- Parameters:
- idsarray_like
Array of H3 IDs encoding focal geometries
- order
int
,optional
Order of contiguity, by default 1
- weight
str
,optional
Type of weight. Options are:
distance
: raw topological distance between cellsbinary
: 1 for neighbors, 0 for non-neighborsinverse
: 1 / distance between cells
By default “distance”.
- Returns:
- classmethod build_kernel(data, kernel='gaussian', k=None, bandwidth=None, metric='euclidean', p=2, coplanar='raise')[source]¶
Generate Graph from geometry data based on a kernel function
- Parameters:
- data
numpy.ndarray
,geopandas.GeoSeries
,geopandas.GeoDataFrame
geometries over which to compute a kernel. If a geopandas object with Point geoemtry is provided, the .geometry attribute is used. If a numpy.ndarray with shapely geoemtry is used, then the coordinates are extracted and used. If a numpy.ndarray of a shape (2,n) is used, it is assumed to contain x, y coordinates. If metric=”precomputed”, data is assumed to contain a precomputed distance metric.
- kernel
str
orcallable()
(default: ‘gaussian’) kernel function to apply over the distance matrix computed by metric. The following kernels are supported:
"triangular"
:"parabolic"
:"gaussian"
:"bisquare"
:"cosine"
:'boxcar'
/discrete: all distances less than bandwidth are 1, and all other distances are 0"identity"
/None : do nothing, weight similarity based on raw distancecallable
: a user-defined function that takes the distance vector and the bandwidth and returns the kernel: kernel(distances, bandwidth)
- k
int
(default:None
) number of nearest neighbors used to truncate the kernel. This is assumed to be constant across samples. If None, no truncation is conduted.
- bandwidth
float
(default:None
) distance to use in the kernel computation. Should be on the same scale as the input coordinates.
- metric
str
orcallable()
(default: ‘euclidean’) distance function to apply over the input coordinates. Supported options depend on whether or not scikit-learn is installed. If so, then any distance function supported by scikit-learn is supported here. Otherwise, only euclidean, minkowski, and manhattan/cityblock distances are admitted.
- p
int
(default: 2) parameter for minkowski metric, ignored if metric != “minkowski”.
- coplanar: str, optional (default “raise”)
Method for handling coplanar points when
k
is not None. Options are'raise'
(raising an exception when coplanar points are present),'jitter'
(randomly displace coplanar points to produce uniqueness), &'clique'
(induce fully-connected sub cliques for coplanar points).
- data
- Returns:
Graph
libpysal.graph.Graph encoding kernel weights
- classmethod build_knn(data, k, metric='euclidean', p=2, coplanar='raise')[source]¶
Generate Graph from geometry data based on k-nearest neighbors search
- Parameters:
- data
numpy.ndarray
,geopandas.GeoSeries
,geopandas.GeoDataFrame
geometries over which to compute a kernel. If a geopandas object with Point geoemtry is provided, the .geometry attribute is used. If a numpy.ndarray with shapely geoemtry is used, then the coordinates are extracted and used. If a numpy.ndarray of a shape (2,n) is used, it is assumed to contain x, y coordinates.
- k
int
number of nearest neighbors.
- metric
str
orcallable()
(default: ‘euclidean’) distance function to apply over the input coordinates. Supported options depend on whether or not scikit-learn is installed. If so, then any distance function supported by scikit-learn is supported here. Otherwise, only euclidean, minkowski, and manhattan/cityblock distances are admitted.
- p
int
(default: 2) parameter for minkowski metric, ignored if metric != “minkowski”.
- coplanar: str, optional (default “raise”)
Method for handling coplanar points. Options include
'raise'
(raising an exception when coplanar points are present),'jitter'
(randomly displace coplanar points to produce uniqueness), &'clique'
(induce fully-connected sub cliques for coplanar points).
- data
- Returns:
Graph
libpysal.graph.Graph encoding KNN weights
Examples
>>> import geopandas as gpd >>> from geodatasets import get_path >>> nybb = gpd.read_file(get_path('nybb')).set_index('BoroName') >>> nybb BoroCode ... geometry BoroName ... Staten Island 5 ... MULTIPOLYGON (((970217.022 145643.332, 970227.... Queens 4 ... MULTIPOLYGON (((1029606.077 156073.814, 102957... Brooklyn 3 ... MULTIPOLYGON (((1021176.479 151374.797, 102100... Manhattan 1 ... MULTIPOLYGON (((981219.056 188655.316, 980940.... Bronx 2 ... MULTIPOLYGON (((1012821.806 229228.265, 101278...
>>> knn3 = graph.Graph.build_knn(nybb.centroid, k=3) >>> knn3.adjacency focal neighbor Staten Island Queens 1 Brooklyn 1 Manhattan 1 Queens Brooklyn 1 Manhattan 1 Bronx 1 Brooklyn Staten Island 1 Queens 1 Manhattan 1 Manhattan Queens 1 Brooklyn 1 Bronx 1 Bronx Queens 1 Brooklyn 1 Manhattan 1 Name: weight, dtype: int32
Specifying k=1 identifies the nearest neighbor (note that this can be asymmetrical):
>>> knn1 = graph.Graph.build_knn(nybb.centroid, k=1) >>> knn1.adjacency focal neighbor Staten Island Brooklyn 1 Queens Brooklyn 1 Brooklyn Queens 1 Manhattan Bronx 1 Bronx Manhattan 1 Name: weight, dtype: int32
- classmethod build_spatial_matches(data, k, metric='euclidean', solver=None, allow_partial_match=False, **metric_kwargs)[source]¶
Match locations in one dataset to at least n_matches locations in another (possibly identical) dataset by minimizing the total distance between matched locations.
Letting \(d_{ij}\) be
\[ \begin{align}\begin{aligned}\text{minimize} \sum_i^n \sum_j^n d_{ij}m_{ij}\\\text{subject to} \sum_j^n m_{ij} >= k \forall i\\ m_{ij} \in {0,1} \forall ij\end{aligned}\end{align} \]- Parameters:
- x
numpy.ndarray
,geopandas.GeoSeries
,geopandas.GeoDataFrame
geometries that need matches. If a geopandas.Geo* object is provided, the .geometry attribute is used. If a numpy.ndarray with a geometry dtype is used, then the coordinates are extracted and used.
- y
numpy.ndarray
,geopandas.GeoSeries
,geopandas.GeoDataFrame
(default:None
) geometries that are used as a source for matching. If a geopandas object is provided, the .geometry attribute is used. If a numpy.ndarray with a geometry dtype is used, then the coordinates are extracted and used. If none, matches are made within x.
- n_matches
int
(default:None
) number of matches
- metric
str
orcallable()
(default: ‘euclidean’) distance function to apply over the input coordinates. Supported options depend on whether or not scikit-learn is installed. If so, then any distance function supported by scikit-learn is supported here. Otherwise, only euclidean, minkowski, and manhattan/cityblock distances are admitted.
- solver
solver
from
pulp
(default:None
) a solver defined by the pulp optimization library. If no solver is provided, pulp’s default solver will be used. This is generally pulp.COIN(), but this may vary depending on your configuration.
- return_mipbool (default:
False
) whether or not to return the instance of the pulp.LpProblem. By default, the problem is not returned to the user.
- allow_partial_matchbool (default:
False
) whether to allow for partial matching. A partial match may have a weight between zero and one, while a “full” match (by default) must have a weight of either zero or one. A partial matching may have a shorter total distance, but will result in a weighted graph.
- x
- classmethod build_triangulation(data, method='delaunay', bandwidth=inf, kernel='boxcar', clip='bounding_box', rook=True, coplanar='raise')[source]¶
Generate Graph from geometry based on triangulation
- Parameters:
- data
numpy.ndarray
,geopandas.GeoSeries
,geopandas.GeoDataFrame
geometries containing locations to compute the delaunay triangulation. If a geopandas object with Point geoemtry is provided, the .geometry attribute is used. If a numpy.ndarray with shapely geoemtry is used, then the coordinates are extracted and used. If a numpy.ndarray of a shape (2,n) is used, it is assumed to contain x, y coordinates.
- method
str
, (default
“delaunay”) method of extracting the weights from triangulation. Supports:
"delaunay"
"gabriel"
"relative_neighborhood"
"voronoi"
- bandwidth
float
,optional
distance to use in the kernel computation. Should be on the same scale as the input coordinates, by default numpy.inf
- kernel
str
,optional
kernel function to use in order to weight the output graph. See
Graph.build_kernel()
for details. By default “boxcar”- clip
str
(default: ‘bbox’) Clipping method when
method="voronoi"
. Ignored otherwise. Default is'bounding_box'
. Options are as follows.None
No clip is applied. Voronoi cells may be arbitrarily larger that the source map. Note that this may lead to cells that are many orders of magnitude larger in extent than the original map. Not recommended.
'bounding_box'
Clip the voronoi cells to the bounding box of the input points.
'convex_hull'
Clip the voronoi cells to the convex hull of the input points.
'alpha_shape'
Clip the voronoi cells to the tightest hull that contains all points (e.g. the smallest alpha shape, using
libpysal.cg.alpha_shape_auto()
).shapely.Polygon
Clip to an arbitrary Polygon.
- rookbool,
optional
Contiguity method when
method="voronoi"
. Ignored otherwise. If True, two geometries are considered neighbours if they share at least one edge. If False, two geometries are considered neighbours if they share at least one vertex. By default True- coplanar: str, optional (default “raise”)
Method for handling coplanar points. Options include
'raise'
(raising an exception when coplanar points are present),'jitter'
(randomly displace coplanar points to produce uniqueness), &'clique'
(induce fully-connected sub cliques for coplanar points).
- data
- Returns:
Graph
libpysal.graph.Graph encoding triangulation weights
Examples
>>> import geopandas as gpd >>> from geodatasets import get_path >>> nybb = gpd.read_file(get_path('nybb')).set_index("BoroName") >>> nybb BoroCode ... geometry BoroName ... Staten Island 5 ... MULTIPOLYGON (((970217.022 145643.332, 970227.... Queens 4 ... MULTIPOLYGON (((1029606.077 156073.814, 102957... Brooklyn 3 ... MULTIPOLYGON (((1021176.479 151374.797, 102100... Manhattan 1 ... MULTIPOLYGON (((981219.056 188655.316, 980940.... Bronx 2 ... MULTIPOLYGON (((1012821.806 229228.265, 101278... [5 rows x 4 columns]
Note that the method requires point geometry (or an array of coordinates representing points) as an input.
>>> triangulation = graph.Graph.build_triangulation(nybb.centroid) >>> triangulation.adjacency focal neighbor Staten Island Brooklyn 1 Manhattan 1 Queens Brooklyn 1 Manhattan 1 Bronx 1 Brooklyn Staten Island 1 Queens 1 Manhattan 1 Manhattan Staten Island 1 Queens 1 Brooklyn 1 Bronx 1 Bronx Queens 1 Manhattan 1 Name: weight, dtype: int64
- property cardinalities¶
Number of neighbors for each observation
- Returns:
pandas.Series
Series with a number of neighbors per each observation
- property component_labels¶
Get component labels per observation
- Returns:
numpy.array
Array of component labels
- describe(y, q=None, statistics=None)[source]¶
Describe the distribution of
y
values within the neighbors of each node.Given the graph, computes the descriptive statistics of values within the neighbourhood of each node. Optionally, the values can be limited to a certain quantile range before computing the statistics.
- Parameters:
- y
NDArray
[np.float64
] |Series
An 1D array of numeric values to be described.
- q
tuple
[float
,float
] |None
,optional
Tuple of percentages for the percentiles to compute. Values must be between 0 and 100 inclusive. When set, values below and above the percentiles will be discarded before computation of the statistics. The percentiles are computed for each neighborhood. By default None.
- statistics
list
[str
] |None
A list of stats functions to compute. If None, compute all available functions - “count”, “mean”, “median”, “std”, “min”, “max”, “sum”, “nunique”, “mode”. By default None.
- y
- Returns:
DataFrame
A DataFrame with descriptive statistics.
Notes
The index of
values
must match the index of the graph.Weight values do not affect the calculations, only adjacency does.
Returns numpy.nan for all isolates.
The numba package is used extensively in this function to accelerate the computation of statistics. Without numba, these computations may become slow on large data.
- eliminate_zeros()[source]¶
Remove graph edges with zero weight
Eliminates edges with weight == 0 that do not encode an isolate. This is useful to clean-up edges that will make no effect in operations like
lag()
.- Returns:
Graph
subset of Graph with zero-weight edges eliminated
- explore(gdf, focal=None, nodes=True, color='black', edge_kws=None, node_kws=None, focal_kws=None, m=None, **kwargs)[source]¶
Plot graph as an interactive Folium Map
- Parameters:
- gdf
geopandas.GeoDataFrame
geodataframe used to instantiate to Graph
- focal
list
,optional
subset of focal observations to plot in the map, by default None. If none, all relationships are plotted
- nodesbool,
optional
whether to display observations as nodes in the map, by default True
- color
str
,optional
color applied to nodes and edges, by default “black”
- edge_kws
dict
,optional
additional keyword arguments passed to geopandas explore function when plotting edges, by default None
- node_kws
dict
,optional
additional keyword arguments passed to geopandas explore function when plotting nodes, by default None
- focal_kws
dict
,optional
additional keyword arguments passed to geopandas explore function when plotting focal observations, by default None. Only applicable when passing a subset of nodes with the focal argument
- m
Folilum.Map
,optional
folium map objecto to plot on top of, by default None
- **kwargs
dict
,optional
additional keyword arguments are passed directly to geopandas.explore, when
m=None
by default None
- gdf
- Returns:
folium.Map
folium map
- classmethod from_W(w)[source]¶
Create an experimental Graph from libpysal.weights.W object
- Parameters:
- Returns:
Graph
libpysal.graph.Graph from W
- classmethod from_adjacency(adjacency, focal_col='focal', neighbor_col='neighbor', weight_col='weight')[source]¶
Create a Graph from a pandas DataFrame formatted as an adjacency list
- Parameters:
- adjacency
pandas.DataFrame
a dataframe formatted as an ajacency list. Should have columns “focal”, “neighbor”, and “weight”, or columns that can be mapped to these (e.g. origin, destination, cost)
- focal
str
,optional
name of column holding focal/origin index, by default ‘focal’
- neighbor
str
,optional
name of column holding neighbor/destination index, by default ‘neighbor’
- weight
str
,optional
name of column holding weight values, by default ‘weight’
- adjacency
- Returns:
Graph
libpysal.graph.Graph
- classmethod from_arrays(focal_ids, neighbor_ids, weight, **kwargs)[source]¶
Generate Graph from arrays of indices and weights of the same length
The arrays needs to be sorted in a way ensuring that focal_ids.unique() is equal to the index of original observations from which the Graph is being built
- Parameters:
- focal_indexarray_like
focal indices
- neighbor_indexarray_like
neighbor indices
- weightarray_like
weights
- **kwargs
keyword arguments passed to the class constructor
- Returns:
Graph
libpysal.graph.Graph based on arrays
- classmethod from_dicts(neighbors, weights=None)[source]¶
Generate Graph from dictionaries of neighbors and weights
- Parameters:
- Returns:
Graph
libpysal.graph.Graph based on dictionaries
Examples
>>> neighbors = { ... 'Africa': ['Asia'], ... 'Asia': ['Africa', 'Europe'], ... 'Australia': [], ... 'Europe': ['Asia'], ... 'North America': ['South America'], ... 'South America': ['North America'], ... } >>> connectivity = graph.Graph.from_dicts(neighbors) >>> connectivity.adjacency focal neighbor Africa Asia 1 Asia Africa 1 Europe 1 Australia Australia 0 Europe Asia 1 North America South America 1 South America North America 1 Name: weight, dtype: float64
You can also specify weights (for example based on the length of the shared border):
>>> weights = { ... 'Africa': [1], ... 'Asia': [0.2, 0.8], ... 'Australia': [], ... 'Europe': [1], ... 'North America': [1], ... 'South America': [1], ... } >>> connectivity = graph.Graph.from_dicts(neighbors, weights) >>> connectivity.adjacency focal neighbor Africa Asia 1.0 Asia Africa 0.2 Europe 0.8 Australia Australia 0.0 Europe Asia 1.0 North America South America 1.0 South America North America 1.0 Name: weight, dtype: float64
- classmethod from_sparse(sparse, ids=None)[source]¶
Convert a
scipy.sparse
array to a PySALGraph
object.- Parameters:
- sparse
scipy.sparse
array
sparse representation of a graph
- idslist-like,
default
None
list-like of ids for geometries that is mappable to positions from sparse. If None, the positions are used as labels.
- sparse
- Returns:
Graph
libpysal.graph.Graph based on sparse
- classmethod from_weights_dict(weights_dict)[source]¶
Generate Graph from a dict of dicts
- Parameters:
- weights_dict
dictionary
of
dictionaries
weights dictionary with the
{focal: {neighbor: weight}}
structure.
- weights_dict
- Returns:
Graph
libpysal.graph.Graph based on weights dictionary of dictionaries
- higher_order(k=2, shortest_path=True, diagonal=False, lower_order=False)[source]¶
Contiguity weights object of order \(k\).
Proper higher order neighbors are returned such that \(i\) and \(j\) are \(k\)-order neighbors if the shortest path from \(i-j\) is of length \(k\).
- Parameters:
- k
int
,optional
Order of contiguity. By default 2.
- shortest_pathbool,
optional
If True, \(i,j\) and \(k\)-order neighbors if the shortest path for \(i,j\) is \(k\). If False, \(i,j\) are k-order neighbors if there is a path from \(i,j\) of length \(k\). By default True.
- diagonalbool,
optional
If True, keep \(k\)-order (\(i,j\)) joins when \(i==j\). If False, remove \(k\)-order (\(i,j\)) joins when \(i==j\). By default False.
- lower_orderbool,
optional
If True, include lower order contiguities. If False return only weights of order \(k\). By default False.
- k
- Returns:
Graph
higher order weights
- property isolates¶
Index of observations with no neighbors
Isolates are encoded as a self-loop with the weight == 0 in the adjacency table.
- Returns:
pandas.Index
Index with a subset of observations that do not have any neighbor
- lag(y, categorical=False, ties='raise')[source]¶
Spatial lag operator
Constructs spatial lag based on neighbor relations of the graph.
- Parameters:
- y
array
numpy array with dimensionality conforming to w
- categoricalbool
True if y is categorical, False if y is continuous.
- ties{‘raise’, ‘random’, ‘tryself’},
optional
Policy on how to break ties when a focal unit has multiple modes for a categorical lag. - ‘raise’: This will raise an exception if ties are encountered to alert the user (Default). - ‘random’: modal label ties Will be broken randomly. - ‘tryself’: check if focal label breaks the tie between label modes. If the focal label does not break the modal tie, the tie will be be broken randomly. If the focal unit has a self-weight, focal label is not used to break any tie, rather any tie will be broken randomly.
- y
- Returns:
numpy.ndarray
array of numeric|categorical values for the spatial lag
- property n¶
Number of observations.
- property n_edges¶
Number of observations.
- property n_nodes¶
Number of observations.
- property neighbors¶
Get neighbors dictionary
- Returns:
dict
dict of tuples representing neighbors
Notes
It is recommended to work directly with
Graph.adjacency()
rather than using theGraph.neighbors()
.
- property nonzero¶
Number of nonzero weights.
- property pct_nonzero¶
Percentage of nonzero weights.
- plot(gdf, focal=None, nodes=True, color='k', edge_kws=None, node_kws=None, focal_kws=None, ax=None, figsize=None, limit_extent=False)[source]¶
Plot edges and nodes of the Graph
Creates a
maptlotlib
plot based on the topology stored in the Graph and spatial location defined ingdf
.- Parameters:
- gdf
geopandas.GeoDataFrame
Geometries indexed using the same index as Graph. Geometry types other than points are converted to centroids encoding start and end point of Graph edges.
- focal
hashable
| array_like[hashable
] |None
,optional
ID or an array-like of IDs of focal geometries whose weights shall be plotted. If None, all weights from all focal geometries are plotted. By default None
- nodesbool,
optional
Plot nodes as points, by default True
- color
str
,optional
The color of all objects, by default “k”
- edge_kws
dict
,optional
Keyword arguments dictionary to send to
LineCollection
, which provides fine-grained control over the aesthetics of the edges in the plot. By default None- node_kws
dict
,optional
Keyword arguments dictionary to send to
ax.scatter
, which provides fine-grained control over the aesthetics of the nodes in the plot. By default None- focal_kws
dict
,optional
Keyword arguments dictionary to send to
ax.scatter
, which provides fine-grained control over the aesthetics of the focal nodes in the plot on top of genericnode_kws
. Values ofnode_kws
are updated fromfocal_kws
. Ignored iffocal=None
. By default None- ax
matplotlib.axes.Axes
,optional
Axis on which to plot the weights. If None, a new figure and axis are created. By default None
- figsize
tuple
,optional
figsize used to create a new axis. By default None
- limit_extentbool,
optional
limit the extent of the axis to the extent of the plotted graph, by default False
- gdf
- Returns:
matplotlib.axes.Axes
Axis with the resulting plot
Notes
If you’d like to overlay the actual geometries from the
geopandas.GeoDataFrame
, create an axis by plotting theGeoDataFrame
and plot the Graph on top.ax = gdf.plot() gdf_graph.plot(gdf, ax=ax)
- property sparse¶
Return a scipy.sparse array (CSR)
- Returns:
scipy.sparse.CSR
sparse representation of the adjacency
- subgraph(ids)[source]¶
Returns a subset of Graph containing only nodes specified in ids
The resulting subgraph contains only the nodes in
ids
and the edges between them or zero-weight self-loops in case of isolates.The order of
ids
reflects a new canonical order of the resulting subgraph. This meansids
should be equal to the index of the DataFrame containing data linked to the graph to ensure alignment of sparse representation of subgraph.- Parameters:
- idsarray_like
An array of node IDs to be retained
- Returns:
Graph
A new Graph that is a subset of the original
Notes
Unlike the implementation in
networkx
, this creates a copy since Graphs inlibpysal
are immutable.
- to_W()[source]¶
Convert Graph to a libpysal.weights.W object
- Returns:
libpysal.weights.W
representation of graph as a weights.W object
Examples
>>> import geopandas as gpd >>> from geodatasets import get_path >>> nybb = gpd.read_file(get_path('nybb')).set_index("BoroName") >>> nybb BoroCode ... geometry BoroName ... Staten Island 5 ... MULTIPOLYGON (((970217.022 145643.332, 970227.... Queens 4 ... MULTIPOLYGON (((1029606.077 156073.814, 102957... Brooklyn 3 ... MULTIPOLYGON (((1021176.479 151374.797, 102100... Manhattan 1 ... MULTIPOLYGON (((981219.056 188655.316, 980940.... Bronx 2 ... MULTIPOLYGON (((1012821.806 229228.265, 101278... [5 rows x 4 columns]
>>> contiguity = graph.Graph.build_contiguity(nybb) >>> contiguity.adjacency focal neighbor Staten Island Staten Island 0 Queens Brooklyn 1 Manhattan 1 Bronx 1 Brooklyn Queens 1 Manhattan 1 Manhattan Queens 1 Brooklyn 1 Bronx 1 Bronx Queens 1 Manhattan 1 Name: weight, dtype: int64
>>> w = contiguity.to_W() >>> w.neighbors {'Bronx': ['Queens', 'Manhattan'], 'Brooklyn': ['Queens', 'Manhattan'], 'Manhattan': ['Queens', 'Brooklyn', 'Bronx'], 'Queens': ['Brooklyn', 'Manhattan', 'Bronx'], 'Staten Island': []}
- to_gal(path)[source]¶
Save Graph to a GAL file
Graph is serialized to the GAL file format.
- Parameters:
- path
str
path to the GAL file
- path
See also
read_gal
- to_gwt(path)[source]¶
Save Graph to a GWT file
Graph is serialized to the GWT file format.
- Parameters:
- path
str
path to the GWT file
- path
See also
read_gwt
- to_networkx()[source]¶
Convert Graph to a
networkx
graph.If Graph is symmetric, returns
nx.Graph
, otherwise returns anx.DiGraph
.- Returns:
networkx.Graph
|networkx.DiGraph
Representation of libpysal Graph as networkx graph
- to_parquet(path, **kwargs)[source]¶
Save Graph to a Apache Parquet
Graph is serialized to the Apache Parquet using the underlying adjacency object stored as a Parquet table and custom metadata containing transformation.
Requires pyarrow package.
- Parameters:
- path
str
|pyarrow.NativeFile
path or any stream supported by pyarrow
- **kwargs
additional keyword arguments passed to pyarrow.parquet.write_table
- path
See also
read_parquet
- transform(transformation)[source]¶
Transformation of weights
- Parameters:
- transformation
str
|callable()
Transformation method. The following are valid transformations.
B – Binary
R – Row-standardization (global sum \(=n\))
D – Double-standardization (global sum \(=1\))
V – Variance stabilizing
Alternatively, you can pass your own callable passed to
self.adjacency.groupby(level=0).transform()
.
- transformation
- Returns:
Graph
transformed weights
- Raises:
ValueError
Value error for unsupported transformation
- property unique_ids¶
Unique IDs used in the Graph
- property weights¶
Get weights dictionary
- Returns:
dict
dict of tuples representing weights
Notes
It is recommended to work directly with
Graph.adjacency()
rather than using theGraph.weights()
.