libpysal.weights.KNN¶
- class libpysal.weights.KNN(data, k=2, p=2, ids=None, radius=None, distance_metric='euclidean', **kwargs)[source]¶
Creates nearest neighbor weights matrix based on k nearest neighbors.
- Parameters:
- kdtree
object
PySAL KDTree or ArcKDTree where KDtree.data is array (n,k) n observations on k characteristics used to measure distances between the n objects
- k
int
number of nearest neighbors
- p
float
Minkowski p-norm distance metric parameter: 1<=p<=infinity 2: Euclidean distance 1: Manhattan distance Ignored if the KDTree is an ArcKDTree
- ids
list
identifiers to attach to each observation
- kdtree
- Returns:
- w
W
instance Weights object with binary weights
- w
See also
Notes
Ties between neighbors of equal distance are arbitrarily broken.
Further, if many points occupy the same spatial location (i.e. observations are coincident), then you may need to increase k for those observations to acquire neighbors at different spatial locations. For example, if five points are coincident, then their four nearest neighbors will all occupy the same spatial location; only the fifth nearest neighbor will result in those coincident points becoming connected to the graph as a whole.
Solutions to this problem include jittering the points (by adding a small random value to each observation’s location) or by adding higher-k neighbors only to the coincident points, using the weights.w_sets.w_union() function.
Examples
>>> import libpysal >>> import numpy as np >>> points = [(10, 10), (20, 10), (40, 10), (15, 20), (30, 20), (30, 30)] >>> kd = libpysal.cg.KDTree(np.array(points)) >>> wnn2 = libpysal.weights.KNN(kd, 2) >>> [1,3] == wnn2.neighbors[0] True >>> wnn2 = KNN(kd,2) >>> wnn2[0] {1: 1.0, 3: 1.0} >>> wnn2[1] {0: 1.0, 3: 1.0}
now with 1 rather than 0 offset
>>> wnn2 = libpysal.weights.KNN(kd, 2, ids=range(1,7)) >>> wnn2[1] {2: 1.0, 4: 1.0} >>> wnn2[2] {1: 1.0, 4: 1.0} >>> 0 in wnn2.neighbors False
Methods
__init__
(data[, k, p, ids, radius, ...])asymmetry
([intrinsic])Asymmetry check.
from_WSP
(WSP[, silence_warnings])Create a pysal W from a pysal WSP object (thin weights matrix).
from_adjlist
(adjlist[, focal_col, ...])Return an adjacency list representation of a weights object.
from_array
(array, *args, **kwargs)Creates nearest neighbor weights matrix based on k nearest neighbors.
from_dataframe
(df[, geom_col, ids, use_index])Make KNN weights from a dataframe.
from_file
([path, format])Read a weights file into a W object.
from_networkx
(graph[, weight_col])Convert a
networkx
graph to a PySALW
object.from_shapefile
(filepath, *args, **kwargs)Nearest neighbor weights from a shapefile.
from_sparse
(sparse)Convert a
scipy.sparse
array to a PySALW
object.full
()Generate a full
numpy.ndarray
.get_transform
()Getter for transform property.
plot
(gdf[, indexed_on, ax, color, node_kws, ...])Plot spatial weights objects.
remap_ids
(new_ids)In place modification throughout
W
of id values fromw.id_order
tonew_ids
in all.reweight
([k, p, new_data, new_ids, inplace])Redo K-Nearest Neighbor weights construction using given parameters
set_shapefile
(shapefile[, idVariable, full])Adding metadata for writing headers of
.gal
and.gwt
files.set_transform
([value])Transformations of weights.
symmetrize
([inplace])Construct a symmetric KNN weight.
to_WSP
()Generate a
WSP
object.to_adjlist
([remove_symmetric, drop_islands, ...])Compute an adjacency list representation of a weights object.
to_file
([path, format])Write a weights to a file.
to_networkx
()Convert a weights object to a
networkx
graph.to_sparse
([fmt])Generate a
scipy.sparse
array object from a pysal W.Attributes
asymmetries
List of id pairs with asymmetric weights sorted in ascending index location order.
cardinalities
Number of neighbors for each observation.
component_labels
Store the graph component in which each observation falls.
diagW2
Diagonal of \(WW\).
diagWtW
Diagonal of \(W^{'}W\).
diagWtW_WW
Diagonal of \(W^{'}W + WW\).
histogram
Cardinality histogram as a dictionary where key is the id and value is the number of neighbors for that unit.
id2i
Dictionary where the key is an ID and the value is that ID's index in
W.id_order
.id_order
Returns the ids for the observations in the order in which they would be encountered if iterating over the weights.
id_order_set
Returns
True
if user has setid_order
,False
if not.islands
List of ids without any neighbors.
max_neighbors
Largest number of neighbors.
mean_neighbors
Average number of neighbors.
min_neighbors
Minimum number of neighbors.
n
Number of units.
n_components
Store whether the adjacency matrix is fully connected.
neighbor_offsets
Given the current
id_order
,neighbor_offsets[id]
is the offsets of the id's neighbors inid_order
.nonzero
Number of nonzero weights.
pct_nonzero
Percentage of nonzero weights.
s0
s0
is defined ass1
s1
is defined ass2
s2
is defined ass2array
Individual elements comprising
s2
.sd
Standard deviation of number of neighbors.
sparse
Sparse matrix object.
transform
Getter for transform property.
trcW2
Trace of \(WW\).
trcWtW
Trace of \(W^{'}W\).
trcWtW_WW
Trace of \(W^{'}W + WW\).
- classmethod from_array(array, *args, **kwargs)[source]¶
Creates nearest neighbor weights matrix based on k nearest neighbors.
- Parameters:
- array
np.ndarray
(n, k) array representing n observations on k characteristics used to measure distances between the n objects
- **kwargs
keyword
arguments
,see
Rook
- array
- Returns:
- w
W
instance Weights object with binary weights
- w
See also
Notes
Ties between neighbors of equal distance are arbitrarily broken.
Examples
>>> from libpysal.weights import KNN >>> points = [(10, 10), (20, 10), (40, 10), (15, 20), (30, 20), (30, 30)] >>> wnn2 = KNN.from_array(points, 2) >>> [1,3] == wnn2.neighbors[0] True >>> wnn2 = KNN.from_array(points,2) >>> wnn2[0] {1: 1.0, 3: 1.0} >>> wnn2[1] {0: 1.0, 3: 1.0}
now with 1 rather than 0 offset
>>> wnn2 = KNN.from_array(points, 2, ids=range(1,7)) >>> wnn2[1] {2: 1.0, 4: 1.0} >>> wnn2[2] {1: 1.0, 4: 1.0} >>> 0 in wnn2.neighbors False
- classmethod from_dataframe(df, geom_col=None, ids=None, use_index=True, *args, **kwargs)[source]¶
Make KNN weights from a dataframe.
- Parameters:
- df
pandas.dataframe
a dataframe with a geometry column that can be used to construct a W object
- geom_col
str
the name of the column in df that contains the geometries. Defaults to active geometry column.
- idslist-like,
str
a list-like of ids to use to index the spatial weights object or the name of the column to use as IDs. If nothing is provided, the dataframe index is used if use_index=True or a positional index is used if use_index=False. Order of the resulting W is not respected from this list.
- use_indexbool
use index of df as ids to index the spatial weights object.
- df
See also
- classmethod from_shapefile(filepath, *args, **kwargs)[source]¶
Nearest neighbor weights from a shapefile.
- Parameters:
- data
str
shapefile containing attribute data.
- k
int
number of nearest neighbors
- p
float
Minkowski p-norm distance metric parameter: 1<=p<=infinity 2: Euclidean distance 1: Manhattan distance
- ids
list
identifiers to attach to each observation
- radius
float
If supplied arc_distances will be calculated based on the given radius. p will be ignored.
- data
- Returns:
- w
KNN
instance; Weights object with binary weights.
- w
See also
Notes
Ties between neighbors of equal distance are arbitrarily broken.
Examples
Polygon shapefile >>> import libpysal >>> from libpysal.weights import KNN >>> wc=KNN.from_shapefile(libpysal.examples.get_path(“columbus.shp”)) >>> “%.4f”%wc.pct_nonzero ‘4.0816’ >>> set([2,1]) == set(wc.neighbors[0]) True >>> wc3=KNN.from_shapefile(libpysal.examples.get_path(“columbus.shp”),k=3) >>> set(wc3.neighbors[0]) == set([2,1,3]) True >>> set(wc3.neighbors[2]) == set([4,3,0]) True
Point shapefile
>>> w=KNN.from_shapefile(libpysal.examples.get_path("juvenile.shp")) >>> w.pct_nonzero 1.1904761904761905 >>> w1=KNN.from_shapefile(libpysal.examples.get_path("juvenile.shp"),k=1) >>> "%.3f"%w1.pct_nonzero '0.595'
- reweight(k=None, p=None, new_data=None, new_ids=None, inplace=True)[source]¶
Redo K-Nearest Neighbor weights construction using given parameters
- Parameters:
- new_data
np.ndarray
an array containing additional data to use in the KNN weight
- new_ids
list
a list aligned with new_data that provides the ids for each new observation
- inplacebool
a flag denoting whether to modify the KNN object in place or to return a new KNN object
- k
int
number of nearest neighbors
- p
float
Minkowski p-norm distance metric parameter: 1<=p<=infinity 2: Euclidean distance 1: Manhattan distance Ignored if the KDTree is an ArcKDTree
- new_data
- Returns: