esda.path_silhouette

esda.path_silhouette(data, labels, W, D=None, metric=<function euclidean_distances>, closest=False, return_nbfc=False, return_nbfc_score=False, return_paths=False, directed=False)[source]

Compute a path silhouette for all observations [Rou87, WKR19].

Parameters:
datanp.ndarray (N,P)

matrix of data with N observations and P covariates.

labelsnp.ndarray (N,)

flat vector of the L labels assigned over N observations.

Wlibpysal.weights.W | libpysal.graph.Graph

spatial weights object reflecting the spatial connectivity in the problem under analysis

Dnp.ndarray (N,N)

a precomputed distance matrix to apply over W. If passed, takes precedence over data, and data is ignored.

metriccallable()

function mapping the (N,P) data into an (N,N) dissimilarity matrix, like that found in scikit.metrics.pairwise or scipy.spatial.distance

closestbool

whether or not to consider the observation “connected” when it is first connected to the cluster, or considering the path cost to transit through the cluster. If True, the path cost is assessed between i and the path-closest j in each cluster. If False, the path cost is assessed as the average of path costs between i and all j in each cluster

return_nbfcbool

Whether or not to return the label of the next best fit cluster

return_nbfc_score: bool

Whether or not to return the score of the next best fit cluster.

return_pathsbool

Whether or not to return the matrix of shortest path lengths after having computed them.

directedbool

whether to consider the weights matrix as directed or undirected. If directed, asymmetry in the input W is heeded. If not, asymmetry is ignored.

Returns:
An (N_obs,) array of the path silhouette values for each observation.