mapclassify.JenksCaspallSampled

class mapclassify.JenksCaspallSampled(y, k=5, pct=0.1)[source]

Jenks Caspall Map Classification using a random sample

Parameters
yarray

(n,1), values to classify

kint

number of classes required

pctfloat

The percentage of n that should form the sample If pct is specified such that n*pct > 1000, then pct = 1000./n

Notes

This is intended for large n problems. The logic is to apply Jenks_Caspall to a random subset of the y space and then bin the complete vector y on the bins obtained from the subset. This would trade off some “accuracy” for a gain in speed.

Examples

>>> import mapclassify as mc
>>> cal = mc.load_example()
>>> x = np.random.random(100000)
>>> jc = mc.JenksCaspall(x)
>>> jcs = mc.JenksCaspallSampled(x)
>>> jc.bins
array([0.1988721 , 0.39624334, 0.59441487, 0.79624357, 0.99999251])
>>> jcs.bins
array([0.20998558, 0.42112792, 0.62752937, 0.80543819, 0.99999251])
>>> jc.counts
array([19943, 19510, 19547, 20297, 20703])
>>> jcs.counts
array([21039, 20908, 20425, 17813, 19815])

# not for testing since we get different times on different hardware # just included for documentation of likely speed gains #>>> t1 = time.time(); jc = Jenks_Caspall(x); t2 = time.time() #>>> t1s = time.time(); jcs = Jenks_Caspall_Sampled(x); t2s = time.time() #>>> t2 - t1; t2s - t1s #1.8292930126190186 #0.061631917953491211

Attributes
ybarray

(n,1), bin ids for observations,

binsarray

(k,1), the upper bounds of each class

kint

the number of classes

countsarray

(k,1), the number of observations falling in each class

__init__(self, y, k=5, pct=0.1)[source]

Initialize self. See help(type(self)) for accurate signature.

Methods

__init__(self, y[, k, pct])

Initialize self.

find_bin(self, x)

Sort input or inputs according to the current bin estimate

get_adcm(self)

Absolute deviation around class median (ADCM).

get_fmt(self)

get_gadf(self)

Goodness of absolute deviation of fit

get_legend_classes(self[, fmt])

Format the strings for the classes on the legend

get_tss(self)

Total sum of squares around class means

make(\*args, \*\*kwargs)

Configure and create a classifier that will consume data and produce classifications, given the configuration options specified by this function.

plot(self, gdf[, border_color, …])

Plot Mapclassiifer NOTE: Requires matplotlib, and implicitly requires geopandas dataframe as input.

set_fmt(self, fmt)

table(self)

update(self[, y, inplace])

Add data or change classification parameters.

Attributes

fmt

update(self, y=None, inplace=False, \*\*kwargs)[source]

Add data or change classification parameters.

Parameters
yarray

(n,1) array of data to classify

inplacebool

whether to conduct the update in place or to return a copy estimated from the additional specifications.

Additional parameters provided in **kwargs are passed to the init
function of the class. For documentation, check the class constructor.