mapclassify.BoxPlot

class mapclassify.BoxPlot(y, hinge=1.5)[source]

BoxPlot Map Classification.

Parameters:
ynumpy.array

Attribute to classify

hingefloat (default 1.5)

Multiplier for IQR.

Attributes:
ybnumpy.array

\((n,1)\), bin ids for observations.

binsarray

\((n,1)\), the upper bounds of each class (monotonic).

kint

The number of classes.

countsnumpy.array

\((k,1)\), the number of observations falling in each class.

low_outlier_idsnumpy.array

Indices of observations that are low outliers.

high_outlier_idsnumpy.array

Indices of observations that are high outliers.

Notes

The bins are set as follows:

bins[0] = q[0]-hinge*IQR
bins[1] = q[0]
bins[2] = q[1]
bins[3] = q[2]
bins[4] = q[2]+hinge*IQR
bins[5] = inf  (see Notes)

where \(q\) is an array of the first three quartiles of \(y\) and \(IQR=q[2]-q[0]\).

If \(q[2]+hinge*IQR > max(y)\) there will only be 5 classes and no high outliers, otherwise, there will be 6 classes and at least one high outlier.

Examples

>>> import mapclassify
>>> import numpy
>>> cal = mapclassify.load_example()
>>> bp = mapclassify.BoxPlot(cal)
>>> bp.bins
array([-5.287625e+01,  2.567500e+00,  9.365000e+00,  3.953000e+01,
        9.497375e+01,  4.111450e+03])
>>> bp.counts.tolist()
[0, 15, 14, 14, 6, 9]
>>> bp.high_outlier_ids.tolist()
[0, 6, 18, 29, 33, 36, 37, 40, 42]
>>> cal[bp.high_outlier_ids].values
array([ 329.92,  181.27,  370.5 ,  722.85,  192.05,  110.74, 4111.45,
        317.11,  264.93])
>>> bx = mapclassify.BoxPlot(numpy.arange(100))
>>> bx.bins
array([-49.5 ,  24.75,  49.5 ,  74.25, 148.5 ])
__init__(y, hinge=1.5)[source]
Parameters:
ynumpy.array

\((n,1)\), attribute to classify

hingefloat (default 1.5)

Multiple of inter-quartile range.

Methods

__init__(y[, hinge])

find_bin(x)

Sort input or inputs according to the current bin estimate.

get_adcm()

Absolute deviation around class median (ADCM).

get_fmt()

get_gadf()

Goodness of absolute deviation of fit.

get_legend_classes([fmt])

Format the strings for the classes on the legend.

get_tss()

Returns sum of squares over all class means.

make(*args, **kwargs)

Configure and create a classifier that will consume data and produce classifications, given the configuration options specified by this function.

plot(gdf[, border_color, border_width, ...])

Plot a mapclassifier object.

plot_histogram([color, linecolor, ...])

Plot histogram of y with bin values superimposed

set_fmt(fmt)

table()

update([y, inplace])

Add data or change classification parameters.

Attributes

fmt

update(y=None, inplace=False, **kwargs)[source]

Add data or change classification parameters.

Parameters:
ynumpy.array (default None)

\((n,1)\), array of data to classify.

inplacebool (default False)

Whether to conduct the update in place or to return a copy estimated from the additional specifications.

**kwargsdict

Additional parameters that are passed to the __init__ function of the class. For documentation, check the class constructor.