mapclassify.JenksCaspallSampled¶
- class mapclassify.JenksCaspallSampled(y, k=5, pct=0.1)[source]¶
Jenks Caspall Map Classification using a random sample.
- Parameters:
- y
numpy.array
\((n,1)\), values to classify.
- k
int
(default
5) The number of classes required.
- pct
float
(default
0.10) The percentage of \(n\) that should form the sample. If
pct
is specified such that \(n*pct > 1000\), then \(pct = 1000./n\).
- y
- Attributes:
- yb
numpy.array
\((n,1)\), bin IDs for observations.
- bins
numpy.array
\((k,1)\), the upper bounds of each class.
- k
int
The number of classes.
- counts
numpy.array
\((k,1)\), the number of observations falling in each class.
- yb
Notes
This is intended for large \(n\) problems. The logic is to apply
Jenks_Caspall
to a random subset of the \(y\) space and then bin the complete vector \(y\) on the bins obtained from the subset. This would trade off some “accuracy” for a gain in speed.Examples
>>> import mapclassify >>> import numpy >>> cal = mapclassify.load_example() >>> numpy.random.seed(0) >>> x = numpy.random.random(100000) >>> jc = mapclassify.JenksCaspall(x) >>> jcs = mapclassify.JenksCaspallSampled(x) >>> jc.bins array([0.20108144, 0.4025151 , 0.60396127, 0.80302249, 0.99997795])
>>> jcs.bins array([0.19978245, 0.40793025, 0.59253555, 0.78241472, 0.99997795])
>>> jc.counts.tolist() [20286, 19951, 20310, 19708, 19745]
>>> jcs.counts.tolist() [20147, 20633, 18591, 18857, 21772]
# not for testing since we get different times on different hardware # just included for documentation of likely speed gains #>>> t1 = time.time(); jc = Jenks_Caspall(x); t2 = time.time() #>>> t1s = time.time(); jcs = Jenks_Caspall_Sampled(x); t2s = time.time() #>>> t2 - t1; t2s - t1s #1.8292930126190186 #0.061631917953491211
Methods
__init__
(y[, k, pct])find_bin
(x)Sort input or inputs according to the current bin estimate.
get_adcm
()Absolute deviation around class median (ADCM).
get_fmt
()get_gadf
()Goodness of absolute deviation of fit.
get_legend_classes
([fmt])Format the strings for the classes on the legend.
get_tss
()Returns sum of squares over all class means.
make
(*args, **kwargs)Configure and create a classifier that will consume data and produce classifications, given the configuration options specified by this function.
plot
(gdf[, border_color, border_width, ...])Plot a mapclassifier object.
plot_histogram
([color, linecolor, ...])Plot histogram of y with bin values superimposed
set_fmt
(fmt)table
()update
([y, inplace])Add data or change classification parameters.
Attributes
fmt
- update(y=None, inplace=False, **kwargs)[source]¶
Add data or change classification parameters.
- Parameters:
- y
numpy.array
(default
None
) \((n,1)\), array of data to classify.
- inplacebool (
default
False
) Whether to conduct the update in place or to return a copy estimated from the additional specifications.
- **kwargs
dict
Additional parameters that are passed to the
__init__
function of the class. For documentation, check the class constructor.
- y