class sklearn.preprocessing.KBinsDiscretizer(n_bins=5, encode=’onehot’, strategy=’quantile’) [source]
Bin continuous data into intervals.
Read more in the User Guide.
| Parameters: |
|
|---|---|
| Attributes: |
|
See also
sklearn.preprocessing.Binarizer
0 or 1 based on a parameter threshold.In bin edges for feature i, the first and last values are used only for inverse_transform. During transform, bin edges are extended to:
np.concatenate([-np.inf, bin_edges_[i][1:-1], np.inf])
You can combine KBinsDiscretizer with sklearn.compose.ColumnTransformer if you only want to preprocess part of the features.
>>> X = [[-2, 1, -4, -1],
... [-1, 2, -3, -0.5],
... [ 0, 3, -2, 0.5],
... [ 1, 4, -1, 2]]
>>> est = KBinsDiscretizer(n_bins=3, encode='ordinal', strategy='uniform')
>>> est.fit(X)
KBinsDiscretizer(...)
>>> Xt = est.transform(X)
>>> Xt
array([[ 0., 0., 0., 0.],
[ 1., 1., 1., 0.],
[ 2., 2., 2., 1.],
[ 2., 2., 2., 2.]])
Sometimes it may be useful to convert the data back into the original feature space. The inverse_transform function converts the binned data into the original feature space. Each value will be equal to the mean of the two bin edges.
>>> est.bin_edges_[0]
array([-2., -1., 0., 1.])
>>> est.inverse_transform(Xt)
array([[-1.5, 1.5, -3.5, -0.5],
[-0.5, 2.5, -2.5, -0.5],
[ 0.5, 3.5, -1.5, 0.5],
[ 0.5, 3.5, -1.5, 1.5]])
fit(X[, y]) | Fits the estimator. |
fit_transform(X[, y]) | Fit to data, then transform it. |
get_params([deep]) | Get parameters for this estimator. |
inverse_transform(Xt) | Transforms discretized data back to original feature space. |
set_params(**params) | Set the parameters of this estimator. |
transform(X) | Discretizes the data. |
__init__(n_bins=5, encode=’onehot’, strategy=’quantile’) [source]
fit(X, y=None) [source]
Fits the estimator.
| Parameters: |
|
|---|---|
| Returns: |
|
fit_transform(X, y=None, **fit_params) [source]
Fit to data, then transform it.
Fits transformer to X and y with optional parameters fit_params and returns a transformed version of X.
| Parameters: |
|
|---|---|
| Returns: |
|
get_params(deep=True) [source]
Get parameters for this estimator.
| Parameters: |
|
|---|---|
| Returns: |
|
inverse_transform(Xt) [source]
Transforms discretized data back to original feature space.
Note that this function does not regenerate the original data due to discretization rounding.
| Parameters: |
|
|---|---|
| Returns: |
|
set_params(**params) [source]
Set the parameters of this estimator.
The method works on simple estimators as well as on nested objects (such as pipelines). The latter have parameters of the form <component>__<parameter> so that it’s possible to update each component of a nested object.
| Returns: |
|
|---|
transform(X) [source]
Discretizes the data.
| Parameters: |
|
|---|---|
| Returns: |
|
sklearn.preprocessing.KBinsDiscretizer
© 2007–2018 The scikit-learn developers
Licensed under the 3-clause BSD License.
http://scikit-learn.org/stable/modules/generated/sklearn.preprocessing.KBinsDiscretizer.html