class sklearn.linear_model.HuberRegressor(epsilon=1.35, max_iter=100, alpha=0.0001, warm_start=False, fit_intercept=True, tol=1e-05) [source]
Linear regression model that is robust to outliers.
The Huber Regressor optimizes the squared loss for the samples where |(y - X'w) / sigma| < epsilon and the absolute loss for the samples where |(y - X'w) / sigma| > epsilon, where w and sigma are parameters to be optimized. The parameter sigma makes sure that if y is scaled up or down by a certain factor, one does not need to rescale epsilon to achieve the same robustness. Note that this does not take into account the fact that the different features of X may be of different scales.
This makes sure that the loss function is not heavily influenced by the outliers while not completely ignoring their effect.
Read more in the User Guide
New in version 0.18.
| Parameters: | 
 | 
|---|---|
| Attributes: | 
 | 
| [1] | Peter J. Huber, Elvezio M. Ronchetti, Robust Statistics Concomitant scale estimates, pg 172 | 
| [2] | Art B. Owen (2006), A robust hybrid of lasso and ridge regression. http://statweb.stanford.edu/~owen/reports/hhu.pdf | 
>>> import numpy as np
>>> from sklearn.linear_model import HuberRegressor, LinearRegression
>>> from sklearn.datasets import make_regression
>>> np.random.seed(0)
>>> X, y, coef = make_regression(
...     n_samples=200, n_features=2, noise=4.0, coef=True, random_state=0)
>>> X[:4] = np.random.uniform(10, 20, (4, 2))
>>> y[:4] = np.random.uniform(10, 20, 4)
>>> huber = HuberRegressor().fit(X, y)
>>> huber.score(X, y) 
-7.284608623514573
>>> huber.predict(X[:1,])
array([806.7200...])
>>> linear = LinearRegression().fit(X, y)
>>> print("True coefficients:", coef)
True coefficients: [20.4923...  34.1698...]
>>> print("Huber coefficients:", huber.coef_)
Huber coefficients: [17.7906... 31.0106...]
>>> print("Linear Regression coefficients:", linear.coef_)
Linear Regression coefficients: [-1.9221...  7.0226...]
 | fit(X, y[, sample_weight]) | Fit the model according to the given training data. | 
| get_params([deep]) | Get parameters for this estimator. | 
| predict(X) | Predict using the linear model | 
| score(X, y[, sample_weight]) | Returns the coefficient of determination R^2 of the prediction. | 
| set_params(**params) | Set the parameters of this estimator. | 
__init__(epsilon=1.35, max_iter=100, alpha=0.0001, warm_start=False, fit_intercept=True, tol=1e-05) [source]
fit(X, y, sample_weight=None) [source]
Fit the model according to the given training data.
| Parameters: | 
 | 
|---|---|
| Returns: | 
 | 
get_params(deep=True) [source]
Get parameters for this estimator.
| Parameters: | 
 | 
|---|---|
| Returns: | 
 | 
predict(X) [source]
Predict using the linear model
| Parameters: | 
 | 
|---|---|
| Returns: | 
 | 
score(X, y, sample_weight=None) [source]
Returns the coefficient of determination R^2 of the prediction.
The coefficient R^2 is defined as (1 - u/v), where u is the residual sum of squares ((y_true - y_pred) ** 2).sum() and v is the total sum of squares ((y_true - y_true.mean()) ** 2).sum(). The best possible score is 1.0 and it can be negative (because the model can be arbitrarily worse). A constant model that always predicts the expected value of y, disregarding the input features, would get a R^2 score of 0.0.
| Parameters: | 
 | 
|---|---|
| Returns: | 
 | 
set_params(**params) [source]
Set the parameters of this estimator.
The method works on simple estimators as well as on nested objects (such as pipelines). The latter have parameters of the form <component>__<parameter> so that it’s possible to update each component of a nested object.
| Returns: | 
 | 
|---|
sklearn.linear_model.HuberRegressor
    © 2007–2018 The scikit-learn developers
Licensed under the 3-clause BSD License.
    http://scikit-learn.org/stable/modules/generated/sklearn.linear_model.HuberRegressor.html