What is Ridge Regression in Machine Learning ?
Ridge regression ( also called Tikhonov regularization) is a regularized version of Linear Regression. A regularization term is added to the cost function. This forces the learning algorithm to not only fit the data but also keep the model weights as small as possible.
The hyperparameter alpha controls how much you want to regularize the model. If alpha=0 then Ridge Regression is just Linear Regression. If alpha is very large then all weights end up very close to zero and the result is a flat line going through the data’s mean.
How to train a Ridge Regression Model in Sklearn?
Let’s read a dataset to work with.
import pandas as pd import numpy as np from sklearn import datasets housing = datasets.fetch_california_housing() X = pd.DataFrame(housing.data, columns=housing.feature_names) y = housing.target X.head()
Now split the data into a training and test set.
from sklearn.model_selection import train_test_split X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)
Now, let’s train a Ridge Regression model in sklearn.
Note – It is important to scale the data (using StandardScaler ) before performing Ridge Regression as it is sensitive to the scale of the input features. This is true of most regularized models.
from sklearn.linear_model import Ridge from sklearn.preprocessing import StandardScaler from sklearn.metrics import mean_squared_error from sklearn.pipeline import make_pipeline # create a ridge regression model ridge_reg = make_pipeline(StandardScaler(), Ridge(alpha=1, solver='cholesky')) # train it on the training data ridge_reg.fit(X_train, y_train) # make predictions on the test set y_pred = ridge_reg.predict(X_test) # measure error mse = mean_squared_error(y_test, y_pred) rmse = np.sqrt(mse) rmse
# output 0.7455567442814783
You can also train a Ridge Regression model using Stochastic Gradient Descent.
from sklearn.linear_model import SGDRegressor sgd_reg = make_pipeline(StandardScaler(), SGDRegressor(penalty='l2')) sgd_reg.fit(X_train, y_train) y_pred = sgd_reg.predict(X_test) mse = mean_squared_error(y_test, y_pred) rmse = np.sqrt(mse) rmse
# output 0.7529495372586685
The penalty hyperparameter sets the type of regularization term to use. Specifying ‘l2’ indicates that you want SGD to add a regularization term to the cost function equal to half the square of the l2 norm of the weight vector. This is simply Ridge Regression.