## What is Ridge Regression in Machine Learning ?

Ridge regression ( also called Tikhonov regularization) is a regularized version of Linear Regression. A regularization term is added to the cost function. This forces the learning algorithm to not only fit the data but also keep the model weights as small as possible.

The hyperparameter** alpha** controls how much you want to regularize the model. If alpha=0 then Ridge Regression is just Linear Regression. If alpha is very large then all weights end up very close to zero and the result is a flat line going through the data’s mean.

## How to train a Ridge Regression Model in Sklearn?

Let’s read a dataset to work with.

```
import pandas as pd
import numpy as np
from sklearn import datasets
housing = datasets.fetch_california_housing()
X = pd.DataFrame(housing.data, columns=housing.feature_names)
y = housing.target
X.head()
```

Now split the data into a training and test set.

```
from sklearn.model_selection import train_test_split
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)
```

Now, let’s train a Ridge Regression model in sklearn.

**Note – **It is important to scale the data (using StandardScaler ) before performing Ridge Regression as it is sensitive to the scale of the input features. This is true of most regularized models.

```
from sklearn.linear_model import Ridge
from sklearn.preprocessing import StandardScaler
from sklearn.metrics import mean_squared_error
from sklearn.pipeline import make_pipeline
# create a ridge regression model
ridge_reg = make_pipeline(StandardScaler(), Ridge(alpha=1, solver='cholesky'))
# train it on the training data
ridge_reg.fit(X_train, y_train)
# make predictions on the test set
y_pred = ridge_reg.predict(X_test)
# measure error
mse = mean_squared_error(y_test, y_pred)
rmse = np.sqrt(mse)
rmse
```

```
# output
0.7455567442814783
```

You can also train a Ridge Regression model using Stochastic Gradient Descent.

```
from sklearn.linear_model import SGDRegressor
sgd_reg = make_pipeline(StandardScaler(), SGDRegressor(penalty='l2'))
sgd_reg.fit(X_train, y_train)
y_pred = sgd_reg.predict(X_test)
mse = mean_squared_error(y_test, y_pred)
rmse = np.sqrt(mse)
rmse
```

```
# output
0.7529495372586685
```

The penalty hyperparameter sets the type of regularization term to use. Specifying **‘l2’** indicates that you want SGD to add a regularization term to the cost function equal to half the square of the l2 norm of the weight vector. This is simply Ridge Regression.