Introduction to Polynomial Regression in Machine Learning

Spread the love

What is a Polynomial ?

In general, a polynomial is the sum of a finite number of terms where each terms has a coefficient being multiplied by a variable being raised to a non negative integer power.

The equation of a line y = mx + b is an example of a polynomial.

The degree of a polynomial is the highest exponent that the variable x is raised to.

For example y = 2x**3 + 8x**2 -40 has degree 3, since 3 is the highest exponent that the variable is raised to.

What is Polynomial Regression in Machine Learning?

Polynomial regression is used when we want to fit a linear model to a nonlinear dataset. Polynomial regression does this by adding powers of each features as new features then train a linear model on this extended set of features. The polynomial features transforms an array containing n features into an array containing (n + d) ! / (d! * n!) features.

How to train a Polynomial Regression Model in Scikit-Learn ?

let’s read a dataset to work with.

import pandas as pd
import numpy as np
from sklearn import datasets

housing = datasets.fetch_california_housing()
X = pd.DataFrame(housing.data, columns=housing.feature_names)
y = housing.target
X.head()

Now, let’s split the data into a training and test set.

from sklearn.model_selection import train_test_split

X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)

Now, we will train a polynomial regression model and measure the rmse.

from sklearn.preprocessing import PolynomialFeatures
from sklearn.linear_model import LinearRegression
from sklearn.metrics import mean_squared_error

# create polynomial features
poly_features = PolynomialFeatures(degree=2, include_bias=False)
X_train_poly = poly_features.fit_transform(X_train)
X_test_poly = poly_features.transform(X_test)

# create a linear regression model
lin_reg = LinearRegression()
lin_reg.fit(X_train_poly, y_train)

# predict on the test set
y_pred = lin_reg.predict(X_test_poly)

# measure error
mse = mean_squared_error(y_test, y_pred)
rmse = np.sqrt(mse)
rmse
# output
0.68139674479311

Related Posts –

  1. Introduction to Linear Regression in Machine Learning

Rating: 1 out of 5.

Leave a Reply