# Introduction

Quadratic regression, or polynomial regression of order 2, is a form of regression analysis in which the relationship between the independent variable x and the dependent variable y is modeled as an nth degree polynomial. In quadratic regression, we aim to find the best fitting curve, or parabola, for a set of data points.

Quadratic regression extends the simple linear regression model, which models the relationship between x and y as a straight line, by adding an additional term, (x^2), to the equation of the line. This additional term allows the model to capture nonlinear relationships between x and y.

In this article, we’ll walk through a comprehensive guide on how to perform quadratic regression in Python using different libraries such as NumPy, SciPy, statsmodels, and scikit-learn.

We can perform quadratic regression in Python using NumPy’s polyfit function. This function fits a polynomial of a specified degree to a set of data using the method of least squares, and returns the coefficients of the polynomial.

Here is an example of how to use polyfit to perform quadratic regression:

import numpy as np
import matplotlib.pyplot as plt

# Define the data
x = np.array([1, 2, 3, 4, 5])
y = np.array([1, 4, 9, 16, 25])

coefficients = np.polyfit(x, y, 2)
polynomial = np.poly1d(coefficients)

# Plot the original data and the polynomial fit
plt.scatter(x, y)
plt.plot(x, polynomial(x), color='red')
plt.show()

In this example, np.polyfit(x, y, 2) fits a second-degree polynomial (a parabola) to the data. The function np.poly1d(coefficients) creates a polynomial function from the coefficients returned by polyfit, which we can then use to compute the y-values for the polynomial fit.

We can also use the curve_fit function from the SciPy library to perform quadratic regression. This function fits a function of your choice to a set of data using the method of least squares.

Here is an example of how to use curve_fit to perform quadratic regression:

from scipy.optimize import curve_fit

# Define the form of the function we want to fit
return a * x**2 + b * x + c

params, params_covariance = curve_fit(quadratic, x, y)

# Print the coefficients
print(params)

In this example, we define the function quadratic(x, a, b, c), which corresponds to the equation of a parabola. We then pass this function, along with our data, to curve_fit, which returns the coefficients that best fit our data.

Scikit-Learn provides the PolynomialFeatures class for transforming our input data, allowing us to fit a linear model to our transformed data to perform polynomial regression:

from sklearn.preprocessing import PolynomialFeatures
from sklearn.linear_model import LinearRegression

# Reshape the data to fit the model
x = x.reshape(-1, 1)
y = y.reshape(-1, 1)

# Transform the data
poly = PolynomialFeatures(degree=2)
x_poly = poly.fit_transform(x)

# Fit the model
model = LinearRegression()
model.fit(x_poly, y)

# Predict y values
y_pred = model.predict(x_poly)

# Plot the original data and the polynomial fit
plt.scatter(x, y)
plt.plot(x, y_pred, color='red')
plt.show()

In this example, PolynomialFeatures(degree=2) generates a new feature matrix consisting of all polynomial combinations of the features with degree less than or equal to the specified degree. For example, if an input sample is two dimensional and of the form [a, b], the degree-2 polynomial features are [1, a, b, a^2, ab, b^2].

Statsmodels is a powerful Python library for statistics and econometrics. Statsmodels also allows us to perform quadratic regression by adding a quadratic term to our model:

import statsmodels.api as sm

# Create a DataFrame
df = pd.DataFrame({
'x': x.flatten(),
'y': y.flatten()
})

df['x_squared'] = df['x'] ** 2

# Define our dependent variable
y = df['y']
# Define our independent variables
X = df[['x', 'x_squared']]
print(results.summary())