# How to Perform Quadratic Regression in R

Quadratic regression is a type of polynomial regression that models the relationship between a dependent and an independent variable as an nth degree polynomial. In the case of quadratic regression, it’s a 2nd-degree polynomial. It’s often used when the relationship between variables isn’t strictly linear. In this guide, we’ll explore how to execute and interpret quadratic regression in R.

Quadratic regression is given by the equation:

Here:

• Y is the dependent variable.
• X is the independent variable.
• β0,β1,β2​ are coefficients.
• ϵ represents the error term.

The X2 term introduces the polynomial component, allowing the model to capture U-shaped patterns.

### 2. Setting Up the R Environment

Ensure you have R and optionally RStudio installed.

#### Step 1: Install Required Packages

install.packages("ggplot2")

#### Step 2: Load Necessary Libraries

library(ggplot2)

Using R’s built-in mtcars dataset, we’ll try to model a quadratic relationship between mpg (miles per gallon) and hp (horsepower).

#### Step 1: Visualizing the Data

A scatter plot can help in visualizing the relationship:

ggplot(mtcars, aes(x=hp, y=mpg)) +
geom_point() +
ggtitle("Scatter plot of mpg vs. hp")

#### Step 2: Fitting the Quadratic Model

To incorporate the quadratic term, we’ll add an I(hp^2) term to our formula in lm():

quad_model <- lm(mpg ~ hp + I(hp^2), data = mtcars)
summary(quad_model)

### 4. Interpreting the Results

• Coefficients: These represent the change in the dependent variable for a unit change in the independent variable. The coefficient for hp2 will indicate the curvature.
• R-squared: Indicates how well the model fits the data. A value closer to 1 denotes a better fit.
• p-value: A lower p-value (typically ≤ 0.05) for a coefficient suggests it’s significant.

### 5. Checking Assumptions

Quadratic regression, like linear regression, requires certain assumptions to be met:

1. Linearity in Parameters: Even though the relationship between the variables is quadratic, the coefficients β must have a linear relationship with the dependent variable.

2. Independence: Observations should be independent of each other.

3. Homoscedasticity: Residuals should have constant variance. Plotting residuals can help check this:

plot(quad_model$residuals, ylab="Residuals", main="Residual Plot") abline(h=0, col="red") 4. Normality of Residuals: The residuals should be normally distributed. A Q-Q plot can check this: qqnorm(quad_model$residuals)

#### Making Predictions:

For new data points:

new_data <- data.frame(hp=c(100, 150))
print(predicted_values)

1. Can capture non-linear relationships.
2. Doesn’t require transformation of variables.