Quadratic regression is a type of polynomial regression that models the relationship between a dependent and an independent variable as an nth degree polynomial. In the case of quadratic regression, it’s a 2nd-degree polynomial. It’s often used when the relationship between variables isn’t strictly linear. In this guide, we’ll explore how to execute and interpret quadratic regression in R.

### 1. Understanding Quadratic Regression

Quadratic regression is given by the equation:

Here:

- Y is the dependent variable.
- X is the independent variable.
- β0,β1,β2 are coefficients.
- ϵ represents the error term.

The X2 term introduces the polynomial component, allowing the model to capture U-shaped patterns.

### 2. Setting Up the R Environment

Ensure you have R and optionally RStudio installed.

#### Step 1: Install Required Packages

`install.packages("ggplot2")`

#### Step 2: Load Necessary Libraries

`library(ggplot2)`

### 3. Performing Quadratic Regression

Using R’s built-in `mtcars`

dataset, we’ll try to model a quadratic relationship between `mpg`

(miles per gallon) and `hp`

(horsepower).

#### Step 1: Visualizing the Data

A scatter plot can help in visualizing the relationship:

```
ggplot(mtcars, aes(x=hp, y=mpg)) +
geom_point() +
ggtitle("Scatter plot of mpg vs. hp")
```

#### Step 2: Fitting the Quadratic Model

To incorporate the quadratic term, we’ll add an `I(hp^2)`

term to our formula in `lm()`

:

```
quad_model <- lm(mpg ~ hp + I(hp^2), data = mtcars)
summary(quad_model)
```

### 4. Interpreting the Results

**Coefficients**: These represent the change in the dependent variable for a unit change in the independent variable. The coefficient for hp2 will indicate the curvature.**R-squared**: Indicates how well the model fits the data. A value closer to 1 denotes a better fit.**p-value**: A lower p-value (typically ≤ 0.05) for a coefficient suggests it’s significant.

### 5. Checking Assumptions

Quadratic regression, like linear regression, requires certain assumptions to be met:

1. **Linearity in Parameters**: Even though the relationship between the variables is quadratic, the coefficients β must have a linear relationship with the dependent variable.

2. **Independence**: Observations should be independent of each other.

3. **Homoscedasticity**: Residuals should have constant variance. Plotting residuals can help check this:

```
plot(quad_model$residuals, ylab="Residuals", main="Residual Plot")
abline(h=0, col="red")
```

4. **Normality of Residuals**: The residuals should be normally distributed. A Q-Q plot can check this:

```
qqnorm(quad_model$residuals)
qqline(quad_model$residuals)
```

### 6. Visualizing the Quadratic Fit

A graph can show how well the quadratic model fits the data:

```
ggplot(mtcars, aes(x=hp, y=mpg)) +
geom_point() +
geom_smooth(method="lm", formula=y ~ poly(x, 2), se=FALSE, color="red") +
ggtitle("Quadratic Fit of mpg vs. hp")
```

### 7. Model Validation and Prediction

#### Model Validation:

Use the `predict()`

function to get fitted values:

`mtcars$predicted_mpg <- predict(quad_model, mtcars)`

#### Making Predictions:

For new data points:

```
new_data <- data.frame(hp=c(100, 150))
predicted_values <- predict(quad_model, new_data)
print(predicted_values)
```

### 8. Advantages and Disadvantages

**Advantages**:

- Can capture non-linear relationships.
- Doesn’t require transformation of variables.

**Disadvantages**:

- Can easily overfit with higher-degree polynomials.
- Requires careful validation.

### 9. Conclusion

Quadratic regression is a powerful tool when faced with U-shaped patterns in data. It captures non-linear relationships without the need for data transformations. However, like all models, the assumptions need to be checked carefully. Quadratic regression forms a stepping stone towards more complex polynomial regression models, and R provides a comprehensive suite of tools to work with these models efficiently.