# How to Plot Multiple Boxplots in One Chart in R

R is a popular language for statistical analysis and data visualization. It has many built-in functions and libraries for generating a wide range of plots. One of these visualizations is a boxplot, which provides a summary of the distribution of data points in a dataset along a single axis.

A single boxplot is often useful for getting a quick overview of a single variable, but there may be times when you want to compare distributions across multiple variables or groups. This is where multiple boxplots come into the picture. In this article, we will cover how to plot multiple boxplots on one chart in R. We’ll take a look at the base R approach as well as using the ggplot2 package.

## Multiple Boxplots Using Base R

The basic function to create a box plot in R is boxplot(). The simplest way to plot multiple boxplots on a single chart is to provide the boxplot function with a formula argument in the form of y ~ x where x is a factor variable dividing y into groups.

Let’s use mtcars data to create multiple boxplots of mpg (miles per gallon) grouped by number of cylinders (cyl).

# load data
data(mtcars)

# create boxplots
boxplot(mpg ~ cyl, data = mtcars,
main = "Boxplot of MPG grouped by number of cylinders",
xlab = "Number of Cylinders",
ylab = "Miles Per Gallon")

In the above code, mpg ~ cyl instructs R to split the mpg data according to the cyl factor levels. The main, xlab, and ylab parameters are used to set the title, x-axis label, and y-axis label respectively.

## Multiple Boxplots Using ggplot2

While base R graphics are useful, the ggplot2 package offers more flexibility and customization options. Below is how to generate the same plot as above using ggplot2.

First, make sure you’ve loaded the ggplot2 package with the following command:

library(ggplot2)

Next, we can use the ggplot() function to initialize our plot, the aes() function to specify our variables, and the geom_boxplot() function to create the boxplots:

ggplot(mtcars, aes(x = factor(cyl), y = mpg)) +
geom_boxplot() +
labs(title = "Boxplot of MPG grouped by number of cylinders",
x = "Number of Cylinders",
y = "Miles Per Gallon")

The factor(cyl) function is used to treat cyl as a factor variable.

What if we want to include another variable in our boxplots? Let’s say we want to look at the effect of both the number of cylinders and the type of transmission on miles per gallon. In ggplot2, we can add a fill aesthetic to our boxplots to differentiate between automatic (am = 0) and manual (am = 1) transmission:

ggplot(mtcars, aes(x = factor(cyl), y = mpg, fill = factor(am))) +
geom_boxplot() +
labs(title = "Boxplot of MPG grouped by number of cylinders and transmission type",
x = "Number of Cylinders",
y = "Miles Per Gallon",
fill = "Transmission Type") +
scale_fill_discrete(labels = c("Automatic", "Manual"))

The fill = factor(am) adds different colors for different transmission types. The scale_fill_discrete() function is used to customize the labels in the legend.

## Conclusion

Creating multiple boxplots in one chart in R provides a visual means of comparing distributions across different groups. While base R provides basic functionality for creating boxplots, the ggplot2 package offers more flexibility for customization, making it the preferred choice for many data visualization tasks.

Posted in RTagged