How to Plot Multiple Lines in One Chart in R

Spread the love

Data visualization is a fundamental part of data analysis and statistics, providing a visual interpretation of complex datasets. One of the common and powerful ways to visualize data is by using line charts. Line charts display a series of data points, known as ‘markers’, connected by straight line segments. These charts are particularly useful for visualizing data changes over an interval or time span, forming a time series. In many cases, we may wish to compare multiple data series within the same line chart, which could involve plotting multiple lines in one chart. This article will guide you on how to do this in R using the ggplot2 package.

Preparing the Data

For this guide, let’s consider a hypothetical dataset that contains the monthly sales data for two products, say ‘Product A’ and ‘Product B’. The dataset has three columns: “Month”, “Sales_A”, and “Sales_B”.

Here’s an example of how the data might look:

You can create this data frame in R using the following code:

sales <- data.frame(
  Month = factor(c("Jan", "Feb", "Mar", "Apr", "May"), 
                 levels = c("Jan", "Feb", "Mar", "Apr", "May", "Jun", "Jul", "Aug", "Sep", "Oct", "Nov", "Dec")),
  Sales_A = c(200, 215, 250, 230, 290),
  Sales_B = c(250, 270, 280, 290, 310)
)

Note that the factor() function specifies that the ‘Month’ variable is a categorical variable with a certain order. This tells ggplot2 that the months have an inherent order to be used when plotting.

Reshaping the Data

Before plotting multiple lines, we need to reshape the data from wide format to long format. We’ll use the tidyverse package, which includes tidyr for data reshaping. If you haven’t installed tidyverse yet, you can do so using install.packages("tidyverse").

To reshape the data, we’ll use the pivot_longer() function from tidyr. This function takes multiple columns and gathers them into key-value pairs, duplicating all other columns as needed:

library(tidyverse)

sales_long <- sales %>%
  pivot_longer(
    cols = c(Sales_A, Sales_B),
    names_to = "Product",
    values_to = "Sales"
  )

head(sales_long)

Now, the dataset looks like this:

Plotting Multiple Lines in One Chart

Now that the data is in the correct format, we can plot the two lines using ggplot2:

ggplot(data = sales_long, aes(x = Month, y = Sales, color = Product, group = Product)) +
  geom_line() +
  geom_point()

In the above code, aes(x = Month, y = Sales, color = Product, group = Product) sets up the chart with the ‘Month’ column for the x-axis, ‘Sales’ column for the y-axis, ‘Product’ for the color and group aesthetic. The color aesthetic is used to differentiate between lines, and the group aesthetic tells ggplot to draw a separate line for each product. The geom_line() function adds the lines to the plot, and geom_point() adds points at each data value.

Customizing the Line Chart

There are many ways you can customize the line chart in R with the ggplot2 package.

Changing Line Types and Colors

To change the line type and color, you can add parameters to the geom_line() function:

ggplot(data = sales_long, aes(x = Month, y = Sales, color = Product, group = Product)) +
  geom_line(aes(linetype = Product)) +
  scale_color_manual(values = c("Sales_A" = "blue", "Sales_B" = "red")) +
  geom_point()

In this code, aes(linetype = Product) inside geom_line() changes the line type for different products, while scale_color_manual(values = c("Sales_A" = "blue", "Sales_B" = "red")) changes the color of the lines for each product.

Adding Titles and Labels

You can add a title to your chart and labels to your axes using ggtitle(), xlab(), and ylab() functions:

ggplot(data = sales_long, aes(x = Month, y = Sales, color = Product, group = Product)) +
  geom_line() +
  geom_point() +
  ggtitle("Monthly Sales of Product A and Product B") +
  xlab("Month") +
  ylab("Sales")

Adjusting the Legend

You can adjust the legend of the plot using the labs() function to change the legend title, and scale_color_discrete() or scale_linetype_discrete() to change the legend labels:

ggplot(data = sales_long, aes(x = Month, y = Sales, color = Product, group = Product)) +
  geom_line(aes(linetype = Product)) +
  geom_point() +
  labs(color = "Product", linetype = "Product") +
  scale_color_discrete(labels = c("Product A", "Product B")) +
  scale_linetype_discrete(labels = c("Product A", "Product B"))

Conclusion

In this guide, we have covered how to plot multiple lines in one chart in R using the ggplot2 package. We have seen how to prepare and reshape the data for plotting, how to create a multi-line chart, and how to customize it. Although this guide only scratches the surface of what is possible with ggplot2, it should provide you with a solid foundation for creating multi-line plots to visualize your own data.

Posted in RTagged

Leave a Reply