Data reshaping is an essential part of the data preparation and analysis process, and R provides a versatile suite of tools to make this task as easy as possible. Depending on the structure and complexity of your data, you might want to reshape it from a wide format to a long format or vice versa. This article provides a comprehensive guide on how to accomplish this task in R.
1. Understanding Wide and Long Data Formats
1.1 Wide Data Format
In a wide data format, a subject’s repeated responses will be in a single row, and each response is in a separate column. For example:

1.2 Long Data Format
In a long data format, each row is a one-time point per subject. A subject with three-time points will have three rows. The same data in a long format will look as follows:

2. Wide to Long Format Conversion
We will use the pivot_longer()
function from the tidyverse
package to convert from wide to long format.
2.1 The pivot_longer()
Function
The pivot_longer()
function takes multiple columns and collapses them into key-value pairs, duplicating all other columns as needed. The main arguments for this function are:
cols
: The columns you want to gather into key-value pairs. You can use variable names (e.g.,x, y, z
) or select helpers (e.g.,starts_with("x")
,ends_with("z")
,contains("y")
, etc.).names_to
: The name of the new key column. It can also be a character vector defining multiple new columns.values_to
: The name of the new value column.
Suppose you have the following data:
# load tidyverse
library(tidyverse)
# Create a data frame
df_wide <- data.frame(
Subject = c("Joe", "Alice", "Bob"),
Test1 = c(80, 70, 75),
Test2 = c(85, 75, 80),
Test3 = c(90, 85, 95)
)
print(df_wide)
To reshape this data from wide to long format using pivot_longer()
, you can do the following:
df_long <- df_wide %>%
pivot_longer(
cols = starts_with("Test"),
names_to = "Test",
values_to = "Score"
)
print(df_long)
3. Long to Wide Format Conversion
We will use the pivot_wider()
function from the tidyverse
package to convert from long to wide format.
3.1 The pivot_wider()
Function
The pivot_wider()
function spreads a key-value pair across multiple columns. The main arguments for this function are:
names_from
: The column that contains the variable names.values_from
: The column that contains the variable values.id_cols
: The columns that will be left as is. If NULL, defaults to all columns not used innames_from
orvalues_from
.
Suppose you have the following long format data:
# Create a data frame
df_long <- data.frame(
Subject = c("Joe", "Joe", "Joe", "Alice", "Alice", "Alice", "Bob", "Bob", "Bob"),
Test = c("Test1", "Test2", "Test3", "Test1", "Test2", "Test3", "Test1", "Test2", "Test3"),
Score = c(80, 85, 90, 70, 75, 85, 75, 80, 95)
)
print(df_long)
To reshape this data from long to wide format using pivot_wider()
, you can do the following:
df_wide <- df_long %>%
pivot_wider(
names_from = Test,
values_from = Score
)
print(df_wide)
4. Managing More Complex Data Structures
If your data set is more complex, for instance, if you have multiple variables measured over time, you can use multiple name and value columns in pivot_longer()
and pivot_wider()
. This allows you to manage the reshaping of more complex datasets.
4.1 Multiple Name and Value Columns in pivot_longer()
For instance, if you have data where two measurements, test
and quiz
, are taken over three time points, you can use pivot_longer()
in the following way:
# Creating a data frame
df_wide <- data.frame(
Subject = c("Joe", "Alice", "Bob"),
Test1 = c(80, 70, 75),
Test2 = c(85, 75, 80),
Test3 = c(90, 85, 95),
Quiz1 = c(50, 45, 55),
Quiz2 = c(60, 55, 65),
Quiz3 = c(70, 65, 75)
)
# Reshaping to long format
df_long <- df_wide %>%
pivot_longer(
cols = -Subject,
names_to = c(".value", "Time"),
names_pattern = "(.*)(.)"
)
print(df_long)
4.2 Multiple Name and Value Columns in pivot_wider()
On the other hand, you can also reshape the above long format data to wide format using pivot_wider()
as follows:
df_wide <- df_long %>%
pivot_wider(
names_from = Time,
values_from = c(Test, Quiz)
)
print(df_wide)
5. Conclusion
In conclusion, the tidyverse
package in R offers powerful and flexible functions for data reshaping, namely pivot_longer()
and pivot_wider()
. These functions can handle most of the reshaping tasks you will encounter in your data science journey, be it a simple wide-to-long or long-to-wide reshaping, or more complex reshaping involving multiple variables over time.