
Problem –
Creating many intermediate variables in your code is tedious and verbose, while nesting R functions makes the code nearly unreadable.
Solution –
Use the pipe operator (%>%) to make your expression easier to read and write. Use the pipe operator to combine multiple functions together into a pipeline of functions without intermediate variables.
library(tidyverse)
data(mpg)
mpg %>%
filter(cty > 21) %>%
head(3) %>%
print()
# A tibble: 3 × 11
manufactu…¹ model displ year cyl trans drv cty hwy fl class
<chr> <chr> <dbl> <int> <int> <chr> <chr> <int> <int> <chr> <chr>
1 chevrolet mali… 2.4 2008 4 auto… f 22 30 r mids…
2 honda civic 1.6 1999 4 manu… f 28 33 r subc…
3 honda civic 1.6 1999 4 auto… f 24 32 r subc…
# … with abbreviated variable name ¹manufacturer
Using the pipe is much cleaner and easier to read then using intermediate temporary variables.
> temp1 <- filter(mpg, cty > 21)
> temp2 <- head(temp1, 3)
> print(temp2)
# A tibble: 3 × 11
manufactu…¹ model displ year cyl trans drv cty hwy fl class
<chr> <chr> <dbl> <int> <int> <chr> <chr> <int> <int> <chr> <chr>
1 chevrolet mali… 2.4 2008 4 auto… f 22 30 r mids…
2 honda civic 1.6 1999 4 manu… f 28 33 r subc…
3 honda civic 1.6 1999 4 auto… f 24 32 r subc…
# … with abbreviated variable name ¹manufacturer
Using the pipe operator greatly improves the readability of a code. It takes the output of a function on the left of the operator and passes it as the first argument to the function on the right.
Writing this
x %>% head()
is functionally the same as writing this
head(x)
In both cases x is argument to the head. We can supply additional arguments but x is always the first argument.
These two lines are also functionally identical.
x %>% head(n=10)
head(x, n=10)
Let’s look at one more example where we will use intermediate results to perform some tasks.
filtered_mpg <- filter(mpg, cty > 21)
selected_mpg <- select(filtered_mpg, cty, hwy)
ggplot(selected_mpg, aes(cty, hwy)) + geom_point()
An alternative is to nest the functions together.
ggplot(select(filter(mpg, cty > 21), cty, hwy), aes(cty, hwy)) + geom_point()
Now let’s use the pipe operator to do the same.
mpg %>%
filter(cty > 21) %>%
select(cty, hwy) %>%
ggplot(aes(cty, hwy)) + geom_point()
