How to use Pipe Operator (%>%) in R?

Spread the love

Problem –

Creating many intermediate variables in your code is tedious and verbose, while nesting R functions makes the code nearly unreadable.

Solution –

Use the pipe operator (%>%) to make your expression easier to read and write. Use the pipe operator to combine multiple functions together into a pipeline of functions without intermediate variables.

library(tidyverse)
data(mpg)
mpg %>%
  filter(cty > 21) %>%
  head(3) %>%
  print()
# A tibble: 3 × 11
  manufactu…¹ model displ  year   cyl trans drv     cty   hwy fl    class
  <chr>       <chr> <dbl> <int> <int> <chr> <chr> <int> <int> <chr> <chr>
1 chevrolet   mali…   2.4  2008     4 auto… f        22    30 r     mids…
2 honda       civic   1.6  1999     4 manu… f        28    33 r     subc…
3 honda       civic   1.6  1999     4 auto… f        24    32 r     subc…
# … with abbreviated variable name ¹​manufacturer

Using the pipe is much cleaner and easier to read then using intermediate temporary variables.

> temp1 <- filter(mpg, cty > 21)
> temp2 <- head(temp1, 3)
> print(temp2)
# A tibble: 3 × 11
  manufactu…¹ model displ  year   cyl trans drv     cty   hwy fl    class
  <chr>       <chr> <dbl> <int> <int> <chr> <chr> <int> <int> <chr> <chr>
1 chevrolet   mali…   2.4  2008     4 auto… f        22    30 r     mids…
2 honda       civic   1.6  1999     4 manu… f        28    33 r     subc…
3 honda       civic   1.6  1999     4 auto… f        24    32 r     subc…
# … with abbreviated variable name ¹​manufacturer

Using the pipe operator greatly improves the readability of a code. It takes the output of a function on the left of the operator and passes it as the first argument to the function on the right.

Writing this

x %>% head()

is functionally the same as writing this

head(x)

In both cases x is argument to the head. We can supply additional arguments but x is always the first argument.

These two lines are also functionally identical.

x %>% head(n=10)

head(x, n=10)

Let’s look at one more example where we will use intermediate results to perform some tasks.

filtered_mpg <- filter(mpg, cty > 21)
selected_mpg <- select(filtered_mpg, cty, hwy)
ggplot(selected_mpg, aes(cty, hwy)) + geom_point()

An alternative is to nest the functions together.

ggplot(select(filter(mpg, cty > 21), cty, hwy), aes(cty, hwy)) + geom_point()

Now let’s use the pipe operator to do the same.

mpg %>%
  filter(cty > 21) %>%
  select(cty, hwy) %>%
  ggplot(aes(cty, hwy)) + geom_point()

Rating: 1 out of 5.

Posted in RTagged

Leave a Reply