13 Summary

Tidy evaluation is a complex subject, but you don’t really need to understand how it works to start using it in your functions. The following is a brief, practical summary of the most common use cases.

13.1 Passing full expressions with ...

You can use ... to pass full expressions into a function. For example, the following function applies filter() to mpg.

mpg_filter <- function(...) {
  mpg %>% 
    filter(...)
}

mpg_filter(manufacturer == "audi", year == 1999)
#> # A tibble: 9 x 11
#>   manufacturer model  displ  year   cyl trans drv     cty   hwy fl    class
#>   <chr>        <chr>  <dbl> <int> <int> <chr> <chr> <int> <int> <chr> <chr>
#> 1 audi         a4       1.8  1999     4 auto… f        18    29 p     comp…
#> 2 audi         a4       1.8  1999     4 manu… f        21    29 p     comp…
#> 3 audi         a4       2.8  1999     6 auto… f        16    26 p     comp…
#> 4 audi         a4       2.8  1999     6 manu… f        18    26 p     comp…
#> 5 audi         a4 qu…   1.8  1999     4 manu… 4        18    26 p     comp…
#> 6 audi         a4 qu…   1.8  1999     4 auto… 4        16    25 p     comp…
#> # … with 3 more rows

... can take any number of arguments, so we can filter by an unlimited number of conditions.

mpg_filter(
  manufacturer == "audi", 
  year == 1999, 
  drv == "f", 
  fl == "p"
)
#> # A tibble: 4 x 11
#>   manufacturer model displ  year   cyl trans  drv     cty   hwy fl    class
#>   <chr>        <chr> <dbl> <int> <int> <chr>  <chr> <int> <int> <chr> <chr>
#> 1 audi         a4      1.8  1999     4 auto(… f        18    29 p     comp…
#> 2 audi         a4      1.8  1999     4 manua… f        21    29 p     comp…
#> 3 audi         a4      2.8  1999     6 auto(… f        16    26 p     comp…
#> 4 audi         a4      2.8  1999     6 manua… f        18    26 p     comp…

Passing ... works anytime all you want to do is pass a full expression into a dplyr function. Here’s another example that uses select().

mpg_select <- function(...) {
  mpg %>% 
    select(...)
}

mpg_select(car = model, drivetrain = drv)
#> # A tibble: 234 x 2
#>   car   drivetrain
#>   <chr> <chr>     
#> 1 a4    f         
#> 2 a4    f         
#> 3 a4    f         
#> 4 a4    f         
#> 5 a4    f         
#> 6 a4    f         
#> # … with 228 more rows

13.2 Named arguments

Sometimes, ... won’t work because you’ll want to supply your function with named arguments.

In the introduction, we noted the following function that doesn’t work:

grouped_mean <- function(df, group_var, summary_var) {
  df %>% 
    group_by(group_var) %>% 
    summarize(mean = mean(summary_var))
}

grouped_mean(df = mpg, group_var = manufacturer, summary_var = cty)
#> Error: Column `group_var` is unknown

We can create a function that works by by using enquo() and !!.

grouped_mean <- function(df, group_var, summary_var) {
  group_var <- enquo(group_var)
  summary_var <- enquo(summary_var)
  
  df %>% 
    group_by(!! group_var) %>% 
    summarize(mean = mean(!! summary_var))
}

grouped_mean(df = mpg, group_var = manufacturer, summary_var = cty)
#> # A tibble: 15 x 2
#>   manufacturer  mean
#>   <chr>        <dbl>
#> 1 audi          17.6
#> 2 chevrolet     15  
#> 3 dodge         13.1
#> 4 ford          14  
#> 5 honda         24.4
#> 6 hyundai       18.6
#> # … with 9 more rows

Here’s the steps:

  • Apply enquo() to the arguments that refer to column names.
  • When you want to reference those arguments, put !! before their names.

13.3 Named arguments and any number of additional arguments

Sometimes, you’ll want your function to take named arguments, but you’ll also want to allow for any number of additional arguments. You can use enquo(), !!, and ....

grouped_mean_2 <- function(df, summary_var, ...) {
  summary_var <- enquo(summary_var)
  
  df %>% 
    group_by(...) %>% 
    summarize(mean = mean(!! summary_var))
}

grouped_mean_2(df = mpg, summary_var = cty, year, drv)
#> # A tibble: 6 x 3
#> # Groups:   year [2]
#>    year drv    mean
#>   <int> <chr> <dbl>
#> 1  1999 4      14.2
#> 2  1999 f      20.0
#> 3  1999 r      14  
#> 4  2008 4      14.4
#> 5  2008 f      20.0
#> 6  2008 r      14.1

With the ..., we can pass any number of grouping variables into group_by().

grouped_mean_2(df = mpg, summary_var = cty, year, drv, class)
#> # A tibble: 23 x 4
#> # Groups:   year, drv [6]
#>    year drv   class       mean
#>   <int> <chr> <chr>      <dbl>
#> 1  1999 4     compact     16.5
#> 2  1999 4     midsize     15  
#> 3  1999 4     pickup      13  
#> 4  1999 4     subcompact  19.5
#> 5  1999 4     suv         13.8
#> 6  1999 f     compact     20.4
#> # … with 17 more rows

13.4 Assigning names

Many of the dpylr verbs allow you to name or rename columns. When you want to pass the name of a column into your function, you need to:

  • Apply enquo() to the argument giving the name
  • Put !! before the name when you reference it
  • Use := instead of = to assign the name
summary_mean <- function(df, summary_var, summary_name) {
  summary_var <- enquo(summary_var)
  summary_name <- enquo(summary_name)
  
  df %>% 
    summarize(!! summary_name := mean(!! summary_var))
}

summary_mean(df = mpg, summary_var = cty, summary_name = cty_mean)
#> # A tibble: 1 x 1
#>   cty_mean
#>      <dbl>
#> 1     16.9

13.5 Recoding

recode() is also a dplyr verb. It is often useful to put your recode vector in the parameters section, instead of directly inside recode(). Use !!! to get this to work.

recode_drv <- c("f" = "front", "r" = "rear", "4" = "four")

mpg %>% 
  mutate(drv = recode(drv, !!! recode_drv))
#> # A tibble: 234 x 11
#>   manufacturer model displ  year   cyl trans  drv     cty   hwy fl    class
#>   <chr>        <chr> <dbl> <int> <int> <chr>  <chr> <int> <int> <chr> <chr>
#> 1 audi         a4      1.8  1999     4 auto(… front    18    29 p     comp…
#> 2 audi         a4      1.8  1999     4 manua… front    21    29 p     comp…
#> 3 audi         a4      2    2008     4 manua… front    20    31 p     comp…
#> 4 audi         a4      2    2008     4 auto(… front    21    30 p     comp…
#> 5 audi         a4      2.8  1999     6 auto(… front    16    26 p     comp…
#> 6 audi         a4      2.8  1999     6 manua… front    18    26 p     comp…
#> # … with 228 more rows