13 The Basics

Without really thinking about, you’ve been using two types of variables in R. First, there’s the “classic” variables that involve binding a value to a name. For example, here we bind the value 1 to y.

When we call y, we get 1.

There’s also the “data” type variables that you’re used to working with in the context of data frames. For example, manufacturer exists as a column of mpg. When we refer to manufacturer in a dplyr verb, dplyr looks for a column in mpg with the name “manufacturer.”

Unlike y, manufacturer doesn’t exist outside the context of mpg.

Here’s our function again:

Notice that we kind of want our function to treat group_var and summary_var as a sort of hybrid of the “classic” type variables and the “data” type variables. We want R to understand that we want group_var to refer to manufacturer, but we also want group_by() to still only look inside our data frame for a column called manufacturer. We don’t want it to look for a value bound to manufacturer, because there is none.

This sounds complicated, and, under the hood, it is. However, all we actually need to do to fix our function is wrap group_var and summary_var in curly braces:

This is called “embracing,” and will cover almost all of your tidyeval needs. If you want to understand more of the theory, you can read the “Theory” chapter.

Let’s see another example. Say you want to create a function that will plot a discrete variable using geom_bar():

All we need to do is embrace var_plot.

Now, we’ll go over some other tidyeval use cases, while remaining light on theory.

13.1 Passing full expressions with ...

You can use ... to pass full expressions into a function. For example, the following function applies filter() to mpg.

... can take any number of arguments, so we can filter by an unlimited number of conditions.

Passing ... works anytime all you want to do is pass a full expression into a dplyr function. Here’s another example that uses select().

13.3 Assigning names

Many of the dpylr verbs allow you to name or rename columns. When you want to pass the name of a column into your function, you need to:

  • Wrap the name in {{}}
  • Use := instead of = to assign the name

13.4 Recoding

This last example is going to look slightly different. !!! is another tidyeval operator, like {{}}.

Let’s say you want to recode a variable:

It’s often a good idea to store your recode mapping as a vector in your parameters section, instead of directly inside recode(). To get this to work, you’ll need to use !!!.

Another common use of !!! occurs when you want to count() all variables in a tibble. For example, say we want to understand the patterns of NAs in nycflights13::planes. First, we’ll transform every value into TRUE or FALSE depending on if it’s NA or not. Then, we can use count(!!! .) to count all the variables.