bookclub-advr

DSLC Advanced R Book Club
git clone https://git.eamoncaddigan.net/bookclub-advr.git
Log | Files | Refs | README | LICENSE

19.Rmd (9833B)


      1 ---
      2 engine: knitr
      3 title: Quasiquotation
      4 ---
      5 
      6 ## Learning objectives:
      7 
      8 - What quasiquotation means
      9 - Why it's important
     10 - Learn some practical uses
     11 
     12 ```{r, message=FALSE}
     13 library(rlang)
     14 library(purrr)
     15 ```
     16 
     17 ## Introduction
     18 
     19 Three pillars of *tidy* evaluation
     20 
     21    1. Quasiquotation
     22    2. Quosures (chapter 20)
     23    3. Data masks (Chapter 20)
     24 
     25 **Quasiquotation = quotation + unquotation**
     26 
     27 - **Quote.** Capture unevaluated expression... ("defuse")  
     28 - **Unquote.** Evaluate selections of quoted expression! ("inject")
     29 - Functions that use these features are said to use Non-standard evaluation (NSE)
     30 - Note: related to Lisp macros, and also exists in other languages with Lisp heritage, e.g. Julia
     31 
     32 > On it's own, Quasiquotation good for programming, but combined with other tools, 
     33 > important for data analysis.
     34 
     35 ## Motivation
     36 
     37 Simple *concrete* example:
     38 
     39 `cement()` is a function that works like `paste()` but doesn't need need quotes
     40 
     41 (Think of automatically adding 'quotes' to the arguments)
     42 
     43 ```{r}
     44 cement <- function(...) {
     45   args <- ensyms(...)
     46   paste(purrr::map(args, as_string), collapse = " ")
     47 }
     48 
     49 cement(Good, morning, Hadley)
     50 ```
     51 
     52 What if we wanted to use variables? What is an object and what should be quoted?
     53 
     54 This is where 'unquoting' comes in!
     55 
     56 ```{r}
     57 name <- "Bob"
     58 cement(Good, afternoon, !!name) # Bang-bang!
     59 ```
     60 
     61 ## Vocabulary {-}
     62 
     63 Can think of `cement()` and `paste()` as being 'mirror-images' of each other.
     64 
     65 - `paste()` - define what to quote - **Evaluates** arguments
     66 - `cement()` - define what to unquote - **Quotes** arguments
     67 
     68 **Quoting function** similar to, but more precise than, **Non-standard evaluation (NSE)**
     69 
     70 - Tidyverse functions - e.g., `dplyr::mutate()`, `tidyr::pivot_longer()`
     71 - Base functions - e.g., `library()`, `subset()`, `with()`
     72 
     73 **Quoting function** arguments cannot be evaluated outside of function:
     74 ```{r, error = TRUE}
     75 cement(Good, afternoon, Cohort) # No problem
     76 Good      # Error!
     77 ```
     78 
     79 **Non-quoting (standard) function** arguments can be evaluated:
     80 ```{r}
     81 paste("Good", "afternoon", "Cohort")
     82 "Good"
     83 ```
     84 
     85 
     86 ## Quoting
     87 
     88 **Capture expressions without evaluating them**
     89 
     90 ```{r, echo = FALSE}
     91 data.frame(
     92   t = rep(c("One", "Many"), 3),
     93   Developer = c("`expr()`","`exprs()`", 
     94                 "`quote()`", "`substitute()`",
     95                 "", ""),
     96   User = c("`enexpr()`", "`enexprs()`", 
     97            "`alist()`", "`as.list(substitute(...()))`",
     98            "`ensym()`", "`ensyms()`"),
     99   type = c("Expression", "Expression", "R Base", "R Base", "Symbol", "Symbol")) |>
    100   dplyr::group_by(type) |>
    101   gt::gt() |>
    102   gt::tab_row_group(label = "R Base (Quotation)", rows = type == "R Base")|>
    103   gt::tab_row_group(label = "Symbol (Quasiquotation)", rows = type == "Symbol") |>
    104   gt::tab_row_group(label = "Expression (Quasiquotation)", rows = type == "Expression")|>
    105   gt::cols_label(t = "") |>
    106   gt::tab_options(row_group.font.weight = "bold") |>
    107   gt::tab_style(style = gt::cell_text(align = "center", weight = "bold"), 
    108                 locations = gt::cells_column_labels()) |>
    109   gt::tab_style(style = gt::cell_borders(style = "hidden"), locations = gt::cells_body()) |>
    110   gt::tab_style(style = gt::cell_borders(sides = "top", style = "solid"),
    111                 locations = gt::cells_body(rows = c(1, 3, 5))) |>
    112     gt::tab_style(style = gt::cell_borders(sides = "bottom", style = "solid"),
    113                 locations = gt::cells_body(rows = c(2, 4))) |>
    114   gt::cols_align("center", columns = -1) |>
    115   gt::fmt_markdown() |>
    116   gt::cols_width(t ~ px(100))
    117 ```
    118 
    119 - Non-base functions are from **rlang**
    120 - **Developer** - From you, direct, fixed, interactive
    121 - **User** - From the user, indirect, varying, programmatic
    122 
    123 Also: 
    124 
    125 - `bquote()` provides a limited form of quasiquotation
    126 - `~`, the formula, is a quoting function (see [Section 20.3.4](https://adv-r.hadley.nz/evaluation.html#quosure-impl))
    127 
    128 ### `expr()` and `exprs()` {-}
    129 ```{r}
    130 expr(x + y)
    131 exprs(exp1 = x + y, exp2 = x * y)
    132 ```
    133 
    134 ### `enexpr()`^[`enexpr()` = **en**rich `expr()`] and `enexprs()` {-}
    135 ```{r}
    136 f <- function(x) enexpr(x)
    137 f(a + b + c)
    138 
    139 f2 <- function(x, y) enexprs(exp1 = x, exp2 = y)
    140 f2(x = a + b, y = c + d)
    141 ```
    142 
    143 ### `ensym()` and `ensyms()` {-}
    144 
    145 - **[Remember](https://adv-r.hadley.nz/expressions.html#symbols):** Symbol represents the name of an object. Can only be length 1.
    146 - These are stricter than `enexpr/s()`
    147 
    148 ```{r}
    149 f <- function(x) ensym(x)
    150 f(a)
    151 
    152 f2 <- function(x, y) ensyms(sym1 = x, sym2 = y)
    153 f2(x = a, y = "b")
    154 ```
    155 
    156 
    157 ## Unquoting
    158 
    159 **Selectively evaluate parts of an expression**
    160 
    161 - Merges ASTs with template
    162 - 1 argument `!!` (**unquote**, **bang-bang**)
    163   - Unquoting a *function call* evaluates and returns results
    164   - Unquoting a *function (name)* replaces the function (alternatively use `call2()`)
    165 - \>1 arguments `!!!` (**unquote-splice**, **bang-bang-bang**, **triple bang**)
    166 - `!!` and `!!!` only work like this inside quoting function using rlang
    167 
    168 ### Basic unquoting {-}
    169 
    170 **One argument**
    171 ```{r}
    172 x <- expr(a + b)
    173 y <- expr(c / d)
    174 ```
    175 
    176 ```{r, collapse = TRUE}
    177 expr(f(x, y))      # No unquoting
    178 expr(f(!!x, !!y))  # Unquoting
    179 ```
    180 
    181 **Multiple arguments**
    182 ```{r}
    183 z <- exprs(a + b, c + d)
    184 w <- exprs(exp1 = a + b, exp2 = c + d)
    185 ```
    186 
    187 ```{r, collapse = TRUE}
    188 expr(f(z))      # No unquoting
    189 expr(f(!!!z))   # Unquoting
    190 expr(f(!!!w))   # Unquoting when named
    191 ```
    192 
    193 
    194 ### Special usages or cases {-}
    195 
    196 For example, get the AST of an expression
    197 ```{r, collapse = TRUE}
    198 lobstr::ast(x)
    199 lobstr::ast(!!x)
    200 ```
    201 
    202 
    203 Unquote *function call*
    204 ```{r, collapse = TRUE}
    205 expr(f(!!mean(c(100, 200, 300)), y))
    206 ```
    207 
    208 Unquote *function*
    209 ```{r, collapse = TRUE}
    210 f <- expr(sd)
    211 expr((!!f)(x))
    212 expr((!!f)(!!x + !!y))
    213 ```
    214 
    215 ## Non-quoting
    216 
    217 Only `bquote()` provides a limited form of quasiquotation.
    218 
    219 The rest of base selectively uses or does not use quoting (rather than unquoting). 
    220 
    221 Four basic forms of quoting/non-quoting:
    222 
    223 1. **Pair of functions** - Quoting and non-quoting
    224     - e.g., `$` (quoting) and `[[` (non-quoting)
    225 2. **Pair of Arguments** - Quoting and non-quoting
    226     - e.g., `rm(...)` (quoting) and `rm(list = c(...))` (non-quoting)
    227 3. **Arg to control quoting**
    228     - e.g., `library(rlang)` (quoting) and `library(pkg, character.only = TRUE)` (where `pkg <- "rlang"`)
    229 4. **Quote if evaluation fails**
    230     - `help(var)` - Quote, show help for var
    231     - `help(var)` (where `var <- "mean"`) - No quote, show help for mean
    232     - `help(var)` (where `var <- 10`) - Quote fails, show help for var
    233 
    234 
    235 ## ... (dot-dot-dot) [When using ... with quoting]
    236 
    237 - Sometimes need to supply an *arbitrary* list of expressions or arguments in a function (`...`)
    238 - But need a way to use these when we don't necessarily have the names
    239 - Remember `!!` and `!!!` only work with functions that use rlang
    240 - Can use `list2(...)` to turn `...` into "tidy dots" which *can* be unquoted and spliced
    241 - Require `list2()` if going to be passing or using `!!` or `!!!` in `...`
    242 - `list2()` is a wrapper around `dots_list()` with the most common defaults
    243 
    244 **No need for `list2()`**
    245 ```{r, collapse = TRUE}
    246 d <- function(...) data.frame(list(...))
    247 d(x = c(1:3), y = c(2, 4, 6))
    248 ```
    249 
    250 **Require `list2()`**
    251 ```{r, collapse = TRUE, error = TRUE}
    252 vars <- list(x = c(1:3), y = c(2, 4, 6))
    253 d(!!!vars)
    254 d2 <- function(...) data.frame(list2(...))
    255 d2(!!!vars)
    256 # Same result but x and y evaluated later
    257 vars_expr <- exprs(x = c(1:3), y = c(2, 4, 6))
    258 d2(!!!vars_expr)  
    259 ```
    260 
    261 Getting argument names (symbols) from variables
    262 ```{r}
    263 nm <- "z"
    264 val <- letters[1:4]
    265 d2(x = 1:4, !!nm := val)
    266 ```
    267 
    268 ## `exec()` [Making your own ...] {-}
    269 
    270 What if your function doesn't have tidy dots?
    271 
    272 
    273 Can't use `!!` or `:=` if doesn't support rlang or dynamic dots
    274 ```{r, collapse=TRUE, error = TRUE}
    275 my_mean <- function(x, arg_name, arg_val) {
    276   mean(x, !!arg_name := arg_val)
    277 }
    278 
    279 my_mean(c(NA, 1:10), arg_name = "na.rm", arg_val = TRUE)     
    280 ```
    281 
    282 Let's use the ... from `exec()`
    283 ```{r, eval = FALSE}
    284 exec(.fn, ..., .env = caller_env())
    285 ```
    286 
    287 
    288 ```{r, collapse=TRUE}
    289 my_mean <- function(x, arg_name, arg_val) {
    290   exec("mean", x, !!arg_name := arg_val)
    291 }
    292 
    293 my_mean(c(NA, 1:10), arg_name = "na.rm", arg_val = TRUE)     
    294 ```
    295 
    296 Note that you do not unquote `arg_val`.
    297  
    298 Also `exec` is useful for mapping over a list of functions:
    299 
    300 ```{r}
    301 x <- c(runif(10), NA)
    302 funs <- c("mean", "median", "sd")
    303 purrr::map_dbl(funs, exec, x, na.rm = TRUE)
    304 ```
    305 
    306    
    307 ##  Base R `do.call` {-}
    308 
    309 `do.call(what, args)`
    310 
    311 - `what` is a function to call
    312 - `args` is a list of arguments to pass to the function.
    313 
    314 ```{r, collapse = TRUE}
    315 nrow(mtcars)
    316 mtcars3 <- do.call("rbind", list(mtcars, mtcars, mtcars))
    317 nrow(mtcars3)
    318 ```
    319  
    320 
    321 ### Exercise 19.5.5 #1 {-}
    322 
    323 One way to implement `exec` is shown here: Describe how it works. What are the key ideas?
    324 
    325 ```{r}
    326 exec_ <- function(f, ..., .env = caller_env()){
    327   args <- list2(...)
    328   do.call(f, args, envir  = .env)
    329 }
    330 ```
    331 
    332 ## Case Studies (side note)
    333 
    334 Sometimes you want to run a bunch of models, without having to copy/paste each one.
    335 
    336 BUT, you also want the summary function to show the appropriate model call, 
    337 not one with hidden variables (e.g., `lm(y ~ x, data = data)`). 
    338 
    339 We can achieve this by building expressions and unquoting as needed:
    340 
    341 ```{r, collapse = TRUE}
    342 library(purrr)
    343 
    344 vars <- data.frame(x = c("hp", "hp"),
    345                    y = c("mpg", "cyl"))
    346 
    347 x_sym <- syms(vars$x)
    348 y_sym <- syms(vars$y)
    349 
    350 formulae <- map2(x_sym, y_sym, \(x, y) expr(!!y ~ !!x))
    351 formulae
    352 models <- map(formulae, \(f) expr(lm(!!f, data = mtcars)))
    353 summary(eval(models[[1]]))
    354 ```
    355 
    356 As a function:
    357 ```{r, collapse = TRUE}
    358 lm_df <- function(df, data) {
    359   x_sym <- map(df$x, as.symbol)
    360   y_sym <- map(df$y, as.symbol)
    361   data <- enexpr(data)
    362   
    363   formulae <- map2(x_sym, y_sym, \(x, y) expr(!!y ~ !!x))
    364   models <- map(formulae, \(f) expr(lm(!!f, !!data)))
    365   
    366   map(models, \(m) summary(eval(m)))
    367 }
    368 
    369 vars <- data.frame(x = c("hp", "hp"),
    370                    y = c("mpg", "cyl"))
    371 lm_df(vars, data = mtcars)
    372 ```