20.Rmd (7627B)
1 --- 2 engine: knitr 3 title: Evaluation 4 --- 5 6 ## Learning objectives: 7 8 - Learn evaluation basics 9 - Learn about **quosures** and **data mask** 10 - Understand tidy evaluation 11 12 ```{r message=FALSE,warning=FALSE} 13 library(rlang) 14 library(purrr) 15 ``` 16 17 ## A bit of a recap 18 19 - Metaprogramming: To separate our description of the action from the action itself - Separate the code from its evaluation. 20 - Quasiquotation: combine code written by the *function's author* with code written by the *function's user*. 21 - Unquotation: it gives the *user* the ability to evaluate parts of a quoted argument. 22 - Evaluation: it gives the *developer* the ability to evluated quoted expression in custom environments. 23 24 **Tidy evaluation**: quasiquotation, quosures and data masks 25 26 ## Evaluation basics 27 28 We use `eval()` to evaluate, run, or execute expressions. It requires two arguments: 29 30 - `expr`: the object to evaluate, either an expression or a symbol. 31 - `env`: the environment in which to evaluate the expression or where to look for the values. 32 Defaults to current env. 33 34 ```{r} 35 sumexpr <- expr(x + y) 36 x <- 10 37 y <- 40 38 eval(sumexpr) 39 ``` 40 41 ```{r} 42 eval(sumexpr, envir = env(x = 1000, y = 10)) 43 ``` 44 45 46 ## Application: reimplementing `source()` 47 48 What do we need? 49 50 - Read the file being sourced. 51 - Parse its expressions (quote them?) 52 - Evaluate each expression saving the results 53 - Return the results 54 55 ```{r} 56 source2 <- function(path, env = caller_env()) { 57 file <- paste(readLines(path, warn = FALSE), collapse = "\n") 58 exprs <- parse_exprs(file) 59 60 res <- NULL 61 for (i in seq_along(exprs)) { 62 res <- eval(exprs[[i]], env) 63 } 64 65 invisible(res) 66 } 67 ``` 68 69 The real source is much more complex. 70 71 ## Quosures 72 73 **quosures** are a data structure from `rlang` containing both and expression and an environment 74 75 *Quoting* + *closure* because it quotes the expression and encloses the environment. 76 77 Three ways to create them: 78 79 - Used mostly for learning: `new_quosure()`, creates a quosure from its components. 80 81 ```{r} 82 q1 <- rlang::new_quosure(expr(x + y), 83 env(x = 1, y = 10)) 84 ``` 85 86 With a quosure, we can use `eval_tidy()` directly. 87 88 ```{r} 89 rlang::eval_tidy(q1) 90 ``` 91 92 And get its components 93 94 ```{r} 95 rlang::get_expr(q1) 96 rlang::get_env(q1) 97 ``` 98 99 Or set them 100 101 ```{r} 102 q1 <- set_env(q1, env(x = 3, y = 4)) 103 eval_tidy(q1) 104 ``` 105 106 107 - Used in the real world: `enquo()` o `enquos()`, to capture user supplied expressions. They take the environment from where they're created. 108 109 ```{r} 110 foo <- function(x) enquo(x) 111 quo_foo <- foo(a + b) 112 ``` 113 114 ```{r} 115 get_expr(quo_foo) 116 get_env(quo_foo) 117 ``` 118 119 - Almost never used: `quo()` and `quos()`, to match to `expr()` and `exprs()`. 120 121 ## Quosures and `...` 122 123 Quosures are just a convenience, but they are essential when it comes to working with `...`, because you can have each argument from `...` associated with a different environment. 124 125 ```{r} 126 g <- function(...) { 127 ## Creating our quosures from ... 128 enquos(...) 129 } 130 131 createQuos <- function(...) { 132 ## symbol from the function environment 133 x <- 1 134 g(..., f = x) 135 } 136 ``` 137 138 ```{r} 139 ## symbol from the global environment 140 x <- 0 141 qs <- createQuos(global = x) 142 qs 143 ``` 144 145 ## Other facts about quosures 146 147 Formulas were the inspiration for closures because they also capture an expression and an environment 148 149 ```{r} 150 f <- ~runif(3) 151 str(f) 152 ``` 153 154 There was an early version of tidy evaluation with formulas, but there's no easy way to implement quasiquotation with them. 155 156 They are actually call objects 157 158 ```{r} 159 q4 <- new_quosure(expr(x + y + z)) 160 class(q4) 161 is.call(q4) 162 ``` 163 164 with an attribute to store the environment 165 166 ```{r} 167 attr(q4, ".Environment") 168 ``` 169 170 171 **Nested quosures** 172 173 With quosiquotation we can embed quosures in expressions. 174 175 ```{r} 176 q2 <- new_quosure(expr(x), env(x = 1)) 177 q3 <- new_quosure(expr(x), env(x = 100)) 178 179 nq <- expr(!!q2 + !!q3) 180 ``` 181 182 And evaluate them 183 184 ```{r} 185 eval_tidy(nq) 186 ``` 187 188 But for printing it's better to use `expr_print(x)` 189 190 ```{r} 191 expr_print(nq) 192 nq 193 ``` 194 195 ## Data mask 196 197 A data frame where the evaluated code will look first for its variable definitions. 198 199 Used in packages like dplyr and ggplot. 200 201 To use it we need to supply the data mask as a second argument to `eval_tidy()` 202 203 ```{r} 204 q1 <- new_quosure(expr(x * y), env(x = 100)) 205 df <- data.frame(y = 1:10) 206 207 eval_tidy(q1, df) 208 ``` 209 210 Everything together, in one function. 211 212 ```{r} 213 with2 <- function(data, expr) { 214 expr <- enquo(expr) 215 eval_tidy(expr, data) 216 } 217 ``` 218 219 But we need to create the objects that are not part of our data mask 220 ```{r} 221 x <- 100 222 with2(df, x * y) 223 ``` 224 225 Also doable with `base::eval()` instead of `rlang::eval_tidy()` but we have to use `base::substitute()` instead of `enquo()` (like we did for `enexpr()`) and we need to specify the environment. 226 227 ```{r} 228 with3 <- function(data, expr) { 229 expr <- substitute(expr) 230 eval(expr, data, caller_env()) 231 } 232 ``` 233 234 ```{r} 235 with3(df, x*y) 236 ``` 237 238 ## Pronouns: .data$ and .env$ 239 240 **Ambiguity!!** 241 242 An object value can come from the env or from the data mask 243 244 ```{r} 245 q1 <- new_quosure(expr(x * y + x), env = env(x = 1)) 246 df <- data.frame(y = 1:5, 247 x = 10) 248 249 eval_tidy(q1, df) 250 ``` 251 252 We use pronouns: 253 254 - `.data$x`: `x` from the data mask 255 - `.env$x`: `x` from the environment 256 257 258 ```{r} 259 q1 <- new_quosure(expr(.data$x * y + .env$x), env = env(x = 1)) 260 eval_tidy(q1, df) 261 ``` 262 263 ## Application: reimplementing `base::subset()` 264 265 `base::subset()` works like `dplyr::filter()`: it selects rows of a data frame given an expression. 266 267 What do we need? 268 269 - Quote the expression to filter 270 - Figure out which rows in the data frame pass the filter 271 - Subset the data frame 272 273 ```{r} 274 subset2 <- function(data, rows) { 275 rows <- enquo(rows) 276 rows_val <- eval_tidy(rows, data) 277 stopifnot(is.logical(rows_val)) 278 279 data[rows_val, , drop = FALSE] 280 } 281 ``` 282 283 ```{r} 284 sample_df <- data.frame(a = 1:5, b = 5:1, c = c(5, 3, 2, 4, 1)) 285 286 # Shorthand for sample_df[sample_df$b == sample_df$c, ] 287 subset2(sample_df, b == c) 288 ``` 289 290 ## Using tidy evaluation 291 292 Most of the time we might not call it directly, but call a function that uses `eval_tidy()` (becoming developer AND user) 293 294 **Use case**: resample and subset 295 296 We have a function that resamples a dataset: 297 298 ```{r} 299 resample <- function(df, n) { 300 idx <- sample(nrow(df), n, replace = TRUE) 301 df[idx, , drop = FALSE] 302 } 303 ``` 304 305 ```{r} 306 resample(sample_df, 10) 307 ``` 308 309 But we also want to use subset and we want to create a function that allow us to resample and subset (with `subset2()`) in a single step. 310 311 First attempt: 312 313 ```{r} 314 subsample <- function(df, cond, n = nrow(df)) { 315 df <- subset2(df, cond) 316 resample(df, n) 317 } 318 ``` 319 320 ```{r error=TRUE} 321 subsample(sample_df, b == c, 10) 322 ``` 323 324 What happened? 325 326 `subsample()` doesn't quote any arguments and `cond` is evaluated normally 327 328 So we have to quote `cond` and unquote it when we pass it to `subset2()` 329 330 ```{r} 331 subsample <- function(df, cond, n = nrow(df)) { 332 cond <- enquo(cond) 333 334 df <- subset2(df, !!cond) 335 resample(df, n) 336 } 337 ``` 338 339 ```{r} 340 subsample(sample_df, b == c, 10) 341 ``` 342 343 **Be careful!**, potential ambiguity: 344 345 ```{r} 346 threshold_x <- function(df, val) { 347 subset2(df, x >= val) 348 } 349 ``` 350 351 What would happen if `x` exists in the calling environment but doesn't exist in `df`? Or if `val` also exists in `df`? 352 353 So, as developers of `threshold_x()` and users of `subset2()`, we have to add some pronouns: 354 355 ```{r} 356 threshold_x <- function(df, val) { 357 subset2(df, .data$x >= .env$val) 358 } 359 ``` 360 361 362 Just remember: 363 364 > As a general rule of thumb, as a function author it’s your responsibility 365 > to avoid ambiguity with any expressions that you create; 366 > it’s the user’s responsibility to avoid ambiguity in expressions that they create. 367 368 369 ## Base evaluation 370 371 Check 20.6 in the book!