bookclub-advr

DSLC Advanced R Book Club
git clone https://git.eamoncaddigan.net/bookclub-advr.git
Log | Files | Refs | README | LICENSE

commit 6cb4390eba13749debb57a7def53362d52ed8ff9
parent e4a24be756fa8442b88464f7de79dec12e64bd5c
Author: Steffi LaZerte <steffi@steffi.ca>
Date:   Thu, 26 Sep 2024 05:35:16 -0500

Steffi's Chp 17 (#71)

* First draft

* Final notes for Chp 17
Diffstat:
M17_Big_picture.Rmd | 199+++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++--
1 file changed, 195 insertions(+), 4 deletions(-)

diff --git a/17_Big_picture.Rmd b/17_Big_picture.Rmd @@ -2,12 +2,203 @@ **Learning objectives:** -- THESE ARE NICE TO HAVE BUT NOT ABSOLUTELY NECESSARY +- Become familiar with some metaprogramming principals and how they relate to each other +- Review vocabulary associated with metaprogramming -## SLIDE 1 +```{r} +library(rlang) +library(lobstr) +``` + + +## Code is data + +- **expression** - Captured code (*call*, *symbol*, *constant*, or *pairlist*) +- Use `rlang::expr()`[^1] to capture code directly + +```{r} +expr(mean(x, na.rm = TRUE)) +``` + +- Use `rlang::enexpr()` to capture code indirectly + +```{r} +capture_it <- function(x) { # 'automatically quotes first argument' + enexpr(x) +} +capture_it(a + b + c) +``` + +- 'Captured' code can be modified (like a list)! + - First element is the function, next elements are the arguments + +```{r} +f <- expr(f(x = 1, y = 2)) +names(f) + +ff <- fff <- f # Create two copies + +ff$z <- 3 # Add an argument to one +fff[[2]] <- NULL # Remove an argument from another + +f +ff +fff +``` + +> More on this next week! + +[^1]: Equivalent to `base::bquote()` + +## Code is a tree + +- **Abstract syntax tree** (AST) - Almost every language represents code as a tree +- Use `lobstr::ast()` to inspect these code trees + +```{r} +ast(f1(f2(a, b), f3(1))) +ast(1 + 2 * 3) +``` + + +## Code can generate code + +- `rlang::call2()` creates function call + +```{r} +call2("f", 1, 2, 3) +``` + +- Going backwards from the tree, can use functions to create calls + +```{r} +call2("f1", call2("f2", "a", "b"), call2("f3", 1)) +call2("+", 1, call2("*", 2, 3)) +``` + +- `!!` bang-bang - **unquote operator** + - inserts previously defined expressions into the current one + +```{r} +xx <- expr(x + x) +yy <- expr(y + y) +expr(xx / yy) # Nope! + +expr(!!xx / !!yy) # Yup! +``` + +```{r} +cv <- function(var) { + var <- enexpr(var) # Get user's expression + expr(sd(!!var) / mean(!!var)) # Insert user's expression +} + +cv(x) +cv(x + y) +``` + +- Avoid `paste()` for building code + - Problems with non-syntactic names and precedence among expressions + +> "You might think this is an esoteric concern, but not worrying about it when generating SQL code in web applications led to SQL injection attacks that have collectively cost billions of dollars." + +## Evaluation runs code + +- **evaluate** - run/execute an expression +- need both expression and environment +- `eval()` uses current environment if not set +- manual evaluation means you can tweak the environment! + +```{r} +xy <- expr(x + y) + +eval(xy, env(x = 1, y = 10)) +eval(xy, env(x = 2, y = 100)) +``` + + +## Customizing evaluations with functions +- Can also bind names to functions in supplied environment +- Allows overriding function behaviour +- This is how dplyr generates SQL for working with databases + +For example... +```{r} +string_math <- function(x) { + e <- env( + caller_env(), + `+` = function(x, y) paste(x, y), + `*` = function(x, y) strrep(x, y) + ) + + eval(enexpr(x), e) +} + +cohort <- 9 +string_math("Hello" + "cohort" + cohort) +string_math(("dslc" + "is" + "awesome---") * cohort) +``` + + +## Customizing evaluation wtih data + +- Look for variables inside data frame +- **Data mask** - typically a data frame +- use `rlang::eval_tidy()` rather than `eval()` + +```{r} +df <- data.frame(x = 1:5, y = sample(5)) +eval_tidy(expr(x + y), df) +``` + +Catch user input with `enexpr()`... + +```{r} +with2 <- function(df, expr) { + eval_tidy(enexpr(expr), df) +} + +with2(df, x + y) +``` + +But there's a bug! + +- Evaluates in environment inside `with2()`, but the expression likely refers + to objects in the Global environment + +```{r} +with2 <- function(df, expr) { + a <- 1000 + eval_tidy(enexpr(expr), df) +} + +df <- data.frame(x = 1:3) +a <- 10 +with2(df, x + a) +``` + +- Solved with Quosures... + +## Quosures + +- **Quosures** bundles expression with an environment +- Use `enquo()` instead of `enexpr()` (with `eval_tidy()`) + +```{r} +with2 <- function(df, expr) { + a <- 1000 + eval_tidy(enquo(expr), df) +} + +df <- data.frame(x = 1:3) +a <- 10 +with2(df, x + a) +``` + +> "Whenever you use a data mask, you must always use `enquo()` instead of `enexpr()`. + +This comes back in Chapter 20. -- ADD SLIDES AS SECTIONS (`##`). -- TRY TO KEEP THEM RELATIVELY SLIDE-LIKE; THESE ARE NOTES, NOT THE BOOK ITSELF. ## Meeting Videos