07.qmd (11138B)
1 --- 2 engine: knitr 3 title: Environments 4 --- 5 6 ## Learning objectives: 7 8 - Create, modify, and inspect environments 9 10 - Recognize special environments 11 12 - Understand how environments power lexical scoping and namespaces 13 14 # 7.2 Environment Basics 15 16 ## Environments are similar to lists 17 18 Generally, an environment is similar to a named list, with four important exceptions: 19 20 - Every name must be unique. 21 22 - The names in an environment are not ordered. 23 24 - An environment has a parent. 25 26 - Environments are not copied when modified. 27 28 ::: {.notes} 29 30 - Lists can have duplicate names, e.g. x <- base::list(a = 1, a = 1) 31 32 - Lists have an inherent order, e.g. x[[1]] returns the first element of the list above 33 34 - Environments copy by reference, not by replacement, e.g.: 35 36 Modifying a list produces a different memory address 37 base::identical(lobstr::obj_addr(x), lobstr::obj_addr({x[[1]] <- 2; x})) 38 39 y <- rlang::env() 40 y$'a' <- 1 41 42 base::identical( 43 lobstr::obj_addr(y), 44 lobstr::obj_addr({y[['a']] <- 2; y}) 45 ) 46 47 ::: 48 49 ## Create a new environment with `{rlang}` 50 51 :::: {.columns} 52 53 ::: {.column} 54 55 ```{r} 56 e1 <- rlang::env( 57 rlang::global_env(), 58 a = FALSE, 59 b = "a", 60 c = 2.3, 61 d = 1:3, 62 ) 63 ``` 64 65 ::: 66 67 ::: {.column} 68 69 ```{r} 70 e2 <- rlang::new_environment( 71 data = list( 72 a = FALSE, 73 b = "a", 74 c = 2.3, 75 d = 1:3 76 ), 77 parent = rlang::global_env() 78 ) 79 ``` 80 81 ::: 82 83 :::: 84 85 ::: {.notes} 86 87 rlang::env() creates a child of the current environment by default and takes a variable number of named objects to populate it. 88 89 rlang::new_environment() creates a child of the empty environment by default and takes a named list of objects to populate it. 90 91 ::: 92 93 94 ## An environment associates, or **binds** a set of names to a set of values 95 96 :::: {.columns} 97 98 ::: {.column} 99 100 - A bag of names with no implied order 101 102 - Bindings live within the environment 103 104  105 106 ::: 107 108 ::: {.column} 109 110 - Environments have reference semantics and thus can contain themselves 111 112 ```{r} 113 #| eval: false 114 e1$d <- e1 115 ``` 116 117  118 119 ::: 120 121 :::: 122 123 ::: {.notes} 124 125 no implied order (unlike a list, so no index subsetting) 126 127 the grey box represents an environment 128 129 the blue dot represents the parent environment 130 131 letters represent variable names bound within the environment 132 133 --- 134 135 reference semantics store a reference to the object's memory address, not the actual value (as is done in value semantics) 136 137 ::: 138 139 ## Inspect environments with `{rlang}` 140 141 ```{r} 142 rlang::env_print(e1) 143 ``` 144 145 ```{r} 146 rlang::env_names(e1) 147 ``` 148 149 ```{r} 150 rlang::env_has(e1, "a") 151 ``` 152 153 ```{r} 154 rlang::env_get(e1, "a") 155 ``` 156 157 ```{r} 158 rlang::env_parent(e1) 159 ``` 160 161 ::: {.notes} 162 163 base::print() displays the memory address and is not as helpful as rlang::env_print() 164 165 ::: 166 167 ## By default, the current environment is your global environment 168 169 - The current environment is where code is currently executing 170 171 - The global environment *is* your current environment when working interactively 172 173 ```{r} 174 rlang::current_env() 175 176 rlang::global_env() 177 178 base::identical( 179 rlang::current_env(), 180 rlang::global_env() 181 ) 182 ``` 183 184 ::: {.notes} 185 186 If you open a new R session, you are in the global environment by default (unless otherwise modified by say the .rprofile file) 187 188 The current environment isn't *always* your global environment. Your current environment changes as you move into and out of functions, for example. 189 190 ::: 191 192 ## Every environment has a parent environment 193 194 - Allows for lexical scoping 195 196 ```{r} 197 e2a <- rlang::env(d = 4, e = 5) 198 199 e2b <- rlang::env(e2a, a = 1, b = 2, c = 3) 200 201 rlang::env_parent(e2b) 202 203 rlang::env_parents(e2b) 204 ``` 205 206  207 208 ::: {.notes} 209 210 Lexical scoping means if a name is not found in an environment, then R will look in its parent (and so on) 211 212 Lexical scoping is in contrast to dynamic scoping, where the variable is retrieved as it is defined at run time 213 214 ::: 215 216 ## Only the **empty** environment does not have a parent 217 218 ```{r} 219 e2c <- rlang::env(rlang::empty_env(), d = 4, e = 5) 220 221 e2d <- rlang::env(e2c, a = 1, b = 2, c = 3) 222 ``` 223 224 {width=50% height=50%} 225 226 ::: {.notes} 227 228 THe lack of a parent is shown by the hollow blue dot 229 230 ::: 231 232 ## All environments eventually terminate with the empty environment 233 234 ```{r} 235 rlang::env_parents(e2b, last = rlang::empty_env()) 236 ``` 237 238 ::: {.notes} 239 240 The empty enviornment typically isn't shown but can be displayed by setting the `last` parameter of `rlang::env_parents()` 241 242 ::: 243 244 ## Be wary of using `<<-` 245 246 - Regular assignment (`<-`) always creates a variable in the current environment 247 248 - Super assignment (`<<-`) does a few things: 249 250 1. modifies the variable if it exists in a parent environment 251 252 2. creates the variable in the global environment if it does not exist 253 254 ::: {.notes} 255 256 `<<-` searches through environments via the search path, using the first found instance of the variable 257 258 `<<--` does not search through package environments as they are above the global environment on the search path 259 260 e1 <- rlang::env() 261 e2 <- rlang::env(e1) 262 263 rlang::env_poke(e1, "a", 1) 264 265 withr::with_environment( 266 e2, 267 a <<- 2 268 ) 269 ::: 270 271 272 ## Retrieve environment variables with `$`, `[[`, or `{rlang}` functions 273 274 ```{r} 275 e3 <- rlang::env(x = 1, y = 2) 276 277 e3$x 278 ``` 279 280 ```{r} 281 e3[["x"]] 282 ``` 283 284 ```{r} 285 rlang::env_get(e3, "x") 286 ``` 287 288 ```{r} 289 #| error: true 290 e3[[1]] 291 ``` 292 293 ```{r} 294 #| error: true 295 e3["x"] 296 ``` 297 298 ## Add bindings to an environment with ``$`, `[[`, or `{rlang}` functions` 299 300 ```{r} 301 e3$z <- 3 302 303 e3[["z"]] <- 3 304 305 rlang::env_poke(e3, "z", 3) 306 307 rlang::env_bind(e3, z = 3, b = 20) 308 309 rlang::env_unbind(e3, "z") 310 ``` 311 312 ::: {.notes} 313 314 rlang::env_has() is used to check if a variable exists within the environment 315 316 rlang::env_unbind() is used to unbind a variable from an environment 317 318 ::: 319 320 ## Special cases for binding environment variables 321 322 - `rlang::env_bind_lazy()` creates delayed bindings 323 324 - evaluated the first time they are accessed 325 326 - `rlang::env_bind_active()` creates active bindings 327 328 - re-computed every time they’re accessed 329 330 # 7.3 Recursing over environments 331 332 ## Explore environments recursively 333 334 ```{r} 335 where <- function(name, env = caller_env()) { 336 if (identical(env, empty_env())) { 337 # Base case 338 stop("Can't find ", name, call. = FALSE) 339 } else if (env_has(env, name)) { 340 # Success case 341 env 342 } else { 343 # Recursive case 344 where(name, env_parent(env)) 345 } 346 } 347 ``` 348 349 ::: {.notes} 350 351 Why is recursing over environments important? 352 353 Recursion is not the same thing as iteration 354 355 ::: 356 357 # 7.4 Special Environments 358 359 ## Attaching packages changes the search path 360 361 - The **search path** is the order in which R will look through environments for objects 362 363 - Attached packages become a parent of the global environment 364 365 - The immediate parent of the global environment is that last package attached 366 367  368 369 370 ::: {.notes} 371 372 Autoloads and base are always the last two environments on the search path 373 374 Autoloads uses lazy loading to make large package objects (like datasets) available without taking up memory 375 376 Functions within base are used to load all other packages 377 378 ::: 379 380 ## Attaching packages changes the search path 381 382 :::: {.columns} 383 384 ::: {.column} 385 386 ```{r} 387 rlang::search_envs() 388 ``` 389 390 ::: 391 392 ::: {.column} 393 394 ```{r} 395 library(rlang) 396 397 rlang::search_envs() 398 ``` 399 400 ::: 401 402 :::: 403 404 ::: {.notes} 405 406 Attaching `{rlang}` modifies the search path 407 408 ::: 409 410 ## Functions enclose their current environment 411 412 - Functions enclose current environment when it is created 413 414 ```{r} 415 y <- 1 416 417 f <- function(x) x + y 418 419 rlang::fn_env(f) 420 ``` 421 422  423 424 ::: {.notes} 425 426 The function environment is represented by the black dot 427 428 The function `f()` knows where to look for y thanks to the function environment 429 430 ::: 431 432 ## Functions enclose their current environment 433 434 - `g()` is *being bound by* the environment `e` but *binds* the global environment 435 436 - The function environment is the global environment but the binding environment is `e` 437 438 ```{r} 439 e <- env() 440 441 e$g <- function() 1 442 443 rlang::fn_env(e$g) 444 ``` 445 446 ## Functions enclose their current environment 447 448  449 450 ## Namespaces ensure package environment independence 451 452 - Every package has an underlying namespace 453 454 - Every function is associated with a package environment and namespace environment 455 456 - Package environments contain exported objects 457 458 - Namespace environments contain exported and internal objects 459 460 ::: {.notes} 461 462 Contrast `dplyr::across` and `dplyr:::across_glue_mask()` 463 464 `sd()` is bound to the `{stats}` namespace environment 465 466 ::: 467 468 ## Namespaces ensure package environment independence 469 470  471 472 ## Namespaces ensure package environment independence 473 474  475 476 ::: {.notes} 477 478 `var()` is found in the stats namespace first, so that is the definition of var that is used by `sd()` 479 480 If an object called by `sd()` wasn't found in the stats namespace, it would be searched for according to the search path 481 482 ::: 483 484 ## Functions use ephemeral execution environments 485 486 - Functions create a new environment to use whenever executed 487 488 - The execution environment is a child of the function environment 489 490 - Execution environments are garbage collected on function exit 491 492 ## Functions use ephemeral execution environments 493 494  495 496 # 7.5 Call stacks 497 498 ## The caller environment informs the call stack 499 500 - The caller environment is the environment from which the function was called 501 502 - Accessed with `rlang::caller_env()` 503 504 - The call stack is created within the caller environment 505 506 :::: {.columns} 507 508 ::: {.column} 509 510 ```{r} 511 f <- function(x) { 512 g(x = 2) 513 } 514 g <- function(x) { 515 h(x = 3) 516 } 517 h <- function(x) { 518 lobstr::cst() 519 } 520 ``` 521 522 ::: 523 524 ::: {.column} 525 526 ```{r} 527 f(x = 1) 528 ``` 529 530 ::: 531 532 :::: 533 534 ::: {.notes} 535 536 `traceback()` is the base R approach 537 538 `lobstr::cst()` prints the call stack in order of call, opposite of `traceback()` 539 540 Does `lobstr::cst()` now prints the caller environment? 541 542 ::: 543 544 ## The caller environment informs the call stack 545 546 - Call stack is more complicated with lazy evaluation 547 548 ```{r} 549 a <- function(x) b(x) 550 b <- function(x) d(x) 551 d <- function(x) x 552 553 a(f()) 554 ``` 555 556 ::: {.notes} 557 558 Do different branches represent different caller environments? 559 560 Note that `c()` was replaced with `d()` as it could not be rendered with `c()` 561 562 ::: 563 564 ## The caller environment informs the call stack 565 566  567 568 ::: {.notes} 569 570 - Each frame contains: 571 572 1. An expression 573 574 2. An environment 575 576 3. A parent 577 578 ::: 579 580 ## R uses lexical scoping, not dynamic scoping 581 582 > R uses lexical scoping: it looks up the values of names based on how a function is defined, not how it is called. “Lexical” here is not the English adjective that means relating to words or a vocabulary. It’s a technical CS term that tells us that the scoping rules use a parse-time, rather than a run-time structure. - [Chapter 6 - functions](https://adv-r.hadley.nz/functions.html) 583 584 - Dynamic scoping means functions use variables as they are defined in the calling environment 585 586 # 7.6 Data structures 587 588 ## Environments are useful data structures 589 590 - Usecase include: 591 592 1. Avoiding copies of large data 593 594 2. Managing state within a package 595 596 3. As a hashmap 597 598 ::: {.notes} 599 600 Finding a function in a package uses constant time 601 602 ::: 603 604 605