bookclub-advr

DSLC Advanced R Book Club
git clone https://git.eamoncaddigan.net/bookclub-advr.git
Log | Files | Refs | README | LICENSE

05.qmd (10371B)


      1 ---
      2 engine: knitr
      3 title: Control flow
      4 ---
      5 
      6 ## Learning objectives:
      7 
      8 - Understand the two primary tools for control flow: **choices** and **loops**
      9 - Learn best practices to void common pitfalls
     10 - Distinguish when to use `if`, `ifelse()`, and `switch()` for choices
     11 - Distinguish when to use `for`, `while`, and `repeat` for loops
     12 
     13 ::: {.callout-note}
     14 Basic familiarity with choices and loops is assumed.
     15 :::
     16 
     17 # Choices
     18 
     19 [Section 5.2 Choices]: #
     20 
     21 ## `if` is the basic statement for a **choice** 
     22 
     23 Single line format
     24 
     25 ```{r}
     26 #| eval: false
     27 if (condition) true_action
     28 if (condition) true_action else false_action
     29 ```
     30 
     31 Compound statement within `{`
     32 
     33 ```{r}
     34 grade <- function(x) {
     35   if (x > 90) {
     36     "A"
     37   } else if (x > 80) {
     38     "B"
     39   } else if (x > 50) {
     40     "C"
     41   } else {
     42     "F"
     43   }
     44 }
     45 ```
     46 
     47 ## Results of `if` can be assigned
     48 
     49 ```{r}
     50 x1 <- if (TRUE) 1 else 2
     51 x2 <- if (FALSE) 1 else 2
     52 
     53 c(x1, x2)
     54 ```
     55 
     56 :::{.callout-tip}
     57 Only recommended with single line format; otherwise hard to read.
     58 :::
     59 
     60 ## `if` without `else` can be combined with `c()` or `paste()` to create compact expressions
     61 
     62  - `if` without `else` invisibly returns `NULL` when `FALSE`.
     63 
     64 ```{r}
     65 greet <- function(name, birthday = FALSE) {
     66   paste0(
     67     "Hi ", name,
     68     if (birthday) " and HAPPY BIRTHDAY"
     69   )
     70 }
     71 greet("Maria", FALSE)
     72 greet("Jaime", TRUE)
     73 ```
     74 
     75 [Section 5.2.1 Invalid Inputs]: #
     76 
     77 ## `if` should have a single `TRUE` or `FALSE` condition, other inputs generate errors
     78 
     79 ```{r}
     80 #| error: true
     81 if ("x") 1
     82 if (logical()) 1
     83 if (NA) 1
     84 if (c(TRUE, FALSE)) 1
     85 ```
     86 
     87 [Section 5.2.2 Vectorized If]: #
     88 
     89 ## Use `ifelse()` for vectorized conditions
     90 
     91 ```{r}
     92 x <- 1:10
     93 ifelse(x %% 5 == 0, "XXX", as.character(x))
     94 ifelse(x %% 2 == 0, "even", "odd")
     95 ```
     96 
     97 ::: {.callout-tip}
     98 Only use `ifelse()` if both results are of the same type; otherwise output type is hard to predict.
     99 :::
    100 
    101 ## Use `dplyr::case_when()` for multiple condition-vector pairs
    102 
    103 ```{r}
    104 dplyr::case_when(
    105   x %% 35 == 0 ~ "fizz buzz",
    106   x %% 5 == 0 ~ "fizz",
    107   x %% 7 == 0 ~ "buzz",
    108   is.na(x) ~ "???",
    109   TRUE ~ as.character(x)
    110 )
    111 ```
    112 
    113 
    114 [Section 5.2.3 switch()]: #
    115 
    116 ## `switch()` is a special purpose equivalent to `if` that can be used to compact code {transition="none-out"}
    117 
    118 :::: {.columns}
    119 ::: {.column}
    120 ```{r}
    121 x_option <- function(x) {
    122   if (x == "a") {
    123     "option 1"
    124   } else if (x == "b") {
    125     "option 2" 
    126   } else if (x == "c") {
    127     "option 3"
    128   } else {
    129     stop("Invalid `x` value")
    130   }
    131 }
    132 ```
    133 :::
    134 
    135 ::: {.column}
    136 ```{r}
    137 x_option <- function(x) {
    138   switch(x,
    139     a = "option 1",
    140     b = "option 2",
    141     c = "option 3",
    142     stop("Invalid `x` value")
    143   )
    144 }
    145 ```
    146 :::
    147 ::::
    148 
    149 ## `switch()` is a special purpose equivalent to `if` that can be used to compact code {transition="none-in"}
    150 
    151 ::: {.callout-tip}
    152  - The last component of a `switch()` should always throw an error, otherwise unmatched inputs will invisibly return `NULL`.
    153  - Only use `switch()` with character inputs. Numeric inputs are hard to read and have undesirable failure modes.
    154 :::
    155 
    156 ::: {.callout-caution}
    157 Like `if`, `switch()` can only take a single condition, not vector conditions
    158 :::
    159 
    160 
    161 ## Avoid repeat outputs by leaving the right side of `=` empty
    162 
    163 - Inputs will "fall through" to the next value.
    164 
    165 ```{r}
    166 legs <- function(x) {
    167   switch(x,
    168     cow = ,
    169     horse = ,
    170     dog = 4,
    171     human = ,
    172     chicken = 2,
    173     plant = 0,
    174     stop("Unknown input")
    175   )
    176 }
    177 legs("cow")
    178 legs("dog")
    179 ```
    180 
    181 [Section 5.3 Loops]: #
    182 
    183 # Loops
    184 
    185 ## To iterate over items in a vector, use a `for` **loop**
    186 
    187 ```{r}
    188 #| eval: false
    189 for (item in vector) perform_action
    190 ```
    191 
    192 ```{r}
    193 for (i in 1:3) {
    194   print(i)
    195 }
    196 ```
    197 
    198 ::: {.callout-note}
    199 Convention uses short variables like `i`, `j`, or `k` for iterating vector indices
    200 :::
    201 
    202 ## `for` will overwrite existing variables in the current environment
    203 
    204 ```{r}
    205 i <- 100
    206 for (i in 1:3) {}
    207 i
    208 ```
    209 
    210 ## Use `next` or `break` to terminate loops early {transition="none-out"}
    211 - `next` exits the current iteration, but continues the loop
    212 - `break` exits the entire loop
    213 
    214 ## Use `next` or `break` to terminate loops early {transition="none-in"}
    215 ```{r}
    216 for (i in 1:10) {
    217   if (i < 3) 
    218     next
    219 
    220   print(i)
    221   
    222   if (i >= 5)
    223     break
    224 }
    225 ```
    226 
    227 [Section 5.3.1 Common pitfalls]: #
    228 
    229 ## Preallocate an output container to avoid slow loops
    230 
    231 ```{r}
    232 means <- c(1, 50, 20)
    233 out <- vector("list", length(means))
    234 for (i in 1:length(means)) {
    235   out[[i]] <- rnorm(10, means[[i]])
    236 }
    237 ```
    238 
    239 :::{.callout-tip}
    240 `vector()` function is helpful for preallocation
    241 :::
    242 
    243 ## Use `seq_along(x)` instead of `1:length(x)` {transition="none-out"}
    244 - `1:length(x)` causes unexpected failure for 0 length vectors
    245 - `:` works with both increasing and decreasing sequences
    246 ```{r}
    247 means <- c()
    248 1:length(means)
    249 seq_along(means)
    250 ```
    251 
    252 ## Use `seq_along(x)` instead of `1:length(x)` {transition="none-in"}
    253 :::: {.columns}
    254 ::: {.column}
    255 ```{r}
    256 #| error: true
    257 out <- vector("list", length(means))
    258 for (i in 1:length(means)) {
    259   out[[i]] <- rnorm(10, means[[i]])
    260 }
    261 ```
    262 :::
    263 ::: {.column}
    264 ```{r}
    265 out <- vector("list", length(means))
    266 for (i in seq_along(means)) {
    267   out[[i]] <- rnorm(10, means[[i]])
    268 }
    269 out
    270 ```
    271 :::
    272 ::::
    273 
    274 ## Avoid problems when iterating over S3 vectors by using `seq_along(x)` and `x[[i]]`
    275 ::: {}
    276 - loops typically strip attributes
    277 ```{r}
    278 xs <- as.Date(c("2020-01-01", "2010-01-01"))
    279 ```
    280 :::
    281 :::: {.columns}
    282 ::: {.column}
    283 ```{r}
    284 for (x in xs) {
    285   print(x)
    286 }
    287 ```
    288 :::
    289 ::: {.column}
    290 ```{r}
    291 for (i in seq_along(xs)) {
    292   print(xs[[i]])
    293 }
    294 ```
    295 :::
    296 ::::
    297  
    298 [Section 5.3.2 Related tools]: #
    299  
    300 ## Use `while` or `repeat` when you don't know the number of iterations
    301 
    302 - `while(condition) action`: perfoms `action` while `condition` is `TRUE`
    303 - `repeat(action)`: repeats `action` forever (or until a `break`) 
    304  
    305 ## Always use the least-flexible loop option possible
    306 
    307 - Use `for` before `while` or `repeat`
    308 - In data analysis use `apply()` or `purrr::map()` before `for`
    309 
    310 # Quiz & Exercises {visibility="uncounted"}
    311 
    312 [Section 5.1 Quiz]: #
    313 
    314 ## What is the difference between if and ifelse()? {visibility="uncounted"}
    315 
    316 ::: {.fragment .fade-in}
    317 `if` works with scalars; `ifelse()` works with vectors.
    318 :::
    319 
    320 ## In the following code, what will the value of `y` be if `x` is `TRUE`? What if `x` is `FALSE`? What if `x` is `NA`? {visibility="uncounted"}
    321 
    322 
    323 ```{r}
    324 #| eval: false
    325 y <- if (x) 3
    326 ```
    327 
    328 ::: {.fragment .fade-up fragment-index=1}
    329 When `x` is `TRUE`, `y` will be `3`; when `FALSE`, `y` will be `NULL`; when `NA` the `if` statement will throw an error.
    330 :::
    331 
    332 ## What does `switch("x", x = , y = 2, z = 3)` return? {visibility="uncounted"}
    333 
    334 
    335 ```{r}
    336 #| eval: false
    337 switch(
    338   "x",
    339   x = ,
    340   y = 2,
    341   x = 3
    342 )
    343 ```
    344 
    345 ::: {.fragment .fade-in}
    346 This `switch()` statement makes use of fall-through so it will return `2`.
    347 :::
    348 
    349 [Section 5.2.4 Exercises]: #
    350 
    351 ## What type of vector does each of the following calls to ifelse() return? {visibility="uncounted" transition="none-out"}
    352 
    353 Read the documentation and write down the rules in your own words.
    354 ```{r}
    355 #| eval: false
    356 ifelse(TRUE, 1, "no")
    357 ifelse(FALSE, 1, "no")
    358 ifelse(NA, 1, "no")
    359 ```
    360 
    361 ## What type of vector does each of the following calls to ifelse() return? {visibility="uncounted" transition="none"}
    362 
    363 The arguments of `ifelse()` are named `test`, `yes` and `no`.
    364 In general, `ifelse()` returns the entry for `yes` when `test` is `TRUE`,
    365 the entry for `no` when `test` is `FALSE` 
    366 and `NA` when `test` is `NA`.
    367 
    368 ```{r}
    369 ifelse(TRUE, 1, "no")
    370 ifelse(FALSE, 1, "no")
    371 ifelse(NA, 1, "no")
    372 ```
    373 
    374 ## What type of vector does each of the following calls to ifelse() return? {visibility="uncounted" transition="none-in"}
    375 In practice, `test` is first converted to `logical` and if the result is neither `TRUE` nor `FALSE`, then `as.logical(test)` is returned.
    376 ```{r}
    377 ifelse(logical(), 1, "no")
    378 ifelse(NaN, 1, "no")
    379 ifelse(NA_character_, 1, "no")
    380 ifelse("a", 1, "no")
    381 ifelse("true", 1, "no")
    382 ```
    383 
    384 ## Why does the following code work? {visibility="uncounted"}
    385 
    386 ```{r}
    387 x <- 1:10
    388 if (length(x)) "not empty" else "empty"
    389 x <- numeric()
    390 if (length(x)) "not empty" else "empty"
    391 ```
    392 ::: {.fragment .fade-up fragment-index=1}
    393 `if()` expects a logical condition, but also accepts a numeric vector where `0` is treated as `FALSE` and all other numbers are treated as `TRUE`.
    394 Numerical missing values (including `NaN`) lead to an error in the same way that a logical missing, `NA`, does.
    395 :::
    396 
    397 [Section 5.3.3 Exercises]: #
    398 
    399 ## Why does this code succeed without errors or warnings? {visibility="uncounted" transition="none-out"}
    400 
    401 ```{r}
    402 x <- numeric()
    403 out <- vector("list", length(x))
    404 for (i in 1:length(x)) {
    405   out[i] <- x[i] ^ 2
    406 }
    407 out
    408 ```
    409 
    410 ## Why does this code succeed without errors or warnings? {visibility="uncounted" transition="none-in"}
    411 - Subsetting behavior for out-of-bounds & `0` indices when using `[<-` and `[`
    412 - `x[1]` generates an `NA`. `NA` is assigned to the empty length-1 list `out[1]`
    413 - `x[0]` returns `numeric(0)`. `numeric(0)` is assigned to `out[0]`. Assigning a 0-length vector to a 0-length subset doesn't change the object.
    414 - Each step includes valid R operations (even though the result may not be what the user intended).
    415 
    416 ## Walk-through {visibility="uncounted" transition="none-out"}
    417 
    418 Setup
    419 ```{r}
    420 x <- numeric()
    421 out <- vector("list", length(x))
    422 1:length(x)
    423 ```
    424 
    425 ## Walk-through {visibility="uncounted" transition="none"}
    426 
    427 First Iteration
    428 ```{r}
    429 x[1]
    430 x[1]^2
    431 out[1]
    432 out[1] <- x[1]^2
    433 out[1]
    434 ```
    435 
    436 ## Walk-through {visibility="uncounted" transition="none"}
    437 
    438 Second Iteration
    439 ```{r}
    440 x[0]
    441 x[0]^2
    442 out[0]
    443 out[0] <- x[0]^2
    444 out[0]
    445 ```
    446 
    447 ## Walk-through {visibility="uncounted" transition="none-in"}
    448 
    449 Final Result
    450 ```{r}
    451 out
    452 ```
    453 
    454 ## When the following code is evaluated, what can you say about the vector being iterated? {visibility="uncounted"}
    455 
    456 ```{r}
    457 xs <- c(1, 2, 3)
    458 for (x in xs) {
    459   xs <- c(xs, x * 2)
    460 }
    461 xs
    462 ```
    463 
    464 ::: {.fragment .fade-in}
    465 In this loop `x` takes on the values of the initial `xs` (`1`, `2` and `3`), indicating that it is evaluated just once in the beginning of the loop, not after each iteration. (Otherwise, we would run into an infinite loop.)
    466 :::
    467 
    468 ## What does the following code tell you about when the index is updated? {visibility="uncounted"}
    469 
    470 ```{r}
    471 for (i in 1:3) {
    472   i <- i * 2
    473   print(i) 
    474 }
    475 ```
    476 
    477 ::: {.fragment .fade-in}
    478 In a `for` loop the index is updated in the beginning of each iteration. Therefore, reassigning the index symbol during one iteration doesn't affect the following iterations. (Again, we would otherwise run into an infinite loop.)
    479 :::