bookclub-advr

DSLC Advanced R Book Club
git clone https://git.eamoncaddigan.net/bookclub-advr.git
Log | Files | Refs | README | LICENSE

commit f0c408cc8875f0fb130b87978517540bdd5c6c4d
parent 3bc2c44fdc76c83e2b048649c4dafa36409f1661
Author: Jacob Schwan <blackbrass88@gmail.com>
Date:   Mon,  8 Sep 2025 05:21:32 -0500

Update Chapter 5 slides for Cohort 10 (#93)

* First draft chapter 5 for cohort 10

* Updated chapter 5 slides for cohort 10
Diffstat:
M_freeze/slides/05/execute-results/html.json | 13++++++++-----
D_freeze/slides/05/figure-html/unnamed-chunk-16-1.png | 0
Dslides/05.Rmd | 420-------------------------------------------------------------------------------
Aslides/05.qmd | 479+++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
4 files changed, 487 insertions(+), 425 deletions(-)

diff --git a/_freeze/slides/05/execute-results/html.json b/_freeze/slides/05/execute-results/html.json @@ -1,15 +1,18 @@ { - "hash": "80a70ab33aa94f43d33430a9c9b397e1", + "hash": "c432b7bf08c41a9f48e329de63c095e1", "result": { - "engine": "knitr", - "markdown": "---\nengine: knitr\ntitle: Control flow\n---\n\n## Learning objectives:\n\n- Learn the **tools** for controlling flow of execution.\n\n- Learn some technical pitfalls and (perhaps lesser known) useful features.\n\n\n::: {.cell layout-align=\"left\"}\n::: {.cell-output-display}\n![](images/whatif2.png){fig-align='left' width=518}\n:::\n:::\n\n\n::: {.cell layout-align=\"right\"}\n::: {.cell-output-display}\n![](images/forloop.png){fig-align='right' width=520}\n:::\n:::\n\n\n---\n\n## Introduction\n\nThere are two main groups of flow control tools: **choices** and **loops**: \n\n- Choices (`if`, `switch`, `ifelse`, `dplyr::if_else`, `dplyr::case_when`) allow you to run different code depending on the input. \n \n- Loops (`for`, `while`, `repeat`) allow you to repeatedly run code \n\n\n---\n\n\n## Choices\n\n\n\n`if()` and `else`\n\nUse `if` to specify a block of code to be executed, if a specified condition is true. Use `else` to specify a block of code to be executed, if the same condition is false. \n\n\n::: {.cell}\n\n```{.r .cell-code}\nif (condition) true_action\nif (condition) true_action else false_action\n```\n:::\n\n\n(Note braces are only *needed* for compound expressions)\n\n\n::: {.cell}\n\n```{.r .cell-code}\nif (test_expression) { \n true_action\n} else {\n false_action\n}\n```\n:::\n\n\n\nCan be expanded to more alternatives:\n\n\n::: {.cell}\n\n```{.r .cell-code}\nif (test_expression) { \n true_action\n} else if (other_test_expression) {\n other_action\n} else {\n false_action\n}\n```\n:::\n\n\n\n## Exercise {-}\nWhy does this work?\n```\nx <- 1:10\nif (length(x)) \"not empty\" else \"empty\"\n#> [1] \"not empty\"\n\nx <- numeric()\nif (length(x)) \"not empty\" else \"empty\"\n#> [1] \"empty\"\n```\n\n`if` returns a value which can be assigned\n\n\n::: {.cell}\n\n```{.r .cell-code}\nx1 <- if (TRUE) 1 else 2\nx2 <- if (FALSE) 1 else 2\n\nc(x1, x2)\n```\n\n::: {.cell-output .cell-output-stdout}\n\n```\n#> [1] 1 2\n```\n\n\n:::\n:::\n\n\nThe book recommends assigning the results of an if statement only when the entire expression fits on one line; otherwise it tends to be hard to read.\n\n\n## Single if without else {-}\n\nWhen you use the single argument form without an else statement, if invisibly (Section 6.7.2) returns NULL if the condition is FALSE. Since functions like c() and paste() drop NULL inputs, this allows for a compact expression of certain idioms:\n\n\n::: {.cell}\n\n```{.r .cell-code}\ngreet <- function(name, birthday = FALSE) {\n paste0(\n \"Hi \", name,\n if (birthday) \" and HAPPY BIRTHDAY\"\n )\n}\ngreet(\"Maria\", FALSE)\n#> [1] \"Hi Maria\"\ngreet(\"Jaime\", TRUE)\n#> [1] \"Hi Jaime and HAPPY BIRTHDAY\"\n```\n:::\n\n\n\n\n\n::: {.cell}\n\n```{.r .cell-code}\nformat_lane_text <- function(number){\n\n paste0(\n number,\n \" lane\",\n if (number > 1) \"s\",\n \" of sequencing\"\n )\n}\n\nformat_lane_text(1)\n#> [1] \"1 lane of sequencing\"\nformat_lane_text(4)\n#> [1] \"4 lanes of sequencing\"\n```\n:::\n\n\n\n\n\n## Invalid inputs {-}\n\n- *Condition* must evaluate to a *single* `TRUE` or `FALSE`\n\nA single number gets coerced to a logical type. \n\n\n::: {.cell}\n\n```{.r .cell-code}\nif (56) 1\n#> [1] 1\nif (0.3) 1\n#> [1] 1\nif (0) 1\n```\n:::\n\n\nIf the condition cannot evaluate to a *single* `TRUE` or `FALSE`, an error is (usually) produced.\n\n\n::: {.cell}\n\n```{.r .cell-code}\nif (\"text\") 1\n#> Error in if (\"text\") 1: argument is not interpretable as logical\nif (\"true\") 1 \n#> 1\nif (numeric()) 1\n#> Error in if (numeric()) 1: argument is of length zero\nif (NULL) 1\n#> Error in if (NULL) 1 : argument is of length zero\nif (NA) 1\n#> Error in if (NA) 1: missing value where TRUE/FALSE needed\n```\n:::\n\n\n\nException is a logical vector of length greater than 1, which only generates a warning, unless you have `_R_CHECK_LENGTH_1_CONDITION_` set to `TRUE`. \nThis seems to have been the default since R-4.2.0\n\n\n::: {.cell}\n\n```{.r .cell-code}\nif (c(TRUE, FALSE)) 1\n#>Error in if (c(TRUE, FALSE)) 1 : the condition has length > 1\n```\n:::\n\n\n## Vectorized choices {-}\n\n- `ifelse()` is a vectorized version of `if`:\n\n\n::: {.cell}\n\n```{.r .cell-code}\nx <- 1:10\nifelse(x %% 5 == 0, \"XXX\", as.character(x))\n#> [1] \"1\" \"2\" \"3\" \"4\" \"XXX\" \"6\" \"7\" \"8\" \"9\" \"XXX\"\n\nifelse(x %% 2 == 0, \"even\", \"odd\")\n#> [1] \"odd\" \"even\" \"odd\" \"even\" \"odd\" \"even\" \"odd\" \"even\" \"odd\" \"even\"\n```\n:::\n\n\n- `dplyr::if_else()`\n\n- Book recommends only using `ifelse()` \"only when the yes and no vectors are the same type as it is otherwise hard to predict the output type.\" \n\n- `dplyr::if_else()` enforces this recommendation.\n\n**For example:**\n\n\n::: {.cell}\n\n```{.r .cell-code}\nifelse(c(TRUE,TRUE,FALSE),\"a\",3)\n#> [1] \"a\" \"a\" \"3\"\ndplyr::if_else(c(TRUE,TRUE,FALSE),\"a\",3)\n#> Error in `dplyr::if_else()`:\n#> ! `false` must be a character vector, not a double vector.\n```\n:::\n\n \n## Switch {-}\n\nRather then string together multiple if - else if chains, you can often use `switch`.\n\n\n\n::: {.cell}\n\n```{.r .cell-code}\ncentre <- function(x, type) {\n switch(type,\n mean = mean(x),\n median = median(x),\n trimmed = mean(x, trim = .1),\n stop(\"Invalid `type` value\")\n )\n}\n```\n:::\n\n\nLast component should always throw an error, as unmatched inputs would otherwise invisibly return NULL.\nBook recommends to only use character inputs for `switch()`.\n\n\n::: {.cell}\n\n```{.r .cell-code}\nvec <- c(1:20,50:55)\ncentre(vec, \"mean\")\n#> [1] 20.19231\ncentre(vec, \"median\")\n#> [1] 13.5\ncentre(vec, \"trimmed\")\n#> [1] 18.77273\n```\n:::\n\n\n\n::: {.cell}\n\n```{.r .cell-code}\nset.seed(123)\nx <- rlnorm(100)\n\ncenters <- data.frame(type = c('mean', 'median', 'trimmed'))\ncenters$value = sapply(centers$type, \\(t){centre(x,t)})\n\nrequire(ggplot2)\nggplot(data = data.frame(x), aes(x))+\n geom_density()+\n geom_vline(data = centers, \n mapping = aes(color = type, xintercept = value), \n linewidth=0.5,linetype=\"dashed\") +\n xlim(-1,10)+\n theme_bw()\n```\n\n::: {.cell-output-display}\n![](05_files/figure-html/unnamed-chunk-16-1.png){width=672}\n:::\n:::\n\n\n\nExample from book of \"falling through\" to next value\n\n\n::: {.cell}\n\n```{.r .cell-code}\nlegs <- function(x) {\n switch(x,\n cow = ,\n horse = ,\n dog = 4,\n human = ,\n chicken = 2,\n plant = 0,\n stop(\"Unknown input\")\n )\n}\nlegs(\"cow\")\n```\n\n::: {.cell-output .cell-output-stdout}\n\n```\n#> [1] 4\n```\n\n\n:::\n\n```{.r .cell-code}\n#> [1] 4\nlegs(\"dog\")\n```\n\n::: {.cell-output .cell-output-stdout}\n\n```\n#> [1] 4\n```\n\n\n:::\n\n```{.r .cell-code}\n#> [1] 4\n```\n:::\n\n\n\n\n\n## Using `dplyr::case_when` {-}\n\n- `case_when` is a more general `if_else` and can be used often in place of multiple chained `if_else` or sapply'ing `switch`.\n\n- It uses a special syntax to allow any number of condition-vector pairs:\n\n\n::: {.cell}\n\n```{.r .cell-code}\nset.seed(123)\nx <- rlnorm(100)\n\ncenters <- data.frame(type = c('mean', 'median', 'trimmed'))\n\ncenters$value = dplyr::case_when(\n centers$type == 'mean' ~ mean(x),\n centers$type == 'median' ~ median(x),\n centers$type == 'trimmed' ~ mean(x, trim = 0.1),\n .default = 1000\n )\n\ncenters\n```\n\n::: {.cell-output .cell-output-stdout}\n\n```\n#> type value\n#> 1 mean 1.652545\n#> 2 median 1.063744\n#> 3 trimmed 1.300568\n```\n\n\n:::\n:::\n\n\n \n\n## Loops\n\n- Iteration over a elements of a vector\n\n`for (item in vector) perform_action`\n\n**First example**\n\n::: {.cell}\n\n```{.r .cell-code}\nfor(i in 1:5) {\n print(1:i)\n}\n```\n\n::: {.cell-output .cell-output-stdout}\n\n```\n#> [1] 1\n#> [1] 1 2\n#> [1] 1 2 3\n#> [1] 1 2 3 4\n#> [1] 1 2 3 4 5\n```\n\n\n:::\n\n```{.r .cell-code}\nx <- numeric(length=5L)\ndf <- data.frame(x=1:5)\n\nfor(i in 1:5) {\n df$y[[i]] <- i+1\n}\n```\n:::\n\n\n\n**Second example**: terminate a *for loop* earlier\n\n- `next` skips rest of current iteration\n- `break` exits the loop entirely\n\n\n::: {.cell}\n\n```{.r .cell-code}\nfor (i in 1:10) {\n if (i < 3) \n next\n\n print(i)\n \n if (i >= 5)\n break\n}\n```\n\n::: {.cell-output .cell-output-stdout}\n\n```\n#> [1] 3\n#> [1] 4\n#> [1] 5\n```\n\n\n:::\n:::\n\n\n## Exercise {-}\n\nWhen the following code is evaluated, what can you say about the vector being iterated?\n```\nxs <- c(1, 2, 3)\nfor (x in xs) {\n xs <- c(xs, x * 2)\n}\nxs\n#> [1] 1 2 3 2 4 6\n```\n\n## Pitfalls {-}\n\n- Preallocate output containers to avoid *slow* code. \n\n- Beware that `1:length(v)` when `v` has length 0 results in a iterating backwards over `1:0`, probably not what is intended. Use `seq_along(v)` instead.\n\n- When iterating over S3 vectors, use `[[]]` yourself to avoid stripping attributes. \n\n```\nxs <- as.Date(c(\"2020-01-01\", \"2010-01-01\"))\nfor (x in xs) {\n print(x)\n}\n#> [1] 18262\n#> [1] 14610\n```\nvs. \n```\nfor (i in seq_along(xs)) {\n print(xs[[i]])\n}\n#> [1] \"2020-01-01\"\n#> [1] \"2010-01-01\"\n```\n\n## Related tools {-}\n\n- `while(condition) action`: performs action while condition is TRUE.\n- `repeat(action)`: repeats action forever (i.e. until it encounters break).\n\n- Note that `for` can be rewritten as `while` and while can be rewritten as `repeat` (this goes in one direction only!); *however*:\n\n>Good practice is to use the least-flexible solution to a problem, so you should use `for` wherever possible.\nBUT you shouldn't even use for loops for data analysis tasks as `map()` and `apply()` already provide *less flexible* solutions to most problems. (More in Chapter 9.)\n\n\n::: {.cell}\n\n```{.r .cell-code}\nfor (i in 1:5) {\n print(i)\n}\n```\n\n::: {.cell-output .cell-output-stdout}\n\n```\n#> [1] 1\n#> [1] 2\n#> [1] 3\n#> [1] 4\n#> [1] 5\n```\n\n\n:::\n:::\n\n\n\n::: {.cell}\n\n```{.r .cell-code}\nx_option <- function(x) {\n switch(x,\n a = \"option 1\",\n b = \"option 2\",\n c = \"option 3\"#,\n #stop(\"Invalid `x` value\")\n )\n}\n```\n:::\n\n\n\n\n\n::: {.cell}\n\n```{.r .cell-code}\ni <- 1\n\nwhile(i <=5 ) {\n print(i)\n i <- i+1\n}\n```\n\n::: {.cell-output .cell-output-stdout}\n\n```\n#> [1] 1\n#> [1] 2\n#> [1] 3\n#> [1] 4\n#> [1] 5\n```\n\n\n:::\n:::\n\n\n\n::: {.cell}\n\n```{.r .cell-code}\ni <- 1\n\nrepeat {\n print(i)\n i <- i+1\n if (i > 5) break\n}\n```\n\n::: {.cell-output .cell-output-stdout}\n\n```\n#> [1] 1\n#> [1] 2\n#> [1] 3\n#> [1] 4\n#> [1] 5\n```\n\n\n:::\n:::\n\n", + "markdown": "---\nengine: knitr\ntitle: Control flow\n---\n\n\n## Learning objectives:\n\n- Understand the two primary tools for control flow: **choices** and **loops**\n- Learn best practices to void common pitfalls\n- Distinguish when to use `if`, `ifelse()`, and `switch()` for choices\n- Distinguish when to use `for`, `while`, and `repeat` for loops\n\n::: {.callout-note}\nBasic familiarity with choices and loops is assumed.\n:::\n\n# Choices\n\n[Section 5.2 Choices]: #\n\n## `if` is the basic statement for a **choice** \n\nSingle line format\n\n\n::: {.cell}\n\n```{.r .cell-code}\nif (condition) true_action\nif (condition) true_action else false_action\n```\n:::\n\n\nCompound statement within `{`\n\n\n::: {.cell}\n\n```{.r .cell-code}\ngrade <- function(x) {\n if (x > 90) {\n \"A\"\n } else if (x > 80) {\n \"B\"\n } else if (x > 50) {\n \"C\"\n } else {\n \"F\"\n }\n}\n```\n:::\n\n\n## Results of `if` can be assigned\n\n\n::: {.cell}\n\n```{.r .cell-code}\nx1 <- if (TRUE) 1 else 2\nx2 <- if (FALSE) 1 else 2\n\nc(x1, x2)\n```\n\n::: {.cell-output .cell-output-stdout}\n```\n#> [1] 1 2\n```\n:::\n:::\n\n\n:::{.callout-tip}\nOnly recommended with single line format; otherwise hard to read.\n:::\n\n## `if` without `else` can be combined with `c()` or `paste()` to create compact expressions\n\n - `if` without `else` invisibly returns `NULL` when `FALSE`.\n\n\n::: {.cell}\n\n```{.r .cell-code}\ngreet <- function(name, birthday = FALSE) {\n paste0(\n \"Hi \", name,\n if (birthday) \" and HAPPY BIRTHDAY\"\n )\n}\ngreet(\"Maria\", FALSE)\n```\n\n::: {.cell-output .cell-output-stdout}\n```\n#> [1] \"Hi Maria\"\n```\n:::\n\n```{.r .cell-code}\ngreet(\"Jaime\", TRUE)\n```\n\n::: {.cell-output .cell-output-stdout}\n```\n#> [1] \"Hi Jaime and HAPPY BIRTHDAY\"\n```\n:::\n:::\n\n\n[Section 5.2.1 Invalid Inputs]: #\n\n## `if` should have a single `TRUE` or `FALSE` condition, other inputs generate errors\n\n\n::: {.cell}\n\n```{.r .cell-code}\nif (\"x\") 1\n```\n\n::: {.cell-output .cell-output-error}\n```\n#> Error in if (\"x\") 1: argument is not interpretable as logical\n```\n:::\n\n```{.r .cell-code}\nif (logical()) 1\n```\n\n::: {.cell-output .cell-output-error}\n```\n#> Error in if (logical()) 1: argument is of length zero\n```\n:::\n\n```{.r .cell-code}\nif (NA) 1\n```\n\n::: {.cell-output .cell-output-error}\n```\n#> Error in if (NA) 1: missing value where TRUE/FALSE needed\n```\n:::\n\n```{.r .cell-code}\nif (c(TRUE, FALSE)) 1\n```\n\n::: {.cell-output .cell-output-error}\n```\n#> Error in if (c(TRUE, FALSE)) 1: the condition has length > 1\n```\n:::\n:::\n\n\n[Section 5.2.2 Vectorized If]: #\n\n## Use `ifelse()` for vectorized conditions\n\n\n::: {.cell}\n\n```{.r .cell-code}\nx <- 1:10\nifelse(x %% 5 == 0, \"XXX\", as.character(x))\n```\n\n::: {.cell-output .cell-output-stdout}\n```\n#> [1] \"1\" \"2\" \"3\" \"4\" \"XXX\" \"6\" \"7\" \"8\" \"9\" \"XXX\"\n```\n:::\n\n```{.r .cell-code}\nifelse(x %% 2 == 0, \"even\", \"odd\")\n```\n\n::: {.cell-output .cell-output-stdout}\n```\n#> [1] \"odd\" \"even\" \"odd\" \"even\" \"odd\" \"even\" \"odd\" \"even\" \"odd\" \"even\"\n```\n:::\n:::\n\n\n::: {.callout-tip}\nOnly use `ifelse()` if both results are of the same type; otherwise output type is hard to predict.\n:::\n\n## Use `dplyr::case_when()` for multiple condition-vector pairs\n\n\n::: {.cell}\n\n```{.r .cell-code}\ndplyr::case_when(\n x %% 35 == 0 ~ \"fizz buzz\",\n x %% 5 == 0 ~ \"fizz\",\n x %% 7 == 0 ~ \"buzz\",\n is.na(x) ~ \"???\",\n TRUE ~ as.character(x)\n)\n```\n\n::: {.cell-output .cell-output-stdout}\n```\n#> [1] \"1\" \"2\" \"3\" \"4\" \"fizz\" \"6\" \"buzz\" \"8\" \"9\" \"fizz\"\n```\n:::\n:::\n\n\n\n[Section 5.2.3 switch()]: #\n\n## `switch()` is a special purpose equivalent to `if` that can be used to compact code {transition=\"none-out\"}\n\n:::: {.columns}\n::: {.column}\n\n::: {.cell}\n\n```{.r .cell-code}\nx_option <- function(x) {\n if (x == \"a\") {\n \"option 1\"\n } else if (x == \"b\") {\n \"option 2\" \n } else if (x == \"c\") {\n \"option 3\"\n } else {\n stop(\"Invalid `x` value\")\n }\n}\n```\n:::\n\n:::\n\n::: {.column}\n\n::: {.cell}\n\n```{.r .cell-code}\nx_option <- function(x) {\n switch(x,\n a = \"option 1\",\n b = \"option 2\",\n c = \"option 3\",\n stop(\"Invalid `x` value\")\n )\n}\n```\n:::\n\n:::\n::::\n\n## `switch()` is a special purpose equivalent to `if` that can be used to compact code {transition=\"none-in\"}\n\n::: {.callout-tip}\n - The last component of a `switch()` should always throw an error, otherwise unmatched inputs will invisibly return `NULL`.\n - Only use `switch()` with character inputs. Numeric inputs are hard to read and have undesirable failure modes.\n:::\n\n::: {.callout-caution}\nLike `if`, `switch()` can only take a single condition, not vector conditions\n:::\n\n\n## Avoid repeat outputs by leaving the right side of `=` empty\n\n- Inputs will \"fall through\" to the next value.\n\n\n::: {.cell}\n\n```{.r .cell-code}\nlegs <- function(x) {\n switch(x,\n cow = ,\n horse = ,\n dog = 4,\n human = ,\n chicken = 2,\n plant = 0,\n stop(\"Unknown input\")\n )\n}\nlegs(\"cow\")\n```\n\n::: {.cell-output .cell-output-stdout}\n```\n#> [1] 4\n```\n:::\n\n```{.r .cell-code}\nlegs(\"dog\")\n```\n\n::: {.cell-output .cell-output-stdout}\n```\n#> [1] 4\n```\n:::\n:::\n\n\n[Section 5.3 Loops]: #\n\n# Loops\n\n## To iterate over items in a vector, use a `for` **loop**\n\n\n::: {.cell}\n\n```{.r .cell-code}\nfor (item in vector) perform_action\n```\n:::\n\n::: {.cell}\n\n```{.r .cell-code}\nfor (i in 1:3) {\n print(i)\n}\n```\n\n::: {.cell-output .cell-output-stdout}\n```\n#> [1] 1\n#> [1] 2\n#> [1] 3\n```\n:::\n:::\n\n\n::: {.callout-note}\nConvention uses short variables like `i`, `j`, or `k` for iterating vector indices\n:::\n\n## `for` will overwrite existing variables in the current environment\n\n\n::: {.cell}\n\n```{.r .cell-code}\ni <- 100\nfor (i in 1:3) {}\ni\n```\n\n::: {.cell-output .cell-output-stdout}\n```\n#> [1] 3\n```\n:::\n:::\n\n\n## Use `next` or `break` to terminate loops early {transition=\"none-out\"}\n- `next` exits the current iteration, but continues the loop\n- `break` exits the entire loop\n\n## Use `next` or `break` to terminate loops early {transition=\"none-in\"}\n\n::: {.cell}\n\n```{.r .cell-code}\nfor (i in 1:10) {\n if (i < 3) \n next\n\n print(i)\n \n if (i >= 5)\n break\n}\n```\n\n::: {.cell-output .cell-output-stdout}\n```\n#> [1] 3\n#> [1] 4\n#> [1] 5\n```\n:::\n:::\n\n\n[Section 5.3.1 Common pitfalls]: #\n\n## Preallocate an output container to avoid slow loops\n\n\n::: {.cell}\n\n```{.r .cell-code}\nmeans <- c(1, 50, 20)\nout <- vector(\"list\", length(means))\nfor (i in 1:length(means)) {\n out[[i]] <- rnorm(10, means[[i]])\n}\n```\n:::\n\n\n:::{.callout-tip}\n`vector()` function is helpful for preallocation\n:::\n\n## Use `seq_along(x)` instead of `1:length(x)` {transition=\"none-out\"}\n- `1:length(x)` causes unexpected failure for 0 length vectors\n- `:` works with both increasing and decreasing sequences\n\n::: {.cell}\n\n```{.r .cell-code}\nmeans <- c()\n1:length(means)\n```\n\n::: {.cell-output .cell-output-stdout}\n```\n#> [1] 1 0\n```\n:::\n\n```{.r .cell-code}\nseq_along(means)\n```\n\n::: {.cell-output .cell-output-stdout}\n```\n#> integer(0)\n```\n:::\n:::\n\n\n## Use `seq_along(x)` instead of `1:length(x)` {transition=\"none-in\"}\n:::: {.columns}\n::: {.column}\n\n::: {.cell}\n\n```{.r .cell-code}\nout <- vector(\"list\", length(means))\nfor (i in 1:length(means)) {\n out[[i]] <- rnorm(10, means[[i]])\n}\n```\n\n::: {.cell-output .cell-output-error}\n```\n#> Error in rnorm(10, means[[i]]): invalid arguments\n```\n:::\n:::\n\n:::\n::: {.column}\n\n::: {.cell}\n\n```{.r .cell-code}\nout <- vector(\"list\", length(means))\nfor (i in seq_along(means)) {\n out[[i]] <- rnorm(10, means[[i]])\n}\nout\n```\n\n::: {.cell-output .cell-output-stdout}\n```\n#> list()\n```\n:::\n:::\n\n:::\n::::\n\n## Avoid problems when iterating over S3 vectors by using `seq_along(x)` and `x[[i]]`\n::: {}\n- loops typically strip attributes\n\n::: {.cell}\n\n```{.r .cell-code}\nxs <- as.Date(c(\"2020-01-01\", \"2010-01-01\"))\n```\n:::\n\n:::\n:::: {.columns}\n::: {.column}\n\n::: {.cell}\n\n```{.r .cell-code}\nfor (x in xs) {\n print(x)\n}\n```\n\n::: {.cell-output .cell-output-stdout}\n```\n#> [1] 18262\n#> [1] 14610\n```\n:::\n:::\n\n:::\n::: {.column}\n\n::: {.cell}\n\n```{.r .cell-code}\nfor (i in seq_along(xs)) {\n print(xs[[i]])\n}\n```\n\n::: {.cell-output .cell-output-stdout}\n```\n#> [1] \"2020-01-01\"\n#> [1] \"2010-01-01\"\n```\n:::\n:::\n\n:::\n::::\n \n[Section 5.3.2 Related tools]: #\n \n## Use `while` or `repeat` when you don't know the number of iterations\n\n- `while(condition) action`: perfoms `action` while `condition` is `TRUE`\n- `repeat(action)`: repeats `action` forever (or until a `break`) \n \n## Always use the least-flexible loop option possible\n\n- Use `for` before `while` or `repeat`\n- In data analysis use `apply()` or `purrr::map()` before `for`\n\n# Quiz & Exercises {visibility=\"uncounted\"}\n\n[Section 5.1 Quiz]: #\n\n## What is the difference between if and ifelse()? {visibility=\"uncounted\"}\n\n::: {.fragment .fade-in}\n`if` works with scalars; `ifelse()` works with vectors.\n:::\n\n## In the following code, what will the value of `y` be if `x` is `TRUE`? What if `x` is `FALSE`? What if `x` is `NA`? {visibility=\"uncounted\"}\n\n\n\n::: {.cell}\n\n```{.r .cell-code}\ny <- if (x) 3\n```\n:::\n\n\n::: {.fragment .fade-up fragment-index=1}\nWhen `x` is `TRUE`, `y` will be `3`; when `FALSE`, `y` will be `NULL`; when `NA` the `if` statement will throw an error.\n:::\n\n## What does `switch(\"x\", x = , y = 2, z = 3)` return? {visibility=\"uncounted\"}\n\n\n\n::: {.cell}\n\n```{.r .cell-code}\nswitch(\n \"x\",\n x = ,\n y = 2,\n x = 3\n)\n```\n:::\n\n\n::: {.fragment .fade-in}\nThis `switch()` statement makes use of fall-through so it will return `2`.\n:::\n\n[Section 5.2.4 Exercises]: #\n\n## What type of vector does each of the following calls to ifelse() return? {visibility=\"uncounted\" transition=\"none-out\"}\n\nRead the documentation and write down the rules in your own words.\n\n::: {.cell}\n\n```{.r .cell-code}\nifelse(TRUE, 1, \"no\")\nifelse(FALSE, 1, \"no\")\nifelse(NA, 1, \"no\")\n```\n:::\n\n\n## What type of vector does each of the following calls to ifelse() return? {visibility=\"uncounted\" transition=\"none\"}\n\nThe arguments of `ifelse()` are named `test`, `yes` and `no`.\nIn general, `ifelse()` returns the entry for `yes` when `test` is `TRUE`,\nthe entry for `no` when `test` is `FALSE` \nand `NA` when `test` is `NA`.\n\n\n::: {.cell}\n\n```{.r .cell-code}\nifelse(TRUE, 1, \"no\")\n```\n\n::: {.cell-output .cell-output-stdout}\n```\n#> [1] 1\n```\n:::\n\n```{.r .cell-code}\nifelse(FALSE, 1, \"no\")\n```\n\n::: {.cell-output .cell-output-stdout}\n```\n#> [1] \"no\"\n```\n:::\n\n```{.r .cell-code}\nifelse(NA, 1, \"no\")\n```\n\n::: {.cell-output .cell-output-stdout}\n```\n#> [1] NA\n```\n:::\n:::\n\n\n## What type of vector does each of the following calls to ifelse() return? {visibility=\"uncounted\" transition=\"none-in\"}\nIn practice, `test` is first converted to `logical` and if the result is neither `TRUE` nor `FALSE`, then `as.logical(test)` is returned.\n\n::: {.cell}\n\n```{.r .cell-code}\nifelse(logical(), 1, \"no\")\n```\n\n::: {.cell-output .cell-output-stdout}\n```\n#> logical(0)\n```\n:::\n\n```{.r .cell-code}\nifelse(NaN, 1, \"no\")\n```\n\n::: {.cell-output .cell-output-stdout}\n```\n#> [1] NA\n```\n:::\n\n```{.r .cell-code}\nifelse(NA_character_, 1, \"no\")\n```\n\n::: {.cell-output .cell-output-stdout}\n```\n#> [1] NA\n```\n:::\n\n```{.r .cell-code}\nifelse(\"a\", 1, \"no\")\n```\n\n::: {.cell-output .cell-output-stdout}\n```\n#> [1] NA\n```\n:::\n\n```{.r .cell-code}\nifelse(\"true\", 1, \"no\")\n```\n\n::: {.cell-output .cell-output-stdout}\n```\n#> [1] 1\n```\n:::\n:::\n\n\n## Why does the following code work? {visibility=\"uncounted\"}\n\n\n::: {.cell}\n\n```{.r .cell-code}\nx <- 1:10\nif (length(x)) \"not empty\" else \"empty\"\n```\n\n::: {.cell-output .cell-output-stdout}\n```\n#> [1] \"not empty\"\n```\n:::\n\n```{.r .cell-code}\nx <- numeric()\nif (length(x)) \"not empty\" else \"empty\"\n```\n\n::: {.cell-output .cell-output-stdout}\n```\n#> [1] \"empty\"\n```\n:::\n:::\n\n::: {.fragment .fade-up fragment-index=1}\n`if()` expects a logical condition, but also accepts a numeric vector where `0` is treated as `FALSE` and all other numbers are treated as `TRUE`.\nNumerical missing values (including `NaN`) lead to an error in the same way that a logical missing, `NA`, does.\n:::\n\n[Section 5.3.3 Exercises]: #\n\n## Why does this code succeed without errors or warnings? {visibility=\"uncounted\" transition=\"none-out\"}\n\n\n::: {.cell}\n\n```{.r .cell-code}\nx <- numeric()\nout <- vector(\"list\", length(x))\nfor (i in 1:length(x)) {\n out[i] <- x[i] ^ 2\n}\nout\n```\n\n::: {.cell-output .cell-output-stdout}\n```\n#> [[1]]\n#> [1] NA\n```\n:::\n:::\n\n\n## Why does this code succeed without errors or warnings? {visibility=\"uncounted\" transition=\"none-in\"}\n- Subsetting behavior for out-of-bounds & `0` indices when using `[<-` and `[`\n- `x[1]` generates an `NA`. `NA` is assigned to the empty length-1 list `out[1]`\n- `x[0]` returns `numeric(0)`. `numeric(0)` is assigned to `out[0]`. Assigning a 0-length vector to a 0-length subset doesn't change the object.\n- Each step includes valid R operations (even though the result may not be what the user intended).\n\n## Walk-through {visibility=\"uncounted\" transition=\"none-out\"}\n\nSetup\n\n::: {.cell}\n\n```{.r .cell-code}\nx <- numeric()\nout <- vector(\"list\", length(x))\n1:length(x)\n```\n\n::: {.cell-output .cell-output-stdout}\n```\n#> [1] 1 0\n```\n:::\n:::\n\n\n## Walk-through {visibility=\"uncounted\" transition=\"none\"}\n\nFirst Iteration\n\n::: {.cell}\n\n```{.r .cell-code}\nx[1]\n```\n\n::: {.cell-output .cell-output-stdout}\n```\n#> [1] NA\n```\n:::\n\n```{.r .cell-code}\nx[1]^2\n```\n\n::: {.cell-output .cell-output-stdout}\n```\n#> [1] NA\n```\n:::\n\n```{.r .cell-code}\nout[1]\n```\n\n::: {.cell-output .cell-output-stdout}\n```\n#> [[1]]\n#> NULL\n```\n:::\n\n```{.r .cell-code}\nout[1] <- x[1]^2\nout[1]\n```\n\n::: {.cell-output .cell-output-stdout}\n```\n#> [[1]]\n#> [1] NA\n```\n:::\n:::\n\n\n## Walk-through {visibility=\"uncounted\" transition=\"none\"}\n\nSecond Iteration\n\n::: {.cell}\n\n```{.r .cell-code}\nx[0]\n```\n\n::: {.cell-output .cell-output-stdout}\n```\n#> numeric(0)\n```\n:::\n\n```{.r .cell-code}\nx[0]^2\n```\n\n::: {.cell-output .cell-output-stdout}\n```\n#> numeric(0)\n```\n:::\n\n```{.r .cell-code}\nout[0]\n```\n\n::: {.cell-output .cell-output-stdout}\n```\n#> list()\n```\n:::\n\n```{.r .cell-code}\nout[0] <- x[0]^2\nout[0]\n```\n\n::: {.cell-output .cell-output-stdout}\n```\n#> list()\n```\n:::\n:::\n\n\n## Walk-through {visibility=\"uncounted\" transition=\"none-in\"}\n\nFinal Result\n\n::: {.cell}\n\n```{.r .cell-code}\nout\n```\n\n::: {.cell-output .cell-output-stdout}\n```\n#> [[1]]\n#> [1] NA\n```\n:::\n:::\n\n\n## When the following code is evaluated, what can you say about the vector being iterated? {visibility=\"uncounted\"}\n\n\n::: {.cell}\n\n```{.r .cell-code}\nxs <- c(1, 2, 3)\nfor (x in xs) {\n xs <- c(xs, x * 2)\n}\nxs\n```\n\n::: {.cell-output .cell-output-stdout}\n```\n#> [1] 1 2 3 2 4 6\n```\n:::\n:::\n\n\n::: {.fragment .fade-in}\nIn this loop `x` takes on the values of the initial `xs` (`1`, `2` and `3`), indicating that it is evaluated just once in the beginning of the loop, not after each iteration. (Otherwise, we would run into an infinite loop.)\n:::\n\n## What does the following code tell you about when the index is updated? {visibility=\"uncounted\"}\n\n\n::: {.cell}\n\n```{.r .cell-code}\nfor (i in 1:3) {\n i <- i * 2\n print(i) \n}\n```\n\n::: {.cell-output .cell-output-stdout}\n```\n#> [1] 2\n#> [1] 4\n#> [1] 6\n```\n:::\n:::\n\n\n::: {.fragment .fade-in}\nIn a `for` loop the index is updated in the beginning of each iteration. Therefore, reassigning the index symbol during one iteration doesn't affect the following iterations. (Again, we would otherwise run into an infinite loop.)\n:::\n", "supporting": [ - "05_files" + "05_files/figure-revealjs" ], "filters": [ "rmarkdown/pagebreak.lua" ], - "includes": {}, + "includes": { + "include-after-body": [ + "\n<script>\n // htmlwidgets need to know to resize themselves when slides are shown/hidden.\n // Fire the \"slideenter\" event (handled by htmlwidgets.js) when the current\n // slide changes (different for each slide format).\n (function () {\n // dispatch for htmlwidgets\n function fireSlideEnter() {\n const event = window.document.createEvent(\"Event\");\n event.initEvent(\"slideenter\", true, true);\n window.document.dispatchEvent(event);\n }\n\n function fireSlideChanged(previousSlide, currentSlide) {\n fireSlideEnter();\n\n // dispatch for shiny\n if (window.jQuery) {\n if (previousSlide) {\n window.jQuery(previousSlide).trigger(\"hidden\");\n }\n if (currentSlide) {\n window.jQuery(currentSlide).trigger(\"shown\");\n }\n }\n }\n\n // hookup for slidy\n if (window.w3c_slidy) {\n window.w3c_slidy.add_observer(function (slide_num) {\n // slide_num starts at position 1\n fireSlideChanged(null, w3c_slidy.slides[slide_num - 1]);\n });\n }\n\n })();\n</script>\n\n" + ] + }, "engineDependencies": {}, "preserve": {}, "postProcess": true diff --git a/_freeze/slides/05/figure-html/unnamed-chunk-16-1.png b/_freeze/slides/05/figure-html/unnamed-chunk-16-1.png Binary files differ. diff --git a/slides/05.Rmd b/slides/05.Rmd @@ -1,420 +0,0 @@ ---- -engine: knitr -title: Control flow ---- - -## Learning objectives: - -- Learn the **tools** for controlling flow of execution. - -- Learn some technical pitfalls and (perhaps lesser known) useful features. - -```{r echo = FALSE, fig.align = 'left', fig.dim = '100%'} -knitr::include_graphics("images/whatif2.png") -``` -```{r echo = FALSE, fig.align = 'right', fig.dim = '100%'} -knitr::include_graphics("images/forloop.png") -``` - ---- - -## Introduction - -There are two main groups of flow control tools: **choices** and **loops**: - -- Choices (`if`, `switch`, `ifelse`, `dplyr::if_else`, `dplyr::case_when`) allow you to run different code depending on the input. - -- Loops (`for`, `while`, `repeat`) allow you to repeatedly run code - - ---- - - -## Choices - - - -`if()` and `else` - -Use `if` to specify a block of code to be executed, if a specified condition is true. Use `else` to specify a block of code to be executed, if the same condition is false. - -```{r, eval=FALSE} -if (condition) true_action -if (condition) true_action else false_action -``` - -(Note braces are only *needed* for compound expressions) - -```{r eval=FALSE, include=T} -if (test_expression) { - true_action -} else { - false_action -} -``` - - -Can be expanded to more alternatives: - -```{r, eval=FALSE} -if (test_expression) { - true_action -} else if (other_test_expression) { - other_action -} else { - false_action -} -``` - - -## Exercise {-} -Why does this work? -``` -x <- 1:10 -if (length(x)) "not empty" else "empty" -#> [1] "not empty" - -x <- numeric() -if (length(x)) "not empty" else "empty" -#> [1] "empty" -``` - -`if` returns a value which can be assigned - -```{r} -x1 <- if (TRUE) 1 else 2 -x2 <- if (FALSE) 1 else 2 - -c(x1, x2) -``` - -The book recommends assigning the results of an if statement only when the entire expression fits on one line; otherwise it tends to be hard to read. - - -## Single if without else {-} - -When you use the single argument form without an else statement, if invisibly (Section 6.7.2) returns NULL if the condition is FALSE. Since functions like c() and paste() drop NULL inputs, this allows for a compact expression of certain idioms: - -```{r, eval=FALSE} -greet <- function(name, birthday = FALSE) { - paste0( - "Hi ", name, - if (birthday) " and HAPPY BIRTHDAY" - ) -} -greet("Maria", FALSE) -#> [1] "Hi Maria" -greet("Jaime", TRUE) -#> [1] "Hi Jaime and HAPPY BIRTHDAY" -``` - - - -```{r, eval=FALSE} -format_lane_text <- function(number){ - - paste0( - number, - " lane", - if (number > 1) "s", - " of sequencing" - ) -} - -format_lane_text(1) -#> [1] "1 lane of sequencing" -format_lane_text(4) -#> [1] "4 lanes of sequencing" -``` - - - - -## Invalid inputs {-} - -- *Condition* must evaluate to a *single* `TRUE` or `FALSE` - -A single number gets coerced to a logical type. - -```{r, eval=FALSE} -if (56) 1 -#> [1] 1 -if (0.3) 1 -#> [1] 1 -if (0) 1 -``` - -If the condition cannot evaluate to a *single* `TRUE` or `FALSE`, an error is (usually) produced. - -```{r, eval=FALSE} -if ("text") 1 -#> Error in if ("text") 1: argument is not interpretable as logical -if ("true") 1 -#> 1 -if (numeric()) 1 -#> Error in if (numeric()) 1: argument is of length zero -if (NULL) 1 -#> Error in if (NULL) 1 : argument is of length zero -if (NA) 1 -#> Error in if (NA) 1: missing value where TRUE/FALSE needed -``` - - -Exception is a logical vector of length greater than 1, which only generates a warning, unless you have `_R_CHECK_LENGTH_1_CONDITION_` set to `TRUE`. -This seems to have been the default since R-4.2.0 - -```{r, eval=FALSE} -if (c(TRUE, FALSE)) 1 -#>Error in if (c(TRUE, FALSE)) 1 : the condition has length > 1 -``` - -## Vectorized choices {-} - -- `ifelse()` is a vectorized version of `if`: - -```{r, eval=FALSE} -x <- 1:10 -ifelse(x %% 5 == 0, "XXX", as.character(x)) -#> [1] "1" "2" "3" "4" "XXX" "6" "7" "8" "9" "XXX" - -ifelse(x %% 2 == 0, "even", "odd") -#> [1] "odd" "even" "odd" "even" "odd" "even" "odd" "even" "odd" "even" -``` - -- `dplyr::if_else()` - -- Book recommends only using `ifelse()` "only when the yes and no vectors are the same type as it is otherwise hard to predict the output type." - -- `dplyr::if_else()` enforces this recommendation. - -**For example:** - -```{r eval=FALSE, include=T} -ifelse(c(TRUE,TRUE,FALSE),"a",3) -#> [1] "a" "a" "3" -dplyr::if_else(c(TRUE,TRUE,FALSE),"a",3) -#> Error in `dplyr::if_else()`: -#> ! `false` must be a character vector, not a double vector. -``` - -## Switch {-} - -Rather then string together multiple if - else if chains, you can often use `switch`. - - -```{r message=FALSE, warning=FALSE} -centre <- function(x, type) { - switch(type, - mean = mean(x), - median = median(x), - trimmed = mean(x, trim = .1), - stop("Invalid `type` value") - ) -} -``` - -Last component should always throw an error, as unmatched inputs would otherwise invisibly return NULL. -Book recommends to only use character inputs for `switch()`. - -```{r, eval=FALSE} -vec <- c(1:20,50:55) -centre(vec, "mean") -#> [1] 20.19231 -centre(vec, "median") -#> [1] 13.5 -centre(vec, "trimmed") -#> [1] 18.77273 -``` - -```{r, message=FALSE} -set.seed(123) -x <- rlnorm(100) - -centers <- data.frame(type = c('mean', 'median', 'trimmed')) -centers$value = sapply(centers$type, \(t){centre(x,t)}) - -require(ggplot2) -ggplot(data = data.frame(x), aes(x))+ - geom_density()+ - geom_vline(data = centers, - mapping = aes(color = type, xintercept = value), - linewidth=0.5,linetype="dashed") + - xlim(-1,10)+ - theme_bw() -``` - - -Example from book of "falling through" to next value - -```{r} -legs <- function(x) { - switch(x, - cow = , - horse = , - dog = 4, - human = , - chicken = 2, - plant = 0, - stop("Unknown input") - ) -} -legs("cow") -#> [1] 4 -legs("dog") -#> [1] 4 -``` - - - - -## Using `dplyr::case_when` {-} - -- `case_when` is a more general `if_else` and can be used often in place of multiple chained `if_else` or sapply'ing `switch`. - -- It uses a special syntax to allow any number of condition-vector pairs: - -```{r message=FALSE, warning=FALSE} -set.seed(123) -x <- rlnorm(100) - -centers <- data.frame(type = c('mean', 'median', 'trimmed')) - -centers$value = dplyr::case_when( - centers$type == 'mean' ~ mean(x), - centers$type == 'median' ~ median(x), - centers$type == 'trimmed' ~ mean(x, trim = 0.1), - .default = 1000 - ) - -centers -``` - - - -## Loops - -- Iteration over a elements of a vector - -`for (item in vector) perform_action` - -**First example** -```{r} -for(i in 1:5) { - print(1:i) -} - -x <- numeric(length=5L) -df <- data.frame(x=1:5) - -for(i in 1:5) { - df$y[[i]] <- i+1 -} -``` - - -**Second example**: terminate a *for loop* earlier - -- `next` skips rest of current iteration -- `break` exits the loop entirely - -```{r} -for (i in 1:10) { - if (i < 3) - next - - print(i) - - if (i >= 5) - break -} -``` - -## Exercise {-} - -When the following code is evaluated, what can you say about the vector being iterated? -``` -xs <- c(1, 2, 3) -for (x in xs) { - xs <- c(xs, x * 2) -} -xs -#> [1] 1 2 3 2 4 6 -``` - -## Pitfalls {-} - -- Preallocate output containers to avoid *slow* code. - -- Beware that `1:length(v)` when `v` has length 0 results in a iterating backwards over `1:0`, probably not what is intended. Use `seq_along(v)` instead. - -- When iterating over S3 vectors, use `[[]]` yourself to avoid stripping attributes. - -``` -xs <- as.Date(c("2020-01-01", "2010-01-01")) -for (x in xs) { - print(x) -} -#> [1] 18262 -#> [1] 14610 -``` -vs. -``` -for (i in seq_along(xs)) { - print(xs[[i]]) -} -#> [1] "2020-01-01" -#> [1] "2010-01-01" -``` - -## Related tools {-} - -- `while(condition) action`: performs action while condition is TRUE. -- `repeat(action)`: repeats action forever (i.e. until it encounters break). - -- Note that `for` can be rewritten as `while` and while can be rewritten as `repeat` (this goes in one direction only!); *however*: - ->Good practice is to use the least-flexible solution to a problem, so you should use `for` wherever possible. -BUT you shouldn't even use for loops for data analysis tasks as `map()` and `apply()` already provide *less flexible* solutions to most problems. (More in Chapter 9.) - -```{r} -for (i in 1:5) { - print(i) -} - - -``` - -```{r} - -x_option <- function(x) { - switch(x, - a = "option 1", - b = "option 2", - c = "option 3"#, - #stop("Invalid `x` value") - ) -} - -``` - - - -```{r} -i <- 1 - -while(i <=5 ) { - print(i) - i <- i+1 -} -``` - -```{r} -i <- 1 - -repeat { - print(i) - i <- i+1 - if (i > 5) break -} - -``` diff --git a/slides/05.qmd b/slides/05.qmd @@ -0,0 +1,479 @@ +--- +engine: knitr +title: Control flow +--- + +## Learning objectives: + +- Understand the two primary tools for control flow: **choices** and **loops** +- Learn best practices to void common pitfalls +- Distinguish when to use `if`, `ifelse()`, and `switch()` for choices +- Distinguish when to use `for`, `while`, and `repeat` for loops + +::: {.callout-note} +Basic familiarity with choices and loops is assumed. +::: + +# Choices + +[Section 5.2 Choices]: # + +## `if` is the basic statement for a **choice** + +Single line format + +```{r} +#| eval: false +if (condition) true_action +if (condition) true_action else false_action +``` + +Compound statement within `{` + +```{r} +grade <- function(x) { + if (x > 90) { + "A" + } else if (x > 80) { + "B" + } else if (x > 50) { + "C" + } else { + "F" + } +} +``` + +## Results of `if` can be assigned + +```{r} +x1 <- if (TRUE) 1 else 2 +x2 <- if (FALSE) 1 else 2 + +c(x1, x2) +``` + +:::{.callout-tip} +Only recommended with single line format; otherwise hard to read. +::: + +## `if` without `else` can be combined with `c()` or `paste()` to create compact expressions + + - `if` without `else` invisibly returns `NULL` when `FALSE`. + +```{r} +greet <- function(name, birthday = FALSE) { + paste0( + "Hi ", name, + if (birthday) " and HAPPY BIRTHDAY" + ) +} +greet("Maria", FALSE) +greet("Jaime", TRUE) +``` + +[Section 5.2.1 Invalid Inputs]: # + +## `if` should have a single `TRUE` or `FALSE` condition, other inputs generate errors + +```{r} +#| error: true +if ("x") 1 +if (logical()) 1 +if (NA) 1 +if (c(TRUE, FALSE)) 1 +``` + +[Section 5.2.2 Vectorized If]: # + +## Use `ifelse()` for vectorized conditions + +```{r} +x <- 1:10 +ifelse(x %% 5 == 0, "XXX", as.character(x)) +ifelse(x %% 2 == 0, "even", "odd") +``` + +::: {.callout-tip} +Only use `ifelse()` if both results are of the same type; otherwise output type is hard to predict. +::: + +## Use `dplyr::case_when()` for multiple condition-vector pairs + +```{r} +dplyr::case_when( + x %% 35 == 0 ~ "fizz buzz", + x %% 5 == 0 ~ "fizz", + x %% 7 == 0 ~ "buzz", + is.na(x) ~ "???", + TRUE ~ as.character(x) +) +``` + + +[Section 5.2.3 switch()]: # + +## `switch()` is a special purpose equivalent to `if` that can be used to compact code {transition="none-out"} + +:::: {.columns} +::: {.column} +```{r} +x_option <- function(x) { + if (x == "a") { + "option 1" + } else if (x == "b") { + "option 2" + } else if (x == "c") { + "option 3" + } else { + stop("Invalid `x` value") + } +} +``` +::: + +::: {.column} +```{r} +x_option <- function(x) { + switch(x, + a = "option 1", + b = "option 2", + c = "option 3", + stop("Invalid `x` value") + ) +} +``` +::: +:::: + +## `switch()` is a special purpose equivalent to `if` that can be used to compact code {transition="none-in"} + +::: {.callout-tip} + - The last component of a `switch()` should always throw an error, otherwise unmatched inputs will invisibly return `NULL`. + - Only use `switch()` with character inputs. Numeric inputs are hard to read and have undesirable failure modes. +::: + +::: {.callout-caution} +Like `if`, `switch()` can only take a single condition, not vector conditions +::: + + +## Avoid repeat outputs by leaving the right side of `=` empty + +- Inputs will "fall through" to the next value. + +```{r} +legs <- function(x) { + switch(x, + cow = , + horse = , + dog = 4, + human = , + chicken = 2, + plant = 0, + stop("Unknown input") + ) +} +legs("cow") +legs("dog") +``` + +[Section 5.3 Loops]: # + +# Loops + +## To iterate over items in a vector, use a `for` **loop** + +```{r} +#| eval: false +for (item in vector) perform_action +``` + +```{r} +for (i in 1:3) { + print(i) +} +``` + +::: {.callout-note} +Convention uses short variables like `i`, `j`, or `k` for iterating vector indices +::: + +## `for` will overwrite existing variables in the current environment + +```{r} +i <- 100 +for (i in 1:3) {} +i +``` + +## Use `next` or `break` to terminate loops early {transition="none-out"} +- `next` exits the current iteration, but continues the loop +- `break` exits the entire loop + +## Use `next` or `break` to terminate loops early {transition="none-in"} +```{r} +for (i in 1:10) { + if (i < 3) + next + + print(i) + + if (i >= 5) + break +} +``` + +[Section 5.3.1 Common pitfalls]: # + +## Preallocate an output container to avoid slow loops + +```{r} +means <- c(1, 50, 20) +out <- vector("list", length(means)) +for (i in 1:length(means)) { + out[[i]] <- rnorm(10, means[[i]]) +} +``` + +:::{.callout-tip} +`vector()` function is helpful for preallocation +::: + +## Use `seq_along(x)` instead of `1:length(x)` {transition="none-out"} +- `1:length(x)` causes unexpected failure for 0 length vectors +- `:` works with both increasing and decreasing sequences +```{r} +means <- c() +1:length(means) +seq_along(means) +``` + +## Use `seq_along(x)` instead of `1:length(x)` {transition="none-in"} +:::: {.columns} +::: {.column} +```{r} +#| error: true +out <- vector("list", length(means)) +for (i in 1:length(means)) { + out[[i]] <- rnorm(10, means[[i]]) +} +``` +::: +::: {.column} +```{r} +out <- vector("list", length(means)) +for (i in seq_along(means)) { + out[[i]] <- rnorm(10, means[[i]]) +} +out +``` +::: +:::: + +## Avoid problems when iterating over S3 vectors by using `seq_along(x)` and `x[[i]]` +::: {} +- loops typically strip attributes +```{r} +xs <- as.Date(c("2020-01-01", "2010-01-01")) +``` +::: +:::: {.columns} +::: {.column} +```{r} +for (x in xs) { + print(x) +} +``` +::: +::: {.column} +```{r} +for (i in seq_along(xs)) { + print(xs[[i]]) +} +``` +::: +:::: + +[Section 5.3.2 Related tools]: # + +## Use `while` or `repeat` when you don't know the number of iterations + +- `while(condition) action`: perfoms `action` while `condition` is `TRUE` +- `repeat(action)`: repeats `action` forever (or until a `break`) + +## Always use the least-flexible loop option possible + +- Use `for` before `while` or `repeat` +- In data analysis use `apply()` or `purrr::map()` before `for` + +# Quiz & Exercises {visibility="uncounted"} + +[Section 5.1 Quiz]: # + +## What is the difference between if and ifelse()? {visibility="uncounted"} + +::: {.fragment .fade-in} +`if` works with scalars; `ifelse()` works with vectors. +::: + +## In the following code, what will the value of `y` be if `x` is `TRUE`? What if `x` is `FALSE`? What if `x` is `NA`? {visibility="uncounted"} + + +```{r} +#| eval: false +y <- if (x) 3 +``` + +::: {.fragment .fade-up fragment-index=1} +When `x` is `TRUE`, `y` will be `3`; when `FALSE`, `y` will be `NULL`; when `NA` the `if` statement will throw an error. +::: + +## What does `switch("x", x = , y = 2, z = 3)` return? {visibility="uncounted"} + + +```{r} +#| eval: false +switch( + "x", + x = , + y = 2, + x = 3 +) +``` + +::: {.fragment .fade-in} +This `switch()` statement makes use of fall-through so it will return `2`. +::: + +[Section 5.2.4 Exercises]: # + +## What type of vector does each of the following calls to ifelse() return? {visibility="uncounted" transition="none-out"} + +Read the documentation and write down the rules in your own words. +```{r} +#| eval: false +ifelse(TRUE, 1, "no") +ifelse(FALSE, 1, "no") +ifelse(NA, 1, "no") +``` + +## What type of vector does each of the following calls to ifelse() return? {visibility="uncounted" transition="none"} + +The arguments of `ifelse()` are named `test`, `yes` and `no`. +In general, `ifelse()` returns the entry for `yes` when `test` is `TRUE`, +the entry for `no` when `test` is `FALSE` +and `NA` when `test` is `NA`. + +```{r} +ifelse(TRUE, 1, "no") +ifelse(FALSE, 1, "no") +ifelse(NA, 1, "no") +``` + +## What type of vector does each of the following calls to ifelse() return? {visibility="uncounted" transition="none-in"} +In practice, `test` is first converted to `logical` and if the result is neither `TRUE` nor `FALSE`, then `as.logical(test)` is returned. +```{r} +ifelse(logical(), 1, "no") +ifelse(NaN, 1, "no") +ifelse(NA_character_, 1, "no") +ifelse("a", 1, "no") +ifelse("true", 1, "no") +``` + +## Why does the following code work? {visibility="uncounted"} + +```{r} +x <- 1:10 +if (length(x)) "not empty" else "empty" +x <- numeric() +if (length(x)) "not empty" else "empty" +``` +::: {.fragment .fade-up fragment-index=1} +`if()` expects a logical condition, but also accepts a numeric vector where `0` is treated as `FALSE` and all other numbers are treated as `TRUE`. +Numerical missing values (including `NaN`) lead to an error in the same way that a logical missing, `NA`, does. +::: + +[Section 5.3.3 Exercises]: # + +## Why does this code succeed without errors or warnings? {visibility="uncounted" transition="none-out"} + +```{r} +x <- numeric() +out <- vector("list", length(x)) +for (i in 1:length(x)) { + out[i] <- x[i] ^ 2 +} +out +``` + +## Why does this code succeed without errors or warnings? {visibility="uncounted" transition="none-in"} +- Subsetting behavior for out-of-bounds & `0` indices when using `[<-` and `[` +- `x[1]` generates an `NA`. `NA` is assigned to the empty length-1 list `out[1]` +- `x[0]` returns `numeric(0)`. `numeric(0)` is assigned to `out[0]`. Assigning a 0-length vector to a 0-length subset doesn't change the object. +- Each step includes valid R operations (even though the result may not be what the user intended). + +## Walk-through {visibility="uncounted" transition="none-out"} + +Setup +```{r} +x <- numeric() +out <- vector("list", length(x)) +1:length(x) +``` + +## Walk-through {visibility="uncounted" transition="none"} + +First Iteration +```{r} +x[1] +x[1]^2 +out[1] +out[1] <- x[1]^2 +out[1] +``` + +## Walk-through {visibility="uncounted" transition="none"} + +Second Iteration +```{r} +x[0] +x[0]^2 +out[0] +out[0] <- x[0]^2 +out[0] +``` + +## Walk-through {visibility="uncounted" transition="none-in"} + +Final Result +```{r} +out +``` + +## When the following code is evaluated, what can you say about the vector being iterated? {visibility="uncounted"} + +```{r} +xs <- c(1, 2, 3) +for (x in xs) { + xs <- c(xs, x * 2) +} +xs +``` + +::: {.fragment .fade-in} +In this loop `x` takes on the values of the initial `xs` (`1`, `2` and `3`), indicating that it is evaluated just once in the beginning of the loop, not after each iteration. (Otherwise, we would run into an infinite loop.) +::: + +## What does the following code tell you about when the index is updated? {visibility="uncounted"} + +```{r} +for (i in 1:3) { + i <- i * 2 + print(i) +} +``` + +::: {.fragment .fade-in} +In a `for` loop the index is updated in the beginning of each iteration. Therefore, reassigning the index symbol during one iteration doesn't affect the following iterations. (Again, we would otherwise run into an infinite loop.) +:::