bookclub-advr

DSLC Advanced R Book Club
git clone https://git.eamoncaddigan.net/bookclub-advr.git
Log | Files | Refs | README | LICENSE

html.json (7773B)


      1 {
      2   "hash": "8b328be22c5a8793824ab48fed89ce11",
      3   "result": {
      4     "engine": "knitr",
      5     "markdown": "---\nengine: knitr\ntitle: Function operators\n---\n\n## Learning objectives:\n\n- Define function operator\n- Explore some existing function operators\n- Make our own function operator\n\n\n\n## Introduction \n\n<!--- ```{r 10-01, fig.align='center',fig.dim=\"50%\", fig.cap=\"Credits: 2001-2003 Michael P.Frank (https://slideplayer.com/slide/17666226/)\",echo=FALSE}\n\nknitr::include_graphics(\"images/11-maths_example.png\")\n``` --->\n\n- A **function operator** is a function that takes one (or more) functions as input and returns a function as output.\n\n- Function operators are a special case of **function factories**, since they return functions.\n\n- They are often used to wrap an existing function to provide additional capability, similar to python's **decorators**. \n\n\n::: {.cell}\n\n```{.r .cell-code}\nchatty <- function(f) {\n  force(f)\n  \n  function(x, ...) {\n    res <- f(x, ...)\n    cat(\"Processing \", x, \"\\n\", sep = \"\")\n    res\n  }\n}\n\nf <- function(x) x ^ 2\ns <- c(3, 2, 1)\n\npurrr::map_dbl(s, chatty(f))\n```\n\n::: {.cell-output .cell-output-stdout}\n\n```\n#> Processing 3\n#> Processing 2\n#> Processing 1\n```\n\n\n:::\n\n::: {.cell-output .cell-output-stdout}\n\n```\n#> [1] 9 4 1\n```\n\n\n:::\n:::\n\n\n## Existing function operators \n\nTwo function operator examples are `purrr:safely()` and `memoise::memoise()`.  These can be found in `purr` and `memoise`:\n\n\n::: {.cell}\n\n```{.r .cell-code}\nlibrary(purrr)\nlibrary(memoise)\n```\n:::\n\n\n## purrr::safely {-}\n\nCapturing Errors: turns errors into data!\n\n      \n\n::: {.cell}\n\n```{.r .cell-code}\nx <- list(\n  c(0.512, 0.165, 0.717),\n  c(0.064, 0.781, 0.427),\n  c(0.890, 0.785, 0.495),\n  \"oops\"\n)\n```\n:::\n\n\n\n::: {.cell}\n\n```{.r .cell-code}\nmap_dbl(x, sum)\n#> Error in .Primitive(\"sum\")(..., na.rm = na.rm): invalid 'type' (character) of\n#> argument\n```\n:::\n\n\n\n\n::: {.cell}\n\n```{.r .cell-code}\n# note use of map (not map_dbl), safely returns a lisst\n\nout <- map(x, safely(sum))\nstr(transpose(out))\n```\n\n::: {.cell-output .cell-output-stdout}\n\n```\n#> List of 2\n#>  $ result:List of 4\n#>   ..$ : num 1.39\n#>   ..$ : num 1.27\n#>   ..$ : num 2.17\n#>   ..$ : NULL\n#>  $ error :List of 4\n#>   ..$ : NULL\n#>   ..$ : NULL\n#>   ..$ : NULL\n#>   ..$ :List of 2\n#>   .. ..$ message: chr \"invalid 'type' (character) of argument\"\n#>   .. ..$ call   : language .Primitive(\"sum\")(..., na.rm = na.rm)\n#>   .. ..- attr(*, \"class\")= chr [1:3] \"simpleError\" \"error\" \"condition\"\n```\n\n\n:::\n:::\n\n      \n\n\n## Other `purrr` function operators {-}\n\n<!---\n\n::: {.cell layout-align=\"center\"}\n::: {.cell-output-display}\n![Credits: www.nextptr.com](images/11-function_operators.png){fig-align='center' width=350}\n:::\n:::\n\n--->\n\n> purrr comes with three other function operators in a similar vein:\n\n      \n      possibly(): returns a default value when there’s an error. It provides no way to tell if an error occured or not, so it’s best reserved for cases when there’s some obvious sentinel value (like NA).\n\n      quietly(): turns output, messages, and warning side-effects into output, message, and warning components of the output.\n\n      auto_browser(): automatically executes browser() inside the function when there’s an error.\n\n\n## memoise::memoise {-}\n\nCaching computations: avoid repeated computations!\n \n\n\n::: {.cell}\n\n```{.r .cell-code}\nslow_function <- function(x) {\n  Sys.sleep(1)\n  x * 10 * runif(1)\n}\nsystem.time(print(slow_function(1)))\n```\n\n::: {.cell-output .cell-output-stdout}\n\n```\n#> [1] 3.054764\n```\n\n\n:::\n\n::: {.cell-output .cell-output-stdout}\n\n```\n#>    user  system elapsed \n#>    0.02    0.00    1.02\n```\n\n\n:::\n\n```{.r .cell-code}\nsystem.time(print(slow_function(1)))\n```\n\n::: {.cell-output .cell-output-stdout}\n\n```\n#> [1] 3.644634\n```\n\n\n:::\n\n::: {.cell-output .cell-output-stdout}\n\n```\n#>    user  system elapsed \n#>    0.00    0.00    1.03\n```\n\n\n:::\n:::\n\n\n\n::: {.cell}\n\n```{.r .cell-code}\nfast_function <- memoise::memoise(slow_function)\nsystem.time(print(fast_function(1)))\n```\n\n::: {.cell-output .cell-output-stdout}\n\n```\n#> [1] 7.451108\n```\n\n\n:::\n\n::: {.cell-output .cell-output-stdout}\n\n```\n#>    user  system elapsed \n#>    0.00    0.00    1.01\n```\n\n\n:::\n\n```{.r .cell-code}\nsystem.time(print(fast_function(1)))\n```\n\n::: {.cell-output .cell-output-stdout}\n\n```\n#> [1] 7.451108\n```\n\n\n:::\n\n::: {.cell-output .cell-output-stdout}\n\n```\n#>    user  system elapsed \n#>    0.00    0.00    0.01\n```\n\n\n:::\n:::\n\n\n> Be careful about memoising impure functions! \n\n## Exercise {-}\n\nHow does `safely()` work?  \nThe source code looks like this:\n\n\n::: {.cell}\n\n```{.r .cell-code}\nsafely\n```\n\n::: {.cell-output .cell-output-stdout}\n\n```\n#> function (.f, otherwise = NULL, quiet = TRUE) \n#> {\n#>     .f <- as_mapper(.f)\n#>     force(otherwise)\n#>     check_bool(quiet)\n#>     function(...) capture_error(.f(...), otherwise, quiet)\n#> }\n#> <bytecode: 0x00000183867aacf0>\n#> <environment: namespace:purrr>\n```\n\n\n:::\n:::\n\n\nThe real work is done in `capture_error` which is defined in the package **namespace**. We can access it with the `:::` operator. (Could also grab it from the function's environment.)\n\n\n::: {.cell}\n\n```{.r .cell-code}\npurrr:::capture_error\n```\n\n::: {.cell-output .cell-output-stdout}\n\n```\n#> function (code, otherwise = NULL, quiet = TRUE) \n#> {\n#>     tryCatch(list(result = code, error = NULL), error = function(e) {\n#>         if (!quiet) {\n#>             message(\"Error: \", conditionMessage(e))\n#>         }\n#>         list(result = otherwise, error = e)\n#>     })\n#> }\n#> <bytecode: 0x0000018386817b60>\n#> <environment: namespace:purrr>\n```\n\n\n:::\n:::\n\n\n## Case study: make your own function operator\n\n\n\n::: {.cell}\n\n```{.r .cell-code}\nurls <- c(\n  \"adv-r\" = \"https://adv-r.hadley.nz\", \n  \"r4ds\" = \"http://r4ds.had.co.nz/\"\n  # and many many more\n)\npath <- paste(tempdir(), names(urls), \".html\")\n\nwalk2(urls, path, download.file, quiet = TRUE)\n```\n:::\n\n\n\nHere we make a function operator that add a little delay in reading each page:\n\n\n::: {.cell}\n\n```{.r .cell-code}\ndelay_by <- function(f, amount) {\n  force(f)\n  force(amount)\n  \n  function(...) {\n    Sys.sleep(amount)\n    f(...)\n  }\n}\nsystem.time(runif(100))\n```\n\n::: {.cell-output .cell-output-stdout}\n\n```\n#>    user  system elapsed \n#>       0       0       0\n```\n\n\n:::\n\n```{.r .cell-code}\nsystem.time(delay_by(runif, 0.1)(100))\n```\n\n::: {.cell-output .cell-output-stdout}\n\n```\n#>    user  system elapsed \n#>    0.00    0.00    0.11\n```\n\n\n:::\n:::\n\n\n\nAnd another to add a dot after nth invocation:\n\n\n::: {.cell}\n\n```{.r .cell-code}\ndot_every <- function(f, n) {\n  force(f)\n  force(n)\n  \n  i <- 0\n  function(...) {\n    i <<- i + 1\n    if (i %% n == 0) cat(\".\")\n    f(...)\n  }\n}\n\nwalk(1:100, dot_every(runif, 10))\n```\n\n::: {.cell-output .cell-output-stdout}\n\n```\n#> ..........\n```\n\n\n:::\n:::\n\n\nCan now use both of these function operators to express our desired result:\n\n\n::: {.cell}\n\n```{.r .cell-code}\nwalk2(\n  urls, path, \n  download.file %>% dot_every(10) %>% delay_by(0.1), \n  quiet = TRUE\n)\n```\n:::\n\n\n## Exercise {-}\n\n2) Should you memoise file.download? Why or why not?\n",
      6     "supporting": [
      7       "11_files"
      8     ],
      9     "filters": [
     10       "rmarkdown/pagebreak.lua"
     11     ],
     12     "includes": {},
     13     "engineDependencies": {},
     14     "preserve": {},
     15     "postProcess": true
     16   }
     17 }