commit a48f32282aff6e9f6e9c1a6ab6a6cca7ecab7249
parent 2ff6abc737f0cb8988da428e921cd8d423597f8a
Author: Federica Gazzelloni <61802414+Fgazzelloni@users.noreply.github.com>
Date: Wed, 24 Aug 2022 16:49:57 +0200
chapter10 (#23)
Diffstat:
5 files changed, 501 insertions(+), 5 deletions(-)
diff --git a/10_Function_factories.Rmd b/10_Function_factories.Rmd
@@ -2,12 +2,505 @@
**Learning objectives:**
-- THESE ARE NICE TO HAVE BUT NOT ABSOLUTELY NECESSARY
+- Understand what a function factory is
+- Recognise how function factories work
+- Learn about non-obvious combination of function features
+- Generate a family of functions from data
+
+
+
+## Introduction
+
+
+In this chapter we'll talk about function factories.
+
+A type of functions that make other functions.
+
+We'll tackle some challegenges in understanding how to make a workflow of functions.
+
+
+
+## What is a function factory?
+
+
+A **function factory** is a function that makes functions
+
+Factory made function are **manufactured functions**.
+
+```{r 10-1, echo=FALSE, fig.align='center', fig.dim="50%",fig.alt="https://epsis.com/no/operations-centers-focus-on-ways-of-working/",fig.cap="Function factory | Credits: epsis.com"}
+knitr::include_graphics("images/10-1-factories.png")
+```
+
+
+
+## How does a function factory work?
+```{r 10-2, echo=FALSE, fig.align='center', fig.dim="100%",fig.cap="How does it work? | Credits: kakaakigas.com/how-it-works/"}
+knitr::include_graphics("images/10-2-how.jpg")
+```
+
+```{r 10-ex1}
+power1 <- function(exp) {
+ function(x) {
+ x ^ exp
+ }
+}
+
+square <- power1(2)
+cube <- power1(3)
+```
+`power1()` is the function factory and `square()` and `cube()` are manufactured functions.
+
+
+**What we know already: Function Features**
+
+1. first-class functions
+```{r 10-ex2, eval=FALSE}
+name <- function(variables) {
+
+}
+```
+
+2. function environment `fn_env()`
+```{r 10-ex3}
+y <- 1
+f <- function(x) x + y
+rlang::fn_env(f)
+```
+
+3. function execution environment
+```{r 10-ex4}
+h <- function(x) {
+ # 1.
+ a <- 2 # 2. assingned value in the environment
+ x + a
+}
+y <- h(1) # 3.
+```
+
+
+
+A **function factory** is defined as a **functional programming tools**, as well as **functionals**, and **function operators**.
+
+An application would be to tackle the understanding of:
+
+- **Box-Cox transformation**
+- **maximum likelihood** problems
+- **bootstrap resamples**
+
+
+### Factory fundamentals
+
+- Environments
+- Force calculation
+- Super assignments
+- Cleaning up
+
+
+```{r 10-lib}
+library(rlang)
+library(ggplot2)
+library(scales)
+```
+
+Looking at manufactored functions, in our example the `square()` function, and imagine to do not know anything about it. The environment can be investigated with `env_print()` function. Two different enviroments can be found, with the same **parent**.
+
+```{r 10-6}
+rlang::env_print(square)
+```
+
+To know more about it, **exp** is visible as the engine of the function, the computed value (`square <- power1(2)`) can be retrieved like this:
+```{r 10-7}
+fn_env(square)$exp
+```
+```{r 10-8, echo=FALSE, fig.align="center", fig.dim="50%"}
+knitr::include_graphics("images/10-3-procedure.png")
+```
+
+To make sure every argument is evaluated, whenever x is assigned to a different value.
+```{r 10-9, eval=FALSE}
+force()
+```
+
+```{r 10-10}
+power1 <- function(exp) {
+ function(x) {
+ x ^ exp
+ }
+}
+square <- power1(x)
+x <- 2
+square(2)
+```
+Assign a value to the variable x, while it goes in the environment, see what's happen to our function:
+```{r 10-11}
+x <- 3
+square(2)
+```
+
+This is not necessarely wrong, but what if we'd like to set a variable **x** which can assume other values in the environment?
+
+```{r 10-12}
+power2 <- function(exp) {
+ force(exp)
+ function(x) {
+ x ^ exp
+ }
+}
+
+x <- 2
+square <- power2(x)
+square(2)
+```
+Look what's happen now:
+```{r 10-13}
+x <- 3
+square(2)
+```
+Even if we assigned a new value to **x** and it is stored in the environment, our function doesn't take consideration of it and keep doing its calculation based on the first variable assignment.
+
+**The super assignment operator**
+
+```{r 10-14, eval=FALSE}
+<<-
+```
+
+Another way is to set the function to **mantain the state**, in a way that each invocation will be completely independent, and a new *environment* is created, because considered as a **fresh start** (6.4.3).
+```{r 10-15}
+new_counter <- function() {
+ i <- 0 # first assignment inside the function (fresh start)
+ function() {
+ i <<- i + 1 # second assignment (super assignment)
+ i
+ }
+}
+
+counter_one <- new_counter()
+counter_two <- new_counter()
+counter_one()
+counter_one()
+counter_two()
+```
+
+
+> "As soon as your function starts managing the state of multiple variables, it’s better to switch to R6"
+
+**Cleaning up** using `rm()` inside a function:
+```{r 10-16}
+f1 <- function(n) {
+ x <- runif(n)
+ m <- mean(x)
+ function() m
+}
+
+g1 <- f1(1e6)
+lobstr::obj_size(g1)
+#> 8,013,104 B
+
+f2 <- function(n) {
+ x <- runif(n)
+ m <- mean(x)
+ rm(x)
+ function() m
+}
+
+g2 <- f2(1e6)
+lobstr::obj_size(g2)
+#> 12,944 B
+```
+
+
+
+### Example of: Graphical factories
+
+Graphical factories **useful function factories**, such as:
+
+1. Labelling with:
+
+ * formatter functions
+
+```{r 10-19}
+y <- c(12345, 123456, 1234567)
+comma_format()(y)
+```
+```{r 10-20}
+number_format(scale = 1e-3, suffix = " K")(y)
+```
+They are more commonly used inside a ggplot:
+```{r 10-21, include=FALSE}
+df <- data.frame(x = 1, y = y)
+a_ggplot_object <- ggplot(df, aes(x, y)) +
+ geom_point() +
+ scale_x_continuous(breaks = 1, labels = NULL) +
+ labs(x = NULL, y = NULL)
+```
+
+```{r 10-22,eval=T}
+a_ggplot_object +
+ scale_y_continuous(
+ labels = comma_format()
+)
+```
+
+2. Using binwidth in facet histograms
+
+ * binwidth_bins
+
+```{r}
+binwidth_bins <- function(n) {
+ force(n)
+
+ function(x) {
+ (max(x) - min(x)) / n
+ }
+}
+```
+
+Or use a concatenation of this typr of detecting number of bins functions:
+
+ - nclass.Sturges()
+ - nclass.scott()
+ - nclass.FD()
+
+```{r}
+base_bins <- function(type) {
+ fun <- switch(type,
+ Sturges = nclass.Sturges,
+ scott = nclass.scott,
+ FD = nclass.FD,
+ stop("Unknown type", call. = FALSE)
+ )
+
+ function(x) {
+ (max(x) - min(x)) / fun(x)
+ }
+}
+```
+
+
+3. Internals:
+
+ * ggplot2:::plot_dev()
+
+
+## Non-obvious combinations
+
+
+- The **Box-Cox** transformation.
+- **Bootstrap** resampling.
+- **Maximum likelihood** estimation.
+
+
+### Statistical factories
+
+The **Box-Cox** transformation towards normality:
+```{r}
+boxcox1 <- function(x, lambda) {
+ stopifnot(length(lambda) == 1)
+
+ if (lambda == 0) {
+ log(x)
+ } else {
+ (x ^ lambda - 1) / lambda
+ }
+}
+```
+
+
+```{r}
+boxcox2 <- function(lambda) {
+ if (lambda == 0) {
+ function(x) log(x)
+ } else {
+ function(x) (x ^ lambda - 1) / lambda
+ }
+}
+
+stat_boxcox <- function(lambda) {
+ stat_function(aes(colour = lambda), fun = boxcox2(lambda), size = 1)
+}
+
+plot1 <- ggplot(data.frame(x = c(0, 5)), aes(x)) +
+ lapply(c(0.5, 1, 1.5), stat_boxcox) +
+ scale_colour_viridis_c(limits = c(0, 1.5))
+
+# visually, log() does seem to make sense as the transformation
+# for lambda = 0; as values get smaller and smaller, the function
+# gets close and closer to a log transformation
+plot2 <- ggplot(data.frame(x = c(0.01, 1)), aes(x)) +
+ lapply(c(0.5, 0.25, 0.1, 0), stat_boxcox) +
+ scale_colour_viridis_c(limits = c(0, 1.5))
+library(patchwork)
+plot1+plot2
+```
+
+**Bootstrap generators**
+
+
+```{r}
+boot_permute <- function(df, var) {
+ n <- nrow(df)
+ force(var)
+
+ function() {
+ col <- df[[var]]
+ col[sample(n, replace = TRUE)]
+ }
+}
+
+boot_mtcars1 <- boot_permute(mtcars, "mpg")
+head(boot_mtcars1())
+#> [1] 16.4 22.8 22.8 22.8 16.4 19.2
+head(boot_mtcars1())
+#> [1] 17.8 18.7 30.4 30.4 16.4 21.0
+```
+```{r}
+boot_model <- function(df, formula) {
+ mod <- lm(formula, data = df)
+ fitted <- unname(fitted(mod))
+ resid <- unname(resid(mod))
+ rm(mod)
+
+ function() {
+ fitted + sample(resid)
+ }
+}
+
+boot_mtcars2 <- boot_model(mtcars, mpg ~ wt)
+head(boot_mtcars2())
+#> [1] 25.0 24.0 21.7 19.2 24.9 16.0
+head(boot_mtcars2())
+#> [1] 27.4 21.0 20.3 19.4 16.3 21.3
+```
+
+**Maximum likelihood estimation**
+
+$$P(\lambda,x)=\prod_{i=1}^{n}\frac{\lambda^{x_i}e^{-\lambda}}{x_i!}$$
+```{r}
+lprob_poisson <- function(lambda, x) {
+ n <- length(x)
+ (log(lambda) * sum(x)) - (n * lambda) - sum(lfactorial(x))
+}
+```
+
+```{r}
+x1 <- c(41, 30, 31, 38, 29, 24, 30, 29, 31, 38)
+```
+
+```{r}
+lprob_poisson(10, x1)
+#> [1] -184
+lprob_poisson(20, x1)
+#> [1] -61.1
+lprob_poisson(30, x1)
+#> [1] -31
+```
+
+```{r}
+ll_poisson1 <- function(x) {
+ n <- length(x)
+
+ function(lambda) {
+ log(lambda) * sum(x) - n * lambda - sum(lfactorial(x))
+ }
+}
+```
+```{r}
+ll_poisson2 <- function(x) {
+ n <- length(x)
+ sum_x <- sum(x)
+ c <- sum(lfactorial(x))
+
+ function(lambda) {
+ log(lambda) * sum_x - n * lambda - c
+ }
+}
+```
+
+```{r}
+ll1 <- ll_poisson2(x1)
+
+ll1(10)
+#> [1] -184
+ll1(20)
+#> [1] -61.1
+ll1(30)
+#> [1] -31
+```
+```{r}
+optimise(ll1, c(0, 100), maximum = TRUE)
+#> $maximum
+#> [1] 32.1
+#>
+#> $objective
+#> [1] -30.3
+```
+```{r}
+optimise(lprob_poisson, c(0, 100), x = x1, maximum = TRUE)
+#> $maximum
+#> [1] 32.1
+#>
+#> $objective
+#> [1] -30.3
+```
+
+## Function factory applications
+
+
+Combine functionals and function factories to turn data into many functions.
+
+### Function factories + functionals
+```{r}
+names <- list(
+ square = 2,
+ cube = 3,
+ root = 1/2,
+ cuberoot = 1/3,
+ reciprocal = -1
+)
+funs <- purrr::map(names, power1)
+
+funs$root(64)
+#> [1] 8
+funs$root
+#> function(x) {
+#> x ^ exp
+#> }
+#> <bytecode: 0x7fe85512a410>
+#> <environment: 0x7fe85b21f190>
+```
+```{r}
+with(funs, root(100))
+#> [1] 10
+```
+
+```{r}
+attach(funs)
+#> The following objects are masked _by_ .GlobalEnv:
+#>
+#> cube, square
+root(100)
+#> [1] 10
+detach(funs)
+```
+
+
+```{r}
+rlang::env_bind(globalenv(), !!!funs)
+root(100)
+#> [1] 10
+```
+
+```{r}
+rlang::env_unbind(globalenv(), names(funs))
+```
+
+
+## Conclusion
+
+
+
-## SLIDE 1
-- ADD SLIDES AS SECTIONS (`##`).
-- TRY TO KEEP THEM RELATIVELY SLIDE-LIKE; THESE ARE NOTES, NOT THE BOOK ITSELF.
## Meeting Videos
diff --git a/DESCRIPTION b/DESCRIPTION
@@ -23,4 +23,7 @@ Imports:
emoji,
lobstr,
rlang,
- glue
+ glue,
+ scales,
+ ggplot2,
+ patchwork
diff --git a/images/10-1-factories.png b/images/10-1-factories.png
Binary files differ.
diff --git a/images/10-2-how.jpg b/images/10-2-how.jpg
Binary files differ.
diff --git a/images/10-3-procedure.png b/images/10-3-procedure.png
Binary files differ.