06.Rmd (11290B)
1 --- 2 engine: knitr 3 title: Functions 4 --- 5 6 ## Learning objectives: 7 8 - How to make functions in R 9 - What are the parts of a function 10 - Nested functions 11 12 ## How to make a simple function in R 13 14 **Function components** 15 16 Functions have three parts, `formals()`, `body()`, and `environment()`. 17 18 ```{r 06-c01, echo=FALSE, fig.align='center', fig.dim="100%"} 19 DiagrammeR::mermaid(" 20 graph LR 21 A{formals}-->B(body) 22 B-->C(environment) 23 style A fill:#bbf,stroke:#f66,stroke-width:2px,color:#fff,stroke-dasharray: 5 5 24 ",height = '100%', width = '100%') 25 ``` 26 27 -------------------------------------------------------------------------------- 28 29 **Example: [coffee rations](https://raw.githubusercontent.com/rfordatascience/tidytuesday/master/data/2020/2020-07-07/coffee_ratings.csv)** 30 31 ```{r, echo = FALSE, message = FALSE} 32 coffee_ratings <- readr::read_csv('https://raw.githubusercontent.com/rfordatascience/tidytuesday/master/data/2020/2020-07-07/coffee_ratings.csv') 33 library(tidyverse) 34 ``` 35 36 ```{r, function_example} 37 avg_points <- function(species){ 38 # this function is for calculating the mean 39 avg <- coffee_ratings |> 40 filter(species == species) |> 41 summarise(mean = mean(total_cup_points)) 42 return(avg) 43 } 44 ``` 45 46 47 ```{r} 48 avg_points("Arabica") 49 ``` 50 51 --------------------------------------------------------------------------------- 52 53 ```{r} 54 formals(avg_points) 55 ``` 56 57 ```{r} 58 body(avg_points) 59 ``` 60 61 ```{r} 62 environment(avg_points) 63 ``` 64 65 66 ----------------------------------------------------------------------------------- 67 68 Functions uses attributes, one attribute used by base R is `srcref`, short for **source reference**. 69 It points to the source code used to create the function. 70 It contains code comments and other formatting. 71 72 ```{r} 73 74 #| eval=FALSE 75 my_f <- function(){ 76 #test 77 return(1) 78 } 79 80 attr(my_f, "srcref") 81 ``` 82 83 ## Primitive functions 84 85 Are the core function in base R, such as `sum()` 86 ```{r} 87 sum 88 ``` 89 90 Type of primitives: 91 92 - builtin 93 - special 94 95 ```{r} 96 typeof(sum) 97 ``` 98 99 100 These core functions have components to NULL. 101 102 ## Anonymous functions 103 104 If you don't provide a name to a function 105 106 ```{r} 107 lapply(mtcars |> select(mpg, cyl), function(x) length(unique(x))) 108 ``` 109 110 111 ```{r} 112 vector_len <- function(x) { 113 length(unique(x)) 114 } 115 ``` 116 117 ```{r} 118 lapply(mtcars |> select(mpg, cyl), vector_len) 119 ``` 120 121 ------------------------------------------------------------------------------------------- 122 123 **Invoking a function** 124 125 ```{r} 126 args <- unique(coffee_ratings$species)[[1]] |> as.list() 127 128 do.call(avg_points, args) 129 ``` 130 131 `do.call` is used a lot with apply's family 132 133 ## Function composition 134 135 ```{r nestedfunctioncalls} 136 x <- runif(100) 137 # Nested function calls 138 square <- function(x) x^2 139 deviation <- function(x) x - mean(x) 140 sqrt(mean(square(deviation(x)))) 141 ``` 142 143 ```{r intermediateresults} 144 # intermediate result 145 out <- deviation(x) 146 out <- square(out) 147 out <- mean(out) 148 out <- sqrt(out) 149 out 150 ``` 151 152 153 ```{r pipebinaryoperator} 154 # Pipe workflow 155 x |> 156 deviation() |> 157 square() |> 158 mean() |> 159 sqrt() 160 ``` 161 162 ## More about functions insights 163 164 ### Lexical scoping 165 166 **Rules** 167 168 - Name masking 169 - Functions versus variables 170 - A fresh start 171 - Dynamic lookup 172 173 -------------------------------------------------------------------------------- 174 175 #### Name masking 176 177 Save us from our polluted global environment! 178 179 ```{r namemasking} 180 bob <- 1; alice <- 2 181 182 g02 <- function() {bob <- "test"; alice <- "bacon"; c(bob, alice)} 183 184 g03 <- function() {bob <- "test"; c(bob, alice)} 185 186 g02() 187 g03() 188 ``` 189 190 Note: also applied for functions names 191 192 ------------------------------------------------------------------------------------ 193 194 #### Functions versus variables 195 196 ::: {.callout-caution} 197 Do not be too smart! 198 ::: 199 200 ```{r donotdothat} 201 g09 <- function(x) x + 100 202 g10 <- function() { 203 g09 <- 10 204 g09(g09) 205 } 206 g10() 207 ``` 208 209 ------------------------------------------------------------------------------------ 210 211 #### Fresh start: 212 213 ```{r} 214 g11 <- function() { 215 if (!exists("a")) { 216 print("I am here") 217 a <- 1 218 } else { 219 print("Dead branch!") 220 a <- a + 1 221 } 222 a 223 } 224 225 g11() 226 g11() 227 ``` 228 229 ------------------------------------------------------------------------------------- 230 231 #### Dynamic scoping: 232 233 **"When"**: R looks for the value **when** the function is run 234 235 This function 236 ```{r} 237 g12 <- function() x + 1 238 x <- 15 239 g12() 240 x <- 20 241 g12() 242 ``` 243 244 ------------------------------------------------------------------------------------ 245 246 #### Debugging 247 248 ```{r} 249 codetools::findGlobals(g12) 250 ``` 251 252 You can change the function’s environment to an environment which contains nothing: 253 ```{r} 254 # environment(g12) <- emptyenv() 255 # g12() 256 # Error in x + 1 : could not find function "+" 257 ``` 258 259 ------------------------------------------------------------------------------------ 260 261 ## Lazy Evaluation: 262 263 264 ::: {.callout-important} 265 function arguments are only evaluated when accessed 266 ::: 267 268 ------------------------------------------------------------------------------------ 269 270 #### Promises 271 272 See you at chapter 20! 273 274 Promises: multiple definitions? 275 276 - [Futurverse](https://journal.r-project.org/archive/2021/RJ-2021-048/RJ-2021-048.pdf) 277 278 - [Promises](https://rstudio.github.io/promises/) in Shiny 279 280 ---------------------------------------------------------------------------------- 281 282 #### Default arguments 283 284 Arguments defined by other arguments or variables in Functions 285 286 ::: {.callout-caution} 287 Not recommended by Hadley! 288 ::: 289 290 ---------------------------------------------------------------------------------- 291 292 #### Missing arguments 293 294 `missing()` checks if an argument is missing (TRUE/FALSE) then you can "branch" according to it 295 296 ::: {.callout-caution} 297 Do not be like `sample()` ! 298 ::: 299 300 ------------------------------------------------------------------------------------ 301 302 ### ... (dot-dot-dot) 303 304 **Example** 305 306 ```{r} 307 i01 <- function(y, z) { 308 list(y = y, z = z) 309 } 310 i02 <- function(x, ...) { 311 i01(...) 312 } 313 str(i02(x = 1, y = 2, z = 3)) 314 ``` 315 316 Used a lot with higher ordered functions! 317 318 ----------------------------------------------------------------------------------- 319 320 ### Exiting a function 321 322 1. Implicit or explicit returns 323 324 2. Invisibility (`<-` most famous function that returns an invisible value) 325 326 3. `stop()` to stop a function with an error. 327 328 4. Exit handlers (`on.exit()`) 329 330 ------------------------------------------------------------------------------------ 331 332 ### Function forms 333 334 >Everything that exists is an object. 335 Everything that happens is a function call. 336 — John Chambers 337 338 339 ```{r echo=FALSE} 340 knitr::include_graphics("images/06_forms.png") 341 ``` 342 343 --- 344 345 346 ## Case Study: SIR model function 347 348 This is an interesting example taken from a course on Coursera: [Infectious disease modelling-ICL](https://www.coursera.org/specializations/infectious-disease-modelling) 349 350 The purpose of this example is to show how to make a model passing through making a function. 351 352 First we need to load some useful libraries: 353 354 ```{r message=FALSE, warning=FALSE, paged.print=FALSE} 355 library(deSolve) 356 library(reshape2) 357 ``` 358 359 360 Then set the model inputs: 361 362 - population size (N) 363 - number of susceptable (S) 364 - infected (I) 365 - recovered (R) 366 367 And add the model parameters: 368 369 - infection rate ($\beta$) 370 - recovery rate ($\gamma$) 371 372 373 ```{r} 374 N<- 100000 # population 375 376 state_values<- c(S = N -1, # susceptible 377 I = 1, # infected 378 R = 0) # recovered 379 380 parameters<- c(beta = 1/2, # infection rate days^-1 381 gamma = 1/4) # recovery rate days^-1 382 ``` 383 384 385 Then we set the **time** as an important factor, which defines the length of time we are looking at this model run. It is intended as the time range in which the infections spread out, let’s say that we are aiming to investigate an infection period of 100 days. 386 387 ```{r} 388 times<- seq(0, 100, by = 1) 389 ``` 390 391 Finally, we set up the **SIR model**, the susceptable, infected and recovered model. How do we do that is passing the paramenters through a function of the time, and state. 392 393 Within the model function we calculate one more paramenter, the **force of infection**: $\lambda$ 394 395 ```{r} 396 sir_model<- function(time, state, parameters){ 397 with(as.list(c(state, parameters)),{ 398 N<- S + I + R 399 lambda = beta * I/N # force of infection 400 dS<- - lambda * S 401 dI<- lambda * S - gamma * I 402 dR<- gamma * I 403 return(list(c(dS,dI,dR))) 404 }) 405 } 406 ``` 407 408 409 Once we have our **SIR model** function ready, we can calculate the **output** of the model, with the help of the function `ode()` from {deSolve} package. 410 411 ```{r} 412 output<- as.data.frame(ode(y = state_values, 413 times = times, 414 func = sir_model, 415 parms = parameters)) 416 output %>% head 417 ``` 418 419 In addition to our builtin SIR model function we can have a look at: 420 421 ```{r eval=FALSE, include=T} 422 ?deSolve::ode() 423 ``` 424 425 It solves **Ordinary Differential Equations**. 426 427 ```{r} 428 deSolve:::ode 429 ``` 430 431 ```{r} 432 methods("ode") 433 ``` 434 435 436 --- 437 438 With the help of the {reshape2} package we use the function `melt()` to reshape the output: 439 440 ```{r} 441 melt(output,id="time") %>% head 442 ``` 443 444 445 The same as usign `pivot_longer()` function. 446 ```{r} 447 output%>% 448 pivot_longer(cols = c("S","I","R"), 449 names_to="variable", 450 values_to="values") %>% 451 arrange(desc(variable)) %>% 452 head 453 ``` 454 455 --- 456 457 458 Before to proceed with the visualization of the SIR model output we do a bit of investigations. 459 460 **What if we want to see how `melt()` function works?** 461 462 **What instruments we can use to see inside the function and understand how it works?** 463 464 465 Using just the function name **melt** or `structure()` function with *melt* as an argument, we obtain the same output. To select just the argument of the function we can do `args(melt)` 466 467 ```{r} 468 reshape2:::melt 469 ``` 470 471 472 ```{r} 473 body(melt) 474 ``` 475 476 ```{r} 477 formals(melt) 478 ``` 479 480 ```{r} 481 environment(melt) 482 ``` 483 484 485 ```{r} 486 typeof(melt) 487 ``` 488 489 > "R functions simulate a closure by keeping an explicit reference to the environment that was active when the function was defined." 490 491 492 ref: [closures](https://www.r-bloggers.com/2015/03/using-closures-as-objects-in-r/) 493 494 495 496 Try with `methods()`, or `print(methods(melt))`: Non-visible functions are asterisked! 497 498 > The S3 method name is followed by an asterisk * if the method definition is not exported from the package namespace in which the method is defined. 499 500 ```{r} 501 methods("melt", data) 502 ``` 503 504 ```{r} 505 methods(class="table") 506 ``` 507 508 ```{r eval=FALSE, include=T} 509 help(UseMethod) 510 ``` 511 512 513 We can access to some of the above calls with `getAnywhere()`, for example here is done for "melt.data.frame": 514 ```{r} 515 getAnywhere("melt.data.frame") 516 ``` 517 518 519 References: 520 521 - [stackoverflow article](https://stackoverflow.com/questions/11173683/how-can-i-read-the-source-code-for-an-r-function) 522 - [Rnews bulletin: R Help Desk](https://www.r-project.org/doc/Rnews/Rnews_2006-4.pdf) 523 524 525 526 --- 527 528 Going back to out model output visualization. 529 530 ```{r} 531 output_full<- melt(output,id="time") 532 ``` 533 534 ```{r} 535 output_full$proportion<- output_full$value/sum(state_values) 536 ``` 537 538 ```{r} 539 ggplot(data = output, aes(x = time, y = I)) + 540 geom_line() + 541 xlab("Time(days)") + 542 ylab("Number of Infected") + 543 labs("SIR Model: prevalence of infection") 544 ``` 545 546 ```{r} 547 ggplot(output_full, aes(time, proportion, color = variable, group = variable)) + 548 geom_line() + 549 xlab("Time(days)") + 550 ylab("Prevalence") + 551 labs(color = "Compartment", title = "SIR Model") 552 ```