bookclub-advr

DSLC Advanced R Book Club
git clone https://git.eamoncaddigan.net/bookclub-advr.git
Log | Files | Refs | README | LICENSE

23.Rmd (3100B)


      1 ---
      2 engine: knitr
      3 title: Measuring performance
      4 ---
      5 
      6 ## Learning objectives:
      7 
      8 - Understand how to improve your code for making it faster
      9 - Learn what are the tools for improving your code
     10 - Test how to profile your code
     11 
     12 
     13 ## Introduction
     14 
     15 > "Before you can make your code faster, you first need to figure out what’s making it slow."
     16 
     17 
     18 ```{r echo=FALSE, fig.align='center',fig.cap="SLOW DOWN TO LEARN HOW TO CODE FASTER | credits: packtpub.com"}
     19 knitr::include_graphics("images/23_code_faster.jpeg")
     20 ```
     21 
     22 
     23 - **profile** your code: measure the run-time of each line of code using realistic inputs
     24 - **experiment** with alternatives to find faster code
     25 - **microbenchmark** to measure the difference in performance.
     26 
     27 
     28 
     29 ## Profiling
     30 
     31 ```{r message=FALSE, warning=FALSE, paged.print=FALSE}
     32 library(profvis)
     33 library(bench)
     34 ```
     35 
     36 
     37 The tool to use is a **profiler**, it allows for **sampling** the code performance through stopping the execution of code every few milliseconds and recording all the steps.
     38 
     39 Example:
     40 
     41 ```{r}
     42 f <- function() {
     43   pause(0.1)
     44   g()
     45   h()
     46 }
     47 g <- function() {
     48   pause(0.1)
     49   h()
     50 }
     51 h <- function() {
     52   pause(0.1)
     53 }
     54 ```
     55 
     56 Profile the execution of f():
     57 
     58     profvis::pause() is used instead of Sys.sleep()
     59     profile f(), with utils::Rprof()
     60     
     61 ```{r}
     62 tmp <- tempfile()
     63 Rprof(tmp, interval = 0.1)
     64 f()
     65 Rprof(NULL)
     66 writeLines(readLines(tmp))
     67 ```
     68     
     69     
     70 **Visualising profiles**
     71 
     72 Makes easier to build up a mental model of what you need to change:
     73 
     74     profvis::profvis()
     75     utils::summaryRprof()
     76 
     77 ```{r}
     78 source("scripts/profiling-example.R")
     79 profvis(f())
     80 ```
     81 
     82 **Memory profiling and the garbage collector**
     83 
     84 Profiling a loop that modifies an existing variable:
     85 ```{r}
     86 profvis::profvis({
     87   x <- integer()
     88 for (i in 1:1e4) {
     89   x <- c(x, i)
     90 }
     91 })
     92 ```
     93 
     94 You can figure out what is the source of the problem by looking at the memory column. In this case, **copy-on-modify** acts in each iteration of the loop creating another copy of x.
     95 
     96 
     97 **Limitations**
     98 
     99 - Profiling does not extend to C code
    100 - Anonymous functions are hard to figure out
    101 - Arguments are evaluated inside another function
    102 
    103 
    104 ### Exercise
    105 ```{r eval=FALSE}
    106 profvis::profvis({
    107   f <- function(n = 1e5) {
    108   x <- rep(1, n)
    109   rm(x)
    110 }
    111 },torture = TRUE)
    112 ```
    113 
    114     ?rm()
    115     
    116 [solution](https://advanced-r-solutions.rbind.io/measuring-performance.html)    
    117     
    118 ## Microbenchmarking
    119 
    120 
    121 *Measurement of the performance of a very small piece of code* is useful for comparing small snippets of code for specific tasks.
    122 
    123 ```{r echo=FALSE, fig.align='center',fig.cap = "Credits: Google search-engine"}
    124 knitr::include_graphics("images/23_microbenchmarking.jpeg")
    125 ```
    126 
    127 
    128 The {bench} package uses a high precision time.
    129 
    130     bench::mark()
    131     
    132     
    133 ```{r}
    134 library(bench)
    135 x <- runif(100)
    136 (lb <- bench::mark(
    137   sqrt(x),
    138   x ^ 0.5
    139 ))
    140 ```
    141 - heavily right-skewed distribution
    142 
    143 
    144 ```{r}
    145 require(ggbeeswarm)
    146 plot(lb)
    147 ```
    148 
    149 
    150 ## Resources
    151 
    152 - [profvis package](https://rstudio.github.io/profvis/)
    153 - [bench package](https://cran.r-project.org/web/packages/bench/bench.pdf)
    154 - [solutions](https://advanced-r-solutions.rbind.io/measuring-performance.html)