html.json (10676B)
1 { 2 "hash": "355b584146ead093c1e2e038c1794779", 3 "result": { 4 "engine": "knitr", 5 "markdown": "---\nengine: knitr\ntitle: Improving performance\n---\n\n## Overview\n\n1. Code organization\n2. Check for existing solutions\n3. Do as little as possible\n4. Vectorise\n5. Avoid Copies\n\n## Organizing code\n\n- Write a function for each approach\n\n::: {.cell}\n\n```{.r .cell-code}\nmean1 <- function(x) mean(x)\nmean2 <- function(x) sum(x) / length(x)\n```\n:::\n\n- Keep old functions that you've tried, even the failures\n- Generate a representative test case\n\n::: {.cell}\n\n```{.r .cell-code}\nx <- runif(1e5)\n```\n:::\n\n- Use `bench::mark` to compare the different versions (and include unit tests)\n\n::: {.cell}\n\n```{.r .cell-code}\nbench::mark(\n mean1(x),\n mean2(x)\n)[c(\"expression\", \"min\", \"median\", \"itr/sec\", \"n_gc\")]\n```\n\n::: {.cell-output .cell-output-stdout}\n\n```\n#> # A tibble: 2 × 4\n#> expression min median `itr/sec`\n#> <bch:expr> <bch:tm> <bch:tm> <dbl>\n#> 1 mean1(x) 146.4µs 166.8µs 5839.\n#> 2 mean2(x) 67.1µs 75.7µs 12950.\n```\n\n\n:::\n:::\n\n\n## Check for Existing Solution\n- CRAN task views (http://cran.rstudio.com/web/views/)\n- Reverse dependencies of Rcpp (https://cran.r-project.org/web/packages/Rcpp/)\n- Talk to others!\n - Google (rseek)\n - Stackoverflow ([R])\n - https://community.rstudio.com/\n - DSLC community\n\n## Do as little as possible\n- use a function tailored to a more specific type of input or output, or to a more specific problem\n - `rowSums()`, `colSums()`, `rowMeans()`, and `colMeans()` are faster than equivalent invocations that use `apply()` because they are vectorised\n - `vapply()` is faster than `sapply()` because it pre-specifies the output type\n - `any(x == 10)` is much faster than `10 %in% x` because testing equality is simpler than testing set inclusion\n- Some functions coerce their inputs into a specific type. If your input is not the right type, the function has to do extra work\n - e.g. `apply()` will always turn a dataframe into a matrix\n- Other examples\n - `read.csv()`: specify known column types with `colClasses`. (Also consider\n switching to `readr::read_csv()` or `data.table::fread()` which are \n considerably faster than `read.csv()`.)\n\n - `factor()`: specify known levels with `levels`.\n\n - `cut()`: don't generate labels with `labels = FALSE` if you don't need them,\n or, even better, use `findInterval()` as mentioned in the \"see also\" section\n of the documentation.\n \n - `unlist(x, use.names = FALSE)` is much faster than `unlist(x)`.\n\n - `interaction()`: if you only need combinations that exist in the data, use\n `drop = TRUE`.\n \n## Avoiding Method Dispatch\n\n::: {.cell}\n\n```{.r .cell-code}\nx <- runif(1e2)\nbench::mark(\n mean(x),\n mean.default(x)\n)[c(\"expression\", \"min\", \"median\", \"itr/sec\", \"n_gc\")]\n```\n\n::: {.cell-output .cell-output-stdout}\n\n```\n#> # A tibble: 2 × 4\n#> expression min median `itr/sec`\n#> <bch:expr> <bch:tm> <bch:tm> <dbl>\n#> 1 mean(x) 2.8µs 3.1µs 295383.\n#> 2 mean.default(x) 900ns 1.1µs 839532.\n```\n\n\n:::\n:::\n\n\n\n::: {.cell}\n\n```{.r .cell-code}\nx <- runif(1e2)\nbench::mark(\n mean(x),\n mean.default(x),\n .Internal(mean(x))\n)[c(\"expression\", \"min\", \"median\", \"itr/sec\", \"n_gc\")]\n```\n\n::: {.cell-output .cell-output-stdout}\n\n```\n#> # A tibble: 3 × 4\n#> expression min median `itr/sec`\n#> <bch:expr> <bch:tm> <bch:tm> <dbl>\n#> 1 mean(x) 2.7µs 2.9µs 310733.\n#> 2 mean.default(x) 900ns 1.1µs 849192.\n#> 3 .Internal(mean(x)) 100ns 200ns 4495392.\n```\n\n\n:::\n:::\n\n\n\n::: {.cell}\n\n```{.r .cell-code}\nx <- runif(1e4)\nbench::mark(\n mean(x),\n mean.default(x),\n .Internal(mean(x))\n)[c(\"expression\", \"min\", \"median\", \"itr/sec\", \"n_gc\")]\n```\n\n::: {.cell-output .cell-output-stdout}\n\n```\n#> # A tibble: 3 × 4\n#> expression min median `itr/sec`\n#> <bch:expr> <bch:tm> <bch:tm> <dbl>\n#> 1 mean(x) 16.6µs 17.3µs 55604.\n#> 2 mean.default(x) 14.4µs 14.9µs 64958.\n#> 3 .Internal(mean(x)) 13.6µs 13.7µs 70368.\n```\n\n\n:::\n:::\n\n\n## Avoiding Input Coercion\n- `as.data.frame()` is quite slow because it coerces each element into a data frame and then `rbind()`s them together\n- instead, if you have a named list with vectors of equal length, you can directly transform it into a data frame\n\n\n::: {.cell}\n\n```{.r .cell-code}\nquickdf <- function(l) {\n class(l) <- \"data.frame\"\n attr(l, \"row.names\") <- .set_row_names(length(l[[1]]))\n l\n}\nl <- lapply(1:26, function(i) runif(1e3))\nnames(l) <- letters\nbench::mark(\n as.data.frame = as.data.frame(l),\n quick_df = quickdf(l)\n)[c(\"expression\", \"min\", \"median\", \"itr/sec\", \"n_gc\")]\n```\n\n::: {.cell-output .cell-output-stdout}\n\n```\n#> # A tibble: 2 × 4\n#> expression min median `itr/sec`\n#> <bch:expr> <bch:tm> <bch:tm> <dbl>\n#> 1 as.data.frame 589.5µs 656.2µs 1447.\n#> 2 quick_df 3.7µs 4.3µs 207929.\n```\n\n\n:::\n:::\n\n\n*Caveat!* This method is fast because it's dangerous!\n\n## Vectorise\n- vectorisation means finding the existing R function that is implemented in C and most closely applies to your problem\n- Vectorised functions that apply to many scenarios\n - `rowSums()`, `colSums()`, `rowMeans()`, and `colMeans()`\n - Vectorised subsetting can lead to big improvements in speed\n - `cut()` and `findInterval()` for converting continuous variables to categorical\n - Be aware of vectorised functions like `cumsum()` and `diff()`\n - Matrix algebra is a general example of vectorisation\n\n## Avoiding copies\n\n- Whenever you use c(), append(), cbind(), rbind(), or paste() to create a bigger object, R must first allocate space for the new object and then copy the old object to its new home.\n\n\n::: {.cell}\n\n```{.r .cell-code}\nrandom_string <- function() {\n paste(sample(letters, 50, replace = TRUE), collapse = \"\")\n}\nstrings10 <- replicate(10, random_string())\nstrings100 <- replicate(100, random_string())\ncollapse <- function(xs) {\n out <- \"\"\n for (x in xs) {\n out <- paste0(out, x)\n }\n out\n}\nbench::mark(\n loop10 = collapse(strings10),\n loop100 = collapse(strings100),\n vec10 = paste(strings10, collapse = \"\"),\n vec100 = paste(strings100, collapse = \"\"),\n check = FALSE\n)[c(\"expression\", \"min\", \"median\", \"itr/sec\", \"n_gc\")]\n```\n\n::: {.cell-output .cell-output-stdout}\n\n```\n#> # A tibble: 4 × 4\n#> expression min median `itr/sec`\n#> <bch:expr> <bch:tm> <bch:tm> <dbl>\n#> 1 loop10 17.1µs 18µs 52743.\n#> 2 loop100 460.3µs 491.5µs 1959.\n#> 3 vec10 2.7µs 2.9µs 317912.\n#> 4 vec100 16.3µs 18.5µs 51297.\n```\n\n\n:::\n:::\n\n\n## Case study: t-test\n\n\n::: {.cell}\n\n```{.r .cell-code}\nm <- 1000\nn <- 50\nX <- matrix(rnorm(m * n, mean = 10, sd = 3), nrow = m)\ngrp <- rep(1:2, each = n / 2)\n```\n:::\n\n\n\n::: {.cell}\n\n```{.r .cell-code}\n# formula interface\nsystem.time(\n for (i in 1:m) {\n t.test(X[i, ] ~ grp)$statistic\n }\n)\n```\n\n::: {.cell-output .cell-output-stdout}\n\n```\n#> user system elapsed \n#> 0.28 0.00 0.33\n```\n\n\n:::\n\n```{.r .cell-code}\n# provide two vectors\nsystem.time(\n for (i in 1:m) {\n t.test(X[i, grp == 1], X[i, grp == 2])$statistic\n }\n)\n```\n\n::: {.cell-output .cell-output-stdout}\n\n```\n#> user system elapsed \n#> 0.08 0.00 0.08\n```\n\n\n:::\n:::\n\n\nAdd functionality to save values\n\n\n::: {.cell}\n\n```{.r .cell-code}\ncompT <- function(i){\n t.test(X[i, grp == 1], X[i, grp == 2])$statistic\n}\nsystem.time(t1 <- purrr::map_dbl(1:m, compT))\n```\n\n::: {.cell-output .cell-output-stdout}\n\n```\n#> user system elapsed \n#> 0.09 0.00 0.09\n```\n\n\n:::\n:::\n\n\nIf you look at the source code of `stats:::t.test.default()`, you’ll see that it does a lot more than just compute the t-statistic.\n\n\n::: {.cell}\n\n```{.r .cell-code}\n# Do less work\nmy_t <- function(x, grp) {\n t_stat <- function(x) {\n m <- mean(x)\n n <- length(x)\n var <- sum((x - m) ^ 2) / (n - 1)\n list(m = m, n = n, var = var)\n }\n g1 <- t_stat(x[grp == 1])\n g2 <- t_stat(x[grp == 2])\n se_total <- sqrt(g1$var / g1$n + g2$var / g2$n)\n (g1$m - g2$m) / se_total\n}\nsystem.time(t2 <- purrr::map_dbl(1:m, ~ my_t(X[.,], grp)))\n```\n\n::: {.cell-output .cell-output-stdout}\n\n```\n#> user system elapsed \n#> 0.02 0.00 0.01\n```\n\n\n:::\n\n```{.r .cell-code}\nstopifnot(all.equal(t1, t2))\n```\n:::\n\n\nThis gives us a six-fold speed improvement!\n\n\n::: {.cell}\n\n```{.r .cell-code}\n# Vectorise it\nrowtstat <- function(X, grp){\n t_stat <- function(X) {\n m <- rowMeans(X)\n n <- ncol(X)\n var <- rowSums((X - m) ^ 2) / (n - 1)\n list(m = m, n = n, var = var)\n }\n g1 <- t_stat(X[, grp == 1])\n g2 <- t_stat(X[, grp == 2])\n se_total <- sqrt(g1$var / g1$n + g2$var / g2$n)\n (g1$m - g2$m) / se_total\n}\nsystem.time(t3 <- rowtstat(X, grp))\n```\n\n::: {.cell-output .cell-output-stdout}\n\n```\n#> user system elapsed \n#> 0.02 0.00 0.02\n```\n\n\n:::\n\n```{.r .cell-code}\nstopifnot(all.equal(t1, t3))\n```\n:::\n\n\n1000 times faster than when we started!\n\n## Other techniques\n* [Read R blogs](http://www.r-bloggers.com/) to see what performance\n problems other people have struggled with, and how they have made their\n code faster.\n\n* Read other R programming books, like The Art of R Programming or Patrick Burns'\n [_R Inferno_](http://www.burns-stat.com/documents/books/the-r-inferno/) to\n learn about common traps.\n\n* Take an algorithms and data structure course to learn some\n well known ways of tackling certain classes of problems. I have heard\n good things about Princeton's\n [Algorithms course](https://www.coursera.org/course/algs4partI) offered on\n Coursera.\n \n* Learn how to parallelise your code. Two places to start are\n Parallel R and Parallel Computing for Data Science\n\n* Read general books about optimisation like Mature optimisation\n or the Pragmatic Programmer\n \n* Read more R code. StackOverflow, R Mailing List, DSLC, GitHub, etc.\n", 6 "supporting": [ 7 "24_files" 8 ], 9 "filters": [ 10 "rmarkdown/pagebreak.lua" 11 ], 12 "includes": {}, 13 "engineDependencies": {}, 14 "preserve": {}, 15 "postProcess": true 16 } 17 }