22.Rmd - bookclub-advr - DSLC Advanced R Book Club

22.Rmd (7803B)
      1 ---
      2 engine: knitr
      3 title: Debugging
      4 ---
      5 
      6 ## Learning objectives:
      7 
      8 - General strategy for finding and fixing errors.
      9 
     10 - Explore the `traceback()` function to locate exactly where an error occurred
     11 
     12 - Explore how to pause the execution of a function and launch environment where we can interactively explore what’s happening
     13 
     14 - Explore debugging when you’re running code non-interactively
     15 
     16 - Explore non-error problems that occasionally also need debugging
     17 
     18 ## Introduction {-}
     19 
     20 > Finding bug in code, is a process of confirming the many things that we believe are true — until we find one which is not true.
     21 
     22 **—Norm Matloff**
     23 
     24 > Debugging is like being the detective in a crime movie where you're also the murderer. 
     25 
     26 **-Filipe Fortes**
     27 
     28 ### Strategies for finding and fixing errors {-}
     29 
     30 #### Google! {-}
     31 Whenever you see an error message, start by googling it. We can automate this process with the [{errorist}](https://github.com/coatless-rpkg/errorist) and [{searcher}](https://github.com/coatless-rpkg/searcher) packages. 
     32 
     33 #### Make it repeatable {-}
     34 To find the root cause of an error, you’re going to need to execute the code many times as you consider and reject hypotheses. It’s worth some upfront investment to make the problem both easy and fast to reproduce.
     35 
     36 #### Figure out where it is {-}
     37 To find the bug, adopt the scientific method: **generate hypotheses**, **design experiments to test them**, and **record your results**. This may seem like a lot of work, but a systematic approach will end up saving you time. 
     38 
     39 #### Fix it and test it {-}
     40 Once you’ve found the bug, you need to figure out how to fix it and to check that the fix actually worked. It’s very useful to have automated tests in place. 
     41 
     42 ## Locating errors {-}
     43 The most important tool for finding errors is `traceback()`, which shows you the sequence of calls (also known as the **call stack**) that lead to the error.
     44 
     45 - Here’s a simple example where `f()` calls `g()` calls `h()` calls `i()`, which checks if its argument is numeric:
     46 
     47 ![](images/locating-errors.png)
     48 When we run `f("a")` code in RStudio we see:
     49 
     50 ![](images/fa.png)
     51 
     52 
     53 If you click **“Show traceback”** you see:
     54 ![](images/options.png)
     55 
     56 
     57 You read the `traceback()` output from bottom to top: the initial call is `f()`, which calls `g()`, then `h()`, then `i()`, which triggers the error. 
     58 
     59 ##  Lazy evaluation {-}
     60 One drawback to `traceback()` is that it always **linearises** the call tree, which can be confusing if there is much lazy evaluation involved. For example, take the following example where the error happens when evaluating the first argument to `f()`:
     61 
     62 ![](images/lazy-evaluation.png)
     63 
     64 ![](images/traceback.png)
     65 
     66 Note: `rlang::with_abort()` is no longer an exported object from 'namespace:rlang'. There is an [open issue](https://github.com/hadley/adv-r/issues/1740) about a fix for the chapter but no drop-in replacement.
     67 
     68 
     69 ## Interactive debugger {-}
     70 Enter the interactive debugger is wwith RStudio’s **“Rerun with Debug”** tool. This reruns the command that created the error, pausing execution where the error occurred. Otherwise, you can insert a call to `browser()` where you want to pause, and re-run the function. 
     71 
     72 ![](images/browser.png)
     73 
     74 `browser()` is just a regular function call which means that you can run it conditionally by wrapping it in an `if` statement:
     75 
     76 ![](images/browser2.png)
     77 
     78 
     79 
     80 
     81 ## `browser()` commands {-}
     82 `browser()` provides a few special commands. 
     83 
     84 ![](images/debug-toolbar.png)
     85 
     86 - Next, `n`: executes the next step in the function.
     87 
     88 - Step into,  or `s`: works like next, but if the next step is a function, it will step into that function so you can explore it interactively.
     89 
     90 - Finish,  or `f`: finishes execution of the current loop or function.
     91 
     92 - Continue, `c`: leaves interactive debugging and continues regular execution of the function. 
     93 - Stop, `Q`: stops debugging, terminates the function, and returns to the global workspace. 
     94 
     95 
     96 ##  Alternatives {-}
     97 There are three alternatives to using `browser()`: setting breakpoints in RStudio, `options(error = recover)`, and `debug()` and other related functions.
     98 
     99 ## Breakpoints {-}
    100 In RStudio, you can set a breakpoint by clicking to the left of the line number, or pressing **Shift + F9.** There are two small downsides to breakpoints:
    101 
    102 - There are a few unusual situations in which breakpoints will not work. [Read breakpoint troubleshooting for more details](https://support.posit.co/hc/en-us/articles/200534337-Breakpoint-Troubleshooting)
    103 
    104 - RStudio currently does not support conditional breakpoints.
    105 
    106 ## `recover()` {-}
    107 When you set `options(error = recover)`, when you get an error, you’ll get an interactive prompt that displays the traceback and gives you the ability to interactively debug inside any of the frames:
    108 
    109 ![](images/recover.png)
    110 You can return to default error handling with `options(error = NULL)`.
    111 
    112 ## `debug()` {-}
    113 
    114 Another approach is to call a function that inserts the `browser()` call:
    115 
    116 - `debug()` inserts a browser statement in the first line of the specified function. undebug() removes it. 
    117 
    118 - `utils::setBreakpoint()` works similarly, but instead of taking a function name, it takes a file name and line number and finds the appropriate function for you.
    119 
    120 
    121 ## Call stack {-}
    122 The call stacks printed by `traceback()`, `browser()` & `where`, and `recover()` are not consistent. 
    123 
    124 ![](images/print-debug.png)
    125 
    126 RStudio displays calls in the same order as `traceback()`. rlang functions use the same ordering and numbering as `recover()`, but also use indenting to reinforce the hierarchy of calls.
    127 
    128 ## Non-interactive debugging {-}
    129 
    130 When you can’t explore interactively...
    131 
    132 ### `callr::r()` {-}
    133 
    134 `callr::r(f, list(1, 2))` calls `f(1, 2)` in a fresh session to help diagnose:
    135 
    136 - Is the global environment different? Have you loaded different packages? Are objects left from previous sessions causing differences?
    137 
    138 - Is the working directory different?
    139 
    140 - Is the `PATH` environment variable different?
    141 
    142 - Is the `R_LIBS` environment variable different?
    143 
    144 ### `dump.frames()` {-}
    145 
    146 `dump.frames()` is the equivalent to `recover()` for non-interactive code.
    147 
    148 ![](images/non-interractive-debugging.png)
    149 
    150 ### Print debugging {-}
    151 
    152 Insert numerous print statements to precisely locate the problem, and see the values of important variables. Print debugging is particularly useful for compiled code.
    153 
    154 ![](images/print-debugging.png)
    155 
    156 
    157 ### RMarkdown {-}
    158 
    159 - If you’re knitting the file using RStudio, switch to calling `rmarkdown::render("path/to/file.Rmd")` instead to run the code in the current session. 
    160 
    161 - For interactive debugging, you’ll need to call `sink()` in the error handler. For example, to use `recover()` with RMarkdown, you’d put the following code in your setup block:
    162 
    163 ![](images/print-recover.png){height="110"}
    164 
    165 
    166 
    167 ## Non-error failures {-}
    168 There are other ways for a function to fail apart from throwing an error:
    169 
    170 - A function may generate an unexpected warning. Convert warnings into errors with `options(warn = 2)` and use the the call stack.
    171 
    172 - A function may generate an unexpected message. The removal of `with_abort()` from {rlang} breaks this solution.
    173 
    174 - A function might never return. 
    175 
    176 - The worst scenario is that your code might crash R completely, leaving you with no way to interactively debug your code. This indicates a bug in compiled (C or C++) code.
    177 
    178 ## Link to some useful resources on debugging {-}
    179 
    180 - Jenny Bryan's ["Object of type closure is not subsettable"](https://github.com/jennybc/debugging#readme) talk from rstudio::conf 2020
    181 
    182 - Jenny Bryan and Jim Hester's book: ["What They Forgot to Teach You About R"](https://rstats.wtf/debugging-r) Ch12
    183 
    184 - Hadley's video on a [minimal reprex for a shiny app](https://www.youtube.com/watch?v=9w8ANOAlWy4)
	bookclub-advr DSLC Advanced R Book Club
	git clone https://git.eamoncaddigan.net/bookclub-advr.git
	Log \| Files \| Refs \| README \| LICENSE