22.Rmd (7803B)
1 --- 2 engine: knitr 3 title: Debugging 4 --- 5 6 ## Learning objectives: 7 8 - General strategy for finding and fixing errors. 9 10 - Explore the `traceback()` function to locate exactly where an error occurred 11 12 - Explore how to pause the execution of a function and launch environment where we can interactively explore what’s happening 13 14 - Explore debugging when you’re running code non-interactively 15 16 - Explore non-error problems that occasionally also need debugging 17 18 ## Introduction {-} 19 20 > Finding bug in code, is a process of confirming the many things that we believe are true — until we find one which is not true. 21 22 **—Norm Matloff** 23 24 > Debugging is like being the detective in a crime movie where you're also the murderer. 25 26 **-Filipe Fortes** 27 28 ### Strategies for finding and fixing errors {-} 29 30 #### Google! {-} 31 Whenever you see an error message, start by googling it. We can automate this process with the [{errorist}](https://github.com/coatless-rpkg/errorist) and [{searcher}](https://github.com/coatless-rpkg/searcher) packages. 32 33 #### Make it repeatable {-} 34 To find the root cause of an error, you’re going to need to execute the code many times as you consider and reject hypotheses. It’s worth some upfront investment to make the problem both easy and fast to reproduce. 35 36 #### Figure out where it is {-} 37 To find the bug, adopt the scientific method: **generate hypotheses**, **design experiments to test them**, and **record your results**. This may seem like a lot of work, but a systematic approach will end up saving you time. 38 39 #### Fix it and test it {-} 40 Once you’ve found the bug, you need to figure out how to fix it and to check that the fix actually worked. It’s very useful to have automated tests in place. 41 42 ## Locating errors {-} 43 The most important tool for finding errors is `traceback()`, which shows you the sequence of calls (also known as the **call stack**) that lead to the error. 44 45 - Here’s a simple example where `f()` calls `g()` calls `h()` calls `i()`, which checks if its argument is numeric: 46 47  48 When we run `f("a")` code in RStudio we see: 49 50  51 52 53 If you click **“Show traceback”** you see: 54  55 56 57 You read the `traceback()` output from bottom to top: the initial call is `f()`, which calls `g()`, then `h()`, then `i()`, which triggers the error. 58 59 ## Lazy evaluation {-} 60 One drawback to `traceback()` is that it always **linearises** the call tree, which can be confusing if there is much lazy evaluation involved. For example, take the following example where the error happens when evaluating the first argument to `f()`: 61 62  63 64  65 66 Note: `rlang::with_abort()` is no longer an exported object from 'namespace:rlang'. There is an [open issue](https://github.com/hadley/adv-r/issues/1740) about a fix for the chapter but no drop-in replacement. 67 68 69 ## Interactive debugger {-} 70 Enter the interactive debugger is wwith RStudio’s **“Rerun with Debug”** tool. This reruns the command that created the error, pausing execution where the error occurred. Otherwise, you can insert a call to `browser()` where you want to pause, and re-run the function. 71 72  73 74 `browser()` is just a regular function call which means that you can run it conditionally by wrapping it in an `if` statement: 75 76  77 78 79 80 81 ## `browser()` commands {-} 82 `browser()` provides a few special commands. 83 84  85 86 - Next, `n`: executes the next step in the function. 87 88 - Step into, or `s`: works like next, but if the next step is a function, it will step into that function so you can explore it interactively. 89 90 - Finish, or `f`: finishes execution of the current loop or function. 91 92 - Continue, `c`: leaves interactive debugging and continues regular execution of the function. 93 - Stop, `Q`: stops debugging, terminates the function, and returns to the global workspace. 94 95 96 ## Alternatives {-} 97 There are three alternatives to using `browser()`: setting breakpoints in RStudio, `options(error = recover)`, and `debug()` and other related functions. 98 99 ## Breakpoints {-} 100 In RStudio, you can set a breakpoint by clicking to the left of the line number, or pressing **Shift + F9.** There are two small downsides to breakpoints: 101 102 - There are a few unusual situations in which breakpoints will not work. [Read breakpoint troubleshooting for more details](https://support.posit.co/hc/en-us/articles/200534337-Breakpoint-Troubleshooting) 103 104 - RStudio currently does not support conditional breakpoints. 105 106 ## `recover()` {-} 107 When you set `options(error = recover)`, when you get an error, you’ll get an interactive prompt that displays the traceback and gives you the ability to interactively debug inside any of the frames: 108 109  110 You can return to default error handling with `options(error = NULL)`. 111 112 ## `debug()` {-} 113 114 Another approach is to call a function that inserts the `browser()` call: 115 116 - `debug()` inserts a browser statement in the first line of the specified function. undebug() removes it. 117 118 - `utils::setBreakpoint()` works similarly, but instead of taking a function name, it takes a file name and line number and finds the appropriate function for you. 119 120 121 ## Call stack {-} 122 The call stacks printed by `traceback()`, `browser()` & `where`, and `recover()` are not consistent. 123 124  125 126 RStudio displays calls in the same order as `traceback()`. rlang functions use the same ordering and numbering as `recover()`, but also use indenting to reinforce the hierarchy of calls. 127 128 ## Non-interactive debugging {-} 129 130 When you can’t explore interactively... 131 132 ### `callr::r()` {-} 133 134 `callr::r(f, list(1, 2))` calls `f(1, 2)` in a fresh session to help diagnose: 135 136 - Is the global environment different? Have you loaded different packages? Are objects left from previous sessions causing differences? 137 138 - Is the working directory different? 139 140 - Is the `PATH` environment variable different? 141 142 - Is the `R_LIBS` environment variable different? 143 144 ### `dump.frames()` {-} 145 146 `dump.frames()` is the equivalent to `recover()` for non-interactive code. 147 148  149 150 ### Print debugging {-} 151 152 Insert numerous print statements to precisely locate the problem, and see the values of important variables. Print debugging is particularly useful for compiled code. 153 154  155 156 157 ### RMarkdown {-} 158 159 - If you’re knitting the file using RStudio, switch to calling `rmarkdown::render("path/to/file.Rmd")` instead to run the code in the current session. 160 161 - For interactive debugging, you’ll need to call `sink()` in the error handler. For example, to use `recover()` with RMarkdown, you’d put the following code in your setup block: 162 163 {height="110"} 164 165 166 167 ## Non-error failures {-} 168 There are other ways for a function to fail apart from throwing an error: 169 170 - A function may generate an unexpected warning. Convert warnings into errors with `options(warn = 2)` and use the the call stack. 171 172 - A function may generate an unexpected message. The removal of `with_abort()` from {rlang} breaks this solution. 173 174 - A function might never return. 175 176 - The worst scenario is that your code might crash R completely, leaving you with no way to interactively debug your code. This indicates a bug in compiled (C or C++) code. 177 178 ## Link to some useful resources on debugging {-} 179 180 - Jenny Bryan's ["Object of type closure is not subsettable"](https://github.com/jennybc/debugging#readme) talk from rstudio::conf 2020 181 182 - Jenny Bryan and Jim Hester's book: ["What They Forgot to Teach You About R"](https://rstats.wtf/debugging-r) Ch12 183 184 - Hadley's video on a [minimal reprex for a shiny app](https://www.youtube.com/watch?v=9w8ANOAlWy4)