bookclub-advr

DSLC Advanced R Book Club
git clone https://git.eamoncaddigan.net/bookclub-advr.git
Log | Files | Refs | README | LICENSE

commit b71d07101c9b29942bac8aa7283d4e25e4dcd8a2
parent d680e86b39b971a5e236ca006248b9d105e2e5c7
Author: Arthur Shaw <47256431+arthur-shaw@users.noreply.github.com>
Date:   Fri, 16 Sep 2022 11:26:33 -0400

Draft notes (#27)


Diffstat:
M13_S3.Rmd | 274+++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++--
1 file changed, 269 insertions(+), 5 deletions(-)

diff --git a/13_S3.Rmd b/13_S3.Rmd @@ -1,13 +1,277 @@ # S3 -**Learning objectives:** +## Basics -- THESE ARE NICE TO HAVE BUT NOT ABSOLUTELY NECESSARY +- Has class +- Uses a generic function to decide on method + - method = implementation for a specific class + - dispatch = process of searching for right method -## SLIDE 1 +## Classes -- ADD SLIDES AS SECTIONS (`##`). -- TRY TO KEEP THEM RELATIVELY SLIDE-LIKE; THESE ARE NOTES, NOT THE BOOK ITSELF. +### Theory + +What is class? + + - No formal definition in S3 + - Simply set class attribute + +How to set class? + + - At time of object creation + - After object creation + +```{r} +# at time of object creation +x <- structure(list(), class = "my_class") + +# after object creation +x <- list() +class(x) <- "my_class" +``` + +Some advice on style: + + - Rules: Can be any string + - Advice: Consider using/including package name to avoid collision with name of another class (e.g., `blob`, which defines a single class; haven has `labelled` and `haven_labelled`) + - Convention: letters and `_`; avoid `.` since it might be confused as separator between generic and class name + +### Practice + +How to compose a class in practice? + +- **Constructor**, which helps the developer create new object of target class. Provide always. +- **Validator**, which checks that values in constructor are valid. May not be necessary for simple classes. +- **Helper**, which helps users create new objects of target class. May be relevant only for user-facing classes. + +### Constructors + +Help developers construct an object of the target class: + +```{r} +new_difftime <- function(x = double(), units = "secs") { + # check inputs + # issue generic system error if unexpected type or value + stopifnot(is.double(x)) + units <- match.arg(units, c("secs", "mins", "hours", "days", "weeks")) + + # construct instance of target class + structure(x, + class = "difftime", + units = units + ) +} +``` + +### Validators + +Contrast a constructor, aimed at quickly creating instances of a class, which only checks type of inputs ... + +```{r} +new_factor <- function(x = integer(), levels = character()) { + stopifnot(is.integer(x)) + stopifnot(is.character(levels)) + + structure( + x, + levels = levels, + class = "factor" + ) +} + +# error messages are for system default and developer-facing +new_factor(1:5, "a") +``` + + +... with a validator, aimed at emitting errors if inputs pose problems, which makes more expensive checks + +```{r} +validate_factor <- function(x) { + values <- unclass(x) + levels <- attr(x, "levels") + + if (!all(!is.na(values) & values > 0)) { + stop( + "All `x` values must be non-missing and greater than zero", + call. = FALSE + ) + } + + if (length(levels) < max(values)) { + stop( + "There must be at least as many `levels` as possible values in `x`", + call. = FALSE + ) + } + + x +} + +# error messages are informative and user-facing +validate_factor(new_factor(1:5, "a")) +``` + +### Helpers + +Some desired virtues: + +- Have the same name as the class +- Call the constructor and validator, if the latter exists. +- Issue error informative, user-facing error messages +- Adopt thoughtful/useful defaults or type conversion + +## Generics and methods + +### Generic functions + +- Consist of a call to `UseMethod()` +- Pass arguments from the generic to the dispatched method "auto-magically" + +```{r} +my_new_generic <- function(x) { + UseMethod("my_new_generic") +} +``` + +### Method dispatch + +- `UseMethod()` creates a vector of method names +- Dispatch + - Examines all methods in the vector + - Selects a method + +```{r} +x <- Sys.Date() +sloop::s3_dispatch(print(x)) +``` + +### Finding methods + +While `sloop::s3_dispatch()` gives the specific method selected for a specific call, on can see the methods defined: + +- For a generic +```{r} +sloop::s3_methods_generic("mean") +``` +- For a class +```{r} +sloop::s3_methods_class("ordered") +``` + +### Creating methods + +Two rules: + +- Only write a method if you own the generic. Otherwise, bad manners. +- Method must have same arguments as its generic--with one important exception: `...` + +### Examples caught in the wild + +- [`haven::zap_label`](https://github.com/tidyverse/haven/blob/main/R/zap_label.R), which removes column labels +- [`dplyr::mutate`](https://github.com/tidyverse/dplyr/blob/main/R/mutate.R) +- [`tidyr::pivot_longer`](https://github.com/tidyverse/tidyr/blob/main/R/pivot-long.R) + +## Inheritance + +Three ideas: + +1. Class is a vector of classes +```{r} +class(ordered("x")) +class(Sys.time()) +``` +2. Dispatch moves through class vector until it finds a defined method +```{r} +sloop::s3_dispatch(print(ordered("x"))) +``` +3. Method can delegate to another method via `NextMethod()`, which is indicated by `<-` as below: +```{r} +sloop::s3_dispatch(ordered("x")[1]) +``` + +### NextMethod() + +Consider `secret` class that masks each character of the input with `x` in output + +```{r} +new_secret <- function(x = double()) { + stopifnot(is.double(x)) + structure(x, class = "secret") +} + +print.secret <- function(x, ...) { + print(strrep("x", nchar(x))) + invisible(x) +} + +x <- new_secret(c(15, 1, 456)) +x +``` + +Notice that the `[` method is problematic in that it does not preserve the `secret` class + +```{r} +sloop::s3_dispatch(x[1]) +``` + +Fix this with a `[.secret` method: + +```{r} +`[.secret` <- function(x, i) { + # first, dispatch to `[` + # then, coerce subset value to `secret` class + new_secret(NextMethod()) +} +``` + +Notice that `[.secret` is selected for dispatch, but that the method delegates to the internal `[` + +```{r} +sloop::s3_dispatch(x[1]) +``` + +### Allowing subclassing + +Continue the example above to have a `supersecret` subclass that hides even the number of characters in the input (e.g., `123` -> `xxxxx`, 12345678 -> `xxxxx`, 1 -> `xxxxx`). + +To allow for this subclass, the constructor function needs to include two additional arguments: + +- `...` for passing an arbitrary set of arguments to different subclasses +- `class` for defining the subclass + +```{r} +new_secret <- function(x, ..., class = character()) { + stopifnot(is.double(x)) + + structure( + x, + ..., + class = c(class, "secret") + ) +} +``` + +To create the subclass, simply invoke the parent class constructor inside of the subclass constructor: + +```{r} +new_supersecret <- function(x) { + new_secret(x, class = "supersecret") +} + +print.supersecret <- function(x, ...) { + print(rep("xxxxx", length(x))) + invisible(x) +} +``` + +But this means the subclass inherits all parent methods and needs to overwrite all parent methods with subclass methods that return the sublclass rather than the parent class. + +There's no easy solution to this problem in base R. + +There is a solution in the vectors package: `vctrs::vec_restore()` + +<!-- TODO: read docs/vignettes to be able to summarize how this works --> ## Meeting Videos