html.json (13359B)
1 { 2 "hash": "9b125f32ffe5d76475c3174fbd6c7c25", 3 "result": { 4 "engine": "knitr", 5 "markdown": "---\nengine: knitr\ntitle: S3\n---\n\n# Introduction\n\n## Basics\n\n- Has class\n- Uses a generic function to decide on method\n - method = implementation for a specific class\n - dispatch = process of searching for right method\n\n## Classes\n\n**Theory:**\n\nWhat is class?\n\n - No formal definition in S3\n - Simply set class attribute\n\nHow to set class?\n\n - At time of object creation\n - After object creation\n \n\n::: {.cell}\n\n```{.r .cell-code}\n# at time of object creation\nx <- structure(list(), class = \"my_class\")\n\n# after object creation\nx <- list()\nclass(x) <- \"my_class\"\n```\n:::\n\n\nSome advice on style:\n\n - Rules: Can be any string\n - Advice: Consider using/including package name to avoid collision with name of another class (e.g., `blob`, which defines a single class; haven has `labelled` and `haven_labelled`)\n - Convention: letters and `_`; avoid `.` since it might be confused as separator between generic and class name\n\n**Practice:**\n\nHow to compose a class in practice?\n\n- **Constructor**, which helps the developer create new object of target class. Provide always.\n- **Validator**, which checks that values in constructor are valid. May not be necessary for simple classes.\n- **Helper**, which helps users create new objects of target class. May be relevant only for user-facing classes.\n\n### Constructors\n\nHelp developers construct an object of the target class:\n\n\n::: {.cell}\n\n```{.r .cell-code}\nnew_difftime <- function(x = double(), units = \"secs\") {\n # check inputs\n # issue generic system error if unexpected type or value\n stopifnot(is.double(x))\n units <- match.arg(units, c(\"secs\", \"mins\", \"hours\", \"days\", \"weeks\"))\n\n # construct instance of target class\n structure(x,\n class = \"difftime\",\n units = units\n )\n}\n```\n:::\n\n\n### Validators\n\nContrast a constructor, aimed at quickly creating instances of a class, which only checks type of inputs ...\n\n\n::: {.cell}\n\n```{.r .cell-code}\nnew_factor <- function(x = integer(), levels = character()) {\n stopifnot(is.integer(x))\n stopifnot(is.character(levels))\n\n structure(\n x,\n levels = levels,\n class = \"factor\"\n )\n}\n\n# error messages are for system default and developer-facing\nnew_factor(1:5, \"a\")\n```\n\n::: {.cell-output .cell-output-error}\n\n```\n#> Error in as.character.factor(x): malformed factor\n```\n\n\n:::\n:::\n\n\n\n... with a validator, aimed at emitting errors if inputs pose problems, which makes more expensive checks\n\n\n::: {.cell}\n\n```{.r .cell-code}\nvalidate_factor <- function(x) {\n values <- unclass(x)\n levels <- attr(x, \"levels\")\n\n if (!all(!is.na(values) & values > 0)) {\n stop(\n \"All `x` values must be non-missing and greater than zero\",\n call. = FALSE\n )\n }\n\n if (length(levels) < max(values)) {\n stop(\n \"There must be at least as many `levels` as possible values in `x`\",\n call. = FALSE\n )\n }\n\n x\n}\n\n# error messages are informative and user-facing\nvalidate_factor(new_factor(1:5, \"a\"))\n```\n\n::: {.cell-output .cell-output-error}\n\n```\n#> Error: There must be at least as many `levels` as possible values in `x`\n```\n\n\n:::\n:::\n\n\nMaybe there is a typo in the `validate_factor()` function? Do the integers need to start at 1 and be consecutive? \n\n* If not, then `length(levels) < max(values)` should be `length(levels) < length(values)`, right?\n* If so, why do the integers need to start at 1 and be consecutive? And if they need to be as such, we should tell the user, right?\n\n\n::: {.cell}\n\n```{.r .cell-code}\nvalidate_factor(new_factor(1:3, levels = c(\"a\", \"b\", \"c\")))\n```\n\n::: {.cell-output .cell-output-stdout}\n\n```\n#> [1] a b c\n#> Levels: a b c\n```\n\n\n:::\n\n```{.r .cell-code}\nvalidate_factor(new_factor(10:12, levels = c(\"a\", \"b\", \"c\")))\n```\n\n::: {.cell-output .cell-output-error}\n\n```\n#> Error: There must be at least as many `levels` as possible values in `x`\n```\n\n\n:::\n:::\n\n\n\n### Helpers\n\nSome desired virtues:\n\n- Have the same name as the class\n- Call the constructor and validator, if the latter exists.\n- Issue error informative, user-facing error messages\n- Adopt thoughtful/useful defaults or type conversion\n\n\nExercise 5 in 13.3.4\n\nQ: Read the documentation for `utils::as.roman()`. How would you write a constructor for this class? Does it need a validator? What might a helper do?\n\nA: This function transforms numeric input into Roman numbers. It is built on the integer type, which results in the following constructor.\n \n \n\n::: {.cell}\n\n```{.r .cell-code}\nnew_roman <- function(x = integer()) {\n stopifnot(is.integer(x))\n structure(x, class = \"roman\")\n}\n```\n:::\n\n\nThe documentation tells us, that only values between 1 and 3899 are uniquely represented, which we then include in our validation function.\n\n\n::: {.cell}\n\n```{.r .cell-code}\nvalidate_roman <- function(x) {\n values <- unclass(x)\n \n if (any(values < 1 | values > 3899)) {\n stop(\n \"Roman numbers must fall between 1 and 3899.\",\n call. = FALSE\n )\n }\n x\n}\n```\n:::\n\n\nFor convenience, we allow the user to also pass real values to a helper function.\n\n\n::: {.cell}\n\n```{.r .cell-code}\nroman <- function(x = integer()) {\n x <- as.integer(x)\n \n validate_roman(new_roman(x))\n}\n\n# Test\nroman(c(1, 753, 2024))\n```\n\n::: {.cell-output .cell-output-stdout}\n\n```\n#> [1] I DCCLIII MMXXIV\n```\n\n\n:::\n\n```{.r .cell-code}\nroman(0)\n```\n\n::: {.cell-output .cell-output-error}\n\n```\n#> Error: Roman numbers must fall between 1 and 3899.\n```\n\n\n:::\n:::\n\n\n\n\n## Generics and methods\n\n**Generic functions:**\n\n- Consist of a call to `UseMethod()`\n- Pass arguments from the generic to the dispatched method \"auto-magically\"\n\n\n::: {.cell}\n\n```{.r .cell-code}\nmy_new_generic <- function(x) {\n UseMethod(\"my_new_generic\")\n}\n```\n:::\n\n\n### Method dispatch\n\n- `UseMethod()` creates a vector of method names\n- Dispatch \n - Examines all methods in the vector\n - Selects a method\n\n\n::: {.cell}\n\n```{.r .cell-code}\nx <- Sys.Date()\nsloop::s3_dispatch(print(x))\n```\n\n::: {.cell-output .cell-output-stdout}\n\n```\n#> => print.Date\n#> * print.default\n```\n\n\n:::\n:::\n\n\n### Finding methods\n\nWhile `sloop::s3_dispatch()` gives the specific method selected for a specific call, on can see the methods defined:\n\n- For a generic\n\n::: {.cell}\n\n```{.r .cell-code}\nsloop::s3_methods_generic(\"mean\")\n```\n\n::: {.cell-output .cell-output-stdout}\n\n```\n#> # A tibble: 7 × 4\n#> generic class visible source \n#> <chr> <chr> <lgl> <chr> \n#> 1 mean Date TRUE base \n#> 2 mean default TRUE base \n#> 3 mean difftime TRUE base \n#> 4 mean POSIXct TRUE base \n#> 5 mean POSIXlt TRUE base \n#> 6 mean quosure FALSE registered S3method\n#> 7 mean vctrs_vctr FALSE registered S3method\n```\n\n\n:::\n:::\n\n- For a class\n\n::: {.cell}\n\n```{.r .cell-code}\nsloop::s3_methods_class(\"ordered\")\n```\n\n::: {.cell-output .cell-output-stdout}\n\n```\n#> # A tibble: 4 × 4\n#> generic class visible source \n#> <chr> <chr> <lgl> <chr> \n#> 1 as.data.frame ordered TRUE base \n#> 2 Ops ordered TRUE base \n#> 3 relevel ordered FALSE registered S3method\n#> 4 Summary ordered TRUE base\n```\n\n\n:::\n:::\n\n\n### Creating methods\n\nTwo rules:\n\n- Only write a method if you own the generic. Otherwise, bad manners.\n- Method must have same arguments as its generic--with one important exception: `...`\n\n**Example from text:**\n\nI thought it would be good for us to work through this problem.\n\n> Carefully read the documentation for `UseMethod()` and explain why the following code returns the results that it does. What two usual rules of function evaluation does `UseMethod()` violate?\n\n\n::: {.cell}\n\n```{.r .cell-code}\ng <- function(x) {\n x <- 10\n y <- 10\n UseMethod(\"g\")\n}\ng.default <- function(x) c(x = x, y = y)\n\nx <- 1\ny <- 1\ng(x)\n```\n\n::: {.cell-output .cell-output-stdout}\n\n```\n#> x y \n#> 1 1\n```\n\n\n:::\n\n```{.r .cell-code}\ng.default(x)\n```\n\n::: {.cell-output .cell-output-stdout}\n\n```\n#> x y \n#> 1 1\n```\n\n\n:::\n:::\n\n\n\n\n**Examples caught in the wild:**\n\n- [`haven::zap_label`](https://github.com/tidyverse/haven/blob/main/R/zap_label.R), which removes column labels\n- [`dplyr::mutate`](https://github.com/tidyverse/dplyr/blob/main/R/mutate.R)\n- [`tidyr::pivot_longer`](https://github.com/tidyverse/tidyr/blob/main/R/pivot-long.R)\n\n## Object styles\n\n## Inheritance\n\nThree ideas:\n\n1. Class is a vector of classes\n\n::: {.cell}\n\n```{.r .cell-code}\nclass(ordered(\"x\"))\n```\n\n::: {.cell-output .cell-output-stdout}\n\n```\n#> [1] \"ordered\" \"factor\"\n```\n\n\n:::\n\n```{.r .cell-code}\nclass(Sys.time())\n```\n\n::: {.cell-output .cell-output-stdout}\n\n```\n#> [1] \"POSIXct\" \"POSIXt\"\n```\n\n\n:::\n:::\n\n2. Dispatch moves through class vector until it finds a defined method\n\n::: {.cell}\n\n```{.r .cell-code}\nsloop::s3_dispatch(print(ordered(\"x\")))\n```\n\n::: {.cell-output .cell-output-stdout}\n\n```\n#> print.ordered\n#> => print.factor\n#> * print.default\n```\n\n\n:::\n:::\n\n3. Method can delegate to another method via `NextMethod()`, which is indicated by `->` as below:\n\n::: {.cell}\n\n```{.r .cell-code}\nsloop::s3_dispatch(ordered(\"x\")[1])\n```\n\n::: {.cell-output .cell-output-stdout}\n\n```\n#> [.ordered\n#> => [.factor\n#> [.default\n#> -> [ (internal)\n```\n\n\n:::\n:::\n\n\n### `NextMethod()`\n\nConsider `secret` class that masks each character of the input with `x` in output\n\n\n::: {.cell}\n\n```{.r .cell-code}\nnew_secret <- function(x = double()) {\n stopifnot(is.double(x))\n structure(x, class = \"secret\")\n}\n\nprint.secret <- function(x, ...) {\n print(strrep(\"x\", nchar(x)))\n invisible(x)\n}\n\ny <- new_secret(c(15, 1, 456))\ny\n```\n\n::: {.cell-output .cell-output-stdout}\n\n```\n#> [1] \"xx\" \"x\" \"xxx\"\n```\n\n\n:::\n:::\n\n\nNotice that the `[` method is problematic in that it does not preserve the `secret` class. Additionally, it returns `15` as the first element instead of `xx`.\n\n\n::: {.cell}\n\n```{.r .cell-code}\nsloop::s3_dispatch(y[1])\n```\n\n::: {.cell-output .cell-output-stdout}\n\n```\n#> [.secret\n#> [.default\n#> => [ (internal)\n```\n\n\n:::\n\n```{.r .cell-code}\ny[1]\n```\n\n::: {.cell-output .cell-output-stdout}\n\n```\n#> [1] 15\n```\n\n\n:::\n:::\n\n\nFix this with a `[.secret` method:\n\nThe first fix (not run) is inefficient because it creates a copy of `y`.\n\n\n::: {.cell}\n\n```{.r .cell-code}\n# not run\n`[.secret` <- function(x, i) {\n x <- unclass(x)\n new_secret(x[i])\n}\n```\n:::\n\n\n`NextMethod()` is more efficient.\n\n\n::: {.cell}\n\n```{.r .cell-code}\n`[.secret` <- function(x, i) {\n # first, dispatch to `[`\n # then, coerce subset value to `secret` class\n new_secret(NextMethod())\n}\n```\n:::\n\n\nNotice that `[.secret` is selected for dispatch, but that the method delegates to the internal `[`.\n\n\n::: {.cell}\n\n```{.r .cell-code}\nsloop::s3_dispatch(y[1])\n```\n\n::: {.cell-output .cell-output-stdout}\n\n```\n#> => [.secret\n#> [.default\n#> -> [ (internal)\n```\n\n\n:::\n\n```{.r .cell-code}\ny[1]\n```\n\n::: {.cell-output .cell-output-stdout}\n\n```\n#> [1] \"xx\"\n```\n\n\n:::\n:::\n\n\n\n### Allowing subclassing\n\nContinue the example above to have a `supersecret` subclass that hides even the number of characters in the input (e.g., `123` -> `xxxxx`, 12345678 -> `xxxxx`, 1 -> `xxxxx`).\n\nTo allow for this subclass, the constructor function needs to include two additional arguments:\n\n- `...` for passing an arbitrary set of arguments to different subclasses\n- `class` for defining the subclass\n\n\n::: {.cell}\n\n```{.r .cell-code}\nnew_secret <- function(x, ..., class = character()) {\n stopifnot(is.double(x))\n\n structure(\n x,\n ...,\n class = c(class, \"secret\")\n )\n}\n```\n:::\n\n\nTo create the subclass, simply invoke the parent class constructor inside of the subclass constructor:\n\n\n::: {.cell}\n\n```{.r .cell-code}\nnew_supersecret <- function(x) {\n new_secret(x, class = \"supersecret\")\n}\n\nprint.supersecret <- function(x, ...) {\n print(rep(\"xxxxx\", length(x)))\n invisible(x)\n}\n```\n:::\n\n\nBut this means the subclass inherits all parent methods and needs to overwrite all parent methods with subclass methods that return the sublclass rather than the parent class.\n\nThere's no easy solution to this problem in base R.\n\nThere is a solution in the vectors package: `vctrs::vec_restore()`\n\n<!-- TODO: read docs/vignettes to be able to summarize how this works -->\n", 6 "supporting": [ 7 "13_files" 8 ], 9 "filters": [ 10 "rmarkdown/pagebreak.lua" 11 ], 12 "includes": {}, 13 "engineDependencies": {}, 14 "preserve": {}, 15 "postProcess": true 16 } 17 }