bookclub-advr

DSLC Advanced R Book Club
git clone https://git.eamoncaddigan.net/bookclub-advr.git
Log | Files | Refs | README | LICENSE

html.json (9868B)


      1 {
      2   "hash": "34d644ad4ea8b8873e90265633e52f27",
      3   "result": {
      4     "engine": "knitr",
      5     "markdown": "---\nengine: knitr\ntitle: Trade-offs\n---\n\n## Learning objectives:\n\n- Understand the Trade-offs between S3, R6 and S4\n\n- Brief intro to S7 (the object system formerly known as R7)\n\n\n## Introduction {-}\n\n* We have three OOP systems introduced so far (S3, S4, R6) \n\n* At the current time (pre - S7?) Hadley recommends use S3 by default: It's simple and widely used throughout base R and CRAN.\n\n* If you have experience in other languages,  *Resist* the temptation to use R6 even though it will feel more familiar!\n\n\n## S4 versus S3 {-}\n\n**Which functional object system to use, S3 or S4? **\n\n- **S3** is a simple and flexible system.\n   \n   - Good for small teams who need flexibility and immediate payoffs.\n   \n   - Commonly used throughout base R and CRAN \n   \n   - Flexibility can cause problems, more complex systems might require formal conventions\n   \n\n- **S4** is a more formal, strict system. \n\n   - Good for large projects and large teams\n   \n   - Used by Bioconductor project\n   \n   - Requires significant up front investment in design, but payoff is a robust system that enforces conventions.\n   \n   - S4 documentation is challenging to use. \n    \n\n\n## R6 versus S3 {-}\n\n**R6** is built on **encapsulated objects**, rather than generic functions.   \n\n\n**Big differences: general trade-offs**\n\n* A generic is a regular function so it lives in the global namespace. An R6 method belongs to an object so it lives in a local namespace. This influences how we think about naming.\n\n* R6's reference semantics allow methods to simultaneously return a value and modify an object. This solves a painful problem called \"threading state\".\n\n* You invoke an R6 method using `$`, which is an infix operator. If you set up your methods correctly you can use chains of method calls as an alternative to the pipe.\n\n## Namespacing {-}\n\n**Where methods are found?**\n\n- in S3, **Generic functions** are **global** and live in the **global namespace**\n\n   - Advantage: Uniform API: `summary`, `print`, `predict` etc.\n   \n   - Disadvantage: Must be careful about creating new methods!  Homonyms must be avoided, don't define `plot(bank_heist)`\n \n\n- in R6, **Encapsulated methods** are **local**: objects with a **scope**\n\n   - Advantage: No problems with homonyms:  meaning of `bank_heist$plot()` is clear and unambiguous.\n   \n   - Disadvantage: Lack of a uniform API, except by convention.\n   \n\n## Threading state {-}\n\n\nIn S3 the challenge is to return a value and modify the object. \n\n\n\n::: {.cell}\n\n```{.r .cell-code}\nnew_stack <- function(items = list()) {\n  structure(list(items = items), class = \"stack\")\n}\n\npush <- function(x, y) {\n  x$items <- c(x$items, list(y))\n  x\n}\n```\n:::\n\n\nNo problem with that, but what about when we want to pop a value?  We need to return two things.\n\n\n::: {.cell}\n\n```{.r .cell-code}\npop <- function(x) {\n  n <- length(x$items)\n  \n  item <- x$items[[n]]\n  x$items <- x$items[-n]\n  \n  list(item = item, x = x)\n}\n```\n:::\n\n\nThe usage is a bit awkward:\n\n\n::: {.cell}\n\n```{.r .cell-code}\ns <- new_stack()\ns <- push(s, 10)\ns <- push(s, 20)\n\nout <- pop(s)\n# Update state:\ns <- out$x\n\nprint(out$item)\n```\n\n::: {.cell-output .cell-output-stdout}\n\n```\n#> [1] 20\n```\n\n\n:::\n:::\n\n\n\nIn python and other languages we have structured binding to make this less awkward.  R has the {zeallot} package. For more, see this vignette:\n\n\n::: {.cell}\n\n```{.r .cell-code}\nvignette('unpacking-assignment')\n```\n:::\n\n\nHowever, this is all easier in R6 due to the reference semantics!\n\n\n::: {.cell}\n\n```{.r .cell-code}\nStack <- R6::R6Class(\"Stack\", list(\n  items = list(),\n  push = function(x) {\n    self$items <- c(self$items, x)\n    invisible(self)\n  },\n  pop = function() {\n    item <- self$items[[self$length()]]\n    self$items <- self$items[-self$length()]\n    item\n  },\n  length = function() {\n    length(self$items)\n  }\n))\n\ns <- Stack$new()\ns$push(10)\ns$push(20)\ns$pop()\n```\n\n::: {.cell-output .cell-output-stdout}\n\n```\n#> [1] 20\n```\n\n\n:::\n:::\n\n\n\n## Method chaining {-}\n\nUseful to compose functions from left-to-right.\n\nUse of the operators:\n\n- S3: `|>` or `%>%`\n\n- R6: `$`\n\n\n::: {.cell}\n\n```{.r .cell-code}\ns$push(44)$push(32)$pop()\n```\n\n::: {.cell-output .cell-output-stdout}\n\n```\n#> [1] 32\n```\n\n\n:::\n:::\n\n\n\n## Umm... what about S7 ? {-}\n\n\n::: {.cell}\n::: {.cell-output-display}\n![https://xkcd.com/927/](https://imgs.xkcd.com/comics/standards_2x.png)\n:::\n:::\n\n\n### Primary references: {-}\n\n* Docs: <https://rconsortium.github.io/S7/>\n\n* Talk by Hadley Wickham <https://www.youtube.com/watch?v=P3FxCvSueag>\n\n## S7 briefly {-}\n\n* S7 is a 'better' version of S3 with some of the 'strictness' of S4.\n\n```\n\"A little bit more complex then S3, with almost all of the features, \nall of the payoff of S4\" - rstudio conf 2022, Hadley Wickham\n```\n* S3 + S4 = S7\n\n* Compatible with S3: S7 objects are S3 objects!  Can even extend an S3 object with S7\n\n* Somewhat compatible with S4, see [compatability vignette](https://rconsortium.github.io/S7/articles/compatibility.html) for details. \n\n* Helpful error messages! \n\n* Note that it was previously called R7, but it was changed to \"S7\" to better reflect that it is functional not encapsulated! \n\n## Abbreviated introduction based on the vignette {-}\n\nTo install (it's now on CRAN): \n\n::: {.cell}\n\n```{.r .cell-code}\ninstall.packages(\"S7\")\n```\n:::\n\n\n\n\n::: {.cell}\n\n```{.r .cell-code}\nlibrary(S7)\ndog <- new_class(\"dog\", properties = list(\n  name = class_character,\n  age = class_numeric\n))\ndog\n\n\n#> <S7_class>\n#> @ name  :  dog\n#> @ parent: <S7_object>\n#> @ properties:\n#>  $ name: <character>          \n#>  $ age : <integer> or <double>\n```\n:::\n\n\nNote the `class_character`, these are S7 classes corresponding to the base classes.\n\nNow to use it to create an object of class _dog_:\n\n\n::: {.cell}\n\n```{.r .cell-code}\nlola <- dog(name = \"Lola\", age = 11)\nlola\n\n#> <dog>\n#>  @ name: chr \"Lola\"\n#>  @ age : num 11\n```\n:::\n\n\nProperties can be set/read with `@`, with automatic validation ('safety rails') based on the type!\n\n\n::: {.cell}\n\n```{.r .cell-code}\nlola@age <- 12\nlola@age\n\n#> 12\n\nlola@age <- \"twelve\"\n\n#> Error: <dog>@age must be <integer> or <double>, not <character>\n```\n:::\n\n\nNote the helpful error message!\n\nLike S3 (and S4) S7 has generics, implemented with `new_generic` and `method` for particular methods:\n\n\n::: {.cell}\n\n```{.r .cell-code}\nspeak <- new_generic(\"speak\", \"x\")\n\nmethod(speak, dog) <- function(x) {\n  \"Woof\"\n}\n  \nspeak(lola)\n\n#> [1] \"Woof\"\n```\n:::\n\n\nIf we have another class, we can implement the generic for that too:\n\n\n::: {.cell}\n\n```{.r .cell-code}\ncat <- new_class(\"cat\", properties = list(\n  name = class_character,\n  age = class_double\n))\nmethod(speak, cat) <- function(x) {\n  \"Meow\"\n}\n\nfluffy <- cat(name = \"Fluffy\", age = 5)\nspeak(fluffy)\n\n#> [1] \"Meow\"\n```\n:::\n\n\nHelpful messages:\n\n\n::: {.cell}\n\n```{.r .cell-code}\nspeak\n\n#> <S7_generic> speak(x, ...) with 2 methods:\n#> 1: method(speak, cat)\n#> 2: method(speak, dog)\n```\n:::\n\n\n\n\"most usage of S7 with S3 will just work\"\n\n\n::: {.cell}\n\n```{.r .cell-code}\nmethod(print, cat) <- function(...) {\n  print(\"I am a cat.\")\n}\n\nprint(fluffy)\n#> \"I am a cat\"\n```\n:::\n\n\n*For validators, inheritance, dynamic properties and more,  see the [vignette!](https://rconsortium.github.io/S7/articles/S7.html)*\n\n\n## So... switch to S7 ? {-}\n\n$$\n\\huge\n\\textbf{Soon}^{tm}\n$$\n\n* Not yet... still in development! ![Lifecycle: experimental](https://img.shields.io/badge/lifecycle-experimental-orange.svg)\n\n* But consider trying it out:\n\n   * To stay ahead of the curve... S7 will be integrated into base R someday!\n   \n   * To contribute feedback to the S7 team!\n\n   * To get \"almost all\" of the benefits of S4 without the complexity !  \n   \n* In particular, if you have a new project that might require the complexity of S4, consider S7 instead!\n\n## OOP system comparison {-}\n\n| Characteristic | S3 | S4 | S7 | R6 |\n|-------|------|------|------|------|\n| _Package_ | base R | base R  | [S7](https://rconsortium.github.io/S7/)  | [R6](https://r6.r-lib.org/)  |\n| _Programming type_ | Functional | Functional | Functional | Encapulated |\n| _Complexity_ | Low  | High  | Medium  | High  |\n| _Payoff_ | Low  | High  | High  | High  |\n| _Team size_ | Small | Small-large | Large  | ?  |\n| _Namespace_ | Global | Global?  | Global?  | Local  |\n| _Modify in place_ | No | No  | No  | Yes  |\n| _Method chaining_ | `|>` | `|>`?  | `|>`?  | `$`  |\n| _Get/set component_ | `$` | `@` | `@` | `$` |\n| _Create class_ | `class()` or `structure()` with `class` argument | `setClass()` | `new_class()` | `R6Class()` |\n| _Create validator_ | `function()` | `setValidity()` or `validator` argument in `setClass()` | `validator` argument in `new_class()` | `$validate()` |\n| _Create generic_ | `UseMethod()` | `setGeneric()` | `new_generic()` | NA |\n| _Create method_ | `function()` assigned to `generic.method` | `setMethod()` | `method()` | `R6Class()` |\n| _Create object_ | `class()` or `structure()` with `class` argument or constructor function | `new()` | Use registered method function | `$new()` |\n| _Additional components_ | attributes  | slots  | properties  |  |\n|  |  |  |  |  |\n",
      6     "supporting": [
      7       "16_files"
      8     ],
      9     "filters": [
     10       "rmarkdown/pagebreak.lua"
     11     ],
     12     "includes": {},
     13     "engineDependencies": {},
     14     "preserve": {},
     15     "postProcess": true
     16   }
     17 }