bookclub-advr

DSLC Advanced R Book Club
git clone https://git.eamoncaddigan.net/bookclub-advr.git
Log | Files | Refs | README | LICENSE

16.Rmd (8605B)


      1 ---
      2 engine: knitr
      3 title: Trade-offs
      4 ---
      5 
      6 ## Learning objectives:
      7 
      8 - Understand the Trade-offs between S3, R6 and S4
      9 
     10 - Brief intro to S7 (the object system formerly known as R7)
     11 
     12 
     13 ## Introduction {-}
     14 
     15 * We have three OOP systems introduced so far (S3, S4, R6) 
     16 
     17 * At the current time (pre - S7?) Hadley recommends use S3 by default: It's simple and widely used throughout base R and CRAN.
     18 
     19 * If you have experience in other languages,  *Resist* the temptation to use R6 even though it will feel more familiar!
     20 
     21 
     22 ## S4 versus S3 {-}
     23 
     24 **Which functional object system to use, S3 or S4? **
     25 
     26 - **S3** is a simple and flexible system.
     27    
     28    - Good for small teams who need flexibility and immediate payoffs.
     29    
     30    - Commonly used throughout base R and CRAN 
     31    
     32    - Flexibility can cause problems, more complex systems might require formal conventions
     33    
     34 
     35 - **S4** is a more formal, strict system. 
     36 
     37    - Good for large projects and large teams
     38    
     39    - Used by Bioconductor project
     40    
     41    - Requires significant up front investment in design, but payoff is a robust system that enforces conventions.
     42    
     43    - S4 documentation is challenging to use. 
     44     
     45 
     46 
     47 ## R6 versus S3 {-}
     48 
     49 **R6** is built on **encapsulated objects**, rather than generic functions.   
     50 
     51 
     52 **Big differences: general trade-offs**
     53 
     54 * A generic is a regular function so it lives in the global namespace. An R6 method belongs to an object so it lives in a local namespace. This influences how we think about naming.
     55 
     56 * R6's reference semantics allow methods to simultaneously return a value and modify an object. This solves a painful problem called "threading state".
     57 
     58 * You invoke an R6 method using `$`, which is an infix operator. If you set up your methods correctly you can use chains of method calls as an alternative to the pipe.
     59 
     60 ## Namespacing {-}
     61 
     62 **Where methods are found?**
     63 
     64 - in S3, **Generic functions** are **global** and live in the **global namespace**
     65 
     66    - Advantage: Uniform API: `summary`, `print`, `predict` etc.
     67    
     68    - Disadvantage: Must be careful about creating new methods!  Homonyms must be avoided, don't define `plot(bank_heist)`
     69  
     70 
     71 - in R6, **Encapsulated methods** are **local**: objects with a **scope**
     72 
     73    - Advantage: No problems with homonyms:  meaning of `bank_heist$plot()` is clear and unambiguous.
     74    
     75    - Disadvantage: Lack of a uniform API, except by convention.
     76    
     77 
     78 ## Threading state {-}
     79 
     80 
     81 In S3 the challenge is to return a value and modify the object. 
     82 
     83 
     84 ```{r}
     85 new_stack <- function(items = list()) {
     86   structure(list(items = items), class = "stack")
     87 }
     88 
     89 push <- function(x, y) {
     90   x$items <- c(x$items, list(y))
     91   x
     92 }
     93 ```
     94 
     95 No problem with that, but what about when we want to pop a value?  We need to return two things.
     96 
     97 ```{r}
     98 pop <- function(x) {
     99   n <- length(x$items)
    100   
    101   item <- x$items[[n]]
    102   x$items <- x$items[-n]
    103   
    104   list(item = item, x = x)
    105 }
    106 ```
    107 
    108 The usage is a bit awkward:
    109 
    110 ```{r}
    111 s <- new_stack()
    112 s <- push(s, 10)
    113 s <- push(s, 20)
    114 
    115 out <- pop(s)
    116 # Update state:
    117 s <- out$x
    118 
    119 print(out$item)
    120 ```
    121 
    122 
    123 In python and other languages we have structured binding to make this less awkward.  R has the {zeallot} package. For more, see this vignette:
    124 
    125 ```{r 16-Trade-offs-5, eval=FALSE}
    126 vignette('unpacking-assignment')
    127 ```
    128 
    129 However, this is all easier in R6 due to the reference semantics!
    130 
    131 ```{r}
    132 Stack <- R6::R6Class("Stack", list(
    133   items = list(),
    134   push = function(x) {
    135     self$items <- c(self$items, x)
    136     invisible(self)
    137   },
    138   pop = function() {
    139     item <- self$items[[self$length()]]
    140     self$items <- self$items[-self$length()]
    141     item
    142   },
    143   length = function() {
    144     length(self$items)
    145   }
    146 ))
    147 
    148 s <- Stack$new()
    149 s$push(10)
    150 s$push(20)
    151 s$pop()
    152 ```
    153 
    154 
    155 ## Method chaining {-}
    156 
    157 Useful to compose functions from left-to-right.
    158 
    159 Use of the operators:
    160 
    161 - S3: `|>` or `%>%`
    162 
    163 - R6: `$`
    164 
    165 ```{r}
    166 s$push(44)$push(32)$pop()
    167 ```
    168 
    169 
    170 ## Umm... what about S7 ? {-}
    171 
    172 ```{r standards, echo = FALSE,  fig.cap = "https://xkcd.com/927/"}
    173 
    174 knitr::include_graphics("https://imgs.xkcd.com/comics/standards_2x.png")
    175 
    176 ```
    177 
    178 ### Primary references: {-}
    179 
    180 * Docs: <https://rconsortium.github.io/S7/>
    181 
    182 * Talk by Hadley Wickham <https://www.youtube.com/watch?v=P3FxCvSueag>
    183 
    184 ## S7 briefly {-}
    185 
    186 * S7 is a 'better' version of S3 with some of the 'strictness' of S4.
    187 
    188 ```
    189 "A little bit more complex then S3, with almost all of the features, 
    190 all of the payoff of S4" - rstudio conf 2022, Hadley Wickham
    191 ```
    192 * S3 + S4 = S7
    193 
    194 * Compatible with S3: S7 objects are S3 objects!  Can even extend an S3 object with S7
    195 
    196 * Somewhat compatible with S4, see [compatability vignette](https://rconsortium.github.io/S7/articles/compatibility.html) for details. 
    197 
    198 * Helpful error messages! 
    199 
    200 * Note that it was previously called R7, but it was changed to "S7" to better reflect that it is functional not encapsulated! 
    201 
    202 ## Abbreviated introduction based on the vignette {-}
    203 
    204 To install (it's now on CRAN): 
    205 ```{r, eval=FALSE}
    206 install.packages("S7")
    207 ```
    208 
    209 
    210 ```{r, eval=FALSE}
    211 library(S7)
    212 dog <- new_class("dog", properties = list(
    213   name = class_character,
    214   age = class_numeric
    215 ))
    216 dog
    217 
    218 
    219 #> <S7_class>
    220 #> @ name  :  dog
    221 #> @ parent: <S7_object>
    222 #> @ properties:
    223 #>  $ name: <character>          
    224 #>  $ age : <integer> or <double>
    225 ```
    226 
    227 Note the `class_character`, these are S7 classes corresponding to the base classes.
    228 
    229 Now to use it to create an object of class _dog_:
    230 
    231 ```{r, eval = FALSE}
    232 lola <- dog(name = "Lola", age = 11)
    233 lola
    234 
    235 #> <dog>
    236 #>  @ name: chr "Lola"
    237 #>  @ age : num 11
    238 ```
    239 
    240 Properties can be set/read with `@`, with automatic validation ('safety rails') based on the type!
    241 
    242 ```{r, eval = FALSE}
    243 
    244 lola@age <- 12
    245 lola@age
    246 
    247 #> 12
    248 
    249 lola@age <- "twelve"
    250 
    251 #> Error: <dog>@age must be <integer> or <double>, not <character>
    252 
    253 ```
    254 
    255 Note the helpful error message!
    256 
    257 Like S3 (and S4) S7 has generics, implemented with `new_generic` and `method` for particular methods:
    258 
    259 ```{r, eval = FALSE}
    260 speak <- new_generic("speak", "x")
    261 
    262 method(speak, dog) <- function(x) {
    263   "Woof"
    264 }
    265   
    266 speak(lola)
    267 
    268 #> [1] "Woof"
    269 ```
    270 
    271 If we have another class, we can implement the generic for that too:
    272 
    273 ```{r, eval = FALSE}
    274 cat <- new_class("cat", properties = list(
    275   name = class_character,
    276   age = class_double
    277 ))
    278 method(speak, cat) <- function(x) {
    279   "Meow"
    280 }
    281 
    282 fluffy <- cat(name = "Fluffy", age = 5)
    283 speak(fluffy)
    284 
    285 #> [1] "Meow"
    286 ```
    287 
    288 Helpful messages:
    289 
    290 ```{r, eval = FALSE}
    291 speak
    292 
    293 #> <S7_generic> speak(x, ...) with 2 methods:
    294 #> 1: method(speak, cat)
    295 #> 2: method(speak, dog)
    296 ```
    297 
    298 
    299 "most usage of S7 with S3 will just work"
    300 
    301 ```{r, eval = FALSE}
    302 method(print, cat) <- function(...) {
    303   print("I am a cat.")
    304 }
    305 
    306 print(fluffy)
    307 #> "I am a cat"
    308 
    309 ```
    310 
    311 *For validators, inheritance, dynamic properties and more,  see the [vignette!](https://rconsortium.github.io/S7/articles/S7.html)*
    312 
    313 
    314 ## So... switch to S7 ? {-}
    315 
    316 $$
    317 \huge
    318 \textbf{Soon}^{tm}
    319 $$
    320 
    321 * Not yet... still in development! ![Lifecycle: experimental](https://img.shields.io/badge/lifecycle-experimental-orange.svg)
    322 
    323 * But consider trying it out:
    324 
    325    * To stay ahead of the curve... S7 will be integrated into base R someday!
    326    
    327    * To contribute feedback to the S7 team!
    328 
    329    * To get "almost all" of the benefits of S4 without the complexity !  
    330    
    331 * In particular, if you have a new project that might require the complexity of S4, consider S7 instead!
    332 
    333 ## OOP system comparison {-}
    334 
    335 | Characteristic | S3 | S4 | S7 | R6 |
    336 |-------|------|------|------|------|
    337 | _Package_ | base R | base R  | [S7](https://rconsortium.github.io/S7/)  | [R6](https://r6.r-lib.org/)  |
    338 | _Programming type_ | Functional | Functional | Functional | Encapulated |
    339 | _Complexity_ | Low  | High  | Medium  | High  |
    340 | _Payoff_ | Low  | High  | High  | High  |
    341 | _Team size_ | Small | Small-large | Large  | ?  |
    342 | _Namespace_ | Global | Global?  | Global?  | Local  |
    343 | _Modify in place_ | No | No  | No  | Yes  |
    344 | _Method chaining_ | `|>` | `|>`?  | `|>`?  | `$`  |
    345 | _Get/set component_ | `$` | `@` | `@` | `$` |
    346 | _Create class_ | `class()` or `structure()` with `class` argument | `setClass()` | `new_class()` | `R6Class()` |
    347 | _Create validator_ | `function()` | `setValidity()` or `validator` argument in `setClass()` | `validator` argument in `new_class()` | `$validate()` |
    348 | _Create generic_ | `UseMethod()` | `setGeneric()` | `new_generic()` | NA |
    349 | _Create method_ | `function()` assigned to `generic.method` | `setMethod()` | `method()` | `R6Class()` |
    350 | _Create object_ | `class()` or `structure()` with `class` argument or constructor function | `new()` | Use registered method function | `$new()` |
    351 | _Additional components_ | attributes  | slots  | properties  |  |
    352 |  |  |  |  |  |