12.Rmd (7739B)
1 --- 2 engine: knitr 3 title: Base types 4 --- 5 6 ## Learning objectives: 7 8 - Understand what OOP means--at the very least for R 9 - Know how to discern an object's nature--base or OO--and type 10 11  12 13 <details> 14 <summary>Session Info</summary> 15 ```{r} 16 library("DiagrammeR") 17 ``` 18 19 ```{r} 20 utils::sessionInfo() 21 ``` 22 23 </details> 24 25 26 ## Why OOP is hard in R 27 28 - Multiple OOP systems exist: S3, R6, S4, and (now/soon) S7. 29 - Multiple preferences: some users prefer one system; others, another. 30 - R's OOP systems are different enough that prior OOP experience may not transfer well. 31 32 [](https://xkcd.com/927/) 33 34 35 ## OOP: Big Ideas 36 37 1. **Polymorphism.** Function has a single interface (outside), but contains (inside) several class-specific implementations. 38 ```{r, eval=FALSE} 39 # imagine a function with object x as an argument 40 # from the outside, users interact with the same function 41 # but inside the function, there are provisions to deal with objects of different classes 42 some_function <- function(x) { 43 if is.numeric(x) { 44 # implementation for numeric x 45 } else if is.character(x) { 46 # implementation for character x 47 } ... 48 } 49 ``` 50 51 <details> 52 <summary>Example of polymorphism</summary> 53 54 ```{r polymorphism_example} 55 # data frame 56 summary(mtcars[,1:4]) 57 58 # statistical model 59 lin_fit <- lm(mpg ~ hp, data = mtcars) 60 summary(lin_fit) 61 ``` 62 63 </details> 64 65 2. **Encapsulation.** Function "encapsulates"--that is, encloses in an inviolate capsule--both data and how it acts on data. Think of a REST API: a client interacts with with an API only through a set of discrete endpoints (i.e., things to get or set), but the server does not otherwise give access to its internal workings or state. Like with an API, this creates a separation of concerns: OOP functions take inputs and yield results; users only consume those results. 66 67 ## OOP: Properties 68 69 ### Objects have class 70 71 - Class defines: 72 - Method (i.e., what can be done with object) 73 - Fields (i.e., data that defines an instance of the class) 74 - Objects are an instance of a class 75 76 ### Class is inherited 77 78 - Class is defined: 79 - By an object's class (e.g., ordered factor) 80 - By the parent of the object's class (e.g., factor) 81 - Inheritance matters for method dispatch 82 - If a method is defined for an object's class, use that method 83 - If an object doesn't have a method, use the method of the parent class 84 - The process of finding a method, is called dispatch 85 86 ## OOP in R: Two Paradigms 87 88 **1. Encapsulated OOP** 89 90 - Objects "encapsulate" 91 - Methods (i.e., what can be done) 92 - Fields (i.e., data on which things are done) 93 - Calls communicate this encapsulation, since form follows function 94 - Form: `object.method(arg1, arg2)` 95 - Function: for `object`, apply `method` for `object`'s class with arguments `arg1` and `arg2` 96 97 **2. Functional OOP** 98 99 - Methods belong to "generic" functions 100 - From the outside, look like regular functions: `generic(object, arg2, arg3)` 101 - From the inside, components are also functions 102 103 ### Concept Map 104 105 ```{r, echo = FALSE, eval = TRUE} 106 DiagrammeR::mermaid(" 107 graph LR 108 109 OOP --> encapsulated_OOP 110 OOP --> functional_OOP 111 112 functional_OOP --> S3 113 functional_OOP --> S4 114 115 encapsulated_OOP --> R6 116 encapsulated_OOP --> RC 117 ") 118 ``` 119 120 <details> 121 <summary>Mermaid code</summary> 122 ```{r, echo = TRUE, eval = FALSE} 123 DiagrammeR::mermaid(" 124 graph LR 125 126 OOP --> encapsulated_OOP 127 OOP --> functional_OOP 128 129 functional_OOP --> S3 130 functional_OOP --> S4 131 132 encapsulated_OOP --> R6 133 encapsulated_OOP --> RC 134 ") 135 ``` 136 </details> 137 138 ## OOP in base R 139 140 - **S3** 141 - Paradigm: functional OOP 142 - Noteworthy: R's first OOP system 143 - Use case: low-cost solution for common problems 144 - Downsides: no guarantees 145 - **S4** 146 - Paradigm: functional OOP 147 - Noteworthy: rewrite of S3, used by `Bioconductor` 148 - Use case: "more guarantees and greater encapsulation" than S3 149 - Downsides: higher setup cost than S3 150 - **RC** 151 - Paradigm: encapsulated OOP 152 - Noteworthy: special type of S4 object is mutable--in other words, that can be modified in place (instead of R's usual copy-on-modify behavior) 153 - Use cases: problems that are hard to tackle with functional OOP (in S3 and S4) 154 - Downsides: harder to reason about (because of modify-in-place logic) 155 156 ## OOP in packages 157 158 - **R6** 159 - Paradigm: encapsulated OOP 160 - Noteworthy: resolves issues with RC 161 - **R7** 162 - Paradigm: functional OOP 163 - Noteworthy: 164 - best parts of S3 and S4 165 - ease of S3 166 - power of S4 167 - See more in [rstudio::conf(2022) talk](https://www.rstudio.com/conference/2022/talks/introduction-to-r7/) 168 - **R.oo** 169 - Paradigm: hybrid functional and encapsulated (?) 170 - **proto** 171 - Paradigm: prototype OOP 172 - Noteworthy: OOP style used in `ggplot2` 173 174 ## How can you tell if an object is base or OOP? 175 176 ### Functions 177 178 Two functions: 179 180 - `base::is.object()`, which yields TRUE/FALSE about whether is OOP object 181 - `sloop::otype()`, which says what type of object type: `"base"`, `"S3"`, etc. 182 183 An few examples: 184 185 ```{r} 186 # Example 1: a base object 187 is.object(1:10) 188 sloop::otype(1:10) 189 190 # Example 2: an OO object 191 is.object(mtcars) 192 sloop::otype(mtcars) 193 ``` 194 195 ### sloop 196 197 * **S** **L**anguage **O**bject-**O**riented **P**rogramming 198 199 [](https://en.wikipedia.org/wiki/Sloop_John_B) 200 201 ### Class 202 203 OO objects have a "class" attribute: 204 205 ```{r} 206 # base object has no class 207 attr(1:10, "class") 208 209 # OO object has one or more classes 210 attr(mtcars, "class") 211 ``` 212 213 ## What about types? 214 215 Only OO objects have a "class" attribute, but every object--whether base or OO--has class 216 217 ### Vectors 218 219 ```{r} 220 typeof(NULL) 221 typeof(c("a", "b", "c")) 222 typeof(1L) 223 typeof(1i) 224 ``` 225 226 227 ### Functions 228 229 ```{r} 230 # "normal" function 231 my_fun <- function(x) { x + 1 } 232 typeof(my_fun) 233 # internal function 234 typeof(`[`) 235 # primitive function 236 typeof(sum) 237 ``` 238 239 ### Environments 240 241 ```{r} 242 typeof(globalenv()) 243 ``` 244 245 246 ### S4 247 248 ```{r} 249 mle_obj <- stats4::mle(function(x = 1) (x - 2) ^ 2) 250 typeof(mle_obj) 251 ``` 252 253 254 ### Language components 255 256 ```{r} 257 typeof(quote(a)) 258 typeof(quote(a + 1)) 259 typeof(formals(my_fun)) 260 ``` 261 262 ### Concept Map 263 264  265 266 <details> 267 <summary>Sankey graph code</summary> 268 269 The graph above was made with [SankeyMATIC](https://sankeymatic.com/) 270 271 ``` 272 // toggle "Show Values" 273 // set Default Flow Colors from "each flow's Source" 274 275 base\ntypes [8] vectors 276 base\ntypes [3] functions 277 base\ntypes [1] environments 278 base\ntypes [1] S4 OOP 279 base\ntypes [3] language\ncomponents 280 base\ntypes [6] C components 281 282 vectors [1] NULL 283 vectors [1] logical 284 vectors [1] integer 285 vectors [1] double 286 vectors [1] complex 287 vectors [1] character 288 vectors [1] list 289 vectors [1] raw 290 291 functions [1] closure 292 functions [1] special 293 functions [1] builtin 294 295 environments [1] environment 296 297 S4 OOP [1] S4 298 299 language\ncomponents [1] symbol 300 language\ncomponents [1] language 301 language\ncomponents [1] pairlist 302 303 C components [1] externalptr 304 C components [1] weakref 305 C components [1] bytecode 306 C components [1] promise 307 C components [1] ... 308 C components [1] any 309 ``` 310 311 </details> 312 313 ## Be careful about the numeric type 314 315 1. Often "numeric" is treated as synonymous for double: 316 317 ```{r} 318 # create a double and integeger objects 319 one <- 1 320 oneL <- 1L 321 typeof(one) 322 typeof(oneL) 323 324 # check their type after as.numeric() 325 one |> as.numeric() |> typeof() 326 oneL |> as.numeric() |> typeof() 327 ``` 328 329 2. In S3 and S4, "numeric" is taken as either integer or double, when choosing methods: 330 331 ```{r} 332 sloop::s3_class(1) 333 sloop::s3_class(1L) 334 ``` 335 336 3. `is.numeric()` tests whether an object behaves like a number 337 338 ```{r} 339 typeof(factor("x")) 340 is.numeric(factor("x")) 341 ``` 342 343 But Advanced R consistently uses numeric to mean integer or double type.