bookclub-advr

DSLC Advanced R Book Club
git clone https://git.eamoncaddigan.net/bookclub-advr.git
Log | Files | Refs | README | LICENSE

commit d680e86b39b971a5e236ca006248b9d105e2e5c7
parent 8b2d993208a8a0ca7c641956eed8d731e75651a6
Author: Arthur Shaw <47256431+arthur-shaw@users.noreply.github.com>
Date:   Wed,  7 Sep 2022 10:51:49 -0400

Ch 12 oop intro (#26)

* Add sloop to so can inspect objects

* Draft notes
Diffstat:
M12_Base_types.Rmd | 219+++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++--
MDESCRIPTION | 3++-
2 files changed, 217 insertions(+), 5 deletions(-)

diff --git a/12_Base_types.Rmd b/12_Base_types.Rmd @@ -2,12 +2,223 @@ **Learning objectives:** -- THESE ARE NICE TO HAVE BUT NOT ABSOLUTELY NECESSARY +- Understand what OOP means--at the very least for R +- Know how to discern an object's nature--base or OO--and type + +## Why OOP is hard in R + +- Multiple OOP systems exist: S3, R6, S4, and (now/soon) R7. +- Multiple preferences: some users prefer one system; others, another. +- R's OOP systems are different enough that prior OOP experience may not transfer well. + +## What is OOP in general? + +### Two big ideas + +1. **Polymorphism.** Function has a single interface (outside), but contains (inside) several class-specific implementations. +```{r, eval=FALSE} +# imagine a function with object x as an argument +# from the outside, users interact with the same function +# but inside the function, there are provisions to deal with objects of different classes +some_function <- function(x) { + if is.numeric(x) { + # implementation for numeric x + } else if is.character(x) { + # implementation for character x + } ... +} +``` +2. **Encapsulation.** Function "encapsulates"--that is, encloses in an inviolate capsule--both data and how it acts on data. Think of a REST API: a client interacts with with an API only through a set of discrete endpoints (i.e., things to get or set), but the server does not otherwise give access to its internal workings or state. Like with an API, this creates a separation of concerns: OOP functions take inputs and yield results; users only consume those results. + +### A few practical ideas + +#### Objects have class + +- Class defines: + - Method (i.e., what can be done with object) + - Fields (i.e., data that defines an instance of the class) +- Objects are an instance of a class + +#### Class is inherited + +- Class is defined: + - By an object's class (e.g., ordered factor) + - By the parent of the object's class (e.g., factor) +- Inheritance matters for method dispatch + - If a method is defined for an object's class, use that method + - If an object doesn't have a method, use the method of the parent class + - The process of finding a method, is called dispatch + +## What about OOP in R? + +### Two main paradigms + +**1. Encapsulated OOP** + +- Objects "encapsulate" + - Methods (i.e., what can be done) + - Fields (i.e., data on which things are done) +- Calls communicate this encapsulation, since form follows function + - Form: `object.method(arg1, arg2)` + - Function: for `object`, apply `method` for `object`'s class with arguments `arg1` and `arg2` + +**2. Functional OOP** + +- Methods belong to "generic" functions +- From the outside, look like regular functions: `generic(object, arg2, arg3)` +- From the inside, components are also functions + +### Overview of the systems + +In base R: + +- **S3** + - Paradigm: functional OOP + - Noteworthy: R's first OOP system + - Use case: low-cost solution for common problems + - Downsides: no guarantees +- **S4** + - Paradigm: functional OOP + - Noteworthy: rewrite of S3 + - Use case: "more guarantees and greater encapsulation" than S3 + - Downsides: higher setup cost than S3 +- **RC** + - Paradigm: encapsulated OOP + - Noteworthy: special type of S4 object is mutable--in other words, that can be modified in place (instead of R's usual copy-on-modify behavior) + - Use cases: problems that are hard to tackle with functional OOP (in S3 and S4) + - Downsides: harder to reason about (because of modify-in-place logic) + +In packages: + +- **R6** + - Paradigm: encapsulated OOP + - Noteworthy: resolves issues with RC +- **R7** + - Paradigm: functional OOP + - Noteworthy: + - best parts of S3 and S4 + - ease of S3 + - power of S4 + - See more in [rstudio::conf(2022) talk](https://www.rstudio.com/conference/2022/talks/introduction-to-r7/) +- **R.oo** + - Paradigm: hybrid functional and encapsulated (?) +- **proto** + - Paradigm: prototype OOP + - Noteworthy: OOP style used in ggplot2 + +## How can you tell if an object is base or OOP? + +### Functions + +Two functions: + +- `base::is.object()`, which yields TRUE/FALSE about whether is OOP object +- `sloop::otype()`, which says what type of object type: `"base"`, `"S3"`, etc. + +An few examples: + +```{r} +# Example 1: a base object +is.object(1:10) +sloop::otype(1:10) + +# Example 2: an OO object +is.object(mtcars) +sloop::otype(mtcars) +``` + +### Class + +OO objects have a "class" attribute: + +```{r} +# base object has no class +attr(1:10, "class") + +# OO object has one or more class +attr(mtcars, "class") +``` + +## What about types? + +Only OO objects have a "class" attribute, but every object--whether base or OO--has class + +### Vectors + +```{r} +typeof(NULL) +typeof(c("a", "b", "c")) +typeof(1L) +typeof(1i) +``` + + +### Functions + +```{r} +# "normal" function +my_fun <- function(x) { x + 1 } +typeof(my_fun) +# internal function +typeof(`[`) +# primitive function +typeof(sum) +``` + +### Environments + +```{r} +typeof(globalenv()) +``` -## SLIDE 1 -- ADD SLIDES AS SECTIONS (`##`). -- TRY TO KEEP THEM RELATIVELY SLIDE-LIKE; THESE ARE NOTES, NOT THE BOOK ITSELF. +### S4 + +```{r} +mle_obj <- stats4::mle(function(x = 1) (x - 2) ^ 2) +typeof(mle_obj) +``` + + +### Language components + +```{r} +typeof(quote(a)) +typeof(quote(a + 1)) +typeof(formals(my_fun)) +``` + +## Be careful about the numeric type + +1. Often "numeric" is treated as synonymous for double: + +```{r} +# create a double and integeger objects +one <- 1 +oneL <- 1L +typeof(one) +typeof(oneL) + +# check their type after as.numeric() +one |> as.numeric() |> typeof() +oneL |> as.numeric() |> typeof() +``` + +2. In S3 and S4, "numeric" is taken as either integer or double, when choosing methods: + +```{r} +sloop::s3_class(1) +sloop::s3_class(1L) +``` + +3. `is.numeric()` tests whether an object behaves like a number + +```{r} +typeof(factor("x")) +is.numeric(factor("x")) +``` + +But Advanced R consistently uses numeric to mean integer or double type. ## Meeting Videos diff --git a/DESCRIPTION b/DESCRIPTION @@ -28,4 +28,5 @@ Imports: ggplot2, patchwork, memoise, - purrr + purrr, + sloop