commit e2dff30303ebdd2389ee59d3f57a7934f4745f35
parent cc279c49c1522df914e326a6e22ad282ed84ffac
Author: Arthur Shaw <47256431+arthur-shaw@users.noreply.github.com>
Date:   Wed,  8 Jun 2022 11:01:10 -0400
Draft notes on chapter 3. (#8)
* Draft notes on chapter 3.
* Update README and GHA to latest standards.
* Move knitr_opts to index.Rmd.
* Ignore transient html files during build.
This prevents weird things from getting checked in if a build fails.
Co-authored-by: Jon Harmon <jonthegeek@gmail.com>
Diffstat:
11 files changed, 781 insertions(+), 16 deletions(-)
diff --git a/.github/workflows/deploy_bookdown.yml b/.github/workflows/deploy_bookdown.yml
@@ -1,6 +1,8 @@
 on:
   push:
     branches: main
+    paths-ignore:
+      - 'README.md'
   workflow_dispatch:
 
 name: renderbook
diff --git a/.github/workflows/pr_check.yml b/.github/workflows/pr_check.yml
@@ -1,10 +1,11 @@
+name: pr_check
 on:
   pull_request:
     branches: main
+    paths-ignore:
+      - 'README.md'
   workflow_dispatch:
 
-name: pr_check
-
 jobs:
   bookdown:
     name: pr_check_book
diff --git a/.github/workflows/pr_check_readme.yml b/.github/workflows/pr_check_readme.yml
@@ -0,0 +1,14 @@
+name: pr_check
+on:
+  pull_request:
+    branches: main
+    paths:
+      - 'README.md'
+  workflow_dispatch:
+
+jobs:
+  bookdown:
+    name: pr_check_book
+    runs-on: ubuntu-latest
+    steps:
+      - run: 'echo "No build required" '
diff --git a/.gitignore b/.gitignore
@@ -10,3 +10,4 @@ bookclub-advr.html
 bookclub-advr.knit.md
 bookclub-advr_files
 libs
+*.html
diff --git a/03_Vectors.Rmd b/03_Vectors.Rmd
@@ -2,12 +2,740 @@
 
 **Learning objectives:**
 
-- THESE ARE NICE TO HAVE BUT NOT ABSOLUTELY NECESSARY
+- Learn about different types of vectors
+- Learn how these types relate to one another
 
-## SLIDE 1
+## Types of vectors
+
+The family tree of vectors:
+
+
+Credit: [Advanced R](https://adv-r.hadley.nz/index.html) by Hadley Wickham
+
+- **Atomic.** Elements all the same type.
+- **List.** Elements are different Types.
+- **NULL** Null elements. Length zero.
+
+## Atomic vectors
+
+### Types
+
+- The vector family tree revisited. 
+- Meet the children of atomic vectors
+
+
+Credit: [Advanced R](https://adv-r.hadley.nz/index.html) by Hadley Wickham
+
+### Length one
+
+"Scalars" that consist of a single value.
+
+```{r vec_lgl}
+# Logicals
+lgl1 <- TRUE
+lgl2 <- T
+```
+
+```{r vec_dbl}
+# Doubles
+# integer, decimal, scientific, or hexidecimal format
+dbl1 <- 1
+dbl2 <- 1.234
+dbl3 <- 1.234e0
+dbl4 <- 0xcafe
+```
+
+```{r vec_int}
+# Integers
+# Note: L denotes an integer
+int1 <- 1L
+int2 <- 1.234L
+int3 <- 1.234e0L
+int4 <- 0xcafeL
+```
+
+```{r vec_str}
+# Strings
+str1 <- "hello" # double quotes
+str2 <- 'hello' # single quotes
+str3 <- "مرحبًا" # Unicode
+str4 <- "\U0001f605" # sweaty_smile
+```
+
+### Longer
+
+Several ways to make longer:
+
+**1. With single values**
+
+```{r long_single}
+lgl_vec <- c(TRUE, FALSE)
+
+```
+
+
+**2. With other vectors**
+
+```{r long_vec}
+c(c(1, 2), c(3, 4))
+```
+
+**See also**
+
+`{rlang}` has [vector constructor functions too](https://rlang.r-lib.org/reference/vector-construction.html):
+
+- `rlang::lgl(...)`
+- `rlang::int(...)`
+- `rlang::dbl(...)`
+- `rlang::chr(...)`
+
+They look to do both more and less than `c()`.
+
+- More: 
+  - Enforce type
+  - Splice lists
+  - More types: `rlang::bytes()`, `rlang::cpl(...)`
+- Less: 
+  - Stricter rules on names
+
+Note: currently has `questioning` lifecycle badge, since these constructors may get moved to `vctrs`
+
+### Missing values
+
+**Contagion**
+
+For most computations, an operation over values that includes a missing value yields a missing value (unless you're careful)
+
+```{r na_contagion}
+# contagion
+5*NA
+sum(c(1, 2, NA, 3))
+
+# innoculate
+sum(c(1, 2, NA, 3), na.rm = TRUE)
+
+```
+**Types**
+
+Each type has its own NA type
+
+- Logical: `NA`
+- Integer: `NA_integer`
+- Double: `NA_double`
+- Character: `NA_character`
+
+This may not matter in many contexts.
+
+But this does matter for operations where types matter like `dplyr::if_else()`.
+
+### Testing
+
+**What type of vector `is.*`() it?**
+
+Test data type:
+
+- Logical: `is.logical()`
+- Integer: `is.integer()`
+- Double: `is.double()`
+- Character: `is.character()`
+
+**What type of object is it?**
+
+Don't test objects with these tools:
+
+- `is.vector()`
+- `is.atomic()`
+- `is.numeric()`
+
+Instead, maybe, use `{rlang}`
+
+- `rlang::is_vector`
+- `rlang::is_atomic`
+
+```{r test_rlang}
+# vector
+rlang::is_vector(c(1, 2))
+rlang::is_vector(list(1, 2))
+
+# atomic
+rlang::is_atomic(c(1, 2))
+rlang::is_atomic(list(1, "a"))
+
+```
+
+
+See more [here](https://rlang.r-lib.org/reference/type-predicates.html)
+
+### Coercion
+
+R follows rules for coercion: character → double → integer → logical
+
+R can coerce either automatically or explicitly
+
+**Automatic**
+
+Two contexts for automatic coercion:
+
+1. Combination
+1. Mathematical
+
+Combination:
+
+```{r coerce_c}
+str(c(TRUE, "TRUE"))
+```
+
+Mathematical operations
+
+```{r coerce_math}
+# imagine a logical vector about whether an attribute is present
+has_attribute <- c(TRUE, FALSE, TRUE, TRUE)
+
+# number with attribute
+sum(has_attribute)
+```
+
+**Explicit**
+
+Use `as.*()`
+
+- Logical: `as.logical()`
+- Integer: `as.integer()`
+- Double: `as.double()`
+- Character: `as.character()`
+
+But note that coercions may fail in one of two ways, or both:
+
+- With warning/error
+- NAs
+
+```{r coerce_error}
+as.integer(c(1, 2, "three"))
+```
+
+## Attributes
+
+- What
+- How
+- Why
+
+### What
+
+Two perspectives:
+
+- Name-value pairs
+- Metadata
+
+**Name-value pairs**
+
+Formally, attributes have a name and a value.
+
+**Metadata**
+
+- Not data itself
+- But data about the data
+
+### How
+
+Two operations:
+
+1. Get
+1. Set
+
+Two cases:
+
+1. Single attribute
+2. Multiple attributes
+
+**Single attribute**
+
+Use `attr()`
+
+```{r attr_single}
+# some object
+a <- c(1, 2, 3)
+
+# set attribute
+attr(x = a, which = "some_attribute_name") <- "some attribute"
+
+# get attribute
+attr(x = a, which = "some_attribute_name")
+```
+**Multiple attributes**
+
+To set multiple attributes, use `structure()`
+To get multiple attributes, use `attributes()`
+
+```{r attr_multiple}
+b <- c(4, 5, 6)
+
+# set
+b <- structure(
+  .Data = b,
+  attrib1 = "one",
+  attrib2 = "two"
+)
+
+# get
+str(attributes(b))
+```
+
+### Why
+
+Two common use cases:
+
+- Names
+- Dimensions
+
+**Names**
+
+~~Three~~ Four ways to name:
+
+```{r}
+# 1. At creation
+one <- c(one = 1, two = 2, three = 3)
+
+# 2. By assigning a character vector of names
+two <- c(1, 2, 3)
+names(two) <- c("one", "two", "three")
+
+# 3. By setting names--with base R
+three <- c(1, 2, 3)
+stats::setNames(
+  object = three, 
+  nm = c("One", "Two", "Three")
+)
+
+# 4. By setting names--with {rlang}
+rlang::set_names(
+  x = three,
+  nm = c("One", "Two", "Three")
+)
+```
+
+Thematically but not directly related: labelled class vectors with `haven::labelled()`
+
+**Dimensions**
+
+Important for arrays and matrices.
+
+```{r}
+# length 6 vector spread across 2 rows of 3 columns
+matrix(1:6, nrow = 2, ncol = 3)
+```
+
+## S3 atomic vectors
+
+- The vector family tree revisited. 
+- Meet the children of typed atomic vectors
+
+
+Credit: [Advanced R](https://adv-r.hadley.nz/index.html) by Hadley Wickham
+
+This list could (more easily) be expanded to new  vector types with [`{vctrs}`](https://vctrs.r-lib.org/). See [rstudio::conf(2019) talk on the package around 18:27](https://www.rstudio.com/resources/rstudioconf-2019/vctrs-tools-for-making-size-and-type-consistent-functions/). See also [rstudio::conf(2020) talk on new vector types for dealing with non-decimal currencies](https://www.rstudio.com/resources/rstudioconf-2020/vctrs-creating-custom-vector-classes-with-the-vctrs-package/).
+
+What makes S3 atomic vectors different than their parents?
+
+Two things:
+
+1. Class
+2. Attributes (typically)
+
+### Factors
+
+Factors are integer vectors with:
+
+- Class: "factor"
+- Attributes: "levels", or the set of allowed values
+
+```{r factor}
+# Build a factor
+a_factor <- factor(
+  # values
+  x = c(1, 2, 3),
+  # exhaustive list of values
+  levels = c(1, 2, 3, 4)
+)
+
+# Inspect
+a_factor
+
+# Dissect
+# - type
+typeof(a_factor)
+
+# - attributes
+attributes(a_factor)
+```
+
+Factors can be ordered. This can be useful for models or visaulations where order matters.
+
+```{r factor_ordered}
+# Build
+ordered_factor <- ordered(
+  # values
+  x = c(1, 2, 3),
+  # levels in ascending order
+  levels = c(4, 3, 2, 1)
+)
+
+# Inspect
+ordered_factor
+```
+
+### Dates
+
+Dates are:
+
+- Double vectors
+- With class "Date"
+
+The double component represents the number of days since since `1970-01-01`
+
+```{r dates}
+notes_date <- Sys.Date()
+
+# type
+typeof(notes_date)
+
+# class
+attributes(notes_date)
+```
+
+### Date-times
+
+There are 2 Date-time representations in base R:
+
+- POSIXct, where "ct" denotes calendar time
+- POSIXlt, where "lt" designates local time.
+
+Let's focus on POSIXct because:
+
+- Simplest
+- Built on an atomic vector
+- Most apt to be in a data frame
+
+Let's now build and deconstruct a Date-time
+
+```{r date_time}
+# Build
+note_date_time <- as.POSIXct(
+  # time
+  x = Sys.time(),
+  # time zone, used only for formatting
+  tz = "EDT"
+)
+
+# Inspect
+note_date_time
+
+# Dissect
+# - type
+typeof(note_date_time)
+# - attributes
+attributes(note_date_time)
+```
+
+
+### Durations
+
+Durations are:
+
+- Double vectors
+- Class: "difftime"
+- Attributes: "units", or the unit of duration (e.g., weeks, hours, minutes, seconds, etc.)
+
+```{r durations}
+# Construct
+one_minute <- as.difftime(1, units = "mins")
+
+# Inspect
+one_minute
+
+# Dissect
+# - type
+typeof(one_minute)
+# - attributes
+attributes(one_minute)
+```
+
+See also:
+
+- [`lubridate::make_difftime()`](https://lubridate.tidyverse.org/reference/make_difftime.html)
+- [`clock::date_time_build()`](https://clock.r-lib.org/reference/date_time_build.html)
+
+## Lists
+
+Sometimes called a generic vector, a list can be composed of elements of different types.
+
+### Constructing
+
+Simple lists:
+
+```{r list_simple}
+# Construct
+simple_list <- list(
+  # logicals
+  c(TRUE, FALSE),
+  # integers
+  1:20,
+  # doubles
+  c(1.2, 2.3, 3.4),
+  # characters
+  c("primo", "secundo", "tercio")
+)
+
+# Inspect
+# - type
+typeof(simple_list)
+# - structure
+str(simple_list)
+
+```
+Nested lists:
+
+```{r list_nested}
+nested_list <- list(
+  # first level
+  list(
+    # second level
+    list(
+      # third level
+      list(1)
+    )
+  )
+)
+
+str(nested_list)
+```
+
+Like JSON.
+
+Combined lists
+
+```{r list_combined}
+# with list()
+list_comb1 <- list(list(1, 2), list(3, 4))
+# with c()
+list_comb2 <- c(list(1, 2), list(3, 4))
+
+# compare structure
+str(list_comb1)
+str(list_comb2)
+```
+
+### Testing
+
+Check that is a list:
+
+- `is.list()`
+- `rlang::is_list()``
+
+The two do the same, except that the latter can check for the number of elements
+
+```{r list_test}
+# is list
+base::is.list(list_comb2)
+rlang::is_list(list_comb2)
+
+# is list of 4 elements
+rlang::is_list(x = list_comb2, n = 4)
+
+# is a vector (of a special type)
+# remember the family tree?
+rlang::is_vector(list_comb2)
+```
+
+
+### Coercion
+
+## Data frames and tibbles
+
+- The vector family tree revisited. 
+- Meet the children of lists
+
+
+Credit: [Advanced R](https://adv-r.hadley.nz/index.html) by Hadley Wickham
+
+### Data frame
+
+A data frame is a:
+
+- Named list of vectors (i.e., column names)
+- Class: "data frame"
+- Attributes:
+  - (column) `names`
+  - `row.names``
+
+```{r data_frame}
+# Construct
+df <- data.frame(
+  # named atomic vector
+  col1 = c(1, 2, 3),
+  # another named atomic vector
+  col2 = c("un", "deux", "trois"),
+  # not necessary after R 4.1 (?)
+  stringsAsFactors = FALSE
+)
+
+# Inspect
+df
+
+# Deconstruct
+# - type
+typeof(df)
+# - attributes
+attributes(df)
+```
+
+
+Unlike other lists, the length of each vector must be the same (i.e. as many vector elements as rows in the data frame).
+
+### Tibble
+
+As compared to data frames, tibbles are data frames that are:
+
+- Lazy
+- Surly
+
+#### Lazy
+
+Tibbles do not:
+
+- Coerce strings
+- Transform non-syntactic names
+- Recycle vectors of length greater than 1
+
+**Coerce strings**
+
+```{r tbl_no_coerce}
+chr_col <- c("don't", "factor", "me", "bro")
+
+# data frame
+df <- data.frame(
+  a = chr_col,
+  # in R 4.1 and earlier, this was the default
+  stringsAsFactors = TRUE
+)
+
+# tibble
+tbl <- tibble::tibble(
+  a = chr_col
+)
+
+# contrast the structure
+str(df$a)
+str(tbl$a)
+
+```
+
+**Transform non-syntactic names**
+
+```{r tbl_col_name}
+# data frame
+df <- data.frame(
+  `1` = c(1, 2, 3)
+)
+
+# tibble
+tbl <- tibble::tibble(
+  `1` = c(1, 2, 3)
+)
+
+# contrast the names
+names(df)
+names(tbl)
+```
+
+**Recycle vectors of length greater than 1**
+
+```{r tbl_recycle, error=TRUE}
+# data frame
+df <- data.frame(
+  col1 = c(1, 2, 3, 4),
+  col2 = c(1, 2)
+)
+
+# tibble
+tbl <- tibble::tibble(
+  col1 = c(1, 2, 3, 4),
+  col2 = c(1, 2)
+)
+```
+
+
+#### Surly
+
+Tibbles do only what they're asked and complain if what they're asked doesn't make sense:
+
+- Subsetting always yields a tibble
+- Complains if cannot find column
+
+**Subsetting always yields a tibble**
+
+```{r tbl_subset}
+# data frame
+df <- data.frame(
+  col1 = c(1, 2, 3, 4)
+)
+
+# tibble
+tbl <- tibble::tibble(
+  col1 = c(1, 2, 3, 4)
+)
+
+# contrast
+df_col <- df[, "col1"]
+str(df_col)
+tbl_col <- tbl[, "col1"]
+str(tbl_col)
+
+# to select a vector, do one of these instead
+tbl_col_1 <- tbl[["col1"]]
+str(tbl_col_1)
+tbl_col_2 <- dplyr::pull(tbl, col1)
+str(tbl_col_2)
+```
+
+**Complains if cannot find column**
+
+```{r tbl_col_match, warning=TRUE}
+names(df)
+df$col
+
+names(tbl)
+tbl$col
+```
+
+### Testing
+
+Whether data frame: `is.data.frame()`. Note: both data frame and tibble are data frames.
+
+Whether tibble: `tibble::is_tibble`. Note: only tibbles are tibbles. Vanilla data frames are not.
+
+### Coercion
+
+- To data frame: `as.data.frame()`
+- To tibble: `tibble::as_tibble()`
+
+## `NULL`
+
+Special type of object that:
+
+- Length 0
+- Cannot have attributes
+
+```{r null, error=TRUE}
+typeof(NULL)
+#> [1] "NULL"
+
+length(NULL)
+#> [1] 0
+
+x <- NULL
+attr(x, "y") <- 1
+```
 
-- ADD SLIDES AS SECTIONS (`##`).
-- TRY TO KEEP THEM RELATIVELY SLIDE-LIKE; THESE ARE NOTES, NOT THE BOOK ITSELF.
 
 ## Meeting Videos
 
diff --git a/README.md b/README.md
@@ -27,16 +27,27 @@ The slides from the old clubs are in a [separate repository](https://github.com/
 This repository is structured as a [{bookdown}](https://CRAN.R-project.org/package=bookdown) site.
 To present, follow these instructions:
 
+Do these steps once:
+
 1. [Setup Github Locally](https://www.youtube.com/watch?v=hNUNPkoledI) (also see [_Happy Git and GitHub for the useR_](https://happygitwithr.com/github-acct.html))
-2. Install {usethis} `install.packages("usethis")`
-3. `usethis::create_from_github("r4ds/bookclub-advr")` (cleanly creates your own copy of this repository).
-4. `usethis::pr_init("my-chapter")` (creates a branch for your work, to avoid confusion).
-5. Edit the appropriate chapter file, if necessary. Use `##` to indicate new slides (new sections).
-7. If you use any packages that are not already in the `DESCRIPTION`, add them. You can use `usethis::use_package("myCoolPackage")` to add them quickly!
-8. Build the book! ctrl-shift-b (or command-shift-b) will render the full book, or ctrl-shift-k (command-shift-k) to render just your slide. Please do this to make sure it works before you push your changes up to the main repo!
-9. Commit your changes (either through the command line or using Rstudio's Git tab).
-10. `usethis::pr_push()` (pushes the changes up to github, and opens a "pull request" (PR) to let us know your work is ready).
-11. (If we request changes, make them)
-12. When your PR has been accepted ("merged"), `usethis::pr_finish()` to close out your branch and prepare your local repository for future work.
+2. Install {usethis} and {devtools} `install.packages(c("usethis", "devtools"))`
+3. Set up a default {usethis} directory:
+  - `usethis::edit_r_profile()` to open your profile for editing.
+  - Add this line: `options(usethis.destdir = "YOURDIR")` (replace `YOURDIR` with the root directory under which you want your R projects to appear; or you can skip these steps, and the project will be saved to your Desktop).
+  - Restart your R session (Session/Restart R in Rstudio).
+4. `usethis::create_from_github("r4ds/bookclub-advr")` (cleanly creates your own copy of this repository).
+
+Do these steps each time you present another chapter:
+
+1. Open your project for this book.
+2. `usethis::pr_init("my-chapter")` (creates a branch for your work, to avoid confusion, making sure that you have the latest changes from other contributors; replace `my-chapter` with a descriptive name, ideally).
+3. `devtools::install_dev_deps()` (installs any packages used by the book that you don't already have installed).
+4. Edit the appropriate chapter file, if necessary. Use `##` to indicate new slides (new sections).
+5. If you use any packages that are not already in the `DESCRIPTION`, add them. You can use `usethis::use_package("myCoolPackage")` to add them quickly!
+6. Build the book! ctrl-shift-b (or command-shift-b) will render the full book, or ctrl-shift-k (command-shift-k) to render just your slide. Please do this to make sure it works before you push your changes up to the main repo!
+7. Commit your changes (either through the command line or using Rstudio's Git tab).
+8. `usethis::pr_push()` (pushes the changes up to github, and opens a "pull request" (PR) to let us know your work is ready).
+9. (If we request changes, make them)
+10. When your PR has been accepted ("merged"), `usethis::pr_finish()` to close out your branch and prepare your local repository for future work.
 
 When your PR is checked into the main branch, the bookdown site will rebuild, adding your slides to [this site](https://r4ds.io/advr).
diff --git a/images/vectors/summary-tree-atomic.png b/images/vectors/summary-tree-atomic.png
Binary files differ.
diff --git a/images/vectors/summary-tree-s3-1.png b/images/vectors/summary-tree-s3-1.png
Binary files differ.
diff --git a/images/vectors/summary-tree-s3-2.png b/images/vectors/summary-tree-s3-2.png
Binary files differ.
diff --git a/images/vectors/summary-tree.png b/images/vectors/summary-tree.png
Binary files differ.
diff --git a/index.Rmd b/index.Rmd
@@ -13,6 +13,14 @@ description: "This is the product of the R4DS Online Learning Community's Advanc
 
 # Welcome {-}
 
+```{r knitr_opts, echo=FALSE, message=FALSE, warning=FALSE}
+knitr::opts_chunk$set(
+  echo = TRUE,
+  comment = "#>",
+  collapse = TRUE
+)
+```
+
 Welcome to the bookclub! 
 
 This is a companion for the book [_Advanced R_](https://adv-r.hadley.nz/) by Hadley Wickham (Chapman & Hall, copyright 2019, [9780815384571](https://www.routledge.com/Advanced-R-Second-Edition/Wickham/p/book/9780815384571)).