commit cc607fe9afa0887d0c24fc4cd952e322354194e7
parent ddee529ad1dc157a6ae90c16f8fdbe9d7bd42643
Author: Trevin Flickinger <twflick@gmail.com>
Date: Wed, 15 Jun 2022 16:41:15 -0400
subsetting chapter cohort 6 (#12)
Diffstat:
7 files changed, 509 insertions(+), 116 deletions(-)
diff --git a/03_Vectors.Rmd b/03_Vectors.Rmd
@@ -2,29 +2,27 @@
**Learning objectives:**
-- Learn about different types of vectors
-- Learn how these types relate to one another
+- Learn about different types of vectors
+- Learn how these types relate to one another
## Types of vectors
The family tree of vectors:
-
-Credit: [Advanced R](https://adv-r.hadley.nz/index.html) by Hadley Wickham
+ Credit: [Advanced R](https://adv-r.hadley.nz/index.html) by Hadley Wickham
-- **Atomic.** Elements all the same type.
-- **List.** Elements are different Types.
-- **NULL** Null elements. Length zero.
+- **Atomic.** Elements all the same type.
+- **List.** Elements are different Types.
+- **NULL** Null elements. Length zero.
## Atomic vectors
### Types
-- The vector family tree revisited.
-- Meet the children of atomic vectors
+- The vector family tree revisited.
+- Meet the children of atomic vectors
-
-Credit: [Advanced R](https://adv-r.hadley.nz/index.html) by Hadley Wickham
+ Credit: [Advanced R](https://adv-r.hadley.nz/index.html) by Hadley Wickham
### Length one
@@ -73,7 +71,6 @@ lgl_vec <- c(TRUE, FALSE)
```
-
**2. With other vectors**
```{r long_vec}
@@ -84,19 +81,19 @@ c(c(1, 2), c(3, 4))
`{rlang}` has [vector constructor functions too](https://rlang.r-lib.org/reference/vector-construction.html):
-- `rlang::lgl(...)`
-- `rlang::int(...)`
-- `rlang::dbl(...)`
-- `rlang::chr(...)`
+- `rlang::lgl(...)`
+- `rlang::int(...)`
+- `rlang::dbl(...)`
+- `rlang::chr(...)`
They look to do both more and less than `c()`.
-- More:
- - Enforce type
- - Splice lists
- - More types: `rlang::bytes()`, `rlang::cpl(...)`
-- Less:
- - Stricter rules on names
+- More:
+ - Enforce type
+ - Splice lists
+ - More types: `rlang::bytes()`, `rlang::cpl(...)`
+- Less:
+ - Stricter rules on names
Note: currently has `questioning` lifecycle badge, since these constructors may get moved to `vctrs`
@@ -115,14 +112,15 @@ sum(c(1, 2, NA, 3))
sum(c(1, 2, NA, 3), na.rm = TRUE)
```
+
**Types**
Each type has its own NA type
-- Logical: `NA`
-- Integer: `NA_integer`
-- Double: `NA_double`
-- Character: `NA_character`
+- Logical: `NA`
+- Integer: `NA_integer`
+- Double: `NA_double`
+- Character: `NA_character`
This may not matter in many contexts.
@@ -134,23 +132,23 @@ But this does matter for operations where types matter like `dplyr::if_else()`.
Test data type:
-- Logical: `is.logical()`
-- Integer: `is.integer()`
-- Double: `is.double()`
-- Character: `is.character()`
+- Logical: `is.logical()`
+- Integer: `is.integer()`
+- Double: `is.double()`
+- Character: `is.character()`
**What type of object is it?**
Don't test objects with these tools:
-- `is.vector()`
-- `is.atomic()`
-- `is.numeric()`
+- `is.vector()`
+- `is.atomic()`
+- `is.numeric()`
Instead, maybe, use `{rlang}`
-- `rlang::is_vector`
-- `rlang::is_atomic`
+- `rlang::is_vector`
+- `rlang::is_atomic`
```{r test_rlang}
# vector
@@ -163,7 +161,6 @@ rlang::is_atomic(list(1, "a"))
```
-
See more [here](https://rlang.r-lib.org/reference/type-predicates.html)
### Coercion
@@ -176,8 +173,8 @@ R can coerce either automatically or explicitly
Two contexts for automatic coercion:
-1. Combination
-1. Mathematical
+1. Combination
+2. Mathematical
Combination:
@@ -199,15 +196,15 @@ sum(has_attribute)
Use `as.*()`
-- Logical: `as.logical()`
-- Integer: `as.integer()`
-- Double: `as.double()`
-- Character: `as.character()`
+- Logical: `as.logical()`
+- Integer: `as.integer()`
+- Double: `as.double()`
+- Character: `as.character()`
But note that coercions may fail in one of two ways, or both:
-- With warning/error
-- NAs
+- With warning/error
+- NAs
```{r coerce_error}
as.integer(c(1, 2, "three"))
@@ -215,16 +212,16 @@ as.integer(c(1, 2, "three"))
## Attributes
-- What
-- How
-- Why
+- What
+- How
+- Why
### What
Two perspectives:
-- Name-value pairs
-- Metadata
+- Name-value pairs
+- Metadata
**Name-value pairs**
@@ -232,20 +229,20 @@ Formally, attributes have a name and a value.
**Metadata**
-- Not data itself
-- But data about the data
+- Not data itself
+- But data about the data
### How
Two operations:
-1. Get
-1. Set
+1. Get
+2. Set
Two cases:
-1. Single attribute
-2. Multiple attributes
+1. Single attribute
+2. Multiple attributes
**Single attribute**
@@ -261,10 +258,10 @@ attr(x = a, which = "some_attribute_name") <- "some attribute"
# get attribute
attr(x = a, which = "some_attribute_name")
```
+
**Multiple attributes**
-To set multiple attributes, use `structure()`
-To get multiple attributes, use `attributes()`
+To set multiple attributes, use `structure()` To get multiple attributes, use `attributes()`
```{r attr_multiple}
b <- c(4, 5, 6)
@@ -284,8 +281,8 @@ str(attributes(b))
Two common use cases:
-- Names
-- Dimensions
+- Names
+- Dimensions
**Names**
@@ -326,27 +323,26 @@ matrix(1:6, nrow = 2, ncol = 3)
## S3 atomic vectors
-- The vector family tree revisited.
-- Meet the children of typed atomic vectors
+- The vector family tree revisited.
+- Meet the children of typed atomic vectors
-
-Credit: [Advanced R](https://adv-r.hadley.nz/index.html) by Hadley Wickham
+ Credit: [Advanced R](https://adv-r.hadley.nz/index.html) by Hadley Wickham
-This list could (more easily) be expanded to new vector types with [`{vctrs}`](https://vctrs.r-lib.org/). See [rstudio::conf(2019) talk on the package around 18:27](https://www.rstudio.com/resources/rstudioconf-2019/vctrs-tools-for-making-size-and-type-consistent-functions/). See also [rstudio::conf(2020) talk on new vector types for dealing with non-decimal currencies](https://www.rstudio.com/resources/rstudioconf-2020/vctrs-creating-custom-vector-classes-with-the-vctrs-package/).
+This list could (more easily) be expanded to new vector types with [`{vctrs}`](https://vctrs.r-lib.org/). See [rstudio::conf(2019) talk on the package around 18:27](https://www.rstudio.com/resources/rstudioconf-2019/vctrs-tools-for-making-size-and-type-consistent-functions/). See also [rstudio::conf(2020) talk on new vector types for dealing with non-decimal currencies](https://www.rstudio.com/resources/rstudioconf-2020/vctrs-creating-custom-vector-classes-with-the-vctrs-package/).
What makes S3 atomic vectors different than their parents?
Two things:
-1. Class
-2. Attributes (typically)
+1. Class
+2. Attributes (typically)
### Factors
Factors are integer vectors with:
-- Class: "factor"
-- Attributes: "levels", or the set of allowed values
+- Class: "factor"
+- Attributes: "levels", or the set of allowed values
```{r factor}
# Build a factor
@@ -387,8 +383,8 @@ ordered_factor
Dates are:
-- Double vectors
-- With class "Date"
+- Double vectors
+- With class "Date"
The double component represents the number of days since since `1970-01-01`
@@ -406,14 +402,14 @@ attributes(notes_date)
There are 2 Date-time representations in base R:
-- POSIXct, where "ct" denotes calendar time
-- POSIXlt, where "lt" designates local time.
+- POSIXct, where "ct" denotes calendar time
+- POSIXlt, where "lt" designates local time.
Let's focus on POSIXct because:
-- Simplest
-- Built on an atomic vector
-- Most apt to be in a data frame
+- Simplest
+- Built on an atomic vector
+- Most apt to be in a data frame
Let's now build and deconstruct a Date-time
@@ -436,14 +432,13 @@ typeof(note_date_time)
attributes(note_date_time)
```
-
### Durations
Durations are:
-- Double vectors
-- Class: "difftime"
-- Attributes: "units", or the unit of duration (e.g., weeks, hours, minutes, seconds, etc.)
+- Double vectors
+- Class: "difftime"
+- Attributes: "units", or the unit of duration (e.g., weeks, hours, minutes, seconds, etc.)
```{r durations}
# Construct
@@ -461,8 +456,8 @@ attributes(one_minute)
See also:
-- [`lubridate::make_difftime()`](https://lubridate.tidyverse.org/reference/make_difftime.html)
-- [`clock::date_time_build()`](https://clock.r-lib.org/reference/date_time_build.html)
+- [`lubridate::make_difftime()`](https://lubridate.tidyverse.org/reference/make_difftime.html)
+- [`clock::date_time_build()`](https://clock.r-lib.org/reference/date_time_build.html)
## Lists
@@ -492,6 +487,7 @@ typeof(simple_list)
str(simple_list)
```
+
Nested lists:
```{r list_nested}
@@ -528,8 +524,8 @@ str(list_comb2)
Check that is a list:
-- `is.list()`
-- `rlang::is_list()``
+- `is.list()`
+- \`rlang::is_list()\`\`
The two do the same, except that the latter can check for the number of elements
@@ -546,26 +542,24 @@ rlang::is_list(x = list_comb2, n = 4)
rlang::is_vector(list_comb2)
```
-
### Coercion
## Data frames and tibbles
-- The vector family tree revisited.
-- Meet the children of lists
+- The vector family tree revisited.
+- Meet the children of lists
-
-Credit: [Advanced R](https://adv-r.hadley.nz/index.html) by Hadley Wickham
+ Credit: [Advanced R](https://adv-r.hadley.nz/index.html) by Hadley Wickham
### Data frame
A data frame is a:
-- Named list of vectors (i.e., column names)
-- Class: "data frame"
-- Attributes:
- - (column) `names`
- - `row.names``
+- Named list of vectors (i.e., column names)
+- Class: "data frame"
+- Attributes:
+ - (column) `names`
+ - \`row.names\`\`
```{r data_frame}
# Construct
@@ -588,23 +582,22 @@ typeof(df)
attributes(df)
```
-
Unlike other lists, the length of each vector must be the same (i.e. as many vector elements as rows in the data frame).
### Tibble
As compared to data frames, tibbles are data frames that are:
-- Lazy
-- Surly
+- Lazy
+- Surly
#### Lazy
Tibbles do not:
-- Coerce strings
-- Transform non-syntactic names
-- Recycle vectors of length greater than 1
+- Coerce strings
+- Transform non-syntactic names
+- Recycle vectors of length greater than 1
**Coerce strings**
@@ -663,13 +656,12 @@ tbl <- tibble::tibble(
)
```
-
#### Surly
Tibbles do only what they're asked and complain if what they're asked doesn't make sense:
-- Subsetting always yields a tibble
-- Complains if cannot find column
+- Subsetting always yields a tibble
+- Complains if cannot find column
**Subsetting always yields a tibble**
@@ -715,15 +707,15 @@ Whether tibble: `tibble::is_tibble`. Note: only tibbles are tibbles. Vanilla dat
### Coercion
-- To data frame: `as.data.frame()`
-- To tibble: `tibble::as_tibble()`
+- To data frame: `as.data.frame()`
+- To tibble: `tibble::as_tibble()`
## `NULL`
Special type of object that:
-- Length 0
-- Cannot have attributes
+- Length 0
+- Cannot have attributes
```{r null, error=TRUE}
typeof(NULL)
@@ -736,7 +728,6 @@ x <- NULL
attr(x, "y") <- 1
```
-
## Meeting Videos
### Cohort 1
@@ -764,9 +755,13 @@ attr(x, "y") <- 1
`r knitr::include_url("https://www.youtube.com/embed/URL")`
<details>
-<summary> Meeting chat log </summary>
-```
-LOG
-```
+<summary>
+
+Meeting chat log
+
+</summary>
+
+ LOG
+
</details>
diff --git a/04_Subsetting.Rmd b/04_Subsetting.Rmd
@@ -2,12 +2,406 @@
**Learning objectives:**
-- THESE ARE NICE TO HAVE BUT NOT ABSOLUTELY NECESSARY
+- Learn about the 6 ways to subset atomic vectors
+- Learn about the 3 subsetting operators: `[[`, `[`, and `$`
+- Learn how subsetting works with different vector types
-## SLIDE 1
+## Selecting multiple elements
+
+### Atomic Vectors
+
+- 6 ways to subset atomic vectors
+
+Let's take a look with an example vector.
+
+```{r atomic_vector}
+x <- c(3.1, 2.2, 1.3, 4.4)
+```
+
+**Positive integers**
+
+```{r positive_int}
+# return elements at specified positions
+x[c(4, 1)]
+
+# duplicate indices return duplicate values
+x[c(2, 2)]
+
+# real numbers truncate to integers
+x[c(3.2, 3.8)]
+```
+
+**Negative integers**
+
+```{r, eval=FALSE}
+### excludes elements at specified positions
+# x[-c(1, 3)] # same as x[c(-1, -3)]
+
+### mixing positive and negative is a no-no
+# x[c(-1, 3)]
+```
+
+**Logical Vectors**
+
+```{r logical_vec}
+x[c(TRUE, TRUE, FALSE, TRUE)]
+
+x[x < 3]
+```
+
+- **Recyling rules** apply when subsetting this way: x[y]
+- Easy to understand if x or y is 1, best to avoid other lengths
+
+```{r missing}
+# missing value in index will also return NA in output
+x[c(NA, TRUE)]
+```
+
+
+**Nothing**
+
+```{r nothing}
+# returns the original vector
+x[]
+```
+
+**Zero**
+
+```{r zero}
+# returns a zero-length vector
+x[0]
+```
+
+**Character vectors**
+
+```{r character}
+# if name, you can use to return matched elements
+(y <- setNames(x, letters[1:4]))
+
+y[c("d", "b", "a")]
+```
+
+### Lists
+
+- Subsetting works the same way
+- `[` always returns a list, `[[` and `$` let you pull elements out of a list
+
+### Matrices and arrays
+
+You can subset higher dimensional structures in three ways:
+- with multiple vectors
+- with a single vector
+- with a matrix
+
+```{r, eval=FALSE}
+a <- matrix(1:9, nrow = 3)
+colnames(a) <- c("A", "B", "C")
+a[1:2, ]
+#> A B C
+#> [1,] 1 4 7
+#> [2,] 2 5 8
+a[c(TRUE, FALSE, TRUE), c("B", "A")]
+#> B A
+#> [1,] 4 1
+#> [2,] 6 3
+a[0, -2]
+#> A C
+
+a[1, ]
+#> A B C
+#> 1 4 7
+
+a[1, 1]
+#> A
+#> 1
+```
+
+Credit: [Advanced R](https://adv-r.hadley.nz/index.html) by Hadley Wickham
+
+Matrices and arrays are just special vectors; can subset with a single vector
+(arrays in R stored column wise)
+
+```{r}
+vals <- outer(1:5, 1:5, FUN = "paste", sep = ",")
+vals
+
+vals[c(3, 15)]
+```
+
+### Data frames and tibbles
+
+Data frames act like lists and matrices
+- single index -> list
+- two indices -> matrix
+
+```{r penguins}
+library(palmerpenguins)
+
+# single index
+penguins[1:2]
+
+penguins[c("species","island")]
+
+# two indices
+penguins[1:2, ]
+```
+
+Subsetting a tibble with `[` always returns a tibble
+
+### Preserving dimensionality
+
+- Data frames and tibbles behave differently
+- tibble will default to preserve dimensionality, data frames do not
+- this can lead to unexpected behavior and code breaking in the future
+
+Can use `drop = FALSE` when using a data frame or can use tibbles
+
+## Selecting a single element
+
+`[[` and `$` are used to extract single elements
+
+### `[[]]`
+
+```{r train}
+x <- list(1:3, "a", 4:6)
+```
+
+
+
+
+
+
+
+Credit: [Advanced R](https://adv-r.hadley.nz/index.html) by Hadley Wickham
+
+
+
+### `$`
+
+- `x$y` is equivalent to `x[["y"]]`
+
+the `$` operator doens't work with stored vals
+
+```{r, eval=FALSE}
+var <- "cyl"
+# Doesn't work - mtcars$var translated to mtcars[["var"]]
+mtcars$var
+#> NULL
+
+# Instead use [[
+mtcars[[var]]
+#> [1] 6 6 4 6 8 6 8 4 4 6 6 8 8 8 8 8 8 4 4 4 4 8 8 8 8 4 4 4 8 6 8 4
+```
+
+`$` allows partial matching, `[[]]` does not
+
+```{r, eval=FALSE}
+x <- list(abc = 1)
+x$a
+#> [1] 1
+x[["a"]]
+#> NULL
+```
+
+Hadley advises to change Global settings:
+
+```{r, eval=FALSE}
+options(warnPartialMatchDollar = TRUE)
+x$a
+#> Warning in x$a: partial match of 'a' to 'abc'
+#> [1] 1
+```
+
+tibbles don't have this behavior
+```{r}
+penguins$s
+```
+
+### missing and out of bound indices
+- Due to the inconsistency of how R handles such indices, `purrr::pluck()` and `purrr::chuck()` are recommended
+```{r, eval=FALSE}
+x <- list(
+ a = list(1, 2, 3),
+ b = list(3, 4, 5)
+)
+purrr::pluck(x, "a", 1)
+# [1] 1
+purrr::pluck(x, "c", 1)
+# NULL
+purrr::pluck(x, "c", 1, .default = NA)
+# [1] NA
+```
+
+### `@` and `slot()`
+- `@` is `$` for S4 objects (to be revisited in Chapter 15)
+
+- `slot()` is `[[ ]]` for S4 objects
+
+## Subsetting and Assignment
+
+- Subsetting can be combined with assignment to edit values
+
+```{r}
+x <- c("Tigers", "Royals", "White Sox", "Twins", "Indians")
+
+x[5] <- "Guardians"
+
+x
+```
+
+- length of the subset and assignment vector should be the same to avoid recycling
+
+You can use NULL to remove a component
+
+```{r}
+x <- list(a = 1, b = 2)
+x[["b"]] <- NULL
+str(x)
+```
+
+Subsetting with nothing can preserve structure of original object
+
+```{r, eval=FALSE}
+# mtcars[] <- lapply(mtcars, as.integer)
+# is.data.frame(mtcars)
+# [1] TRUE
+# mtcars <- lapply(mtcars, as.integer)
+#> is.data.frame(mtcars)
+# [1] FALSE
+```
+
+## Applications
+
+Applications copied from cohort 2 slide
+
+### Lookup tables (character subsetting)
+```{r, eval=FALSE}
+x <- c("m", "f", "u", "f", "f", "m", "m")
+lookup <- c(m = "Male", f = "Female", u = NA)
+lookup[x]
+# m f u f f m m
+# "Male" "Female" NA "Female" "Female" "Male" "Male"
+```
+
+### Matching and merging by hand (integer subsetting)
+- The `match()` function allows merging a vector with a table
+```{r, eval=FALSE}
+grades <- c("D", "A", "C", "B", "F")
+info <- data.frame(
+ grade = c("A", "B", "C", "D", "F"),
+ desc = c("Excellent", "Very Good", "Average", "Fair", "Poor"),
+ fail = c(F, F, F, F, T)
+)
+id <- match(grades, info$grade)
+id
+# [1] 3 2 2 1 3
+info[id, ]
+# grade desc fail
+# 4 D Fair FALSE
+# 1 A Excellent FALSE
+# 3 C Average FALSE
+# 2 B Very Good FALSE
+# 5 F Poor TRUE
+```
+
+
+### Random samples and bootstrapping (integer subsetting)
+```{r, eval=FALSE}
+# mtcars[sample(nrow(mtcars), 3), ] # use replace = TRUE to replace
+# mpg cyl disp hp drat wt qsec vs am gear carb
+# Lotus Europa 30.4 4 95.1 113 3.77 1.513 16.90 1 1 5 2
+# Mazda RX4 21.0 6 160.0 110 3.90 2.620 16.46 0 1 4 4
+# Cadillac Fleetwood 10.4 8 472.0 205 2.93 5.250 17.98 0 0 3 4
+```
+
+
+### Ordering (integer subsetting)
+```{r, eval=FALSE}
+# mtcars[order(mtcars$mpg), ]
+# mpg cyl disp hp drat wt qsec vs am gear carb
+# Cadillac Fleetwood 10.4 8 472.0 205 2.93 5.250 17.98 0 0 3 4
+# Lincoln Continental 10.4 8 460.0 215 3.00 5.424 17.82 0 0 3 4
+# Camaro Z28 13.3 8 350.0 245 3.73 3.840 15.41 0 0 3 4
+# Duster 360 14.3 8 360.0 245 3.21 3.570 15.84 0 0 3 4
+# Chrysler Imperial 14.7 8 440.0 230 3.23 5.345 17.42 0 0 3 4
+# Maserati Bora 15.0 8 301.0 335 3.54 3.570 14.60 0 1 5 8
+# ...
+```
+
+
+### Expanding aggregated counts (integer subsetting)
+- We can expand a count column by using `rep()`
+```{r, eval=FALSE}
+df <- tibble::tibble(x = c("Amy", "Julie", "Brian"), n = c(2, 1, 3))
+df[rep(1:nrow(df), df$n), ]
+# A tibble: 6 x 2
+# x n
+# <chr> <dbl>
+# 1 Amy 2
+# 2 Amy 2
+# 3 Julie 1
+# 4 Brian 3
+# 5 Brian 3
+# 6 Brian 3
+```
+
+
+
+### Removing columns from data frames (character)
+- We can remove a column by subsetting, which does not change the object
+```{r, eval=FALSE}
+df[, 1]
+# A tibble: 3 x 1
+# x
+# <chr>
+# 1 Amy
+# 2 Julie
+# 3 Brian
+```
+- We can also delete the column using `NULL`
+```{r, eval=FALSE}
+df$n <- NULL
+df
+# A tibble: 3 x 1
+# x
+# <chr>
+# 1 Amy
+# 2 Julie
+# 3 Brian
+```
+
+
+
+### Selecting rows based on a condition (logical subsetting)
+
+```{r, eval=FALSE}
+# mtcars[mtcars$gear == 5, ]
+# mpg cyl disp hp drat wt qsec vs am gear carb
+# Porsche 914-2 26.0 4 120.3 91 4.43 2.140 16.7 0 1 5 2
+# Lotus Europa 30.4 4 95.1 113 3.77 1.513 16.9 1 1 5 2
+# Ford Pantera L 15.8 8 351.0 264 4.22 3.170 14.5 0 1 5 4
+# Ferrari Dino 19.7 6 145.0 175 3.62 2.770 15.5 0 1 5 6
+# Maserati Bora 15.0 8 301.0 335 3.54 3.570 14.6 0 1 5 8
+```
+
+
+
+### Boolean algebra versus sets (logical and integer)
+- `which()` gives the indices of a Boolean vector
+
+```{r, eval=FALSE}
+(x1 <- 1:10 %% 2 == 0) # 1-10 divisible by 2
+# [1] FALSE TRUE FALSE TRUE FALSE TRUE FALSE TRUE FALSE TRUE
+(x2 <- which(x1))
+# [1] 2 4 6 8 10
+(y1 <- 1:10 %% 5 == 0) # 1-10 divisible by 5
+# [1] FALSE FALSE FALSE FALSE TRUE FALSE FALSE FALSE FALSE TRUE
+(y2 <- which(y1))
+# [1] 5 10
+x1 & y1
+# [1] FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE TRUE
+```
-- ADD SLIDES AS SECTIONS (`##`).
-- TRY TO KEEP THEM RELATIVELY SLIDE-LIKE; THESE ARE NOTES, NOT THE BOOK ITSELF.
## Meeting Videos
@@ -42,3 +436,6 @@
LOG
```
</details>
+
+
+
diff --git a/DESCRIPTION b/DESCRIPTION
@@ -16,4 +16,5 @@ Imports:
bookdown,
rmarkdown,
tidyverse,
- DiagrammeR
+ DiagrammeR,
+ palmerpenguins
diff --git a/images/subsetting/hadley-tweet.png b/images/subsetting/hadley-tweet.png
Binary files differ.
diff --git a/images/subsetting/train-1.png b/images/subsetting/train-1.png
Binary files differ.
diff --git a/images/subsetting/train-2.png b/images/subsetting/train-2.png
Binary files differ.
diff --git a/images/subsetting/train-3.png b/images/subsetting/train-3.png
Binary files differ.