index.Rmd (3915B)
1 --- 2 title: "Riddler #2" 3 author: "Eamon Caddigan" 4 date: "2015-12-19" 5 output: html_document 6 layout: post 7 summary: Solving and simulating a probability puzzle. 8 categories: R stats 9 --- 10 11 FiveThirtyEight has a new puzzle feature called The Riddler. This week they posted [their second puzzle](http://fivethirtyeight.com/features/which-geyser-gushes-first/), which involves probability: 12 13 > You arrive at the beautiful Three Geysers National Park. You read a placard 14 > explaining that the three eponymous geysers — creatively named A, B and C — 15 > erupt at intervals of precisely two hours, four hours and six hours, 16 > respectively. However, you just got there, so you have no idea how the three 17 > eruptions are staggered. Assuming they each started erupting at some 18 > independently random point in history, what are the probabilities that A, B 19 > and C, respectively, will be the first to erupt after your arrival? 20 21 ### Analytic solution 22 23 Responses were due last night at midnight, so I hope I'm not spoiling anything by sharing mine. 24 25 When you arrive at the park, the first eruption of geyser A is likely to occur at any time within the next two hours with a uniform probability. Denoting the number of hours until the first eruption of the geyser as “A”, A ~ Uniform(0, 2). Similarly, B ~ Uniform(0, 4) and C ~ Uniform(0, 6). 26 27 To compute the probability of geyser A erupting first, separately consider the cases where geysers B and C do and do not erupt within the first two hours of waiting. 28 29 Using Bayes Theorem: 30 31 $$Pr(A first \cap B<2, C<2) = Pr(A first | B < 2, C < 2) Pr(B < 2, C < 2)$$ 32 33 Assuming independence of the geysers: 34 35 $$= Pr(A first | B < 2, C < 2) Pr(B < 2) Pr(C < 2)$$ 36 37 Considering the case here, in which all three geysers erupt in the first two hours, the probability of any geyser erupting first is 1/3. The probability of geysers B and C erupting in the first two hours is easy to calculate. 38 39 $$= (1/3) \times (1/2) \times (1/3) = 1/18$$ 40 41 Using similar logic: 42 43 $$Pr(A first \cap B > 2, C < 2) = (1/2) \times (1/2) \times (1/3) = 1/12\\ 44 Pr(A first \cap B < 2, C > 2) = (1/2) \times (1/2) \times (2/3) = 1/6\\ 45 Pr(A first \cap B > 2, C > 2) = 1 \times (1/2) \times (2/3) = 1/3$$ 46 47 These disjoint events partition the sample space, so the law of total probability dictates: 48 49 $$Pr(A first) = 1/18 + 1/12 + 1/6 + 1/3 = 23/36 \approx 0.6389$$ 50 51 Do the same thing to calculate the probability of geyser B erupting before the others: 52 53 $$Pr(B first \cap B < 2, C < 2) = (1/3) \times (1/2) \times (1/3) = 1/18\\ 54 Pr(B first \cap B < 2, C > 2) = (1/2) \times (1/2) \times (2/3) = 1/6$$ 55 56 Note that when geyser B does not erupt within the first two hours, another geyser is guaranteed to erupt before it: 57 58 $$Pr(B first \cap B > 2, C < 2) = 0\\ 59 Pr(B first \cap B > 2, C > 2) = 0$$ 60 61 Therefore: 62 63 $$Pr(B first) = 1/18 + 1/6 = 4/18 \approx 0.2222$$ 64 65 And geyser C is similar: 66 67 $$Pr(C first \cap B < 2, C < 2) = (1/3) \times (1/2) \times (1/3) = 1/18\\ 68 Pr(C first \cap B > 2, C < 2) = (1/2) \times (1/2) \times (1/3) = 1/12\\ 69 Pr(C first) = 1/18 + 1/12 = 5/36 \approx 0.1389$$ 70 71 ### Quick simulation 72 73 It never hurts to check results with a simulation. After all, math is hard and programming is easy. 74 75 ```{r, warning=FALSE, message=FALSE} 76 library(dplyr) 77 library(ggplot2) 78 79 numSamples <- 1e6 80 81 # Generate random wait times for each geyser's next eruption 82 geysers <- data_frame(A = runif(numSamples, 0, 2), 83 B = runif(numSamples, 0, 4), 84 C = runif(numSamples, 0, 6)) 85 geyserNames <- colnames(geysers) 86 87 # Identify the geyser with the smallest time-to eruption 88 geysers <- geysers %>% 89 mutate(first_geyser = geyserNames[max.col(-geysers[, geyserNames])]) 90 91 # Plot the results 92 geysers %>% 93 ggplot(aes(x = first_geyser)) + geom_bar() 94 95 # Count observations and estimate probabilities 96 geysers %>% 97 count(first_geyser) %>% 98 mutate(prob_first = n / numSamples) 99 ``` 100 101 Close enough!