index.md (2590B)
1 --- 2 title: "Weeknote for 2025-W21" 3 description: "A bunch of stuff relevant to data scientists, actually" 4 date: 2025-05-24T15:33:30-07:00 5 draft: false 6 categories: 7 - Weeknotes 8 tags: 9 - Statistics 10 --- 11 12 It’s all stats and data science for this post folks! 13 14 ## Weird ad-hoc analysis work vs. repeatable infrastructure work 15 16 I enjoy reading how people with jobs similar to mine approach their 17 work[^work], and here Au explains his workflow as a “mostly ad-hoc” data 18 scientist. For example: 19 20 > \[T\]here's a very strong tension between how much I leverage pre-existing 21 > code and my effectiveness writing new code. 22 23 My habits aren’t unlike his when I’m doing ad-hoc stuff; I’ll write “the same 24 code” dozens of times and find I can approach a problem a bit better each with 25 each iteration. 26 27 [Randy Au---As a mostly ad-hoc'er, I've got workflow 28 issues](https://www.counting-stuff.com/as-a-mostly-ad-hocer-ive-got-workflow-issues/) 29 30 ## A Systematic Literature Review of Undergraduate Data Science Education Research 31 32 We’re “data people”, right? Surely that means there is a solid body of research 33 on the best practices for inducting students into our field, from which our 34 course materials are drawn? The bad news: “no, not really.” 35 36 > The undergraduate data science literature that we identified often lacks 37 > empirical data, research questions and reproducibility. 38 39 The good news: work from folks like Dogucu can help correct this. 40 41 [Mine Dogucu---We read many data science education papers so that you don’t 42 have to](https://www.datapedagogy.com/posts/2025-05-17-ds-ed/ds-ed-paper) 43 44 ## Everything you need to know about multivariate normal sampling 45 46 If you’re the sort of person who would enjoy reading a 13k word blog post 47 comprising a deep dive on procedures for generating random samples from the 48 multivariate normal distribution, then: 49 50 1. We should be friends 51 2. Have I found the blog post for _you_ 52 53 It’s been a while since I’ve had to think deeply about some of the technical 54 topics discussed here (e.g, matrix decomposition), but Navarro does a good job 55 of reminding the reader what these things are and why they’re 56 relevant---without making anyone feel weird about forgetting so much stuff from 57 college. 58 59 [Danielle Navarro---When good pseudorandom numbers go 60 bad](https://blog.djnavarro.net/posts/2025-05-18_multivariate-normal-sampling-floating-point/) 61 62 [^work]: Honestly, I will listen to anybody talk about their avocations, 63 especially when they approach them as crafts. But it’s a welcome bonus when 64 their experiences directly parallel mine.