www.eamoncaddigan.net

Content and configuration for https://www.eamoncaddigan.net
git clone https://git.eamoncaddigan.net/www.eamoncaddigan.net.git
Log | Files | Refs | Submodules | README

index.md (2590B)


      1 ---
      2 title: "Weeknote for 2025-W21"
      3 description: "A bunch of stuff relevant to data scientists, actually"
      4 date: 2025-05-24T15:33:30-07:00
      5 draft: false
      6 categories:
      7 - Weeknotes
      8 tags:
      9 - Statistics
     10 ---
     11 
     12 It’s all stats and data science for this post folks!
     13 
     14 ## Weird ad-hoc analysis work vs. repeatable infrastructure work
     15 
     16 I enjoy reading how people with jobs similar to mine approach their
     17 work[^work], and here Au explains his workflow as a “mostly ad-hoc” data
     18 scientist. For example:
     19 
     20 > \[T\]here's a very strong tension between how much I leverage pre-existing
     21 > code and my effectiveness writing new code.
     22 
     23 My habits aren’t unlike his when I’m doing ad-hoc stuff; I’ll write “the same
     24 code” dozens of times and find I can approach a problem a bit better each with
     25 each iteration.
     26 
     27 [Randy Au---As a mostly ad-hoc'er, I've got workflow
     28 issues](https://www.counting-stuff.com/as-a-mostly-ad-hocer-ive-got-workflow-issues/)
     29 
     30 ## A Systematic Literature Review of Undergraduate Data Science Education Research
     31 
     32 We’re “data people”, right? Surely that means there is a solid body of research
     33 on the best practices for inducting students into our field, from which our
     34 course materials are drawn? The bad news: “no, not really.” 
     35 
     36 > The undergraduate data science literature that we identified often lacks
     37 > empirical data, research questions and reproducibility.
     38 
     39 The good news: work from folks like Dogucu can help correct this.
     40 
     41 [Mine Dogucu---We read many data science education papers so that you don’t
     42 have to](https://www.datapedagogy.com/posts/2025-05-17-ds-ed/ds-ed-paper)
     43 
     44 ## Everything you need to know about multivariate normal sampling
     45 
     46 If you’re the sort of person who would enjoy reading a 13k word blog post
     47 comprising a deep dive on procedures for generating random samples from the
     48 multivariate normal distribution, then:
     49 
     50 1. We should be friends
     51 2. Have I found the blog post for _you_
     52 
     53 It’s been a while since I’ve had to think deeply about some of the technical
     54 topics discussed here (e.g, matrix decomposition), but Navarro does a good job
     55 of reminding the reader what these things are and why they’re
     56 relevant---without making anyone feel weird about forgetting so much stuff from
     57 college.
     58 
     59 [Danielle Navarro---When good pseudorandom numbers go
     60 bad](https://blog.djnavarro.net/posts/2025-05-18_multivariate-normal-sampling-floating-point/)
     61 
     62 [^work]: Honestly, I will listen to anybody talk about their avocations,
     63     especially when they approach them as crafts. But it’s a welcome bonus when
     64 their experiences directly parallel mine.