www.eamoncaddigan.net

Content and configuration for https://www.eamoncaddigan.net
git clone https://git.eamoncaddigan.net/www.eamoncaddigan.net.git
Log | Files | Refs | Submodules | README

commit 3778f66e7bca3d99b652eb20545de5363218d4db
parent a727ff5d4677c86ba399191f980be0b5a5db911b
Author: Eamon Caddigan <eamon.caddigan@gmail.com>
Date:   Sat, 24 May 2025 20:19:58 -0700

Add weeknote for 2025-W21

Diffstat:
Acontent/posts/weeknotes/2025-w21/index.md | 64++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
1 file changed, 64 insertions(+), 0 deletions(-)

diff --git a/content/posts/weeknotes/2025-w21/index.md b/content/posts/weeknotes/2025-w21/index.md @@ -0,0 +1,64 @@ +--- +title: "Weeknote for 2025-W21" +description: "A bunch of stuff relevant to data scientists, actually" +date: 2025-05-24T15:33:30-07:00 +draft: false +categories: +- Weeknotes +tags: +- Statistics +--- + +It’s all stats and data science for this post folks! + +## Weird ad-hoc analysis work vs. repeatable infrastructure work + +I enjoy reading how people with jobs similar to mine approach their +work[^work], and here Au explains his workflow as a “mostly ad-hoc” data +scientist. For example: + +> \[T\]here's a very strong tension between how much I leverage pre-existing +> code and my effectiveness writing new code. + +My habits aren’t unlike his when I’m doing ad-hoc stuff; I’ll write “the same +code” dozens of times and find I can approach a problem a bit better each with +each iteration. + +[Randy Au---As a mostly ad-hoc'er, I've got workflow +issues](https://www.counting-stuff.com/as-a-mostly-ad-hocer-ive-got-workflow-issues/) + +## A Systematic Literature Review of Undergraduate Data Science Education Research + +We’re “data people”, right? Surely that means there is a solid body of research +on the best practices for inducting students into our field, from which our +course materials are drawn? The bad news: “no, not really.” + +> The undergraduate data science literature that we identified often lacks +> empirical data, research questions and reproducibility. + +The good news: work from folks like Dogucu can help correct this. + +[Mine Dogucu---We read many data science education papers so that you don’t +have to](https://www.datapedagogy.com/posts/2025-05-17-ds-ed/ds-ed-paper) + +## Everything you need to know about multivariate normal sampling + +If you’re the sort of person who would enjoy reading a 13k word blog post +comprising a deep dive on procedures for generating random samples from the +multivariate normal distribution, then: + +1. We should be friends +2. Have I found the blog post for _you_ + +It’s been a while since I’ve had to think deeply about some of the technical +topics discussed here (e.g, matrix decomposition), but Navarro does a good job +of reminding the reader what these things are and why they’re +relevant---without making anyone feel weird about forgetting so much stuff from +college. + +[Danielle Navarro---When good pseudorandom numbers go +bad](https://blog.djnavarro.net/posts/2025-05-18_multivariate-normal-sampling-floating-point/) + +[^work]: Honestly, I will listen to anybody talk about their avocations, + especially when they approach them as crafts. But it’s a welcome bonus when +their experiences directly parallel mine.