www.eamoncaddigan.net

Content and configuration for https://www.eamoncaddigan.net
git clone https://git.eamoncaddigan.net/www.eamoncaddigan.net.git
Log | Files | Refs | Submodules | README

index.md (3643B)


      1 ---
      2 title: "Weeknote for 2025-W17"
      3 description: "Tools for reproducible R, Python data classes, the wrong abstraction"
      4 date: 2025-04-21T19:48:11-07:00
      5 draft: false
      6 categories:
      7 - Weeknotes
      8 tags:
      9 - R
     10 - Python
     11 ---
     12 
     13 ## Managing reproducible environments in R
     14 
     15 I saw [a toot on
     16 Mastodon](https://mastodon.social/@joranelias/114361012740025621) bemoaning the
     17 experience of using `renv` to manage “reproducible environments” in R. I’ve had
     18 the exact experience the author describes:
     19 
     20 > More than half the time when I actually have to *use* it to restore a project
     21 > that hasn't been touched in more than a year or two `renv::restore()` fails
     22 > with tons of errors, I struggle for hours to figure it out and typically just
     23 > give up and simply install current versions of all the packages and pray.
     24 
     25 To give a concrete example, I was recently contacted by a journalist who
     26 wondered if I could update a quick and dirty analysis that I posted to Twitter
     27 _years_ ago, and `renv` failed because the older versions of packages wouldn’t
     28 build with the version of GCC on my computer. (Fortunately, my prayers were
     29 answered, and installing the current versions of all the packages worked just
     30 fine.)
     31 
     32 [A helpful reply](https://fosstodon.org/@brodriguesco/114361023856952425) to
     33 the toot recommended `rix`, and I plan to check it out!
     34 
     35 [rix: Reproducible Environments with Nix](https://docs.ropensci.org/rix/)
     36 
     37 ## Using data classes in Python
     38 
     39 I’ve been writing a lot of Python at work lately, primarily using
     40 [PySpark](https://spark.apache.org/docs/latest/api/python/index.html). Now I’ve
     41 been using Python for a long time[^old] but after doing some work in Julia I
     42 fell in love with [multiple
     43 dispatch](https://docs.julialang.org/en/v1/manual/methods/) and lately I feel
     44 encumbered by Python’s approach to OOP.
     45 
     46 My interest was piqued by this post about leveraging Python’s [data
     47 classes](https://docs.python.org/3/library/dataclasses.html) (and relying more
     48 heavily on factory methods) to simplify object definitions. The approach
     49 advocated here feels like an improvement to Python classes I’ve written. I
     50 still prefer being able to extend a class’s capabilities by writing new “outer
     51 methods” (as in Julia) than creating a whole subclass (as Python requires me to
     52 do). I expect subclassing data classes will be less onerous than the status
     53 quo, but I just have to try this whole thing out and see how it feels.
     54 
     55 [Glyph --- Stop Writing `__init__`
     56 Methods](https://blog.glyph.im/2025/04/stop-writing-init-methods.html)
     57 
     58 ## “Duplication is far cheaper than the wrong abstraction”
     59 
     60 [Maya (of maya.land)
     61 shared](https://maya.land/responses/2025/04/15/specific-advice-refactoring-code.html)
     62 this excellent and practical advice on how to identify and fix the problems
     63 that arise when a piece of code is committed to the “wrong abstraction”. I’ve
     64 been guilty of overenthusiastically introducing quickly-outgrown abstractions,
     65 so don’t take the following as an indictment of my colleagues’ skill. However,
     66 the aforementioned Python code happens to be part of a major refactoring effort
     67 (which involves changing languages and platforms), so now I’m inheriting years
     68 of accumulated tweaks and kludges, and I need to decide what to keep[^pm]. This
     69 advice couldn’t come at a better time for me professionally.
     70 
     71 [Sandi Metz --- The Wrong
     72 Abstraction](https://sandimetz.com/blog/2016/1/20/the-wrong-abstraction)
     73 
     74 [^old]: I picked up Python to replace C for a project in 2004, and it’s been my
     75     “primary language” on and off for the past 2+ decades. Hey, I’m old.
     76 
     77 [^pm]: And also help others figure out what to keep.