www.eamoncaddigan.net

Content and configuration for https://www.eamoncaddigan.net
git clone https://git.eamoncaddigan.net/www.eamoncaddigan.net.git
Log | Files | Refs | Submodules | README

commit c7c6ce7dbb37d2e776df139eb494cc34c0d7cf0b
parent a90858a8387b18e7bf510688721d53693a43c461
Author: Eamon Caddigan <eamon.caddigan@gmail.com>
Date:   Thu, 22 Feb 2024 20:26:48 -0800

Add a post about coding assistants

Diffstat:
Acontent/posts/coding-assistants/index.md | 243+++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
1 file changed, 243 insertions(+), 0 deletions(-)

diff --git a/content/posts/coding-assistants/index.md b/content/posts/coding-assistants/index.md @@ -0,0 +1,243 @@ +--- +title: "Programmers should reject LLM-based coding assistants" +description: "Even without the ethical issues present in the coding assistants +that exist, these tools are fundamentally unfit for the job." +date: 2024-02-22T20:36:52-08:00 +draft: false +categories: +- Programming +- Data Science +tags: +- R +--- + +The complexity of our world is beyond the limits of human comprehension. In +spite of this, we generally feel like we understand what’s going on around +us. Each of us achieves this feat of self-deception by cobbling together an +assortment of abstractions, mental models, schemas, and metaphors[^science]. +When life confronts us with yet another task that demands attention, we +select the most appropriate of these and ignore the universe of details that +are (hopefully) irrelevant. This approach generally works surprisingly well! + +Computers are less complex than the whole world, but they still resist human +comprehension. Computer practitioners once again rely on abstractions, etc., +in order to muddle their way through things—it’s the only hope we’ve +got[^comprehend]. Programming languages are among the best tools in our +arsenal, allowing us to transform human-written source code (which is a sort +of mashup of human language and mathematical notation—another good tool for +approximating the world) into the list of numbers comprising a CPU’s +instructions. Sometimes our programs even work. + +Some people truly love programming for its own sake, but there aren’t enough +of them to fill all the jobs that require doing so. Further complicating +matters, even these folks only _really_ like writing certain kinds of code, +which generally represents a minority of the code employers need. When taken +together, these observations imply that most code is written begrudgingly—it +is not exactly [contributing to self-discovery or spiritual +growth](https://codeberg.org/oneirophage/practice-guide-for-computer/src/branch/main/guide.pdf). + +This is probably one reason that large language model-based coding +assistants (LLMCAs) are becoming popular with some programmers. The most +well-known of these tools is GitHub Copilot, developed by Microsoft and +OpenAI[^ai]. LLMs work by learning representations of language (including, +in the case of LLM-based coding assistants, programming languages) that +result in good performance at predicting the next token in a sequence. For +a programmer using a LLMCA to help with their work, they experience +“auto-complete for code”. In short, they speed up the process of writing +programs, and “writing programs” is the thing that programmers are paid to +do. + +There are ethical issues with the use of the LLMCAs that currently exist. +Copilot specifically was trained on code that was posted to GitHub, and the +authors of this code were not asked for their informed consent to have their +work being used this way[^ethics]. LLM-based models are also particularly +energy intensive, which is something that should concern anybody who cares +about climate change[^climate]. LLMCAs are also probably illegal[^illegal] +as Copilot is known to have violated the licenses of most open source code +posted there. Especially damning is the use of [“copyleft” code]({{< ref +"use-the-gpl" >}}) in its training corpus. Such code was licensed in +a manner that allows for its adaptation and reuse (which is what Copilot is +ultimately doing—adapting and reusing code at scale), but _only_ when the +resulting code is also shared with the same license. Whether or not you’d +like to see the proliferation of Copilot-generated code result in _all_ +programs becoming copyleft, I don’t think that’s what its users (or their +employers) intend to have happen. + +But the above issues with LLMCAs are at least solvable in theory. Viz: +a company as well-resourced as Microsoft _could_ train its model using code +that was collected with the authors’ explicit consent, and advances in +energy infrastructure and algorithmic efficiency _might_ bring the climate +impact of coding assistants down into acceptable levels. However, there is +a existential issue with LLMCAs that should inspire programmers to reject +them out of hand: even though they address a real problem, they are the +wrong tool for the job. + +The real problem that LLMCAs attempt to address is that many programmers are +ill-served by the rest of their tooling. I don’t have the personal +experience with web programming to opine on the state of the JavaScript +ecosystem, but there is an emerging recognition that the current status quo +(which starts by reaching for the JavaScript framework du jour, and solves +the problems that arise from using it by bolting on addition dependencies) +is unpleasant and untenable. This approach to developing applications may +generate a lot of code, but it isn’t really _programming_[^kids]; while +bolting together disparate parts is sometimes an appropriate way to build +something, it can’t be the only way we build things. As the early 20th +century biologist Edward J. v. K. Menge noted[^menge]: + +> Breeding homing pigeons that could cover a given space with ever +> increasing rapidity did not give us the laws of telegraphy, nor did +> breeding faster horses bring us the steam locomotive. + +Sometimes people get the opportunity to apply cleverness and creativity to +find new solutions to problems. This usually starts by taking a step back +and understanding the problem space in a holistic manner, and then finding +a different way to think about it. Coders working with LLMCAs[^unlucky] +won’t be able to do this very often. + +So what does a good solution to this tooling problem look like? Here I’ll +share an example from the R world, since it’s the primary language I’ve +programmed in for the past ten years. I’ve been doing statistics and “data +science” for longer than that, and programming longer still, but two +important things happened ten years ago that turned me into an “R +programmer”: I started a new job that was going to require more statistical +programming than I’d done in academia, and Hadley Wickham was hard at work +on a new R package called dplyr (which was to become the centerpiece of +a family of packages collectively called the Tidyverse[^hadley]). + +I used R before 2014, but I went to tremendous lengths to avoid actually +programming it. Instead, I would do all of my data wrangling on Python (in +version 2, which was the style at the time) and then load “tidy data” into +R to perform T-tests and ANOVA. In my experiments with R as a programming +language, I found its native interface for manipulating data +frames[^data-frame] (now frequently called “base R” to distinguish it from +Tidyverse-dependent approaches) to be clunky and unintuitive. The Tidyverse +changed all that; dplyr introduced a suite of “pure” +functions[^pure-function] for data transformation. They had easy-to-remember +names (all verbs, since they performed actions on data frames), consistent +argument ordering, and were designed to work well with [the forward pipe +operator from the magrittr package]({{< ref "r-pipe-equality" >}}). + +Data wrangling in the Tidyverse just _feels_ different (and better) than +working with its predecessors. While doing a live coding exercise as +I interviewed for a previous job, one of my then-future-colleagues—a +die-hard Python user—commented on how “fluid” programming in the Tidyverse +seemed. Compared to the syntax of Pandas, a Python data frame module that +provides an interface not too different from base R’s, it’s a fundamentally +different beast. + +That stuff about metaphors and abstractions is relevant here, because these +explain why the Tidyverse feels different. It operates on a different level +of abstraction than base R’s data frame operations; i.e., it depends on +a different mental model of the underlying data structures. Just to be +clear: its advantages do come at some cost, and not everybody agrees that +these trade-offs are justified. But based on the popularity of the +Tidyverse, I am not alone in thinking they are. Almost everything we do on +computers follows this pattern. Writing our data analyses in R and Python is +much easier than using a “low-level” language like C, but this additional +layer of abstraction can make our programs slower and less memory-efficient. +For that matter, carefully optimized assembly code can outperform C, and +I haven’t met anybody who analyzes data using assembly. Programming +languages (and paradigms, libraries, frameworks, etc.) proliferate because +they solve different problems, generally by working at different levels of +abstraction. + +LLMCAs also introduce trade-offs: for example, programmers can generate code +more quickly, but they don’t understand it as deeply as if they had written +it themselves. Rather than simply argue about when (if ever) this trade-off +is worth making, I invite you to imagine that Copilot had come to R before +the Tidyverse had. Instead of getting an interface that allows data +scientists to work faster by operating at a more comfortable level of +abstraction, we’d be writing base R at faster rates using its suggestions. +Both approaches result in programming tasks being finished more quickly. +However, the programmer using the Tidyverse knows exactly why and how their +code works (at least at one level of abstraction) and enjoyed the time they +spent on it. The programmer using Copilot would only have a sketchy sense +that their code seems to work, and they probably didn’t have much fun +getting there. This is why I fundamentally oppose LLMCAs: the challenges +that drive programmers to use them would be better solved with their own +“Tidyverses”. + +From a business perspective, it might seem less risky to rent access to +LLMCAs than invest in the development of new tooling, but this is a mistake. +The tools may be _relatively_ inexpensive to use now, but that’s bound to +change eventually. The cost of building and deploying a new LLMCA ensures +that only a few Big Tech players can compete, and these companies have track +record of collusion[^collusion]. I also find that many hiring managers +underestimate how much more productive their workers can be when given +a challenging but fun task than when asked to do something “easier” that’s +boring. + +I’m no business guy, my call to action is directed primarily to my fellow +“senior” developers. Don’t evangelize for LLMCAs—instead push harder for the +resources to develop better tooling. If you currently use LLMCAs yourself, +identify the types of tasks that benefit the most from them, and note these +as spaces in need of creative solutions. Encourage junior programmers to +develop a deeper understanding of the tools they use currently, and insist +that your colleagues at all levels imagine something better. + +[^science]: For its part, science can be a great tool for exposing the + limitations of these mental models. But at the end of the day, it’s + still only producing different, hopefully better models, operating at + specific levels of abstraction. + +[^comprehend]: I invite any extremely hardcore coder who scoffs at my claim + that computers are difficult to comprehend to reflect on the last time + they were required to think about the physics of transistors or the + capacitance of circuit traces when they were programming their web app + or whatever. + +[^ai]: Like many LLM-based technologies, these are currently being marketed + as “AI”. There’s no reason to believe that these machine learning + technologies will bring us closer to “general” artificial intelligence. + +[^ethics]: Somebody may argue that this sort of use was permitted by + GitHub’s Terms of Service, but there are two flaws in this argument. + First, the people posting code to GitHub are not necessarily the code’s + authors; plenty of projects have been “mirrored” there by people who + were only hoping to make them more accessible. The more glaring error in + this argument is that it commits the cardinal sin of mistaking “not + illegal” for “ethical”. Plenty of atrocious behaviors have been + perfectly legal. Stealing people’s computer code is certainly not in the + same class of behavior as slavery and genocide, but I still learned not + to assume that those who are quick to point to a Terms of Service are + taking ethical issues seriously. + +[^climate]: [ChatGPT alone is already consuming the energy of 33,000 + homes](https://www.nature.com/articles/d41586-024-00478-x). + +[^illegal]: I say “probably” because the courts have yet to rule on the + matter. At least [one lawsuit](https://githubcopilotlitigation.com/) has + already been filed, but I can’t say that I’m particularly optimistic + that the courts would rule in favor of individual hobbyists against the + interests of some of the wealthiest individuals and corporations in the + world. + +[^kids]: Just to be clear, this isn’t a “kids these days” rant about the + skills of junior programmers, if anybody deserves the blame here it’s + the managers and senior programmers who allowed this rotten situation to + fester. + +[^menge]: Menge, E. J. v. K. (1930). Biological problems and opinions. _The + Quarterly Review of Biology, 5_(3), 348-359. + +[^unlucky]: And also the doubly-unlucky coders who are neither allowed to + use LLMCAs nor given the resources and opportunity to be clever and + creative. + +[^hadley]: There was a brief period where this was colloquially called “the + Hadleyverse” in homage to Wickham, but he insisted on the new name. + +[^data-frame]: A “data frame” is another abstraction, used by many + programming languages or libraries use to represent observations about + distinguishable units; it uses the same metaphor of rows and columns + that spreadsheet programs like Microsoft Excel use. + +[^pure-function]: A “pure function” is one that “doesn’t have side effects”. + In slightly plainer English, a pure function doesn’t do anything other + than (potentially) return a new thing. Object methods that update object + attributes aren’t pure functions, nor are any functions that modify + their arguments or global variables. + +[^collusion]: For instance, several Big Tech companies [recently reached + a settlement](https://www.npr.org/sections/thetwo-way/2014/04/24/306592297/tech-giants-settle-wage-fixing-lawsuit) + with workers alleging that they had engaged in illegal wage-fixing.