commit c7c6ce7dbb37d2e776df139eb494cc34c0d7cf0b
parent a90858a8387b18e7bf510688721d53693a43c461
Author: Eamon Caddigan <eamon.caddigan@gmail.com>
Date: Thu, 22 Feb 2024 20:26:48 -0800
Add a post about coding assistants
Diffstat:
1 file changed, 243 insertions(+), 0 deletions(-)
diff --git a/content/posts/coding-assistants/index.md b/content/posts/coding-assistants/index.md
@@ -0,0 +1,243 @@
+---
+title: "Programmers should reject LLM-based coding assistants"
+description: "Even without the ethical issues present in the coding assistants
+that exist, these tools are fundamentally unfit for the job."
+date: 2024-02-22T20:36:52-08:00
+draft: false
+categories:
+- Programming
+- Data Science
+tags:
+- R
+---
+
+The complexity of our world is beyond the limits of human comprehension. In
+spite of this, we generally feel like we understand what’s going on around
+us. Each of us achieves this feat of self-deception by cobbling together an
+assortment of abstractions, mental models, schemas, and metaphors[^science].
+When life confronts us with yet another task that demands attention, we
+select the most appropriate of these and ignore the universe of details that
+are (hopefully) irrelevant. This approach generally works surprisingly well!
+
+Computers are less complex than the whole world, but they still resist human
+comprehension. Computer practitioners once again rely on abstractions, etc.,
+in order to muddle their way through things—it’s the only hope we’ve
+got[^comprehend]. Programming languages are among the best tools in our
+arsenal, allowing us to transform human-written source code (which is a sort
+of mashup of human language and mathematical notation—another good tool for
+approximating the world) into the list of numbers comprising a CPU’s
+instructions. Sometimes our programs even work.
+
+Some people truly love programming for its own sake, but there aren’t enough
+of them to fill all the jobs that require doing so. Further complicating
+matters, even these folks only _really_ like writing certain kinds of code,
+which generally represents a minority of the code employers need. When taken
+together, these observations imply that most code is written begrudgingly—it
+is not exactly [contributing to self-discovery or spiritual
+growth](https://codeberg.org/oneirophage/practice-guide-for-computer/src/branch/main/guide.pdf).
+
+This is probably one reason that large language model-based coding
+assistants (LLMCAs) are becoming popular with some programmers. The most
+well-known of these tools is GitHub Copilot, developed by Microsoft and
+OpenAI[^ai]. LLMs work by learning representations of language (including,
+in the case of LLM-based coding assistants, programming languages) that
+result in good performance at predicting the next token in a sequence. For
+a programmer using a LLMCA to help with their work, they experience
+“auto-complete for code”. In short, they speed up the process of writing
+programs, and “writing programs” is the thing that programmers are paid to
+do.
+
+There are ethical issues with the use of the LLMCAs that currently exist.
+Copilot specifically was trained on code that was posted to GitHub, and the
+authors of this code were not asked for their informed consent to have their
+work being used this way[^ethics]. LLM-based models are also particularly
+energy intensive, which is something that should concern anybody who cares
+about climate change[^climate]. LLMCAs are also probably illegal[^illegal]
+as Copilot is known to have violated the licenses of most open source code
+posted there. Especially damning is the use of [“copyleft” code]({{< ref
+"use-the-gpl" >}}) in its training corpus. Such code was licensed in
+a manner that allows for its adaptation and reuse (which is what Copilot is
+ultimately doing—adapting and reusing code at scale), but _only_ when the
+resulting code is also shared with the same license. Whether or not you’d
+like to see the proliferation of Copilot-generated code result in _all_
+programs becoming copyleft, I don’t think that’s what its users (or their
+employers) intend to have happen.
+
+But the above issues with LLMCAs are at least solvable in theory. Viz:
+a company as well-resourced as Microsoft _could_ train its model using code
+that was collected with the authors’ explicit consent, and advances in
+energy infrastructure and algorithmic efficiency _might_ bring the climate
+impact of coding assistants down into acceptable levels. However, there is
+a existential issue with LLMCAs that should inspire programmers to reject
+them out of hand: even though they address a real problem, they are the
+wrong tool for the job.
+
+The real problem that LLMCAs attempt to address is that many programmers are
+ill-served by the rest of their tooling. I don’t have the personal
+experience with web programming to opine on the state of the JavaScript
+ecosystem, but there is an emerging recognition that the current status quo
+(which starts by reaching for the JavaScript framework du jour, and solves
+the problems that arise from using it by bolting on addition dependencies)
+is unpleasant and untenable. This approach to developing applications may
+generate a lot of code, but it isn’t really _programming_[^kids]; while
+bolting together disparate parts is sometimes an appropriate way to build
+something, it can’t be the only way we build things. As the early 20th
+century biologist Edward J. v. K. Menge noted[^menge]:
+
+> Breeding homing pigeons that could cover a given space with ever
+> increasing rapidity did not give us the laws of telegraphy, nor did
+> breeding faster horses bring us the steam locomotive.
+
+Sometimes people get the opportunity to apply cleverness and creativity to
+find new solutions to problems. This usually starts by taking a step back
+and understanding the problem space in a holistic manner, and then finding
+a different way to think about it. Coders working with LLMCAs[^unlucky]
+won’t be able to do this very often.
+
+So what does a good solution to this tooling problem look like? Here I’ll
+share an example from the R world, since it’s the primary language I’ve
+programmed in for the past ten years. I’ve been doing statistics and “data
+science” for longer than that, and programming longer still, but two
+important things happened ten years ago that turned me into an “R
+programmer”: I started a new job that was going to require more statistical
+programming than I’d done in academia, and Hadley Wickham was hard at work
+on a new R package called dplyr (which was to become the centerpiece of
+a family of packages collectively called the Tidyverse[^hadley]).
+
+I used R before 2014, but I went to tremendous lengths to avoid actually
+programming it. Instead, I would do all of my data wrangling on Python (in
+version 2, which was the style at the time) and then load “tidy data” into
+R to perform T-tests and ANOVA. In my experiments with R as a programming
+language, I found its native interface for manipulating data
+frames[^data-frame] (now frequently called “base R” to distinguish it from
+Tidyverse-dependent approaches) to be clunky and unintuitive. The Tidyverse
+changed all that; dplyr introduced a suite of “pure”
+functions[^pure-function] for data transformation. They had easy-to-remember
+names (all verbs, since they performed actions on data frames), consistent
+argument ordering, and were designed to work well with [the forward pipe
+operator from the magrittr package]({{< ref "r-pipe-equality" >}}).
+
+Data wrangling in the Tidyverse just _feels_ different (and better) than
+working with its predecessors. While doing a live coding exercise as
+I interviewed for a previous job, one of my then-future-colleagues—a
+die-hard Python user—commented on how “fluid” programming in the Tidyverse
+seemed. Compared to the syntax of Pandas, a Python data frame module that
+provides an interface not too different from base R’s, it’s a fundamentally
+different beast.
+
+That stuff about metaphors and abstractions is relevant here, because these
+explain why the Tidyverse feels different. It operates on a different level
+of abstraction than base R’s data frame operations; i.e., it depends on
+a different mental model of the underlying data structures. Just to be
+clear: its advantages do come at some cost, and not everybody agrees that
+these trade-offs are justified. But based on the popularity of the
+Tidyverse, I am not alone in thinking they are. Almost everything we do on
+computers follows this pattern. Writing our data analyses in R and Python is
+much easier than using a “low-level” language like C, but this additional
+layer of abstraction can make our programs slower and less memory-efficient.
+For that matter, carefully optimized assembly code can outperform C, and
+I haven’t met anybody who analyzes data using assembly. Programming
+languages (and paradigms, libraries, frameworks, etc.) proliferate because
+they solve different problems, generally by working at different levels of
+abstraction.
+
+LLMCAs also introduce trade-offs: for example, programmers can generate code
+more quickly, but they don’t understand it as deeply as if they had written
+it themselves. Rather than simply argue about when (if ever) this trade-off
+is worth making, I invite you to imagine that Copilot had come to R before
+the Tidyverse had. Instead of getting an interface that allows data
+scientists to work faster by operating at a more comfortable level of
+abstraction, we’d be writing base R at faster rates using its suggestions.
+Both approaches result in programming tasks being finished more quickly.
+However, the programmer using the Tidyverse knows exactly why and how their
+code works (at least at one level of abstraction) and enjoyed the time they
+spent on it. The programmer using Copilot would only have a sketchy sense
+that their code seems to work, and they probably didn’t have much fun
+getting there. This is why I fundamentally oppose LLMCAs: the challenges
+that drive programmers to use them would be better solved with their own
+“Tidyverses”.
+
+From a business perspective, it might seem less risky to rent access to
+LLMCAs than invest in the development of new tooling, but this is a mistake.
+The tools may be _relatively_ inexpensive to use now, but that’s bound to
+change eventually. The cost of building and deploying a new LLMCA ensures
+that only a few Big Tech players can compete, and these companies have track
+record of collusion[^collusion]. I also find that many hiring managers
+underestimate how much more productive their workers can be when given
+a challenging but fun task than when asked to do something “easier” that’s
+boring.
+
+I’m no business guy, my call to action is directed primarily to my fellow
+“senior” developers. Don’t evangelize for LLMCAs—instead push harder for the
+resources to develop better tooling. If you currently use LLMCAs yourself,
+identify the types of tasks that benefit the most from them, and note these
+as spaces in need of creative solutions. Encourage junior programmers to
+develop a deeper understanding of the tools they use currently, and insist
+that your colleagues at all levels imagine something better.
+
+[^science]: For its part, science can be a great tool for exposing the
+ limitations of these mental models. But at the end of the day, it’s
+ still only producing different, hopefully better models, operating at
+ specific levels of abstraction.
+
+[^comprehend]: I invite any extremely hardcore coder who scoffs at my claim
+ that computers are difficult to comprehend to reflect on the last time
+ they were required to think about the physics of transistors or the
+ capacitance of circuit traces when they were programming their web app
+ or whatever.
+
+[^ai]: Like many LLM-based technologies, these are currently being marketed
+ as “AI”. There’s no reason to believe that these machine learning
+ technologies will bring us closer to “general” artificial intelligence.
+
+[^ethics]: Somebody may argue that this sort of use was permitted by
+ GitHub’s Terms of Service, but there are two flaws in this argument.
+ First, the people posting code to GitHub are not necessarily the code’s
+ authors; plenty of projects have been “mirrored” there by people who
+ were only hoping to make them more accessible. The more glaring error in
+ this argument is that it commits the cardinal sin of mistaking “not
+ illegal” for “ethical”. Plenty of atrocious behaviors have been
+ perfectly legal. Stealing people’s computer code is certainly not in the
+ same class of behavior as slavery and genocide, but I still learned not
+ to assume that those who are quick to point to a Terms of Service are
+ taking ethical issues seriously.
+
+[^climate]: [ChatGPT alone is already consuming the energy of 33,000
+ homes](https://www.nature.com/articles/d41586-024-00478-x).
+
+[^illegal]: I say “probably” because the courts have yet to rule on the
+ matter. At least [one lawsuit](https://githubcopilotlitigation.com/) has
+ already been filed, but I can’t say that I’m particularly optimistic
+ that the courts would rule in favor of individual hobbyists against the
+ interests of some of the wealthiest individuals and corporations in the
+ world.
+
+[^kids]: Just to be clear, this isn’t a “kids these days” rant about the
+ skills of junior programmers, if anybody deserves the blame here it’s
+ the managers and senior programmers who allowed this rotten situation to
+ fester.
+
+[^menge]: Menge, E. J. v. K. (1930). Biological problems and opinions. _The
+ Quarterly Review of Biology, 5_(3), 348-359.
+
+[^unlucky]: And also the doubly-unlucky coders who are neither allowed to
+ use LLMCAs nor given the resources and opportunity to be clever and
+ creative.
+
+[^hadley]: There was a brief period where this was colloquially called “the
+ Hadleyverse” in homage to Wickham, but he insisted on the new name.
+
+[^data-frame]: A “data frame” is another abstraction, used by many
+ programming languages or libraries use to represent observations about
+ distinguishable units; it uses the same metaphor of rows and columns
+ that spreadsheet programs like Microsoft Excel use.
+
+[^pure-function]: A “pure function” is one that “doesn’t have side effects”.
+ In slightly plainer English, a pure function doesn’t do anything other
+ than (potentially) return a new thing. Object methods that update object
+ attributes aren’t pure functions, nor are any functions that modify
+ their arguments or global variables.
+
+[^collusion]: For instance, several Big Tech companies [recently reached
+ a settlement](https://www.npr.org/sections/thetwo-way/2014/04/24/306592297/tech-giants-settle-wage-fixing-lawsuit)
+ with workers alleging that they had engaged in illegal wage-fixing.