By Serdar Yegulalp on 2023-05-22 08:00:00-04:00 No comments
By now it's obvious to most anyone how ChatGPT and other large language models amount to plagiarism as a service. That may not be what their intent was, but that's what they've turned into. People, me included, have good reason to be worried about how these things can be used to displace creative workers — not because they are that good, but because cheap replacements for human labor of any kind are superficially attractive.
What's been kicking around in my head is how drawing parallels between ChatGPT and what creativity actually is obscures the mechanisms by which transformative creative work is actually made. We've become so accustomed to devaluing actual human creative work, maybe it's no surprise this next set of adulterations comes so readily.
ChatGPT and other LLMs work by using statistics to guess the likelihood that a given thing will be followed by another given thing. But transformative work isn't about taking something and following it with one of a number of things that are statistically likely to follow. It's about doing the exact opposite — doing something unlikely to follow, but in a way that insists on its own connection with the predecessor and defends it. That is far more difficult, but far more useful and rewarding.
Call it the "constructive surprise" criterion: we love it when we are shown how two apparently unrelated things can share common ground in a way that enriches them both. I remember Douglas Hofstader arguing how one of the hallmarks of consciousness the ability to analogize. The value of a transformative work isn't proportionate to the statistical likelihood of the elements that it arises from, but inversely so. LLMs don't seem equipped to make such leaps.
I am also reminded of Sir Karl Popper's criterion for the value of a theory. A theory that predicts something with a high probability of happening isn't very interesting, because it has low predictive power. A theory that is about something improbable is more interesting, especially if it has high predictive power. "It will rain soon" is not very useful; "soon" could be "in an hour" or "within the week". "It will rain for 1 1/2 hours from 9am on, tomorrow," is far more useful, especially if it turns out to be accurate.
We value new information, improbable information, more than what's predictable and probable. This is not to say predictable things are useless, only that having a way to know about less predictable things, and thus increase our knowledge, is more useful. The same goes for creative work: the new is more useful to us than some recombination of the old. Having a large pool of the old to rejigger back into the new doesn't change the underlying issue. There's room in this world for autocomplete, but autocomplete — especially "page-level autocomplete", as Brad DeLong thinks of LLMs — is not a creative system.
All this also further reinforces my feeling that the computational models we have for human thought are simply wrong. A brain isn't a computer and a computer isn't a brain. But more than that, a statistical model for human language is not a brain, either. Language is an artifact of human intelligence, but it's seeming increasingly improbable we can reverse-engineer the behaviors that constitute human intelligence by simply accruing a large enough statistical representation of language.
But then again, when your goal is to displace human labor with the cheapest possible simulation of it you can get away with, maybe you don't need full-blown AGI anyway. Just a fair forgery.