This weekly roundup thread is intended for all culture war posts. 'Culture war' is vaguely defined, but it basically means controversial issues that fall along set tribal lines. Arguments over culture war issues generate a lot of heat and little light, and few deeply entrenched people ever change their minds. This thread is for voicing opinions and analyzing the state of the discussion while trying to optimize for light over heat.
Optimistically, we think that engaging with people you disagree with is worth your time, and so is being nice! Pessimistically, there are many dynamics that can lead discussions on Culture War topics to become unproductive. There's a human tendency to divide along tribal lines, praising your ingroup and vilifying your outgroup - and if you think you find it easy to criticize your ingroup, then it may be that your outgroup is not who you think it is. Extremists with opposing positions can feed off each other, highlighting each other's worst points to justify their own angry rhetoric, which becomes in turn a new example of bad behavior for the other side to highlight.
We would like to avoid these negative dynamics. Accordingly, we ask that you do not use this thread for waging the Culture War. Examples of waging the Culture War:
-
Shaming.
-
Attempting to 'build consensus' or enforce ideological conformity.
-
Making sweeping generalizations to vilify a group you dislike.
-
Recruiting for a cause.
-
Posting links that could be summarized as 'Boo outgroup!' Basically, if your content is 'Can you believe what Those People did this week?' then you should either refrain from posting, or do some very patient work to contextualize and/or steel-man the relevant viewpoint.
In general, you should argue to understand, not to win. This thread is not territory to be claimed by one group or another; indeed, the aim is to have many different viewpoints represented here. Thus, we also ask that you follow some guidelines:
-
Speak plainly. Avoid sarcasm and mockery. When disagreeing with someone, state your objections explicitly.
-
Be as precise and charitable as you can. Don't paraphrase unflatteringly.
-
Don't imply that someone said something they did not say, even if you think it follows from what they said.
-
Write like everyone is reading and you want them to be included in the discussion.
On an ad hoc basis, the mods will try to compile a list of the best posts/comments from the previous week, posted in Quality Contribution threads and archived at /r/TheThread. You may nominate a comment for this list by clicking on 'report' at the bottom of the post and typing 'Actually a quality contribution' as the report reason.
Jump in the discussion.
No email address required.
Notes -
A response to Freddie deBoer on AI hype
Bulverism is a waste of everyone's time
Freddie deBoer has a new edition of the article he writes about AI. Not, you’ll note, a new article about AI: my use of the definite article was quite intentional. For years, Freddie has been writing exactly one article about AI, repeating the same points he always makes more or less verbatim, repeatedly assuring his readers that nothing ever happens and there’s nothing to see here. Freddie’s AI article always consists of two discordant components inelegantly and incongruously kludged together:
sober-minded appeals to AI maximalists to temper their most breathless claims about the capabilities of this technology by carefully pointing out shortcomings therein
childish, juvenile insults directed at anyone who is even marginally more excited about the potential of this technology than he is, coupled with armchair psychoanalysis of the neuroses undergirding said excitement
What I find most frustrating about each repetition of Freddie’s AI article is that I agree with him on many of the particulars. While Nick Bostrom’s Superintelligence is, without exception, the most frightening book I’ve ever read in my life, and I do believe that our species will eventually invent artificial general intelligence — I nevertheless think the timeline for that event is quite a bit further out than the AI utopians and doomers would have us believe, and I think a lot of the hype around large language models (LLMs) in particular is unwarranted. And to lay my credentials on the table: I’m saying this as someone doesn’t work in the tech industry, who doesn’t have a backgrond in computer science, who hasn’t been following the developments in the AI space as closely as many have (presumably including Freddie), and who (contrary to the occasional accusation my commenters have fielded at me) has never used generative AI to compose text for this newsletter and never intends to.
I’m not here to take Freddie to task on his needlessly confrontational demeanour (something he rather hypocritically decries in his interlocutors), or attempt to put manners on him. If he can’t resist the temptation to pepper his well-articulated criticisms of reckless AI hypemongering with spiteful schoolyard zingers, that’s his business. But his article (just like every instance in the series preceding it) contains many examples of a particular species of fallacious reasoning I find incredibly irksome, regardless of the context in which it is used. I believe his arguments would have a vastly better reception among the AI maximalists he claims to want to persuade if he could only exercise a modicum of discipline and refrain from engaging in this specific category of argument.
Quick question: what’s the balance in your checking account?
If you’re a remotely sensible individual, it should be immediately obvious that there are a very limited number of ways in which you can find the information to answer this question accurately:
Dropping into the nearest branch of your bank and asking them to confirm your balance (or phoning them).
Logging into your bank account on your browser and checking the balance (or doing so via your banking app).
Perhaps you did either #1 or #2 a few minutes before I asked the question, and can recite the balance from memory.
Now, supposing that you answered the question to the best of your knowledge, claiming that the balance of your checking account is, say, €2,000. Imagine that, in response, I rolled my eyes and scoffed that there’s no way your bank balance could possibly be €2,000, and the only reason that you’re claiming that that’s the real figure is because you’re embarrassed about your reckless spending habits. You would presumably retort that it’s very rude for me to accuse you of lying, that you were accurately reciting your bank balance to the best of your knowledge, and furthermore how dare I suggest that you’re bad with money when in fact you’re one of the most fiscally responsible people in your entire social circle—
Wait. Stop. Can you see what a tremendous waste of time this line of discussion is for both of us?
Either your bank balance is €2,000, or it isn’t. The only ways to find out what it is are the three methods outlined above. If I have good reason to believe that the claimed figure is inaccurate (say, because I was looking over your shoulder when you were checking your banking app; or because you recently claimed to be short of money and asked me for financial assistance), then I should come out and argue that. But as amusing as it might be for me to practise armchair psychoanalysis about how the only reason you’re claiming that the balance is €2,000 is because of this or that complex or neurosis, it won’t bring me one iota closer to finding out what the real figure is. It accomplishes nothing.
This particular species of fallacious argument is called Bulverism, and refers to any instance in which, rather than debating the truth or falsity of a specific claim, an interlocutor assumes that the claim is false and expounds on the underlying motivations of the person who advanced it. The checking accout balance example above is not original to me, but from C.S. Lewis, who coined the term:
As Lewis notes, if I have definitively demonstrated that the claim is wrong — that there’s no possible way your bank balance really is €2,000 — it may be of interest to consider the psychological factors that resulted in you claiming otherwise. Maybe you really were lying to me because you’re embarrassed about your fiscal irresponsibility; maybe you were mistakenly looking at the balance of your savings account rather than your checking account; maybe you have undiagnosed myopia and you misread a 3 as a 2. But until I’ve established that you are wrong, it’s a colossal waste of my time and yours to expound at length on the state of mind that led you to erroneously conclude that the balance is €2,000 when it’s really something else.
In the eight decades since Lewis coined the term, the popularity of this fallacious argumentative strategy shows no signs of abating, and is routinely employed by people at every point on the political spectrum against everyone else. You’ll have evolutionists claiming that the only reason people endorse young-Earth creationism is because the idea of humans evolving from animals makes them uncomfortable; creationists claiming that the only reason evolutionists endorse evolution is because they’ve fallen for the epistemic trap of Scientism™ and can’t accept that not everything can be deduced from observation alone; climate-change deniers claiming that the only reason environmentalists claim that climate change is happening is because they want to instate global communism; environmentalists claiming that the only reason people deny that climate change is happening is because they’re shills for petrochemical companies. And of course, identity politics of all stripes (in particular standpoint epistemology and other ways of knowing) is Bulverism with a V8 engine: is there any debate strategy less productive than “you’re only saying that because you’re a privileged cishet white male”? It’s all wonderfully amusing — what could be more fun than confecting psychological just-so stories about your ideological opponents in order to insult them with a thin veneer of cod-academic therapyspeak?
But it’s also, ultimately, a waste of time. The only way to find out the balance of your checking account is to check the balance on your checking account — idle speculation on the psychological factors that caused you to claim that the balance was X when it was really Y are futile until it has been established that it really is Y rather than X. And so it goes with all claims of truth or falsity. Hypothetically, it could be literally true that 100% of the people who endorse evolution have fallen for the epistemic trap of Scientism™ and so on and so forth. Even if that was the case, that wouldn’t tell us a thing about whether evolution is literally true.
To give Freddie credit where it’s due, the various iterations of his AI article do not consist solely of him assuming that AI maximalists are wrong and speculating on the psychological factors that caused them to be so. He does attempt, with no small amount of rigour, to demonstrate that they are wrong on the facts: pointing out major shortcomings in the current state of the LLM art; citing specific examples of AI predictions which conspicuously failed to come to pass; comparing the recent impact of LLMs on human society with other hugely influential technologies (electricity, indoor plumbing, antibiotics etc.) in order to make the case that LLMs have been nowhere near as influential on our society as the maximalists would like to believe. This is what a sensible debate about the merits of LLMs and projections about their future capabilities should look like.
But poor Freddie just can’t help himself, so in addition to all of this sensible sober-minded analysis, he insists on wasting his readers’ time with endless interminable paragraphs of armchair psychoanalysis about how the AI maximalists came to arrive at their deluded worldviews:
Am I disagreeing with any of the above? Not at all: whenever anyone is making breathless claims about the potential near-future impacts of some new technology, I have to assume there’s some amount of wishful thinking or motivated reasoning at play.
No: what I’m saying to Freddie is that his analysis, even if true, doesn’t fucking matter. It’s irrelevant. It could well be the case that 100% of the AI maximalists are only breathlessly touting the immediate future of AI on human society because they’re too scared to confront the reality of a world characterised by boredom, drudgery, infirmity and mortality. But even if that was the case, that wouldn’t tell us one single solitary thing about whether this or that AI prediction is likely to come to pass or not. The only way to answer that question to our satisfaction is to soberly and dispassionately look at the state of the evidence, the facts on the ground, resisting the temptation to get caught up in hype or reflexive dismissal. If it ultimately turns out that LLMs are a blind alley, there will be plenty of time to gloat about the psychological factors that caused the AI maximalists to believe otherwise. Doing so before it has been conclusively shown that LLMs are a blind alley is a waste of words.
Freddie, I plead with you: stay on topic. I’m sure it feels good to call everyone who’s more excited than you about AI an emotionally stunted manchild afraid to confront the real world, but it’s not a productive contribution to the debate. Resist the temptation to psychoanalyse people you disagree with, something you’ve complained about people doing to you (in the form of suggesting that your latest article is so off the wall that it could only be the product of a manic episode) on many occasions. The only way to check the balance of someone’s checking account is to check the balance on their checking account. Anything else is a waste of everyone’s time.
You used to get this sorta thing on ratsphere tumblr, where "rapture of the nerds" was so common as to be a cliche. I kinda wonder if deBoer's "imminent AI rupture" follows from that and he edited it, or if it's just a coincidence. There's a fun Bulverist analysis of why religion was the focus there and 'the primacy of material conditions' from deBoer, but that's even more of a distraction from the actual discussion matter.
There's a boring sense where it's kinda funny how bad deBoer is at this. I'll overlook the typos, because lord knows I make enough of those myself, but look at his actual central example, that he opens up his story around:
There's a steelman of deBoer's argument, here. But the one he actually presented isn't engaging, in the very slightest, with what Scott is trying to bring up, or even with a strawman of what Scott was trying to bring up. What, exactly, does deBoer believe a cure to aging (or even just a treatment for diabetes, if we want to go all tech-hyper-optimism) would look like, if not new medical technology? What, exactly, does deBoer think of the actual problem of long-term commitment strategies in a rapidly changing environment?
Okay, deBoer doesn't care, and/or doesn't even recognize those things as questions. It's really just a springboard for I Hate Advocates For This Technology. Whatever extent he's engaging with the specific claims is just a tool to get to that point. Does he actually do his chores or eat his broccoli?
Well, no.
Ah, nobody makes that claim, r-
Okay, so 'nobody' includes the very person making this story.
This isn't even a good technical understanding of how ChatGPT, as opposed to just the LLM, work, and even if I'm not willing to go as far as self_made_human for people raising the parrots critique here, I'm still pretty critical for it, but the more damning bit is where and deBoer is either unfamiliar with or choosing to ignore the many domains in favor of One
StudyRando With A Chess Game. Will he change his mind if someone presents a chess-focused LLM with a high ELO score?I could break into his examples and values a lot deeper -- the hallucination problem is actually a lot more interesting and complicated, questions of bias are usually just smuggling in 'doesn't agree with the writer's politics' but there are some genuine technical questions -- but if you locked the two of us in a room and only provided escape if we agreed I still don't think either of us would find discussing it with each other more interesting that talking to the walls. It's not just that we have different understandings of what we're debating; it's whether we're even trying to debate something that can be changed by actual changes in the real world.
Okay, deBoer isn't debating honestly. His claim about New York Times fact-checking everything is hilarious, but his link to a special issue that he literally claims "not a single line of real skepticism appears" and also has as its first headline "Everyone is Using AI for Everything. Is That Bad?" and includes the phrase "The mental model I sometimes have of these chatbots is as a very smart assistant who has a dozen Ph.D.s but is also high on ketamine like 30 percent of the time". He tries to portray Mounk as outraged by "indifference of people like Tolentino (and me) to the LLM “revolution.”" But look at Mounk or Tolentino's actual pieces, and there's actual factual claims that they're making, not just vague vibes that they're bouncing off each other; the central criticism Mounk has is whether Tolentino's piece and its siblings are actually engaging with what LLMs can change rather than complaining about a litany of lizardman evils. (At least deBoer's not falsely calling anyone a rapist, this time.)
((Tbf, Mounk, in turn, is just using Tolentino as a springboard; her piece is actually about digital disassociation and the increasing power of AIgen technologies that she loathes. It's not really the sorta piece that's supposed to talk about how you grapple with things, for better or worse.))
But ultimately, that's just not the point. None of deBoer's readers are going to treat him any less seriously because of ChessLLM (or because many LLMs will, in fact, both say they reason and quod erat demonstratum), or because deBoer turns "But in practice, I too find it hard to act on that knowledge." into “I too find it hard to act on that knowledge [of our forthcoming AI-driven species reorganization]” when commenting on an essay that does not use the word "species" at all, and only uses "organization" twice in the same paragraph to talk about regulatory changes, and when "that knowledge" is actually just Mounk's (imo, wrong) claim that AI is under-hyped. That's not what his readers are paying him for, and that's not why anyone who links to him in the slightly most laudatory manner is doing so.
The question of Bulverism versus factual debate is an important one, but it's undermined when the facts don't matter, either.
Huh. I was confident that I had a better writeup about why "stochastic parrots" are a laughable idea, at least as a description for LLMs. But no, after getting a minor headache figuring out the search operators here, it turns out that's all I've written on the topic.
I guess I never bothered because it's a Gary Marcus-tier critique, and anyone using it loses about 20 IQ points in my estimation.
But I guess now is as good a time as any? In short, it is a pithy, evocative critique that makes no sense.
LLMs are not inherently stochastic. They have a (not usually exposed to end-user except via API) setting called temperature. Without going into how that works, it suffices it to say that by setting the value to zero, their output becomes deterministic. The exact same prompt gives the exact same output.
The reason why temperature isn't just set to zero all the time is because the ability to choose something other than the next most likely token has benefits when it comes to creativity. At the very least it saves you from getting stuck with the same subpar result.
Alas, this means that LLMs aren't stochastic parrots. Minus the stochasticity, are they just "parrots"? Anyone thinking this is on crack, since Polly won't debug your Python no matter how many crackers you feed her.
If LLMs were merely interpolating between memorized n-grams or "stitching together" text, their performance would be bounded by the literal contents of their training data. They would excel at retrieving facts and mimicking styles present in the corpus, but would fail catastrophically at any task requiring genuine abstraction or generalization to novel domains. This is not what we observe.
Let’s get specific. The “parrot” model implies the following:
LLMs can only repeat (paraphrase, interpolate, or permute) what they have seen.
They lack generalization, abstraction, or true reasoning.
They are, in essence, Markov chains with steroids.
To disprove any of those claims, just gestures angrily look at the things they can do. If winning gold in the latest IMO is something a "stochastic parrot" can pull off, then well, the only valid takeaway is that the damn parrot is smarter than we thought. Definitely smarter than the people who use the phrase unironically.
The inventors of the phrase, Bender & Koller gave two toy “gotchas” that they claimed no pure language model could ever solve: (1) a short vignette about a bear chasing a hiker, and (2) the spelled-out arithmetic prompt “Three plus five equals”. GPT-3 solved both within a year. The response? Crickets, followed by goal-post shifting: “Well, it must have memorized those exact patterns.” But the bear prompt isn’t in any training set at scale, and GPT-3 could generalize the schema to new animals, new hazards, and new resolutions. Memorization is a finite resource but generalization is not.
(I hope everyone here recalls that GPT-3 is ancient now)
On point 2: Consider the IMO example. Or better yet, come up with a rigorous definition of reasoning by which we can differentiate a human from an LLM. It's all word games, or word salad.
On 3: Just a few weeks back, I was trying to better understand the actual difference between a Markov Chain and an LLM, and I had asked o3 if it wasn't possible to approximate the latter with the former. After all, I wondered, if MCs only consider the previous unit (usually words, or a few words/n-gram), then couldn't we just train the MC to output the next word conditioned on every word that came before? The answer was yes, but that this was completely computationally intractable. The fact that we can run LLMs on something smaller than a Matrioshka brain is because of their autoregressive nature, and the brilliance of the transformer architecture/attention mechanism.
Overall, even the steelman interpretation of the parrot analogy is only as helpful as this meme, which I have helpfully appended below. It is a bankrupt notion, a thought-terminating cliché at best, and I wouldn't cry if anyone using it meets a tiger outside the confines of a cage.
/images/17544215520465958.webp
I liked using the stochastic parrot idea as a shorthand for the way most of the public use llms. It gives non-computer savvy people a simple heuristic that greatly elevates their ability to use them. But having read this I feel a bit like Charlie and Mac when the gang wrestles.
I would consider myself an LLM evangelist, and have introduced quite a few not-particularly tech savvy people to them, with good results.
I've never been tempted to call them stochastic parrots. The term harms more than it helps. My usual shortcut is to tell people to act as if they're talking to a human, a knowledgeable but fallible one, and they should double check anything of real consequence. This is a far more relevant description of the kind of capabilities they possess than any mention of a "parrot".
The fact you've never been tempted to use the 'stochastic parrot' idea just means you haven't dealt with the specific kind of frustration I'm talking about.
Yeah the 'fallible but super intelligent human' is my first shortcut too, but it actually contributes to the failure mode the stochastic parrot concept helps alleviate. The concept is useful for those who reply 'Yeah, but when I tell a human they're being an idiot, they change their approach.' For those who want to know why it can't consistently generate good comedy or poetry. For people who don't understand rewording the prompt can drastically change the response, or those who don't understand or feel bad about regenerating or ignoring the parts of a response they don't care about like follow up questions.
In those cases, the stochastic parrot is a more useful model than the fallible human. It helps them understand they're not talking to a who, but interacting with a what. It explains the lack of genuine consciousness, which is the part many non-savvy users get stuck on. Rattling off a bunch of info about context windows and temperature is worthless, but saying "it's a stochastic parrot" to themselves helps them quickly stop identifying it as conscious. Claiming it 'harms more than it helps' seems more focused on protecting the public image of LLMs than on actually helping frustrated users. Not every explanation has to be a marketing pitch.
I still don't see why that applies, and I'm being earnest here. What about the "stochastic parrot" framing keys the average person into the fact that they're good at code and bad at poetry? That is more to do with mode collapse and the downsides of RLHF than it is to do with lacking "consciousness". Like, even on this forum, we have no shortage of users who are great at coding but can't write a poem to save their lives, what does that say about their consciousness? Are parrots known to be good at Ruby-on-rails but fail at poetry?
My explanation of temperature is, at the very least, meant as a high level explainer. It doesn't come up in normal conversation. Context windows? They're so large now that it's not something that is worth mentioning except in passing?
My point is that the parrot metaphor adds nothing. It is, at best, irrelevant, when it comes to all the additional explainers you need to give to normies.
I thought I explained it pretty well, but I will try again. It is a cognitive shortcut, a shorthand people can use when they are still modelling it like a 'fallible human' and expecting it to respond like a fallible human. Mode collapse and RLHF have nothing to do with it, because it isn't a server side issue, it is a user issue, the user is anthropomorphising a tool.
Yes, temperature and context windows (although I actually meant to say max tokens, good catch) don't come up in normal conversation, they mean nothing to a normie. When a normie is annoyed that chatgpt doesn't "get" them, the parrot model helps them pivot from "How do I make this understand me?" to "What kind of input does this tool need to give me the output I want?"
You can give them a bunch of additional explanations about mode collapse and max tokens that they won't understand (and they will just stop using it) or you can give them a simple concept that cuts through the anthropomorphising immediately so that when they are sitting at their computer getting frustrated at poor quality writing or feeling bad about ignoring the llms prodding to take the conversation in a direction they don't care about, they can think 'wait it's a stochastic parrot' and switch gears. It works.
A human fails at poetry because it has the mind, the memories and grounding in reality, but it lacks the skill to match the patterns we see as poetic. An LLM has the skill, but lacks the mind, memories and grounding in reality. What about the parrot framing triggers that understanding? Memetics I guess. We have been using parrots to describe non-thinking pattern matchers for centuries. Parroting a phrase goes back to the 18th century. "The parrot can speak, and yet is nothing more than a bird" is a phrase in the ancient Chinese Book of Rites.
Also I didn't address this earlier because I thought it was just amusing snark, but you appear to be serious about it. Yes, you are correct that a parrot can't code. Do you have a similar problem with the fact a computer virus can't be treated with medicine? Or that the cloud is actually a bunch of servers and can't be shifted by the wind? Or the fact that the world wide web wasn't spun by a world wide spider? Attacking a metaphor is not an argument.
More options
Context Copy link
More options
Context Copy link
More options
Context Copy link
More options
Context Copy link
More options
Context Copy link
More options
Context Copy link
More options
Context Copy link
More options
Context Copy link