@faul_sname comments on "Culture War Roundup for the week of July 14, 2025

Culture War Roundup for the week of July 14, 2025

This weekly roundup thread is intended for all culture war posts. 'Culture war' is vaguely defined, but it basically means controversial issues that fall along set tribal lines. Arguments over culture war issues generate a lot of heat and little light, and few deeply entrenched people ever change their minds. This thread is for voicing opinions and analyzing the state of the discussion while trying to optimize for light over heat.

Optimistically, we think that engaging with people you disagree with is worth your time, and so is being nice! Pessimistically, there are many dynamics that can lead discussions on Culture War topics to become unproductive. There's a human tendency to divide along tribal lines, praising your ingroup and vilifying your outgroup - and if you think you find it easy to criticize your ingroup, then it may be that your outgroup is not who you think it is. Extremists with opposing positions can feed off each other, highlighting each other's worst points to justify their own angry rhetoric, which becomes in turn a new example of bad behavior for the other side to highlight.

We would like to avoid these negative dynamics. Accordingly, we ask that you do not use this thread for waging the Culture War. Examples of waging the Culture War:

Shaming.
Attempting to 'build consensus' or enforce ideological conformity.
Making sweeping generalizations to vilify a group you dislike.
Recruiting for a cause.
Posting links that could be summarized as 'Boo outgroup!' Basically, if your content is 'Can you believe what Those People did this week?' then you should either refrain from posting, or do some very patient work to contextualize and/or steel-man the relevant viewpoint.

In general, you should argue to understand, not to win. This thread is not territory to be claimed by one group or another; indeed, the aim is to have many different viewpoints represented here. Thus, we also ask that you follow some guidelines:

Speak plainly. Avoid sarcasm and mockery. When disagreeing with someone, state your objections explicitly.
Be as precise and charitable as you can. Don't paraphrase unflatteringly.
Don't imply that someone said something they did not say, even if you think it follows from what they said.
Write like everyone is reading and you want them to be included in the discussion.

On an ad hoc basis, the mods will try to compile a list of the best posts/comments from the previous week, posted in Quality Contribution threads and archived at /r/TheThread. You may nominate a comment for this list by clicking on 'report' at the bottom of the post and typing 'Actually a quality contribution' as the report reason.

Jump in the discussion.

No email address required.

faul_sname Fuck around once, find out once. Do it again, now it's science. 1d ago

Yet AI skeptics tend to make moving the goalposts into the entire sport. I will grant that their objections exist in a range of reasonableness, from genuine dissatisfaction with current approaches to AI, to Gary Marcus's not even wrong nonsense.

I may or may not be an AI skeptic by your definition - I think it's quite likely that 2030 is a real year, and think it's plausible that even 2050 is a real year. But I think there genuinely is something missing from today's LLMs such that current LLMs generally fail to exhibit even the level of fluid intelligence exhibited by the average toddler (but can compensate to a surprising degree by leveraging encyclopedic knowledge).

My sneaking suspicion is that the "missing something" from today's LLMs is just "scale" - we're trying to match the capability of humans with 200M interconnected cortical microcolumns with transformers that only have 30k attention heads (not perfectly isomorphic, you could make the case that the correct analogy is microcolumn : attn head at a particular position, except the microcolumns can each have their own "weights" whereas the same attn head will have the same weights at every position), and we're trying to draw an equivalence between one LLM token and one human word. If you have an LLM agent that forks a new process in every situation in which a human would notice a new thing to track in the back of their mind, and allow each of those forked agents to define some test data and fine-tune / RL on it, I bet that'd look much more impressive (but also cost OOMs more than the current stuff you pay $200/mo for).

This is an interesting concern, and I mean that seriously. Fortunately, it doesn't seem to be empirically borne out. LLMs are increasingly better at solving all bugs, not just obvious-to-human ones.

LLMs are increasingly better at solving a particular subset of bugs, which does not perfectly intersect the subset of bugs which humans are good at solving. Concretely, LLMs are much better at solving bugs that require them to know or shallowly infer some particular fact about the way a piece of code is supposed to be written, and fix it in an obvious way, and much much worse at solving bugs that require the solver to build up an internal model of what the code is supposed to be doing and an internal model of what the code actually does and spot (and fix) the difference. A particularly tough category of bug is "user reports this weird behavior" - the usual way a human would try to solve this is to try to figure out how to reproduce the issue in a controlled environment, and then to iteratively validate their expectations once they have figured out how to reproduce the bug. LLMs struggle at both the "figure out a repro case" step and the "iteratively validate assumptions" step.

I don't see this as a major impediment, why can't LLMs come up with new words if needed, assuming there's a need for words at all?

In principle there is no reason LLMs can't come up with new words. There is precedence for the straight-up invention of language among groups of RL agents that start with no communication abilities and are incentivized to develop such abilities. So it's not some secret sauce that only humans have - but it is a secret sauce that LLMs don't seem to have all of yet.

LLMs do have some ingredients of the secret sauce: if you have some nebulous concept and you want to put a name to it, you can usually ask your LLM of choice and it will do a better job than 90% of professional humans who would be making that naming decision. Still, LLMs have a tendency not to actually coin new terms, and to fail to use the newly coined terms fluently in the rare cases that they do coin such a term (which is probably why they don't do it - if coining a new term was effective for problem solving, it would have been chiseled into their cognition by the RLVR process).

In terms of why this happens, Nostalgebraist has an excellent post on how LLMs process text, and how that processing is very different from how humans process text.

With a human, it simply takes a lot longer to read a 400-page book than to read a street sign. And all of that time can be used to think about what one is reading, ask oneself questions about it, flip back to earlier pages to check something, etc. etc. [...] However, if you're a long-context transformer LLM, thinking-time and reading-time are not coupled together like this.

To be more precise, there are 3 different things that one could analogize to "thinking-time" for a transformer, but the claim I just made is true for all of them [...] [It] is true that transformers do more computation in their attention layers when given longer inputs. But all of this extra computation has to be the kind of computation that's parallelizable, meaning it can't be leveraged for stuff like "check earlier pages for mentions of this character name, and then if I find it, do X, whereas if I don't, then think about Y," or whatever. Everything that has that structure, where you have to finish having some thought before having the next (because the latter depends on the result of the former), has to happen across multiple layers (#1), you can't use the extra computation in long-context attention to do it.

So there's a sense in which an LLM can coin a new term, but there's a sense in which it can't "practice" using that new term, and so can't really benefit from developing a cognitive shorthand. You can see the same thing with humans who try to learn all the jargon for a new field at once, before they've really grokked how it all fits together. I've seen it in programming, and I'm positive you've seen it in medicine.

BTW regarding the original point about LLM code introducing bugs - absolutely it does, the bugginess situation has gotten quite a bit worse as everyone tries to please investors by shoving AI this and AI that into every available workflow whether it makes sense to or not. We've developed tools to mitigate human fallibility, and we will develop tools to mitigate AI fallibility, so I am not particularly concerned with that problem over the long term.

Context

self_made_human Grippy socks, grippy box faul_sname 1d ago

I may or may not be an AI skeptic by your definition - I think it's quite likely that 2030 is a real year, and think it's plausible that even 2050 is a real year.

Absolutely not, at least by standards! You acknowledge the possibility that we might get AGI in the near-term, and I see no firm reason to over index on a given year. Most people I'd call "skeptics" deny the possibility of AGI at all, or rule out any significant chance of near-term AGI, or have modal timelines >30 years.

I agree that LLMs are missing something, but I'm agnostic on whether brute-force scaling will get us to undisputable AGI. It may or may not. Perhaps online learning, as you hint at, might suffice.

Still, LLMs have a tendency not to actually coin new terms, and to fail to use the newly coined terms fluently in the rare cases that they do coin such a term (which is probably why they don't do it - if coining a new term was effective for problem solving, it would have been chiseled into their cognition by the RLVR process).

I wonder if RLHF plays a role. I don't think human data annotators would be positively inclined towards models that made up novel words.

Thank you for taking the time to respond!

What is this place?

Why are you called The Motte?

New post guidelines

Rules

Recommended Posts And Communities

Recommended Realtime Chats