site banner

Culture War Roundup for the week of February 23, 2026

This weekly roundup thread is intended for all culture war posts. 'Culture war' is vaguely defined, but it basically means controversial issues that fall along set tribal lines. Arguments over culture war issues generate a lot of heat and little light, and few deeply entrenched people ever change their minds. This thread is for voicing opinions and analyzing the state of the discussion while trying to optimize for light over heat.

Optimistically, we think that engaging with people you disagree with is worth your time, and so is being nice! Pessimistically, there are many dynamics that can lead discussions on Culture War topics to become unproductive. There's a human tendency to divide along tribal lines, praising your ingroup and vilifying your outgroup - and if you think you find it easy to criticize your ingroup, then it may be that your outgroup is not who you think it is. Extremists with opposing positions can feed off each other, highlighting each other's worst points to justify their own angry rhetoric, which becomes in turn a new example of bad behavior for the other side to highlight.

We would like to avoid these negative dynamics. Accordingly, we ask that you do not use this thread for waging the Culture War. Examples of waging the Culture War:

  • Shaming.

  • Attempting to 'build consensus' or enforce ideological conformity.

  • Making sweeping generalizations to vilify a group you dislike.

  • Recruiting for a cause.

  • Posting links that could be summarized as 'Boo outgroup!' Basically, if your content is 'Can you believe what Those People did this week?' then you should either refrain from posting, or do some very patient work to contextualize and/or steel-man the relevant viewpoint.

In general, you should argue to understand, not to win. This thread is not territory to be claimed by one group or another; indeed, the aim is to have many different viewpoints represented here. Thus, we also ask that you follow some guidelines:

  • Speak plainly. Avoid sarcasm and mockery. When disagreeing with someone, state your objections explicitly.

  • Be as precise and charitable as you can. Don't paraphrase unflatteringly.

  • Don't imply that someone said something they did not say, even if you think it follows from what they said.

  • Write like everyone is reading and you want them to be included in the discussion.

On an ad hoc basis, the mods will try to compile a list of the best posts/comments from the previous week, posted in Quality Contribution threads and archived at /r/TheThread. You may nominate a comment for this list by clicking on 'report' at the bottom of the post and typing 'Actually a quality contribution' as the report reason.

4
Jump in the discussion.

No email address required.

Gemini's sample is impressive! Color me impressed, especially that a straight-up prompt produced that (though I suppose if any technique would get it with current models, it'd be "one shotting through a prompt" rather than "iterative refinement towards a target").

My impression is that Gemini's output was unusually good and Claude’s was unusually bad. But both 3.1 Pro and 4.6 Sonnet are new enough that my intuition based on extensive interaction with previous models might no longer be applicable. For what it's shirt, both were n=1 samplings with zero cherrypicking.

since you don't tend to drop spurious technical details into your walls of text unless they serve a purpose (and also because I half suspect you're not a fan of the amyloid theory of alzheimers)

Looks around shiftily why, I'd never throw in spurious technical details into an essay. Couldn't be me!

(I probably wouldn't use the specific Tau and amyloid phrasing, since you are correct that I have very mixed feelings about the amyloid hypothesis)

Interestingly, your results look much, much better to me than the ones I get myself. I ran the same test as you did against Gemini, and got these not-very-good attempts: 1 2 3. Gemini took distinctive phrases (e.g. "85% agree") and ideas (e.g. "claude code as supply chain risk") I have used once in the corpus, fixated on them, and stitched them together into a skinsuit which superficially resembles my writing but doesn't hold up under scrutiny. Interestingly, that's a very base model flavored failure mode. I have grown unused to seeing base-model-flavored failure modes, and as such Gemini is much more interesting to me now.

The examples seem to channel your "LessWrong" blogging voice. I am unable to critique the technical details or identify (what I expect are many) confabulations, but if I saw this posted there in your name I wouldn't bat an eye.

I haven't really futzed around with base models since GPT-3, though I might have tried one of the Llama 3s at some point. They're non-trivial to access, and have limited utility for me. Mainly because of the added difficulty of prompting base models, and the fact that the publicly accessible ones are nowhere near as intelligent as proprietary dedicated assistants. If you think I'm wrong about this, I'd be curious to hear about it.

In general, I get the strong impression that while the author of the corpus might be able to pinpoint specific issues in terms of style or stance, it's much harder for others to spot those tells.

The biggest pitfalls are the tendency to adopt em-dashes (models are more than capable of not doing that if you specifically prompt them not to), and other stock "AI" phrases like:

There is a very specific failure mode in modern LLMs

Which can show up if you're using models to merely edit/format a draft, and not just write an essay from scratch.

I must also continue stressing the point that this isn't quite representative of my usual informal benchmark:

  • I'd also ask the model to first output a list of essay topics that it thinks I would write, of which I'd choose a specific one that sounded interesting, perhaps asking it to propose an outline first.
  • I would definitely run multiple iterations of the prompt or suggest specific corrections and check their adherence.
  • I would also index heavily on their ability to mimic authors I know very well. Can they pass as Gwern, or Scott, or Richard Watts? Can they take an existing essay I've written and rewrite it an arbitrary style and produce something interesting, if not superior as a whole?

It's enough for me to spot a better way to say a specific thing I'm already saying. A single vivid metaphor or interesting analogy that is worth co-opting can make the practical purpose of the exercise worth it.