site banner

Culture War Roundup for the week of August 4, 2025

This weekly roundup thread is intended for all culture war posts. 'Culture war' is vaguely defined, but it basically means controversial issues that fall along set tribal lines. Arguments over culture war issues generate a lot of heat and little light, and few deeply entrenched people ever change their minds. This thread is for voicing opinions and analyzing the state of the discussion while trying to optimize for light over heat.

Optimistically, we think that engaging with people you disagree with is worth your time, and so is being nice! Pessimistically, there are many dynamics that can lead discussions on Culture War topics to become unproductive. There's a human tendency to divide along tribal lines, praising your ingroup and vilifying your outgroup - and if you think you find it easy to criticize your ingroup, then it may be that your outgroup is not who you think it is. Extremists with opposing positions can feed off each other, highlighting each other's worst points to justify their own angry rhetoric, which becomes in turn a new example of bad behavior for the other side to highlight.

We would like to avoid these negative dynamics. Accordingly, we ask that you do not use this thread for waging the Culture War. Examples of waging the Culture War:

  • Shaming.

  • Attempting to 'build consensus' or enforce ideological conformity.

  • Making sweeping generalizations to vilify a group you dislike.

  • Recruiting for a cause.

  • Posting links that could be summarized as 'Boo outgroup!' Basically, if your content is 'Can you believe what Those People did this week?' then you should either refrain from posting, or do some very patient work to contextualize and/or steel-man the relevant viewpoint.

In general, you should argue to understand, not to win. This thread is not territory to be claimed by one group or another; indeed, the aim is to have many different viewpoints represented here. Thus, we also ask that you follow some guidelines:

  • Speak plainly. Avoid sarcasm and mockery. When disagreeing with someone, state your objections explicitly.

  • Be as precise and charitable as you can. Don't paraphrase unflatteringly.

  • Don't imply that someone said something they did not say, even if you think it follows from what they said.

  • Write like everyone is reading and you want them to be included in the discussion.

On an ad hoc basis, the mods will try to compile a list of the best posts/comments from the previous week, posted in Quality Contribution threads and archived at /r/TheThread. You may nominate a comment for this list by clicking on 'report' at the bottom of the post and typing 'Actually a quality contribution' as the report reason.

3
Jump in the discussion.

No email address required.

I have some experience with games and algorithms, and that leads to some thoughts.

The big headline is that all the various methods we know (including humans) have problems. They often all have some strengths, too. The extremely big picture conceptual hook to hang a variety of particulars under is the No Free Lunch Theorem. Now, when we dig in to some of the details of the ways in which algorithms/people are good/bad, we often see that they're entirely different in character. What happens when you tweak details of the game; what happens when you make a qualitative shift in the game; what happens on the extremes of performance; what you can/can't prove mathematically; etc.

To stick with the chess example, one can easily think about minor chess variants. One that has gotten popular lately is chess 960. Human players are able to adapt decently well in some ways. For example, they hardly ever give illegal moves. At least if you're a remotely experienced player. You miiiiight screw up castling at some point, or you could forget about it in your calculation, but if/when you do, it will 'prompt' you to ruminate on the rule a bit, really commit it to your thought process, and then you're mostly fine. At top level human play, we almost never saw illegal moves, even right at the beginning of when it became a thing. Of course, humans clearly take a substantial performance hit.

Traditional engines require a minor amount of human reprogramming, particularly for the changed castling rules. But other than that, they can pretty much just go. They maybe also suffer a bit in performance, since they haven't built up opening books yet, but probably not as much.

An LLM? Ehhhh. It depends? If it's been trained entirely like Chess LLM on full move sets of traditional chess games, I can't imagine that it won't be spewing illegal moves left and right. It's just completely out of distribution. The answer here is typically that you just need to curate a new dataset (somehow inputting the initial position) and retrain the whole thing. Can it eventually work? Yeah, maybe. But all these things are different.

You can have thought experiments with allll sorts of variants. Humans mostly adapt pretty quickly to the ruleset, with not so many illegal moves, but a performance hit. I'm sure I can come up with variants that require minimal coding modification to traditional engines; I'm sure I can come up with variants that require substantial coding modification to traditional engines (think especially to the degree that your evaluation function needs significant reworking; the addition of NNs to modern 'traditional' engines for evaluation may also require complete retraining of that component); others may even require some modification to other core engine components, which may be more/less annoying. LLMs? Man, I don't know. Are we going to get to a point where they have 'internalized' enough about the game that you could throw a variant at it, turn thinking mode up to max, and it'll manage to think its way through the rule changes, even though you've only trained it on traditional games? Maybe? I don't know! I kind of don't have a clue. But I also slightly lean toward thinking it's unlikely. [EDIT: This paper may be mildly relevant.]

Again, I'm thinking about a whole world of variants that I can come up with; I imagine with interesting selection of variants, we could see all sorts of effects for different methods. It would be quite the survey paper, but probably difficult to have a great classification scheme for the qualitative types of differences. Some metric for 'how much' recoding would need to happen for a traditional engine? Some metric on LLMs with retraining or fine-tuning, or something else, and sort of 'how much'? It's messy.

But yeah, one of the conclusions that I wanted to get to is that I sort of doubt that LLMs (even with max thinking mode) are likely to do all that well on even very minor variants that we could probably come up with. And I think that likely speaks to something on the matter of 'general'. It's not like the benchmark for 'general' is that you have to maintain the same performance on the variant. We see humans take a performance hit, but they generally get the rules right and do at least sort of okay. But it speaks to that different things are different, there's no free lunch, and sometimes it's really difficult to put measures on what's going on between the different approaches. Some people will call it 'jagged' or whatever, but I sort of interpret that as 'not general in the kind of way that humans are general'. Maybe they're still 'general' in a different way! But I tend to think that these various approaches are mostly just completely alien to each other, and they just have very different properties/characteristics all the way down the line.

Indeed, and as i argued in my on post on the subject i think this element of general-applicablity/adaptability is a key component of what most people think of as "intelligence". A book may contain knowledge, but a book is generally not seen as "intelligent" in the way that say an orangutan or a human is. I also think that recognizing this neatly explains the seeming bifurcation in opinions on AI between those in "Bouba" (ie soft/non-rigorous) disciplines and "Kiki" (ie hard) disciplines where there are clear right and wrong answers.