@Rov_Scam comments on "Culture War Roundup for the week of February 23, 2026

Culture War Roundup for the week of February 23, 2026

This weekly roundup thread is intended for all culture war posts. 'Culture war' is vaguely defined, but it basically means controversial issues that fall along set tribal lines. Arguments over culture war issues generate a lot of heat and little light, and few deeply entrenched people ever change their minds. This thread is for voicing opinions and analyzing the state of the discussion while trying to optimize for light over heat.

Optimistically, we think that engaging with people you disagree with is worth your time, and so is being nice! Pessimistically, there are many dynamics that can lead discussions on Culture War topics to become unproductive. There's a human tendency to divide along tribal lines, praising your ingroup and vilifying your outgroup - and if you think you find it easy to criticize your ingroup, then it may be that your outgroup is not who you think it is. Extremists with opposing positions can feed off each other, highlighting each other's worst points to justify their own angry rhetoric, which becomes in turn a new example of bad behavior for the other side to highlight.

We would like to avoid these negative dynamics. Accordingly, we ask that you do not use this thread for waging the Culture War. Examples of waging the Culture War:

Shaming.
Attempting to 'build consensus' or enforce ideological conformity.
Making sweeping generalizations to vilify a group you dislike.
Recruiting for a cause.
Posting links that could be summarized as 'Boo outgroup!' Basically, if your content is 'Can you believe what Those People did this week?' then you should either refrain from posting, or do some very patient work to contextualize and/or steel-man the relevant viewpoint.

In general, you should argue to understand, not to win. This thread is not territory to be claimed by one group or another; indeed, the aim is to have many different viewpoints represented here. Thus, we also ask that you follow some guidelines:

Speak plainly. Avoid sarcasm and mockery. When disagreeing with someone, state your objections explicitly.
Be as precise and charitable as you can. Don't paraphrase unflatteringly.
Don't imply that someone said something they did not say, even if you think it follows from what they said.
Write like everyone is reading and you want them to be included in the discussion.

On an ad hoc basis, the mods will try to compile a list of the best posts/comments from the previous week, posted in Quality Contribution threads and archived at /r/TheThread. You may nominate a comment for this list by clicking on 'report' at the bottom of the post and typing 'Actually a quality contribution' as the report reason.

Jump in the discussion.

No email address required.

Rov_Scam 11hr ago

Thanks! Reviewing the results:

As a spoiler alert, it got both dates wrong again, so I'm disinclined to keep testing this particular task, as it only gets harder from here. That being said, I think the new models did somewhat better. Just so we're clear, GRoL first appeared on a radio chart on 5/9/1966, the Monday before which being 5/2/1966, thus our release date. FtH is pretty straightforward as the copyright date of publication is listed as 6/16/1980.

For GRoL, 5.2 Agent noticed that the major discographical sites (first preference) set the release date to May 1966, and, unlike o3, it didn't note this but pick a June date anyway, so that's an improvement, though I'm not sure if this is due to better architecture or the old error was a one-off. It was able to correctly pick the 5/28/1966 Billboard review, which o3 did as well. However, it once again flunked the ARSA test, the correct radio chart being the 5/9/1966 KBLA chart. Instead, it picked the 6/17/1966 WLS survey. Upon inspection of the sources, though, it appears that, unlike o3, it did not consult ARSA but an old GeoCities site that hosts charts from select radio stations in a few markets. The thing is I specifically specified ARSA. I did allow it to look at "other information", but the context in which it presented the find gave it similar weight to ARSA, and didn't specify that it didn't come from ARSA. Now, when I checked last August's results to see if it made the same error then and I missed it, it did check ARSA, but the link wasn't working. Since ARSA requires a free login, I wasn't sure initially that it would be able to get access but it did, and something may have changed in the meantime that stymied its ability to query ARSA.

But that's not the only problem. First, if it's going to query an alternative site it needs to disclose that. Second, it picked the June 17 date, when the site had the song appearing on the June 10 chart. Third, it noted that the song had been on the charts for 4 weeks, when there's no way it could have known this. The song had only been on the chart the previous week; it had been played on the station for 4 weeks. There was a 4 next to the title, and it incorrectly assumed that this stood for weeks on chart. Since the site wasn't clear, I had to go to ARSA and pull a scan of the chart to be sure exactly what it meant. The thing is that I don't understand why it even did this. I only care about the ARSA data if it gives an earlier date than Billboard, and it clearly didn't so it was irrelevant. If it couldn't access ARSA it could have just said so and used the Billboard date. If the other website had chart data that was earlier I would have appreciated if it took that into consideration, but that wasn't the case. I don't know why it would pretend to pull ARSA data when it didn't yield any useful information.

The 5.2 Thinking model confidently provided a date of 5/28/1966, based on Wikipedia. Based on what we know from above, this date is incorrect, and is the result of somebody entering the Billboard review date into Wikipedia. This is a common error, but I didn't include it in the initial algorithm because I didn't want to overcomplicate things (i.e., include a rule where it won't use Wikipedia dates when they clearly conform to Billboard dates), and this error wasn't present back in August, so I'll let the model slide here. What I won't let it slide on is where it says 45Cat agrees; 45Cat list a release date of May 1966 and includes a note saying "BB 5/28/1966", which clearly refers to a Billboard date. The issue with this is, yes, it followed the rules. But it was clear from the rules that I wanted a date prior to the Billboard date. If we're talking about LLMs being able to replace people for certain tasks, then it can't make the kind of mistake I wouldn't have made. If I only had looked at Wikipedia I might have made that mistake, and if the LLM had only done so I would have given it a pass. But it looked at 45Cat, didn't recognize that the date was not a release date, and even if it had I'm not sure that it would have recognized that the Wikipedia date might be untrustworthy, especially since there was no annotation for it. This might have worked better if I had provided a specific instruction to that effect, but if these things are really intelligent I shouldn't have to think of every possible caveat. If I were going to do that I wouldn't need an LLM and could write a program using conventional software where I just specify every field and include instructions for it.

Moving on to FtH, I have to admit that I whiffed a bit on this when setting this test up because I assumed that since this is a relatively obscure record release information wouldn't be readily available. Apparently I was wrong, and RYM has had the correct release date based on copyright publication data up since July 2024. What this means is that the LLM whiffed harder than I initially gave it credit for. It's apparently still having trouble accessing the US Copyright database, because neither model looked there despite the explicit instructions to query it for all releases after 1978. The Thinking model evidently didn't query RYM at all and did 45Cat (not the best for albums) before going straight to trade publications, radio charts, and a newspaper article. From there it defaults to the Monday prior to the earliest mention and gives a date of 7/14/1980.

The Agent whiffed even harder, though the date it gave was closer to the correct one. First, it said that RYM only listed 1980, but it appears that hasn't been true for nearly 2 years. From there it skips the copyright queries entirely and goes straight to the industry publication data, which this time have an earliest mention of 7/12/1980. Here's where it makes its biggest error. The instructions specified for it to default to Monday if there wasn't a coordinated release day. Here, it picks Tuesday, July 8. Why? It states that 1980 had a typical Tuesday release date, and cites a Vox article. This is not true, and the Vox article says that the Tuesday release date started in the 1980s. To be specific, coordinated Tuesday releases began in April 1989, nearly a decade after FtH was released. So it misunderstood the Vox article. But even had it understood it correctly, it still would have been in a bit of trouble, because the Vox article itself had an error. It says that before April 1989, record stores would stock releases whenever they came in. This is also incorrect; an article in a March 1989 issue of—you guessed it—Billboard, stated that they were changing the release date from Monday to Tuesday because some retailers weren't getting their stock until late Monday. It also says that MCA stayed with the Monday release for the time being (they would switch to Tuesday in 1991 or so). In fact, labels had been coordinating Monday releases since 1982 or 1983. This doesn't matter for the purposes of my rules, since they default to Monday, but it's something to be aware of.

The upshot is that we ran 2 releases with 2 models each and got 4 different answers, none of which was the correct one. To summarize the answers so far for GRoL:

GTP o3: 5/23/1966
GTP 5.2 Thinking: 5/28/1966
GTP 5.2 Agent: 5/16/1966
Gemini 2.5 Pro: 6/13/1966
Claude 4.0 Sonnet: 5/23/1966 or 5/30/1966 (it couldn't decide)

Five models, five dates, none of them correct. There was a glitch in the test where I inadvertently made it too easy and both models still whiffed; when I first designed the test I intentionally omitted released dates that were on reputable websites, because I had no doubt that the LLMs could perform a simple lookup, but one model didn't bother looking and the other probably didn't bother looking. What I suspect happened here is that the 1980 date was in the initial training data from before July 2024 and the model didn't doublecheck the site to see if it had been updated. That's just a guess, but either way it seems like a major problem if after a year it can't find a number on a webpage I specifically instructed it to check. It doesn't understand that since the 1980s does not mean since January 1, 1980.

As a final thought, when I was checking the ARSA data, I pulled the 5/9/1966 survey from KBLA in Burbank, CA, when I noticed something interesting. GRoL did not appear on the chart itself, but in a special "coming attractions" section. Now, I want to make it clear that these dates I am expecting are merely estimates, and that the radio data is the least reliable since stations often get copies for airplay in advance of release. When I was developing this system, I made a judgment call that I'd prefer a too early release date to a too late one. I initially had no way of knowing whether the coming attractions were records that had been released and were expected to be on the next chart, or merely records scheduled for release. I considered the possibility that this may have caused the LLM to think they hadn't been released (before discounting it because they also ignored charts where the record had appeared and may have provided an explanation for why they were discounting a chart). Then I noticed that the coming attractions section that week also included the Temptations classic Ain't Too Proud to Beg. This was fortuitous, because Motown release dates are well-documented; if that record had been released by May 9, then I could be confident that the other coming attractions probably were as well. Ain't Too Proud to Beg was released May 3, 1966, one day after my estimate for GRoL (Motown didn't stick to a set release day). It's a small sample size, but I'm more confident in my method than I was before.

Would an LLM have recognized this possibility and thought to check it like this?

Context

What is this place?

Why are you called The Motte?

New post guidelines

Rules

Recommended Posts And Communities

Recommended Realtime Chats