site banner

Culture War Roundup for the week of May 18, 2026

This weekly roundup thread is intended for all culture war posts. 'Culture war' is vaguely defined, but it basically means controversial issues that fall along set tribal lines. Arguments over culture war issues generate a lot of heat and little light, and few deeply entrenched people ever change their minds. This thread is for voicing opinions and analyzing the state of the discussion while trying to optimize for light over heat.

Optimistically, we think that engaging with people you disagree with is worth your time, and so is being nice! Pessimistically, there are many dynamics that can lead discussions on Culture War topics to become unproductive. There's a human tendency to divide along tribal lines, praising your ingroup and vilifying your outgroup - and if you think you find it easy to criticize your ingroup, then it may be that your outgroup is not who you think it is. Extremists with opposing positions can feed off each other, highlighting each other's worst points to justify their own angry rhetoric, which becomes in turn a new example of bad behavior for the other side to highlight.

We would like to avoid these negative dynamics. Accordingly, we ask that you do not use this thread for waging the Culture War. Examples of waging the Culture War:

  • Shaming.

  • Attempting to 'build consensus' or enforce ideological conformity.

  • Making sweeping generalizations to vilify a group you dislike.

  • Recruiting for a cause.

  • Posting links that could be summarized as 'Boo outgroup!' Basically, if your content is 'Can you believe what Those People did this week?' then you should either refrain from posting, or do some very patient work to contextualize and/or steel-man the relevant viewpoint.

In general, you should argue to understand, not to win. This thread is not territory to be claimed by one group or another; indeed, the aim is to have many different viewpoints represented here. Thus, we also ask that you follow some guidelines:

  • Speak plainly. Avoid sarcasm and mockery. When disagreeing with someone, state your objections explicitly.

  • Be as precise and charitable as you can. Don't paraphrase unflatteringly.

  • Don't imply that someone said something they did not say, even if you think it follows from what they said.

  • Write like everyone is reading and you want them to be included in the discussion.

On an ad hoc basis, the mods will try to compile a list of the best posts/comments from the previous week, posted in Quality Contribution threads and archived at /r/TheThread. You may nominate a comment for this list by clicking on 'report' at the bottom of the post and typing 'Actually a quality contribution' as the report reason.

5
Jump in the discussion.

No email address required.

AI bros still in shambles, news at 7.

A few weeks ago, Anthropic made a post about their new model, Mythos. As has been done by other members of the AI industry as far back as the release of GPT 2, the creators of it said it was too dangerous to release. The headline feature of Mythos, at least as described by Anthropic, was not code generation. Instead, they specifically hyped it as the most amazing thing ever for finding security vulnerabilities in code.

Several people, including here on this forum, shared the hype. As usual, I remained unconvinced. I've mentioned elsewhere that I don't think AIs are inherently incapable of finding security vulnerabilities in code, my main skepticism is that they will generate lots of false positives in the process that will make them a lot less useful than the companies selling them have advertised. And more importantly, I think they are currently incapable of designing and maintaining any significant projects that go beyond a basic bitch CRUD application or things of that sort. I'm also skeptical that there is all that much room for growth or improvement beyond their current capabilities, for a number of reasons that I won't get into right now.

But enough about my opinions, I'm just a retarded code monkey doing API integrations for boring tax software. Enter Daniel Stenberg, the creator and maintainer of curl. For those who don't know, if you have a program or library that makes HTTP requests, there is an extremely high likelihood that it is using curl under the hood. It's basically one of the foundational pieces of modern digital infrastructure, a "project some random person in Nebraska has been thanklessly maintaining since 2003", as XKCD might put it: https://xkcd.com/2347/

Stenberg/curl was one of the projects that was offered early access to Mythos. However despite being promised access initially, it took several weeks to get it. And even then he suddenly was no longer being offered direct access, but was offered to have someone else run Mythos against his codebase for him and to then share the results with him. This is a big red flag for me, because if Mythos does actually generate a lot of noise/false positives, it would make sense that Anthropic would want to hide that by running it themselves as many times as they could until it actually generated some real, actionable results.

In any case, the results that Stenberg got back were underwhelming. Mythos claimed to have identified 5 vulnerabilities. After investigating all of them, Stenberg and his team determined that only one of those was a vulnerability, and a low severity one at that. In Stenberg's own words: "curl is certainly getting better thanks to this report, but counted by the volume of issues found, all the previous AI tools we have used have resulted in larger bugfix amounts."

Most damning from Stenberg is this: "My personal conclusion can however not end up with anything else than that the big hype around this model so far was primarily marketing. I see no evidence that this setup finds issues to any particular higher or more advanced degree than the other tools have done before Mythos. Maybe this model is a little bit better, but even if it is, it is not better to a degree that seems to make a significant dent in code analyzing."

So I'm asking @self_made_human and others who seem more on-board with the AI hype train: does this report from a knowledgeable and experienced developer change your opinions on the future trajectory of AI at all?

Full article by Stenberg can be found here:

https://daniel.haxx.se/blog/2026/05/11/mythos-finds-a-curl-vulnerability/

For the record, while I appreciate the name-drop, I've largely checked out of this debate. I read the article when it crossed HN, which I browse daily. The strongest critique of Mythos is that GPT 5.5 Pro reaches similar benchmarks while being cheaper and generally available. Which is to say: Mythos isn't quite as special as Anthropic would like, because a competing frontier model already demonstrates equivalent capabilities. See the problem there? Or, from my vantage point, the absence of one?

Why so checked out?

Not because I've recanted, and not because I've stopped believing my own forecasts. It's that anyone who hasn't gotten the memo by now is beyond my ability to help. I've been on this beat for years, sounding the alarm for about as long. Litigating whether each fresh data point lands above or below the trendline has stopped feeling like a useful expenditure of my evenings. I still have the arguments cocked and loaded, still bookmark whatever catches my eye, with roughly the clinical curiosity of an ICU physician watching creatinine and urea climb and eGFR slide in a patient with end-stage renal disease. Erdos problems falling like dominos and Terence Tau watching from the sidelines, Tim Gowers writing up breakthroughs from OpenAI's unreleased general-purpose models, METR's task-horizon metrics snapping like a mediocre school psychologist trying to score Einstein on the Stanford-Binet. (At some point the instrument stops measuring the subject and starts measuring its own inadequacy.)*

TL;DR: my supply of fucks is running thin. If you're pinging me hoping to extract an argument about AI capabilities, calibrate accordingly. I've got bigger fish to fry before I get thrown into the fryer myself. Good luck to whoever still has the energy for it.

*Go ask ChatGPT for citations and actual links.

METR's task-horizon metrics snapping like a mediocre school psychologist trying to score Einstein on the Stanford-Binet.

Einstein's IQ was probably about 140, so no. You don't understand what you're talking about here. Nice try though.

Sure buddy. The psychiatry resident who reads up on psychometrics for fun and is fully aware of the unreliability of standard IQ testing when we're going several sigmas away from the median of the distribution wouldn't have any idea about what he's talking about. Especially when his actual point is that trying to IQ test someone as far out of distribution as Einstein (famous for being the dimmest bulb in the shed)* is going to give unreliable results?

You wanna try telling me Feynmann had an IQ of 125? I'll believe you, or at least humor you.

Aight. Gotta hand it to you. Never had a chance of winning this argument. I concede.

*He's famous for only emitting a singular photon.

Especially when his actual point is that trying to IQ test someone as far out of distribution as Einstein (famous for being the dimmest bulb in the shed)* is going to give unreliable results?

Einstein was not out of the distribution. He was very smart, but not in a reality-shattering way, and he was very focused on his craft, like all successful smart people.

The psychiatry resident who reads up on psychometrics for fun and is fully aware of the unreliability of standard IQ testing when we're going several sigmas away from the median of the distribution wouldn't have any idea about what he's talking about.

I have a PhD in statistics. With all due respect, I know what physicians study, and while many of you are great healthcare practitioners, you do not study the quantitative.

Einstein was not out of the distribution. He was very smart, but not in a reality-shattering way, and he was very focused on his craft, like all successful smart people.

Einstein. The gentleman responsible for Special Relativity, General Relativity, the photoelectric effect (which actually got him the Nobel), Brownian motion as proof of atoms, mass-energy equivalence, and Bose-Einstein statistics. "Very smart, but not in a reality-shattering way."

I'm going to need a minute to process that. Possibly three. Fortunately my psychiatry experience prepares me well; I can usually recover from being utterly flabbergasted in 5 seconds or bust.

If two complete overhauls of how humanity understands space, time, gravity, and matter doesn't clear your bar for "reality-shattering," I'd genuinely love to know what does. Should he have collapsed the lightcone via propagating false vacuum decay? Manually torn the curvature tensor out of the universe and presented it to Bohr in a jar? What are you on about? What are you smoking?

As for “Einstein was probably about 140”: probably according to what? A preserved Wechsler protocol from 1905? A Stanford-Binet administered by divine revelation? Some conversion table from “invented general relativity” to “moderately gifted but not too spooky”? I am genuinely curious how you got to “Einstein probably had IQ 140”. I presume you've heard of something called a ceiling effect?

"Focused on his craft, like all successful smart people" really makes me wonder which Einstein you mean. The one who played violin semi-seriously, wrote political and philosophical essays by the bushel, corresponded with Freud about the psychology of war, lobbied Roosevelt about the bomb, and turned down the presidency of Israel? Monomaniacal indeed. The phrase "successful smart people" is also wonderfully convenient as a construction, since any polymath counter-example presumably just gets retroactively reclassified as unsuccessful.

I have a PhD in statistics. With all due respect, I know what physicians study, and while many of you are great healthcare practitioners, you do not study the quantitative.

With whatever respect you're due, and without further comment on the magnitude of that debt: British psychiatrists are held to higher standards than that. I'm held to higher standards than that, mostly by myself. I know the difference between Cohen's d and Hedges' g. My interest in entering a d-measuring contest with you is, by consensus values, small. It is roughly equivalent to my interest in arguing with you about the psychometric validity of the other form of g.

Don't believe me? Here's the MRCPsych Paper B critical appraisal syllabus.

I gave it last week. The headache is still bad enough that I'm not going to dig through my own post history to surface the times I've gone several layers deep into statistics arguments on this site. You're welcome to spend your time doing so, I value mine.

Lumping me in with the median doctor who thinks p<0.05 gud? Nice try though.

I'm going to need a minute to process that.

I heard you can do that in just 5 seconds if you just run really really fast. Like, really fast.

And shrink when viewed by an external observer? I'm a grower, not a shower.