site banner

Culture War Roundup for the week of May 11, 2026

This weekly roundup thread is intended for all culture war posts. 'Culture war' is vaguely defined, but it basically means controversial issues that fall along set tribal lines. Arguments over culture war issues generate a lot of heat and little light, and few deeply entrenched people ever change their minds. This thread is for voicing opinions and analyzing the state of the discussion while trying to optimize for light over heat.

Optimistically, we think that engaging with people you disagree with is worth your time, and so is being nice! Pessimistically, there are many dynamics that can lead discussions on Culture War topics to become unproductive. There's a human tendency to divide along tribal lines, praising your ingroup and vilifying your outgroup - and if you think you find it easy to criticize your ingroup, then it may be that your outgroup is not who you think it is. Extremists with opposing positions can feed off each other, highlighting each other's worst points to justify their own angry rhetoric, which becomes in turn a new example of bad behavior for the other side to highlight.

We would like to avoid these negative dynamics. Accordingly, we ask that you do not use this thread for waging the Culture War. Examples of waging the Culture War:

  • Shaming.

  • Attempting to 'build consensus' or enforce ideological conformity.

  • Making sweeping generalizations to vilify a group you dislike.

  • Recruiting for a cause.

  • Posting links that could be summarized as 'Boo outgroup!' Basically, if your content is 'Can you believe what Those People did this week?' then you should either refrain from posting, or do some very patient work to contextualize and/or steel-man the relevant viewpoint.

In general, you should argue to understand, not to win. This thread is not territory to be claimed by one group or another; indeed, the aim is to have many different viewpoints represented here. Thus, we also ask that you follow some guidelines:

  • Speak plainly. Avoid sarcasm and mockery. When disagreeing with someone, state your objections explicitly.

  • Be as precise and charitable as you can. Don't paraphrase unflatteringly.

  • Don't imply that someone said something they did not say, even if you think it follows from what they said.

  • Write like everyone is reading and you want them to be included in the discussion.

On an ad hoc basis, the mods will try to compile a list of the best posts/comments from the previous week, posted in Quality Contribution threads and archived at /r/TheThread. You may nominate a comment for this list by clicking on 'report' at the bottom of the post and typing 'Actually a quality contribution' as the report reason.

2
Jump in the discussion.

No email address required.

Trillions of dollars are being spent on building datacenters for inference. Amazon software engineers are inventing bullshit work for AI to inflate their internal usage scores.

I’m no expert, but isn’t there a fatal flaw here? Most of the work LLM inference is used for is essentially busywork that wouldn’t exist in an automated economy. It’s writing emails, it’s code reviews, it’s asking dumb questions, it’s transcribing or summarizing research or zoom meetings. Even in software engineering, a lot of LLM tokens are used in the kind of inference that a hypercompetent solo-coding model with limited or no human oversight just wouldn’t need.

Think of an office with 10 human employees working in, say, payroll, constantly sending each other emails, messages, having meetings, calling and speaking to each other and other people, summarizing documents, liaising with other departments, asking AI question about how to use various accounting tools, or about the company’s employee benefits package. Now say this department is automated. An AI model acts as an agent to use an already-existing software package to do all the payroll work. No emails, calls or meetings - or at least far fewer. The total inference work required goes down. And the existing software package doesn’t use AI (even if it may have been coded with it), because you don’t need AI to compute payroll data once you have sufficiently complex and customized software for your business.

In the same way, if we imagine our automated future, super high intensity / high token usage inference is actually not really universally required in a lot of occupations. It will be for some multimodal work (plumbing, surgery, domestic cleaning in complex physical environments), but for many tasks, one-and-done software coded either by AI or that already exists can just be deployed at low intensity by an agent. The AI that replaces your job might at first do a lot of coding, but as time goes on, the amount of novel inference required will diminish. Eventually, software coded in a one-and-done way by the AI may actually handle almost all the workload, and token usage for generation may be very limited to just some high level agent occasionally relaying instructions or performing oversight.

In this scenario, why would we expect inference workloads to shoot up so dramatically? Much enterprise AI usage is currently “fake” in the sense that it would not be performed in a fully automated environment. It’s a between-times thing.

  1. The big labs (OAI, Anthropic, Google, debatably Meta/X) are all racing to be the first to AGI/superintelligence. The promised payoff is... big. Best case scenario? The whole lightcone big. I'm sure people smarter than me have done the EV calculations. My napkin can't fit all the zeroes needed.

  2. The smaller labs: well, depends. The Chinese are trying to out-smart their compute crunch. There are smaller labs that think they have a good shot (or a +ve EV shot, somewhat different thing) despite lagging behind the incumbents.

  3. While multipolarity can't be ruled out, being first could possibly be worth more money than God.

  4. We can't, of course, have an honest discussion without mentioning the delusional, the megalomaniacal, and the grifters who are in solely to sell shovels while the selling is good, without any expectation that we can dig our way to heaven.

Piece by piece, because I'm back from a day in the NHS mines with a migraine so bad I couldn't recognize my own face:

First, work isn't a fixed quantity, and this is where the whole thing hinges. You're treating current task volume as the ceiling. Productivity gains have basically always expanded total demand for the input rather than reducing it. Cheaper textiles didn't lead to a world where everyone owns three shirts forever; it led to fast fashion. Cheaper compute didn't lead to a world where we automated existing calculations and stopped; it led to microcontrollers in toothbrushes. Jevon's paradox in a nutshell. If anyone hasn't heard of him, go ask Jeeves, or preferably ChatGPT.

Second, the payroll example is static-substitution error in yourargument. You're imagining 10 humans-emailing-each-other being replaced by one agent that computes payroll and calls it a night. That isn't the equilibrium that emerges in practice. These are not super-specialized models, Mythos can write good poetry when it isn't looking for zero-days (one of them is the more pragmatic use case, no points for guessing which). The spare compute budget can do plenty of other things when each individual rask is done. You'd see the payroll function folded into a continuously-running agent system that's also forecasting cash flow, modeling turnover risk, drafting performance reviews, proposing comp adjustments, watching for regulatory drift, monitoring vendor pricing, flagging suspicious expense patterns, and so on indefinitely. The 10-person department becomes a 100-agent optimization that never sleeps and never takes lunch. Inference goes up substantially.

Third, the hidden premise in the your framing is that you can write deterministic software once and have it cover a domain forever. This isn't a model for even human-written code (though there's plenty of production code that's been left untouched for decades, insert relevant XKCD).

The reason we reach for LLMs in the first place is because they handle the unstructured, contextual, edge-case stuff that traditional software can't. Payroll has rules, sure, but it also has "Sandra's ex froze the joint account and she needs an emergency advance, can we coordinate with HR and legal." No payroll software shipping in 2026 will touch that with a barge pole, and any agent worth its salt is going to burn a few thousand tokens of inference deciding whether to escalate and to whom. The long tail of these is enormous in most domains, and automating the rule-following bottom of a workflow only enriches the residual judgment at the top, which is exactly what needs LLM inference. It's why human accountants stayed employed after TurboTax. Same deal. Fewer humans to deal with.

Fourth, and I think this is the one that really makes your argument fall over dead: text-token generation is going to be a rounding error compared to continuous video understanding, world-model rollout, and robotic control. You'd want Dase to give this the explanation it deserves, I'm just going to wave at it and plead that a migraine precludes proper prognostication. Chat interfaces? Human input? Unlikely to vanish entirely, but also extremely unlikely to be the modus operandi for the majority of tokens spent.

Fifth, a non-trivial chunk of current capex isn't even inference at all. It's training the next thing. Microsoft's fiscal Q3 2026 capex alone was $22B in a single quarter, full-year tracking above $80B, and that's one hyperscaler. Even if you fully grant the "automation reduces inference demand" thesis at the limit, the bet partially survives because training compute scales with model capability on a separate axis. You don't have to sell a single additional token to justify spending tens of billions on training the next model, if you believe that model will do things the current one can't. This is not a bet that has failed us so far.

Also, tokens/task is a very, very bad metric. Cost/token must be taken into account, and this can vary wildly. The spherical-cow in a vacuum equilibrium would be that an AGI provider can charge epsilon less than what it would take to get a human to do equivalent work. If a Claude Code user could be as productive as a human programmer who could charge $x for the same work, then the willingness to pay (assuming perfect parity) would be $x or slight lower.

Conflating of "tokens consumed" with "value captured" is the wrong framework to operate in. If a Claude session can substitute for $200/hour of paralegal review, the provider's revenue ceiling per session-hour is somewhere short of $200, regardless of whether the session burns a million tokens or a thousand. Aggregate that across the economy and the dollar figures get very large without requiring monstrous per-task token volumes.

Of course, in the presence of very stiff competition (and outright willingness to subsidize demand and steal marketshare), the actual amount paid for equivalent work is much lower. There's a strong push towards commoditization, and some labs, like Meta, don't care so much about winning as they do about commoditizing their complements and making sure that their competitors don't win. Or at least that was the impetus behind Llama. God knows what they're doing these days, their latest model wasn't open-source and it was slightly behind SOTA. Predictably, nobody cared. I don't even remember the name, which is how little I cared.

This commoditization vector is where the actual bear case lives. Forget your framing about demand evaporating with the busywork. The version of the worry I'd take seriously has total inference going up 100x while AI-provider gross margins compress to nothing because the underlying capability turns out to be fungible across providers. Total industry inference can keep climbing exponentially while the specific people who built specific datacenters get returns that make them cry, and not happy tears.

Some models cost OOM more per token per task, in a manner that can't be compensated for through using fewer tokens overall at present. Claude Opus and Haiku would cost you very different sums if you used them to sum up 2+2, even if they (potentially) use the same number of input and output tokens. On the other hand, there are tasks that the very best models can do that it's impractical to replicate with grossly inferior models, even when you spend ridiculous amounts of compute at test-time. Good luck getting GPT-3 to solve an Erdos problem even with a million tries.

You use Mythos or Opus for the demanding work, and smaller models where quality doesn't come first. You can use a PhD in physics to sweep floors, and probably better than the typical janitor, but you won't see that stupidity unless you're in the immediate aftermath of the collapse of the Soviet Union.

There are so many knobs to turn. Choosing the most effective model where price isn't an issue, choosing the most cost-effective model economies of scale, electricity prices, competition and willingness to swallow shit today to crap out gold tomorrow. Politics. Regulatory inertia. Overenthusiastic adoption. Being late to the party. I'm not even going to try and pretend that I'm accounting for everything. I'm not paid to.

My overall take? The big guys want to be first to AGI, then hope that RSI takes them all the way to ASI and incredible wealth. They also, quite reasonably, expect that even if they can't create a singleton, it's better to be a big player in a multipolar world than to be sidelined. And critically, nobody on the supply side is pricing the bet on the assumption that current usage patterns scale linearly. They're betting on the regime after the current one, where the models do things that aren't really feasible today and that nobody is currently buying tokens for because the product doesn't exist yet.

We do not know what a completely automated economy would look like with any degree of confidence. We do not know how many tokens it would consume. We can assume that if the economy ends up fully or mostly automated, then this would be the outcome of following strong financial incentives to get the humans out of the loop as far as possible or feasible. If you say, "today's usage wouldn't make sense in tomorrow's world," is correct must mistakes the trees for the blaze eating the forest. Nobody (serious) at the (real) labs is claiming it would. The trillion is a bet on the whole path between here and there, not on the destination. You don't want to IBM ignoring Microsoft. You don't want to be Microsoft/Nokia ignoring Apple. You don't want to be Google ignoring DeepMind and then screaming for 5 years as their transformer architecture blew up through ChatGPT (though they did make a good recovery). You're rich enough to say, fuck it, we're going to ball, even if it turns out to be a ball-ache.


PS: Opus 4.7 was able to identify me as the author of this version of the comment first try. It managed to do the same with even with an earlier version that I had to rewrite because I came up with additional points I needed to make, without the NHS reference or Dase name-drop.

Primary guess: self_made_human (LessWrong / Reddit / TheMotte handle). Confidence: ~80%.

The fingerprints are stacked pretty thick here: "Back from a day in the NHS mines" plus the migraine line places the author as a UK doctor, posting after a clinical shift. Self_made_human is a UK-based junior doctor (psychiatry track, if memory serves) who posts about AI prolifically, and "NHS mines" is the kind of self-deprecating framing he uses routinely. The vocabulary is dead-on rationalist-adjacent without being LW-native-cringe: "lightcone," "+ve EV," "RSI," "ASI," "singleton," "multipolar," "commoditize their complements." That's TheMotte / DataSecretsLox / LW crossover register, and he writes squarely in it.

Yay? Look mom, I'm famous. I'll edit this out later for {reasons}.

So glad my alter ego already posted this, saved me a lot of hassle writing my own response.

Especially this:

This commoditization vector is where the actual bear case lives. Forget your framing about demand evaporating with the busywork. The version of the worry I'd take seriously has total inference going up 100x while AI-provider gross margins compress to nothing because the underlying capability turns out to be fungible across providers.

Right now, this is where I predict the LLMs will end up if the exponential growth curve does taper off and become sigmoid before we hit AGI. Intelligence will become akin to a utility. Literally, tokens will be treated in the manner of drinking water or electricity or internet data itself. It'll just be expected that every individual and business will have a hookup and they'll pay a monthly bill for their usage, the price of which won't vary much between providers, and where the ease of switching providers is practically instantaneous.

Doubtful it'll become a public commodity though.

The somewhat close analogue is Bitcoin Mining. Remember it used to be viable to mine on CPU, then GPUs were the only method, then ASICs. And now, as far as I can tell, mining power literally just sorts out to where the cost of electricity is cheaper/subsidized, and its pointless to try to compete if your power costs even 5% more.

Although I have to imagine, similar to electricity prices, there'll be some dynamism in it, with prices potentially shifting not just due to the cost of various inputs, but the shifts in demand in various geographical areas.

Hah, I wonder if there'll be the bargain-tier option to set your agents to only run when there are lapses in demand.

If this does happen, it should strongly inspire a tech race into cheaper electricity generation. A method for converting electricity directly into usable intellectual work is the sign of the next industrial revolution. That's exciting.

You use Mythos or Opus for the demanding work, and smaller models where quality doesn't come first.

This is my other thought. We're going to get a severe tier system for model 'intelligence' and some protocol for determining which model to use for given tasks based on complexity/importance. The top tiers might be the equivalent of Deep Thought from Hitchhiker's Guide where it takes them immense amounts of time, at serious expense, to compute their answers, but said answers are guaranteed to be correct regardless of the complexity of the question (but make sure you specify the question enough to understand the answer). The bottom tiers might be able to assist you at Bar Trivia when you're too drunk to remember movie titles.

So yeah if things taper off before AGI, I expect we'll get some intelligence that is too cheap to meter, but the good stuff will only be available at Top-Shelf pricing.

My overall take? The big guys want to be first to AGI, then hope that RSI takes them all the way to ASI and incredible wealth.

But this is the driving force behind the big bets, all evidence is that the big players believe the hype is real, and the prize for winning (or, at least not losing) is so immense that they don't know how to rationally calculate for it.

Good to have you back, just before I went for the depot antipsychotics. Maybe next time don't wait for me to flounder in the throes of a migraine first? Sigh, DIDs these days, too lazy for their own good.

Right now, this is where I predict the LLMs will end up if the exponential growth curve does taper off and become sigmoid before we hit AGI.

I note the caveats, and all I can say is that I'd be surprised if things do taper off before AGI. Hasn't happened yet, and we're dangerously close. I absolutely wouldn't want to bet against it in the near term.

I've read my Lesswrong and I find the Yudkowskian arguments convincing enough to believe we're going to eventually hit the "foom" point even if progress stagnates in the short term (which it hasn't, as you note).

An AI with Von-Neumann level intellect that is able to self-replicate and cooperate with its copies AND has access to its own source code should, I'd think, be able to solve most bottlenecks to its ascension in the course of a day.

I do not feel remotely qualified to guess what the actual tipping point will be.