site banner

Culture War Roundup for the week of August 18, 2025

This weekly roundup thread is intended for all culture war posts. 'Culture war' is vaguely defined, but it basically means controversial issues that fall along set tribal lines. Arguments over culture war issues generate a lot of heat and little light, and few deeply entrenched people ever change their minds. This thread is for voicing opinions and analyzing the state of the discussion while trying to optimize for light over heat.

Optimistically, we think that engaging with people you disagree with is worth your time, and so is being nice! Pessimistically, there are many dynamics that can lead discussions on Culture War topics to become unproductive. There's a human tendency to divide along tribal lines, praising your ingroup and vilifying your outgroup - and if you think you find it easy to criticize your ingroup, then it may be that your outgroup is not who you think it is. Extremists with opposing positions can feed off each other, highlighting each other's worst points to justify their own angry rhetoric, which becomes in turn a new example of bad behavior for the other side to highlight.

We would like to avoid these negative dynamics. Accordingly, we ask that you do not use this thread for waging the Culture War. Examples of waging the Culture War:

  • Shaming.

  • Attempting to 'build consensus' or enforce ideological conformity.

  • Making sweeping generalizations to vilify a group you dislike.

  • Recruiting for a cause.

  • Posting links that could be summarized as 'Boo outgroup!' Basically, if your content is 'Can you believe what Those People did this week?' then you should either refrain from posting, or do some very patient work to contextualize and/or steel-man the relevant viewpoint.

In general, you should argue to understand, not to win. This thread is not territory to be claimed by one group or another; indeed, the aim is to have many different viewpoints represented here. Thus, we also ask that you follow some guidelines:

  • Speak plainly. Avoid sarcasm and mockery. When disagreeing with someone, state your objections explicitly.

  • Be as precise and charitable as you can. Don't paraphrase unflatteringly.

  • Don't imply that someone said something they did not say, even if you think it follows from what they said.

  • Write like everyone is reading and you want them to be included in the discussion.

On an ad hoc basis, the mods will try to compile a list of the best posts/comments from the previous week, posted in Quality Contribution threads and archived at /r/TheThread. You may nominate a comment for this list by clicking on 'report' at the bottom of the post and typing 'Actually a quality contribution' as the report reason.

Jump in the discussion.

No email address required.

I was browsing through the news today and I found an interesting article about the current state of AI for corporate productivity.

MIT report: 95% of generative AI pilots at companies are failing

Despite the rush to integrate powerful new models, about 5% of AI pilot programs achieve rapid revenue acceleration; the vast majority stall, delivering little to no measurable impact on P&L.

There seems to have been a feeling over the last few years that generative AI was going to gut white collar jobs the same way that offshoring gutted blue collar jobs in the 1980s and 90s, and that it was going to happen any day now.

If this study is trustworthy, the promise of AI appears to be less concrete and less imminent than many would hope or fear.

I've been thinking about why that might be, and I've reached three non-exclusive but somewhat unrelated thoughts.

The first is that Gartner hype cycle is real. With almost every new technology, investors tend to think that every sigmoid curve is an exponential curve that will asymptotically approach infinity. Few actually are. Are we reaching the point where the practical gains available in each iteration our current models are beginning to bottom out? I'm not deeply plugged in to the industry, nor the research, nor the subculture, but it seems like the substantive value increase per watt is rapidly diminishing. If that's true, and there aren't any efficiency improvements hiding around the next corner, it seems like we may be entering the through of disillusionment soon.

The other thought that occurs to me is that people seem to be absolutely astounded by the capabilities of LLMs and similar technology.

Caveat: My own experience with LLMs is that it's like talking to a personable schizophrenic from a parallel earth, so take my ramblings with a grain of salt.

It almost seems like LLMs exist in an area similar to very early claims of humanoid automata, like the mechanical Turk. It can do things that seem human, and as a result, we naturally and unconsciously ascribe other human capabilities to them while downplaying their limits. Eventually, the discrepancy grows to great - usually when somebody notices the cost.

On the third hand, maybe it is a good technology and 95% of companies just don't know how to use it?

Does anyone have any evidence that might lend weight to any of these thoughts, or discredit them?

There are two companion articles of late that I'd add to comment on this.

  1. Why LLMs can't actually build software

This one is pretty short and to the point. LLMs, without any companion data management component, are prediction machines. They predict the next n-number of tokens based on the preceding (input) tokens. The context window functions like a very rough analog to a "memory" but it's really better to compare it to priors or biases in the bayesian sense. (This is why you can gradually prompt an LLM into and out of rabbit holes). Crucially, LLMs don't have nor hold an idea of state. They don't have a mental model of anything because they don't have a mental anything (re-read that twice, slowly).

In terms of corporate adoption, companies are seeing that once you get into complex, multi-stage tasks, especially those that might involve multiple teams working together, LLMs break down in hilarious ways. Software devs have been seeing this for months (years?). An LLM can make nice little toy python class or method pretty easily, but when you're getting into complex full stack development, all sorts of failure modes pop up (the best is when it nukes its own tests to make everything pass.)

"Complexity is the enemy" may be a cliche but it remains true. For any company above a certain size, any investment has to answer the question "will this reduce or increase complexity?" The answer may not need to be "reduce." There could be a tradeoff there that actually results in more revenue / reduced cost. But still, the question will come up. With LLMs, the answer, right now, is 100% "increase." Again, that's not a show stopper, but it makes the bar for actually going through with the investment higher. And the returns just aren't there at scale. From friends at large corporations in the middle of this, their anec-data is all the same "we realized pretty early that we'd have to build a whole new team of 'LLM watchers' for at least the first version of the rollout. We didn't want to hire and manage all of that."

  1. AWS may have shown what true pricing looks like

TLDR for this one: for LLM providers to actually break even, it might cost $2k/month per user.

There's room to disagree with that figure, but even the pro version of the big models that cost $200+ per month are probably being heavily subsidized through burning VC cash. A hackernews comment framed it well - "$24k / yr is 20% of a $120k / yr salary. Do we think that every engineer using LLMs for coding is seeing a 20% overall productivity boost?"

Survey says no (Note: there are more than a few "AI makes devs worse" research papers floating around right now. I haven't fully developed my own evaluation of them - I think a few conflate things - but the early data, such as it is, paints a grim picture)


I'm a believer in LLMs to be a transformational technology, but I think our first attempt with them - as a society - is going to be kind of a wet fart. Neither "spacing faring giga-civilizaiton" nor "paperclips ate my robot girlfriend." Two topical predictions are 1) One of the Big AI companies is going to go to zero. 2) A Fortune 100 company is going to go nearly bankrupt because of negligent use of AI, but not in a spectacular "it sent all of our money to china" way ... it'll be about 1 - 2 years slow creep of fucked up internal reporting and management before, all of a sudden, "we've entered a death spiral of declining revenue and rising costs."

An LLM can make nice little toy python class or method pretty easily, but when you're getting into complex full stack development, all sorts of failure modes pop up

I'm using it for full stack development on a $20 plan and it works. I guess it depends on what you mean by complex full stack development, how complex is complex? I wouldn't try to make an MMO or code global air traffic controls with AI but it can definitely handle frontend (if supervised by a human with eyes), backend, database, API calls, logging, cybersecurity...

And sure it does fail sometimes with complex requests, once you go above 10K lines in one context window the quality lowers. But you can use it to fix errors it makes and iterate, have it help with troubleshooting, refactor, focus the context length on what's critical... Seems like there are many programmers who expect it to one-shot everything and if it doesn't one-shot a task they just give up on it entirely.

The metr paper is somewhat specialized. It tests only experienced devs working on repositories they're already familiar with as they mention within, the most favourable conditions for human workers over AI: https://metr.org/blog/2025-07-10-early-2025-ai-experienced-os-dev-study/

Secondly, Claude 3.7 is now obsolete. I recall someone on twitter saying they were one of the devs in that study. He said that modern reasoning models are much more helpful than what they had then + people are getting better at using them.

Given that the general trend in AI is that inference costs are declining while capability increases, since the production frontier is moving outwards, then investment will probably pay off. Usage of Openrouter in terms of tokens has increased 30x within a year. The top 3 users of tokens there are coding tools. People clearly want AI and they're prepared to pay for it, I see no reason why their revealed preference should be disbelieved.

https://openrouter.ai/rankings