This weekly roundup thread is intended for all culture war posts. 'Culture war' is vaguely defined, but it basically means controversial issues that fall along set tribal lines. Arguments over culture war issues generate a lot of heat and little light, and few deeply entrenched people ever change their minds. This thread is for voicing opinions and analyzing the state of the discussion while trying to optimize for light over heat.
Optimistically, we think that engaging with people you disagree with is worth your time, and so is being nice! Pessimistically, there are many dynamics that can lead discussions on Culture War topics to become unproductive. There's a human tendency to divide along tribal lines, praising your ingroup and vilifying your outgroup - and if you think you find it easy to criticize your ingroup, then it may be that your outgroup is not who you think it is. Extremists with opposing positions can feed off each other, highlighting each other's worst points to justify their own angry rhetoric, which becomes in turn a new example of bad behavior for the other side to highlight.
We would like to avoid these negative dynamics. Accordingly, we ask that you do not use this thread for waging the Culture War. Examples of waging the Culture War:
-
Shaming.
-
Attempting to 'build consensus' or enforce ideological conformity.
-
Making sweeping generalizations to vilify a group you dislike.
-
Recruiting for a cause.
-
Posting links that could be summarized as 'Boo outgroup!' Basically, if your content is 'Can you believe what Those People did this week?' then you should either refrain from posting, or do some very patient work to contextualize and/or steel-man the relevant viewpoint.
In general, you should argue to understand, not to win. This thread is not territory to be claimed by one group or another; indeed, the aim is to have many different viewpoints represented here. Thus, we also ask that you follow some guidelines:
-
Speak plainly. Avoid sarcasm and mockery. When disagreeing with someone, state your objections explicitly.
-
Be as precise and charitable as you can. Don't paraphrase unflatteringly.
-
Don't imply that someone said something they did not say, even if you think it follows from what they said.
-
Write like everyone is reading and you want them to be included in the discussion.
On an ad hoc basis, the mods will try to compile a list of the best posts/comments from the previous week, posted in Quality Contribution threads and archived at /r/TheThread. You may nominate a comment for this list by clicking on 'report' at the bottom of the post and typing 'Actually a quality contribution' as the report reason.
Jump in the discussion.
No email address required.
Notes -
There are two companion articles of late that I'd add to comment on this.
This one is pretty short and to the point. LLMs, without any companion data management component, are prediction machines. They predict the next n-number of tokens based on the preceding (input) tokens. The context window functions like a very rough analog to a "memory" but it's really better to compare it to priors or biases in the bayesian sense. (This is why you can gradually prompt an LLM into and out of rabbit holes). Crucially, LLMs don't have nor hold an idea of state. They don't have a mental model of anything because they don't have a mental anything (re-read that twice, slowly).
In terms of corporate adoption, companies are seeing that once you get into complex, multi-stage tasks, especially those that might involve multiple teams working together, LLMs break down in hilarious ways. Software devs have been seeing this for months (years?). An LLM can make nice little toy python class or method pretty easily, but when you're getting into complex full stack development, all sorts of failure modes pop up (the best is when it nukes its own tests to make everything pass.)
"Complexity is the enemy" may be a cliche but it remains true. For any company above a certain size, any investment has to answer the question "will this reduce or increase complexity?" The answer may not need to be "reduce." There could be a tradeoff there that actually results in more revenue / reduced cost. But still, the question will come up. With LLMs, the answer, right now, is 100% "increase." Again, that's not a show stopper, but it makes the bar for actually going through with the investment higher. And the returns just aren't there at scale. From friends at large corporations in the middle of this, their anec-data is all the same "we realized pretty early that we'd have to build a whole new team of 'LLM watchers' for at least the first version of the rollout. We didn't want to hire and manage all of that."
TLDR for this one: for LLM providers to actually break even, it might cost $2k/month per user.
There's room to disagree with that figure, but even the pro version of the big models that cost $200+ per month are probably being heavily subsidized through burning VC cash. A hackernews comment framed it well - "$24k / yr is 20% of a $120k / yr salary. Do we think that every engineer using LLMs for coding is seeing a 20% overall productivity boost?"
Survey says no (Note: there are more than a few "AI makes devs worse" research papers floating around right now. I haven't fully developed my own evaluation of them - I think a few conflate things - but the early data, such as it is, paints a grim picture)
I'm a believer in LLMs to be a transformational technology, but I think our first attempt with them - as a society - is going to be kind of a wet fart. Neither "spacing faring giga-civilizaiton" nor "paperclips ate my robot girlfriend." Two topical predictions are 1) One of the Big AI companies is going to go to zero. 2) A Fortune 100 company is going to go nearly bankrupt because of negligent use of AI, but not in a spectacular "it sent all of our money to china" way ... it'll be about 1 - 2 years slow creep of fucked up internal reporting and management before, all of a sudden, "we've entered a death spiral of declining revenue and rising costs."
I'm using it for full stack development on a $20 plan and it works. I guess it depends on what you mean by complex full stack development, how complex is complex? I wouldn't try to make an MMO or code global air traffic controls with AI but it can definitely handle frontend (if supervised by a human with eyes), backend, database, API calls, logging, cybersecurity...
And sure it does fail sometimes with complex requests, once you go above 10K lines in one context window the quality lowers. But you can use it to fix errors it makes and iterate, have it help with troubleshooting, refactor, focus the context length on what's critical... Seems like there are many programmers who expect it to one-shot everything and if it doesn't one-shot a task they just give up on it entirely.
The metr paper is somewhat specialized. It tests only experienced devs working on repositories they're already familiar with as they mention within, the most favourable conditions for human workers over AI: https://metr.org/blog/2025-07-10-early-2025-ai-experienced-os-dev-study/
Secondly, Claude 3.7 is now obsolete. I recall someone on twitter saying they were one of the devs in that study. He said that modern reasoning models are much more helpful than what they had then + people are getting better at using them.
Given that the general trend in AI is that inference costs are declining while capability increases, since the production frontier is moving outwards, then investment will probably pay off. Usage of Openrouter in terms of tokens has increased 30x within a year. The top 3 users of tokens there are coding tools. People clearly want AI and they're prepared to pay for it, I see no reason why their revealed preference should be disbelieved.
https://openrouter.ai/rankings
More options
Context Copy link
If the Big AI companies try to actually implement that kind of pricing, they will face significant competition from local models. Right now you can run Qwen3-30B-A3B at ridiculous speeds on medium-end gaming rig or a decent Macbook, or if you're a decently sized company, you could rent a 8xH200 rig 8h/day, every workday, for ~$3.5k/mo, and give 64 engineers simultaneous, unlimited access to Deepseek R1 with comparable speed and performance to the big known models, so like... $55/month per engineer. And I highly doubt they're going to fully saturate it every minute of every workday, so you could probably add even more users, or use a quantized/smaller model.
Yes.
Which is why the Big AI companies are looking to tightly couple with existing enterprise SaaS and/or consumer hardware as fast as possible. And I'm reasonably sure that the large hardware companies may want to aid them. NVIDIA keeps making noise about "AI first" hardware at, I think, a consumer level.
They really do want a version of Sky Net.
More options
Context Copy link
More options
Context Copy link
More options
Context Copy link