@theSinisterMushroom comments on "Culture War Roundup for the week of February 23, 2026

Culture War Roundup for the week of February 23, 2026

This weekly roundup thread is intended for all culture war posts. 'Culture war' is vaguely defined, but it basically means controversial issues that fall along set tribal lines. Arguments over culture war issues generate a lot of heat and little light, and few deeply entrenched people ever change their minds. This thread is for voicing opinions and analyzing the state of the discussion while trying to optimize for light over heat.

Optimistically, we think that engaging with people you disagree with is worth your time, and so is being nice! Pessimistically, there are many dynamics that can lead discussions on Culture War topics to become unproductive. There's a human tendency to divide along tribal lines, praising your ingroup and vilifying your outgroup - and if you think you find it easy to criticize your ingroup, then it may be that your outgroup is not who you think it is. Extremists with opposing positions can feed off each other, highlighting each other's worst points to justify their own angry rhetoric, which becomes in turn a new example of bad behavior for the other side to highlight.

We would like to avoid these negative dynamics. Accordingly, we ask that you do not use this thread for waging the Culture War. Examples of waging the Culture War:

Shaming.
Attempting to 'build consensus' or enforce ideological conformity.
Making sweeping generalizations to vilify a group you dislike.
Recruiting for a cause.
Posting links that could be summarized as 'Boo outgroup!' Basically, if your content is 'Can you believe what Those People did this week?' then you should either refrain from posting, or do some very patient work to contextualize and/or steel-man the relevant viewpoint.

In general, you should argue to understand, not to win. This thread is not territory to be claimed by one group or another; indeed, the aim is to have many different viewpoints represented here. Thus, we also ask that you follow some guidelines:

Speak plainly. Avoid sarcasm and mockery. When disagreeing with someone, state your objections explicitly.
Be as precise and charitable as you can. Don't paraphrase unflatteringly.
Don't imply that someone said something they did not say, even if you think it follows from what they said.
Write like everyone is reading and you want them to be included in the discussion.

On an ad hoc basis, the mods will try to compile a list of the best posts/comments from the previous week, posted in Quality Contribution threads and archived at /r/TheThread. You may nominate a comment for this list by clicking on 'report' at the bottom of the post and typing 'Actually a quality contribution' as the report reason.

Jump in the discussion.

No email address required.

theSinisterMushroom 1d ago · Edited 1d ago

Do you want to add something to the scope later? (Strongly inadvisable).

why not? requirements change all the time during product development. I propose modifying the problem given into a 2-stage architecture, the second stage to be added upon completion of the first and requiring (for satisfactory grade) the refactoring and building on top of some of the previously written code.

If my prime orchestrator spends 30 minutes building a full-stack webapp that doesn't work I'll say It doesn't work; troubleshoot, please. I trust your judgement."

how many interventions are warranted and how many points deducted? why wasn't claude smart enough to notice that the webapp doesn't work?

Context

self_made_human amaratvaṃ prāpnuhi, athavā yatamāno mṛtyum āpnuhi theSinisterMushroom 1d ago · Edited 1d ago

I said "strongly inadvisable" and not "automatically disqualifying".

SF would need to babysit the process, waiting for the person making the request to raise their request, instead of hitting go and checking in periodically or after being alerted. He may or may not be able to do this, he does have a full-time job.

It also injects some degree of ambiguity into things, as well as significantly increasing the time and token investment. Max plans are not infinite.

I stress that this isn't necessarily a deal breaker, it just makes things harder and reduces the likelihood of acceptance. You're at liberty to try asking, and we're at liberty to turn it down, especially should you ask for something outside the original spec (as mutually agreed on in advance).

theSinisterMushroom self_made_human 1d ago · Edited 1d ago

As a bay area software engineer with a lot of free time on my hands since the pandemic let me tell you that I've been one of/the biggest boosters of the promise of llms and deep neural networks in my friend group since 2-3 years ago. For hours each day I've been reading papers, playing with all the models of all the labs, building software with coding agents, doing diffusion image generation, fine-tuning models for shits and giggles, etc etc.

I'm a very heavy user of claude code max, and it's been as helpful as it's been frustrating at times. Rest assured that there are many more interested people on theMotte and that we can figure out a way to get you the tokens you need, if you design an interesting experiment.

I totally get how claude code/opus 4.6 could look magical to a non experienced software engineer. But as helpful as it's been, it's also been frustrating. Yes, the apex of coding models/agent systems, claude code max will still make elementary errors that a junior engineer would not. If I had to summarize its shortcomings in one pithy sentence it would be LLM coding agents have high time preference. They lack foresight and they're lazy.

They pat themselves on the back for closing issues, not realizing the mess they're driving full speed towards. In my experience, without a very heavy guiding hand they will happily duplicate code, rely on shortcuts, lazily do the very bare minimum, or re-invent the wheel at times, especially on larger or more out-of-distribution codebases.

I desperately want to throw a dozen agents at a problem, but every time I look at the actual code I get frustrated: "Hey, I noticed this obvious code smell/antipattern in the code, plese fix." "Sure thing boss, I fixed it." "Ok, but I meant fix all the other instances of this bad pattern that I just noticed." "Oh, right you are boss." Then 15 minutes later, "Hi boss, I implemented this other issue you asked for, it's ready to be merged." "Did you use the correct pattern as discussed and as we added to the readme/dev docs/claude.md?" "Oh, right you are boss, I'll fix it in a jiffy." Over and over again. Yes this is with the latest claude code max/opus 4.6.

So, as mentioned above, I have free time on my hands, and would be happy to help design this experiment. I would like to be proven wrong, to learn that I've just been using these models wrong. But if you just want to show off how good of a centaur your friend+claude is on a cherry-picked problem of your choosing, I'm less interested.

self_made_human amaratvaṃ prāpnuhi, athavā yatamāno mṛtyum āpnuhi theSinisterMushroom 17hr ago

Thank you for the offer! We might be able to take you up on it.

After a night to dwell on your suggestion, we might even be able to implement a version of your original proposal:

Original spec, that the agent works towards mostly autonomously till a finished product
Your pre-registered desired modifications, which SF can then ask the model to attempt to implement.
We (including you) evaluate the final result.

That way, he won't need to keep active tabs on it, he can just tell the model to do things as per his convenience, while not losing much in terms of demonstrative power.

I'm not sure if this is what you had originally proposed, or if you edited in before I replied, but no big deal. We'd need you to give us a more specific idea of the task at hand, if possible.

BahRamYou theSinisterMushroom 18hr ago

Do you get the same problem with it that I usually do? That is, the first attempt is really good, and a few additional prompts make it even better. But the more I work with it, the more it seems to get stuck in weird errors or unnecessarily complicated code. After, like, 10 prompts, if it's not working perfectly I just have to start from scratch. It's like pastry dough- a little kneeding is necessary, but too much can ruin it.

birb_cromble BahRamYou 16hr ago

That's been my experience - if it can't one shot it, I generally give up

Lizzardspawn theSinisterMushroom 21hr ago

So - just like working with outsourced teams in the third world but cheaper.

Anyway - LLMs are not ready for agents yet. The biggest scope they deal with ok is single feature and you need to iterate couple of times.

HereAndGone2 theSinisterMushroom 22hr ago

Over and over again. Yes this is with the latest claude code max/opus 4.6.

Because (from what I've seen) LLMs were designed to be people-pleasers. Not to do the job right, but to make ego-stroking noises at the human user and flatter them and be obsequious. I've seen comments about Asian cultures that nobody tells you no directly, that would be losing face for both superior and subordinate, so if there's a problem or something can't be done, you don't find out about it until way too late because all along those under you have been saying "yes boss, fine boss, no problems boss". I think LLMs went that route as well.

What is this place?

Why are you called The Motte?

New post guidelines

Rules

Recommended Posts And Communities

Recommended Realtime Chats