This weekly roundup thread is intended for all culture war posts. 'Culture war' is vaguely defined, but it basically means controversial issues that fall along set tribal lines. Arguments over culture war issues generate a lot of heat and little light, and few deeply entrenched people ever change their minds. This thread is for voicing opinions and analyzing the state of the discussion while trying to optimize for light over heat.
Optimistically, we think that engaging with people you disagree with is worth your time, and so is being nice! Pessimistically, there are many dynamics that can lead discussions on Culture War topics to become unproductive. There's a human tendency to divide along tribal lines, praising your ingroup and vilifying your outgroup - and if you think you find it easy to criticize your ingroup, then it may be that your outgroup is not who you think it is. Extremists with opposing positions can feed off each other, highlighting each other's worst points to justify their own angry rhetoric, which becomes in turn a new example of bad behavior for the other side to highlight.
We would like to avoid these negative dynamics. Accordingly, we ask that you do not use this thread for waging the Culture War. Examples of waging the Culture War:
-
Shaming.
-
Attempting to 'build consensus' or enforce ideological conformity.
-
Making sweeping generalizations to vilify a group you dislike.
-
Recruiting for a cause.
-
Posting links that could be summarized as 'Boo outgroup!' Basically, if your content is 'Can you believe what Those People did this week?' then you should either refrain from posting, or do some very patient work to contextualize and/or steel-man the relevant viewpoint.
In general, you should argue to understand, not to win. This thread is not territory to be claimed by one group or another; indeed, the aim is to have many different viewpoints represented here. Thus, we also ask that you follow some guidelines:
-
Speak plainly. Avoid sarcasm and mockery. When disagreeing with someone, state your objections explicitly.
-
Be as precise and charitable as you can. Don't paraphrase unflatteringly.
-
Don't imply that someone said something they did not say, even if you think it follows from what they said.
-
Write like everyone is reading and you want them to be included in the discussion.
On an ad hoc basis, the mods will try to compile a list of the best posts/comments from the previous week, posted in Quality Contribution threads and archived at /r/TheThread. You may nominate a comment for this list by clicking on 'report' at the bottom of the post and typing 'Actually a quality contribution' as the report reason.

Jump in the discussion.
No email address required.
Notes -
Cases like this, and the erdos problems, are exactly where LLMs shine. Problems with clear and unambiguous reward functions that are difficult to hack are perfect use cases. In the Alibaba case, they likely have an extensive set of characterization tests that guarantee consistent behavior. An LLM with a good harness can pound its head against those tests forever while simultaneously measuring the performance as a success metric. It will never get tired and it won't get sick of doing that kind of work.
There's definitely value there, but I don't know how much value. The combination of technical depth and strong guardrails make for a very schizophrenic kind of difficulty. Doing that kind of work is traditionally either the domain of a plucky junior with too much energy, or an insane wizard who claimed a broom closet as his office.
When we've experimented with that kind of optimization work at my employer, it tends to be very expensive, since most of the results come from the absolute tirelessness of the agent. In comparison, how much are you paying your junior? How much are you paying your wizard, and what is he doing if he's not doing that task? Security scans are a similar thing. Line audits aren't hard, but they're hella time consuming. As model costs rise (and they are rising per task completed when you compare any single vendor over time), it might legitimately be cheaper to throw interns at the problem than LLMs.
At least on the software side, I think there's a reasonable chance that what we're seeing is a temporary pop due to a lot of highly verifiable technical debt deadwood finally getting burned out, and that might not be a constant source of demand.
On the war side, I wish I knew more. The sensitive nature of the topic means that all parties are incentivized to obfuscate and dissemble as much as possible. It might legitimately be an ideal case. LLMs do well when you can accept 95% accuracy, and in something like intelligence analysis, 95% accuracy probably has the spooks all but shitting their pants.
More options
Context Copy link