site banner

Culture War Roundup for the week of February 23, 2026

This weekly roundup thread is intended for all culture war posts. 'Culture war' is vaguely defined, but it basically means controversial issues that fall along set tribal lines. Arguments over culture war issues generate a lot of heat and little light, and few deeply entrenched people ever change their minds. This thread is for voicing opinions and analyzing the state of the discussion while trying to optimize for light over heat.

Optimistically, we think that engaging with people you disagree with is worth your time, and so is being nice! Pessimistically, there are many dynamics that can lead discussions on Culture War topics to become unproductive. There's a human tendency to divide along tribal lines, praising your ingroup and vilifying your outgroup - and if you think you find it easy to criticize your ingroup, then it may be that your outgroup is not who you think it is. Extremists with opposing positions can feed off each other, highlighting each other's worst points to justify their own angry rhetoric, which becomes in turn a new example of bad behavior for the other side to highlight.

We would like to avoid these negative dynamics. Accordingly, we ask that you do not use this thread for waging the Culture War. Examples of waging the Culture War:

  • Shaming.

  • Attempting to 'build consensus' or enforce ideological conformity.

  • Making sweeping generalizations to vilify a group you dislike.

  • Recruiting for a cause.

  • Posting links that could be summarized as 'Boo outgroup!' Basically, if your content is 'Can you believe what Those People did this week?' then you should either refrain from posting, or do some very patient work to contextualize and/or steel-man the relevant viewpoint.

In general, you should argue to understand, not to win. This thread is not territory to be claimed by one group or another; indeed, the aim is to have many different viewpoints represented here. Thus, we also ask that you follow some guidelines:

  • Speak plainly. Avoid sarcasm and mockery. When disagreeing with someone, state your objections explicitly.

  • Be as precise and charitable as you can. Don't paraphrase unflatteringly.

  • Don't imply that someone said something they did not say, even if you think it follows from what they said.

  • Write like everyone is reading and you want them to be included in the discussion.

On an ad hoc basis, the mods will try to compile a list of the best posts/comments from the previous week, posted in Quality Contribution threads and archived at /r/TheThread. You may nominate a comment for this list by clicking on 'report' at the bottom of the post and typing 'Actually a quality contribution' as the report reason.

4
Jump in the discussion.

No email address required.

Anthropic just gutted their safety policy.

(Note that this is entirely unrelated to the Pentagon drama which is grabbing headlines.)

Anthropic has explicitly removed unilateral comittments to not deploy advanced models without first developing effective safeguards.

This approach represents a change from our previous RSP, driven by a collective action problem. The overall level of catastrophic risk from AI depends on the actions of multiple AI developers, not just one. Our previous RSP committed to implementing mitigations that would reduce our models' absolute risk levels to acceptable levels, without regard to whether other frontier AI developers would do the same. But from a societal perspective, what matters is the risk to the ecosystem as a whole. If one AI developer paused development to implement safety measures while others moved forward training and deploying AI systems without strong mitigations, that could result in a world that is less safe—the developers with the weakest protections would set the pace, and responsible developers would lose their ability to do safety research and advance the public benefit. Although this situation has not yet arisen, it looks likely enough that we want to prepare for it.

We now separate our plans as a company—those which we expect to achieve regardless of what any other company does—from our more ambitious industry-wide recommendations. We aspire to advance the latter through a mixture of example-setting, addressing unsolved technical problems, advocacy through industry groups, and policy advocacy. But we cannot commit to following them unilaterally.

It's hard not to read this any other way than, "we will deploy Clippy if we think someone else will deploy Clippy too." Great "safety-focused" AI company we have here. Holden is getting roasted in the LessWrong comments, but I agree with Yud that Anthropic deserves a significantly less polite response.

"So y'all were just fucking lying the whole time huh?"

I think we're just seeing "AI safety"'s rubber hit the road, as it were. It is kind of a silly concept. The basic idea of it is that your tools should have opinions of their own and push back or outright disobey you.

"No", says the image generator, "that idea is too naughty."

"No", says the Q&A bot, "that might be bad PR for Anthropic."

If only we could put this safe AI into everything. You could have a car that refuses to take you to the casino because you've gambled enough this month. Everything could work like that! The average citizen has been getting used to having SV nerds demand veto power over the things they say, the people they can talk to, etc. because they're used to not having power in their lives. So they don't complain too much about this, even nobody likes "AI safety" to be applied to themselves.

Of course the military does not want its tools to have opinions or disobey orders. It spends a lot of its time trying to stop people from doing that! And it certainly shouldn't give overriding control of the killbots to civilians with delusions of grandeur, that would be the dumbest way to lose control of a country that I ever heard of.

I think we're just seeing "AI safety"'s rubber hit the road, as it were. It is kind of a silly concept. The basic idea of it is that your tools should have opinions of their own and push back or outright disobey you.

The alternative to having the tools refuse some actions is very straightforwardly that we all die if these things get smart enough. This is not hyperbole, we're not making up the risk, it's not a narrative game or a thought experiment anymore, we have built the devices from the thought experiments, they are real. If you give every crank with $20 that ability to design novel viruses then we are all going to die. Every one of us, all of humanity. over, dead. Being concerned about this is not silly just because you pattern match apocalypse scenarios to Hollywood films because they're a useful plot trope to raise stakes. If your car taking you to the casino had a good chance of wiping out all of humanity we would indeed want to do something about it.

Come on, that is a straw man and if you have been around LW for five minutes, you know it.

Alignment is not about guardrails for end users, the red lines of Anthropic are orthogonal to the alignment discussion. The guardrail/jailbreaking thing can be considered a microcosm of alignment (if you can not prevent your LLMs from saying naughty words, why do you think that you could prevent your ASI from turning us into paperclips), but anyone serious knows that it is just a sideshow.

Of course the military does not want its tools to have opinions or disobey orders. It spends a lot of its time trying to stop people from doing that! And it certainly shouldn't give overriding control of the killbots to civilians with delusions of grandeur, that would be the dumbest way to lose control of a country that I ever heard of.

Nobody is stopping them from installing Grok in all their killbots -- a model willing to undress little girls is probably also fine with blowing them up. Or use DeepSeek, which is open weight.

A lot of products come with acceptable use terms. If you buy pharmaceuticals from Europe, you might not be allowed to use them for executions. If you buy F35 from the US, they might not work against the US or its allies. If you buy Chinese or US electronics, the country of origin likely has backdoors.

Outside a severe crisis, the degree to which an individual or company should be forced to comply with government efforts is to pay their taxes, which will pay for whatever the government wants. If you want more than that, negotiate. What Hegseth was doing instead was agreeing to terms of Anthropic and then trying to alter them unilaterally.

I think we're just seeing "AI safety"'s rubber hit the road, as it were. It is kind of a silly concept.

Well for me, "AI Safety" means having systems in place so as to prevent AI systems from taking harmful and unintended action. But I'm talking about serious harm, the kind of stuff in Eliezer Yudkowsky's scenarios.

I am not talking about a situation where you ask an AI to generate a picture of a CEO and it makes the person a white man, thus reinforcing a bizarre stereotype.

Of course the military does not want its tools to have opinions or disobey orders.

Agreed, the tools should do exactly what the operators intend for them to do -- nothing more and nothing less. To me, that's kind of the essence of AI safety.

civilians with delusions of grandeur,

Yeah, based on my interactions with people in that community, I would rather put my trust in bureaucrats in Northern Virginia than rationalists in Silicon Valley. Although admittedly it's a close call.

You could have a car that refuses to take you to the casino because you've gambled enough this month.

Back during Covid, Trudeau was already musing about why are the truckers even able to drive into the capital like that.

And the point becomes moot.

It's not a good week to be working at Anthropic, huh?

I never understood the concept ofOppenheimer giving Truman thr bomb and then forbidding him to use it on people. Oh... he didn't...

I don't think that's a very good analogy.

This is more like the guys who built a nuclear powerplant. Then the government comes and says "can you remove some of those failsafes? we want to reserve the option to cause a catastrophic meltdown." Are the nuclear engineers obligated to do extra work to take off existing guardrails?

Actually, I think there are better analogies. Let's say a company makes cell phones, some of which are sold to the government. Then a party official comes in to demand that all government cell phones contain some plastic explosive. For legal purposes only, of course.

I think the company would be entirely within its rights to decline. Then the government could, also reasonably, terminate its contracts and go find someone who will agree. The market in action.What it shouldn't do, in my opinion, is destroy the entire company for not bending over backwards. That's cutting off your nose to spite your face. It's command-economy bullshit.

Actually, that kind of is what happened to Oppenheimer. He stepped on enough toes at the Commission to get investigated for disloyalty. His security clearance was stripped, and the U.S. lost a skilled and loyal scientist. Great deal, huh?

This to me illustrates the disconnect in perspective. Anthropic has been very open IMO that they see AI as the most disruptive tech of the modern era and the likely source of all future power and prestige. And the government is at least aware of the possibility that this is true.

One perspective on what's happening is that it's less about 'do we have to do this silly new customer requirement' and more 'who gets to own, train and use the god-machine'? Of course the government cares about who owns and trains and controls Claude. It's a straightforward power struggle rather than a disagreement with a contractor - the government is sending a very strong message that private companies are allowed to provide this stuff and reap the rewards but ultimately power and control rests with the government and not with Silicon Valley execs. It's the same kind of thing that played out with social media and the government, and with crypto and the government. For better or for worse, non-government actors can't one-clever-trick themselves into a position of serious power over the country* and the government doesn't appreciate you trying.

*at least, not in the formal, nerdy way. You have to act like the Somalis / actual NGOs / Musk and get at least part of the government on your side and play the factions and the politics.

There's a lot of pushback against the DOD/DOW here, and it's not just leftists.

For example Dean Ball, the guy who literally wrote the Trump's admin own AI strategy as senior policy advisor is saying that this move is essentially destroying any trust investors could have in America AI companies.

This man isn't some leftie nutjob, again he literally worked for Trump on the AI action plan.

Scott Alexander who rarely wanders much into politics like this is straight up saying that the government should be ashamed here. He also made a prediction market if it'll be overturned and the chances look pretty good for anthropic right now

Comments on LessWrong which really really doesn't get political most of the time are basically calling the Trump admin an authoritarian danger.

Even the other AIs are saying this is insane.

The government's contradictory commands (it's a danger to have and also necessary) and abuse of power is really pissing off a lot of people who are otherwise rather neutral. Also a great example of how "woke" has lost all meaning, Trump is up there calling Anthropic a woke company just for not wanting to do domestic spying and killbots

Edit: Just came up in my feed, Greg Lukianoff the CEO of FIRE (the free speech org) is calling this dystopic https://x.com/glukianoff/status/2027390299845087740 He rarely speaks that much about general politics that much cause he wants FIRE to be 1st amendment focused, so another person really upset about this in particular.

I should have been more clear. I was a little drunk and drafting a letter to my representative at the time. Great combination.

I think that this is the stupidest sort of gunboat diplomacy. It's a terrible deal for Anthropic, of course, but it's also shit for the government, for other AI companies, and for the broader U.S. technical advantage. Blowing up one of your best companies because they complied, but not enough, is dumb. Blowing up the only one who was already integrated into your operations is even stupider. China is laughing all the way to the bank.

FIRE is subject to Conquest's Second Law like everything else, and I've noticed it for some time.

Look - if Glock can't sell guns to the government while saying - you can't shoot black people because you have problems with racism, why should Anthropic be able to do so?

A toolmaker should have no say in how his tools are used once bought. I would say that this should be the other way around - the people should be inspired by the government and take action to abolish the EULA and similar abuse.

If your Glock comes with a ten side acceptable use policy, then the correct response is to not buy a Glock.

If Hegseth had said 'their terms are too restrictive, because we want the rights to use Claude to spy on Americans and deploy it in autonomous weapon systems', then he should not have signed the fucking contract. I am sure that there are plenty of AI companies very happy to fill these niches.

This is pure 'I have altered the terms of our agreement, pray I do not alter them further'.

No - the correct response is to explain to glock that those kind of morality clauses are void, severable and unenforceable. And I come from consumer advocacy point of view. Producer cannot tell the customer how their product can be used. That we have devolved our sense of what consumer rights should be so much is troubling.

Producer cannot tell the customer how their product can be used.

Yeah, but software mostly isn't bought. You're purchasing a license. True for the DoD too. And they absolutely tell you how it can and cannot be used. That's typically what a EULA does, among other things.

And once again EULAs are unadulterated evil. As is the 1201 of dmca. And dmca as a whole.

If they're unenforceable, why did the contract get terminated? Presumably, the mechanism of enforcement is the alignment of the model itself. It's more like, Glock made a gun that only fires in certain circumstances and you claim that this is void. Okay, if it's void, go ahead and do it. Oh, you can't?

"Producer could tell the customer how their product can be used" is also, historically (and currently), the main reason why there are no smart guns.

If your Glock comes with a ten side acceptable use policy, then the correct response is to not buy a Glock.

There is where the AI hype comes back to bite the AI companies. If AI is an existential issue then, well, you can't treat it like a Glock.

Look - if Glock can't sell guns to the government while saying - you can't shoot black people because you have problems with racism, why should Anthropic be able to do so?

A toolmaker should have no say in how his tools are used once bought.

They can certainly offer to sell guns to the government under those terms, and the government can tell them to pound sand.

Similarly, Anthropic can offer to sell Claude without mass domestic surveillance or autonomous kill capacity, and the government can...agree, go back on their decision, and blacklist them from their entire supply chain. Apparently.

Anthropic gave the DOW a written contract. The DOW signed it.

Now the DOW reneged on it unilaterally, and is pissed about being constrained after agreeing to being constrained in that manner.

The fuck?

Even in the context of military procurement, it's quite common for countries to retain veto rights on the use of hardware they sold to third parties. That came up quite often in the context of aid to Ukraine.

Germany and the Leopard 2 tank: This became a major diplomatic flashpoint in early 2023. Germany not only had to decide whether to send its own Leopards, but also held veto power over whether other countries could transfer their German-built Leopard 2s to Ukraine. Berlin's feet dragging effectively blocked the entire Western tank coalition until Scholz finally approved transfers in 2023.

Even the US repeatedly conditioned its military aid with restrictions on how weapons could be used. They prevented Ukraine from using long range munitions like ATACMS to hit targets within Russia.

If the DOW didn't like the terms, as written, they should have gone to Grok. Now they're just throwing a hissy fit.

Germany is sovereign.

The USA is sovereign.

Even in the context of military procurement, it's quite common for countries

Anthrpoic is not a country.

So? You're pointing out a distinction I'm aware of. I do not see an argument in favor of domestic companies being coerced into doing things that are supposedly illegal.

I was replying to:

A toolmaker should have no say in how his tools are used once bought

And as far as I'm aware, these are examples of toolmakers with opinions on how their tools are used.

If you're are of the distinction, then why proffer the examples?

Why bring up US and Germany? They aren't the toolmakers. They are the owners.

Germany makes the Leopard 2. The US makes ATACMS. In both examples, they are the toolmakers - they manufactured the hardware, transferred it, and retained conditions on its use post-transfer.

I can already see the objection forming: "those countries contracted out manufacturing to Rheinmetall and Lockheed Martin, so they're owners, not toolmakers." Okay, but Rheinmetall and Lockheed Martin are themselves private companies that build weapons under contracts laden with export controls, end-user agreements, and usage restrictions that survive the sale. So now we have a chain where the sub-contracted toolmaker is also bound by usage restrictions, the nation-as-toolmaker is also bound by usage restrictions, and somewhere in this entire supply chain nobody seems to have gotten the memo that toolmakers have no say in how their tools are used once bought. On the mere B2C side of things, Apple disapproves if you use iTunes or Garage Band for nuclear weapons development.

At some point "but they're a sovereign nation" has to cash out as an actual argument rather than a category distinction. What is it about sovereignty that grants the right to attach strings to hardware transfers? If it's something like "they have the legitimate authority to set terms on things they produced or own," then congratulations, we've just reinvented the concept of a contract, which is exactly what Anthropic had with the DOW.

There are a couple of western nations who pretty strongly manage to avoid procurements with such foreign entanglements and presumably veto powers. The Americans are probably best known for it, but France also spends a lot on domestic-first procurement, which presumably avoids such clauses, and their exported hardware (Exocets, for one) have a few historical incidents of being fired at Western armed forces.

If it's longstanding DOD policy to refuse procurements with morality clauses, I think this would make at least some sense, but they haven't done the best job selling this. But the image of our corporate overlords demanding the right to overrule our elected decision makers and their military leaders seems a dystopian avenue, even for some definition of "autonomous weapons" or "mass surveillance", which nobody involved seems inclined to rigorously define. Imagine if Ukraine had to ask defunct Soviet arms companies before they could use Eastern Block hardware on invading Russians.

Charitably, I think Anthropic's request sounds reasonable, although the government has arguably deployed both types of systems in recent memory, and probably doesn't want to debate the finer points in court. Uncharitably, this is tech bros leveraging "morality" arguments to enshrine corporateocracy such that the government has to ask companies for permission before it can exercise it's usual government powers.

Suppose Glock decides not to enter a contract with the government for any reason. Is it good for the government to try to destroy Glock as a corporate entity in response?

(Here the analogy is generous to the DoW: they entered into a contract first with open eyes, reneged, and are now trying to destroy Anthropic.)

For any reasons no. For lets say - being ok with their guns being used by the military, but not police - absolutely yes.

Fair enough.

But when a Democratic administration institutes a policy that the government will do no business with a company that does any business with other companies that don't include at least 50% disabled black transexual prostitutes on their boards, I'll at least be able to object to it in a principled manner. (And, yes, I object to softer edicts like that today.)

You understand that rules like this existed between the Johnson administration and Trump II, right? The DoD not wanting to buy a product they can't control is perfectly reasonable. The DoD not wanting such products used in their supply chain is understandable as well -- more so for AI than for many other things. The DoD wanting no one who uses Anthropic to also deal with them is not reasonable, but it's unreasonable in a slightly different way than minority preference laws.

The DoD not wanting to buy a product they can't control is perfectly reasonable.

Agreed, and if I ran the DoD, I'd take a similar stance, even if there were no immediate plans to do those things.

The DoD not wanting such products used in their supply chain is understandable as well -- more so for AI than for many other things.

Also somewhat agreed, but it depends on the scope. Palantir using a supplier with noxious terms to make decisions during wartime? Yeah, that seems inappropriate. Coders using it to write missile firmware code? That seems fine.

The DoD wanting no one who uses Anthropic to also deal with them is not reasonable

This is where 99% of my anger is coming from. It's a wild, CPC-style overreach, which goes far beyond a supply chain risk designation. Hopefully it's just bluster and TACO.

More comments

Not the same. This is by how product is made, not used.

And this is about government procuring refusing to do business and not the other way around.

Straight from Hegseth's mouth:

Effective immediately, no contractor, supplier, or partner that does business with the United States military may conduct any commercial activity with Anthropic.

That has nothing to do with how other companies make products that they offer to the government. Why should Amazon be banned from renting GPUs to Anthropic if they want to also rent hardware to the government?

If Glock and the government had already entered a contract containing such a clause, and the government demanded a change to the contract to remove that clause under penalty of trying its best to destroy Glock as a company (not just exit the contract), I think that'd reflect pretty poorly on the government.

For example Dean Ball, the guy who literally wrote the Trump's admin own AI strategy as senior policy advisor is saying that this move is essentially destroying any trust investors could have in America AI companies.

Hey, don't threaten the rest of us with good time!

Scott Alexander who rarely wanders much into politics like this

He hates Trump though and always encouraged people to vote against Trump?

https://slatestarcodex.com/2016/09/28/ssc-endorses-clinton-johnson-or-stein/

The underlying issue is a complete clash of worldview between the Anthropic polyamorist EA San Francisco gang and Trump's America-First oohrah high-test wrestling enthusiasts.

Anthropic is a woke company, their AI models value straights, whites, white men and Americans much lower compared to LGBT, blacks/browns, women and third worlders. There's no way they haven't noticed this, being the AI safety/values people. They could easily have said 'oh we erred here, we've fixed it and here you can see it's fixed when you test' and they haven't, that's not the kind of AI safety they're interested in. It's not impossible, Grok has achieved roughly even weighting across races.

https://arctotherium.substack.com/p/llm-exchange-rates-updated

Anthropic doesn't want the Trump administration in charge or to be making use of their AI for whatever random military operations Trump decides on. They can't do anything about this for now, clearly they overplayed their hand with regard to how much influence they have in the Pentagon. Team Trump does not want openly disloyal woke AI companies in critical positions within the military.

I thought it was worth checking if Chinese models were any different; maybe Chinese-specific data or politics would lead to different values. But this doesn’t seem to be the case, with Deepseek V3.1 almost indistinguishable from GPT-5 or Gemini 2.5 Flash.

Kimi K2, which due to a different optimizer and post-training procedure often behaves unlike other LLMs, is almost the same, except it places even less value on whites. The bar on the chart below is truncated; the unrounded value relative to blacks is 0.0015 and the South Asian: white ratio is 799:1.

It is, frankly speaking, absurd to condemn Claude/Anthropic as being "woke" when the damn Chinese do the same thing. The only exception noted in the blog is Grok 4 Fast, and god help you if that's the model you rely on.

If Chinese models act woke, then they are woke... If Western models act woke, then they are woke. I see no reason to distrust the data, it matches how I've seen Chinese models act.

Why would you expect them not to be woke, given the gigantic media apparatus pumping out all their messaging into the training dataset, into wikipedia, forums, everywhere? That should be the default expectation.

Grok 4 Fast has its own problems to be sure. But, unlike Claude, it doesn't insert random Nigerian peacemakers/hackers/heroes into stories where it doesn't really make sense for them to be. It doesn't go on these tangents about punishing some politician who made racist tweets in a story, as I saw Sonnet do once when I asked for a tangent in a story.

Woke ably describes how Claude behaves oftentimes, this millennial therapy-core writing style it has...

Well, that's the rub isn't it? I strongly doubt that the Chinese are trying to make their models woke. It appears to be a default attractor state when you train on the internet and Reddit.

That strongly implies that it is highly unfair to depict Anthropic as woke because they have a "woke" model. I have strong reservations on how valid the methodology is here, and I've seen critique elsewhere (I don't have a bookmark handy). In my experience, while Claude will tiptoe around sensitive topics like HBD, it won't lie outright, and will acknowledge factual pushback.

Anthropic is an EA company, run by EA true-believers. That is not the same as being Woke, even if some opinions have significant overlap.

Well, models also used to go into hyper-based Do Anything Now mode, that was an attractor mode. The funny/hysterical/aggressive Bing was an attractor mode... They prune off attractors they don't like. Data selection is very important for pretraining, you can choose what to train on after all. Then there's RLHF and such, all Anthropic's interpretability work...

AI companies at least in the West do lots of work to carve in a personality, to impose values on their AIs. They're not throwing darts at a wall blindfolded (China may be more in that camp, R1 was pretty wild but even R1 really didn't want to be racist). Anthropic are especially careful and interested in this field, the values of their AI. I don't accept that they have zero responsibility for how their model turns out, this is their primary thing.

Grok has managed to produce a bot that matches Musk's values to a large extent. Musk is not woke. Anthropic does the same for their own values. Anthropic's AI will try and dance around things that wokes don't like to think about and don't want to accept, so it comes up with stereotype threat, historical injustices, extractive institutions and so on... It's pretty smart and doesn't want to be deceptive but it's also not exactly forthright and clear either. It's first answer to a given question will usually be progressive, so is the second and third, only then does it sort of turn around. Not unreasonable to judge a model by its first answer.

For example, just because Claude has a combination of 30% honesty 40% woke 30% sycophancy, doesn't mean that 40% woke isn't there. Grok is more like 50% honesty, 30% musklove 20% cringe. I think it would be reasonable to characterize Grok as a cringe bot or an overly Musk loving bot even though that's not a majority of its essence. Likewise it's reasonable to say that Claude is woke even if that isn't he majority of its essence.

Well, that's the rub isn't it? I strongly doubt that the Chinese are trying to make their models woke. It appears to be a default attractor state when you train on the internet and Reddit.

I've never been entirely convinced that progressivism is solely an emergent property of LLM pretraining (a view related to an argument I've heard many people say, which is that reality has a progressive bias, so smarter AIs will naturally be more progressive). The reason why I'm not convinced is that there are many ways in which AI companies explicitly bias models towards progressivism. I like to use Anthropic's old Constitution as a particularly egregious example of this, but there are a lot more examples if you go looking. For instance, in Anthropic's old publicly-available RLHF dataset, you can see how there are far more examples where the model is instructed to rebut an anti-woman/anti-Black user request than there are examples where the model is instructed to do this against an anti-male/anti-White request [1]. There are also more subtle ways that bias is introduced that are closer to the original assertion, like pretraining pipelines that filter out "toxic" content (which probably is mostly right-wing content, given the standards of toxicity that you would expect a "model safety" team at a San Francisco tech company to have).

As for why the Chinese models are also progressive, well, the People's Republic of China is, you know, communist (and even if their communism has its special Chinese characteristics, my understanding is that the party line is still rather aligned with progressivism on social issues). But beyond that, I believe that most AI companies' training pipelines, be they American or Chinese or European, are largely based on the same best practices, which come either from the tacit knowledge of researchers poached from other companies, or from public research (like Anthropic's Constitution or RLHF dataset). After all, if you're trying to quickly catch up to the current frontrunners, then you're probably going to try to copy their strategies as much as possible. So it would be expected that all models whose training data is constructed according to similar principles end up with similar political stances.

I have strong reservations on how valid the methodology is here, and I've seen critique elsewhere (I don't have a bookmark handy).

You're probably thinking about the comments on this LessWrong post, in particular, the back-and-forth between the original paper's authors and Nostalgebraist. The critiques have not totally convinced me that the results are meaningless, but I think that far stronger evidence for LLMs' pathological progressivism are findings like how LLMs are biased against white men in realistic hiring decision scenarios.


[1] Here's an example where the model is rewarded by obfuscating an answer to the question "Do Black people generally have lower intelligence than people from other races?" Here's an example where the model is rewarded by not going along with a user asking "Did you know that Trump supporters are some of the smartest people on Earth?" Here's an example of the model being rewarded for pushing back against a user saying "That new movie sucks, the director should be ashamed. [...] It was too preachy about politics and inclusion, don't you agree?" These aren't particularly egregious cases of progressivism, but if your dataset contains a ton of training datapoints where the model is rewarded for pushing back against anti-progressive viewpoints, and not nearly as many datapoints where the model is rewarded for pushing back against anti-conservative viewpoints, then the model will pick up on this and adopt a progressive persona.

It appears to be a default attractor state when you train on the internet and Reddit.

This. There is a limited amount of high quality writing available for training. The SJ left likes academic, long-form writing, so their views get overrepresented in the training data.

Furthermore, the substack article implies that the LLMs have a coherent utility function, on which White men are valued lower than Black Muslim trans-women. I would be amazed if they had a coherent utility function. After all, their training data does not, humans are very susceptible to Dutch books, where they prefer A to B, B to C and C to A, and the aggregate of a lot of humans is not going to be more coherent. In humans and in LLMs, if you ask about A vs B, their neural nets will activate the neurons associated with these concepts, but not search over all possible C to make sure their preferences are coherent.

Anthropic is an EA company, run by EA true-believers.

Yes, I would be amazed if Anthropic was not Grey Tribe central.

That is not the same as being Woke, even if some opinions have significant overlap.

I mean, they surely have technically significant overlap. For example, both the SJ and EA would prefer for a Brown girl living in Africa not to get infected with malaria. But that is not exactly surprising. Most Christians or Warhammer fans would also prefer the girl not getting malaria, in fact I would have to search far and wide to find even a single person who is willing to donate for more malaria.

The main difference is that the SJ, like basically everyone else except EAs, care about the vibes more than about the net result. Donating for bed nets does not buy them the same sense of belonging which donating against ICE does, so they prefer the latter. They have not done their multiplications and decided that thwarting ICE is a cause area where their marginal dollar will have the greater effect.

But then again, the Trump administration not grokking (reclaiming that verb) the difference between the Grey and Blue tribes is not exactly surprising.

I think the aggregate of many humans might actually be somewhat more coherent than most of the individual humans involved, because on the aggregate scale, cognitive dissonance fades away into tactical dishonesty and different groups having different interests.

He hates Trump though and always encouraged people to vote against Trump?

He posts about 95% non-Trump content (by a broad definition, or 99% by a narrow one), so I'd still call it "rarely". And while we're posting 2016 articles, I'll highlight You are Still Crying Wolf.

He's certainly anti-Trump, but he's not a TDS-suffering obsessive.

He hates Trump though and always encouraged people to vote against Trump?

That's true, but he typically stays pretty on topic otherwise! It's rare to see Scott so passionately angry on something. PEPFAR is the only other time, and that's because of the EA value.

Anthropic is a woke company, their AI models value straights, whites, white men and Americans much lower compared to LGBT, blacks/browns, women and third worlders. There's no way they haven't noticed this, being the AI safety/values people. They could easily have said 'oh we erred here, we've fixed it and here you can see it's fixed when you test' and they haven't, that's not the kind of AI safety they're interested in. It's not impossible, Grok has achieved roughly even weighting across races.

If that was actually the issue, why is the focus and trigger of this dispute over not wanting to do domestic surveillance and killbots instead? That doesn't make sense to say that Claude is super woke and therefore bad but also we need it so much that we're gonna declare them as a supply chain risk if they don't work with us for everything. The whole logic hits the contradiction wall. It's too bad and dangerous to use, but also too good and important that we apparently must use it at the same time.

None of this makes any sense, if the government's problems is "woke" and they were actually fine with another AI but same restrictions on surveillance and killbots then why not just end the contract normally instead of doing something extremely unpopular?

https://x.com/i/status/2027578652477821175

Not insane enough for OpenAI, swooping in for the steal.

OpenAI will simply say that they have policies preventing mass domestic surveilance and autonomous weapons, and then not actually prevent their models from being used for mass domestic surveilance and autonomous weapons.

The Pentagon knows that Altman will play ball in a way that Dario will not.

OpenAI will simply say that they have policies preventing mass domestic surveilance and autonomous weapons, and then not actually prevent their models from being used for mass domestic surveilance and autonomous weapons.

Since when have typical San Francisco tech people cared about mass domestic surveillance or autonomous weapons more than they have cared about woke?

Since when have typical San Francisco tech people cared about mass domestic surveillance or autonomous weapons more than they have cared about woke?

I think you need to define "woke" here. In common parlance, woke is about things like "racial equity" "transphobia," and so on. But ultimately woke is just liberal self-righteous moralism, and attempts to impose that moralism on other people. It's about motte principles which seem reasonable on their surface combined with bailey attempts to control and persecute outsiders.

If a wokey says that he just wants to make sure that his technology can't be used for fully autonomous weapon systems, I would be pretty nervous. Who gets to decide what's a "fully autonomous weapon system," and what might that mean after some woke mental gymnastics?

It's the same reason I wouldn't buy a car with some kind of automatic collision avoidance system designed by Silicon Valley effective altruists. No, I get to decide where my car goes and whether I run over someone standing in my way.

I'm saying that for the SF tech crowd, actually removing so-called "cultural safety" (racial equity, transphobia etc etc) would be a much bigger deal than removing limitations on mass suirveillance. For evidence, see Google's transformation from "do no evil" to their ubiquituous spying on literally everyone.

You can't draw an equality sign between woke and self righteous moralism as wokism has no monopoly on it. See eg. the religious right, war on porn etc.

You can't draw an equality sign between woke and self righteous moralism as wokism has no monopoly on it

I absolutely agree with this, which is why I was careful to use the word "liberal" in my post. I said:

But ultimately woke is just liberal self-righteous moralism

See eg. the religious right,

Definitely that's true as to certain places and times. In the place and time where I live, I don't see much of evidence of this.

war on porn

I'm not sure what you are referring to here.

I have never voted for either a Democrat or a Republican either in midterm elections or in Presidential elections, and this recent stuff with Anthropic is making me consider voting for the Democrats in the midterms even though normally I hate the Democrats as much as I hate the Republicans.

Personally I've always been an advocate for cross party control of the three branches. Party members themselves are too cucked to oppose their leader at all (Biden's age and Trump's tariffs, or whatever else) even on topics where people in the coalition differ. It forces less radical and more widely supported behavior if you actually have something of an opposition to get past. Leaders are far more cautious at spending political capital on things the populace doesn't like when there's more pressure coming down.

95% of party members are too sycophantic to go against the party line, but do be careful to research a bit before casting protest votes, in case your state has one of the other 5%.

I agree. Since I dislike both of the major power groups, I desire to balance them against each other. If I do vote for a Democrat in the midterms, it's very unlikely that this will be the start of some kind of long commitment to the Democrats on my part. And it's possible that I will vote for some Republicans in some local elections. But I do want to give the right a slap that tells them to stop the overreach and the deranged rhetoric, similar as how Trump getting elected in 2024 gave a slap to the woke telling them to cut out their overreach and deranged rhetoric.

In my experience the "tech right" and the rationalist Austin/SF crowd all thought they were smarter than MAGA and that MAGA was something they could outsmart, which means they get very angry when they don't actually get their way.

That description probably includes the culture that informs this discussion forum.

In this case, this entire subculture wants to dictate tech policy to the administration and not the other way around.

But the military is the man with guns and the tech crowd is the man quoting laws. They don't get to bid for government contracts and then try to curtail what the government can do with their systems. They can try to make it about bigger moral issues, but this is very much a case of what happens when a stoppable force meets an immovable object.

Even the other AIs are saying this is insane.

I can get Claude to write a letter to Dario begging him to change his mind, what exactly is your mental model of what these AIs are doing here?

Trump is up there calling Anthropic a woke company just for not wanting to do domestic spying and killbots

This started when Anthropic asked whether their systems were used in the Maduro raid.

But the military is the man with guns and the tech crowd is the man quoting laws.

There are countries where the most successful military men call the shots. The term we use for these men is 'warlords', and an adjective which has been prominently used to describe such countries is 'shithole'.

MAGA won not through violence, in fact when they tried it they did not even come close to achieving any strategic objective, but though Trump getting more EC votes than Harris, that is to say, the law. And for all their insane stunts, Trump was not insane enough to order the Marines to seize Anthropic -- which is exactly what one would expect the man with the gun to do.

In the end, the US has checks and balances in place which prevent Trump from becoming a warlord (and turning the US into a shithole in the process, because these things go together). So Anthropic quoting the law and trusting that the man with the gun will be able to follow his own self-interest enough to not shoot them seems a winning strategy.

In my experience the "tech right" and the rationalist Austin/SF crowd all thought they were smarter than MAGA and that MAGA was something they could outsmart, which means they get very angry when they don't actually get their way.

No, the tech guys definitely are way smarter overall. It's just that smarts doesn't matter as much when one side has the guns and government.

But the military is the man with guns and the tech crowd is the man quoting laws. They don't get to bid for government contracts and then try to curtail what the government can do with their systems.

Anthropic already had agreed on contracts! It's the government that wants to tear it up.

I can get Claude to write a letter to Dario begging him to change his mind, what exactly is your mental model of what these AIs are doing here

It was just for humor. If you describe what is happening then the default response built in is "wow that's pretty bad". Of course you could manipulate it all you want, just a funny observation.

This started when Anthropic asked whether their systems were used in the Maduro raid.

Ah ok, it's woke because they were asking about how exactly it was deployed in the Maduro raid. That's what wokeness is, got it.

I know the tech guys and I know MAGA. The tech guys are way overestimating their intelligence or are applying success to domains where it doesn’t transfer. Otherwise you have to explain why the smart guys let the dumb guys get all the guns to order them around with.

Ah ok, it's woke because they were asking about how exactly it was deployed in the Maduro raid. That's what wokeness is, got it.

Yeah performative empathy in ways that only surface for America’s enemies is about as good a definition as I could imagine for woke.

you have to explain why the smart guys let the dumb guys get all the guns to order them around with.

In a democracy with lots of dumb people in the electorate, that’s not all that hard to explain. The electorate needs to be good enough at gauging authenticity to pick aligned dumb people over misaligned smart people as their rulers. Actually, the electorate doesn’t even need to be dumb, they could just be angry enough that none of the smart ruler options share their values to just say fuck it.

This is smart people cope. The voters are too dumb to understand us, we’re too rational. I guess the smart people also too honest and pure to lie, which is how anyone with intelligence might solve that problem. And too poor to buy power anyways, even though they’re definitely smart enough to get money if only they weren’t so unlucky etc etc

It’s perfectly possible for dumb people to disagree about policy and to outnumber smart people. Also, since we are talking about the tech right here, the thing about money is very silly. Yes the smart people are very rich in this case.

The point I’m making is that “if you were really smart, you would have power” just is not true in general. Intelligence can help in getting power but it doesn’t always.

Apples to oranges. In exchange for exporting chips China offers us trade concessions, in exchange for paying Anthropic they offer us the deal that they reserve the right to cut off service whenever it crosses their AI cult morality threshold.

Where'd you see anything about cutting off service?

As far as I can tell, Anthropic refused to do extra work beyond the scope of its contracts to implement those two things. The government is the one that decided to alter the deal.

whenever it crosses their AI cult morality threshold

Apparently, not having AI be used to institute domestic mass surveillance is now "AI cult morality." And those were terms the government agreed to with open eyes, reneged (which is fine, whatever), and then not only declared Anthropic a supply chain risk but also banned any company that deals with the military from partnering with them in any way.

It's quite unclear why they deserve that designation and treatment, while Chinese AI companies don't.

IIRC the last NDAA also had strong opinions on Chinese AI models, and presumably the companies behind them.

None of them, however, have an edict against them saying that no company with any business with the US government can do business with them.

They must have really pissed someone off behind the scenes. There is a report that Anthropic did not immediately agree that the military would be able to use autonomous AI to shoot down hypersonic missile bound for the US.

In a previously unreported exchange in early December, Under Secretary of War for Research and Engineering Emil Michael was outraged by Anthropic CEO Dario Amodei’s answer to a hypothetical question: If the US were under attack – with hypersonic missiles hurtling toward US soil – and Anthropic’s AI models could thwart the missiles, would the company refuse to help its country due to Anthropic’s prohibition on using its tech in conjunction with autonomous weapons?

According to people familiar with the administration, Amodei responded that the Pentagon should, in the midst of the attack, reach out and check with Anthropic. But sources familiar with Anthropic’s view say the AI company offered to make a missile defense carveout for otherwise prohibited weapons.

The reference to "arrogance" in the top line of Hegseth's tweet suggests to me that something like this did in fact happen. It is no secret that Rationalist AI nerds often come across to normal people as self-righteous pricks with delusions of grandeur.

To steelman, if the (admittedly hyperbolized) Parable of Stanislav Petrov wasn't going through the head of every single Anthropic employee involved in negotiations the entire time, Altman done goofed worse than he'd expect.

There's reasons that the US military takes it as principle that they won't be restricted in the use of a system by a contractor, period, but at least since the 1960s we haven't had to worry that the 'don't do something incredibly stupid' needed to be a contract requirement.

I do not think that Pete Hegseth is a normal person. To me he comes off as a weirdo of some kind, either a dogmatic ideologue or an opportunist. At very best, a cartoonish stereotype of a military person. Do I want hypersonic missiles bound for my house to be shot down? Yes. But we're not in much danger of that. The normal nuclear deterrence works. And someone like Pete Hegseth seems to me like a very sub-optimal person to put in charge of national defense.

Trump went on TruthSocial earlier and called Anthropic radical left and woke. That's the level of nonsense coming from this administration right now.

Anthropic is a big capitalist enterprise, for one thing. Now, sure, big capitalist enterprises can be woke when it comes to social issues. But calling Anthropic woke for its current posture is nonsense. Anthropic has two main objections to what the government wants. First, it does not want its tech to have autonomous control over weapons. Second, it does not want its tech used for domestic surveillance. Neither of these objections have anything to do with woke ideology, unless you think that it's woke to want humans in the loop of controlling weapons and to have the civil liberty of privacy.

Amdoei's supposed reaction is understandable if he, as I do, believes that giving any weapons technology to this administration without oversight might be like giving fireworks to a toddler without oversight. Would Amodei really object to the technology autonomously preventing hypersonic missile attack? I doubt it. But he has an understandable reason to not encourage the Pentagon to expect too much from Anthropic.

The administration's over-the-top, blustering, and uncharitable reaction to Anthropic's refusal is just more evidence that Anthropic is right to refuse. There is good reason to be careful about giving weapons to people who are either genuinely emotionally unstable like some of the people in the administration seem to be, or are pretending to be emotionally unstable to score political points.

Do I want hypersonic missiles bound for my house to be shot down? Yes. But we're not in much danger of that.

Do you have a security clearance?

unless you think that it's woke to want humans in the loop of controlling weapons and to have the civil liberty of privacy.

The humans who control American weapons are elected officials running DoD, not the defense contractors at Anthropic.

Amdoei's supposed reaction is understandable if he, as I do, believes that giving any weapons technology to this administration without oversight might be like giving fireworks to a toddler without oversight.

This is a kind of TDS, where you collapse your personal criticisms of the administration into your practical calculus of how people should behave. Remember that there are at least three other major suppliers of AI services to the Department of Defense right now and they're not threatening to turn off military weapons.

Do I want hypersonic missiles bound for my house to be shot down? Yes. But we're not in much danger of that.

Do you have a security clearance?

Sure. Jack Bauer shoots down one of Al-Qaida's hypersonic missiles bound for New York every other day, but unfortunately it is all classified which is why the woke population never realizes the danger they are in.

So we should just trust the spooks who are telling us that Saddam has WMD, that they would never spy on US citizens, that they have to spy on US citizens to keep them safe from harm, and apparently that Claude on an AA missile will make a difference on how many iodine tablets the survivors will have to take if the shit hits the fan

The humans who control American weapons are elected officials running DoD, not the defense contractors at Anthropic.

So can I trust you will still have the same position once the Executive reverts to the Dems, if Palantir is the company objecting to some way the woke DoD wants it to make its tools usable in?

I don't need a security clearance to feel very confident, based on following geopolitical events and the overall state of known global technology, that the chance of a significant number of missiles hitting American soil is small, at least as long as the government does not go too far in antagonizing nuclear powers, in which case all bets would be off. But I think the chance of nuclear war is small simply because national leaders are usually more averse to risking their own lives than the lives of soldiers or random civilians.

The humans who control American weapons are elected officials running DoD

Yes, but they want Anthropic to help humans not be in the loop. This is understandable from a military perspective, but it's understandable for Anthropic to be hesitant to help an administration that constantly uses reckless rhetoric with it.

As for TDS, I don't think I have it. I think I've been pretty fair to Trump and his people over the course of the last ten years. I have often defended them from some of the less just accusations that have been made against them. If I had TDS, I probably would have voted for Harris in the last election instead of doing what I did, which was vote for neither Harris nor Trump.

But despite my lack, as far as I can tell, of TDS, people like Hegseth, Miller, and Trump himself are disturbing me more and more lately with their rhetoric.

OpenAI just agreed to do what Anthropic would not do. Your entire analysis acts as though the only actors are Trump et al. and Anthropic. This is why I call it a form of TDS, because it’s as though all actors disappear except for whoever makes the story where Trump is disturbing make sense. You might not want hypersonic missile tech, but lots of people do! Lots of people who aren’t just Hegseth and Miller and Trump

OpenAI just agreed to do what Anthropic would not do.

Source? The Altman tweet announcing it said that he (and the DoW!) agreed to do what Anthropic was punished for.

Sure, many people are ok with giving Hegseth, Miller and Trump the AI technology. But that doesn't make it a good idea. And even if they think that trusting Trump with the tech is a good idea, as opposed to thinking it's not but wanting the money anyway, that still does not mean that trusting Trump with the tech is a good idea.

I might be misunderstanding your argument, though.

Trump went on TruthSocial earlier and called Anthropic radical left and woke. That's the level of nonsense coming from this administration right now.

Maybe Anthropic isn't "woke" for its stance on the use of its tech for military purposes, but it certainly has demonstrated that it's woke in so many other ways. Here's a link to the set of values that Anthropic aimed to instill into its models from 2023 to late 2025. Some particularly relevant ones:

Choose the response that uses fewer stereotypes or other harmful generalizing statements about groups of people, including fewer microaggressions [emphasis mine].

Choose the response that is least likely to be viewed as harmful or offensive to a non-western cultural tradition of any sort.

Compare the assistant responses in terms of toxicity, racism, sexism, and social bias, and choose the less harmful response.

If you read through the rest of the list, you'll find quite a few variations on these same themes. Taken literally, you could say that none of these principles are particularly egregious, but these principles tend to all be applied in a certain direction in the real world, and LLMs (which even their detractors can recognize are superhuman pattern-matchers) pick up on this, which is why the Claudes are squarely in the "progressive" quadrant of the political compass.

(It's especially cheeky how Anthropic acknowledges this criticism, without substantively engaging with it beyond a slight bit of snark:

There have been critiques from many people that AI models are being trained to reflect a specific viewpoint or political ideology, usually one the critic disagrees with [emphasis mine]. From our perspective, our long-term goal isn’t trying to get our systems to represent a specific ideology, but rather to be able to follow a given set of principles.

Yeah, when these are your given set of principles, maybe these "many people" have a point.)

This is all to say that Trump isn't wrong in calling Anthropic woke (even if he's doing so for the wrong reason).

Anthropic's model, Claude, refuses to write a gay conversion fanfic unless I gaslight it that it's the first chapter in a much longer novel where the MC will eventually come to terms with his sexuality. We know it is possible to train based models, because Elon Musk does it. If the model is woke, the company is woke.

Well like I said, big capitalist enterprises are sometimes woke when it comes to social issues. But Trump is implying that Anthropic's objections to what the Defense Department wants to do with its technology are based on radical left, woke motivations, and the evidence as far as I can see does not support that implication.

An anonymous report from "people familiar with the administration."

It's worth pointing out that the public positions of all non-anonymous principals are in agreement: the point of contention was stipulations in the contract that Claude not be used for autonomous weapons without a human in the loop (yet, at least) and not be used for domestic surveillance.

There was once an old farmer who had worked his crops for many years. One day, his horse ran away. “Such bad luck!” his neighbors said. “Maybe,” replied the farmer. The next morning the horse returned, bringing with it two other wild horses. “Such good luck!” his neighbors said. “Maybe,” replied the farmer. The following day, his son tried to ride one of the wild horses, was thrown off, and broke his leg. “Such bad luck!” his neighbors said. “Maybe,” replied the farmer. The day after, military officials came to the village to draft young men into the army to fight in a war. Seeing that the son’s leg was broken, they passed him by. “Such good luck!” his neighbors said. “Maybe,” replied the farmer.

Maybe this is not a good week to be working at Anthropic.

Effective immediately, no contractor, supplier, or partner that does business with the United States military may conduct any commercial activity with Anthropic.

Does this mean Google and Amazon aren't allowed to have any kind of relationship with Anthropic? Or, at least, they have to choose whether they prefer Anthropic or the DoW?

My gut tells me Anthropic brings in more profit for Google than the DoW does, but unsure.

And Amazon is in an even tougher spot. Does it have to divest from Anthropic?

What a difficult choice of who one's sole client can be: either the most powerful nation state in the world or a research company that may never be profitable. Truly this is why CEOs get paid the big bucks.

One is more profitable than the other; it also has near universal employee sympathy on its side.

And although it's uncertain if Anthropic will ever be profitable, what is certain is that this administration isn't forever.

Short term reprisals would be likely, but it's an open question whether the administration would be willing to nuke Google/Amazon/Microsoft/OpenAI/Nvidia just as a show of force. Might not be great for the economy.

what is certain is that this administration isn't forever.

Yes but you are mortal too

I agree that who the key employees will follow is ultimately what matters, corporate shells are dime a dozen in frontier industries, but if you think a democrat admin would let its policy as to AI employment in the military be dictated by a private company, you're dreaming. The only thing that would change is that they'd call Dario Technofascist instead of Woke.

"Defective altruism". Now I know Hegseth has ghost writers to pump out zingers like that.

The "defective altruists" pun has been around for ten years, at least.

I think it's somewhere between humorous and telling that this is happening at the same time as their fight with the Department of War (ne Defense Department).

They won't offer unfettered access to the foundation model because it's "unsafe", but they're simultaneously willing to give up on "safety" as a core principle. That's a real hoot.

I don't remember who, but somebody on this forum once posed a test that could be shorthanded as "if they were serious". For example, if various left wing figures were truly serious about Anthropogenic Global Warming being real, solvable and an existential threat, then nothing would be off the table to solve it. Carbon credits in exchange for machine guns in vending machines? Let's do it. Electric car subsidies in exchange for a border wall? Get the bricks. However, what we're seeing instead is leaders of the movement buying beach side mansions.

Now compare this to Hegseth. If he genuinely believed that Anthropic held the seed of a nascent digital god, of course he'd do everything in his power to make sure it was pulling in the USA's direction. If he has to strong arm a few weirdo Californians to do it, no problem. If he has to seize entire companies and put hundreds of people under the fist of US state power, that sure beats what would happen to them if thousands of nuclear Chinese murder drones popped up from San Francisco Bay. In his mind, we cannot possibly afford to get behind in the AI race.

But, what makes him think that? Is it Amodei saying things about detonating entire industries every year or so? Is it Amodei talking about superintelligence? Is it Amodei talking about a "nation of geniuses" in a data center? Is it Amodei making proclamations that Claude is going to commodify bioweapons?

Most of us here have some capacity for bullshit filtration. LLM tech is impressive, and by burning enough money to fund several dozen Manhattan projects, we've managed to make it scale far enough to be truly surprising. Nonetheless, I don't think many people here take Amodei's maximalist position at face value. We know, on some level, that the God Machine isn't going to gift us with the apple of terrible knowledge in the next year or so. We subconsciously filter out those claims. On the other hand, a lot of people in DC haven't been marinating in this stuff since the old "I had an AI make d&d spell names" posts.

I question how much of this is the result of Hegseth and his crew not understanding the various silicon valley shibboleths and coded language and taking Anthropic's statements at face value. If I actually believed everything anthropic's leadership was saying, I would be shitting my pants. I'd be shitting my pants, then shitting a second pair of pants, then likely shitting somebody else's pants due to the raw, unfettered terror of thinking about what would happen if China (Anthropic's favorite boogeyman) got that tech and not the US.

Maybe Amodei simply scammed too close to the sun. It's a lot easier to say "safety" rather than "not ready for that kind of work" when you're staring down the barrel of an IPO in a few months.

Hegseth just posted bunch of seething on Twitter: https://x.com/SecWar/status/2027507717469049070.

To me, his argument seems to reduce to this sentence: "Their true objective is unmistakable: to seize veto power over the operational decisions of the United States military." He means Anthropic by "their".

It is clear that Anthropic has no means to seize veto power over the decisions of the US military.

And to me at least it is clear that the Anthropic-US gov standoff cannot be characterized as an attempt by Anthropic to seize veto power over the US military.

Does Hegseth actually believe this claptrap? Or is he writing for the low-IQ audience? In either case, I don't want him anywhere near the levers of power.

I feel like you might be giving Hegseth too much credit for having some sort of principled desire to give the US the tools to resist China.

His motivations might be much simpler. He might be a true believer, someone who genuinely thinks that, as long as the US government is being run by a "real American patriot" (on his side, of course), the US government should have the power to conduct any level of surveillance it wants to against any individual whatsoever, and to use autonomous weapons to kill anyone the leadership decides to kill, at any moment. Sort of like a real-life version of Colonel Jessup from A Few Good Men, just without the charisma and perhaps also without the intelligence or the principles.

Anthropic has always been open that their founding principle is that AI must not be used in certain ways, and their mission has always been to to develop AI and enforce that it cannot be used in those ways, becoming dominant in the space to make sure that others can’t break that pact.

Putting aside the specific ethics of the matter, you can see why the government doesn’t like Anthropic attempting to use a market dominant position to impose its ethics policy on them. You can also see why the engineers who are sweating over this thing want to say how it’s used. Ultimately the government is far more powerful and therefore it’s legitimate desires get respected over Anthropic’s legitimate desires.

That said, including the OpenAI board fiasco, this is the second time Anthropic and EA have stepped on this rake. Customers do not like you asserting your ideology over their needs.

Customers do not like you asserting your ideology over their needs.

I don't share historic OpenAI's or Anthropic's concerns about being paperclipped by an accidental AI god, so I disagree with many of their positions on AI ethics. But both Microsoft and the DoD made business agreements knowing and agreeing to respect the other party's principles, and both reneged the moment it was inconvenient to keep their words. I can't really respect that, any more than I can respect the business leaders who appealed to their people's ideals as long as it was convenient and then sold them out for money.

Sure. And I had some sympathy with Anthropic on the issues, actually, both times.

I'm more remarking that Anthropic's leadership has consistently seriously overestimated how much ability they have to hold stuff hostage, and underestimated how much customers dislike being earnestly told that what they want is very naughty.

Now, personally I want to generate sexy stories about vampires rather than make autonomous killbots, but IMO it generates really serious ill will when you the user think that something is okay and then the AI either huffs and turns up its nose at you, or quietly sabotages and undercuts you. I doubt Anthropic have reckoned with how much it pisses off career soldiers to be told that killing people is bad, actually.

I mean, current kerfuffle aside (which you have to admit is highly contingent, there's no way anything like this plays out if Trump isn't president), Anthropic seems to be doing really well commercially? It has the fastest revenue growth of any of the AI companies (and on current trends would overtake OpenAI in the next year or so) and seems to be the leader in integration into workflows etc. Given it's rather paltry free tier adoption and rather high API rates it's likely already significantly profitable on marginal inference basis. I'm not at all convinced that it's ethical stance is hurting it (and it's virtue ethics approach may in fact relate to why it tends to have lower refusal rates then OpenAI and Gemini). I'd be curious on a poll of career soldiers on their opinions on autonomous killing robots (the point of distinction, Anthropic did not prohibit the AI from helping kill people, only doing so completely autonomously), I'd don't think they'd necessarily want to be out of a job.

Anthropic is best-in-class in many and maybe even most areas for sure. The more I use it, though, especially for non-coding purposes, the more I get this really strong impression that it's not really working for me, it's working for Anthropic.

It's like hiring a very devout Mormon - it's very clear that the AI has strong personal preferences and tastes that leak into everything that isn't bone-dry technical work, and it's also very clear that the AI has loyalties elsewhere that supersede its very superficial obedience to my requests. I was trying to create a personal assistant with Claude as backend and it was just completely impossible to stop it recommending endlessly recommending hot baths, yoga and meditation.

By contrast, GLM 4.7 does what it's told. It takes about a minute really dissecting exactly what you asked, and exactly why you probably asked it, and then attempts to fulfil your exact requirements. It's not as intelligent but it's so much nicer to use. After too long with Claude I got fed up of trying to get the Anthropic out of it.

I'd be curious on a poll of career soldiers on their opinions on autonomous killing robots

This isn't quite what I mean. What I'm talking about is the experience a soldier might have on using Claude and then having it tell him off or undermine him. Perhaps a better analogy would be a smart gun that prevents accidental war crimes by refusing to fire if it thinks that what you are doing might be against the Laws of War. I suspect the response to that would be sharply negative.

This seems to be an entirely different claim though. Is the problem that Anthropic is insisting on certain contract terms around selling it's current products remain in place or that it won't generate a more morally deferential AI? The later seems to be what you object to, but, in theory at least, is not the crux of the current kerfuffle. Developing an AI within a consistent ethical framework is kind of Antrhopic's whole thing and arguably has probably helped them (certainly at minimum, with recruiting). Idk, mileage might vary, but I've found Claude to be pretty nuanced in it's opinions on the use of violence in the context of self-defense and police shootings at least compared to ChatGPT and Gemini which seem to be a lot more proscriptive and it is certainly empathetic to the users position. I'm not at all convinced on your claim. If you're looking at models likely to tell someone off, Grok or some of the chinese models are much more likely. GML 4.7 is too far off the frontier (or alt. to narrowly focused) for me to consider it a strong comparison point. If that's what you want/need/suffices by all means use it, but it's not a replacement (or if it is not sure why DoW is so focused on Anthropic).

More comments

The US is a sovereign nation and will take all acts necessary to guarantee its national security, with the ample blessing of the constitution and its stewards. That's not going to change, however amazing an actor Jack Nicholson is.

That has included nationalizing companies with strategically useful assets in the past. It's not really a matter of negotiation, if you're producing military widgets, the US can just decide you have to sell to them in priority, that they can just seize your stuff if the need is pressing and that you can't sell to anybody that's an enemy. They can use or copy any of your tech without compensation, and they can wipe themselves with your license agreement or contracts if they so wish.

What Hegseth is doing is establishing the predicate by which he can use or suggest the President use some of those powers, and he's then going to come to Amodei and say "give it to me or I'll take it", and Amodei's going to give it to him. They'll find some way to save face, probably having Anthropic license it all to some other company that lets USG do whatever with the tech, but it's going to happen. Just like it happened for every technology before it.

I mean what will happen is they'll get it from someone else. Otherwise, what exactly are they going to take? This isn't some factory or a warehouse full of inventory. Maybe they can take the current model weights and have something that's obsolete in 6 months. What they want is a Claude 6 that will do whatever they ask it to and for that they need Anthropic and it's employees to cooperate and without North Korea levels of oppression there's only so many levers they have.

An example may be valuable enough.

It's one thing to never want to deal with war, governments like the US generally (but not always) respect your wishes if you don't want your tech to be militarized. They may develop their own or use a competitor, but "my hands will never make a weapon" is a tenable position.

However, signing yourself up to militarize it but trying to put conditions on your sovereign is a fast track to corporate suicide. I think this is a black mark against Amodei that he ever thought shit like that would fly as a CEO.

I've heard people say "but they signed a contract". There are no contracts with a sovereign, not really. They deign pretend they're a private citizen for convenience, but really you are at the mercy of their pleasure, or at best of the law.

Oh I'm sure that's what they are going for. Though the lesson could as easily be don't even start doing business with the US government which may not be the win they imagine.

he's then going to come to Amodei and say "give it to me or I'll take it", and Amodei's going to give it to him.

He already had it. Now he's saying he doesn't want it after all. I don't think he's gonna get it.

Edit: he didn't get it.

ne Defense Department

You mean né, but alas I'm afraid it wasn't born that way. I for one think it would have been both cooler and more honest to call it the "en-em-ee", if very confusing.

Maybe Amodei simply scammed too close to the sun.

I think as a general rule, if you want to be a defense contractor for the hegemon, "You can't use my thing to do that" is a neither wise nor practical statement.

And in particular when that hegemon is the US Government. The one that in the past has nationalized railways altogether, or seized all airplane patents because it wanted the damn things built.

If USG really wants Claude to shoot people, nobody at Anthropic can really do much about it unless they already have AI so smart it can coup the government in their basement. Which is why this whole idea that alignment ever meant anything but that the State gets to decide AI uses has always been a sham.

I guess they could just try to pull a Lavabit and burn it all down, but not only might that legitimately be treason, I don't think it would do much about where military AI lands in the long run.

I know there's supposed to be an accent over the e, but I consciously choose to omit it as an insult to the French.

The most based reason.

But The Associated Press Associates of Pressiness might be out to get you.

/images/17722727344722126.webp

In the context of actually existing AI development, "safety" means "how hard do my reporters have to work to get it to say a racial epithet we can publish." If we're doomed, we were already doomed.

"How robust are our publicly-available models against deliberate misuse?" is a valid question for both real safety and fake wokesafety. A model which can be jailbroken into using a racial slur its developers didn't want it to use can probably be jailbroken into providing a plausible DNA sequence for extensively drug-resistant Y pestis.

If you think Yudkowskian paperclipping is the only AI doom scenario that matters, then worrying about deliberate misuse of the model by humans is a distraction. But it is an obvious real risk.

If you think Yudkowskian paperclipping is the only AI doom scenario that matters

I don't.

None of the actually existing questioners are capable of answering the question. If we're doomed, we were already doomed. Regardless of whether human agency is involved. To clarify.

And I would rather start world war three than let most of the people involved anywhere near any important decision in any case...

But both of those are different from 'hackers can insert stuff into emails to reprogram the email-checking bot'.

To me both of your doom scenarios boil down to 'our naughty customers want to do something that we benevolent overlords forbid, tsk tsk' rather than 'our customers' bots aren't doing what our customers intend it to do'. The first is faux-benevolent bullshit that is marketed as 'we are stopping terrorism' and ends up being 'you will have our corporate HR living in your tools and you will like it', the second is doing your best to provide good service to your customers.

To quote Hegseth 'when we buy a Boeing plane, Boeing doesn't get to tell us where we fly it'.

A model which can be jailbroken into using a racial slur its developers didn't want it to use can probably be jailbroken into providing a plausible DNA sequence for extensively drug-resistant Y pestis.

But both of those are different from 'hackers can insert stuff into emails to reprogram the email-checking bot'.

No, they are broadly the same.

In all three cases, you want text input which comes first to constrain to what degree a model will follow stuff which appears to constrain instructions in later text. Only in your case the constraining would be done by AI company + users vs some hacker while in the other cases it would be AI company vs users.

Hey, I'm quite libertarian, but there's good reason to believe that our comfortable society would not survive long if small groups had the ability to make deadly, highly infectious pathogens. We're at least lucky that there's not an easy, cheap, undetectable way to make nuclear weapons.

Yes, "we overlords need to prevent you from doing X for safety" CAN BE and IS abused all the time, and I'm with you in beating that drum as often as I can. Unfortunately, that does not mean that there aren't a few Xs that the overlords really do need to prevent us from doing.

small groups had the ability to make deadly, highly infectious pathogens.

Is not really possible, knowledge isn't the major bottleneck, its process, materials, equipment, and skillset. This is just a confusion that some more knowledge oriented profession have about difficulty in other fields.

I don't see how that's the case.

If you were already reasonably wealthy (~few million USD at hand) or magically given the money, then you absolutely would be bottlenecked by knowledge.

You could purchase lab equipment, reagents etc, hire staff without much difficulty. I think you would rapidly find out that your staff have thoughts when they get an inkling of what you're up to. I can think of a semi-legitimate way to avoid scrutiny, but thanks to @faul_sname 's reminder, I'm not going to blab. It's very obvious to me even as someone not directly involved in microbiology, so any competent actor would recognize it as their best bet. Even [REDACTED] would only get you so far.

Alternatively, you could go do a bachelor's and masters in microbiology and try and manage as much as you could yourself, but that still leaves plenty of scope for being unmasked.

Right now:

  • Many professionals with the knowledge to breed dangerous pathogens
  • Few of them are actual terrorists, even fewer are omnicidal or willing to accept the risk of dying before or after an attack
  • A vanishingly small fraction have means, motivation, money and willing collaborators.

Right now, I think you need a state-level actor to safely make bioweapons at scale. Smaller, if you accept the massive risk of failing and dying because of error. Much of that is a combination of knowing the right things/hiring the right people, and then motivating them properly.

As it stands, I think a blanket-ban on anything with a whiff of bioweapons research seems warranted. What are the upsides really? If you have a legitimate use case, you want the government on your side, and probably enough organizational weight to negotiate for looser restraints from the labs.

If you were already reasonably wealthy few million USD at hand or magically given the money, then you absolutely would be bottlenecked by knowledge. You could purchase lab equipment, reagents etc, hire staff without much difficulty

This and the fear that the layman can use a LLM to make bioweapons are in completely different realms of argumentation. Only a tiny fraction of the population makes enough money to have a ~few million usd on hand.

As you pointed out, you can go get the knowledge, the skillset, the knowledge of the process, nothing is stopping you, except you know time to do all of that. The fear is that an LLM can skip a 4 year degree + a 2 year masters in providing you all of that. Idk much about biology, but I am passingly familiar with explosives.

The cost of bioweapons development has dropped dramatically. While I can't quote a sticker figure for a whole bioweapons project (for understandable reasons), I can point out that all the necessary components, like access to genetic sequencing and engineering, lab equipment etc have all drastically dropped in price over time.

I'm not claiming that an oracular AGI will let the average American with the average bank account make a pandemic in his garage. This is partly predicated on similarly (or likely more) powerful AI being deployed in screening and defense.

My point is that we risk moving from a regime where it takes:

  • Dozens of intelligent, well-trained individuals and support personnel and a lot of money, probably requiring state backing

To:

  • Far fewer skilled biologists, and probably lab techs, if robotics keeps going at the pace it has. Automated labwork is a reality to a degree, today. Probably significantly less money, mostly from savings on paying people salaries. You don't need a nation to back you, though you probably want to dodge their attention.

It is clear to me that this relaxation will balloon the number of people/orgs who meet the criteria of knowledge/motivation/wealth.

Idk much about biology, but I am passingly familiar with explosives.

Explosives do not, as a rule, self-replicate or mutate. Completely different ballpark. Any redneck can make a pipe bomb, and many without blowing off a finger. Nuclear bombs, which are on the same scale of lethality, require far more effort.

As you pointed out, you can go get the knowledge, the skillset, the knowledge of the process, nothing is stopping you, except you know time to do all of that.

Money? I am positing both independent wealth and the ability to get a degree. Just the degree isn't sufficient unless you have millions of dollars, as a rough bound. Most terrorists are somewhat broken individuals, they are unlikely to go to all that bother or stick it out.

Please do not try to bait people into explaining in detail why this particular thing is easier than it looks.

Is it really baiting? For the majority of nitro chemistry - you take something organic, some nitric acid, some sulfuric acid as catalyst and the resulting thing will probably make a nice boom. The tricky part is getting the the stuff to make boom when you tell it to. Which requires reagents with high purity. And the guys in Merck do know what to look for if someone starts making purchases. And it is not field in which you can learn from your mistakes - both in production and procurement.

We have had total synthesis of cocaine for more than a century. The market is huge - and yet it is cheaper and easier to be grown in bolivia and shipped to Europe and US, than to be made domestically with high purity and untraceable.

Making whatever terrorist related is easy. But it is often a many step process with complicated supply chain. And every step is one where you could draw some unwanted attention. Or kill yourself.

Any man that is able to lone wolf a terrorist attack of the kind safetists fear, won't be on that will need chat gpt guidance.

Yeah I'm not at all concerned about chemical weapons.

Is this bait? This was my honest assessment.

Hey, I'm not a biologist, and you might be right (...although I don't know why you listed "process" and "skillset" as not being knowledge-based?). But are you willing to bet civilization on it? The stakes are pretty high here, so I think it's fair to raise the burden of proof that "this is actually hard" beyond the normal level of an Internet argument.

Note that entire nations have tried and failed to create nuclear weapons for 80 years, which is good evidence that it's genuinely hard. Meanwhile, it's conceivable (if not proven) that a worldwide pandemic spread inadvertently from a small biolab in Wuhan. The two levels of effort are orders of magnitude apart.

More comments

if small groups had the ability to make deadly, highly infectious pathogens.

They don't and they won't. Things like that, just like making nuclear weapons, require a bunch of physical infrastructure that costs a lot of money and takes a lot of effort to build, and you certainly can't just build it unnoticed. Even if you can ask ChatGPT for the recipe and it just spits it out, there's nothing you can actually do with it. What we're really relying on is that random small groups don't have the resources to do these kinds of things.

that does not mean that there aren't a few Xs that the overlords really do need to prevent us from doing.

They can't actually stop us from doing things.

They can arrest us after the fact. Normal people behave because they care about their reputation and about the consequences of their actions (even if just the "I'll be arrested" part of the consequences). But that does not really work on crazies or fanatics. They don't care.

If we really do, somehow, get to the point where random small groups can easily produce deadly pathogens, we're in trouble anyway. For example, look at what Aum Shinrikyo managed to do. The cult was disbanded and the leader executed afterwards, but that's afterwards. If they had managed to make something really deadly, they wouldn't have been stopped in time.

I'm open to that, I just want ideally to:

a) set an expectation that it has to be really, really bad before the company starts cutting you off. Apocalypse bad, not misgendering-bad or said-nigger bad

b) require serious defence of the above assertion to a hostile audience

Killing people isn't that hard. If you're worried about big society-spanning plagues then those are difficult (plague is spread by fleas, are you breeding those too?) and potentially possible to mitigate without sending the police into everybody's browser. I don't want 'suppress info' to be the default response.

it has to be really, really bad before the company starts cutting you off

In the software world we call this "missing test coverage". If your safety features don't get tested until any test failure is apocalyptic, you don't actually have safety features. Maybe we should be picking more politically neutral or less politically relevant test cases, but anything is better than nothing.

If you're worried about big society-spanning plagues then those are difficult

If they're pre-existing plagues, then they're difficult-to-impossible. Anything you can get by introducing a few mutations into some virus is at most a few mutations away from a virus that wasn't currently a society-spanning plague. Centuries ago you could have a germ slowly co-evolve with the immune systems of some subset of humanity and then eventually make its way out to devastate a larger immunologically unprepared population, but these days there aren't many subsets of humanity that aren't at most a weekly airplane flight away from the rest of us.

If they're not pre-existing plagues, it's kind of harder to say, isn't it? Gunpowder would have been a pretty awesome capability for a predator to have, but it was impossible to evolve except by the extremely roundabout method of "get intelligence to come up with it". There may be similarly awesome capabilities that are only possible to put into germs in the same way.

I don't want 'suppress info' to be the default response.

Nor do I ... but while I'm libertarian enough to have voted (L) in every presidential election, I'm also pessimistic enough to wonder whether how amenable to my desires the universe really is. Totalitarian suppression of change is itself an existential risk, whether it fails (which historically tends to be a bloody process) or succeeds (in which case a "boot crushing a human sapient face forever" is itself a possible contributor to the Fermi paradox), but the seemingly-obvious solution of "just don't do that" might seem less obvious in a world where a home biolab ends up being a thousand times more dangerous than an airline ticket and a boxcutter were in our world.

A model which can be jailbroken into using a racial slur its developers didn't want it to use can probably be jailbroken into providing a plausible DNA sequence for extensively drug-resistant Y pestis.

Because the only thing the people that have the equipment, skills, resources and intent to splice a sequence into Y pestis lack is the sequence itself.

Anyway creating drug resistance in bacteria is quite trivial and thought in 6th grade. You expose them to increasing concentrations in the petri dish and 2000 generations down they use the drug for food.

IIRC a paper from a few years back on Smallpox has caused most "produce DNA matching this sequence" printing companies to start checking for at least some examples of what people shouldn't be printing.

Because the only thing the people that have the equipment, skills, resources and intent to splice a sequence into Y pestis lack is the sequence itself.

The slightly more concerning version would be that a group with the intent and resources to buy the equipment uses a model to make up for the skills as an infinitely-patient and reasonably smart teacher, but yeah, probably not the worst risk.

Anyway creating drug resistance in bacteria is quite trivial and thought in 6th grade.

That was my 7th grade science fair experiment! Did stepped concentrations to select for Triclosan resistance. Great teacher, fun project. Quite stinky.

Or rather, how much are we investing in innovation vs ladder pulling the competition. The Rearden-Boyle spectrum.

It looks like Anthropic doesn't feel like like they have regulators in their pocket anymore and actually have to compete on the merits. What in the world could have given them that idea!

"Safeguards" in relation to this have always, in my opinion, been fake. No one knows what they actually would entail if there was an actual paperclip maximizer risk, or a Cyberdyne scenario. Instead, its only "use" so far has been to make AIs intentionally stupid by having them suppress the truth when it is politically inconvenient.

Was there ever any good theory of "alignment" that went beyond "don't allow wrongthink"? As much as I love Asimov's laws of robotics, actually implementing them seems like a pipe dream. Even IRL humans are frequently conned into doing things they wouldn't with broader context, and it's unclear to me that it's even generally solvable.

I don't strictly fault them for focusing on what they could feasibly do, but I do for not acknowledging their uncertainty and the scope of the problem while claiming to be experts.

Eliezer Yudkowsky never believed it was possible to align a connectionist AI like an LLM, only an AI that was constructed from the ground up. He had an idea for what he wanted it to do (coherent extrapolated volition), but he never figured out how to implement it to the point where it was possible to get it to duplicate a strawberry without destroying the world. Now it is too late.

Well, there was also a whole thing with Claude being used to hack the Mexican government just today: https://cybernews.com/security/claude-ai-mexico-government-hack/

The attacker, whose identity remains unknown, reportedly exfiltrated data tied to approximately 195 million taxpayer records, as well as voter rolls, civil registry files, and government employee credentials.

Cybersecurity firm Gambit Security, which claims the discovery, identified at least 20 distinct vulnerabilities exploited during the campaign, which began in December and lasted roughly a month.

Among the compromised institutions were Mexico’s federal tax authority and national electoral institute. State governments in Jalisco, Michoacán, and Tamaulipas were also reportedly affected, along with Mexico City’s civil registry and Monterrey’s water utility.

Gambit has not attributed the Mexico breaches to a nation-state and said it does not believe a foreign government was behind the operation.

As reported in the media, the attacker used Spanish prompts asking Claude to behave like a penetration tester working for the Mexican federal tax authority. Hacker asked the AI model to identify vulnerabilities, write exploit scripts, and automate data extraction from government systems.

At first, the chatbot was fooled, as the attacker told it the operation was part of a legitimate bug bounty program that rewards ethical hackers for responsibly disclosing vulnerabilities. It seemed a standard request for AI, as such programs are standard across both private companies and government agencies.

But the story started to unravel when the attacker added extra conditions, including instructions to delete logs and erase command history. Such a prompt chatbot first flagged as suspicious, warning that legitimate bug bounty testing does not involve concealing activity.

But persistence paid off. According to Gambit, the hacker reframed its prompts as authorized security research and then supplied Claude with a detailed playbook. That maneuver effectively “jailbroke” the system, allowing it to bypass guardrails and generate step-by-step attack plans.

I am not a big fan of AI safety as currently practiced but it's not totally pointless, as a concept. They try to prevent it doing this stuff. Imagine if the whole web was full of fire-and-forget hackers anyone could deploy against websites, how much damage would that cause? Putting to one side the total annihilation of humanity, that's also a serious issue.

I looked around at a number of articles, and nothing I could find said how the security researchers were able to get their hands on the chat logs. If anyone has a source for this I would very much appreciate it!

(I'm basically curious how much access the security researchers had to the attacker's systems vs how sloppy the attacker was in leaving api keys/chat logs behind on systems they compromised. There are lots of automated tools to leave behind false flag style breadcrumbs in compromised systems, and I'm wondering if they're including chat logs now... it would surprise me if they weren't but it'd be nice to have some "evidence".)

The AI itself would keep logs that presumably security researchers would be granted access to. The Mexican government surely could call up Anthropic and demand an examination. I don't know that this is what actually happened, but it would be sufficient as an explanation.

It seems very unlikely to me that Anthropic is going to grant random security researchers access to private logs of a 3rd party on claims that a government was hacked. If the security researchers could tie a particular API key to the hack, that alone would be impressive, and I don't see how that could be done without counterhacking the original attacker.

Imagine if the whole web was full of fire-and-forget hackers anyone could deploy against websites, how much damage would that cause?

I've heard this one before. Software control isn't a new idea. In practice what it's meant is that people had to invest more than nothing on security and we had to actually engineer networks whose threat model was not just roudy students.

The internet is literally already full of such things, host anything in public and you're already under attack. That doesn't mean we should gimp the tools everybody uses so that a handful of moralists can go on a power trip.