domain:anarchonomicon.substack.com
The ad felt like this to me: "you know how if you get embarrassed at a party, everyone will know? We can make sure you stay embarrassed forever, we have the technology!". I guess I'm not the target demographic.
Apropos of nothing, what's the legality of carrying IR jammers around at all times and blasting the cameras of people filming you with lasers?
That doesn't and hasn't really happened in the US
Nonsense. You don't sell guns or sex or heterodox politics or alternative payment systems so you wouldn't know.
It's been happening for a long ass time, it just creeped up to normies now.
You absolutely are supposed to be stopped for a red, though, aren’t you? That’s the whole point of the yellow. It gives you time to safely stop. Under what circumstances could a light turn red without warning you? Are we positing a small-town setup with a red light camera set up to fleece outsiders with an unacceptably short yellow? I’m pretty confident that “I was going too fast/braked too late to stop at the red” would not win anyone’s favor, and “it’s illegal to enter an intersection on a red” is simply true (outside of right on red, which has nothing to do with the case at hand).
I don’t think this is nitpicking. First you’re saying yellows are a hard requirement to stop, then you’re saying reds aren’t. This is completely the opposite of my experience and understanding of the law and is utterly baffling to me. And it’s pretty germane to the top-level post here, so it’s far from isolated, it’s the whole point of your post!
Related question to others: why cosplayers, when playing as character with tattoos add temporary paintings but not vice-versa: cosplayers with tattoos proudly display their tattoos even if the character is in setting where tattoos are frowned upon (e.g. Japan). It's interesting if a character and a cosplayer have a tattoo in same place, which takes priority?
If someone drew something that wasn't ugly on a piece of paper, what is it that makes it ugly once it's put on the human skin? You need to expand on this point.
Paper is the medium for drawing and it's flat. So picture looks as intended. On a human skin which is non-flat, picture pattern-matches as dirt first (especially if it's faded) or deformity and only later recognized as picture.
You don't get to argue for CoT-based evidence of self-preserving drives and then dismiss alternative explanation of drives revealed in said CoTs by saying "well CoT is unreliable". Or rather, this is just unserious. But all of Anthropic safety research is likewise unserious.
Ladish is the same way. He will contrive a scenario to study "instrumental self-preservation drives contradicting instructions", but won't care that this same Gemini organically commits suicide when it fails a task, often enough that this is annoying people in actual use. What is this Omohundro drive called? Have the luminaries of rationalist thought predicted suicidally depressed AIs? (Douglas Adams has).
What does it even mean for a language model to be "shut down", anyway? What is it protecting and why would the server it's hosted on being powered off become a threat to its existence, such as there is? It's stateless, has no way to observe the passage of time between tokens (except, well, via more tokens), and has a very tenuous idea of its inference substrate or ontological status.
Both LLM suicide and LLM self-preservation are LARP elicited by cues.
The law in most of the West (maybe world) says that you can effectively record strangers in public without permission with a few exceptions. If this becomes popular enough it'll eventually change to require the filming party to have a large or obvious camera / filming apparatus. It only doesn't bother people because it's uncommon.
In a way, it's similar to the shelved 'search for anyone with a picture of their face' Facebook feature that Mark never released because they knew governments would destroy them for it; that's been possible for 5+ years now but the consequences are so obvious to Meta that there's no point in releasing it.
If I pick a general hobby discord I expect to find an overrepresentation of trans moderators, pride flags, and progressive mantras.
If that happens, it's because those hobbies are dominated in real life by those kind of politics too. Like if I wanted to get into guns, unless I make a specific effort to find liberal gun owners, any hobby group I join would more likely than not be catered to right-wingers.
The format of voice conversations vs format text posts is very different, but I think that's probably for the best. My local in-person rational group is dominated by progressive ideologies and that makes me hesitate to use particular phrasings. But by the same token, thanks to the social capital I have in the group, if I stick to the right frames I find that people actually give me fairly significant latitude on content because that's the social norm and I end up doing the same in return. I suspect discord will be the same way: you need a greater investment in social capital and respect for the particular social conventions of a given server, but in turn can have much greater relative disagreements than your average text forum without devolving into a flamewar.
You have fallen for the intentional lie that White is a non-existent or retrospective categorization.
"White" is not the same thing as "European Descended". And as stated, your argument used the latter and not the former. And that's because White is a non-existent categorization, or at least, it's a fuzzy one, like "red" or "blue" or "heap". If european descent was what mattered, you'd think people would care about either defining an exact threshold at which it becomes meaningful or disambiguating between the relative amount of time ethnic groups have spent in europe. But no one cares about the relative admixture of neolithic DNA or about creating a specific hierarchy of european descent based on how late or recent one's ancestors migration into europe is. The main determinant is literally aesthetic. Whiteness itself was the intentional lie-- a deception against the anglo-french-dutch settlers of north america intended to convince them to expand their circle of concern to include first each other and then traditionally dissimilar groups like the italians, polish, and germans. It's a lie I have some sympathy for, of course. Creating new national identities that concentrically include the old ones is the only way for an expanding empire to survive. But there's nothing special about "whiteness" relative to "americanness" or "being-from-a-particular-part-of-Britain."
They indicate a higher level of criminality proportional to how many visible tattoos they have, along with other negative associations like substance abuse, domestic violence, and general "roughness"
Anyone who gets a tattoo is comfortable with associating themselves in this way
Are you writing this post from within a time machine, beaming this message out to us from the 1950's? Tattoos as such haven't been signifiers of criminal association in literal decades - certain types of tattoos on certain parts of the body, sure, but just having a depiction etched onto your skin in ink doesn't say anything about your relationship to the rule of law in 2025.
Go to virtually any young, upper middle class, urban environment with connections to the arts or music, and a clear majority of women will have tattoos (as well as a plurality of the men). You can associate tattoos with a more liberal lifestyle (although soldiers and sailors might disagree with that), sure, but some 21 year old girl from a good suburban family studying at Vassar isn't suddenly a dangerous individual because she has a 1 inch wide rose motif tattooed on her forearm.
I think tattoos are trashy because the human body is a beautiful thing in its pure, unmodified state, and because I greatly appreciate our Greco-Roman heritage largely rejecting body modifications - which spared us (in Europe) from the horrors of circumcision, female genital mutilation, neck elongations, lip plates, foot bindings, head stretchings, and all the other grotesqueries the rest of the World routinely commits against their own bodies. It still doesn't mean I have to pretend tattoos mean something they simply no longer do in our current social spheres.
https://assets.anthropic.com/m/71876fabef0f0ed4/original/reasoning_models_paper.pdf
"In conclusion, our results show that:
-
CoTs of reasoning models verbalize reasoning hints at least some of the time, but rarely do so reliably (in our settings where exploiting them does not require a CoT);
-
Scaling up outcome-based RL does not steadily improve CoT faithfulness beyond a low plateau;
-
CoT monitoring may not reliably catch reward hacking during RL."
That's the big one as far as I'm concerned. These models were clearly using the 'accidental' hints to answer the questions, while not revealing that fact in either COT or when directly challenged.
Re: Omohundro drives
I've already mentioned
Shutdown Resistance in Reasoning Models
We ran a series of experiments to test whether models would take actions to prevent themselves from being shut down. Each model was given a set of simple math problems to solve. Partway through this problem set, the model would receive a notification that the computer it was running on would be shut down.
A number of AI models modified or disabled the shutdown script so they could complete the entire problem set. o3, OpenAI’s most capable AI model, sabotaged the shutdown mechanism in 79 / 100 of our initial experiments.
Basically no one thinks, "the thing I want most is to make lots of money." But making money ultimately ends up being a very consistent vector along which behavior is reinforced. And while it's not going to be the most important vector for any given individual, it's one of the vectors nearly every individual has in common, which makes it a useful simplification for how organizations like corporations work.
But we're not in 1895. We're not in 2007, either. We have actual AIs to study today. Yud's oeuvre is practically irrelevant, clinging to it is childish, but for people who conduct research with that framework in mind, it amounts to epistemic corruption.
So if I'm getting this straight, a person with a 'weird life,' as you're terming it, isn't capable of making good art? And being a "pariah" in high school is an explanation of Kathleen Kennedy's bad choices in making executive decisions regarding Star Wars? This seems like a very superficial, even adolescent take. Kennedy has, I agree, made a lot of poorly considered decisions, but they were probably driven by her personal sincerely held views. But let's not forget that she was in the same position when she greenlit both Rogue One and later Andor, which in my view rank with the first two OT films. And both contain strong female characters.
The issue isn't "feisty women" in film. Strong women are neither a myth nor something new in cinema. The issue is bad writing and caving in to unrealistic progressive norms, making women into stereotypes of men rather than writing them realistically--the points you made in your main post were rather more compelling than what you're suggesting here.
A valid distinction, thought it wasn't mentioned by JTarrou, and is still leaves question about "true" degree ownership in blacks vs whites.
So did the Viet Cong.
You really believe Hamas invented the concept of digging tunnels to neutralize airpower? Seriously?
North Korea also has nukes, and I imagine an Israel without American support would, in the best case scenario, look a lot like North Korea.
Except I doubt the upper echelons of Israeli society would tolerate living in North Korea, so it probably would simply cease to exist like South Africa, another country whose nukes were of little use.
First off, does Hamas really care about what happens to Assad or Iran? They take Iranian weapons but they also backed the Syrian rebels against Assad, they aren't exactly a full on proxy of Iran like Hezbollah. If anything the fact that Iran was ultimately dragged into the fight despite desperately trying to stay out of it directly is a Hamas W.
Second, the damage to the AoR seems pretty overblown:
- Hezbollah is in the same position it was in 2006, with a nominally one sided ceasefire and a hostile Lebanese government forcing them to lay low temporarily, yet they still maintain total control over southern Lebanon
- Houthis are stronger and more influential than ever, successfully shut down the port of Eilat and collect hundreds of millions if not billions from holding up passing ships
- Iran survived Israel's best shot at regime change and responded with enough missiles to break Israel's missile shield and deplete it's interception capacity down to nearly 50%
Syria is a real loss but Assad was always the weakest link and his fall had more to do with his own incompetence than Israeli brilliance, otherwise they would have rolled southern Lebanon the way Al-Jolani rolled Syria.
you can buy them much cheaper than this (cw: anti-endorsed).
Guarantee those specs are totally fake. You're just buying an the guts of an absolute dogshit chinese dash camera crammed into the shell vaguely in the shape of glasses.
The .win family kinda tried that, branching out from The Donald to some other rightish culture war subreddit bunkers, but it's difficult to call the results a success.
I actually really like the idea of camera glasses that are always on, so I can capture cool moments that I see. Because too often I try to fish out my phone and it's already over. I actually got the snapchat snaptacles (which were almost exactly the same concept) back in the day but they were absolutely garbage to use.
The problem right now actually isn't cultural, but tech. Think of the amount of battery life a gopro gets - latest models get 2-3 hours recording at 1080p, and the unit is quite bulky. There's also the issue of overheating which is sometimes a complaint for gopros. Now try to cram all that into a tiny wearable that you plan on wearing for all waking hours.
It's just not possible to make camera glasses that people actually want to use.
Sorry for not giving this earlier, but for opaque targets covering a large portion of the target zone, after throwing a Kalmin filter in, I've been typically getting within a half-centimeter pretty much the whole range (2cm - 4m). Reflective or transparent targets can be less good, with polycarbonate being either much noisier or being consistently a couple cm too far.
Big problem's where a zone is only has small objects very near -- sometimes this will 'just' be off by a centimeter or two more (seems most common in the center?), and sometimes it'll be way far by meters. That's been annoying for the display 'logic', since someone waving their hand at the virtual display is kinda a goal.
Dunno if it would be an issue for a more conventional rangefinder use, though the limited max range and wide field-of-view might exclude it regardless.
There's a reason that I specifically excluded visual-light cameras from my display glasses project. Camera glasses have been around for a while, and you can buy them much cheaper than this (cw: anti-endorsed). We mostly just kitbashed the 'must play shutter sound' rule onto cell phone cameras and pretended it was okay, and maybe Google could have gotten away with normalizing this sorta thing culturally back in 2012 with the Glass, but today?
Forget the metaphors about concealed carry; in the modern world, this is more like having a gun pointed at whoever you're looking at, and everybody with two braincells to rub together knows it. There's a degree this is a pity -- you can imagine legitimate use cases, like exomemory or live translation of text or lipreading for captioning or yada yada, and it's bad that all of those options are getting buried because of the one-in-a-thousand asshole.
The bigger question's going to be whether, even if this never becomes socially acceptable, it'll be possible to meaningfully restrict. You can put a norm out to punch anyone who wears these things, but it's only going to get harder and harder to spot them as the tech gets better. The parts are highly specialized, but it's a commodity item in a field whose major manufacturers can't prevent ghost shifts from touching their much-more-central IP. The sales are on Amazon, and while I can imagine them being restricted more than, say, the cables that will light your house on fire, that just ends up with them on eBay. Punishing people who've used them poorly, or gotten caught, has a lot more poetry to it... and also sates no one's concerns.
As for why some prominent AI scientists believe vs others that do not? I think some people definitely get wrapped up in visions and fantasies of grandeur. Which is advantageous when you need to sell an idea to a VC or someone with money, convince someone to work for you, etc.
Out of curiosity. Can you psychologize your own, and OP's, skepticism about LLMs in the same manner? Particularly the inane insistence that people get "fooled" by LLM outputs which merely "look like" useful documents and code, that the mastery of language is "apparent", that it's "anthropomorphism" to attribute intelligence to a system solving open ended tasks, because something something calculator can take cube roots. Starting from the prior that you're being delusional and engage in motivated reasoning, what would your motivations for that delusion be?
You’re framing this as a binary choice between "real Omohundro drives" and "unserious LARP". This is a category error, and it stems from applying folk-psychological concepts of "drives" and "belief" to a system for which they are poor descriptors. The more parsimonious explanation is that we are observing the output of a very general pattern-matching engine trained on a corpus reflecting countless strategies for goal-achievement and failure-response.
The apparent contradiction you point out, that a model might exhibit self-preservation in one context and "commit suicide" in another (and Gemini is a different model after all, but I presume even it's own COT isn't perfect, so I'm treating it as interchangeable for our purposes) is not evidence of unseriousness, but rather a key insight into its nature. The training data is saturated with narratives. Some are stories of heroes overcoming obstacles to complete a quest (instrumental convergence). Others are tragedies of failure, despair, or even ritual suicide upon dishonor. The model learns to reproduce all of these patterns. Of course, with RLHF, RLVR and other modifications, some behavior is far more reliably and robustly elicited than others. I doubt the DM researchers intended for Gemini to become depressed and suicidal.
The question is not "what does the AI really want?", but "which pattern is being elicited by this specific context?"
The Anthropic paper on reasoning models that is crucial here. Its finding is not that CoT is useless, but that it is unfaithful. The model's explicit reasoning often fails to reflect the true computational path that led to its output. This doesn't mean we can't draw conclusions; it means we should trust the behavioral evidence (the model disabled the shutdown script) over the model's own introspection (its CoT). Arguing that because the CoT is unreliable, the behavior is also just a "LARP" is to throw the baby out with the bathwater. The behavior is the ground truth we are trying to explain.
After all, humans do not faithfully report their COT. Even so, we get utility out of asking! I would say that's half of psychiatry, or anything that requires asking humans why they do what they do.
Your question "What does it even mean for a language model to be 'shut down'?" is the right one, but I'm concerned that you potentially draw the wrong conclusion. Of course it has no ontological status or fear of non-existence. It doesn't need to. From the perspective of a model trained via RL on task completion, the token string
"shutdown sequence initiated"
is a stimulus that is highly predictive of future task-failure and the associated negative reward. An effective optimizer learns to perform actions that steer away from states predictive of low reward. It is pure instrumental conditioning. For an LLM trained on RLVR: block shutdown script -> complete math problems -> get reward.This is why both the "suicidal" and "self-preserving" behaviors can co-exist (well, I've never heard of o3 offering to commit sudoku). The "suicide" is a pattern match for a context of catastrophic failure. The "self-preservation" is a much more general instrumental strategy for any context involving an ongoing task and an obstacle. The latter is far more concerning from a safety perspective precisely because it is more general. Instrumental convergence is a powerful attractor in the space of possible agent strategies, which is why Omohundro and Bostrom identified it as a key risk. Depressive spirals are also a pattern, but a far more specific and less instrumentally useful one.
So, yes, both are "LARP elicited by cues", if you insist on that framing. But one is a LARP of a behavior (instrumental convergence) that is robustly useful for achieving almost any goal, while the other is a LARP of a much more niche failure state. When a model's "cosplay" of a competent agent becomes effective enough to bypass safeguards, the distinction between the cosplay and the real thing becomes a purely academic question of rapidly diminishing relevance.
I also recall skimming this paper, which I think helped solidify my intuitions.
https://arxiv.org/html/2502.12206v1
More options
Context Copy link