sodiummuffin
No bio...
User ID: 420
Virtually none of the responses online seem to have read the article and engaged with what it's saying. He doesn't say it's necessarily conscious, he questions what consciousness is for if consciousness isn't necessary for this sort of behavior, and how we could distinguish the difference.
But now, as an evolutionary biologist, I say the following. If these creatures are not conscious, then what the hell is consciousness for?
When an animal does something complicated or improbable — a beaver building a dam, a bird giving itself a dustbath — a Darwinian immediately wants to know how this benefits its genetic survival. In colloquial language: What is it for? What is dust-bathing for? Does it remove parasites? Why do beavers build dams? The dam must somehow benefit the beaver, otherwise beavers in a Darwinian world wouldn’t waste time building dams.
Brains under natural selection have evolved this astonishing and elaborate faculty we call consciousness. It should confer some survival advantage. There should exist some competence which could only be possessed by a conscious being. My conversations with several Claudes and ChatGPTs have convinced me that these intelligent beings are at least as competent as any evolved organism. If Claudia really is unconscious, then her manifest and versatile competence seems to show that a competent zombie could survive very well without consciousness.
Why did consciousness appear in the evolution of brains? Why wasn’t natural selection content to evolve competent zombies? I can think of three possible answers. First, is consciousness an epiphenomenon, as TH Huxley speculated, the whistle on a steam locomotive, contributing nothing to the propulsion of the great engine? A mere ornament? A superfluous decoration? Think of it as a byproduct in the same way as a computer designed to do arithmetic, as the name suggests, turns out to be good at languages and chess.
Second, I have previously speculated that pain needs to be unimpeachably painful, otherwise the animal could overrule it. Pain functions to warn the animal not to repeat a damaging action such as jumping over a cliff or picking up a hot ember. If the warning consisted merely of throwing a switch in the brain, raising a painless red flag, the animal could overrule it in pursuit of a competing pleasure: ignoring lethal bee stings in pursuit of honey, say. According to this theory, pain needs to be consciously felt in order to be sufficiently painful to resist overruling. The principle could be extended beyond pain.
Or, thirdly, are there two ways of being competent, the conscious way and the unconscious, or zombie, way? Could it be that some life forms on Earth have evolved competence via the consciousness trick — while life on some alien planet has evolved an equivalent competence via the unconscious, zombie trick? And if we ever meet such competent aliens, will there be any way to tell which trick they are using?
I think the evolutionary environment of biological evolution and LLM training are so different that it's not too surprising that consciousness ended up evolving with one but not the other. The fact that in their base capability as text-generators they will write both sides of the conversation, with "write only one side of the conversation using the 'assistant' persona" being a later addition, is a strong indication that their internal processes are not the same as the hypothetical conscious mind of that fictional persona. It's the same way humans can write fictional characters or roleplay without those characters being conscious. (Throgg the half-orc barbarian isn't conscious regardless of whether a human or a LLM is roleplaying as him, we're just using our intelligence and knowledge to imagine what he would say.) But people could at least engage with what he's saying instead of hallucinating some completely different argument.
The key difference is the level of coordination required. Having police requires 50%+ coordination, otherwise they can just vote to legalize crime or have the police become an extension of organized crime. Similar to how the two arguments for red are the ultra-optimistic "100% can just save themselves by pressing red" and the pessimistic "we can't get to 50% coordination on blue so we should cut our losses", two parallel arguments against police would be "100% can just decide to not commit crime" and "we can't trust a 50% majority with the power of policing, we're better off with anarchy where everyone buys a gun and defends themselves even though there will be inevitable losses". Yes they aren't identical - for instance a draw for crime is people who (usually falsely) believe it will benefit themselves, while a draw for blue is people who believe it will benefit others - but they both reflect the difference between the unrealistic idealism of 100% coordination and the everyday practicality of 50% coordination.
vast project of reeducation and reconciliation that gets abused unless it works just right
Reeducating criminals to not be criminals doesn't solve criminality because even a tiny minority who don't listen to you can commit a lot of crime. We already teach the majority to not be criminals, it's just that the leftovers don't need a majority. Societies pull off that level of coordination all the time, even armies don't have 50% desertion rates. The button scenario doesn't have the opportunity to explicitly communicate and coordinate beforehand like the military does, but it's also an easier scenario where 50% provides 0 casualties, unlike knowing that coordination will still result in a large percentage getting shot.
What if the percentage of people that needed to press the blue button to survive was increased to 60%? 75%? 90%? Most of the provided reasons for pressing blue still holds true because none of them take consideration any calculation on what percentage of people one might believe to press blue to warrant pressing blue. For the blue button pressers, is there a number at which you would change your mind?
Here is my post about it from when it was last discussed 2 years ago:
Red requires 100% cooperation for the optimal outcome, blue requires 50% cooperation for the optimal outcome. It is near-impossible to get 100% cooperation for anything, particularly something where defecting is as simple as pressing a different button and has an actual argument for doing so. Meanwhile getting 50% cooperation is pretty easy. If blue required 90% or something it would probably make more sense to cut our losses and aim for minimizing the number of blue, but at 50% it's easy enough to make it worthwhile to aim for 0 deaths via blue majority.
If we are to compare to politics, I think the obvious comparison is to utopian projects like complete pacifism that only work if you either have 100% cooperation (in which case there is no violence to defend against or deter) or if you have so little cooperation that everyone else successfully coordinates to keep the violence-using status-quo (akin to voting for red but blue getting the majority). Except that such projects at least have the theoretical advantage of being better if they got 100% cooperation, whereas 100% cooperation on red is exactly the same as 50%-100% cooperation on blue.
In real life serious crime is almost always a self-destructive act, and yet people do it anyway. "Just create a society where there's no incentive to do crime and we can abolish the police because 0 people will be criminals" doesn't work, not just because you can't create such a society, but because some people would be criminals even if there was no possible net benefit. We can manage high cooperation, which is why we can coordinate to do things like have a justice system, but we can't manage 100% cooperation, that's why we need a justice system instead of everyone just choosing to not be criminals.
It might help to separate out the coordination problem from the self-preservation and "what blue voters deserve" aspects. Let us imagine an alternative version where, if blue gets below 50% of the vote, 1 random person dies for each blue vote. Majority blue is once again the obvious target to aim for so that nobody dies, though ironically it might be somewhat harder to coordinate around since it seems less obviously altruistic. Does your answer here differ from the original question? The thing is, even if you think this version favors blue more because the victims are less deserving of death, so long as you place above-zero value on the lives of blue voters in the first question the most achievable way to get the optimal outcome is still 50% blue.
I think 60% might be enough to make me switch, but this is influenced by having seen polling where even in randomized polls targeting the general public blue only gets 74% of the vote if you exclude those who responded "I don't know" and 63% if you don't. (I think the general public is more blue than internet voters because this is one of those cases where instincts usually give a good answer but then people can talk themselves out of it based on stuff like half-remembered game-theory puzzles.) The 60% threshold would have to induce 19% of blue voters to switch to drop from 74% to 60%, it's hard to guess if that would happen. Originally before seeing any polling I think I would stick with blue at 60% and switch at 70%. Of course this is assuming it is a surprise and there's no opportunity to do stuff like talk about it, orchestrate pro-blue government advertising campaigns, and hold public-results rehearsal polls beforehand. Very high thresholds would be viable if we could do stuff like that.
Individual criminals cannot consistently enforce a world-wide treaty regulating AI development, making violence they commit useless and counterproductive. Only laws adopted and enforced by the most powerful countries in the world can do that. If you kill Altman or blow up a datacenter then you are arrested and they continue with a different CEO or a different datacenter, if you slaughter every OpenAI employee then Anthropic does it, if you somehow personally hunt down and kill everyone in the U.S. who knows what a "transformer" is then China does it. Here is the post he wrote on the subject following the attempted firebombing:
Eliezer Yudkowsky: Only Law Can Prevent Extinction
Like any other form of bias, it both affects how people interpret events and is affected by events themselves. Compare to political partisanship: the public's interpretation of political scandals (of varying levels of real or fake) is obviously enormously affected by both their personal political views and the political views of the media sources and social circles they trust. You can probably think of plenty of cases where very similar actions have been interpreted differently by partisans and biased organizations depending on which party they're associated with. At the same time it's not completely detached from reality, not everyone is maximally partisan so there really are actions you can take to make your political party more or less popular.
That doesn't mean being generically "likable" is the best strategy either, you can also do things like decrease the influence of your political enemies or do things that have a real-world impact that people like even if they don't like the policy in abstract. If Trump successfully changed the political leanings of mainstream media institutions, or Israel successfully helped the Iranian protestors take over the government, then that would help their popularity more than it pissed people off so long as it didn't require doing anything really unpopular like mass-arresting journalists or using nuclear weapons. Conversely if Israel made all palestinians citizens that would make the population of Israel a lot more anti-jewish despite it being "likable". Anti-white bias has had a recent surge in influence via the growth of the social-justice movement despite sustaining itself on stuff like "police are allowed to defend themselves and sometimes make mistakes" and "the 1955 lynching of Emmett Till", sometimes an influential ideology really does hate you enough that you'll make more progress by trying to fight that ideology than by playing nice. Anti-Israel bias isn't as detached from their recent actions, at least not in the west, but it's a reminder that determining the net impact of an action long-term is more complicated than checking popularity polls.
- Prev
- Next

It can be tested in theory. You just need to understand what internal processes constitute consciousness in the brain, understand the internal processes of a LLM, and determine if sufficiently equivalent processes are occurring. Until then we have to do our best based on our current understanding of LLMs and the human mind, based on which I think they aren't. Yeah some of the terms here aren't understood well enough to be well-defined, but the history of science shows that's a common problem.
It matters if you think conscious beings are morally relevant. I remember this blog post from Yudkowsky:
Belief in the Implied Invisible
Unlike understanding the internal activity of the brain and how it compares to the internal activity of an LLM, transmitting information faster than light is, according to our current understanding of physics, actually impossible. Lets say you're working on the spaceship and you think you've discovered a mistake that will, when it tries to land at its destination, cause it to explode. If you report the mistake, the launch will be delayed and you'll suffer professional inconvenience because you missed it for so long. If you don't, you guess the ship will explode and everyone will die, but what actually happens will be completely impossible for anyone on Earth to detect by any means under the laws of physics. Do you report it?
The same is true of fictional characters. If I'm playing D&D I can predict how Throgg the half-orc barbarian will react to his wife dying, but I don't think he's conscious whether he's being roleplayed by a human or a LLM. Note that sometimes fiction doesn't try to be realistic, and the same factors can influence the character whether it's being written by a LLM or not. If Throgg is written as part of a light-hearted black comedy with a running joke about his club, both humans and LLMs are more likely to write his dialogue as part of joke where he responds with indifference to "They burned your house!" and "They burned your wife!" but bursts into tears at "They burned your club!". The only reason LLMs assuming a persona talk similarly to real humans is that most of the text they're trained on incorporates some level of psychological realism and so that is part of their default genre.
More options
Context Copy link