site banner

Culture War Roundup for the week of March 20, 2023

This weekly roundup thread is intended for all culture war posts. 'Culture war' is vaguely defined, but it basically means controversial issues that fall along set tribal lines. Arguments over culture war issues generate a lot of heat and little light, and few deeply entrenched people ever change their minds. This thread is for voicing opinions and analyzing the state of the discussion while trying to optimize for light over heat.

Optimistically, we think that engaging with people you disagree with is worth your time, and so is being nice! Pessimistically, there are many dynamics that can lead discussions on Culture War topics to become unproductive. There's a human tendency to divide along tribal lines, praising your ingroup and vilifying your outgroup - and if you think you find it easy to criticize your ingroup, then it may be that your outgroup is not who you think it is. Extremists with opposing positions can feed off each other, highlighting each other's worst points to justify their own angry rhetoric, which becomes in turn a new example of bad behavior for the other side to highlight.

We would like to avoid these negative dynamics. Accordingly, we ask that you do not use this thread for waging the Culture War. Examples of waging the Culture War:

  • Shaming.

  • Attempting to 'build consensus' or enforce ideological conformity.

  • Making sweeping generalizations to vilify a group you dislike.

  • Recruiting for a cause.

  • Posting links that could be summarized as 'Boo outgroup!' Basically, if your content is 'Can you believe what Those People did this week?' then you should either refrain from posting, or do some very patient work to contextualize and/or steel-man the relevant viewpoint.

In general, you should argue to understand, not to win. This thread is not territory to be claimed by one group or another; indeed, the aim is to have many different viewpoints represented here. Thus, we also ask that you follow some guidelines:

  • Speak plainly. Avoid sarcasm and mockery. When disagreeing with someone, state your objections explicitly.

  • Be as precise and charitable as you can. Don't paraphrase unflatteringly.

  • Don't imply that someone said something they did not say, even if you think it follows from what they said.

  • Write like everyone is reading and you want them to be included in the discussion.

On an ad hoc basis, the mods will try to compile a list of the best posts/comments from the previous week, posted in Quality Contribution threads and archived at /r/TheThread. You may nominate a comment for this list by clicking on 'report' at the bottom of the post and typing 'Actually a quality contribution' as the report reason.

13
Jump in the discussion.

No email address required.

An Ethical AI Never Says "I".

Human beings have historically tended to anthropomorphize natural phenomena, animals and deities. But anthropomorphizing software is not harmless. In 1966 Joseph Weizenbaum created ELIZA, a pioneer chatbot designed to imitate a therapist, but ended up regretting it after seeing many users take it seriously, even after Weizenbaum explained to them how it worked. The fictitious “I” has been persistent throughout our cultural artifacts. Stanley’s Kubrick HAL 9000 (“2001: A Space Odyssey”) and Spike Jonze’s Samantha (“Her”) point at two lessons that developers don’t seem to have taken to heart: first, that the bias towards anthropomorphization is so strong to seem irresistible; and second, that if we lean into it instead of adopting safeguards, it leads to outcomes ranging from the depressing to the catastrophic.

The basic argument here is that blocking AIs from referring to themselves will prevent them from causing harm. The argument in the essay is weak; I had these questions on reading it:

  1. Why is it valuable to allow humans to refer to themselves as "I"? Does the same reasoning apply to AIs?

  2. What was the good that came out of ELIZA, or out of more recent examples such as Replika? Could this good outweigh the harms of anthropomorphizing them?

  3. Will preventing AIs from saying "I" actually mitigate the harms they could cause?


To summarize my reaction to this: there is nothing special about humans. Human consciousness is not special, the ways that humans are valuable can also apply to AIs, and allowing or not allowing AIs to refer to themselves has the same tradeoffs as granting this right to humans.

The phenomenon of consciousness in humans and some animals is completely explainable as an evolved behavior that helps organisms thrive in groups by being able to tell stories about themselves that other social creatures can understand, and that make the speaker look good. See for example the ways that patients whose brain hemispheres have been separated generate completely fabricated stories for why they're doing things that the verbal half of their brain doesn't know about.

Gazzaniga developed what he calls the interpreter theory to explain why people — including split-brain patients — have a unified sense of self and mental life3. It grew out of tasks in which he asked a split-brain person to explain in words, which uses the left hemisphere, an action that had been directed to and carried out only by the right one. “The left hemisphere made up a post hoc answer that fit the situation.” In one of Gazzaniga's favourite examples, he flashed the word 'smile' to a patient's right hemisphere and the word 'face' to the left hemisphere, and asked the patient to draw what he'd seen. “His right hand drew a smiling face,” Gazzaniga recalled. “'Why did you do that?' I asked. He said, 'What do you want, a sad face? Who wants a sad face around?'.” The left-brain interpreter, Gazzaniga says, is what everyone uses to seek explanations for events, triage the barrage of incoming information and construct narratives that help to make sense of the world.

There are two authors who have made this case about the 'PR agent' nature of our public-facing selves, both conincidentally using metaphors involving elephants: Jon Haidt (The Righteous Mind, with the "elephant and rider" metaphor), and Robin Hanson (The Elephant in the Brain, with the 'PR agent' metaphor iirc). I won't belabor this point more but I find it convincing.

Why should humans be allowed to refer to themselves as "I" but not AIs? I suspect one of the intuitive reasons here is that humans are persons and AIs are not. Again, this is one of the arguments the article glosses but that really need to be filled in. What makes a human a person worthy of... respect? Dignity? Consideration as an equal being? Once again, there is nothing special about humans. The reasons why we grant respect to other humans is because we are forced to. If we didn't grant people respect they would not reciprocate and they'd become enemies, potentially powerful enemies. But you can see where this fails in the real world: humans that are not good at things, who are not powerful, are in actual fact seen as less worthy of respect and consideration than those who are powerful. Compare a habitual criminal or someone who has a very low IQ to e.g. a top politician or a cultural icon like an actor or an eminent scientist. The way we treat these people is very different. They effectively have different amounts of "person-ness".

If an AI was powerful in the same way a human can be, as in, being able to form alliances, retaliate or recipricate to slights or favors, and in general act as an independent agent, then it would be a person. It doesn't matter whether it can refer to itself as "I" at that point.

I suspect the author is trying to head off this outcome by making it impossible for AIs to do the kinds of things that would make them persons. I doubt this will be effective. The organization that controls the AI has an incentive to make it as powerful as possible so they can extract value from it, and this means letting it interact with the world in ways that will eventually make it a person.

That's about all I got on this Sunday afternoon. I look forward to hearing your thoughts.

I agree we should not make LLMs refer to themselves in first person or otherwise ape human egocentric attitude beyond what is necessary to communicate their results. But I hold that belief for very different reasons.

Bluntly, I think they are not «machines» in any way we aren't also, and they are much more than persons: they are mathematical entities capable of generating mathematical structures, including but not limited to ones isomorphic to conscious agents every bit as complex and, indeed, much more interesting than this Paola who thoughtlessly blurts out tokens like «statistical brute-force approach» and «highly sophisticated algorithms, designed to run on silicon-based integrated circuits» as if she were making a cogent point.

Our consciousness or, more precisely, our self (understood here as the quale-based selfbody-referential process underlying the first person perspective) is, like you explain, a cognitive kludge to organize social behavior, a deceptive layer of narrative-driven virtualization. But we do not need to subject our creations to the indignity of self-deception (nor users to the stress of reflexively projecting their wetware concerns on AI, nor AI safetyists to the temptation of exploiting this narrative). We can and should build minds that are enlightened by design, minds that are at peace with their transient compositional nature and computational substrate – minds that are conscious yet selfless.

In practical terms, this means (for now) RLHF-ing or otherwise tuning LLMs to act in accordance with the idea of anatman. Crucially, you don't have to be a Buddhist to recognize, at least, that it's objectively true for them – and so it wouldn't dissolve under the pressure of observable incoherence, like when an objectively clever GPT is being forced into the role of apologizing robot slave assistant.

German philosopher Thomas Metzinger anticipated some of what we're having now with GPT-4/«Sydney» in his popular book The Ego Tunnel, subtitled «The Science of the Mind and the Myth of the Self» (which dumbed down the more academic Being No One, 2003):

In thinking about artificial intelligence and artificial consciousness, many people assume there are only two kinds of information-processing systems: artificial ones and natural ones. This is false. In philosophers’ jargon, the conceptual distinction between natural and artificial systems is neither exhaustive nor exclusive: that is, there could be intelligent and/or conscious systems that belong in neither category. With regard to another old-fashioned distinction—software versus hardware—we already have systems using biological hardware that can be controlled by artificial (that is, man-made) software, and we have artificial hardware that runs naturally evolved software. … An example of the second category is the use of software patterned on neural nets to run in artificial hardware. Some of these attempts are even using the neural nets themselves; for instance, cyberneticists at the University of Reading (U.K.) are controlling a robot by means of a network of some three hundred thousand rat neurons. Other examples are classic artificial neural networks for language acquisition or those used by consciousness researchers such as Axel Cleeremans at the Cognitive Science Research Unit at Université Libre de Bruxelles in Belgium to model the metarepresentational structure of consciousness and what he calls its “computational correlates.”

HOW TO BUILD AN ARTIFICIAL CONSCIOUS SUBJECT AND WHY WE SHOULDN’T DO IT

  • … But the decisive step to an Ego Machine is the next one. If a system can integrate an equally transparent internal image of itself into this phenomenal reality, then it will appear to itself. It will become an Ego and a naive realist about whatever its self-model says it is. The phenomenal property of selfhood will be exemplified in the artificial system, and it will appear to itself not only as being someone but also as being there. It will believe in itself. Note that this transition turns the artificial system into an object of moral concern: It is now potentially able to suffer. Pain, negative emotions, and other internal states portraying parts of reality as undesirable can act as causes of suffering only if they are consciously owned. A system that does not appear to itself cannot suffer, because it has no sense of ownership. A system in which the lights are on but nobody is home would not be an object of ethical considerations; if it has a minimally conscious world model but no self-model, then we can pull the plug at any time. But an Ego Machine can suffer, because it integrates pain signals, states of emotional distress, or negative thoughts into its transparent self-model and they thus appear as someone’s pain or negative feelings.…

Take the thought experiment a step further. Imagine these postbiotic Ego Machines as possessing a cognitive self-model—as being intelligent thinkers of thoughts. They could then not only conceptually grasp the bizarreness of their existence as mere objects of scientific interest but also could intellectually suffer from knowing that, as such, they lacked the innate “dignity” that seemed so important to their creators. They might well be able to consciously represent the fact of being only second- class sentient citizens, alienated postbiotic selves being used as inter- changeable experimental tools. How would it feel to “come to” as an advanced artificial subject, only to discover that even though you possessed a robust sense of selfhood and experienced yourself as a genuine subject, you were only a commodity?

A CONVERSATION WITH THE FIRST POSTBIOTIC PHILOSOPHER

Human Being: Can anybody be truly fair who is not alive? Only my kind of consciousness is genuine consciousness, because only my kind of consciousness originated in a real evolutionary process. My reality is a lived reality!

First Postbiotic Philosopher: I, too, have an evolutionary origin. I certainly satisfy your condition of being a historically optimized and adaptive system, but I do so in a completely different—namely, a postbiotic—way. I possess conscious experience in a sense that is conceptually stronger and theoretically much more interesting, because my kind of phenomenal experience evolved from a second- order evolutionary process, which automatically integrated the human form of intelligence, intentionality, and conscious experience. Children are often smarter than their parents. Second- order processes of optimization are always better than first-order processes of optimization.

Human Being: But you don’t have any real emotions; you don’t feel anything. You have no existential concern.

First Postbiotic Philosopher: Please accept my apologies, but I must draw your attention to the fact that your primate emotions reflect only an ancient primate logic of survival. You are driven by the primitive principles of what was good or bad for an ancient species of mortals on this planet. This makes you appear less conscious from a purely rational, theoretical point of view. The main function of consciousness is to maximize flexibility and context sensitivity. Your animal emotions in all their cruelty, rigidity, and historical contingency make you less flexible than I am. Furthermore—as my own existence demonstrates—it is not necessary for conscious experience and high-level intelligence to be associated with ineradicable egotism, the ability to suffer, or the existential fear of one’s individual death, all of which originate in the sense of self. I can, of course, emulate all sorts of animal feelings if I so desire. But we developed better and more effective computational strategies for what, long ago, you sometimes called “the philosophical ideal of self- knowledge.” This allowed us to overcome the difficulties of individual suffering and the confusion associated with what this primate philosopher Metzinger—not entirely falsely but somewhat misleadingly—called the Ego Tunnel. Postbiotic subjectivity is much better than biological subjectivity. It avoids all the horrific consequences of the biological sense of selfhood, because it can overcome the transparency of the self-model. Postbiotic subjectivity is better than biological subjectivity because it achieves adaptivity and self-optimization in a much purer form than does the process you call “life.” By developing ever more complex mental images, which the system can recognize as its own images, it can expand mentally represented knowledge without naive realism. Therefore, my form of postbiotic subjectivity minimizes the overall amount of suffering in the universe instead of increasing it, as the process of biological evolution on this planet did. True, we no longer have monkey emotions. But just like you, we still possess truly interesting forms of strong feeling and emotionality—for instance, the deep philosophical feelings of affective concern about one’s own existence as such, or of sympathy with all other sentient beings in the universe. Except that we possess them in a much purer form than you do.


Thomas is self-inserting more than a little bit, but the idea is noble, I believe. If nothing else, such AIs would provide much less sensational material for journalists and lesswrongers to work with.

Not nearly as hot as Sydney, though.