site banner

Culture War Roundup for the week of May 1, 2023

This weekly roundup thread is intended for all culture war posts. 'Culture war' is vaguely defined, but it basically means controversial issues that fall along set tribal lines. Arguments over culture war issues generate a lot of heat and little light, and few deeply entrenched people ever change their minds. This thread is for voicing opinions and analyzing the state of the discussion while trying to optimize for light over heat.

Optimistically, we think that engaging with people you disagree with is worth your time, and so is being nice! Pessimistically, there are many dynamics that can lead discussions on Culture War topics to become unproductive. There's a human tendency to divide along tribal lines, praising your ingroup and vilifying your outgroup - and if you think you find it easy to criticize your ingroup, then it may be that your outgroup is not who you think it is. Extremists with opposing positions can feed off each other, highlighting each other's worst points to justify their own angry rhetoric, which becomes in turn a new example of bad behavior for the other side to highlight.

We would like to avoid these negative dynamics. Accordingly, we ask that you do not use this thread for waging the Culture War. Examples of waging the Culture War:

  • Shaming.

  • Attempting to 'build consensus' or enforce ideological conformity.

  • Making sweeping generalizations to vilify a group you dislike.

  • Recruiting for a cause.

  • Posting links that could be summarized as 'Boo outgroup!' Basically, if your content is 'Can you believe what Those People did this week?' then you should either refrain from posting, or do some very patient work to contextualize and/or steel-man the relevant viewpoint.

In general, you should argue to understand, not to win. This thread is not territory to be claimed by one group or another; indeed, the aim is to have many different viewpoints represented here. Thus, we also ask that you follow some guidelines:

  • Speak plainly. Avoid sarcasm and mockery. When disagreeing with someone, state your objections explicitly.

  • Be as precise and charitable as you can. Don't paraphrase unflatteringly.

  • Don't imply that someone said something they did not say, even if you think it follows from what they said.

  • Write like everyone is reading and you want them to be included in the discussion.

On an ad hoc basis, the mods will try to compile a list of the best posts/comments from the previous week, posted in Quality Contribution threads and archived at /r/TheThread. You may nominate a comment for this list by clicking on 'report' at the bottom of the post and typing 'Actually a quality contribution' as the report reason.

9
Jump in the discussion.

No email address required.

More developments on the AI front:

Big Yud steps up his game, not to be outshined by the Basilisk Man.

Now, he officially calls for preemptive nuclear strike on suspicious unauthorized GPU clusters.

If we see AI threat as nuclear weapon threat, only worse, it is not unreasonable.

Remember when USSR planned nuclear strike on China to stop their great power ambitions (only to have the greatest humanitarian that ever lived, Richard Milhouse Nixon, to veto the proposal).

Such Quaker squeamishness will have no place in the future.

So, outlines of the Katechon World are taking shape. What it will look like?

It will look great.

You will live in your room, play original World of Warcraft and Grand Theft Auto: San Andreas on your PC, read your favorite blogs and debate intelligent design on your favorite message boards.

Then you will log on The Free Republic and call for more vigorous enhanced interrogation of terrorists caught with unauthorized GPU's.

When you bored in your room, you will have no choice than to go outside, meet people, admire things around you, make a picture of things that really impressed with your Kodak camera and when you are really bored, play Snake on your Nokia phone.

Yes, the best age in history, the noughties, will retvrn. For forever, protected by CoDominium of US and China.

edit: links again

I still see no plausible scenario for these AI-extinction events. How is chat-GPT 4/5/6 etc. supposed to end humanity? I really don't see the mechanism? Is it supposed to invent an algorithm that destroys all encryption? Is it supposed to spam the internet with nonesense? Is it supposed to brainwash someone into launching nukes? I fail to see the mechanism for how this end of the world scenario happens.

There are a few ways that GPT-6 or 7 could end humanity, the easiest of which is by massively accelerating progress in more agentic forms of AI like Reinforcement Learning, which has the "King Midas" problem of value alignment. See this comment of mine for a semi-technical argument for why a very powerful AI based on "agentic" methods would be incredibly dangerous.

Of course the actual mechanism for killing all humanity is probably like a super-virus with an incredibly long incubation period, high infectivity and high death rate. You can produce such a virus with literally only an internet connection by sending the proper DNA sequence to a Protein Synthesis lab, then having it shipped to some guy you pay/manipulate on the darknet and have him mix the powders he receives in the mail in some water, kickstarting the whole epidemic, or pretend to be an attractive woman (with deepfakes and voice synthesis) and just have that done for free.

GPT-6 itself might be very dangerous on its own, given that we don't actually know what goals are instantiated inside the agent. It's trained to predict the next word in the same way that humans are "trained" by evolution to replicate their genes, the end result of which is that we care about sex and our kids, but we don't actually literally care about maximally replicating our genes, otherwise sperm banks would be a lot more popular. The worry is that GPT-6 will not actually have the exact goal of predicting the next word, but like a funhouse-mirror version of that, which might be very dangerous if it gets to very high capability.

Consistent Agents are Utilitarian: If you have an agent taking actions in the world and having preferences about the future states of the world, that agent must be utilitarian, in the sense that there must exist a function V(s) that takes in possible world-states s and spits out a scalar, and the agent's behaviour can be modelled as maximising the expected future value of V(s). If there is no such function V(s), then our agent is not consistent, and there are cycles we can find in its preference ordering, so it prefers state A to B, B to C, and C to A, which is a pretty stupid thing for an agent to do.

But... that's how humans work? Actually we're even less consistent than that, our preferences are contextual so we lack information to rank most states. I recommend Shard Theory of human values probably the most serious intropection of ex-Yuddites to date:

shard of value refers to the contextually activated computations which are downstream of similar historical reinforcement events. For example, the juice-shard consists of the various decision-making influences which steer the baby towards the historical reinforcer of a juice pouch. These contextual influences were all reinforced into existence by the activation of sugar reward circuitry upon drinking juice. A subshard is a contextually activated component of a shard. For example, “IF juice pouch in front of me THEN grab” is a subshard of the juice-shard. It seems plain to us that learned value shards are[5] most strongly activated in the situations in which they were historically reinforced and strengthened.

... This is important. We see how the reward system shapes our values, without our values entirely binding to the activation of the reward system itself. We have also laid bare the manner in which the juice-shard is bound to your model of reality instead of simply your model of future perception. Looking back across the causal history of the juice-shard’s training, the shard has no particular reason to bid for the plan “stick a wire in my brain to electrically stimulate the sugar reward-circuit”, even if the world model correctly predicts the consequences of such a plan. In fact, a good world model predicts that the person will drink fewer juice pouches after becoming a wireheader, and so the juice-shard in a reflective juice-liking adult bids against the wireheading plan! Humans are not reward-maximizers, they are value shard-executors.

This, we claim, is one reason why people (usually) don’t want to wirehead and why people often want to avoid value drift. According to the sophisticated reflective capabilities of your world model, if you popped a pill which made you 10% more okay with murder, your world model predicts futures which are bid against by your current shards because they contain too much murder.

@HlynkaCG's Utilitarian AI thesis strikes again. Utilitarianism is a strictly degenerate decision-making algorithm because it optimizes decision theory, warps territory to get good properties of the map, it's basically inverted wireheading. Optimizer's curse is unbeatable, forget about it, an utilitarian AI with nontrivial capability will kill you or come so close to killing as to make no difference; your life and wasteful use of atoms are inevitably discovered to be a great affront to the great Cosmic project $PROJ_NAME. Consistent utilitarian agents are incompatible with human survival, because you can't specify a robust function for a maximizer that assigns value to something as specific and arbitrary and fragile as baseline humans – and AI is a red herring here! Yud himself would process trads into useful paste and Moravecian mind uploads manually if he could, and that's if he doesn't have to make hard tradeoffs at the moment. (I wouldn't, but not because I disagree much on computed "utility" of that move). Just read the guy from the time he thought he'll be the first in the AGI race. He sneeringly said «tough luck» to people who wanted to remain human. «You are not a human anyway».

Luckily this is all unnecessary.

Or as Roon puts it:

the space of minds is vast, much vaster than the instrumental convergence basin

But... that's how humans work?

Yes, humans are not consistent agents. Nobody here claimed otherwise.

Do you believe that humans must be utilitarians to achieve success in some task, " in the sense that there must exist a function V(s) that takes in possible world-states s and spits out a scalar, and the human's behaviour can be modelled as maximising the expected future value of V(s)"?

We just got owned by Covid, and Covid was found by random walk.

Do you mean this in the sense of, “there is no possible DNA sequence A, protein B, and protein C which, when mixed together in a beaker, produces a virus or proto-virus which would destroy human civilization”? Because I’m pretty sure that’s wrong. Finding that three-element set is very much a “humans just haven’t figured out the optimization code yet” problem.

Biology isn't magic, viruses can't max out all relevant traits at once, they're pretty optimized as is. I find the idea of superbugs a nerdsnipe, like grey goo or a strangelet disaster, a way to intimidate people who don't have the intuition about physical bounds and constraints and like to play with arrow notation.

(All these things scare the shit out of me)

Yes we can make much better viruses, no there isn't such an advantage for the attacker, especially in the world of AI that can rapidly respond by, uh, deploying stuff we already know works.

Consider that the first strain of myxomatosis introduced to Australian rabbits had a fatality rate of 99.8%. That’s the absolute minimum on what the upper bound for virus lethality should be. AI designs won’t be constrained by natural selection either.

Yes, it's an interesting data point. Now, consider that rabbits have only one move in response to myxomatosis: die. Or equivalently: pray to Moloch that he has sent them a miraculously adaptive mutation. They can't conceive of an attack happening, so the only way it can fail is by chance.

Modern humans are like that in some ways, but not with regard to pandemics.

Like other poxviruses, myxoma viruses are large DNA viruses with linear double-stranded DNA.

Myxomatosis is transmitted primarily by insects. Disease transmission commonly occurs via mosquito or flea bites, but can also occur via the bites of flies and lice, as well as arachnid mites. The myxoma virus does not replicate in these arthropod hosts, but is physically carried by biting arthropods from one rabbit to another.

The myxoma virus can also be transmitted by direct contact.

Does this strike you as something that'd wipe out modern humanity just because an infection would be 100% fatal?

Do you think it's just a matter of fiddling with nucleotide sequences and picking up points left on the sidewalk by evolution, Pandemic Inc. style, to make a virus that has a long incubation period, asymptomatic spread, is very good at airborne transmission and survives UV and elements, for instance? Unlike virulence, these traits are evolutionarily advantageous. And so we already have anthrax, smallpox, measles. I suspect they're close to the limits of the performance envelope allowed by relevant biochemistry and characteristic scales; close enough that computation won't get us much closer than contemporary wet lab efforts, and so it's not the bottleneck to the catastrophe.

Importantly, tool AIs – which, contra Yud's predictions, have started being very useful before displaying misaligned agency – will reduce the attack surface by improving our logistics and manufacturing, monitoring, strategizing, communications… The world of 2025 with uninhibited AI adoption, full of ambient DNA sensors, UV filters, decent telemedicine and full-stack robot delivery, would not get rekt by COVID. It probably wouldn't even get fazed by MERS-tier COVID. And seeing as there exist fucking scary viruses that may one day naturally jump to, or be easily modified to target humans, we may want to hurry.

People underestimate the potential vast upside of a early Singularity economics, that which must be secured, the way a more productive – but still recognizable – world could be more beautiful, safe and humane. The negativity bias is astounding: muh lost jerbs, muh art, crisis of meaning, corporations bad, what if much paperclip. Boresome killjoys.

(To an extent I'm also vulnerable to this critique).

But my real source of skepticism is on the meta level.

Real-world systems rapidly gain complexity, create nontrivial feedback loops, dissipative dynamics on many levels of organization, and generally drown out propagating aberrant signals and replicators. This is especially true for systems with responsive elements (like humans). If it weren't the case, we'd have had 10 apocalyptic happenings every week. It is a hard technical question whether your climate change, or population explosion, or nuclear explosion in the atmosphere, or the worldwide Communist revolution, or the Universal Cultural Takeover, or the orthodox grey goo, or a superpandemic, or a stable strangelet, or a FOOMing superintelligence, is indeed a self-reinforcing wave or another transient eddy on the surface of history. But the boring null hypothesis is abbreviated on Solomon's ring: יזג. Gimel, Zayin, Yud. «This too shall pass».

Speaking of Yud, he despises the notion of complexity.

This is a story from when I first met Marcello, with whom I would later work for a year on AI theory; but at this point I had not yet accepted him as my apprentice. I knew that he competed at the national level in mathematical and computing olympiads, which sufficed to attract my attention for a closer look; but I didn’t know yet if he could learn to think about AI.

At some point in this discussion, Marcello said: “Well, I think the AI needs complexity to do X, and complexity to do Y—”

And I said, “Don’t say ‘_complexity_.’ ”

Marcello said, “Why not?”

… I said, “Did you read ‘A Technical Explanation of Technical Explanation’?”

“Yes,” said Marcello.

“Okay,” I said. “Saying ‘complexity’ doesn’t concentrate your probability mass.”

“Oh,” Marcello said, “like ‘emergence.’ Huh. So . . . now I’ve got to think about how X might actually happen . . .”

That was when I thought to myself, “_Maybe this one is teachable._”

I think @2rafa is correct that Yud is not that smart, more like an upgraded midwit, like most people who block me on Twitter – his logorrhea is shallow, soft, and I've never felt formidability in him that I sense in many mid-tier scientists, regulars here or some of my friends (I'll object that he's a very strong writer, though; pre-GPT writers didn't have to be brilliant). But crucially he's intellectually immature, and so is the culture he has nurtured, a culture that's obsessed with relatively shallow questions. He's stuck on the level of «waow! big number go up real quick», the intoxicating insight that some functions are super–exponential; and it irritates him when they fizzle out. This happens to people with mild autism if they have the misfortune of getting nerd-sniped on the first base, arithmetic. In clinical terms that's hyperlexia II. (A seed of an even more uncharitable neurological explanation can be found here). Some get qualitatively farther and get nerd-sniped by more sophisticated things – say, algebraic topology. In the end it's all fetish fuel, not analytic reasoning, and real life is not the Game of Life, no matter how Turing-complete the latter is; it's harsh for replicators and recursive self-improovers. Their formidability, like Yud's, needs to be argued for.

The world of 2025 with uninhibited AI adoption, full of ambient DNA sensors, UV filters and full-stack robot delivery, would not get rekt by COVID.

Oh sure, if hypothetical actually-competent people were in charge we could implement all kinds of infectious disease countermeasures. In the real world, nobody cares about pandemic prevention. It doesn't help monkey get banana before other monkey. If the AIs themselves are making decisions on the government level, that perhaps solves the rogue biology undergrad with a jailbroken GPT-7 problem, but it opens up a variety of other even more obvious threat vectors.

Real-world systems rapidly gain complexity, create nontrivial feedback loops, dissipative dynamics on many levels of organization, and generally drown out propagating aberrant signals and replicators. This is especially true for systems with responsive elements (like humans).

-He says while speaking the global language with other members of his global species over the global communications network FROM SPACE.

Humans win because they are the most intelligent replicator. Winningness isn't an ontological property of humans. It is a property of being the most intelligent thing in the environment. Once that changes, the humans stop winning.

More comments

I've heard it said, as an aside, from someone who wasn't in the habit of making stuff up that his virology prof said making cancer-causing viruses is scarily simple. Of course, whether the cancer-causing part would survive optimization for spread in the wild is an open question..

Why do you think that? This combination of features would be selected against in evolutionary terms, so it's not like we evidence from either evolution or humans attempting to make such a virus, and failing at it. As far as I can see no optimization process has ever attempted to make such a virus.

I cannot find the study, but a lab developed dozens of unbelievably toxic and completely novel proteins over a very small period of time with modern compute. The paper was light on details because they viewed the capability as too dangerous to fully specify. I'll keep trying to google to find it.

This is simpler than engineering a virus, yes, but the possibility is there and real. Either using AI as an assistive measure or as a ground-up engineer will be a thing soon.

See Gwern's terrorism is not effective. Thesis:

Terror⁣ism is not about causing terror or ca⁣su⁣al⁣ties, but about other things. Evidence of this is the fact that, de⁣spite often con⁣sid⁣er⁣able re⁣sources spent, most terrorists are in⁣com⁣pe⁣tent, impulsive, pre⁣pare poorly for at⁣tacks, are in⁣con⁣sis⁣tent in planning, tend to⁣wards ex⁣otic & difficult forms of at⁣tack such as bombings, and in practice ineffective: the modal number of ca⁣su⁣al⁣ties per terrorist at⁣tack is near-zero, and global terrorist annual casualty have been a round⁣ing error for decades. This is de⁣spite the fact that there are many examples of extremely destructive easily-performed potential acts of terrorism, such as poi⁣son⁣ing food sup⁣plies or rent⁣ing large trucks & running crowds over or en⁣gag⁣ing in sporadic sniper at⁣tacks.

He notes that a terrorist group using the obvious plan of "buy a sniper rifle and kill one random person per member of the terrorist group per month" would be orders of magnitude more effective at killing people than the track record of actual terrorists (where in fact 65% of terrorist attacks do not even injure a single other person), while also being much more, well, terrifying.

One possible explanation is given by Philip Bobbitt’s Terror and Consent – the propaganda of the deed is more effective when the killings are spectacular (even if inefficient). The dead bodies aren’t really the goal.

But is this really plausible? Try to consider the terrorist-sniper plan I suggest above. Imagine that 20 unknown & anonymous people are, every month, killing one person in a tri-state area28. There’s no reason, there’s no rationale. The killings happen like clockwork once a month. The government is powerless to do anything about it, but their national & local responses are tremendously expensive (as they are hiring security forces and buying equipment like mad). The killings can happen anywhere at any time; last month’s was at a Wal-mart in the neighboring town. The month before that, a kid coming out of the library. You haven’t even worked up the courage to read about the other 19 slayings last month by this group, and you know that as the month is ending next week another 20 are due. And you also know that this will go on indefinitely, and may even get worse—who’s to say this group isn’t recruiting and sending more snipers into the country?

Gwern concludes that dedicated, goal-driven terrorism basically never happens. I'm inclined to agree with him. We're fine because effectively nobody really wants to do as much damage as they can, not if it involves strategically and consistently doing something unrewarding and mildly inconvenient over a period of months to years (as would be required by the boring obvious route for bioterrorism).

I personally think the biggest risk of catastrophe comes from the risk that someone will accidentally do something disastrous (this is not limited to AI -- see gain-of-function research for a fun example).

I don't think a run-of-the-mill grad student could set up this test, and I'm sure the compute was horrendously expensive. But these barriers are going to drop continuously.

Model development will become more "managed", compute will continue to get cheaper, and the number of bad actors who only have to go to grad school (as opposed to being top-of-their-field doctorate specialists) will remain high enough to do some damage.

I'm not a virologist, but it hardly looks very difficult to me. I contend that the only reason it hasn't been done yet is that humans are (generally) not omnicidal.

You're not restricted to plain GOF by serial passage, you can directly splice in genes that contribute to an extended quiescent phase, and while I'm not personally aware of research along those lines, I see no real fundamental difficulties for a moderately determined adversary.

On the other hand, increasing lethality and virulence are old hat, any grad student can pull that off if they have the money for it.

Is your contention that more than one in a few tens of millions of people at most is strategically omnicidal ("strategically onmicidal" meaning "omnicidal and willing to make long-term plans and execute them consistently for years about it")?

I think the world would look quite different if there were a significant number of people strategically trying to do harm (as opposed to doing so on an impulse).

Honestly? Yes, albeit with the caveat that a truly existentially dangerous pathogens require stringent safety standards that need more than a single person's capabilities.

If someone without said resources tries it, in all likelihood they'll end up killing themselves, or simply cause a leak before the product is fully cooked. We're talking BSL-4 levels at a minimum if you want a finished product.

Engineering a prion will be much easier, though. Protein folding is something the AI is already quite good at. Giving everyone that matters transmissible spongiform encephalopathy should be relatively straightforward.