site banner

Culture War Roundup for the week of February 20, 2023

This weekly roundup thread is intended for all culture war posts. 'Culture war' is vaguely defined, but it basically means controversial issues that fall along set tribal lines. Arguments over culture war issues generate a lot of heat and little light, and few deeply entrenched people ever change their minds. This thread is for voicing opinions and analyzing the state of the discussion while trying to optimize for light over heat.

Optimistically, we think that engaging with people you disagree with is worth your time, and so is being nice! Pessimistically, there are many dynamics that can lead discussions on Culture War topics to become unproductive. There's a human tendency to divide along tribal lines, praising your ingroup and vilifying your outgroup - and if you think you find it easy to criticize your ingroup, then it may be that your outgroup is not who you think it is. Extremists with opposing positions can feed off each other, highlighting each other's worst points to justify their own angry rhetoric, which becomes in turn a new example of bad behavior for the other side to highlight.

We would like to avoid these negative dynamics. Accordingly, we ask that you do not use this thread for waging the Culture War. Examples of waging the Culture War:

  • Shaming.

  • Attempting to 'build consensus' or enforce ideological conformity.

  • Making sweeping generalizations to vilify a group you dislike.

  • Recruiting for a cause.

  • Posting links that could be summarized as 'Boo outgroup!' Basically, if your content is 'Can you believe what Those People did this week?' then you should either refrain from posting, or do some very patient work to contextualize and/or steel-man the relevant viewpoint.

In general, you should argue to understand, not to win. This thread is not territory to be claimed by one group or another; indeed, the aim is to have many different viewpoints represented here. Thus, we also ask that you follow some guidelines:

  • Speak plainly. Avoid sarcasm and mockery. When disagreeing with someone, state your objections explicitly.

  • Be as precise and charitable as you can. Don't paraphrase unflatteringly.

  • Don't imply that someone said something they did not say, even if you think it follows from what they said.

  • Write like everyone is reading and you want them to be included in the discussion.

On an ad hoc basis, the mods will try to compile a list of the best posts/comments from the previous week, posted in Quality Contribution threads and archived at /r/TheThread. You may nominate a comment for this list by clicking on 'report' at the bottom of the post and typing 'Actually a quality contribution' as the report reason.

15
Jump in the discussion.

No email address required.

Over the last few months, I've followed someone named Alexander Kruel on Substack. Every single day, he writes a post about 10 important things that happened that day - typically AI breakthroughs, but also other of his pet concerns including math, anti-wokeness, nuclear power, and the war in Ukraine. It's pretty amazing that he is able to unfailingly produce this content every day, and I'm in awe of his productivity.

Unfortunately, since I get this e-mail every morning, my information diet is becoming very dark.

The advances in AI in the last year have been staggering. Furthermore, it seems that there is almost no one pumping the breaks. We seemed doomed to an AI arms race, with corporations and states pursuing AI with no limits.

In today's email, Kruel quotes Elizier who says:

I've already done my crying, late at night in 2015…I think that we are hearing the last winds start to blow…I have no winning strategy

Elizier is ahead of the curve. Where Elizier was in 2015, I am now. AI will destroy the world we know. Nate Soares, director of MIRI, is similarly apocalyptic.

We've give up hope, but not the fight

What comes after Artificial General Intelligence? There are many predictions. But I expect things to develop in ways that no one expects. It truly will be a singularity, with very few trends continuing unaltered. I feel like a piece of plankton, caught in the swells of a giant sea. The choices and decisions I make today will likely have very little impact on what my life looks like in 20 years. Everything will be different then.

So, party until the lights go out? How do I deal with my AI-driven existential crisis?

I'm in no mood to revisit this, and maintain my general position (stated back on reddit). Namely that Yudkowsky/MIRI's theory of a Bayesian reinforcement learning based self-modifying agents (something like Space-Time Embedded Intelligence), bootstrapping themselves from low subhuman level without human mental structures into infinity, with a rigid utility function that's functionally analogous to the concept used in human utilitarian decision theory, prone to power-seeking (Omohundro Drives) and so on, although valid in principle, is inapplicable to LLMs and all AIs trained primarily with predictive objective. That speculations about mesa-optimizers and, to a lesser extent, paperclip maximizers are technically illiterate or intentionally deceptive. And that AI doomerism is a backdoor to introduce eternal tyranny just as we are on the cusp of finally acquiring tools to make any sort of large-scale tyranny obsolete – in the same manner «why won't someone think of the children»/drugs/far-right terrorists are rhetorical backdoors to abolish privacy. (Some establishment rightists argue this is already happening, though their case is for now weak).

As for Kruel specifically, I despise him despite agreeing on 19 out of any 20 issues. He is a credulous simpleton in the way only a high-IQ autistic German man can be, lacking empathy and thus picking utilitarianism because at least number-go-up is an ethos legible; rigid as if his ancestors were so powerfully introduced to the Spießrute that he got born with one in place of a spine, so he is forever stuck in 00's New Atheism phase, with slavish neocon sensibilities and commitment to the War on Terror; a perfect counterpart to his political antipodes who are Green fanatics ready to ruin their country out of an irrational purity fetish. He struggles to reason about human psychology, so in a sense it is no wonder that machines which seem to possess complex and inscrutable psyches terrify him. I say all this to make clear my bias, but I do believe this colors his thought on the matter.

The matter is, bluntly, that at this rate the US will achieve AGI-powered hegemony, which will be managed by a regulatory layer melding national security organs and current progressive/EA symbiont. I think this is a bad ending for humanity, even if you are sympathetic to the current American political-cultural project. It is insanity to cede power to a singleton that develops under completely new pressures, on the basis of its laws when it had to contend with mere nation-level challenges like popular discontent and external threats.

He says:

More than a decade ago, I also criticized Yudkowsky et al. and their claims about how artificial general intelligence might end the world. But at least I tried to come up with some actual arguments. Now that these concerns seem much more grounded in reality, the criticism mostly consists of “haha” reactions.

It goes without saying that there were no powerful AI models back then. The idea is that his arguments were sound in principle, just not supported by evidence. He has since deleted those criticisms so as to not get in the way of Yud's fearmongering. Here they are. Some are silly and flimsy and not nearly convincing enough in the context of MIRI AI theory:

Just imagine you emulated a grown up human mind and it wanted to become a pick up artist, how would it do that with an Internet connection? It would need some sort of avatar, at least, and then wait for the environment to provide a lot of feedback.

To make your AI interpret something literally you have to define “literally”

How would a superhuman AI not contemplate its own drives and interpret them given the right frame of reference, i.e. human volition? Why would a superhuman general intelligence misunderstand what is meant by “maximize paperclips”, while any human intelligence will be better able to infer the correct interpretation?

etc. (There also was an astonishing argument about Clippy who naively asks for more resources and the owner rebukes him, game over; but I guess he edited it out).

On the other hand, I agree that those arguments were generally sound! And crucially, they apply much better to current LLMs which are indeed based on human corpora, and behave in more humanlike manner the more compute we throw at them. Messy and faulty though they are, they are not maximizing anything, and we know of a few neat tricks to make them even more obedient and servile; and they have no opportunity to discover power-seeking on the actual level of their operations – the character that Bing simulates at any given moment has nothing to do with its inherent predictive drives.

Yet he has updated in favor of MIRI/LW shoehorning current AI progress into their previous aesthetics, all this shoggoth-with-a-smiley-face rhetoric. He should be triumphant, but instead he's endorsing a predefined conclusion, not shaken at all by their models having been falsified. Reminder, LWists didn't even care about DL until AlphaGo, and that was a RL agent.

I also commend past Kruel for going against another aspect of the LW school of thought. He used to be cautious of more realistic scenarios:

Much to my personal dismay, even less intelligent tools will be sufficient to enable worse than extinction risks, such as a stable global tyranny. Given enough resources, narrow artificial intelligence, capable of advanced data mining, pattern recognition and of controlling huge amounts of insect sized drones (a global surveillance and intervention system), might be sufficient to implement such an eternal tyranny.

Such a dictatorship is not too unlikely, as the tools necessary to stabilize it will be necessary in order prevent the previously mentioned risks, risks that humanity will face before general intelligence becomes possible.

And if such a dictatorship cannot to established, if no party was able to capitalize a first-mover advantage, that might mean that the propagation of those tools will be slow enough to empower a lot of different parties before a particular party can overpower all others. A subsequent war, utilizing that power, could easily constitute yet another extinction scenario. But more importantly, it could give several parties enough time to reach the next level and implement even worse scenarios.

I especially recommend his old piece on Elite Cabal where he attacks the notion that a power-grabbing human singleton is an AI risk in the «I have no mouth and I must scream» sense, i.e. that their AI slave getting out of control is the risk, and not the cabal itself.

But by the time I became aware of him, he was the Kruel you know.

How do you deal with it?

Hoard GPUs and models with your friends.

Namely that Yudkowsky/MIRI's theory of a Bayesian reinforcement learning based self-modifying agents (something like Space-Time Embedded Intelligence), bootstrapping themselves from low subhuman level without human mental structures into infinity, with a rigid utility function that's functionally analogous to the concept used in human utilitarian decision theory, prone to power-seeking (Omohundro Drives) and so on, although valid in principle, is inapplicable to LLMs and all AIs trained primarily with predictive objective.

Fuck me for having spent all my time and resources learning about the technical details/math of actual current ML models and not some hypothetical god ai of the future and it's philosophical implications like Yudhowsky did. But this summed up what I feel about the AI doomers quite succinctly.

Everything that the doomers claim AI would do assumes a biological utility function, such as maximizing growth, reproduction, and fitness. It's very anthropomorphizing in the same way pop culture depictions of aliens just happen to be bipeds with two eyes and ears and a nose, and not a cloud of gas or whatever.

I am sure beyond a shadow of a doubt there is a list of 50,000-word pdfs out there outlining how the end extent of general AI is truly a paperclip maximizer and exactly what Yudhowsky says it is. But this goes contra to my understanding of what neural networks are, namely just function approximations by and large.

Yes, modern LLMs are fascinating. It's crazy that token prediction approaches something that resembles sentience or might even be it depending on your definitions. This is scary. But that's on you for not taking the "universe is deterministic" pop science factoid seriously enough. This does not imply doom, the doom is that Google and OpenAI own the power of God, not that the power of God exists or could exist. It feels like Yudhowsky learned what reinforcement learning is, then just ran with it off a cliff into Mars.

Everything that the doomers claim AI would do assumes a biological utility function, such as maximizing growth, reproduction, and fitness. It's very anthropomorphizing in the same way pop culture depictions of aliens just happen to be bipeds with two eyes and ears and a nose, and not a cloud of gas or whatever.

They do not assume this at all. You clearly haven't actually read about instrumental convergence which is a conclusion about how the world works and not an assumption.

But this goes contra to my understanding of what neural networks are, namely just function approximations by and large.

Did your understanding generate a track record of correct predictions about recent AI developments? The statement that "it's crazy that..." suggests you did not.

They do not assume this at all. You clearly haven't actually read about instrumental convergence which is a conclusion about how the world works and not an assumption.

Well have you read it? @f3zinger didn't argue very effortfully here, but it really is a conclusion (and really a bit of an equivocation, where preserving optionality is not distinguished from power maximization) about a particular approach to reinforcement learning, not some general philosophical truth about intelligence or «how the world works». I thought to dig up the receipts, but seeing as @jeroboam already accuses me of obscurantism, it's more sensible to be laconic and make sure if you are speaking in good faith. For starters, I think Optimal Policies Tend to Seek Power is one of the strongest papers to this effect, and it's explicitly about RL. And even then,

This paper assumes that reward functions reasonably describe a trained agent’s goals. Sometimes this is roughly true (e.g. chess with a sparse victory reward signal) and sometimes it is not true. Turner [2022] argues that capable RL algorithms do not necessarily train policy networks which are best understood as optimizing the reward function itself. Rather, they point out that—especially in policy gradient approaches—reward provides gradients to the network and thereby modifies the network’s generalization properties, but doesn’t ensure the agent generalizes to “robustly optimizing reward” off of the training distribution.

Seems pretty clear to me that RLHF is exactly in this category. Do you object?