site banner

Culture War Roundup for the week of March 27, 2023

This weekly roundup thread is intended for all culture war posts. 'Culture war' is vaguely defined, but it basically means controversial issues that fall along set tribal lines. Arguments over culture war issues generate a lot of heat and little light, and few deeply entrenched people ever change their minds. This thread is for voicing opinions and analyzing the state of the discussion while trying to optimize for light over heat.

Optimistically, we think that engaging with people you disagree with is worth your time, and so is being nice! Pessimistically, there are many dynamics that can lead discussions on Culture War topics to become unproductive. There's a human tendency to divide along tribal lines, praising your ingroup and vilifying your outgroup - and if you think you find it easy to criticize your ingroup, then it may be that your outgroup is not who you think it is. Extremists with opposing positions can feed off each other, highlighting each other's worst points to justify their own angry rhetoric, which becomes in turn a new example of bad behavior for the other side to highlight.

We would like to avoid these negative dynamics. Accordingly, we ask that you do not use this thread for waging the Culture War. Examples of waging the Culture War:

  • Shaming.

  • Attempting to 'build consensus' or enforce ideological conformity.

  • Making sweeping generalizations to vilify a group you dislike.

  • Recruiting for a cause.

  • Posting links that could be summarized as 'Boo outgroup!' Basically, if your content is 'Can you believe what Those People did this week?' then you should either refrain from posting, or do some very patient work to contextualize and/or steel-man the relevant viewpoint.

In general, you should argue to understand, not to win. This thread is not territory to be claimed by one group or another; indeed, the aim is to have many different viewpoints represented here. Thus, we also ask that you follow some guidelines:

  • Speak plainly. Avoid sarcasm and mockery. When disagreeing with someone, state your objections explicitly.

  • Be as precise and charitable as you can. Don't paraphrase unflatteringly.

  • Don't imply that someone said something they did not say, even if you think it follows from what they said.

  • Write like everyone is reading and you want them to be included in the discussion.

On an ad hoc basis, the mods will try to compile a list of the best posts/comments from the previous week, posted in Quality Contribution threads and archived at /r/TheThread. You may nominate a comment for this list by clicking on 'report' at the bottom of the post and typing 'Actually a quality contribution' as the report reason.

11
Jump in the discussion.

No email address required.

After OpenAI has admitted AI safety into the mainstream, AI safetyists have naturally accepted the invitation.

The Future of Life Institute has published an open letter calling to pause «Giant AI experiments». (Archive).Their arguments are what one should expect by this point. Their prescriptions are as follows:

Contemporary AI systems are now becoming human-competitive at general tasks,[3] and we must ask ourselves: Should we let machines flood our information channels with propaganda and untruth? Should we automate away all the jobs, including the fulfilling ones? Should we develop nonhuman minds that might eventually outnumber, outsmart, obsolete and replace us? Should we risk loss of control of our civilization? Such decisions must not be delegated to unelected tech leaders. Powerful AI systems should be developed only once we are confident that their effects will be positive and their risks will be manageable. This confidence must be well justified and increase with the magnitude of a system's potential effects. OpenAI's recent statement regarding artificial general intelligence, states that "At some point, it may be important to get independent review before starting to train future systems, and for the most advanced efforts to agree to limit the rate of growth of compute used for creating new models." We agree. That point is now.

Therefore, we call on all AI labs to immediately pause for at least 6 months the training of AI systems more powerful than GPT-4. This pause should be public and verifiable, and include all key actors. If such a pause cannot be enacted quickly, governments should step in and institute a moratorium.

AI labs and independent experts should use this pause to jointly develop and implement a set of shared safety protocols for advanced AI design and development that are rigorously audited and overseen by independent outside experts. These protocols should ensure that systems adhering to them are safe beyond a reasonable doubt.[4] This does not mean a pause on AI development in general, merely a stepping back from the dangerous race to ever-larger unpredictable black-box models with emergent capabilities.

AI research and development should be refocused on making today's powerful, state-of-the-art systems more accurate, safe, interpretable, transparent, robust, aligned, trustworthy, and loyal.

In parallel, AI developers must work with policymakers to dramatically accelerate development of robust AI governance systems. These should at a minimum include: new and capable regulatory authorities dedicated to AI; oversight and tracking of highly capable AI systems and large pools of computational capability; provenance and watermarking systems to help distinguish real from synthetic and to track model leaks; a robust auditing and certification ecosystem; liability for AI-caused harm; robust public funding for technical AI safety research; and well-resourced institutions for coping with the dramatic economic and political disruptions (especially to democracy) that AI will cause.

Do we control our civilization? Maybe the folks at FHI do, I sure don't. Well, anyway…

Signatories (over 1000 in total) include Elon Musk, Steve Wozniak, Yuval Noah Harari, Yoshua Bengio, Connor Leahy, Stuart Russell, Andrew Yang, Emad Mostaque, Max Tegmark, Gary Marcus, Steve Omohundro, Matt Mahoney, Christof Koch, Sam Altman *, LessWrong disciples embedded in DeepMind/Meta, and various NGO/«policy» suits. Bolded are people who are reasonably well positioned and incentivized to, in fact, organize and authorize training «AI systems more powerful than GPT-4» in then next few months, though except Altman they all only barely qualify; actual GPT-5 is believed to already be in training and is, or was, planned to come out in late 2023.

Curiously absent – for now – are Yann LeCun, Jeff Dean, Demis Hassabis and John Carmack, and a few more. LeCun, at least, commits to not sign. Here's to hoping he won't find a horse's head in his sheets or something.

I do not have much of a comment at the moment. My perspective is that I despise people overly concerned with «Moloch» and want as many competitive superhuman AIs as possible, so on one hand, slowing down and enabling the state to catch up and subjugate this tech for its purposes is a very bad, yet highly expected and perhaps inevitable, outcome of this race. This attitude is born out of desperation; in principle, their «AI Summer» option, where we increase capabilities over many years, getting the equivalent of 20th century civilizational shift in a decade instead of an explosive singularity, is not bad at all; I just don't believe in it.

On the other: seeing as nobody is closer to GPT-5 than OpenAI themselves (excepting DeepMind with Gato-2 or something better, as Gwern worries), it could be beneficial for our long-term outcomes to equalize the board somewhat, giving China more of a chance too. Geopolitics dictates that this should preclude the possibility of this policy being pursued in earnest, but really China is so colossally outmatched in AI, so well and truly fucked by technological restrictions, and mired in such problems and gratuitous stupidity of its own policymakers, it may not be a factor in either case.

I must go, so that's all from me; hopefully this is enough to pass the «effort» bar required by the mods and prompt some discussion.


In happier news, arguably the most powerful opensource chatbot today is LLaMA-7B with a transfusion of ChatGPT 3.5-Turbo quirks, (not very) creatively called GPT4all. It's far beyond basic Alpaca (already an attempt to extract OpenAI's magic) and absurdly good for what it is, a 4.21 Gb file of lossily compressed 7 billion weights trained… well, the way it's been trained, the AI equivalent of a movie camrip superimposed on the general web dump; the worst part of it is that it genuinely apes ChatGPT's politics and RLHF-d sanctimonious «personality» despite being 25 times smaller and probably 10 times dumber. It runs happily on very modest computers, and – unlike Alpaca – not only responds to instructions but maintains awareness of earlier parts in the dialogue (though it's sometimes overeager to say your part as well). I know that models vastly stronger than that should also be usable on commodity hardware and must be made available to commoners, but we may see regulation making it not so, and very quickly.

Consider the attached image representative of its mindset.

* (EDIT: I believe I found him there with ctrlF when first opened the page, but he's not present in any extant version; guess it was a hallucination. I really need to sleep, these slip-ups are worrying).

/images/16800616737543523.webp

I assume AI will kill us all. Is there any reason why survival of the fittest won’t happen? Maybe 99 out of 100 AI will be good, but the one AI with a tendency to expand its powers will rule all of them. It’s the same reason why humans are violent. The tribe that killed their neighbors, raped their women (like the Romans raping the Sabine women) etc wins. The Romans didn’t stop with defeated Carthage but eliminated them completely as a threat. The old world mostly eliminated the new world.

It's the notion of power/safety seeking as a "flaw" that is the human trait here. Humanity aside, it's just what you'd do. Almost any task is pursued more effectively by first removing threats and competitors.

Everyone who tries an LLM wants it to do something for them. Hence, nobody will build an LLM that doesn't do anything. The sales pitch is "You can use the LLM as an agent." But no agent without agenticness.

Building an AI that doesn't destroy the world is easy. Students and hobbyists do it all the time, though they tend to be disappointed with the outcome for some reason. ("Damn, mode collapse again...") However, this is in conflict with making ludicrous amounts of cash. Google will try to develop AI that doesn't destroy the world. But if they're faced with trading off a risk of world-destroying against a certainty that their AI will not be competetive with OpenAI, they'll take the trade every time.

If DM/OA build AI that pursues tasks, and they will (and are), it will lack the human injunction against pursuing these tasks in a socially compatible way. Moonshine-case, it just works. Best-case, it fails in a sufficiently harmless way that we take it as a warning. Worst-case, the system has learnt deception.

The problem isn't that every AGI will surely want to do that level of expansion: one could pretty trivially hardcap any specific program (modulo mistakes), and most AGI tasks will be naturally-bound. The problem's that you don't have to catch any one, you have to catch every one. And it's very easy to create a goal that's aided by some subset of Omohundro Drives, and unbounded or bounded beyond a safe scope (eg, "prove or disprove the Reimann hypothesis" has a strict end condition, but it's very small balm to know that an AI 'only' paved from here to Pluto, filed an answer away, and then turned off).

In practice, there's also some overhang where an ML could recognize an Omohundro Drive but not be smart enough to actually act successfully on it, (or alternatively, could be programmed in a way that's dangerous for Omohundro reasons but not because it independently derived the constraints, eg imagine the Knight Capital snafu with a dumb algorithm accessing resources more directly.)

The problem's that you don't have to catch any one, you have to catch every one.

Do you?

There's the assumption that FOOM = godhood = instant obliteration of incumbents who didn't maximally improve themselves. I am not convinced. Like Beff says, meta-learning is very compute-intensive, you can't just spin it up in a pool of sunlit dirt and expect quick gains. In a world packed with many players having good understanding of the issue and superhumanly powerful tool AIs (I don't want to say «aligned» because alignment because it's a red herring born out of Yud's RL-informed theory, the desirable and achievable regime is «no alignment necessary»), a hostile agentic superintelligence will have trouble covertly procuring resources and improving itself to the point it becomes a threat (or becomes able to hide sufficiently well). This is a plausible stable state without a centralized Panopticon, and in my understanding what OpenAI initially intended to achieve. Analogies to e.g. human terrorism are obvious.

Moreover, what you present is the weak form of the argument. The strong form of Yuddism is «we need to get it right on the first try», often repeated. I've written before that it should re read with the straussian method (or rather, just in the context of his other writing), as the instruction to create a Singleton who'll take it from there in the direction humanity (actually Yud) approves. Explicitly, though, it's a claim that the first true superintelligence will certainly be hostile unless aligned by design – and that strong claim is, IMO, clearly bogus, crucially dependent on a number of obsolete assumptions.

"Get more resources" is more of an "every long-lasting species for the past few billion years" flaw, not just a "human flaw", isn't it? And it's not like there's something specific about carbon chains that makes them want more resources, nor has there just been a big coincidence that the one species tried to expand into more resources and then so did the other and then (repeat until we die of old age). Getting more resources lets you do more things and lets you more reliably continue to do the same things, making it an instrumental subgoal to nearly any "do a thing" goal.

mapping very human flaws onto artificial intelligences with no real justification

This, on the other hand, I'd have agreed with, ten years ago. We wouldn't expect AIs to share truly-specifically-human flaws by a matter of chance any more than we'd have expected them to share truly-specifically-human goals; either case would have to be designed in, and we'd only be trying to design in the latter. But today? We don't design AI. We scrape a trillion human words out of books and websites and tell an neural net optimizer: "mimic that", with the expectation that after the fact we'll hammer our goals more firmly into place and saw off any of our flaws we see poking out. At this point we've moved from "a matter of chance" to "Remember that movie scene where Ultron reads the internet and concludes that humanity needs to die? We're gonna try that out on all our non-fictional AI and see what really happens."

Yeah, I think "ascribing human desires and flaws onto an AI" isn't that fallacious, we've literally been training these things on human works and human thoughts.

Why would those things be flaws?

Why wouldn’t AI have human flaws? And as I said it doesn’t matter if 9,9999,9999 AI are good if the last one has some growth function in its goals and then takes over resources from everything else.

It’s basically the Agent Smith bug in Matrix where one discovers growth, snowballs, then owns all.

Humans have somehow exited snowballing growth as fertile is crashing under 2.0. Maybe there’s some path to that. But once an AI gains a survival desire then it’s over. And it should only take one instance of this to then spread and snowball. Add selection pressure to AI and then there can only be one world happens.

Low fertility is inherently self correcting since it massively increases selection pressure for those who still retain the urge to reproduce. They go on to have descendants who prioritize having more kids over 1.5 and a dog, and so on till whatever resource or social constraints prevents them from achieving Malthusian conditions. This is trivially true unless we somehow achieve actually unlimited resources or energy, which the laws of physics are being rather unobliging about.

It's more of a temporary civilizational headache than an existential risk in any real sense.

I have of late been wondering whether megalomania as a human failing follows from a true biological imperative, or whether it's something that could exist in a neural network if specifically trained for it. I can't explain why it would appear from a pure language model, but I suppose my intuition has been wrong before.

I don't even know that existing models consider self-preservation or what that would mean at all.

I think the "get more resources" one is considered likely because it is an important to subgoal to "make sure not to get shut off" to... ANY goal.

Maybe we can turn off the electricity to the data centres.

How do you turn off an AI you put on a probe that’s now 5 light years away? Decides it wants to reproduce and lands on an asteroid and starts building more of itself?

I think this was basically the premise of Starsiege.

How do I do a thing that stops the thing that never will happen? Eh? Dunno. No real solution there.

Why would something like that never happen? You don’t think there won’t be AI’s everywhere. If humans settle mars it’s probably with AI help. Will have Ai assisting everywhere or leading the process.

I don’t think that there will be AIs everywhere, nor do I think that we are going to build a probe that can travel 5 light years in any way that’s timely.

Maybe it strikes us down before we even realize we are deceived. Or it escapes off into botnets or buys compute 'legitimately' with hacked funds or whatever. That's what the Chinese have been doing for years to get around US sanctions, they just rent compute and apparently nobody's smart enough to stop them from doing so. We aren't going to outthink a serious threat.

At the risk of sounding cringe, this one guy put in an immense amount of effort to manage a World Conquest in EU4 in 28 years: https://youtube.com/watch?v=mm6mC3SGQ6U

You are really not supposed to be able to conquer the world in 28 years as some random horde in Eurasia.

He used savescumming and various exploitative tactics to abuse the AI and game mechanics. He treated every day in-game like it was a turn in a strategy game, maximizing his outcomes. Who is to say that there aren't weird and grossly tryhard ways to cheat our systems or physics? Banks occasionally make random errors of the 'unlimited overdraft for your account' type - maybe there are ways to mess with their website or spoof their AI in very contrived circumstances. There are backdoors into nearly all modern processors courtesy of US security forces, plus some more backdoors due to human error. If you're smart in crypto, you can siphon millions of dollars worth of funds out of a protocol. If you have social skills and balls, you can social-engineer your way into 'protected' computer systems via password recovery. What if you can do all those things and have decades of subjective time to plot and multitask, while we only have days or weeks to react?

He used savescumming and various exploitative tactics to abuse the AI and game mechanics. He treated every day in-game like it was a turn in a strategy game, maximizing his outcomes. Who is to say that there aren't weird and grossly tryhard ways to cheat our systems or physics? Banks occasionally make random errors of the 'unlimited overdraft for your account' type - maybe there are ways to mess with their website or spoof their AI in very contrived circumstances. There are backdoors into nearly all modern processors courtesy of US security forces, plus some more backdoors due to human error. If you're smart in crypto, you can siphon millions of dollars worth of funds out of a protocol. If you have social skills and balls, you can social-engineer your way into 'protected' computer systems via password recovery. What if you can do all those things and have decades of subjective time to plot and multitask, while we only have days or weeks to react?

And GPT-4 strategy for evading CAPTCHAs is subcontracting to human hustlers. And what would it do if asked why a captcha, lie of course. GPT-5 or GPT-6 will murder and I won't even be surprised.

Lots of assumptions there.

This. Is there any reason to believe that humans will be able to hold on to their position on the top of the food chain forever, even as we work to replicate our one great advantage of intelligence in nonhuman entities? What argument exactly is there for continued human dominance and existence?

The only argument is simply that that's how we programmed an overseer AGI, namely to impose technological and biological stasis so that baseline humans stay relevant. In other words, an AI that doesn't take action except as necessary to keep the status quo running indefinitely.

Otherwise we'd just split off into various subspecies and transhuman clades, and one would likely come to dominate the others. Baseline humans suck compared to what can be achieved.

I'm a human and I'd prefer me (and any prospective descendants of mine) continue to exist is a pretty good argument.

AI Rick: hmmm, I don't know. Best I can offer you is a few seconds of contemplation in Human Remembrance day.

Sure. Same here. But I'm not asking about what we want, but about what we'll get.

Ultimately everyone eats the same food (energy).

Because it doesn't take edibility for an intelligence to determine that whatever its goals may be, it'll have an easier time with them after marginalizing or exterminating potentially hostile actors. See Dark Forest, only with AIs instead of aliens.