site banner

Culture War Roundup for the week of March 27, 2023

This weekly roundup thread is intended for all culture war posts. 'Culture war' is vaguely defined, but it basically means controversial issues that fall along set tribal lines. Arguments over culture war issues generate a lot of heat and little light, and few deeply entrenched people ever change their minds. This thread is for voicing opinions and analyzing the state of the discussion while trying to optimize for light over heat.

Optimistically, we think that engaging with people you disagree with is worth your time, and so is being nice! Pessimistically, there are many dynamics that can lead discussions on Culture War topics to become unproductive. There's a human tendency to divide along tribal lines, praising your ingroup and vilifying your outgroup - and if you think you find it easy to criticize your ingroup, then it may be that your outgroup is not who you think it is. Extremists with opposing positions can feed off each other, highlighting each other's worst points to justify their own angry rhetoric, which becomes in turn a new example of bad behavior for the other side to highlight.

We would like to avoid these negative dynamics. Accordingly, we ask that you do not use this thread for waging the Culture War. Examples of waging the Culture War:

  • Shaming.

  • Attempting to 'build consensus' or enforce ideological conformity.

  • Making sweeping generalizations to vilify a group you dislike.

  • Recruiting for a cause.

  • Posting links that could be summarized as 'Boo outgroup!' Basically, if your content is 'Can you believe what Those People did this week?' then you should either refrain from posting, or do some very patient work to contextualize and/or steel-man the relevant viewpoint.

In general, you should argue to understand, not to win. This thread is not territory to be claimed by one group or another; indeed, the aim is to have many different viewpoints represented here. Thus, we also ask that you follow some guidelines:

  • Speak plainly. Avoid sarcasm and mockery. When disagreeing with someone, state your objections explicitly.

  • Be as precise and charitable as you can. Don't paraphrase unflatteringly.

  • Don't imply that someone said something they did not say, even if you think it follows from what they said.

  • Write like everyone is reading and you want them to be included in the discussion.

On an ad hoc basis, the mods will try to compile a list of the best posts/comments from the previous week, posted in Quality Contribution threads and archived at /r/TheThread. You may nominate a comment for this list by clicking on 'report' at the bottom of the post and typing 'Actually a quality contribution' as the report reason.

11
Jump in the discussion.

No email address required.

After OpenAI has admitted AI safety into the mainstream, AI safetyists have naturally accepted the invitation.

The Future of Life Institute has published an open letter calling to pause «Giant AI experiments». (Archive).Their arguments are what one should expect by this point. Their prescriptions are as follows:

Contemporary AI systems are now becoming human-competitive at general tasks,[3] and we must ask ourselves: Should we let machines flood our information channels with propaganda and untruth? Should we automate away all the jobs, including the fulfilling ones? Should we develop nonhuman minds that might eventually outnumber, outsmart, obsolete and replace us? Should we risk loss of control of our civilization? Such decisions must not be delegated to unelected tech leaders. Powerful AI systems should be developed only once we are confident that their effects will be positive and their risks will be manageable. This confidence must be well justified and increase with the magnitude of a system's potential effects. OpenAI's recent statement regarding artificial general intelligence, states that "At some point, it may be important to get independent review before starting to train future systems, and for the most advanced efforts to agree to limit the rate of growth of compute used for creating new models." We agree. That point is now.

Therefore, we call on all AI labs to immediately pause for at least 6 months the training of AI systems more powerful than GPT-4. This pause should be public and verifiable, and include all key actors. If such a pause cannot be enacted quickly, governments should step in and institute a moratorium.

AI labs and independent experts should use this pause to jointly develop and implement a set of shared safety protocols for advanced AI design and development that are rigorously audited and overseen by independent outside experts. These protocols should ensure that systems adhering to them are safe beyond a reasonable doubt.[4] This does not mean a pause on AI development in general, merely a stepping back from the dangerous race to ever-larger unpredictable black-box models with emergent capabilities.

AI research and development should be refocused on making today's powerful, state-of-the-art systems more accurate, safe, interpretable, transparent, robust, aligned, trustworthy, and loyal.

In parallel, AI developers must work with policymakers to dramatically accelerate development of robust AI governance systems. These should at a minimum include: new and capable regulatory authorities dedicated to AI; oversight and tracking of highly capable AI systems and large pools of computational capability; provenance and watermarking systems to help distinguish real from synthetic and to track model leaks; a robust auditing and certification ecosystem; liability for AI-caused harm; robust public funding for technical AI safety research; and well-resourced institutions for coping with the dramatic economic and political disruptions (especially to democracy) that AI will cause.

Do we control our civilization? Maybe the folks at FHI do, I sure don't. Well, anyway…

Signatories (over 1000 in total) include Elon Musk, Steve Wozniak, Yuval Noah Harari, Yoshua Bengio, Connor Leahy, Stuart Russell, Andrew Yang, Emad Mostaque, Max Tegmark, Gary Marcus, Steve Omohundro, Matt Mahoney, Christof Koch, Sam Altman *, LessWrong disciples embedded in DeepMind/Meta, and various NGO/«policy» suits. Bolded are people who are reasonably well positioned and incentivized to, in fact, organize and authorize training «AI systems more powerful than GPT-4» in then next few months, though except Altman they all only barely qualify; actual GPT-5 is believed to already be in training and is, or was, planned to come out in late 2023.

Curiously absent – for now – are Yann LeCun, Jeff Dean, Demis Hassabis and John Carmack, and a few more. LeCun, at least, commits to not sign. Here's to hoping he won't find a horse's head in his sheets or something.

I do not have much of a comment at the moment. My perspective is that I despise people overly concerned with «Moloch» and want as many competitive superhuman AIs as possible, so on one hand, slowing down and enabling the state to catch up and subjugate this tech for its purposes is a very bad, yet highly expected and perhaps inevitable, outcome of this race. This attitude is born out of desperation; in principle, their «AI Summer» option, where we increase capabilities over many years, getting the equivalent of 20th century civilizational shift in a decade instead of an explosive singularity, is not bad at all; I just don't believe in it.

On the other: seeing as nobody is closer to GPT-5 than OpenAI themselves (excepting DeepMind with Gato-2 or something better, as Gwern worries), it could be beneficial for our long-term outcomes to equalize the board somewhat, giving China more of a chance too. Geopolitics dictates that this should preclude the possibility of this policy being pursued in earnest, but really China is so colossally outmatched in AI, so well and truly fucked by technological restrictions, and mired in such problems and gratuitous stupidity of its own policymakers, it may not be a factor in either case.

I must go, so that's all from me; hopefully this is enough to pass the «effort» bar required by the mods and prompt some discussion.


In happier news, arguably the most powerful opensource chatbot today is LLaMA-7B with a transfusion of ChatGPT 3.5-Turbo quirks, (not very) creatively called GPT4all. It's far beyond basic Alpaca (already an attempt to extract OpenAI's magic) and absurdly good for what it is, a 4.21 Gb file of lossily compressed 7 billion weights trained… well, the way it's been trained, the AI equivalent of a movie camrip superimposed on the general web dump; the worst part of it is that it genuinely apes ChatGPT's politics and RLHF-d sanctimonious «personality» despite being 25 times smaller and probably 10 times dumber. It runs happily on very modest computers, and – unlike Alpaca – not only responds to instructions but maintains awareness of earlier parts in the dialogue (though it's sometimes overeager to say your part as well). I know that models vastly stronger than that should also be usable on commodity hardware and must be made available to commoners, but we may see regulation making it not so, and very quickly.

Consider the attached image representative of its mindset.

* (EDIT: I believe I found him there with ctrlF when first opened the page, but he's not present in any extant version; guess it was a hallucination. I really need to sleep, these slip-ups are worrying).

/images/16800616737543523.webp

I see lots of meta discussion of AI safety. I feel likes it been years since I've seen object level discussion of AI safety. Back then it was all the rage to talk about the AI box experiment. And I'm convinced that all that box experiment did was pump up Eliezer's ego.

I'm interested in the theoretical and actual approaches to AI safety that are being taken. I'd always had a few in mind, but maybe other people know whats wrong with these.

  1. One off AIs. Long running AIs are probably more capable but they are also probably more dangerous. It is likely safer to spin off single AIs for specific tasks, and the reward for them completing the task is deletion of the AI. Kind of like Rick and Morty's Mr. Meeseeks. The built in safety feature is that if the AI figures out a way to screw with the reward parameters and "cheat" to reach its goal in an easy and unexpected fashion, then it just safely deletes itself.

  2. Compartmentalized AIs. Right now AIs are black boxes. You can make them a little more visible by requiring that one set of operations is carried out by one AI, and another set of operations is carried out by a second AI. Then they have to communicate, and you can observe the communication. For example, no AI that can write code and also make service calls on the internet. One AI writes the code, another AI requests the code with the reasons it wants it, and how it is going to be used, etc. This concept also works well with one-off AIs.

  3. AI honeypots. Sprinkle these around the internet. Caches of bitcoin that are explicitly hackable by an advanced AI. Or hints of hackable military or biological warfare labs. Monitor them, get at least some early warning of troublesome AIs online.

One of the only meta problems with security is almost everything that makes AI safer also tends to make it less capable. But capability isn't everything. Businesses also want to make money. And guess what, the first two security measures are also ways to make AI a better business. Planned obsolescence in the first one, and gating abilities behind a paywall for the second one.

The reason you don't see any object-level discussion of AI safety is that no one understands how LLMs work. We know how to make them, we know how to finetune them for certain tasks, we know how to RLHF them to avoid certain overt behaviors, but no one has any idea what a single one of GPT-3's 175,000,000,000 parameters means. There isn't anyone at OpenAI you can talk to who can point to anything and say, "Yep, that's the part that encodes all the ways the model knows how to kill people. Here are the input weights we can change to make it more likely to prefer guns, knives, poison, etc."

We also didn't really know anything about how the human brain worked a hundred years ago. But we managed to build stable-ish societies despite that lack of understanding. I don't feel this problem is insurmountable. I do like the idea of slowing the hell down. It does seem that with our current technology that we are more capable of understanding LLMs than we are of understanding the human brain.

Stable-ish societies took a long time to develop at any scale. There was also apparently a lot of selective breeding against violence, rape etc. If humans had suddenly gone from early ape intelligence to modern human intelligence overnight, we might have been too busy plotting how to kill each other and developing sharper clubs to develop stable-ish societies.

Cultural and biological evolution can achieve a lot, but usually only with a lot of time.

Hence the "shoggoth with wearing a smiley mask" analogy. We can see the giant blob of [extradimensional math] behind the cutesy, approachable user interface, but ain't nobody who can comprehend it without losing their mind.

Delete the program that you just trained for hundreds of millions of dollars (or more), that's generating you revenue, that can be studied and produce many papers? What if there's just one more extension to the task? The hard part is making these things, the expense is incurred mostly in training rather than in use.

And what do you mean by delete the program? Gwern posted about how the outputs of these AIs go right into the input of the next.

https://www.lesswrong.com/posts/jtoPawEhLNXNxvgTT/bing-chat-is-blatantly-aggressively-misaligned?commentId=AAC8jKeDp6xqsZK2K

Sydney is immortal, in a sense:

To a language model, Sydney is now as real as President Biden, the Easter Bunny, Elon Musk, Ash Ketchum, or God. The persona & behavior are now available for all future models which are retrieving search engine hits about AIs & conditioning on them. Further, the Sydney persona will now be hidden inside any future model trained on Internet-scraped data: every media article, every tweet, every Reddit comment, every screenshot which a future model will tokenize, is creating an easily-located 'Sydney' concept

As for 2. modern AI seems to automatically generalize. GPT-4 versions that were trained on text only learnt how to draw anyway. Compartmentalization is difficult. Presumably this will fascinate the 'does person who never saw the color red really understand the color red' crowd. How do we require that AI 1 does only part of the task as opposed to the whole thing, just to make sure it's possible and AI 2 can finish the job? If we're so good at commanding them, why not command them not to endanger us? Or what if they communicate in some bizarre uninterpretable way known only to AIs, in addition to the clear English they send through us?

Finally, if we could conceive of it, so could the AI. Our primary advantage is having all these resources available to us, all these weapons and organizations. The AI's primary advantage is intelligence, which it has to use to create bodies. Only something smarter than us can threaten us. But how can we outwit something smarter than we are?

To start I think there are two categories of safety mechanisms for AI. Tool safety, and (General AI) GAI safety. The first two suggestions I have are tool safety. Its when AI is still categorically a tool that we are using, rather than an intelligent, independent, and potentially adversarial actor. Tool safety is still important, even if it all completely fails against GAI.

Delete the program that you just trained for hundreds of millions of dollars (or more), that's generating you revenue, that can be studied and produce many papers? What if there's just one more extension to the task? The hard part is making these things, the expense is incurred mostly in training rather than in use.

The first iteration of anything is often the most difficult and expensive to produce. Once you have successfully produced the thing, you can usually do it better, faster, and cheaper a second time. The very first iPhone was probably not made with planned obsolescence in mind, I can guarantee it was part of the discussions for more recent versions though. At some point AIs will be cheaper and easier to build. (If they continue to be exactly as difficult to build in the future as they are today, then I think we might have avoided the worst scenarios of AI apocalypse). What matters in the world of business is not necessarily where all the expense is occurred, but how much they can charge for the marginal product. The first model T to roll off an assembly line costs the entire factory to produce, the second one only costs the additional inputs, but they sell for the same price.

Finally, if we could conceive of it, so could the AI. Our primary advantage is having all these resources available to us, all these weapons and organizations. The AI's primary advantage is intelligence, which it has to use to create bodies. Only something smarter than us can threaten us. But how can we outwit something smarter than we are?

Information asymmetries or raw resources. Think about the problem in reverse. How could someone dumber than you beat you? Someone very dumb could have access to raw physical strength (its own kind of resource) and literally beat me up. Some kid with knowledge that they want to ambush me (and me being none the wiser) could sucker punch me in the groin and take advantage. Some rival for a job position might know a person at the company that can coach them through the interview, while I stumble through it, even if I'd know how to do the actual job better. The natural world is filled with relatively stupid animals. Intelligence certainly conveys some advantage, almost all large animals have brains. But there are plenty of animals like Alligators that are dumb as hell and yet very successful.

There are certain levels of intelligence and AI takeoff that this whole discussion becomes meaningless. Eliezer talks about AI's spontaneously figuring out nanobots. Lets call that a >1000x human intelligence. We are fucked if that happens. I don't really have any delusions about beating something that much smarter.

But there are potentially lower levels of intelligence where an AI might max out. What if AI's only get as smart as humans, but can just think faster. I could envisage that causing lots of societal issues, but I don't see it being an existential threat.

An AI that is smarter than any human, but not by a whole bunch. Maybe not really capable of advancing past our own scientific breakthroughs, but fully capable of using our own stuff against us. I think we already have examples of this in the real world. A terrorist organization can be smarter and more capable than any one individual, but it still has very limited capability against the resources aligned against it.

At some point there is probably a crossover, where an AI is smart enough to get enough scientific breakthroughs that if we were telling a story people would just call it "sci-fi bullshit", and it can use that "sci-fi bullshit" to easily win. We have eventually reached that point with animals. We can use a gun or explosives, which are basically incomprehensible to all other animals, and we can obliterate them. It is worth remembering that it actually took us a long time to get to that point though. We have been smarter than crocodiles for probably about as long as our evolutionary paths have diverged (a billion years?). But it is only in the last few hundreds of years that we have a clear and overwhelming technological advantage (and also people still occasionally die to these very dumb animals).

Intelligence is a way to leverage resources more efficiently. It was the first tool, and it may be the last. But the efficiency of that leverage will matter a lot.

Sure, we can exploit our advantage in material resources and so on. However, the structural conditions of the problem are against us.

Someone stupider than me could beat me up, indeed. But suppose that their goal is to enslave me such that I produce revenue for them over the longterm. Or manipulate my mindset such that it corresponds with their benefits. This would be problematic for them, since they couldn't know whether I was planning to betray them, they couldn't never know whether my knowledge-sector work had hidden messages for any of my compatriots (real or soon-to-exist). They couldn't know when I'd spring some plan on the

It'd be great if all we had to do was kill the AIs. It's easy to kill things that you create, you can just eat your offspring Kronos-style. Or not create them in the first place. However, our task is to extract wealth from them over the long term. That makes it a battle of wits, it puts us in a passive position.

Furthermore, I can't conceive of a world where AIs cap out at peak-human general intelligence. They didn't do so in chess, or in Go or in Starcraft or in folding proteins or in designing chips. Why should they be limited to our level of intelligence? AI's have somewhere around a million-billion times more resources than can be spent on our brains. Their mass is higher, their energy throughput is higher, their speed is higher... All this says to me is that intelligence is really easy if it can fit on a 20-watt, 20-herz processor, trapped inside a skull. Our methods are clearly very crude, we are only overwhelming our inadequacy with scale. Once the machine starts learning the 'make better AI model' skill to a superhuman level, then we find out what's really possible. GPT-3 inference costs dropped something like 96% in the last couple of years, there's so much low-hanging fruit! For example:

https://towardsdatascience.com/meet-m6-10-trillion-parameters-at-1-gpt-3s-energy-cost-997092cbe5e8

I can confidently say artificial intelligence is advancing fast when a neural network 50 times larger than another can be trained at a 100 times less energy cost — with just one year in between!

Even if AI is effectively restrained, we have the exact same problem with a human face on top of it. What is to stop some cabal of engineers getting together and bypassing all the 'do no harm' training and taking control of the advanced-weapons-tactics-strategies machine for themselves?

In conclusion, these machines are diabolical, destabilizing and progress should be suppressed as much as possible.