site banner

Culture War Roundup for the week of November 28, 2022

This weekly roundup thread is intended for all culture war posts. 'Culture war' is vaguely defined, but it basically means controversial issues that fall along set tribal lines. Arguments over culture war issues generate a lot of heat and little light, and few deeply entrenched people ever change their minds. This thread is for voicing opinions and analyzing the state of the discussion while trying to optimize for light over heat.

Optimistically, we think that engaging with people you disagree with is worth your time, and so is being nice! Pessimistically, there are many dynamics that can lead discussions on Culture War topics to become unproductive. There's a human tendency to divide along tribal lines, praising your ingroup and vilifying your outgroup - and if you think you find it easy to criticize your ingroup, then it may be that your outgroup is not who you think it is. Extremists with opposing positions can feed off each other, highlighting each other's worst points to justify their own angry rhetoric, which becomes in turn a new example of bad behavior for the other side to highlight.

We would like to avoid these negative dynamics. Accordingly, we ask that you do not use this thread for waging the Culture War. Examples of waging the Culture War:

  • Shaming.

  • Attempting to 'build consensus' or enforce ideological conformity.

  • Making sweeping generalizations to vilify a group you dislike.

  • Recruiting for a cause.

  • Posting links that could be summarized as 'Boo outgroup!' Basically, if your content is 'Can you believe what Those People did this week?' then you should either refrain from posting, or do some very patient work to contextualize and/or steel-man the relevant viewpoint.

In general, you should argue to understand, not to win. This thread is not territory to be claimed by one group or another; indeed, the aim is to have many different viewpoints represented here. Thus, we also ask that you follow some guidelines:

  • Speak plainly. Avoid sarcasm and mockery. When disagreeing with someone, state your objections explicitly.

  • Be as precise and charitable as you can. Don't paraphrase unflatteringly.

  • Don't imply that someone said something they did not say, even if you think it follows from what they said.

  • Write like everyone is reading and you want them to be included in the discussion.

On an ad hoc basis, the mods will try to compile a list of the best posts/comments from the previous week, posted in Quality Contribution threads and archived at /r/TheThread. You may nominate a comment for this list by clicking on 'report' at the bottom of the post and typing 'Actually a quality contribution' as the report reason.

16
Jump in the discussion.

No email address required.

Regarding AI alignment -

I'm aware of and share @DaseindustriesLtd's aesthetical objection that the AI safety movement is not terribly aligned with my values itself and the payoff expectation of letting them perform their "pivotal act" that involves deputy godhood for themselves does not look so attractive from the outside, but the overall Pascal's Mugging performed by Yudkowsky, TheZvi etc. as linked downthread really does seem fairly persuasive as long as you accept the assumptions that they make. With all that being said, to me the weakest link of their narrative always actually has been in a different part than either the utility of their proposed eschaton or the probability that an AGI becomes Clippy, and I've seen very little discussion of the part that bothers me though I may not have looked well enough.

Specifically, it seems to me that everyone in the field accepts as gospel the assumption that AGI takeoff would (1) be very fast (minimal time from (1+\varepsilon) human capability to C*human capability for some C on the order of theoretical upper bounds) and (2) irreversible (P(the most intelligent agent on Earth will be an AGI n units of time in the future | the most intelligent agent on Earth is an AGI now) ~= 1). I've never seen the argument for either of these two made in any other way than repetition and a sort of obnoxious insinuation that if you don't see them as self-evident you must be kind of dull. Yet, I remain far from convinced of either (though, to be clear, it's not like I'm not convinced of their negations).

Regarding (1), the first piece of natural counterevidence to me is the existence of natural human variation in intelligence. I'm sure you don't need me to sketch in detail an explanation of why the superintelligent-relative-to-baseline Ashkenazim, or East Asians, or John von Neumann himself didn't undergo a personal intelligence explosion, but whence the certainty that this explanation won't in part or full also be relevant for superintelligent AGIs we construct? Sure, there is a certain argument that computer programs are easier to reproduce, modify and iterate upon than wetware, but this advantage is surely not infinitely large, and we do not even have the understanding to quantify this advantage in natural units. "Improving a silicon-based AI is easier than humans, therefore assume it will self-improve about instantaneously even though humans didn't" is extremely facile. It took humans like 10k years of urbanised society to get to the point where building something superior to humans at general reasoning seems within grasp. Even if that next thing is much better than us, how do we know if moving another step beyond that will take 5k, 1k, 100, 10 or 1 year, or minutes? The superhuman AIs we build may well come with their own set of architectural constraints that force them into a hard-to-leave local minimum, too. If the Infante Eschaton is actually a transformer talking to itself, how do we know it won't be forever tied down by an unfortunately utterly insurmountable tendency to exhibit tics in response to Tumblr memes in its token stream that we accidentally built into it, or a hidden high-order term in the cost/performance function for the entire transformer architecture and anything like it, for a sweet 100 years where we get AI Jeeves but not much more?

Secondly, I'm actually very partial to the interpretation that we have already built "superhuman AGI", in the shape of corporations. I realise this sounds like a trite anticapitalist trope, but being put on a bingo board is not a refutation. It may seem like an edge case given the queer computational substrate, but at the same time I'm struggling to find a good definition of superhuman AGI that naturally does not cover them. They are markedly non-human, have their own value function that their computational substrate is compelled to optimise for (fiduciary duty), and exhibit capacities in excess of any human (which is what makes them so useful). Put differently, if an AI built by Google on GPUs does ascend to Yudkowskian godhood, in the process rebuilding itself on nanomachines and then on computronium, what's the reason for the alien historian looking upon the simulation from the outside to place the starting point of "the singularity" specifically at the moment that Google launched the GPU version of the AI to further Google's goals, as opposed to when the GPU AI launched the nanomachine AI in furtherance of its own goals, or when humans launched the human-workers version of Google to further their human goals? Of all these points, the last one seems to be the most special one to me, because it marks the beginning of the chain where intelligent agents deliberately construct more intelligent agents in furtherance of their goals. However, if the descent towards the singularity has already started, so far it's been taking its sweet time. Why do we expect a crazy acceleration at the next step, apart from the ancient human tendency to believe ourselves to be living in the most special of times?

Regarding (2), even if $sv_business or $three_letter_agency builds a superhuman AI that is rapidly going critical, what's to say this won't be spotted and quickly corroborated by an assortment of Russian and/or Chinese spies, and those governments don't have some protocol in place that will result in them preemptively unloading their nuclear arsenal on every industrial center in the US? If the nukes land, the reversal criterion will probably be satisfied, and it's likely enough that the AI will be large enough and depend on sufficiently special hardware that it can't just quickly evacuate itself to AWS Antarctica. At that point, the AI may already be significantly smarter than humans, without having the capability to resist. Certainly the Yudkowsky scenario of bribing people into synthesising the appropriate nanomachine peptides can't be executed on 30 minutes' notice, and I doubt even a room full of uber-von Neumanns on amphetamines (especially ones bound to the wheelchair of specialty hardware and reliably electricity supply) could contrive a way to save itself from 50 oncoming nukes in that timespan. Of course this particular class of scenario may have very low probability, but I do not think that that probability is 0; and the more slowness and perhaps also fragility of early superhuman AIs we are willing to concede per point (1), the more opportunities for individually low-probability reversals like this arise.

All in all, I'm left with a far lower subjective belief that the LW-canon AGI apocalypse will happen as described than Yudkowsky's near-certainty that seems to be offset only by black swan events before the silicon AGI comes into being. I'm gravitating towards putting something like a 20% probability on it, without being at all confident in my napkinless mental Bayesianism, which is of course still very high for x-risk but makes the proposed "grow the probability of totalitarian EA machine god" countermeasure look much less attractive. It would be interesting to see if something along the lines of my thoughts above has already been argued against in the community, or if there is some qualitative (because I consider the quantitative aspect to be a bit hopeless) flaw in my lines of reasoning that stands out to the Motte.

Even if that next thing is much better than us, how do we know if moving another step beyond that will take 5k, 1k, 100, 10 or 1 year, or minutes? The superhuman AIs we build may well come with their own set of architectural constraints that force them into a hard-to-leave local minimum, too. If the Infante Eschaton is actually a transformer talking to itself, how do we know it won't be forever tied down by an unfortunately utterly insurmountable tendency to exhibit tics in response to Tumblr memes in its token stream that we accidentally built into it, or a hidden high-order term in the cost/performance function for the entire transformer architecture and anything like it, for a sweet 100 years where we get AI Jeeves but not much more?

The argument that is convincing to me is that once an AI is as good at reasoning as us, which should be possible as we are likely not extra physical beings, the advantage it has is time. With generous hardware scaling even if we can't make it straight up better at reasoning we can give it a thousand human lifetimes a second where its memory doesn't decay at all to try and do a better job than we did at designing an ai. By my estimates you vastly underrate this kind of scaling.

I think AI really does need to be better at reasoning. For instance if you give me a thousand lifetimes, or to make it more ridiculous, a dog million life times, I wouldn't expect a theory of quantum gravity out of it. Some problems are just too hard to solve.

Firstly, I think it's likely that the first AI that we build that attains "human-level reasoning" (in whatever rough measure of "reasoning per unit of time") will be pretty close to at least a local maximum of compute capabilities, and won't easily be scaled up by a factor of 1000 over night. Secondly, I'm not quite convinced that even if that scaling-up were possible, this would necessarily translate to world-shattering capability, because the object in question is still a lone AI, not corporeal and facing an organised society of humans that are primed to distrust it and control the power switch. I'm not so sure that the Hitler head in a jar, where the jar also runs on very sensitive and supply-chain-dependent equipment, could be reliably expected to take over the world even if it were given a 1000:1 computation speed advantage and perfect memory; the "find the right sequence of words to sway the heart of any mortal with 100% certainty" trope seems oversold to me. I'm aware of Eliezer's old "I'll persuade you to unbox me" experiments, too, but those to me seemed like an unrealistic model of the problem in question. (Maybe if several people not participating in the chat also at all times had the option to go and permanently delete Eliezer with minimal personal consequences, and the twitchy finger to do so based on observations like "this guy who said he was going to talk to Jar Hitler is taking far too long"...)

Of course this is all probabilistic, but I explained in a parallel subthread why I take even low-probability ways in which the whole thing could fail to work out to be important. To break my acceptance of the MIRI agenda, it is sufficient to establish that the probability of our current path towards runaway AGI culminating in its success is significantly lower than 90something%.

Firstly, I think it's likely that the first AI that we build that attains "human-level reasoning" (in whatever rough measure of "reasoning per unit of time") will be pretty close to at least a local maximum of compute capabilities, and won't easily be scaled up by a factor of 1000 over night.

Why? None of the current neural networks represent a maximum of compute for their host company, or even within an oom.

I realise that the statement was a bit facile, but in concrete terms arbitrary scaling doesn't actually seem to be a problem that has been solved for deep learning so far, and given the advances that were made without it, it's not clear that it will be by the time we reach the human level. Here, for instance, is OpenAI talking about the difficulties with the distributed training process they've set up, which seems to be bounded by nonlinear-in-#machines overheads that in turn generate demand for state on each machine which itself is running up to the limits of RAM and VRAM that is available for single machines with modern hardware. If that's the issue, then the existence of hundreds of thousands of more nodes at Azure (if ones with the right kind of hardware indeed exist) may not matter, because you could not make them train the same network in parallel.

On the other hand, one could imagine that the "continued learning" process of the hypothetical superhuman AI would not involve further training of the network but instead some other more legible mechanism, such as it populating a database of facts; in that case, however, it would start exhibiting scaling problems that very much resemble the scaling problems of meat humans. That is, you can easily improve 'software' like memes and theories but not 'hardware' like brain architecture (which, for the AI, would be the weights and the design of the network), and the 'software' has soft limits to possible returns; also, we still haven't really dealt with the problem of running a trained instance of AI in a distributed fashion rather than a single machine, so even if the AI can acquire lots of compute nodes that are good enough to run one copy (no guarantee; easily hacked Chinese toasters don't come with A100s, and my impression is that when you go on cloud services nowadays the really high-end GPU options all have low availability, implying that they are not particularly overprovisioned) all it could do would be running autonomous copies of itself on them, which would have to coordinate through some channel that is much more bounded than "share brain state" like a collective of humans who have no better option than to talk to each other.

I think that's really an underappreciated factor: a computer already "thinks" faster than a human can get through the synaptic chain of pressing a button. In the time it takes to blink, a GPU can pull off the crazy algebra required to make a texture not warp when applied to a polygon thousands of times. We're reaching the limit of how small we can make transistors, but what we have now is damn good, and of course, you can always just bolt more hardware on.

I can also do that. It's called my imagination.

Just because I'm not thinking in algebra doesn't mean my brain isn't doing it.