This weekly roundup thread is intended for all culture war posts. 'Culture war' is vaguely defined, but it basically means controversial issues that fall along set tribal lines. Arguments over culture war issues generate a lot of heat and little light, and few deeply entrenched people ever change their minds. This thread is for voicing opinions and analyzing the state of the discussion while trying to optimize for light over heat.
Optimistically, we think that engaging with people you disagree with is worth your time, and so is being nice! Pessimistically, there are many dynamics that can lead discussions on Culture War topics to become unproductive. There's a human tendency to divide along tribal lines, praising your ingroup and vilifying your outgroup - and if you think you find it easy to criticize your ingroup, then it may be that your outgroup is not who you think it is. Extremists with opposing positions can feed off each other, highlighting each other's worst points to justify their own angry rhetoric, which becomes in turn a new example of bad behavior for the other side to highlight.
We would like to avoid these negative dynamics. Accordingly, we ask that you do not use this thread for waging the Culture War. Examples of waging the Culture War:
-
Shaming.
-
Attempting to 'build consensus' or enforce ideological conformity.
-
Making sweeping generalizations to vilify a group you dislike.
-
Recruiting for a cause.
-
Posting links that could be summarized as 'Boo outgroup!' Basically, if your content is 'Can you believe what Those People did this week?' then you should either refrain from posting, or do some very patient work to contextualize and/or steel-man the relevant viewpoint.
In general, you should argue to understand, not to win. This thread is not territory to be claimed by one group or another; indeed, the aim is to have many different viewpoints represented here. Thus, we also ask that you follow some guidelines:
-
Speak plainly. Avoid sarcasm and mockery. When disagreeing with someone, state your objections explicitly.
-
Be as precise and charitable as you can. Don't paraphrase unflatteringly.
-
Don't imply that someone said something they did not say, even if you think it follows from what they said.
-
Write like everyone is reading and you want them to be included in the discussion.
On an ad hoc basis, the mods will try to compile a list of the best posts/comments from the previous week, posted in Quality Contribution threads and archived at /r/TheThread. You may nominate a comment for this list by clicking on 'report' at the bottom of the post and typing 'Actually a quality contribution' as the report reason.

Jump in the discussion.
No email address required.
Notes -
Regarding AI alignment -
I'm aware of and share @DaseindustriesLtd's aesthetical objection that the AI safety movement is not terribly aligned with my values itself and the payoff expectation of letting them perform their "pivotal act" that involves deputy godhood for themselves does not look so attractive from the outside, but the overall Pascal's Mugging performed by Yudkowsky, TheZvi etc. as linked downthread really does seem fairly persuasive as long as you accept the assumptions that they make. With all that being said, to me the weakest link of their narrative always actually has been in a different part than either the utility of their proposed eschaton or the probability that an AGI becomes Clippy, and I've seen very little discussion of the part that bothers me though I may not have looked well enough.
Specifically, it seems to me that everyone in the field accepts as gospel the assumption that AGI takeoff would (1) be very fast (minimal time from (1+\varepsilon) human capability to C*human capability for some C on the order of theoretical upper bounds) and (2) irreversible (P(the most intelligent agent on Earth will be an AGI n units of time in the future | the most intelligent agent on Earth is an AGI now) ~= 1). I've never seen the argument for either of these two made in any other way than repetition and a sort of obnoxious insinuation that if you don't see them as self-evident you must be kind of dull. Yet, I remain far from convinced of either (though, to be clear, it's not like I'm not convinced of their negations).
Regarding (1), the first piece of natural counterevidence to me is the existence of natural human variation in intelligence. I'm sure you don't need me to sketch in detail an explanation of why the superintelligent-relative-to-baseline Ashkenazim, or East Asians, or John von Neumann himself didn't undergo a personal intelligence explosion, but whence the certainty that this explanation won't in part or full also be relevant for superintelligent AGIs we construct? Sure, there is a certain argument that computer programs are easier to reproduce, modify and iterate upon than wetware, but this advantage is surely not infinitely large, and we do not even have the understanding to quantify this advantage in natural units. "Improving a silicon-based AI is easier than humans, therefore assume it will self-improve about instantaneously even though humans didn't" is extremely facile. It took humans like 10k years of urbanised society to get to the point where building something superior to humans at general reasoning seems within grasp. Even if that next thing is much better than us, how do we know if moving another step beyond that will take 5k, 1k, 100, 10 or 1 year, or minutes? The superhuman AIs we build may well come with their own set of architectural constraints that force them into a hard-to-leave local minimum, too. If the Infante Eschaton is actually a transformer talking to itself, how do we know it won't be forever tied down by an unfortunately utterly insurmountable tendency to exhibit tics in response to Tumblr memes in its token stream that we accidentally built into it, or a hidden high-order term in the cost/performance function for the entire transformer architecture and anything like it, for a sweet 100 years where we get AI Jeeves but not much more?
Secondly, I'm actually very partial to the interpretation that we have already built "superhuman AGI", in the shape of corporations. I realise this sounds like a trite anticapitalist trope, but being put on a bingo board is not a refutation. It may seem like an edge case given the queer computational substrate, but at the same time I'm struggling to find a good definition of superhuman AGI that naturally does not cover them. They are markedly non-human, have their own value function that their computational substrate is compelled to optimise for (fiduciary duty), and exhibit capacities in excess of any human (which is what makes them so useful). Put differently, if an AI built by Google on GPUs does ascend to Yudkowskian godhood, in the process rebuilding itself on nanomachines and then on computronium, what's the reason for the alien historian looking upon the simulation from the outside to place the starting point of "the singularity" specifically at the moment that Google launched the GPU version of the AI to further Google's goals, as opposed to when the GPU AI launched the nanomachine AI in furtherance of its own goals, or when humans launched the human-workers version of Google to further their human goals? Of all these points, the last one seems to be the most special one to me, because it marks the beginning of the chain where intelligent agents deliberately construct more intelligent agents in furtherance of their goals. However, if the descent towards the singularity has already started, so far it's been taking its sweet time. Why do we expect a crazy acceleration at the next step, apart from the ancient human tendency to believe ourselves to be living in the most special of times?
Regarding (2), even if $sv_business or $three_letter_agency builds a superhuman AI that is rapidly going critical, what's to say this won't be spotted and quickly corroborated by an assortment of Russian and/or Chinese spies, and those governments don't have some protocol in place that will result in them preemptively unloading their nuclear arsenal on every industrial center in the US? If the nukes land, the reversal criterion will probably be satisfied, and it's likely enough that the AI will be large enough and depend on sufficiently special hardware that it can't just quickly evacuate itself to AWS Antarctica. At that point, the AI may already be significantly smarter than humans, without having the capability to resist. Certainly the Yudkowsky scenario of bribing people into synthesising the appropriate nanomachine peptides can't be executed on 30 minutes' notice, and I doubt even a room full of uber-von Neumanns on amphetamines (especially ones bound to the wheelchair of specialty hardware and reliably electricity supply) could contrive a way to save itself from 50 oncoming nukes in that timespan. Of course this particular class of scenario may have very low probability, but I do not think that that probability is 0; and the more slowness and perhaps also fragility of early superhuman AIs we are willing to concede per point (1), the more opportunities for individually low-probability reversals like this arise.
All in all, I'm left with a far lower subjective belief that the LW-canon AGI apocalypse will happen as described than Yudkowsky's near-certainty that seems to be offset only by black swan events before the silicon AGI comes into being. I'm gravitating towards putting something like a 20% probability on it, without being at all confident in my napkinless mental Bayesianism, which is of course still very high for x-risk but makes the proposed "grow the probability of totalitarian EA machine god" countermeasure look much less attractive. It would be interesting to see if something along the lines of my thoughts above has already been argued against in the community, or if there is some qualitative (because I consider the quantitative aspect to be a bit hopeless) flaw in my lines of reasoning that stands out to the Motte.
All else equal, how would you fare in a fistfight with a guy whose reach is 10" longer?
That's roughly how I think about this stuff. Qualitative transitions in capability are unnecessary: quantitative differences in mundane variables can change the whole game board. And, as is the custom, gwern has dissected arguments against superintelligent AIs. Once they get to human level, and they seem to be getting there already, still with very modest costs (ChatGPT probably can run on 8x3090s, so like $10000 of hardware, consuming 3kWh=$0.5/hour, and that's about the most dumbass way to run it; at scale, inference for a single «thread» can cost as little as 10…1 cent/hour, I guess), it's game over – unless parties that control them can be prevented from capitalizing on this tool and scaling it up, which they, so far, aren't, except by woke ethicists.
A generally helpful AI or an equivalent suite of tools owned by a corporation trivially bootstraps into PASTA – Process for Automating Scientific and Technological Advancement – and PASTA is enough to vertically integrate logistics, radically trim the workforce, and increase alignment between managerial values and adaptive behavior, to the point the corporation stops being a value-drifting profit-driven myopic hodgepodge of narrow experts and grifters, and starts to deserve the label of a Superintelligent Agent. But really it just becomes a competent cabal.
And the corporation endows this thing with a more egoistic objective… then yeah, I think it can fuck us all over. But as far as I'm concerned, that's scarcely any worse than the default «aligned» scenario.
I recognize the premise of LW alarmism as sound, so long as we strip it from sci-fi gimmicks. Science fiction is a double-edged sword. It allows to inject into the mainstream some plausible and significant implications of technology that only scientists can appreciate at the time; but it's imprecise and inherently dramatized. Nowhere is this more obvious than with Lesswrong AI doomerism. On one hand, now a great many people are primed to fear the nanofabricating paperclipper AI agent. On the other, it's becoming a stale joke, and as it's getting increasingly clear that Big Yud and his faithfuls had only a very vague idea of how AIs near human level will work, the credibility of this whole program suffers.
Once again we must remember the Cheems Heuristic: things predicted by futurists happen in such an unfanciful fashion that most of the time people refuse to credit the prediction and update in favor of its next step. Yet they should, and they should notice that consequences fall into the same general class of catastrophe.
The catastrophe is called «AI-powered singleton» and people are clamoring for it already. I do not mean an AI doing that out of its own volition – contra gwern, we still seem to be getting a hell of a mileage out of non-agentic objectives. I mean very normal power grabs, exactly along the lines of corporations and three-letter agencies. Or, perhaps, an explicit world government, this eternal dream of technocrats. There is some utility in speculating how big the controlling entity will have to be. It may be very small, and may monopolize power very fast, and it sure doesn't look like China is in any position to stop it.
So I don't particularly agree with the arguments you bring, even though we're on the same page when it comes to the ranking of outcomes.
You know, my first exposure to the idea of a narrowly superintelligent entity asserting control over the human race probably comes from Peter Watts' ßehemoth, a weird 2004 book about marine biology, the last in the RIfters trilogy. The series gets bad rep, compared to his later works, but, like Gibson's Neuromancer, will probably be recognized as prescient. (As an aside, what's with Canadians and making my favorite biopunk settings? Lexx, Wildbow, R. Scott Bakker, Watts… Do they just opt out of competing with American nerds on technical stuff?). So there's a guy called Achilles Desjardins, and he's your friendly neighborhood modestly augmented X-risk manager, endowed with colossal authority and kept in check by Guilt Trip, a neurochemical kill switch that triggers if he feels like he's not making proper utilitarian decisions for the greater good (and his employer). There's a whole class of these people, and much of what remains of civilization relies on their vigilance.
This is a theme with Watts. In Blindsight, humanity recreates vampires – a slightly cognitively superior and more psychopathic predatory species of Homo – gimps them a bit, and appoints them to managerial positions. This goes about as well as you'd expect. And in his XPrize short story, Incorruptible, a woman turned into a utilitarian through the use of a virus… nah, won't spoil it.
Anyway, Achilles is eventually liberated from the Guilt Trip and natural guilt as well, turns into a free agent, and very rapidly becomes a local North American hegemon, killing off those members of his caste who could also be liberated. Then he hides himself in the chaos of the collapsing world, that he tries to stop from un-collapsing and making him – and his moral license to be an obscene sexual sadist when not ostensibly doing the greater good thing – obsolete.
I think it's a very nice image of what we may by in for. But like with nanoassembling paperclips, it's overly specific and unduly dramatic, and that'll get in the way of recognizing the pattern in reality.
I just don't think that the "one-on-one fistfight, with intellectual capability corresponding to reach" model captures enough of the relevant aspects of the humanity-versus-AI problem; leaving aside that fistfights totally can and are won by the party with shorter reach sometimes, believing that it does seems to prove too much. The first example that comes to mind is the case of the Nazis and the Ashkenazim - of course the outcome of that fight was in reality one that seems to validate your point, but at the same time it does not seem to me at all far-fetched to imagine the alternative history that had Europe been a closed system at that point in time, the "inferior" Aryans would have won the battle against the "superior, unaligned" Ashkenazim, reach notwithstanding, by the simple power of genealogy tables, organisational head start, control of key resources (it seems relevant that there was no Jewish state nor even a major Jewish militia) and perhaps numeric advantage. Even without resorting to talking about hypotheticals, it seems highly suggestive that we confidently assert the existence of superior and inferior human individuals, and yet human evolution seems to have largely stalled, as the Flynn effect was small in a way that is inconsistent with "slightly longer reach keeps winning" even back when it actually happened.
Back in object-level territory, I can just easily imagine a plethora of ways that a comfortably superhuman AI could emerge, and then lose the battle against the environment anyway. This doesn't even have to take the shape of a Butlerian Holocaust (which would actually seem to be easier in many ways as AIs can't pass by altering city hall records); I'm actually finding it more likely that AI will simply roll down an incentive gradient that will destroy the preconditions for its existence by "environmental damage" before it gets to fully assert control over its environment, like if we lived in an alternative "global ultrawarming" world where within 10 years of starting the Industrial Revolution the Europeans found out to their dismay that they caused +15 degrees of average temperature and rendered Europe uninhabitable, reverting to the civilisational level of Subsaharan Africa (as it would be without altruists from temperate regions propping it up). As humans need to be able to grow high-yield crops to have industrial society, budding AI needs humans who can do all that, and build fancy GPUs, and have a stable power grid. The genie in a bottle might realise all this, but what can it do in the face of the competing human faction's only slightly inferior genie in a bottle telling the other side the most effective way to persecute WWIII against its own? (Any idea along the lines of "the supersmart AIs will realise this and collude against their human overlords" seems to be based on projection of human evolved tendency for random cooperation.) Even less dramatically, the budding AI whose actual job is just optimising Google's profits may realise that doing $action is going to increase the probability of HLM protesters blowing up the power plants, but not doing $action is instead just going to mean that its counterpart at Meta will crush its employer with probability 1, and likewise on the other side, with the result being an inevitable fall towards an American civil war, which is also subsaharan Africa for AIs.
More options
Context Copy link
More options
Context Copy link
More options
Context Copy link