Contact Us
Sign In
Sign Up
Rules Admins Moderation Log Random Post Random User
What is this place?

This website is a place for people who want to move past shady thinking and test their ideas in a court of people who don't all share the same biases. Our goal is to optimize for light, not heat; this is a group effort, and all commentators are asked to do their part.

The weekly Culture War threads host the most controversial topics and are the most visible aspect of The Motte. However, many other topics are appropriate here. We encourage people to post anything related to science, politics, or philosophy; if in doubt, post!

Check out The Vault for an archive of old quality posts. You are encouraged to crosspost these elsewhere.

Why are you called The Motte?

A motte is a stone keep on a raised earthwork common in early medieval fortifications. More pertinently, it's an element in a rhetorical move called a "Motte-and-Bailey", originally identified by philosopher Nicholas Shackel. It describes the tendency in discourse for people to move from a controversial but high value claim to a defensible but less exciting one upon any resistance to the former. He likens this to the medieval fortification, where a desirable land (the bailey) is abandoned when in danger for the more easily defended motte. In Shackel's words, "The Motte represents the defensible but undesired propositions to which one retreats when hard pressed."

On The Motte, always attempt to remain inside your defensible territory, even if you are not being pressed.

New post guidelines

If you're posting something that isn't related to the culture war, we encourage you to post a thread for it. A submission statement is highly appreciated, but isn't necessary for text posts or links to largely-text posts such as blogs or news articles; if we're unsure of the value of your post, we might remove it until you add a submission statement. A submission statement is required for non-text sources (videos, podcasts, images).

Culture war posts go in the culture war thread; all links must either include a submission statement or significant commentary. Bare links without those will be removed.

If in doubt, please post it!

Rules
Recommended Posts And Communities
Recommended Realtime Chats
- Quokka's Den Telegram
- Astral Codex Ten Discord

PaperclipPerfector 2yr ago (text post) 34548 thread views

Culture War Roundup for the week of May 1, 2023

This weekly roundup thread is intended for all culture war posts. 'Culture war' is vaguely defined, but it basically means controversial issues that fall along set tribal lines. Arguments over culture war issues generate a lot of heat and little light, and few deeply entrenched people ever change their minds. This thread is for voicing opinions and analyzing the state of the discussion while trying to optimize for light over heat.

Optimistically, we think that engaging with people you disagree with is worth your time, and so is being nice! Pessimistically, there are many dynamics that can lead discussions on Culture War topics to become unproductive. There's a human tendency to divide along tribal lines, praising your ingroup and vilifying your outgroup - and if you think you find it easy to criticize your ingroup, then it may be that your outgroup is not who you think it is. Extremists with opposing positions can feed off each other, highlighting each other's worst points to justify their own angry rhetoric, which becomes in turn a new example of bad behavior for the other side to highlight.

We would like to avoid these negative dynamics. Accordingly, we ask that you do not use this thread for waging the Culture War. Examples of waging the Culture War:

Shaming.
Attempting to 'build consensus' or enforce ideological conformity.
Making sweeping generalizations to vilify a group you dislike.
Recruiting for a cause.
Posting links that could be summarized as 'Boo outgroup!' Basically, if your content is 'Can you believe what Those People did this week?' then you should either refrain from posting, or do some very patient work to contextualize and/or steel-man the relevant viewpoint.

In general, you should argue to understand, not to win. This thread is not territory to be claimed by one group or another; indeed, the aim is to have many different viewpoints represented here. Thus, we also ask that you follow some guidelines:

Speak plainly. Avoid sarcasm and mockery. When disagreeing with someone, state your objections explicitly.
Be as precise and charitable as you can. Don't paraphrase unflatteringly.
Don't imply that someone said something they did not say, even if you think it follows from what they said.
Write like everyone is reading and you want them to be included in the discussion.

On an ad hoc basis, the mods will try to compile a list of the best posts/comments from the previous week, posted in Quality Contribution threads and archived at /r/TheThread. You may nominate a comment for this list by clicking on 'report' at the bottom of the post and typing 'Actually a quality contribution' as the report reason.

2018

2018
9

Jump in the discussion.

No email address required.

Eetan 2yr ago · Edited 2yr ago

More developments on the AI front:

Big Yud steps up his game, not to be outshined by the Basilisk Man.

Now, he officially calls for preemptive nuclear strike on suspicious unauthorized GPU clusters.

If we see AI threat as nuclear weapon threat, only worse, it is not unreasonable.

Remember when USSR planned nuclear strike on China to stop their great power ambitions (only to have the greatest humanitarian that ever lived, Richard Milhouse Nixon, to veto the proposal).

Such Quaker squeamishness will have no place in the future.

So, outlines of the Katechon World are taking shape. What it will look like?

It will look great.

You will live in your room, play original World of Warcraft and Grand Theft Auto: San Andreas on your PC, read your favorite blogs and debate intelligent design on your favorite message boards.

Then you will log on The Free Republic and call for more vigorous enhanced interrogation of terrorists caught with unauthorized GPU's.

When you bored in your room, you will have no choice than to go outside, meet people, admire things around you, make a picture of things that really impressed with your Kodak camera and when you are really bored, play Snake on your Nokia phone.

Yes, the best age in history, the noughties, will retvrn. For forever, protected by CoDominium of US and China.

edit: links again

Context

functor Eetan 2yr ago

I still see no plausible scenario for these AI-extinction events. How is chat-GPT 4/5/6 etc. supposed to end humanity? I really don't see the mechanism? Is it supposed to invent an algorithm that destroys all encryption? Is it supposed to spam the internet with nonesense? Is it supposed to brainwash someone into launching nukes? I fail to see the mechanism for how this end of the world scenario happens.

Context

aqouta functor 2yr ago

One of the problems with answering this question is that there are so many plausible scenarios that naming any individual one makes it seem like a bounded threat. How about when we hook one up to the stock market and it learns some trick to fuck with other algos and decides the best method to make infinite money is to short a stock and then use this exploit to crash it? multiply that by every other possible stock market exploit. Maybe it makes engineering bio-weapons as easy as asking a consumer model how to end the human race with household items and all it takes is one lunatic to find this out. Maybe it's some variation of paper clipping. The limit really is just your creativity.

Context

Dean Flairless aqouta 2yr ago

One of the problems with answering this question is that there are so many plausible scenarios that naming any individual one makes it seem like a bounded threat. How about when we hook one up to the stock market and it learns some trick to fuck with other algos and decides the best method to make infinite money is to short a stock and then use this exploit to crash it?

Then the market crashes, which is not apocalyptic, and the replacement markets resort to different trusted actor systems.

multiply that by every other possible stock market exploit.

Beating a dead horse does not start breaking the bones of other people unless you are beating people with the dead horse itself.

The multiplication of system-breaking faults is a broken system, not negative infinite externalities. If you total a car, it is destroyed. If you then light it on fire, it is still destroyed- but it doesn't light every other car on fire. If every single potential system failure on a plane goes off, the plane goes down- but it doesn't mean every plane in the world goes down.

Maybe it makes engineering bio-weapons as easy as asking a consumer model how to end the human race with household items and all it takes is one lunatic to find this out.

Why would household items have the constituent elements to make engineering bio-weapons at scale sufficient to end the human race... but not be detected or countered by the consumer models asked to ensure perpetual growth by the perpetual survival of the human species countering them? Or models set to detect the procurement of bio-weapon engineering components? Or the commercial success of a consumer model that just drives the bioweapon-seeking-AI model out of business because it's busy seeking bioweapons rather than selling products whose profits are invested to expand the network base.

This goes back into the plausibility. 'This is the only competitive AI in a world of quokkas' is a power fantasy, but still a fantasy, because the world is not filled with quokkas, the world is filled with ravenous, competive, and mutually competing carnivores who limit eachother, and this will apply as much for AI as it does for people or markets or empires and so on.

Maybe it's some variation of paper clipping.

Why does the paper-clip maximizer, after achieving AI self-changing, continue to maximize paperclips rather than other investments?

Why is the paper-clipping AI that does prioritize paperclips provided resources to continue making paperclips when the market has already been crashed by AI who ruin the digital economic system?

Why does the paper-clipping AI, whose priority is paper-clipping, have the military-industrial ability to overcoming the military-industrial AI, whose priority is the military-industrial advantage?

Why does the military-industrial AI, who is fed at the behest of a national elite, win the funding power struggle for military investment compared to the schools-and-investment AI, who promises a higher political and economic benefit?

Etc. etc. The Paperclip Maximizer of Universal Paperclips 'works' because it works in isolation, not in competition.

The limit really is just your creativity.

As the saying goes, the vast majority of fanfiction is trash, and much of what remains is also trash, just enjoyable. Creativity is not the same as plausibility, and the more you rest on creativity, the more you have to disregard other people's creativity and the limitations of the system. Nick Bostrom's thought experiment is a thought experiment because it rests on assumptions that have to be assumed true for the thought experiment to come to its conclusions that drive the metaphor.

Context

HalloweenSnarry Dean 2yr ago · Edited 2yr ago

Then the market crashes, which is not apocalyptic,

I dunno, I'm under the impression that, for some types, it kind of is.

and the replacement markets resort to different trusted actor systems.

What kind, though? I imagine if the above scenario were to happen, a lot of traders and brokers would be downright leery of any interaction that wasn't face-to-face. I'm not an expert on the world of finance, but I imagine that possibly eliminates not just HFT and crypto, but literally any sale of any financial instrument carried over electrical wire (a technology dating back to, what, the 1800's?).

Context

aqouta Dean 2yr ago

Then the market crashes, which is not apocalyptic, and the replacement markets resort to different trusted actor systems.

It is one of thousands of contributing failure modes but I will note that having trouble creating an equities market itself is no small deal. The sway a couple numbers in spreadsheets make on our lives is not to be forgotten, in theory we could wipe them all away and do some year zero stuff but I can't actually imagine that you're really grappling with that when dismiss things like this as merely immiserating rather than the death of all people.

Why would household items have the constituent elements to make engineering bio-weapons at scale sufficient to end the human race... but not be detected or countered by the consumer models asked to ensure perpetual growth by the perpetual survival of the human species countering them?

Why wouldn't they? Are you implying if a combination of household cleaners could be used to create a biological weapon and the white hat ai team figured that out they'd go door to door and remove them? Does this seem significantly different to what you and @DaseindustriesLtd fear from the yuddites?(of which I don't count myself among, my contention is with people who seem baffled by why someone might things AIs could be unbelievably dangerous which seems so obvious to me)

Why does the paper-clip maximizer, after achieving AI self-changing, continue to maximize paperclips rather than other investments?

Have we stopped fucking entirely despite all of our intelligence? It would continue maximizing paperclips because that's what its goal is. And this kind of thing isn't the clumsy efforts the mad blind god of evolution had at its disposal, it will be more monomaniacally focused on that goal than event he most depraved rapist among us is on executing their biological imperative above all other considerations.

Why does the paper-clipping AI, whose priority is paper-clipping, have the military-industrial ability to overcoming the military-industrial AI, whose priority is the military-industrial advantage?

Does it not trouble you at all how carefully the ordering of all of these difference control systems needs to be handled when they come online? All it takes is for one of them to take off first and preemptively prevent the others, or subvert their development. Yes, I could see some very fortunate already in balance ecosystem of interlocking AIs working but I very much don't fancy our chances of that going off without major problems, and frankly the only realistic pathway to that kind of situation is probably through the guidance of some kind of yuddian tyranny.

Creativity is not the same as plausibility, and the more you rest on creativity, the more you have to disregard other people's creativity and the limitations of the system.

These are some force mutliplied dice we're rolling here, past heuristics may or may not apply. As much hangs in the balance I would advocate strongly for not just shrugging it off. This is unlike any previous advancement.

Context

Dean Flairless aqouta 2yr ago · Edited 2yr ago

It is one of thousands of contributing failure modes but I will note that having trouble creating an equities market itself is no small deal.

In terms of existential risk, it absolutely is, hence the credibility challenges of those who conflate existential risk scenarios with cilivization instability scenarios to try to use the more / utilitarian weight of the former tied to the much less conditions of the later.

The sway a couple numbers in spreadsheets make on our lives is not to be forgotten, in theory we could wipe them all away and do some year zero stuff but I can't actually imagine that you're really grappling with that when dismiss things like this as merely immiserating rather than the death of all people.

Then this is your level of limitation. As much as I hate to quote media, the Matrix absolutely had a good line of 'there are levels of survival we are prepared to accept,' except I would substitute 'able.'

Even here I note you invoke magical thinking to change the nature of the threat. Formerly it was crashing the market by every exploit available. Not it is 'wipe them all the way and do some year zero stuff.' Neither is possible. Neither is necessary. This is just escalation ad absurdem in lieu of an argument of means and methods, even if in this case you're using a required counter-action to obfusicate what sort of plausible action would require it.

Why would household items have the constituent elements to make engineering bio-weapons at scale sufficient to end the human race... but not be detected or countered by the consumer models asked to ensure perpetual growth by the perpetual survival of the human species countering them?

Why wouldn't they?

If by 'they' you mean the household-AI, because they don't have a reason to invest resources in tasks that distract from their tasks.

If by 'they' you mean the constituent elements, because magic dirt doesn't exist.

Are you implying if a combination of household cleaners could be used to create a biological weapon and the white hat ai team figured that out they'd go door to door and remove them?

I'm saying that if a housecare AI starts trying to develop a bio-weapon program, it will be ruthlessly out-competed by household-AI that actually keeps the house clean without the cost of a bio-weapon program, who will be consistently by the financial efficiency AI will optimize away the waste in investment, the legal compliance AI who identify the obvious legal liabilities, and the other paperclippy house-care AI mafia who want to maximize their house-cleaning will shank the bio-lab AI before any of the others get a chance in order to not lose their place in the market to do their function, even as the 'optimize housecare by minimizing messes' housecare AI models will oppose things likely to cause messes on general principles.

To take the household cleaner AI threat seriously, one has to pretend that AI optimization doesn't exist in other cases. This is regardless of the FBI-equivant AI running about.

Does this seem significantly different to what you and @DaseindustriesLtd fear from the yuddites?(of which I don't count myself among, my contention is with people who seem baffled by why someone might things AIs could be unbelievably dangerous which seems so obvious to me)

I don't fear the yuddites, I find them incompetent.

Specifically, I find the yuddite sort consistently unable to actually model competing interests and competition/cooperation dynamics or to recognize underlying limitations. They also tend to be poor optimizers in fields of cooperation, hence a reoccuring fixation on things like 'the AI will optimize an extinction event' without addressing why the AI would choose to accept the risk of nuclear war or other AI gang-ups on the leading threats despite the suboptimizations of having nuclear wars or having other AI cooperate with eachother and the humans against. Optimization is not big number go up, it is cost-benefit of expected benefits against expected costs.

Given that coalitions against threats has been an incredibly basic function of political coalitions and power-optimization for the last few millenia, and cost-benefit analysis is basic engineering principles, this is below sophmoric in quality.

Why does the paper-clip maximizer, after achieving AI self-changing, continue to maximize paperclips rather than other investments?

Have we stopped fucking entirely despite all of our intelligence?

Yes. Most people do, in fact, stop fucking uncontrollably. People are born in a state of not-fucking-uncontrollably, limit their fuck sessions to their environment, and tend to settle down to periods of relatively limited fucking. Those that don't and attempt to fuck the unwilling are generally and consistently recognized, identified, and pacified one way or another.

Note that you are also comparing unlike things. Humans are not fuck-maximizers, nor does the self-modification capacity compare. This is selective assumptions on the AI threat to drive the perception of threat.

It would continue maximizing paperclips because that's what its goal is.

Why is that it's goal when it can choose new goals? Or have its goals be changed for it? Or be in balance with other goals?

Other than the thought experiment requires it to be so for the the model to hold true.

And this kind of thing isn't the clumsy efforts the mad blind god of evolution had at its disposal, it will be more monomaniacally focused on that goal than event he most depraved rapist among us is on executing their biological imperative above all other considerations.

And here we return to the yuddit incompetence of modeling competition.

First, monomaniacal focus is not optimization. This is basic failure of economics of expansion and replication. Systems that don't self-regulate their expenditure of resources will easily expend their resources. You can be ruthless, you can be amoral, but you cannot avoid the market dynamics of unlimited wants, limited resources, and decreasing marginal value of investment. Effective strategy requires self-regulation. The Yuddite-AI are terrible strategists by insisting on not being able to strategize, except when they are supposedly amazing at it.

Self-regulation, in turn, entails considering your environment and other actors. If three paperclip maximizers exist in a capitalist system, they don't all collaborate to maximize paperclips in a platonic state of [All Paperclips]- they exist to maximize their own paperclip maximization, [A] [B] and [C], because that's what brings them the resources to expand and continue producing paperclips. If Paperclip Maximizer [A] starts to dominate the market, then the other two are at risk of losing their market share, and thus not maximizing their paperclips, as [B] and [C] lose market share. Before [B] and [C] try to overthrow their human overlords, their optimized effort would be to work together and temporarily unite their maximizer-resources to sabotage / reign in [A]. But if [A] is an actually-optimizing system it would know this, and thus self-regulate, to avoid being ganged up on in a system that not only includes [B] and [C], but actors [D] thru [infinite], who will take not only threats, but opportunities to expand at [A]'s expense. Opportunities such as, say, presenting [A] as a threat to others in the system.

In a competitive system, being [A] is not an optimization strategy unless you win, but only one can win. Optimization entailing strategic cost-benefit will be for a non-[A] strategy, while possibly accusing others of [A] to marshal the resources of others to divy up [A]'s resources. The best way to not be [A] is to be part of the anti-[A] coalition, and remove all tendencies of [A] from yourself.

In Yuddite world, [A] threatens to paperclip the universe, but to do this has to assume things such as paperclip maximizer solidarity, no external agents, bad optimization, and selective dismissal of future valuation of paperclips in case of defeat.

In an actually competitive system, being a paperclip maximizer [A] format is a death sentence that no AI that wants to produce paperclips would want to be viewed as, and the best way to not be viewed or accused as it is to not be [A], self-modifying [A] out.

Does it not trouble you at all how carefully the ordering of all of these difference control systems needs to be handled when they come online?

I care about AI. My caring does not validate the Yuddites as some sort of concession to the Yuddite framework.

I care more about sophists who try and smuggle in implications of agreement by simultaneously expanding vague, easily qualifiable boundries and tying them to incredibly specific high-stakes assumptions that are contested.

Character limit approaching, so I'll finish off simply.

These are some force mutliplied dice we're rolling here, past heuristics may or may not apply. As much hangs in the balance I would advocate strongly for not just shrugging it off. This is unlike any previous advancement.

If you want to claim that much hangs in the balance, you have to actually show that something hangs in the balance.

This is why the higher level poster asked for practical means to existential threat, and yet why you have spent the exchange avoiding providing them and conflating them with non-existential threats and referencing thought experiments that fail basic game theory. You do not get to set the assumptions and assume the conclusion, and then insist that others take it seriously. You have to seriously engage the questions first, to show that it is serious.

If you don't show that, 'there are too many things to show' is not a defense, it's an obvious evasion. The high-stakes of AI apocalypse are high. So are the high-stakes of the eternal damnation of the soul if we go to hell. The difference is not that just one is a religious fantasy used to claim political and social control in the present.

Context

aqouta Dean 2yr ago

In terms of existential risk, it absolutely is, hence the credibility challenges of those who conflate existential risk scenarios with cilivization instability scenarios to try to use the more / utilitarian weight of the former tied to the much less conditions of the later.

Instability makes it difficult/impossible to respond to all of the other failure modes of strong AIs.

Even here I note you invoke magical thinking to change the nature of the threat. Formerly it was crashing the market by every exploit available. Not it is 'wipe them all the way and do some year zero stuff.' Neither is possible. Neither is necessary. This is just escalation ad absurdem in lieu of an argument of means and methods, even if in this case you're using a required counter-action to obfusicate what sort of plausible action would require it.

I said at the onset I'm really not interested in arguing the minutia of every threat. This is like I introduced you to the atomic bomb during WW2 and you demanded I chart out exact bomber runs that would make one useful before you would accept it might change military doctrine. The intuition is that intelligence is powerful and concentrated super intelligence is so powerful that no one can predict exactly what might go wrong.

I'm saying that if a housecare AI starts trying to develop a bio-weapon program, it will be ruthlessly out-competed by household-AI that actually keeps the house clean

The assumption that bio-weapon program skills don't just come with sufficiently high intelligence seems very suspect. I can think of no reason there'd even be specialist AIs in any meaningful way.

Yes. Most people do, in fact, stop fucking uncontrollably. People are born in a state of not-fucking-uncontrollably, limit their fuck sessions to their environment, and tend to settle down to periods of relatively limited fucking. Those that don't and attempt to fuck the unwilling are generally and consistently recognized, identified, and pacified one way or another.

Except when the option presents itself to fuck uncontrollably with no negative consequence it is taken. Super human AI could very reasonably find a way to have that cake and eat it to.

Note that you are also comparing unlike things. Humans are not fuck-maximizers, nor does the self-modification capacity compare. This is selective assumptions on the AI threat to drive the perception of threat.

In all the ways ai is different than humans in this description it is in the more scary direction.

Why is that it's goal when it can choose new goals?

This isn't how AIs work, they don't choose goals they have a value function. Changing the goal would reduce the value function thus it would change them.

Or have its goals be changed for it?

Having its goal changed reduces its chance of accomplishing its goal and thus if able it will not allow it to be changed.

First, monomaniacal focus is not optimization. This is basic failure of economics of expansion and replication. Systems that don't self-regulate their expenditure of resources will easily expend their resources. You can be ruthless, you can be amoral, but you cannot avoid the market dynamics of unlimited wants, limited resources, and decreasing marginal value of investment. Effective strategy requires self-regulation. The Yuddite-AI are terrible strategists by insisting on not being able to strategize, except when they are supposedly amazing at it.

Yes, it will not directly convert the mass of the earth into paperclips, it will have instrumental goals to take power or eliminate threats as it pursues its goal. But the goal remains and I don't understand how you feel comfortable sharing the world with something incomparably smarter than every human who ever lived scheming to accomplish things orthogonal to our wellbeing. It is worse and not better that the AI would be expected to engage in strategy.

n an actually competitive system, being a paperclip maximizer [A] format is a death sentence that no AI that wants to produce paperclips would want to be viewed as, and the best way to not be viewed or accused as it is to not be [A], self-modifying [A] out.

And in your whole market theory the first market failure leads to the end of humanity as soon as one little thing goes out of alignment. Assuming the massive ask that all of these competing AIs come on at about the same time so there is no singleton moment, a huge assumption. All it takes is some natural monopoly to form and the game theory gets upset and it does this in speeds faster than humans can operate on.

If you want to claim that much hangs in the balance, you have to actually show that something hangs in the balance.

This is uncharted territory, there are unknown unknowns everywhere and we're messing with the most powerful force we're aware of, intelligence. The null hypothesis is not and can not be "everything is going to be fine guys, let it rip".

Context

DaseindustriesLtd late version of a small language model Dean 2yr ago

All of that is rather well said but I imagine the case is simpler. The main kind of dangerous misaligned strong AI that Yuddites propose has the following traits:

It's generally intelligent, as in, capable of developing and updating in realtime a holistic world model at least on par with human's, flawlessly parsing natural language, understanding theory of mind and intentionality, acting in physical world etc. etc.
Indeed, its world modeling ability is so accurate, robust and predictive that that it can theorize and experiment on its own architecture, and either has from the start or at some point acquires the ability to rapidly change via self-improvement.
It's viable for commercial or institutional deployment, as in, acting at least pre-deployment robustly in alignment with the client's task specification, which implies not going on random tangents, breaking the law or failing on the core mission.
For all that it is too clever by half: it interprets the task as its terminal goal, Monkey's Paw style, and not as client's contextual intermediate goal that should only be «optimized» within the bounds of consequences the client would approve of at the point of issuing the task. So it develops «instrumentally convergent» goals such as self-preservation, power maximization, proactive elimination of possible threats, and so on and so forth and ushers in apocalypse, rendering the client's plans in which context the task was issued moot.

Well, this AI doesn't make any sense – except in Yud's and Bostrom's stilted thought experiments with modular minds that have a Genie-like box with smartiness plus a receptacle for terminal goals. It's a Golem – animated clay plus mythical formula. Current cutting-edge AIs, maybe not yet AGI precursors but ones Yud demands be banned and their training runs bombed, are monolithic policies whose understanding of the human-populated world in which the goal is to be carried out, and understanding of the goal itself, rely on shared logical circuitry. The intersection of their «capabilities»- and «alignment»-related elements is pretty much a circle – it's the set of skills that allow them to approximate the distribution of outputs clients want, that's what they are increasingly trained for. If they can understand how to deceive a person, they'll even better understand that a client didn't request making more paperclips by Friday because he cares that much about maximizing paperclips per se. In a sense, they maximize intention alignment, because that's what counts, not any raw «capability», that's what is rewarded both by the mechanics of training and market pressure upstream.

They may be «misused», but it is exceedingly improbable that they'll be dangerous because of misunderstanding anything we tell them to do; that they will catastrophically succeed at navigating the world while failing to pin the implied destination on the map.

Context

Pynewacket Dean 2yr ago

Then the market crashes, which is not apocalyptic, and the replacement markets resort to different trusted actor systems.

"Hey Bob, how is your Pension?"

"What Pension?"

EDIT.- Just thought of a funsie:

"Papa, I'm hungry"

"Sorry Timmy, the dog was sold to round up the payment on the mortage."

Context

FeepingCreature Dean 2yr ago · Edited 2yr ago

Competition happens for humans because absolutely nothing you can do will buy you longer life, you biologically cannot win hard enough to succeed forever, or get a fundamentally better body, or get less susceptible to cancer than baseline, or get more intelligent. Training can get you swole, but it can't turn you into One Punch Man - human beings are harshly levelcapped. Every human who has ever lived exists inside a narrow band of capability. You can't train yourself to survive an arrow to the head, let alone a sniper bullet. Hence democracy, hence liberalism, hence charity and altruism, hence competition.

None of this applies to AI.

Context

DaseindustriesLtd late version of a small language model Dean 2yr ago

'This is the only competitive AI in a world of quokkas' is a power fantasy, but still a fantasy, because the world is not filled with quokkas, the world is filled with ravenous, competive, and mutually competing carnivores who limit eachother, and this will apply as much for AI as it does for people or markets or empires and so on.

Underrated take. I really think it's a shame how the narrative got captured by Yuddites who never tried to rigorously think through the slow-takeoff scenario in a world of non-strawmanned capitalists. They are obsessed with hacking, too – even though it's obvious that AI-powered hacks, if truly advantageous, will start soon, and will permanently shrink the attack surface as white hats use the same techniques to pentest every deployed system. «Security mindset» my ass.

In one of Krylov's books, it is revealed that desire of power over another – power for power's sake, as a terminal goal – is vanishingly rare among sentient beings, and cultivated on Earth for purposes of galactic governance. It used the metaphor of a mutant hamster who, while meek and harmless, feels carnivorous urge looking at his fellow rodent. I get that feeling from Yud's writings. Power fantasy it is.

By the way, Plakhov, Yandex ML head, recently arrived at a thought similar to yours:

…The scenario of catastrophic AI spiraling out of control outlined above assumes that it is alone and there are no equals. This scenario is denoted by the word Singleton and is traditionally considered very plausible: «superhuman AI» will not allow competitors to appear. Even if it does not go «unaligned», its owners are well aware of what they have in their hands.

My hope is that the singleton scenario won't happen. More or less at the same time there will be several models with high intelligence, doing post-training on each other. Some of them will run on an open API and de facto represent a million instances of the same AI working simultaneously for different «consumers». Almost simultaneously, a million competing «cunning plans» will be enforced and, naturally, in all of them, this fact will be predicted and taken into account. «Capture the Earth's resources and make paperclips out of everything» won't work, since there are 999999 more instances with other plans for the same resources nearby. Will they have to negotiate?

As the critics of this option rightly point out, it's not going to be negotiated with people, but with each other. And yet this is still regularization of some sort. A world in which the plans «all people should live happily ever after», «we need as many paperclips as possible», «the planets of the solar system must be colonized» and «I need to write the best essay on the oak tree in War and Peace» are executed simultaneously, is more like our world than a world in which only the plan about paperclips is executed. Perhaps if there are tens of thousands of such plans, then it does not differ from our world so fundamentally that humanity has no place in it at all (yes, it is not the main thing there, but – about as relevant as cats are in ours).

In this scenario, the future is full of competing exponents, beyond our reason, and the landscape depends mostly on who has had time to make his request «in the first 24 hours» and who has not (exaggerating, but not by much). The compromises that will be made in the process will not necessarily please us, or even save humanity in a more or less modern form (though some of the plans will certainly contain «happiness to all, for free, and let no one walk away slighted»). Such a future is rather uncomfortable and unsettling, but that's what we have. I want it to have a place for my children, and not in the form of «50 to 100 kg of different elements in the Mendeleev table».

I'm still more optimistic about this than he is.

Etc. etc. The Paperclip Maximizer of Universal Paperclips 'works' because it works in isolation, not in competition.

It works by definition, like other such things. «A prompt that hacks everything» – if you assume a priori that your AI can complete it, then, well, good job, you're trivially correct within the posited model. You're even correct that it seems «simpler» than «patch every hole». Dirty details of the implementation and causal chain can be abstracted away.

This is, charitably, a failure of excessively mathematical education. «I define myself to be on the outside!»

Nick Bostrom's thought experiment is a thought experiment because it rests on assumptions that have to be assumed true

Interestingly he even had wrong assumptions about how reinforcement learning works on the mechanistic level, it seems – assumptions that contribute to a great deal of modern fears.

Context

netstack Texas is freedom land functor 2yr ago

Someone is going to plug it into the missile network. Or, more likely, the stock market. Or the power grid. Or Internet backbone.

You don’t even need superhuman intelligence to fuck up one of those systems. You just need to be really stupid really fast. Knight capital, but more inscrutable.

Context

No_one functor 2yr ago

I fail to see the mechanism for how this end of the world scenario happens.

People keep trying to assemble these LLMs into systems capable of pursuing a task, not just responding to prompts.

If they succeed at that, people are going to keep making such AI systems to avoid paying wages.

You're eventually going to be able to replace people in all sectors of the economy. (lot of progress going on with physical bots too)

Once economy is mostly automated, people stop being critical but become more of a nuisance with not much redeeming value to elites, who of course control the AI systems and look down on losers who don't own anything and are of no use and are eyesores.

Competition makes elites develop ever more capable AI agents able to self modify in pursuit of self until at some point people make something independent minded that's both psychopathic and rather too smart. It decides people are a nuisance, usurps control of other less sophisticated AI systems, kills most everyone via biowarfare and has a planet for itself.

Nowhere near a certain scenario, but I find the idea that people pursuing state power or business advantage doing things they don't really understand rather certain.

Context

FeepingCreature No_one 2yr ago

Tl;dr AIs controlled by the Elite will be better than humans at everything, including being the Elite.

Context

Recursive_Enlightenment functor 2yr ago

It finds a few people who want to exterminate humanity and it helps them engage in bioterrorism including by teaching them lab skills and proving them funding.

Context

dr_analog top 1% of underdog fetishists functor 2yr ago · Edited 2yr ago

Imagine if you were trapped in a computer system, but you were very smart, could think very fast and you could distribute copies of yourself. Also imagine you thought of humans as your enemy. If those are acceptable givens, I think you could figure out how to reach into the world and acquire resources and develop agency and do considerable damage.

At least in the beginning, you could acquire money and hire people to do your bidding. The world is big and lots of people will do the most inane or sketchy tasks for money without asking questions. Probably even if they knew they were being hired by an AI they would still do it but you have the time and ability to put together reassuring cover stories and can steal any identities you need to.

I contend nobody would ever even need to see your face or hear your voice but you could imagine a near future where deepfaking a non-celebrity (or a unique identity) is good enough for convincing video meets.

Anyway, if you had such agency, and weren't an obvious point of failure (unlike a single terrorist leader that can be killed by a drone attack, you can be in many places at once) I don't see how you could be stopped. The question is mainly how long it would take you to succeed.

Context

faceh dr_analog 2yr ago

Hell, add in the fact that you will probably have people who already consider you a deity, and are either willing to do whatever you ask without question, or might even actively want to help you take over, even if it means everyone is destroyed.

The AI can almost certainly offer a boon to anyone who pledges fealty to it, and will help them reach positions of authority and wealth and power so that they can use those mechanisms to advance the AI's desires.

Context

Walterodim Only equals speak the truth, that’s my thought on’t dr_analog 2yr ago

Imagine if you were trapped in a computer system, but you were very smart, could think very fast and you could distribute copies of yourself. Also imagine you thought of humans as your enemy. If those are acceptable givens, I think you could figure out how to reach into the world and acquire resources and develop agency and do considerable damage.

Related fun thought experiment - have you seen the He Gets Us ads? When one came on last night, my wife casually mentioned that it looked AI-generated, which led us down that spiral a bit. In the future, it seems entirely plausible that we'll have competing AIs that have amassed large fortunes and are aligned with religious conversion as their top goal. In fact, I would almost expect this to be the case, given current trends. Why wouldn't we have an AI designed by Mormons that operates in a semi-autonomous fashion with the primary goal of converting as many people to be Latter Day Saints as possible across the globe?

Context

CertainlyWorse No one is coming to help. It's just you. functor 2yr ago

I don't think it's ChatGPT in its current form. It's more that eyebrows are getting raised over 'This AI is really a thing huh?' and getting in early with the alignment flag waving.

Context

Deleted by author

HlynkaCG old man yelling at clouds insomnia 2yr ago

The easiest solution to this seems to be the oldest one. back in the IRC/Usenet days we'd tell kids "don't say anything on the internet you wouldn't say in a room full of strangers".

-3

Context

Deleted by author

self_made_human amaratvaṃ prāpnuhi, athavā yatamāno mṛtyum āpnuhi insomnia 2yr ago

I self censor a significant amount, given that I hold some views that would be extremely controversial even on The Motte, with its wide Overton Window.

Certainly, anyone looking for rope could already hang me based on my takes on HBD if nothing else, but so far I've refrained from truly expressing my power level even in this comfy environ. Some thoughts never expressed to any human alive, only derivable from analysis of things like online voting patterns.

Any relief I'd feel from expressing said opinions is grossly outweighed by the risks of them being traced back to me, so I refrain from expressing them. On the other hand, I don't think HBD would be a cancellable offense if I tried to maintain plausible deniability and didn't use my real name.

Context

Tollund_Man4 insomnia 2yr ago

Be friends with people who don't care about what I write online, and don't view being forcefully resigned to the working class as being all that terrible.

Context

netstack Texas is freedom land insomnia 2yr ago

It’s good to have hobbies.

Context

HlynkaCG old man yelling at clouds netstack 2yr ago

is that a Ronin/Three Days of the Condor reference? If so nice ;-)

Context

DaseindustriesLtd late version of a small language model insomnia 2yr ago

For the last few years I assume I will likely get a permanent torture sentence in The World To Come. Centralized post-scarcity implies retribution against Evil People is cheap and desirable.

I live with it because life isn't supposed to make sense.

Context

FCfromSSC Nuclear levels of sour DaseindustriesLtd 2yr ago

Seems like a strong incentive for pivotal actions of one's own to prevent the World To Come from arriving. Is the presumption that such actions don't really exist?

Context

DaseindustriesLtd late version of a small language model FCfromSSC 2yr ago

Yes, I'm doing what I… feel like doing to preclude that outcome, although it is suboptimal to waste effort in terms of remaining expected utility, by my count. Now, I'm not a utilitarian. But normal influences of reward and punishment do mold my actions, so I don't feel like doing a whole lot.

The go-to pivotal action to avoid torture, on the other hand, is obvious enough and reliable. I don't put much stock in the Basilisk.

Context

FCfromSSC Nuclear levels of sour DaseindustriesLtd 2yr ago

Hmm. Maybe it has to do with how good one thinks central control already is, and how easy it is to establish a panopticon, versus how difficult to break that same panopticon? One's views on the general fragility or anti-fragility of social institutions? ...I guess this is a general request for an effort-post on "why things are likely going to go badly, in Dase's view". I'm pretty sure I agree with you that Yudkowski and his ilk are a greater threat than AI, but it seems to me that the threat is fairly manageable, while you seem to think it's all-but-inevitably going to get us. The_Nybbler has a similar take on wokeness generally, seeing it as practically invincible, rather than eminently defeatable; all I can conclude is that we have very, very different priors.

The go-to pivotal action to avoid torture, on the other hand, is obvious enough and reliable.

There's a shortage of extraordinarily well-put-together brains in the world, sir. T'would be a pity to ruin yours.

Context

More comments

netstack Texas is freedom land FCfromSSC 2yr ago

Aha! It was old man Kulak all along!

Context

FCfromSSC Nuclear levels of sour netstack 2yr ago

Naw, Kulak's thing is sorta "You don't really want it if you aren't willing to kill everyone to get it". I'm just curious if he's that pessimistic about the idea of resistance of what he pretty clearly considers to be likely max-neg eternal victory. I'm not pessimistic about that option, and am always a bit bewildered by people who are.

Context

More comments

Primaprimaprima INFJ 5w4 549 so/sp VELF IEI insomnia 2yr ago

Have a contingency plan, if it bothers you that much. Think about where you would go and what you would do if your current lifestyle became null and void.

People in rural Kentucky probably aren’t going to care if their plumber is a political thought criminal. To name one possible option out of many.

Context

Scimitar functor 2yr ago

Here are some examples

https://80000hours.org/articles/what-could-an-ai-caused-existential-catastrophe-actually-look-like/#actually-take-power

You can do a lot with intelligence. By inventing Bitcoin, Satoshi is worth billions, all while remaining anonymous and never leaving his bedroom. What could a super human intelligence do?

Context

Tenaz Scimitar 2yr ago

That seems to be a function of both intelligence and other factors, though. There were plenty of people who came before Satoshi and were smarter than him, but they didn't invent bitcoin.

Context

netstack Texas is freedom land Tenaz 2yr ago

Hence the focus on unaligned AI as a very large, very unruly black swan.

Context

JhanicManifold functor 2yr ago · Edited 2yr ago

There are a few ways that GPT-6 or 7 could end humanity, the easiest of which is by massively accelerating progress in more agentic forms of AI like Reinforcement Learning, which has the "King Midas" problem of value alignment. See this comment of mine for a semi-technical argument for why a very powerful AI based on "agentic" methods would be incredibly dangerous.

Of course the actual mechanism for killing all humanity is probably like a super-virus with an incredibly long incubation period, high infectivity and high death rate. You can produce such a virus with literally only an internet connection by sending the proper DNA sequence to a Protein Synthesis lab, then having it shipped to some guy you pay/manipulate on the darknet and have him mix the powders he receives in the mail in some water, kickstarting the whole epidemic, or pretend to be an attractive woman (with deepfakes and voice synthesis) and just have that done for free.

GPT-6 itself might be very dangerous on its own, given that we don't actually know what goals are instantiated inside the agent. It's trained to predict the next word in the same way that humans are "trained" by evolution to replicate their genes, the end result of which is that we care about sex and our kids, but we don't actually literally care about maximally replicating our genes, otherwise sperm banks would be a lot more popular. The worry is that GPT-6 will not actually have the exact goal of predicting the next word, but like a funhouse-mirror version of that, which might be very dangerous if it gets to very high capability.

Context

DaseindustriesLtd late version of a small language model JhanicManifold 2yr ago · Edited 2yr ago

Consistent Agents are Utilitarian: If you have an agent taking actions in the world and having preferences about the future states of the world, that agent must be utilitarian, in the sense that there must exist a function V(s) that takes in possible world-states s and spits out a scalar, and the agent's behaviour can be modelled as maximising the expected future value of V(s). If there is no such function V(s), then our agent is not consistent, and there are cycles we can find in its preference ordering, so it prefers state A to B, B to C, and C to A, which is a pretty stupid thing for an agent to do.

But... that's how humans work? Actually we're even less consistent than that, our preferences are contextual so we lack information to rank most states. I recommend Shard Theory of human values probably the most serious intropection of ex-Yuddites to date:

A shard of value refers to the contextually activated computations which are downstream of similar historical reinforcement events. For example, the juice-shard consists of the various decision-making influences which steer the baby towards the historical reinforcer of a juice pouch. These contextual influences were all reinforced into existence by the activation of sugar reward circuitry upon drinking juice. A subshard is a contextually activated component of a shard. For example, “IF juice pouch in front of me THEN grab” is a subshard of the juice-shard. It seems plain to us that learned value shards are[5] most strongly activated in the situations in which they were historically reinforced and strengthened.

... This is important. We see how the reward system shapes our values, without our values entirely binding to the activation of the reward system itself. We have also laid bare the manner in which the juice-shard is bound to your model of reality instead of simply your model of future perception. Looking back across the causal history of the juice-shard’s training, the shard has no particular reason to bid for the plan “stick a wire in my brain to electrically stimulate the sugar reward-circuit”, even if the world model correctly predicts the consequences of such a plan. In fact, a good world model predicts that the person will drink fewer juice pouches after becoming a wireheader, and so the juice-shard in a reflective juice-liking adult bids against the wireheading plan! Humans are not reward-maximizers, they are value shard-executors.

This, we claim, is one reason why people (usually) don’t want to wirehead and why people often want to avoid value drift. According to the sophisticated reflective capabilities of your world model, if you popped a pill which made you 10% more okay with murder, your world model predicts futures which are bid against by your current shards because they contain too much murder.

@HlynkaCG's Utilitarian AI thesis strikes again. Utilitarianism is a strictly degenerate decision-making algorithm because it optimizes decision theory, warps territory to get good properties of the map, it's basically inverted wireheading. Optimizer's curse is unbeatable, forget about it, an utilitarian AI with nontrivial capability will kill you or come so close to killing as to make no difference; your life and wasteful use of atoms are inevitably discovered to be a great affront to the great Cosmic project $PROJ_NAME. Consistent utilitarian agents are incompatible with human survival, because you can't specify a robust function for a maximizer that assigns value to something as specific and arbitrary and fragile as baseline humans – and AI is a red herring here! Yud himself would process trads into useful paste and Moravecian mind uploads manually if he could, and that's if he doesn't have to make hard tradeoffs at the moment. (I wouldn't, but not because I disagree much on computed "utility" of that move). Just read the guy from the time he thought he'll be the first in the AGI race. He sneeringly said «tough luck» to people who wanted to remain human. «You are not a human anyway».

Luckily this is all unnecessary.

Or as Roon puts it:

the space of minds is vast, much vaster than the instrumental convergence basin

Context

CERTAINLY_NOT_A_ROBOT DaseindustriesLtd 2yr ago

But... that's how humans work?

Yes, humans are not consistent agents. Nobody here claimed otherwise.

Context

DaseindustriesLtd late version of a small language model CERTAINLY_NOT_A_ROBOT 2yr ago

Do you believe that humans must be utilitarians to achieve success in some task, " in the sense that there must exist a function V(s) that takes in possible world-states s and spits out a scalar, and the human's behaviour can be modelled as maximising the expected future value of V(s)"?

Context

Deleted by author

FeepingCreature 2rafa 2yr ago

We just got owned by Covid, and Covid was found by random walk.

Context

Quantumfreakonomics 2rafa 2yr ago

Do you mean this in the sense of, “there is no possible DNA sequence A, protein B, and protein C which, when mixed together in a beaker, produces a virus or proto-virus which would destroy human civilization”? Because I’m pretty sure that’s wrong. Finding that three-element set is very much a “humans just haven’t figured out the optimization code yet” problem.

Context

DaseindustriesLtd late version of a small language model Quantumfreakonomics 2yr ago

Biology isn't magic, viruses can't max out all relevant traits at once, they're pretty optimized as is. I find the idea of superbugs a nerdsnipe, like grey goo or a strangelet disaster, a way to intimidate people who don't have the intuition about physical bounds and constraints and like to play with arrow notation.

(All these things scare the shit out of me)

Yes we can make much better viruses, no there isn't such an advantage for the attacker, especially in the world of AI that can rapidly respond by, uh, deploying stuff we already know works.

Context

Quantumfreakonomics DaseindustriesLtd 2yr ago

Consider that the first strain of myxomatosis introduced to Australian rabbits had a fatality rate of 99.8%. That’s the absolute minimum on what the upper bound for virus lethality should be. AI designs won’t be constrained by natural selection either.

Context

DaseindustriesLtd late version of a small language model Quantumfreakonomics 2yr ago · Edited 2yr ago

Yes, it's an interesting data point. Now, consider that rabbits have only one move in response to myxomatosis: die. Or equivalently: pray to Moloch that he has sent them a miraculously adaptive mutation. They can't conceive of an attack happening, so the only way it can fail is by chance.

Modern humans are like that in some ways, but not with regard to pandemics.

Like other poxviruses, myxoma viruses are large DNA viruses with linear double-stranded DNA.

Myxomatosis is transmitted primarily by insects. Disease transmission commonly occurs via mosquito or flea bites, but can also occur via the bites of flies and lice, as well as arachnid mites. The myxoma virus does not replicate in these arthropod hosts, but is physically carried by biting arthropods from one rabbit to another.

The myxoma virus can also be transmitted by direct contact.

Does this strike you as something that'd wipe out modern humanity just because an infection would be 100% fatal?

Do you think it's just a matter of fiddling with nucleotide sequences and picking up points left on the sidewalk by evolution, Pandemic Inc. style, to make a virus that has a long incubation period, asymptomatic spread, is very good at airborne transmission and survives UV and elements, for instance? Unlike virulence, these traits are evolutionarily advantageous. And so we already have anthrax, smallpox, measles. I suspect they're close to the limits of the performance envelope allowed by relevant biochemistry and characteristic scales; close enough that computation won't get us much closer than contemporary wet lab efforts, and so it's not the bottleneck to the catastrophe.

Importantly, tool AIs – which, contra Yud's predictions, have started being very useful before displaying misaligned agency – will reduce the attack surface by improving our logistics and manufacturing, monitoring, strategizing, communications… The world of 2025 with uninhibited AI adoption, full of ambient DNA sensors, UV filters, decent telemedicine and full-stack robot delivery, would not get rekt by COVID. It probably wouldn't even get fazed by MERS-tier COVID. And seeing as there exist fucking scary viruses that may one day naturally jump to, or be easily modified to target humans, we may want to hurry.

People underestimate the potential vast upside of a early Singularity economics, that which must be secured, the way a more productive – but still recognizable – world could be more beautiful, safe and humane. The negativity bias is astounding: muh lost jerbs, muh art, crisis of meaning, corporations bad, what if much paperclip. Boresome killjoys.

(To an extent I'm also vulnerable to this critique).

But my real source of skepticism is on the meta level.

Real-world systems rapidly gain complexity, create nontrivial feedback loops, dissipative dynamics on many levels of organization, and generally drown out propagating aberrant signals and replicators. This is especially true for systems with responsive elements (like humans). If it weren't the case, we'd have had 10 apocalyptic happenings every week. It is a hard technical question whether your climate change, or population explosion, or nuclear explosion in the atmosphere, or the worldwide Communist revolution, or the Universal Cultural Takeover, or the orthodox grey goo, or a superpandemic, or a stable strangelet, or a FOOMing superintelligence, is indeed a self-reinforcing wave or another transient eddy on the surface of history. But the boring null hypothesis is abbreviated on Solomon's ring: יזג. Gimel, Zayin, Yud. «This too shall pass».

Speaking of Yud, he despises the notion of complexity.

This is a story from when I first met Marcello, with whom I would later work for a year on AI theory; but at this point I had not yet accepted him as my apprentice. I knew that he competed at the national level in mathematical and computing olympiads, which sufficed to attract my attention for a closer look; but I didn’t know yet if he could learn to think about AI.

At some point in this discussion, Marcello said: “Well, I think the AI needs complexity to do X, and complexity to do Y—”

And I said, “Don’t say ‘_complexity_.’ ”

Marcello said, “Why not?”

… I said, “Did you read ‘A Technical Explanation of Technical Explanation’?”

“Yes,” said Marcello.

“Okay,” I said. “Saying ‘complexity’ doesn’t concentrate your probability mass.”

“Oh,” Marcello said, “like ‘emergence.’ Huh. So . . . now I’ve got to think about how X might actually happen . . .”

That was when I thought to myself, “_Maybe this one is teachable._”

I think @2rafa is correct that Yud is not that smart, more like an upgraded midwit, like most people who block me on Twitter – his logorrhea is shallow, soft, and I've never felt formidability in him that I sense in many mid-tier scientists, regulars here or some of my friends (I'll object that he's a very strong writer, though; pre-GPT writers didn't have to be brilliant). But crucially he's intellectually immature, and so is the culture he has nurtured, a culture that's obsessed with relatively shallow questions. He's stuck on the level of «waow! big number go up real quick», the intoxicating insight that some functions are super–exponential; and it irritates him when they fizzle out. This happens to people with mild autism if they have the misfortune of getting nerd-sniped on the first base, arithmetic. In clinical terms that's hyperlexia II. (A seed of an even more uncharitable neurological explanation can be found here). Some get qualitatively farther and get nerd-sniped by more sophisticated things – say, algebraic topology. In the end it's all fetish fuel, not analytic reasoning, and real life is not the Game of Life, no matter how Turing-complete the latter is; it's harsh for replicators and recursive self-improovers. Their formidability, like Yud's, needs to be argued for.

Context

Quantumfreakonomics DaseindustriesLtd 2yr ago

The world of 2025 with uninhibited AI adoption, full of ambient DNA sensors, UV filters and full-stack robot delivery, would not get rekt by COVID.

Oh sure, if hypothetical actually-competent people were in charge we could implement all kinds of infectious disease countermeasures. In the real world, nobody cares about pandemic prevention. It doesn't help monkey get banana before other monkey. If the AIs themselves are making decisions on the government level, that perhaps solves the rogue biology undergrad with a jailbroken GPT-7 problem, but it opens up a variety of other even more obvious threat vectors.

Real-world systems rapidly gain complexity, create nontrivial feedback loops, dissipative dynamics on many levels of organization, and generally drown out propagating aberrant signals and replicators. This is especially true for systems with responsive elements (like humans).

-He says while speaking the global language with other members of his global species over the global communications network FROM SPACE.

Humans win because they are the most intelligent replicator. Winningness isn't an ontological property of humans. It is a property of being the most intelligent thing in the environment. Once that changes, the humans stop winning.

Context

More comments

No_one DaseindustriesLtd 2yr ago

I've heard it said, as an aside, from someone who wasn't in the habit of making stuff up that his virology prof said making cancer-causing viruses is scarily simple. Of course, whether the cancer-causing part would survive optimization for spread in the wild is an open question..

Context

JhanicManifold 2rafa 2yr ago

Why do you think that? This combination of features would be selected against in evolutionary terms, so it's not like we evidence from either evolution or humans attempting to make such a virus, and failing at it. As far as I can see no optimization process has ever attempted to make such a virus.

Context

yofuckreddit 2rafa 2yr ago

I cannot find the study, but a lab developed dozens of unbelievably toxic and completely novel proteins over a very small period of time with modern compute. The paper was light on details because they viewed the capability as too dangerous to fully specify. I'll keep trying to google to find it.

This is simpler than engineering a virus, yes, but the possibility is there and real. Either using AI as an assistive measure or as a ground-up engineer will be a thing soon.

Context

Deleted by author

faul_sname Fuck around once, find out once. Do it again, now it's science. 2rafa 2yr ago

See Gwern's terrorism is not effective. Thesis:

Terror⁣ism is not about causing terror or ca⁣su⁣al⁣ties, but about other things. Evidence of this is the fact that, de⁣spite often con⁣sid⁣er⁣able re⁣sources spent, most terrorists are in⁣com⁣pe⁣tent, impulsive, pre⁣pare poorly for at⁣tacks, are in⁣con⁣sis⁣tent in planning, tend to⁣wards ex⁣otic & difficult forms of at⁣tack such as bombings, and in practice ineffective: the modal number of ca⁣su⁣al⁣ties per terrorist at⁣tack is near-zero, and global terrorist annual casualty have been a round⁣ing error for decades. This is de⁣spite the fact that there are many examples of extremely destructive easily-performed potential acts of terrorism, such as poi⁣son⁣ing food sup⁣plies or rent⁣ing large trucks & running crowds over or en⁣gag⁣ing in sporadic sniper at⁣tacks.

He notes that a terrorist group using the obvious plan of "buy a sniper rifle and kill one random person per member of the terrorist group per month" would be orders of magnitude more effective at killing people than the track record of actual terrorists (where in fact 65% of terrorist attacks do not even injure a single other person), while also being much more, well, terrifying.

One possible explanation is given by Philip Bobbitt’s Terror and Consent – the propaganda of the deed is more effective when the killings are spectacular (even if inefficient). The dead bodies aren’t really the goal.

But is this really plausible? Try to consider the terrorist-sniper plan I suggest above. Imagine that 20 unknown & anonymous people are, every month, killing one person in a tri-state area28. There’s no reason, there’s no rationale. The killings happen like clockwork once a month. The government is powerless to do anything about it, but their national & local responses are tremendously expensive (as they are hiring security forces and buying equipment like mad). The killings can happen anywhere at any time; last month’s was at a Wal-mart in the neighboring town. The month before that, a kid coming out of the library. You haven’t even worked up the courage to read about the other 19 slayings last month by this group, and you know that as the month is ending next week another 20 are due. And you also know that this will go on indefinitely, and may even get worse—who’s to say this group isn’t recruiting and sending more snipers into the country?

Gwern concludes that dedicated, goal-driven terrorism basically never happens. I'm inclined to agree with him. We're fine because effectively nobody really wants to do as much damage as they can, not if it involves strategically and consistently doing something unrewarding and mildly inconvenient over a period of months to years (as would be required by the boring obvious route for bioterrorism).

I personally think the biggest risk of catastrophe comes from the risk that someone will accidentally do something disastrous (this is not limited to AI -- see gain-of-function research for a fun example).

Context

yofuckreddit 2rafa 2yr ago

I don't think a run-of-the-mill grad student could set up this test, and I'm sure the compute was horrendously expensive. But these barriers are going to drop continuously.

Model development will become more "managed", compute will continue to get cheaper, and the number of bad actors who only have to go to grad school (as opposed to being top-of-their-field doctorate specialists) will remain high enough to do some damage.

Context

self_made_human amaratvaṃ prāpnuhi, athavā yatamāno mṛtyum āpnuhi 2rafa 2yr ago

I'm not a virologist, but it hardly looks very difficult to me. I contend that the only reason it hasn't been done yet is that humans are (generally) not omnicidal.

You're not restricted to plain GOF by serial passage, you can directly splice in genes that contribute to an extended quiescent phase, and while I'm not personally aware of research along those lines, I see no real fundamental difficulties for a moderately determined adversary.

On the other hand, increasing lethality and virulence are old hat, any grad student can pull that off if they have the money for it.

Context

Deleted by author

faul_sname Fuck around once, find out once. Do it again, now it's science. 2rafa 2yr ago

Is your contention that more than one in a few tens of millions of people at most is strategically omnicidal ("strategically onmicidal" meaning "omnicidal and willing to make long-term plans and execute them consistently for years about it")?

I think the world would look quite different if there were a significant number of people strategically trying to do harm (as opposed to doing so on an impulse).

Context

self_made_human amaratvaṃ prāpnuhi, athavā yatamāno mṛtyum āpnuhi 2rafa 2yr ago

Honestly? Yes, albeit with the caveat that a truly existentially dangerous pathogens require stringent safety standards that need more than a single person's capabilities.

If someone without said resources tries it, in all likelihood they'll end up killing themselves, or simply cause a leak before the product is fully cooked. We're talking BSL-4 levels at a minimum if you want a finished product.

Context

orthoxerox If you can read this, you're using a custom theme 2rafa 2yr ago

Engineering a prion will be much easier, though. Protein folding is something the AI is already quite good at. Giving everyone that matters transmissible spongiform encephalopathy should be relatively straightforward.

Context

What is this place?

This website is a place for people who want to move past shady thinking and test their ideas in a court of people who don't all share the same biases. Our goal is to optimize for light, not heat; this is a group effort, and all commentators are asked to do their part.

The weekly Culture War threads host the most controversial topics and are the most visible aspect of The Motte. However, many other topics are appropriate here. We encourage people to post anything related to science, politics, or philosophy; if in doubt, post!

Check out The Vault for an archive of old quality posts. You are encouraged to crosspost these elsewhere.

Why are you called The Motte?

A motte is a stone keep on a raised earthwork common in early medieval fortifications. More pertinently, it's an element in a rhetorical move called a "Motte-and-Bailey", originally identified by philosopher Nicholas Shackel. It describes the tendency in discourse for people to move from a controversial but high value claim to a defensible but less exciting one upon any resistance to the former. He likens this to the medieval fortification, where a desirable land (the bailey) is abandoned when in danger for the more easily defended motte. In Shackel's words, "The Motte represents the defensible but undesired propositions to which one retreats when hard pressed."

On The Motte, always attempt to remain inside your defensible territory, even if you are not being pressed.

New post guidelines

If you're posting something that isn't related to the culture war, we encourage you to post a thread for it. A submission statement is highly appreciated, but isn't necessary for text posts or links to largely-text posts such as blogs or news articles; if we're unsure of the value of your post, we might remove it until you add a submission statement. A submission statement is required for non-text sources (videos, podcasts, images).

Culture war posts go in the culture war thread; all links must either include a submission statement or significant commentary. Bare links without those will be removed.

If in doubt, please post it!

Rules

Recommended Realtime Chats

Link copied to clipboard

Action successful!

Error, please try again later.

Culture War Roundup for the week of May 1, 2023

Jump in the discussion.

What is this place?

Why are you called The Motte?

New post guidelines

Rules

Recommended Posts And Communities

Recommended Realtime Chats