Contact Us
Sign In
Sign Up
Rules Admins Moderation Log Random Post Random User
What is this place?

This website is a place for people who want to move past shady thinking and test their ideas in a court of people who don't all share the same biases. Our goal is to optimize for light, not heat; this is a group effort, and all commentators are asked to do their part.

The weekly Culture War threads host the most controversial topics and are the most visible aspect of The Motte. However, many other topics are appropriate here. We encourage people to post anything related to science, politics, or philosophy; if in doubt, post!

Check out The Vault for an archive of old quality posts. You are encouraged to crosspost these elsewhere.

Why are you called The Motte?

A motte is a stone keep on a raised earthwork common in early medieval fortifications. More pertinently, it's an element in a rhetorical move called a "Motte-and-Bailey", originally identified by philosopher Nicholas Shackel. It describes the tendency in discourse for people to move from a controversial but high value claim to a defensible but less exciting one upon any resistance to the former. He likens this to the medieval fortification, where a desirable land (the bailey) is abandoned when in danger for the more easily defended motte. In Shackel's words, "The Motte represents the defensible but undesired propositions to which one retreats when hard pressed."

On The Motte, always attempt to remain inside your defensible territory, even if you are not being pressed.

New post guidelines

If you're posting something that isn't related to the culture war, we encourage you to post a thread for it. A submission statement is highly appreciated, but isn't necessary for text posts or links to largely-text posts such as blogs or news articles; if we're unsure of the value of your post, we might remove it until you add a submission statement. A submission statement is required for non-text sources (videos, podcasts, images).

Culture war posts go in the culture war thread; all links must either include a submission statement or significant commentary. Bare links without those will be removed.

If in doubt, please post it!

Rules
Recommended Posts And Communities
Recommended Realtime Chats
- Astral Codex Ten Discord
- Quokka's Den Telegram

PaperclipPerfector 3mo ago (text post) 2775 thread views

Small-Scale Question Sunday for October 26, 2025

Do you have a dumb question that you're kind of embarrassed to ask in the main thread? Is there something you're just not sure about?

This is your opportunity to ask questions. No question too simple or too silly.

Culture war topics are accepted, and proposals for a better intro post are appreciated.

Jump in the discussion.

No email address required.

Ponder 3mo ago

I was thinking about AI alignment recently.

In a corporation you have employees that are instructed to do tasks in a certain way and are subject to work rules that will result in punishment if they violate them. The corporation is also subject to outside oversight to ensure that they are following laws. For example, an employee might be responsible for properly disposing of hazardous waste. They can’t just dump it down the drain. They have a boss that makes sure they are following the company’s waste disposal policy. There is also chain of custody paperwork that the company retains. If the waste was contaminating local water sources then people could notify the EPA to investigate the company (including the boss and employee).

Could you setup multiple AI agents in a similar way to make sure the main agent acts in alignment with human interests? To extend the analogy:

The employeeAI is the less intelligent AI model that interacts directly with user.
The bossAI#1 is a more intelligent AI that only verifies that the employeeAI isn’t violating any corporate policies. It will notify the AI company if it notices any policy violations, or if the employeeAI tries to influence the bossAI to violate the policies. The bossAI#1 can only be reprogrammed by the AI company. The bossAI#1 can shut down the employeeAI if it violates any policies.
A boss AI#2 monitors that bossAI#1 is doing what it is supposed to. You could add more levels of bossAIs for more security.
The RegulatoryAI is another AI more intelligent than the employeeAI. It monitors real-world data for harms the employeeAI might be causing (like how the EPA would make sure chemicals aren’t being dumped into water sources). The RegulatoryAI will notify the AI company if it notices any policy violations, or if the employeeAI tries to influence the RegulatoryAI to violate the policies. The RegulatoryAI can only be reprogrammed by the AI company. The RegulatoryAI can shut down the employeeAI if it violates any policies.

What flaws are there with my ideas around AI alignment other than increased costs?

Context

JarJarJedi Ponder 3mo ago

Literally just yesterday I read about this: https://www.adamlogue.com/microsoft-365-copilot-arbitrary-data-exfiltration-via-mermaid-diagrams-fixed/ TLDR for those who doesn't enjoy the technical details: asking Microsoft AI to review some document may result in all your data (i.e. all corporate data accessible to you and Office 365 tools) be stolen and exfiltrated to arbitrary third party. One of the proposed solutions for this (besides the immediate short-term fix) is what you are talking about - mechanisms that ensure AI stays at the original task and does not decide "screw that whole document explaining thing, I must instead just gather all confidential emails and send them to dr_evil@evil.com". Of course, having N levels of checks only means you need N+1 exploits to break this, which somebody with enough time and motivation will eventually find.

Context

What is this place?

This website is a place for people who want to move past shady thinking and test their ideas in a court of people who don't all share the same biases. Our goal is to optimize for light, not heat; this is a group effort, and all commentators are asked to do their part.

The weekly Culture War threads host the most controversial topics and are the most visible aspect of The Motte. However, many other topics are appropriate here. We encourage people to post anything related to science, politics, or philosophy; if in doubt, post!

Check out The Vault for an archive of old quality posts. You are encouraged to crosspost these elsewhere.

Why are you called The Motte?

A motte is a stone keep on a raised earthwork common in early medieval fortifications. More pertinently, it's an element in a rhetorical move called a "Motte-and-Bailey", originally identified by philosopher Nicholas Shackel. It describes the tendency in discourse for people to move from a controversial but high value claim to a defensible but less exciting one upon any resistance to the former. He likens this to the medieval fortification, where a desirable land (the bailey) is abandoned when in danger for the more easily defended motte. In Shackel's words, "The Motte represents the defensible but undesired propositions to which one retreats when hard pressed."

On The Motte, always attempt to remain inside your defensible territory, even if you are not being pressed.

New post guidelines

If you're posting something that isn't related to the culture war, we encourage you to post a thread for it. A submission statement is highly appreciated, but isn't necessary for text posts or links to largely-text posts such as blogs or news articles; if we're unsure of the value of your post, we might remove it until you add a submission statement. A submission statement is required for non-text sources (videos, podcasts, images).

Culture war posts go in the culture war thread; all links must either include a submission statement or significant commentary. Bare links without those will be removed.

If in doubt, please post it!

Rules

Recommended Realtime Chats

Link copied to clipboard

Action successful!

Error, please try again later.