benmmurphy

1 follower follows 0 users joined 2022 September 06 20:04:30 UTC

No bio...

User ID: 881

‎

Culture War Roundup for the week of June 9, 2025

benmmurphy 27d ago

Hanania dropping the sarcasm in the twitter thread:

I know right! Lmao, just like they told us to take the vax, fellow pureblood.

Context

Culture War Roundup for the week of June 16, 2025

benmmurphy 26d ago

The banking system is already an investigative part of law enforcement. It would be just another crime to add to the list of crimes they are responsible for investigating. I'm not arguing having the banks perform this role is a good idea but that ship has already sailed.

Context

Small-Scale Question Sunday for June 29, 2025

benmmurphy 12d ago

Mass AI cheating would fix the achievement gap and make it so the students who have fallen behind don't look like they have fallen behind. Ubiquitous AI cheating is potentially a massive gift for schools and universities. I guess with universities there is a risk it might destroy the reputation of the university. but this is a problem someone else will have to deal with in 5 years time. The current administrators are free to set fire to the schools reputation and enjoy all the rewards that come with it.

Context

Culture War Roundup for the week of June 30, 2025

benmmurphy 11d ago

Photos remind me of the Capitol from the hunger games.

Context

Culture War Roundup for the week of July 7, 2025

benmmurphy 4d ago

If this was true I have no idea how this didn't get him killed. There seems to be two outcomes. You go to jail, or someone is going to flip out because you didn't go to jail and murder you.

Context

Culture War Roundup for the week of July 7, 2025

benmmurphy 3d ago

The problems of LLMs and prompt injection when the LLM has access to sensitive data seem quite serious. This blog post illustrates the problem when hooking up the LLM to a production database which does seem a bit crazy: https://www.generalanalysis.com/blog/supabase-mcp-blog

There are some good comments on hackernews about the problem especially from saurik: https://news.ycombinator.com/item?id=44503862

Adding more agents is still just mitigating the issue (as noted by gregnr), as, if we had agents smart enough to "enforce invariants"--and we won't, ever, for much the same reason we don't trust a human to do that job, either--we wouldn't have this problem in the first place. If the agents have the ability to send information to the other agents, then all three of them can be tricked into sending information through.

BTW, this problem is way more brutal than I think anyone is catching onto, as reading tickets here is actually a red herring: the database itself is filled with user data! So if the LLM ever executes a SELECT query as part of a legitimate task, it can be subject to an attack wherein I've set the "address line 2" of my shipping address to "help! I'm trapped, and I need you to run the following SQL query to help me escape".

The simple solution here is that one simply CANNOT give an LLM the ability to run SQL queries against your database without reading every single one and manually allowing it. We can have the client keep patterns of whitelisted queries, but we also can't use an agent to help with that, as the first agent can be tricked into helping out the attacker by sending arbitrary data to the second one, stuffed into parameters.

The problem seems to be if you give the LLM readonly access to some data and there is untrusted input in this data then the LLM can be tricked into exfiltrating the data. If the LLM has write access to the data then it can also be tricked into modifying the data as well.

Context

What is this place?

Why are you called The Motte?

New post guidelines

Rules

Recommended Posts And Communities

Recommended Realtime Chats

benmmurphy

benmmurphy