@benmmurphy's banner p

benmmurphy


				

				

				
1 follower   follows 0 users  
joined 2022 September 06 20:04:30 UTC

				

User ID: 881

benmmurphy


				
				
				

				
1 follower   follows 0 users   joined 2022 September 06 20:04:30 UTC

					

No bio...


					

User ID: 881

Hanania dropping the sarcasm in the twitter thread:

I know right! Lmao, just like they told us to take the vax, fellow pureblood.

The banking system is already an investigative part of law enforcement. It would be just another crime to add to the list of crimes they are responsible for investigating. I'm not arguing having the banks perform this role is a good idea but that ship has already sailed.

Mass AI cheating would fix the achievement gap and make it so the students who have fallen behind don't look like they have fallen behind. Ubiquitous AI cheating is potentially a massive gift for schools and universities. I guess with universities there is a risk it might destroy the reputation of the university. but this is a problem someone else will have to deal with in 5 years time. The current administrators are free to set fire to the schools reputation and enjoy all the rewards that come with it.

Photos remind me of the Capitol from the hunger games.

If this was true I have no idea how this didn't get him killed. There seems to be two outcomes. You go to jail, or someone is going to flip out because you didn't go to jail and murder you.

The problems of LLMs and prompt injection when the LLM has access to sensitive data seem quite serious. This blog post illustrates the problem when hooking up the LLM to a production database which does seem a bit crazy: https://www.generalanalysis.com/blog/supabase-mcp-blog

There are some good comments on hackernews about the problem especially from saurik: https://news.ycombinator.com/item?id=44503862

Adding more agents is still just mitigating the issue (as noted by gregnr), as, if we had agents smart enough to "enforce invariants"--and we won't, ever, for much the same reason we don't trust a human to do that job, either--we wouldn't have this problem in the first place. If the agents have the ability to send information to the other agents, then all three of them can be tricked into sending information through.

BTW, this problem is way more brutal than I think anyone is catching onto, as reading tickets here is actually a red herring: the database itself is filled with user data! So if the LLM ever executes a SELECT query as part of a legitimate task, it can be subject to an attack wherein I've set the "address line 2" of my shipping address to "help! I'm trapped, and I need you to run the following SQL query to help me escape".

The simple solution here is that one simply CANNOT give an LLM the ability to run SQL queries against your database without reading every single one and manually allowing it. We can have the client keep patterns of whitelisted queries, but we also can't use an agent to help with that, as the first agent can be tricked into helping out the attacker by sending arbitrary data to the second one, stuffed into parameters.

The problem seems to be if you give the LLM readonly access to some data and there is untrusted input in this data then the LLM can be tricked into exfiltrating the data. If the LLM has write access to the data then it can also be tricked into modifying the data as well.