@faul_sname's banner p

faul_sname

Fuck around once, find out once. Do it again, now it's science.

1 follower   follows 3 users  
joined 2022 September 06 20:44:12 UTC
Verified Email

				

User ID: 884

faul_sname

Fuck around once, find out once. Do it again, now it's science.

1 follower   follows 3 users   joined 2022 September 06 20:44:12 UTC

					

No bio...


					

User ID: 884

Verified Email

As usual, Gwern has the canonical post on this topic.

A lot of people getting put on the street
It's getting harder to stay off the street
He comes home every night past tents on his street
And wonders how much longer till he's out on the street

AHA. Turns out I was not wrong a year ago when I said people would freak out a bit once they realized that LLMs could do this, just early.

The easiest way to say this without being canceled is "if you have a heritable propensity to abuse others, your family members probably share that propensity"

The reason I said "if you instruct Claude to use the programming env" is that Claude will generally do things similar to those that were evaluated well in the past, and most chess-like-evals would have forbidden tool use or anything else that human players wouldn't consider "fair play". I expect "always consider what tools you have available and make use of them where it makes sense unless explicitly told not to" in your user instructions will work so that you just never run into this in practice.

Bluntly, I don't think it matters how the board state is represented, as long as the answer isn't "Claude is trying to reconstruct the entire board state from the move sequence".

FWIW I tried the prompt

Play good chess.

d4

and Opus 4.7, at various points in the opening, dumped a snapshot of the board state into the chat.

Not playing the full game because Claude spent 20 minutes thinking and writing janky minmax code after blundering before hitting the compaction limit then erroring out, then on the second attempt spent another 15 minutes thinking and almost hit the compaction limit but you can see that it does in fact use tools.

Anyway, to answer the question:

Firstly, why should I as the user have to prompt the AI to make a program to ensure that it doesn’t go off the rails? Why can’t it figure that out for itself?

The AI has no memory. Every conversation is a fresh new world. As a rule of thumb, I expect AI to significantly outperform me at anything I've never done before, but that for any task that hasn't been the subject of absurd amounts of RL (and some tasks that have), I'll very quickly be able to identify the places that AI is likely to fail and steer it around those pitfalls. Because I can learn, and the AI can't.