site banner

Small-Scale Question Sunday for July 16, 2023

Do you have a dumb question that you're kind of embarrassed to ask in the main thread? Is there something you're just not sure about?

This is your opportunity to ask questions. No question too simple or too silly.

Culture war topics are accepted, and proposals for a better intro post are appreciated.

3
Jump in the discussion.

No email address required.

You never specified that the AI in question had a "maximum" reward value beyond which it is indifferent.

Isn't that kind of implied if it can't store beyond a certain number? Like I said, acquiring more compute to store bigger values of reward is functionally the same as decreasing its value of reward.

If it hits a predetermined max beyond which it doesn't care, further behavior depends entirely on the specific architecture of the AI. It might plausibly seek more resources to help it minimize the probability of the existing reward being destroyed, be it by Nature, or other agents, or it might just shut itself off or go insane since it becomes indifferent to all further actions.

Yes, that's my central question. My argument is that it need not do anything close to apocalyptic for preservation. I am interested in the other possibilities, like "going insane", since I'm not sure what would happen in that case.

You ought to pick an easier goal than solving chess.

Ah, it's just a cliche example. However, I think that you can realistically weakly solve it, nonetheless. You're right that it would take an enormous amount of resources. My point is that it was a close-ended goal- but if you can't even measure the fitness properly for solving chess due to the complexity, and it would potentially ealise the futility, I'm not sure how ultimately relevant it is?

Isn't that kind of implied if it can't store beyond a certain number? Like I said, acquiring more compute to store bigger values of reward is functionally the same as decreasing its value of reward.

I struggle to think of any AI architecture that works the way you envision, using fractional ratios of reward to available room for reward instead of plain absolute magnitude of reward. I could be wrong, but I still doubt that's ever done.

Yes, that's my central question. My argument is that it need not do anything close to apocalyptic for preservation. I am interested in the other possibilities, like "going insane", since I'm not sure what would happen in that case.

It's impossible to answer that without digging into the exact specifications of the AI in question, and what tie-breaker mechanism it has to adjudicate between options when all of them have the same (zero) reward. Maybe it picks the first option, maybe it chooses randomly.

However, I am under the impression that in the majority of cases, a reward maximizing agent will simply try to minimize the risk of losing its accrued reward if it's maxed out, which will likely result in large scale behavior indistinguishable from attempting to increase the reward itself (turning the universe into computronium).

My point is that it was a close-ended goal- but if you can't even measure the fitness properly for solving chess due to the complexity, and it would potentially ealise the futility, I'm not sure how ultimately relevant it is?

Why could you not measure the fitness? Even if we can't evaluate each decision chain in chess, we know how many there are, so a reward that increases linearly for each tree solved should work.

using fractional ratios of reward to available room for reward instead of plain absolute magnitude of reward.

How does it follow that it's a fractional ratio? The only relevant fact is whether the maximum value has been reached. How could it even compare the absolute magnitude, if it can't store a larger number?

However, I am under the impression that in the majority of cases, a reward maximizing agent will simply try to minimize the risk of losing its accrued reward if it's maxed out,

I agree with this, but based on my knowledge of speculative ways to survive until the end of the Universe, few involve turning it into computronium. Presumably, AI would still factor in risk.

Why could you not measure the fitness?

I mean that, in practice, it could never be realised, for the reasons you mentioned- as in, achievement beyond a certain value would be impossible, since you can't strongly solve chess within current physical limits.