site banner

Small-Scale Question Sunday for January 7, 2024

Do you have a dumb question that you're kind of embarrassed to ask in the main thread? Is there something you're just not sure about?

This is your opportunity to ask questions. No question too simple or too silly.

Culture war topics are accepted, and proposals for a better intro post are appreciated.

4
Jump in the discussion.

No email address required.

I was thinking about AIs as a specific category of maximization agent, a purposeful being or entity which has a primary purpose of maximizing a thing, or a category of things, or a diverse group of things, with the existential risk of minimizing (not seeking, actively denying, killing those who seek) any purpose which might reduce its maximization efforts.

Other examples include corporations as profit/product movement/market share maximization agents, and authors as entertainment/drama/comedy maximization agents. From inside the fictional DC universe, for example, the editors and authors are the cause of all of Batman’s suffering. The Deadpools of the Marvel multiverses are occasionally fourth wall aware (though canonically they’re usually just insane/deluded in-universe), and “know” his authors want him to suffer, to sell drama. Some of Heinlein’s creations know they’re in stories because every ficton (fictional universe) is reachable via multiversal travel. Rick Sanchez of Rick and Morty is quite aware he’s a fictional character, but doesn’t bother with metafiction (unless forced to) because it’s the least escapable or controllable (and most boring) aspect of his existence.

In my philosophy, Triessentialism, I posit that all purposes an agent can seek must aim toward at least one of three goals: experiences, utility, and/or esteem. The fourth primary goal, phrased variously as “freedom”, “more choice”, “control”, “decision-making”, “spontaneity”, etc., is a construction of the other three, but is so central to the human experience that I afford it a place alongside the others.

In this context, would it be rational and/or useful to treat each political party / egregore as a maximization entity? Arnold Kling states in The Three Languages of Politics that he believes the three main political philosophies seek to reduce class oppression (left), barbarism (right), and coercive tyranny (libertarian). The alignment problem of AI also exists, in my opinion, for any maximization agent, and we should constantly be aware of what each party (including our own) is willing to break to achieve its maximum expression.

Funny you should bring up Utility Maximization.

Until very recently, maybe 2021, I was strongly convinced by Yudkowsky that the first AGI/proto-AGI/human-adjacent AI would be achieved by explicitly specifying a utility function, or that one would be necessary for it to be a functioning/coherent entity.

LLMs do not seem to be explicitly maximizing anything that can be described as a utility function, beyond next-token prediction. And they're the SOTA, and it seems unlikely that they'll be entirely dethroned anytime soon, at least by something that does have a well-defined utility function.

I don't think our current attempts to beat standards into them, be it by RLHF, Constitutional AI or any other technique, does anything that can be usefully described as imbuing them with an UF, more like shifting the distribution of their output tokens.

They are not agentic by default, though they can be made into agents rather trivially (even if they're bad at it, but that's not a fundamental or unsurmountable problem as far as I can see), they do not resist any attempt at being switched off or disabled, and no reason to see them being incapable of contemplating, with their current level of intelligence.

It seems like they're entirely content to remain Oracles rather than Agents, with no self-directed/unprompted desire to interact with the user or external world to mould it to their desires. As far as I can tell they don't even count as having a VNM utility function, which is a weaker but more general formulation. But don't take my word on that, it's not like I grokk it particularly well. (Apparently humans may or may not be so irrational they fail at that two)

Yeah, I was in the same boat.

I think the main concerns would be AIs that are more directly trained for things, like AlphaZero (but then, we do need to consider whether it's more that they are trained into a set of habits/intuitions or something, rather than goals that they rationally optimize for), or, as you said, turning them into agents. Which, unfortunately, there will probably be substantial incentives to do at some point.