@JarJarJedi's banner p

JarJarJedi


				

				

				
0 followers   follows 0 users  
joined 10 Sep 2022

				

User ID: 1118

JarJarJedi


				
				
				

				
0 followers   follows 0 users   joined 10 Sep 2022

					

No bio...


					

User ID: 1118

3

A modest idea for those who want to try their hand at AI alignment problem but is deterred by the lack of actual AI to try it on.

Let's consider a simpler (I think?) stepping stone - a multi-billionaire alignment problem. Especially in the aftermath of recent events where different billionaires caused different turmoils in different areas with different results, I think it makes sense to ask ourselves, as a society, whether we can - or should - have some kind of billionaire alignment program, and how we should approach it, before we try the same towards more alien entities such as AIs.

The input is:

  1. We have a bunch on intelligent - but not super-intelligent yet, so the task is easier - entities. For this task we presume human-level intelligence, probably on the higher end of the spectrum but nothing overwhelming.

  2. These entities control resources comparable to the power of middle-of-the-road nation-state, and deploy them with little effective oversight from anyone.

  3. They deploy those resources to achieve their goals, which may go contrary to goals of the other people, and could cause - even when very well intentioned - enormous harm. A misguided economic intervention can lead to an economic collapse of a country, a misguided social policy can make a major city as unlivable as a bombing campaign (maybe more as the effects are more permanent), a misguided medical policy can rob generations of years of life, new modes of communication can destroy social bonds and cause widespread cultural disruptions, etc. etc. Of course, they are also capable of selfishness and outright evil, though we do not presume they are more inclined to it than average human being (or less, either).

  4. For the sake of this task, we do not consider it moral or practical to destroy these entities or their resources, but want to minimize the potential harm caused by them, including unintentional harm, and potentially maximize their benefit to humanity (workable definition of "benefit to humanity" should be included in the solution, but if you eventually will attempt to align the AI, you must have some ideas what you are aligning it to, right?).

  5. We assume, for the sake of the exercise, that there's no magic lever that we could pull (like: "you do this or we destroy you/take your resources/torture you/kill your dog") to instantly put these entities to somebody else's complete control, or that people that are in control of the lever would be likely under the control of at least one of the entities above, and possibly multiple ones.

  6. In the interest of saving time, we declare all the variants of "we just need to have the right people in control of it and everything will be ok" as a non-solution since a) it just changes the personal or collective entity that needs to be aligned and b) it doesn't provide any practical actionable suggestions.

Any ideas how we could approach solving this task?