The perfect rolling sphere of alignment

Alignment as Pareto-efficiency

Say Agent α has action set Xα and utility function Uα(xα,xβ); likewise for agent β. We consider these agents "aligned" if they can be modeled as a single super-agent. What does this mean? There are two ways to express this:

  1. The agents choose actions (xα*,xβ*) such that there is no better (for both of them) action they could arrive at through co-ordination.
  2. The agents choose actions (xα*,xβ*) which maximize an "aggregate utility function" (a utility function which satisfies some sensible properties, like being increasing in both agents' utilities).

… in other words the agents act in a way which is Pareto-efficient.

In particular, alignment can be achieved by trade.

Consider the following toy example.

Suppose Xα = ℝG representing the "quantities of each good G = {g1, …gn} that α supplies", and Xβ = {0}. β, who receives these goods, has a preference on these, e.g. Uβ(xα) = ∑λilog (xαgi) — whereas α only sees costs to producing them Uα(xα) =  − pin ⋅ xα where pin are input costs. Then of course, the only Nash equilibrium is xα = (0,…,0).

(To be clear, this is also Pareto-efficient. Indeed the case where one agent, β's wishes are totally ignored is also one possible alignment, the "degenerate case" of alignment if you will. The point will be that … TODO

But now suppose β can bid for the goods.

scratch

Suppose we have a good G, with Xα0 = [0, Q) the "quantity of G that α tries to consume" and likewise for Xβ0 = [0, Q). Each agent gets log (1+x) utility from its respective consumption; if xα + xβ > Q, the two agents go to war which costs them  − 1099 utility.

For instance, suppose we have a good G, with Xα = [0, ∞) the "quantity of G consumed" by α (with the quantity consumed by β set to q − xα always, so Xβ = ⌀) and Uα(xα) = log (xα); Uβ(xα) = log (qxα) (q is the total quantity of G). In the absence of trade where α is the only actor, the Nash equilibrium is xα = q which gives α a utility of q and β a utility of  − ∞.

However we might consider the following "expanded game", where:

  1. There is a third agent μ whose action is represented by "the price of G". It acts by charging agents a price p for the amount of G they consume. Its goal is to minimize disequilibrium, i.e. Uμ =  − |qxαxβ|
  2. Then Uα = log (xα) − pxα and Uβ = log (xβ) − pxβ.

Then the Nash equilibrium occurs at (q/2,q/2,2/q). This also maximizes Uα + Uβ.

It is also possible to think of all of the above as a constrained optimization problem with prices as Lagrange multipliers, etc.