Back
 AJCM  Vol.11 No.1 , March 2021
Polynomial Time Method for Solving Nash Equilibria of Zero-Sum Games
Abstract: There are a few studies that focus on solution methods for finding a Nash equilibrium of zero-sum games. We discuss the use of Karmarkar’s interior point method to solve the Nash equilibrium problems of a zero-sum game, and prove that it is theoretically a polynomial time algorithm. We implement the Karmarkar method, and a preliminary computational result shows that it performs well for zero-sum games. We also mention an affine scaling method that would help us compute Nash equilibria of general zero-sum games effectively.

1. Introduction

It is well known that John von Neumann the existence of Nash equilibria of zero-sum games in the late 1920s [1].

Dantzig and Thapa [2] mentioned the linear programming formulation of a zero-sum game, and considered reducing linear primal and dual problems to a zero-sum game.

Khachian [3] and Karmarkar’s [4] interior-point methods for solving linear programming problems in polynomial time are significant discoveries.

Khachian’s ellipsoid method [3] is important in that it is a first polynomial time algorithm, however, computational results with it have been disappointing.

Karmarkar’s projective scaling method [4] is considered to be a practical polynomial algorithm. Numerous attempts have been made to refine a variety of interior point methods in the late 1980s [5] [6]. There are few studies concerning computational methods (e.g., [7] ) to solve a Nash equilibrium of a zero-sum game, in spite of those developments in linear programming.

We discuss the use of the Karmarkar’s method to solve a Nash equilibrium of a zero-sum game, which is expressed as linear programming problems derived from the original zero-sum game, and prove that it is without doubt a polynomial time algorithm. We implement the Karmarkar method, and apply it for solving Rock-Paper-Scissors, of which result seems to be promising.

Finally, we also mention an affine scaling method that would help us compute a Nash equilibrium effectively.

2. Formulation

Dantzig and Thapa [2] mentioned that maximin strategy of Player 1 in a zero-sum game can be formulated as the following linear programming problem:

maximize x , ξ ξ subjectto A T x ξ e i = 1 n x i = 1 , x i 0 (1)

where ξ , e = ( 1 , , 1 ) T + n , A n × n is a payoff matrix, and x + n is a mixed strategy vector of Player 1.

Likewise, the minimax strategy of Player 2 can be formulated as the following linear programming problem:

minimize y , η η subjectto A y η e i = 1 n y i = 1 , y i 0 (2)

where y + n is a mixed strategy of Player 2, which is, in fact, the linear programming dual problem of (1).

The following result which clarifies the relationship between (1) and (2).

Theorem 1. There exist solutions { x ¯ , ξ ¯ } and { y ¯ , η ¯ } for (1) and (2). Moreover,

ξ ¯ = η ¯ .

Proof We let ξ = ξ 1 ξ 2 , ξ 1 0 , ξ 2 0 , and rewrite (1) as

minimize x , ξ 1 , ξ 2 ( ξ 1 ξ 2 ) subjectto A T x ( ξ 1 ξ 2 ) e 0 e T x 1 e T x 1 x i 0 , i = 1 , , n , ξ 1 , ξ 2 0 (3)

The dual linear program of (1) is

maximize y , η 1 , η 2 η 1 η 2 subjectto A y + ( η 1 η 2 ) e 0 e T y 1 e T y 1 y i 0 , i = 1 , , n , η 1 , η 2 0 (4)

Note that (4) is equivalent to (2) by letting η = ( η 1 η 2 ) .

There exists a feasible point for (3), such as x = 1 n e , ξ 1 = min j i a i j , ξ 2 = 0 . We also observe that (3) has a finite solution, as ( ξ 1 ξ 2 ) should be bounded because x Δ n 1 , where Δ n 1 is an n 1 simplex. Therefore, ξ ¯ = ( ξ ¯ 1 ξ ¯ 2 ) = η ¯ 1 η ¯ 2 = η ¯ because the strong duality theorem [8] holds. □

The above result shows that (2) is the linear programming dual problem of (1).

Next, we establish the validity of the above formulation. Let S x = { s 1 , , s n } and S y = { s 1 , , s n } be sets of pure strategies of Players 1 and 2, respectively. Let E ( x , y ) denote the expected payoff value for Player 1 when Player 1 takes x and Player 2 takes y as a mixed strategy, that is,

E ( x , y ) = i = 1 n x i E ( s i , y ) = i = 1 n x i ( j = 1 n y j E ( s i , s j ) ) = i = 1 n x i ( j = 1 n y j a i j ) = x T A y (5)

Theorem 2.

E ( x , y ¯ ) E ( x ¯ , y ¯ ) E ( x ¯ , y ) . (6)

Proof Because x ¯ and y ¯ are the solutions for (1) and (2), respectively,

A y ¯ α e A T x ¯ ,

where α = ξ ¯ = η ¯ from Theorem 1.

Therefore,

x T A y ¯ α and α y T A T x ¯ = x ¯ T A y

namely,

E ( x , y ¯ ) E ( x ¯ , y ¯ ) E ( x ¯ , y ) . □

The result from Theorem 2 is known as the minimax theorem [1]; however, the direct derivation above would be simple and easy to comprehend.

3. Karmarkar Method

We can use the simplex method to find a Nash equilibrium from the results of Section 2.

However, the curse of dimensionality may occur if the running time required to solve a linear problem using the simplex method may increase rapidly as the number of variables increases.

Therefore, we employ a variation of Karmarkar’s method [9] which belongs to an interior point method and assures the polynomial time convergence property.

The Karmarkar method [9] we adopt deals with the following canonical form:

minimize z = c T x subjectto A x = 0 a T x = 1 x 0 (7)

where c , x n , a = ( 1 , , 1 m , 0 , , 0 ) T + n , and A n × n .

If c T x ¯ = 0 , where x ¯ is the solution of (7), we refer to (7) as the canonical form.

It should be noted that the original Karmarkar’s method [4] utilizes e = ( 1 , , 1 ) T + n instead of a. In this study, we utilize a because we can formulate the part of components of x for mixed strategies as a T x = 1 in (14) and (15) later.

The key projective transformation sends x to

x ˜ = X 1 x e T X 1 x

where X = diag ( x k ) . We notice that x ˜ Δ n 1 , where Δ n 1 is ( n 1 ) -simplex. The corresponding inverse transformation is

x = X x ˜ a T X x ˜

As seen in [9], the problem in (7) is transformed as

minimize c ˜ T x ˜ subjectto A ˜ x ˜ = 0 e T x ˜ = 1 x ˜ 0 (8)

where c ˜ = X c and A ˜ = A X . We denote the constraint matrix using

B = ( A ˜ e T ) ,

and the corresponding orthogonal projection matrix is P B = I B T ( B B T ) 1 B .

The projected steepest-descent direction is

Δ x ˜ = P B c ˜ = P B X c = ( P ˜ 1 n e e T ) X c = P ˜ X c + c T x k n e (9)

At each step, the next point x ˜ k + 1 is given by

x ˜ k + 1 = e n + α Δ x ˜ , α > 0 ,

more precisely,

x ˜ k + 1 = e n + 1 n ( n + 1 ) Δ x ˜ Δ x ˜ , (10)

so that x ˜ k + 1 will be an inner point in Δ n 1 .

The next point x k + 1 in x-space is obtained by

x k + 1 = X x ˜ k + 1 a T X x ˜ k + 1 . (11)

Finally, we define the optimality criterion as

c T x k < 2 q c T x 0 , (12)

where 2 q is a prescribed precision.

Next, we describe the Karmarkar method as follows:

Algorithm 1.

Input: A n × n , c n , a feasible point x 0 for (7)

Output: x * such that c T x * = 0 , A x * = 0 , a T x * = 1 , x * 0 .

1: Compute Δ x ˜ by (9).

2: Determine x ˜ k + 1 by (10).

3: Obtain x k + 1 using (11). Check the optimality criterion using (12).

4: k k + 1 . Go to Step 1.

Karmarkar defined a potential function as follows:

f ( x ) = n ln c T x j = 1 n ln x j = j = 1 n ln ( c T x x j ) . (13)

He proved that Algorithm 1 generates, { x k } , which reduces f ( x k ) by γ > 0 at each iteration.

In fact, the following result follows for original Karmarkar’s algorithm [4].

Theorem 3. Let { x k } be the sequence generated by Karmarkar’s algorithm [4] applied to (3). Then, (3) can be solved in polynomial time.

Proof We can transform (3) to original Karmarkar’s original canonical form [4] in pol-ynomial time in the manner of [4]. Then the result follows since the original canonical form can be solved in at most (𝑛(𝑞 + log2 𝑛)) iterations from Theorem 1 [4]. □

We can establish the following result.

Theorem 4. A Nash equilibrium of zero-sum games can be found in polynomial time.

Proof Nash equilibria of zero-sum games can be formulated as (1) and (2), even if the value of the game is not necessarily 0. We can rewrite (1) and (2) as (3) and (4) respectively. (3) (and (4)) can be solved in polynomial time from Theorem 3. □

We should observe that the transformation in the above proof is used only for the proof itself. When ξ ¯ 0 , we can employ the dual problem scheme [9, Chapter 17.5] to update an estimated lower bound to, z * , which is used to transform (3) into the canonical form (7).

4. Computational Results

In this section, we employ the Karmarkar method described in the previous section.

The zero-sum game (1) and (2) can be rewritten as:

minimize x , ξ , s ( ξ 1 ξ 2 ) subjectto A T x + ( ξ 1 ξ 2 ) e + ( s 1 , , s n ) T = 0 e T x = 1 x i 0 , i = 1 , , n , ξ 1 , ξ 2 0 , s i 0 , i = 1 , , n (14)

and

maximize y , η , t η 1 η 2 subjectto A y + ( η 1 η 2 ) e + ( t 1 , , t n ) T = 0 e T y = 1 y i 0 , i = 1 , , n , η 1 , η 2 0 , t i 0 , i = 1 , , n (15)

where x n , A n × n , e = ( 1 , , 1 ) T n , y n .

These problems belong to the canonical form (7), which can be solved by Algorithm 1.

Rock-Paper-Scissors is used as a test problem of zero-sum games. Its payoff matrix for Player 1 (gain) and Player 2 (loss) is shown in Table 1.

The Nash equilibrium of Rock-Paper-Scissors is

{ x ¯ , y ¯ } = { ( 1 3 , 1 3 , 1 3 ) , ( 1 3 , 1 3 , 1 3 ) }

and the value of the game is 0.

It should be noted that (14) and (15) are exactly the same because A T = A in this case. Thus, we have only the results of (14) to demonstrate, namely, the problem is:

minimize x , ξ , s ( ξ 1 ξ 2 ) subjectto x 2 + x 3 + ξ 1 ξ 2 + s 1 = 0 x 1 x 3 + ξ 1 ξ 2 + s 2 = 0 x 1 + x 2 + ξ 1 ξ 2 + s 3 = 0 x 1 + x 2 + x 3 = 1 x 1 , x 2 , x 3 0 , ξ 1 , ξ 2 0 , s 1 , s 2 , s 3 0 (16)

The Karmarkar method was coded in Python, and run on a Windows PC (Core i7 8550U, RAM 16GB).

Table 2 summarizes the sequence generated by the Karmarkar method applied to (16) from x 0 = ( 0.8 , 0.1 , 0.1 ) , ξ 1 0 = 0.1 , ξ 2 0 = 0.9 , s 0 = ( 0.8 , 0.1 , 1.5 ) . We can see that the objective value decreases by half in each iteration, which with (12) validates Theorem 3.

Figure 1 shows the sequences generated by the Karmarkar method from three different starting points

Figure 1. Sequences for different starting points.

Table 1. Payoff matrix of rock-paper-scissors for player 1 (P1) and player 2 (P2).

Table 2. Details for x0 = (0.8, 0.1, 0.1).

x 0 = ( 0.8 , 0.1 , 0.1 ) , ( 0.1 , 0.8 , 0.1 ) and ( 0.1 , 0.1 , 0.8 ) .

It seems that each sequence tends linearly to the Nash equilibrium from an arbitrary interior point.

Table 2 and Figure 1 show that the Karmarkar method is effective and efficient even for small-size problems.

5. Concluding Remarks

Linear programming formulations can represent the minimax principle for zero-sum games.

We have proved that the Karmarkar’s algorithm solves the linear programming problems in polynomial time.

The Karmarkar method performs well when the value of the game is 0 . If the value of the game is not 0 , we can utilize the dual problem scheme ( [9], Chapter 17.5).

We can resort to an affine scaling algorithm [10] because, either (1) or (2) is the linear programming standard form generated by introducing slack variables, regardless of whether or not the value of the game is 0 . The affine scaling algorithm for solving equilibria of zero-sum games would be practically effective; however, there is no proof that it has a polynomial time convergence property [11].

Cite this paper: Tanaka, Y. and Togashi, M. (2021) Polynomial Time Method for Solving Nash Equilibria of Zero-Sum Games. American Journal of Computational Mathematics, 11, 23-30. doi: 10.4236/ajcm.2021.111002.
References

[1]   von Neumann, J. (1928) Zur Theorie der Gesellschaftsspiele. Mathematische Annalen, 100, 295-320.
https://doi.org/10.1007/BF01448847

[2]   Dantzig, G.B. and Thapa, M.N. (1997) Linear Programming 1: Introduction. Springer-Verlag, New York, 166.

[3]   Khachiyan, L.G. (1979) A Polynomial Algorithm in Linear Programming. Doklady Akademii Nauk, 244, 1093-1096.

[4]   Karmarkar, N. (1984) A New Polynomial Time Algorithm for Linear Programming. Combinatorica, 4, 373-395.
https://doi.org/10.1007/BF02579150

[5]   Renegar, J. (1988) A Polynomial-Time Algorithm, Based on Newton’s Method, for Linear Programming. Mathematical Programming, 40, 59-93.
https://doi.org/10.1007/BF01580724

[6]   Vanderbei, R.J., Meketon, M.S. and Freedman, B.A. (1986) A Modification of Karmarkar’s Linear Programming Algorithm. Algorithmica, 1, 395-407.
https://doi.org/10.1007/BF01840454

[7]   Daskalakis, C., Deckelbaum, A. and Kim, A. (2015) Near-Optimal No-Regret Algorithms for Zero-Sum Games. Games and Economic Behavior, 100, 327-348.
https://doi.org/10.1016/j.geb.2014.01.003

[8]   Nocedal, J. and Wright, S.J. (2006) Numerical Optimization. 2nd Edition, Springer, Berlin.

[9]   Nash, S.G. and Sofer, A. (1996) Linear and Nonlinear Programming. McGraw-Hill, New York.

[10]   Dikin, I.I. (1967) Iterative Solution of the Problems of Linear and Quadratic Programming. Soviet Mathematics Doklady, 8, 674-675.

[11]   Tsuchiya, T. and Muramatsu, M. (1995) Global Convergence of a Long-Step Affine Scaling Algorithm for Degenerate Linear Programming Problems. SIAM Journal on Optimization, 5, 525-551.
https://doi.org/10.1137/0805027

 
 
Top