In dynamic games of perfect information, the concept of subgame perfect equilibrium is most commonly used in the prediction of players’ behavior. Consider a generic game of finitely many moves, the subgame perfect equilibrium always uniquely exists. While the equilibrium concept is easily understood and the equilibrium characterization is usually straightforward, challenges to its ability to predict players’ behavior grow in the literature, both on theoretical front and experimental front.
Rosenthal  constructed a game (later dubbed the “Centipede Game”) that consisted of a sequence of one hundred moves. In this game, each player moves in every alternative period, either to pass (to the next period) or to end the game right away. Passing the game to the next period yields a larger total pile of money, but it strictly reduces the payoff a player receives if the opponent ends the game in her subsequent turn. The unique subgame perfect equilibrium (SPE) is that the first player ends the game at the first node and each player gets a small sum. Rosenthal argued that it is highly unlikely that, in practice, players will actually choose the SPE strategies when they play that game.
Various centipede game experiments have been conducted to test the predictive power of the concept of SPE. McKelvey and Palfrey  reported that only 15% of the players end the game at the first node (the outcome predicted by SPE) in a high-payoff version, and that number reduces to as little as 0.7% in other versions of the centipede game. In a much simplified two-move extensive form game, Goeree and Holt  documented that players usually did not trust their opponents to be rational. In contrast, Palacios-Huerta and Volij  conduct experiments involving expert chess players, who are known for their high degree of rationality and ability to find optimal strategies using backward induction reasoning. The outcome of their experiments is very close to the SPE prediction. Overall these experimental studies suggest that common knowledge of rationality of all players is the key requirement of SPE and so it is not surprising that players do not follow SPE strategies if they do not believe their opponents are rational.
In an attempt to reconcile the differences between the theory and the experimental outcomes, various modifications to the assumptions of the games used in the experiments have been proposed. McKelvey and Palfrey  , for example, propose that a player believes that the opponent is an altruist with some positive probability. They find that even a very small such probability can induce players to adopt mixed strategies in the early rounds of the game, mimicking the observed behaviors in their experiment1. A few years later, McKelvey and Palfrey  use a quantal choice model to re-examine the same experimental results. They show that if one assumes that the probability of implementing a particular strategy is increasing in the equilibrium payoff of the strategy, then the observed behavior more or less coincides with the predictive behavior. Zauner  proposes an alternative explanation of McKelvey and Palfrey’s experimental results by assuming a random perturbation of each player’s payoffs. He considers different types of perturbations and two best-fit models are selected.
In the theoretical literature, game theorists have proposed alternatives to some key assumptions that lead to SPE, including the common knowledge of rationality and backward induction. Aumann  formalizes the idea of higher order mutual knowledge2. Caplan  treats irrationality as a standard good, and players need to pay to get closer to some (irrational) “bliss belief.” Basu  argues that each history of moves reveals certain characteristics of players to one another, and therefore the outcomes of a game depend on these revealed characteristics (instead of depending on rationality alone). Halpern and Pass  propose the “iterated regret minimization” as a solution concept for strategic games. They apply it to the centipede games and find that, with linear payoffs, players will cooperate for a number of rounds. With exponential payoffs, they will cooperate all the way up to the end of the game. Meanwhile, Rand and Nowak  model the stochastic evolution of strategies in the centipede game and find that the players’ cooperative behavior may in fact be the favored outcome of natural selection.
Advances in psychology also help explain why players in experiments may behave differently than SPE predicts. Epstein et al.  conduct studies that test the cognitive-experiential self-theory. They confirm that two conceptual systems, an experiential system and a rational system, operate by their own rules of inference inside the same individual. To some extent, an individual may switch from one system to another. Tirole  builds on similar psychological findings and proposes a model of rational irrationality that can explain why people rehearse good news and selectively forget bad news―a universal behavior.
In this paper, we argue along the lines of the above psychological findings and propose another theoretical explanation of the failure of the SPE as a predictor of behavior. We emphasize on the observation that even if all players understand fully the concept of subgame perfect equilibrium and even if no players believe that other players are altruists, they still do not follow the SPE strategies when playing the centipede game. We assume that a player can choose to play SPE, i.e. be “rational”, or else may choose to be “behavioral”. If being “behavioral” yields a better expected outcome than being “rational”, then a player would choose to be “behavioral” (or, in terms of standard game theory terminology, “irrational”). Our intuition is as follows. SPE strategies are optimal for a player only when other players follow them. If players do not believe that other players will follow SPE strategies, then their own SPE strategies are not, in general, optimal. In the model, we specify an alternative belief for each player regarding the behavior of other players. Each player then has a choice of selecting his belief (between the SPE strategy and the alternative one) at the beginning of the game and then optimizing given the selected belief. A “behavioral equilibrium” is formed if each player is better off in the actual outcomes by selecting the alternative belief. These outcomes of the game are determined by the strategies the players actually used in the game.
The basic idea behind the “behavioral equilibrium” concept is that players can choose to believe that their counterparts can be either fully rational (such that SPE strategies are the best response) or somewhat irrational (so that SPE strategies are not best response any more). Given any belief, the players still optimize by choosing the best strategy. This is the same as in a subgame perfect equilibrium. However, the difference between a behavioral equilibrium and a subgame perfect equilibrium is that those alternative beliefs in a behavioral equilibrium do not usually coincide with those players’ actual strategies. If the two are the same, a subgame perfect equilibrium is formed. Therefore, these alternative beliefs are somewhat irrational. Still, these irrational beliefs generate better payoffs than those SPE beliefs. Thus, players will choose these irrational beliefs rationally.
The origin of irrational beliefs is an interesting and open question. Epstein et al.  find that there are an experiential and a rational system in each individual and that an individual can switch from one system to another. We conjecture that irrational beliefs may come from the experiential system, while rational beliefs may come from the rational system. As we observed in the above-mentioned experiments, players are better off using the irrational beliefs than the rational beliefs. These irrational beliefs may not translate into the players’ “maximum” payoffs. But the payoffs are usually very good, and are much better than the payoffs implied by SPE strategies. Therefore, players may reinforce these irrational beliefs and move away from their rational beliefs. In some sense, these irrational beliefs are the “rules of thumb” for the players.
One real life example related to the centipede games that we examine in this paper is the rotating-savings and credit associations (Roscas), commonly found in many developing countries. (See Besley et al.  and Anderson and Baland  , for example.) In these associations, a predetermined group of individuals get together and contribute a predetermined amount into a “pool” which is then given to one member (winner). These gatherings repeat themselves, with previous winners excluded from receiving the “pool” while still being obliged to contribute. The gathering may stop after each member has received the “pool” but often the same group continues the Rosca with a new “pool”. These Roscas run the risk of earlier winners defaulting on later contributions, a strategy resembling “stopping early” in the centipede game. Still, defaults are very infrequent. Our model of “irrational beliefs” or “rules of thumbs” may shed some light on these phenomena.
The rest of this paper is organized as follows. In Section 2, we analyze a few centipede games using the concept of “behavioral equilibria”. In Section 3, we analyze some of the experiments in centipede games in the literature. In Section 4, we conclude.
2. Centipede Games and Behavioral Equilibria
We begin with a general description of the centipede games.
There are two players, 1 and 2, playing the centipede game of n moves in Figure 1. To simplify notation, we assume that n is even.
In this game, , , , , , , and, , , , ,. It is straight-forward to check that the unique subgame perfect equilibrium strategy for each player is to play T whenever it is his turn to move. Given this strategy, the equilibrium outcome of the game is that player 1 plays T at the very beginning and ends the game with payoffs.
Now suppose that before the start of the game, the two players choose a belief secretly and simultaneously. Player 1 chooses a belief from; at the same time,
Figure 1. A general n-move centipede game.
player 2 chooses a belief from. Here, represents player i’s subgame perfect equilibrium belief on his opponent j’s behavior; i.e., player j will play T whenever it is his move. On the other hand, denotes player i’s alternative belief. Let be player 1’s belief, where is the probability that player 2 will play T at node 2k conditional on node 2k being reached. For SPE belief,. Similarly, we define, and.
The subgame perfect equilibrium belief is the only belief that satisfies the properties of common knowledge of rationality and backward induction in the centipede game. Therefore, any other belief would violate at least one of these properties. This alternative belief may be derived from a player’s past game-play experience against other players and/or some “rules of thumb” guesses may have been formed. Since players in general do not always behave rationally, these “rules of thumb” guesses do not always coincide with the other players’ SPE strategies.
In summary, the game we are examining is as follows. Both players simultaneously select their beliefs before the start of the game. Once the belief is selected, it remains the same throughout the game. Given these beliefs regarding an opponent’s behavior, players play the above centipede game. Each player’s goal is to maximize his expected payoff given his chosen belief.
To simplify our analysis, we assume that the beliefs are not updated during the game. (Even if we allow for belief updating, we will not get back the SPE beliefs as long as the initial belief is somewhat incorrect.)
To analyze the modified centipede game, first note the following. If is such that playing T at node 1 is the optimal action for player 1, then the game is over at node 1 no matter what belief player 1 has selected. The more interesting case is when playing T at node 1 is not the optimal action.
If player 1 chooses belief and thus plays T at the first node, the game ends at the first node, with payoffs. If player 1 chooses belief, player 1 maximizes his expected payoff by choosing the node he plans to play T:
Let denote an i that maximizes the above. (Note that there could be many such i’s that maximize the above.) Consider player 2 at node 2. The optimal action with the belief of is to end the game right away. In this case, the payoffs are. If belief is chosen, player 2 maximizes his expected payoff by choosing the node he plans to play T:
Let denote a j that maximizes the above. (Again, there could be many such j’s that maximize the above.)
The proposed pure strategy for player 1 is to select and plan to play T at node. The proposed pure strategy for player 2 is to select and play P at node 2 (if player 1 played P at node 1), and plan to play T at node. The game ends at node.
Definition 1 and form a pure strategy “behavioral equilibrium” if player 1’s payoff is higher by selecting than selecting given player 2’s strategy of playing T at node, and player 2’s payoff is higher by selecting than selecting given player 1’s strategy of playing T at node. That is,
In this behavioral equilibrium, players are better off selecting these non-SPE beliefs than selecting the SPE beliefs. These beliefs are reinforced if the players play these games again later.
Now consider mixed strategy “behavioral equilibria”. Suppose that there are more than one j’s that maximize (2), or there are more than one i’s that maximize (1), mixed strategies could be used by the players. Let denote any of player 1’s optimal mixed strategies, where are all of the numbers that maximizes (1). Similarly, let denote any of player 2’s optimal mixed strategies, where are all of the numbers that maximizes (2). Then the outcomes of the game are determined by and.
Definition 2 and form a mixed-strategy “behavioral equilibrium” if player 1’s payoff is higher by selecting (comparing to) given player 2’s strategy, and player 2’s payoff is higher by selecting (comparing to) given player 1’s strategy.
Again, in this behavioral equilibrium, players are better off selecting these non-SPE beliefs than selecting those SPE beliefs. We can generalize the concept of behavioral equilibria to any general game G with n players and normal-form payoff ,
Definition 3 Suppose that is a subgame perfect equilibrium strategy profile in G. Let be player i’s subgame perfect equilibrium belief about other players’ strategies. Suppose that be player i’s another belief about other players’ strategies and is player i’s best response to. Then form a “behavioral equilibrium” if,
Note that in the above definition, a player’s belief may not be correct; that is, is not necessarily the same as. However, the optimal responses to these “incorrect” beliefs generate higher payoffs to each player than the subgame perfect equilibrium payoffs. Therefore, these “incorrect” beliefs are reinforced.
Note also that the subgame perfect equilibrium strategy profile together with the corresponding correct belief always form a behavioral equilibrium. In fact, according to the definition, there could be many behavioral equilibria in a game. However, in games with dominant strategies, such as the Prisoner’s Dilemma games, players using the dominant strategies are the unique behavioral equilibrium, since they are optimal independent of players’ beliefs.
Below, we focus on centipede games to illustrate our equilibrium concept.
Example 1 Consider the eight-move centipede game in Figure 2.
Suppose that and. Then it is straight-forward to obtain, and. That is, player 1 playing T at node 7 is optimal given, while player 2 playing T at node 6 is optimal given. The minimum of and, , is 6; that is, the game ends at node 6, with payoffs (2,5).
It is easy to see that and form a behavioral equilibrium because, and.
Example 2 Consider the six-move centipede game in Figure 3.
In this game, we can construct pure-strategy behavioral equilibria similarly to the last example. Let, and. Then we have, and. Therefore,; that is, the game ends at node 3. This constitutes a behavioral equilibrium as the final outcome is (3,0), which is weakly better for both players than the SPE outcome of (1,0).
Now consider a mixed-strategy behavioral equilibrium. Suppose that and, with and. Given these beliefs, denote player 1’s expected payoff of planning to play T at node i by. We have, , and. For player 1 to randomize between playing T at node 3 and playing T at node 5, we should set;
Similarly, for player 2, , , and.
Figure 2. An eight-move centipede game.
Figure 3. A six-move centipede game.
Suppose that. Then.
To construct a behavioral equilibrium, player 1’s mixed strategy must satisfy the following two conditions regarding each player’s actual payoffs. First, for player 1, is at least 1, which is player 1’s payoff by following SPE strategy and playing T at node 1. This gives us. Second, for player 2, must be at least 2, which is player 2’s payoff by following SPE strategy and playing T at node 2. This gives us. Therefore, any would satisfy these two conditions.
To summarize, , , , , where, and form a mixed-strategy behavioral equilibrium.
3. Analyzing Previous Centipede Game Experiments
3Payoffs are obtained if player 2 chooses to pass at move 4.
McKelvey and Palfrey  report the results of seven different centipede game experiments. Sessions 1 to 3 are four-move centipede games with the following payoffs:, , , , and.3 Session 4 is a high-payoff four-move centipede game where the payoffs are quadrupled. Sessions 5 to 7 are six-move centipede games with the following payoffs:, , , , , , and .
Table IIA in McKelvey and Palfrey  reports the proportion of observations at each terminal node. In that table, is used to denote the proportion of games that ends at node i. From these’s, we can calculate a player’s strategy as follows. For the four-move game, let and be the proportion of player 1 who plans to choose TAKE at node 1 and at node 3 respectively. (Therefore, the proportion of player 1 choosing Pass at node 3 is equal to.) Similarly, let and be the proportion of player 2 who plan to choose TAKE at node 2 and at node 4 respectively, and thus the proportion of player 2 choosing Pass at node 4 is equal to. Then, , , and. We define similarly in the six-move game. Then we have, , , , , and. The results are reported in the following table.
We cannot infer a player’s belief in playing these games from the data since many different beliefs could lead to the same observed strategy. Therefore, in each session, we assume that a player’s belief corresponds exactly to his rival’s revealed strategy and calculate the player’s optimal action according to that belief. In the calculations, we assign the players a utility function with a constant degree of absolute risk aversion of 0.5 so
Table 1. Players’ strategies and optimal actions.
that the players are modestly risk averse. That is, for player i, where x is the amount of money earned in one game. The results are reported in Table 1 as well. The percentage number after each optimal action is the percentage of players actually choosing the implied optimal action in that session. As we can see from the table, the majority of the players chose the implied optimal action in all but session 3. We interpret these findings cautiously as our assumption that a player’s belief corresponds exactly to his rival’s revealed strategy is only one possible specification of beliefs consistent with the behavioral equilibrium. Nevertheless, and in contrast with the predictions of SPE, the behavior of the majority of the players can be explained by our theory.
In this paper, we propose a concept of behavioral equilibrium to explain the observed behavior of players in centipede games. Experimental evidence suggests that players’ behavior is inconsistent with game theoretic predictions. We allow players to abandon the “logic” of subgame perfect equilibrium and to choose an alternate belief of opponents’ expected behavior formed from previous experience in similar situations. We show that, under certain conditions, players are better off abandoning the “logic” of subgame perfect equilibrium and choosing the alternative belief instead. We argue this reinforces the players’ subjective belief that subgame perfect equilibrium may not work well in these games and, by extension, that the alternative belief becomes the belief of choice. We support our theory by re-examining the results of centipede game experiments conducted by other researchers.
We thank the referees, Jim Bergin, Lester Kwong and Jasmina Arifovic for helpful com- ments. Ruqu Wang’s research is supported by the Social Sciences and Humanities Research Council of Canada. Xiaoting Wang acknowledges support from the National Natural Science Foundation of China (#71571038).
2Samet  labels the material rationality in a centipede game as common belief instead of common knowledge.
 Rand, D. and Nowak, M. (2012) Evolutionary Dynamics in Finite Populations Can Explain the Full Range of Cooperative Behaviors Observed in the Centipede Game. Journal of Theoretical Biology, 300, 212-221.
 Epstein, S., Lipson, A., Holstein, C. and Huh, E. (1992) Irrational Reactions to Negative Outcomes: Evidence for Two Conceptual Systems. Journal of Personality and Social Psychology, 62, 328-339.