What Is the Prisoner's Dilemma?
The prisoner's dilemma is a paradox in decision analysis in which two individuals acting in their own self-interests do not produce the optimal outcome.
A prime example of game theory, the prisoner's dilemma was developed in 1950 by RAND Corporation mathematicians Merrill Flood and Melvin Dresher during the Cold War (but later given its name by the game theorist Alvin Tucker). Some have speculated that the prisoner's dilemma was crafted to simulate strategic thinking between the U.S.A. and U.S.S.R. during the Cold War.
Today, the prisoner's dilemma is a paradigmatic example of how strategic thinking between individuals can lead to suboptimal outcomes for both players.
- A prisoner's dilemma is a situation where individual decision-makers always have an incentive to choose in a way that creates a less than optimal outcome for the individuals as a group.
- The prisoner's dilemmas occur in many aspects of the economy.
- In the classic prisoner's dilemma, individuals receive the greatest payoffs if they betray the group rather than cooperate.
- If games are repeated, it is possible for each player to devise a strategy that rewards cooperation.
- People have developed many methods of overcoming prisoner's dilemmas to choose better collective results despite apparently unfavorable individual incentives.
Understanding the Prisoner's Dilemma
The typical prisoner's dilemma is set up in such a way that both parties choose to protect themselves at the expense of the other participant. As a result, both participants find themselves in a worse state than if they had cooperated with each other in the decision-making process. The prisoner's dilemma is one of the most well-known concepts in modern game theory.
The prisoner’s dilemma presents a situation where two parties, separated and unable to communicate, must each choose between cooperating with the other or not. The highest reward for each party occurs when both parties choose to co-operate.
The classic prisoner’s dilemma goes like this:
- Two bank robbers, Elizabeth and Henry, have been arrested and are being interrogated in separate rooms.
- The authorities have no other witnesses, and can only prove the case against them if they can convince at least one of the robbers to betray their accomplice and testify to the crime.
- Each bank robber is faced with the choice to cooperate with their accomplice and remain silent or to defect from the gang and testify for the prosecution.
- If they both co-operate and remain silent, then the authorities will only be able to convict them on a lesser charge resulting in one year in jail for each (1 year for Elizabeth + 1 year for Henry = 2 years total jail time).
- If one testifies and the other does not, then the one who testifies will go free and the other will get five years (0 years for the one who defects + 5 for the one convicted = 5 years total).
- However, if both testify against the other, each will get three years in jail for being partly responsible for the robbery (3 years for Elizabeth + 3 years for Henry = 6 years total jail time).
The respective penalties can be expressed visually as follows:
|Possible Outcomes of Prisoner's Dilemma|
|Outcome||Henry Cooperates||Henry Defects|
In this case, each robber always has an incentive to defect, regardless of the choice the other makes. From Elizabeth's point of view, if Henry remains silent, then Elizabeth can either co-operate with Henry and do a year in jail, or defect and go free. Obviously, she would be better off betraying Henry in this case. On the other hand, if Henry defects and testifies against Elizabeth, then her choice becomes either to remain silent and do five years or to talk and do three years in jail. Again, obviously, she would prefer to do the three years over five.
In both cases, whether Henry cooperates with Elizabeth or defects to the prosecution, Elizabeth will be better off if she defects and testifies. Now, since Henry faces the exact same set of choices he also will always be better off defecting as well.
The paradox of the prisoner’s dilemma is this: both robbers can minimize the total jail time that the two of them will do only if they both co-operate and stay silent (two years total), but the incentives that they each face separately will always drive them each to defect and end up doing the maximum total jail time between the two of them of six years total.
The prisoner's dilemma is frequently used in economics or business situations to explain why individual incentives might lead actors to choose a sub-optimal outcome.
Examples of the Prisoner's Dilemma
The economy is replete with examples of prisoner’s dilemmas which can have outcomes that are either beneficial or harmful to the economy and society as a whole. The common thread is this: a situation where the incentives faced by each individual decision-maker would induce them each to behave in a way that makes them all collectively worse off, while individually avoiding choices that would make them all collectively better off if all could somehow cooperatively choose.
One such example is the tragedy of the commons. It may be to everyone’s collective advantage to conserve and reinvest in the propagation of a common pool of natural resources in order to be able to continue consuming it, but each individual always has an incentive to instead consume as much as possible as quickly as possible, which then depletes the resource. Finding some way to co-operate would clearly make everyone better off here.
On the other hand, the behavior of cartels can also be considered a prisoner’s dilemma. All members of a cartel can collectively enrich themselves by restricting output to keep the price that each receives high enough to capture economic rents from consumers, but each cartel member individually has an incentive to cheat on the cartel and increase output to also capture rents away from the other cartel members. In terms of the welfare of the overall society that the cartel operates in, this is an example of how individual incentives can sometimes actually make society better off as a whole.
Escape from the Prisoner's Dilemma
Over time, people have worked out a variety of solutions to prisoner’s dilemmas in order to overcome individual incentives in favor of the common good.
First, in the real world, most economic and other human interactions are repeated more than once. A true prisoner's dilemma is typically played only once or else it is classified as an iterated prisoner's dilemma. In an iterated prisoner’s dilemma, the players can choose strategies that reward cooperation or punish defection over time. By repeatedly interacting with the same individuals we can even deliberately move from a one-time prisoner's dilemma to a repeated prisoner's dilemma.
Second, people have developed formal institutional strategies to alter the incentives that individual decision-makers face. Collective action to enforce cooperative behavior through reputation, rules, laws, democratic or other collective decision-making procedures, and explicit social punishment for defections transforms many prisoner’s dilemmas toward the more collectively beneficial cooperative outcomes.
Last, some people and groups of people have developed psychological and behavioral biases over time such as higher trust in one another, long-term future orientation in repeated interactions, and inclinations toward positive reciprocity of cooperative behavior or negative reciprocity of defecting behaviors. These tendencies may evolve through a kind of natural selection within a society over time or group selection across different competing societies. In effect, they lead groups of individuals to “irrationally” choose outcomes that are actually the most beneficial to all of them together.
Put together, these three factors (the repeated prisoner’s dilemmas, formal institutions that break down prisoner’s dilemmas, and behavioral biases that undermine “rational” individual choice in prisoner’s dilemmas) help resolve the many prisoner’s dilemmas we would all otherwise face.
In the iterated prisoner's dilemma, it is possible for both players to devise a strategy that punishes betrayal and rewards cooperation. The "tit for tat" strategy has been determined to be the optimal way for optimizing a prisoner's dilemma. Tit for tat was introduced by Anatol Rapoport, who developed a strategy in which each participant in an iterated prisoner's dilemma follows a course of action consistent with their opponent's previous turn. For example, if provoked, a player subsequently responds with retaliation; if unprovoked, the player cooperates.
What Is the Likely Outcome of a Prisoner's Dilemma?
The likely outcome for a prisoner's dilemma is that both players defect (i.e., behave selfishly), leading to suboptimal outcomes for both. This is also the Nash Equilibrium, a decision-making theorem within game theory that states a player can achieve the desired outcome by not deviating from their initial strategy. The Nash equilibrium in this example is for both players to betray one other, even though mutual cooperation leads to a better outcome for both players; however, if one prisoner chooses mutual cooperation and the other does not, one prisoner's outcome is worse.
What Are Some Ways to Combat the Prisoner's Dilemma?
Solutions to prisoner’s dilemmas focus on overcoming individual incentives in favor of the common good. In the real world, most economic and other human interactions are repeated more than once. This allows parties to choose strategies that reward cooperation or punish defection over time.
Another solution relies on developing formal institutional strategies to alter the incentives that individual decision-makers face. Finally, behavioral biases will likely develop over time that undermine “rational” individual choice in prisoner’s dilemmas and lead groups of individuals to “irrationally” choose outcomes that are actually the most beneficial to all of them together.
Can the Prisoner's Dilemma Be Useful to Society?
Prisoners' dilemma problems can sometimes actually make society better off as a whole. A prime example is the behavior of an oil cartel. All cartel members can collectively enrich themselves by restricting output to keep the price of oil at a level where each maximizes revenue received from consumers, but each cartel member individually has an incentive to cheat on the cartel and increase output to also capture revenue away from the other cartel members. The end result is not the optimal outcome that the cartel desires but, rather, an outcome that benefits the consumer in terms of lower oil prices.
What Is the Tragedy of the Commons?
The tragedy of the commons is a theoretical problem in economics that proposes every individual has an incentive to consume a resource, but at the expense of every other individual—with no way to exclude anyone from consuming. Generally, the resource of interest is easily available to all individuals without barriers (i.e. the "commons"). This hypothetically leads to over-consumption and ultimately depletion of the common resource, to everybody's detriment. Basically, it highlights the concept of individuals neglecting the well-being of society in the pursuit of personal gain. Its accuracy and application are debated.
The Bottom Line
The prisoner's dilemma is a well-known parable for the difficulty of solving collective action problems. By acting in their own self-interests, the metaphorical prisoners find themselves with a greater penalty than they would face if they had worked together. However, when the experiment is repeated over the long term, it is possible for the players to devise incentives for cooperation.
Correction—June 30, 2022: The example of the prisoner's dilemma was edited to demonstrate how following individual interests can lead to the worst possible outcome.
William Poundstone. Prisoner's Dilemma/John Von Neumann, game theory and the puzzle of the bomb. Anchor Books, 1993.