1
|
LaPorte P, Nowak MA. A geometric process of evolutionary game dynamics. J R Soc Interface 2023; 20:20230460. [PMID: 38016638 PMCID: PMC10684345 DOI: 10.1098/rsif.2023.0460] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/06/2023] [Accepted: 11/02/2023] [Indexed: 11/30/2023] Open
Abstract
Many evolutionary processes occur in phenotype spaces which are continuous. It is therefore of interest to explore how selection operates in continuous spaces. One approach is adaptive dynamics, which assumes that mutants are local. Here we study a different process which also allows non-local mutants. We assume that a resident population is challenged by an invader who uses a strategy chosen from a random distribution on the space of all strategies. We study the repeated donation game of direct reciprocity. We consider reactive strategies given by two probabilities, denoting respectively the probability to cooperate after the co-player has cooperated or defected. The strategy space is the unit square. We derive analytic formulae for the stationary distribution of evolutionary dynamics and for the average cooperation rate as function of the cost-to-benefit ratio. For positive reactive strategies, we prove that cooperation is more abundant than defection if the area of the cooperative region is greater than 1/2 which is equivalent to benefit, b, divided by cost, c, exceeding [Formula: see text]. We introduce the concept of strategies that are stable with probability one. We also study an extended process and discuss other games.
Collapse
Affiliation(s)
- Philip LaPorte
- Department of Mathematics, University of California, Berkeley, CA 94720, USA
| | - Martin A. Nowak
- Department of Mathematics, Harvard University, Cambridge, MA 02138, USA
- Department of Organismic and Evolutionary Biology, Harvard University, Cambridge, MA 02138, USA
| |
Collapse
|
2
|
Kleshnina M, Hilbe C, Šimsa Š, Chatterjee K, Nowak MA. The effect of environmental information on evolution of cooperation in stochastic games. Nat Commun 2023; 14:4153. [PMID: 37438341 DOI: 10.1038/s41467-023-39625-9] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/04/2022] [Accepted: 06/22/2023] [Indexed: 07/14/2023] Open
Abstract
Many human interactions feature the characteristics of social dilemmas where individual actions have consequences for the group and the environment. The feedback between behavior and environment can be studied with the framework of stochastic games. In stochastic games, the state of the environment can change, depending on the choices made by group members. Past work suggests that such feedback can reinforce cooperative behaviors. In particular, cooperation can evolve in stochastic games even if it is infeasible in each separate repeated game. In stochastic games, participants have an interest in conditioning their strategies on the state of the environment. Yet in many applications, precise information about the state could be scarce. Here, we study how the availability of information (or lack thereof) shapes evolution of cooperation. Already for simple examples of two state games we find surprising effects. In some cases, cooperation is only possible if there is precise information about the state of the environment. In other cases, cooperation is most abundant when there is no information about the state of the environment. We systematically analyze all stochastic games of a given complexity class, to determine when receiving information about the environment is better, neutral, or worse for evolution of cooperation.
Collapse
Affiliation(s)
| | - Christian Hilbe
- Max Planck Research Group Dynamics of Social Behavior, Max Planck Institute for Evolutionary Biology, Plön, Germany
| | - Štěpán Šimsa
- IST Austria, Klosterneuburg, Austria
- Faculty of Mathematics and Physics, Charles University, Prague, Czech Republic
| | | | - Martin A Nowak
- Department of Mathematics, Harvard University, Cambridge, MA, USA
- Department of Organismic and Evolutionary Biology, Harvard University, Cambridge, MA, USA
| |
Collapse
|
3
|
Murase Y, Baek SK. Grouping promotes both partnership and rivalry with long memory in direct reciprocity. PLoS Comput Biol 2023; 19:e1011228. [PMID: 37339134 DOI: 10.1371/journal.pcbi.1011228] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/11/2023] [Accepted: 05/30/2023] [Indexed: 06/22/2023] Open
Abstract
Biological and social scientists have long been interested in understanding how to reconcile individual and collective interests in the iterated Prisoner's Dilemma. Many effective strategies have been proposed, and they are often categorized into one of two classes, 'partners' and 'rivals.' More recently, another class, 'friendly rivals,' has been identified in longer-memory strategy spaces. Friendly rivals qualify as both partners and rivals: They fully cooperate with themselves, like partners, but never allow their co-players to earn higher payoffs, like rivals. Although they have appealing theoretical properties, it is unclear whether they would emerge in an evolving population because most previous works focus on the memory-one strategy space, where no friendly rival strategy exists. To investigate this issue, we have conducted evolutionary simulations in well-mixed and group-structured populations and compared the evolutionary dynamics between memory-one and longer-memory strategy spaces. In a well-mixed population, the memory length does not make a major difference, and the key factors are the population size and the benefit of cooperation. Friendly rivals play a minor role because being a partner or a rival is often good enough in a given environment. It is in a group-structured population that memory length makes a stark difference: When longer-memory strategies are available, friendly rivals become dominant, and the cooperation level nearly reaches a maximum, even when the benefit of cooperation is so low that cooperation would not be achieved in a well-mixed population. This result highlights the important interaction between group structure and memory lengths that drive the evolution of cooperation.
Collapse
Affiliation(s)
- Yohsuke Murase
- RIKEN Center for Computational Science, Kobe, Japan
- Max Planck Research Group 'Dynamics of Social Behavior,' Max Planck Institute for Evolutionary Biology, Plön, Germany
| | - Seung Ki Baek
- Department of Scientific Computing, Pukyong National University, Busan, Korea
| |
Collapse
|
4
|
Li J, Zhao X, Li B, Rossetti CSL, Hilbe C, Xia H. Evolution of cooperation through cumulative reciprocity. NATURE COMPUTATIONAL SCIENCE 2022; 2:677-686. [PMID: 38177263 DOI: 10.1038/s43588-022-00334-w] [Citation(s) in RCA: 5] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 10/06/2021] [Accepted: 09/14/2022] [Indexed: 01/06/2024]
Abstract
Reciprocity is a simple principle for cooperation that explains many of the patterns of how humans seek and receive help from each other. To capture reciprocity, traditional models often assume that individuals use simple strategies with restricted memory. These memory-1 strategies are mathematically convenient, but they miss important aspects of human reciprocity, where defections can have lasting effects. Here we instead propose a strategy of cumulative reciprocity. Cumulative reciprocators count the imbalance of cooperation across their previous interactions with their opponent. They cooperate as long as this imbalance is sufficiently small. Using analytical and computational methods, we show that this strategy can sustain cooperation in the presence of errors, that it enforces fair outcomes and that it evolves in hostile environments. Using an economic experiment, we confirm that cumulative reciprocity is more predictive of human behaviour than several classical strategies. The basic principle of cumulative reciprocity is versatile and can be extended to a range of social dilemmas.
Collapse
Affiliation(s)
- Juan Li
- Institute of Systems Engineering, Dalian University of Technology, Dalian, China
- Center for Big Data and Intelligent Decision-Making, Dalian University of Technology, Dalian, China
| | - Xiaowei Zhao
- Institute of Systems Engineering, Dalian University of Technology, Dalian, China
- School of Software Technology, Dalian University of Technology, Dalian, China
| | - Bing Li
- Institute of Systems Engineering, Dalian University of Technology, Dalian, China
| | | | - Christian Hilbe
- Max Planck Institute for Evolutionary Biology, Plön, Germany.
| | - Haoxiang Xia
- Institute of Systems Engineering, Dalian University of Technology, Dalian, China.
- Center for Big Data and Intelligent Decision-Making, Dalian University of Technology, Dalian, China.
| |
Collapse
|
5
|
Schmid L, Hilbe C, Chatterjee K, Nowak MA. Direct reciprocity between individuals that use different strategy spaces. PLoS Comput Biol 2022; 18:e1010149. [PMID: 35700167 PMCID: PMC9197081 DOI: 10.1371/journal.pcbi.1010149] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/26/2021] [Accepted: 04/28/2022] [Indexed: 12/04/2022] Open
Abstract
In repeated interactions, players can use strategies that respond to the outcome of previous rounds. Much of the existing literature on direct reciprocity assumes that all competing individuals use the same strategy space. Here, we study both learning and evolutionary dynamics of players that differ in the strategy space they explore. We focus on the infinitely repeated donation game and compare three natural strategy spaces: memory-1 strategies, which consider the last moves of both players, reactive strategies, which respond to the last move of the co-player, and unconditional strategies. These three strategy spaces differ in the memory capacity that is needed. We compute the long term average payoff that is achieved in a pairwise learning process. We find that smaller strategy spaces can dominate larger ones. For weak selection, unconditional players dominate both reactive and memory-1 players. For intermediate selection, reactive players dominate memory-1 players. Only for strong selection and low cost-to-benefit ratio, memory-1 players dominate the others. We observe that the supergame between strategy spaces can be a social dilemma: maximum payoff is achieved if both players explore a larger strategy space, but smaller strategy spaces dominate.
Collapse
Affiliation(s)
| | - Christian Hilbe
- Max Planck Research Group Dynamics of Social Behavior, Max Planck Institute for Evolutionary Biology, Plön, Germany
| | | | - Martin A. Nowak
- Department of Mathematics, Department of Organismic and Evolutionary Biology, Harvard University, Cambridge, Massachusetts, United States of America
| |
Collapse
|
6
|
Cheng Z, Chen G, Hong Y. Misperception influence on zero-determinant strategies in iterated Prisoner's Dilemma. Sci Rep 2022; 12:5174. [PMID: 35338188 PMCID: PMC8956668 DOI: 10.1038/s41598-022-08750-8] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/13/2021] [Accepted: 02/17/2022] [Indexed: 11/09/2022] Open
Abstract
Zero-determinant (ZD) strategies have attracted wide attention in Iterated Prisoner's Dilemma (IPD) games, since the player equipped with ZD strategies can unilaterally enforce the two players' expected utilities subjected to a linear relation. On the other hand, uncertainties, which may be caused by misperception, occur in IPD inevitably in practical circumstances. To better understand the situation, we consider the influence of misperception on ZD strategies in IPD, where the two players, player X and player Y, have different cognitions, but player X detects the misperception and it is believed to make ZD strategies by player Y. We provide a necessary and sufficient condition for the ZD strategies in IPD with misperception, where there is also a linear relationship between players' utilities in player X's cognition. Then we explore bounds of players' expected utility deviation from a linear relationship in player X's cognition with also improving its own utility.
Collapse
Affiliation(s)
- Zhaoyang Cheng
- Key Laboratory of Systems and Control, Academy of Mathematics and Systems Science, Beijing, 100190, China
- School of Mathematical Sciences, University of Chinese Academy of Sciences, Beijing, 100190, China
| | - Guanpu Chen
- Key Laboratory of Systems and Control, Academy of Mathematics and Systems Science, Beijing, 100190, China
- JD Explore Academy, Beijing, 100176, China
| | - Yiguang Hong
- Key Laboratory of Systems and Control, Academy of Mathematics and Systems Science, Beijing, 100190, China.
- Department of Control Science and Engineering, Tongji University, Shanghai, 201804, China.
| |
Collapse
|
7
|
Murase Y, Kim M, Baek SK. Social norms in indirect reciprocity with ternary reputations. Sci Rep 2022; 12:455. [PMID: 35013393 PMCID: PMC8748885 DOI: 10.1038/s41598-021-04033-w] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/21/2021] [Accepted: 12/07/2021] [Indexed: 11/23/2022] Open
Abstract
Indirect reciprocity is a key mechanism that promotes cooperation in social dilemmas by means of reputation. Although it has been a common practice to represent reputations by binary values, either ‘good’ or ‘bad’, such a dichotomy is a crude approximation considering the complexity of reality. In this work, we studied norms with three different reputations, i.e., ‘good’, ‘neutral’, and ‘bad’. Through massive supercomputing for handling more than thirty billion possibilities, we fully identified which norms achieve cooperation and possess evolutionary stability against behavioural mutants. By systematically categorizing all these norms according to their behaviours, we found similarities and dissimilarities to their binary-reputation counterpart, the leading eight. We obtained four rules that should be satisfied by the successful norms, and the behaviour of the leading eight can be understood as a special case of these rules. A couple of norms that show counter-intuitive behaviours are also presented. We believe the findings are also useful for designing successful norms with more general reputation systems.
Collapse
Affiliation(s)
- Yohsuke Murase
- RIKEN Center for Computational Science, Kobe, Hyogo, 650-0047, Japan
| | - Minjae Kim
- Department of Physics, Pukyong National University, Busan, 48513, Korea
| | - Seung Ki Baek
- Department of Physics, Pukyong National University, Busan, 48513, Korea.
| |
Collapse
|
8
|
Lee S, Murase Y, Baek SK. Local stability of cooperation in a continuous model of indirect reciprocity. Sci Rep 2021; 11:14225. [PMID: 34244552 PMCID: PMC8270921 DOI: 10.1038/s41598-021-93598-7] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/09/2021] [Accepted: 06/28/2021] [Indexed: 02/06/2023] Open
Abstract
Reputation is a powerful mechanism to enforce cooperation among unrelated individuals through indirect reciprocity, but it suffers from disagreement originating from private assessment, noise, and incomplete information. In this work, we investigate stability of cooperation in the donation game by regarding each player's reputation and behaviour as continuous variables. Through perturbative calculation, we derive a condition that a social norm should satisfy to give penalties to its close variants, provided that everyone initially cooperates with a good reputation, and this result is supported by numerical simulation. A crucial factor of the condition is whether a well-reputed player's donation to an ill-reputed co-player is appreciated by other members of the society, and the condition can be reduced to a threshold for the benefit-cost ratio of cooperation which depends on the reputational sensitivity to a donor's behaviour as well as on the behavioural sensitivity to a recipient's reputation. Our continuum formulation suggests how indirect reciprocity can work beyond the dichotomy between good and bad even in the presence of inhomogeneity, noise, and incomplete information.
Collapse
Affiliation(s)
- Sanghun Lee
- grid.412576.30000 0001 0719 8994Department of Physics, Pukyong National University, Busan, 48513 Korea
| | - Yohsuke Murase
- grid.474693.bRIKEN Center for Computational Science, Kobe, Hyogo 650-0047 Japan
| | - Seung Ki Baek
- grid.412576.30000 0001 0719 8994Department of Physics, Pukyong National University, Busan, 48513 Korea
| |
Collapse
|
9
|
Conditions for the existence of zero-determinant strategies under observation errors in repeated games. J Theor Biol 2021; 526:110810. [PMID: 34119498 DOI: 10.1016/j.jtbi.2021.110810] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/09/2021] [Revised: 06/04/2021] [Accepted: 06/07/2021] [Indexed: 11/24/2022]
Abstract
Repeated games are useful models to analyze long term interactions of living species and complex social phenomena. Zero-determinant (ZD) strategies in repeated games discovered by Press and Dyson in 2012 enforce a linear payoff relationship between a focal player and the opponent. This linear relationship can be set arbitrarily by a ZD player. Hence, a subclass of ZD strategies can fix the opponent's expected payoff and another subclass of the strategies can exceed the opponent for the expected payoff. Since this discovery, theories for ZD strategies are extended to cope with various natural situations. It is especially important to consider the theory of ZD strategies for repeated games with a discount factor and observation errors because it allows the theory to be applicable in the real world. Recent studies revealed their existence of ZD strategies even in repeated games with both factors. However, the conditions for the existence has not been sufficiently analyzed. Here, we mathematically analyzed the conditions in repeated games with both factors. First, we derived the thresholds of a discount factor and observation errors which ensure the existence of Equalizer and positively correlated ZD (pcZD) strategies, which are well-known subclasses of ZD strategies. We found that ZD strategies exist only when a discount factor remains high as the error rates increase. Next, we derived the conditions for the expected payoff of the opponent enforced by Equalizer as well as the conditions for the slope and base line payoff of linear lines enforced by pcZD. As a result, we found that, as error rates increase or a discount factor decreases, the conditions for the linear line that Equalizer or pcZD can enforce become strict.
Collapse
|
10
|
Ueda M. Memory-two zero-determinant strategies in repeated games. ROYAL SOCIETY OPEN SCIENCE 2021; 8:202186. [PMID: 34084544 PMCID: PMC8150048 DOI: 10.1098/rsos.202186] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 12/01/2020] [Accepted: 05/04/2021] [Indexed: 06/12/2023]
Abstract
Repeated games have provided an explanation of how mutual cooperation can be achieved even if defection is more favourable in a one-shot game in the Prisoner's Dilemma situation. Recently found zero-determinant (ZD) strategies have substantially been investigated in evolutionary game theory. The original memory-one ZD strategies unilaterally enforce linear relationships between average pay-offs of players. Here, we extend the concept of ZD strategies to memory-two strategies in repeated games. Memory-two ZD strategies unilaterally enforce linear relationships between correlation functions of pay-offs and pay-offs of the previous round. Examples of memory-two ZD strategy in the repeated Prisoner's Dilemma game are provided, some of which generalize the tit-for-tat strategy to a memory-two case. Extension of ZD strategies to memory-n case with n ≥ ~2 is also straightforward.
Collapse
Affiliation(s)
- Masahiko Ueda
- Graduate School of Sciences and Technology for Innovation, Yamaguchi University, Yamaguchi 753-8511, Japan
| |
Collapse
|
11
|
Murase Y, Baek SK. Friendly-rivalry solution to the iterated n-person public-goods game. PLoS Comput Biol 2021; 17:e1008217. [PMID: 33476337 PMCID: PMC7853487 DOI: 10.1371/journal.pcbi.1008217] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/30/2020] [Revised: 02/02/2021] [Accepted: 12/12/2020] [Indexed: 11/19/2022] Open
Abstract
Repeated interaction promotes cooperation among rational individuals under the shadow of future, but it is hard to maintain cooperation when a large number of error-prone individuals are involved. One way to construct a cooperative Nash equilibrium is to find a ‘friendly-rivalry’ strategy, which aims at full cooperation but never allows the co-players to be better off. Recently it has been shown that for the iterated Prisoner’s Dilemma in the presence of error, a friendly rival can be designed with the following five rules: Cooperate if everyone did, accept punishment for your own mistake, punish defection, recover cooperation if you find a chance, and defect in all the other circumstances. In this work, we construct such a friendly-rivalry strategy for the iterated n-person public-goods game by generalizing those five rules. The resulting strategy makes a decision with referring to the previous m = 2n − 1 rounds. A friendly-rivalry strategy for n = 2 inherently has evolutionary robustness in the sense that no mutant strategy has higher fixation probability in this population than that of a neutral mutant. Our evolutionary simulation indeed shows excellent performance of the proposed strategy in a broad range of environmental conditions when n = 2 and 3. How to maintain cooperation among a number of self-interested individuals is a difficult problem, especially if they can sometimes commit error. In this work, we propose a strategy for the iterated n-person public-goods game based on the following five rules: Cooperate if everyone did, accept punishment for your own mistake, punish others’ defection, recover cooperation if you find a chance, and defect in all the other circumstances. These rules are not far from actual human behavior, and the resulting strategy guarantees three advantages: First, if everyone uses it, full cooperation is recovered even if error occurs with small probability. Second, the player of this strategy always never obtains a lower long-term payoff than any of the co-players. Third, if the co-players are unconditional cooperators, it obtains a strictly higher long-term payoff than theirs. Therefore, if everyone uses this strategy, no one has a reason to change it. Furthermore, our simulation shows that this strategy will become highly abundant over long time scales due to its robustness against the invasion of other strategies. In this sense, the repeated social dilemma is solved for an arbitrary number of players.
Collapse
Affiliation(s)
| | - Seung Ki Baek
- Department of Physics, Pukyong National University, Busan, Korea
- * E-mail:
| |
Collapse
|