1
|
Zhao C, Zheng G, Zhang C, Zhang J, Chen L. Emergence of cooperation under punishment: A reinforcement learning perspective. CHAOS (WOODBURY, N.Y.) 2024; 34:073123. [PMID: 38985966 DOI: 10.1063/5.0215702] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Received: 04/25/2024] [Accepted: 06/26/2024] [Indexed: 07/12/2024]
Abstract
Punishment is a common tactic to sustain cooperation and has been extensively studied for a long time. While most of previous game-theoretic work adopt the imitation learning framework where players imitate the strategies of those who are better off, the learning logic in the real world is often much more complex. In this work, we turn to the reinforcement learning paradigm, where individuals make their decisions based upon their experience and long-term returns. Specifically, we investigate the prisoners' dilemma game with a Q-learning algorithm, and cooperators probabilistically pose punishment on defectors in their neighborhood. Unexpectedly, we find that punishment could lead to either continuous or discontinuous cooperation phase transitions, and the nucleation process of cooperation clusters is reminiscent of the liquid-gas transition. The analysis of a Q-table reveals the evolution of the underlying "psychologic" changes, which explains the nucleation process and different levels of cooperation. The uncovered first-order phase transition indicates that great care needs to be taken when implementing the punishment compared to the continuous scenario.
Collapse
Affiliation(s)
- Chenyang Zhao
- School of Physics and Information Technology, Shaanxi Normal University, Xi'an 710061, People's Republic of China
| | - Guozhong Zheng
- School of Physics and Information Technology, Shaanxi Normal University, Xi'an 710061, People's Republic of China
| | - Chun Zhang
- School of Science, Xi'an Shiyou University, Xi'an 710065, People's Republic of China
| | - Jiqiang Zhang
- School of Physics, Ningxia University, Yinchuan 750021, People's Republic of China
| | - Li Chen
- School of Physics and Information Technology, Shaanxi Normal University, Xi'an 710061, People's Republic of China
| |
Collapse
|
2
|
Kozitsina TS, Kozitsin IV, Menshikov IS. Quantal response equilibrium for the Prisoner's Dilemma game in Markov strategies. Sci Rep 2022; 12:4482. [PMID: 35296729 PMCID: PMC8927616 DOI: 10.1038/s41598-022-08426-3] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/21/2021] [Accepted: 03/07/2022] [Indexed: 11/18/2022] Open
Abstract
Within the studies of human cooperation, there are gaps that require further investigation. One possible area for growth is developing theoretical concepts which describe high levels of cooperation. In this paper, we present a symmetrical quantal response equilibrium (QRE) in Prisoner's Dilemma game (PD) constructed in Markov strategies (tolerance to defection and mutual cooperation). To prove the adequacy of the resulting equilibrium, we compare it with the previously found Nash equilibrium in PD in Markov strategies: the QRE converges with the Nash equilibrium that corresponds with the theory. Next, we investigate the properties of QRE in PD in Markov strategies by testing it against experimental data. For low levels of rationality, the found equilibrium manages to describe high cooperation. We derive the levels of rationality under which the intersection between Nash and QRE occurs. Lastly, our experimental data suggest that QRE serves as a dividing line between behavior with low and high cooperation.
Collapse
Affiliation(s)
- T S Kozitsina
- Moscow Institute of Physics and Technology, Institutsky lane 9, Dolgoprudny, Moscow region, 141700, Russian Federation.
- Federal Research Center ″Computer Science and Control″ of the Russian Academy of Sciences, Vavilova street 44/2, Moscow, 119333, Russian Federation.
| | - I V Kozitsin
- Moscow Institute of Physics and Technology, Institutsky lane 9, Dolgoprudny, Moscow region, 141700, Russian Federation
- V.A. Trapeznikov Institute of Control Sciences of Russian Academy of Sciences, Profsoyuznaya street 65, Moscow, 117997, Russian Federation
| | - I S Menshikov
- Moscow Institute of Physics and Technology, Institutsky lane 9, Dolgoprudny, Moscow region, 141700, Russian Federation
- Federal Research Center ″Computer Science and Control″ of the Russian Academy of Sciences, Vavilova street 44/2, Moscow, 119333, Russian Federation
| |
Collapse
|
3
|
Sun W, Liu L, Chen X, Szolnoki A, Vasconcelos VV. Combination of institutional incentives for cooperative governance of risky commons. iScience 2021; 24:102844. [PMID: 34381969 PMCID: PMC8334382 DOI: 10.1016/j.isci.2021.102844] [Citation(s) in RCA: 11] [Impact Index Per Article: 3.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/10/2021] [Revised: 06/23/2021] [Accepted: 07/08/2021] [Indexed: 11/03/2022] Open
Abstract
Finding appropriate incentives to enforce collaborative efforts for governing the commons in risky situations is a long-lasting challenge. Previous works have demonstrated that both punishing free-riders and rewarding cooperators could be potential tools to reach this goal. Despite weak theoretical foundations, policy makers frequently impose a punishment-reward combination. Here, we consider the emergence of positive and negative incentives and analyze their simultaneous impact on sustaining risky commons. Importantly, we consider institutions with fixed and flexible incentives. We find that a local sanctioning scheme with pure reward is the optimal incentive strategy. It can drive the entire population toward a highly cooperative state in a broad range of parameters, independently of the type of institutions. We show that our finding is also valid for flexible incentives in the global sanctioning scheme, although the local arrangement works more effectively. Pure reward in a local scheme is more effective both for fixed and flexible incentives It can drive the entire population toward a highly cooperative state Increasing the efficiency of the institution can induce the success of pure reward A local scheme promotes group success more effectively than a global scheme
Collapse
Affiliation(s)
- Weiwei Sun
- School of Mathematical Sciences, University of Electronic Science and Technology of China, Chengdu 611731, China
| | - Linjie Liu
- School of Mathematical Sciences, University of Electronic Science and Technology of China, Chengdu 611731, China
| | - Xiaojie Chen
- School of Mathematical Sciences, University of Electronic Science and Technology of China, Chengdu 611731, China
| | - Attila Szolnoki
- Institute of Technical Physics and Materials Science, Centre for Energy Research, P.O. Box 49, Budapest 1525, Hungary
| | - Vítor V Vasconcelos
- Informatics Institute, University of Amsterdam, Science Park 904, 1098XH Amsterdam, The Netherlands.,Institute for Advanced Study, University of Amsterdam, 1012 GC Amsterdam, The Netherlands
| |
Collapse
|
4
|
Quan J, Li X, Wang X. The evolution of cooperation in spatial public goods game with conditional peer exclusion. CHAOS (WOODBURY, N.Y.) 2019; 29:103137. [PMID: 31675844 DOI: 10.1063/1.5119395] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 07/10/2019] [Accepted: 10/08/2019] [Indexed: 06/10/2023]
Abstract
Social exclusion can prevent free riders from participating in social activities and deprive them of sharing cooperative benefits, which is an effective mechanism for the evolution of cooperation. However, traditional peer-exclusion strategies are unconditional, and as long as there are defectors in the group, they will pay a cost to exclude the defectors. In reality, one of the reasons for the complexity of these strategies is that individuals may react differently depending on the environment in which they are located. Based on this consideration, we introduce a kind of conditional peer-exclusion strategy in the spatial public goods game model. Specifically, the behavior of conditional exclusion depends on the number of defectors in the group and can be adjusted by a tolerance parameter. Only if the number of defectors in the group exceeds the tolerance threshold, conditional exclusion can be triggered to exclude defectors. We explore the effects of parameters such as tolerance, exclusion cost, and probability of exclusion success on the evolution of cooperation. Simulation results confirmed that conditional exclusion can greatly reduce the threshold values of the synergy factor above which cooperation can emerge. Especially, when the tolerance is low, very small synergy factors can promote the population to achieve a high level of cooperation. Moreover, even if the probability of exclusion success is low, or the unit exclusion cost is relatively high, conditional exclusion is effective in promoting cooperation. These results allow us to better understand the role of exclusion strategies in the emergence of cooperation.
Collapse
Affiliation(s)
- Ji Quan
- School of Management, Wuhan University of Technology, Wuhan 430070, China
| | - Xia Li
- School of Management, Wuhan University of Technology, Wuhan 430070, China
| | - Xianjia Wang
- School of Economics and Management, Wuhan University, Wuhan 430072, China
| |
Collapse
|
5
|
The public goods game with shared punishment cost in well-mixed and structured populations. J Theor Biol 2019; 476:36-43. [PMID: 31150664 DOI: 10.1016/j.jtbi.2019.05.019] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/14/2018] [Revised: 05/24/2019] [Accepted: 05/28/2019] [Indexed: 11/23/2022]
Abstract
Both experimental and theoretical studies have shown that punishment plays an important role in promoting cooperation. Various forms of punishment are proposed to explain why costly punishment could be maintained in the population and stabilize cooperation. Here we consider an altruistic behavior that cooperators perform cooperation and punishment simultaneously and share the punishment cost. We investigate the role of punishment cost shared among cooperators in the evolution of cooperation in public goods game. We show that the punishment can promote and stabilize cooperation when the penalty imposed on defectors is large enough compared to the punishment cost incurred by cooperators in well-mixed populations. In structured populations, cooperation could emerge under lower fine threshold and coexist with defection. However, as the penalty increases, cooperation will have a larger basin of attraction in the well-mixed population than that in the structured population. Our analytical and simulated results indicate that punishment indeed can effectively promote the evolution of cooperation. We also find that population structure can promote the coexistence of cooperation and defection but not always be beneficial to cooperation.
Collapse
|
6
|
Szolnoki A, Chen X. Reciprocity-based cooperative phalanx maintained by overconfident players. Phys Rev E 2018; 98:022309. [PMID: 30253608 DOI: 10.1103/physreve.98.022309] [Citation(s) in RCA: 16] [Impact Index Per Article: 2.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/05/2018] [Indexed: 11/07/2022]
Abstract
According to the evolutionary game theory principle, a strategy representing a higher payoff can spread among competitors. But there are cases when a player consistently overestimates or underestimates her own payoff, which undermines proper comparison. Interestingly, both underconfident and overconfident individuals are capable of elevating the cooperation level significantly. While former players stimulate a local coordination of strategies, the presence of overconfident individuals enhances the spatial reciprocity mechanism. In both cases the propagations of competing strategies are influenced in a biased way resulting in a cooperation supporting environment. These effects are strongly related to the nonlinear character of invasion probabilities which is a common and frequently observed feature of microscopic dynamics.
Collapse
Affiliation(s)
- Attila Szolnoki
- Institute of Technical Physics and Materials Science, Centre for Energy Research, Hungarian Academy of Sciences, P.O. Box 49, H-1525 Budapest, Hungary
| | - Xiaojie Chen
- School of Mathematical Sciences, University of Electronic Science and Technology of China, Chengdu 611731, China
| |
Collapse
|
7
|
Bravetti A, Padilla P. An optimal strategy to solve the Prisoner's Dilemma. Sci Rep 2018; 8:1948. [PMID: 29386635 PMCID: PMC5792647 DOI: 10.1038/s41598-018-20426-w] [Citation(s) in RCA: 8] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/04/2017] [Accepted: 11/06/2017] [Indexed: 11/25/2022] Open
Abstract
Cooperation is a central mechanism for evolution. It consists of an individual paying a cost in order to benefit another individual. However, natural selection describes individuals as being selfish and in competition among themselves. Therefore explaining the origin of cooperation within the context of natural selection is a problem that has been puzzling researchers for a long time. In the paradigmatic case of the Prisoner's Dilemma (PD), several schemes for the evolution of cooperation have been proposed. Here we introduce an extension of the Replicator Equation (RE), called the Optimal Replicator Equation (ORE), motivated by the fact that evolution acts not only at the level of individuals of a population, but also among competing populations, and we show that this new model for natural selection directly leads to a simple and natural rule for the emergence of cooperation in the most basic version of the PD. Contrary to common belief, our results reveal that cooperation can emerge among selfish individuals because of selfishness itself: if the final reward for being part of a society is sufficiently appealing, players spontaneously decide to cooperate.
Collapse
Affiliation(s)
- Alessandro Bravetti
- Instituto de Investigaciones en Matemáticas Aplicadas y en Sistemas, Universidad Nacional Autónoma de México, México City, 04510, Mexico.
| | - Pablo Padilla
- Instituto de Investigaciones en Matemáticas Aplicadas y en Sistemas, Universidad Nacional Autónoma de México, México City, 04510, Mexico
- Fitzwilliam College, University of Cambridge, Storey's Way, CB3 ODG, UK
| |
Collapse
|
8
|
Huang F, Chen X, Wang L. Conditional punishment is a double-edged sword in promoting cooperation. Sci Rep 2018; 8:528. [PMID: 29323286 PMCID: PMC5764993 DOI: 10.1038/s41598-017-18727-7] [Citation(s) in RCA: 19] [Impact Index Per Article: 3.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/13/2017] [Accepted: 12/16/2017] [Indexed: 12/03/2022] Open
Abstract
Punishment is widely recognized as an effective approach for averting from exploitation by free-riders in human society. However, punishment is costly, and thus rational individuals are unwilling to take the punishing action, resulting in the second-order free-rider problem. Recent experimental study evidences that individuals prefer conditional punishment, and their punishing decision depends on other members' punishing decisions. In this work, we thus propose a theoretical model for conditional punishment and investigate how such conditional punishment influences cooperation in the public goods game. Considering conditional punishers only take the punishing action when the number of unconditional punishers exceeds a threshold number, we demonstrate that such conditional punishment induces the effect of a double-edged sword on the evolution of cooperation both in well-mixed and structured populations. Specifically, when it is relatively easy for conditional punishers to engage in the punishment activity corresponding to a low threshold value, cooperation can be promoted in comparison with the case without conditional punishment. Whereas when it is relatively difficult for conditional punishers to engage in the punishment activity corresponding to a high threshold value, cooperation is inhibited in comparison with the case without conditional punishment. Moreover, we verify that such double-edged sword effect exists in a wide range of model parameters and can be still observed in other different punishment regimes.
Collapse
Affiliation(s)
- Feng Huang
- Center for Systems and Control, College of Engineering, Peking University, Beijing, 100871, China
- School of Mathematical Sciences, University of Electronic Science and Technology of China, Chengdu, 611731, China
| | - Xiaojie Chen
- School of Mathematical Sciences, University of Electronic Science and Technology of China, Chengdu, 611731, China.
| | - Long Wang
- Center for Systems and Control, College of Engineering, Peking University, Beijing, 100871, China
| |
Collapse
|
9
|
Wang J, Zhang Y, Guan J, Zhou S. Divide-and-conquer Tournament on Social Networks. Sci Rep 2017; 7:15484. [PMID: 29138411 PMCID: PMC5686164 DOI: 10.1038/s41598-017-15616-x] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/20/2016] [Accepted: 10/30/2017] [Indexed: 12/05/2022] Open
Abstract
In social gaming networks, previous studies extensively investigated the influence of a variety of strategies on reciprocal behaviors in the prisoner's dilemma game. The studied frameworks range from the case that an individual uniformly cooperates or defects with all social contacts, to the recently reported divide-and-conquer games, where an individual can choose a particular move to play with each neighbor. In this paper, we investigate a divide-and-conquer tournament among 14 well-known strategies on social gaming networks. In the tournament, an individual's fitness is measured by accumulated and average payoff aggregated for a certain number of rounds. On the base of their fitness, the evolution of the population follows a local learning mechanism. Our observation indicates that the distribution of individuals adopting a strategy in degree ranking fundamentally changes the frequency of the strategy. In the divide-and-conquer gaming networks, our result suggests that the connectivity in social networks and strategy are two key factors that govern the evolution of the population.
Collapse
Affiliation(s)
- Jiasheng Wang
- Department of Computer Science and Technology, Tongji University, 4800 Cao'an Road, Shanghai, 201804, China
- Key Laboratory of Embedded System and Service Computing (Tongji University), Ministry of Education, Shanghai, 200092, China
| | - Yichao Zhang
- Department of Computer Science and Technology, Tongji University, 4800 Cao'an Road, Shanghai, 201804, China.
- Key Laboratory of Embedded System and Service Computing (Tongji University), Ministry of Education, Shanghai, 200092, China.
| | - Jihong Guan
- Department of Computer Science and Technology, Tongji University, 4800 Cao'an Road, Shanghai, 201804, China
- Key Laboratory of Embedded System and Service Computing (Tongji University), Ministry of Education, Shanghai, 200092, China
| | - Shuigeng Zhou
- School of Computer Science, Fudan University, 220 Handan Road, Shanghai, 200433, China
- Shanghai Key Laboratory of Intelligent Information Processing, Shanghai, 200433, China
| |
Collapse
|
10
|
Hu H. Competing opinion diffusion on social networks. ROYAL SOCIETY OPEN SCIENCE 2017; 4:171160. [PMID: 29291101 PMCID: PMC5717675 DOI: 10.1098/rsos.171160] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 08/16/2017] [Accepted: 10/02/2017] [Indexed: 05/14/2023]
Abstract
Opinion competition is a common phenomenon in real life, such as with opinions on controversial issues or political candidates; however, modelling this competition remains largely unexplored. To bridge this gap, we propose a model of competing opinion diffusion on social networks taking into account degree-dependent fitness or persuasiveness. We study the combined influence of social networks, individual fitnesses and attributes, as well as mass media on people's opinions, and find that both social networks and mass media act as amplifiers in opinion diffusion, the amplifying effect of which can be quantitatively characterized. We analytically obtain the probability that each opinion will ultimately pervade the whole society when there are no committed people in networks, and the final proportion of each opinion at the steady state when there are committed people in networks. The results of numerical simulations show good agreement with those obtained through an analytical approach. This study provides insight into the collective influence of individual attributes, local social networks and global media on opinion diffusion, and contributes to a comprehensive understanding of competing diffusion behaviours in the real world.
Collapse
Affiliation(s)
- Haibo Hu
- Department of Management Science and Engineering, East China University of Science and Technology, Shanghai, People’s Republic of China
| |
Collapse
|
11
|
Szolnoki A, Chen X. Cooperation driven by success-driven group formation. Phys Rev E 2016; 94:042311. [PMID: 27841629 DOI: 10.1103/physreve.94.042311] [Citation(s) in RCA: 30] [Impact Index Per Article: 3.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/08/2016] [Indexed: 11/07/2022]
Abstract
In the traditional setup of the public goods game all players are involved in every available group and the mutual benefit is shared among competing cooperator and defector strategies. However, in real life situations the group formation of players could be more sophisticated because not all players are attractive enough for others to participate in a joint venture. What if only those players who are successful enough to the neighbors can initiate a group formation and establish a game? To elaborate this idea we employ a modified protocol and demonstrate that a carefully chosen threshold to establish a joint venture could efficiently improve the cooperation level even if the synergy factor would suggest a full defector state otherwise. The microscopic mechanism that is responsible for this effect is based on the asymmetric consequences of competing strategies: while the success of a cooperator provides a long-time well-being for the neighborhood, the temporary advantage of defection cannot be maintained if the protocol is based on the success of leaders.
Collapse
Affiliation(s)
- Attila Szolnoki
- Institute of Technical Physics and Materials Science, Centre for Energy Research, Hungarian Academy of Sciences, P.O. Box 49, H-1525 Budapest, Hungary
| | - Xiaojie Chen
- School of Mathematical Sciences, University of Electronic Science and Technology of China, Chengdu 611731, China
| |
Collapse
|
12
|
Stivala A, Kashima Y, Kirley M. Culture and cooperation in a spatial public goods game. Phys Rev E 2016; 94:032303. [PMID: 27739708 DOI: 10.1103/physreve.94.032303] [Citation(s) in RCA: 18] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/19/2016] [Indexed: 11/07/2022]
Abstract
We study the coevolution of culture and cooperation by combining the Axelrod model of cultural dissemination with a spatial public goods game, incorporating both noise and social influence. Both participation and cooperation in public goods games are conditional on cultural similarity. We find that a larger "scope of cultural possibilities" in the model leads to the survival of cooperation, when noise is not present, and a higher probability of a multicultural state evolving, for low noise rates. High noise rates, however, lead to both rapid extinction of cooperation and collapse into cultural "anomie," in which stable cultural regions fail to form. These results suggest that cultural diversity can actually be beneficial for the evolution of cooperation, but that cultural information needs to be transmitted accurately in order to maintain both coherent cultural groups and cooperation.
Collapse
Affiliation(s)
- Alex Stivala
- Melbourne School of Psychological Sciences, The University of Melbourne, VIC 3010, Australia
| | - Yoshihisa Kashima
- Melbourne School of Psychological Sciences, The University of Melbourne, VIC 3010, Australia
| | - Michael Kirley
- Department of Computing and Information Systems, The University of Melbourne, VIC 3010, Australia
| |
Collapse
|