1
|
Du C, Lu Y, Meng H, Park J. Evolution of cooperation on reinforcement-learning driven-adaptive networks. CHAOS (WOODBURY, N.Y.) 2024; 34:041101. [PMID: 38558043 DOI: 10.1063/5.0201968] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Received: 02/01/2024] [Accepted: 03/12/2024] [Indexed: 04/04/2024]
Abstract
Complex networks are widespread in real-world environments across diverse domains. Real-world networks tend to form spontaneously through interactions between individual agents. Inspired by this, we design an evolutionary game model in which agents participate in a prisoner's dilemma game (PDG) with their neighboring agents. Agents can autonomously modify their connections with neighbors using reinforcement learning to avoid unfavorable environments. Interestingly, our findings reveal some remarkable results. Exploiting reinforcement learning-based adaptive networks improves cooperation when juxtaposed with existing PDGs performed on homogeneous networks. At the same time, the network's topology evolves from homogeneous to heterogeneous states. This change occurs as players gain experience from past games and become more astute in deciding whether to join PDGs with their current neighbors or disconnect from the least profitable neighbors. Instead, they seek out more favorable environments by establishing connections with second-order neighbors with higher rewards. By calculating the degree distribution and modularity of the adaptive network in a steady state, we confirm that the adaptive network follows a power law and has a clear community structure, indicating that the adaptive network is similar to networks in the real world. Our study reports a new phenomenon in evolutionary game theory on networks. It proposes a new perspective to generate scale-free networks, which is generating scale-free networks by the evolution of homogeneous networks rather than typical ways of network growth and preferential connection. Our results provide new aspects to understanding the network structure, the emergence of cooperation, and the behavior of actors in nature and society.
Collapse
Affiliation(s)
- Chunpeng Du
- School of Mathematics, Kunming University, Kunming 650214, China
| | - Yikang Lu
- School of Statistics and Mathematics, Yunnan University of Finance and Economics, Kunming, Yunnan 650221, China
| | - Haoran Meng
- Technical Center, Shanghai Tobacco Group Co. Ltd., Shanghai 200120, China
| | - Junpyo Park
- Department of Applied Mathematics, College of Applied Sciences, Kyung Hee University, Yongin 17104, Republic of Korea
| |
Collapse
|
2
|
Aguilar-Janita M, Khalil N, Leyva I, Sendiña-Nadal I. Cooperation transitions in social games induced by aspiration-driven players. Phys Rev E 2024; 109:024107. [PMID: 38491644 DOI: 10.1103/physreve.109.024107] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/03/2023] [Accepted: 01/16/2024] [Indexed: 03/18/2024]
Abstract
Cooperation and defection are social traits whose evolutionary origin is still unresolved. Recent behavioral experiments with humans suggested that strategy changes are driven mainly by the individuals' expectations and not by imitation. This work theoretically analyzes and numerically explores an aspiration-driven strategy updating in a well-mixed population playing games. The payoffs of the game matrix and the aspiration are condensed into just two parameters that allow a comprehensive description of the dynamics. We find continuous and abrupt transitions in the cooperation density with excellent agreement between theory and the Gillespie simulations. Under strong selection, the system can display several levels of steady cooperation or get trapped into absorbing states. These states are still relevant for experiments even when irrational choices are made due to their prolonged relaxation times. Finally, we show that for the particular case of the prisoner dilemma, where defection is the dominant strategy under imitation mechanisms, the self-evaluation update instead favors cooperation nonlinearly with the level of aspiration. Thus, our work provides insights into the distinct role between imitation and self-evaluation with no learning dynamics.
Collapse
Affiliation(s)
- M Aguilar-Janita
- Complex Systems Group & GISC, Universidad Rey Juan Carlos, 28933 Móstoles, Spain
| | - N Khalil
- Complex Systems Group & GISC, Universidad Rey Juan Carlos, 28933 Móstoles, Spain
| | - I Leyva
- Complex Systems Group & GISC, Universidad Rey Juan Carlos, 28933 Móstoles, Spain
- Center for Biomedical Technology, Universidad Politécnica de Madrid, Pozuelo de Alarcón, 28223 Madrid, Spain
| | - I Sendiña-Nadal
- Complex Systems Group & GISC, Universidad Rey Juan Carlos, 28933 Móstoles, Spain
- Center for Biomedical Technology, Universidad Politécnica de Madrid, Pozuelo de Alarcón, 28223 Madrid, Spain
| |
Collapse
|
3
|
A reinforcement learning approach to explore the role of social expectations in altruistic behavior. Sci Rep 2023; 13:1717. [PMID: 36720949 PMCID: PMC9889354 DOI: 10.1038/s41598-023-28659-0] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/20/2022] [Accepted: 01/23/2023] [Indexed: 02/02/2023] Open
Abstract
While altruism has been studied from a variety of standpoints, none of them has proven sufficient to explain the richness of nuances detected in experimentally observed altruistic behavior. On the other hand, the recent success of behavioral economics in linking expectation formation to key behaviors in complex societies hints to social expectations having a key role in the emergence of altruism. This paper proposes an agent-based model based upon the Bush-Mosteller reinforcement learning algorithm in which agents, subject to stimuli derived from empirical and normative expectations, update their aspirations (and, consequently, their future cooperative behavior) after playing successive rounds of the Dictator Game. The results of the model are compared with experimental results. Such comparison suggests that a stimuli model based on empirical and normative expectations, such as the one presented in this work, has considerable potential for capturing the cognitive-behavioral processes that shape decision-making in contexts where cooperative behavior is relevant.
Collapse
|
4
|
Barfuss W, Meylahn JM. Intrinsic fluctuations of reinforcement learning promote cooperation. Sci Rep 2023; 13:1309. [PMID: 36693872 PMCID: PMC9873645 DOI: 10.1038/s41598-023-27672-7] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/26/2022] [Accepted: 01/05/2023] [Indexed: 01/26/2023] Open
Abstract
In this work, we ask for and answer what makes classical temporal-difference reinforcement learning with [Formula: see text]-greedy strategies cooperative. Cooperating in social dilemma situations is vital for animals, humans, and machines. While evolutionary theory revealed a range of mechanisms promoting cooperation, the conditions under which agents learn to cooperate are contested. Here, we demonstrate which and how individual elements of the multi-agent learning setting lead to cooperation. We use the iterated Prisoner's dilemma with one-period memory as a testbed. Each of the two learning agents learns a strategy that conditions the following action choices on both agents' action choices of the last round. We find that next to a high caring for future rewards, a low exploration rate, and a small learning rate, it is primarily intrinsic stochastic fluctuations of the reinforcement learning process which double the final rate of cooperation to up to 80%. Thus, inherent noise is not a necessary evil of the iterative learning process. It is a critical asset for the learning of cooperation. However, we also point out the trade-off between a high likelihood of cooperative behavior and achieving this in a reasonable amount of time. Our findings are relevant for purposefully designing cooperative algorithms and regulating undesired collusive effects.
Collapse
Affiliation(s)
- Wolfram Barfuss
- Tübingen AI Center, University of Tübingen, Tübingen, Germany
| | - Janusz M Meylahn
- Department of Applied Mathematics, University of Twente, Enschede, The Netherlands. .,Dutch Institute of Emergent Phenomena, University of Amsterdam, Amsterdam, The Netherlands.
| |
Collapse
|
5
|
Song Z, Guo H, Jia D, Perc M, Li X, Wang Z. Reinforcement learning facilitates an optimal interaction intensity for cooperation. Neurocomputing 2022. [DOI: 10.1016/j.neucom.2022.09.109] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/26/2022]
|
6
|
McCauley TG, McCullough ME. Retrospective Self-Reported Childhood Experiences in Enriched Environments Uniquely Predict Prosocial Behavior and Personality Traits in Adulthood. EVOLUTIONARY PSYCHOLOGY 2022; 20:14747049221110603. [PMID: 35791506 PMCID: PMC10303491 DOI: 10.1177/14747049221110603] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/18/2021] [Revised: 04/26/2022] [Accepted: 06/14/2022] [Indexed: 08/19/2023] Open
Abstract
What features of people's childhood environments go on to shape their prosocial behavior during adulthood? Past studies linking childhood environment to adult prosocial behavior have focused primarily on adverse features, thereby neglecting the possible influence of exposure to enriched environments (e.g., access to material resources, experiences with rich cooperative relationships, and interactions with morally exemplary role models). Here, we expand the investigation of childhood environmental quality to include consideration of enriching childhood experiences and their relation to adult prosociality. In two cross-sectional studies, we found promising evidence that enriched childhood environments are associated with adult moral behavior. In study 1 (N = 1,084 MTurk workers), we adapted an existing measure of enriched childhood environmental quality for retrospective recall of childhood experiences and found that subjects' recollections of their enriched childhood experiences are distinct from their recollections of adverse childhood experiences. In Study 2 (N = 2,208 MTurk workers), we found that a formative composite of subjects' recollections of enriched childhood experiences is positively associated with a variety of morally relevant traits in adulthood, including agreeableness, honesty-humility, altruism, endorsement of the principle of care, empathic responding to the plights of needy others, and charitable donations in an experimental setting, and that these associations held after controlling for childhood environmental adversity, childhood socioeconomic status, sex, and age. We also found evidence suggesting that some, but not all, of the relationship between enrichment and adult prosociality can be explained by a shared genetic correlation. We include a new seven-item measure as an appendix.
Collapse
Affiliation(s)
- Thomas G. McCauley
- Department of Psychology, University of California San Diego, La Jolla, CA, USA
- Department of Psychology, University of
Miami, Coral Gables, FL, USA
| | - Michael E. McCullough
- Department of Psychology, University of California San Diego, La Jolla, CA, USA
- Department of Psychology, University of
Miami, Coral Gables, FL, USA
| |
Collapse
|
7
|
Lindig-León C, Schmid G, Braun DA. Nash equilibria in human sensorimotor interactions explained by Q-learning with intrinsic costs. Sci Rep 2021; 11:20779. [PMID: 34675336 PMCID: PMC8531365 DOI: 10.1038/s41598-021-99428-0] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/07/2021] [Accepted: 09/01/2021] [Indexed: 11/09/2022] Open
Abstract
The Nash equilibrium concept has previously been shown to be an important tool to understand human sensorimotor interactions, where different actors vie for minimizing their respective effort while engaging in a multi-agent motor task. However, it is not clear how such equilibria are reached. Here, we compare different reinforcement learning models to human behavior engaged in sensorimotor interactions with haptic feedback based on three classic games, including the prisoner's dilemma, and the symmetric and asymmetric matching pennies games. We find that a discrete analysis that reduces the continuous sensorimotor interaction to binary choices as in classical matrix games does not allow to distinguish between the different learning algorithms, but that a more detailed continuous analysis with continuous formulations of the learning algorithms and the game-theoretic solutions affords different predictions. In particular, we find that Q-learning with intrinsic costs that disfavor deviations from average behavior explains the observed data best, even though all learning algorithms equally converge to admissible Nash equilibrium solutions. We therefore conclude that it is important to study different learning algorithms for understanding sensorimotor interactions, as such behavior cannot be inferred from a game-theoretic analysis alone, that simply focuses on the Nash equilibrium concept, as different learning algorithms impose preferences on the set of possible equilibrium solutions due to the inherent learning dynamics.
Collapse
Affiliation(s)
- Cecilia Lindig-León
- Institute of Neural Information Processing, Faculty of Engineering, Computer Science and Psychology, Ulm University, Ulm, Germany.
| | - Gerrit Schmid
- Institute of Neural Information Processing, Faculty of Engineering, Computer Science and Psychology, Ulm University, Ulm, Germany
| | - Daniel A Braun
- Institute of Neural Information Processing, Faculty of Engineering, Computer Science and Psychology, Ulm University, Ulm, Germany
| |
Collapse
|
8
|
Aspiration dynamics generate robust predictions in heterogeneous populations. Nat Commun 2021; 12:3250. [PMID: 34059670 PMCID: PMC8166829 DOI: 10.1038/s41467-021-23548-4] [Citation(s) in RCA: 9] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/07/2020] [Accepted: 05/05/2021] [Indexed: 12/03/2022] Open
Abstract
Update rules, which describe how individuals adjust their behavior over time, affect the outcome of social interactions. Theoretical studies have shown that evolutionary outcomes are sensitive to model details when update rules are imitation-based but are robust when update rules are self-evaluation based. However, studies of self-evaluation based rules have focused on homogeneous population structures where each individual has the same number of neighbors. Here, we consider heterogeneous population structures represented by weighted networks. Under weak selection, we analytically derive the condition for strategy success, which coincides with the classical condition of risk-dominance. This condition holds for all weighted networks and distributions of aspiration levels, and for individualized ways of self-evaluation. Our findings recover previous results as special cases and demonstrate the universality of the robustness property under self-evaluation based rules. Our work thus sheds light on the intrinsic difference between evolutionary dynamics under self-evaluation based and imitation-based update rules. Social interaction outcomes can depend on the type of information individuals possess and how it is used in decision-making. Here, Zhou et al. find that self-evaluation based decision-making rules lead to evolutionary outcomes that are robust to different population structures and ways of self-evaluation.
Collapse
|
9
|
Pereda M, Ozaita J, Stavrakakis I, Sánchez A. Competing for congestible goods: experimental evidence on parking choice. Sci Rep 2020; 10:20803. [PMID: 33257701 PMCID: PMC7705686 DOI: 10.1038/s41598-020-77711-w] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/09/2020] [Accepted: 11/02/2020] [Indexed: 11/24/2022] Open
Abstract
Congestible goods describe situations in which a group of people share or use a public good that becomes congested or overexploited when demand is low. We study experimentally a congestible goods problem of relevance for parking design, namely how people choose between a convenient parking lot with few spots and a less convenient one with unlimited space. We find that the Nash equilibrium predicts reasonably well the competition for the convenient parking when it has few spots, but not when it has more availability. We then show that the Rosenthal equilibrium, a bounded-rational approach, is a better description of the experimental results accounting for the randomness in the decision process. We introduce a dynamical model that shows how Rosenthal equilibria can be approached in a few rounds of the game. Our results give insights on how to deal with parking problems such as the design of parking lots in central locations in cities and open the way to better understand similar congestible goods problems in other contexts.
Collapse
Affiliation(s)
- María Pereda
- Grupo de Investigación Ingeniería de Organización y Logística (IOL), Departamento Ingeniería de Organización, Administración de empresas y Estadística, Escuela Técnica Superior de Ingenieros Industriales, Universidad Politécnica de Madrid, C/ José Gutiérrez Abascal, 2, 28006, Madrid, Spain. .,Unidad Mixta Interdisciplinar de Comportamiento y Complejidad Social (UMICCS) UC3M-UV-UZ, 28911, Leganés, Madrid, Spain.
| | - Juan Ozaita
- Unidad Mixta Interdisciplinar de Comportamiento y Complejidad Social (UMICCS) UC3M-UV-UZ, 28911, Leganés, Madrid, Spain.,Grupo Interdisciplinar de Sistemas Complejos, Departamento de Matemáticas, Universidad Carlos III de Madrid, 28911, Leganés, Madrid, Spain
| | - Ioannis Stavrakakis
- Department of Informatics and Telecommunications, National and Kapodistrian University of Athens, Athens, Greece
| | - Angel Sánchez
- Unidad Mixta Interdisciplinar de Comportamiento y Complejidad Social (UMICCS) UC3M-UV-UZ, 28911, Leganés, Madrid, Spain.,Grupo Interdisciplinar de Sistemas Complejos, Departamento de Matemáticas, Universidad Carlos III de Madrid, 28911, Leganés, Madrid, Spain.,Institute UC3M-Santander for Big Data (IBiDat), Universidad Carlos III de Madrid, 28903, Getafe, Madrid, Spain.,Instituto de Biocomputación y Física de Sistemas Complejos (BIFI), Universidad de Zaragoza, 50018, Zaragoza, Spain
| |
Collapse
|
10
|
Evolution of cooperation in malicious social networks with differential privacy mechanisms. Neural Comput Appl 2020. [DOI: 10.1007/s00521-020-05243-5] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/23/2022]
|
11
|
Mazzolini A, Celani A. Generosity, selfishness and exploitation as optimal greedy strategies for resource sharing. J Theor Biol 2020; 485:110041. [DOI: 10.1016/j.jtbi.2019.110041] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/03/2019] [Revised: 09/30/2019] [Accepted: 10/08/2019] [Indexed: 11/15/2022]
|
12
|
Shuler RL. Wealth-relative effects in cooperation games. Heliyon 2019; 5:e02958. [PMID: 31872125 PMCID: PMC6909069 DOI: 10.1016/j.heliyon.2019.e02958] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/03/2018] [Revised: 09/23/2019] [Accepted: 11/27/2019] [Indexed: 11/02/2022] Open
Abstract
This paper investigates cooperation games in which poor agents do not benefit from cooperation with wealthy agents. They instead benefit from considering wealth relative to decision payoffs of fitness or wealth. Of concern is the effect of cooperation on participants, their rational self-interest and choices, and not the evolution of cooperation directly. The accumulation of fitness or wealth has been shown in the literature to lead to different optimal strategies for wealthy and poor players in Chicken games. The effect could have important explanatory power if it were more broadly applicable. First we empirically compare two published results, one involving the temptation parameter vs. degree of cooperation in Prisoner's Dilemma, and the other a surprising result from a public goods game with participants from different cultures, networks and wealth in which a fixed rather than relative payoff scheme was used. Using the temptation data to calibrate the public goods behavior suggests wealth factors can provide an explanation for the results. Second we show using simulation that adding a survival threshold to a wealth or fitness accumulating Iterated Prisoner's Dilemma produces a wealth relative effect. We clarify previous results to show the poor must avoid survival risk, regardless of whether this is associated with cooperation or defection. We do this by introducing the Farmer's Game, a simulation of Iterated Prisoner's Dilemma with wealth accumulation and a survival threshold. This is used to evaluate the Tit-for-Tat strategy and four variants. Equilibrium payoffs keep the game scaled to social relevance, with a fraction of all payoffs externalized as a turn cost parameter. Findings include poor performance of Tit-for-Tat near the survival threshold, superior performance of low risk strategies for both poor and wealthy players, dependence of survival of the poor near the threshold on Tit-for-Tat forgiveness, unexpected optimization of forgiveness without encountering a social dilemma, improved performance of a diverse mix of strategies, and a more abrupt threshold of social catastrophe for the better performing mix. Lastly we compare cooperating and non-cooperating societies using the simulation and discover disturbing connections between cooperation and familiar non-egalitarian wealth distribution patterns.
Collapse
Affiliation(s)
- Robert L Shuler
- NASA Johnson Space Center, 2101 NASA Parkway, Houston, TX, 77058, USA
| |
Collapse
|
13
|
Kim HR, Toyokawa W, Kameda T. How do we decide when (not) to free-ride? Risk tolerance predicts behavioral plasticity in cooperation. EVOL HUM BEHAV 2019. [DOI: 10.1016/j.evolhumbehav.2018.08.001] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/28/2022]
|
14
|
Realpe-Gómez J, Andrighetto G, Nardin LG, Montoya JA. Balancing selfishness and norm conformity can explain human behavior in large-scale prisoner's dilemma games and can poise human groups near criticality. Phys Rev E 2018; 97:042321. [PMID: 29758626 DOI: 10.1103/physreve.97.042321] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/18/2017] [Indexed: 01/16/2023]
Abstract
Cooperation is central to the success of human societies as it is crucial for overcoming some of the most pressing social challenges of our time; still, how human cooperation is achieved and may persist is a main puzzle in the social and biological sciences. Recently, scholars have recognized the importance of social norms as solutions to major local and large-scale collective action problems, from the management of water resources to the reduction of smoking in public places to the change in fertility practices. Yet a well-founded model of the effect of social norms on human cooperation is still lacking. Using statistical-physics techniques and integrating findings from cognitive and behavioral sciences, we present an analytically tractable model in which individuals base their decisions to cooperate both on the economic rewards they obtain and on the degree to which their action complies with social norms. Results from this parsimonious model are in agreement with observations in recent large-scale experiments with humans. We also find the phase diagram of the model and show that the experimental human group is poised near a critical point, a regime where recent work suggests living systems respond to changing external conditions in an efficient and coordinated manner.
Collapse
Affiliation(s)
- John Realpe-Gómez
- Quantum Artificial Intelligence Laboratory, NASA Ames Research Center, Moffett Field, California 94035, USA; Instituto de Matemáticas Aplicadas, Universidad de Cartagena, Cartagena de Indias, Bolívar 13001, Colombia; and SGT Inc., 7701 Greenbelt Road, Greenbelt, Maryland 20770, USA
| | - Giulia Andrighetto
- Institute of Cognitive Sciences and Technologies, National Research Council, Rome 00185, Italy; Mälardalen University, Högskoleplan 1, 721 23 Västerås, Sweden; and Institute for Futures Studies, Holländargatan 13, 101 31 Stockholm, Sweden
| | - Luis Gustavo Nardin
- Institute of Cognitive Sciences and Technologies, National Research Council, Rome 00185, Italy and Brandenburg University of Technology, 03046 Cottbus, Brandenburg, Germany
| | - Javier Antonio Montoya
- Grupo de Modelado Computacional, Universidad de Cartagena, Cartagena de Indias, Bolívar 13001, Colombia and The Abdus Salam International Centre for Theoretical Physics, Strada Costiera 11, 34151 Trieste, Italy
| |
Collapse
|
15
|
Han X, Cao S, Bao JZ, Wang WX, Zhang B, Gao ZY, Sánchez A. Equal status in Ultimatum Games promotes rational sharing. Sci Rep 2018; 8:1222. [PMID: 29352130 PMCID: PMC5775192 DOI: 10.1038/s41598-018-19503-x] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/31/2017] [Accepted: 12/18/2017] [Indexed: 11/09/2022] Open
Abstract
Experiments on the Ultimatum Game (UG) repeatedly show that people's behaviour is far from rational. In UG experiments, a subject proposes how to divide a pot and the other can accept or reject the proposal, in which case both lose everything. While rational people would offer and accept the minimum possible amount, in experiments low offers are often rejected and offers are typically larger than the minimum, and even fair. Several theoretical works have proposed that these results may arise evolutionarily when subjects act in both roles and there is a fixed interaction structure in the population specifying who plays with whom. We report the first experiments on structured UG with subjects playing simultaneously both roles. We observe that acceptance levels of responders approach rationality and proposers accommodate their offers to their environment. More precisely, subjects keep low acceptance levels all the time, but as proposers they follow a best-response-like approach to choose their offers. We thus find that status equality promotes rational sharing while the influence of structure leads to fairer offers compared to well-mixed populations. Our results are far from what is observed in single-role UG experiments and largely different from available predictions based on evolutionary game theory.
Collapse
Affiliation(s)
- Xiao Han
- MOE Key Laboratory for Urban Transportation Complex Systems Theory and Technology, Beijing Jiaotong University, Beijing, 100044, P. R. China
- School of Systems Science, Beijing Normal University, Beijing, 100875, P. R. China
| | - Shinan Cao
- School of Finance, University of International Business and Economics, Beijing, 100029, P. R. China
| | - Jian-Zhang Bao
- School of Systems Science, Beijing Normal University, Beijing, 100875, P. R. China
| | - Wen-Xu Wang
- School of Systems Science, Beijing Normal University, Beijing, 100875, P. R. China
| | - Boyu Zhang
- Laboratory of Mathematics and Complex Systems, Ministry of Education, School of Mathematical Sciences, Beijing Normal University, Beijing, 100875, P. R. China.
| | - Zi-You Gao
- MOE Key Laboratory for Urban Transportation Complex Systems Theory and Technology, Beijing Jiaotong University, Beijing, 100044, P. R. China.
| | - Angel Sánchez
- Grupo Interdisciplinar de Sistemas Complejos (GISC), Departamento de Matemáticas, Universidad Carlos III de Madrid, 28911 Leganés, Spain.
- UC3M-BS Institute of Financial Big Data (IFIBID), Universidad Carlos III de Madrid, 28911 Leganés, Spain.
- Instituto de Biocomputación y Física de Sistemas Complejos (BIFI), Universidad de Zaragoza, Zaragoza, Spain.
- Unidad Mixta Interdisciplinar de Comportamiento y Complejidad Social (UICCS) UC3M-UV-UZ, Universidad Carlos III de Madrid, 28911 Leganés, Spain.
| |
Collapse
|
16
|
Abstract
Evolutionary game theory predicts that cooperation in social dilemma games is promoted when agents are connected as a network. However, when networks are fixed over time, humans do not necessarily show enhanced mutual cooperation. Here we show that reinforcement learning (specifically, the so-called Bush-Mosteller model) approximately explains the experimentally observed network reciprocity and the lack thereof in a parameter region spanned by the benefit-to-cost ratio and the node’s degree. Thus, we significantly extend previously obtained numerical results.
Collapse
|
17
|
Pereda M, Brañas-Garza P, Rodríguez-Lara I, Sánchez A. The emergence of altruism as a social norm. Sci Rep 2017; 7:9684. [PMID: 28851876 PMCID: PMC5575094 DOI: 10.1038/s41598-017-07712-9] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/11/2017] [Accepted: 06/23/2017] [Indexed: 12/02/2022] Open
Abstract
Expectations, exerting influence through social norms, are a very strong candidate to explain how complex societies function. In the Dictator game (DG), people expect generous behavior from others even if they cannot enforce any sharing of the pie. Here we assume that people donate following their expectations, and that they update their expectations after playing a DG by reinforcement learning to construct a model that explains the main experimental results in the DG. Full agreement with the experimental results is reached when some degree of mismatch between expectations and donations is added into the model. These results are robust against the presence of envious agents, but affected if we introduce selfish agents that do not update their expectations. Our results point to social norms being on the basis of the generous behavior observed in the DG and also to the wide applicability of reinforcement learning to explain many strategic interactions.
Collapse
Affiliation(s)
- María Pereda
- Grupo Interdisciplinar de Sistemas Complejos, Departamento de Matemáticas, Universidad Carlos III de Madrid, 28911, Leganés, Madrid, Spain.
| | - Pablo Brañas-Garza
- Middlesex University London, Department of Economics, Business School, Hendon Campus, The Burroughs, London, NW4 4BT, United Kingdom
| | - Ismael Rodríguez-Lara
- Middlesex University London, Department of Economics, Business School, Hendon Campus, The Burroughs, London, NW4 4BT, United Kingdom
| | - Angel Sánchez
- Grupo Interdisciplinar de Sistemas Complejos, Departamento de Matemáticas, Universidad Carlos III de Madrid, 28911, Leganés, Madrid, Spain.,Institute UC3M-BS of Financial Big Data, Universidad Carlos III de Madrid, 28903, Getafe, Spain.,Institute for Biocomputation and Physics of Complex Systems (BIFI), University of Zaragoza, 50018, Zaragoza, Spain
| |
Collapse
|
18
|
Reinforcement learning accounts for moody conditional cooperation behavior: experimental results. Sci Rep 2017; 7:39275. [PMID: 28071646 PMCID: PMC5223288 DOI: 10.1038/srep39275] [Citation(s) in RCA: 21] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/12/2016] [Accepted: 11/21/2016] [Indexed: 11/08/2022] Open
Abstract
In social dilemma games, human participants often show conditional cooperation (CC) behavior or its variant called moody conditional cooperation (MCC), with which they basically tend to cooperate when many other peers have previously cooperated. Recent computational studies showed that CC and MCC behavioral patterns could be explained by reinforcement learning. In the present study, we use a repeated multiplayer prisoner's dilemma game and the repeated public goods game played by human participants to examine whether MCC is observed across different types of game and the possibility that reinforcement learning explains observed behavior. We observed MCC behavior in both games, but the MCC that we observed was different from that observed in the past experiments. In the present study, whether or not a focal participant cooperated previously affected the overall level of cooperation, instead of changing the tendency of cooperation in response to cooperation of other participants in the previous time step. We found that, across different conditions, reinforcement learning models were approximately as accurate as a MCC model in describing the experimental results. Consistent with the previous computational studies, the present results suggest that reinforcement learning may be a major proximate mechanism governing MCC behavior.
Collapse
|