1
|
Boccardo F, Pierre-Louis O. Reinforcement learning with thermal fluctuations at the nanoscale. Phys Rev E 2024; 110:L023301. [PMID: 39294981 DOI: 10.1103/physreve.110.l023301] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/18/2023] [Accepted: 08/06/2024] [Indexed: 09/21/2024]
Abstract
Reinforcement Learning offers a framework to learn to choose actions in order to control a system. However, at small scales Brownian fluctuations limit the control of nanomachine actuation or nanonavigation and of the molecular machinery of life. We analyze this regime using the general framework of Markov decision processes. We show that at the nanoscale, while optimal control actions should bring an improvement proportional to the small ratio of the applied force times a length scale over the temperature, the learned improvement is smaller and proportional to the square of this small ratio. Consequently, the efficiency of learning, which compares the learning improvement to the theoretical optimal improvement, drops to zero. Nevertheless, these limitations can be circumvented by using actions learned at a lower temperature. These results are illustrated with simulations of the control of the shape of small particle clusters.
Collapse
|
2
|
Kozielska M, Weissing FJ. A neural network model for the evolution of learning in changing environments. PLoS Comput Biol 2024; 20:e1011840. [PMID: 38289971 PMCID: PMC10857588 DOI: 10.1371/journal.pcbi.1011840] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/24/2023] [Revised: 02/09/2024] [Accepted: 01/18/2024] [Indexed: 02/01/2024] Open
Abstract
Learning from past experience is an important adaptation and theoretical models may help to understand its evolution. Many of the existing models study simple phenotypes and do not consider the mechanisms underlying learning while the more complex neural network models often make biologically unrealistic assumptions and rarely consider evolutionary questions. Here, we present a novel way of modelling learning using small neural networks and a simple, biology-inspired learning algorithm. Learning affects only part of the network, and it is governed by the difference between expectations and reality. We use this model to study the evolution of learning under various environmental conditions and different scenarios for the trade-off between exploration (learning) and exploitation (foraging). Efficient learning readily evolves in our individual-based simulations. However, in line with previous studies, the evolution of learning is less likely in relatively constant environments, where genetic adaptation alone can lead to efficient foraging, or in short-lived organisms that cannot afford to spend much of their lifetime on exploration. Once learning does evolve, the characteristics of the learning strategy (i.e. the duration of the learning period and the learning rate) and the average performance after learning are surprisingly little affected by the frequency and/or magnitude of environmental change. In contrast, an organism's lifespan and the distribution of resources in the environment have a clear effect on the evolved learning strategy: a shorter lifespan or a broader resource distribution lead to fewer learning episodes and larger learning rates. Interestingly, a longer learning period does not always lead to better performance, indicating that the evolved neural networks differ in the effectiveness of learning. Overall, however, we show that a biologically inspired, yet relatively simple, learning mechanism can evolve to lead to an efficient adaptation in a changing environment.
Collapse
Affiliation(s)
- Magdalena Kozielska
- Groningen Institute for Evolutionary Life Sciences, University of Groningen, Groningen, The Netherlands
| | - Franz J. Weissing
- Groningen Institute for Evolutionary Life Sciences, University of Groningen, Groningen, The Netherlands
| |
Collapse
|
3
|
Hamilton's rule, the evolution of behavior rules and the wizardry of control theory. J Theor Biol 2022; 555:111282. [PMID: 36179799 DOI: 10.1016/j.jtbi.2022.111282] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/11/2022] [Revised: 09/14/2022] [Accepted: 09/16/2022] [Indexed: 01/14/2023]
Abstract
This paper formalizes selection on a quantitative trait affecting the evolution of behavior (or development) rules through which individuals act and react with their surroundings. Combining Hamilton's marginal rule for selection on scalar traits and concepts from optimal control theory, a necessary first-order condition for the evolutionary stability of the trait in a group-structured population is derived. The model, which is of intermediate level of complexity, fills a gap between the formalization of selection on evolving traits that are directly conceived as actions (no phenotypic plasticity) and selection on evolving traits that are conceived as strategies or function valued actions (complete phenotypic plasticity). By conceptualizing individuals as open deterministic dynamical systems expressing incomplete phenotypic plasticity, the model captures selection on a large class of phenotypic expression mechanisms, including developmental pathways and learning under life-history trade-offs. As an illustration of the results, a first-order condition for the evolutionary stability of behavior response rules from the social evolution literature is re-derived, strengthened, and generalized. All results of the paper also generalize directly to selection on multidimensional quantitative traits affecting behavior rule evolution, thereby covering neural and gene network evolution.
Collapse
|
4
|
|
5
|
Sol D, Lapiedra O, González-Lagos C, de Cáceres M. Resource preferences and the emergence of individual niche specialization within populations. Behav Ecol 2021. [DOI: 10.1093/beheco/arab086] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/14/2022] Open
Abstract
Abstract
Growing evidence that individuals of many generalist animals behave as resource specialists have attracted substantial research interest for its ecological and evolutionary implications. Variation in resource preferences is considered to be critical for developing a general theory of individual specialization. However, it remains to be shown whether diverging preferences can arise among individuals sharing a similar environment, and whether these preferences are sufficiently stable over time to be ecologically relevant. We addressed these issues by means of common garden experiments in feral pigeons (Columba livia), a species known to exhibit among-individual resource specialization in the wild. Food-choice experiments on wild-caught pigeons and their captive-bred cross-fostered descendants showed that short-term variation in food preferences can easily arise within a population, and that this variation may represent a substantial fraction of the population foraging niche. However, the experiments also showed that, rather than being limited by genetic or vertical cultural inheritance, food preferences exhibited high plasticity and tended to converge in the long-term. Although our results challenge the notion that variation in food preferences is a major driver of resource specialization, early differences in preferences could pave the way to specializations when combined with neophobic responses and/or positive feedbacks that reinforce niche conservation.
Collapse
Affiliation(s)
- Daniel Sol
- CREAF, Centre for Ecological Research and Applied Forestries, Cerdanyola del Vallès, Catalonia 08193, Spain
- CSIC, Spanish National Research Council, CREAF-UAB, Cerdanyola del Vallès, Catalonia 08193, Spain
| | - Oriol Lapiedra
- CREAF, Centre for Ecological Research and Applied Forestries, Cerdanyola del Vallès, Catalonia 08193, Spain
| | - Cesar González-Lagos
- Centro de Investigación en Recursos Naturales y Sustentabilidad (CIRENYS), Universidad Bernardo O’Higgins, Av. Viel 1497, Santiago 8370993, Chile
- Centre of Applied Ecology and Sustainability (CAPES), Av. Libertador Bernardo O´Higgins 340, Santiago 8331150, Chile
| | - Miquel de Cáceres
- CREAF, Centre for Ecological Research and Applied Forestries, Cerdanyola del Vallès, Catalonia 08193, Spain
| |
Collapse
|
6
|
Frankenhuis WE, Walasek N. Modeling the evolution of sensitive periods. Dev Cogn Neurosci 2020; 41:100715. [PMID: 31999568 PMCID: PMC6994616 DOI: 10.1016/j.dcn.2019.100715] [Citation(s) in RCA: 52] [Impact Index Per Article: 13.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/29/2019] [Revised: 08/09/2019] [Accepted: 10/01/2019] [Indexed: 11/28/2022] Open
Abstract
In the past decade, there has been monumental progress in our understanding of the neurobiological basis of sensitive periods. Little is known, however, about the evolution of sensitive periods. Recent studies have started to address this gap. Biologists have built mathematical models exploring the environmental conditions in which sensitive periods are likely to evolve. These models investigate how mechanisms of plasticity can respond optimally to experience during an individual's lifetime. This paper discusses the central tenets, insights, and predictions of these models, in relation to empirical work on humans and other animals. We also discuss which future models are needed to improve the bridge between theory and data, advancing their synergy. Our paper is written in an accessible manner and for a broad audience. We hope our work will contribute to recently emerging connections between the fields of developmental neuroscience and evolutionary biology.
Collapse
Affiliation(s)
| | - Nicole Walasek
- Behavioural Science Institute, Radboud University, the Netherlands
| |
Collapse
|
7
|
Akçay E. Deconstructing Evolutionary Game Theory: Coevolution of Social Behaviors with Their Evolutionary Setting. Am Nat 2019; 195:315-330. [PMID: 32017621 DOI: 10.1086/706811] [Citation(s) in RCA: 20] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/03/2022]
Abstract
Evolution of social behaviors is one of the most fascinating and active fields of evolutionary biology. During the past half century, social evolution theory developed into a mature field with powerful tools to understand the dynamics of social traits such as cooperation under a wide range of conditions. In this article, I argue that the next stage in the development of social evolution theory should consider the evolution of the setting in which social behaviors evolve. To that end, I propose a conceptual map of the components that make up the evolutionary setting of social behaviors, review existing work that considers the evolution of each component, and discuss potential future directions. The theoretical work reviewed here illustrates how unexpected dynamics can happen when the setting of social evolution itself is evolving, such as cooperation sometimes being self-limiting. I argue that a theory of how the setting of social evolution itself evolves will lead to a deeper understanding of when cooperation and other social behaviors evolve and diversify.
Collapse
|
8
|
|
9
|
Wubs M, Bshary R, Lehmann L. A reinforcement learning model for grooming up the hierarchy in primates. Anim Behav 2018. [DOI: 10.1016/j.anbehav.2018.02.014] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/17/2022]
|
10
|
Frankenhuis WE, Panchanathan K, Barto AG. Enriching behavioral ecology with reinforcement learning methods. Behav Processes 2018; 161:94-100. [PMID: 29412143 DOI: 10.1016/j.beproc.2018.01.008] [Citation(s) in RCA: 26] [Impact Index Per Article: 4.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/13/2017] [Revised: 01/05/2018] [Accepted: 01/10/2018] [Indexed: 01/13/2023]
Abstract
This article focuses on the division of labor between evolution and development in solving sequential, state-dependent decision problems. Currently, behavioral ecologists tend to use dynamic programming methods to study such problems. These methods are successful at predicting animal behavior in a variety of contexts. However, they depend on a distinct set of assumptions. Here, we argue that behavioral ecology will benefit from drawing more than it currently does on a complementary collection of tools, called reinforcement learning methods. These methods allow for the study of behavior in highly complex environments, which conventional dynamic programming methods do not feasibly address. In addition, reinforcement learning methods are well-suited to studying how biological mechanisms solve developmental and learning problems. For instance, we can use them to study simple rules that perform well in complex environments. Or to investigate under what conditions natural selection favors fixed, non-plastic traits (which do not vary across individuals), cue-driven-switch plasticity (innate instructions for adaptive behavioral development based on experience), or developmental selection (the incremental acquisition of adaptive behavior based on experience). If natural selection favors developmental selection, which includes learning from environmental feedback, we can also make predictions about the design of reward systems. Our paper is written in an accessible manner and for a broad audience, though we believe some novel insights can be drawn from our discussion. We hope our paper will help advance the emerging bridge connecting the fields of behavioral ecology and reinforcement learning.
Collapse
Affiliation(s)
- Willem E Frankenhuis
- Behavioural Science Institute, Radboud University, Montessorilaan 3, PO Box 9104, 6500, HE, Nijmegen, The Netherlands.
| | | | - Andrew G Barto
- College of Information and Computer Sciences, University of Massachusetts Amherst, United States
| |
Collapse
|
11
|
Bshary R, Raihani NJ. Helping in humans and other animals: a fruitful interdisciplinary dialogue. Proc Biol Sci 2018; 284:rspb.2017.0929. [PMID: 28954904 PMCID: PMC5627196 DOI: 10.1098/rspb.2017.0929] [Citation(s) in RCA: 14] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/29/2017] [Accepted: 08/29/2017] [Indexed: 11/12/2022] Open
Abstract
Humans are arguably unique in the extent and scale of cooperation with unrelated individuals. While pairwise interactions among non-relatives occur in some non-human species, there is scant evidence of the large-scale, often unconditional prosociality that characterizes human social behaviour. Consequently, one may ask whether research on cooperation in humans can offer general insights to researchers working on similar questions in non-human species, and whether research on humans should be published in biology journals. We contend that the answer to both of these questions is yes. Most importantly, social behaviour in humans and other species operates under the same evolutionary framework. Moreover, we highlight how an open dialogue between different fields can inspire studies on humans and non-human species, leading to novel approaches and insights. Biology journals should encourage these discussions rather than drawing artificial boundaries between disciplines. Shared current and future challenges are to study helping in ecologically relevant contexts in order to correctly interpret how payoff matrices translate into inclusive fitness, and to integrate mechanisms into the hitherto largely functional theory. We can and should study human cooperation within a comparative framework in order to gain a full understanding of the evolution of helping.
Collapse
Affiliation(s)
- Redouan Bshary
- Institute of Biology, University of Neuchâtel, Emile-Argand 11, 2000 Neuchâtel, Switzerland
| | - Nichola J Raihani
- Department of Experimental Psychology, University College London, 26 Bedford Way, London WC1H 0AP, UK
| |
Collapse
|
12
|
Dridi S, Akçay E. Learning to Cooperate: The Evolution of Social Rewards in Repeated Interactions. Am Nat 2018; 191:58-73. [DOI: 10.1086/694822] [Citation(s) in RCA: 19] [Impact Index Per Article: 3.2] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/03/2022]
|
13
|
Bshary R, Zuberbühler K, van Schaik CP. Why mutual helping in most natural systems is neither conflict-free nor based on maximal conflict. Philos Trans R Soc Lond B Biol Sci 2016; 371:20150091. [PMID: 26729931 PMCID: PMC4760193 DOI: 10.1098/rstb.2015.0091] [Citation(s) in RCA: 20] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 11/20/2015] [Indexed: 11/12/2022] Open
Abstract
Mutual helping for direct benefits can be explained by various game theoretical models, which differ mainly in terms of the underlying conflict of interest between two partners. Conflict is minimal if helping is self-serving and the partner benefits as a by-product. In contrast, conflict is maximal if partners are in a prisoner's dilemma with both having the pay-off-dominant option of not returning the other's investment. Here, we provide evolutionary and ecological arguments for why these two extremes are often unstable under natural conditions and propose that interactions with intermediate levels of conflict are frequent evolutionary endpoints. We argue that by-product helping is prone to becoming an asymmetric investment game since even small variation in by-product benefits will lead to the evolution of partner choice, leading to investments by the chosen class. Second, iterated prisoner's dilemmas tend to take place in stable social groups where the fitness of partners is interdependent, with the effect that a certain level of helping is self-serving. In sum, intermediate levels of mutual helping are expected in nature, while efficient partner monitoring may allow reaching higher levels.
Collapse
Affiliation(s)
- Redouan Bshary
- Institute of Biology, University of Neuchâtel, Emile-Argand 11, Neuchâtel 2000, Switzerland
| | - Klaus Zuberbühler
- Institute of Biology, University of Neuchâtel, Emile-Argand 11, Neuchâtel 2000, Switzerland
| | - Carel P van Schaik
- Anthropological Institute and Museum, University of Zürich, Winterthurerstrasse 190, Zürich 8057, Switzerland
| |
Collapse
|