Reference Citation Analysis: Find an Article, Find a Category, Find a Journal, Find a Scholar

For: Frankenhuis WE, Panchanathan K, Barto AG. Enriching behavioral ecology with reinforcement learning methods. Behav Processes 2018;161:94-100. [PMID: 29412143 DOI: 10.1016/j.beproc.2018.01.008] [Citation(s) in RCA: 26] [Impact Index Per Article: 4.3] [Reference Citation Analysis] [What about the content of this article? (0)] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/13/2017] [Revised: 01/05/2018] [Accepted: 01/10/2018] [Indexed: 01/13/2023]

For:	Frankenhuis WE, Panchanathan K, Barto AG. Enriching behavioral ecology with reinforcement learning methods. Behav Processes 2018;161:94-100. [PMID: 29412143 DOI: 10.1016/j.beproc.2018.01.008] [Citation(s) in RCA: 26] [Impact Index Per Article: 4.3] [Reference Citation Analysis] [What about the content of this article? (0)] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/13/2017] [Revised: 01/05/2018] [Accepted: 01/10/2018] [Indexed: 01/13/2023]

Number

Cited by Other Article(s)

Agboka KM, Peter E, Bwambale E, Sokame BM. Biological reinforcement learning simulation for natural enemy -host behavior: Exploring deep learning algorithms for population dynamics. MethodsX 2024;13:102845. [PMID: 39092273 PMCID: PMC11292350 DOI: 10.1016/j.mex.2024.102845] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/25/2024] [Accepted: 07/02/2024] [Indexed: 08/04/2024] Open

Borgstede M. Behavioral selection in structured populations. Theory Biosci 2024;143:97-105. [PMID: 38441745 PMCID: PMC11127832 DOI: 10.1007/s12064-024-00413-8] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/09/2022] [Accepted: 01/31/2024] [Indexed: 05/27/2024]

Brown RD, Pepper GV. The Uncontrollable Mortality Risk Hypothesis: Theoretical foundations and implications for public health. Evol Med Public Health 2024;12:86-96. [PMID: 38807860 PMCID: PMC11132133 DOI: 10.1093/emph/eoae009] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/20/2023] [Revised: 04/26/2024] [Indexed: 05/30/2024] Open

Farkas BC, Baptista A, Speranza M, Wyart V, Jacquet PO. Specifying the timescale of early life unpredictability helps explain the development of internalising and externalising behaviours. Sci Rep 2024;14:3563. [PMID: 38347055 PMCID: PMC10861493 DOI: 10.1038/s41598-024-54093-x] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/08/2023] [Accepted: 02/08/2024] [Indexed: 02/15/2024] Open

Segovia-Martin J, Creutzig F, Winters J. Efficiency traps beyond the climate crisis: exploration-exploitation trade-offs and rebound effects. Philos Trans R Soc Lond B Biol Sci 2023;378:20220405. [PMID: 37718604 PMCID: PMC10505854 DOI: 10.1098/rstb.2022.0405] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/29/2023] [Accepted: 06/16/2023] [Indexed: 09/19/2023] Open

Lapeyrolerie M, Chapman MS, Norman KEA, Boettiger C. Deep reinforcement learning for conservation decisions. Methods Ecol Evol 2022. [DOI: 10.1111/2041-210x.13954] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/28/2022]

Borowiec ML, Dikow RB, Frandsen PB, McKeeken A, Valentini G, White AE. Deep learning as a tool for ecology and evolution. Methods Ecol Evol 2022. [DOI: 10.1111/2041-210x.13901] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/29/2022]

Frankenhuis WE, Amir D. What is the expected human childhood? Insights from evolutionary anthropology. Dev Psychopathol 2022;34:473-497. [PMID: 34924077 DOI: 10.1017/s0954579421001401] [Citation(s) in RCA: 16] [Impact Index Per Article: 8.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/10/2023]

Railsback SF. Suboptimal foraging theory: How inaccurate predictions and approximations can make better models of adaptive behavior. Ecology 2022;103:e3721. [PMID: 35394652 DOI: 10.1002/ecy.3721] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 05/25/2021] [Revised: 02/10/2022] [Accepted: 02/16/2022] [Indexed: 11/12/2022]

Abstract

Optimal foraging theory (OFT) is based on the ecological concept that organisms select behaviors that convey future fitness, and on the mathematical concept of optimization: finding the alternative that provides the best value of a fitness measure. As implemented in, e.g., state-based dynamic modeling, OFT is powerful for one key problem of modern ecology: modeling behavior as a tradeoff among competing fitness elements such as growth, risk avoidance, and reproductive output. However, OFT is not useful for other modern problems such as representing feedbacks within systems of interacting, unique individuals: when we need to model foraging by each of many individuals that interact competitively or synergistically, optimization is impractical or impossible-there are no optimal behaviors. For such problems we can, however, still use the concept of future fitness to model behavior, by replacing optimization with less precise (but perhaps more realistic) techniques for ranking alternatives. Instead of simplifying the systems we model until we can find "optimal" behavior, we can use theory based on inaccurate predictions, coarse approximations, and updating to produce good behavior in more complex and realistic contexts. This "state- and prediction-based theory" (SPT) can, for example, produce realistic foraging decisions by each of many unique, interacting individuals when growth rates and predation risks vary over space and time. Because SPT lets us address more natural complexity and more realistic problems, it is more easily tested against more kinds of observation and more useful in management ecology. A simple foraging model illustrates how SPT readily accommodates complexities that make optimization intractable. Other models use SPT to represent contingent decisions (whether to feed or hide, in what patch) that are tradeoffs between growth and predation risk, when both growth and risk vary among hundreds of patches, vary unpredictably over time, depend on characteristics of the individuals, are subject to feedbacks from competition, and change over the daily light cycle. Modern ecology demands theory for tradeoff behaviors in complex contexts that produce feedbacks; when optimization is infeasible, we should not be afraid to use approximate fitness-seeking methods instead.

Collapse

Walasek N, Frankenhuis WE, Panchanathan K. Sensitive periods, but not critical periods, evolve in a fluctuating environment: a model of incremental development. Proc Biol Sci 2022;289:20212623. [PMID: 35168396 PMCID: PMC8848242 DOI: 10.1098/rspb.2021.2623] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/21/2022] Open

Walasek N, Frankenhuis WE, Panchanathan K. An evolutionary model of sensitive periods when the reliability of cues varies across ontogeny. Behav Ecol 2022;33:101-114. [PMID: 35197808 PMCID: PMC8857937 DOI: 10.1093/beheco/arab113] [Citation(s) in RCA: 6] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/17/2021] [Revised: 07/22/2021] [Accepted: 09/17/2021] [Indexed: 11/13/2022] Open

Borgstede M. Why Do Individuals Seek Information? A Selectionist Perspective. Front Psychol 2021;12:684544. [PMID: 34867580 PMCID: PMC8639505 DOI: 10.3389/fpsyg.2021.684544] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/23/2021] [Accepted: 10/19/2021] [Indexed: 11/16/2022] Open

Abstract

Several authors have proposed that mechanisms of adaptive behavior, and reinforcement learning in particular, can be explained by an innate tendency of individuals to seek information about the local environment. In this article, I argue that these approaches adhere to an essentialist view of learning that avoids the question why information seeking should be favorable in the first place. I propose a selectionist account of adaptive behavior that explains why individuals behave as if they had a tendency to seek information without resorting to essentialist explanations. I develop my argument using a formal selectionist framework for adaptive behavior, the multilevel model of behavioral selection (MLBS). The MLBS has been introduced recently as a formal theory of behavioral selection that links reinforcement learning to natural selection within a single unified model. I show that the MLBS implies an average gain in information about the availability of reinforcement. Formally, this means that behavior reaches an equilibrium state, if and only if the Fisher information of the conditional probability of reinforcement is maximized. This coincides with a reduction in the randomness of the expected environmental feedback as captured by the information theoretic concept of expected surprise (i.e., entropy). The main result is that behavioral selection maximizes the information about the expected fitness consequences of behavior, which, in turn, minimizes average surprise. In contrast to existing attempts to link adaptive behavior to information theoretic concepts (e.g., the free energy principle), neither information gain nor surprise minimization is treated as a first principle. Instead, the result is formally deduced from the MLBS and therefore constitutes a mathematical property of the more general principle of behavioral selection. Thus, if reinforcement learning is understood as a selection process, there is no need to assume an active agent with an innate tendency to seek information or minimize surprise. Instead, information gain and surprise minimization emerge naturally because it lies in the very nature of selection to produce order from randomness.

Collapse

Towards learning behavior modeling of military logistics agent utilizing profit sharing reinforcement learning algorithm. Appl Soft Comput 2021. [DOI: 10.1016/j.asoc.2021.107784] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/20/2022]

Ciranka S, van den Bos W. Adolescent risk-taking in the context of exploration and social influence. DEVELOPMENTAL REVIEW 2021. [DOI: 10.1016/j.dr.2021.100979] [Citation(s) in RCA: 13] [Impact Index Per Article: 4.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/20/2022]

Tessereau C, O’Dea R, Coombes S, Bast T. Reinforcement learning approaches to hippocampus-dependent flexible spatial navigation. Brain Neurosci Adv 2021;5:2398212820975634. [PMID: 33954259 PMCID: PMC8042550 DOI: 10.1177/2398212820975634] [Citation(s) in RCA: 7] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/31/2020] [Accepted: 10/21/2020] [Indexed: 11/17/2022] Open

Abstract

Humans and non-human animals show great flexibility in spatial navigation, including the ability to return to specific locations based on as few as one single experience. To study spatial navigation in the laboratory, watermaze tasks, in which rats have to find a hidden platform in a pool of cloudy water surrounded by spatial cues, have long been used. Analogous tasks have been developed for human participants using virtual environments. Spatial learning in the watermaze is facilitated by the hippocampus. In particular, rapid, one-trial, allocentric place learning, as measured in the delayed-matching-to-place variant of the watermaze task, which requires rodents to learn repeatedly new locations in a familiar environment, is hippocampal dependent. In this article, we review some computational principles, embedded within a reinforcement learning framework, that utilise hippocampal spatial representations for navigation in watermaze tasks. We consider which key elements underlie their efficacy, and discuss their limitations in accounting for hippocampus-dependent navigation, both in terms of behavioural performance (i.e. how well do they reproduce behavioural measures of rapid place learning) and neurobiological realism (i.e. how well do they map to neurobiological substrates involved in rapid place learning). We discuss how an actor-critic architecture, enabling simultaneous assessment of the value of the current location and of the optimal direction to follow, can reproduce one-trial place learning performance as shown on watermaze and virtual delayed-matching-to-place tasks by rats and humans, respectively, if complemented with map-like place representations. The contribution of actor-critic mechanisms to delayed-matching-to-place performance is consistent with neurobiological findings implicating the striatum and hippocampo-striatal interaction in delayed-matching-to-place performance, given that the striatum has been associated with actor-critic mechanisms. Moreover, we illustrate that hierarchical computations embedded within an actor-critic architecture may help to account for aspects of flexible spatial navigation. The hierarchical reinforcement learning approach separates trajectory control via a temporal-difference error from goal selection via a goal prediction error and may account for flexible, trial-specific, navigation to familiar goal locations, as required in some arm-maze place memory tasks, although it does not capture one-trial learning of new goal locations, as observed in open field, including watermaze and virtual, delayed-matching-to-place tasks. Future models of one-shot learning of new goal locations, as observed on delayed-matching-to-place tasks, should incorporate hippocampal plasticity mechanisms that integrate new goal information with allocentric place representation, as such mechanisms are supported by substantial empirical evidence.

Collapse

Frankenhuis WE, Nettle D. Integration of plasticity research across disciplines. Curr Opin Behav Sci 2020. [DOI: 10.1016/j.cobeha.2020.10.012] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/22/2022]

Borgstede M. An evolutionary model of reinforcer value. Behav Processes 2020;175:104109. [DOI: 10.1016/j.beproc.2020.104109] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/08/2019] [Revised: 02/23/2020] [Accepted: 03/19/2020] [Indexed: 12/14/2022]

Han S, Chen H, Harris J, Long R. Who Reports Low Interactive Psychology Status? An Investigation Based on Chinese Coal Miners. INTERNATIONAL JOURNAL OF ENVIRONMENTAL RESEARCH AND PUBLIC HEALTH 2020;17:ijerph17103446. [PMID: 32429127 PMCID: PMC7277538 DOI: 10.3390/ijerph17103446] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 03/30/2020] [Revised: 05/10/2020] [Accepted: 05/11/2020] [Indexed: 01/29/2023]

Frankenhuis WE, Nettle D, Dall SRX. A case for environmental statistics of early-life effects. Philos Trans R Soc Lond B Biol Sci 2020;374:20180110. [PMID: 30966883 PMCID: PMC6460088 DOI: 10.1098/rstb.2018.0110] [Citation(s) in RCA: 44] [Impact Index Per Article: 11.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/24/2022] Open

Frankenhuis WE, Walasek N. Modeling the evolution of sensitive periods. Dev Cogn Neurosci 2020;41:100715. [PMID: 31999568 PMCID: PMC6994616 DOI: 10.1016/j.dcn.2019.100715] [Citation(s) in RCA: 52] [Impact Index Per Article: 13.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/29/2019] [Revised: 08/09/2019] [Accepted: 10/01/2019] [Indexed: 11/28/2022] Open

Reimer JR, Mangel M, Derocher AE, Lewis MA. Matrix methods for stochastic dynamic programming in ecology and evolutionary biology. Methods Ecol Evol 2019. [DOI: 10.1111/2041-210x.13291] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/01/2022]

Cognitive gadgets: A provocative but flawed manifesto. Behav Brain Sci 2019;42:e174. [DOI: 10.1017/s0140525x19001134] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/06/2022]

Wang X, Cheng J, Wang L. Deep-Reinforcement Learning-Based Co-Evolution in a Predator-Prey System. ENTROPY 2019;21:e21080773. [PMID: 33267487 PMCID: PMC7515302 DOI: 10.3390/e21080773] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 06/26/2019] [Revised: 07/27/2019] [Accepted: 08/06/2019] [Indexed: 11/16/2022]

Budaev S, Jørgensen C, Mangel M, Eliassen S, Giske J. Decision-Making From the Animal Perspective: Bridging Ecology and Subjective Cognition. Front Ecol Evol 2019. [DOI: 10.3389/fevo.2019.00164] [Citation(s) in RCA: 33] [Impact Index Per Article: 6.6] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/13/2022] Open

Hall-McMaster S, Luyckx F. Revisiting foraging approaches in neuroscience. COGNITIVE, AFFECTIVE & BEHAVIORAL NEUROSCIENCE 2019;19:225-230. [PMID: 30607832 PMCID: PMC6420423 DOI: 10.3758/s13415-018-00682-z] [Citation(s) in RCA: 10] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Subscribe] [Scholar Register] [Indexed: 01/18/2023]

Chumbley J, Steinhoff A. A computational perspective on social attachment. Infant Behav Dev 2019;54:85-98. [PMID: 30641469 DOI: 10.1016/j.infbeh.2018.12.001] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/06/2018] [Revised: 12/07/2018] [Accepted: 12/09/2018] [Indexed: 11/28/2022]

Frankenhuis WE, Nettle D, McNamara JM. Echoes of Early Life: Recent Insights From Mathematical Modeling. Child Dev 2018;89:1504-1518. [PMID: 29947096 PMCID: PMC6175464 DOI: 10.1111/cdev.13108] [Citation(s) in RCA: 11] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/23/2022]