1
|
Kang P, Tobler PN, Dayan P. Bayesian reinforcement learning: A basic overview. Neurobiol Learn Mem 2024; 211:107924. [PMID: 38579896 DOI: 10.1016/j.nlm.2024.107924] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/20/2023] [Revised: 03/21/2024] [Accepted: 04/02/2024] [Indexed: 04/07/2024]
Abstract
We and other animals learn because there is some aspect of the world about which we are uncertain. This uncertainty arises from initial ignorance, and from changes in the world that we do not perfectly know; the uncertainty often becomes evident when our predictions about the world are found to be erroneous. The Rescorla-Wagner learning rule, which specifies one way that prediction errors can occasion learning, has been hugely influential as a characterization of Pavlovian conditioning and, through its equivalence to the delta rule in engineering, in a much wider class of learning problems. Here, we review the embedding of the Rescorla-Wagner rule in a Bayesian context that is precise about the link between uncertainty and learning, and thereby discuss extensions to such suggestions as the Kalman filter, structure learning, and beyond, that collectively encompass a wider range of uncertainties and accommodate a wider assortment of phenomena in conditioning.
Collapse
Affiliation(s)
- Pyungwon Kang
- University of Zurich, Department of Economics, Laboratory for Social and Neural Systems Research, Zurich, Switzerland.
| | - Philippe N Tobler
- University of Zurich, Department of Economics, Laboratory for Social and Neural Systems Research, Zurich, Switzerland.
| | - Peter Dayan
- Max Planck Institute for Biological Cybernetics, Tübingen, Germany; University of Tübingen, Tübingen Germany.
| |
Collapse
|
2
|
Sandhu TR, Xiao B, Lawson RP. Transdiagnostic computations of uncertainty: towards a new lens on intolerance of uncertainty. Neurosci Biobehav Rev 2023; 148:105123. [PMID: 36914079 DOI: 10.1016/j.neubiorev.2023.105123] [Citation(s) in RCA: 4] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/11/2022] [Revised: 02/21/2023] [Accepted: 03/08/2023] [Indexed: 03/13/2023]
Abstract
People radically differ in how they cope with uncertainty. Clinical researchers describe a dispositional characteristic known as "intolerance of uncertainty", a tendency to find uncertainty aversive, reported to be elevated across psychiatric and neurodevelopmental conditions. Concurrently, recent research in computational psychiatry has leveraged theoretical work to characterise individual differences in uncertainty processing. Under this framework, differences in how people estimate different forms of uncertainty can contribute to mental health difficulties. In this review, we briefly outline the concept of intolerance of uncertainty within its clinical context, and we argue that the mechanisms underlying this construct may be further elucidated through modelling how individuals make inferences about uncertainty. We will review the evidence linking psychopathology to different computationally specified forms of uncertainty and consider how these findings might suggest distinct mechanistic routes towards intolerance of uncertainty. We also discuss the implications of this computational approach for behavioural and pharmacological interventions, as well as the importance of different cognitive domains and subjective experiences in studying uncertainty processing.
Collapse
Affiliation(s)
- Timothy R Sandhu
- Department of Psychology, Downing Place, University of Cambridge, CB2 3EB, UK; MRC Cognition and Brain Sciences Unit, 15 Chaucer Road, CB2 7EF, UK.
| | - Bowen Xiao
- Department of Psychology, Downing Place, University of Cambridge, CB2 3EB, UK
| | - Rebecca P Lawson
- Department of Psychology, Downing Place, University of Cambridge, CB2 3EB, UK; MRC Cognition and Brain Sciences Unit, 15 Chaucer Road, CB2 7EF, UK
| |
Collapse
|
3
|
Fields C, Friston K, Glazebrook JF, Levin M. A free energy principle for generic quantum systems. PROGRESS IN BIOPHYSICS AND MOLECULAR BIOLOGY 2022; 173:36-59. [PMID: 35618044 DOI: 10.1016/j.pbiomolbio.2022.05.006] [Citation(s) in RCA: 15] [Impact Index Per Article: 7.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Received: 02/26/2022] [Revised: 05/04/2022] [Accepted: 05/18/2022] [Indexed: 01/17/2023]
Abstract
The Free Energy Principle (FEP) states that under suitable conditions of weak coupling, random dynamical systems with sufficient degrees of freedom will behave so as to minimize an upper bound, formalized as a variational free energy, on surprisal (a.k.a., self-information). This upper bound can be read as a Bayesian prediction error. Equivalently, its negative is a lower bound on Bayesian model evidence (a.k.a., marginal likelihood). In short, certain random dynamical systems evince a kind of self-evidencing. Here, we reformulate the FEP in the formal setting of spacetime-background free, scale-free quantum information theory. We show how generic quantum systems can be regarded as observers, which with the standard freedom of choice assumption become agents capable of assigning semantics to observational outcomes. We show how such agents minimize Bayesian prediction error in environments characterized by uncertainty, insufficient learning, and quantum contextuality. We show that in its quantum-theoretic formulation, the FEP is asymptotically equivalent to the Principle of Unitarity. Based on these results, we suggest that biological systems employ quantum coherence as a computational resource and - implicitly - as a communication resource. We summarize a number of problems for future research, particularly involving the resources required for classical communication and for detecting and responding to quantum context switches.
Collapse
Affiliation(s)
- Chris Fields
- 23 Rue des Lavandières, 11160, Caunes Minervois, France.
| | - Karl Friston
- Wellcome Centre for Human Neuroimaging, University College London, London, WC1N 3AR, UK
| | - James F Glazebrook
- Department of Mathematics and Computer Science, Eastern Illinois University, Charleston, IL, 61920, USA; Adjunct Faculty, Department of Mathematics, University of Illinois at Urbana-Champaign, Urbana, IL, 61801, USA
| | - Michael Levin
- Allen Discovery Center at Tufts University, Medford, MA, 02155, USA
| |
Collapse
|
4
|
Yu LQ, Wilson RC, Nassar MR. Adaptive learning is structure learning in time. Neurosci Biobehav Rev 2021; 128:270-281. [PMID: 34144114 PMCID: PMC8422504 DOI: 10.1016/j.neubiorev.2021.06.024] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/01/2020] [Revised: 04/19/2021] [Accepted: 06/11/2021] [Indexed: 10/21/2022]
Abstract
People use information flexibly. They often combine multiple sources of relevant information over time in order to inform decisions with little or no interference from intervening irrelevant sources. They adjust the degree to which they use new information over time rationally in accordance with environmental statistics and their own uncertainty. They can even use information gained in one situation to solve a problem in a very different one. Learning flexibly rests on the ability to infer the context at a given time, and therefore knowing which pieces of information to combine and which to separate. We review the psychological and neural mechanisms behind adaptive learning and structure learning to outline how people pool together relevant information, demarcate contexts, prevent interference between information collected in different contexts, and transfer information from one context to another. By examining all of these processes through the lens of optimal inference we bridge concepts from multiple fields to provide a unified multi-system view of how the brain exploits structure in time to optimize learning.
Collapse
Affiliation(s)
- Linda Q Yu
- Carney Institute for Brain Sciences, Brown University, 164 Angell Street, Providence, RI, 02912, USA.
| | - Robert C Wilson
- Department of Psychology, University of Arizona, Tucson, AZ, 85721, USA
| | - Matthew R Nassar
- Carney Institute for Brain Sciences, Brown University, 164 Angell Street, Providence, RI, 02912, USA
| |
Collapse
|
5
|
Abstract
Active inference is a first principle account of how autonomous agents operate in dynamic, nonstationary environments. This problem is also considered in reinforcement learning, but limited work exists on comparing the two approaches on the same discrete-state environments. In this letter, we provide (1) an accessible overview of the discrete-state formulation of active inference, highlighting natural behaviors in active inference that are generally engineered in reinforcement learning, and (2) an explicit discrete-state comparison between active inference and reinforcement learning on an OpenAI gym baseline. We begin by providing a condensed overview of the active inference literature, in particular viewing the various natural behaviors of active inference agents through the lens of reinforcement learning. We show that by operating in a pure belief-based setting, active inference agents can carry out epistemic exploration-and account for uncertainty about their environment-in a Bayes-optimal fashion. Furthermore, we show that the reliance on an explicit reward signal in reinforcement learning is removed in active inference, where reward can simply be treated as another observation we have a preference over; even in the total absence of rewards, agent behaviors are learned through preference learning. We make these properties explicit by showing two scenarios in which active inference agents can infer behaviors in reward-free environments compared to both Q-learning and Bayesian model-based reinforcement learning agents and by placing zero prior preferences over rewards and learning the prior preferences over the observations corresponding to reward. We conclude by noting that this formalism can be applied to more complex settings (e.g., robotic arm movement, Atari games) if appropriate generative models can be formulated. In short, we aim to demystify the behavior of active inference agents by presenting an accessible discrete state-space and time formulation and demonstrate these behaviors in a OpenAI gym environment, alongside reinforcement learning agents.
Collapse
Affiliation(s)
- Noor Sajid
- Wellcome Centre for Human Neuroimaging, UCL Queen Square Institute of Neurology, London, WC1N 3AR, U.K.
| | - Philip J Ball
- Machine Learning Research Group, Department of Engineering Science, University of Oxford, Oxford OX1 3PJ, U.K.
| | - Thomas Parr
- Wellcome Centre for Human Neuroimaging, UCL Queen Square Institute of Neurology, London, WC1N 3AR, U.K.
| | - Karl J Friston
- Wellcome Centre for Human Neuroimaging, UCL Queen Square Institute of Neurology, London, WC1N 3AR, U.K.
| |
Collapse
|
6
|
Hämäläinen L, Thorogood R. The signal detection problem of aposematic prey revisited: integrating prior social and personal experience. Philos Trans R Soc Lond B Biol Sci 2020; 375:20190473. [PMID: 32420858 PMCID: PMC7331014 DOI: 10.1098/rstb.2019.0473] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 01/28/2020] [Indexed: 11/12/2022] Open
Abstract
Ever since Alfred R. Wallace suggested brightly coloured, toxic insects warn predators about their unprofitability, evolutionary biologists have searched for an explanation of how these aposematic prey evolve and are maintained in natural populations. Understanding how predators learn about this widespread prey defence is fundamental to addressing the problem, yet individuals differ in their foraging decisions and the predominant application of associative learning theory largely ignores predators' foraging context. Here we revisit the suggestion made 15 years ago that signal detection theory provides a useful framework to model predator learning by emphasizing the integration of prior information into predation decisions. Using multiple experiments where we modified the availability of social information using video playback, we show that personal information (sampling aposematic prey) improves how predators (great tits, Parus major) discriminate between novel aposematic and cryptic prey. However, this relationship was not linear and beyond a certain point personal encounters with aposematic prey were no longer informative for prey discrimination. Social information about prey unpalatability reduced attacks on aposematic prey across learning trials, but it did not influence the relationship between personal sampling and discrimination. Our results suggest therefore that acquiring social information does not influence the value of personal information, but more experiments are needed to manipulate pay-offs and disentangle whether information sources affect response thresholds or change discrimination. This article is part of the theme issue 'Signal detection theory in recognition systems: from evolving models to experimental tests'.
Collapse
Affiliation(s)
- Liisa Hämäläinen
- Department of Zoology, University of Cambridge, Downing Street, Cambridge CB2 3EJ, UK
- Department of Biological and Environmental Science, University of Jyväskylä, Jyväskylä, 40014, Finland
- Department of Biological Sciences, Macquarie University, NSW 2109, Australia
| | - Rose Thorogood
- Department of Zoology, University of Cambridge, Downing Street, Cambridge CB2 3EJ, UK
- HiLIFE Helsinki Institute of Life Sciences, University of Helsinki, Helsinki 00011, Finland
- Research Programme in Organismal and Evolutionary Biology, Faculty of Biological and Environmental Sciences, University of Helsinki, Helsinki 00011, Finland
| |
Collapse
|
7
|
Schulz E, Franklin NT, Gershman SJ. Finding structure in multi-armed bandits. Cogn Psychol 2020; 119:101261. [PMID: 32059133 DOI: 10.1016/j.cogpsych.2019.101261] [Citation(s) in RCA: 19] [Impact Index Per Article: 4.8] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/29/2018] [Revised: 11/10/2019] [Accepted: 12/02/2019] [Indexed: 12/24/2022]
Abstract
How do humans search for rewards? This question is commonly studied using multi-armed bandit tasks, which require participants to trade off exploration and exploitation. Standard multi-armed bandits assume that each option has an independent reward distribution. However, learning about options independently is unrealistic, since in the real world options often share an underlying structure. We study a class of structured bandit tasks, which we use to probe how generalization guides exploration. In a structured multi-armed bandit, options have a correlation structure dictated by a latent function. We focus on bandits in which rewards are linear functions of an option's spatial position. Across 5 experiments, we find evidence that participants utilize functional structure to guide their exploration, and also exhibit a learning-to-learn effect across rounds, becoming progressively faster at identifying the latent function. Our experiments rule out several heuristic explanations and show that the same findings obtain with non-linear functions. Comparing several models of learning and decision making, we find that the best model of human behavior in our tasks combines three computational mechanisms: (1) function learning, (2) clustering of reward distributions across rounds, and (3) uncertainty-guided exploration. Our results suggest that human reinforcement learning can utilize latent structure in sophisticated ways to improve efficiency.
Collapse
|
8
|
Li Q, Tu Y, Chen J, Shan J, Yung PSH, Ling SKK, Hua Y. Reverse anterolateral drawer test is more sensitive and accurate for diagnosing chronic anterior talofibular ligament injury. Knee Surg Sports Traumatol Arthrosc 2020; 28:55-62. [PMID: 31559464 DOI: 10.1007/s00167-019-05705-x] [Citation(s) in RCA: 22] [Impact Index Per Article: 5.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 10/24/2018] [Accepted: 09/11/2019] [Indexed: 12/26/2022]
Abstract
PURPOSE To diagnose chronic anterior talofibular ligament (ATFL) injury, three different physical examinations were compared: the anterior drawer test (ADT), the anterolateral drawer test (ALDT), and the reverse anterolateral drawer test (RALDT). METHODS A total of 72 ankles from potential ATFL-injured patients and the normal population were included and examined using the ADT, ALDT, and RALDT by two examiners without knowing the injury histories of any of the participants. Ultrasound examination was then applied as the gold standard to divide the ankles into the ATFL-injured group and the control group. The sensitivity (Se), specificity (Sp), false negative rate (FNR), false positive rate (FPR), accuracy, κ value, and p value of the two examiners' diagnoses were calculated to assess the diagnostic ability of each examination. RESULTS There were 38 ankles in the injured group and 34 ankles in the control group. No significant difference was found between the two groups in terms of gender, age, body mass index (BMI), and included ankles. In the ADT and ALDT groups, the specificity reached one, while the sensitivity was relatively low (0.053 and 0.477 for the junior examiner and 0.395 and 0.500 for the senior examiner). In the RALDT, both the sensitivity and specificity were greater than 85% (0.868 and 0.912 for the senior examiner and 0.921 and 0.882 for the junior examiner). The κ value of the RALDT (0.639) was higher than that of the ALDT (0.528) and the ADT (0.196), whereas all the p values were less than 0.05. CONCLUSION The ADT and ALDT are valuable physical tests to assess ATFL injuries. Compared with the traditional ADT and ALDT, however, the RALDT is more sensitive and accurate in diagnosing chronic ATFL injuries. LEVEL OF EVIDENCE II (diagnostic).
Collapse
Affiliation(s)
- Qianru Li
- Department of Sports Medicine, Huashan Hospital, Fudan University, Shanghai, China
| | - Yingchun Tu
- Department of Orthopedics, Jinhua Municipal Central Hospital, Jinhua, Zhejiang, China
| | - Jun Chen
- Department of Orthopedics, Dongyang People's Hospital, Wenzhou Medical University, Wenzhou, Zhejiang, China
| | - Jieling Shan
- Department of Ultrasound, Huashan Hospital, Fudan University, Shanghai, China
| | - Patrick Shu-Hang Yung
- Department of Orthopaedics and Traumatology, Faculty of Medicine, The Chinese University of Hong Kong, Hong Kong, China
| | - Samuel Ka-Kin Ling
- Department of Orthopaedics and Traumatology, Faculty of Medicine, The Chinese University of Hong Kong, Hong Kong, China
| | - Yinghui Hua
- Department of Sports Medicine, Huashan Hospital, Fudan University, Shanghai, China.
| |
Collapse
|
9
|
Parr T, Friston KJ. Generalised free energy and active inference. BIOLOGICAL CYBERNETICS 2019; 113:495-513. [PMID: 31562544 PMCID: PMC6848054 DOI: 10.1007/s00422-019-00805-w] [Citation(s) in RCA: 70] [Impact Index Per Article: 14.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 07/04/2017] [Accepted: 09/13/2019] [Indexed: 05/30/2023]
Abstract
Active inference is an approach to understanding behaviour that rests upon the idea that the brain uses an internal generative model to predict incoming sensory data. The fit between this model and data may be improved in two ways. The brain could optimise probabilistic beliefs about the variables in the generative model (i.e. perceptual inference). Alternatively, by acting on the world, it could change the sensory data, such that they are more consistent with the model. This implies a common objective function (variational free energy) for action and perception that scores the fit between an internal model and the world. We compare two free energy functionals for active inference in the framework of Markov decision processes. One of these is a functional of beliefs (i.e. probability distributions) about states and policies, but a function of observations, while the second is a functional of beliefs about all three. In the former (expected free energy), prior beliefs about outcomes are not part of the generative model (because they are absorbed into the prior over policies). Conversely, in the second (generalised free energy), priors over outcomes become an explicit component of the generative model. When using the free energy function, which is blind to future observations, we equip the generative model with a prior over policies that ensure preferred (i.e. priors over) outcomes are realised. In other words, if we expect to encounter a particular kind of outcome, this lends plausibility to those policies for which this outcome is a consequence. In addition, this formulation ensures that selected policies minimise uncertainty about future outcomes by minimising the free energy expected in the future. When using the free energy functional-that effectively treats future observations as hidden states-we show that policies are inferred or selected that realise prior preferences by minimising the free energy of future expectations. Interestingly, the form of posterior beliefs about policies (and associated belief updating) turns out to be identical under both formulations, but the quantities used to compute them are not.
Collapse
Affiliation(s)
- Thomas Parr
- Wellcome Centre for Human Neuroimaging, Institute of Neurology, University College London, 12 Queen Square, London, WC1N 3BG UK
| | - Karl J. Friston
- Wellcome Centre for Human Neuroimaging, Institute of Neurology, University College London, 12 Queen Square, London, WC1N 3BG UK
| |
Collapse
|
10
|
Stamps JA, Krishnan V. Age-dependent changes in behavioural plasticity: insights from Bayesian models of development. Anim Behav 2017. [DOI: 10.1016/j.anbehav.2017.01.013] [Citation(s) in RCA: 12] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/20/2022]
|
11
|
Qian T, Jaeger TF, Aslin RN. Incremental implicit learning of bundles of statistical patterns. Cognition 2016; 157:156-173. [PMID: 27639552 PMCID: PMC5181648 DOI: 10.1016/j.cognition.2016.09.002] [Citation(s) in RCA: 17] [Impact Index Per Article: 2.1] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/24/2014] [Revised: 09/02/2016] [Accepted: 09/08/2016] [Indexed: 11/26/2022]
Abstract
Forming an accurate representation of a task environment often takes place incrementally as the information relevant to learning the representation only unfolds over time. This incremental nature of learning poses an important problem: it is usually unclear whether a sequence of stimuli consists of only a single pattern, or multiple patterns that are spliced together. In the former case, the learner can directly use each observed stimulus to continuously revise its representation of the task environment. In the latter case, however, the learner must first parse the sequence of stimuli into different bundles, so as to not conflate the multiple patterns. We created a video-game statistical learning paradigm and investigated (1) whether learners without prior knowledge of the existence of multiple "stimulus bundles" - subsequences of stimuli that define locally coherent statistical patterns - could detect their presence in the input and (2) whether learners are capable of constructing a rich representation that encodes the various statistical patterns associated with bundles. By comparing human learning behavior to the predictions of three computational models, we find evidence that learners can handle both tasks successfully. In addition, we discuss the underlying reasons for why the learning of stimulus bundles occurs even when such behavior may seem irrational.
Collapse
Affiliation(s)
- Ting Qian
- Department of Biomedical and Health Informatics, The Children's Hospital of Philadelphia, United States
| | - T Florian Jaeger
- Department of Brain and Cognitive Sciences, University of Rochester, United States; Department of Computer Science, University of Rochester, United States; Department of Linguistics, University of Rochester, United States
| | - Richard N Aslin
- Department of Brain and Cognitive Sciences, University of Rochester, United States
| |
Collapse
|
12
|
Li Y, Nakae K, Ishii S, Naoki H. Uncertainty-Dependent Extinction of Fear Memory in an Amygdala-mPFC Neural Circuit Model. PLoS Comput Biol 2016; 12:e1005099. [PMID: 27617747 PMCID: PMC5019407 DOI: 10.1371/journal.pcbi.1005099] [Citation(s) in RCA: 14] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/28/2016] [Accepted: 08/11/2016] [Indexed: 11/29/2022] Open
Abstract
Uncertainty of fear conditioning is crucial for the acquisition and extinction of fear memory. Fear memory acquired through partial pairings of a conditioned stimulus (CS) and an unconditioned stimulus (US) is more resistant to extinction than that acquired through full pairings; this effect is known as the partial reinforcement extinction effect (PREE). Although the PREE has been explained by psychological theories, the neural mechanisms underlying the PREE remain largely unclear. Here, we developed a neural circuit model based on three distinct types of neurons (fear, persistent and extinction neurons) in the amygdala and medial prefrontal cortex (mPFC). In the model, the fear, persistent and extinction neurons encode predictions of net severity, of unconditioned stimulus (US) intensity, and of net safety, respectively. Our simulation successfully reproduces the PREE. We revealed that unpredictability of the US during extinction was represented by the combined responses of the three types of neurons, which are critical for the PREE. In addition, we extended the model to include amygdala subregions and the mPFC to address a recent finding that the ventral mPFC (vmPFC) is required for consolidating extinction memory but not for memory retrieval. Furthermore, model simulations led us to propose a novel procedure to enhance extinction learning through re-conditioning with a stronger US; strengthened fear memory up-regulates the extinction neuron, which, in turn, further inhibits the fear neuron during re-extinction. Thus, our models increased the understanding of the functional roles of the amygdala and vmPFC in the processing of uncertainty in fear conditioning and extinction. Animals live in environments that contain uncertainty. To adapt to uncertain situations, they flexibly learn to associate environmental cues with rewards and punishments. Understanding how the brain processes uncertainty has remained an important issue in neuroscience. To address this question, we focused on neural processing in the amygdala and mPFC during fear conditioning and extinction. We developed a neural circuit model that incorporates distinct neural populations in the amygdala and mPFC. Our model first successfully reproduced uncertainty-dependent resistance to the extinction of fear memory. An extension of the model provided a possible explanation for observations made during optogenetic manipulation of the ventral mPFC. Finally, we proposed a procedure to accelerate the efficacy of subsequent extinction based on our model.
Collapse
Affiliation(s)
- Yuzhe Li
- Graduate School of Biostudies, Kyoto University, Kyoto, Japan
| | - Ken Nakae
- Graduate School of Informatics, Kyoto University, Kyoto, Japan
| | - Shin Ishii
- Graduate School of Informatics, Kyoto University, Kyoto, Japan
| | - Honda Naoki
- Imaging Platform of Spatio-temporal Information, Graduate School of Medicine, Kyoto University, Kyoto, Japan
- * E-mail:
| |
Collapse
|
13
|
|
14
|
Stamps JA, Frankenhuis WE. Bayesian Models of Development. Trends Ecol Evol 2016; 31:260-268. [PMID: 26896042 DOI: 10.1016/j.tree.2016.01.012] [Citation(s) in RCA: 63] [Impact Index Per Article: 7.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/10/2015] [Revised: 01/19/2016] [Accepted: 01/20/2016] [Indexed: 10/22/2022]
Abstract
Until recently, biology lacked a framework for studying how information from genes, parental effects, and different personal experiences is combined across the lifetime to affect phenotypic development. Over the past few years, researchers have begun to build such a framework, using models that incorporate Bayesian updating to study the evolution of developmental plasticity and developmental trajectories. Here, we describe the merits of a Bayesian approach to development, review the main findings and implications of the current set of models, and describe predictions that can be tested using protocols already used by empiricists. We suggest that a Bayesian perspective affords a simple and tractable way to conceptualize, explain, and predict how information combines across the lifetime to affect development.
Collapse
Affiliation(s)
- Judy A Stamps
- Section of Evolution and Ecology, Division of Biological Sciences, University of California at Davis, Davis, CA 95616, USA.
| | - Willem E Frankenhuis
- Behavioural Science Institute, Radboud University, Nijmegen, Montessorilaan 3, PO Box 9104, 6500 HE Nijmegen, The Netherlands
| |
Collapse
|
15
|
Iigaya K. Adaptive learning and decision-making under uncertainty by metaplastic synapses guided by a surprise detection system. eLife 2016; 5:e18073. [PMID: 27504806 PMCID: PMC5008908 DOI: 10.7554/elife.18073] [Citation(s) in RCA: 21] [Impact Index Per Article: 2.6] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/23/2016] [Accepted: 08/08/2016] [Indexed: 01/27/2023] Open
Abstract
Recent experiments have shown that animals and humans have a remarkable ability to adapt their learning rate according to the volatility of the environment. Yet the neural mechanism responsible for such adaptive learning has remained unclear. To fill this gap, we investigated a biophysically inspired, metaplastic synaptic model within the context of a well-studied decision-making network, in which synapses can change their rate of plasticity in addition to their efficacy according to a reward-based learning rule. We found that our model, which assumes that synaptic plasticity is guided by a novel surprise detection system, captures a wide range of key experimental findings and performs as well as a Bayes optimal model, with remarkably little parameter tuning. Our results further demonstrate the computational power of synaptic plasticity, and provide insights into the circuit-level computation which underlies adaptive decision-making.
Collapse
Affiliation(s)
- Kiyohito Iigaya
- Gatsby Computational Neuroscience Unit, University College London, London, United Kingdom,Center for Theoretical Neuroscience, College of Physicians and Surgeons, Columbia University, New York, United States,Department of Physics, Columbia University, New York, United States,
| |
Collapse
|
16
|
|
17
|
Stamps JA. Individual differences in behavioural plasticities. Biol Rev Camb Philos Soc 2015; 91:534-67. [PMID: 25865135 DOI: 10.1111/brv.12186] [Citation(s) in RCA: 154] [Impact Index Per Article: 17.1] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/15/2014] [Revised: 03/14/2015] [Accepted: 03/18/2015] [Indexed: 01/06/2023]
Abstract
Interest in individual differences in animal behavioural plasticities has surged in recent years, but research in this area has been hampered by semantic confusion as different investigators use the same terms (e.g. plasticity, flexibility, responsiveness) to refer to different phenomena. The first goal of this review is to suggest a framework for categorizing the many different types of behavioural plasticities, describe examples of each, and indicate why using reversibility as a criterion for categorizing behavioural plasticities is problematic. This framework is then used to address a number of timely questions about individual differences in behavioural plasticities. One set of questions concerns the experimental designs that can be used to study individual differences in various types of behavioural plasticities. Although within-individual designs are the default option for empirical studies of many types of behavioural plasticities, in some situations (e.g. when experience at an early age affects the behaviour expressed at subsequent ages), 'replicate individual' designs can provide useful insights into individual differences in behavioural plasticities. To date, researchers using within-individual and replicate individual designs have documented individual differences in all of the major categories of behavioural plasticities described herein. Another important question is whether and how different types of behavioural plasticities are related to one another. Currently there is empirical evidence that many behavioural plasticities [e.g. contextual plasticity, learning rates, IIV (intra-individual variability), endogenous plasticities, ontogenetic plasticities) can themselves vary as a function of experiences earlier in life, that is, many types of behavioural plasticity are themselves developmentally plastic. These findings support the assumption that differences among individuals in prior experiences may contribute to individual differences in behavioural plasticities observed at a given age. Several authors have predicted correlations across individuals between different types of behavioural plasticities, i.e. that some individuals will be generally more plastic than others. However, empirical support for most of these predictions, including indirect evidence from studies of relationships between personality traits and plasticities, is currently sparse and equivocal. The final section of this review suggests how an appreciation of the similarities and differences between different types of behavioural plasticities may help theoreticians formulate testable models to explain the evolution of individual differences in behavioural plasticities and the evolutionary and ecological consequences of individual differences in behavioural plasticities.
Collapse
Affiliation(s)
- Judy A Stamps
- Department of Ecology and Evolution, University of California Davis, Davis, CA 95616, U.S.A
| |
Collapse
|
18
|
Stamps JA, Krishnan VV. Individual differences in the potential and realized developmental plasticity of personality traits. Front Ecol Evol 2014. [DOI: 10.3389/fevo.2014.00069] [Citation(s) in RCA: 17] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/22/2023] Open
|
19
|
Learning bundles of stimuli renders stimulus order as a cue, not a confound. Proc Natl Acad Sci U S A 2014; 111:14400-5. [PMID: 25246587 DOI: 10.1073/pnas.1416109111] [Citation(s) in RCA: 13] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/18/2022] Open
Abstract
The order in which stimuli are presented in an experiment has long been recognized to influence behavior. Previous accounts have often attributed the effect of stimulus order to the mechanisms with which people process information. We propose that stimulus order influences cognition because it is an important cue for learning the underlying structure of a task environment. In particular, stimulus order can be used to infer a "stimulus bundle"--a sequence of consecutive stimuli that share the same underlying latent cluster. We describe a clustering model that successfully explains the perception of streak shooting in basketball games, along with two other cognitive phenomena, as the outcome of finding the statistically optimal bundle representation. We argue that the perspective of viewing stimulus order as a cue may hold the key to explaining behaviors that seemingly deviate from normative theories of cognition and that in task domains where the assumption of stimulus bundles is intuitively appropriate, it can improve the explanatory power of existing models.
Collapse
|