101
|
Lu Q, Hasson U, Norman KA. A neural network model of when to retrieve and encode episodic memories. eLife 2022; 11:e74445. [PMID: 35142289 PMCID: PMC9000961 DOI: 10.7554/elife.74445] [Citation(s) in RCA: 4] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/04/2021] [Accepted: 02/09/2022] [Indexed: 11/23/2022] Open
Abstract
Recent human behavioral and neuroimaging results suggest that people are selective in when they encode and retrieve episodic memories. To explain these findings, we trained a memory-augmented neural network to use its episodic memory to support prediction of upcoming states in an environment where past situations sometimes reoccur. We found that the network learned to retrieve selectively as a function of several factors, including its uncertainty about the upcoming state. Additionally, we found that selectively encoding episodic memories at the end of an event (but not mid-event) led to better subsequent prediction performance. In all of these cases, the benefits of selective retrieval and encoding can be explained in terms of reducing the risk of retrieving irrelevant memories. Overall, these modeling results provide a resource-rational account of why episodic retrieval and encoding should be selective and lead to several testable predictions.
Collapse
Affiliation(s)
- Qihong Lu
- Department of Psychology, Princeton UniversityPrincetonUnited States
- Princeton Neuroscience Institute, Princeton UniversityPrincetonUnited States
| | - Uri Hasson
- Department of Psychology, Princeton UniversityPrincetonUnited States
- Princeton Neuroscience Institute, Princeton UniversityPrincetonUnited States
| | - Kenneth A Norman
- Department of Psychology, Princeton UniversityPrincetonUnited States
- Princeton Neuroscience Institute, Princeton UniversityPrincetonUnited States
| |
Collapse
|
102
|
Grossman CD, Bari BA, Cohen JY. Serotonin neurons modulate learning rate through uncertainty. Curr Biol 2022; 32:586-599.e7. [PMID: 34936883 PMCID: PMC8825708 DOI: 10.1016/j.cub.2021.12.006] [Citation(s) in RCA: 40] [Impact Index Per Article: 20.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/17/2020] [Revised: 10/11/2021] [Accepted: 12/03/2021] [Indexed: 12/20/2022]
Abstract
Regulating how fast to learn is critical for flexible behavior. Learning about the consequences of actions should be slow in stable environments, but accelerate when that environment changes. Recognizing stability and detecting change are difficult in environments with noisy relationships between actions and outcomes. Under these conditions, theories propose that uncertainty can be used to modulate learning rates ("meta-learning"). We show that mice behaving in a dynamic foraging task exhibit choice behavior that varied as a function of two forms of uncertainty estimated from a meta-learning model. The activity of dorsal raphe serotonin neurons tracked both types of uncertainty in the foraging task as well as in a dynamic Pavlovian task. Reversible inhibition of serotonin neurons in the foraging task reproduced changes in learning predicted by a simulated lesion of meta-learning in the model. We thus provide a quantitative link between serotonin neuron activity, learning, and decision making.
Collapse
Affiliation(s)
- Cooper D Grossman
- The Solomon H. Snyder Department of Neuroscience, Brain Science Institute, Kavli Neuroscience Discovery Institute, The Johns Hopkins University School of Medicine, 725 N. Wolfe Street, Baltimore, MD 21205, USA
| | - Bilal A Bari
- The Solomon H. Snyder Department of Neuroscience, Brain Science Institute, Kavli Neuroscience Discovery Institute, The Johns Hopkins University School of Medicine, 725 N. Wolfe Street, Baltimore, MD 21205, USA
| | - Jeremiah Y Cohen
- The Solomon H. Snyder Department of Neuroscience, Brain Science Institute, Kavli Neuroscience Discovery Institute, The Johns Hopkins University School of Medicine, 725 N. Wolfe Street, Baltimore, MD 21205, USA.
| |
Collapse
|
103
|
Hattori R, Komiyama T. Context-dependent persistency as a coding mechanism for robust and widely distributed value coding. Neuron 2022; 110:502-515.e11. [PMID: 34818514 PMCID: PMC8813889 DOI: 10.1016/j.neuron.2021.11.001] [Citation(s) in RCA: 19] [Impact Index Per Article: 9.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/22/2021] [Revised: 08/26/2021] [Accepted: 11/01/2021] [Indexed: 02/04/2023]
Abstract
Task-related information is widely distributed across the brain with different coding properties, such as persistency. We found in mice that coding persistency of action history and value was variable across areas, learning phases, and task context, with the highest persistency in the retrosplenial cortex of expert mice performing value-based decisions where history needs to be maintained across trials. Persistent coding also emerged in artificial networks trained to perform mouse-like reinforcement learning. Persistency allows temporally untangled value representations in neuronal manifolds where population activity exhibits cyclic trajectories that transition along the value axis after action outcomes, collectively forming cylindrical dynamics. Simulations indicated that untangled persistency facilitates robust value retrieval by downstream networks. Even leakage of persistently maintained value through non-specific connectivity could contribute to the brain-wide distributed value coding with different levels of persistency. These results reveal that context-dependent, untangled persistency facilitates reliable signal coding and its distribution across the brain.
Collapse
Affiliation(s)
- Ryoma Hattori
- Neurobiology Section, Center for Neural Circuits and Behavior, Department of Neurosciences, and Halıcıoğlu Data Science Institute, University of California San Diego, La Jolla, CA 90093, USA.
| | - Takaki Komiyama
- Neurobiology Section, Center for Neural Circuits and Behavior, Department of Neurosciences, and Halıcıoğlu Data Science Institute, University of California San Diego, La Jolla, CA 90093, USA.
| |
Collapse
|
104
|
Kumar MG, Tan C, Libedinsky C, Yen SC, Tan AYY. A Nonlinear Hidden Layer Enables Actor-Critic Agents to Learn Multiple Paired Association Navigation. Cereb Cortex 2022; 32:3917-3936. [PMID: 35034127 DOI: 10.1093/cercor/bhab456] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/23/2021] [Revised: 11/05/2021] [Accepted: 11/06/2021] [Indexed: 11/15/2022] Open
Abstract
Navigation to multiple cued reward locations has been increasingly used to study rodent learning. Though deep reinforcement learning agents have been shown to be able to learn the task, they are not biologically plausible. Biologically plausible classic actor-critic agents have been shown to learn to navigate to single reward locations, but which biologically plausible agents are able to learn multiple cue-reward location tasks has remained unclear. In this computational study, we show versions of classic agents that learn to navigate to a single reward location, and adapt to reward location displacement, but are not able to learn multiple paired association navigation. The limitation is overcome by an agent in which place cell and cue information are first processed by a feedforward nonlinear hidden layer with synapses to the actor and critic subject to temporal difference error-modulated plasticity. Faster learning is obtained when the feedforward layer is replaced by a recurrent reservoir network.
Collapse
Affiliation(s)
- M Ganesh Kumar
- Integrative Sciences and Engineering Programme, NUS Graduate School, National University of Singapore, Singapore 119077, Singapore
- The N.1 Institute for Health, National University of Singapore, Singapore 117456, Singapore
- Innovation and Design Programme, Faculty of Engineering, National University of Singapore, Singapore 117579, Singapore
| | - Cheston Tan
- Institute for Infocomm Research, Agency for Science, Technology and Research, Singapore 138632, Singapore
| | - Camilo Libedinsky
- Integrative Sciences and Engineering Programme, NUS Graduate School, National University of Singapore, Singapore 119077, Singapore
- The N.1 Institute for Health, National University of Singapore, Singapore 117456, Singapore
- Department of Psychology, National University of Singapore, Singapore 117570, Singapore
- Institute of Molecular and Cell Biology, Agency for Science, Technology and Research, Singapore 138673, Singapore
| | - Shih-Cheng Yen
- Integrative Sciences and Engineering Programme, NUS Graduate School, National University of Singapore, Singapore 119077, Singapore
- The N.1 Institute for Health, National University of Singapore, Singapore 117456, Singapore
- Innovation and Design Programme, Faculty of Engineering, National University of Singapore, Singapore 117579, Singapore
| | - Andrew Y Y Tan
- Department of Physiology, Yong Loo Lin School of Medicine, National University of Singapore, Singapore 117593, Singapore
- Healthy Longevity Translational Research Programme, Yong Loo Lin School of Medicine, National University of Singapore, Singapore 119228, Singapore
- Cardiovascular Disease Translational Research Programme, Yong Loo Lin School of Medicine, National University of Singapore, Singapore 119228, Singapore
- Neurobiology Programme, Life Sciences Institute, National University of Singapore, Singapore 119077, Singapore
| |
Collapse
|
105
|
Abstract
Recent breakthroughs in artificial intelligence (AI) have enabled machines to plan in tasks previously thought to be uniquely human. Meanwhile, the planning algorithms implemented by the brain itself remain largely unknown. Here, we review neural and behavioral data in sequential decision-making tasks that elucidate the ways in which the brain does-and does not-plan. To systematically review available biological data, we create a taxonomy of planning algorithms by summarizing the relevant design choices for such algorithms in AI. Across species, recording techniques, and task paradigms, we find converging evidence that the brain represents future states consistent with a class of planning algorithms within our taxonomy-focused, depth-limited, and serial. However, we argue that current data are insufficient for addressing more detailed algorithmic questions. We propose a new approach leveraging AI advances to drive experiments that can adjudicate between competing candidate algorithms.
Collapse
|
106
|
Dombrovski AY, Hallquist MN. Search for solutions, learning, simulation, and choice processes in suicidal behavior. WILEY INTERDISCIPLINARY REVIEWS. COGNITIVE SCIENCE 2022; 13:e1561. [PMID: 34008338 PMCID: PMC9285563 DOI: 10.1002/wcs.1561] [Citation(s) in RCA: 6] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 11/12/2020] [Revised: 03/06/2021] [Accepted: 04/07/2021] [Indexed: 12/25/2022]
Abstract
Suicide may be viewed as an unfortunate outcome of failures in decision processes. Such failures occur when the demands of a crisis exceed a person's capacity to (i) search for options, (ii) learn and simulate possible futures, and (iii) make advantageous value-based choices. Can individual-level decision deficits and biases drive the progression of the suicidal crisis? Our overview of the evidence on this question is informed by clinical theory and grounded in reinforcement learning and behavioral economics. Cohort and case-control studies provide strong evidence that limited cognitive capacity and particularly impaired cognitive control are associated with suicidal behavior, imposing cognitive constraints on decision-making. We conceptualize suicidal ideation as an element of impoverished consideration sets resulting from a search for solutions under cognitive constraints and mood-congruent Pavlovian influences, a view supported by mostly indirect evidence. More compelling is the evidence of impaired learning in people with a history of suicidal behavior. We speculate that an inability to simulate alternative futures using one's model of the world may undermine alternative solutions in a suicidal crisis. The hypothesis supported by the strongest evidence is that the selection of suicide over alternatives is facilitated by a choice process undermined by randomness. Case-control studies using gambling tasks, armed bandits, and delay discounting support this claim. Future experimental studies will need to uncover real-time dynamics of choice processes in suicidal people. In summary, the decision process framework sheds light on neurocognitive mechanisms that facilitate the progression of the suicidal crisis. This article is categorized under: Economics > Individual Decision-Making Psychology > Emotion and Motivation Psychology > Learning Neuroscience > Behavior.
Collapse
Affiliation(s)
| | - Michael N. Hallquist
- Department of Psychology and NeuroscienceUniversity of North CarolinaChapel HillNorth CarolinaUSA
| |
Collapse
|
107
|
Kenwood MM, Kalin NH, Barbas H. The prefrontal cortex, pathological anxiety, and anxiety disorders. Neuropsychopharmacology 2022; 47:260-275. [PMID: 34400783 PMCID: PMC8617307 DOI: 10.1038/s41386-021-01109-z] [Citation(s) in RCA: 72] [Impact Index Per Article: 36.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 03/08/2021] [Revised: 07/06/2021] [Accepted: 07/08/2021] [Indexed: 02/07/2023]
Abstract
Anxiety is experienced in response to threats that are distal or uncertain, involving changes in one's subjective state, autonomic responses, and behavior. Defensive and physiologic responses to threats that involve the amygdala and brainstem are conserved across species. While anxiety responses typically serve an adaptive purpose, when excessive, unregulated, and generalized, they can become maladaptive, leading to distress and avoidance of potentially threatening situations. In primates, anxiety can be regulated by the prefrontal cortex (PFC), which has expanded in evolution. This prefrontal expansion is thought to underlie primates' increased capacity to engage high-level regulatory strategies aimed at coping with and modifying the experience of anxiety. The specialized primate lateral, medial, and orbital PFC sectors are connected with association and limbic cortices, the latter of which are connected with the amygdala and brainstem autonomic structures that underlie emotional and physiological arousal. PFC pathways that interface with distinct inhibitory systems within the cortex, the amygdala, or the thalamus can regulate responses by modulating neuronal output. Within the PFC, pathways connecting cortical regions are poised to reduce noise and enhance signals for cognitive operations that regulate anxiety processing and autonomic drive. Specialized PFC pathways to the inhibitory thalamic reticular nucleus suggest a mechanism to allow passage of relevant signals from thalamus to cortex, and in the amygdala to modulate the output to autonomic structures. Disruption of specific nodes within the PFC that interface with inhibitory systems can affect the negative bias, failure to regulate autonomic arousal, and avoidance that characterize anxiety disorders.
Collapse
Affiliation(s)
- Margaux M Kenwood
- Department of Psychiatry, University of Wisconsin School of Medicine and Public Health, Madison, WI, USA
- Neuroscience Training Program at University of Wisconsin-Madison, Madison, USA
| | - Ned H Kalin
- Department of Psychiatry, University of Wisconsin School of Medicine and Public Health, Madison, WI, USA
- Neuroscience Training Program at University of Wisconsin-Madison, Madison, USA
- Wisconsin National Primate Center, Madison, WI, USA
| | - Helen Barbas
- Neural Systems Laboratory, Department of Health Sciences, Boston University, Boston, MA, USA.
- Department of Anatomy and Neurobiology, Boston University School of Medicine, Boston, MA, USA.
| |
Collapse
|
108
|
Collins AGE, Shenhav A. Advances in modeling learning and decision-making in neuroscience. Neuropsychopharmacology 2022; 47:104-118. [PMID: 34453117 PMCID: PMC8617262 DOI: 10.1038/s41386-021-01126-y] [Citation(s) in RCA: 19] [Impact Index Per Article: 9.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 04/03/2021] [Revised: 07/14/2021] [Accepted: 07/22/2021] [Indexed: 02/07/2023]
Abstract
An organism's survival depends on its ability to learn about its environment and to make adaptive decisions in the service of achieving the best possible outcomes in that environment. To study the neural circuits that support these functions, researchers have increasingly relied on models that formalize the computations required to carry them out. Here, we review the recent history of computational modeling of learning and decision-making, and how these models have been used to advance understanding of prefrontal cortex function. We discuss how such models have advanced from their origins in basic algorithms of updating and action selection to increasingly account for complexities in the cognitive processes required for learning and decision-making, and the representations over which they operate. We further discuss how a deeper understanding of the real-world complexities in these computations has shed light on the fundamental constraints on optimal behavior, and on the complex interactions between corticostriatal pathways to determine such behavior. The continuing and rapid development of these models holds great promise for understanding the mechanisms by which animals adapt to their environments, and what leads to maladaptive forms of learning and decision-making within clinical populations.
Collapse
Affiliation(s)
- Anne G E Collins
- Department of Psychology and Helen Wills Neuroscience Institute, University of California, Berkeley, Berkeley, CA, USA.
| | - Amitai Shenhav
- Department of Cognitive, Linguistic, & Psychological Sciences and Carney Institute for Brain Science, Brown University, Providence, RI, USA.
| |
Collapse
|
109
|
Zhang Y, Pan X, Wang Y. Category learning in a recurrent neural network with reinforcement learning. Front Psychiatry 2022; 13:1008011. [PMID: 36387007 PMCID: PMC9640766 DOI: 10.3389/fpsyt.2022.1008011] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 07/31/2022] [Accepted: 10/10/2022] [Indexed: 11/13/2022] Open
Abstract
It is known that humans and animals can learn and utilize category information quickly and efficiently to adapt to changing environments, and several brain areas are involved in learning and encoding category information. However, it is unclear that how the brain system learns and forms categorical representations from the view of neural circuits. In order to investigate this issue from the network level, we combine a recurrent neural network with reinforcement learning to construct a deep reinforcement learning model to demonstrate how the category is learned and represented in the network. The model consists of a policy network and a value network. The policy network is responsible for updating the policy to choose actions, while the value network is responsible for evaluating the action to predict rewards. The agent learns dynamically through the information interaction between the policy network and the value network. This model was trained to learn six stimulus-stimulus associative chains in a sequential paired-association task that was learned by the monkey. The simulated results demonstrated that our model was able to learn the stimulus-stimulus associative chains, and successfully reproduced the similar behavior of the monkey performing the same task. Two types of neurons were found in this model: one type primarily encoded identity information about individual stimuli; the other type mainly encoded category information of associated stimuli in one chain. The two types of activity-patterns were also observed in the primate prefrontal cortex after the monkey learned the same task. Furthermore, the ability of these two types of neurons to encode stimulus or category information was enhanced during this model was learning the task. Our results suggest that the neurons in the recurrent neural network have the ability to form categorical representations through deep reinforcement learning during learning stimulus-stimulus associations. It might provide a new approach for understanding neuronal mechanisms underlying how the prefrontal cortex learns and encodes category information.
Collapse
Affiliation(s)
- Ying Zhang
- Institute for Cognitive Neurodynamics, East China University of Science and Technology, Shanghai, China
| | - Xiaochuan Pan
- Institute for Cognitive Neurodynamics, East China University of Science and Technology, Shanghai, China
| | - Yihong Wang
- Institute for Cognitive Neurodynamics, East China University of Science and Technology, Shanghai, China
| |
Collapse
|
110
|
Mei J, Muller E, Ramaswamy S. Informing deep neural networks by multiscale principles of neuromodulatory systems. Trends Neurosci 2022; 45:237-250. [DOI: 10.1016/j.tins.2021.12.008] [Citation(s) in RCA: 5] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/09/2021] [Revised: 12/04/2021] [Accepted: 12/21/2021] [Indexed: 01/19/2023]
|
111
|
Pitti A, Quoy M, Lavandier C, Boucenna S, Swaileh W, Weidmann C. In Search of a Neural Model for Serial Order: a Brain Theory for Memory Development and Higher-Level Cognition. IEEE Trans Cogn Dev Syst 2022. [DOI: 10.1109/tcds.2022.3168046] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/10/2022]
|
112
|
Yoo AH, Collins AGE. How Working Memory and Reinforcement Learning Are Intertwined: A Cognitive, Neural, and Computational Perspective. J Cogn Neurosci 2021; 34:551-568. [PMID: 34942642 DOI: 10.1162/jocn_a_01808] [Citation(s) in RCA: 21] [Impact Index Per Article: 7.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/04/2022]
Abstract
Reinforcement learning and working memory are two core processes of human cognition and are often considered cognitively, neuroscientifically, and algorithmically distinct. Here, we show that the brain networks that support them actually overlap significantly and that they are less distinct cognitive processes than often assumed. We review literature demonstrating the benefits of considering each process to explain properties of the other and highlight recent work investigating their more complex interactions. We discuss how future research in both computational and cognitive sciences can benefit from one another, suggesting that a key missing piece for artificial agents to learn to behave with more human-like efficiency is taking working memory's role in learning seriously. This review highlights the risks of neglecting the interplay between different processes when studying human behavior (in particular when considering individual differences). We emphasize the importance of investigating these dynamics to build a comprehensive understanding of human cognition.
Collapse
|
113
|
Farashahi S, Soltani A. Computational mechanisms of distributed value representations and mixed learning strategies. Nat Commun 2021; 12:7191. [PMID: 34893597 PMCID: PMC8664930 DOI: 10.1038/s41467-021-27413-2] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/08/2021] [Accepted: 11/16/2021] [Indexed: 11/25/2022] Open
Abstract
Learning appropriate representations of the reward environment is challenging in the real world where there are many options, each with multiple attributes or features. Despite existence of alternative solutions for this challenge, neural mechanisms underlying emergence and adoption of value representations and learning strategies remain unknown. To address this, we measure learning and choice during a multi-dimensional probabilistic learning task in humans and trained recurrent neural networks (RNNs) to capture our experimental observations. We find that human participants estimate stimulus-outcome associations by learning and combining estimates of reward probabilities associated with the informative feature followed by those of informative conjunctions. Through analyzing representations, connectivity, and lesioning of the RNNs, we demonstrate this mixed learning strategy relies on a distributed neural code and opponency between excitatory and inhibitory neurons through value-dependent disinhibition. Together, our results suggest computational and neural mechanisms underlying emergence of complex learning strategies in naturalistic settings.
Collapse
Affiliation(s)
- Shiva Farashahi
- Department of Psychological and Brain Sciences, Dartmouth College, Hanover, NH, USA.
- Center for Computational Neuroscience, Flatiron Institute, Simons Foundation, New York, NY, USA.
| | - Alireza Soltani
- Department of Psychological and Brain Sciences, Dartmouth College, Hanover, NH, USA.
| |
Collapse
|
114
|
K Namboodiri VM, Hobbs T, Trujillo-Pisanty I, Simon RC, Gray MM, Stuber GD. Relative salience signaling within a thalamo-orbitofrontal circuit governs learning rate. Curr Biol 2021; 31:5176-5191.e5. [PMID: 34637750 PMCID: PMC8849135 DOI: 10.1016/j.cub.2021.09.037] [Citation(s) in RCA: 13] [Impact Index Per Article: 4.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/10/2021] [Revised: 07/19/2021] [Accepted: 09/15/2021] [Indexed: 11/20/2022]
Abstract
Learning to predict rewards is essential for the sustained fitness of animals. Contemporary views suggest that such learning is driven by a reward prediction error (RPE)-the difference between received and predicted rewards. The magnitude of learning induced by an RPE is proportional to the product of the RPE and a learning rate. Here we demonstrate using two-photon calcium imaging and optogenetics in mice that certain functionally distinct subpopulations of ventral/medial orbitofrontal cortex (vmOFC) neurons signal learning rate control. Consistent with learning rate control, trial-by-trial fluctuations in vmOFC activity positively correlate with behavioral updating when the RPE is positive, and negatively correlates with behavioral updating when the RPE is negative. Learning rate is affected by many variables including the salience of a reward. We found that the average reward response of these neurons signals the relative salience of a reward, because it decreases after reward prediction learning or the introduction of another highly salient aversive stimulus. The relative salience signaling in vmOFC is sculpted by medial thalamic inputs. These results support emerging theoretical views that prefrontal cortex encodes and controls learning parameters.
Collapse
Affiliation(s)
- Vijay Mohan K Namboodiri
- The Center for the Neurobiology of Addiction, Pain, and Emotion, Department of Anesthesiology and Pain Medicine, Department of Pharmacology, University of Washington, Seattle, WA 98195, USA
| | - Taylor Hobbs
- The Center for the Neurobiology of Addiction, Pain, and Emotion, Department of Anesthesiology and Pain Medicine, Department of Pharmacology, University of Washington, Seattle, WA 98195, USA
| | - Ivan Trujillo-Pisanty
- The Center for the Neurobiology of Addiction, Pain, and Emotion, Department of Anesthesiology and Pain Medicine, Department of Pharmacology, University of Washington, Seattle, WA 98195, USA
| | - Rhiana C Simon
- Graduate Program in Neuroscience, University of Washington, Seattle, WA 98195, USA
| | - Madelyn M Gray
- Graduate Program in Neuroscience, University of Washington, Seattle, WA 98195, USA
| | - Garret D Stuber
- The Center for the Neurobiology of Addiction, Pain, and Emotion, Department of Anesthesiology and Pain Medicine, Department of Pharmacology, University of Washington, Seattle, WA 98195, USA; Graduate Program in Neuroscience, University of Washington, Seattle, WA 98195, USA.
| |
Collapse
|
115
|
Foucault C, Meyniel F. Gated recurrence enables simple and accurate sequence prediction in stochastic, changing, and structured environments. eLife 2021; 10:71801. [PMID: 34854377 PMCID: PMC8735865 DOI: 10.7554/elife.71801] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/30/2021] [Accepted: 12/01/2021] [Indexed: 11/13/2022] Open
Abstract
From decision making to perception to language, predicting what is coming next is crucial. It is also challenging in stochastic, changing, and structured environments; yet the brain makes accurate predictions in many situations. What computational architecture could enable this feat? Bayesian inference makes optimal predictions but is prohibitively difficult to compute. Here, we show that a specific recurrent neural network architecture enables simple and accurate solutions in several environments. This architecture relies on three mechanisms: gating, lateral connections, and recurrent weight training. Like the optimal solution and the human brain, such networks develop internal representations of their changing environment (including estimates of the environment’s latent variables and the precision of these estimates), leverage multiple levels of latent structure, and adapt their effective learning rate to changes without changing their connection weights. Being ubiquitous in the brain, gated recurrence could therefore serve as a generic building block to predict in real-life environments.
Collapse
Affiliation(s)
- Cédric Foucault
- INSERM, CEA, Université Paris-Saclay, Gif sur Yvette, France
| | | |
Collapse
|
116
|
Thompson JAF. Forms of explanation and understanding for neuroscience and artificial intelligence. J Neurophysiol 2021; 126:1860-1874. [PMID: 34644128 DOI: 10.1152/jn.00195.2021] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/22/2022] Open
Abstract
Much of the controversy evoked by the use of deep neural networks as models of biological neural systems amount to debates over what constitutes scientific progress in neuroscience. To discuss what constitutes scientific progress, one must have a goal in mind (progress toward what?). One such long-term goal is to produce scientific explanations of intelligent capacities (e.g., object recognition, relational reasoning). I argue that the most pressing philosophical questions at the intersection of neuroscience and artificial intelligence are ultimately concerned with defining the phenomena to be explained and with what constitute valid explanations of such phenomena. I propose that a foundation in the philosophy of scientific explanation and understanding can scaffold future discussions about how an integrated science of intelligence might progress. Toward this vision, I review relevant theories of scientific explanation and discuss strategies for unifying the scientific goals of neuroscience and AI.
Collapse
Affiliation(s)
- Jessica A F Thompson
- Human Information Processing Lab, Department of Experimental Psychology, University of Oxford, Oxford, United Kingdom
| |
Collapse
|
117
|
Hennig JA, Oby ER, Losey DM, Batista AP, Yu BM, Chase SM. How learning unfolds in the brain: toward an optimization view. Neuron 2021; 109:3720-3735. [PMID: 34648749 PMCID: PMC8639641 DOI: 10.1016/j.neuron.2021.09.005] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/30/2021] [Revised: 08/25/2021] [Accepted: 09/02/2021] [Indexed: 12/17/2022]
Abstract
How do changes in the brain lead to learning? To answer this question, consider an artificial neural network (ANN), where learning proceeds by optimizing a given objective or cost function. This "optimization framework" may provide new insights into how the brain learns, as many idiosyncratic features of neural activity can be recapitulated by an ANN trained to perform the same task. Nevertheless, there are key features of how neural population activity changes throughout learning that cannot be readily explained in terms of optimization and are not typically features of ANNs. Here we detail three of these features: (1) the inflexibility of neural variability throughout learning, (2) the use of multiple learning processes even during simple tasks, and (3) the presence of large task-nonspecific activity changes. We propose that understanding the role of these features in the brain will be key to describing biological learning using an optimization framework.
Collapse
Affiliation(s)
- Jay A Hennig
- Neuroscience Institute, Carnegie Mellon University, Pittsburgh, PA, USA; Center for the Neural Basis of Cognition, Pittsburgh, PA, USA; Machine Learning Department, Carnegie Mellon University, Pittsburgh, PA, USA.
| | - Emily R Oby
- Center for the Neural Basis of Cognition, Pittsburgh, PA, USA; Department of Bioengineering, University of Pittsburgh, Pittsburgh, PA, USA
| | - Darby M Losey
- Neuroscience Institute, Carnegie Mellon University, Pittsburgh, PA, USA; Center for the Neural Basis of Cognition, Pittsburgh, PA, USA; Machine Learning Department, Carnegie Mellon University, Pittsburgh, PA, USA
| | - Aaron P Batista
- Center for the Neural Basis of Cognition, Pittsburgh, PA, USA; Department of Bioengineering, University of Pittsburgh, Pittsburgh, PA, USA
| | - Byron M Yu
- Neuroscience Institute, Carnegie Mellon University, Pittsburgh, PA, USA; Center for the Neural Basis of Cognition, Pittsburgh, PA, USA; Department of Electrical and Computer Engineering, Carnegie Mellon University, Pittsburgh, PA, USA; Department of Biomedical Engineering, Carnegie Mellon University, Pittsburgh, PA, USA
| | - Steven M Chase
- Neuroscience Institute, Carnegie Mellon University, Pittsburgh, PA, USA; Center for the Neural Basis of Cognition, Pittsburgh, PA, USA; Department of Biomedical Engineering, Carnegie Mellon University, Pittsburgh, PA, USA
| |
Collapse
|
118
|
Towards the next generation of recurrent network models for cognitive neuroscience. Curr Opin Neurobiol 2021; 70:182-192. [PMID: 34844122 DOI: 10.1016/j.conb.2021.10.015] [Citation(s) in RCA: 10] [Impact Index Per Article: 3.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/16/2021] [Revised: 10/25/2021] [Accepted: 10/28/2021] [Indexed: 11/20/2022]
Abstract
Recurrent neural networks (RNNs) trained with machine learning techniques on cognitive tasks have become a widely accepted tool for neuroscientists. In this short opinion piece, we discuss fundamental challenges faced by the early work of this approach and recent steps to overcome such challenges and build next-generation RNN models for cognition. We propose several essential questions that practitioners of this approach should address to continue to build future generations of RNN models.
Collapse
|
119
|
Subramanian A, Chitlangia S, Baths V. Reinforcement learning and its connections with neuroscience and psychology. Neural Netw 2021; 145:271-287. [PMID: 34781215 DOI: 10.1016/j.neunet.2021.10.003] [Citation(s) in RCA: 8] [Impact Index Per Article: 2.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/31/2021] [Revised: 09/26/2021] [Accepted: 10/01/2021] [Indexed: 11/19/2022]
Abstract
Reinforcement learning methods have recently been very successful at performing complex sequential tasks like playing Atari games, Go and Poker. These algorithms have outperformed humans in several tasks by learning from scratch, using only scalar rewards obtained through interaction with their environment. While there certainly has been considerable independent innovation to produce such results, many core ideas in reinforcement learning are inspired by phenomena in animal learning, psychology and neuroscience. In this paper, we comprehensively review a large number of findings in both neuroscience and psychology that evidence reinforcement learning as a promising candidate for modeling learning and decision making in the brain. In doing so, we construct a mapping between various classes of modern RL algorithms and specific findings in both neurophysiological and behavioral literature. We then discuss the implications of this observed relationship between RL, neuroscience and psychology and its role in advancing research in both AI and brain science.
Collapse
Affiliation(s)
- Ajay Subramanian
- Department of Psychology, New York University, New York, New York, 10003, USA; Cognitive Neuroscience Lab, BITS Pilani K K Birla Goa Campus, NH-17B, Zuarinagar, Goa, 403726, India.
| | - Sharad Chitlangia
- Amazon; Cognitive Neuroscience Lab, BITS Pilani K K Birla Goa Campus, NH-17B, Zuarinagar, Goa, 403726, India.
| | - Veeky Baths
- Cognitive Neuroscience Lab, BITS Pilani K K Birla Goa Campus, NH-17B, Zuarinagar, Goa, 403726, India; Department of Biological Sciences, BITS Pilani K K Birla Goa Campus, NH-17B, Zuarinagar, Goa, 403726, India.
| |
Collapse
|
120
|
Do Q, Hasselmo ME. Neural circuits and symbolic processing. Neurobiol Learn Mem 2021; 186:107552. [PMID: 34763073 PMCID: PMC10121157 DOI: 10.1016/j.nlm.2021.107552] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/13/2021] [Revised: 10/14/2021] [Accepted: 11/02/2021] [Indexed: 11/29/2022]
Abstract
The ability to use symbols is a defining feature of human intelligence. However, neuroscience has yet to explain the fundamental neural circuit mechanisms for flexibly representing and manipulating abstract concepts. This article will review the research on neural models for symbolic processing. The review first focuses on the question of how symbols could possibly be represented in neural circuits. The review then addresses how neural symbolic representations could be flexibly combined to meet a wide range of reasoning demands. Finally, the review assesses the research on program synthesis and proposes that the most flexible neural representation of symbolic processing would involve the capacity to rapidly synthesize neural operations analogous to lambda calculus to solve complex cognitive tasks.
Collapse
Affiliation(s)
- Quan Do
- Center for Systems Neuroscience, Boston University, 610 Commonwealth Ave, Boston, MA 02215, United States.
| | - Michael E Hasselmo
- Center for Systems Neuroscience, Boston University, 610 Commonwealth Ave, Boston, MA 02215, United States.
| |
Collapse
|
121
|
Stetter M, Lang EW. Learning Intuitive Physics and One-Shot Imitation Using State-Action-Prediction Self-Organizing Maps. COMPUTATIONAL INTELLIGENCE AND NEUROSCIENCE 2021; 2021:5590445. [PMID: 34804145 PMCID: PMC8604601 DOI: 10.1155/2021/5590445] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 01/18/2021] [Revised: 10/14/2021] [Accepted: 10/21/2021] [Indexed: 11/17/2022]
Abstract
Human learning and intelligence work differently from the supervised pattern recognition approach adopted in most deep learning architectures. Humans seem to learn rich representations by exploration and imitation, build causal models of the world, and use both to flexibly solve new tasks. We suggest a simple but effective unsupervised model which develops such characteristics. The agent learns to represent the dynamical physical properties of its environment by intrinsically motivated exploration and performs inference on this representation to reach goals. For this, a set of self-organizing maps which represent state-action pairs is combined with a causal model for sequence prediction. The proposed system is evaluated in the cartpole environment. After an initial phase of playful exploration, the agent can execute kinematic simulations of the environment's future and use those for action planning. We demonstrate its performance on a set of several related, but different one-shot imitation tasks, which the agent flexibly solves in an active inference style.
Collapse
Affiliation(s)
- Martin Stetter
- Department of Bioengineering Sciences, Weihenstephan-Triesdorf University of Applied Sciences, Freising D-85354, Germany
| | - Elmar W. Lang
- Computational Intelligence and Machine Learning Group, Department of Biophysics, University of Regensburg, Regensburg D-93053, Germany
| |
Collapse
|
122
|
Froudist-Walsh S, Bliss DP, Ding X, Rapan L, Niu M, Knoblauch K, Zilles K, Kennedy H, Palomero-Gallagher N, Wang XJ. A dopamine gradient controls access to distributed working memory in the large-scale monkey cortex. Neuron 2021; 109:3500-3520.e13. [PMID: 34536352 PMCID: PMC8571070 DOI: 10.1016/j.neuron.2021.08.024] [Citation(s) in RCA: 43] [Impact Index Per Article: 14.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/25/2021] [Revised: 06/08/2021] [Accepted: 08/17/2021] [Indexed: 12/13/2022]
Abstract
Dopamine is required for working memory, but how it modulates the large-scale cortex is unknown. Here, we report that dopamine receptor density per neuron, measured by autoradiography, displays a macroscopic gradient along the macaque cortical hierarchy. This gradient is incorporated in a connectome-based large-scale cortex model endowed with multiple neuron types. The model captures an inverted U-shaped dependence of working memory on dopamine and spatial patterns of persistent activity observed in over 90 experimental studies. Moreover, we show that dopamine is crucial for filtering out irrelevant stimuli by enhancing inhibition from dendrite-targeting interneurons. Our model revealed that an activity-silent memory trace can be realized by facilitation of inter-areal connections and that adjusting cortical dopamine induces a switch from this internal memory state to distributed persistent activity. Our work represents a cross-level understanding from molecules and cell types to recurrent circuit dynamics underlying a core cognitive function distributed across the primate cortex.
Collapse
Affiliation(s)
| | - Daniel P Bliss
- Center for Neural Science, New York University, New York, NY 10003, USA
| | - Xingyu Ding
- Center for Neural Science, New York University, New York, NY 10003, USA
| | | | - Meiqi Niu
- Research Centre Jülich, INM-1, Jülich, Germany
| | - Kenneth Knoblauch
- INSERM U846, Stem Cell & Brain Research Institute, 69500 Bron, France; Université de Lyon, Université Lyon I, 69003 Lyon, France
| | - Karl Zilles
- Research Centre Jülich, INM-1, Jülich, Germany
| | - Henry Kennedy
- INSERM U846, Stem Cell & Brain Research Institute, 69500 Bron, France; Université de Lyon, Université Lyon I, 69003 Lyon, France; Institute of Neuroscience, State Key Laboratory of Neuroscience, Chinese Academy of Sciences (CAS), Key Laboratory of Primate Neurobiology CAS, Shanghai, China
| | - Nicola Palomero-Gallagher
- Research Centre Jülich, INM-1, Jülich, Germany; C. & O. Vogt Institute for Brain Research, Heinrich-Heine-University, 40225 Düsseldorf, Germany
| | - Xiao-Jing Wang
- Center for Neural Science, New York University, New York, NY 10003, USA.
| |
Collapse
|
123
|
Na S, Chung D, Hula A, Perl O, Jung J, Heflin M, Blackmore S, Fiore VG, Dayan P, Gu X. Humans use forward thinking to exploit social controllability. eLife 2021; 10:64983. [PMID: 34711304 PMCID: PMC8555988 DOI: 10.7554/elife.64983] [Citation(s) in RCA: 13] [Impact Index Per Article: 4.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/18/2020] [Accepted: 09/30/2021] [Indexed: 12/27/2022] Open
Abstract
The controllability of our social environment has a profound impact on our behavior and mental health. Nevertheless, neurocomputational mechanisms underlying social controllability remain elusive. Here, 48 participants performed a task where their current choices either did (Controllable), or did not (Uncontrollable), influence partners’ future proposals. Computational modeling revealed that people engaged a mental model of forward thinking (FT; i.e., calculating the downstream effects of current actions) to estimate social controllability in both Controllable and Uncontrollable conditions. A large-scale online replication study (n=1342) supported this finding. Using functional magnetic resonance imaging (n=48), we further demonstrated that the ventromedial prefrontal cortex (vmPFC) computed the projected total values of current actions during forward planning, supporting the neural realization of the forward-thinking model. These findings demonstrate that humans use vmPFC-dependent FT to estimate and exploit social controllability, expanding the role of this neurocomputational mechanism beyond spatial and cognitive contexts.
Collapse
Affiliation(s)
- Soojung Na
- The Graduate School of Biomedical Sciences, Icahn School of Medicine at Mount Sinai, New York, United States.,Nash Family Department of Neuroscience, Icahn School of Medicine at Mount Sinai, New York, United States.,Department of Psychiatry, Icahn School of Medicine at Mount Sinai, New York, United States
| | - Dongil Chung
- Department of Biomedical Engineering, Ulsan National Institute of Science and Technology, Ulsan, Republic of Korea
| | - Andreas Hula
- Austrian Institute of Technology, Seibersdorf, Austria
| | - Ofer Perl
- Department of Psychiatry, Icahn School of Medicine at Mount Sinai, New York, United States
| | - Jennifer Jung
- School of Behavioral and Brain Sciences, The University of Texas at Dallas, Richardson, United States
| | - Matthew Heflin
- Department of Psychiatry, Icahn School of Medicine at Mount Sinai, New York, United States
| | - Sylvia Blackmore
- Department of Psychiatry, Icahn School of Medicine at Mount Sinai, New York, United States.,Queen Square Institute of Neurology, University College London, London, United Kingdom
| | - Vincenzo G Fiore
- Department of Psychiatry, Icahn School of Medicine at Mount Sinai, New York, United States
| | - Peter Dayan
- Max Planck Institute for Biological Cybernetics, Tübingen, Germany.,University of Tübingen, Tübingen, Germany
| | - Xiaosi Gu
- Nash Family Department of Neuroscience, Icahn School of Medicine at Mount Sinai, New York, United States.,Department of Psychiatry, Icahn School of Medicine at Mount Sinai, New York, United States
| |
Collapse
|
124
|
Langdon A, Botvinick M, Nakahara H, Tanaka K, Matsumoto M, Kanai R. Meta-learning, social cognition and consciousness in brains and machines. Neural Netw 2021; 145:80-89. [PMID: 34735893 DOI: 10.1016/j.neunet.2021.10.004] [Citation(s) in RCA: 7] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/26/2021] [Revised: 09/20/2021] [Accepted: 10/01/2021] [Indexed: 12/11/2022]
Abstract
The intersection between neuroscience and artificial intelligence (AI) research has created synergistic effects in both fields. While neuroscientific discoveries have inspired the development of AI architectures, new ideas and algorithms from AI research have produced new ways to study brain mechanisms. A well-known example is the case of reinforcement learning (RL), which has stimulated neuroscience research on how animals learn to adjust their behavior to maximize reward. In this review article, we cover recent collaborative work between the two fields in the context of meta-learning and its extension to social cognition and consciousness. Meta-learning refers to the ability to learn how to learn, such as learning to adjust hyperparameters of existing learning algorithms and how to use existing models and knowledge to efficiently solve new tasks. This meta-learning capability is important for making existing AI systems more adaptive and flexible to efficiently solve new tasks. Since this is one of the areas where there is a gap between human performance and current AI systems, successful collaboration should produce new ideas and progress. Starting from the role of RL algorithms in driving neuroscience, we discuss recent developments in deep RL applied to modeling prefrontal cortex functions. Even from a broader perspective, we discuss the similarities and differences between social cognition and meta-learning, and finally conclude with speculations on the potential links between intelligence as endowed by model-based RL and consciousness. For future work we highlight data efficiency, autonomy and intrinsic motivation as key research areas for advancing both fields.
Collapse
Affiliation(s)
- Angela Langdon
- Princeton Neuroscience Institute, Princeton University, USA
| | - Matthew Botvinick
- DeepMind, London, UK; Gatsby Computational Neuroscience Unit, University College London, London, UK
| | | | - Keiji Tanaka
- RIKEN Center for Brain Science, Wako, Saitama, Japan
| | - Masayuki Matsumoto
- Division of Biomedical Science, Faculty of Medicine, University of Tsukuba, Ibaraki, Japan; Graduate School of Comprehensive Human Sciences, University of Tsukuba, Ibaraki, Japan; Transborder Medical Research Center, University of Tsukuba, Ibaraki, Japan
| | | |
Collapse
|
125
|
Morningstar MD, Barnett WH, Goodlett CR, Kuznetsov A, Lapish CC. Understanding ethanol's acute effects on medial prefrontal cortex neural activity using state-space approaches. Neuropharmacology 2021; 198:108780. [PMID: 34480911 PMCID: PMC8488975 DOI: 10.1016/j.neuropharm.2021.108780] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/28/2021] [Revised: 08/10/2021] [Accepted: 08/30/2021] [Indexed: 12/22/2022]
Abstract
Acute ethanol (EtOH) intoxication results in several maladaptive behaviors that may be attributable, in part, to the effects of EtOH on neural activity in medial prefrontal cortex (mPFC). The acute effects of EtOH on mPFC function have been largely described as inhibitory. However, translating these observations on function into a mechanism capable of delineating acute EtOH's effects on behavior has proven difficult. This review highlights the role of acute EtOH on electrophysiological measurements of mPFC function and proposes that interpreting these changes through the lens of dynamical systems theory is critical to understand the mechanisms that mediate the effects of EtOH intoxication on behavior. Specifically, the present review posits that the effects of EtOH on mPFC N-methyl-d-aspartate (NMDA) receptors are critical for the expression of impaired behavior following EtOH consumption. This hypothesis is based on the observation that recurrent activity in cortical networks is supported by NMDA receptors, and, when disrupted, may lead to impairments in cognitive function. To evaluate this hypothesis, we discuss the representation of mPFC neural activity in low-dimensional, dynamic state spaces. This approach has proven useful for identifying the underlying computations necessary for the production of behavior. Ultimately, we hypothesize that EtOH-related alterations to NMDA receptor function produces alterations that can be effectively conceptualized as impairments in attractor dynamics and provides insight into how acute EtOH disrupts forms of cognition that rely on mPFC function. This article is part of the special Issue on 'Neurocircuitry Modulating Drug and Alcohol Abuse'.
Collapse
Affiliation(s)
| | - William H Barnett
- Indiana University-Purdue University Indianapolis, Department of Psychology, USA
| | - Charles R Goodlett
- Indiana University-Purdue University Indianapolis, Department of Psychology, USA; Indiana University School of Medicine, Stark Neurosciences, USA
| | - Alexey Kuznetsov
- Indiana University-Purdue University Indianapolis, Department of Mathematics, USA; Indiana University School of Medicine, Stark Neurosciences, USA
| | - Christopher C Lapish
- Indiana University-Purdue University Indianapolis, Department of Psychology, USA; Indiana University School of Medicine, Stark Neurosciences, USA
| |
Collapse
|
126
|
Dissociable mechanisms of information sampling in prefrontal cortex and the dopaminergic system. Curr Opin Behav Sci 2021. [DOI: 10.1016/j.cobeha.2021.04.005] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/21/2022]
|
127
|
Eckstein MK, Wilbrecht L, Collins AGE. What do Reinforcement Learning Models Measure? Interpreting Model Parameters in Cognition and Neuroscience. Curr Opin Behav Sci 2021; 41:128-137. [PMID: 34984213 PMCID: PMC8722372 DOI: 10.1016/j.cobeha.2021.06.004] [Citation(s) in RCA: 37] [Impact Index Per Article: 12.3] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/29/2022]
Abstract
Reinforcement learning (RL) is a concept that has been invaluable to fields including machine learning, neuroscience, and cognitive science. However, what RL entails differs between fields, leading to difficulties when interpreting and translating findings. After laying out these differences, this paper focuses on cognitive (neuro)science to discuss how we as a field might over-interpret RL modeling results. We too often assume-implicitly-that modeling results generalize between tasks, models, and participant populations, despite negative empirical evidence for this assumption. We also often assume that parameters measure specific, unique (neuro)cognitive processes, a concept we call interpretability, when evidence suggests that they capture different functions across studies and tasks. We conclude that future computational research needs to pay increased attention to implicit assumptions when using RL models, and suggest that a more systematic understanding of contextual factors will help address issues and improve the ability of RL to explain brain and behavior.
Collapse
Affiliation(s)
- Maria K Eckstein
- Department of Psychology, UC Berkeley, 2121 Berkeley Way West, Berkeley, 94720, CA, USA
| | - Linda Wilbrecht
- Department of Psychology, UC Berkeley, 2121 Berkeley Way West, Berkeley, 94720, CA, USA
- Helen Wills Neuroscience Institute, UC Berkeley, 175 Li Ka Shing Center, Berkeley, 94720, CA, USA
| | - Anne G E Collins
- Department of Psychology, UC Berkeley, 2121 Berkeley Way West, Berkeley, 94720, CA, USA
- Helen Wills Neuroscience Institute, UC Berkeley, 175 Li Ka Shing Center, Berkeley, 94720, CA, USA
| |
Collapse
|
128
|
Güntürkün O, von Eugen K, Packheiser J, Pusch R. Avian pallial circuits and cognition: A comparison to mammals. Curr Opin Neurobiol 2021; 71:29-36. [PMID: 34562800 DOI: 10.1016/j.conb.2021.08.007] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/07/2021] [Revised: 08/23/2021] [Accepted: 08/25/2021] [Indexed: 12/27/2022]
Abstract
Cognitive functions are similar in birds and mammals. So, are therefore pallial cellular circuits and neuronal computations also alike? In search of answers, we move in from bird's pallial connectomes, to cortex-like sensory canonical circuits and connections, to forebrain micro-circuitries and finally to the avian "prefrontal" area. This voyage from macro- to micro-scale networks and areas reveals that both birds and mammals evolved similar neural and computational properties in either convergent or parallel manner, based upon circuitries inherited from common ancestry. Thus, these two vertebrate classes evolved separately within 315 million years with highly similar pallial architectures that produce comparable cognitive functions.
Collapse
Affiliation(s)
- Onur Güntürkün
- Department of Biopsychology, Institute of Cognitive Neuroscience, Faculty of Psychology, Ruhr University Bochum, Universitätsstraße 150, 44801, Bochum, Germany.
| | - Kaya von Eugen
- Department of Biopsychology, Institute of Cognitive Neuroscience, Faculty of Psychology, Ruhr University Bochum, Universitätsstraße 150, 44801, Bochum, Germany
| | - Julian Packheiser
- Department of Biopsychology, Institute of Cognitive Neuroscience, Faculty of Psychology, Ruhr University Bochum, Universitätsstraße 150, 44801, Bochum, Germany
| | - Roland Pusch
- Department of Biopsychology, Institute of Cognitive Neuroscience, Faculty of Psychology, Ruhr University Bochum, Universitätsstraße 150, 44801, Bochum, Germany
| |
Collapse
|
129
|
Schmidgall S, Ashkanazy J, Lawson W, Hays J. SpikePropamine: Differentiable Plasticity in Spiking Neural Networks. Front Neurorobot 2021; 15:629210. [PMID: 34630063 PMCID: PMC8493296 DOI: 10.3389/fnbot.2021.629210] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/13/2020] [Accepted: 08/11/2021] [Indexed: 11/17/2022] Open
Abstract
The adaptive changes in synaptic efficacy that occur between spiking neurons have been demonstrated to play a critical role in learning for biological neural networks. Despite this source of inspiration, many learning focused applications using Spiking Neural Networks (SNNs) retain static synaptic connections, preventing additional learning after the initial training period. Here, we introduce a framework for simultaneously learning the underlying fixed-weights and the rules governing the dynamics of synaptic plasticity and neuromodulated synaptic plasticity in SNNs through gradient descent. We further demonstrate the capabilities of this framework on a series of challenging benchmarks, learning the parameters of several plasticity rules including BCM, Oja's, and their respective set of neuromodulatory variants. The experimental results display that SNNs augmented with differentiable plasticity are sufficient for solving a set of challenging temporal learning tasks that a traditional SNN fails to solve, even in the presence of significant noise. These networks are also shown to be capable of producing locomotion on a high-dimensional robotic learning task, where near-minimal degradation in performance is observed in the presence of novel conditions not seen during the initial training period.
Collapse
Affiliation(s)
| | - Julia Ashkanazy
- U.S. Naval Research Laboratory, Washington, DC, United States
| | - Wallace Lawson
- U.S. Naval Research Laboratory, Washington, DC, United States
| | - Joe Hays
- U.S. Naval Research Laboratory, Washington, DC, United States
| |
Collapse
|
130
|
Slow manifolds within network dynamics encode working memory efficiently and robustly. PLoS Comput Biol 2021; 17:e1009366. [PMID: 34525089 PMCID: PMC8475983 DOI: 10.1371/journal.pcbi.1009366] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/28/2021] [Revised: 09/27/2021] [Accepted: 08/19/2021] [Indexed: 11/19/2022] Open
Abstract
Working memory is a cognitive function involving the storage and manipulation of latent information over brief intervals of time, thus making it crucial for context-dependent computation. Here, we use a top-down modeling approach to examine network-level mechanisms of working memory, an enigmatic issue and central topic of study in neuroscience. We optimize thousands of recurrent rate-based neural networks on a working memory task and then perform dynamical systems analysis on the ensuing optimized networks, wherein we find that four distinct dynamical mechanisms can emerge. In particular, we show the prevalence of a mechanism in which memories are encoded along slow stable manifolds in the network state space, leading to a phasic neuronal activation profile during memory periods. In contrast to mechanisms in which memories are directly encoded at stable attractors, these networks naturally forget stimuli over time. Despite this seeming functional disadvantage, they are more efficient in terms of how they leverage their attractor landscape and paradoxically, are considerably more robust to noise. Our results provide new hypotheses regarding how working memory function may be encoded within the dynamics of neural circuits.
Collapse
|
131
|
Cortese A. Metacognitive resources for adaptive learning⋆. Neurosci Res 2021; 178:10-19. [PMID: 34534617 DOI: 10.1016/j.neures.2021.09.003] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/30/2021] [Revised: 09/07/2021] [Accepted: 09/08/2021] [Indexed: 10/20/2022]
Abstract
Biological organisms display remarkably flexible behaviours. This is an area of active investigation, in particular in the fields of artificial intelligence, computational and cognitive neuroscience. While inductive biases and broader cognitive functions are undoubtedly important, the ability to monitor and evaluate one's performance or oneself -- metacognition -- strikes as a powerful resource for efficient learning. Often measured as decision confidence in neuroscience and psychology experiments, metacognition appears to reflect a broad range of abstraction levels and downstream behavioural effects. Within this context, the formal investigation of how metacognition interacts with learning processes is a recent endeavour. Of special interest are the neural and computational underpinnings of confidence and reinforcement learning modules. This review discusses a general hierarchy of confidence functions and their neuro-computational relevance for adaptive behaviours. It then introduces novel ways to study the formation and use of meta-representations and nonconscious mental representations related to learning and confidence, and concludes with a discussion on outstanding questions and wider perspectives.
Collapse
Affiliation(s)
- Aurelio Cortese
- Computational Neuroscience Labs, ATR Institute International, 619-0288 Kyoto, Japan.
| |
Collapse
|
132
|
Caligiore D, Silvetti M, D'Amelio M, Puglisi-Allegra S, Baldassarre G. Computational Modeling of Catecholamines Dysfunction in Alzheimer's Disease at Pre-Plaque Stage. J Alzheimers Dis 2021; 77:275-290. [PMID: 32741822 PMCID: PMC7592658 DOI: 10.3233/jad-200276] [Citation(s) in RCA: 13] [Impact Index Per Article: 4.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/11/2022]
Abstract
Background: Alzheimer’s disease (AD) etiopathogenesis remains partially unexplained. The main conceptual framework used to study AD is the Amyloid Cascade Hypothesis, although the failure of recent clinical experimentation seems to reduce its potential in AD research. Objective: A possible explanation for the failure of clinical trials is that they are set too late in AD progression. Recent studies suggest that the ventral tegmental area (VTA) degeneration could be one of the first events occurring in AD progression (pre-plaque stage). Methods: Here we investigate this hypothesis through a computational model and computer simulations validated with behavioral and neural data from patients. Results: We show that VTA degeneration might lead to system-level adjustments of catecholamine release, triggering a sequence of events leading to relevant clinical and pathological signs of AD. These changes consist first in a midfrontal-driven compensatory hyperactivation of both VTA and locus coeruleus (norepinephrine) followed, with the progression of the VTA impairment, by a downregulation of catecholamine release. These processes could then trigger the neural degeneration at the cortical and hippocampal levels, due to the chronic loss of the neuroprotective role of norepinephrine. Conclusion: Our novel hypothesis might contribute to the formulation of a wider system-level view of AD which might help to devise early diagnostic and therapeutic interventions.
Collapse
Affiliation(s)
- Daniele Caligiore
- Computational and Translational Neuroscience Laboratory (CTNLab), Institute of Cognitive Sciences and Technologies, National Research Council, Rome, Italy
| | - Massimo Silvetti
- Computational and Translational Neuroscience Laboratory (CTNLab), Institute of Cognitive Sciences and Technologies, National Research Council, Rome, Italy
| | - Marcello D'Amelio
- Unit of Molecular Neurosciences, Department of Medicine, University Campus-Biomedico, Rome, Italy.,IRCCS Santa Lucia Foundation, Rome, Italy
| | | | - Gianluca Baldassarre
- Laboratory of Computational Embodied Neuroscience (LOCEN), Institute of Cognitive Sciences and Technologies, National Research Council, Rome, Italy
| |
Collapse
|
133
|
Roscow EL, Chua R, Costa RP, Jones MW, Lepora N. Learning offline: memory replay in biological and artificial reinforcement learning. Trends Neurosci 2021; 44:808-821. [PMID: 34481635 DOI: 10.1016/j.tins.2021.07.007] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/12/2021] [Revised: 07/13/2021] [Accepted: 07/21/2021] [Indexed: 10/20/2022]
Abstract
Learning to act in an environment to maximise rewards is among the brain's key functions. This process has often been conceptualised within the framework of reinforcement learning, which has also gained prominence in machine learning and artificial intelligence (AI) as a way to optimise decision making. A common aspect of both biological and machine reinforcement learning is the reactivation of previously experienced episodes, referred to as replay. Replay is important for memory consolidation in biological neural networks and is key to stabilising learning in deep neural networks. Here, we review recent developments concerning the functional roles of replay in the fields of neuroscience and AI. Complementary progress suggests how replay might support learning processes, including generalisation and continual learning, affording opportunities to transfer knowledge across the two fields to advance the understanding of biological and artificial learning and memory.
Collapse
Affiliation(s)
| | | | - Rui Ponte Costa
- Bristol Computational Neuroscience Unit, Intelligent Systems Lab, Department of Computer Science, University of Bristol, Bristol, UK
| | - Matt W Jones
- School of Physiology, Pharmacology and Neuroscience, University of Bristol, Bristol, UK
| | - Nathan Lepora
- Department of Engineering Mathematics and Bristol Robotics Laboratory, University of Bristol, Bristol, UK
| |
Collapse
|
134
|
Park SA, Miller DS, Boorman ED. Inferences on a multidimensional social hierarchy use a grid-like code. Nat Neurosci 2021; 24:1292-1301. [PMID: 34465915 PMCID: PMC8759596 DOI: 10.1038/s41593-021-00916-3] [Citation(s) in RCA: 53] [Impact Index Per Article: 17.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/24/2020] [Accepted: 07/21/2021] [Indexed: 02/06/2023]
Abstract
Generalizing experiences to guide decision-making in novel situations is a hallmark of flexible behavior. Cognitive maps of an environment or task can theoretically afford such flexibility, but direct evidence has proven elusive. In this study, we found that discretely sampled abstract relationships between entities in an unseen two-dimensional social hierarchy are reconstructed into a unitary two-dimensional cognitive map in the hippocampus and entorhinal cortex. We further show that humans use a grid-like code in entorhinal cortex and medial prefrontal cortex for inferred direct trajectories between entities in the reconstructed abstract space during discrete decisions. These grid-like representations in the entorhinal cortex are associated with decision value computations in the medial prefrontal cortex and temporoparietal junction. Collectively, these findings show that grid-like representations are used by the human brain to infer novel solutions, even in abstract and discrete problems, and suggest a general mechanism underpinning flexible decision-making and generalization.
Collapse
Affiliation(s)
| | - Douglas S. Miller
- Center for Mind and Brain, University of California, Davis, USA,Center for Neuroscience, University of California, Davis, USA
| | - Erie D. Boorman
- Center for Mind and Brain, University of California, Davis, USA,Department of Psychology, University of California, Davis, USA
| |
Collapse
|
135
|
Jiang L, Litwin-Kumar A. Models of heterogeneous dopamine signaling in an insect learning and memory center. PLoS Comput Biol 2021; 17:e1009205. [PMID: 34375329 PMCID: PMC8354444 DOI: 10.1371/journal.pcbi.1009205] [Citation(s) in RCA: 13] [Impact Index Per Article: 4.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/09/2020] [Accepted: 06/22/2021] [Indexed: 11/25/2022] Open
Abstract
The Drosophila mushroom body exhibits dopamine dependent synaptic plasticity that underlies the acquisition of associative memories. Recordings of dopamine neurons in this system have identified signals related to external reinforcement such as reward and punishment. However, other factors including locomotion, novelty, reward expectation, and internal state have also recently been shown to modulate dopamine neurons. This heterogeneity is at odds with typical modeling approaches in which these neurons are assumed to encode a global, scalar error signal. How is dopamine dependent plasticity coordinated in the presence of such heterogeneity? We develop a modeling approach that infers a pattern of dopamine activity sufficient to solve defined behavioral tasks, given architectural constraints informed by knowledge of mushroom body circuitry. Model dopamine neurons exhibit diverse tuning to task parameters while nonetheless producing coherent learned behaviors. Notably, reward prediction error emerges as a mode of population activity distributed across these neurons. Our results provide a mechanistic framework that accounts for the heterogeneity of dopamine activity during learning and behavior. Dopamine neurons across the animal kingdom are involved in the formation of associative memories. While numerous studies have recorded activity in these neurons related to external and predicted rewards, the diversity of these neurons’ activity and their tuning to non-reward-related quantities such as novelty, movement, and internal state have proved challenging to account for in traditional modeling approaches. Using a well-characterized model system for learning and memory, the mushroom body of Drosophila fruit flies, Jiang and Litwin-Kumar provide an account of the diversity of signals across dopamine neurons. They show that models optimized to solve tasks like those encountered by flies exhibit heterogeneous activity across dopamine neurons, but nonetheless this activity is sufficient for the system to solve the tasks. The models will be useful to generate testable hypotheses about dopamine neuron activity across different experimental conditions.
Collapse
Affiliation(s)
- Linnie Jiang
- Mortimer B. Zuckerman Mind Brain Behavior Institute, Department of Neuroscience, Columbia University, New York, New York, United States of America
- Neurosciences Program, Stanford University, Stanford, California, United States of America
| | - Ashok Litwin-Kumar
- Mortimer B. Zuckerman Mind Brain Behavior Institute, Department of Neuroscience, Columbia University, New York, New York, United States of America
- * E-mail:
| |
Collapse
|
136
|
Hunt LT, Daw ND, Kaanders P, MacIver MA, Mugan U, Procyk E, Redish AD, Russo E, Scholl J, Stachenfeld K, Wilson CRE, Kolling N. Formalizing planning and information search in naturalistic decision-making. Nat Neurosci 2021; 24:1051-1064. [PMID: 34155400 DOI: 10.1038/s41593-021-00866-w] [Citation(s) in RCA: 28] [Impact Index Per Article: 9.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/23/2020] [Accepted: 03/23/2021] [Indexed: 02/05/2023]
Abstract
Decisions made by mammals and birds are often temporally extended. They require planning and sampling of decision-relevant information. Our understanding of such decision-making remains in its infancy compared with simpler, forced-choice paradigms. However, recent advances in algorithms supporting planning and information search provide a lens through which we can explain neural and behavioral data in these tasks. We review these advances to obtain a clearer understanding for why planning and curiosity originated in certain species but not others; how activity in the medial temporal lobe, prefrontal and cingulate cortices may support these behaviors; and how planning and information search may complement each other as means to improve future action selection.
Collapse
Affiliation(s)
- L T Hunt
- Department of Psychiatry, Wellcome Centre for Integrative Neuroimaging, University of Oxford, Oxford, UK.
| | - N D Daw
- Princeton Neuroscience Institute and Department of Psychology, Princeton University, Princeton, NJ, USA
| | - P Kaanders
- Department of Experimental Psychology, Wellcome Centre for Integrative Neuroimaging, University of Oxford, Oxford, UK
| | - M A MacIver
- Center for Robotics and Biosystems, Department of Neurobiology, Department of Biomedical Engineering, Department of Mechanical Engineering, Northwestern University, Evanston, IL, USA
| | - U Mugan
- Center for Robotics and Biosystems, Department of Neurobiology, Department of Biomedical Engineering, Department of Mechanical Engineering, Northwestern University, Evanston, IL, USA
| | - E Procyk
- Univ Lyon, Université Claude Bernard Lyon 1, INSERM, Stem Cell and Brain Research Institute U1208, Bron, France
| | - A D Redish
- Department of Neuroscience, University of Minnesota, Minneapolis, MN, USA
| | - E Russo
- Department of Theoretical Neuroscience, Central Institute of Mental Health, Mannheim, Germany.,Department of Psychiatry and Psychotherapy, University Medical Center, Johannes Gutenberg University, Mainz, Germany
| | - J Scholl
- Department of Experimental Psychology, Wellcome Centre for Integrative Neuroimaging, University of Oxford, Oxford, UK
| | | | - C R E Wilson
- Univ Lyon, Université Claude Bernard Lyon 1, INSERM, Stem Cell and Brain Research Institute U1208, Bron, France
| | - N Kolling
- Department of Psychiatry, Wellcome Centre for Integrative Neuroimaging, University of Oxford, Oxford, UK.
| |
Collapse
|
137
|
Multitask learning over shared subspaces. PLoS Comput Biol 2021; 17:e1009092. [PMID: 34228719 PMCID: PMC8284664 DOI: 10.1371/journal.pcbi.1009092] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/30/2020] [Revised: 07/16/2021] [Accepted: 05/18/2021] [Indexed: 11/19/2022] Open
Abstract
This paper uses constructs from machine learning to define pairs of learning tasks that either shared or did not share a common subspace. Human subjects then learnt these tasks using a feedback-based approach and we hypothesised that learning would be boosted for shared subspaces. Our findings broadly supported this hypothesis with either better performance on the second task if it shared the same subspace as the first, or positive correlations over task performance for shared subspaces. These empirical findings were compared to the behaviour of a Neural Network model trained using sequential Bayesian learning and human performance was found to be consistent with a minimal capacity variant of this model. Networks with an increased representational capacity, and networks without Bayesian learning, did not show these transfer effects. We propose that the concept of shared subspaces provides a useful framework for the experimental study of human multitask and transfer learning. How does knowledge gained from previous experience affect learning of new tasks? This question of “Transfer Learning” has been addressed by teachers, psychologists, and more recently by researchers in the fields of neural networks and machine learning. Leveraging constructs from machine learning, we designed pairs of learning tasks that either shared or did not share a common subspace. We compared the dynamics of transfer learning in humans with those of a multitask neural network model, finding that human performance was consistent with a minimal capacity variant of the model. Learning was boosted in the second task if the same subspace was shared between tasks. Additionally, accuracy between tasks was positively correlated but only when they shared the same subspace. Our results highlight the roles of subspaces, showing how they could act as a learning boost if shared, and be detrimental if not.
Collapse
|
138
|
van der Maas HL, Snoek L, Stevenson CE. How much intelligence is there in artificial intelligence? A 2020 update. INTELLIGENCE 2021. [DOI: 10.1016/j.intell.2021.101548] [Citation(s) in RCA: 7] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/20/2022]
|
139
|
Yang CS, Cowan NJ, Haith AM. De novo learning versus adaptation of continuous control in a manual tracking task. eLife 2021; 10:e62578. [PMID: 34169838 PMCID: PMC8266385 DOI: 10.7554/elife.62578] [Citation(s) in RCA: 28] [Impact Index Per Article: 9.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/29/2020] [Accepted: 06/22/2021] [Indexed: 12/20/2022] Open
Abstract
How do people learn to perform tasks that require continuous adjustments of motor output, like riding a bicycle? People rely heavily on cognitive strategies when learning discrete movement tasks, but such time-consuming strategies are infeasible in continuous control tasks that demand rapid responses to ongoing sensory feedback. To understand how people can learn to perform such tasks without the benefit of cognitive strategies, we imposed a rotation/mirror reversal of visual feedback while participants performed a continuous tracking task. We analyzed behavior using a system identification approach, which revealed two qualitatively different components of learning: adaptation of a baseline controller and formation of a new, task-specific continuous controller. These components exhibited different signatures in the frequency domain and were differentially engaged under the rotation/mirror reversal. Our results demonstrate that people can rapidly build a new continuous controller de novo and can simultaneously deploy this process with adaptation of an existing controller.
Collapse
Affiliation(s)
- Christopher S Yang
- Department of Neuroscience, Johns Hopkins UniversityBaltimoreUnited States
| | - Noah J Cowan
- Department of Mechanical Engineering, Laboratory for Computational Sensing and Robotics, Johns Hopkins UniversityBaltimoreUnited States
| | - Adrian M Haith
- Department of Neurology, Johns Hopkins UniversityBaltimoreUnited States
| |
Collapse
|
140
|
Safron A. The Radically Embodied Conscious Cybernetic Bayesian Brain: From Free Energy to Free Will and Back Again. ENTROPY (BASEL, SWITZERLAND) 2021; 23:783. [PMID: 34202965 PMCID: PMC8234656 DOI: 10.3390/e23060783] [Citation(s) in RCA: 9] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 01/18/2021] [Revised: 05/12/2021] [Accepted: 05/27/2021] [Indexed: 11/24/2022]
Abstract
Drawing from both enactivist and cognitivist perspectives on mind, I propose that explaining teleological phenomena may require reappraising both "Cartesian theaters" and mental homunculi in terms of embodied self-models (ESMs), understood as body maps with agentic properties, functioning as predictive-memory systems and cybernetic controllers. Quasi-homuncular ESMs are suggested to constitute a major organizing principle for neural architectures due to their initial and ongoing significance for solutions to inference problems in cognitive (and affective) development. Embodied experiences provide foundational lessons in learning curriculums in which agents explore increasingly challenging problem spaces, so answering an unresolved question in Bayesian cognitive science: what are biologically plausible mechanisms for equipping learners with sufficiently powerful inductive biases to adequately constrain inference spaces? Drawing on models from neurophysiology, psychology, and developmental robotics, I describe how embodiment provides fundamental sources of empirical priors (as reliably learnable posterior expectations). If ESMs play this kind of foundational role in cognitive development, then bidirectional linkages will be found between all sensory modalities and frontal-parietal control hierarchies, so infusing all senses with somatic-motoric properties, thereby structuring all perception by relevant affordances, so solving frame problems for embodied agents. Drawing upon the Free Energy Principle and Active Inference framework, I describe a particular mechanism for intentional action selection via consciously imagined (and explicitly represented) goal realization, where contrasts between desired and present states influence ongoing policy selection via predictive coding mechanisms and backward-chained imaginings (as self-realizing predictions). This embodied developmental legacy suggests a mechanism by which imaginings can be intentionally shaped by (internalized) partially-expressed motor acts, so providing means of agentic control for attention, working memory, imagination, and behavior. I further describe the nature(s) of mental causation and self-control, and also provide an account of readiness potentials in Libet paradigms wherein conscious intentions shape causal streams leading to enaction. Finally, I provide neurophenomenological handlings of prototypical qualia including pleasure, pain, and desire in terms of self-annihilating free energy gradients via quasi-synesthetic interoceptive active inference. In brief, this manuscript is intended to illustrate how radically embodied minds may create foundations for intelligence (as capacity for learning and inference), consciousness (as somatically-grounded self-world modeling), and will (as deployment of predictive models for enacting valued goals).
Collapse
Affiliation(s)
- Adam Safron
- Center for Psychedelic and Consciousness Research, Johns Hopkins University School of Medicine, Baltimore, MD 21218, USA;
- Kinsey Institute, Indiana University, Bloomington, IN 47405, USA
- Cognitive Science Program, Indiana University, Bloomington, IN 47405, USA
| |
Collapse
|
141
|
Trends of Human-Robot Collaboration in Industry Contexts: Handover, Learning, and Metrics. SENSORS 2021; 21:s21124113. [PMID: 34203766 PMCID: PMC8232712 DOI: 10.3390/s21124113] [Citation(s) in RCA: 10] [Impact Index Per Article: 3.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 05/12/2021] [Revised: 06/03/2021] [Accepted: 06/08/2021] [Indexed: 12/03/2022]
Abstract
Repetitive industrial tasks can be easily performed by traditional robotic systems. However, many other works require cognitive knowledge that only humans can provide. Human-Robot Collaboration (HRC) emerges as an ideal concept of co-working between a human operator and a robot, representing one of the most significant subjects for human-life improvement.The ultimate goal is to achieve physical interaction, where handing over an object plays a crucial role for an effective task accomplishment. Considerable research work had been developed in this particular field in recent years, where several solutions were already proposed. Nonetheless, some particular issues regarding Human-Robot Collaboration still hold an open path to truly important research improvements. This paper provides a literature overview, defining the HRC concept, enumerating the distinct human-robot communication channels, and discussing the physical interaction that this collaboration entails. Moreover, future challenges for a natural and intuitive collaboration are exposed: the machine must behave like a human especially in the pre-grasping/grasping phases and the handover procedure should be fluent and bidirectional, for an articulated function development. These are the focus of the near future investigation aiming to shed light on the complex combination of predictive and reactive control mechanisms promoting coordination and understanding. Following recent progress in artificial intelligence, learning exploration stand as the key element to allow the generation of coordinated actions and their shaping by experience.
Collapse
|
142
|
Noel JP, Caziot B, Bruni S, Fitzgerald NE, Avila E, Angelaki DE. Supporting generalization in non-human primate behavior by tapping into structural knowledge: Examples from sensorimotor mappings, inference, and decision-making. Prog Neurobiol 2021; 201:101996. [PMID: 33454361 PMCID: PMC8096669 DOI: 10.1016/j.pneurobio.2021.101996] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/17/2020] [Revised: 12/15/2020] [Accepted: 01/12/2021] [Indexed: 02/05/2023]
Abstract
The complex behaviors we ultimately wish to understand are far from those currently used in systems neuroscience laboratories. A salient difference are the closed loops between action and perception prominently present in natural but not laboratory behaviors. The framework of reinforcement learning and control naturally wades across action and perception, and thus is poised to inform the neurosciences of tomorrow, not only from a data analyses and modeling framework, but also in guiding experimental design. We argue that this theoretical framework emphasizes active sensing, dynamical planning, and the leveraging of structural regularities as key operations for intelligent behavior within uncertain, time-varying environments. Similarly, we argue that we may study natural task strategies and their neural circuits without over-training animals when the tasks we use tap into our animal's structural knowledge. As proof-of-principle, we teach animals to navigate through a virtual environment - i.e., explore a well-defined and repetitive structure governed by the laws of physics - using a joystick. Once these animals have learned to 'drive', without further training they naturally (i) show zero- or one-shot learning of novel sensorimotor contingencies, (ii) infer the evolving path of dynamically changing latent variables, and (iii) make decisions consistent with maximizing reward rate. Such task designs allow for the study of flexible and generalizable, yet controlled, behaviors. In turn, they allow for the exploitation of pillars of intelligence - flexibility, prediction, and generalization -, properties whose neural underpinning have remained elusive.
Collapse
Affiliation(s)
- Jean-Paul Noel
- Center for Neural Science, New York University, New York, USA
| | - Baptiste Caziot
- Center for Neural Science, New York University, New York, USA
| | - Stefania Bruni
- Center for Neural Science, New York University, New York, USA
| | | | - Eric Avila
- Center for Neural Science, New York University, New York, USA
| | - Dora E Angelaki
- Center for Neural Science, New York University, New York, USA; Tandon School of Engineering, New York University, New York, USA.
| |
Collapse
|
143
|
Zhang T, Mo H. Reinforcement learning for robot research: A comprehensive review and open issues. INT J ADV ROBOT SYST 2021. [DOI: 10.1177/17298814211007305] [Citation(s) in RCA: 15] [Impact Index Per Article: 5.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/17/2022] Open
Abstract
Applying the learning mechanism of natural living beings to endow intelligent robots with humanoid perception and decision-making wisdom becomes an important force to promote the revolution of science and technology in robot domains. Advances in reinforcement learning (RL) over the past decades have led robotics to be highly automated and intelligent, which ensures safety operation instead of manual work and implementation of more intelligence for many challenging tasks. As an important branch of machine learning, RL can realize sequential decision-making under uncertainties through end-to-end learning and has made a series of significant breakthroughs in robot applications. In this review article, we cover RL algorithms from theoretical background to advanced learning policies in different domains, which accelerate to solving practical problems in robotics. The challenges, open issues, and our thoughts on future research directions of RL are also presented to discover new research areas with the objective to motivate new interest.
Collapse
Affiliation(s)
- Tengteng Zhang
- College of Intelligent Systems Science and Engineering, Harbin Engineering University, Harbin, China
| | - Hongwei Mo
- College of Intelligent Systems Science and Engineering, Harbin Engineering University, Harbin, China
| |
Collapse
|
144
|
Palidis DJ, McGregor HR, Vo A, MacDonald PA, Gribble PL. Null effects of levodopa on reward- and error-based motor adaptation, savings, and anterograde interference. J Neurophysiol 2021; 126:47-67. [PMID: 34038228 DOI: 10.1152/jn.00696.2020] [Citation(s) in RCA: 8] [Impact Index Per Article: 2.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/22/2022] Open
Abstract
Dopamine signaling is thought to mediate reward-based learning. We tested for a role of dopamine in motor adaptation by administering the dopamine precursor levodopa to healthy participants in two experiments involving reaching movements. Levodopa has been shown to impair reward-based learning in cognitive tasks. Thus, we hypothesized that levodopa would selectively impair aspects of motor adaptation that depend on the reinforcement of rewarding actions. In the first experiment, participants performed two separate tasks in which adaptation was driven either by visual error-based feedback of the hand position or binary reward feedback. We used EEG to measure event-related potentials evoked by task feedback. We hypothesized that levodopa would specifically diminish adaptation and the neural responses to feedback in the reward learning task. However, levodopa did not affect motor adaptation in either task nor did it diminish event-related potentials elicited by reward outcomes. In the second experiment, participants learned to compensate for mechanical force field perturbations applied to the hand during reaching. Previous exposure to a particular force field can result in savings during subsequent adaptation to the same force field or interference during adaptation to an opposite force field. We hypothesized that levodopa would diminish savings and anterograde interference, as previous work suggests that these phenomena result from a reinforcement learning process. However, we found no reliable effects of levodopa. These results suggest that reward-based motor adaptation, savings, and interference may not depend on the same dopaminergic mechanisms that have been shown to be disrupted by levodopa during various cognitive tasks.NEW & NOTEWORTHY Motor adaptation relies on multiple processes including reinforcement of successful actions. Cognitive reinforcement learning is impaired by levodopa-induced disruption of dopamine function. We administered levodopa to healthy adults who participated in multiple motor adaptation tasks. We found no effects of levodopa on any component of motor adaptation. This suggests that motor adaptation may not depend on the same dopaminergic mechanisms as cognitive forms or reinforcement learning that have been shown to be impaired by levodopa.
Collapse
Affiliation(s)
- Dimitrios J Palidis
- Brain and Mind Institute, Western University, London, Ontario, Canada.,Department of Psychology, Western University, London, Ontario, Canada.,Graduate Program in Neuroscience, Schulich School of Medicine and Dentistry, Western University, London, Ontario, Canada
| | - Heather R McGregor
- Department of Applied Physiology and Kinesiology, University of Florida, Gainesville, Florida
| | - Andrew Vo
- Department of Neurology and Neurosurgery, Montreal Neurological Institute, McGill University, Montreal, Quebec, Canada
| | - Penny A MacDonald
- Brain and Mind Institute, Western University, London, Ontario, Canada.,Department of Psychology, Western University, London, Ontario, Canada.,Department of Physiology and Pharmacology, Schulich School of Medicine and Dentistry, Western University, London, Ontario, Canada.,Department of Clinical Neurological Sciences, University of Western Ontario, London, Ontario, Canada
| | - Paul L Gribble
- Brain and Mind Institute, Western University, London, Ontario, Canada.,Department of Psychology, Western University, London, Ontario, Canada.,Department of Physiology and Pharmacology, Schulich School of Medicine and Dentistry, Western University, London, Ontario, Canada.,Haskins Laboratories, New Haven, Connecticut
| |
Collapse
|
145
|
Wang Z, Zhao W, Zhai A, He P, Wang D. DQN based single-pixel imaging. OPTICS EXPRESS 2021; 29:15463-15477. [PMID: 33985246 DOI: 10.1364/oe.422636] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Received: 02/15/2021] [Accepted: 04/28/2021] [Indexed: 06/12/2023]
Abstract
For an orthogonal transform based single-pixel imaging (OT-SPI), to accelerate its speed while degrading as little as possible of its imaging quality, the normal way is to artificially plan the sampling path for optimizing the sampling strategy based on the characteristic of the orthogonal transform. Here, we propose an optimized sampling method using a Deep Q-learning Network (DQN), which considers the sampling process as decision-making, and the improvement of the reconstructed image as feedback, to obtain a relatively optimal sampling strategy for an OT-SPI. We verify the effectiveness of the method through simulations and experiments. Thanks to the DQN, the proposed single-pixel imaging technique is capable of obtaining an optimal sampling strategy directly, and therefore it requires no artificial planning of the sampling path there, which eliminates the influence of the imperfect sampling path planning on the imaging performance.
Collapse
|
146
|
Robot navigation as hierarchical active inference. Neural Netw 2021; 142:192-204. [PMID: 34022669 DOI: 10.1016/j.neunet.2021.05.010] [Citation(s) in RCA: 16] [Impact Index Per Article: 5.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/09/2021] [Revised: 03/30/2021] [Accepted: 05/06/2021] [Indexed: 12/14/2022]
Abstract
Localization and mapping has been a long standing area of research, both in neuroscience, to understand how mammals navigate their environment, as well as in robotics, to enable autonomous mobile robots. In this paper, we treat navigation as inferring actions that minimize (expected) variational free energy under a hierarchical generative model. We find that familiar concepts like perception, path integration, localization and mapping naturally emerge from this active inference formulation. Moreover, we show that this model is consistent with models of hippocampal functions, and can be implemented in silico on a real-world robot. Our experiments illustrate that a robot equipped with our hierarchical model is able to generate topologically consistent maps, and correct navigation behaviour is inferred when a goal location is provided to the system.
Collapse
|
147
|
Liu X, Shen X, Chen S, Zhang X, Huang Y, Wang Y, Wang Y. Hierarchical Dynamical Model for Multiple Cortical Neural Decoding. Neural Comput 2021; 33:1372-1401. [PMID: 34496393 DOI: 10.1162/neco_a_01380] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/21/2020] [Accepted: 12/14/2020] [Indexed: 11/04/2022]
Abstract
Motor brain machine interfaces (BMIs) interpret neural activities from motor-related cortical areas in the brain into movement commands to control a prosthesis. As the subject adapts to control the neural prosthesis, the medial prefrontal cortex (mPFC), upstream of the primary motor cortex (M1), is heavily involved in reward-guided motor learning. Thus, considering mPFC and M1 functionality within a hierarchical structure could potentially improve the effectiveness of BMI decoding while subjects are learning. The commonly used Kalman decoding method with only one simple state model may not be able to represent the multiple brain states that evolve over time as well as along the neural pathway. In addition, the performance of Kalman decoders degenerates in heavy-tailed nongaussian noise, which is usually generated due to the nonlinear neural system or influences of movement-related noise in online neural recording. In this letter, we propose a hierarchical model to represent the brain states from multiple cortical areas that evolve along the neural pathway. We then introduce correntropy theory into the hierarchical structure to address the heavy-tailed noise existing in neural recordings. We test the proposed algorithm on in vivo recordings collected from the mPFC and M1 of two rats when the subjects were learning to perform a lever-pressing task. Compared with the classic Kalman filter, our results demonstrate better movement decoding performance due to the hierarchical structure that integrates the past failed trial information over multisite recording and the combination with correntropy criterion to deal with noisy heavy-tailed neural recordings.
Collapse
Affiliation(s)
- Xi Liu
- Department of Electronic and Computer Engineering, Hong Kong University of Science and Technology, Kowloon, Hong Kong SAR 999077, China
| | - Xiang Shen
- Department of Electronic and Computer Engineering, Hong Kong University of Science and Technology, Kowloon, Hong Kong SAR 999077, China
| | - Shuhang Chen
- Department of Chemical and Biological Engineering, Hong Kong University of Science and Technology, Kowloon, Hong Kong SAR 999077, China
| | - Xiang Zhang
- Department of Electronic and Computer Engineering, Hong Kong University of Science and Technology, Kowloon, Hong Kong SAR 999077, China
| | - Yifan Huang
- Department of Electronic and Computer Engineering, Hong Kong University of Science and Technology, Kowloon, Hong Kong SAR 999077, China
| | - Yueming Wang
- Qiushi Academy for Advanced Studies, Zhejiang University, Hangzhou 310027, China, and Zhejiang Lab, Hangzhou 311121, China
| | - Yiwen Wang
- Department of Electronic and Computer Engineering and Department of Chemical and Biological Engineering, Hong Kong University of Science and Technology, Kowloon, Hong Kong SAR 999077, China
| |
Collapse
|
148
|
Bermudez-Contreras E. Deep reinforcement learning to study spatial navigation, learning and memory in artificial and biological agents. BIOLOGICAL CYBERNETICS 2021; 115:131-134. [PMID: 33564968 DOI: 10.1007/s00422-021-00862-0] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 10/12/2020] [Accepted: 01/19/2021] [Indexed: 06/12/2023]
Abstract
Despite the recent advancements and popularity of deep learning that has resulted from the advent of numerous industrial applications, artificial neural networks (ANNs) still lack crucial features from their biological counterparts that could improve their performance and their potential to advance our understanding of how the brain works. One avenue that has been proposed to change this is to strengthen the interaction between artificial intelligence (AI) research and neuroscience. Since their historical beginnings, ANNs and AI, in general, have developed in close alignment with both neuroscience and psychology. In addition to deep learning, reinforcement learning (RL) is another approach that is strongly linked to AI and neuroscience to understand how learning is implemented in the brain. In a recently published article, Botvinick et al. (Neuron, 107:603-616, 2020) explain why deep reinforcement learning (DRL) is important for neuroscience as a framework to study learning, representations and decision making. Here, I summarise Botvinick et al.'s main arguments and frame them in the context of the study of learning, memory and spatial navigation. I believe that applying this approach to study spatial navigation can provide useful insights for the understanding of how the brain builds, processes and stores representations of the outside world to extract knowledge.
Collapse
|
149
|
Raman DV, O'Leary T. Frozen algorithms: how the brain's wiring facilitates learning. Curr Opin Neurobiol 2021; 67:207-214. [PMID: 33508698 PMCID: PMC8202511 DOI: 10.1016/j.conb.2020.12.017] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/30/2020] [Revised: 12/21/2020] [Accepted: 12/30/2020] [Indexed: 12/03/2022]
Abstract
Synapses and neural connectivity are plastic and shaped by experience. But to what extent does connectivity itself influence the ability of a neural circuit to learn? Insights from optimization theory and AI shed light on how learning can be implemented in neural circuits. Though abstract in their nature, learning algorithms provide a principled set of hypotheses on the necessary ingredients for learning in neural circuits. These include the kinds of signals and circuit motifs that enable learning from experience, as well as an appreciation of the constraints that make learning challenging in a biological setting. Remarkably, some simple connectivity patterns can boost the efficiency of relatively crude learning rules, showing how the brain can use anatomy to compensate for the biological constraints of known synaptic plasticity mechanisms. Modern connectomics provides rich data for exploring this principle, and may reveal how brain connectivity is constrained by the requirement to learn efficiently.
Collapse
Affiliation(s)
- Dhruva V Raman
- Department of Engineering, University of Cambridge, United Kingdom
| | - Timothy O'Leary
- Department of Engineering, University of Cambridge, United Kingdom.
| |
Collapse
|
150
|
Starkweather CK, Uchida N. Dopamine signals as temporal difference errors: recent advances. Curr Opin Neurobiol 2021; 67:95-105. [PMID: 33186815 PMCID: PMC8107188 DOI: 10.1016/j.conb.2020.08.014] [Citation(s) in RCA: 20] [Impact Index Per Article: 6.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/13/2020] [Revised: 08/24/2020] [Accepted: 08/26/2020] [Indexed: 11/28/2022]
Abstract
In the brain, dopamine is thought to drive reward-based learning by signaling temporal difference reward prediction errors (TD errors), a 'teaching signal' used to train computers. Recent studies using optogenetic manipulations have provided multiple pieces of evidence supporting that phasic dopamine signals function as TD errors. Furthermore, novel experimental results have indicated that when the current state of the environment is uncertain, dopamine neurons compute TD errors using 'belief states' or a probability distribution over potential states. It remains unclear how belief states are computed but emerging evidence suggests involvement of the prefrontal cortex and the hippocampus. These results refine our understanding of the role of dopamine in learning and the algorithms by which dopamine functions in the brain.
Collapse
Affiliation(s)
- Clara Kwon Starkweather
- Center for Brain Science, Department of Molecular and Cellular Biology, Harvard University, Cambridge, MA 02138, USA
| | - Naoshige Uchida
- Center for Brain Science, Department of Molecular and Cellular Biology, Harvard University, Cambridge, MA 02138, USA.
| |
Collapse
|