1
|
Zheng WL, Wu Z, Hummos A, Yang GR, Halassa MM. Rapid context inference in a thalamocortical model using recurrent neural networks. Nat Commun 2024; 15:8275. [PMID: 39333467 PMCID: PMC11436643 DOI: 10.1038/s41467-024-52289-3] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/14/2023] [Accepted: 08/29/2024] [Indexed: 09/29/2024] Open
Abstract
Cognitive flexibility is a fundamental ability that enables humans and animals to exhibit appropriate behaviors in various contexts. The thalamocortical interactions between the prefrontal cortex (PFC) and the mediodorsal thalamus (MD) have been identified as crucial for inferring temporal context, a critical component of cognitive flexibility. However, the neural mechanism responsible for context inference remains unknown. To address this issue, we propose a PFC-MD neural circuit model that utilizes a Hebbian plasticity rule to support rapid, online context inference. Specifically, the model MD thalamus can infer temporal contexts from prefrontal inputs within a few trials. This is achieved through the use of PFC-to-MD synaptic plasticity with pre-synaptic traces and adaptive thresholding, along with winner-take-all normalization in the MD. Furthermore, our model thalamus gates context-irrelevant neurons in the PFC, thus facilitating continual learning. We evaluate our model performance by having it sequentially learn various cognitive tasks. Incorporating an MD-like component alleviates catastrophic forgetting of previously learned contexts and demonstrates the transfer of knowledge to future contexts. Our work provides insight into how biological properties of thalamocortical circuits can be leveraged to achieve rapid context inference and continual learning.
Collapse
Affiliation(s)
- Wei-Long Zheng
- Department of Computer Science and Engineering, Shanghai Jiao Tong University, Shanghai, China.
- Institute of Medical Robotics, Shanghai Jiao Tong University, Shanghai, China.
- Department of Brain and Cognitive Science, Massachusetts Institute of Technology, Cambridge, MA, USA.
| | - Zhongxuan Wu
- Department of Neuroscience, The University of Texas at Austin, Austin, TX, USA
| | - Ali Hummos
- Department of Brain and Cognitive Science, Massachusetts Institute of Technology, Cambridge, MA, USA
| | - Guangyu Robert Yang
- Department of Brain and Cognitive Science, Massachusetts Institute of Technology, Cambridge, MA, USA
- Altera.AL, Inc., Menlo Park, CA, USA
| | - Michael M Halassa
- Department of Neuroscience, Tufts University School of Medicine, Boston, MA, USA.
- Department of Psychiatry, Tufts University School of Medicine, Boston, MA, USA.
| |
Collapse
|
2
|
Lu Q, Nguyen TT, Zhang Q, Hasson U, Griffiths TL, Zacks JM, Gershman SJ, Norman KA. Reconciling shared versus context-specific information in a neural network model of latent causes. Sci Rep 2024; 14:16782. [PMID: 39039131 PMCID: PMC11263346 DOI: 10.1038/s41598-024-64272-5] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/05/2023] [Accepted: 06/06/2024] [Indexed: 07/24/2024] Open
Abstract
It has been proposed that, when processing a stream of events, humans divide their experiences in terms of inferred latent causes (LCs) to support context-dependent learning. However, when shared structure is present across contexts, it is still unclear how the "splitting" of LCs and learning of shared structure can be simultaneously achieved. Here, we present the Latent Cause Network (LCNet), a neural network model of LC inference. Through learning, it naturally stores structure that is shared across tasks in the network weights. Additionally, it represents context-specific structure using a context module, controlled by a Bayesian nonparametric inference algorithm, which assigns a unique context vector for each inferred LC. Across three simulations, we found that LCNet could (1) extract shared structure across LCs in a function learning task while avoiding catastrophic interference, (2) capture human data on curriculum effects in schema learning, and (3) infer the underlying event structure when processing naturalistic videos of daily events. Overall, these results demonstrate a computationally feasible approach to reconciling shared structure and context-specific structure in a model of LCs that is scalable from laboratory experiment settings to naturalistic settings.
Collapse
Affiliation(s)
- Qihong Lu
- Department of Psychology and Princeton Neuroscience Institute, Princeton University, Princeton, USA.
| | - Tan T Nguyen
- Department of Psychological and Brain Sciences, Washington University in St. Louis, St. Louis, USA
| | - Qiong Zhang
- Department of Psychology and Department of Computer Science, Rutgers University, New Brunswick, USA
| | - Uri Hasson
- Department of Psychology and Princeton Neuroscience Institute, Princeton University, Princeton, USA
| | - Thomas L Griffiths
- Department of Psychology and Princeton Neuroscience Institute, Princeton University, Princeton, USA
- Department of Computer Science, Princeton University, Princeton, USA
| | - Jeffrey M Zacks
- Department of Psychological and Brain Sciences, Washington University in St. Louis, St. Louis, USA
| | - Samuel J Gershman
- Department of Psychology and Center for Brain Science, Harvard University, Cambridge, USA
| | - Kenneth A Norman
- Department of Psychology and Princeton Neuroscience Institute, Princeton University, Princeton, USA
| |
Collapse
|
3
|
Driscoll LN, Shenoy K, Sussillo D. Flexible multitask computation in recurrent networks utilizes shared dynamical motifs. Nat Neurosci 2024; 27:1349-1363. [PMID: 38982201 PMCID: PMC11239504 DOI: 10.1038/s41593-024-01668-6] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/16/2022] [Accepted: 04/26/2024] [Indexed: 07/11/2024]
Abstract
Flexible computation is a hallmark of intelligent behavior. However, little is known about how neural networks contextually reconfigure for different computations. In the present work, we identified an algorithmic neural substrate for modular computation through the study of multitasking artificial recurrent neural networks. Dynamical systems analyses revealed learned computational strategies mirroring the modular subtask structure of the training task set. Dynamical motifs, which are recurring patterns of neural activity that implement specific computations through dynamics, such as attractors, decision boundaries and rotations, were reused across tasks. For example, tasks requiring memory of a continuous circular variable repurposed the same ring attractor. We showed that dynamical motifs were implemented by clusters of units when the unit activation function was restricted to be positive. Cluster lesions caused modular performance deficits. Motifs were reconfigured for fast transfer learning after an initial phase of learning. This work establishes dynamical motifs as a fundamental unit of compositional computation, intermediate between neuron and network. As whole-brain studies simultaneously record activity from multiple specialized systems, the dynamical motif framework will guide questions about specialization and generalization.
Collapse
Affiliation(s)
- Laura N Driscoll
- Department of Electrical Engineering, Stanford University, Stanford, CA, USA.
| | - Krishna Shenoy
- Department of Electrical Engineering, Stanford University, Stanford, CA, USA
- Department of Neurosurgery, Stanford University, Stanford, CA, USA
- Department of Bioengineering, Stanford University, Stanford, CA, USA
- Department of Neurobiology, Stanford University, Stanford, CA, USA
- Wu Tsai Neurosciences Institute, Stanford University, Stanford, CA, USA
- Bio-X Institute, Stanford University, Stanford, CA, USA
- Howard Hughes Medical Institute at Stanford University, Stanford, CA, USA
| | - David Sussillo
- Department of Electrical Engineering, Stanford University, Stanford, CA, USA
- Wu Tsai Neurosciences Institute, Stanford University, Stanford, CA, USA
| |
Collapse
|
4
|
Saxena R, McNaughton BL. Bridging Neuroscience and AI: Environmental Enrichment as a Model for Forward Knowledge Transfer. ARXIV 2024:arXiv:2405.07295v2. [PMID: 38947919 PMCID: PMC11213130] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Download PDF] [Subscribe] [Scholar Register] [Indexed: 07/02/2024]
Abstract
Continual learning (CL) refers to an agent's capability to learn from a continuous stream of data and transfer knowledge without forgetting old information. One crucial aspect of CL is forward transfer, i.e., improved and faster learning on a new task by leveraging information from prior knowledge. While this ability comes naturally to biological brains, it poses a significant challenge for artificial intelligence (AI). Here, we suggest that environmental enrichment (EE) can be used as a biological model for studying forward transfer, inspiring human-like AI development. EE refers to animal studies that enhance cognitive, social, motor, and sensory stimulation and is a model for what, in humans, is referred to as 'cognitive reserve'. Enriched animals show significant improvement in learning speed and performance on new tasks, typically exhibiting forward transfer. We explore anatomical, molecular, and neuronal changes post-EE and discuss how artificial neural networks (ANNs) can be used to predict neural computation changes after enriched experiences. Finally, we provide a synergistic way of combining neuroscience and AI research that paves the path toward developing AI capable of rapid and efficient new task learning.
Collapse
Affiliation(s)
- Rajat Saxena
- Department of Neurobiology and Behavior, University of California, Irvine, Irvine, CA 92697, USA
| | - Bruce L McNaughton
- Department of Neurobiology and Behavior, University of California, Irvine, Irvine, CA 92697, USA
- Canadian Centre for Behavioural Neuroscience, University of Lethbridge, Lethbridge, AB, T1K 3M4 Canada
| |
Collapse
|
5
|
Losey DM, Hennig JA, Oby ER, Golub MD, Sadtler PT, Quick KM, Ryu SI, Tyler-Kabara EC, Batista AP, Yu BM, Chase SM. Learning leaves a memory trace in motor cortex. Curr Biol 2024; 34:1519-1531.e4. [PMID: 38531360 PMCID: PMC11097210 DOI: 10.1016/j.cub.2024.03.003] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/21/2023] [Revised: 12/06/2023] [Accepted: 03/04/2024] [Indexed: 03/28/2024]
Abstract
How are we able to learn new behaviors without disrupting previously learned ones? To understand how the brain achieves this, we used a brain-computer interface (BCI) learning paradigm, which enables us to detect the presence of a memory of one behavior while performing another. We found that learning to use a new BCI map altered the neural activity that monkeys produced when they returned to using a familiar BCI map in a way that was specific to the learning experience. That is, learning left a "memory trace" in the primary motor cortex. This memory trace coexisted with proficient performance under the familiar map, primarily by altering neural activity in dimensions that did not impact behavior. Forming memory traces might be how the brain is able to provide for the joint learning of multiple behaviors without interference.
Collapse
Affiliation(s)
- Darby M Losey
- Neuroscience Institute, Carnegie Mellon University, Pittsburgh, PA 15213, USA; Center for the Neural Basis of Cognition, Pittsburgh, PA 15213, USA; Machine Learning Department, Carnegie Mellon University, Pittsburgh, PA 15213, USA
| | - Jay A Hennig
- Neuroscience Institute, Carnegie Mellon University, Pittsburgh, PA 15213, USA; Center for the Neural Basis of Cognition, Pittsburgh, PA 15213, USA; Machine Learning Department, Carnegie Mellon University, Pittsburgh, PA 15213, USA
| | - Emily R Oby
- Center for the Neural Basis of Cognition, Pittsburgh, PA 15213, USA; Department of Bioengineering, University of Pittsburgh, Pittsburgh, PA 15213, USA
| | - Matthew D Golub
- Center for the Neural Basis of Cognition, Pittsburgh, PA 15213, USA; Department of Electrical and Computer Engineering, Carnegie Mellon University, Pittsburgh, PA 15213, USA; Department of Electrical Engineering, Stanford University, Stanford, CA 94305, USA; Paul G. Allen School of Computer Science & Engineering, University of Washington, Seattle, WA 98195, USA
| | - Patrick T Sadtler
- Center for the Neural Basis of Cognition, Pittsburgh, PA 15213, USA; Department of Bioengineering, University of Pittsburgh, Pittsburgh, PA 15213, USA
| | - Kristin M Quick
- Center for the Neural Basis of Cognition, Pittsburgh, PA 15213, USA; Department of Bioengineering, University of Pittsburgh, Pittsburgh, PA 15213, USA
| | - Stephen I Ryu
- Department of Electrical Engineering, Stanford University, Stanford, CA 94305, USA; Department of Neurosurgery, Palo Alto Medical Foundation, Palo Alto, CA 94301, USA
| | - Elizabeth C Tyler-Kabara
- Center for the Neural Basis of Cognition, Pittsburgh, PA 15213, USA; Department of Physical Medicine and Rehabilitation, University of Pittsburgh, Pittsburgh, PA 15213, USA; Department of Neurological Surgery, University of Pittsburgh, Pittsburgh, PA 15213, USA; Department of Neurosurgery, Dell Medical School, University of Texas at Austin, Austin, TX 78712, USA
| | - Aaron P Batista
- Center for the Neural Basis of Cognition, Pittsburgh, PA 15213, USA; Department of Bioengineering, University of Pittsburgh, Pittsburgh, PA 15213, USA.
| | - Byron M Yu
- Neuroscience Institute, Carnegie Mellon University, Pittsburgh, PA 15213, USA; Center for the Neural Basis of Cognition, Pittsburgh, PA 15213, USA; Department of Electrical and Computer Engineering, Carnegie Mellon University, Pittsburgh, PA 15213, USA; Department of Biomedical Engineering, Carnegie Mellon University, Pittsburgh, PA 15213, USA.
| | - Steven M Chase
- Neuroscience Institute, Carnegie Mellon University, Pittsburgh, PA 15213, USA; Center for the Neural Basis of Cognition, Pittsburgh, PA 15213, USA; Department of Biomedical Engineering, Carnegie Mellon University, Pittsburgh, PA 15213, USA.
| |
Collapse
|
6
|
Zhu D, Bu Q, Zhu Z, Zhang Y, Wang Z. Advancing autonomy through lifelong learning: a survey of autonomous intelligent systems. Front Neurorobot 2024; 18:1385778. [PMID: 38644905 PMCID: PMC11027131 DOI: 10.3389/fnbot.2024.1385778] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/13/2024] [Accepted: 03/25/2024] [Indexed: 04/23/2024] Open
Abstract
The combination of lifelong learning algorithms with autonomous intelligent systems (AIS) is gaining popularity due to its ability to enhance AIS performance, but the existing summaries in related fields are insufficient. Therefore, it is necessary to systematically analyze the research on lifelong learning algorithms with autonomous intelligent systems, aiming to gain a better understanding of the current progress in this field. This paper presents a thorough review and analysis of the relevant work on the integration of lifelong learning algorithms and autonomous intelligent systems. Specifically, we investigate the diverse applications of lifelong learning algorithms in AIS's domains such as autonomous driving, anomaly detection, robots, and emergency management, while assessing their impact on enhancing AIS performance and reliability. The challenging problems encountered in lifelong learning for AIS are summarized based on a profound understanding in literature review. The advanced and innovative development of lifelong learning algorithms for autonomous intelligent systems are discussed for offering valuable insights and guidance to researchers in this rapidly evolving field.
Collapse
Affiliation(s)
- Dekang Zhu
- College of Electronic and Information Engineering, Tongji University, Shanghai, China
| | - Qianyi Bu
- College of Science and Engineering, University of Glasgow, Glasgow, United Kingdom
| | - Zhongpan Zhu
- College of Electronic and Information Engineering, Tongji University, Shanghai, China
- College of Mechanical Engineering, University of Shanghai for Science and Technology, Shanghai, China
| | - Yujie Zhang
- College of Electronic and Information Engineering, Tongji University, Shanghai, China
| | - Zhipeng Wang
- College of Electronic and Information Engineering, Tongji University, Shanghai, China
| |
Collapse
|
7
|
Jiang LP, Rao RPN. Dynamic predictive coding: A model of hierarchical sequence learning and prediction in the neocortex. PLoS Comput Biol 2024; 20:e1011801. [PMID: 38330098 PMCID: PMC10880975 DOI: 10.1371/journal.pcbi.1011801] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/05/2023] [Revised: 02/21/2024] [Accepted: 01/04/2024] [Indexed: 02/10/2024] Open
Abstract
We introduce dynamic predictive coding, a hierarchical model of spatiotemporal prediction and sequence learning in the neocortex. The model assumes that higher cortical levels modulate the temporal dynamics of lower levels, correcting their predictions of dynamics using prediction errors. As a result, lower levels form representations that encode sequences at shorter timescales (e.g., a single step) while higher levels form representations that encode sequences at longer timescales (e.g., an entire sequence). We tested this model using a two-level neural network, where the top-down modulation creates low-dimensional combinations of a set of learned temporal dynamics to explain input sequences. When trained on natural videos, the lower-level model neurons developed space-time receptive fields similar to those of simple cells in the primary visual cortex while the higher-level responses spanned longer timescales, mimicking temporal response hierarchies in the cortex. Additionally, the network's hierarchical sequence representation exhibited both predictive and postdictive effects resembling those observed in visual motion processing in humans (e.g., in the flash-lag illusion). When coupled with an associative memory emulating the role of the hippocampus, the model allowed episodic memories to be stored and retrieved, supporting cue-triggered recall of an input sequence similar to activity recall in the visual cortex. When extended to three hierarchical levels, the model learned progressively more abstract temporal representations along the hierarchy. Taken together, our results suggest that cortical processing and learning of sequences can be interpreted as dynamic predictive coding based on a hierarchical spatiotemporal generative model of the visual world.
Collapse
Affiliation(s)
- Linxing Preston Jiang
- Paul G. Allen School of Computer Science & Engineering, University of Washington, Seattle, Washington, United States of America
- Center for Neurotechnology, University of Washington, Seattle, Washington, United States of America
- Computational Neuroscience Center, University of Washington, Seattle, Washington, United States of America
| | - Rajesh P. N. Rao
- Paul G. Allen School of Computer Science & Engineering, University of Washington, Seattle, Washington, United States of America
- Center for Neurotechnology, University of Washington, Seattle, Washington, United States of America
- Computational Neuroscience Center, University of Washington, Seattle, Washington, United States of America
| |
Collapse
|
8
|
Baronig M, Legenstein R. Context association in pyramidal neurons through local synaptic plasticity in apical dendrites. Front Neurosci 2024; 17:1276706. [PMID: 38357522 PMCID: PMC10864492 DOI: 10.3389/fnins.2023.1276706] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/12/2023] [Accepted: 12/26/2023] [Indexed: 02/16/2024] Open
Abstract
The unique characteristics of neocortical pyramidal neurons are thought to be crucial for many aspects of information processing and learning in the brain. Experimental data suggests that their segregation into two distinct compartments, the basal dendrites close to the soma and the apical dendrites branching out from the thick apical dendritic tuft, plays an essential role in cortical organization. A recent hypothesis states that layer 5 pyramidal cells associate top-down contextual information arriving at their apical tuft with features of the sensory input that predominantly arrives at their basal dendrites. It has however remained unclear whether such context association could be established by synaptic plasticity processes. In this work, we formalize the objective of such context association learning through a mathematical loss function and derive a plasticity rule for apical synapses that optimizes this loss. The resulting plasticity rule utilizes information that is available either locally at the synapse, through branch-local NMDA spikes, or through global Ca2+events, both of which have been observed experimentally in layer 5 pyramidal cells. We show in computer simulations that the plasticity rule enables pyramidal cells to associate top-down contextual input patterns with high somatic activity. Furthermore, it enables networks of pyramidal neuron models to perform context-dependent tasks and enables continual learning by allocating new dendritic branches to novel contexts.
Collapse
Affiliation(s)
| | - Robert Legenstein
- Institute of Theoretical Computer Science, Graz University of Technology, Graz, Austria
| |
Collapse
|
9
|
Voina D, Shea-Brown E, Mihalas S. A biologically inspired architecture with switching units can learn to generalize across backgrounds. Neural Netw 2023; 168:615-630. [PMID: 37839332 PMCID: PMC10843013 DOI: 10.1016/j.neunet.2023.09.014] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/06/2022] [Revised: 08/24/2023] [Accepted: 09/07/2023] [Indexed: 10/17/2023]
Abstract
Humans and other animals navigate different environments effortlessly, their brains rapidly and accurately generalizing across contexts. Despite recent progress in deep learning, this flexibility remains a challenge for many artificial systems. Here, we show how a bio-inspired network motif can explicitly address this issue. We do this using a dataset of MNIST digits of varying transparency, set on one of two backgrounds of different statistics that define two contexts: a pixel-wise noise or a more naturalistic background from the CIFAR-10 dataset. After learning digit classification when both contexts are shown sequentially, we find that both shallow and deep networks have sharply decreased performance when returning to the first background - an instance of the catastrophic forgetting phenomenon known from continual learning. To overcome this, we propose the bottleneck-switching network or switching network for short. This is a bio-inspired architecture analogous to a well-studied network motif in the visual cortex, with additional "switching" units that are activated in the presence of a new background, assuming a priori a contextual signal to turn these units on or off. Intriguingly, only a few of these switching units are sufficient to enable the network to learn the new context without catastrophic forgetting through inhibition of redundant background features. Further, the bottleneck-switching network can generalize to novel contexts similar to contexts it has learned. Importantly, we find that - again as in the underlying biological network motif, recurrently connecting the switching units to network layers is advantageous for context generalization.
Collapse
Affiliation(s)
- Doris Voina
- Department of Applied Mathematics, Computational Neuroscience Center, University of Washington, Seattle, WA 98195, USA.
| | - Eric Shea-Brown
- Department of Applied Mathematics, Computational Neuroscience Center, University of Washington, Seattle, WA 98195, USA; Allen Institute for Brain Science, 615 Westlake Ave N, Seattle, WA 98109, USA
| | - Stefan Mihalas
- Department of Applied Mathematics, Computational Neuroscience Center, University of Washington, Seattle, WA 98195, USA; Allen Institute for Brain Science, 615 Westlake Ave N, Seattle, WA 98109, USA
| |
Collapse
|
10
|
Lässig F, Aceituno PV, Sorbaro M, Grewe BF. Bio-inspired, task-free continual learning through activity regularization. BIOLOGICAL CYBERNETICS 2023; 117:345-361. [PMID: 37589728 PMCID: PMC10600047 DOI: 10.1007/s00422-023-00973-w] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 11/30/2022] [Accepted: 08/06/2023] [Indexed: 08/18/2023]
Abstract
The ability to sequentially learn multiple tasks without forgetting is a key skill of biological brains, whereas it represents a major challenge to the field of deep learning. To avoid catastrophic forgetting, various continual learning (CL) approaches have been devised. However, these usually require discrete task boundaries. This requirement seems biologically implausible and often limits the application of CL methods in the real world where tasks are not always well defined. Here, we take inspiration from neuroscience, where sparse, non-overlapping neuronal representations have been suggested to prevent catastrophic forgetting. As in the brain, we argue that these sparse representations should be chosen on the basis of feed forward (stimulus-specific) as well as top-down (context-specific) information. To implement such selective sparsity, we use a bio-plausible form of hierarchical credit assignment known as Deep Feedback Control (DFC) and combine it with a winner-take-all sparsity mechanism. In addition to sparsity, we introduce lateral recurrent connections within each layer to further protect previously learned representations. We evaluate the new sparse-recurrent version of DFC on the split-MNIST computer vision benchmark and show that only the combination of sparsity and intra-layer recurrent connections improves CL performance with respect to standard backpropagation. Our method achieves similar performance to well-known CL methods, such as Elastic Weight Consolidation and Synaptic Intelligence, without requiring information about task boundaries. Overall, we showcase the idea of adopting computational principles from the brain to derive new, task-free learning algorithms for CL.
Collapse
Affiliation(s)
- Francesco Lässig
- Institute of Neuroinformatics University of Zürich and ETH, Zürich, Switzerland
| | | | - Martino Sorbaro
- Institute of Neuroinformatics University of Zürich and ETH, Zürich, Switzerland
- AI Center, ETH, Zürich, Switzerland
| | - Benjamin F. Grewe
- Institute of Neuroinformatics University of Zürich and ETH, Zürich, Switzerland
- AI Center, ETH, Zürich, Switzerland
| |
Collapse
|
11
|
Ma G, Jiang R, Wang L, Tang H. Dual memory model for experience-once task-incremental lifelong learning. Neural Netw 2023; 166:174-187. [PMID: 37494763 DOI: 10.1016/j.neunet.2023.07.009] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/01/2022] [Revised: 06/21/2023] [Accepted: 07/07/2023] [Indexed: 07/28/2023]
Abstract
Experience replay (ER) is a widely-adopted neuroscience-inspired method to perform lifelong learning. Nonetheless, existing ER-based approaches consider very coarse memory modules with simple memory and rehearsal mechanisms that cannot fully exploit the potential of memory replay. Evidence from neuroscience has provided fine-grained memory and rehearsal mechanisms, such as the dual-store memory system consisting of PFC-HC circuits. However, the computational abstraction of these processes is still very challenging. To address these problems, we introduce the Dual-Memory (Dual-MEM) model emulating the memorization, consolidation, and rehearsal process in the PFC-HC dual-store memory circuit. Dual-MEM maintains an incrementally updated short-term memory to benefit current-task learning. At the end of the current task, short-term memories will be consolidated into long-term ones for future rehearsal to alleviate forgetting. For the Dual-MEM optimization, we propose two learning policies that emulate different memory retrieval strategies: Direct Retrieval Learning and Mixup Retrieval Learning. Extensive evaluations on eight benchmarks demonstrate that Dual-MEM delivers compelling performance while maintaining high learning and memory utilization efficiencies under the challenging experience-once setting.
Collapse
Affiliation(s)
- Gehua Ma
- College of Computer Science and Technology, Zhejiang University, Hangzhou, China
| | - Runhao Jiang
- College of Computer Science and Technology, Zhejiang University, Hangzhou, China
| | - Lang Wang
- Department of Neurology of the First Affiliated Hospital, Interdisciplinary Institute of Neuroscience and Technology, Zhejiang University School of Medicine, Hangzhou, China
| | - Huajin Tang
- College of Computer Science and Technology, Zhejiang University, Hangzhou, China.
| |
Collapse
|
12
|
Zhang T, Cheng X, Jia S, Li CT, Poo MM, Xu B. A brain-inspired algorithm that mitigates catastrophic forgetting of artificial and spiking neural networks with low computational cost. SCIENCE ADVANCES 2023; 9:eadi2947. [PMID: 37624895 PMCID: PMC10456855 DOI: 10.1126/sciadv.adi2947] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 04/16/2023] [Accepted: 07/27/2023] [Indexed: 08/27/2023]
Abstract
Neuromodulators in the brain act globally at many forms of synaptic plasticity, represented as metaplasticity, which is rarely considered by existing spiking (SNNs) and nonspiking artificial neural networks (ANNs). Here, we report an efficient brain-inspired computing algorithm for SNNs and ANNs, referred to here as neuromodulation-assisted credit assignment (NACA), which uses expectation signals to induce defined levels of neuromodulators to selective synapses, whereby the long-term synaptic potentiation and depression are modified in a nonlinear manner depending on the neuromodulator level. The NACA algorithm achieved high recognition accuracy with substantially reduced computational cost in learning spatial and temporal classification tasks. Notably, NACA was also verified as efficient for learning five different class continuous learning tasks with varying degrees of complexity, exhibiting a markedly mitigated catastrophic forgetting at low computational cost. Mapping synaptic weight changes showed that these benefits could be explained by the sparse and targeted synaptic modifications attributed to expectation-based global neuromodulation.
Collapse
Affiliation(s)
- Tielin Zhang
- Institute of Automation, Chinese Academy of Sciences, Beijing 100190, China
- School of Artificial Intelligence, University of Chinese Academy of Sciences, Beijing 100049, China
- Shanghai Center for Brain Science and Brain-inspired Technology, Lingang Laboratory, Shanghai 200031, China
| | - Xiang Cheng
- Institute of Automation, Chinese Academy of Sciences, Beijing 100190, China
- School of Artificial Intelligence, University of Chinese Academy of Sciences, Beijing 100049, China
| | - Shuncheng Jia
- Institute of Automation, Chinese Academy of Sciences, Beijing 100190, China
- School of Artificial Intelligence, University of Chinese Academy of Sciences, Beijing 100049, China
| | - Chengyu T Li
- Shanghai Center for Brain Science and Brain-inspired Technology, Lingang Laboratory, Shanghai 200031, China
- Center for Excellence in Brain Science and Intelligence Technology, Institute of Neuroscience, Chinese Academy of Sciences, Shanghai 200031, China
| | - Mu-ming Poo
- Shanghai Center for Brain Science and Brain-inspired Technology, Lingang Laboratory, Shanghai 200031, China
- Center for Excellence in Brain Science and Intelligence Technology, Institute of Neuroscience, Chinese Academy of Sciences, Shanghai 200031, China
| | - Bo Xu
- Institute of Automation, Chinese Academy of Sciences, Beijing 100190, China
- School of Artificial Intelligence, University of Chinese Academy of Sciences, Beijing 100049, China
| |
Collapse
|
13
|
Wybo WAM, Tsai MC, Tran VAK, Illing B, Jordan J, Morrison A, Senn W. NMDA-driven dendritic modulation enables multitask representation learning in hierarchical sensory processing pathways. Proc Natl Acad Sci U S A 2023; 120:e2300558120. [PMID: 37523562 PMCID: PMC10410730 DOI: 10.1073/pnas.2300558120] [Citation(s) in RCA: 5] [Impact Index Per Article: 5.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/13/2023] [Accepted: 06/14/2023] [Indexed: 08/02/2023] Open
Abstract
While sensory representations in the brain depend on context, it remains unclear how such modulations are implemented at the biophysical level, and how processing layers further in the hierarchy can extract useful features for each possible contextual state. Here, we demonstrate that dendritic N-Methyl-D-Aspartate spikes can, within physiological constraints, implement contextual modulation of feedforward processing. Such neuron-specific modulations exploit prior knowledge, encoded in stable feedforward weights, to achieve transfer learning across contexts. In a network of biophysically realistic neuron models with context-independent feedforward weights, we show that modulatory inputs to dendritic branches can solve linearly nonseparable learning problems with a Hebbian, error-modulated learning rule. We also demonstrate that local prediction of whether representations originate either from different inputs, or from different contextual modulations of the same input, results in representation learning of hierarchical feedforward weights across processing layers that accommodate a multitude of contexts.
Collapse
Affiliation(s)
- Willem A. M. Wybo
- Institute of Neuroscience and Medicine (INM-6) and Institute for Advanced Simulation (IAS-6) and JARA-Institute Brain Structure–Function Relationships (INM-10), Jülich Research Center, DE-52428Jülich, Germany
| | - Matthias C. Tsai
- Department of Physiology, University of Bern, CH-3012Bern, Switzerland
| | - Viet Anh Khoa Tran
- Institute of Neuroscience and Medicine (INM-6) and Institute for Advanced Simulation (IAS-6) and JARA-Institute Brain Structure–Function Relationships (INM-10), Jülich Research Center, DE-52428Jülich, Germany
- Department of Computer Science - 3, Faculty 1, RWTH Aachen University, DE-52074Aachen, Germany
| | - Bernd Illing
- Laboratory of Computational Neuroscience, École Polytechnique Fédérale de Lausanne, CH-1015Lausanne, Switzerland
| | - Jakob Jordan
- Department of Physiology, University of Bern, CH-3012Bern, Switzerland
| | - Abigail Morrison
- Institute of Neuroscience and Medicine (INM-6) and Institute for Advanced Simulation (IAS-6) and JARA-Institute Brain Structure–Function Relationships (INM-10), Jülich Research Center, DE-52428Jülich, Germany
- Department of Computer Science - 3, Faculty 1, RWTH Aachen University, DE-52074Aachen, Germany
| | - Walter Senn
- Department of Physiology, University of Bern, CH-3012Bern, Switzerland
| |
Collapse
|
14
|
Heald JB, Wolpert DM, Lengyel M. The Computational and Neural Bases of Context-Dependent Learning. Annu Rev Neurosci 2023; 46:233-258. [PMID: 36972611 PMCID: PMC10348919 DOI: 10.1146/annurev-neuro-092322-100402] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 03/29/2023]
Abstract
Flexible behavior requires the creation, updating, and expression of memories to depend on context. While the neural underpinnings of each of these processes have been intensively studied, recent advances in computational modeling revealed a key challenge in context-dependent learning that had been largely ignored previously: Under naturalistic conditions, context is typically uncertain, necessitating contextual inference. We review a theoretical approach to formalizing context-dependent learning in the face of contextual uncertainty and the core computations it requires. We show how this approach begins to organize a large body of disparate experimental observations, from multiple levels of brain organization (including circuits, systems, and behavior) and multiple brain regions (most prominently the prefrontal cortex, the hippocampus, and motor cortices), into a coherent framework. We argue that contextual inference may also be key to understanding continual learning in the brain. This theory-driven perspective places contextual inference as a core component of learning.
Collapse
Affiliation(s)
- James B Heald
- Department of Neuroscience and Zuckerman Mind Brain Behavior Institute, Columbia University, New York, NY, USA; ,
| | - Daniel M Wolpert
- Department of Neuroscience and Zuckerman Mind Brain Behavior Institute, Columbia University, New York, NY, USA; ,
- Computational and Biological Learning Lab, Department of Engineering, University of Cambridge, Cambridge, United Kingdom;
| | - Máté Lengyel
- Computational and Biological Learning Lab, Department of Engineering, University of Cambridge, Cambridge, United Kingdom;
- Center for Cognitive Computation, Department of Cognitive Science, Central European University, Budapest, Hungary
| |
Collapse
|
15
|
Li C, Huang Z, Zou W, Huang H. Statistical mechanics of continual learning: Variational principle and mean-field potential. Phys Rev E 2023; 108:014309. [PMID: 37583230 DOI: 10.1103/physreve.108.014309] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/07/2022] [Accepted: 06/30/2023] [Indexed: 08/17/2023]
Abstract
An obstacle to artificial general intelligence is set by continual learning of multiple tasks of a different nature. Recently, various heuristic tricks, both from machine learning and from neuroscience angles, were proposed, but they lack a unified theory foundation. Here, we focus on continual learning in single-layered and multilayered neural networks of binary weights. A variational Bayesian learning setting is thus proposed in which the neural networks are trained in a field-space, rather than a gradient-ill-defined discrete-weight space, and furthermore, weight uncertainty is naturally incorporated, and it modulates synaptic resources among tasks. From a physics perspective, we translate variational continual learning into a Franz-Parisi thermodynamic potential framework, where previous task knowledge serves as a prior probability and a reference as well. We thus interpret the continual learning of the binary perceptron in a teacher-student setting as a Franz-Parisi potential computation. The learning performance can then be analytically studied with mean-field order parameters, whose predictions coincide with numerical experiments using stochastic gradient descent methods. Based on the variational principle and Gaussian field approximation of internal preactivations in hidden layers, we also derive the learning algorithm considering weight uncertainty, which solves the continual learning with binary weights using multilayered neural networks, and performs better than the currently available metaplasticity algorithm in which binary synapses bear hidden continuous states and the synaptic plasticity is modulated by a heuristic regularization function. Our proposed principled frameworks also connect to elastic weight consolidation, weight-uncertainty modulated learning, and neuroscience-inspired metaplasticity, providing a theoretically grounded method for real-world multitask learning with deep networks.
Collapse
Affiliation(s)
- Chan Li
- PMI Lab, School of Physics, Sun Yat-sen University, Guangzhou 510275, People's Republic of China
| | - Zhenye Huang
- CAS Key Laboratory for Theoretical Physics, Institute of Theoretical Physics, Chinese Academy of Sciences, Beijing 100190, People's Republic of China
| | - Wenxuan Zou
- PMI Lab, School of Physics, Sun Yat-sen University, Guangzhou 510275, People's Republic of China
| | - Haiping Huang
- PMI Lab, School of Physics, Sun Yat-sen University, Guangzhou 510275, People's Republic of China
- Guangdong Provincial Key Laboratory of Magnetoelectric Physics and Devices, Sun Yat-sen University, Guangzhou 510275, China
| |
Collapse
|
16
|
Pagkalos M, Makarov R, Poirazi P. Leveraging dendritic properties to advance machine learning and neuro-inspired computing. ARXIV 2023:arXiv:2306.08007v1. [PMID: 37396619 PMCID: PMC10312913] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Subscribe] [Scholar Register] [Indexed: 07/04/2023]
Abstract
The brain is a remarkably capable and efficient system. It can process and store huge amounts of noisy and unstructured information using minimal energy. In contrast, current artificial intelligence (AI) systems require vast resources for training while still struggling to compete in tasks that are trivial for biological agents. Thus, brain-inspired engineering has emerged as a promising new avenue for designing sustainable, next-generation AI systems. Here, we describe how dendritic mechanisms of biological neurons have inspired innovative solutions for significant AI problems, including credit assignment in multilayer networks, catastrophic forgetting, and high energy consumption. These findings provide exciting alternatives to existing architectures, showing how dendritic research can pave the way for building more powerful and energy-efficient artificial learning systems.
Collapse
Affiliation(s)
- Michalis Pagkalos
- Institute of Molecular Biology and Biotechnology (IMBB), Foundation for Research and Technology Hellas (FORTH), Heraklion, 70013, Greece
- Department of Biology, University of Crete, Heraklion, 70013, Greece
| | - Roman Makarov
- Institute of Molecular Biology and Biotechnology (IMBB), Foundation for Research and Technology Hellas (FORTH), Heraklion, 70013, Greece
- Department of Biology, University of Crete, Heraklion, 70013, Greece
| | - Panayiota Poirazi
- Institute of Molecular Biology and Biotechnology (IMBB), Foundation for Research and Technology Hellas (FORTH), Heraklion, 70013, Greece
| |
Collapse
|
17
|
Geva N, Deitch D, Rubin A, Ziv Y. Time and experience differentially affect distinct aspects of hippocampal representational drift. Neuron 2023:S0896-6273(23)00378-1. [PMID: 37315556 DOI: 10.1016/j.neuron.2023.05.005] [Citation(s) in RCA: 22] [Impact Index Per Article: 22.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/22/2022] [Revised: 02/22/2023] [Accepted: 05/08/2023] [Indexed: 06/16/2023]
Abstract
Hippocampal activity is critical for spatial memory. Within a fixed, familiar environment, hippocampal codes gradually change over timescales of days to weeks-a phenomenon known as representational drift. The passage of time and the amount of experience are two factors that profoundly affect memory. However, thus far, it has remained unclear to what extent these factors drive hippocampal representational drift. Here, we longitudinally recorded large populations of hippocampal neurons in mice while they repeatedly explored two different familiar environments that they visited at different time intervals over weeks. We found that time and experience differentially affected distinct aspects of representational drift: the passage of time drove changes in neuronal activity rates, whereas experience drove changes in the cells' spatial tuning. Changes in spatial tuning were context specific and largely independent of changes in activity rates. Thus, our results suggest that representational drift is a multi-faceted process governed by distinct neuronal mechanisms.
Collapse
Affiliation(s)
- Nitzan Geva
- Department of Brain Sciences, Weizmann Institute of Science, Rehovot, Israel
| | - Daniel Deitch
- Department of Brain Sciences, Weizmann Institute of Science, Rehovot, Israel
| | - Alon Rubin
- Department of Brain Sciences, Weizmann Institute of Science, Rehovot, Israel.
| | - Yaniv Ziv
- Department of Brain Sciences, Weizmann Institute of Science, Rehovot, Israel.
| |
Collapse
|
18
|
Levy WB, Baxter RA. Growing dendrites enhance a neuron's computational power and memory capacity. Neural Netw 2023; 164:275-309. [PMID: 37163846 DOI: 10.1016/j.neunet.2023.04.033] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/20/2022] [Revised: 04/13/2023] [Accepted: 04/18/2023] [Indexed: 05/12/2023]
Abstract
Neocortical pyramidal neurons have many dendrites, and such dendrites are capable of, in isolation of one-another, generating a neuronal spike. It is also now understood that there is a large amount of dendritic growth during the first years of a humans life, arguably a period of prodigious learning. These observations inspire the construction of a local, stochastic algorithm based on an earlier stochastic, homeostatic, Hebbian developmental theory. Here we investigate the neurocomputational advantages and limits on this novel algorithm that combines dendritogenesis with supervised adaptive synaptogenesis. Neurons created with this algorithm have enhanced memory capacity, can avoid catastrophic interference (forgetting), and have the ability to unmix mixture distributions. In particular, individual dendrites develop within each class, in an unsupervised manner, to become feature-clusters that correspond to the mixing elements of class-conditional mixture distribution. Error-free classification is demonstrated with input perturbations up to 40%. Although discriminative problems are used to understand the capabilities of the stochastic algorithm and the neuronal connectivity it produces, the algorithm is in the generative class, it thus seems ideal for decisions that require generalization, i.e., extrapolation beyond previous learning.
Collapse
Affiliation(s)
- William B Levy
- Department of Neurosurgery, University of Virginia School of Medicine, Charlottesville, VA 22908, United States of America; Informed Simplifications, Earlysville, VA 22936, United States of America.
| | - Robert A Baxter
- Department of Neurosurgery, University of Virginia School of Medicine, Charlottesville, VA 22908, United States of America; Baxter Adaptive Systems, Bedford, MA 01730, United States of America
| |
Collapse
|
19
|
Tang S, Yu X, Cheang CF, Ji X, Yu HH, Choi IC. CLELNet: A continual learning network for esophageal lesion analysis on endoscopic images. COMPUTER METHODS AND PROGRAMS IN BIOMEDICINE 2023; 231:107399. [PMID: 36780717 DOI: 10.1016/j.cmpb.2023.107399] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 11/02/2021] [Revised: 01/03/2023] [Accepted: 02/01/2023] [Indexed: 06/18/2023]
Abstract
BACKGROUND AND OBJECTIVE A deep learning-based intelligent diagnosis system can significantly reduce the burden of endoscopists in the daily analysis of esophageal lesions. Considering the need to add new tasks in the diagnosis system, a deep learning model that can train a series of tasks incrementally using endoscopic images is essential for identifying the types and regions of esophageal lesions. METHOD In this paper, we proposed a continual learning-based esophageal lesion network (CLELNet), in which a convolutional autoencoder was designed to extract representation features of endoscopic images among different esophageal lesions. The proposed CLELNet consists of shared layers and task-specific layers. Shared layers are used to extract common features among different lesions while task-specific layers can complete different tasks. The first two tasks trained by the CLELNet are the classification (task 1) and the segmentation (task 2). We collected a dataset of esophageal endoscopic images from Macau Kiang Wu Hospital for training and testing the CLELNet. RESULTS The experimental results showed that the classification accuracy of task 1 was 95.96%, and the Intersection Over Union and the Dice Similarity Coefficient of task 2 were 65.66% and 78.08%, respectively. CONCLUSIONS The proposed CLELNet can realize task-incremental learning without forgetting the previous tasks and thus become a useful computer-aided diagnosis system in esophageal lesions analysis.
Collapse
Affiliation(s)
- Suigu Tang
- Faculty of Innovation Engineering-School of Computer Science and Engineering, Macau University of Science and Technology, Avenida Wai Long, Taipa, Macau SAR
| | - Xiaoyuan Yu
- Faculty of Innovation Engineering-School of Computer Science and Engineering, Macau University of Science and Technology, Avenida Wai Long, Taipa, Macau SAR
| | - Chak Fong Cheang
- Faculty of Innovation Engineering-School of Computer Science and Engineering, Macau University of Science and Technology, Avenida Wai Long, Taipa, Macau SAR.
| | - Xiaoyu Ji
- Faculty of Innovation Engineering-School of Computer Science and Engineering, Macau University of Science and Technology, Avenida Wai Long, Taipa, Macau SAR
| | - Hon Ho Yu
- Kiang Wu Hospital, Rua de Coelho do Amaral, Macau SAR
| | - I Cheong Choi
- Kiang Wu Hospital, Rua de Coelho do Amaral, Macau SAR
| |
Collapse
|
20
|
Zhang W, Li D, Ma C, Zhai G, Yang X, Ma K. Continual Learning for Blind Image Quality Assessment. IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE 2023; 45:2864-2878. [PMID: 35635807 DOI: 10.1109/tpami.2022.3178874] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/15/2023]
Abstract
The explosive growth of image data facilitates the fast development of image processing and computer vision methods for emerging visual applications, meanwhile introducing novel distortions to processed images. This poses a grand challenge to existing blind image quality assessment (BIQA) models, which are weak at adapting to subpopulation shift. Recent work suggests training BIQA methods on the combination of all available human-rated IQA datasets. However, this type of approach is not scalable to a large number of datasets and is cumbersome to incorporate a newly created dataset as well. In this paper, we formulate continual learning for BIQA, where a model learns continually from a stream of IQA datasets, building on what was learned from previously seen data. We first identify five desiderata in the continual setting with three criteria to quantify the prediction accuracy, plasticity, and stability, respectively. We then propose a simple yet effective continual learning method for BIQA. Specifically, based on a shared backbone network, we add a prediction head for a new dataset and enforce a regularizer to allow all prediction heads to evolve with new data while being resistant to catastrophic forgetting of old data. We compute the overall quality score by a weighted summation of predictions from all heads. Extensive experiments demonstrate the promise of the proposed continual learning method in comparison to standard training techniques for BIQA, with and without experience replay. We made the code publicly available at https://github.com/zwx8981/BIQA_CL.
Collapse
|
21
|
Flesch T, Saxe A, Summerfield C. Continual task learning in natural and artificial agents. Trends Neurosci 2023; 46:199-210. [PMID: 36682991 PMCID: PMC10914671 DOI: 10.1016/j.tins.2022.12.006] [Citation(s) in RCA: 6] [Impact Index Per Article: 6.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/10/2022] [Revised: 12/07/2022] [Accepted: 12/15/2022] [Indexed: 01/21/2023]
Abstract
How do humans and other animals learn new tasks? A wave of brain recording studies has investigated how neural representations change during task learning, with a focus on how tasks can be acquired and coded in ways that minimise mutual interference. We review recent work that has explored the geometry and dimensionality of neural task representations in neocortex, and computational models that have exploited these findings to understand how the brain may partition knowledge between tasks. We discuss how ideas from machine learning, including those that combine supervised and unsupervised learning, are helping neuroscientists understand how natural tasks are learned and coded in biological brains.
Collapse
Affiliation(s)
- Timo Flesch
- Department of Experimental Psychology, University of Oxford, Oxford, UK
| | - Andrew Saxe
- Gatsby Computational Neuroscience Unit & Sainsbury Wellcome Centre, UCL, London, UK.
| | | |
Collapse
|
22
|
Riquelme JL, Hemberger M, Laurent G, Gjorgjieva J. Single spikes drive sequential propagation and routing of activity in a cortical network. eLife 2023; 12:e79928. [PMID: 36780217 PMCID: PMC9925052 DOI: 10.7554/elife.79928] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/03/2022] [Accepted: 12/19/2022] [Indexed: 02/14/2023] Open
Abstract
Single spikes can trigger repeatable firing sequences in cortical networks. The mechanisms that support reliable propagation of activity from such small events and their functional consequences remain unclear. By constraining a recurrent network model with experimental statistics from turtle cortex, we generate reliable and temporally precise sequences from single spike triggers. We find that rare strong connections support sequence propagation, while dense weak connections modulate propagation reliability. We identify sections of sequences corresponding to divergent branches of strongly connected neurons which can be selectively gated. Applying external inputs to specific neurons in the sparse backbone of strong connections can effectively control propagation and route activity within the network. Finally, we demonstrate that concurrent sequences interact reliably, generating a highly combinatorial space of sequence activations. Our results reveal the impact of individual spikes in cortical circuits, detailing how repeatable sequences of activity can be triggered, sustained, and controlled during cortical computations.
Collapse
Affiliation(s)
- Juan Luis Riquelme
- Max Planck Institute for Brain ResearchFrankfurt am MainGermany
- School of Life Sciences, Technical University of MunichFreisingGermany
| | - Mike Hemberger
- Max Planck Institute for Brain ResearchFrankfurt am MainGermany
| | - Gilles Laurent
- Max Planck Institute for Brain ResearchFrankfurt am MainGermany
| | - Julijana Gjorgjieva
- Max Planck Institute for Brain ResearchFrankfurt am MainGermany
- School of Life Sciences, Technical University of MunichFreisingGermany
| |
Collapse
|
23
|
Loeb GE. Remembrance of things perceived: Adding thalamocortical function to artificial neural networks. Front Integr Neurosci 2023; 17:1108271. [PMID: 36959924 PMCID: PMC10027940 DOI: 10.3389/fnint.2023.1108271] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/25/2022] [Accepted: 02/13/2023] [Indexed: 03/09/2023] Open
Abstract
Recent research has illuminated the complexity and importance of the thalamocortical system but it has been difficult to identify what computational functions it performs. Meanwhile, deep-learning artificial neural networks (ANNs) based on bio-inspired models of purely cortical circuits have achieved surprising success solving sophisticated cognitive problems associated historically with human intelligence. Nevertheless, the limitations and shortcomings of artificial intelligence (AI) based on such ANNs are becoming increasingly clear. This review considers how the addition of thalamocortical connectivity and its putative functions related to cortical attention might address some of those shortcomings. Such bio-inspired models are now providing both testable theories of biological cognition and improved AI technology, much of which is happening outside the usual academic venues.
Collapse
|
24
|
Flesch T, Nagy DG, Saxe A, Summerfield C. Modelling continual learning in humans with Hebbian context gating and exponentially decaying task signals. PLoS Comput Biol 2023; 19:e1010808. [PMID: 36656823 PMCID: PMC9851563 DOI: 10.1371/journal.pcbi.1010808] [Citation(s) in RCA: 10] [Impact Index Per Article: 10.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/21/2022] [Accepted: 12/11/2022] [Indexed: 01/20/2023] Open
Abstract
Humans can learn several tasks in succession with minimal mutual interference but perform more poorly when trained on multiple tasks at once. The opposite is true for standard deep neural networks. Here, we propose novel computational constraints for artificial neural networks, inspired by earlier work on gating in the primate prefrontal cortex, that capture the cost of interleaved training and allow the network to learn two tasks in sequence without forgetting. We augment standard stochastic gradient descent with two algorithmic motifs, so-called "sluggish" task units and a Hebbian training step that strengthens connections between task units and hidden units that encode task-relevant information. We found that the "sluggish" units introduce a switch-cost during training, which biases representations under interleaved training towards a joint representation that ignores the contextual cue, while the Hebbian step promotes the formation of a gating scheme from task units to the hidden layer that produces orthogonal representations which are perfectly guarded against interference. Validating the model on previously published human behavioural data revealed that it matches performance of participants who had been trained on blocked or interleaved curricula, and that these performance differences were driven by misestimation of the true category boundary.
Collapse
Affiliation(s)
- Timo Flesch
- Department of Experimental Psychology, University of Oxford; Oxford, United Kingdom
| | - David G Nagy
- Department of Computational Sciences, Wigner Research Centre for Physics; Budapest, Hungary
| | - Andrew Saxe
- Gatsby Computational Neuroscience Unit & Sainsbury Wellcome Centre, University College London; London, United Kingdom
- CIFAR Azrieli Global Scholars program, CIFAR; Toronto, Canada
| | | |
Collapse
|
25
|
Fukai T. Computational models of Idling brain activity for memory processing. Neurosci Res 2022; 189:75-82. [PMID: 36592825 DOI: 10.1016/j.neures.2022.12.024] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/28/2022] [Accepted: 12/29/2022] [Indexed: 01/01/2023]
Abstract
Studying the underlying neural mechanisms of cognitive functions of the brain is one of the central questions in modern biology. Moreover, it has significantly impacted the development of novel technologies in artificial intelligence. Spontaneous activity is a unique feature of the brain and is currently lacking in many artificially constructed intelligent machines. Spontaneous activity may represent the brain's idling states, which are internally driven by neuronal networks and possibly participate in offline processing during awake, sleep, and resting states. Evidence is accumulating that the brain's spontaneous activity is not mere noise but part of the mechanisms to process information about previous experiences. A bunch of literature has shown how previous sensory and behavioral experiences influence the subsequent patterns of brain activity with various methods in various animals. It seems, however, that the patterns of neural activity and their computational roles differ significantly from area to area and from function to function. In this article, I review the various forms of the brain's spontaneous activity, especially those observed during memory processing, and some attempts to model the generation mechanisms and computational roles of such activities.
Collapse
Affiliation(s)
- Tomoki Fukai
- Okinawa Institute of Science and Technology, Tancha 1919-1, Onna-son, Okinawa 904-0495, Japan.
| |
Collapse
|
26
|
Connectivity concepts in neuronal network modeling. PLoS Comput Biol 2022; 18:e1010086. [PMID: 36074778 PMCID: PMC9455883 DOI: 10.1371/journal.pcbi.1010086] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/07/2021] [Accepted: 04/07/2022] [Indexed: 11/19/2022] Open
Abstract
Sustainable research on computational models of neuronal networks requires published models to be understandable, reproducible, and extendable. Missing details or ambiguities about mathematical concepts and assumptions, algorithmic implementations, or parameterizations hinder progress. Such flaws are unfortunately frequent and one reason is a lack of readily applicable standards and tools for model description. Our work aims to advance complete and concise descriptions of network connectivity but also to guide the implementation of connection routines in simulation software and neuromorphic hardware systems. We first review models made available by the computational neuroscience community in the repositories ModelDB and Open Source Brain, and investigate the corresponding connectivity structures and their descriptions in both manuscript and code. The review comprises the connectivity of networks with diverse levels of neuroanatomical detail and exposes how connectivity is abstracted in existing description languages and simulator interfaces. We find that a substantial proportion of the published descriptions of connectivity is ambiguous. Based on this review, we derive a set of connectivity concepts for deterministically and probabilistically connected networks and also address networks embedded in metric space. Beside these mathematical and textual guidelines, we propose a unified graphical notation for network diagrams to facilitate an intuitive understanding of network properties. Examples of representative network models demonstrate the practical use of the ideas. We hope that the proposed standardizations will contribute to unambiguous descriptions and reproducible implementations of neuronal network connectivity in computational neuroscience.
Collapse
|
27
|
Peng J, Tang B, Jiang H, Li Z, Lei Y, Lin T, Li H. Overcoming Long-Term Catastrophic Forgetting Through Adversarial Neural Pruning and Synaptic Consolidation. IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS 2022; 33:4243-4256. [PMID: 33577459 DOI: 10.1109/tnnls.2021.3056201] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/12/2023]
Abstract
Enabling a neural network to sequentially learn multiple tasks is of great significance for expanding the applicability of neural networks in real-world applications. However, artificial neural networks face the well-known problem of catastrophic forgetting. What is worse, the degradation of previously learned skills becomes more severe as the task sequence increases, known as the long-term catastrophic forgetting. It is due to two facts: first, as the model learns more tasks, the intersection of the low-error parameter subspace satisfying for these tasks becomes smaller or even does not exist; second, when the model learns a new task, the cumulative error keeps increasing as the model tries to protect the parameter configuration of previous tasks from interference. Inspired by the memory consolidation mechanism in mammalian brains with synaptic plasticity, we propose a confrontation mechanism in which Adversarial Neural Pruning and synaptic Consolidation (ANPyC) is used to overcome the long-term catastrophic forgetting issue. The neural pruning acts as long-term depression to prune task-irrelevant parameters, while the novel synaptic consolidation acts as long-term potentiation to strengthen task-relevant parameters. During the training, this confrontation achieves a balance in that only crucial parameters remain, and non-significant parameters are freed to learn subsequent tasks. ANPyC avoids forgetting important information and makes the model efficient to learn a large number of tasks. Specifically, the neural pruning iteratively relaxes the current task's parameter conditions to expand the common parameter subspace of the task; the synaptic consolidation strategy, which consists of a structure-aware parameter-importance measurement and an element-wise parameter updating strategy, decreases the cumulative error when learning new tasks. Our approach encourages the synapse to be sparse and polarized, which enables long-term learning and memory. ANPyC exhibits effectiveness and generalization on both image classification and generation tasks with multiple layer perceptron, convolutional neural networks, and generative adversarial networks, and variational autoencoder. The full source code is available at https://github.com/GeoX-Lab/ANPyC.
Collapse
|
28
|
Jedlicka P, Tomko M, Robins A, Abraham WC. Contributions by metaplasticity to solving the Catastrophic Forgetting Problem. Trends Neurosci 2022; 45:656-666. [PMID: 35798611 DOI: 10.1016/j.tins.2022.06.002] [Citation(s) in RCA: 6] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/31/2022] [Revised: 06/06/2022] [Accepted: 06/09/2022] [Indexed: 10/17/2022]
Abstract
Catastrophic forgetting (CF) refers to the sudden and severe loss of prior information in learning systems when acquiring new information. CF has been an Achilles heel of standard artificial neural networks (ANNs) when learning multiple tasks sequentially. The brain, by contrast, has solved this problem during evolution. Modellers now use a variety of strategies to overcome CF, many of which have parallels to cellular and circuit functions in the brain. One common strategy, based on metaplasticity phenomena, controls the future rate of change at key connections to help retain previously learned information. However, the metaplasticity properties so far used are only a subset of those existing in neurobiology. We propose that as models become more sophisticated, there could be value in drawing on a richer set of metaplasticity rules, especially when promoting continual learning in agents moving about the environment.
Collapse
Affiliation(s)
- Peter Jedlicka
- ICAR3R - Interdisciplinary Centre for 3Rs in Animal Research, Faculty of Medicine, Justus Liebig University, Giessen, Germany; Institute of Clinical Neuroanatomy, Neuroscience Center, Goethe University Frankfurt, Frankfurt/Main, Germany; Frankfurt Institute for Advanced Studies, Frankfurt 60438, Germany.
| | - Matus Tomko
- ICAR3R - Interdisciplinary Centre for 3Rs in Animal Research, Faculty of Medicine, Justus Liebig University, Giessen, Germany; Institute of Molecular Physiology and Genetics, Centre of Biosciences, Slovak Academy of Sciences, Bratislava, Slovakia
| | - Anthony Robins
- Department of Computer Science, University of Otago, Dunedin 9016, New Zealand
| | - Wickliffe C Abraham
- Department of Psychology, Brain Health Research Centre, University of Otago, Dunedin 9054, New Zealand.
| |
Collapse
|
29
|
Driscoll LN, Duncker L, Harvey CD. Representational drift: Emerging theories for continual learning and experimental future directions. Curr Opin Neurobiol 2022; 76:102609. [PMID: 35939861 DOI: 10.1016/j.conb.2022.102609] [Citation(s) in RCA: 35] [Impact Index Per Article: 17.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/04/2022] [Revised: 06/08/2022] [Accepted: 06/23/2022] [Indexed: 11/03/2022]
Abstract
Recent work has revealed that the neural activity patterns correlated with sensation, cognition, and action often are not stable and instead undergo large scale changes over days and weeks-a phenomenon called representational drift. Here, we highlight recent observations of drift, how drift is unlikely to be explained by experimental confounds, and how the brain can likely compensate for drift to allow stable computation. We propose that drift might have important roles in neural computation to allow continual learning, both for separating and relating memories that occur at distinct times. Finally, we present an outlook on future experimental directions that are needed to further characterize drift and to test emerging theories for drift's role in computation.
Collapse
Affiliation(s)
- Laura N Driscoll
- Department of Electrical Engineering, Stanford University, Stanford, CA, USA.
| | - Lea Duncker
- Howard Hughes Medical Institute, Stanford University, Stanford, CA, USA.
| | | |
Collapse
|
30
|
Wen S, Rios A, Ge Y, Itti L. Beneficial Perturbation Network for Designing General Adaptive Artificial Intelligence Systems. IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS 2022; 33:3778-3791. [PMID: 33596177 DOI: 10.1109/tnnls.2021.3054423] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/12/2023]
Abstract
The human brain is the gold standard of adaptive learning. It not only can learn and benefit from experience, but also can adapt to new situations. In contrast, deep neural networks only learn one sophisticated but fixed mapping from inputs to outputs. This limits their applicability to more dynamic situations, where the input to output mapping may change with different contexts. A salient example is continual learning-learning new independent tasks sequentially without forgetting previous tasks. Continual learning of multiple tasks in artificial neural networks using gradient descent leads to catastrophic forgetting, whereby a previously learned mapping of an old task is erased when learning new mappings for new tasks. Herein, we propose a new biologically plausible type of deep neural network with extra, out-of-network, task-dependent biasing units to accommodate these dynamic situations. This allows, for the first time, a single network to learn potentially unlimited parallel input to output mappings, and to switch on the fly between them at runtime. Biasing units are programed by leveraging beneficial perturbations (opposite to well-known adversarial perturbations) for each task. Beneficial perturbations for a given task bias the network toward that task, essentially switching the network into a different mode to process that task. This largely eliminates catastrophic interference between tasks. Our approach is memory-efficient and parameter-efficient, can accommodate many tasks, and achieves the state-of-the-art performance across different tasks and domains.
Collapse
|
31
|
Gryshchuk V, Weber C, Loo CK, Wermter S. Go ahead and do not forget: Modular lifelong learning from event-based data. Neurocomputing 2022. [DOI: 10.1016/j.neucom.2022.05.101] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/25/2022]
|
32
|
Stoianov I, Maisto D, Pezzulo G. The hippocampal formation as a hierarchical generative model supporting generative replay and continual learning. Prog Neurobiol 2022; 217:102329. [PMID: 35870678 DOI: 10.1016/j.pneurobio.2022.102329] [Citation(s) in RCA: 17] [Impact Index Per Article: 8.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/09/2021] [Revised: 07/15/2022] [Accepted: 07/19/2022] [Indexed: 11/28/2022]
Abstract
We advance a novel computational theory of the hippocampal formation as a hierarchical generative model that organizes sequential experiences, such as rodent trajectories during spatial navigation, into coherent spatiotemporal contexts. We propose that the hippocampal generative model is endowed with inductive biases to identify individual items of experience (first hierarchical layer), organize them into sequences (second layer) and cluster them into maps (third layer). This theory entails a novel characterization of hippocampal reactivations as generative replay: the offline resampling of fictive sequences from the generative model, which supports the continual learning of multiple sequential experiences. We show that the model learns and efficiently retains multiple spatial navigation trajectories, by organizing them into spatial maps. Furthermore, the model reproduces flexible and prospective aspects of hippocampal dynamics that are challenging to explain within existing frameworks. This theory reconciles multiple roles of the hippocampal formation in map-based navigation, episodic memory and imagination.
Collapse
Affiliation(s)
- Ivilin Stoianov
- Institute of Cognitive Sciences and Technologies, National Research Council, Rome, Italy
| | - Domenico Maisto
- Institute of Cognitive Sciences and Technologies, National Research Council, Rome, Italy
| | - Giovanni Pezzulo
- Institute of Cognitive Sciences and Technologies, National Research Council, Rome, Italy.
| |
Collapse
|
33
|
Kumar Sah R, Mirzadeh SI, Ghasemzadeh H. Continual Learning for Activity Recognition. ANNUAL INTERNATIONAL CONFERENCE OF THE IEEE ENGINEERING IN MEDICINE AND BIOLOGY SOCIETY. IEEE ENGINEERING IN MEDICINE AND BIOLOGY SOCIETY. ANNUAL INTERNATIONAL CONFERENCE 2022; 2022:2416-2420. [PMID: 36085745 DOI: 10.1109/embc48229.2022.9871690] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/15/2023]
Abstract
The recent success of deep neural networks in prediction tasks on wearable sensor data is evident. However, in more practical online learning scenarios, where new data arrive sequentially, neural networks suffer severely from the "catastrophic forgetting" problem. In real-world settings, given a pre-trained model on the old data, when we collect new data, it is practically infeasible to re-train the model on both old and new data because the computational costs will increase dramatically as more and more data arrive in time. However, if we fine-tune the model only with the new data because the new data might be different from the old data, the neural network parameters will change to fit the new data. As a result, the new parameters are no longer suitable for the old data. This phenomenon is known as catastrophic forgetting, and continual learning research aims to overcome this problem with minimal computational costs. While most of the continual learning research focuses on computer vision tasks, implications of catastrophic forgetting in wearable computing research and potential avenues to address this problem have remained unexplored. To address this knowledge gap, we study continual learning for activity recognition using wearable sensor data. We show that the catastrophic forgetting problem is a critical challenge for real-world deployment of machine learning models for wearables. Moreover, we show that the catastrophic forgetting problem can be alleviated by employing various training techniques.
Collapse
|
34
|
A framework for the general design and computation of hybrid neural networks. Nat Commun 2022; 13:3427. [PMID: 35701391 PMCID: PMC9198039 DOI: 10.1038/s41467-022-30964-7] [Citation(s) in RCA: 4] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/31/2021] [Accepted: 05/25/2022] [Indexed: 12/11/2022] Open
Abstract
There is a growing trend to design hybrid neural networks (HNNs) by combining spiking neural networks and artificial neural networks to leverage the strengths of both. Here, we propose a framework for general design and computation of HNNs by introducing hybrid units (HUs) as a linkage interface. The framework not only integrates key features of these computing paradigms but also decouples them to improve flexibility and efficiency. HUs are designable and learnable to promote transmission and modulation of hybrid information flows in HNNs. Through three cases, we demonstrate that the framework can facilitate hybrid model design. The hybrid sensing network implements multi-pathway sensing, achieving high tracking accuracy and energy efficiency. The hybrid modulation network implements hierarchical information abstraction, enabling meta-continual learning of multiple tasks. The hybrid reasoning network performs multimodal reasoning in an interpretable, robust and parallel manner. This study advances cross-paradigm modeling for a broad range of intelligent tasks. Hybrid neural networks combine advantages of spiking and artificial neural networks in the context of computing and biological motivation. The authors propose a design framework with hybrid units for improved flexibility and efficiency of hybrid neural networks, and modulation of hybrid information flows.
Collapse
|
35
|
Csorba BA, Krause MR, Zanos TP, Pack CC. Long-range cortical synchronization supports abrupt visual learning. Curr Biol 2022; 32:2467-2479.e4. [PMID: 35523181 DOI: 10.1016/j.cub.2022.04.029] [Citation(s) in RCA: 6] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/24/2021] [Revised: 03/08/2022] [Accepted: 04/12/2022] [Indexed: 11/29/2022]
Abstract
Visual plasticity declines sharply after the critical period, yet we easily learn to recognize new faces and places, even as adults. Such learning is often characterized by a "moment of insight," an abrupt and dramatic improvement in recognition. The mechanisms that support abrupt learning are unknown, but one hypothesis is that they involve changes in synchronization between brain regions. To test this hypothesis, we used a behavioral task in which non-human primates rapidly learned to recognize novel images and to associate them with specific responses. Simultaneous recordings from inferotemporal and prefrontal cortices revealed a transient synchronization of neural activity between these areas that peaked around the moment of insight. Synchronization was strongest between inferotemporal sites that encoded images and reward-sensitive prefrontal sites. Moreover, its magnitude intensified gradually over image exposures, suggesting that abrupt learning is the culmination of a search for informative signals within a circuit linking sensory information to task demands.
Collapse
Affiliation(s)
- Bennett A Csorba
- Montreal Neurological Institute, McGill University, Montreal, QC H3A 2B4, Canada.
| | - Matthew R Krause
- Montreal Neurological Institute, McGill University, Montreal, QC H3A 2B4, Canada
| | | | - Christopher C Pack
- Montreal Neurological Institute, McGill University, Montreal, QC H3A 2B4, Canada
| |
Collapse
|
36
|
Dubreuil A, Valente A, Beiran M, Mastrogiuseppe F, Ostojic S. The role of population structure in computations through neural dynamics. Nat Neurosci 2022; 25:783-794. [PMID: 35668174 PMCID: PMC9284159 DOI: 10.1038/s41593-022-01088-4] [Citation(s) in RCA: 44] [Impact Index Per Article: 22.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/03/2020] [Accepted: 04/28/2022] [Indexed: 11/09/2022]
Abstract
Neural computations are currently investigated using two separate approaches: sorting neurons into functional subpopulations or examining the low-dimensional dynamics of collective activity. Whether and how these two aspects interact to shape computations is currently unclear. Using a novel approach to extract computational mechanisms from networks trained on neuroscience tasks, here we show that the dimensionality of the dynamics and subpopulation structure play fundamentally complementary roles. Although various tasks can be implemented by increasing the dimensionality in networks with fully random population structure, flexible input-output mappings instead require a non-random population structure that can be described in terms of multiple subpopulations. Our analyses revealed that such a subpopulation structure enables flexible computations through a mechanism based on gain-controlled modulations that flexibly shape the collective dynamics. Our results lead to task-specific predictions for the structure of neural selectivity, for inactivation experiments and for the implication of different neurons in multi-tasking.
Collapse
Affiliation(s)
- Alexis Dubreuil
- Laboratoire de Neurosciences Cognitives et Computationnelles, INSERM U960, Ecole Normale Superieure - PSL Research University, Paris, France.
- Université de Bordeaux, CNRS, IMN, UMR, Bordeaux, France.
| | - Adrian Valente
- Laboratoire de Neurosciences Cognitives et Computationnelles, INSERM U960, Ecole Normale Superieure - PSL Research University, Paris, France.
| | - Manuel Beiran
- Laboratoire de Neurosciences Cognitives et Computationnelles, INSERM U960, Ecole Normale Superieure - PSL Research University, Paris, France
- Center for Theoretical Neuroscience, Zuckerman Institute, Columbia University, New York, NY, USA
| | - Francesca Mastrogiuseppe
- Gatsby Computational Neuroscience Unit, University College London, London, UK
- Champalimaud Research, Lisbon, Portugal
| | - Srdjan Ostojic
- Laboratoire de Neurosciences Cognitives et Computationnelles, INSERM U960, Ecole Normale Superieure - PSL Research University, Paris, France.
| |
Collapse
|
37
|
Barragán-Montero A, Bibal A, Dastarac MH, Draguet C, Valdés G, Nguyen D, Willems S, Vandewinckele L, Holmström M, Löfman F, Souris K, Sterpin E, Lee JA. Towards a safe and efficient clinical implementation of machine learning in radiation oncology by exploring model interpretability, explainability and data-model dependency. Phys Med Biol 2022; 67:10.1088/1361-6560/ac678a. [PMID: 35421855 PMCID: PMC9870296 DOI: 10.1088/1361-6560/ac678a] [Citation(s) in RCA: 16] [Impact Index Per Article: 8.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/30/2021] [Accepted: 04/14/2022] [Indexed: 01/26/2023]
Abstract
The interest in machine learning (ML) has grown tremendously in recent years, partly due to the performance leap that occurred with new techniques of deep learning, convolutional neural networks for images, increased computational power, and wider availability of large datasets. Most fields of medicine follow that popular trend and, notably, radiation oncology is one of those that are at the forefront, with already a long tradition in using digital images and fully computerized workflows. ML models are driven by data, and in contrast with many statistical or physical models, they can be very large and complex, with countless generic parameters. This inevitably raises two questions, namely, the tight dependence between the models and the datasets that feed them, and the interpretability of the models, which scales with its complexity. Any problems in the data used to train the model will be later reflected in their performance. This, together with the low interpretability of ML models, makes their implementation into the clinical workflow particularly difficult. Building tools for risk assessment and quality assurance of ML models must involve then two main points: interpretability and data-model dependency. After a joint introduction of both radiation oncology and ML, this paper reviews the main risks and current solutions when applying the latter to workflows in the former. Risks associated with data and models, as well as their interaction, are detailed. Next, the core concepts of interpretability, explainability, and data-model dependency are formally defined and illustrated with examples. Afterwards, a broad discussion goes through key applications of ML in workflows of radiation oncology as well as vendors' perspectives for the clinical implementation of ML.
Collapse
Affiliation(s)
- Ana Barragán-Montero
- Molecular Imaging, Radiation and Oncology (MIRO) Laboratory, Institut de Recherche Expérimentale et Clinique (IREC), UCLouvain, Belgium
| | - Adrien Bibal
- PReCISE, NaDI Institute, Faculty of Computer Science, UNamur and CENTAL, ILC, UCLouvain, Belgium
| | - Margerie Huet Dastarac
- Molecular Imaging, Radiation and Oncology (MIRO) Laboratory, Institut de Recherche Expérimentale et Clinique (IREC), UCLouvain, Belgium
| | - Camille Draguet
- Molecular Imaging, Radiation and Oncology (MIRO) Laboratory, Institut de Recherche Expérimentale et Clinique (IREC), UCLouvain, Belgium
- Department of Oncology, Laboratory of Experimental Radiotherapy, KU Leuven, Belgium
| | - Gilmer Valdés
- Department of Radiation Oncology, Department of Epidemiology and Biostatistics, University of California, San Francisco, United States of America
| | - Dan Nguyen
- Medical Artificial Intelligence and Automation (MAIA) Laboratory, Department of Radiation Oncology, UT Southwestern Medical Center, United States of America
| | - Siri Willems
- ESAT/PSI, KU Leuven Belgium & MIRC, UZ Leuven, Belgium
| | | | | | | | - Kevin Souris
- Molecular Imaging, Radiation and Oncology (MIRO) Laboratory, Institut de Recherche Expérimentale et Clinique (IREC), UCLouvain, Belgium
| | - Edmond Sterpin
- Molecular Imaging, Radiation and Oncology (MIRO) Laboratory, Institut de Recherche Expérimentale et Clinique (IREC), UCLouvain, Belgium
- Department of Oncology, Laboratory of Experimental Radiotherapy, KU Leuven, Belgium
| | - John A Lee
- Molecular Imaging, Radiation and Oncology (MIRO) Laboratory, Institut de Recherche Expérimentale et Clinique (IREC), UCLouvain, Belgium
| |
Collapse
|
38
|
Iyer A, Grewal K, Velu A, Souza LO, Forest J, Ahmad S. Avoiding Catastrophe: Active Dendrites Enable Multi-Task Learning in Dynamic Environments. Front Neurorobot 2022; 16:846219. [PMID: 35574225 PMCID: PMC9100780 DOI: 10.3389/fnbot.2022.846219] [Citation(s) in RCA: 7] [Impact Index Per Article: 3.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/30/2021] [Accepted: 03/31/2022] [Indexed: 11/13/2022] Open
Abstract
A key challenge for AI is to build embodied systems that operate in dynamically changing environments. Such systems must adapt to changing task contexts and learn continuously. Although standard deep learning systems achieve state of the art results on static benchmarks, they often struggle in dynamic scenarios. In these settings, error signals from multiple contexts can interfere with one another, ultimately leading to a phenomenon known as catastrophic forgetting. In this article we investigate biologically inspired architectures as solutions to these problems. Specifically, we show that the biophysical properties of dendrites and local inhibitory systems enable networks to dynamically restrict and route information in a context-specific manner. Our key contributions are as follows: first, we propose a novel artificial neural network architecture that incorporates active dendrites and sparse representations into the standard deep learning framework. Next, we study the performance of this architecture on two separate benchmarks requiring task-based adaptation: Meta-World, a multi-task reinforcement learning environment where a robotic agent must learn to solve a variety of manipulation tasks simultaneously; and a continual learning benchmark in which the model's prediction task changes throughout training. Analysis on both benchmarks demonstrates the emergence of overlapping but distinct and sparse subnetworks, allowing the system to fluidly learn multiple tasks with minimal forgetting. Our neural implementation marks the first time a single architecture has achieved competitive results in both multi-task and continual learning settings. Our research sheds light on how biological properties of neurons can inform deep learning systems to address dynamic scenarios that are typically impossible for traditional ANNs to solve.
Collapse
Affiliation(s)
- Abhiram Iyer
- Numenta, Redwood City, CA, United States
- Department of Electrical and Computer Engineering, Carnegie Mellon University, Pittsburgh, PA, United States
| | | | - Akash Velu
- Department of Computer Science, Stanford University, Stanford, CA, United States
| | | | - Jeremy Forest
- Department of Psychology, Cornell University, Ithaca, NY, United States
| | | |
Collapse
|
39
|
Kudithipudi D, Aguilar-Simon M, Babb J, Bazhenov M, Blackiston D, Bongard J, Brna AP, Chakravarthi Raja S, Cheney N, Clune J, Daram A, Fusi S, Helfer P, Kay L, Ketz N, Kira Z, Kolouri S, Krichmar JL, Kriegman S, Levin M, Madireddy S, Manicka S, Marjaninejad A, McNaughton B, Miikkulainen R, Navratilova Z, Pandit T, Parker A, Pilly PK, Risi S, Sejnowski TJ, Soltoggio A, Soures N, Tolias AS, Urbina-Meléndez D, Valero-Cuevas FJ, van de Ven GM, Vogelstein JT, Wang F, Weiss R, Yanguas-Gil A, Zou X, Siegelmann H. Biological underpinnings for lifelong learning machines. NAT MACH INTELL 2022. [DOI: 10.1038/s42256-022-00452-0] [Citation(s) in RCA: 6] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/17/2022]
|
40
|
Wu Y, Zhao R, Zhu J, Chen F, Xu M, Li G, Song S, Deng L, Wang G, Zheng H, Ma S, Pei J, Zhang Y, Zhao M, Shi L. Brain-inspired global-local learning incorporated with neuromorphic computing. Nat Commun 2022; 13:65. [PMID: 35013198 PMCID: PMC8748814 DOI: 10.1038/s41467-021-27653-2] [Citation(s) in RCA: 15] [Impact Index Per Article: 7.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/17/2021] [Accepted: 11/30/2021] [Indexed: 12/18/2022] Open
Abstract
There are two principle approaches for learning in artificial intelligence: error-driven global learning and neuroscience-oriented local learning. Integrating them into one network may provide complementary learning capabilities for versatile learning scenarios. At the same time, neuromorphic computing holds great promise, but still needs plenty of useful algorithms and algorithm-hardware co-designs to fully exploit its advantages. Here, we present a neuromorphic global-local synergic learning model by introducing a brain-inspired meta-learning paradigm and a differentiable spiking model incorporating neuronal dynamics and synaptic plasticity. It can meta-learn local plasticity and receive top-down supervision information for multiscale learning. We demonstrate the advantages of this model in multiple different tasks, including few-shot learning, continual learning, and fault-tolerance learning in neuromorphic vision sensors. It achieves significantly higher performance than single-learning methods. We further implement the model in the Tianjic neuromorphic platform by exploiting algorithm-hardware co-designs and prove that the model can fully utilize neuromorphic many-core architecture to develop hybrid computation paradigm.
Collapse
Affiliation(s)
- Yujie Wu
- Department of Precision Instrument, Center for Brain-Inspired Computing Research (CBICR), Beijing Innovation Center for Future Chip, Optical Memory National Engineering Research Center, Tsinghua University, Beijing, China
| | - Rong Zhao
- Department of Precision Instrument, Center for Brain-Inspired Computing Research (CBICR), Beijing Innovation Center for Future Chip, Optical Memory National Engineering Research Center, Tsinghua University, Beijing, China
| | - Jun Zhu
- Department of Computer Science and Technology, Tsinghua University, Beijing, 100084, China
| | - Feng Chen
- Department of Automation, Tsinghua University, Beijing, 100084, China
| | - Mingkun Xu
- Department of Precision Instrument, Center for Brain-Inspired Computing Research (CBICR), Beijing Innovation Center for Future Chip, Optical Memory National Engineering Research Center, Tsinghua University, Beijing, China
| | - Guoqi Li
- Department of Precision Instrument, Center for Brain-Inspired Computing Research (CBICR), Beijing Innovation Center for Future Chip, Optical Memory National Engineering Research Center, Tsinghua University, Beijing, China
| | - Sen Song
- Laboratory of Brain and Intelligence, Department of Biomedical Engineering, IDG/ McGovern Institute for Brain Research, CBICR, Tsinghua University, Beijing, China
| | - Lei Deng
- Department of Precision Instrument, Center for Brain-Inspired Computing Research (CBICR), Beijing Innovation Center for Future Chip, Optical Memory National Engineering Research Center, Tsinghua University, Beijing, China
| | - Guanrui Wang
- Department of Precision Instrument, Center for Brain-Inspired Computing Research (CBICR), Beijing Innovation Center for Future Chip, Optical Memory National Engineering Research Center, Tsinghua University, Beijing, China
- Lynxi Technologies Co., Ltd, Beijing, China
| | - Hao Zheng
- Department of Precision Instrument, Center for Brain-Inspired Computing Research (CBICR), Beijing Innovation Center for Future Chip, Optical Memory National Engineering Research Center, Tsinghua University, Beijing, China
| | - Songchen Ma
- Department of Precision Instrument, Center for Brain-Inspired Computing Research (CBICR), Beijing Innovation Center for Future Chip, Optical Memory National Engineering Research Center, Tsinghua University, Beijing, China
| | - Jing Pei
- Department of Precision Instrument, Center for Brain-Inspired Computing Research (CBICR), Beijing Innovation Center for Future Chip, Optical Memory National Engineering Research Center, Tsinghua University, Beijing, China
| | - Youhui Zhang
- Department of Computer Science and Technology, Tsinghua University, Beijing, 100084, China
| | - Mingguo Zhao
- Department of Automation, Tsinghua University, Beijing, 100084, China
| | - Luping Shi
- Department of Precision Instrument, Center for Brain-Inspired Computing Research (CBICR), Beijing Innovation Center for Future Chip, Optical Memory National Engineering Research Center, Tsinghua University, Beijing, China.
| |
Collapse
|
41
|
Abstract
Incrementally learning new information from a non-stationary stream of data, referred to as 'continual learning', is a key feature of natural intelligence, but a challenging problem for deep neural networks. In recent years, numerous deep learning methods for continual learning have been proposed, but comparing their performances is difficult due to the lack of a common framework. To help address this, we describe three fundamental types, or 'scenarios', of continual learning: task-incremental, domain-incremental and class-incremental learning. Each of these scenarios has its own set of challenges. To illustrate this, we provide a comprehensive empirical comparison of currently used continual learning strategies, by performing the Split MNIST and Split CIFAR-100 protocols according to each scenario. We demonstrate substantial differences between the three scenarios in terms of difficulty and in terms of the effectiveness of different strategies. The proposed categorization aims to structure the continual learning field, by forming a key foundation for clearly defining benchmark problems.
Collapse
|
42
|
Verbeke P, Verguts T. Using top-down modulation to optimally balance shared versus separated task representations. Neural Netw 2021; 146:256-271. [PMID: 34915411 DOI: 10.1016/j.neunet.2021.11.030] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/13/2021] [Revised: 11/19/2021] [Accepted: 11/26/2021] [Indexed: 01/20/2023]
Abstract
Human adaptive behavior requires continually learning and performing a wide variety of tasks, often with very little practice. To accomplish this, it is crucial to separate neural representations of different tasks in order to avoid interference. At the same time, sharing neural representations supports generalization and allows faster learning. Therefore, a crucial challenge is to find an optimal balance between shared versus separated representations. Typically, models of human cognition employ top-down modulatory signals to separate task representations, but there exist surprisingly little systematic computational investigations of how such modulation is best implemented. We identify and systematically evaluate two crucial features of modulatory signals. First, top-down input can be processed in an additive or multiplicative manner. Second, the modulatory signals can be adaptive (learned) or non-adaptive (random). We cross these two features, resulting in four modulation networks which are tested on a variety of input datasets and tasks with different degrees of stimulus-action mapping overlap. The multiplicative adaptive modulation network outperforms all other networks in terms of accuracy. Moreover, this network develops hidden units that optimally share representations between tasks. Specifically, different than the binary approach of currently popular latent state models, it exploits partial overlap between tasks.
Collapse
Affiliation(s)
- Pieter Verbeke
- Department of experimental psychology, Ghent University, Belgium.
| | - Tom Verguts
- Department of experimental psychology, Ghent University, Belgium
| |
Collapse
|
43
|
Silent Synapses in Cocaine-Associated Memory and Beyond. J Neurosci 2021; 41:9275-9285. [PMID: 34759051 DOI: 10.1523/jneurosci.1559-21.2021] [Citation(s) in RCA: 6] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/30/2021] [Revised: 09/22/2021] [Accepted: 09/27/2021] [Indexed: 11/21/2022] Open
Abstract
Glutamatergic synapses are key cellular sites where cocaine experience creates memory traces that subsequently promote cocaine craving and seeking. In addition to making across-the-board synaptic adaptations, cocaine experience also generates a discrete population of new synapses that selectively encode cocaine memories. These new synapses are glutamatergic synapses that lack functionally stable AMPARs, often referred to as AMPAR-silent synapses or, simply, silent synapses. They are generated de novo in the NAc by cocaine experience. After drug withdrawal, some of these synapses mature by recruiting AMPARs, contributing to the consolidation of cocaine-associated memory. After cue-induced retrieval of cocaine memories, matured silent synapses alternate between two dynamic states (AMPAR-absent vs AMPAR-containing) that correspond with the behavioral manifestations of destabilization and reconsolidation of these memories. Here, we review the molecular mechanisms underlying silent synapse dynamics during behavior, discuss their contributions to circuit remodeling, and analyze their role in cocaine-memory-driven behaviors. We also propose several mechanisms through which silent synapses can form neuronal ensembles as well as cross-region circuit engrams for cocaine-specific behaviors. These perspectives lead to our hypothesis that cocaine-generated silent synapses stand as a distinct set of synaptic substrates encoding key aspects of cocaine memory that drive cocaine relapse.
Collapse
|
44
|
Wildenberg GA, Rosen MR, Lundell J, Paukner D, Freedman DJ, Kasthuri N. Primate neuronal connections are sparse in cortex as compared to mouse. Cell Rep 2021; 36:109709. [PMID: 34525373 DOI: 10.1016/j.celrep.2021.109709] [Citation(s) in RCA: 13] [Impact Index Per Article: 4.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/14/2021] [Revised: 07/30/2021] [Accepted: 08/20/2021] [Indexed: 12/29/2022] Open
Abstract
Detailing how primate and mouse neurons differ is critical for creating generalized models of how neurons process information. We reconstruct 15,748 synapses in adult Rhesus macaques and mice and ask how connectivity differs on identified cell types in layer 2/3 of primary visual cortex. Primate excitatory and inhibitory neurons receive 2-5 times fewer excitatory and inhibitory synapses than similar mouse neurons. Primate excitatory neurons have lower excitatory-to-inhibitory (E/I) ratios than mouse but similar E/I ratios in inhibitory neurons. In both species, properties of inhibitory axons such as synapse size and frequency are unchanged, and inhibitory innervation of excitatory neurons is local and specific. Using artificial recurrent neural networks (RNNs) optimized for different cognitive tasks, we find that penalizing networks for creating and maintaining synapses, as opposed to neuronal firing, reduces the number of connections per node as the number of nodes increases, similar to primate neurons compared with mice.
Collapse
Affiliation(s)
- Gregg A Wildenberg
- Department of Neurobiology, University of Chicago, Chicago, IL 60637, USA; Argonne National Laboratory, Lemont, IL 60439, USA.
| | - Matt R Rosen
- Department of Neurobiology, University of Chicago, Chicago, IL 60637, USA
| | - Jack Lundell
- Department of Neurobiology, University of Chicago, Chicago, IL 60637, USA
| | - Dawn Paukner
- Department of Neurobiology, University of Chicago, Chicago, IL 60637, USA
| | - David J Freedman
- Department of Neurobiology, University of Chicago, Chicago, IL 60637, USA
| | - Narayanan Kasthuri
- Department of Neurobiology, University of Chicago, Chicago, IL 60637, USA; Argonne National Laboratory, Lemont, IL 60439, USA.
| |
Collapse
|
45
|
Roscow EL, Chua R, Costa RP, Jones MW, Lepora N. Learning offline: memory replay in biological and artificial reinforcement learning. Trends Neurosci 2021; 44:808-821. [PMID: 34481635 DOI: 10.1016/j.tins.2021.07.007] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/12/2021] [Revised: 07/13/2021] [Accepted: 07/21/2021] [Indexed: 10/20/2022]
Abstract
Learning to act in an environment to maximise rewards is among the brain's key functions. This process has often been conceptualised within the framework of reinforcement learning, which has also gained prominence in machine learning and artificial intelligence (AI) as a way to optimise decision making. A common aspect of both biological and machine reinforcement learning is the reactivation of previously experienced episodes, referred to as replay. Replay is important for memory consolidation in biological neural networks and is key to stabilising learning in deep neural networks. Here, we review recent developments concerning the functional roles of replay in the fields of neuroscience and AI. Complementary progress suggests how replay might support learning processes, including generalisation and continual learning, affording opportunities to transfer knowledge across the two fields to advance the understanding of biological and artificial learning and memory.
Collapse
Affiliation(s)
| | | | - Rui Ponte Costa
- Bristol Computational Neuroscience Unit, Intelligent Systems Lab, Department of Computer Science, University of Bristol, Bristol, UK
| | - Matt W Jones
- School of Physiology, Pharmacology and Neuroscience, University of Bristol, Bristol, UK
| | - Nathan Lepora
- Department of Engineering Mathematics and Bristol Robotics Laboratory, University of Bristol, Bristol, UK
| |
Collapse
|
46
|
Harkin EF, Shen PR, Goel A, Richards BA, Naud R. Parallel and Recurrent Cascade Models as a Unifying Force for Understanding Sub-cellular Computation. Neuroscience 2021; 489:200-215. [PMID: 34358629 DOI: 10.1016/j.neuroscience.2021.07.026] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/23/2021] [Revised: 07/06/2021] [Accepted: 07/25/2021] [Indexed: 11/15/2022]
Abstract
Neurons are very complicated computational devices, incorporating numerous non-linear processes, particularly in their dendrites. Biophysical models capture these processes directly by explicitly modelling physiological variables, such as ion channels, current flow, membrane capacitance, etc. However, another option for capturing the complexities of real neural computation is to use cascade models, which treat individual neurons as a cascade of linear and non-linear operations, akin to a multi-layer artificial neural network. Recent research has shown that cascade models can capture single-cell computation well, but there are still a number of sub-cellular, regenerative dendritic phenomena that they cannot capture, such as the interaction between sodium, calcium, and NMDA spikes in different compartments. Here, we propose that it is possible to capture these additional phenomena using parallel, recurrent cascade models, wherein an individual neuron is modelled as a cascade of parallel linear and non-linear operations that can be connected recurrently, akin to a multi-layer, recurrent, artificial neural network. Given their tractable mathematical structure, we show that neuron models expressed in terms of parallel recurrent cascades can themselves be integrated into multi-layered artificial neural networks and trained to perform complex tasks. We go on to discuss potential implications and uses of these models for artificial intelligence. Overall, we argue that parallel, recurrent cascade models provide an important, unifying tool for capturing single-cell computation and exploring the algorithmic implications of physiological phenomena.
Collapse
Affiliation(s)
- Emerson F Harkin
- uOttawa Brain and Mind Institute, Centre for Neural Dynamics, Department of Cellular and Molecular Medicine, University of Ottawa, Ottawa, ON, Canada
| | - Peter R Shen
- Department of Systems Design Engineering, University of Waterloo, Waterloo, ON, Canada
| | - Anish Goel
- Lisgar Collegiate Institute, Ottawa, ON, Canada
| | - Blake A Richards
- Mila, Montréal, QC, Canada; Montreal Neurological Institute, Montréal, QC, Canada; Department of Neurology and Neurosurgery, McGill University, Montréal, QC, Canada; School of Computer Science, McGill University, Montréal, QC, Canada.
| | - Richard Naud
- uOttawa Brain and Mind Institute, Centre for Neural Dynamics, Department of Cellular and Molecular Medicine, University of Ottawa, Ottawa, ON, Canada; Department of Physics, University of Ottawa, Ottawa, ON, Canada.
| |
Collapse
|
47
|
Multitask learning over shared subspaces. PLoS Comput Biol 2021; 17:e1009092. [PMID: 34228719 PMCID: PMC8284664 DOI: 10.1371/journal.pcbi.1009092] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/30/2020] [Revised: 07/16/2021] [Accepted: 05/18/2021] [Indexed: 11/19/2022] Open
Abstract
This paper uses constructs from machine learning to define pairs of learning tasks that either shared or did not share a common subspace. Human subjects then learnt these tasks using a feedback-based approach and we hypothesised that learning would be boosted for shared subspaces. Our findings broadly supported this hypothesis with either better performance on the second task if it shared the same subspace as the first, or positive correlations over task performance for shared subspaces. These empirical findings were compared to the behaviour of a Neural Network model trained using sequential Bayesian learning and human performance was found to be consistent with a minimal capacity variant of this model. Networks with an increased representational capacity, and networks without Bayesian learning, did not show these transfer effects. We propose that the concept of shared subspaces provides a useful framework for the experimental study of human multitask and transfer learning. How does knowledge gained from previous experience affect learning of new tasks? This question of “Transfer Learning” has been addressed by teachers, psychologists, and more recently by researchers in the fields of neural networks and machine learning. Leveraging constructs from machine learning, we designed pairs of learning tasks that either shared or did not share a common subspace. We compared the dynamics of transfer learning in humans with those of a multitask neural network model, finding that human performance was consistent with a minimal capacity variant of the model. Learning was boosted in the second task if the same subspace was shared between tasks. Additionally, accuracy between tasks was positively correlated but only when they shared the same subspace. Our results highlight the roles of subspaces, showing how they could act as a learning boost if shared, and be detrimental if not.
Collapse
|
48
|
|
49
|
Davis GP, Katz GE, Gentili RJ, Reggia JA. Compositional memory in attractor neural networks with one-step learning. Neural Netw 2021; 138:78-97. [PMID: 33631609 DOI: 10.1016/j.neunet.2021.01.031] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/14/2020] [Revised: 12/06/2020] [Accepted: 01/28/2021] [Indexed: 10/22/2022]
Abstract
Compositionality refers to the ability of an intelligent system to construct models out of reusable parts. This is critical for the productivity and generalization of human reasoning, and is considered a necessary ingredient for human-level artificial intelligence. While traditional symbolic methods have proven effective for modeling compositionality, artificial neural networks struggle to learn systematic rules for encoding generalizable structured models. We suggest that this is due in part to short-term memory that is based on persistent maintenance of activity patterns without fast weight changes. We present a recurrent neural network that encodes structured representations as systems of contextually-gated dynamical attractors called attractor graphs. This network implements a functionally compositional working memory that is manipulated using top-down gating and fast local learning. We evaluate this approach with empirical experiments on storage and retrieval of graph-based data structures, as well as an automated hierarchical planning task. Our results demonstrate that compositional structures can be stored in and retrieved from neural working memory without persistent maintenance of multiple activity patterns. Further, memory capacity is improved by the use of a fast store-erase learning rule that permits controlled erasure and mutation of previously learned associations. We conclude that the combination of top-down gating and fast associative learning provides recurrent neural networks with a robust functional mechanism for compositional working memory.
Collapse
Affiliation(s)
- Gregory P Davis
- Department of Computer Science, University of Maryland, College Park, MD, USA.
| | - Garrett E Katz
- Department of Elec. Engr. and Comp. Sci., Syracuse University, Syracuse, NY, USA.
| | - Rodolphe J Gentili
- Department of Kinesiology, University of Maryland, College Park, MD, USA.
| | - James A Reggia
- Department of Computer Science, University of Maryland, College Park, MD, USA.
| |
Collapse
|
50
|
Chen H, Xie L, Wang Y, Zhang H. Memory retention in pyramidal neurons: a unified model of energy-based homo and heterosynaptic plasticity with homeostasis. Cogn Neurodyn 2020; 15:675-692. [PMID: 34367368 DOI: 10.1007/s11571-020-09652-z] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/30/2019] [Revised: 10/27/2020] [Accepted: 11/09/2020] [Indexed: 01/07/2023] Open
Abstract
The brain can learn new tasks without forgetting old ones. This memory retention is closely associated with the long-term stability of synaptic strength. To understand the capacity of pyramidal neurons to preserve memory under different tasks, we established a plasticity model based on the postsynaptic membrane energy state, in which the change in synaptic strength depends on the difference between the energy state after stimulation and the resting energy state. If the post-stimulation energy state is higher than the resting energy state, then synaptic depression occurs. On the contrary, the synapse is strengthened. Our model unifies homo- and heterosynaptic plasticity and can reproduce synaptic plasticity observed in multiple experiments, such as spike-timing-dependent plasticity, and cooperative plasticity with few and common parameters. Based on the proposed plasticity model, we conducted a simulation study on how the activation patterns of dendritic branches by different tasks affect the synaptic connection strength of pyramidal neurons. We further investigate the formation mechanism by which different tasks activate different dendritic branches. Simulation results show that compare to the classic plasticity model, the plasticity model we proposed can achieve a better spatial separation of different branches activated by different tasks in pyramidal neurons, which deepens our insight into the memory retention mechanism of brains.
Collapse
Affiliation(s)
- Huanwen Chen
- The School of Automation, Central South University, Changsha, 410083 Hunan China
| | - Lijuan Xie
- The Institute of Physiology and Psychology, Changsha University of Science and Technology, Changsha, 410076 Hunan China
| | - Yijun Wang
- The School of Automation, Central South University, Changsha, 410083 Hunan China
| | - Hang Zhang
- The School of Automation, Central South University, Changsha, 410083 Hunan China
| |
Collapse
|