1
|
Fernández JG, Keemink S, van Gerven M. Gradient-free training of recurrent neural networks using random perturbations. Front Neurosci 2024; 18:1439155. [PMID: 39050673 PMCID: PMC11267880 DOI: 10.3389/fnins.2024.1439155] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/27/2024] [Accepted: 06/25/2024] [Indexed: 07/27/2024] Open
Abstract
Recurrent neural networks (RNNs) hold immense potential for computations due to their Turing completeness and sequential processing capabilities, yet existing methods for their training encounter efficiency challenges. Backpropagation through time (BPTT), the prevailing method, extends the backpropagation (BP) algorithm by unrolling the RNN over time. However, this approach suffers from significant drawbacks, including the need to interleave forward and backward phases and store exact gradient information. Furthermore, BPTT has been shown to struggle to propagate gradient information for long sequences, leading to vanishing gradients. An alternative strategy to using gradient-based methods like BPTT involves stochastically approximating gradients through perturbation-based methods. This learning approach is exceptionally simple, necessitating only forward passes in the network and a global reinforcement signal as feedback. Despite its simplicity, the random nature of its updates typically leads to inefficient optimization, limiting its effectiveness in training neural networks. In this study, we present a new approach to perturbation-based learning in RNNs whose performance is competitive with BPTT, while maintaining the inherent advantages over gradient-based learning. To this end, we extend the recently introduced activity-based node perturbation (ANP) method to operate in the time domain, leading to more efficient learning and generalization. We subsequently conduct a range of experiments to validate our approach. Our results show similar performance, convergence time and scalability when compared to BPTT, strongly outperforming standard node perturbation and weight perturbation methods. These findings suggest that perturbation-based learning methods offer a versatile alternative to gradient-based methods for training RNNs which can be ideally suited for neuromorphic computing applications.
Collapse
Affiliation(s)
- Jesús García Fernández
- Department of Machine Learning and Neural Computing, Donders Institute for Brain, Cognition and Behaviour, Radboud University, Nijmegen, Netherlands
| | | | | |
Collapse
|
2
|
Bredenberg C, Savin C. Desiderata for Normative Models of Synaptic Plasticity. Neural Comput 2024; 36:1245-1285. [PMID: 38776950 DOI: 10.1162/neco_a_01671] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/09/2023] [Accepted: 02/06/2024] [Indexed: 05/25/2024]
Abstract
Normative models of synaptic plasticity use computational rationales to arrive at predictions of behavioral and network-level adaptive phenomena. In recent years, there has been an explosion of theoretical work in this realm, but experimental confirmation remains limited. In this review, we organize work on normative plasticity models in terms of a set of desiderata that, when satisfied, are designed to ensure that a given model demonstrates a clear link between plasticity and adaptive behavior, is consistent with known biological evidence about neural plasticity and yields specific testable predictions. As a prototype, we include a detailed analysis of the REINFORCE algorithm. We also discuss how new models have begun to improve on the identified criteria and suggest avenues for further development. Overall, we provide a conceptual guide to help develop neural learning theories that are precise, powerful, and experimentally testable.
Collapse
Affiliation(s)
- Colin Bredenberg
- Center for Neural Science, New York University, New York, NY 10003, U.S.A
- Mila-Quebec AI Institute, Montréal, QC H2S 3H1, Canada
| | - Cristina Savin
- Center for Neural Science, New York University, New York, NY 10003, U.S.A
- Center for Data Science, New York University, New York, NY 10011, U.S.A.
| |
Collapse
|
3
|
Mastrovito D, Liu YH, Kusmierz L, Shea-Brown E, Koch C, Mihalas S. Transition to chaos separates learning regimes and relates to measure of consciousness in recurrent neural networks. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2024:2024.05.15.594236. [PMID: 38798582 PMCID: PMC11118502 DOI: 10.1101/2024.05.15.594236] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/29/2024]
Abstract
Recurrent neural networks exhibit chaotic dynamics when the variance in their connection strengths exceed a critical value. Recent work indicates connection variance also modulates learning strategies; networks learn "rich" representations when initialized with low coupling and "lazier" solutions with larger variance. Using Watts-Strogatz networks of varying sparsity, structure, and hidden weight variance, we find that the critical coupling strength dividing chaotic from ordered dynamics also differentiates rich and lazy learning strategies. Training moves both stable and chaotic networks closer to the edge of chaos, with networks learning richer representations before the transition to chaos. In contrast, biologically realistic connectivity structures foster stability over a wide range of variances. The transition to chaos is also reflected in a measure that clinically discriminates levels of consciousness, the perturbational complexity index (PCIst). Networks with high values of PCIst exhibit stable dynamics and rich learning, suggesting a consciousness prior may promote rich learning. The results suggest a clear relationship between critical dynamics, learning regimes and complexity-based measures of consciousness.
Collapse
|
4
|
Zhou S, Buonomano DV. Unified control of temporal and spatial scales of sensorimotor behavior through neuromodulation of short-term synaptic plasticity. SCIENCE ADVANCES 2024; 10:eadk7257. [PMID: 38701208 DOI: 10.1126/sciadv.adk7257] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 09/11/2023] [Accepted: 04/03/2024] [Indexed: 05/05/2024]
Abstract
Neuromodulators have been shown to alter the temporal profile of short-term synaptic plasticity (STP); however, the computational function of this neuromodulation remains unexplored. Here, we propose that the neuromodulation of STP provides a general mechanism to scale neural dynamics and motor outputs in time and space. We trained recurrent neural networks that incorporated STP to produce complex motor trajectories-handwritten digits-with different temporal (speed) and spatial (size) scales. Neuromodulation of STP produced temporal and spatial scaling of the learned dynamics and enhanced temporal or spatial generalization compared to standard training of the synaptic weights in the absence of STP. The model also accounted for the results of two experimental studies involving flexible sensorimotor timing. Neuromodulation of STP provides a unified and biologically plausible mechanism to control the temporal and spatial scales of neural dynamics and sensorimotor behaviors.
Collapse
Affiliation(s)
- Shanglin Zhou
- Institute for Translational Brain Research, Fudan University, Shanghai, China
- State Key Laboratory of Medical Neurobiology, Fudan University, Shanghai, China
- MOE Frontiers Center for Brain Science, Fudan University, Shanghai, China
- Zhongshan Hospital, Fudan University, Shanghai, China
| | - Dean V Buonomano
- Department of Neurobiology, University of California, Los Angeles, Los Angeles, CA, USA
- Department of Psychology, University of California, Los Angeles, Los Angeles, CA, USA
| |
Collapse
|
5
|
Terada Y, Toyoizumi T. Chaotic neural dynamics facilitate probabilistic computations through sampling. Proc Natl Acad Sci U S A 2024; 121:e2312992121. [PMID: 38648479 PMCID: PMC11067032 DOI: 10.1073/pnas.2312992121] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/28/2023] [Accepted: 02/13/2024] [Indexed: 04/25/2024] Open
Abstract
Cortical neurons exhibit highly variable responses over trials and time. Theoretical works posit that this variability arises potentially from chaotic network dynamics of recurrently connected neurons. Here, we demonstrate that chaotic neural dynamics, formed through synaptic learning, allow networks to perform sensory cue integration in a sampling-based implementation. We show that the emergent chaotic dynamics provide neural substrates for generating samples not only of a static variable but also of a dynamical trajectory, where generic recurrent networks acquire these abilities with a biologically plausible learning rule through trial and error. Furthermore, the networks generalize their experience in the stimulus-evoked samples to the inference without partial or all sensory information, which suggests a computational role of spontaneous activity as a representation of the priors as well as a tractable biological computation for marginal distributions. These findings suggest that chaotic neural dynamics may serve for the brain function as a Bayesian generative model.
Collapse
Affiliation(s)
- Yu Terada
- Laboratory for Neural Computation and Adaptation, RIKEN Center for Brain Science, Saitama351-0198, Japan
- Department of Neurobiology, University of California, San Diego, La Jolla, CA92093
- The Institute for Physics of Intelligence, The University of Tokyo, Tokyo113-0033, Japan
| | - Taro Toyoizumi
- Laboratory for Neural Computation and Adaptation, RIKEN Center for Brain Science, Saitama351-0198, Japan
- Department of Mathematical Informatics, Graduate School of Information Science and Technology, The University of Tokyo, Tokyo113-8656, Japan
| |
Collapse
|
6
|
Lakshminarasimhan KJ, Xie M, Cohen JD, Sauerbrei BA, Hantman AW, Litwin-Kumar A, Escola S. Specific connectivity optimizes learning in thalamocortical loops. Cell Rep 2024; 43:114059. [PMID: 38602873 PMCID: PMC11104520 DOI: 10.1016/j.celrep.2024.114059] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/22/2023] [Revised: 01/04/2024] [Accepted: 03/20/2024] [Indexed: 04/13/2024] Open
Abstract
Thalamocortical loops have a central role in cognition and motor control, but precisely how they contribute to these processes is unclear. Recent studies showing evidence of plasticity in thalamocortical synapses indicate a role for the thalamus in shaping cortical dynamics through learning. Since signals undergo a compression from the cortex to the thalamus, we hypothesized that the computational role of the thalamus depends critically on the structure of corticothalamic connectivity. To test this, we identified the optimal corticothalamic structure that promotes biologically plausible learning in thalamocortical synapses. We found that corticothalamic projections specialized to communicate an efference copy of the cortical output benefit motor control, while communicating the modes of highest variance is optimal for working memory tasks. We analyzed neural recordings from mice performing grasping and delayed discrimination tasks and found corticothalamic communication consistent with these predictions. These results suggest that the thalamus orchestrates cortical dynamics in a functionally precise manner through structured connectivity.
Collapse
Affiliation(s)
| | - Marjorie Xie
- Zuckerman Mind Brain Behavior Institute, Columbia University, New York, NY 10027, USA
| | - Jeremy D Cohen
- Neuroscience Center, University of North Carolina, Chapel Hill, NC 27559, USA
| | - Britton A Sauerbrei
- Department of Neurosciences, Case Western Reserve University, Cleveland, OH 44106, USA
| | - Adam W Hantman
- Neuroscience Center, University of North Carolina, Chapel Hill, NC 27559, USA
| | - Ashok Litwin-Kumar
- Zuckerman Mind Brain Behavior Institute, Columbia University, New York, NY 10027, USA.
| | - Sean Escola
- Department of Psychiatry, Columbia University, New York, NY 10032, USA.
| |
Collapse
|
7
|
Bredenberg C, Savin C, Kiani R. Recurrent Neural Circuits Overcome Partial Inactivation by Compensation and Re-learning. J Neurosci 2024; 44:e1635232024. [PMID: 38413233 PMCID: PMC11026338 DOI: 10.1523/jneurosci.1635-23.2024] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/23/2023] [Revised: 01/14/2024] [Accepted: 01/20/2024] [Indexed: 02/29/2024] Open
Abstract
Technical advances in artificial manipulation of neural activity have precipitated a surge in studying the causal contribution of brain circuits to cognition and behavior. However, complexities of neural circuits challenge interpretation of experimental results, necessitating new theoretical frameworks for reasoning about causal effects. Here, we take a step in this direction, through the lens of recurrent neural networks trained to perform perceptual decisions. We show that understanding the dynamical system structure that underlies network solutions provides a precise account for the magnitude of behavioral effects due to perturbations. Our framework explains past empirical observations by clarifying the most sensitive features of behavior, and how complex circuits compensate and adapt to perturbations. In the process, we also identify strategies that can improve the interpretability of inactivation experiments.
Collapse
Affiliation(s)
- Colin Bredenberg
- Center for Neural Science, New York University, New York, NY 10003
| | - Cristina Savin
- Center for Neural Science, New York University, New York, NY 10003
- Center for Data Science, New York University, New York, NY 10011
| | - Roozbeh Kiani
- Center for Neural Science, New York University, New York, NY 10003
- Department of Psychology, New York University, New York, NY 10003
| |
Collapse
|
8
|
Liu YH, Baratin A, Cornford J, Mihalas S, Shea-Brown E, Lajoie G. How connectivity structure shapes rich and lazy learning in neural circuits. ARXIV 2024:arXiv:2310.08513v2. [PMID: 37873007 PMCID: PMC10593070] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Download PDF] [Subscribe] [Scholar Register] [Indexed: 10/25/2023]
Abstract
In theoretical neuroscience, recent work leverages deep learning tools to explore how some network attributes critically influence its learning dynamics. Notably, initial weight distributions with small (resp. large) variance may yield a rich (resp. lazy) regime, where significant (resp. minor) changes to network states and representation are observed over the course of learning. However, in biology, neural circuit connectivity could exhibit a low-rank structure and therefore differs markedly from the random initializations generally used for these studies. As such, here we investigate how the structure of the initial weights -- in particular their effective rank -- influences the network learning regime. Through both empirical and theoretical analyses, we discover that high-rank initializations typically yield smaller network changes indicative of lazier learning, a finding we also confirm with experimentally-driven initial connectivity in recurrent neural networks. Conversely, low-rank initialization biases learning towards richer learning. Importantly, however, as an exception to this rule, we find lazier learning can still occur with a low-rank initialization that aligns with task and data statistics. Our research highlights the pivotal role of initial weight structures in shaping learning regimes, with implications for metabolic costs of plasticity and risks of catastrophic forgetting.
Collapse
|
9
|
Bohnstingl T, Wozniak S, Pantazi A, Eleftheriou E. Online Spatio-Temporal Learning in Deep Neural Networks. IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS 2023; 34:8894-8908. [PMID: 35294357 DOI: 10.1109/tnnls.2022.3153985] [Citation(s) in RCA: 3] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/14/2023]
Abstract
Biological neural networks are equipped with an inherent capability to continuously adapt through online learning. This aspect remains in stark contrast to learning with error backpropagation through time (BPTT) that involves offline computation of the gradients due to the need to unroll the network through time. Here, we present an alternative online learning algorithm ic framework for deep recurrent neural networks (RNNs) and spiking neural networks (SNNs), called online spatio-temporal learning (OSTL). It is based on insights from biology and proposes the clear separation of spatial and temporal gradient components. For shallow SNNs, OSTL is gradient equivalent to BPTT enabling for the first time online training of SNNs with BPTT-equivalent gradients. In addition, the proposed formulation unveils a class of SNN architectures trainable online at low time complexity. Moreover, we extend OSTL to a generic form, applicable to a wide range of network architectures, including networks comprising long short-term memory (LSTM) and gated recurrent units (GRUs). We demonstrate the operation of our algorithm ic framework on various tasks from language modeling to speech recognition and obtain results on par with the BPTT baselines.
Collapse
|
10
|
Soo WWM, Goudar V, Wang XJ. Training biologically plausible recurrent neural networks on cognitive tasks with long-term dependencies. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2023:2023.10.10.561588. [PMID: 37873445 PMCID: PMC10592728 DOI: 10.1101/2023.10.10.561588] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 10/25/2023]
Abstract
Training recurrent neural networks (RNNs) has become a go-to approach for generating and evaluating mechanistic neural hypotheses for cognition. The ease and efficiency of training RNNs with backpropagation through time and the availability of robustly supported deep learning libraries has made RNN modeling more approachable and accessible to neuroscience. Yet, a major technical hindrance remains. Cognitive processes such as working memory and decision making involve neural population dynamics over a long period of time within a behavioral trial and across trials. It is difficult to train RNNs to accomplish tasks where neural representations and dynamics have long temporal dependencies without gating mechanisms such as LSTMs or GRUs which currently lack experimental support and prohibit direct comparison between RNNs and biological neural circuits. We tackled this problem based on the idea of specialized skip-connections through time to support the emergence of task-relevant dynamics, and subsequently reinstitute biological plausibility by reverting to the original architecture. We show that this approach enables RNNs to successfully learn cognitive tasks that prove impractical if not impossible to learn using conventional methods. Over numerous tasks considered here, we achieve less training steps and shorter wall-clock times, particularly in tasks that require learning long-term dependencies via temporal integration over long timescales or maintaining a memory of past events in hidden-states. Our methods expand the range of experimental tasks that biologically plausible RNN models can learn, thereby supporting the development of theory for the emergent neural mechanisms of computations involving long-term dependencies.
Collapse
|
11
|
Bredenberg C, Savin C. Desiderata for normative models of synaptic plasticity. ARXIV 2023:arXiv:2308.04988v1. [PMID: 37608931 PMCID: PMC10441445] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Subscribe] [Scholar Register] [Indexed: 08/24/2023]
Abstract
Normative models of synaptic plasticity use a combination of mathematics and computational simulations to arrive at predictions of behavioral and network-level adaptive phenomena. In recent years, there has been an explosion of theoretical work on these models, but experimental confirmation is relatively limited. In this review, we organize work on normative plasticity models in terms of a set of desiderata which, when satisfied, are designed to guarantee that a model has a clear link between plasticity and adaptive behavior, consistency with known biological evidence about neural plasticity, and specific testable predictions. We then discuss how new models have begun to improve on these criteria and suggest avenues for further development. As prototypes, we provide detailed analyses of two specific models - REINFORCE and the Wake-Sleep algorithm. We provide a conceptual guide to help develop neural learning theories that are precise, powerful, and experimentally testable.
Collapse
Affiliation(s)
- Colin Bredenberg
- Center for Neural Science, New York University, New York, NY 10003, USA
- Mila-Quebec AI Institute, 6666 Rue Saint-Urbain, Montréal, QC H2S 3H1
| | - Cristina Savin
- Center for Neural Science, New York University, New York, NY 10003, USA
- Center for Data Science, New York University, New York, NY 10011, USA
| |
Collapse
|
12
|
Wärnberg E, Kumar A. Feasibility of dopamine as a vector-valued feedback signal in the basal ganglia. Proc Natl Acad Sci U S A 2023; 120:e2221994120. [PMID: 37527344 PMCID: PMC10410740 DOI: 10.1073/pnas.2221994120] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/29/2022] [Accepted: 06/08/2023] [Indexed: 08/03/2023] Open
Abstract
It is well established that midbrain dopaminergic neurons support reinforcement learning (RL) in the basal ganglia by transmitting a reward prediction error (RPE) to the striatum. In particular, different computational models and experiments have shown that a striatum-wide RPE signal can support RL over a small discrete set of actions (e.g., no/no-go, choose left/right). However, there is accumulating evidence that the basal ganglia functions not as a selector between predefined actions but rather as a dynamical system with graded, continuous outputs. To reconcile this view with RL, there is a need to explain how dopamine could support learning of continuous outputs, rather than discrete action values. Inspired by the recent observations that besides RPE, the firing rates of midbrain dopaminergic neurons correlate with motor and cognitive variables, we propose a model in which dopamine signal in the striatum carries a vector-valued error feedback signal (a loss gradient) instead of a homogeneous scalar error (a loss). We implement a local, "three-factor" corticostriatal plasticity rule involving the presynaptic firing rate, a postsynaptic factor, and the unique dopamine concentration perceived by each striatal neuron. With this learning rule, we show that such a vector-valued feedback signal results in an increased capacity to learn a multidimensional series of real-valued outputs. Crucially, we demonstrate that this plasticity rule does not require precise nigrostriatal synapses but remains compatible with experimental observations of random placement of varicosities and diffuse volume transmission of dopamine.
Collapse
Affiliation(s)
- Emil Wärnberg
- Department of Neuroscience, Karolinska Institutet, 171 77Stockholm, Sweden
- Division of Computational Science and Technology, School of Electrical Engineering and Computer Science, KTH Royal Institute of Technology, 114 28Stockholm, Sweden
| | - Arvind Kumar
- Division of Computational Science and Technology, School of Electrical Engineering and Computer Science, KTH Royal Institute of Technology, 114 28Stockholm, Sweden
| |
Collapse
|
13
|
Zou W, Li C, Huang H. Ensemble perspective for understanding temporal credit assignment. Phys Rev E 2023; 107:024307. [PMID: 36932505 DOI: 10.1103/physreve.107.024307] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/08/2021] [Accepted: 01/24/2023] [Indexed: 06/18/2023]
Abstract
Recurrent neural networks are widely used for modeling spatiotemporal sequences in both nature language processing and neural population dynamics. However, understanding the temporal credit assignment is hard. Here, we propose that each individual connection in the recurrent computation is modeled by a spike and slab distribution, rather than a precise weight value. We then derive the mean-field algorithm to train the network at the ensemble level. The method is then applied to classify handwritten digits when pixels are read in sequence, and to the multisensory integration task that is a fundamental cognitive function of animals. Our model reveals important connections that determine the overall performance of the network. The model also shows how spatiotemporal information is processed through the hyperparameters of the distribution, and moreover reveals distinct types of emergent neural selectivity. To provide a mechanistic analysis of the ensemble learning, we first derive an analytic solution of the learning at the infinitely large network limit. We then carry out a low-dimensional projection of both neural and synaptic dynamics, analyze symmetry breaking in the parameter space, and finally demonstrate the role of stochastic plasticity in the recurrent computation. Therefore, our study sheds light on mechanisms of how weight uncertainty impacts the temporal credit assignment in recurrent neural networks from the ensemble perspective.
Collapse
Affiliation(s)
- Wenxuan Zou
- PMI Lab, School of Physics, Sun Yat-sen University, Guangzhou 510275, People's Republic of China
| | - Chan Li
- PMI Lab, School of Physics, Sun Yat-sen University, Guangzhou 510275, People's Republic of China
| | - Haiping Huang
- PMI Lab, School of Physics, Sun Yat-sen University, Guangzhou 510275, People's Republic of China
- Guangdong Provincial Key Laboratory of Magnetoelectric Physics and Devices, Sun Yat-sen University, Guangzhou 510275, People's Republic of China
| |
Collapse
|
14
|
Nakajima M, Inoue K, Tanaka K, Kuniyoshi Y, Hashimoto T, Nakajima K. Physical deep learning with biologically inspired training method: gradient-free approach for physical hardware. Nat Commun 2022; 13:7847. [PMID: 36572696 PMCID: PMC9792515 DOI: 10.1038/s41467-022-35216-2] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/29/2021] [Accepted: 11/23/2022] [Indexed: 12/28/2022] Open
Abstract
Ever-growing demand for artificial intelligence has motivated research on unconventional computation based on physical devices. While such computation devices mimic brain-inspired analog information processing, the learning procedures still rely on methods optimized for digital processing such as backpropagation, which is not suitable for physical implementation. Here, we present physical deep learning by extending a biologically inspired training algorithm called direct feedback alignment. Unlike the original algorithm, the proposed method is based on random projection with alternative nonlinear activation. Thus, we can train a physical neural network without knowledge about the physical system and its gradient. In addition, we can emulate the computation for this training on scalable physical hardware. We demonstrate the proof-of-concept using an optoelectronic recurrent neural network called deep reservoir computer. We confirmed the potential for accelerated computation with competitive performance on benchmarks. Our results provide practical solutions for the training and acceleration of neuromorphic computation.
Collapse
Affiliation(s)
- Mitsumasa Nakajima
- NTT Device Technology Labs., 3-1 Morinosato-Wakamiya, Atsugi, Kanagwa 243-0198 Japan
| | - Katsuma Inoue
- grid.26999.3d0000 0001 2151 536XGraduate School of Information Science and Technology, The University of Tokyo, 7-3-1 Hongo, Bunkyo-ku, Tokyo 113-8656 Japan
| | - Kenji Tanaka
- NTT Device Technology Labs., 3-1 Morinosato-Wakamiya, Atsugi, Kanagwa 243-0198 Japan
| | - Yasuo Kuniyoshi
- grid.26999.3d0000 0001 2151 536XGraduate School of Information Science and Technology, The University of Tokyo, 7-3-1 Hongo, Bunkyo-ku, Tokyo 113-8656 Japan ,grid.26999.3d0000 0001 2151 536XNext Generation Artificial Intelligence Research Center, The University of Tokyo, 7-3-1 Hongo, Bunkyo-ku, Tokyo 113-8656 Japan
| | - Toshikazu Hashimoto
- NTT Device Technology Labs., 3-1 Morinosato-Wakamiya, Atsugi, Kanagwa 243-0198 Japan
| | - Kohei Nakajima
- grid.26999.3d0000 0001 2151 536XGraduate School of Information Science and Technology, The University of Tokyo, 7-3-1 Hongo, Bunkyo-ku, Tokyo 113-8656 Japan ,grid.26999.3d0000 0001 2151 536XNext Generation Artificial Intelligence Research Center, The University of Tokyo, 7-3-1 Hongo, Bunkyo-ku, Tokyo 113-8656 Japan
| |
Collapse
|
15
|
Rostami A, Vogginger B, Yan Y, Mayr CG. E-prop on SpiNNaker 2: Exploring online learning in spiking RNNs on neuromorphic hardware. Front Neurosci 2022; 16:1018006. [DOI: 10.3389/fnins.2022.1018006] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/12/2022] [Accepted: 10/19/2022] [Indexed: 11/29/2022] Open
Abstract
IntroductionIn recent years, the application of deep learning models at the edge has gained attention. Typically, artificial neural networks (ANNs) are trained on graphics processing units (GPUs) and optimized for efficient execution on edge devices. Training ANNs directly at the edge is the next step with many applications such as the adaptation of models to specific situations like changes in environmental settings or optimization for individuals, e.g., optimization for speakers for speech processing. Also, local training can preserve privacy. Over the last few years, many algorithms have been developed to reduce memory footprint and computation.MethodsA specific challenge to train recurrent neural networks (RNNs) for processing sequential data is the need for the Back Propagation Through Time (BPTT) algorithm to store the network state of all time steps. This limitation is resolved by the biologically-inspired E-prop approach for training Spiking Recurrent Neural Networks (SRNNs). We implement the E-prop algorithm on a prototype of the SpiNNaker 2 neuromorphic system. A parallelization strategy is developed to split and train networks on the ARM cores of SpiNNaker 2 to make efficient use of both memory and compute resources. We trained an SRNN from scratch on SpiNNaker 2 in real-time on the Google Speech Command dataset for keyword spotting.ResultWe achieved an accuracy of 91.12% while requiring only 680 KB of memory for training the network with 25 K weights. Compared to other spiking neural networks with equal or better accuracy, our work is significantly more memory-efficient.DiscussionIn addition, we performed a memory and time profiling of the E-prop algorithm. This is used on the one hand to discuss whether E-prop or BPTT is better suited for training a model at the edge and on the other hand to explore architecture modifications to SpiNNaker 2 to speed up online learning. Finally, energy estimations predict that the SRNN can be trained on SpiNNaker2 with 12 times less energy than using a NVIDIA V100 GPU.
Collapse
|
16
|
Pohl M, Uesaka M, Takahashi H, Demachi K, Bhusal Chhatkuli R. Prediction of the position of external markers using a recurrent neural network trained with unbiased online recurrent optimization for safe lung cancer radiotherapy. COMPUTER METHODS AND PROGRAMS IN BIOMEDICINE 2022; 222:106908. [PMID: 35716534 DOI: 10.1016/j.cmpb.2022.106908] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 10/10/2021] [Revised: 03/24/2022] [Accepted: 05/23/2022] [Indexed: 06/15/2023]
Abstract
BACKGROUND AND OBJECTIVE During lung cancer radiotherapy, the position of infrared reflective objects on the chest can be recorded to estimate the tumor location. However, radiotherapy systems have a latency inherent to robot control limitations that impedes the radiation delivery precision. Prediction with online learning of recurrent neural networks (RNN) allows for adaptation to non-stationary respiratory signals, but classical methods such as real-time recurrent learning (RTRL) and truncated backpropagation through time are respectively slow and biased. This study investigates the capabilities of unbiased online recurrent optimization (UORO) to forecast respiratory motion and enhance safety in lung radiotherapy. METHODS We used nine observation records of the three-dimensional (3D) position of three external markers on the chest and abdomen of healthy individuals breathing during intervals from 73s to 222s. The sampling frequency was 10Hz, and the amplitudes of the recorded trajectories range from 6mm to 40mm in the superior-inferior direction. We forecast the 3D location of each marker simultaneously with a horizon value (the time interval in advance for which the prediction is made) between 0.1s and 2.0s, using an RNN trained with UORO. We compare its performance with an RNN trained with RTRL, least mean squares (LMS), and offline linear regression. We provide closed-form expressions for quantities involved in the gradient loss calculation in UORO, thereby making its implementation efficient. Training and cross-validation were performed during the first minute of each sequence. RESULTS On average over the horizon values considered and the nine sequences, UORO achieves the lowest root-mean-square (RMS) error and maximum error among the compared algorithms. These errors are respectively equal to 1.3mm and 8.8mm, and the prediction time per time step was lower than 2.8ms (Dell Intel core i9-9900K 3.60 GHz). Linear regression has the lowest RMS error for the horizon values 0.1s and 0.2s, followed by LMS for horizon values between 0.3s and 0.5s, and UORO for horizon values greater than 0.6s. CONCLUSIONS UORO can accurately predict the 3D position of external markers for intermediate to high response times with an acceptable time performance. This will help limit unwanted damage to healthy tissues caused by radiotherapy.
Collapse
Affiliation(s)
- Michel Pohl
- The University of Tokyo, 113-8654 Tokyo, Japan.
| | | | | | | | - Ritu Bhusal Chhatkuli
- National Institutes for Quantum and Radiological Science and Technology, 263-8555 Chiba, Japan
| |
Collapse
|
17
|
Calderon CB, Verguts T, Frank MJ. Thunderstruck: The ACDC model of flexible sequences and rhythms in recurrent neural circuits. PLoS Comput Biol 2022; 18:e1009854. [PMID: 35108283 PMCID: PMC8843237 DOI: 10.1371/journal.pcbi.1009854] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/19/2021] [Revised: 02/14/2022] [Accepted: 01/21/2022] [Indexed: 11/18/2022] Open
Abstract
Adaptive sequential behavior is a hallmark of human cognition. In particular, humans can learn to produce precise spatiotemporal sequences given a certain context. For instance, musicians can not only reproduce learned action sequences in a context-dependent manner, they can also quickly and flexibly reapply them in any desired tempo or rhythm without overwriting previous learning. Existing neural network models fail to account for these properties. We argue that this limitation emerges from the fact that sequence information (i.e., the position of the action) and timing (i.e., the moment of response execution) are typically stored in the same neural network weights. Here, we augment a biologically plausible recurrent neural network of cortical dynamics to include a basal ganglia-thalamic module which uses reinforcement learning to dynamically modulate action. This “associative cluster-dependent chain” (ACDC) model modularly stores sequence and timing information in distinct loci of the network. This feature increases computational power and allows ACDC to display a wide range of temporal properties (e.g., multiple sequences, temporal shifting, rescaling, and compositionality), while still accounting for several behavioral and neurophysiological empirical observations. Finally, we apply this ACDC network to show how it can learn the famous “Thunderstruck” song intro and then flexibly play it in a “bossa nova” rhythm without further training. How do humans flexibly adapt action sequences? For instance, musicians can learn a song and quickly speed up or slow down the tempo, or even play the song following a completely different rhythm (e.g., a rock song using a bossa nova rhythm). In this work, we build a biologically plausible network of cortico-basal ganglia interactions that explains how this temporal flexibility may emerge in the brain. Crucially, our model factorizes sequence order and action timing, respectively represented in cortical and basal ganglia dynamics. This factorization allows full temporal flexibility, i.e. the timing of a learned action sequence can be recomposed without interfering with the order of the sequence. As such, our model is capable of learning asynchronous action sequences, and flexibly shift, rescale, and recompose them, while accounting for biological data.
Collapse
Affiliation(s)
- Cristian Buc Calderon
- Department of Cognitive, Linguistic & Psychological Sciences, Brown University, Providence, Rhode Island, United States of America
- Department of Experimental Psychology, Ghent University, Ghent, Belgium
- Carney Institute for Brain Science, Brown University, Providence, Rhode Island, United States of America
- * E-mail:
| | - Tom Verguts
- Department of Experimental Psychology, Ghent University, Ghent, Belgium
| | - Michael J. Frank
- Department of Cognitive, Linguistic & Psychological Sciences, Brown University, Providence, Rhode Island, United States of America
- Carney Institute for Brain Science, Brown University, Providence, Rhode Island, United States of America
| |
Collapse
|
18
|
Cell-type-specific neuromodulation guides synaptic credit assignment in a spiking neural network. Proc Natl Acad Sci U S A 2021; 118:2111821118. [PMID: 34916291 PMCID: PMC8713766 DOI: 10.1073/pnas.2111821118] [Citation(s) in RCA: 9] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 10/28/2021] [Indexed: 12/27/2022] Open
Abstract
Synaptic connectivity provides the foundation for our present understanding of neuronal network function, but static connectivity cannot explain learning and memory. We propose a computational role for the diversity of cortical neuronal types and their associated cell-type–specific neuromodulators in improving the efficiency of synaptic weight adjustments for task learning in neuronal networks. Brains learn tasks via experience-driven differential adjustment of their myriad individual synaptic connections, but the mechanisms that target appropriate adjustment to particular connections remain deeply enigmatic. While Hebbian synaptic plasticity, synaptic eligibility traces, and top-down feedback signals surely contribute to solving this synaptic credit-assignment problem, alone, they appear to be insufficient. Inspired by new genetic perspectives on neuronal signaling architectures, here, we present a normative theory for synaptic learning, where we predict that neurons communicate their contribution to the learning outcome to nearby neurons via cell-type–specific local neuromodulation. Computational tests suggest that neuron-type diversity and neuron-type–specific local neuromodulation may be critical pieces of the biological credit-assignment puzzle. They also suggest algorithms for improved artificial neural network learning efficiency.
Collapse
|
19
|
Abstract
Recurrent neural networks can solve a variety of computational tasks and produce patterns of activity that capture key properties of brain circuits. However, learning rules designed to train these models are time-consuming and prone to inaccuracies when tuning connection weights located deep within the network. Here, we describe a rapid one-shot learning rule to train recurrent networks composed of biologically-grounded neurons. First, inputs to the model are compressed onto a smaller number of recurrent neurons. Then, a non-iterative rule adjusts the output weights of these neurons based on a target signal. The model learned to reproduce natural images, sequential patterns, as well as a high-resolution movie scene. Together, results provide a novel avenue for one-shot learning in biologically realistic recurrent networks and open a path to solving complex tasks by merging brain-inspired models with rapid optimization rules.
Collapse
|
20
|
Zambrano D, Roelfsema PR, Bohte S. Learning continuous-time working memory tasks with on-policy neural reinforcement learning. Neurocomputing 2021. [DOI: 10.1016/j.neucom.2020.11.072] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/21/2022]
|
21
|
Raman DV, O'Leary T. Optimal plasticity for memory maintenance during ongoing synaptic change. eLife 2021; 10:62912. [PMID: 34519270 PMCID: PMC8504970 DOI: 10.7554/elife.62912] [Citation(s) in RCA: 9] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/08/2020] [Accepted: 09/13/2021] [Indexed: 11/13/2022] Open
Abstract
Synaptic connections in many brain circuits fluctuate, exhibiting substantial turnover and remodelling over hours to days. Surprisingly, experiments show that most of this flux in connectivity persists in the absence of learning or known plasticity signals. How can neural circuits retain learned information despite a large proportion of ongoing and potentially disruptive synaptic changes? We address this question from first principles by analysing how much compensatory plasticity would be required to optimally counteract ongoing fluctuations, regardless of whether fluctuations are random or systematic. Remarkably, we find that the answer is largely independent of plasticity mechanisms and circuit architectures: compensatory plasticity should be at most equal in magnitude to fluctuations, and often less, in direct agreement with previously unexplained experimental observations. Moreover, our analysis shows that a high proportion of learning-independent synaptic change is consistent with plasticity mechanisms that accurately compute error gradients.
Collapse
Affiliation(s)
- Dhruva V Raman
- Department of Engineering, University of Cambridge, Cambridge, United Kingdom
| | - Timothy O'Leary
- Department of Engineering, University of Cambridge, Cambridge, United Kingdom
| |
Collapse
|
22
|
Aljadeff J, Gillett M, Pereira Obilinovic U, Brunel N. From synapse to network: models of information storage and retrieval in neural circuits. Curr Opin Neurobiol 2021; 70:24-33. [PMID: 34175521 DOI: 10.1016/j.conb.2021.05.005] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/03/2021] [Revised: 05/06/2021] [Accepted: 05/25/2021] [Indexed: 10/21/2022]
Abstract
The mechanisms of information storage and retrieval in brain circuits are still the subject of debate. It is widely believed that information is stored at least in part through changes in synaptic connectivity in networks that encode this information and that these changes lead in turn to modifications of network dynamics, such that the stored information can be retrieved at a later time. Here, we review recent progress in deriving synaptic plasticity rules from experimental data and in understanding how plasticity rules affect the dynamics of recurrent networks. We show that the dynamics generated by such networks exhibit a large degree of diversity, depending on parameters, similar to experimental observations in vivo during delayed response tasks.
Collapse
Affiliation(s)
- Johnatan Aljadeff
- Neurobiology Section, Division of Biological Sciences, UC San Diego, USA
| | | | | | - Nicolas Brunel
- Department of Neurobiology, Duke University, USA; Department of Physics, Duke University, USA.
| |
Collapse
|
23
|
Raman DV, O'Leary T. Frozen algorithms: how the brain's wiring facilitates learning. Curr Opin Neurobiol 2021; 67:207-214. [PMID: 33508698 PMCID: PMC8202511 DOI: 10.1016/j.conb.2020.12.017] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/30/2020] [Revised: 12/21/2020] [Accepted: 12/30/2020] [Indexed: 12/03/2022]
Abstract
Synapses and neural connectivity are plastic and shaped by experience. But to what extent does connectivity itself influence the ability of a neural circuit to learn? Insights from optimization theory and AI shed light on how learning can be implemented in neural circuits. Though abstract in their nature, learning algorithms provide a principled set of hypotheses on the necessary ingredients for learning in neural circuits. These include the kinds of signals and circuit motifs that enable learning from experience, as well as an appreciation of the constraints that make learning challenging in a biological setting. Remarkably, some simple connectivity patterns can boost the efficiency of relatively crude learning rules, showing how the brain can use anatomy to compensate for the biological constraints of known synaptic plasticity mechanisms. Modern connectomics provides rich data for exploring this principle, and may reveal how brain connectivity is constrained by the requirement to learn efficiently.
Collapse
Affiliation(s)
- Dhruva V Raman
- Department of Engineering, University of Cambridge, United Kingdom
| | - Timothy O'Leary
- Department of Engineering, University of Cambridge, United Kingdom.
| |
Collapse
|
24
|
Cone I, Shouval HZ. Learning precise spatiotemporal sequences via biophysically realistic learning rules in a modular, spiking network. eLife 2021; 10:63751. [PMID: 33734085 PMCID: PMC7972481 DOI: 10.7554/elife.63751] [Citation(s) in RCA: 13] [Impact Index Per Article: 4.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/06/2020] [Accepted: 02/16/2021] [Indexed: 11/13/2022] Open
Abstract
Multiple brain regions are able to learn and express temporal sequences, and this functionality is an essential component of learning and memory. We propose a substrate for such representations via a network model that learns and recalls discrete sequences of variable order and duration. The model consists of a network of spiking neurons placed in a modular microcolumn based architecture. Learning is performed via a biophysically realistic learning rule that depends on synaptic 'eligibility traces'. Before training, the network contains no memory of any particular sequence. After training, presentation of only the first element in that sequence is sufficient for the network to recall an entire learned representation of the sequence. An extended version of the model also demonstrates the ability to successfully learn and recall non-Markovian sequences. This model provides a possible framework for biologically plausible sequence learning and memory, in agreement with recent experimental results.
Collapse
Affiliation(s)
- Ian Cone
- Neurobiology and Anatomy, University of Texas Medical School at Houston, Houston, TX, United States.,Applied Physics, Rice University, Houston, TX, United States
| | - Harel Z Shouval
- Neurobiology and Anatomy, University of Texas Medical School at Houston, Houston, TX, United States
| |
Collapse
|
25
|
Muratore P, Capone C, Paolucci PS. Target spike patterns enable efficient and biologically plausible learning for complex temporal tasks. PLoS One 2021; 16:e0247014. [PMID: 33592040 PMCID: PMC7886200 DOI: 10.1371/journal.pone.0247014] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/25/2020] [Accepted: 01/31/2021] [Indexed: 11/28/2022] Open
Abstract
Recurrent spiking neural networks (RSNN) in the brain learn to perform a wide range of perceptual, cognitive and motor tasks very efficiently in terms of energy consumption and their training requires very few examples. This motivates the search for biologically inspired learning rules for RSNNs, aiming to improve our understanding of brain computation and the efficiency of artificial intelligence. Several spiking models and learning rules have been proposed, but it remains a challenge to design RSNNs whose learning relies on biologically plausible mechanisms and are capable of solving complex temporal tasks. In this paper, we derive a learning rule, local to the synapse, from a simple mathematical principle, the maximization of the likelihood for the network to solve a specific task. We propose a novel target-based learning scheme in which the learning rule derived from likelihood maximization is used to mimic a specific spatio-temporal spike pattern that encodes the solution to complex temporal tasks. This method makes the learning extremely rapid and precise, outperforming state of the art algorithms for RSNNs. While error-based approaches, (e.g. e-prop) trial after trial optimize the internal sequence of spikes in order to progressively minimize the MSE we assume that a signal randomly projected from an external origin (e.g. from other brain areas) directly defines the target sequence. This facilitates the learning procedure since the network is trained from the beginning to reproduce the desired internal sequence. We propose two versions of our learning rule: spike-dependent and voltage-dependent. We find that the latter provides remarkable benefits in terms of learning speed and robustness to noise. We demonstrate the capacity of our model to tackle several problems like learning multidimensional trajectories and solving the classical temporal XOR benchmark. Finally, we show that an online approximation of the gradient ascent, in addition to guaranteeing complete locality in time and space, allows learning after very few presentations of the target output. Our model can be applied to different types of biological neurons. The analytically derived plasticity learning rule is specific to each neuron model and can produce a theoretical prediction for experimental validation.
Collapse
Affiliation(s)
- Paolo Muratore
- SISSA—International School for Advanced Studies, Trieste, Italy
- * E-mail:
| | | | | |
Collapse
|
26
|
Feulner B, Clopath C. Neural manifold under plasticity in a goal driven learning behaviour. PLoS Comput Biol 2021; 17:e1008621. [PMID: 33544700 PMCID: PMC7864452 DOI: 10.1371/journal.pcbi.1008621] [Citation(s) in RCA: 20] [Impact Index Per Article: 6.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/10/2020] [Accepted: 12/08/2020] [Indexed: 11/19/2022] Open
Abstract
Neural activity is often low dimensional and dominated by only a few prominent neural covariation patterns. It has been hypothesised that these covariation patterns could form the building blocks used for fast and flexible motor control. Supporting this idea, recent experiments have shown that monkeys can learn to adapt their neural activity in motor cortex on a timescale of minutes, given that the change lies within the original low-dimensional subspace, also called neural manifold. However, the neural mechanism underlying this within-manifold adaptation remains unknown. Here, we show in a computational model that modification of recurrent weights, driven by a learned feedback signal, can account for the observed behavioural difference between within- and outside-manifold learning. Our findings give a new perspective, showing that recurrent weight changes do not necessarily lead to change in the neural manifold. On the contrary, successful learning is naturally constrained to a common subspace.
Collapse
Affiliation(s)
- Barbara Feulner
- Department of Bioengineering, Imperial College London, London, United Kingdom
| | - Claudia Clopath
- Department of Bioengineering, Imperial College London, London, United Kingdom
| |
Collapse
|
27
|
Artificial Neural Networks for Neuroscientists: A Primer. Neuron 2020; 107:1048-1070. [DOI: 10.1016/j.neuron.2020.09.005] [Citation(s) in RCA: 65] [Impact Index Per Article: 16.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/29/2020] [Revised: 08/24/2020] [Accepted: 09/01/2020] [Indexed: 12/25/2022]
|
28
|
A solution to the learning dilemma for recurrent networks of spiking neurons. Nat Commun 2020; 11:3625. [PMID: 32681001 PMCID: PMC7367848 DOI: 10.1038/s41467-020-17236-y] [Citation(s) in RCA: 102] [Impact Index Per Article: 25.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/09/2019] [Accepted: 06/16/2020] [Indexed: 11/09/2022] Open
Abstract
Recurrently connected networks of spiking neurons underlie the astounding information processing capabilities of the brain. Yet in spite of extensive research, how they can learn through synaptic plasticity to carry out complex network computations remains unclear. We argue that two pieces of this puzzle were provided by experimental data from neuroscience. A mathematical result tells us how these pieces need to be combined to enable biologically plausible online network learning through gradient descent, in particular deep reinforcement learning. This learning method-called e-prop-approaches the performance of backpropagation through time (BPTT), the best-known method for training recurrent neural networks in machine learning. In addition, it suggests a method for powerful on-chip learning in energy-efficient spike-based hardware for artificial intelligence.
Collapse
|