1
|
Qin L, Wang Z, Yan R, Tang H. Attention-Based Deep Spiking Neural Networks for Temporal Credit Assignment Problems. IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS 2024; 35:10301-10311. [PMID: 37022405 DOI: 10.1109/tnnls.2023.3240176] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/19/2023]
Abstract
The temporal credit assignment (TCA) problem, which aims to detect predictive features hidden in distracting background streams, remains a core challenge in biological and machine learning. Aggregate-label (AL) learning is proposed by researchers to resolve this problem by matching spikes with delayed feedback. However, the existing AL learning algorithms only consider the information of a single timestep, which is inconsistent with the real situation. Meanwhile, there is no quantitative evaluation method for TCA problems. To address these limitations, we propose a novel attention-based TCA (ATCA) algorithm and a minimum editing distance (MED)-based quantitative evaluation method. Specifically, we define a loss function based on the attention mechanism to deal with the information contained within the spike clusters and use MED to evaluate the similarity between the spike train and the target clue flow. Experimental results on musical instrument recognition (MedleyDB), speech recognition (TIDIGITS), and gesture recognition (DVS128-Gesture) show that the ATCA algorithm can reach the state-of-the-art (SOTA) level compared with other AL learning algorithms.
Collapse
|
2
|
Pan W, Zhao F, Han B, Dong Y, Zeng Y. Emergence of brain-inspired small-world spiking neural network through neuroevolution. iScience 2024; 27:108845. [PMID: 38327781 PMCID: PMC10847652 DOI: 10.1016/j.isci.2024.108845] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/08/2023] [Revised: 08/23/2023] [Accepted: 01/03/2024] [Indexed: 02/09/2024] Open
Abstract
Studies suggest that the brain's high efficiency and low energy consumption may be closely related to its small-world topology and critical dynamics. However, existing efforts on the performance-oriented structural evolution of spiking neural networks (SNNs) are time-consuming and ignore the core structural properties of the brain. Here, we introduce a multi-objective Evolutionary Liquid State Machine (ELSM), which blends the small-world coefficient and criticality to evolve models and guide the emergence of brain-inspired, efficient structures. Experiments reveal ELSM's consistent and comparable performance, achieving 97.23% on NMNIST and outperforming LSM models on MNIST and Fashion-MNIST with 98.12% and 88.81% accuracies, respectively. Further analysis shows its versatility and spontaneous evolution of topologies such as hub nodes, short paths, long-tailed degree distributions, and numerous communities. This study evolves recurrent spiking neural networks into brain-inspired energy-efficient structures, showcasing versatility in multiple tasks and potential for adaptive general artificial intelligence.
Collapse
Affiliation(s)
- Wenxuan Pan
- Brain-inspired Cognitive Intelligence Lab, Institute of Automation, Chinese Academy of Sciences, Beijing 100190, China
- School of Artificial Intelligence, University of Chinese Academy of Sciences, Beijing 101408, China
| | - Feifei Zhao
- Brain-inspired Cognitive Intelligence Lab, Institute of Automation, Chinese Academy of Sciences, Beijing 100190, China
| | - Bing Han
- Brain-inspired Cognitive Intelligence Lab, Institute of Automation, Chinese Academy of Sciences, Beijing 100190, China
- School of Artificial Intelligence, University of Chinese Academy of Sciences, Beijing 101408, China
| | - Yiting Dong
- Brain-inspired Cognitive Intelligence Lab, Institute of Automation, Chinese Academy of Sciences, Beijing 100190, China
- School of Future Technology, University of Chinese Academy of Sciences, Beijing 101408, China
| | - Yi Zeng
- Brain-inspired Cognitive Intelligence Lab, Institute of Automation, Chinese Academy of Sciences, Beijing 100190, China
- School of Artificial Intelligence, University of Chinese Academy of Sciences, Beijing 101408, China
- School of Future Technology, University of Chinese Academy of Sciences, Beijing 101408, China
- Center for Excellence in Brain Science and Intelligence Technology, Chinese Academy of Sciences, Shanghai 200031, China
| |
Collapse
|
3
|
Rathi N, Roy K. DIET-SNN: A Low-Latency Spiking Neural Network With Direct Input Encoding and Leakage and Threshold Optimization. IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS 2023; 34:3174-3182. [PMID: 34596559 DOI: 10.1109/tnnls.2021.3111897] [Citation(s) in RCA: 10] [Impact Index Per Article: 10.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/13/2023]
Abstract
Bioinspired spiking neural networks (SNNs), operating with asynchronous binary signals (or spikes) distributed over time, can potentially lead to greater computational efficiency on event-driven hardware. The state-of-the-art SNNs suffer from high inference latency, resulting from inefficient input encoding and suboptimal settings of the neuron parameters (firing threshold and membrane leak). We propose DIET-SNN, a low-latency deep spiking network trained with gradient descent to optimize the membrane leak and the firing threshold along with other network parameters (weights). The membrane leak and threshold of each layer are optimized with end-to-end backpropagation to achieve competitive accuracy at reduced latency. The input layer directly processes the analog pixel values of an image without converting it to spike train. The first convolutional layer converts analog inputs into spikes where leaky-integrate-and-fire (LIF) neurons integrate the weighted inputs and generate an output spike when the membrane potential crosses the trained firing threshold. The trained membrane leak selectively attenuates the membrane potential, which increases activation sparsity in the network. The reduced latency combined with high activation sparsity provides massive improvements in computational efficiency. We evaluate DIET-SNN on image classification tasks from CIFAR and ImageNet datasets on VGG and ResNet architectures. We achieve top-1 accuracy of 69% with five timesteps (inference latency) on the ImageNet dataset with 12× less compute energy than an equivalent standard artificial neural network (ANN). In addition, DIET-SNN performs 20- 500× faster inference compared to other state-of-the-art SNN models.
Collapse
|
4
|
Ma C, Yan R, Yu Z, Yu Q. Deep Spike Learning With Local Classifiers. IEEE TRANSACTIONS ON CYBERNETICS 2023; 53:3363-3375. [PMID: 35867374 DOI: 10.1109/tcyb.2022.3188015] [Citation(s) in RCA: 2] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/15/2023]
Abstract
Backpropagation has been successfully generalized to optimize deep spiking neural networks (SNNs), where, nevertheless, gradients need to be propagated back through all layers, resulting in a massive consumption of computing resources and an obstacle to the parallelization of training. A biologically motivated scheme of local learning provides an alternative to efficiently train deep networks but often suffers a low performance of accuracy on practical tasks. Thus, how to train deep SNNs with the local learning scheme to achieve both efficient and accurate performance still remains an important challenge. In this study, we focus on a supervised local learning scheme where each layer is independently optimized with an auxiliary classifier. Accordingly, we first propose a spike-based efficient local learning rule by only considering the direct dependencies in the current time. We then propose two variants that additionally incorporate temporal dependencies through a backward and forward process, respectively. The effectiveness and performance of our proposed methods are extensively evaluated with six mainstream datasets. Experimental results show that our methods can successfully scale up to large networks and substantially outperform the spike-based local learning baselines on all studied benchmarks. Our results also reveal that gradients with temporal dependencies are essential for high performance on temporal tasks, while they have negligible effects on rate-based tasks. Our work is significant as it brings the performance of spike-based local learning to a new level with the computational benefits being retained.
Collapse
|
5
|
Xiao M, Meng Q, Zhang Z, Wang Y, Lin Z. SPIDE: A purely spike-based method for training feedback spiking neural networks. Neural Netw 2023; 161:9-24. [PMID: 36736003 DOI: 10.1016/j.neunet.2023.01.026] [Citation(s) in RCA: 2] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/29/2022] [Revised: 11/19/2022] [Accepted: 01/19/2023] [Indexed: 01/26/2023]
Abstract
Spiking neural networks (SNNs) with event-based computation are promising brain-inspired models for energy-efficient applications on neuromorphic hardware. However, most supervised SNN training methods, such as conversion from artificial neural networks or direct training with surrogate gradients, require complex computation rather than spike-based operations of spiking neurons during training. In this paper, we study spike-based implicit differentiation on the equilibrium state (SPIDE) that extends the recently proposed training method, implicit differentiation on the equilibrium state (IDE), for supervised learning with purely spike-based computation, which demonstrates the potential for energy-efficient training of SNNs. Specifically, we introduce ternary spiking neuron couples and prove that implicit differentiation can be solved by spikes based on this design, so the whole training procedure, including both forward and backward passes, is made as event-driven spike computation, and weights are updated locally with two-stage average firing rates. Then we propose to modify the reset membrane potential to reduce the approximation error of spikes. With these key components, we can train SNNs with flexible structures in a small number of time steps and with firing sparsity during training, and the theoretical estimation of energy costs demonstrates the potential for high efficiency. Meanwhile, experiments show that even with these constraints, our trained models can still achieve competitive results on MNIST, CIFAR-10, CIFAR-100, and CIFAR10-DVS.
Collapse
Affiliation(s)
- Mingqing Xiao
- National Key Laboratory of General Artificial Intelligence, School of Intelligence Science and Technology, Peking University, China.
| | - Qingyan Meng
- The Chinese University of Hong Kong, Shenzhen, China; Shenzhen Research Institute of Big Data, Shenzhen 518115, China.
| | - Zongpeng Zhang
- Center for Data Science, Academy for Advanced Interdisciplinary Studies, Peking University, China.
| | - Yisen Wang
- National Key Laboratory of General Artificial Intelligence, School of Intelligence Science and Technology, Peking University, China; Institute for Artificial Intelligence, Peking University, China.
| | - Zhouchen Lin
- National Key Laboratory of General Artificial Intelligence, School of Intelligence Science and Technology, Peking University, China; Institute for Artificial Intelligence, Peking University, China; Peng Cheng Laboratory, China.
| |
Collapse
|
6
|
Lin X, Zhang Z, Zheng D. Supervised Learning Algorithm Based on Spike Train Inner Product for Deep Spiking Neural Networks. Brain Sci 2023; 13:brainsci13020168. [PMID: 36831711 PMCID: PMC9954578 DOI: 10.3390/brainsci13020168] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/22/2022] [Revised: 01/15/2023] [Accepted: 01/16/2023] [Indexed: 01/20/2023] Open
Abstract
By mimicking the hierarchical structure of human brain, deep spiking neural networks (DSNNs) can extract features from a lower level to a higher level gradually, and improve the performance for the processing of spatio-temporal information. Due to the complex hierarchical structure and implicit nonlinear mechanism, the formulation of spike train level supervised learning methods for DSNNs remains an important problem in this research area. Based on the definition of kernel function and spike trains inner product (STIP) as well as the idea of error backpropagation (BP), this paper firstly proposes a deep supervised learning algorithm for DSNNs named BP-STIP. Furthermore, in order to alleviate the intrinsic weight transport problem of the BP mechanism, feedback alignment (FA) and broadcast alignment (BA) mechanisms are utilized to optimize the error feedback mode of BP-STIP, and two deep supervised learning algorithms named FA-STIP and BA-STIP are also proposed. In the experiments, the effectiveness of the proposed three DSNN algorithms is verified on the MNIST digital image benchmark dataset, and the influence of different kernel functions on the learning performance of DSNNs with different network scales is analyzed. Experimental results show that the FA-STIP and BP-STIP algorithms can achieve 94.73% and 95.65% classification accuracy, which apparently possess better learning performance and stability compared with the benchmark algorithm BP-STIP.
Collapse
|
7
|
Nakajima M, Inoue K, Tanaka K, Kuniyoshi Y, Hashimoto T, Nakajima K. Physical deep learning with biologically inspired training method: gradient-free approach for physical hardware. Nat Commun 2022; 13:7847. [PMID: 36572696 PMCID: PMC9792515 DOI: 10.1038/s41467-022-35216-2] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/29/2021] [Accepted: 11/23/2022] [Indexed: 12/28/2022] Open
Abstract
Ever-growing demand for artificial intelligence has motivated research on unconventional computation based on physical devices. While such computation devices mimic brain-inspired analog information processing, the learning procedures still rely on methods optimized for digital processing such as backpropagation, which is not suitable for physical implementation. Here, we present physical deep learning by extending a biologically inspired training algorithm called direct feedback alignment. Unlike the original algorithm, the proposed method is based on random projection with alternative nonlinear activation. Thus, we can train a physical neural network without knowledge about the physical system and its gradient. In addition, we can emulate the computation for this training on scalable physical hardware. We demonstrate the proof-of-concept using an optoelectronic recurrent neural network called deep reservoir computer. We confirmed the potential for accelerated computation with competitive performance on benchmarks. Our results provide practical solutions for the training and acceleration of neuromorphic computation.
Collapse
Affiliation(s)
- Mitsumasa Nakajima
- NTT Device Technology Labs., 3-1 Morinosato-Wakamiya, Atsugi, Kanagwa 243-0198 Japan
| | - Katsuma Inoue
- grid.26999.3d0000 0001 2151 536XGraduate School of Information Science and Technology, The University of Tokyo, 7-3-1 Hongo, Bunkyo-ku, Tokyo 113-8656 Japan
| | - Kenji Tanaka
- NTT Device Technology Labs., 3-1 Morinosato-Wakamiya, Atsugi, Kanagwa 243-0198 Japan
| | - Yasuo Kuniyoshi
- grid.26999.3d0000 0001 2151 536XGraduate School of Information Science and Technology, The University of Tokyo, 7-3-1 Hongo, Bunkyo-ku, Tokyo 113-8656 Japan ,grid.26999.3d0000 0001 2151 536XNext Generation Artificial Intelligence Research Center, The University of Tokyo, 7-3-1 Hongo, Bunkyo-ku, Tokyo 113-8656 Japan
| | - Toshikazu Hashimoto
- NTT Device Technology Labs., 3-1 Morinosato-Wakamiya, Atsugi, Kanagwa 243-0198 Japan
| | - Kohei Nakajima
- grid.26999.3d0000 0001 2151 536XGraduate School of Information Science and Technology, The University of Tokyo, 7-3-1 Hongo, Bunkyo-ku, Tokyo 113-8656 Japan ,grid.26999.3d0000 0001 2151 536XNext Generation Artificial Intelligence Research Center, The University of Tokyo, 7-3-1 Hongo, Bunkyo-ku, Tokyo 113-8656 Japan
| |
Collapse
|
8
|
Zarkeshian P, Kergan T, Ghobadi R, Nicola W, Simon C. Photons guided by axons may enable backpropagation-based learning in the brain. Sci Rep 2022; 12:20720. [PMID: 36456619 PMCID: PMC9715721 DOI: 10.1038/s41598-022-24871-6] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/17/2022] [Accepted: 11/22/2022] [Indexed: 12/03/2022] Open
Abstract
Despite great advances in explaining synaptic plasticity and neuron function, a complete understanding of the brain's learning algorithms is still missing. Artificial neural networks provide a powerful learning paradigm through the backpropagation algorithm which modifies synaptic weights by using feedback connections. Backpropagation requires extensive communication of information back through the layers of a network. This has been argued to be biologically implausible and it is not clear whether backpropagation can be realized in the brain. Here we suggest that biophotons guided by axons provide a potential channel for backward transmission of information in the brain. Biophotons have been experimentally shown to be produced in the brain, yet their purpose is not understood. We propose that biophotons can propagate from each post-synaptic neuron to its pre-synaptic one to carry the required information backward. To reflect the stochastic character of biophoton emissions, our model includes the stochastic backward transmission of teaching signals. We demonstrate that a three-layered network of neurons can learn the MNIST handwritten digit classification task using our proposed backpropagation-like algorithm with stochastic photonic feedback. We model realistic restrictions and show that our system still learns the task for low rates of biophoton emission, information-limited (one bit per photon) backward transmission, and in the presence of noise photons. Our results suggest a new functionality for biophotons and provide an alternate mechanism for backward transmission in the brain.
Collapse
Affiliation(s)
- Parisa Zarkeshian
- grid.22072.350000 0004 1936 7697Department of Physics & Astronomy, University of Calgary, 2500 University Drive NW, Calgary, AB T2N 1N4 Canada ,grid.22072.350000 0004 1936 7697Institute for Quantum Science and Technology, University of Calgary, 2500 University Drive NW, Calgary, AB T2N 1N4 Canada ,grid.22072.350000 0004 1936 7697Hotchkiss Brain Institute, University of Calgary, 3330 Hospital Drive NW, Calgary, AB T2N 4N1 Canada ,1QB Information Technologies (1QBit), Vancouver, BC Canada
| | - Taylor Kergan
- grid.22072.350000 0004 1936 7697Department of Physics & Astronomy, University of Calgary, 2500 University Drive NW, Calgary, AB T2N 1N4 Canada
| | - Roohollah Ghobadi
- grid.22072.350000 0004 1936 7697Department of Physics & Astronomy, University of Calgary, 2500 University Drive NW, Calgary, AB T2N 1N4 Canada ,grid.22072.350000 0004 1936 7697Institute for Quantum Science and Technology, University of Calgary, 2500 University Drive NW, Calgary, AB T2N 1N4 Canada ,grid.22072.350000 0004 1936 7697Hotchkiss Brain Institute, University of Calgary, 3330 Hospital Drive NW, Calgary, AB T2N 4N1 Canada
| | - Wilten Nicola
- grid.22072.350000 0004 1936 7697Department of Physics & Astronomy, University of Calgary, 2500 University Drive NW, Calgary, AB T2N 1N4 Canada ,grid.22072.350000 0004 1936 7697Hotchkiss Brain Institute, University of Calgary, 3330 Hospital Drive NW, Calgary, AB T2N 4N1 Canada ,grid.22072.350000 0004 1936 7697Department of Cell Biology and Anatomy, University of Calgary, Cumming School of Medicine, 3330 Hospital Drive NW, Calgary, AB Canada
| | - Christoph Simon
- grid.22072.350000 0004 1936 7697Department of Physics & Astronomy, University of Calgary, 2500 University Drive NW, Calgary, AB T2N 1N4 Canada ,grid.22072.350000 0004 1936 7697Institute for Quantum Science and Technology, University of Calgary, 2500 University Drive NW, Calgary, AB T2N 1N4 Canada ,grid.22072.350000 0004 1936 7697Hotchkiss Brain Institute, University of Calgary, 3330 Hospital Drive NW, Calgary, AB T2N 4N1 Canada
| |
Collapse
|
9
|
Duan S, Principe JC. Training Deep Architectures Without End-to-End Backpropagation: A Survey on the Provably Optimal Methods. IEEE COMPUT INTELL M 2022. [DOI: 10.1109/mci.2022.3199624] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/10/2022]
|
10
|
Sun M, Wang L. Effect of Bodybuilding and Fitness Exercise on Physical Fitness Based on Deep Learning. Emerg Med Int 2022; 2022:3891109. [PMID: 35774151 PMCID: PMC9239833 DOI: 10.1155/2022/3891109] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/31/2022] [Revised: 05/09/2022] [Accepted: 06/01/2022] [Indexed: 11/18/2022] Open
Abstract
With the rapid development of society and economy, people's living standards are improving day by day, and increasingly attention is paid to physical health, which has set off a fitness upsurge. The purpose of this paper was to analyze the impact of bodybuilding exercise on physical fitness based on deep learning. It provides a reference for fitness enthusiasts to choose scientific and targeted exercise methods, and provides a theoretical basis for the promotion of bodybuilding and fitness. This paper first gives a general introduction to deep learning and adds image segmentation technology to design experiments for bodybuilding and fitness. The experiment was divided into groups A and B, and control group C. In this paper, recurrent neural network and gated recurrent neural network are introduced to compare and analyze the data, and the stability of data processing with different activation functions is compared. The data results show that under the scientific and reasonable arrangement of exercise conditions, bodybuilding and fitness exercises have a corresponding positive effect on the body shape and posture of the subjects. It is more practical to choose a combination of aerobic and anaerobic exercise. In this paper, based on the deep learning algorithm, compared with the recurrent neural network, the gated recurrent neural network is more suitable for processing sequence problems. In the experimental analysis part, this paper compares and analyzes the experimental results of the data under different activation functions, sigmoid function, and tanh function. It is found that the tanh activation function and the gated recurrent neural network are more stable for data processing. The highest AUC value of the traditional recurrent neural network differs by 0.78 from the highest AUC value of the gated recurrent neural network. The data analysis results are in line with the actual situation.
Collapse
Affiliation(s)
- Manman Sun
- College of Sports and Leisure, Xi'an Physical Education University, Xi'an, 710000, Shaanxi, China
| | - Lijun Wang
- College of Physical Education, Shaanxi Normal University, Xi'an, Shaanxi 710000, China
| |
Collapse
|
11
|
Abstract
Recurrent neural networks can solve a variety of computational tasks and produce patterns of activity that capture key properties of brain circuits. However, learning rules designed to train these models are time-consuming and prone to inaccuracies when tuning connection weights located deep within the network. Here, we describe a rapid one-shot learning rule to train recurrent networks composed of biologically-grounded neurons. First, inputs to the model are compressed onto a smaller number of recurrent neurons. Then, a non-iterative rule adjusts the output weights of these neurons based on a target signal. The model learned to reproduce natural images, sequential patterns, as well as a high-resolution movie scene. Together, results provide a novel avenue for one-shot learning in biologically realistic recurrent networks and open a path to solving complex tasks by merging brain-inspired models with rapid optimization rules.
Collapse
|
12
|
Biologically motivated learning method for deep neural networks using hierarchical competitive learning. Neural Netw 2021; 144:271-278. [PMID: 34520937 DOI: 10.1016/j.neunet.2021.08.027] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/10/2021] [Revised: 07/30/2021] [Accepted: 08/23/2021] [Indexed: 11/22/2022]
Abstract
This study proposes a novel biologically motivated learning method for deep convolutional neural networks (CNNs). The combination of CNNs and backpropagation learning is the most powerful method in recent machine learning regimes. However, it requires a large amount of labeled data for training, and this requirement can occasionally become a barrier for real world applications. To address this problem and use unlabeled data, we introduce unsupervised competitive learning, which only requires forward propagating signals for CNNs. The method was evaluated on image discrimination tasks using the MNIST, CIFAR-10, and ImageNet datasets, and it achieved state-of-the-art performance with respect to other biologically motivated methods in the ImageNet benchmark. The results suggest that the method enables higher-level learning representations solely based on the forward propagating signals without the need for a backward error signal for training convolutional layers. The proposed method could be useful for a variety of poorly labeled data, for example, time series or medical data.
Collapse
|
13
|
Jiang R, Zhang J, Yan R, Tang H. Few-Shot Learning in Spiking Neural Networks by Multi-Timescale Optimization. Neural Comput 2021; 33:2439-2472. [PMID: 34280263 DOI: 10.1162/neco_a_01423] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/08/2020] [Accepted: 04/01/2021] [Indexed: 11/04/2022]
Abstract
Learning new concepts rapidly from a few examples is an open issue in spike-based machine learning. This few-shot learning imposes substantial challenges to the current learning methodologies of spiking neuron networks (SNNs) due to the lack of task-related priori knowledge. The recent learning-to-learn (L2L) approach allows SNNs to acquire priori knowledge through example-level learning and task-level optimization. However, an existing L2L-based framework does not target the neural dynamics (i.e., neuronal and synaptic parameter changes) on different timescales. This diversity of temporal dynamics is an important attribute in spike-based learning, which facilitates the networks to rapidly acquire knowledge from very few examples and gradually integrate this knowledge. In this work, we consider the neural dynamics on various timescales and provide a multi-timescale optimization (MTSO) framework for SNNs. This framework introduces an adaptive-gated LSTM to accommodate two different timescales of neural dynamics: short-term learning and long-term evolution. Short-term learning is a fast knowledge acquisition process achieved by a novel surrogate gradient online learning (SGOL) algorithm, where the LSTM guides gradient updating of SNN on a short timescale through an adaptive learning rate and weight decay gating. The long-term evolution aims to slowly integrate acquired knowledge and form, which can be achieved by optimizing the LSTM guidance process to tune SNN parameters on a long timescale. Experimental results demonstrate that the collaborative optimization of multi-timescale neural dynamics can make SNNs achieve promising performance for the few-shot learning tasks.
Collapse
Affiliation(s)
- Runhao Jiang
- College of Computer Science, Sichuan University, Chengdu 610065, China
| | - Jie Zhang
- College of Computer Science, Sichuan University, Chengdu 610065, China
| | - Rui Yan
- College of Computer Science, Zhejiang University of Technology, Hangzhou 310014, China
| | - Huajin Tang
- College of Computer Science and Technology, Zhejiang University, Hangzhou 310027, China, and Zhejiang Lab, Hangzhou 311121, China
| |
Collapse
|
14
|
Payeur A, Guerguiev J, Zenke F, Richards BA, Naud R. Burst-dependent synaptic plasticity can coordinate learning in hierarchical circuits. Nat Neurosci 2021; 24:1010-1019. [PMID: 33986551 DOI: 10.1038/s41593-021-00857-x] [Citation(s) in RCA: 67] [Impact Index Per Article: 22.3] [Reference Citation Analysis] [Abstract] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/30/2020] [Accepted: 04/15/2021] [Indexed: 01/25/2023]
Abstract
Synaptic plasticity is believed to be a key physiological mechanism for learning. It is well established that it depends on pre- and postsynaptic activity. However, models that rely solely on pre- and postsynaptic activity for synaptic changes have, so far, not been able to account for learning complex tasks that demand credit assignment in hierarchical networks. Here we show that if synaptic plasticity is regulated by high-frequency bursts of spikes, then pyramidal neurons higher in a hierarchical circuit can coordinate the plasticity of lower-level connections. Using simulations and mathematical analyses, we demonstrate that, when paired with short-term synaptic dynamics, regenerative activity in the apical dendrites and synaptic plasticity in feedback pathways, a burst-dependent learning rule can solve challenging tasks that require deep network architectures. Our results demonstrate that well-known properties of dendrites, synapses and synaptic plasticity are sufficient to enable sophisticated learning in hierarchical circuits.
Collapse
Affiliation(s)
- Alexandre Payeur
- Department of Cellular and Molecular Medicine, University of Ottawa, Ottawa, ON, Canada.,Ottawa Brain and Mind Institute, University of Ottawa, Ottawa, ON, Canada.,Centre for Neural Dynamics, University of Ottawa, Ottawa, ON, Canada.,University of Montréal and Mila, Montréal, QC, Canada
| | - Jordan Guerguiev
- Department of Biological Sciences, University of Toronto Scarborough, Toronto, ON, Canada.,Department of Cell and Systems Biology, University of Toronto, Toronto, ON, Canada
| | - Friedemann Zenke
- Friedrich Miescher Institute for Biomedical Research, Basel, Switzerland
| | - Blake A Richards
- Mila, Montréal, QC, Canada. .,Department of Neurology and Neurosurgery, McGill University, Montréal, QC, Canada. .,School of Computer Science, McGill University, Montréal, QC, Canada. .,Learning in Machines and Brains Program, Canadian Institute for Advanced Research, Toronto, ON, Canada.
| | - Richard Naud
- Department of Cellular and Molecular Medicine, University of Ottawa, Ottawa, ON, Canada. .,Ottawa Brain and Mind Institute, University of Ottawa, Ottawa, ON, Canada. .,Centre for Neural Dynamics, University of Ottawa, Ottawa, ON, Canada. .,Department of Physics, University of Ottawa, Ottawa, ON, Canada.
| |
Collapse
|
15
|
Covi E, Donati E, Liang X, Kappel D, Heidari H, Payvand M, Wang W. Adaptive Extreme Edge Computing for Wearable Devices. Front Neurosci 2021; 15:611300. [PMID: 34045939 PMCID: PMC8144334 DOI: 10.3389/fnins.2021.611300] [Citation(s) in RCA: 23] [Impact Index Per Article: 7.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/28/2020] [Accepted: 03/24/2021] [Indexed: 11/13/2022] Open
Abstract
Wearable devices are a fast-growing technology with impact on personal healthcare for both society and economy. Due to the widespread of sensors in pervasive and distributed networks, power consumption, processing speed, and system adaptation are vital in future smart wearable devices. The visioning and forecasting of how to bring computation to the edge in smart sensors have already begun, with an aspiration to provide adaptive extreme edge computing. Here, we provide a holistic view of hardware and theoretical solutions toward smart wearable devices that can provide guidance to research in this pervasive computing era. We propose various solutions for biologically plausible models for continual learning in neuromorphic computing technologies for wearable sensors. To envision this concept, we provide a systematic outline in which prospective low power and low latency scenarios of wearable sensors in neuromorphic platforms are expected. We successively describe vital potential landscapes of neuromorphic processors exploiting complementary metal-oxide semiconductors (CMOS) and emerging memory technologies (e.g., memristive devices). Furthermore, we evaluate the requirements for edge computing within wearable devices in terms of footprint, power consumption, latency, and data size. We additionally investigate the challenges beyond neuromorphic computing hardware, algorithms and devices that could impede enhancement of adaptive edge computing in smart wearable devices.
Collapse
Affiliation(s)
| | - Elisa Donati
- Institute of Neuroinformatics, University of Zurich, Eidgenössische Technische Hochschule Zürich (ETHZ), Zurich, Switzerland
| | - Xiangpeng Liang
- Microelectronics Lab, James Watt School of Engineering, University of Glasgow, Glasgow, United Kingdom
| | - David Kappel
- Bernstein Center for Computational Neuroscience, III Physikalisches Institut–Biophysik, Georg-August Universität, Göttingen, Germany
| | - Hadi Heidari
- Microelectronics Lab, James Watt School of Engineering, University of Glasgow, Glasgow, United Kingdom
| | - Melika Payvand
- Institute of Neuroinformatics, University of Zurich, Eidgenössische Technische Hochschule Zürich (ETHZ), Zurich, Switzerland
| | - Wei Wang
- The Andrew and Erna Viterbi Department of Electrical Engineering, Technion–Israel Institute of Technology, Haifa, Israel
| |
Collapse
|
16
|
Zhao D, Zeng Y, Zhang T, Shi M, Zhao F. GLSNN: A Multi-Layer Spiking Neural Network Based on Global Feedback Alignment and Local STDP Plasticity. Front Comput Neurosci 2020; 14:576841. [PMID: 33281591 PMCID: PMC7689090 DOI: 10.3389/fncom.2020.576841] [Citation(s) in RCA: 13] [Impact Index Per Article: 3.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/27/2020] [Accepted: 10/12/2020] [Indexed: 11/21/2022] Open
Abstract
Spiking Neural Networks (SNNs) are considered as the third generation of artificial neural networks, which are more closely with information processing in biological brains. However, it is still a challenge for how to train the non-differential SNN efficiently and robustly with the form of spikes. Here we give an alternative method to train SNNs by biologically-plausible structural and functional inspirations from the brain. Firstly, inspired by the significant top-down structural connections, a global random feedback alignment is designed to help the SNN propagate the error target from the output layer directly to the previous few layers. Then inspired by the local plasticity of the biological system in which the synapses are more tuned by the neighborhood neurons, a differential STDP is used to optimize local plasticity. Extensive experimental results on the benchmark MNIST (98.62%) and Fashion MNIST (89.05%) have shown that the proposed algorithm performs favorably against several state-of-the-art SNNs trained with backpropagation.
Collapse
Affiliation(s)
- Dongcheng Zhao
- Research Center for Brain-Inspired Intelligence, Institute of Automation, Chinese Academy of Sciences, Beijing, China
- School of Artificial Intelligence, University of Chinese Academy of Sciences, Beijing, China
| | - Yi Zeng
- Research Center for Brain-Inspired Intelligence, Institute of Automation, Chinese Academy of Sciences, Beijing, China
- School of Artificial Intelligence, University of Chinese Academy of Sciences, Beijing, China
- Center for Excellence in Brain Science and Intelligence Technology, Chinese Academy of Sciences, Beijing, China
- National Laboratory of Pattern Recognition, Institute of Automation, Chinese Academy of Sciences, Beijing, China
| | - Tielin Zhang
- Research Center for Brain-Inspired Intelligence, Institute of Automation, Chinese Academy of Sciences, Beijing, China
| | - Mengting Shi
- Research Center for Brain-Inspired Intelligence, Institute of Automation, Chinese Academy of Sciences, Beijing, China
- School of Artificial Intelligence, University of Chinese Academy of Sciences, Beijing, China
| | - Feifei Zhao
- Research Center for Brain-Inspired Intelligence, Institute of Automation, Chinese Academy of Sciences, Beijing, China
| |
Collapse
|
17
|
A solution to the learning dilemma for recurrent networks of spiking neurons. Nat Commun 2020; 11:3625. [PMID: 32681001 PMCID: PMC7367848 DOI: 10.1038/s41467-020-17236-y] [Citation(s) in RCA: 102] [Impact Index Per Article: 25.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/09/2019] [Accepted: 06/16/2020] [Indexed: 11/09/2022] Open
Abstract
Recurrently connected networks of spiking neurons underlie the astounding information processing capabilities of the brain. Yet in spite of extensive research, how they can learn through synaptic plasticity to carry out complex network computations remains unclear. We argue that two pieces of this puzzle were provided by experimental data from neuroscience. A mathematical result tells us how these pieces need to be combined to enable biologically plausible online network learning through gradient descent, in particular deep reinforcement learning. This learning method-called e-prop-approaches the performance of backpropagation through time (BPTT), the best-known method for training recurrent neural networks in machine learning. In addition, it suggests a method for powerful on-chip learning in energy-efficient spike-based hardware for artificial intelligence.
Collapse
|
18
|
Ott J, Linstead E, LaHaye N, Baldi P. Learning in the machine: To share or not to share? Neural Netw 2020; 126:235-249. [DOI: 10.1016/j.neunet.2020.03.016] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/24/2019] [Revised: 03/15/2020] [Accepted: 03/16/2020] [Indexed: 12/27/2022]
|
19
|
|
20
|
Ju X, Fang B, Yan R, Xu X, Tang H. An FPGA Implementation of Deep Spiking Neural Networks for Low-Power and Fast Classification. Neural Comput 2019; 32:182-204. [PMID: 31703174 DOI: 10.1162/neco_a_01245] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/04/2022]
Abstract
A spiking neural network (SNN) is a type of biological plausibility model that performs information processing based on spikes. Training a deep SNN effectively is challenging due to the nondifferention of spike signals. Recent advances have shown that high-performance SNNs can be obtained by converting convolutional neural networks (CNNs). However, the large-scale SNNs are poorly served by conventional architectures due to the dynamic nature of spiking neurons. In this letter, we propose a hardware architecture to enable efficient implementation of SNNs. All layers in the network are mapped on one chip so that the computation of different time steps can be done in parallel to reduce latency. We propose new spiking max-pooling method to reduce computation complexity. In addition, we apply approaches based on shift register and coarsely grained parallels to accelerate convolution operation. We also investigate the effect of different encoding methods on SNN accuracy. Finally, we validate the hardware architecture on the Xilinx Zynq ZCU102. The experimental results on the MNIST data set show that it can achieve an accuracy of 98.94% with eight-bit quantized weights. Furthermore, it achieves 164 frames per second (FPS) under 150 MHz clock frequency and obtains 41× speed-up compared to CPU implementation and 22 times lower power than GPU implementation.
Collapse
Affiliation(s)
- Xiping Ju
- College of Computer Science, Sichuan University, Chengdu 610065, China
| | - Biao Fang
- College of Computer Science, Sichuan University, Chengdu 610065, China
| | - Rui Yan
- College of Computer Science, Sichuan University, Chengdu 610065, China
| | - Xiaoliang Xu
- School of Computer Science and Technology, Hangzhou Dianzi University, Hangzhou 310018, China
| | - Huajin Tang
- College of Computer Science and Technology, Zhejiang University, Hangzhou 310027, China, and College of Computer Science, Sichuan University, Chengdu 610065, China
| |
Collapse
|
21
|
Richards BA, Lillicrap TP, Beaudoin P, Bengio Y, Bogacz R, Christensen A, Clopath C, Costa RP, de Berker A, Ganguli S, Gillon CJ, Hafner D, Kepecs A, Kriegeskorte N, Latham P, Lindsay GW, Miller KD, Naud R, Pack CC, Poirazi P, Roelfsema P, Sacramento J, Saxe A, Scellier B, Schapiro AC, Senn W, Wayne G, Yamins D, Zenke F, Zylberberg J, Therien D, Kording KP. A deep learning framework for neuroscience. Nat Neurosci 2019; 22:1761-1770. [PMID: 31659335 PMCID: PMC7115933 DOI: 10.1038/s41593-019-0520-2] [Citation(s) in RCA: 385] [Impact Index Per Article: 77.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/21/2019] [Accepted: 09/23/2019] [Indexed: 11/08/2022]
Abstract
Systems neuroscience seeks explanations for how the brain implements a wide variety of perceptual, cognitive and motor tasks. Conversely, artificial intelligence attempts to design computational systems based on the tasks they will have to solve. In artificial neural networks, the three components specified by design are the objective functions, the learning rules and the architectures. With the growing success of deep learning, which utilizes brain-inspired architectures, these three designed components have increasingly become central to how we model, engineer and optimize complex artificial learning systems. Here we argue that a greater focus on these components would also benefit systems neuroscience. We give examples of how this optimization-based framework can drive theoretical and experimental progress in neuroscience. We contend that this principled perspective on systems neuroscience will help to generate more rapid progress.
Collapse
Affiliation(s)
- Blake A Richards
- Mila, Montréal, Quebec, Canada.
- School of Computer Science, McGill University, Montréal, Quebec, Canada.
- Department of Neurology & Neurosurgery, McGill University, Montréal, Quebec, Canada.
- Canadian Institute for Advanced Research, Toronto, Ontario, Canada.
| | - Timothy P Lillicrap
- DeepMind, Inc., London, UK
- Centre for Computation, Mathematics and Physics in the Life Sciences and Experimental Biology, University College London, London, UK
| | | | - Yoshua Bengio
- Mila, Montréal, Quebec, Canada
- Canadian Institute for Advanced Research, Toronto, Ontario, Canada
- Université de Montréal, Montréal, Quebec, Canada
| | - Rafal Bogacz
- MRC Brain Network Dynamics Unit, University of Oxford, Oxford, UK
| | - Amelia Christensen
- Department of Electrical Engineering, Stanford University, Stanford, CA, USA
| | - Claudia Clopath
- Department of Bioengineering, Imperial College London, London, UK
| | - Rui Ponte Costa
- Computational Neuroscience Unit, School of Computer Science, Electrical and Electronic Engineering, and Engineering Maths, University of Bristol, Bristol, UK
- Department of Physiology, Universität Bern, Bern, Switzerland
| | | | - Surya Ganguli
- Department of Applied Physics, Stanford University, Stanford, CA, USA
- Google Brain, Mountain View, CA, USA
| | - Colleen J Gillon
- Department of Biological Sciences, University of Toronto Scarborough, Toronto, Ontario, Canada
- Department of Cell & Systems Biology, University of Toronto, Toronto, Ontario, Canada
| | - Danijar Hafner
- Google Brain, Mountain View, CA, USA
- Department of Computer Science, University of Toronto, Toronto, Ontario, Canada
- Vector Institute, Toronto, Ontario, Canada
| | - Adam Kepecs
- Cold Spring Harbor Laboratory, Cold Spring Harbor, NY, USA
| | - Nikolaus Kriegeskorte
- Department of Psychology and Neuroscience, Columbia University, New York, NY, USA
- Zuckerman Mind Brain Behavior Institute, Columbia University, New York, New York, USA
| | - Peter Latham
- Gatsby Computational Neuroscience Unit, University College London, London, UK
| | - Grace W Lindsay
- Zuckerman Mind Brain Behavior Institute, Columbia University, New York, New York, USA
- Center for Theoretical Neuroscience, Columbia University, New York, NY, USA
| | - Kenneth D Miller
- Zuckerman Mind Brain Behavior Institute, Columbia University, New York, New York, USA
- Center for Theoretical Neuroscience, Columbia University, New York, NY, USA
- Department of Neuroscience, College of Physicians and Surgeons, Columbia University, New York, NY, USA
| | - Richard Naud
- University of Ottawa Brain and Mind Institute, Ottawa, Ontario, Canada
- Department of Cellular and Molecular Medicine, University of Ottawa, Ottawa, Ontario, Canada
| | - Christopher C Pack
- Department of Neurology & Neurosurgery, McGill University, Montréal, Quebec, Canada
| | - Panayiota Poirazi
- Institute of Molecular Biology and Biotechnology (IMBB), Foundation for Research and Technology-Hellas (FORTH), Heraklion, Crete, Greece
| | - Pieter Roelfsema
- Department of Vision & Cognition, Netherlands Institute for Neuroscience, Amsterdam, Netherlands
| | - João Sacramento
- Institute of Neuroinformatics, ETH Zürich and University of Zürich, Zürich, Switzerland
| | - Andrew Saxe
- Department of Experimental Psychology, University of Oxford, Oxford, UK
| | - Benjamin Scellier
- Mila, Montréal, Quebec, Canada
- Université de Montréal, Montréal, Quebec, Canada
| | - Anna C Schapiro
- Department of Psychology, University of Pennsylvania, Philadelphia, PA, USA
| | - Walter Senn
- Department of Physiology, Universität Bern, Bern, Switzerland
| | | | - Daniel Yamins
- Department of Psychology, Stanford University, Stanford, CA, USA
- Department of Computer Science, Stanford University, Stanford, CA, USA
- Wu Tsai Neurosciences Institute, Stanford University, Stanford, CA, USA
| | - Friedemann Zenke
- Friedrich Miescher Institute for Biomedical Research, Basel, Switzerland
- Centre for Neural Circuits and Behaviour, University of Oxford, Oxford, UK
| | - Joel Zylberberg
- Canadian Institute for Advanced Research, Toronto, Ontario, Canada
- Department of Physics and Astronomy York University, Toronto, Ontario, Canada
- Center for Vision Research, York University, Toronto, Ontario, Canada
| | | | - Konrad P Kording
- Canadian Institute for Advanced Research, Toronto, Ontario, Canada
- Department of Bioengineering, University of Pennsylvania, Philadelphia, PA, USA
- Department of Neuroscience, University of Pennsylvania, Philadelphia, PA, USA
| |
Collapse
|
22
|
Hao Y, Huang X, Dong M, Xu B. A biologically plausible supervised learning method for spiking neural networks using the symmetric STDP rule. Neural Netw 2019; 121:387-395. [PMID: 31593843 DOI: 10.1016/j.neunet.2019.09.007] [Citation(s) in RCA: 40] [Impact Index Per Article: 8.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/29/2018] [Revised: 06/30/2019] [Accepted: 09/06/2019] [Indexed: 01/28/2023]
Abstract
Spiking neural networks (SNNs) possess energy-efficient potential due to event-based computation. However, supervised training of SNNs remains a challenge as spike activities are non-differentiable. Previous SNNs training methods can be generally categorized into two basic classes, i.e., backpropagation-like training methods and plasticity-based learning methods. The former methods are dependent on energy-inefficient real-valued computation and non-local transmission, as also required in artificial neural networks (ANNs), whereas the latter are either considered to be biologically implausible or exhibit poor performance. Hence, biologically plausible (bio-plausible) high-performance supervised learning (SL) methods for SNNs remain deficient. In this paper, we proposed a novel bio-plausible SNN model for SL based on the symmetric spike-timing dependent plasticity (sym-STDP) rule found in neuroscience. By combining the sym-STDP rule with bio-plausible synaptic scaling and intrinsic plasticity of the dynamic threshold, our SNN model implemented SL well and achieved good performance in the benchmark recognition task (MNIST dataset). To reveal the underlying mechanism of our SL model, we visualized both layer-based activities and synaptic weights using the t-distributed stochastic neighbor embedding (t-SNE) method after training and found that they were well clustered, thereby demonstrating excellent classification ability. Furthermore, to verify the robustness of our model, we trained it on another more realistic dataset (Fashion-MNIST), which also showed good performance. As the learning rules were bio-plausible and based purely on local spike events, our model could be easily applied to neuromorphic hardware for online training and may be helpful for understanding SL information processing at the synaptic level in biological neural systems.
Collapse
Affiliation(s)
- Yunzhe Hao
- Research Center for Brain-inspired Intelligence, Institute of Automation, Chinese Academy of Sciences, 100190 Beijing, China; University of Chinese Academy of Sciences, 100049 Beijing, China
| | - Xuhui Huang
- Research Center for Brain-inspired Intelligence, Institute of Automation, Chinese Academy of Sciences, 100190 Beijing, China.
| | - Meng Dong
- Research Center for Brain-inspired Intelligence, Institute of Automation, Chinese Academy of Sciences, 100190 Beijing, China
| | - Bo Xu
- Research Center for Brain-inspired Intelligence, Institute of Automation, Chinese Academy of Sciences, 100190 Beijing, China; University of Chinese Academy of Sciences, 100049 Beijing, China; CAS Center for Excellence in Brain Science and Intelligence Technology, Chinese Academy of Sciences, 100190 Beijing, China.
| |
Collapse
|
23
|
Illing B, Gerstner W, Brea J. Biologically plausible deep learning - But how far can we go with shallow networks? Neural Netw 2019; 118:90-101. [PMID: 31254771 DOI: 10.1016/j.neunet.2019.06.001] [Citation(s) in RCA: 30] [Impact Index Per Article: 6.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/07/2019] [Revised: 05/29/2019] [Accepted: 06/02/2019] [Indexed: 11/17/2022]
Abstract
Training deep neural networks with the error backpropagation algorithm is considered implausible from a biological perspective. Numerous recent publications suggest elaborate models for biologically plausible variants of deep learning, typically defining success as reaching around 98% test accuracy on the MNIST data set. Here, we investigate how far we can go on digit (MNIST) and object (CIFAR10) classification with biologically plausible, local learning rules in a network with one hidden layer and a single readout layer. The hidden layer weights are either fixed (random or random Gabor filters) or trained with unsupervised methods (Principal/Independent Component Analysis or Sparse Coding) that can be implemented by local learning rules. The readout layer is trained with a supervised, local learning rule. We first implement these models with rate neurons. This comparison reveals, first, that unsupervised learning does not lead to better performance than fixed random projections or Gabor filters for large hidden layers. Second, networks with localized receptive fields perform significantly better than networks with all-to-all connectivity and can reach backpropagation performance on MNIST. We then implement two of the networks - fixed, localized, random & random Gabor filters in the hidden layer - with spiking leaky integrate-and-fire neurons and spike timing dependent plasticity to train the readout layer. These spiking models achieve >98.2% test accuracy on MNIST, which is close to the performance of rate networks with one hidden layer trained with backpropagation. The performance of our shallow network models is comparable to most current biologically plausible models of deep learning. Furthermore, our results with a shallow spiking network provide an important reference and suggest the use of data sets other than MNIST for testing the performance of future models of biologically plausible deep learning.
Collapse
Affiliation(s)
- Bernd Illing
- School of Computer and Communication Science & School of Life Science, EPFL, 1015 Lausanne, Switzerland.
| | - Wulfram Gerstner
- School of Computer and Communication Science & School of Life Science, EPFL, 1015 Lausanne, Switzerland
| | - Johanni Brea
- School of Computer and Communication Science & School of Life Science, EPFL, 1015 Lausanne, Switzerland
| |
Collapse
|
24
|
Murray JM. Local online learning in recurrent networks with random feedback. eLife 2019; 8:43299. [PMID: 31124785 PMCID: PMC6561704 DOI: 10.7554/elife.43299] [Citation(s) in RCA: 24] [Impact Index Per Article: 4.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/01/2018] [Accepted: 05/23/2019] [Indexed: 01/08/2023] Open
Abstract
Recurrent neural networks (RNNs) enable the production and processing of time-dependent signals such as those involved in movement or working memory. Classic gradient-based algorithms for training RNNs have been available for decades, but are inconsistent with biological features of the brain, such as causality and locality. We derive an approximation to gradient-based learning that comports with these constraints by requiring synaptic weight updates to depend only on local information about pre- and postsynaptic activities, in addition to a random feedback projection of the RNN output error. In addition to providing mathematical arguments for the effectiveness of the new learning rule, we show through simulations that it can be used to train an RNN to perform a variety of tasks. Finally, to overcome the difficulty of training over very large numbers of timesteps, we propose an augmented circuit architecture that allows the RNN to concatenate short-duration patterns into longer sequences.
Collapse
Affiliation(s)
- James M Murray
- Zuckerman Mind, Brain and Behavior Institute, Columbia University, New York, United States
| |
Collapse
|
25
|
Wu X, Wang Y, Tang H, Yan R. A structure-time parallel implementation of spike-based deep learning. Neural Netw 2019; 113:72-78. [PMID: 30785011 DOI: 10.1016/j.neunet.2019.01.010] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/03/2018] [Revised: 01/08/2019] [Accepted: 01/22/2019] [Indexed: 10/27/2022]
Abstract
Motivated by the recent progress of deep spiking neural networks (SNNs), we propose a structure-time parallel strategy based on layered structure and one-time computation over a time window to speed up the prominent spike-based deep learning algorithm named broadcast alignment. Furthermore, a well-designed deep hierarchical model based on the parallel broadcast alignment is proposed for object recognition. The parallel broadcast alignment achieves a significant 137× speedup compared to its original implementation on MNIST dataset. The object recognition model achieves higher accuracy than that of the latest spiking deep convolutional neural networks on the ETH-80 dataset. The proposed parallel strategy and the object recognition model will facilitate both the simulation of deep SNNs for studying spiking neural dynamics and also the applications of spike-based deep learning in real-world problems.
Collapse
Affiliation(s)
- Xi Wu
- Neuromorphic Computing Research Center, College of Computer Science, Sichuan University, Chengdu, 610065, China
| | - Yixuan Wang
- Neuromorphic Computing Research Center, College of Computer Science, Sichuan University, Chengdu, 610065, China
| | - Huajin Tang
- Neuromorphic Computing Research Center, College of Computer Science, Sichuan University, Chengdu, 610065, China
| | - Rui Yan
- Neuromorphic Computing Research Center, College of Computer Science, Sichuan University, Chengdu, 610065, China.
| |
Collapse
|
26
|
Mostafa H, Ramesh V, Cauwenberghs G. Deep Supervised Learning Using Local Errors. Front Neurosci 2018; 12:608. [PMID: 30233295 PMCID: PMC6127296 DOI: 10.3389/fnins.2018.00608] [Citation(s) in RCA: 25] [Impact Index Per Article: 4.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/07/2018] [Accepted: 08/13/2018] [Indexed: 11/13/2022] Open
Abstract
Error backpropagation is a highly effective mechanism for learning high-quality hierarchical features in deep networks. Updating the features or weights in one layer, however, requires waiting for the propagation of error signals from higher layers. Learning using delayed and non-local errors makes it hard to reconcile backpropagation with the learning mechanisms observed in biological neural networks as it requires the neurons to maintain a memory of the input long enough until the higher-layer errors arrive. In this paper, we propose an alternative learning mechanism where errors are generated locally in each layer using fixed, random auxiliary classifiers. Lower layers could thus be trained independently of higher layers and training could either proceed layer by layer, or simultaneously in all layers using local error information. We address biological plausibility concerns such as weight symmetry requirements and show that the proposed learning mechanism based on fixed, broad, and random tuning of each neuron to the classification categories outperforms the biologically-motivated feedback alignment learning technique on the CIFAR10 dataset, approaching the performance of standard backpropagation. Our approach highlights a potential biological mechanism for the supervised, or task-dependent, learning of feature hierarchies. In addition, we show that it is well suited for learning deep networks in custom hardware where it can drastically reduce memory traffic and data communication overheads. Code used to run all learning experiments is available under https://gitlab.com/hesham-mostafa/learning-using-local-erros.git.
Collapse
Affiliation(s)
- Hesham Mostafa
- Institute for Neural Computation, University of California, San Diego, San Diego, CA, United States
| | - Vishwajith Ramesh
- Department of Bioengineering, University of California, San Diego, San Diego, CA, United States
| | - Gert Cauwenberghs
- Institute for Neural Computation, University of California, San Diego, San Diego, CA, United States
- Department of Bioengineering, University of California, San Diego, San Diego, CA, United States
| |
Collapse
|
27
|
Unsupervised heart-rate estimation in wearables with Liquid states and a probabilistic readout. Neural Netw 2018; 99:134-147. [PMID: 29414535 DOI: 10.1016/j.neunet.2017.12.015] [Citation(s) in RCA: 42] [Impact Index Per Article: 7.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/18/2017] [Revised: 12/08/2017] [Accepted: 12/26/2017] [Indexed: 01/28/2023]
Abstract
Heart-rate estimation is a fundamental feature of modern wearable devices. In this paper we propose a machine learning technique to estimate heart-rate from electrocardiogram (ECG) data collected using wearable devices. The novelty of our approach lies in (1) encoding spatio-temporal properties of ECG signals directly into spike train and using this to excite recurrently connected spiking neurons in a Liquid State Machine computation model; (2) a novel learning algorithm; and (3) an intelligently designed unsupervised readout based on Fuzzy c-Means clustering of spike responses from a subset of neurons (Liquid states), selected using particle swarm optimization. Our approach differs from existing works by learning directly from ECG signals (allowing personalization), without requiring costly data annotations. Additionally, our approach can be easily implemented on state-of-the-art spiking-based neuromorphic systems, offering high accuracy, yet significantly low energy footprint, leading to an extended battery-life of wearable devices. We validated our approach with CARLsim, a GPU accelerated spiking neural network simulator modeling Izhikevich spiking neurons with Spike Timing Dependent Plasticity (STDP) and homeostatic scaling. A range of subjects is considered from in-house clinical trials and public ECG databases. Results show high accuracy and low energy footprint in heart-rate estimation across subjects with and without cardiac irregularities, signifying the strong potential of this approach to be integrated in future wearable devices.
Collapse
|
28
|
Neftci EO, Augustine C, Paul S, Detorakis G. Event-Driven Random Back-Propagation: Enabling Neuromorphic Deep Learning Machines. Front Neurosci 2017; 11:324. [PMID: 28680387 PMCID: PMC5478701 DOI: 10.3389/fnins.2017.00324] [Citation(s) in RCA: 74] [Impact Index Per Article: 10.6] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/19/2017] [Accepted: 05/23/2017] [Indexed: 11/17/2022] Open
Abstract
An ongoing challenge in neuromorphic computing is to devise general and computationally efficient models of inference and learning which are compatible with the spatial and temporal constraints of the brain. One increasingly popular and successful approach is to take inspiration from inference and learning algorithms used in deep neural networks. However, the workhorse of deep learning, the gradient descent Gradient Back Propagation (BP) rule, often relies on the immediate availability of network-wide information stored with high-precision memory during learning, and precise operations that are difficult to realize in neuromorphic hardware. Remarkably, recent work showed that exact backpropagated gradients are not essential for learning deep representations. Building on these results, we demonstrate an event-driven random BP (eRBP) rule that uses an error-modulated synaptic plasticity for learning deep representations. Using a two-compartment Leaky Integrate & Fire (I&F) neuron, the rule requires only one addition and two comparisons for each synaptic weight, making it very suitable for implementation in digital or mixed-signal neuromorphic hardware. Our results show that using eRBP, deep representations are rapidly learned, achieving classification accuracies on permutation invariant datasets comparable to those obtained in artificial neural network simulations on GPUs, while being robust to neural and synaptic state quantizations during learning.
Collapse
Affiliation(s)
- Emre O. Neftci
- Neuromorphic Machine Intelligence Laboratory, Department of Cognitive Sciences, University of California, IrvineIrvine, CA, United States
| | | | - Somnath Paul
- Circuit Research Lab, Intel CorporationHilsboro, OR, United States
| | - Georgios Detorakis
- Neuromorphic Machine Intelligence Laboratory, Department of Cognitive Sciences, University of California, IrvineIrvine, CA, United States
| |
Collapse
|
29
|
Kim J, Mostafa F, Tweed DB. The order of complexity of visuomotor learning. BMC Neurosci 2017; 18:50. [PMID: 28606114 PMCID: PMC5469048 DOI: 10.1186/s12868-017-0368-x] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/15/2016] [Accepted: 06/05/2017] [Indexed: 11/20/2022] Open
Abstract
Background Learning algorithms come in three orders of complexity: zeroth-order (perturbation), first-order (gradient descent), and second-order (e.g., quasi-Newton). But which of these are used in the brain? We trained 12 people to shoot targets, and compared them to simulated subjects that learned the same task using various algorithms. Results Humans learned significantly faster than optimized zeroth-order algorithms, but slower than second-order ones. Conclusions Human visuomotor learning is too fast to be explained by zeroth-order processes alone, and must involve first or second-order mechanisms. Electronic supplementary material The online version of this article (doi:10.1186/s12868-017-0368-x) contains supplementary material, which is available to authorized users.
Collapse
Affiliation(s)
- John Kim
- Department of Physiology, University of Toronto, Toronto, ON, M5S1A8, Canada.,College of Medicine, University of Manitoba, Winnipeg, MB, R3E 3P5, Canada
| | - Fariya Mostafa
- Department of Physiology, University of Toronto, Toronto, ON, M5S1A8, Canada
| | - Douglas Blair Tweed
- Department of Physiology, University of Toronto, Toronto, ON, M5S1A8, Canada. .,Centre for Vision Research, York University, Toronto, ON, M3J1P3, Canada.
| |
Collapse
|