1
|
Yuan M, Wu X, Yan R, Tang H. Reinforcement Learning in Spiking Neural Networks with Stochastic and Deterministic Synapses. Neural Comput 2019; 31:2368-2389. [PMID: 31614099 DOI: 10.1162/neco_a_01238] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/04/2022]
Abstract
Though succeeding in solving various learning tasks, most existing reinforcement learning (RL) models have failed to take into account the complexity of synaptic plasticity in the neural system. Models implementing reinforcement learning with spiking neurons involve only a single plasticity mechanism. Here, we propose a neural realistic reinforcement learning model that coordinates the plasticities of two types of synapses: stochastic and deterministic. The plasticity of the stochastic synapse is achieved by the hedonistic rule through modulating the release probability of synaptic neurotransmitter, while the plasticity of the deterministic synapse is achieved by a variant of a reward-modulated spike-timing-dependent plasticity rule through modulating the synaptic strengths. We evaluate the proposed learning model on two benchmark tasks: learning a logic gate function and the 19-state random walk problem. Experimental results show that the coordination of diverse synaptic plasticities can make the RL model learn in a rapid and stable form.
Collapse
Affiliation(s)
- Mengwen Yuan
- College of Computer Science, Sichuan University, Chengdu 610065, China
| | - Xi Wu
- College of Computer Science, Sichuan University, Chengdu 610065, China
| | - Rui Yan
- College of Computer Science, Sichuan University, Chengdu 610065, China
| | - Huajin Tang
- College of Computer Science, Sichuan University, Chengdu 610065, China, and College of Computer Science and Technology, Zhejiang University, Hangzhou 310027, China
| |
Collapse
|
2
|
An Efficient Supervised Training Algorithm for Multilayer Spiking Neural Networks. PLoS One 2016; 11:e0150329. [PMID: 27044001 PMCID: PMC4820126 DOI: 10.1371/journal.pone.0150329] [Citation(s) in RCA: 12] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/08/2015] [Accepted: 02/11/2016] [Indexed: 11/21/2022] Open
Abstract
The spiking neural networks (SNNs) are the third generation of neural networks and perform remarkably well in cognitive tasks such as pattern recognition. The spike emitting and information processing mechanisms found in biological cognitive systems motivate the application of the hierarchical structure and temporal encoding mechanism in spiking neural networks, which have exhibited strong computational capability. However, the hierarchical structure and temporal encoding approach require neurons to process information serially in space and time respectively, which reduce the training efficiency significantly. For training the hierarchical SNNs, most existing methods are based on the traditional back-propagation algorithm, inheriting its drawbacks of the gradient diffusion and the sensitivity on parameters. To keep the powerful computation capability of the hierarchical structure and temporal encoding mechanism, but to overcome the low efficiency of the existing algorithms, a new training algorithm, the Normalized Spiking Error Back Propagation (NSEBP) is proposed in this paper. In the feedforward calculation, the output spike times are calculated by solving the quadratic function in the spike response model instead of detecting postsynaptic voltage states at all time points in traditional algorithms. Besides, in the feedback weight modification, the computational error is propagated to previous layers by the presynaptic spike jitter instead of the gradient decent rule, which realizes the layer-wised training. Furthermore, our algorithm investigates the mathematical relation between the weight variation and voltage error change, which makes the normalization in the weight modification applicable. Adopting these strategies, our algorithm outperforms the traditional SNN multi-layer algorithms in terms of learning efficiency and parameter sensitivity, that are also demonstrated by the comprehensive experimental results in this paper.
Collapse
|
3
|
Soltoggio A. Short-term plasticity as cause-effect hypothesis testing in distal reward learning. BIOLOGICAL CYBERNETICS 2015; 109:75-94. [PMID: 25189158 DOI: 10.1007/s00422-014-0628-0] [Citation(s) in RCA: 8] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 12/16/2013] [Accepted: 08/06/2014] [Indexed: 06/03/2023]
Abstract
Asynchrony, overlaps, and delays in sensory-motor signals introduce ambiguity as to which stimuli, actions, and rewards are causally related. Only the repetition of reward episodes helps distinguish true cause-effect relationships from coincidental occurrences. In the model proposed here, a novel plasticity rule employs short- and long-term changes to evaluate hypotheses on cause-effect relationships. Transient weights represent hypotheses that are consolidated in long-term memory only when they consistently predict or cause future rewards. The main objective of the model is to preserve existing network topologies when learning with ambiguous information flows. Learning is also improved by biasing the exploration of the stimulus-response space toward actions that in the past occurred before rewards. The model indicates under which conditions beliefs can be consolidated in long-term memory, it suggests a solution to the plasticity-stability dilemma, and proposes an interpretation of the role of short-term plasticity.
Collapse
Affiliation(s)
- Andrea Soltoggio
- Computer Science Department, Loughborough University, Loughborough, LE11 3TU, UK,
| |
Collapse
|
4
|
O'Brien MJ, Thibeault CM, Srinivasa N. A novel analytical characterization for short-term plasticity parameters in spiking neural networks. Front Comput Neurosci 2014; 8:148. [PMID: 25477812 PMCID: PMC4237058 DOI: 10.3389/fncom.2014.00148] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/01/2014] [Accepted: 10/29/2014] [Indexed: 11/13/2022] Open
Abstract
Short-term plasticity (STP) is a phenomenon that widely occurs in the neocortex with implications for learning and memory. Based on a widely used STP model, we develop an analytical characterization of the STP parameter space to determine the nature of each synapse (facilitating, depressing, or both) in a spiking neural network based on presynaptic firing rate and the corresponding STP parameters. We demonstrate consistency with previous work by leveraging the power of our characterization to replicate the functional volumes that are integral for the previous network stabilization results. We then use our characterization to predict the precise transitional point from the facilitating regime to the depressing regime in a simulated synapse, suggesting in vitro experiments to verify the underlying STP model. We conclude the work by integrating our characterization into a framework for finding suitable STP parameters for self-sustaining random, asynchronous activity in a prescribed recurrent spiking neural network. The systematic process resulting from our analytical characterization improves the success rate of finding the requisite parameters for such networks by three orders of magnitude over a random search.
Collapse
Affiliation(s)
- Michael J O'Brien
- Center for Neural and Emergent Systems, Information and System Sciences Department, HRL Laboratories LLC Malibu, CA, USA
| | - Corey M Thibeault
- Center for Neural and Emergent Systems, Information and System Sciences Department, HRL Laboratories LLC Malibu, CA, USA
| | - Narayan Srinivasa
- Center for Neural and Emergent Systems, Information and System Sciences Department, HRL Laboratories LLC Malibu, CA, USA
| |
Collapse
|
5
|
Minkovich K, Thibeault CM, O'Brien MJ, Nogin A, Cho Y, Srinivasa N. HRLSim: a high performance spiking neural network simulator for GPGPU clusters. IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS 2014; 25:316-331. [PMID: 24807031 DOI: 10.1109/tnnls.2013.2276056] [Citation(s) in RCA: 15] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/03/2023]
Abstract
Modeling of large-scale spiking neural models is an important tool in the quest to understand brain function and subsequently create real-world applications. This paper describes a spiking neural network simulator environment called HRL Spiking Simulator (HRLSim). This simulator is suitable for implementation on a cluster of general purpose graphical processing units (GPGPUs). Novel aspects of HRLSim are described and an analysis of its performance is provided for various configurations of the cluster. With the advent of inexpensive GPGPU cards and compute power, HRLSim offers an affordable and scalable tool for design, real-time simulation, and analysis of large-scale spiking neural networks.
Collapse
|
6
|
Beyeler M, Dutt ND, Krichmar JL. Categorization and decision-making in a neurobiologically plausible spiking network using a STDP-like learning rule. Neural Netw 2013; 48:109-24. [DOI: 10.1016/j.neunet.2013.07.012] [Citation(s) in RCA: 77] [Impact Index Per Article: 7.0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/24/2013] [Revised: 07/28/2013] [Accepted: 07/31/2013] [Indexed: 11/26/2022]
|
7
|
Thibeault CM, Srinivasa N. Using a hybrid neuron in physiologically inspired models of the basal ganglia. Front Comput Neurosci 2013; 7:88. [PMID: 23847524 PMCID: PMC3701869 DOI: 10.3389/fncom.2013.00088] [Citation(s) in RCA: 28] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/02/2013] [Accepted: 06/15/2013] [Indexed: 11/15/2022] Open
Abstract
Our current understanding of the basal ganglia (BG) has facilitated the creation of computational models that have contributed novel theories, explored new functional anatomy and demonstrated results complementing physiological experiments. However, the utility of these models extends beyond these applications. Particularly in neuromorphic engineering, where the basal ganglia's role in computation is important for applications such as power efficient autonomous agents and model-based control strategies. The neurons used in existing computational models of the BG, however, are not amenable for many low-power hardware implementations. Motivated by a need for more hardware accessible networks, we replicate four published models of the BG, spanning single neuron and small networks, replacing the more computationally expensive neuron models with an Izhikevich hybrid neuron. This begins with a network modeling action-selection, where the basal activity levels and the ability to appropriately select the most salient input is reproduced. A Parkinson's disease model is then explored under normal conditions, Parkinsonian conditions and during subthalamic nucleus deep brain stimulation (DBS). The resulting network is capable of replicating the loss of thalamic relay capabilities in the Parkinsonian state and its return under DBS. This is also demonstrated using a network capable of action-selection. Finally, a study of correlation transfer under different patterns of Parkinsonian activity is presented. These networks successfully captured the significant results of the originals studies. This not only creates a foundation for neuromorphic hardware implementations but may also support the development of large-scale biophysical models. The former potentially providing a way of improving the efficacy of DBS and the latter allowing for the efficient simulation of larger more comprehensive networks.
Collapse
Affiliation(s)
- Corey M Thibeault
- Center for Neural and Emergent Systems, Information and System Sciences Laboratory, HRL Laboratories LLC. Malibu, CA, USA ; Department of Electrical and Biomedical Engineering, The University of Nevada Reno, NV, USA ; Department of Computer Science and Engineering, The University of Nevada Reno, NV, USA
| | | |
Collapse
|
8
|
Dockendorf K, Srinivasa N. Learning and prospective recall of noisy spike pattern episodes. Front Comput Neurosci 2013; 7:80. [PMID: 23801961 PMCID: PMC3689221 DOI: 10.3389/fncom.2013.00080] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/08/2013] [Accepted: 06/03/2013] [Indexed: 11/13/2022] Open
Abstract
Spike patterns in vivo are often incomplete or corrupted with noise that makes inputs to neuronal networks appear to vary although they may, in fact, be samples of a single underlying pattern or repeated presentation. Here we present a recurrent spiking neural network (SNN) model that learns noisy pattern sequences through the use of homeostasis and spike-timing dependent plasticity (STDP). We find that the changes in the synaptic weight vector during learning of patterns of random ensembles are approximately orthogonal in a reduced dimension space when the patterns are constructed to minimize overlap in representations. Using this model, representations of sparse patterns maybe associated through co-activated firing and integrated into ensemble representations. While the model is tolerant to noise, prospective activity, and pattern completion differ in their ability to adapt in the presence of noise. One version of the model is able to demonstrate the recently discovered phenomena of preplay and replay reminiscent of hippocampal-like behaviors.
Collapse
Affiliation(s)
- Karl Dockendorf
- Information and System Sciences Lab, Center for Neural and Emergent Systems, HRL Laboratories LLC Malibu, CA, USA
| | | |
Collapse
|
9
|
Jayet Bray LC, Ferneyhough GB, Barker ER, Thibeault CM, Harris FC. Reward-based learning for virtual neurorobotics through emotional speech processing. Front Neurorobot 2013; 7:8. [PMID: 23641213 PMCID: PMC3638126 DOI: 10.3389/fnbot.2013.00008] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/29/2012] [Accepted: 04/02/2013] [Indexed: 11/24/2022] Open
Abstract
Reward-based learning can easily be applied to real life with a prevalence in children teaching methods. It also allows machines and software agents to automatically determine the ideal behavior from a simple reward feedback (e.g., encouragement) to maximize their performance. Advancements in affective computing, especially emotional speech processing (ESP) have allowed for more natural interaction between humans and robots. Our research focuses on integrating a novel ESP system in a relevant virtual neurorobotic (VNR) application. We created an emotional speech classifier that successfully distinguished happy and utterances. The accuracy of the system was 95.3 and 98.7% during the offline mode (using an emotional speech database) and the live mode (using live recordings), respectively. It was then integrated in a neurorobotic scenario, where a virtual neurorobot had to learn a simple exercise through reward-based learning. If the correct decision was made the robot received a spoken reward, which in turn stimulated synapses (in our simulated model) undergoing spike-timing dependent plasticity (STDP) and reinforced the corresponding neural pathways. Both our ESP and neurorobotic systems allowed our neurorobot to successfully and consistently learn the exercise. The integration of ESP in real-time computational neuroscience architecture is a first step toward the combination of human emotions and virtual neurorobotics.
Collapse
Affiliation(s)
- Laurence C. Jayet Bray
- Department of Computer Science and Engineering, University of NevadaReno, NV, USA
- Department of Bioengineering, George Mason UniversityFairfax, VA, USA
| | | | - Emily R. Barker
- Department of Computer Science and Engineering, University of NevadaReno, NV, USA
| | | | - Frederick C. Harris
- Department of Computer Science and Engineering, University of NevadaReno, NV, USA
| |
Collapse
|