1
|
Zhao Z, Zhao F, Zhao Y, Zeng Y, Sun Y. A brain-inspired theory of mind spiking neural network improves multi-agent cooperation and competition. PATTERNS (NEW YORK, N.Y.) 2023; 4:100775. [PMID: 37602221 PMCID: PMC10435963 DOI: 10.1016/j.patter.2023.100775] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Received: 11/03/2022] [Revised: 01/17/2023] [Accepted: 05/19/2023] [Indexed: 08/22/2023]
Abstract
During dynamic social interaction, inferring and predicting others' behaviors through theory of mind (ToM) is crucial for obtaining benefits in cooperative and competitive tasks. Current multi-agent reinforcement learning (MARL) methods primarily rely on agent observations to select behaviors, but they lack inspiration from ToM, which limits performance. In this article, we propose a multi-agent ToM decision-making (MAToM-DM) model, which consists of a MAToM spiking neural network (MAToM-SNN) module and a decision-making module. We design two brain-inspired ToM modules (Self-MAToM and Other-MAToM) to predict others' behaviors based on self-experience and observations of others, respectively. Each agent can adjust its behavior according to the predicted actions of others. The effectiveness of the proposed model has been demonstrated through experiments conducted in cooperative and competitive tasks. The results indicate that integrating the ToM mechanism can enhance cooperation and competition efficiency and lead to higher rewards compared with traditional MARL models.
Collapse
Affiliation(s)
- Zhuoya Zhao
- Brain-inspired Cognitive Intelligence Lab, Institute of Automation, Chinese Academy of Sciences, Beijing 100190, China
- School of Future Technology, University of Chinese Academy of Sciences, Beijing 100049, China
| | - Feifei Zhao
- Brain-inspired Cognitive Intelligence Lab, Institute of Automation, Chinese Academy of Sciences, Beijing 100190, China
| | - Yuxuan Zhao
- Brain-inspired Cognitive Intelligence Lab, Institute of Automation, Chinese Academy of Sciences, Beijing 100190, China
| | - Yi Zeng
- Brain-inspired Cognitive Intelligence Lab, Institute of Automation, Chinese Academy of Sciences, Beijing 100190, China
- School of Future Technology, University of Chinese Academy of Sciences, Beijing 100049, China
- School of Artificial Intelligence, University of Chinese Academy of Sciences, Beijing 100049, China
- Center for Excellence in Brain Science and Intelligence Technology, Chinese Academy of Sciences, Shanghai 200031, China
- State Key Laboratory of Multimodal Artificial Intelligence Systems, Institute of Automation, Chinese Academy of Sciences, Beijing 100190, China
| | - Yinqian Sun
- Brain-inspired Cognitive Intelligence Lab, Institute of Automation, Chinese Academy of Sciences, Beijing 100190, China
- School of Future Technology, University of Chinese Academy of Sciences, Beijing 100049, China
| |
Collapse
|
2
|
Sleep prevents catastrophic forgetting in spiking neural networks by forming a joint synaptic weight representation. PLoS Comput Biol 2022; 18:e1010628. [PMID: 36399437 PMCID: PMC9674146 DOI: 10.1371/journal.pcbi.1010628] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/22/2022] [Accepted: 10/03/2022] [Indexed: 11/19/2022] Open
Abstract
Artificial neural networks overwrite previously learned tasks when trained sequentially, a phenomenon known as catastrophic forgetting. In contrast, the brain learns continuously, and typically learns best when new training is interleaved with periods of sleep for memory consolidation. Here we used spiking network to study mechanisms behind catastrophic forgetting and the role of sleep in preventing it. The network could be trained to learn a complex foraging task but exhibited catastrophic forgetting when trained sequentially on different tasks. In synaptic weight space, new task training moved the synaptic weight configuration away from the manifold representing old task leading to forgetting. Interleaving new task training with periods of off-line reactivation, mimicking biological sleep, mitigated catastrophic forgetting by constraining the network synaptic weight state to the previously learned manifold, while allowing the weight configuration to converge towards the intersection of the manifolds representing old and new tasks. The study reveals a possible strategy of synaptic weights dynamics the brain applies during sleep to prevent forgetting and optimize learning.
Collapse
|
3
|
Zhao F, Zeng Y, Han B, Fang H, Zhao Z. Nature-inspired self-organizing collision avoidance for drone swarm based on reward-modulated spiking neural network. PATTERNS 2022; 3:100611. [DOI: 10.1016/j.patter.2022.100611] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Subscribe] [Scholar Register] [Received: 05/25/2022] [Revised: 08/18/2022] [Accepted: 09/22/2022] [Indexed: 11/07/2022]
|
4
|
Haşegan D, Deible M, Earl C, D’Onofrio D, Hazan H, Anwar H, Neymotin SA. Training spiking neuronal networks to perform motor control using reinforcement and evolutionary learning. Front Comput Neurosci 2022; 16:1017284. [PMID: 36249482 PMCID: PMC9563231 DOI: 10.3389/fncom.2022.1017284] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/11/2022] [Accepted: 08/31/2022] [Indexed: 11/13/2022] Open
Abstract
Artificial neural networks (ANNs) have been successfully trained to perform a wide range of sensory-motor behaviors. In contrast, the performance of spiking neuronal network (SNN) models trained to perform similar behaviors remains relatively suboptimal. In this work, we aimed to push the field of SNNs forward by exploring the potential of different learning mechanisms to achieve optimal performance. We trained SNNs to solve the CartPole reinforcement learning (RL) control problem using two learning mechanisms operating at different timescales: (1) spike-timing-dependent reinforcement learning (STDP-RL) and (2) evolutionary strategy (EVOL). Though the role of STDP-RL in biological systems is well established, several other mechanisms, though not fully understood, work in concert during learning in vivo. Recreating accurate models that capture the interaction of STDP-RL with these diverse learning mechanisms is extremely difficult. EVOL is an alternative method and has been successfully used in many studies to fit model neural responsiveness to electrophysiological recordings and, in some cases, for classification problems. One advantage of EVOL is that it may not need to capture all interacting components of synaptic plasticity and thus provides a better alternative to STDP-RL. Here, we compared the performance of each algorithm after training, which revealed EVOL as a powerful method for training SNNs to perform sensory-motor behaviors. Our modeling opens up new capabilities for SNNs in RL and could serve as a testbed for neurobiologists aiming to understand multi-timescale learning mechanisms and dynamics in neuronal circuits.
Collapse
Affiliation(s)
- Daniel Haşegan
- Vilcek Institute of Graduate Biomedical Sciences, NYU Grossman School of Medicine, New York, NY, United States
| | - Matt Deible
- Department of Computer Science, University of Pittsburgh, Pittsburgh, PA, United States
| | - Christopher Earl
- Department of Computer Science, University of Massachusetts Amherst, Amherst, MA, United States
| | - David D’Onofrio
- Center for Biomedical Imaging and Neuromodulation, Nathan S. Kline Institute for Psychiatric Research, Orangeburg, NY, United States
| | - Hananel Hazan
- Allen Discovery Center, Tufts University, Boston, MA, United States
| | - Haroon Anwar
- Center for Biomedical Imaging and Neuromodulation, Nathan S. Kline Institute for Psychiatric Research, Orangeburg, NY, United States
| | - Samuel A. Neymotin
- Center for Biomedical Imaging and Neuromodulation, Nathan S. Kline Institute for Psychiatric Research, Orangeburg, NY, United States
- Department of Psychiatry, NYU Grossman School of Medicine, New York, NY, United States
| |
Collapse
|
5
|
Antonov DI, Sviatov KV, Sukhov S. Continuous learning of spiking networks trained with local rules. Neural Netw 2022; 155:512-522. [PMID: 36166978 DOI: 10.1016/j.neunet.2022.09.003] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/12/2021] [Revised: 06/29/2022] [Accepted: 09/02/2022] [Indexed: 10/31/2022]
Abstract
Artificial neural networks (ANNs) experience catastrophic forgetting (CF) during sequential learning. In contrast, the brain can learn continuously without any signs of catastrophic forgetting. Spiking neural networks (SNNs) are the next generation of ANNs with many features borrowed from biological neural networks. Thus, SNNs potentially promise better resilience to CF. In this paper, we study the susceptibility of SNNs to CF and test several biologically inspired methods for mitigating catastrophic forgetting. SNNs are trained with biologically plausible local training rules based on spike-timing-dependent plasticity (STDP). Local training prohibits the direct use of CF prevention methods based on gradients of a global loss function. We developed and tested the method to determine the importance of synapses (weights) based on stochastic Langevin dynamics without the need for the gradients. Several other methods of catastrophic forgetting prevention adapted from analog neural networks were tested as well. The experiments were performed on freely available datasets in the SpykeTorch environment.
Collapse
Affiliation(s)
- D I Antonov
- Kotelnikov Institute of Radio Engineering and Electronics of Russian Academy of Sciences (Ulyanovsk branch), 48/2 Goncharov Str., Ulyanovsk 432071, Russia.
| | - K V Sviatov
- Ulyanovsk State Technical University, 32 Severny Venets, Ulyanovsk 432027, Russia.
| | - S Sukhov
- Kotelnikov Institute of Radio Engineering and Electronics of Russian Academy of Sciences (Ulyanovsk branch), 48/2 Goncharov Str., Ulyanovsk 432071, Russia.
| |
Collapse
|
6
|
Anwar H, Caby S, Dura-Bernal S, D’Onofrio D, Hasegan D, Deible M, Grunblatt S, Chadderdon GL, Kerr CC, Lakatos P, Lytton WW, Hazan H, Neymotin SA. Training a spiking neuronal network model of visual-motor cortex to play a virtual racket-ball game using reinforcement learning. PLoS One 2022; 17:e0265808. [PMID: 35544518 PMCID: PMC9094569 DOI: 10.1371/journal.pone.0265808] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/06/2021] [Accepted: 03/08/2022] [Indexed: 11/18/2022] Open
Abstract
Recent models of spiking neuronal networks have been trained to perform behaviors in static environments using a variety of learning rules, with varying degrees of biological realism. Most of these models have not been tested in dynamic visual environments where models must make predictions on future states and adjust their behavior accordingly. The models using these learning rules are often treated as black boxes, with little analysis on circuit architectures and learning mechanisms supporting optimal performance. Here we developed visual/motor spiking neuronal network models and trained them to play a virtual racket-ball game using several reinforcement learning algorithms inspired by the dopaminergic reward system. We systematically investigated how different architectures and circuit-motifs (feed-forward, recurrent, feedback) contributed to learning and performance. We also developed a new biologically-inspired learning rule that significantly enhanced performance, while reducing training time. Our models included visual areas encoding game inputs and relaying the information to motor areas, which used this information to learn to move the racket to hit the ball. Neurons in the early visual area relayed information encoding object location and motion direction across the network. Neuronal association areas encoded spatial relationships between objects in the visual scene. Motor populations received inputs from visual and association areas representing the dorsal pathway. Two populations of motor neurons generated commands to move the racket up or down. Model-generated actions updated the environment and triggered reward or punishment signals that adjusted synaptic weights so that the models could learn which actions led to reward. Here we demonstrate that our biologically-plausible learning rules were effective in training spiking neuronal network models to solve problems in dynamic environments. We used our models to dissect the circuit architectures and learning rules most effective for learning. Our model shows that learning mechanisms involving different neural circuits produce similar performance in sensory-motor tasks. In biological networks, all learning mechanisms may complement one another, accelerating the learning capabilities of animals. Furthermore, this also highlights the resilience and redundancy in biological systems.
Collapse
Affiliation(s)
- Haroon Anwar
- Center for Biomedical Imaging and Neuromodulation, Nathan Kline Institute for Psychiatric Research, Orangeburg, New York, United States of America
| | - Simon Caby
- Center for Biomedical Imaging and Neuromodulation, Nathan Kline Institute for Psychiatric Research, Orangeburg, New York, United States of America
| | - Salvador Dura-Bernal
- Center for Biomedical Imaging and Neuromodulation, Nathan Kline Institute for Psychiatric Research, Orangeburg, New York, United States of America
- Dept. Physiology & Pharmacology, State University of New York Downstate, Brooklyn, New York, United States of America
| | - David D’Onofrio
- Center for Biomedical Imaging and Neuromodulation, Nathan Kline Institute for Psychiatric Research, Orangeburg, New York, United States of America
| | - Daniel Hasegan
- Center for Biomedical Imaging and Neuromodulation, Nathan Kline Institute for Psychiatric Research, Orangeburg, New York, United States of America
| | - Matt Deible
- University of Pittsburgh, Pittsburgh, Pennsylvania, United States of America
| | - Sara Grunblatt
- Center for Biomedical Imaging and Neuromodulation, Nathan Kline Institute for Psychiatric Research, Orangeburg, New York, United States of America
| | - George L. Chadderdon
- Dept. Physiology & Pharmacology, State University of New York Downstate, Brooklyn, New York, United States of America
| | - Cliff C. Kerr
- Dept Physics, University of Sydney, Sydney, Australia
- Institute for Disease Modeling, Global Health Division, Bill & Melinda Gates Foundation, Seattle, Washington, United States of America
| | - Peter Lakatos
- Center for Biomedical Imaging and Neuromodulation, Nathan Kline Institute for Psychiatric Research, Orangeburg, New York, United States of America
- Dept. Psychiatry, NYU Grossman School of Medicine, New York, New York, United States of America
| | - William W. Lytton
- Dept. Physiology & Pharmacology, State University of New York Downstate, Brooklyn, New York, United States of America
- Dept Neurology, Kings County Hospital Center, Brooklyn, New York, United States of America
| | - Hananel Hazan
- Dept of Biology, Tufts University, Medford, Massachusetts, United States of America
| | - Samuel A. Neymotin
- Center for Biomedical Imaging and Neuromodulation, Nathan Kline Institute for Psychiatric Research, Orangeburg, New York, United States of America
- Dept. Psychiatry, NYU Grossman School of Medicine, New York, New York, United States of America
| |
Collapse
|
7
|
Demin V, Nekhaev D. Recurrent Spiking Neural Network Learning Based on a Competitive Maximization of Neuronal Activity. Front Neuroinform 2018; 12:79. [PMID: 30498439 PMCID: PMC6250118 DOI: 10.3389/fninf.2018.00079] [Citation(s) in RCA: 15] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/27/2018] [Accepted: 10/18/2018] [Indexed: 12/21/2022] Open
Abstract
Spiking neural networks (SNNs) are believed to be highly computationally and energy efficient for specific neurochip hardware real-time solutions. However, there is a lack of learning algorithms for complex SNNs with recurrent connections, comparable in efficiency with back-propagation techniques and capable of unsupervised training. Here we suppose that each neuron in a biological neural network tends to maximize its activity in competition with other neurons, and put this principle at the basis of a new SNN learning algorithm. In such a way, a spiking network with the learned feed-forward, reciprocal and intralayer inhibitory connections, is introduced to the MNIST database digit recognition. It has been demonstrated that this SNN can be trained without a teacher, after a short supervised initialization of weights by the same algorithm. Also, it has been shown that neurons are grouped into families of hierarchical structures, corresponding to different digit classes and their associations. This property is expected to be useful to reduce the number of layers in deep neural networks and modeling the formation of various functional structures in a biological nervous system. Comparison of the learning properties of the suggested algorithm, with those of the Sparse Distributed Representation approach shows similarity in coding but also some advantages of the former. The basic principle of the proposed algorithm is believed to be practically applicable to the construction of much more complicated and diverse task solving SNNs. We refer to this new approach as "Family-Engaged Execution and Learning of Induced Neuron Groups," or FEELING.
Collapse
Affiliation(s)
- Vyacheslav Demin
- National Research Center "Kurchatov Institute", Moscow, Russia.,Moscow Institute of Phycics and Technology, Dolgoprudny, Russia
| | - Dmitry Nekhaev
- National Research Center "Kurchatov Institute", Moscow, Russia
| |
Collapse
|
8
|
Follmann R, Shaffer A, Mobille Z, Rutherford G, Rosa E. Synchronous tonic-to-bursting transitions in a neuronal hub motif. CHAOS (WOODBURY, N.Y.) 2018; 28:106315. [PMID: 30384663 DOI: 10.1063/1.5039880] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 05/11/2018] [Accepted: 08/17/2018] [Indexed: 06/08/2023]
Abstract
We study a heterogeneous neuronal network motif where a central node (hub neuron) is connected via electrical synapses to other nodes (peripheral neurons). Our numerical simulations show that the networked neurons synchronize in three different states: (i) robust tonic, (ii) robust bursting, and (iii) tonic initially evolving to bursting through a period-doubling cascade and chaos transition. This third case displays interesting features, including the carrying on of a characteristic firing rate found in the single neuron tonic-to-bursting transition.
Collapse
Affiliation(s)
- Rosangela Follmann
- School of Information Technology, Illinois State University, Normal, Illinois 61790, USA
| | - Annabelle Shaffer
- Department of Physics, Illinois State University, Normal, Illinois 61790, USA
| | - Zachary Mobille
- Department of Physics, Illinois State University, Normal, Illinois 61790, USA
| | - George Rutherford
- Department of Physics, Illinois State University, Normal, Illinois 61790, USA
| | - Epaminondas Rosa
- Department of Physics, Illinois State University, Normal, Illinois 61790, USA
| |
Collapse
|