1
|
Zhang W, Geng H, Li P. Composing recurrent spiking neural networks using locally-recurrent motifs and risk-mitigating architectural optimization. Front Neurosci 2024; 18:1412559. [PMID: 38966757 PMCID: PMC11222634 DOI: 10.3389/fnins.2024.1412559] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/05/2024] [Accepted: 06/03/2024] [Indexed: 07/06/2024] Open
Abstract
In neural circuits, recurrent connectivity plays a crucial role in network function and stability. However, existing recurrent spiking neural networks (RSNNs) are often constructed by random connections without optimization. While RSNNs can produce rich dynamics that are critical for memory formation and learning, systemic architectural optimization of RSNNs is still an open challenge. We aim to enable systematic design of large RSNNs via a new scalable RSNN architecture and automated architectural optimization. We compose RSNNs based on a layer architecture called Sparsely-Connected Recurrent Motif Layer (SC-ML) that consists of multiple small recurrent motifs wired together by sparse lateral connections. The small size of the motifs and sparse inter-motif connectivity leads to an RSNN architecture scalable to large network sizes. We further propose a method called Hybrid Risk-Mitigating Architectural Search (HRMAS) to systematically optimize the topology of the proposed recurrent motifs and SC-ML layer architecture. HRMAS is an alternating two-step optimization process by which we mitigate the risk of network instability and performance degradation caused by architectural change by introducing a novel biologically-inspired "self-repairing" mechanism through intrinsic plasticity. The intrinsic plasticity is introduced to the second step of each HRMAS iteration and acts as unsupervised fast self-adaptation to structural and synaptic weight modifications introduced by the first step during the RSNN architectural "evolution." We demonstrate that the proposed automatic architecture optimization leads to significant performance gains over existing manually designed RSNNs: we achieve 96.44% on TI46-Alpha, 94.66% on N-TIDIGITS, 90.28% on DVS-Gesture, and 98.72% on N-MNIST. To the best of the authors' knowledge, this is the first work to perform systematic architecture optimization on RSNNs.
Collapse
Affiliation(s)
| | | | - Peng Li
- Department of Electrical and Computer Engineering, University of California, Santa Barbara, Santa Barbara, CA, United States
| |
Collapse
|
2
|
Sun P, Chua Y, Devos P, Botteldooren D. Learnable axonal delay in spiking neural networks improves spoken word recognition. Front Neurosci 2023; 17:1275944. [PMID: 38027508 PMCID: PMC10665570 DOI: 10.3389/fnins.2023.1275944] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/10/2023] [Accepted: 10/23/2023] [Indexed: 12/01/2023] Open
Abstract
Spiking neural networks (SNNs), which are composed of biologically plausible spiking neurons, and combined with bio-physically realistic auditory periphery models, offer a means to explore and understand human auditory processing-especially in tasks where precise timing is essential. However, because of the inherent temporal complexity in spike sequences, the performance of SNNs has remained less competitive compared to artificial neural networks (ANNs). To tackle this challenge, a fundamental research topic is the configuration of spike-timing and the exploration of more intricate architectures. In this work, we demonstrate a learnable axonal delay combined with local skip-connections yields state-of-the-art performance on challenging benchmarks for spoken word recognition. Additionally, we introduce an auxiliary loss term to further enhance accuracy and stability. Experiments on the neuromorphic speech benchmark datasets, NTIDIDIGITS and SHD, show improvements in performance when incorporating our delay module in comparison to vanilla feedforward SNNs. Specifically, with the integration of our delay module, the performance on NTIDIDIGITS and SHD improves by 14% and 18%, respectively. When paired with local skip-connections and the auxiliary loss, our approach surpasses both recurrent and convolutional neural networks, yet uses 10 × fewer parameters for NTIDIDIGITS and 7 × fewer for SHD.
Collapse
Affiliation(s)
- Pengfei Sun
- Department of Information Technology, WAVES Research Group, Ghent University, Ghent, Belgium
| | - Yansong Chua
- Neuromorphic Computing Laboratory, China Nanhu Academy of Electronics and Information Technology, Jiaxing, China
| | - Paul Devos
- Department of Information Technology, WAVES Research Group, Ghent University, Ghent, Belgium
| | - Dick Botteldooren
- Department of Information Technology, WAVES Research Group, Ghent University, Ghent, Belgium
| |
Collapse
|
3
|
Liu S, Leung VCH, Dragotti PL. First-spike coding promotes accurate and efficient spiking neural networks for discrete events with rich temporal structures. Front Neurosci 2023; 17:1266003. [PMID: 37849889 PMCID: PMC10577212 DOI: 10.3389/fnins.2023.1266003] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/24/2023] [Accepted: 09/11/2023] [Indexed: 10/19/2023] Open
Abstract
Spiking neural networks (SNNs) are well-suited to process asynchronous event-based data. Most of the existing SNNs use rate-coding schemes that focus on firing rate (FR), and so they generally ignore the spike timing in events. On the contrary, methods based on temporal coding, particularly time-to-first-spike (TTFS) coding, can be accurate and efficient but they are difficult to train. Currently, there is limited research on applying TTFS coding to real events, since traditional TTFS-based methods impose one-spike constraint, which is not realistic for event-based data. In this study, we present a novel decision-making strategy based on first-spike (FS) coding that encodes FS timings of the output neurons to investigate the role of the first-spike timing in classifying real-world event sequences with complex temporal structures. To achieve FS coding, we propose a novel surrogate gradient learning method for discrete spike trains. In the forward pass, output spikes are encoded into discrete times to generate FS times. In the backpropagation, we develop an error assignment method that propagates error from FS times to spikes through a Gaussian window, and then supervised learning for spikes is implemented through a surrogate gradient approach. Additional strategies are introduced to facilitate the training of FS timings, such as adding empty sequences and employing different parameters for different layers. We make a comprehensive comparison between FS and FR coding in the experiments. Our results show that FS coding achieves comparable accuracy to FR coding while leading to superior energy efficiency and distinct neuronal dynamics on data sequences with very rich temporal structures. Additionally, a longer time delay in the first spike leads to higher accuracy, indicating important information is encoded in the timing of the first spike.
Collapse
Affiliation(s)
- Siying Liu
- Communications and Signal Processing Group, Department of Electrical and Electronic Engineering, Imperial College London, London, United Kingdom
| | | | | |
Collapse
|
4
|
Wang J, Lin S, Liu A. Bioinspired Perception and Navigation of Service Robots in Indoor Environments: A Review. Biomimetics (Basel) 2023; 8:350. [PMID: 37622955 PMCID: PMC10452487 DOI: 10.3390/biomimetics8040350] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/27/2023] [Revised: 07/27/2023] [Accepted: 08/01/2023] [Indexed: 08/26/2023] Open
Abstract
Biological principles draw attention to service robotics because of similar concepts when robots operate various tasks. Bioinspired perception is significant for robotic perception, which is inspired by animals' awareness of the environment. This paper reviews the bioinspired perception and navigation of service robots in indoor environments, which are popular applications of civilian robotics. The navigation approaches are classified by perception type, including vision-based, remote sensing, tactile sensor, olfactory, sound-based, inertial, and multimodal navigation. The trend of state-of-art techniques is moving towards multimodal navigation to combine several approaches. The challenges in indoor navigation focus on precise localization and dynamic and complex environments with moving objects and people.
Collapse
Affiliation(s)
- Jianguo Wang
- Faculty of Engineering and Information Technology, University of Technology Sydney, Sydney, NSW 2007, Australia
| | - Shiwei Lin
- Faculty of Engineering and Information Technology, University of Technology Sydney, Sydney, NSW 2007, Australia
| | - Ang Liu
- Faculty of Engineering and Information Technology, University of Technology Sydney, Sydney, NSW 2007, Australia
| |
Collapse
|
5
|
Kang P, Banerjee S, Chopp H, Katsaggelos A, Cossairt O. Boost event-driven tactile learning with location spiking neurons. Front Neurosci 2023; 17:1127537. [PMID: 37152590 PMCID: PMC10160479 DOI: 10.3389/fnins.2023.1127537] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/19/2022] [Accepted: 03/28/2023] [Indexed: 05/09/2023] Open
Abstract
Tactile sensing is essential for a variety of daily tasks. Inspired by the event-driven nature and sparse spiking communication of the biological systems, recent advances in event-driven tactile sensors and Spiking Neural Networks (SNNs) spur the research in related fields. However, SNN-enabled event-driven tactile learning is still in its infancy due to the limited representation abilities of existing spiking neurons and high spatio-temporal complexity in the event-driven tactile data. In this paper, to improve the representation capability of existing spiking neurons, we propose a novel neuron model called "location spiking neuron," which enables us to extract features of event-based data in a novel way. Specifically, based on the classical Time Spike Response Model (TSRM), we develop the Location Spike Response Model (LSRM). In addition, based on the most commonly-used Time Leaky Integrate-and-Fire (TLIF) model, we develop the Location Leaky Integrate-and-Fire (LLIF) model. Moreover, to demonstrate the representation effectiveness of our proposed neurons and capture the complex spatio-temporal dependencies in the event-driven tactile data, we exploit the location spiking neurons to propose two hybrid models for event-driven tactile learning. Specifically, the first hybrid model combines a fully-connected SNN with TSRM neurons and a fully-connected SNN with LSRM neurons. And the second hybrid model fuses the spatial spiking graph neural network with TLIF neurons and the temporal spiking graph neural network with LLIF neurons. Extensive experiments demonstrate the significant improvements of our models over the state-of-the-art methods on event-driven tactile learning, including event-driven tactile object recognition and event-driven slip detection. Moreover, compared to the counterpart artificial neural networks (ANNs), our SNN models are 10× to 100× energy-efficient, which shows the superior energy efficiency of our models and may bring new opportunities to the spike-based learning community and neuromorphic engineering. Finally, we thoroughly examine the advantages and limitations of various spiking neurons and discuss the broad applicability and potential impact of this work on other spike-based learning applications.
Collapse
Affiliation(s)
- Peng Kang
- Department of Computer Science, Northwestern University, Evanston, IL, United States
- *Correspondence: Peng Kang
| | - Srutarshi Banerjee
- Department of Electrical and Computer Engineering, Northwestern University, Evanston, IL, United States
| | - Henry Chopp
- Department of Electrical and Computer Engineering, Northwestern University, Evanston, IL, United States
| | - Aggelos Katsaggelos
- Department of Electrical and Computer Engineering, Northwestern University, Evanston, IL, United States
| | - Oliver Cossairt
- Department of Computer Science, Northwestern University, Evanston, IL, United States
| |
Collapse
|
6
|
Xu Y, Perera S, Bethi Y, Afshar S, van Schaik A. Event-driven spectrotemporal feature extraction and classification using a silicon cochlea model. Front Neurosci 2023; 17:1125210. [PMID: 37144092 PMCID: PMC10151790 DOI: 10.3389/fnins.2023.1125210] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/16/2022] [Accepted: 03/27/2023] [Indexed: 05/06/2023] Open
Abstract
This paper presents a reconfigurable digital implementation of an event-based binaural cochlear system on a Field Programmable Gate Array (FPGA). It consists of a pair of the Cascade of Asymmetric Resonators with Fast Acting Compression (CAR-FAC) cochlea models and leaky integrate-and-fire (LIF) neurons. Additionally, we propose an event-driven SpectroTemporal Receptive Field (STRF) Feature Extraction using Adaptive Selection Thresholds (FEAST). It is tested on the TIDIGTIS benchmark and compared with current event-based auditory signal processing approaches and neural networks.
Collapse
|
7
|
Forno E, Fra V, Pignari R, Macii E, Urgese G. Spike encoding techniques for IoT time-varying signals benchmarked on a neuromorphic classification task. Front Neurosci 2022; 16:999029. [PMID: 36620463 PMCID: PMC9811205 DOI: 10.3389/fnins.2022.999029] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/20/2022] [Accepted: 11/30/2022] [Indexed: 12/24/2022] Open
Abstract
Spiking Neural Networks (SNNs), known for their potential to enable low energy consumption and computational cost, can bring significant advantages to the realm of embedded machine learning for edge applications. However, input coming from standard digital sensors must be encoded into spike trains before it can be elaborated with neuromorphic computing technologies. We present here a detailed comparison of available spike encoding techniques for the translation of time-varying signals into the event-based signal domain, tested on two different datasets both acquired through commercially available digital devices: the Free Spoken Digit dataset (FSD), consisting of 8-kHz audio files, and the WISDM dataset, composed of 20-Hz recordings of human activity through mobile and wearable inertial sensors. We propose a complete pipeline to benchmark these encoding techniques by performing time-dependent signal classification through a Spiking Convolutional Neural Network (sCNN), including a signal preprocessing step consisting of a bank of filters inspired by the human cochlea, feature extraction by production of a sonogram, transfer learning via an equivalent ANN, and model compression schemes aimed at resource optimization. The resulting performance comparison and analysis provides a powerful practical tool, empowering developers to select the most suitable coding method based on the type of data and the desired processing algorithms, and further expands the applicability of neuromorphic computational paradigms to embedded sensor systems widely employed in the IoT and industrial domains.
Collapse
|
8
|
Hu SG, Qiao GC, Liu XK, Liu YH, Zhang CM, Zuo Y, Zhou P, Liu YA, Ning N, Yu Q, Liu Y. A Co-Designed Neuromorphic Chip With Compact (17.9K F 2) and Weak Neuron Number-Dependent Neuron/Synapse Modules. IEEE TRANSACTIONS ON BIOMEDICAL CIRCUITS AND SYSTEMS 2022; 16:1250-1260. [PMID: 36150001 DOI: 10.1109/tbcas.2022.3209073] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/05/2023]
Abstract
Many efforts have been made to improve the neuron integration efficiency on neuromorphic chips, such as using emerging memory devices and shrinking CMOS technology nodes. However, in the fully connected (FC) neuromorphic core, increasing the number of neurons will lead to a square increase in synapse & dendrite costs and a high-slope linear increase in soma costs, resulting in an explosive growth of core hardware costs. We propose a co-designed neuromorphic core (SRCcore) based on the quantized spiking neural network (SNN) technology and compact chip design methodology. The cost of the neuron/synapse module in SRCcore weakly depends on the neuron number, which effectively relieves the growth pressure of the core area caused by increasing the neuron number. In the proposed BICS chip based on SRCcore, although the neuron/synapse module implements 1∼16 times of neurons and 1∼66 times of synapses, it only costs an area of 1.79 × 107 F2, which is 7.9%∼38.6% of that in previous works. Based on the weight quantization strategy matched with SRCcore, quantized SNNs achieve 0.05%∼2.19% higher accuracy than previous works, thus supporting the design and application of SRCcore. Finally, a cross-modeling application is demonstrated based on the chip. We hope this work will accelerate the development of cortical-scale neuromorphic systems.
Collapse
|
9
|
Deckers L, Tsang IJ, Van Leekwijck W, Latré S. Extended liquid state machines for speech recognition. Front Neurosci 2022; 16:1023470. [PMID: 36389242 PMCID: PMC9651956 DOI: 10.3389/fnins.2022.1023470] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/19/2022] [Accepted: 10/03/2022] [Indexed: 04/19/2024] Open
Abstract
A liquid state machine (LSM) is a biologically plausible model of a cortical microcircuit. It exists of a random, sparse reservoir of recurrently connected spiking neurons with fixed synapses and a trainable readout layer. The LSM exhibits low training complexity and enables backpropagation-free learning in a powerful, yet simple computing paradigm. In this work, the liquid state machine is enhanced by a set of bio-inspired extensions to create the extended liquid state machine (ELSM), which is evaluated on a set of speech data sets. Firstly, we ensure excitatory/inhibitory (E/I) balance to enable the LSM to operate in edge-of-chaos regime. Secondly, spike-frequency adaptation (SFA) is introduced in the LSM to improve the memory capabilities. Lastly, neuronal heterogeneity, by means of a differentiation in time constants, is introduced to extract a richer dynamical LSM response. By including E/I balance, SFA, and neuronal heterogeneity, we show that the ELSM consistently improves upon the LSM while retaining the benefits of the straightforward LSM structure and training procedure. The proposed extensions led up to an 5.2% increase in accuracy while decreasing the number of spikes in the ELSM up to 20.2% on benchmark speech data sets. On some benchmarks, the ELSM can even attain similar performances as the current state-of-the-art in spiking neural networks. Furthermore, we illustrate that the ELSM input-liquid and recurrent synaptic weights can be reduced to 4-bit resolution without any significant loss in classification performance. We thus show that the ELSM is a powerful, biologically plausible and hardware-friendly spiking neural network model that can attain near state-of-the-art accuracy on speech recognition benchmarks for spiking neural networks.
Collapse
Affiliation(s)
- Lucas Deckers
- imec IDLab, Department of Computer Science, University of Antwerp, Antwerp, Belgium
| | | | | | | |
Collapse
|
10
|
Cramer B, Stradmann Y, Schemmel J, Zenke F. The Heidelberg Spiking Data Sets for the Systematic Evaluation of Spiking Neural Networks. IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS 2022; 33:2744-2757. [PMID: 33378266 DOI: 10.1109/tnnls.2020.3044364] [Citation(s) in RCA: 19] [Impact Index Per Article: 9.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/12/2023]
Abstract
Spiking neural networks are the basis of versatile and power-efficient information processing in the brain. Although we currently lack a detailed understanding of how these networks compute, recently developed optimization techniques allow us to instantiate increasingly complex functional spiking neural networks in-silico. These methods hold the promise to build more efficient non-von-Neumann computing hardware and will offer new vistas in the quest of unraveling brain circuit function. To accelerate the development of such methods, objective ways to compare their performance are indispensable. Presently, however, there are no widely accepted means for comparing the computational performance of spiking neural networks. To address this issue, we introduce two spike-based classification data sets, broadly applicable to benchmark both software and neuromorphic hardware implementations of spiking neural networks. To accomplish this, we developed a general audio-to-spiking conversion procedure inspired by neurophysiology. Furthermore, we applied this conversion to an existing and a novel speech data set. The latter is the free, high-fidelity, and word-level aligned Heidelberg digit data set that we created specifically for this study. By training a range of conventional and spiking classifiers, we show that leveraging spike timing information within these data sets is essential for good classification accuracy. These results serve as the first reference for future performance comparisons of spiking neural networks.
Collapse
|
11
|
Ghazi MM, Sorensen L, Ourselin S, Nielsen M. CARRNN: A Continuous Autoregressive Recurrent Neural Network for Deep Representation Learning From Sporadic Temporal Data. IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS 2022; PP:792-802. [PMID: 35666790 DOI: 10.1109/tnnls.2022.3177366] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/15/2023]
Abstract
Learning temporal patterns from multivariate longitudinal data is challenging especially in cases when data is sporadic, as often seen in, e.g., healthcare applications where the data can suffer from irregularity and asynchronicity as the time between consecutive data points can vary across features and samples, hindering the application of existing deep learning models that are constructed for complete, evenly spaced data with fixed sequence lengths. In this article, a novel deep learning-based model is developed for modeling multiple temporal features in sporadic data using an integrated deep learning architecture based on a recurrent neural network (RNN) unit and a continuous-time autoregressive (CAR) model. The proposed model, called CARRNN, uses a generalized discrete-time autoregressive (AR) model that is trainable end-to-end using neural networks modulated by time lags to describe the changes caused by the irregularity and asynchronicity. It is applied to time-series regression and classification tasks for Alzheimer's disease progression modeling, intensive care unit (ICU) mortality rate prediction, human activity recognition, and event-based digit recognition, where the proposed model based on a gated recurrent unit (GRU) in all cases achieves significantly better predictive performance than the state-of-the-art methods using RNNs, GRUs, and long short-term memory (LSTM) networks.
Collapse
|
12
|
Milde MB, Afshar S, Xu Y, Marcireau A, Joubert D, Ramesh B, Bethi Y, Ralph NO, El Arja S, Dennler N, van Schaik A, Cohen G. Neuromorphic Engineering Needs Closed-Loop Benchmarks. Front Neurosci 2022; 16:813555. [PMID: 35237122 PMCID: PMC8884247 DOI: 10.3389/fnins.2022.813555] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/11/2021] [Accepted: 01/24/2022] [Indexed: 12/02/2022] Open
Abstract
Neuromorphic engineering aims to build (autonomous) systems by mimicking biological systems. It is motivated by the observation that biological organisms—from algae to primates—excel in sensing their environment, reacting promptly to their perils and opportunities. Furthermore, they do so more resiliently than our most advanced machines, at a fraction of the power consumption. It follows that the performance of neuromorphic systems should be evaluated in terms of real-time operation, power consumption, and resiliency to real-world perturbations and noise using task-relevant evaluation metrics. Yet, following in the footsteps of conventional machine learning, most neuromorphic benchmarks rely on recorded datasets that foster sensing accuracy as the primary measure for performance. Sensing accuracy is but an arbitrary proxy for the actual system's goal—taking a good decision in a timely manner. Moreover, static datasets hinder our ability to study and compare closed-loop sensing and control strategies that are central to survival for biological organisms. This article makes the case for a renewed focus on closed-loop benchmarks involving real-world tasks. Such benchmarks will be crucial in developing and progressing neuromorphic Intelligence. The shift towards dynamic real-world benchmarking tasks should usher in richer, more resilient, and robust artificially intelligent systems in the future.
Collapse
|
13
|
Qiao G, Ning N, Zuo Y, Hu S, Yu Q, Liu Y. Direct training of hardware-friendly weight binarized spiking neural network with surrogate gradient learning towards spatio-temporal event-based dynamic data recognition. Neurocomputing 2021. [DOI: 10.1016/j.neucom.2021.06.070] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/08/2023]
|
14
|
Tayarani-Najaran MH, Schmuker M. Event-Based Sensing and Signal Processing in the Visual, Auditory, and Olfactory Domain: A Review. Front Neural Circuits 2021; 15:610446. [PMID: 34135736 PMCID: PMC8203204 DOI: 10.3389/fncir.2021.610446] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/25/2020] [Accepted: 04/27/2021] [Indexed: 11/13/2022] Open
Abstract
The nervous systems converts the physical quantities sensed by its primary receptors into trains of events that are then processed in the brain. The unmatched efficiency in information processing has long inspired engineers to seek brain-like approaches to sensing and signal processing. The key principle pursued in neuromorphic sensing is to shed the traditional approach of periodic sampling in favor of an event-driven scheme that mimicks sampling as it occurs in the nervous system, where events are preferably emitted upon the change of the sensed stimulus. In this paper we highlight the advantages and challenges of event-based sensing and signal processing in the visual, auditory and olfactory domains. We also provide a survey of the literature covering neuromorphic sensing and signal processing in all three modalities. Our aim is to facilitate research in event-based sensing and signal processing by providing a comprehensive overview of the research performed previously as well as highlighting conceptual advantages, current progress and future challenges in the field.
Collapse
Affiliation(s)
| | - Michael Schmuker
- School of Physics, Engineering and Computer Science, University of Hertfordshire, Hatfield, United Kingdom
| |
Collapse
|
15
|
Zhang M, Wu J, Belatreche A, Pan Z, Xie X, Chua Y, Li G, Qu H, Li H. Supervised learning in spiking neural networks with synaptic delay-weight plasticity. Neurocomputing 2020. [DOI: 10.1016/j.neucom.2020.03.079] [Citation(s) in RCA: 8] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/01/2023]
|
16
|
Ceolini E, Frenkel C, Shrestha SB, Taverni G, Khacef L, Payvand M, Donati E. Hand-Gesture Recognition Based on EMG and Event-Based Camera Sensor Fusion: A Benchmark in Neuromorphic Computing. Front Neurosci 2020; 14:637. [PMID: 32903824 PMCID: PMC7438887 DOI: 10.3389/fnins.2020.00637] [Citation(s) in RCA: 22] [Impact Index Per Article: 5.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/15/2019] [Accepted: 05/22/2020] [Indexed: 12/03/2022] Open
Abstract
Hand gestures are a form of non-verbal communication used by individuals in conjunction with speech to communicate. Nowadays, with the increasing use of technology, hand-gesture recognition is considered to be an important aspect of Human-Machine Interaction (HMI), allowing the machine to capture and interpret the user's intent and to respond accordingly. The ability to discriminate between human gestures can help in several applications, such as assisted living, healthcare, neuro-rehabilitation, and sports. Recently, multi-sensor data fusion mechanisms have been investigated to improve discrimination accuracy. In this paper, we present a sensor fusion framework that integrates complementary systems: the electromyography (EMG) signal from muscles and visual information. This multi-sensor approach, while improving accuracy and robustness, introduces the disadvantage of high computational cost, which grows exponentially with the number of sensors and the number of measurements. Furthermore, this huge amount of data to process can affect the classification latency which can be crucial in real-case scenarios, such as prosthetic control. Neuromorphic technologies can be deployed to overcome these limitations since they allow real-time processing in parallel at low power consumption. In this paper, we present a fully neuromorphic sensor fusion approach for hand-gesture recognition comprised of an event-based vision sensor and three different neuromorphic processors. In particular, we used the event-based camera, called DVS, and two neuromorphic platforms, Loihi and ODIN + MorphIC. The EMG signals were recorded using traditional electrodes and then converted into spikes to be fed into the chips. We collected a dataset of five gestures from sign language where visual and electromyography signals are synchronized. We compared a fully neuromorphic approach to a baseline implemented using traditional machine learning approaches on a portable GPU system. According to the chip's constraints, we designed specific spiking neural networks (SNNs) for sensor fusion that showed classification accuracy comparable to the software baseline. These neuromorphic alternatives have increased inference time, between 20 and 40%, with respect to the GPU system but have a significantly smaller energy-delay product (EDP) which makes them between 30× and 600× more efficient. The proposed work represents a new benchmark that moves neuromorphic computing toward a real-world scenario.
Collapse
Affiliation(s)
- Enea Ceolini
- Institute of Neuroinformatics, University of Zurich, ETH Zurich, Zurich, Switzerland
| | - Charlotte Frenkel
- Institute of Neuroinformatics, University of Zurich, ETH Zurich, Zurich, Switzerland
- ICTEAM Institute, Université Catholique de Louvain, Louvain-la-Neuve, Belgium
| | - Sumit Bam Shrestha
- Temasek Laboratories, National University of Singapore, Singapore, Singapore
| | - Gemma Taverni
- Institute of Neuroinformatics, University of Zurich, ETH Zurich, Zurich, Switzerland
| | - Lyes Khacef
- Université Côte d'Azur, CNRS, LEAT, Nice, France
| | - Melika Payvand
- Institute of Neuroinformatics, University of Zurich, ETH Zurich, Zurich, Switzerland
| | - Elisa Donati
- Institute of Neuroinformatics, University of Zurich, ETH Zurich, Zurich, Switzerland
| |
Collapse
|
17
|
Wu J, Yılmaz E, Zhang M, Li H, Tan KC. Deep Spiking Neural Networks for Large Vocabulary Automatic Speech Recognition. Front Neurosci 2020; 14:199. [PMID: 32256308 PMCID: PMC7090229 DOI: 10.3389/fnins.2020.00199] [Citation(s) in RCA: 26] [Impact Index Per Article: 6.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/19/2019] [Accepted: 02/24/2020] [Indexed: 11/13/2022] Open
Abstract
Artificial neural networks (ANN) have become the mainstream acoustic modeling technique for large vocabulary automatic speech recognition (ASR). A conventional ANN features a multi-layer architecture that requires massive amounts of computation. The brain-inspired spiking neural networks (SNN) closely mimic the biological neural networks and can operate on low-power neuromorphic hardware with spike-based computation. Motivated by their unprecedented energy-efficiency and rapid information processing capability, we explore the use of SNNs for speech recognition. In this work, we use SNNs for acoustic modeling and evaluate their performance on several large vocabulary recognition scenarios. The experimental results demonstrate competitive ASR accuracies to their ANN counterparts, while require only 10 algorithmic time steps and as low as 0.68 times total synaptic operations to classify each audio frame. Integrating the algorithmic power of deep SNNs with energy-efficient neuromorphic hardware, therefore, offer an attractive solution for ASR applications running locally on mobile and embedded devices.
Collapse
Affiliation(s)
- Jibin Wu
- Department of Electrical and Computer Engineering, National University of Singapore, Singapore, Singapore
| | - Emre Yılmaz
- Department of Electrical and Computer Engineering, National University of Singapore, Singapore, Singapore
| | - Malu Zhang
- Department of Electrical and Computer Engineering, National University of Singapore, Singapore, Singapore
| | - Haizhou Li
- Department of Electrical and Computer Engineering, National University of Singapore, Singapore, Singapore
- Faculty for Computer Science and Mathematics, University of Bremen, Bremen, Germany
| | - Kay Chen Tan
- Department of Computer Science, City University of Hong Kong, Kowloon Tong, Hong Kong
| |
Collapse
|
18
|
Xu C, Zhang W, Liu Y, Li P. Boosting Throughput and Efficiency of Hardware Spiking Neural Accelerators Using Time Compression Supporting Multiple Spike Codes. Front Neurosci 2020; 14:104. [PMID: 32140093 PMCID: PMC7043203 DOI: 10.3389/fnins.2020.00104] [Citation(s) in RCA: 6] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/17/2019] [Accepted: 01/27/2020] [Indexed: 12/01/2022] Open
Abstract
Spiking neural networks (SNNs) are the third generation of neural networks and can explore both rate and temporal coding for energy-efficient event-driven computation. However, the decision accuracy of existing SNN designs is contingent upon processing a large number of spikes over a long period. Nevertheless, the switching power of SNN hardware accelerators is proportional to the number of spikes processed while the length of spike trains limits throughput and static power efficiency. This paper presents the first study on developing temporal compression to significantly boost throughput and reduce energy dissipation of digital hardware SNN accelerators while being applicable to multiple spike codes. The proposed compression architectures consist of low-cost input spike compression units, novel input-and-output-weighted spiking neurons, and reconfigurable time constant scaling to support large and flexible time compression ratios. Our compression architectures can be transparently applied to any given pre-designed SNNs employing either rate or temporal codes while incurring minimal modification of the neural models, learning algorithms, and hardware design. Using spiking speech and image recognition datasets, we demonstrate the feasibility of supporting large time compression ratios of up to 16×, delivering up to 15.93×, 13.88×, and 86.21× improvements in throughput, energy dissipation, the tradeoffs between hardware area, runtime, energy, and classification accuracy, respectively based on different spike codes on a Xilinx Zynq-7000 FPGA. These results are achieved while incurring little extra hardware overhead.
Collapse
Affiliation(s)
- Changqing Xu
- School of Microelectronics, Xidian University, Xi'an, China
| | - Wenrui Zhang
- Department of Electrical and Computer Engineering, University of California, Santa Barbara, Santa Barbara, CA, United States
| | - Yu Liu
- Department of Electrical and Computer Engineering, Texas A&M University, College Station, TX, United States
| | - Peng Li
- Department of Electrical and Computer Engineering, University of California, Santa Barbara, Santa Barbara, CA, United States
| |
Collapse
|
19
|
Pan Z, Chua Y, Wu J, Zhang M, Li H, Ambikairajah E. An Efficient and Perceptually Motivated Auditory Neural Encoding and Decoding Algorithm for Spiking Neural Networks. Front Neurosci 2020; 13:1420. [PMID: 32038132 PMCID: PMC6987407 DOI: 10.3389/fnins.2019.01420] [Citation(s) in RCA: 16] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/23/2019] [Accepted: 12/16/2019] [Indexed: 12/11/2022] Open
Abstract
The auditory front-end is an integral part of a spiking neural network (SNN) when performing auditory cognitive tasks. It encodes the temporal dynamic stimulus, such as speech and audio, into an efficient, effective and reconstructable spike pattern to facilitate the subsequent processing. However, most of the auditory front-ends in current studies have not made use of recent findings in psychoacoustics and physiology concerning human listening. In this paper, we propose a neural encoding and decoding scheme that is optimized for audio processing. The neural encoding scheme, that we call Biologically plausible Auditory Encoding (BAE), emulates the functions of the perceptual components of the human auditory system, that include the cochlear filter bank, the inner hair cells, auditory masking effects from psychoacoustic models, and the spike neural encoding by the auditory nerve. We evaluate the perceptual quality of the BAE scheme using PESQ; the performance of the BAE based on sound classification and speech recognition experiments. Finally, we also built and published two spike-version of speech datasets: the Spike-TIDIGITS and the Spike-TIMIT, for researchers to use and benchmarking of future SNN research.
Collapse
Affiliation(s)
- Zihan Pan
- Department of Electrical and Computer Engineering, National University of Singapore, Singapore, Singapore
| | - Yansong Chua
- Institute for Infocomm Research, Agency for Science, Technology and Research, Singapore, Singapore
| | - Jibin Wu
- Department of Electrical and Computer Engineering, National University of Singapore, Singapore, Singapore
| | - Malu Zhang
- Department of Electrical and Computer Engineering, National University of Singapore, Singapore, Singapore
| | - Haizhou Li
- Department of Electrical and Computer Engineering, National University of Singapore, Singapore, Singapore
| | - Eliathamby Ambikairajah
- School of Electrical Engineering and Telecommunications, University of New South Wales, Sydney, NSW, Australia
| |
Collapse
|
20
|
Wu J, Chua Y, Zhang M, Li H, Tan KC. A Spiking Neural Network Framework for Robust Sound Classification. Front Neurosci 2018; 12:836. [PMID: 30510500 PMCID: PMC6252336 DOI: 10.3389/fnins.2018.00836] [Citation(s) in RCA: 40] [Impact Index Per Article: 6.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/29/2018] [Accepted: 10/26/2018] [Indexed: 11/26/2022] Open
Abstract
Environmental sounds form part of our daily life. With the advancement of deep learning models and the abundance of training data, the performance of automatic sound classification (ASC) systems has improved significantly in recent years. However, the high computational cost, hence high power consumption, remains a major hurdle for large-scale implementation of ASC systems on mobile and wearable devices. Motivated by the observations that humans are highly effective and consume little power whilst analyzing complex audio scenes, we propose a biologically plausible ASC framework, namely SOM-SNN. This framework uses the unsupervised self-organizing map (SOM) for representing frequency contents embedded within the acoustic signals, followed by an event-based spiking neural network (SNN) for spatiotemporal spiking pattern classification. We report experimental results on the RWCP environmental sound and TIDIGITS spoken digits datasets, which demonstrate competitive classification accuracies over other deep learning and SNN-based models. The SOM-SNN framework is also shown to be highly robust to corrupting noise after multi-condition training, whereby the model is trained with noise-corrupted sound samples. Moreover, we discover the early decision making capability of the proposed framework: an accurate classification can be made with an only partial presentation of the input.
Collapse
Affiliation(s)
- Jibin Wu
- Department of Electrical and Computer Engineering, National University of Singapore, Singapore, Singapore
| | - Yansong Chua
- Institute for Infocomm Research, ASTAR, Singapore, Singapore
| | - Malu Zhang
- Department of Electrical and Computer Engineering, National University of Singapore, Singapore, Singapore
| | - Haizhou Li
- Department of Electrical and Computer Engineering, National University of Singapore, Singapore, Singapore.,Institute for Infocomm Research, ASTAR, Singapore, Singapore
| | - Kay Chen Tan
- Department of Computer Science, City University of Hong Kong, Kowloon Tong, Hong Kong
| |
Collapse
|
21
|
Acharya J, Patil A, Li X, Chen Y, Liu SC, Basu A. A Comparison of Low-Complexity Real-Time Feature Extraction for Neuromorphic Speech Recognition. Front Neurosci 2018; 12:160. [PMID: 29643760 PMCID: PMC5882819 DOI: 10.3389/fnins.2018.00160] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/09/2017] [Accepted: 02/27/2018] [Indexed: 11/19/2022] Open
Abstract
This paper presents a real-time, low-complexity neuromorphic speech recognition system using a spiking silicon cochlea, a feature extraction module and a population encoding method based Neural Engineering Framework (NEF)/Extreme Learning Machine (ELM) classifier IC. Several feature extraction methods with varying memory and computational complexity are presented along with their corresponding classification accuracies. On the N-TIDIGITS18 dataset, we show that a fixed bin size based feature extraction method that votes across both time and spike count features can achieve an accuracy of 95% in software similar to previously report methods that use fixed number of bins per sample while using ~3× less energy and ~25× less memory for feature extraction (~1.5× less overall). Hardware measurements for the same topology show a slightly reduced accuracy of 94% that can be attributed to the extra correlations in hardware random weights. The hardware accuracy can be increased by further increasing the number of hidden nodes in ELM at the cost of memory and energy.
Collapse
Affiliation(s)
- Jyotibdha Acharya
- HealthTech NTU, Interdisciplinary Graduate School, Nanyang Technological University, Singapore, Singapore
| | - Aakash Patil
- School of Electrical and Electronic Engineering, Nanyang Technological University, Singapore, Singapore
| | - Xiaoya Li
- Institute of Neuroinformatics, University of Zurich and ETH Zurich, Zurich, Switzerland
| | - Yi Chen
- School of Electrical and Electronic Engineering, Nanyang Technological University, Singapore, Singapore
| | - Shih-Chii Liu
- Institute of Neuroinformatics, University of Zurich and ETH Zurich, Zurich, Switzerland
| | - Arindam Basu
- School of Electrical and Electronic Engineering, Nanyang Technological University, Singapore, Singapore
| |
Collapse
|