1
|
Liu S, Wang G, Song Y, Huang J, Huang Y, Zhou Y, Wang S. SiamEFT: adaptive-time feature extraction hybrid network for RGBE multi-domain object tracking. Front Neurosci 2024; 18:1453419. [PMID: 39176387 PMCID: PMC11338902 DOI: 10.3389/fnins.2024.1453419] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/23/2024] [Accepted: 07/24/2024] [Indexed: 08/24/2024] Open
Abstract
Integrating RGB and Event (RGBE) multi-domain information obtained by high-dynamic-range and temporal-resolution event cameras has been considered an effective scheme for robust object tracking. However, existing RGBE tracking methods have overlooked the unique spatio-temporal features over different domains, leading to object tracking failure and inefficiency, especally for objects against complex backgrounds. To address this problem, we propose a novel tracker based on adaptive-time feature extraction hybrid networks, namely Siamese Event Frame Tracker (SiamEFT), which focuses on the effective representation and utilization of the diverse spatio-temporal features of RGBE. We first design an adaptive-time attention module to aggregate event data into frames based on adaptive-time weights to enhance information representation. Subsequently, the SiamEF module and cross-network fusion module combining artificial neural networks and spiking neural networks hybrid network are designed to effectively extract and fuse the spatio-temporal features of RGBE. Extensive experiments on two RGBE datasets (VisEvent and COESOT) show that the SiamEFT achieves a success rate of 0.456 and 0.574, outperforming the state-of-the-art competing methods and exhibiting a 2.3-fold enhancement in efficiency. These results validate the superior accuracy and efficiency of SiamEFT in diverse and challenging scenes.
Collapse
Affiliation(s)
- Shuqi Liu
- School of Optics and Photonics, Beijing Institute of Technology, Beijing, China
| | - Gang Wang
- Center of Brain Sciences, Beijing Institute of Basic Medical Sciencesy, Beijing, China
| | - Yong Song
- School of Optics and Photonics, Beijing Institute of Technology, Beijing, China
| | - Jinxiang Huang
- School of Optics and Photonics, Beijing Institute of Technology, Beijing, China
| | - Yiqian Huang
- School of Optics and Photonics, Beijing Institute of Technology, Beijing, China
| | - Ya Zhou
- School of Optics and Photonics, Beijing Institute of Technology, Beijing, China
| | - Shiqiang Wang
- School of Optics and Photonics, Beijing Institute of Technology, Beijing, China
| |
Collapse
|
2
|
Yuan M, Zhang C, Wang Z, Liu H, Pan G, Tang H. Trainable Spiking-YOLO for low-latency and high-performance object detection. Neural Netw 2024; 172:106092. [PMID: 38211460 DOI: 10.1016/j.neunet.2023.106092] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/28/2023] [Revised: 12/06/2023] [Accepted: 12/26/2023] [Indexed: 01/13/2024]
Abstract
Spiking neural networks (SNNs) are considered an attractive option for edge-side applications due to their sparse, asynchronous and event-driven characteristics. However, the application of SNNs to object detection tasks faces challenges in achieving good detection accuracy and high detection speed. To overcome the aforementioned challenges, we propose an end-to-end Trainable Spiking-YOLO (Tr-Spiking-YOLO) for low-latency and high-performance object detection. We evaluate our model on not only frame-based PASCAL VOC dataset but also event-based GEN1 Automotive Detection dataset, and investigate the impacts of different decoding methods on detection performance. The experimental results show that our model achieves competitive/better performance in terms of accuracy, latency and energy consumption compared to similar artificial neural network (ANN) and conversion-based SNN object detection model. Furthermore, when deployed on an edge device, our model achieves a processing speed of approximately from 14 to 39 FPS while maintaining a desirable mean Average Precision (mAP), which is capable of real-time detection on resource-constrained platforms.
Collapse
Affiliation(s)
- Mengwen Yuan
- Research Institute of Intelligent Computing, Zhejiang Lab, Hangzhou 311100, China
| | - Chengjun Zhang
- Research Institute of Intelligent Computing, Zhejiang Lab, Hangzhou 311100, China
| | - Ziming Wang
- College of Computer Science and Technology, Zhejiang University, Hangzhou 310027, China
| | - Huixiang Liu
- Research Institute of Intelligent Computing, Zhejiang Lab, Hangzhou 311100, China
| | - Gang Pan
- College of Computer Science and Technology, Zhejiang University, Hangzhou 310027, China; The State Key Lab of Brain-Machine Intelligence, Zhejiang University, Hangzhou 310027, China; MOE Frontier Science Center for Brain Science and Brain-Machine Integration, Zhejiang University, Hangzhou 310027, China
| | - Huajin Tang
- College of Computer Science and Technology, Zhejiang University, Hangzhou 310027, China; The State Key Lab of Brain-Machine Intelligence, Zhejiang University, Hangzhou 310027, China; MOE Frontier Science Center for Brain Science and Brain-Machine Integration, Zhejiang University, Hangzhou 310027, China.
| |
Collapse
|
3
|
Yu H, Qi Y, Pan G. NeuSort: an automatic adaptive spike sorting approach with neuromorphic models. J Neural Eng 2023; 20:056006. [PMID: 37659393 DOI: 10.1088/1741-2552/acf61d] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/11/2023] [Accepted: 09/01/2023] [Indexed: 09/04/2023]
Abstract
Objective.Spike sorting, a critical step in neural data processing, aims to classify spiking events from single electrode recordings based on different waveforms. This study aims to develop a novel online spike sorter, NeuSort, using neuromorphic models, with the ability to adaptively adjust to changes in neural signals, including waveform deformations and the appearance of new neurons.Approach.NeuSort leverages a neuromorphic model to emulate template-matching processes. This model incorporates plasticity learning mechanisms inspired by biological neural systems, facilitating real-time adjustments to online parameters.Results.Experimental findings demonstrate NeuSort's ability to track neuron activities amidst waveform deformations and identify new neurons in real-time. NeuSort excels in handling non-stationary neural signals, significantly enhancing its applicability for long-term spike sorting tasks. Moreover, its implementation on neuromorphic chips guarantees ultra-low energy consumption during computation.Significance.NeuSort caters to the demand for real-time spike sorting in brain-machine interfaces through a neuromorphic approach. Its unsupervised, automated spike sorting process makes it a plug-and-play solution for online spike sorting.
Collapse
Affiliation(s)
- Hang Yu
- State Key Lab of Brain-Machine Intelligence, Hangzhou, People's Republic of China
- College of Computer Science and Technology, Zhejiang University, Hangzhou, People's Republic of China
| | - Yu Qi
- State Key Lab of Brain-Machine Intelligence, Hangzhou, People's Republic of China
- Affiliated Mental Health Center & Hangzhou Seventh People's Hospital, Hangzhou, People's Republic of China
- MOE Frontier Science Center for Brain Science and Brain-machine Integration, Zhejiang University School of Medicine, Hangzhou, People's Republic of China
| | - Gang Pan
- State Key Lab of Brain-Machine Intelligence, Hangzhou, People's Republic of China
- College of Computer Science and Technology, Zhejiang University, Hangzhou, People's Republic of China
| |
Collapse
|
4
|
Zhang H, Fan X, Zhang Y. Energy-Efficient Spiking Segmenter for Frame and Event-Based Images. Biomimetics (Basel) 2023; 8:356. [PMID: 37622961 PMCID: PMC10452323 DOI: 10.3390/biomimetics8040356] [Citation(s) in RCA: 3] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/12/2023] [Revised: 08/04/2023] [Accepted: 08/08/2023] [Indexed: 08/26/2023] Open
Abstract
Semantic segmentation predicts dense pixel-wise semantic labels, which is crucial for autonomous environment perception systems. For applications on mobile devices, current research focuses on energy-efficient segmenters for both frame and event-based cameras. However, there is currently no artificial neural network (ANN) that can perform efficient segmentation on both types of images. This paper introduces spiking neural network (SNN, a bionic model that is energy-efficient when implemented on neuromorphic hardware) and develops a Spiking Context Guided Network (Spiking CGNet) with substantially lower energy consumption and comparable performance for both frame and event-based images. First, this paper proposes a spiking context guided block that can extract local features and context information with spike computations. On this basis, the directly-trained SCGNet-S and SCGNet-L are established for both frame and event-based images. Our method is verified on the frame-based dataset Cityscapes and the event-based dataset DDD17. On the Cityscapes dataset, SCGNet-S achieves comparable results to ANN CGNet with 4.85 × energy efficiency. On the DDD17 dataset, Spiking CGNet outperforms other spiking segmenters by a large margin.
Collapse
Affiliation(s)
- Hong Zhang
- State Key Laboratory of Industrial Control Technology, College of Control Science and Engineering, Zhejiang University, Hangzhou 310027, China; (H.Z.); (X.F.)
| | - Xiongfei Fan
- State Key Laboratory of Industrial Control Technology, College of Control Science and Engineering, Zhejiang University, Hangzhou 310027, China; (H.Z.); (X.F.)
| | - Yu Zhang
- State Key Laboratory of Industrial Control Technology, College of Control Science and Engineering, Zhejiang University, Hangzhou 310027, China; (H.Z.); (X.F.)
- Key Laboratory of Collaborative Sensing and Autonomous Unmanned Systems of Zhejiang Province, Hangzhou 310027, China
| |
Collapse
|
5
|
Zhang H, Li Y, He B, Fan X, Wang Y, Zhang Y. Direct training high-performance spiking neural networks for object recognition and detection. Front Neurosci 2023; 17:1229951. [PMID: 37614339 PMCID: PMC10442545 DOI: 10.3389/fnins.2023.1229951] [Citation(s) in RCA: 2] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/27/2023] [Accepted: 07/19/2023] [Indexed: 08/25/2023] Open
Abstract
Introduction The spiking neural network (SNN) is a bionic model that is energy-efficient when implemented on neuromorphic hardwares. The non-differentiability of the spiking signals and the complicated neural dynamics make direct training of high-performance SNNs a great challenge. There are numerous crucial issues to explore for the deployment of direct training SNNs, such as gradient vanishing and explosion, spiking signal decoding, and applications in upstream tasks. Methods To address gradient vanishing, we introduce a binary selection gate into the basic residual block and propose spiking gate (SG) ResNet to implement residual learning in SNNs. We propose two appropriate representations of the gate signal and verify that SG ResNet can overcome gradient vanishing or explosion by analyzing the gradient backpropagation. For the spiking signal decoding, a better decoding scheme than rate coding is achieved by our attention spike decoder (ASD), which dynamically assigns weights to spiking signals along the temporal, channel, and spatial dimensions. Results and discussion The SG ResNet and ASD modules are evaluated on multiple object recognition datasets, including the static ImageNet, CIFAR-100, CIFAR-10, and neuromorphic DVS-CIFAR10 datasets. Superior accuracy is demonstrated with a tiny simulation time step of four, specifically 94.52% top-1 accuracy on CIFAR-10 and 75.64% top-1 accuracy on CIFAR-100. Spiking RetinaNet is proposed using SG ResNet as the backbone and ASD module for information decoding as the first direct-training hybrid SNN-ANN detector for RGB images. Spiking RetinaNet with a SG ResNet34 backbone achieves an mAP of 0.296 on the object detection dataset MSCOCO.
Collapse
Affiliation(s)
- Hong Zhang
- State Key Laboratory of Industrial Control Technology, College of Control Science and Engineering, Zhejiang University, Hangzhou, China
| | - Yang Li
- State Key Laboratory of Industrial Control Technology, College of Control Science and Engineering, Zhejiang University, Hangzhou, China
| | - Bin He
- State Key Laboratory of Industrial Control Technology, College of Control Science and Engineering, Zhejiang University, Hangzhou, China
| | - Xiongfei Fan
- State Key Laboratory of Industrial Control Technology, College of Control Science and Engineering, Zhejiang University, Hangzhou, China
| | - Yue Wang
- State Key Laboratory of Industrial Control Technology, College of Control Science and Engineering, Zhejiang University, Hangzhou, China
| | - Yu Zhang
- State Key Laboratory of Industrial Control Technology, College of Control Science and Engineering, Zhejiang University, Hangzhou, China
- Key Laboratory of Collaborative Sensing and Autonomous Unmanned Systems of Zhejiang Province, Hangzhou, China
| |
Collapse
|
6
|
Zhang J, Xu B, Yin H. Depression screening using hybrid neural network. MULTIMEDIA TOOLS AND APPLICATIONS 2023; 82:1-16. [PMID: 37362740 PMCID: PMC9992920 DOI: 10.1007/s11042-023-14860-w] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 02/10/2022] [Revised: 08/03/2022] [Accepted: 02/06/2023] [Indexed: 06/28/2023]
Abstract
Depression is a common cause of increased suicides worldwide, and studies have shown that the number of patients suffering from major depressive disorder (MDD) increased several-fold during the COVID-19 pandemic, highlighting the importance of disease detection and depression management, while increasing the need for effective diagnostic tools. In recent years, machine learning and deep learning methods based on electroencephalography (EEG) have achieved significant results in the field of automatic depression detection. However, most current studies have focused on a small number of EEG signal channels, and experimental data require special processing by professionals. In this study, 128 channels of EEG signals were simply filtered and 24-fold leave-one-out cross-validation experiments were performed using 2DCNN-LSTM classifier, support vector machine, K-nearest neighbor and decision tree. The current results show that the proposed 2DCNN-LSTM model has an average classification accuracy of 95.1% with an AUC of 0.98 for depression detection of 6-second participant EEG signals, and the model is much better than 72.05%, 79.7% and 79.49% for support vector machine, K nearest neighbor and decision tree. In addition, we found that the model achieved a 100% probability of correctly classifying the EEG signals of 300-second participants.
Collapse
Affiliation(s)
- Jiao Zhang
- School of Computer and Information Technology, Beijing Jiaotong University, Beijing, China
| | - Baomin Xu
- School of Computer and Information Technology, Beijing Jiaotong University, Beijing, China
| | - Hongfeng Yin
- School of Computer and Information Technology, Cangzhou Jiaotong College, Cangzhou, Hebei China
| |
Collapse
|
7
|
Spiking Neural Networks and Their Applications: A Review. Brain Sci 2022; 12:brainsci12070863. [PMID: 35884670 PMCID: PMC9313413 DOI: 10.3390/brainsci12070863] [Citation(s) in RCA: 43] [Impact Index Per Article: 21.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/03/2022] [Revised: 05/12/2022] [Accepted: 06/13/2022] [Indexed: 02/04/2023] Open
Abstract
The past decade has witnessed the great success of deep neural networks in various domains. However, deep neural networks are very resource-intensive in terms of energy consumption, data requirements, and high computational costs. With the recent increasing need for the autonomy of machines in the real world, e.g., self-driving vehicles, drones, and collaborative robots, exploitation of deep neural networks in those applications has been actively investigated. In those applications, energy and computational efficiencies are especially important because of the need for real-time responses and the limited energy supply. A promising solution to these previously infeasible applications has recently been given by biologically plausible spiking neural networks. Spiking neural networks aim to bridge the gap between neuroscience and machine learning, using biologically realistic models of neurons to carry out the computation. Due to their functional similarity to the biological neural network, spiking neural networks can embrace the sparsity found in biology and are highly compatible with temporal code. Our contributions in this work are: (i) we give a comprehensive review of theories of biological neurons; (ii) we present various existing spike-based neuron models, which have been studied in neuroscience; (iii) we detail synapse models; (iv) we provide a review of artificial neural networks; (v) we provide detailed guidance on how to train spike-based neuron models; (vi) we revise available spike-based neuron frameworks that have been developed to support implementing spiking neural networks; (vii) finally, we cover existing spiking neural network applications in computer vision and robotics domains. The paper concludes with discussions of future perspectives.
Collapse
|
8
|
Kim D, Chakraborty B, She X, Lee E, Kang B, Mukhopadhyay S. MONETA: A Processing-In-Memory-Based Hardware Platform for the Hybrid Convolutional Spiking Neural Network With Online Learning. Front Neurosci 2022; 16:775457. [PMID: 35478844 PMCID: PMC9037635 DOI: 10.3389/fnins.2022.775457] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/14/2021] [Accepted: 03/07/2022] [Indexed: 11/24/2022] Open
Abstract
We present a processing-in-memory (PIM)-based hardware platform, referred to as MONETA, for on-chip acceleration of inference and learning in hybrid convolutional spiking neural network. MONETAuses 8T static random-access memory (SRAM)-based PIM cores for vector matrix multiplication (VMM) augmented with spike-time-dependent-plasticity (STDP) based weight update. The spiking neural network (SNN)-focused data flow is presented to minimize data movement in MONETAwhile ensuring learning accuracy. MONETAsupports on-line and on-chip training on PIM architecture. The STDP-trained convolutional neural network within SNN (ConvSNN) with the proposed data flow, 4-bit input precision, and 8-bit weight precision shows only 1.63% lower accuracy in CIFAR-10 compared to the STDP accuracy implemented by the software. Further, the proposed architecture is used to accelerate a hybrid SNN architecture that couples off-chip supervised (back propagation through time) and on-chip unsupervised (STDP) training. We also evaluate the hybrid network architecture with the proposed data flow. The accuracy of this hybrid network is 10.84% higher than STDP trained accuracy result and 1.4% higher compared to the backpropagated training-based ConvSNN result with the CIFAR-10 dataset. Physical design of MONETAin 65 nm complementary metal-oxide-semiconductor (CMOS) shows 18.69 tera operation per second (TOPS)/W, 7.25 TOPS/W and 10.41 TOPS/W power efficiencies for the inference mode, learning mode, and hybrid learning mode, respectively.
Collapse
Affiliation(s)
- Daehyun Kim
- Department of Electrical and Computer Engineering, Georgia Institute of Technology, Atlanta, GA, United States
| | - Biswadeep Chakraborty
- Department of Electrical and Computer Engineering, Georgia Institute of Technology, Atlanta, GA, United States
| | - Xueyuan She
- Department of Electrical and Computer Engineering, Georgia Institute of Technology, Atlanta, GA, United States
| | - Edward Lee
- Department of Electrical and Computer Engineering, Georgia Institute of Technology, Atlanta, GA, United States
| | - Beomseok Kang
- Department of Electrical and Computer Engineering, Georgia Institute of Technology, Atlanta, GA, United States
| | - Saibal Mukhopadhyay
- Department of Electrical and Computer Engineering, Georgia Institute of Technology, Atlanta, GA, United States
| |
Collapse
|