1
|
Zhou C, Zhang H, Yu L, Ye Y, Zhou Z, Huang L, Ma Z, Fan X, Zhou H, Tian Y. Direct training high-performance deep spiking neural networks: a review of theories and methods. Front Neurosci 2024; 18:1383844. [PMID: 39145295 PMCID: PMC11322636 DOI: 10.3389/fnins.2024.1383844] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/08/2024] [Accepted: 07/03/2024] [Indexed: 08/16/2024] Open
Abstract
Spiking neural networks (SNNs) offer a promising energy-efficient alternative to artificial neural networks (ANNs), in virtue of their high biological plausibility, rich spatial-temporal dynamics, and event-driven computation. The direct training algorithms based on the surrogate gradient method provide sufficient flexibility to design novel SNN architectures and explore the spatial-temporal dynamics of SNNs. According to previous studies, the performance of models is highly dependent on their sizes. Recently, direct training deep SNNs have achieved great progress on both neuromorphic datasets and large-scale static datasets. Notably, transformer-based SNNs show comparable performance with their ANN counterparts. In this paper, we provide a new perspective to summarize the theories and methods for training deep SNNs with high performance in a systematic and comprehensive way, including theory fundamentals, spiking neuron models, advanced SNN models and residual architectures, software frameworks and neuromorphic hardware, applications, and future trends.
Collapse
Affiliation(s)
| | - Han Zhang
- Peng Cheng Laboratory, Shenzhen, China
- Faculty of Computing, Harbin Institute of Technology, Harbin, China
| | - Liutao Yu
- Peng Cheng Laboratory, Shenzhen, China
| | - Yumin Ye
- Peng Cheng Laboratory, Shenzhen, China
| | - Zhaokun Zhou
- Peng Cheng Laboratory, Shenzhen, China
- School of Electronic and Computer Engineering, Shenzhen Graduate School, Peking University, Shenzhen, China
| | - Liwei Huang
- Peng Cheng Laboratory, Shenzhen, China
- National Key Laboratory for Multimedia Information Processing, School of Computer Science, Peking University, Beijing, China
| | | | - Xiaopeng Fan
- Peng Cheng Laboratory, Shenzhen, China
- Faculty of Computing, Harbin Institute of Technology, Harbin, China
| | | | - Yonghong Tian
- Peng Cheng Laboratory, Shenzhen, China
- School of Electronic and Computer Engineering, Shenzhen Graduate School, Peking University, Shenzhen, China
- National Key Laboratory for Multimedia Information Processing, School of Computer Science, Peking University, Beijing, China
| |
Collapse
|
2
|
Park J, Ha S, Yu T, Neftci E, Cauwenberghs G. A 22-pJ/spike 73-Mspikes/s 130k-compartment neural array transceiver with conductance-based synaptic and membrane dynamics. Front Neurosci 2023; 17:1198306. [PMID: 37700751 PMCID: PMC10493285 DOI: 10.3389/fnins.2023.1198306] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/01/2023] [Accepted: 07/07/2023] [Indexed: 09/14/2023] Open
Abstract
Neuromorphic cognitive computing offers a bio-inspired means to approach the natural intelligence of biological neural systems in silicon integrated circuits. Typically, such circuits either reproduce biophysical neuronal dynamics in great detail as tools for computational neuroscience, or abstract away the biology by simplifying the functional forms of neural computation in large-scale systems for machine intelligence with high integration density and energy efficiency. Here we report a hybrid which offers biophysical realism in the emulation of multi-compartmental neuronal network dynamics at very large scale with high implementation efficiency, and yet with high flexibility in configuring the functional form and the network topology. The integrate-and-fire array transceiver (IFAT) chip emulates the continuous-time analog membrane dynamics of 65 k two-compartment neurons with conductance-based synapses. Fired action potentials are registered as address-event encoded output spikes, while the four types of synapses coupling to each neuron are activated by address-event decoded input spikes for fully reconfigurable synaptic connectivity, facilitating virtual wiring as implemented by routing address-event spikes externally through synaptic routing table. Peak conductance strength of synapse activation specified by the address-event input spans three decades of dynamic range, digitally controlled by pulse width and amplitude modulation (PWAM) of the drive voltage activating the log-domain linear synapse circuit. Two nested levels of micro-pipelining in the IFAT architecture improve both throughput and efficiency of synaptic input. This two-tier micro-pipelining results in a measured sustained peak throughput of 73 Mspikes/s and overall chip-level energy efficiency of 22 pJ/spike. Non-uniformity in digitally encoded synapse strength due to analog mismatch is mitigated through single-point digital offset calibration. Combined with the flexibly layered and recurrent synaptic connectivity provided by hierarchical address-event routing of registered spike events through external memory, the IFAT lends itself to efficient large-scale emulation of general biophysical spiking neural networks, as well as rate-based mapping of rectified linear unit (ReLU) neural activations.
Collapse
Affiliation(s)
- Jongkil Park
- Center for Neuromorphic Engineering, Korea Institute of Science and Technology (KIST), Seoul, Republic of Korea
- Institute for Neural Computation, University of California, San Diego, La Jolla, CA, United States
- Department of Electrical and Computer Engineering, Jacobs School of Engineering, University of California, San Diego, La Jolla, CA, United States
| | - Sohmyung Ha
- Institute for Neural Computation, University of California, San Diego, La Jolla, CA, United States
- Department of Bioengineering, Jacobs School of Engineering, University of California, San Diego, La Jolla, CA, United States
- Division of Engineering, New York University Abu Dhabi, Abu Dhabi, United Arab Emirates
| | - Theodore Yu
- Institute for Neural Computation, University of California, San Diego, La Jolla, CA, United States
- Department of Electrical and Computer Engineering, Jacobs School of Engineering, University of California, San Diego, La Jolla, CA, United States
| | - Emre Neftci
- Peter Grünberg Institute, Forschungszentrum Jülich, RWTH, Aachen, Germany
| | - Gert Cauwenberghs
- Institute for Neural Computation, University of California, San Diego, La Jolla, CA, United States
- Department of Bioengineering, Jacobs School of Engineering, University of California, San Diego, La Jolla, CA, United States
| |
Collapse
|
3
|
Hu SG, Qiao GC, Liu XK, Liu YH, Zhang CM, Zuo Y, Zhou P, Liu YA, Ning N, Yu Q, Liu Y. A Co-Designed Neuromorphic Chip With Compact (17.9K F 2) and Weak Neuron Number-Dependent Neuron/Synapse Modules. IEEE TRANSACTIONS ON BIOMEDICAL CIRCUITS AND SYSTEMS 2022; 16:1250-1260. [PMID: 36150001 DOI: 10.1109/tbcas.2022.3209073] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/05/2023]
Abstract
Many efforts have been made to improve the neuron integration efficiency on neuromorphic chips, such as using emerging memory devices and shrinking CMOS technology nodes. However, in the fully connected (FC) neuromorphic core, increasing the number of neurons will lead to a square increase in synapse & dendrite costs and a high-slope linear increase in soma costs, resulting in an explosive growth of core hardware costs. We propose a co-designed neuromorphic core (SRCcore) based on the quantized spiking neural network (SNN) technology and compact chip design methodology. The cost of the neuron/synapse module in SRCcore weakly depends on the neuron number, which effectively relieves the growth pressure of the core area caused by increasing the neuron number. In the proposed BICS chip based on SRCcore, although the neuron/synapse module implements 1∼16 times of neurons and 1∼66 times of synapses, it only costs an area of 1.79 × 107 F2, which is 7.9%∼38.6% of that in previous works. Based on the weight quantization strategy matched with SRCcore, quantized SNNs achieve 0.05%∼2.19% higher accuracy than previous works, thus supporting the design and application of SRCcore. Finally, a cross-modeling application is demonstrated based on the chip. We hope this work will accelerate the development of cortical-scale neuromorphic systems.
Collapse
|
4
|
Zaman KS, Reaz MBI, Md Ali SH, Bakar AAA, Chowdhury MEH. Custom Hardware Architectures for Deep Learning on Portable Devices: A Review. IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS 2022; 33:6068-6088. [PMID: 34086580 DOI: 10.1109/tnnls.2021.3082304] [Citation(s) in RCA: 5] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/12/2023]
Abstract
The staggering innovations and emergence of numerous deep learning (DL) applications have forced researchers to reconsider hardware architecture to accommodate fast and efficient application-specific computations. Applications, such as object detection, image recognition, speech translation, as well as music synthesis and image generation, can be performed with high accuracy at the expense of substantial computational resources using DL. Furthermore, the desire to adopt Industry 4.0 and smart technologies within the Internet of Things infrastructure has initiated several studies to enable on-chip DL capabilities for resource-constrained devices. Specialized DL processors reduce dependence on cloud servers, improve privacy, lessen latency, and mitigate bandwidth congestion. As we reach the limits of shrinking transistors, researchers are exploring various application-specific hardware architectures to meet the performance and efficiency requirements for DL tasks. Over the past few years, several software optimizations and hardware innovations have been proposed to efficiently perform these computations. In this article, we review several DL accelerators, as well as technologies with emerging devices, to highlight their architectural features in application-specific integrated circuit (IC) and field-programmable gate array (FPGA) platforms. Finally, the design considerations for DL hardware in portable applications have been discussed, along with some deductions about the future trends and potential research directions to innovate DL accelerator architectures further. By compiling this review, we expect to help aspiring researchers widen their knowledge in custom hardware architectures for DL.
Collapse
|
5
|
Payvand M, Moro F, Nomura K, Dalgaty T, Vianello E, Nishi Y, Indiveri G. Self-organization of an inhomogeneous memristive hardware for sequence learning. Nat Commun 2022; 13:5793. [PMID: 36184665 PMCID: PMC9527242 DOI: 10.1038/s41467-022-33476-6] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/05/2021] [Accepted: 09/19/2022] [Indexed: 11/27/2022] Open
Abstract
Learning is a fundamental component of creating intelligent machines. Biological intelligence orchestrates synaptic and neuronal learning at multiple time scales to self-organize populations of neurons for solving complex tasks. Inspired by this, we design and experimentally demonstrate an adaptive hardware architecture Memristive Self-organizing Spiking Recurrent Neural Network (MEMSORN). MEMSORN incorporates resistive memory (RRAM) in its synapses and neurons which configure their state based on Hebbian and Homeostatic plasticity respectively. For the first time, we derive these plasticity rules directly from the statistical measurements of our fabricated RRAM-based neurons and synapses. These "technologically plausible” learning rules exploit the intrinsic variability of the devices and improve the accuracy of the network on a sequence learning task by 30%. Finally, we compare the performance of MEMSORN to a fully-randomly-set-up spiking recurrent network on the same task, showing that self-organization improves the accuracy by more than 15%. This work demonstrates the importance of the device-circuit-algorithm co-design approach for implementing brain-inspired computing hardware. One gap between the neuro-inspired computing and its applications lies in the intrinsic variability of the devices. Here, Payvand et al. suggest a technologically plausible co-design of the hardware architecture which takes into account and exploits the physics behind memristors.
Collapse
Affiliation(s)
- Melika Payvand
- Institute for Neuroinformatics, University of Zurich and ETH Zurich, Zurich, Switzerland.
| | - Filippo Moro
- Institute for Neuroinformatics, University of Zurich and ETH Zurich, Zurich, Switzerland.,Université Grenoble Alpes, CEA, Leti, F-38000, Grenoble, France
| | - Kumiko Nomura
- Corporate Research & Development Center, Toshiba Corporation, Kawasaki, Japan
| | - Thomas Dalgaty
- Université Grenoble Alpes, CEA, Leti, F-38000, Grenoble, France
| | - Elisa Vianello
- Université Grenoble Alpes, CEA, Leti, F-38000, Grenoble, France
| | - Yoshifumi Nishi
- Corporate Research & Development Center, Toshiba Corporation, Kawasaki, Japan
| | - Giacomo Indiveri
- Institute for Neuroinformatics, University of Zurich and ETH Zurich, Zurich, Switzerland
| |
Collapse
|
6
|
Putra RVW, Hanif MA, Shafique M. EnforceSNN: Enabling resilient and energy-efficient spiking neural network inference considering approximate DRAMs for embedded systems. Front Neurosci 2022; 16:937782. [PMID: 36033624 PMCID: PMC9399768 DOI: 10.3389/fnins.2022.937782] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/06/2022] [Accepted: 07/11/2022] [Indexed: 11/13/2022] Open
Abstract
Spiking Neural Networks (SNNs) have shown capabilities of achieving high accuracy under unsupervised settings and low operational power/energy due to their bio-plausible computations. Previous studies identified that DRAM-based off-chip memory accesses dominate the energy consumption of SNN processing. However, state-of-the-art works do not optimize the DRAM energy-per-access, thereby hindering the SNN-based systems from achieving further energy efficiency gains. To substantially reduce the DRAM energy-per-access, an effective solution is to decrease the DRAM supply voltage, but it may lead to errors in DRAM cells (i.e., so-called approximate DRAM). Toward this, we propose EnforceSNN, a novel design framework that provides a solution for resilient and energy-efficient SNN inference using reduced-voltage DRAM for embedded systems. The key mechanisms of our EnforceSNN are: (1) employing quantized weights to reduce the DRAM access energy; (2) devising an efficient DRAM mapping policy to minimize the DRAM energy-per-access; (3) analyzing the SNN error tolerance to understand its accuracy profile considering different bit error rate (BER) values; (4) leveraging the information for developing an efficient fault-aware training (FAT) that considers different BER values and bit error locations in DRAM to improve the SNN error tolerance; and (5) developing an algorithm to select the SNN model that offers good trade-offs among accuracy, memory, and energy consumption. The experimental results show that our EnforceSNN maintains the accuracy (i.e., no accuracy loss for BER ≤ 10−3) as compared to the baseline SNN with accurate DRAM while achieving up to 84.9% of DRAM energy saving and up to 4.1x speed-up of DRAM data throughput across different network sizes.
Collapse
Affiliation(s)
- Rachmad Vidya Wicaksana Putra
- Embedded Computing Systems, Institute of Computer Engineering, Technische Universität Wien, Vienna, Austria
- *Correspondence: Rachmad Vidya Wicaksana Putra
| | - Muhammad Abdullah Hanif
- eBrain Lab, Division of Engineering, New York University Abu Dhabi (NYUAD), Abu Dhabi, United Arab Emirates
| | - Muhammad Shafique
- eBrain Lab, Division of Engineering, New York University Abu Dhabi (NYUAD), Abu Dhabi, United Arab Emirates
| |
Collapse
|
7
|
Müller E, Schmitt S, Mauch C, Billaudelle S, Grübl A, Güttler M, Husmann D, Ilmberger J, Jeltsch S, Kaiser J, Klähn J, Kleider M, Koke C, Montes J, Müller P, Partzsch J, Passenberg F, Schmidt H, Vogginger B, Weidner J, Mayr C, Schemmel J. The operating system of the neuromorphic BrainScaleS-1 system. Neurocomputing 2022. [DOI: 10.1016/j.neucom.2022.05.081] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/26/2022]
|
8
|
Wang H, He Z, Wang T, He J, Zhou X, Wang Y, Liu L, Wu N, Tian M, Shi C. TripleBrain: A Compact Neuromorphic Hardware Core With Fast On-Chip Self-Organizing and Reinforcement Spike-Timing Dependent Plasticity. IEEE TRANSACTIONS ON BIOMEDICAL CIRCUITS AND SYSTEMS 2022; 16:636-650. [PMID: 35802542 DOI: 10.1109/tbcas.2022.3189240] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/15/2023]
Abstract
Human brain cortex acts as a rich inspiration source for constructing efficient artificial cognitive systems. In this paper, we investigate to incorporate multiple brain-inspired computing paradigms for compact, fast and high-accuracy neuromorphic hardware implementation. We propose the TripleBrain hardware core that tightly combines three common brain-inspired factors: the spike-based processing and plasticity, the self-organizing map (SOM) mechanism and the reinforcement learning scheme, to improve object recognition accuracy and processing throughput, while keeping low resource costs. The proposed hardware core is fully event-driven to mitigate unnecessary operations, and enables various on-chip learning rules (including the proposed SOM-STDP & R-STDP rule and the R-SOM-STDP rule regarded as the two variants of our TripleBrain learning rule) with different accuracy-latency tradeoffs to satisfy user requirements. An FPGA prototype of the neuromorphic core was implemented and elaborately tested. It realized high-speed learning (1349 frame/s) and inference (2698 frame/s), and obtained comparably high recognition accuracies of 95.10%, 80.89%, 100%, 94.94%, 82.32%, 100% and 97.93% on the MNIST, ETH-80, ORL-10, Yale-10, N-MNIST, Poker-DVS and Posture-DVS datasets, respectively, while only consuming 4146 (7.59%) slices, 32 (3.56%) DSPs and 131 (24.04%) Block RAMs on a Xilinx Zynq-7045 FPGA chip. Our neuromorphic core is very attractive for real-time resource-limited edge intelligent systems.
Collapse
|
9
|
Yang S, Gao T, Wang J, Deng B, Azghadi MR, Lei T, Linares-Barranco B. SAM: A Unified Self-Adaptive Multicompartmental Spiking Neuron Model for Learning With Working Memory. Front Neurosci 2022; 16:850945. [PMID: 35527819 PMCID: PMC9074872 DOI: 10.3389/fnins.2022.850945] [Citation(s) in RCA: 10] [Impact Index Per Article: 5.0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/08/2022] [Accepted: 03/15/2022] [Indexed: 11/13/2022] Open
Abstract
Working memory is a fundamental feature of biological brains for perception, cognition, and learning. In addition, learning with working memory, which has been show in conventional artificial intelligence systems through recurrent neural networks, is instrumental to advanced cognitive intelligence. However, it is hard to endow a simple neuron model with working memory, and to understand the biological mechanisms that have resulted in such a powerful ability at the neuronal level. This article presents a novel self-adaptive multicompartment spiking neuron model, referred to as SAM, for spike-based learning with working memory. SAM integrates four major biological principles including sparse coding, dendritic non-linearity, intrinsic self-adaptive dynamics, and spike-driven learning. We first describe SAM's design and explore the impacts of critical parameters on its biological dynamics. We then use SAM to build spiking networks to accomplish several different tasks including supervised learning of the MNIST dataset using sequential spatiotemporal encoding, noisy spike pattern classification, sparse coding during pattern classification, spatiotemporal feature detection, meta-learning with working memory applied to a navigation task and the MNIST classification task, and working memory for spatiotemporal learning. Our experimental results highlight the energy efficiency and robustness of SAM in these wide range of challenging tasks. The effects of SAM model variations on its working memory are also explored, hoping to offer insight into the biological mechanisms underlying working memory in the brain. The SAM model is the first attempt to integrate the capabilities of spike-driven learning and working memory in a unified single neuron with multiple timescale dynamics. The competitive performance of SAM could potentially contribute to the development of efficient adaptive neuromorphic computing systems for various applications from robotics to edge computing.
Collapse
Affiliation(s)
- Shuangming Yang
- School of Electrical and Information Engineering, Tianjin University, Tianjin, China
| | - Tian Gao
- School of Electrical and Information Engineering, Tianjin University, Tianjin, China
| | - Jiang Wang
- School of Electrical and Information Engineering, Tianjin University, Tianjin, China
| | - Bin Deng
- School of Electrical and Information Engineering, Tianjin University, Tianjin, China
| | | | - Tao Lei
- School of Electronic Information and Artificial Intelligence, Shaanxi University of Science and Technology, Xi’an, China
| | | |
Collapse
|
10
|
Pehle C, Billaudelle S, Cramer B, Kaiser J, Schreiber K, Stradmann Y, Weis J, Leibfried A, Müller E, Schemmel J. The BrainScaleS-2 Accelerated Neuromorphic System With Hybrid Plasticity. Front Neurosci 2022; 16:795876. [PMID: 35281488 PMCID: PMC8907969 DOI: 10.3389/fnins.2022.795876] [Citation(s) in RCA: 20] [Impact Index Per Article: 10.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/15/2021] [Accepted: 01/27/2022] [Indexed: 12/30/2022] Open
Abstract
Since the beginning of information processing by electronic components, the nervous system has served as a metaphor for the organization of computational primitives. Brain-inspired computing today encompasses a class of approaches ranging from using novel nano-devices for computation to research into large-scale neuromorphic architectures, such as TrueNorth, SpiNNaker, BrainScaleS, Tianjic, and Loihi. While implementation details differ, spiking neural networks-sometimes referred to as the third generation of neural networks-are the common abstraction used to model computation with such systems. Here we describe the second generation of the BrainScaleS neuromorphic architecture, emphasizing applications enabled by this architecture. It combines a custom analog accelerator core supporting the accelerated physical emulation of bio-inspired spiking neural network primitives with a tightly coupled digital processor and a digital event-routing network.
Collapse
Affiliation(s)
| | | | | | | | | | | | | | | | | | - Johannes Schemmel
- Electronic Visions, Kirchhoff-Institute for Physics, Heidelberg University, Heidelberg, Germany
| |
Collapse
|
11
|
Abstract
Neuromorphic systems aim to accomplish efficient computation in electronics by mirroring neurobiological principles. Taking advantage of neuromorphic technologies requires effective learning algorithms capable of instantiating high-performing neural networks, while also dealing with inevitable manufacturing variations of individual components, such as memristors or analog neurons. We present a learning framework resulting in bioinspired spiking neural networks with high performance, low inference latency, and sparse spike-coding schemes, which also self-corrects for device mismatch. We validate our approach on the BrainScaleS-2 analog spiking neuromorphic system, demonstrating state-of-the-art accuracy, low latency, and energy efficiency. Our work sketches a path for building powerful neuromorphic processors that take advantage of emerging analog technologies. To rapidly process temporal information at a low metabolic cost, biological neurons integrate inputs as an analog sum, but communicate with spikes, binary events in time. Analog neuromorphic hardware uses the same principles to emulate spiking neural networks with exceptional energy efficiency. However, instantiating high-performing spiking networks on such hardware remains a significant challenge due to device mismatch and the lack of efficient training algorithms. Surrogate gradient learning has emerged as a promising training strategy for spiking networks, but its applicability for analog neuromorphic systems has not been demonstrated. Here, we demonstrate surrogate gradient learning on the BrainScaleS-2 analog neuromorphic system using an in-the-loop approach. We show that learning self-corrects for device mismatch, resulting in competitive spiking network performance on both vision and speech benchmarks. Our networks display sparse spiking activity with, on average, less than one spike per hidden neuron and input, perform inference at rates of up to 85,000 frames per second, and consume less than 200 mW. In summary, our work sets several benchmarks for low-energy spiking network processing on analog neuromorphic hardware and paves the way for future on-chip learning algorithms.
Collapse
|
12
|
Abstract
Stochastic computing is an emerging scientific field pushed by the need for developing high-performance artificial intelligence systems in hardware to quickly solve complex data processing problems. This is the case of virtual screening, a computational task aimed at searching across huge molecular databases for new drug leads. In this work, we show a classification framework in which molecules are described by an energy-based vector. This vector is then processed by an ultra-fast artificial neural network implemented through FPGA by using stochastic computing techniques. Compared to other previously published virtual screening methods, this proposal provides similar or higher accuracy, while it improves processing speed by about two or three orders of magnitude.
Collapse
|
13
|
Chen X, Yuan X, Fu G, Luo Y, Yue T, Yan F, Wang Y, Pan H. Effective Plug-Ins for Reducing Inference-Latency of Spiking Convolutional Neural Networks During Inference Phase. Front Comput Neurosci 2021; 15:697469. [PMID: 34733147 PMCID: PMC8558256 DOI: 10.3389/fncom.2021.697469] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/19/2021] [Accepted: 09/20/2021] [Indexed: 11/17/2022] Open
Abstract
Convolutional Neural Networks (CNNs) are effective and mature in the field of classification, while Spiking Neural Networks (SNNs) are energy-saving for their sparsity of data flow and event-driven working mechanism. Previous work demonstrated that CNNs can be converted into equivalent Spiking Convolutional Neural Networks (SCNNs) without obvious accuracy loss, including different functional layers such as Convolutional (Conv), Fully Connected (FC), Avg-pooling, Max-pooling, and Batch-Normalization (BN) layers. To reduce inference-latency, existing researches mainly concentrated on the normalization of weights to increase the firing rate of neurons. There are also some approaches during training phase or altering the network architecture. However, little attention has been paid on the end of inference phase. From this new perspective, this paper presents 4 stopping criterions as low-cost plug-ins to reduce the inference-latency of SCNNs. The proposed methods are validated using MATLAB and PyTorch platforms with Spiking-AlexNet for CIFAR-10 dataset and Spiking-LeNet-5 for MNIST dataset. Simulation results reveal that, compared to the state-of-the-art methods, the proposed method can shorten the average inference-latency of Spiking-AlexNet from 892 to 267 time steps (almost 3.34 times faster) with the accuracy decline from 87.95 to 87.72%. With our methods, 4 types of Spiking-LeNet-5 only need 24–70 time steps per image with the accuracy decline not more than 0.1%, while models without our methods require 52–138 time steps, almost 1.92 to 3.21 times slower than us.
Collapse
Affiliation(s)
- Xuan Chen
- The School of Electronic Science and Engineering, Nanjing University, Nanjing, China
| | - Xiaopeng Yuan
- The School of Electronic Science and Engineering, Nanjing University, Nanjing, China
| | - Gaoming Fu
- The School of Electronic Science and Engineering, Nanjing University, Nanjing, China
| | - Yuanyong Luo
- The School of Electronic Science and Engineering, Nanjing University, Nanjing, China
| | - Tao Yue
- The School of Electronic Science and Engineering, Nanjing University, Nanjing, China
| | - Feng Yan
- The School of Electronic Science and Engineering, Nanjing University, Nanjing, China
| | - Yuxuan Wang
- The School of Electronic Science and Engineering, Nanjing University, Nanjing, China
| | - Hongbing Pan
- The School of Electronic Science and Engineering, Nanjing University, Nanjing, China
| |
Collapse
|
14
|
Nishi Y, Nomura K, Marukame T, Mizushima K. Stochastic binary synapses having sigmoidal cumulative distribution functions for unsupervised learning with spike timing-dependent plasticity. Sci Rep 2021; 11:18282. [PMID: 34521895 PMCID: PMC8440757 DOI: 10.1038/s41598-021-97583-y] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/14/2021] [Accepted: 08/23/2021] [Indexed: 11/17/2022] Open
Abstract
Spike timing-dependent plasticity (STDP), which is widely studied as a fundamental synaptic update rule for neuromorphic hardware, requires precise control of continuous weights. From the viewpoint of hardware implementation, a simplified update rule is desirable. Although simplified STDP with stochastic binary synapses was proposed previously, we find that it leads to degradation of memory maintenance during learning, which is unfavourable for unsupervised online learning. In this work, we propose a stochastic binary synaptic model where the cumulative probability of the weight change evolves in a sigmoidal fashion with potentiation or depression trials, which can be implemented using a pair of switching devices consisting of serially connected multiple binary memristors. As a benchmark test we perform simulations of unsupervised learning of MNIST images with a two-layer network and show that simplified STDP in combination with this model can outperform conventional rules with continuous weights not only in memory maintenance but also in recognition accuracy. Our method achieves 97.3% in recognition accuracy, which is higher than that reported with standard STDP in the same framework. We also show that the high performance of our learning rule is robust against device-to-device variability of the memristor's probabilistic behaviour.
Collapse
Affiliation(s)
- Yoshifumi Nishi
- Frontier Research Laboratory, Corporate R&D Center, Toshiba Corporation, 1, Komukai-Toshiba-Cho, Saiwai-ku, Kawasaki, 212-8582, Japan.
| | - Kumiko Nomura
- Frontier Research Laboratory, Corporate R&D Center, Toshiba Corporation, 1, Komukai-Toshiba-Cho, Saiwai-ku, Kawasaki, 212-8582, Japan
| | - Takao Marukame
- Frontier Research Laboratory, Corporate R&D Center, Toshiba Corporation, 1, Komukai-Toshiba-Cho, Saiwai-ku, Kawasaki, 212-8582, Japan
| | - Koichi Mizushima
- Frontier Research Laboratory, Corporate R&D Center, Toshiba Corporation, 1, Komukai-Toshiba-Cho, Saiwai-ku, Kawasaki, 212-8582, Japan
| |
Collapse
|
15
|
A Cost-Efficient High-Speed VLSI Architecture for Spiking Convolutional Neural Network Inference Using Time-Step Binary Spike Maps. SENSORS 2021; 21:s21186006. [PMID: 34577214 PMCID: PMC8471769 DOI: 10.3390/s21186006] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 07/17/2021] [Revised: 08/31/2021] [Accepted: 09/03/2021] [Indexed: 11/23/2022]
Abstract
Neuromorphic hardware systems have been gaining ever-increasing focus in many embedded applications as they use a brain-inspired, energy-efficient spiking neural network (SNN) model that closely mimics the human cortex mechanism by communicating and processing sensory information via spatiotemporally sparse spikes. In this paper, we fully leverage the characteristics of spiking convolution neural network (SCNN), and propose a scalable, cost-efficient, and high-speed VLSI architecture to accelerate deep SCNN inference for real-time low-cost embedded scenarios. We leverage the snapshot of binary spike maps at each time-step, to decompose the SCNN operations into a series of regular and simple time-step CNN-like processing to reduce hardware resource consumption. Moreover, our hardware architecture achieves high throughput by employing a pixel stream processing mechanism and fine-grained data pipelines. Our Zynq-7045 FPGA prototype reached a high processing speed of 1250 frames/s and high recognition accuracies on the MNIST and Fashion-MNIST image datasets, demonstrating the plausibility of our SCNN hardware architecture for many embedded applications.
Collapse
|
16
|
Ben Abdallah A, Dang KN. Toward Robust Cognitive 3D Brain-Inspired Cross-Paradigm System. Front Neurosci 2021; 15:690208. [PMID: 34248491 PMCID: PMC8267251 DOI: 10.3389/fnins.2021.690208] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [Key Words] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/02/2021] [Accepted: 06/04/2021] [Indexed: 11/13/2022] Open
Abstract
Spiking Neuromorphic systems have been introduced as promising platforms for energy-efficient spiking neural network (SNNs) execution. SNNs incorporate neuronal and synaptic states in addition to the variant time scale into their computational model. Since each neuron in these networks is connected to many others, high bandwidth is required. Moreover, since the spike times are used to encode information in SNN, a precise communication latency is also needed, although SNN is tolerant to the spike delay variation in some limits when it is seen as a whole. The two-dimensional packet-switched network-on-chip was proposed as a solution to provide a scalable interconnect fabric in large-scale spike-based neural networks. The 3D-ICs have also attracted a lot of attention as a potential solution to resolve the interconnect bottleneck. Combining these two emerging technologies provides a new horizon for IC design to satisfy the high requirements of low power and small footprint in emerging AI applications. Moreover, although fault-tolerance is a natural feature of biological systems, integrating many computation and memory units into neuromorphic chips confronts the reliability issue, where a defective part can affect the overall system's performance. This paper presents the design and simulation of R-NASH-a reliable three-dimensional digital neuromorphic system geared explicitly toward the 3D-ICs biological brain's three-dimensional structure, where information in the network is represented by sparse patterns of spike timing and learning is based on the local spike-timing-dependent-plasticity rule. Our platform enables high integration density and small spike delay of spiking networks and features a scalable design. R-NASH is a design based on the Through-Silicon-Via technology, facilitating spiking neural network implementation on clustered neurons based on Network-on-Chip. We provide a memory interface with the host CPU, allowing for online training and inference of spiking neural networks. Moreover, R-NASH supports fault recovery with graceful performance degradation.
Collapse
Affiliation(s)
- Abderazek Ben Abdallah
- Adaptive Systems Laboratory, Graduate School of Computer Science and Engineering, The University of Aizu, Aizu-Wakamatsu, Japan
| | - Khanh N Dang
- Adaptive Systems Laboratory, Graduate School of Computer Science and Engineering, The University of Aizu, Aizu-Wakamatsu, Japan.,VNU Key Laboratory for Smart Integrated Systems (SISLAB), VNU University of Engineering and Technology, Vietnam National University, Hanoi, Vietnam
| |
Collapse
|
17
|
Covi E, Donati E, Liang X, Kappel D, Heidari H, Payvand M, Wang W. Adaptive Extreme Edge Computing for Wearable Devices. Front Neurosci 2021; 15:611300. [PMID: 34045939 PMCID: PMC8144334 DOI: 10.3389/fnins.2021.611300] [Citation(s) in RCA: 23] [Impact Index Per Article: 7.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/28/2020] [Accepted: 03/24/2021] [Indexed: 11/13/2022] Open
Abstract
Wearable devices are a fast-growing technology with impact on personal healthcare for both society and economy. Due to the widespread of sensors in pervasive and distributed networks, power consumption, processing speed, and system adaptation are vital in future smart wearable devices. The visioning and forecasting of how to bring computation to the edge in smart sensors have already begun, with an aspiration to provide adaptive extreme edge computing. Here, we provide a holistic view of hardware and theoretical solutions toward smart wearable devices that can provide guidance to research in this pervasive computing era. We propose various solutions for biologically plausible models for continual learning in neuromorphic computing technologies for wearable sensors. To envision this concept, we provide a systematic outline in which prospective low power and low latency scenarios of wearable sensors in neuromorphic platforms are expected. We successively describe vital potential landscapes of neuromorphic processors exploiting complementary metal-oxide semiconductors (CMOS) and emerging memory technologies (e.g., memristive devices). Furthermore, we evaluate the requirements for edge computing within wearable devices in terms of footprint, power consumption, latency, and data size. We additionally investigate the challenges beyond neuromorphic computing hardware, algorithms and devices that could impede enhancement of adaptive edge computing in smart wearable devices.
Collapse
Affiliation(s)
| | - Elisa Donati
- Institute of Neuroinformatics, University of Zurich, Eidgenössische Technische Hochschule Zürich (ETHZ), Zurich, Switzerland
| | - Xiangpeng Liang
- Microelectronics Lab, James Watt School of Engineering, University of Glasgow, Glasgow, United Kingdom
| | - David Kappel
- Bernstein Center for Computational Neuroscience, III Physikalisches Institut–Biophysik, Georg-August Universität, Göttingen, Germany
| | - Hadi Heidari
- Microelectronics Lab, James Watt School of Engineering, University of Glasgow, Glasgow, United Kingdom
| | - Melika Payvand
- Institute of Neuroinformatics, University of Zurich, Eidgenössische Technische Hochschule Zürich (ETHZ), Zurich, Switzerland
| | - Wei Wang
- The Andrew and Erna Viterbi Department of Electrical Engineering, Technion–Israel Institute of Technology, Haifa, Israel
| |
Collapse
|
18
|
Frenkel C, Lefebvre M, Bol D. Learning Without Feedback: Fixed Random Learning Signals Allow for Feedforward Training of Deep Neural Networks. Front Neurosci 2021; 15:629892. [PMID: 33642986 PMCID: PMC7902857 DOI: 10.3389/fnins.2021.629892] [Citation(s) in RCA: 7] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/16/2020] [Accepted: 01/06/2021] [Indexed: 11/13/2022] Open
Abstract
While the backpropagation of error algorithm enables deep neural network training, it implies (i) bidirectional synaptic weight transport and (ii) update locking until the forward and backward passes are completed. Not only do these constraints preclude biological plausibility, but they also hinder the development of low-cost adaptive smart sensors at the edge, as they severely constrain memory accesses and entail buffering overhead. In this work, we show that the one-hot-encoded labels provided in supervised classification problems, denoted as targets, can be viewed as a proxy for the error sign. Therefore, their fixed random projections enable a layerwise feedforward training of the hidden layers, thus solving the weight transport and update locking problems while relaxing the computational and memory requirements. Based on these observations, we propose the direct random target projection (DRTP) algorithm and demonstrate that it provides a tradeoff between accuracy and computational cost that is suitable for adaptive edge computing devices.
Collapse
Affiliation(s)
- Charlotte Frenkel
- Institute of Neuroinformatics, University of Zürich and ETH Zürich, Zurich, Switzerland.,ICTEAM Institute, Université catholique de Louvain, Louvain-la-Neuve, Belgium
| | - Martin Lefebvre
- ICTEAM Institute, Université catholique de Louvain, Louvain-la-Neuve, Belgium
| | - David Bol
- ICTEAM Institute, Université catholique de Louvain, Louvain-la-Neuve, Belgium
| |
Collapse
|
19
|
Knight JC, Nowotny T. Larger GPU-accelerated brain simulations with procedural connectivity. NATURE COMPUTATIONAL SCIENCE 2021; 1:136-142. [PMID: 38217218 DOI: 10.1038/s43588-020-00022-7] [Citation(s) in RCA: 13] [Impact Index Per Article: 4.3] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 05/29/2020] [Accepted: 12/23/2020] [Indexed: 01/15/2024]
Abstract
Simulations are an important tool for investigating brain function but large models are needed to faithfully reproduce the statistics and dynamics of brain activity. Simulating large spiking neural network models has, until now, needed so much memory for storing synaptic connections that it required high performance computer systems. Here, we present an alternative simulation method we call 'procedural connectivity' where connectivity and synaptic weights are generated 'on the fly' instead of stored and retrieved from memory. This method is particularly well suited for use on graphical processing units (GPUs)-which are a common fixture in many workstations. Using procedural connectivity and an additional GPU code generation optimization, we can simulate a recent model of the macaque visual cortex with 4.13 × 106 neurons and 24.2 × 109 synapses on a single GPU-a significant step forward in making large-scale brain modeling accessible to more researchers.
Collapse
Affiliation(s)
- James C Knight
- Centre for Computational Neuroscience and Robotics, School of Engineering and Informatics, University of Sussex, Brighton, UK.
| | - Thomas Nowotny
- Centre for Computational Neuroscience and Robotics, School of Engineering and Informatics, University of Sussex, Brighton, UK
| |
Collapse
|
20
|
Azghadi MR, Lammie C, Eshraghian JK, Payvand M, Donati E, Linares-Barranco B, Indiveri G. Hardware Implementation of Deep Network Accelerators Towards Healthcare and Biomedical Applications. IEEE TRANSACTIONS ON BIOMEDICAL CIRCUITS AND SYSTEMS 2020; 14:1138-1159. [PMID: 33156792 DOI: 10.1109/tbcas.2020.3036081] [Citation(s) in RCA: 30] [Impact Index Per Article: 7.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/11/2023]
Abstract
The advent of dedicated Deep Learning (DL) accelerators and neuromorphic processors has brought on new opportunities for applying both Deep and Spiking Neural Network (SNN) algorithms to healthcare and biomedical applications at the edge. This can facilitate the advancement of medical Internet of Things (IoT) systems and Point of Care (PoC) devices. In this paper, we provide a tutorial describing how various technologies including emerging memristive devices, Field Programmable Gate Arrays (FPGAs), and Complementary Metal Oxide Semiconductor (CMOS) can be used to develop efficient DL accelerators to solve a wide variety of diagnostic, pattern recognition, and signal processing problems in healthcare. Furthermore, we explore how spiking neuromorphic processors can complement their DL counterparts for processing biomedical signals. The tutorial is augmented with case studies of the vast literature on neural network and neuromorphic hardware as applied to the healthcare domain. We benchmark various hardware platforms by performing a sensor fusion signal processing task combining electromyography (EMG) signals with computer vision. Comparisons are made between dedicated neuromorphic processors and embedded AI accelerators in terms of inference latency and energy. Finally, we provide our analysis of the field and share a perspective on the advantages, disadvantages, challenges, and opportunities that various accelerators and neuromorphic processors introduce to healthcare and biomedical domains.
Collapse
|
21
|
Ceolini E, Frenkel C, Shrestha SB, Taverni G, Khacef L, Payvand M, Donati E. Hand-Gesture Recognition Based on EMG and Event-Based Camera Sensor Fusion: A Benchmark in Neuromorphic Computing. Front Neurosci 2020; 14:637. [PMID: 32903824 PMCID: PMC7438887 DOI: 10.3389/fnins.2020.00637] [Citation(s) in RCA: 28] [Impact Index Per Article: 7.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/15/2019] [Accepted: 05/22/2020] [Indexed: 12/03/2022] Open
Abstract
Hand gestures are a form of non-verbal communication used by individuals in conjunction with speech to communicate. Nowadays, with the increasing use of technology, hand-gesture recognition is considered to be an important aspect of Human-Machine Interaction (HMI), allowing the machine to capture and interpret the user's intent and to respond accordingly. The ability to discriminate between human gestures can help in several applications, such as assisted living, healthcare, neuro-rehabilitation, and sports. Recently, multi-sensor data fusion mechanisms have been investigated to improve discrimination accuracy. In this paper, we present a sensor fusion framework that integrates complementary systems: the electromyography (EMG) signal from muscles and visual information. This multi-sensor approach, while improving accuracy and robustness, introduces the disadvantage of high computational cost, which grows exponentially with the number of sensors and the number of measurements. Furthermore, this huge amount of data to process can affect the classification latency which can be crucial in real-case scenarios, such as prosthetic control. Neuromorphic technologies can be deployed to overcome these limitations since they allow real-time processing in parallel at low power consumption. In this paper, we present a fully neuromorphic sensor fusion approach for hand-gesture recognition comprised of an event-based vision sensor and three different neuromorphic processors. In particular, we used the event-based camera, called DVS, and two neuromorphic platforms, Loihi and ODIN + MorphIC. The EMG signals were recorded using traditional electrodes and then converted into spikes to be fed into the chips. We collected a dataset of five gestures from sign language where visual and electromyography signals are synchronized. We compared a fully neuromorphic approach to a baseline implemented using traditional machine learning approaches on a portable GPU system. According to the chip's constraints, we designed specific spiking neural networks (SNNs) for sensor fusion that showed classification accuracy comparable to the software baseline. These neuromorphic alternatives have increased inference time, between 20 and 40%, with respect to the GPU system but have a significantly smaller energy-delay product (EDP) which makes them between 30× and 600× more efficient. The proposed work represents a new benchmark that moves neuromorphic computing toward a real-world scenario.
Collapse
Affiliation(s)
- Enea Ceolini
- Institute of Neuroinformatics, University of Zurich, ETH Zurich, Zurich, Switzerland
| | - Charlotte Frenkel
- Institute of Neuroinformatics, University of Zurich, ETH Zurich, Zurich, Switzerland
- ICTEAM Institute, Université Catholique de Louvain, Louvain-la-Neuve, Belgium
| | - Sumit Bam Shrestha
- Temasek Laboratories, National University of Singapore, Singapore, Singapore
| | - Gemma Taverni
- Institute of Neuroinformatics, University of Zurich, ETH Zurich, Zurich, Switzerland
| | - Lyes Khacef
- Université Côte d'Azur, CNRS, LEAT, Nice, France
| | - Melika Payvand
- Institute of Neuroinformatics, University of Zurich, ETH Zurich, Zurich, Switzerland
| | - Elisa Donati
- Institute of Neuroinformatics, University of Zurich, ETH Zurich, Zurich, Switzerland
| |
Collapse
|
22
|
Stoliar P, Schneegans O, Rozenberg MJ. Biologically Relevant Dynamical Behaviors Realized in an Ultra-Compact Neuron Model. Front Neurosci 2020; 14:421. [PMID: 32595437 PMCID: PMC7247826 DOI: 10.3389/fnins.2020.00421] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/15/2019] [Accepted: 04/07/2020] [Indexed: 11/16/2022] Open
Abstract
We demonstrate a variety of biologically relevant dynamical behaviors building on a recently introduced ultra-compact neuron (UCN) model. We provide the detailed circuits which all share a common basic block that realizes the leaky-integrate-and-fire (LIF) spiking behavior. All circuits have a small number of active components and the basic block has only three, two transistors and a silicon controlled rectifier (SCR). We also demonstrate that numerical simulations can faithfully represent the variety of spiking behavior and can be used for further exploration of dynamical behaviors. Taking Izhikevich’s set of biologically relevant behaviors as a reference, our work demonstrates that a circuit of a LIF neuron model can be used as a basis to implement a large variety of relevant spiking patterns. These behaviors may be useful to construct neural networks that can capture complex brain dynamics or may also be useful for artificial intelligence applications. Our UCN model can therefore be considered the electronic circuit counterpart of Izhikevich’s (2003) mathematical neuron model, sharing its two seemingly contradicting features, extreme simplicity and rich dynamical behavior.
Collapse
Affiliation(s)
- Pablo Stoliar
- National Institute of Advanced Industrial Science and Technology (AIST), Tsukuba, Japan
| | - Olivier Schneegans
- CentraleSupélec, CNRS, Université Paris-Saclay, Sorbonne Université, Laboratoire de Génie Electrique et Electronique de Paris, Gif-sur-Yvette, France
| | - Marcelo J Rozenberg
- Université Paris-Saclay, CNRS, Laboratoire de Physique des Solides, Orsay, France
| |
Collapse
|
23
|
Liu Y, Qian K, Hu S, An K, Xu S, Zhan X, Wang JJ, Guo R, Wu Y, Chen TP, Yu Q, Liu Y. Application of Deep Compression Technique in Spiking Neural Network Chip. IEEE TRANSACTIONS ON BIOMEDICAL CIRCUITS AND SYSTEMS 2020; 14:274-282. [PMID: 31715570 DOI: 10.1109/tbcas.2019.2952714] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/10/2023]
Abstract
In this paper, a reconfigurable and scalable spiking neural network processor, containing 192 neurons and 6144 synapses, is developed. By using deep compression technique in spiking neural network chip, the amount of physical synapses can be reduced to 1/16 of that needed in the original network, while the accuracy is maintained. This compression technique can greatly reduce the number of SRAMs inside the chip as well as the power consumption of the chip. This design achieves throughput per unit area of 1.1 GSOP/([Formula: see text]) at 1.2 V, and energy consumed per SOP of 35 pJ. A 2-layer fully-connected spiking neural network is mapped to the chip, and thus the chip is able to realize handwritten digit recognition on MNIST with an accuracy of 91.2%.
Collapse
|