1
|
Grose M, Schmidt JD, Hirakawa K. Convolutional neural network for improved event-based Shack-Hartmann wavefront reconstruction. APPLIED OPTICS 2024; 63:E35-E47. [PMID: 38856590 DOI: 10.1364/ao.520652] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Received: 02/04/2024] [Accepted: 03/30/2024] [Indexed: 06/11/2024]
Abstract
Shack-Hartmann wavefront sensing is a technique for measuring wavefront aberrations, whose use in adaptive optics relies on fast position tracking of an array of spots. These sensors conventionally use frame-based cameras operating at a fixed sampling rate to report pixel intensities, even though only a fraction of the pixels have signal. Prior in-lab experiments have shown feasibility of event-based cameras for Shack-Hartmann wavefront sensing (SHWFS), asynchronously reporting the spot locations as log intensity changes at a microsecond time scale. In our work, we propose a convolutional neural network (CNN) called event-based wavefront network (EBWFNet) that achieves highly accurate estimation of the spot centroid position in real time. We developed a custom Shack-Hartmann wavefront sensing hardware with a common aperture for the synchronized frame- and event-based cameras so that spot centroid locations computed from the frame-based camera may be used to train/test the event-CNN-based centroid position estimation method in an unsupervised manner. Field testing with this hardware allows us to conclude that the proposed EBWFNet achieves sub-pixel accuracy in real-world scenarios with substantial improvement over the state-of-the-art event-based SHWFS. An ablation study reveals the impact of data processing, CNN components, and training cost function; and an unoptimized MATLAB implementation is shown to run faster than 800 Hz on a single GPU.
Collapse
|
2
|
He B, Wang Z, Zhou Y, Chen J, Singh CD, Li H, Gao Y, Shen S, Wang K, Cao Y, Xu C, Aloimonos Y, Gao F, Fermüller C. Microsaccade-inspired event camera for robotics. Sci Robot 2024; 9:eadj8124. [PMID: 38809998 DOI: 10.1126/scirobotics.adj8124] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/24/2023] [Accepted: 05/01/2024] [Indexed: 05/31/2024]
Abstract
Neuromorphic vision sensors or event cameras have made the visual perception of extremely low reaction time possible, opening new avenues for high-dynamic robotics applications. These event cameras' output is dependent on both motion and texture. However, the event camera fails to capture object edges that are parallel to the camera motion. This is a problem intrinsic to the sensor and therefore challenging to solve algorithmically. Human vision deals with perceptual fading using the active mechanism of small involuntary eye movements, the most prominent ones called microsaccades. By moving the eyes constantly and slightly during fixation, microsaccades can substantially maintain texture stability and persistence. Inspired by microsaccades, we designed an event-based perception system capable of simultaneously maintaining low reaction time and stable texture. In this design, a rotating wedge prism was mounted in front of the aperture of an event camera to redirect light and trigger events. The geometrical optics of the rotating wedge prism allows for algorithmic compensation of the additional rotational motion, resulting in a stable texture appearance and high informational output independent of external motion. The hardware device and software solution are integrated into a system, which we call artificial microsaccade-enhanced event camera (AMI-EV). Benchmark comparisons validated the superior data quality of AMI-EV recordings in scenarios where both standard cameras and event cameras fail to deliver. Various real-world experiments demonstrated the potential of the system to facilitate robotics perception both for low-level and high-level vision tasks.
Collapse
Affiliation(s)
- Botao He
- Department of Computer Science, University of Maryland, College Park, MD 20742, USA
- College of Control Science and Engineering, Zhejiang University, Hangzhou, China
| | - Ze Wang
- Huzhou Institute of Zhejiang University, Huzhou, China
- College of Optical Science and Engineering, Zhejiang University, Hangzhou, China
| | - Yuan Zhou
- College of Control Science and Engineering, Zhejiang University, Hangzhou, China
- Huzhou Institute of Zhejiang University, Huzhou, China
| | - Jingxi Chen
- Department of Computer Science, University of Maryland, College Park, MD 20742, USA
| | - Chahat Deep Singh
- Department of Computer Science, University of Maryland, College Park, MD 20742, USA
| | - Haojia Li
- Department of Electronic and Computer Engineering, Hong Kong University of Science and Technology, Hong Kong, China
| | - Yuman Gao
- College of Control Science and Engineering, Zhejiang University, Hangzhou, China
- Huzhou Institute of Zhejiang University, Huzhou, China
| | - Shaojie Shen
- Department of Electronic and Computer Engineering, Hong Kong University of Science and Technology, Hong Kong, China
| | - Kaiwei Wang
- College of Optical Science and Engineering, Zhejiang University, Hangzhou, China
| | - Yanjun Cao
- Huzhou Institute of Zhejiang University, Huzhou, China
| | - Chao Xu
- College of Control Science and Engineering, Zhejiang University, Hangzhou, China
- Huzhou Institute of Zhejiang University, Huzhou, China
| | - Yiannis Aloimonos
- Department of Computer Science, University of Maryland, College Park, MD 20742, USA
- Institute for Advance Computer Studies, University of Maryland, College Park, MD 20742, USA
- Institute for Systems Research, University of Maryland, College Park, MD 20742, USA
| | - Fei Gao
- College of Control Science and Engineering, Zhejiang University, Hangzhou, China
- Huzhou Institute of Zhejiang University, Huzhou, China
| | - Cornelia Fermüller
- Department of Computer Science, University of Maryland, College Park, MD 20742, USA
- Institute for Advance Computer Studies, University of Maryland, College Park, MD 20742, USA
- Institute for Systems Research, University of Maryland, College Park, MD 20742, USA
| |
Collapse
|
3
|
Yao M, Richter O, Zhao G, Qiao N, Xing Y, Wang D, Hu T, Fang W, Demirci T, De Marchi M, Deng L, Yan T, Nielsen C, Sheik S, Wu C, Tian Y, Xu B, Li G. Spike-based dynamic computing with asynchronous sensing-computing neuromorphic chip. Nat Commun 2024; 15:4464. [PMID: 38796464 PMCID: PMC11127998 DOI: 10.1038/s41467-024-47811-6] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/16/2023] [Accepted: 04/12/2024] [Indexed: 05/28/2024] Open
Abstract
By mimicking the neurons and synapses of the human brain and employing spiking neural networks on neuromorphic chips, neuromorphic computing offers a promising energy-efficient machine intelligence. How to borrow high-level brain dynamic mechanisms to help neuromorphic computing achieve energy advantages is a fundamental issue. This work presents an application-oriented algorithm-software-hardware co-designed neuromorphic system for this issue. First, we design and fabricate an asynchronous chip called "Speck", a sensing-computing neuromorphic system on chip. With the low processor resting power of 0.42mW, Speck can satisfy the hardware requirements of dynamic computing: no-input consumes no energy. Second, we uncover the "dynamic imbalance" in spiking neural networks and develop an attention-based framework for achieving the algorithmic requirements of dynamic computing: varied inputs consume energy with large variance. Together, we demonstrate a neuromorphic system with real-time power as low as 0.70mW. This work exhibits the promising potentials of neuromorphic computing with its asynchronous event-driven, sparse, and dynamic nature.
Collapse
Affiliation(s)
- Man Yao
- Institute of Automation, Chinese Academy of Sciences, Beijing, China
| | - Ole Richter
- SynSense AG Corporation, Zurich, Switzerland
| | - Guangshe Zhao
- School of Automation Science and Engineering, Xi'an Jiaotong University, Xi'an, Shaanxi, China
| | - Ning Qiao
- SynSense AG Corporation, Zurich, Switzerland
- SynSense Corporation, Chengdu, Sichuan, China
| | - Yannan Xing
- SynSense Corporation, Chengdu, Sichuan, China
| | - Dingheng Wang
- Northwest Institute of Mechanical & Electrical Engineering, Xianyang, Shaanxi, China
| | - Tianxiang Hu
- Institute of Automation, Chinese Academy of Sciences, Beijing, China
| | - Wei Fang
- School of Computer Science, Peking University, Beijing, China
- Peng Cheng Laboratory, Shenzhen, Guangdong, China
| | | | | | - Lei Deng
- Center for Brain-Inspired Computing, Department of Precision Instrument, Tsinghua University, Beijing, China
| | - Tianyi Yan
- School of Life Science, Beijing Institute of Technology, Beijing, China
| | - Carsten Nielsen
- SynSense AG Corporation, Zurich, Switzerland
- Institute of Neuroinformatics, University of Zurich and ETH Zurich, Zurich, Switzerland
| | | | - Chenxi Wu
- SynSense AG Corporation, Zurich, Switzerland
- Institute of Neuroinformatics, University of Zurich and ETH Zurich, Zurich, Switzerland
| | - Yonghong Tian
- School of Computer Science, Peking University, Beijing, China
- Peng Cheng Laboratory, Shenzhen, Guangdong, China
| | - Bo Xu
- Institute of Automation, Chinese Academy of Sciences, Beijing, China
| | - Guoqi Li
- Institute of Automation, Chinese Academy of Sciences, Beijing, China.
- Key Laboratory of Brain Cognition and Brain-inspired Intelligence Technology, Beijing, China.
| |
Collapse
|
4
|
Gionfrida L, Kim D, Scaramuzza D, Farina D, Howe RD. Wearable robots for the real world need vision. Sci Robot 2024; 9:eadj8812. [PMID: 38776377 DOI: 10.1126/scirobotics.adj8812] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/22/2023] [Accepted: 04/24/2024] [Indexed: 05/25/2024]
Abstract
To enhance wearable robots, understanding user intent and environmental perception with novel vision approaches is needed.
Collapse
Affiliation(s)
- Letizia Gionfrida
- Department of Informatics, Faculty of Natural Mathematics and Engineering Sciences, King's College London, Bush House, 30 Aldwych, London WC2B 4BG, UK
- John A. Paulson School of Engineering and Applied Sciences and Wyss Institute for Biologically Inspired Engineering, Harvard University, Boston, MA, USA
| | - Daekyum Kim
- School of Smart Mobility, Korea University, Seoul 02841, South Korea
- School of Mechanical Engineering, Korea University, Seoul 02841, South Korea
| | - Davide Scaramuzza
- Robotics and Perception Group, Department of Informatics, University of Zurich, Andreasstrasse 15, 8050 Zurich, Switzerland
| | - Dario Farina
- Department of Bioengineering, Faculty of Engineering, Imperial College London, Exhibition Rd, South Kensington, London SW7 2BX, UK
| | - Robert D Howe
- John A. Paulson School of Engineering and Applied Sciences and Wyss Institute for Biologically Inspired Engineering, Harvard University, Boston, MA, USA
| |
Collapse
|
5
|
Yang J, Cai Y, Wang F, Li S, Zhan X, Xu K, He J, Wang Z. A Reconfigurable Bipolar Image Sensor for High-Efficiency Dynamic Vision Recognition. NANO LETTERS 2024; 24:5862-5869. [PMID: 38709809 DOI: 10.1021/acs.nanolett.4c01190] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/08/2024]
Abstract
Dynamic vision perception and processing (DVPP) is in high demand by booming edge artificial intelligence. However, existing imaging systems suffer from low efficiency or low compatibility with advanced machine vision techniques. Here, we propose a reconfigurable bipolar image sensor (RBIS) for in-sensor DVPP based on a two-dimensional WSe2/GeSe heterostructure device. Owing to the gate-tunable and reversible built-in electric field, its photoresponse shows bipolarity as being positive or negative. High-efficiency DVPP incorporating front-end RBIS and back-end CNN is then demonstrated. It shows a high recognition accuracy of over 94.9% on the derived DVS128 data set and requires much fewer neural network parameters than that without RBIS. Moreover, we demonstrate an optimized device with a vertically stacked structure and a stable nonvolatile bipolarity, which enables more efficient DVPP hardware. Our work demonstrates the potential of fabricating DVPP devices with a simple structure, high efficiency, and outputs compatible with advanced algorithms.
Collapse
Affiliation(s)
- Jia Yang
- CAS Key Laboratory of Nanosystem and Hierarchical Fabrication, National Center for Nanoscience and Technology, Beijing 100190, China
- Center of Materials Science and Optoelectronics Engineering, University of Chinese Academy of Sciences, Beijing 100049, China
| | - Yuchen Cai
- CAS Key Laboratory of Nanosystem and Hierarchical Fabrication, National Center for Nanoscience and Technology, Beijing 100190, China
- Center of Materials Science and Optoelectronics Engineering, University of Chinese Academy of Sciences, Beijing 100049, China
| | - Feng Wang
- CAS Key Laboratory of Nanosystem and Hierarchical Fabrication, National Center for Nanoscience and Technology, Beijing 100190, China
- Center of Materials Science and Optoelectronics Engineering, University of Chinese Academy of Sciences, Beijing 100049, China
| | - Shuhui Li
- CAS Key Laboratory of Nanosystem and Hierarchical Fabrication, National Center for Nanoscience and Technology, Beijing 100190, China
| | - Xueying Zhan
- CAS Key Laboratory of Nanosystem and Hierarchical Fabrication, National Center for Nanoscience and Technology, Beijing 100190, China
| | - Kai Xu
- Hangzhou Global Scientific and Technological Innovation Center, School of Micro-Nano Electronics, Zhejiang University, Hangzhou 310027, China
| | - Jun He
- Center of Materials Science and Optoelectronics Engineering, University of Chinese Academy of Sciences, Beijing 100049, China
- Key Laboratory of Artificial Micro- and Nano-structures of Ministry of Education, School of Physics and Technology, Wuhan University, Wuhan 430072, China
| | - Zhenxing Wang
- CAS Key Laboratory of Nanosystem and Hierarchical Fabrication, National Center for Nanoscience and Technology, Beijing 100190, China
- Center of Materials Science and Optoelectronics Engineering, University of Chinese Academy of Sciences, Beijing 100049, China
| |
Collapse
|
6
|
Yu H, Li H, Yang W, Yu L, Xia GS. Detecting Line Segments in Motion-Blurred Images With Events. IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE 2024; 46:2866-2881. [PMID: 37983154 DOI: 10.1109/tpami.2023.3334877] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/22/2023]
Abstract
Making line segment detectors more reliable under motion blurs is one of the most important challenges for practical applications, such as visual SLAM and 3D line mapping. Existing line segment detection methods face severe performance degradation for accurately detecting and locating line segments when motion blur occurs. While event data shows strong complementary characteristics to images for minimal blur and edge awareness at high-temporal resolution, potentially beneficial for reliable line segment recognition. To robustly detect line segments over motion blurs, we propose to leverage the complementary information of images and events. Specifically, we first design a general frame-event feature fusion network to extract and fuse the detailed image textures and low-latency event edges, which consists of a channel-attention-based shallow fusion module and a self-attention-based dual hourglass module. We then utilize the state-of-the-art wireframe parsing networks to detect line segments on the fused feature map. Moreover, due to the lack of line segment detection datasets with pairwise motion-blurred images and events, we contribute two datasets, i.e., synthetic FE-Wireframe and realistic FE-Blurframe, for network training and evaluation. Extensive analyses on the component configurations demonstrate the design effectiveness of our fusion network. When compared to the state-of-the-arts, the proposed approach achieves the highest detection accuracy while maintaining comparable real-time performance. In addition to being robust to motion blur, our method also exhibits superior performance for line detection under high dynamic range scenes.
Collapse
|
7
|
Zhang T, Shen Y, Zhao G, Wang L, Chen X, Bai L, Zhou Y. Swift-Eye: Towards Anti-blink Pupil Tracking for Precise and Robust High-Frequency Near-Eye Movement Analysis with Event Cameras. IEEE TRANSACTIONS ON VISUALIZATION AND COMPUTER GRAPHICS 2024; 30:2077-2086. [PMID: 38437077 DOI: 10.1109/tvcg.2024.3372039] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 03/06/2024]
Abstract
Eye tracking has shown great promise in many scientific fields and daily applications, ranging from the early detection of mental health disorders to foveated rendering in virtual reality (VR). These applications all call for a robust system for high-frequency near-eye movement sensing and analysis in high precision, which cannot be guaranteed by the existing eye tracking solutions with CCD/CMOS cameras. To bridge the gap, in this paper, we propose Swift-Eye, an offline precise and robust pupil estimation and tracking framework to support high-frequency near-eye movement analysis, especially when the pupil region is partially occluded. Swift-Eye is built upon the emerging event cameras to capture the high-speed movement of eyes in high temporal resolution. Then, a series of bespoke components are designed to generate high-quality near-eye movement video at a high frame rate over kilohertz and deal with the occlusion over the pupil caused by involuntary eye blinks. According to our extensive evaluations on EV-Eye, a large-scale public dataset for eye tracking using event cameras, Swift-Eye shows high robustness against significant occlusion. It can improve the IoU and F1-score of the pupil estimation by 20% and 12.5% respectively, compared with the second-best competing approach, when over 80% of the pupil region is occluded by the eyelid. Lastly, it provides continuous and smooth traces of pupils in extremely high temporal resolution and can support high-frequency eye movement analysis and a number of potential applications, such as mental health diagnosis, behaviour-brain association, etc. The implementation details and source codes can be found at https://github.com/ztysdu/Swift-Eye.
Collapse
|
8
|
Gehrig D, Scaramuzza D. Low-latency automotive vision with event cameras. Nature 2024; 629:1034-1040. [PMID: 38811712 PMCID: PMC11136662 DOI: 10.1038/s41586-024-07409-w] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/24/2023] [Accepted: 04/10/2024] [Indexed: 05/31/2024]
Abstract
The computer vision algorithms used currently in advanced driver assistance systems rely on image-based RGB cameras, leading to a critical bandwidth-latency trade-off for delivering safe driving experiences. To address this, event cameras have emerged as alternative vision sensors. Event cameras measure the changes in intensity asynchronously, offering high temporal resolution and sparsity, markedly reducing bandwidth and latency requirements1. Despite these advantages, event-camera-based algorithms are either highly efficient but lag behind image-based ones in terms of accuracy or sacrifice the sparsity and efficiency of events to achieve comparable results. To overcome this, here we propose a hybrid event- and frame-based object detector that preserves the advantages of each modality and thus does not suffer from this trade-off. Our method exploits the high temporal resolution and sparsity of events and the rich but low temporal resolution information in standard images to generate efficient, high-rate object detections, reducing perceptual and computational latency. We show that the use of a 20 frames per second (fps) RGB camera plus an event camera can achieve the same latency as a 5,000-fps camera with the bandwidth of a 45-fps camera without compromising accuracy. Our approach paves the way for efficient and robust perception in edge-case scenarios by uncovering the potential of event cameras2.
Collapse
Affiliation(s)
- Daniel Gehrig
- Robotics and Perception Group, University of Zurich, Zurich, Switzerland.
| | - Davide Scaramuzza
- Robotics and Perception Group, University of Zurich, Zurich, Switzerland.
| |
Collapse
|
9
|
Lei T, Guan B, Liang M, Liu Z, Liu J, Shang Y, Yu Q. Motion measurements of explosive shock waves based on an event camera. OPTICS EXPRESS 2024; 32:15390-15409. [PMID: 38859191 DOI: 10.1364/oe.506662] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Received: 10/04/2023] [Accepted: 03/26/2024] [Indexed: 06/12/2024]
Abstract
Shock wave measurement is vital in assessing explosive power and designing warheads. To obtain satisfactory observation data of explosive shock waves, it is preferable for optical sensors to possess high-dynamic range and high-time resolution capabilities. In this paper, the event camera is first employed to observe explosive shock waves, leveraging its high dynamic range and low latency. A comprehensive procedure is devised to measure the motion parameters of shock waves accurately. Firstly, the plane lines-based calibration method is proposed to compute the calibration parameters of the event camera, which utilizes the edge-sensitive characteristic of the event camera. Then, the fitted ellipse parameters of the shock wave are estimated based on the concise event data, which are gained by utilizing the characteristics of the event triggering and shock waves' morphology. Finally, the geometric relationship between the ellipse parameters and the radius of the shock wave is derived, and the motion parameters of the shock wave are estimated. To verify the performance of our method, we compare our measurement results in the TNT explosion test with the pressure sensor results and empirical formula prediction. The relative measurement error compared to pressure sensors is the lowest at 0.33% and the highest at 7.58%. The experimental results verify the rationality and effectiveness of our methods.
Collapse
|
10
|
Gouda M, Abreu S, Bienstman P. Surrogate gradient learning in spiking networks trained on event-based cytometry dataset. OPTICS EXPRESS 2024; 32:16260-16272. [PMID: 38859258 DOI: 10.1364/oe.518323] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 01/18/2024] [Accepted: 04/03/2024] [Indexed: 06/12/2024]
Abstract
Spiking neural networks (SNNs) are bio-inspired neural networks that - to an extent - mimic the workings of our brains. In a similar fashion, event-based vision sensors try to replicate a biological eye as closely as possible. In this work, we integrate both technologies for the purpose of classifying micro-particles in the context of label-free flow cytometry. We follow up on our previous work in which we used simple logistic regression with binary labels. Although this model was able to achieve an accuracy of over 98%, our goal is to utilize the system for a wider variety of cells, some of which may have less noticeable morphological variations. Therefore, a more advanced machine learning model like the SNNs discussed here would be required. This comes with the challenge of training such networks, since they typically suffer from vanishing gradients. We effectively apply the surrogate gradient method to overcome this issue achieving over 99% classification accuracy on test data for a four-class problem. Finally, rather than treating the neural network as a black box, we explore the dynamics inside the network and make use of that to enhance its accuracy and sparsity.
Collapse
|
11
|
Milozzi A, Ricci S, Ielmini D. Memristive tonotopic mapping with volatile resistive switching memory devices. Nat Commun 2024; 15:2812. [PMID: 38561389 PMCID: PMC10985068 DOI: 10.1038/s41467-024-47228-1] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/27/2023] [Accepted: 03/25/2024] [Indexed: 04/04/2024] Open
Abstract
To reach the energy efficiency and the computing capability of biological neural networks, novel hardware systems and paradigms are required where the information needs to be processed in both spatial and temporal domains. Resistive switching memory (RRAM) devices appear as key enablers for the implementation of large-scale neuromorphic computing systems with high energy efficiency and extended scalability. Demonstrating a full set of spatiotemporal primitives with RRAM-based circuits remains an open challenge. By taking inspiration from the neurobiological processes in the human auditory systems, we develop neuromorphic circuits for memristive tonotopic mapping via volatile RRAM devices. Based on a generalized stochastic device-level approach, we demonstrate the main features of signal processing of cochlea, namely logarithmic integration and tonotopic mapping of signals. We also show that our tonotopic classification is suitable for speech recognition. These results support memristive devices for physical processing of temporal signals, thus paving the way for energy efficient, high density neuromorphic systems.
Collapse
Affiliation(s)
- Alessandro Milozzi
- Dipartimento di Elettronica, Informazione e Bioingegneria, Politecnico di Milano and IU.NET, Piazza Leonardo da Vinci 32, 20133, Milano, Italy
| | - Saverio Ricci
- Dipartimento di Elettronica, Informazione e Bioingegneria, Politecnico di Milano and IU.NET, Piazza Leonardo da Vinci 32, 20133, Milano, Italy
| | - Daniele Ielmini
- Dipartimento di Elettronica, Informazione e Bioingegneria, Politecnico di Milano and IU.NET, Piazza Leonardo da Vinci 32, 20133, Milano, Italy.
| |
Collapse
|
12
|
Miled MB, Liu W, Liu Y. Adaptive Unsupervised Learning-Based 3D Spatiotemporal Filter for Event-Driven Cameras. RESEARCH (WASHINGTON, D.C.) 2024; 7:0330. [PMID: 38562525 PMCID: PMC10981976 DOI: 10.34133/research.0330] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 10/13/2023] [Accepted: 02/06/2024] [Indexed: 04/04/2024]
Abstract
In the evolving landscape of robotics and visual navigation, event cameras have gained important traction, notably for their exceptional dynamic range, efficient power consumption, and low latency. Despite these advantages, conventional processing methods oversimplify the data into 2 dimensions, neglecting critical temporal information. To overcome this limitation, we propose a novel method that treats events as 3D time-discrete signals. Drawing inspiration from the intricate biological filtering systems inherent to the human visual apparatus, we have developed a 3D spatiotemporal filter based on unsupervised machine learning algorithm. This filter effectively reduces noise levels and performs data size reduction, with its parameters being dynamically adjusted based on population activity. This ensures adaptability and precision under various conditions, like changes in motion velocity and ambient lighting. In our novel validation approach, we first identify the noise type and determine its power spectral density in the event stream. We then apply a one-dimensional discrete fast Fourier transform to assess the filtered event data within the frequency domain, ensuring that the targeted noise frequencies are adequately reduced. Our research also delved into the impact of indoor lighting on event stream noise. Remarkably, our method led to a 37% decrease in the data point cloud, improving data quality in diverse outdoor settings.
Collapse
Affiliation(s)
- Meriem Ben Miled
- Department of Mechanical Engineering,
University College London, London, UK
| | - Wenwen Liu
- School of Automation,
Nanjing University of Information, Science and Technology, Nanjing, China
| | - Yuanchang Liu
- Department of Mechanical Engineering,
University College London, London, UK
| |
Collapse
|
13
|
Xu Y, Shidqi K, van Schaik GJ, Bilgic R, Dobrita A, Wang S, Meijer R, Nembhani P, Arjmand C, Martinello P, Gebregiorgis A, Hamdioui S, Detterer P, Traferro S, Konijnenburg M, Vadivel K, Sifalakis M, Tang G, Yousefzadeh A. Optimizing event-based neural networks on digital neuromorphic architecture: a comprehensive design space exploration. Front Neurosci 2024; 18:1335422. [PMID: 38606307 PMCID: PMC11007209 DOI: 10.3389/fnins.2024.1335422] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/08/2023] [Accepted: 02/28/2024] [Indexed: 04/13/2024] Open
Abstract
Neuromorphic processors promise low-latency and energy-efficient processing by adopting novel brain-inspired design methodologies. Yet, current neuromorphic solutions still struggle to rival conventional deep learning accelerators' performance and area efficiency in practical applications. Event-driven data-flow processing and near/in-memory computing are the two dominant design trends of neuromorphic processors. However, there remain challenges in reducing the overhead of event-driven processing and increasing the mapping efficiency of near/in-memory computing, which directly impacts the performance and area efficiency. In this work, we discuss these challenges and present our exploration of optimizing event-based neural network inference on SENECA, a scalable and flexible neuromorphic architecture. To address the overhead of event-driven processing, we perform comprehensive design space exploration and propose spike-grouping to reduce the total energy and latency. Furthermore, we introduce the event-driven depth-first convolution to increase area efficiency and latency in convolutional neural networks (CNNs) on the neuromorphic processor. We benchmarked our optimized solution on keyword spotting, sensor fusion, digit recognition and high resolution object detection tasks. Compared with other state-of-the-art large-scale neuromorphic processors, our proposed optimizations result in a 6× to 300× improvement in energy efficiency, a 3× to 15× improvement in latency, and a 3× to 100× improvement in area efficiency. Our optimizations for event-based neural networks can be potentially generalized to a wide range of event-based neuromorphic processors.
Collapse
Affiliation(s)
| | | | | | | | | | | | | | | | | | | | - Anteneh Gebregiorgis
- Department of Dependable and Emerging Computer Technologies, Delft University of Technology, Delft, Netherlands
| | - Said Hamdioui
- Department of Dependable and Emerging Computer Technologies, Delft University of Technology, Delft, Netherlands
| | | | | | | | | | | | | | - Amirreza Yousefzadeh
- IMEC, Eindhoven, Netherlands
- Department of Computer Architecture and Embedded Systems, University of Twente, Enschede, Netherlands
| |
Collapse
|
14
|
Dai Z, Fu Q, Peng J, Li H. SLoN: a spiking looming perception network exploiting neural encoding and processing in ON/OFF channels. Front Neurosci 2024; 18:1291053. [PMID: 38510466 PMCID: PMC10950957 DOI: 10.3389/fnins.2024.1291053] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/08/2023] [Accepted: 02/14/2024] [Indexed: 03/22/2024] Open
Abstract
Looming perception, the ability to sense approaching objects, is crucial for the survival of humans and animals. After hundreds of millions of years of evolutionary development, biological entities have evolved efficient and robust looming perception visual systems. However, current artificial vision systems fall short of such capabilities. In this study, we propose a novel spiking neural network for looming perception that mimics biological vision to communicate motion information through action potentials or spikes, providing a more realistic approach than previous artificial neural networks based on sum-then-activate operations. The proposed spiking looming perception network (SLoN) comprises three core components. Neural encoding, known as phase coding, transforms video signals into spike trains, introducing the concept of phase delay to depict the spatial-temporal competition between phasic excitatory and inhibitory signals shaping looming selectivity. To align with biological substrates where visual signals are bifurcated into parallel ON/OFF channels encoding brightness increments and decrements separately to achieve specific selectivity to ON/OFF-contrast stimuli, we implement eccentric down-sampling at the entrance of ON/OFF channels, mimicking the foveal region of the mammalian receptive field with higher acuity to motion, computationally modeled with a leaky integrate-and-fire (LIF) neuronal network. The SLoN model is deliberately tested under various visual collision scenarios, ranging from synthetic to real-world stimuli. A notable achievement is that the SLoN selectively spikes for looming features concealed in visual streams against other categories of movements, including translating, receding, grating, and near misses, demonstrating robust selectivity in line with biological principles. Additionally, the efficacy of the ON/OFF channels, the phase coding with delay, and the eccentric visual processing are further investigated to demonstrate their effectiveness in looming perception. The cornerstone of this study rests upon showcasing a new paradigm for looming perception that is more biologically plausible in light of biological motion perception.
Collapse
|
15
|
Cohen-Duwek H, Tsur EE. Colorful image reconstruction from neuromorphic event cameras with biologically inspired deep color fusion neural networks. BIOINSPIRATION & BIOMIMETICS 2024; 19:036001. [PMID: 38373337 DOI: 10.1088/1748-3190/ad2a7c] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 09/04/2023] [Accepted: 02/19/2024] [Indexed: 02/21/2024]
Abstract
Neuromorphic event-based cameras communicate transients in luminance instead of frames, providing visual information with a fine temporal resolution, high dynamic range and high signal-to-noise ratio. Enriching event data with color information allows for the reconstruction of colorful frame-like intensity maps, supporting improved performance and visually appealing results in various computer vision tasks. In this work, we simulated a biologically inspired color fusion system featuring a three-stage convolutional neural network for reconstructing color intensity maps from event data and sparse color cues. While current approaches for color fusion use full RGB frames in high resolution, our design uses event data and low-spatial and tonal-resolution quantized color cues, providing a high-performing small model for efficient colorful image reconstruction. The proposed model outperforms existing coloring schemes in terms of SSIM, LPIPS, PSNR, and CIEDE2000 metrics. We demonstrate that auxiliary limited color information can be used in conjunction with event data to successfully reconstruct both color and intensity frames, paving the way for more efficient hardware designs.
Collapse
Affiliation(s)
- Hadar Cohen-Duwek
- The Neuro-Biomorphic Engineering Lab, Department of Mathematics and Computer Science, The Open University of Israel, Ra'anana, Israel
| | - Elishai Ezra Tsur
- The Neuro-Biomorphic Engineering Lab, Department of Mathematics and Computer Science, The Open University of Israel, Ra'anana, Israel
| |
Collapse
|
16
|
Alkendi Y, Azzam R, Ayyad A, Javed S, Seneviratne L, Zweiri Y. Neuromorphic Camera Denoising Using Graph Neural Network-Driven Transformers. IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS 2024; 35:4110-4124. [PMID: 36107888 DOI: 10.1109/tnnls.2022.3201830] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/15/2023]
Abstract
Neuromorphic vision is a bio-inspired technology that has triggered a paradigm shift in the computer vision community and is serving as a key enabler for a wide range of applications. This technology has offered significant advantages, including reduced power consumption, reduced processing needs, and communication speedups. However, neuromorphic cameras suffer from significant amounts of measurement noise. This noise deteriorates the performance of neuromorphic event-based perception and navigation algorithms. In this article, we propose a novel noise filtration algorithm to eliminate events that do not represent real log-intensity variations in the observed scene. We employ a graph neural network (GNN)-driven transformer algorithm, called GNN-Transformer, to classify every active event pixel in the raw stream into real log-intensity variation or noise. Within the GNN, a message-passing framework, referred to as EventConv, is carried out to reflect the spatiotemporal correlation among the events while preserving their asynchronous nature. We also introduce the known-object ground-truth labeling (KoGTL) approach for generating approximate ground-truth labels of event streams under various illumination conditions. KoGTL is used to generate labeled datasets, from experiments recorded in challenging lighting conditions, including moon light. These datasets are used to train and extensively test our proposed algorithm. When tested on unseen datasets, the proposed algorithm outperforms state-of-the-art methods by at least 8.8% in terms of filtration accuracy. Additional tests are also conducted on publicly available datasets (ETH Zürich Color-DAVIS346 datasets) to demonstrate the generalization capabilities of the proposed algorithm in the presence of illumination variations and different motion dynamics. Compared to state-of-the-art solutions, qualitative results verified the superior capability of the proposed algorithm to eliminate noise while preserving meaningful events in the scene.
Collapse
|
17
|
Bissarinova U, Rakhimzhanova T, Kenzhebalin D, Varol HA. Faces in Event Streams (FES): An Annotated Face Dataset for Event Cameras. SENSORS (BASEL, SWITZERLAND) 2024; 24:1409. [PMID: 38474947 DOI: 10.3390/s24051409] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 12/02/2023] [Revised: 02/13/2024] [Accepted: 02/14/2024] [Indexed: 03/14/2024]
Abstract
The use of event-based cameras in computer vision is a growing research direction. However, despite the existing research on face detection using the event camera, a substantial gap persists in the availability of a large dataset featuring annotations for faces and facial landmarks on event streams, thus hampering the development of applications in this direction. In this work, we address this issue by publishing the first large and varied dataset (Faces in Event Streams) with a duration of 689 min for face and facial landmark detection in direct event-based camera outputs. In addition, this article presents 12 models trained on our dataset to predict bounding box and facial landmark coordinates with an mAP50 score of more than 90%. We also performed a demonstration of real-time detection with an event-based camera using our models.
Collapse
Affiliation(s)
- Ulzhan Bissarinova
- Institute of Smart Systems and Artificial Intelligence, Nazarbayev University, Astana 010000, Kazakhstan
| | - Tomiris Rakhimzhanova
- Institute of Smart Systems and Artificial Intelligence, Nazarbayev University, Astana 010000, Kazakhstan
| | - Daulet Kenzhebalin
- Institute of Smart Systems and Artificial Intelligence, Nazarbayev University, Astana 010000, Kazakhstan
| | - Huseyin Atakan Varol
- Institute of Smart Systems and Artificial Intelligence, Nazarbayev University, Astana 010000, Kazakhstan
| |
Collapse
|
18
|
Xia H, Hou X, Zhang JZ. Long- and Short-Term Memory Model of Cotton Price Index Volatility Risk Based on Explainable Artificial Intelligence. BIG DATA 2024; 12:49-62. [PMID: 37976104 DOI: 10.1089/big.2022.0287] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/19/2023]
Abstract
Market uncertainty greatly interferes with the decisions and plans of market participants, thus increasing the risk of decision-making, leading to compromised interests of decision-makers. Cotton price index (hereinafter referred to as cotton price) volatility is highly noisy, nonlinear, and stochastic and is susceptible to supply and demand, climate, substitutes, and other policy factors, which are subject to large uncertainties. To reduce decision risk and provide decision support for policymakers, this article integrates 13 factors affecting cotton price index volatility based on existing research and further divides them into transaction data and interaction data. A long- and short-term memory (LSTM) model is constructed, and a comparison experiment is implemented to analyze the cotton price index volatility. To make the constructed model explainable, we use explainable artificial intelligence (XAI) techniques to perform statistical analysis of the input features. The experimental results show that the LSTM model can accurately analyze the cotton price index fluctuation trend but cannot accurately predict the actual price of cotton; the transaction data plus interaction data are more sensitive than the transaction data in analyzing the cotton price fluctuation trend and can have a positive effect on the cotton price fluctuation analysis. This study can accurately reflect the fluctuation trend of the cotton market, provide reference to the state, enterprises, and cotton farmers for decision-making, and reduce the risk caused by frequent fluctuation of cotton prices. The analysis of the model using XAI techniques builds the confidence of decision-makers in the model.
Collapse
Affiliation(s)
- Huosong Xia
- School of Management, Wuhan Textile University, Wuhan, China
- Enterprise Decision Support Research Center of Key Institute of Humanities and Social Sciences, Wuhan Textile University, Wuhan, China
- Institute of Management and Economics, Wuhan Textile University, Wuhan, China
| | - Xiaoyu Hou
- School of Management, Wuhan Textile University, Wuhan, China
| | - Justin Zuopeng Zhang
- Department of Management, Coggin College of Business, University of North Florida, Jacksonville, Florida, USA
| |
Collapse
|
19
|
Hwang GM, Simonian AL. Special Issue-Biosensors and Neuroscience: Is Biosensors Engineering Ready to Embrace Design Principles from Neuroscience? BIOSENSORS 2024; 14:68. [PMID: 38391987 PMCID: PMC10886788 DOI: 10.3390/bios14020068] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 12/24/2023] [Accepted: 01/25/2024] [Indexed: 02/24/2024]
Abstract
In partnership with the Air Force Office of Scientific Research (AFOSR), the National Science Foundation's (NSF) Emerging Frontiers and Multidisciplinary Activities (EFMA) office of the Directorate for Engineering (ENG) launched an Emerging Frontiers in Research and Innovation (EFRI) topic for the fiscal years FY22 and FY23 entitled "Brain-inspired Dynamics for Engineering Energy-Efficient Circuits and Artificial Intelligence" (BRAID) [...].
Collapse
Affiliation(s)
- Grace M. Hwang
- Johns Hopkins University Applied Physics Laboratory, 111000 Johns Hopkins Road, Laurel, MD 20723, USA
| | | |
Collapse
|
20
|
Wu N, Hu W, Liu GP, Lei Z. Mathematically Improved XGBoost Algorithm for Truck Hoisting Detection in Container Unloading. SENSORS (BASEL, SWITZERLAND) 2024; 24:839. [PMID: 38339556 PMCID: PMC10856832 DOI: 10.3390/s24030839] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 12/28/2023] [Revised: 01/17/2024] [Accepted: 01/23/2024] [Indexed: 02/12/2024]
Abstract
Truck hoisting detection constitutes a key focus in port security, for which no optimal resolution has been identified. To address the issues of high costs, susceptibility to weather conditions, and low accuracy in conventional methods for truck hoisting detection, a non-intrusive detection approach is proposed in this paper. The proposed approach utilizes a mathematical model and an extreme gradient boosting (XGBoost) model. Electrical signals, including voltage and current, collected by Hall sensors are processed by the mathematical model, which augments their physical information. Subsequently, the dataset filtered by the mathematical model is used to train the XGBoost model, enabling the XGBoost model to effectively identify abnormal hoists. Improvements were observed in the performance of the XGBoost model as utilized in this paper. Finally, experiments were conducted at several stations. The overall false positive rate did not exceed 0.7% and no false negatives occurred in the experiments. The experimental results demonstrated the excellent performance of the proposed approach, which can reduce the costs and improve the accuracy of detection in container hoisting.
Collapse
Affiliation(s)
- Nian Wu
- School of Electrical Engineering and Automation, Wuhan University, Wuhan 430072, China; (N.W.); (Z.L.)
| | - Wenshan Hu
- School of Electrical Engineering and Automation, Wuhan University, Wuhan 430072, China; (N.W.); (Z.L.)
| | - Guo-Ping Liu
- Center for Control Science and Technology, Southern University of Science and Technology, Shenzhen 518055, China;
| | - Zhongcheng Lei
- School of Electrical Engineering and Automation, Wuhan University, Wuhan 430072, China; (N.W.); (Z.L.)
| |
Collapse
|
21
|
Wu H, Li Y, Xu W, Kong F, Zhang F. Moving event detection from LiDAR point streams. Nat Commun 2024; 15:345. [PMID: 38184659 PMCID: PMC10771495 DOI: 10.1038/s41467-023-44554-8] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/12/2023] [Accepted: 12/02/2023] [Indexed: 01/08/2024] Open
Abstract
In dynamic environments, robots require instantaneous detection of moving events with microseconds of latency. This task, known as moving event detection, is typically achieved using event cameras. While light detection and ranging (LiDAR) sensors are essential for robots due to their dense and accurate depth measurements, their use in event detection has not been thoroughly explored. Current approaches involve accumulating LiDAR points into frames and detecting object-level motions, resulting in a latency of tens to hundreds of milliseconds. We present a different approach called M-detector, which determines if a point is moving immediately after its arrival, resulting in a point-by-point detection with a latency of just several microseconds. M-detector is designed based on occlusion principles and can be used in different environments with various types of LiDAR sensors. Our experiments demonstrate the effectiveness of M-detector on various datasets and applications, showcasing its superior accuracy, computational efficiency, detection latency, and generalization ability.
Collapse
Affiliation(s)
- Huajie Wu
- Department of Mechanical Engineering, The University of Hong Kong, Pokfulam, Hong Kong, 999077, China
| | - Yihang Li
- Department of Mechanical Engineering, The University of Hong Kong, Pokfulam, Hong Kong, 999077, China
| | - Wei Xu
- Department of Mechanical Engineering, The University of Hong Kong, Pokfulam, Hong Kong, 999077, China
| | - Fanze Kong
- Department of Mechanical Engineering, The University of Hong Kong, Pokfulam, Hong Kong, 999077, China
| | - Fu Zhang
- Department of Mechanical Engineering, The University of Hong Kong, Pokfulam, Hong Kong, 999077, China.
| |
Collapse
|
22
|
Li H, Wan B, Fang Y, Li Q, Liu JK, An L. An FPGA implementation of Bayesian inference with spiking neural networks. Front Neurosci 2024; 17:1291051. [PMID: 38249589 PMCID: PMC10796689 DOI: 10.3389/fnins.2023.1291051] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/08/2023] [Accepted: 12/06/2023] [Indexed: 01/23/2024] Open
Abstract
Spiking neural networks (SNNs), as brain-inspired neural network models based on spikes, have the advantage of processing information with low complexity and efficient energy consumption. Currently, there is a growing trend to design hardware accelerators for dedicated SNNs to overcome the limitation of running under the traditional von Neumann architecture. Probabilistic sampling is an effective modeling approach for implementing SNNs to simulate the brain to achieve Bayesian inference. However, sampling consumes considerable time. It is highly demanding for specific hardware implementation of SNN sampling models to accelerate inference operations. Hereby, we design a hardware accelerator based on FPGA to speed up the execution of SNN algorithms by parallelization. We use streaming pipelining and array partitioning operations to achieve model operation acceleration with the least possible resource consumption, and combine the Python productivity for Zynq (PYNQ) framework to implement the model migration to the FPGA while increasing the speed of model operations. We verify the functionality and performance of the hardware architecture on the Xilinx Zynq ZCU104. The experimental results show that the hardware accelerator of the SNN sampling model proposed can significantly improve the computing speed while ensuring the accuracy of inference. In addition, Bayesian inference for spiking neural networks through the PYNQ framework can fully optimize the high performance and low power consumption of FPGAs in embedded applications. Taken together, our proposed FPGA implementation of Bayesian inference with SNNs has great potential for a wide range of applications, it can be ideal for implementing complex probabilistic model inference in embedded systems.
Collapse
Affiliation(s)
- Haoran Li
- Guangzhou Institute of Technology, Xidian University, Guangzhou, China
| | - Bo Wan
- School of Computer Science and Technology, Xidian University, Xi'an, China
- Key Laboratory of Smart Human Computer Interaction and Wearable Technology of Shaanxi Province, Xi'an, China
| | - Ying Fang
- College of Computer and Cyber Security, Fujian Normal University, Fuzhou, China
- Digital Fujian Internet-of-Thing Laboratory of Environmental Monitoring, Fujian Normal University, Fuzhou, China
| | - Qifeng Li
- Research Center of Information Technology, Beijing Academy of Agriculture and Forestry Sciences, National Engineering Research Center for Information Technology in Agriculture, Beijing, China
| | - Jian K. Liu
- School of Computer Science, University of Birmingham, Birmingham, United Kingdom
| | - Lingling An
- Guangzhou Institute of Technology, Xidian University, Guangzhou, China
- School of Computer Science and Technology, Xidian University, Xi'an, China
| |
Collapse
|
23
|
Liu Y, Liu T, Hu Y, Liao W, Xing Y, Sheik S, Qiao N. Chip-In-Loop SNN Proxy Learning: a new method for efficient training of spiking neural networks. Front Neurosci 2024; 17:1323121. [PMID: 38239830 PMCID: PMC10794440 DOI: 10.3389/fnins.2023.1323121] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/17/2023] [Accepted: 11/23/2023] [Indexed: 01/22/2024] Open
Abstract
The primary approaches used to train spiking neural networks (SNNs) involve either training artificial neural networks (ANNs) first and then transforming them into SNNs, or directly training SNNs using surrogate gradient techniques. Nevertheless, both of these methods encounter a shared challenge: they rely on frame-based methodologies, where asynchronous events are gathered into synchronous frames for computation. This strays from the authentic asynchronous, event-driven nature of SNNs, resulting in notable performance degradation when deploying the trained models on SNN simulators or hardware chips for real-time asynchronous computation. To eliminate this performance degradation, we propose a hardware-based SNN proxy learning method that is called Chip-In-Loop SNN Proxy Learning (CIL-SPL). This approach effectively eliminates the performance degradation caused by the mismatch between synchronous and asynchronous computations. To demonstrate the effectiveness of our method, we trained models using public datasets such as N-MNIST and tested them on the SNN simulator or hardware chip, comparing our results to those classical training methods.
Collapse
Affiliation(s)
| | | | - Yalun Hu
- SynSense Co. Ltd., Chengdu, China
| | - Wei Liao
- SynSense Co. Ltd., Chengdu, China
| | | | - Sadique Sheik
- SynSense Co. Ltd., Chengdu, China
- SynSense AG., Zurich, Switzerland
| | - Ning Qiao
- SynSense Co. Ltd., Chengdu, China
- SynSense AG., Zurich, Switzerland
| |
Collapse
|
24
|
Du Z, Gupta M, Xu F, Zhang K, Zhang J, Zhou Y, Liu Y, Wang Z, Wrachtrup J, Wong N, Li C, Chu Z. Widefield Diamond Quantum Sensing with Neuromorphic Vision Sensors. ADVANCED SCIENCE (WEINHEIM, BADEN-WURTTEMBERG, GERMANY) 2024; 11:e2304355. [PMID: 37939304 PMCID: PMC10787069 DOI: 10.1002/advs.202304355] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 06/29/2023] [Revised: 09/04/2023] [Indexed: 11/10/2023]
Abstract
Despite increasing interest in developing ultrasensitive widefield diamond magnetometry for various applications, achieving high temporal resolution and sensitivity simultaneously remains a key challenge. This is largely due to the transfer and processing of massive amounts of data from the frame-based sensor to capture the widefield fluorescence intensity of spin defects in diamonds. In this study, a neuromorphic vision sensor to encode the changes of fluorescence intensity into spikes in the optically detected magnetic resonance (ODMR) measurements is adopted, closely resembling the operation of the human vision system, which leads to highly compressed data volume and reduced latency. It also results in a vast dynamic range, high temporal resolution, and exceptional signal-to-background ratio. After a thorough theoretical evaluation, the experiment with an off-the-shelf event camera demonstrated a 13× improvement in temporal resolution with comparable precision of detecting ODMR resonance frequencies compared with the state-of-the-art highly specialized frame-based approach. It is successfully deploy this technology in monitoring dynamically modulated laser heating of gold nanoparticles coated on a diamond surface, a recognizably difficult task using existing approaches. The current development provides new insights for high-precision and low-latency widefield quantum sensing, with possibilities for integration with emerging memory devices to realize more intelligent quantum sensors.
Collapse
Affiliation(s)
- Zhiyuan Du
- Department of Electrical and Electronic Engineering, The University of Hong Kong, Hong Kong, 999077, P. R. China
| | - Madhav Gupta
- Department of Electrical and Electronic Engineering, The University of Hong Kong, Hong Kong, 999077, P. R. China
| | - Feng Xu
- Department of Electrical and Electronic Engineering, The University of Hong Kong, Hong Kong, 999077, P. R. China
| | - Kai Zhang
- Department of Electrical and Electronic Engineering, The University of Hong Kong, Hong Kong, 999077, P. R. China
- School of Science and Engineering, The Chinese University of Hong Kong, Shenzhen, 518000, China
| | - Jiahua Zhang
- Department of Electrical and Electronic Engineering, The University of Hong Kong, Hong Kong, 999077, P. R. China
| | - Yan Zhou
- School of Science and Engineering, The Chinese University of Hong Kong, Shenzhen, 518000, China
| | - Yiyao Liu
- Guangdong Provincial Key Laboratory of Quantum Engineering and Quantum Materials, School of Physics and Telecommunication Engineering, South China Normal University, Guangzhou, 510006, China
| | - Zhenyu Wang
- Guangdong Provincial Key Laboratory of Quantum Engineering and Quantum Materials, School of Physics and Telecommunication Engineering, South China Normal University, Guangzhou, 510006, China
- Frontier Research Institute for Physics, South China Normal University, Guangzhou, 510006, China
| | - Jörg Wrachtrup
- 3rd Institute of Physics, Research Center SCoPE and IQST, University of Stuttgart, 70569, Stuttgart, Germany
- Max Planck Institute for Solid State Research, 70569, Stuttgart, Germany
| | - Ngai Wong
- Department of Electrical and Electronic Engineering, The University of Hong Kong, Hong Kong, 999077, P. R. China
| | - Can Li
- Department of Electrical and Electronic Engineering, The University of Hong Kong, Hong Kong, 999077, P. R. China
| | - Zhiqin Chu
- Department of Electrical and Electronic Engineering, The University of Hong Kong, Hong Kong, 999077, P. R. China
- School of Biomedical Sciences, The University of Hong Kong, Hong Kong, 999077, P. R. China
- Advanced Biomedical Instrumentation Centre, Hong Kong Science Park, Hong Kong, 999077, P. R. China
| |
Collapse
|
25
|
Fu J, Zhang Y, Li Y, Li J, Xiong Z. Fast 3D reconstruction via event-based structured light with spatio-temporal coding. OPTICS EXPRESS 2023; 31:44588-44602. [PMID: 38178526 DOI: 10.1364/oe.507688] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Received: 10/11/2023] [Accepted: 11/26/2023] [Indexed: 01/06/2024]
Abstract
Event-based structured light (SL) systems leverage bio-inspired event cameras, which are renowned for their low latency and high dynamics, to drive progress in high-speed structured light systems. However, existing event-based structured light methods concentrate on the independent construction of either time-domain or space-domain features for stereo matching, ignoring the spatio-temporal consistency towards depth. In this work, we build an event-based SL system that consists of a laser point projector and an event camera, and we devise a spatial-temporal coding strategy that realizes depth encoding in dual domains through a single shot. To exploit the spatio-temporal synergy, we further present STEM, a novel Spatio-Temporal Enhanced Matching approach for 3D reconstruction. STEM is comprised of two parts, the spatio-temporal enhancing (STE) algorithm and the spatio-temporal matching (STM) algorithm. Specifically, STE integrates the dual-domain information to increase the saliency of the temporal coding, providing a more robust basis for matching. STM is a stereo matching algorithm explicitly tailored to the unique characteristics of event data modality, which computes the disparity via a meticulously designed hybrid cost function. Experimental results demonstrate the superior performance of our proposed method, achieving a reconstruction rate of 16 fps and a low root mean square error of 0.56 mm at a distance of 0.72 m.
Collapse
|
26
|
Lesage X, Tran R, Mancini S, Fesquet L. Velocity and Color Estimation Using Event-Based Clustering. SENSORS (BASEL, SWITZERLAND) 2023; 23:9768. [PMID: 38139614 PMCID: PMC10747939 DOI: 10.3390/s23249768] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 09/27/2023] [Revised: 11/27/2023] [Accepted: 12/04/2023] [Indexed: 12/24/2023]
Abstract
Event-based clustering provides a low-power embedded solution for low-level feature extraction in a scene. The algorithm utilizes the non-uniform sampling capability of event-based image sensors to measure local intensity variations within a scene. Consequently, the clustering algorithm forms similar event groups while simultaneously estimating their attributes. This work proposes taking advantage of additional event information in order to provide new attributes for further processing. We elaborate on the estimation of the object velocity using the mean motion of the cluster. Next, we are examining a novel form of events, which includes intensity measurement of the color at the concerned pixel. These events may be processed to estimate the rough color of a cluster, or the color distribution in a cluster. Lastly, this paper presents some applications that utilize these features. The resulting algorithms are applied and exercised thanks to a custom event-based simulator, which generates videos of outdoor scenes. The velocity estimation methods provide satisfactory results with a trade-off between accuracy and convergence speed. Regarding color estimation, the luminance estimation is challenging in the test cases, while the chrominance is precisely estimated. The estimated quantities are adequate for accurately classifying objects into predefined categories.
Collapse
Affiliation(s)
- Xavier Lesage
- Univ. Grenoble Alpes, CNRS (National Centre for Scientific Research), Grenoble INP (Institute of Engineering), TIMA (Techniques of Informatics and Microelectronics for Integrated Systems Architecture), F-38000 Grenoble, France; (X.L.); (R.T.); (S.M.)
- Orioma, F-38430 Moirans, France
| | - Rosalie Tran
- Univ. Grenoble Alpes, CNRS (National Centre for Scientific Research), Grenoble INP (Institute of Engineering), TIMA (Techniques of Informatics and Microelectronics for Integrated Systems Architecture), F-38000 Grenoble, France; (X.L.); (R.T.); (S.M.)
| | - Stéphane Mancini
- Univ. Grenoble Alpes, CNRS (National Centre for Scientific Research), Grenoble INP (Institute of Engineering), TIMA (Techniques of Informatics and Microelectronics for Integrated Systems Architecture), F-38000 Grenoble, France; (X.L.); (R.T.); (S.M.)
| | - Laurent Fesquet
- Univ. Grenoble Alpes, CNRS (National Centre for Scientific Research), Grenoble INP (Institute of Engineering), TIMA (Techniques of Informatics and Microelectronics for Integrated Systems Architecture), F-38000 Grenoble, France; (X.L.); (R.T.); (S.M.)
| |
Collapse
|
27
|
Duan P, Ma Y, Zhou X, Shi X, Wang ZW, Huang T, Shi B. NeuroZoom: Denoising and Super Resolving Neuromorphic Events and Spikes. IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE 2023; 45:15219-15232. [PMID: 37578915 DOI: 10.1109/tpami.2023.3304486] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 08/16/2023]
Abstract
Neuromorphic cameras are emerging imaging technology that has advantages over conventional imaging sensors in several aspects including dynamic range, sensing latency, and power consumption. However, the signal-to-noise level and the spatial resolution still fall behind the state of conventional imaging sensors. In this article, we address the denoising and super-resolution problem for modern neuromorphic cameras. We employ 3D U-Net as the backbone neural architecture for such a task. The networks are trained and tested on two types of neuromorphic cameras: a dynamic vision sensor and a spike camera. Their pixels generate signals asynchronously, the former is based on perceived light changes and the latter is based on accumulated light intensity. To collect the datasets for training such networks, we design a display-camera system to record high frame-rate videos at multiple resolutions, providing supervision for denoising and super-resolution. The networks are trained in a noise-to-noise fashion, where the two ends of the network are unfiltered noisy data. The output of the networks has been tested for downstream applications including event-based visual object tracking and image reconstruction. Experimental results demonstrate the effectiveness of improving the quality of neuromorphic events and spikes, and the corresponding improvement to downstream applications with state-of-the-art performance.
Collapse
|
28
|
Wu X, Song Y, Zhou Y, Jiang Y, Bai Y, Li X, Yang X. STCA-SNN: self-attention-based temporal-channel joint attention for spiking neural networks. Front Neurosci 2023; 17:1261543. [PMID: 38027490 PMCID: PMC10667472 DOI: 10.3389/fnins.2023.1261543] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/20/2023] [Accepted: 10/23/2023] [Indexed: 12/01/2023] Open
Abstract
Spiking Neural Networks (SNNs) have shown great promise in processing spatio-temporal information compared to Artificial Neural Networks (ANNs). However, there remains a performance gap between SNNs and ANNs, which impedes the practical application of SNNs. With intrinsic event-triggered property and temporal dynamics, SNNs have the potential to effectively extract spatio-temporal features from event streams. To leverage the temporal potential of SNNs, we propose a self-attention-based temporal-channel joint attention SNN (STCA-SNN) with end-to-end training, which infers attention weights along both temporal and channel dimensions concurrently. It models global temporal and channel information correlations with self-attention, enabling the network to learn 'what' and 'when' to attend simultaneously. Our experimental results show that STCA-SNNs achieve better performance on N-MNIST (99.67%), CIFAR10-DVS (81.6%), and N-Caltech 101 (80.88%) compared with the state-of-the-art SNNs. Meanwhile, our ablation study demonstrates that STCA-SNNs improve the accuracy of event stream classification tasks.
Collapse
Affiliation(s)
| | - Yong Song
- School of Optics and Photonics, Beijing Institute of Technology, Beijing, China
| | - Ya Zhou
- School of Optics and Photonics, Beijing Institute of Technology, Beijing, China
| | | | | | | | | |
Collapse
|
29
|
Fu Y, Li M, Liu W, Wang Y, Zhang J, Yin B, Wei X, Yang X. Distractor-Aware Event-Based Tracking. IEEE TRANSACTIONS ON IMAGE PROCESSING : A PUBLICATION OF THE IEEE SIGNAL PROCESSING SOCIETY 2023; 32:6129-6141. [PMID: 37889807 DOI: 10.1109/tip.2023.3326683] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 10/29/2023]
Abstract
Event cameras, or dynamic vision sensors, have recently achieved success from fundamental vision tasks to high-level vision researches. Due to its ability to asynchronously capture light intensity changes, event camera has an inherent advantage to capture moving objects in challenging scenarios including objects under low light, high dynamic range, or fast moving objects. Thus event camera are natural for visual object tracking. However, the current event-based trackers derived from RGB trackers simply modify the input images to event frames and still follow conventional tracking pipeline that mainly focus on object texture for target distinction. As a result, the trackers may not be robust dealing with challenging scenarios such as moving cameras and cluttered foreground. In this paper, we propose a distractor-aware event-based tracker that introduces transformer modules into Siamese network architecture (named DANet). Specifically, our model is mainly composed of a motion-aware network and a target-aware network, which simultaneously exploits both motion cues and object contours from event data, so as to discover motion objects and identify the target object by removing dynamic distractors. Our DANet can be trained in an end-to-end manner without any post-processing and can run at over 80 FPS on a single V100. We conduct comprehensive experiments on two large event tracking datasets to validate the proposed model. We demonstrate that our tracker has superior performance against the state-of-the-art trackers in terms of both accuracy and efficiency.
Collapse
|
30
|
Stanojevic A, Woźniak S, Bellec G, Cherubini G, Pantazi A, Gerstner W. An exact mapping from ReLU networks to spiking neural networks. Neural Netw 2023; 168:74-88. [PMID: 37742533 DOI: 10.1016/j.neunet.2023.09.011] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/24/2023] [Revised: 08/31/2023] [Accepted: 09/04/2023] [Indexed: 09/26/2023]
Abstract
Deep spiking neural networks (SNNs) offer the promise of low-power artificial intelligence. However, training deep SNNs from scratch or converting deep artificial neural networks to SNNs without loss of performance has been a challenge. Here we propose an exact mapping from a network with Rectified Linear Units (ReLUs) to an SNN that fires exactly one spike per neuron. For our constructive proof, we assume that an arbitrary multi-layer ReLU network with or without convolutional layers, batch normalization and max pooling layers was trained to high performance on some training set. Furthermore, we assume that we have access to a representative example of input data used during training and to the exact parameters (weights and biases) of the trained ReLU network. The mapping from deep ReLU networks to SNNs causes zero percent drop in accuracy on CIFAR10, CIFAR100 and the ImageNet-like data sets Places365 and PASS. More generally our work shows that an arbitrary deep ReLU network can be replaced by an energy-efficient single-spike neural network without any loss of performance.
Collapse
Affiliation(s)
- Ana Stanojevic
- IBM Research Europe - Zurich, Rüschlikon, Switzerland; École polytechnique fédérale de Lausanne, School of Life Sciences and School of Computer and Communication Sciences, Lausanne EPFL, Switzerland.
| | | | - Guillaume Bellec
- École polytechnique fédérale de Lausanne, School of Life Sciences and School of Computer and Communication Sciences, Lausanne EPFL, Switzerland
| | | | | | - Wulfram Gerstner
- École polytechnique fédérale de Lausanne, School of Life Sciences and School of Computer and Communication Sciences, Lausanne EPFL, Switzerland
| |
Collapse
|
31
|
Li D, Tian Y, Li J. SODFormer: Streaming Object Detection With Transformer Using Events and Frames. IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE 2023; 45:14020-14037. [PMID: 37494161 DOI: 10.1109/tpami.2023.3298925] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 07/28/2023]
Abstract
DAVIS camera, streaming two complementary sensing modalities of asynchronous events and frames, has gradually been used to address major object detection challenges (e.g., fast motion blur and low-light). However, how to effectively leverage rich temporal cues and fuse two heterogeneous visual streams remains a challenging endeavor. To address this challenge, we propose a novel streaming object detector with Transformer, namely SODFormer, which first integrates events and frames to continuously detect objects in an asynchronous manner. Technically, we first build a large-scale multimodal neuromorphic object detection dataset (i.e., PKU-DAVIS-SOD) over 1080.1 k manual labels. Then, we design a spatiotemporal Transformer architecture to detect objects via an end-to-end sequence prediction problem, where the novel temporal Transformer module leverages rich temporal cues from two visual streams to improve the detection performance. Finally, an asynchronous attention-based fusion module is proposed to integrate two heterogeneous sensing modalities and take complementary advantages from each end, which can be queried at any time to locate objects and break through the limited output frequency from synchronized frame-based fusion strategies. The results show that the proposed SODFormer outperforms four state-of-the-art methods and our eight baselines by a significant margin. We also show that our unifying framework works well even in cases where the conventional frame-based camera fails, e.g., high-speed motion and low-light conditions. Our dataset and code can be available at https://github.com/dianzl/SODFormer.
Collapse
|
32
|
Lu Z, Chen X, Chung VYY, Cai W, Shen Y. EV-LFV: Synthesizing Light Field Event Streams from an Event Camera and Multiple RGB Cameras. IEEE TRANSACTIONS ON VISUALIZATION AND COMPUTER GRAPHICS 2023; 29:4546-4555. [PMID: 37788211 DOI: 10.1109/tvcg.2023.3320271] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 10/05/2023]
Abstract
Light field videos captured in RGB frames (RGB-LFV) can provide users with a 6 degree-of-freedom immersive video experience by capturing dense multi-subview video. Despite its potential benefits, the processing of dense multi-subview video is extremely resource-intensive, which currently limits the frame rate of RGB-LFV (i.e., lower than 30 fps) and results in blurred frames when capturing fast motion. To address this issue, we propose leveraging event cameras, which provide high temporal resolution for capturing fast motion. However, the cost of current event camera models makes it prohibitive to use multiple event cameras for RGB-LFV platforms. Therefore, we propose EV-LFV, an event synthesis framework that generates full multi-subview event-based RGB-LFV with only one event camera and multiple traditional RGB cameras. EV-LFV utilizes spatial-angular convolution, ConvLSTM, and Transformer to model RGB-LFV's angular features, temporal features, and long-range dependency, respectively, to effectively synthesize event streams for RGB-LFV. To train EV-LFV, we construct the first event-to-LFV dataset consisting of 200 RGB-LFV sequences with ground-truth event streams. Experimental results demonstrate that EV-LFV outperforms state-of-the-art event synthesis methods for generating event-based RGB-LFV, effectively alleviating motion blur in the reconstructed RGB-LFV.
Collapse
|
33
|
Hou X, Zhang F, Gulati D, Tan T, Zhang W. E2VIDX: improved bridge between conventional vision and bionic vision. Front Neurorobot 2023; 17:1277160. [PMID: 37954492 PMCID: PMC10639115 DOI: 10.3389/fnbot.2023.1277160] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/14/2023] [Accepted: 10/05/2023] [Indexed: 11/14/2023] Open
Abstract
Common RGBD, CMOS, and CCD-based cameras produce motion blur and incorrect exposure under high-speed and improper lighting conditions. According to the bionic principle, the event camera developed has the advantages of low delay, high dynamic range, and no motion blur. However, due to its unique data representation, it encounters significant obstacles in practical applications. The image reconstruction algorithm based on an event camera solves the problem by converting a series of "events" into common frames to apply existing vision algorithms. Due to the rapid development of neural networks, this field has made significant breakthroughs in past few years. Based on the most popular Events-to-Video (E2VID) method, this study designs a new network called E2VIDX. The proposed network includes group convolution and sub-pixel convolution, which not only achieves better feature fusion but also the network model size is reduced by 25%. Futhermore, we propose a new loss function. The loss function is divided into two parts, first part calculates the high level features and the second part calculates the low level features of the reconstructed image. The experimental results clearly outperform against the state-of-the-art method. Compared with the original method, Structural Similarity (SSIM) increases by 1.3%, Learned Perceptual Image Patch Similarity (LPIPS) decreases by 1.7%, Mean Squared Error (MSE) decreases by 2.5%, and it runs faster on GPU and CPU. Additionally, we evaluate the results of E2VIDX with application to image classification, object detection, and instance segmentation. The experiments show that conversions using our method can help event cameras directly apply existing vision algorithms in most scenarios.
Collapse
Affiliation(s)
- Xujia Hou
- School of Marine Science and Technology, Northwestern Polytechnical University, Xi'An, China
| | - Feihu Zhang
- School of Marine Science and Technology, Northwestern Polytechnical University, Xi'An, China
| | | | - Tingfeng Tan
- School of Marine Science and Technology, Northwestern Polytechnical University, Xi'An, China
| | - Wei Zhang
- School of Marine Science and Technology, Northwestern Polytechnical University, Xi'An, China
| |
Collapse
|
34
|
Huang PY, Jiang BY, Chen HJ, Xu JY, Wang K, Zhu CY, Hu XY, Li D, Zhen L, Zhou FC, Qin JK, Xu CY. Neuro-inspired optical sensor array for high-accuracy static image recognition and dynamic trace extraction. Nat Commun 2023; 14:6736. [PMID: 37872169 PMCID: PMC10593955 DOI: 10.1038/s41467-023-42488-9] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/17/2022] [Accepted: 10/12/2023] [Indexed: 10/25/2023] Open
Abstract
Neuro-inspired vision systems hold great promise to address the growing demands of mass data processing for edge computing, a distributed framework that brings computation and data storage closer to the sources of data. In addition to the capability of static image sensing and processing, the hardware implementation of a neuro-inspired vision system also requires the fulfilment of detecting and recognizing moving targets. Here, we demonstrated a neuro-inspired optical sensor based on two-dimensional NbS2/MoS2 hybrid films, which featured remarkable photo-induced conductance plasticity and low electrical energy consumption. A neuro-inspired optical sensor array with 10 × 10 NbS2/MoS2 phototransistors enabled highly integrated functions of sensing, memory, and contrast enhancement capabilities for static images, which benefits convolutional neural network (CNN) with a high image recognition accuracy. More importantly, in-sensor trajectory registration of moving light spots was experimentally implemented such that the post-processing could yield a high restoration accuracy. Our neuro-inspired optical sensor array could provide a fascinating platform for the implementation of high-performance artificial vision systems.
Collapse
Affiliation(s)
- Pei-Yu Huang
- Sauvage Laboratory for Smart Materials, School of Materials Science and Engineering, Harbin Institute of Technology (Shenzhen), Shenzhen, 518055, China
| | - Bi-Yi Jiang
- School of Microelectronics, Southern University of Science and Technology, Shenzhen, 518055, China
- Department of Applied Physics, The Hong Kong Polytechnic University, Hong Kong, 999077, China
| | - Hong-Ji Chen
- Sauvage Laboratory for Smart Materials, School of Materials Science and Engineering, Harbin Institute of Technology (Shenzhen), Shenzhen, 518055, China
| | - Jia-Yi Xu
- School of Microelectronics, Southern University of Science and Technology, Shenzhen, 518055, China
| | - Kang Wang
- Key Laboratory of MEMS of the Ministry of Education, Southeast University, Nanjing, 210096, China
| | - Cheng-Yi Zhu
- Sauvage Laboratory for Smart Materials, School of Materials Science and Engineering, Harbin Institute of Technology (Shenzhen), Shenzhen, 518055, China
| | - Xin-Yan Hu
- School of Microelectronics, Southern University of Science and Technology, Shenzhen, 518055, China
| | - Dong Li
- Sauvage Laboratory for Smart Materials, School of Materials Science and Engineering, Harbin Institute of Technology (Shenzhen), Shenzhen, 518055, China
| | - Liang Zhen
- MOE Key Laboratory of Micro-Systems and Micro-Structures Manufacturing, Harbin Institute of Technology, Harbin, 150080, China
| | - Fei-Chi Zhou
- School of Microelectronics, Southern University of Science and Technology, Shenzhen, 518055, China.
| | - Jing-Kai Qin
- Sauvage Laboratory for Smart Materials, School of Materials Science and Engineering, Harbin Institute of Technology (Shenzhen), Shenzhen, 518055, China.
| | - Cheng-Yan Xu
- Sauvage Laboratory for Smart Materials, School of Materials Science and Engineering, Harbin Institute of Technology (Shenzhen), Shenzhen, 518055, China.
- MOE Key Laboratory of Micro-Systems and Micro-Structures Manufacturing, Harbin Institute of Technology, Harbin, 150080, China.
| |
Collapse
|
35
|
Cohen K, Hershko O, Levy H, Mendlovic D, Raviv D. Illumination-Based Color Reconstruction for the Dynamic Vision Sensor. SENSORS (BASEL, SWITZERLAND) 2023; 23:8327. [PMID: 37837157 PMCID: PMC10575428 DOI: 10.3390/s23198327] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 08/26/2023] [Revised: 10/06/2023] [Accepted: 10/07/2023] [Indexed: 10/15/2023]
Abstract
This work demonstrates a novel, state-of-the-art method to reconstruct colored images via the dynamic vision sensor (DVS). The DVS is an image sensor that indicates only a binary change in brightness, with no information about the captured wavelength (color) or intensity level. However, the reconstruction of the scene's color could be essential for many tasks in computer vision and DVS. We present a novel method for reconstructing a full spatial resolution, colored image utilizing the DVS and an active colored light source. We analyze the DVS response and present two reconstruction algorithms: linear-based and convolutional-neural-network-based. Our two presented methods reconstruct the colored image with high quality, and they do not suffer from any spatial resolution degradation as other methods. In addition, we demonstrate the robustness of our algorithm to changes in environmental conditions, such as illumination and distance. Finally, compared with previous works, we show how we reach the state-of-the-art results. We share our code on GitHub.
Collapse
Affiliation(s)
| | | | | | | | - Dan Raviv
- The Faculty of Engineering, Department of Physical Electronics, Tel Aviv University, Tel Aviv 69978, Israel; (K.C.)
| |
Collapse
|
36
|
Grimaldi A, Perrinet LU. Learning heterogeneous delays in a layer of spiking neurons for fast motion detection. BIOLOGICAL CYBERNETICS 2023; 117:373-387. [PMID: 37695359 DOI: 10.1007/s00422-023-00975-8] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 09/30/2022] [Accepted: 08/18/2023] [Indexed: 09/12/2023]
Abstract
The precise timing of spikes emitted by neurons plays a crucial role in shaping the response of efferent biological neurons. This temporal dimension of neural activity holds significant importance in understanding information processing in neurobiology, especially for the performance of neuromorphic hardware, such as event-based cameras. Nonetheless, many artificial neural models disregard this critical temporal dimension of neural activity. In this study, we present a model designed to efficiently detect temporal spiking motifs using a layer of spiking neurons equipped with heterogeneous synaptic delays. Our model capitalizes on the diverse synaptic delays present on the dendritic tree, enabling specific arrangements of temporally precise synaptic inputs to synchronize upon reaching the basal dendritic tree. We formalize this process as a time-invariant logistic regression, which can be trained using labeled data. To demonstrate its practical efficacy, we apply the model to naturalistic videos transformed into event streams, simulating the output of the biological retina or event-based cameras. To evaluate the robustness of the model in detecting visual motion, we conduct experiments by selectively pruning weights and demonstrate that the model remains efficient even under significantly reduced workloads. In conclusion, by providing a comprehensive, event-driven computational building block, the incorporation of heterogeneous delays has the potential to greatly improve the performance of future spiking neural network algorithms, particularly in the context of neuromorphic chips.
Collapse
Affiliation(s)
- Antoine Grimaldi
- Institut de Neurosciences de la Timone, Aix Marseille Univ, CNRS, 27 boulevard Jean Moulin, 13005, Marseille, France
| | - Laurent U Perrinet
- Institut de Neurosciences de la Timone, Aix Marseille Univ, CNRS, 27 boulevard Jean Moulin, 13005, Marseille, France.
| |
Collapse
|
37
|
Liu S, Dragotti PL. Sensing Diversity and Sparsity Models for Event Generation and Video Reconstruction from Events. IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE 2023; 45:12444-12458. [PMID: 37216257 DOI: 10.1109/tpami.2023.3278940] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/24/2023]
Abstract
Events-to-video (E2V) reconstruction and video-to-events (V2E) simulation are two fundamental research topics in event-based vision. Current deep neural networks for E2V reconstruction are usually complex and difficult to interpret. Moreover, existing event simulators are designed to generate realistic events, but research on how to improve the event generation process has been so far limited. In this paper, we propose a light, simple model-based deep network for E2V reconstruction, explore the diversity for adjacent pixels in V2E generation, and finally build a video-to-events-to-video (V2E2V) architecture to validate how alternative event generation strategies improve video reconstruction. For the E2V reconstruction, we model the relationship between events and intensity using sparse representation models. A convolutional ISTA network (CISTA) is then designed using the algorithm unfolding strategy. Long short-term temporal consistency (LSTC) constraints are further introduced to enhance the temporal coherence. In the V2E generation, we introduce the idea of having interleaved pixels with different contrast threshold and lowpass bandwidth and conjecture that this can help extract more useful information from intensity. Finally, V2E2V architecture is used to verify the effectiveness of this strategy. Results highlight that our CISTA-LSTC network outperforms state-of-the-art methods and achieves better temporal consistency. Sensing diversity in event generation reveals more fine details and this leads to a significantly improved reconstruction quality.
Collapse
|
38
|
Ryoo W, Nam G, Hyun JS, Kim S. Event fusion photometric stereo network. Neural Netw 2023; 167:141-158. [PMID: 37657253 DOI: 10.1016/j.neunet.2023.08.009] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/12/2023] [Revised: 06/29/2023] [Accepted: 08/06/2023] [Indexed: 09/03/2023]
Abstract
Photometric stereo methods typically rely on RGB cameras and are usually performed in a dark room to avoid ambient illumination. Ambient illumination poses a great challenge in photometric stereo due to the restricted dynamic range of the RGB cameras. To address this limitation, we present a novel method, namely Event Fusion Photometric Stereo Network (EFPS-Net), which estimates the surface normals of an object in an ambient light environment by utilizing a deep fusion of RGB and event cameras. The high dynamic range of event cameras provides a broader perspective of light representations that RGB cameras cannot provide. Specifically, we propose an event interpolation method to obtain ample light information, which enables precise estimation of the surface normals of an object. By using RGB-event fused observation maps, our EFPS-Net outperforms previous state-of-the-art methods that depend only on RGB frames, resulting in a 7.94% reduction in mean average error. In addition, we curate a novel photometric stereo dataset by capturing objects with RGB and event cameras under numerous ambient light environments.
Collapse
Affiliation(s)
- Wonjeong Ryoo
- Department of Artificial Intelligence, Korea University, Seoul, 02841, South Korea.
| | | | - Jae-Sang Hyun
- School of Mechanical Engineering, Yonsei University, Seoul, 03722, South Korea.
| | - Sangpil Kim
- Department of Artificial Intelligence, Korea University, Seoul, 02841, South Korea.
| |
Collapse
|
39
|
Schmid D, Jarvers C, Neumann H. Canonical circuit computations for computer vision. BIOLOGICAL CYBERNETICS 2023; 117:299-329. [PMID: 37306782 PMCID: PMC10600314 DOI: 10.1007/s00422-023-00966-9] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 11/30/2022] [Accepted: 05/18/2023] [Indexed: 06/13/2023]
Abstract
Advanced computer vision mechanisms have been inspired by neuroscientific findings. However, with the focus on improving benchmark achievements, technical solutions have been shaped by application and engineering constraints. This includes the training of neural networks which led to the development of feature detectors optimally suited to the application domain. However, the limitations of such approaches motivate the need to identify computational principles, or motifs, in biological vision that can enable further foundational advances in machine vision. We propose to utilize structural and functional principles of neural systems that have been largely overlooked. They potentially provide new inspirations for computer vision mechanisms and models. Recurrent feedforward, lateral, and feedback interactions characterize general principles underlying processing in mammals. We derive a formal specification of core computational motifs that utilize these principles. These are combined to define model mechanisms for visual shape and motion processing. We demonstrate how such a framework can be adopted to run on neuromorphic brain-inspired hardware platforms and can be extended to automatically adapt to environment statistics. We argue that the identified principles and their formalization inspires sophisticated computational mechanisms with improved explanatory scope. These and other elaborated, biologically inspired models can be employed to design computer vision solutions for different tasks and they can be used to advance neural network architectures of learning.
Collapse
Affiliation(s)
- Daniel Schmid
- Institute for Neural Information Processing, Ulm University, James-Franck-Ring, Ulm, 89081 Germany
| | - Christian Jarvers
- Institute for Neural Information Processing, Ulm University, James-Franck-Ring, Ulm, 89081 Germany
| | - Heiko Neumann
- Institute for Neural Information Processing, Ulm University, James-Franck-Ring, Ulm, 89081 Germany
| |
Collapse
|
40
|
Pan X, Shi J, Wang P, Wang S, Pan C, Yu W, Cheng B, Liang SJ, Miao F. Parallel perception of visual motion using light-tunable memory matrix. SCIENCE ADVANCES 2023; 9:eadi4083. [PMID: 37774015 PMCID: PMC10541003 DOI: 10.1126/sciadv.adi4083] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 04/24/2023] [Accepted: 08/29/2023] [Indexed: 10/01/2023]
Abstract
Parallel perception of visual motion is of crucial significance to the development of an intelligent machine vision system. However, implementing in-sensor parallel visual motion perception using conventional complementary metal-oxide semiconductor technology is challenging, because the temporal and spatial information embedded in motion cannot be simultaneously encoded and perceived at the sensory level. Here, we demonstrate the parallel perception of diverse motion modes at the sensor level by exploiting light-tunable memory matrix in a van der Waals (vdW) heterostructure array. The optoelectronic characteristics of gate-tunable photoconductivity and light-tunable memory matrix enable devices in the array to realize simultaneous encoding and processing of incoming spatiotemporal light pattern. Furthermore, we implement a visual motion perceptron with the array capable of deciphering multiple motion parameters in parallel, including direction, velocity, acceleration, and angular velocity. Our work opens up a promising venue for the realization of an intelligent machine vision system based on in-sensor motion perception.
Collapse
Affiliation(s)
- Xuan Pan
- Institute of Brain-Inspired Intelligence, National Laboratory of Solid State Microstructures, School of Physics, Collaborative Innovation Center of Advanced Microstructures, Nanjing University, Nanjing 210093, China
| | - Jingwen Shi
- Institute of Brain-Inspired Intelligence, National Laboratory of Solid State Microstructures, School of Physics, Collaborative Innovation Center of Advanced Microstructures, Nanjing University, Nanjing 210093, China
| | - Pengfei Wang
- Institute of Brain-Inspired Intelligence, National Laboratory of Solid State Microstructures, School of Physics, Collaborative Innovation Center of Advanced Microstructures, Nanjing University, Nanjing 210093, China
| | - Shuang Wang
- Institute of Brain-Inspired Intelligence, National Laboratory of Solid State Microstructures, School of Physics, Collaborative Innovation Center of Advanced Microstructures, Nanjing University, Nanjing 210093, China
| | - Chen Pan
- Institute of Interdisciplinary Physical Sciences, School of Science, Nanjing University of Science and Technology, Nanjing 210094, China
| | - Wentao Yu
- Institute of Interdisciplinary Physical Sciences, School of Science, Nanjing University of Science and Technology, Nanjing 210094, China
| | - Bin Cheng
- Institute of Interdisciplinary Physical Sciences, School of Science, Nanjing University of Science and Technology, Nanjing 210094, China
| | - Shi-Jun Liang
- Institute of Brain-Inspired Intelligence, National Laboratory of Solid State Microstructures, School of Physics, Collaborative Innovation Center of Advanced Microstructures, Nanjing University, Nanjing 210093, China
| | - Feng Miao
- Institute of Brain-Inspired Intelligence, National Laboratory of Solid State Microstructures, School of Physics, Collaborative Innovation Center of Advanced Microstructures, Nanjing University, Nanjing 210093, China
| |
Collapse
|
41
|
Bitar A, Rosales R, Paulitsch M. Gradient-based feature-attribution explainability methods for spiking neural networks. Front Neurosci 2023; 17:1153999. [PMID: 37829721 PMCID: PMC10565802 DOI: 10.3389/fnins.2023.1153999] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/30/2023] [Accepted: 09/01/2023] [Indexed: 10/14/2023] Open
Abstract
Introduction Spiking neural networks (SNNs) are a model of computation that mimics the behavior of biological neurons. SNNs process event data (spikes) and operate more sparsely than artificial neural networks (ANNs), resulting in ultra-low latency and small power consumption. This paper aims to adapt and evaluate gradient-based explainability methods for SNNs, which were originally developed for conventional ANNs. Methods The adapted methods aim to create input feature attribution maps for SNNs trained through backpropagation that process either event-based spiking data or real-valued data. The methods address the limitations of existing work on explainability methods for SNNs, such as poor scalability, limited to convolutional layers, requiring the training of another model, and providing maps of activation values instead of true attribution scores. The adapted methods are evaluated on classification tasks for both real-valued and spiking data, and the accuracy of the proposed methods is confirmed through perturbation experiments at the pixel and spike levels. Results and discussion The results reveal that gradient-based SNN attribution methods successfully identify highly contributing pixels and spikes with significantly less computation time than model-agnostic methods. Additionally, we observe that the chosen coding technique has a noticeable effect on the input features that will be most significant. These findings demonstrate the potential of gradient-based explainability methods for SNNs in improving our understanding of how these networks process information and contribute to the development of more efficient and accurate SNNs.
Collapse
Affiliation(s)
- Ammar Bitar
- Intel Labs, Munich, Germany
- Department of Knowledge Engineering, Maastricht University, Maastricht, Netherlands
| | | | | |
Collapse
|
42
|
Pham MD, D’Angiulli A, Dehnavi MM, Chhabra R. From Brain Models to Robotic Embodied Cognition: How Does Biological Plausibility Inform Neuromorphic Systems? Brain Sci 2023; 13:1316. [PMID: 37759917 PMCID: PMC10526461 DOI: 10.3390/brainsci13091316] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/05/2023] [Revised: 09/05/2023] [Accepted: 09/07/2023] [Indexed: 09/29/2023] Open
Abstract
We examine the challenging "marriage" between computational efficiency and biological plausibility-A crucial node in the domain of spiking neural networks at the intersection of neuroscience, artificial intelligence, and robotics. Through a transdisciplinary review, we retrace the historical and most recent constraining influences that these parallel fields have exerted on descriptive analysis of the brain, construction of predictive brain models, and ultimately, the embodiment of neural networks in an enacted robotic agent. We study models of Spiking Neural Networks (SNN) as the central means enabling autonomous and intelligent behaviors in biological systems. We then provide a critical comparison of the available hardware and software to emulate SNNs for investigating biological entities and their application on artificial systems. Neuromorphics is identified as a promising tool to embody SNNs in real physical systems and different neuromorphic chips are compared. The concepts required for describing SNNs are dissected and contextualized in the new no man's land between cognitive neuroscience and artificial intelligence. Although there are recent reviews on the application of neuromorphic computing in various modules of the guidance, navigation, and control of robotic systems, the focus of this paper is more on closing the cognition loop in SNN-embodied robotics. We argue that biologically viable spiking neuronal models used for electroencephalogram signals are excellent candidates for furthering our knowledge of the explainability of SNNs. We complete our survey by reviewing different robotic modules that can benefit from neuromorphic hardware, e.g., perception (with a focus on vision), localization, and cognition. We conclude that the tradeoff between symbolic computational power and biological plausibility of hardware can be best addressed by neuromorphics, whose presence in neurorobotics provides an accountable empirical testbench for investigating synthetic and natural embodied cognition. We argue this is where both theoretical and empirical future work should converge in multidisciplinary efforts involving neuroscience, artificial intelligence, and robotics.
Collapse
Affiliation(s)
- Martin Do Pham
- Department of Computer Science, University of Toronto, Toronto, ON M5S 1A1, Canada; (M.D.P.); (M.M.D.)
| | - Amedeo D’Angiulli
- Department of Neuroscience, Carleton University, Ottawa, ON K1S 5B6, Canada;
| | - Maryam Mehri Dehnavi
- Department of Computer Science, University of Toronto, Toronto, ON M5S 1A1, Canada; (M.D.P.); (M.M.D.)
| | - Robin Chhabra
- Department of Mechanical and Aerospace Engineering, Carleton University, Ottawa, ON K1S 5B6, Canada
| |
Collapse
|
43
|
Zhu L, Mangan M, Webb B. Neuromorphic sequence learning with an event camera on routes through vegetation. Sci Robot 2023; 8:eadg3679. [PMID: 37756384 DOI: 10.1126/scirobotics.adg3679] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/30/2022] [Accepted: 08/29/2023] [Indexed: 09/29/2023]
Abstract
For many robotics applications, it is desirable to have relatively low-power and efficient onboard solutions. We took inspiration from insects, such as ants, that are capable of learning and following routes in complex natural environments using relatively constrained sensory and neural systems. Such capabilities are particularly relevant to applications such as agricultural robotics, where visual navigation through dense vegetation remains a challenging task. In this scenario, a route is likely to have high self-similarity and be subject to changing lighting conditions and motion over uneven terrain, and the effects of wind on leaves increase the variability of the input. We used a bioinspired event camera on a terrestrial robot to collect visual sequences along routes in natural outdoor environments and applied a neural algorithm for spatiotemporal memory that is closely based on a known neural circuit in the insect brain. We show that this method is plausible to support route recognition for visual navigation and more robust than SeqSLAM when evaluated on repeated runs on the same route or routes with small lateral offsets. By encoding memory in a spiking neural network running on a neuromorphic computer, our model can evaluate visual familiarity in real time from event camera footage.
Collapse
Affiliation(s)
- Le Zhu
- School of Informatics, University of Edinburgh, EH8 9AB Edinburgh, UK
| | - Michael Mangan
- Sheffield Robotics, Department of Computer Science, University of Sheffield, S1 4DP Sheffield, UK
| | - Barbara Webb
- School of Informatics, University of Edinburgh, EH8 9AB Edinburgh, UK
| |
Collapse
|
44
|
Xiao Z, Cheng Z, Xiong Z. Space-Time Super-Resolution for Light Field Videos. IEEE TRANSACTIONS ON IMAGE PROCESSING : A PUBLICATION OF THE IEEE SIGNAL PROCESSING SOCIETY 2023; 32:4785-4799. [PMID: 37603488 DOI: 10.1109/tip.2023.3300121] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 08/23/2023]
Abstract
Light field (LF) cameras suffer from a fundamental trade-off between spatial and angular resolutions. Additionally, due to the significant amount of data that needs to be recorded, the Lytro ILLUM, a modern LF camera, can only capture three frames per second. In this paper, we consider space-time super-resolution (SR) for LF videos, aiming at generating high-resolution and high-frame-rate LF videos from low-resolution and low-frame-rate observations. Extending existing space-time video SR methods to this task directly will meet two key challenges: 1) how to re-organize sub-aperture images (SAIs) efficiently and effectively given highly redundant LF videos, and 2) how to aggregate complementary information between multiple SAIs and frames considering the coherence in LF videos. To address the above challenges, we propose a novel framework for space-time super-resolving LF videos for the first time. First, we propose a novel Multi-Scale Dilated SAI Re-organization strategy for re-organizing SAIs into auxiliary view stacks with decreasing resolution as the Chebyshev distance in the angular dimension increases. In particular, the auxiliary view stack with original resolution preserves essential visual details, while the down-scaled view stacks capture long-range contextual information. Second, we propose the Multi-Scale Aggregated Feature extractor and the Angular-Assisted Feature Interpolation module to utilize and aggregate information from the spatial, angular, and temporal dimensions in LF videos. The former aggregates similar contents from different SAIs and frames for subsequent reconstruction in a disparity-free manner at the feature level, whereas the latter interpolates intermediate frames temporally by implicitly aggregating geometric information. Compared to other potential approaches, experimental results demonstrate that the reconstructed LF videos generated by our framework achieve higher reconstruction quality and better preserve the LF parallax structure and temporal consistency. The implementation code is available at https://github.com/zeyuxiao1997/LFSTVSR.
Collapse
|
45
|
Aboumerhi K, Güemes A, Liu H, Tenore F, Etienne-Cummings R. Neuromorphic applications in medicine. J Neural Eng 2023; 20:041004. [PMID: 37531951 DOI: 10.1088/1741-2552/aceca3] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/20/2023] [Accepted: 08/02/2023] [Indexed: 08/04/2023]
Abstract
In recent years, there has been a growing demand for miniaturization, low power consumption, quick treatments, and non-invasive clinical strategies in the healthcare industry. To meet these demands, healthcare professionals are seeking new technological paradigms that can improve diagnostic accuracy while ensuring patient compliance. Neuromorphic engineering, which uses neural models in hardware and software to replicate brain-like behaviors, can help usher in a new era of medicine by delivering low power, low latency, small footprint, and high bandwidth solutions. This paper provides an overview of recent neuromorphic advancements in medicine, including medical imaging and cancer diagnosis, processing of biosignals for diagnosis, and biomedical interfaces, such as motor, cognitive, and perception prostheses. For each section, we provide examples of how brain-inspired models can successfully compete with conventional artificial intelligence algorithms, demonstrating the potential of neuromorphic engineering to meet demands and improve patient outcomes. Lastly, we discuss current struggles in fitting neuromorphic hardware with non-neuromorphic technologies and propose potential solutions for future bottlenecks in hardware compatibility.
Collapse
Affiliation(s)
- Khaled Aboumerhi
- Department of Electrical and Computer Engineering, The Johns Hopkins University, Baltimore, MD, United States of America
| | - Amparo Güemes
- Electrical Engineering Division, Department of Engineering, University of Cambridge, 9 JJ Thomson Ave, Cambridge CB3 0FA, United Kingdom
| | - Hongtao Liu
- Department of Electrical and Computer Engineering, The Johns Hopkins University, Baltimore, MD, United States of America
| | - Francesco Tenore
- Research and Exploratory Development Department, The Johns Hopkins University Applied Physics Laboratory, Laurel, MD, United States of America
| | - Ralph Etienne-Cummings
- Department of Electrical and Computer Engineering, The Johns Hopkins University, Baltimore, MD, United States of America
| |
Collapse
|
46
|
Cox JC, Ashok A, Morley N. Reduced spatiotemporal bandwidth of motion stabilized event-based imaging sensors: experimental demonstration. APPLIED OPTICS 2023; 62:G128-G142. [PMID: 37707071 DOI: 10.1364/ao.491884] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Received: 03/28/2023] [Accepted: 06/08/2023] [Indexed: 09/15/2023]
Abstract
This work investigates event-based sensor (EBS) imaging system's read-out bandwidth performance under linear motion, with and without hardware stabilization techniques. We implement three image stabilization methods using hardware rotation to cancel the sensor platform's linear motion and recapture lost EBS performance. We successfully demonstrated the methods, showing a bandwidth reduction of over an order of magnitude in two scenes, 10 scene variations, and five EBS velocities. This work demonstrates the benefits of stabilization with EBS to reduce bandwidth requirements versus unstabilized EBS systems.
Collapse
|
47
|
Wang J, Lin S, Liu A. Bioinspired Perception and Navigation of Service Robots in Indoor Environments: A Review. Biomimetics (Basel) 2023; 8:350. [PMID: 37622955 PMCID: PMC10452487 DOI: 10.3390/biomimetics8040350] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/27/2023] [Revised: 07/27/2023] [Accepted: 08/01/2023] [Indexed: 08/26/2023] Open
Abstract
Biological principles draw attention to service robotics because of similar concepts when robots operate various tasks. Bioinspired perception is significant for robotic perception, which is inspired by animals' awareness of the environment. This paper reviews the bioinspired perception and navigation of service robots in indoor environments, which are popular applications of civilian robotics. The navigation approaches are classified by perception type, including vision-based, remote sensing, tactile sensor, olfactory, sound-based, inertial, and multimodal navigation. The trend of state-of-art techniques is moving towards multimodal navigation to combine several approaches. The challenges in indoor navigation focus on precise localization and dynamic and complex environments with moving objects and people.
Collapse
Affiliation(s)
- Jianguo Wang
- Faculty of Engineering and Information Technology, University of Technology Sydney, Sydney, NSW 2007, Australia
| | - Shiwei Lin
- Faculty of Engineering and Information Technology, University of Technology Sydney, Sydney, NSW 2007, Australia
| | - Ang Liu
- Faculty of Engineering and Information Technology, University of Technology Sydney, Sydney, NSW 2007, Australia
| |
Collapse
|
48
|
Yao M, Zhao G, Zhang H, Hu Y, Deng L, Tian Y, Xu B, Li G. Attention Spiking Neural Networks. IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE 2023; 45:9393-9410. [PMID: 37022261 DOI: 10.1109/tpami.2023.3241201] [Citation(s) in RCA: 2] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/19/2023]
Abstract
Brain-inspired spiking neural networks (SNNs) are becoming a promising energy-efficient alternative to traditional artificial neural networks (ANNs). However, the performance gap between SNNs and ANNs has been a significant hindrance to deploying SNNs ubiquitously. To leverage the full potential of SNNs, in this paper we study the attention mechanisms, which can help human focus on important information. We present our idea of attention in SNNs with a multi-dimensional attention module, which infers attention weights along the temporal, channel, as well as spatial dimension separately or simultaneously. Based on the existing neuroscience theories, we exploit the attention weights to optimize membrane potentials, which in turn regulate the spiking response. Extensive experimental results on event-based action recognition and image classification datasets demonstrate that attention facilitates vanilla SNNs to achieve sparser spiking firing, better performance, and energy efficiency concurrently. In particular, we achieve top-1 accuracy of 75.92% and 77.08% on ImageNet-1 K with single/4-step Res-SNN-104, which are state-of-the-art results in SNNs. Compared with counterpart Res-ANN-104, the performance gap becomes -0.95/+0.21 percent and the energy efficiency is 31.8×/7.4×. To analyze the effectiveness of attention SNNs, we theoretically prove that the spiking degradation or the gradient vanishing, which usually holds in general SNNs, can be resolved by introducing the block dynamical isometry theory. We also analyze the efficiency of attention SNNs based on our proposed spiking response visualization method. Our work lights up SNN's potential as a general backbone to support various applications in the field of SNN research, with a great balance between effectiveness and energy efficiency.
Collapse
|
49
|
Sajwani H, Ayyad A, Alkendi Y, Halwani M, Abdulrahman Y, Abusafieh A, Zweiri Y. TactiGraph: An Asynchronous Graph Neural Network for Contact Angle Prediction Using Neuromorphic Vision-Based Tactile Sensing. SENSORS (BASEL, SWITZERLAND) 2023; 23:6451. [PMID: 37514745 PMCID: PMC10383597 DOI: 10.3390/s23146451] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 05/06/2023] [Revised: 06/02/2023] [Accepted: 06/06/2023] [Indexed: 07/30/2023]
Abstract
Vision-based tactile sensors (VBTSs) have become the de facto method for giving robots the ability to obtain tactile feedback from their environment. Unlike other solutions to tactile sensing, VBTSs offer high spatial resolution feedback without compromising on instrumentation costs or incurring additional maintenance expenses. However, conventional cameras used in VBTS have a fixed update rate and output redundant data, leading to computational overhead.In this work, we present a neuromorphic vision-based tactile sensor (N-VBTS) that employs observations from an event-based camera for contact angle prediction. In particular, we design and develop a novel graph neural network, dubbed TactiGraph, that asynchronously operates on graphs constructed from raw N-VBTS streams exploiting their spatiotemporal correlations to perform predictions. Although conventional VBTSs use an internal illumination source, TactiGraph is reported to perform efficiently in both scenarios (with and without an internal illumination source) thus further reducing instrumentation costs. Rigorous experimental results revealed that TactiGraph achieved a mean absolute error of 0.62∘ in predicting the contact angle and was faster and more efficient than both conventional VBTS and other N-VBTS, with lower instrumentation costs. Specifically, N-VBTS requires only 5.5% of the computing time needed by VBTS when both are tested on the same scenario.
Collapse
Affiliation(s)
- Hussain Sajwani
- UAE National Service & Reserve Authority, Abu Dhabi, United Arab Emirates
- Advanced Research and Innovation Center (ARIC), Khalifa University, Abu Dhabi 127788, United Arab Emirates
| | - Abdulla Ayyad
- Advanced Research and Innovation Center (ARIC), Khalifa University, Abu Dhabi 127788, United Arab Emirates
| | - Yusra Alkendi
- Department of Aerospace Engineering, Khalifa University, Abu Dhabi 127788, United Arab Emirates
| | - Mohamad Halwani
- Advanced Research and Innovation Center (ARIC), Khalifa University, Abu Dhabi 127788, United Arab Emirates
| | - Yusra Abdulrahman
- Advanced Research and Innovation Center (ARIC), Khalifa University, Abu Dhabi 127788, United Arab Emirates
- Department of Aerospace Engineering, Khalifa University, Abu Dhabi 127788, United Arab Emirates
| | - Abdulqader Abusafieh
- Advanced Research and Innovation Center (ARIC), Khalifa University, Abu Dhabi 127788, United Arab Emirates
- Research and Development, Strata Manufacturing PJSC, Al Ain 86519, United Arab Emirates
| | - Yahya Zweiri
- Advanced Research and Innovation Center (ARIC), Khalifa University, Abu Dhabi 127788, United Arab Emirates
- Department of Aerospace Engineering, Khalifa University, Abu Dhabi 127788, United Arab Emirates
| |
Collapse
|
50
|
Hattne J, Clabbers MTB, Martynowycz MW, Gonen T. Electron-counting in MicroED. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2023:2023.06.29.547123. [PMID: 37425889 PMCID: PMC10327187 DOI: 10.1101/2023.06.29.547123] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 07/11/2023]
Abstract
The combination of high sensitivity and rapid readout makes it possible for electron-counting detectors to record cryogenic electron microscopy data faster and more accurately without increasing the exposure. This is especially useful for MicroED of macromolecular crystals where the strength of the diffracted signal at high resolution is comparable to the surrounding background. The ability to decrease the exposure also alleviates concerns about radiation damage which limits the information that can be recovered from a diffraction measurement. However, the dynamic range of electron-counting detectors requires careful data collection to avoid errors from coincidence loss. Nevertheless, these detectors are increasingly deployed in cryo-EM facilities, and several have been successfully used for MicroED. Provided coincidence loss can be minimized, electron-counting detectors bring high potential rewards.
Collapse
Affiliation(s)
- Johan Hattne
- Howard Hughes Medical Institute, University of California, Los Angeles, CA 90095
- Department of Biological Chemistry, University of California, Los Angeles, CA 90095
| | - Max T. B. Clabbers
- Department of Biological Chemistry, University of California, Los Angeles, CA 90095
| | - Michael W. Martynowycz
- Howard Hughes Medical Institute, University of California, Los Angeles, CA 90095
- Department of Biological Chemistry, University of California, Los Angeles, CA 90095
| | - Tamir Gonen
- Howard Hughes Medical Institute, University of California, Los Angeles, CA 90095
- Department of Biological Chemistry, University of California, Los Angeles, CA 90095
- Department of Physiology, University of California, Los Angeles, CA 90095
| |
Collapse
|