151
|
Visual Event-Based Egocentric Human Action Recognition. PATTERN RECOGNITION AND IMAGE ANALYSIS 2022. [DOI: 10.1007/978-3-031-04881-4_32] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/27/2022]
|
152
|
ESPEE: Event-Based Sensor Pose Estimation Using an Extended Kalman Filter. SENSORS 2021; 21:s21237840. [PMID: 34883852 PMCID: PMC8659537 DOI: 10.3390/s21237840] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 10/15/2021] [Revised: 11/18/2021] [Accepted: 11/20/2021] [Indexed: 12/03/2022]
Abstract
Event-based vision sensors show great promise for use in embedded applications requiring low-latency passive sensing at a low computational cost. In this paper, we present an event-based algorithm that relies on an Extended Kalman Filter for 6-Degree of Freedom sensor pose estimation. The algorithm updates the sensor pose event-by-event with low latency (worst case of less than 2 μs on an FPGA). Using a single handheld sensor, we test the algorithm on multiple recordings, ranging from a high contrast printed planar scene to a more natural scene consisting of objects viewed from above. The pose is accurately estimated under rapid motions, up to 2.7 m/s. Thereafter, an extension to multiple sensors is described and tested, highlighting the improved performance of such a setup, as well as the integration with an off-the-shelf mapping algorithm to allow point cloud updates with a 3D scene and enhance the potential applications of this visual odometry solution.
Collapse
|
153
|
Osorio Quero CA, Durini D, Rangel-Magdaleno J, Martinez-Carranza J. Single-pixel imaging: An overview of different methods to be used for 3D space reconstruction in harsh environments. THE REVIEW OF SCIENTIFIC INSTRUMENTS 2021; 92:111501. [PMID: 34852525 DOI: 10.1063/5.0050358] [Citation(s) in RCA: 8] [Impact Index Per Article: 2.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Received: 03/14/2021] [Accepted: 10/12/2021] [Indexed: 06/13/2023]
Abstract
Different imaging solutions have been proposed over the last few decades, aimed at three-dimensional (3D) space reconstruction and obstacle detection, either based on stereo-vision principles using active pixel sensors operating in the visible part of the spectra or based on active Near Infra-Red (NIR) illumination applying the time-of-flight principle, to mention just a few. If extremely low quantum efficiencies for NIR active illumination yielded by silicon-based detector solutions are considered together with the huge photon noise levels produced by the background illumination accompanied by Rayleigh scattering effects taking place in outdoor applications, the operating limitations of these systems under harsh weather conditions, especially if relatively low-power active illumination is used, are evident. If longer wavelengths for active illumination are applied to overcome these issues, indium gallium arsenide (InGaAs)-based photodetectors become the technology of choice, and for low-cost solutions, using a single InGaAs photodetector or an InGaAs line-sensor becomes a promising choice. In this case, the principles of Single-Pixel Imaging (SPI) and compressive sensing acquire a paramount importance. Thus, in this paper, we review and compare the different SPI developments reported. We cover a variety of SPI system architectures, modulation methods, pattern generation and reconstruction algorithms, embedded system approaches, and 2D/3D image reconstruction methods. In addition, we introduce a Near Infra-Red Single-Pixel Imaging (NIR-SPI) sensor aimed at detecting static and dynamic objects under outdoor conditions for unmanned aerial vehicle applications.
Collapse
Affiliation(s)
- Carlos A Osorio Quero
- Digital Systems Group, Electronics Department, Instituto Nacional de Astrofísica, Óptica y Electrónica (INAOE), 72840 Puebla, Mexico
| | - Daniel Durini
- Digital Systems Group, Electronics Department, Instituto Nacional de Astrofísica, Óptica y Electrónica (INAOE), 72840 Puebla, Mexico
| | - Jose Rangel-Magdaleno
- Digital Systems Group, Electronics Department, Instituto Nacional de Astrofísica, Óptica y Electrónica (INAOE), 72840 Puebla, Mexico
| | - Jose Martinez-Carranza
- Computer Science Department, Instituto Nacional de Astrofísica, Óptica y Electrónica (INAOE), 72840 Puebla, Mexico
| |
Collapse
|
154
|
|
155
|
Manzoor S, Joo SH, Kim EJ, Bae SH, In GG, Pyo JW, Kuc TY. 3D Recognition Based on Sensor Modalities for Robotic Systems: A Survey. SENSORS (BASEL, SWITZERLAND) 2021; 21:7120. [PMID: 34770429 PMCID: PMC8587961 DOI: 10.3390/s21217120] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 08/26/2021] [Revised: 10/17/2021] [Accepted: 10/20/2021] [Indexed: 11/16/2022]
Abstract
3D visual recognition is a prerequisite for most autonomous robotic systems operating in the real world. It empowers robots to perform a variety of tasks, such as tracking, understanding the environment, and human-robot interaction. Autonomous robots equipped with 3D recognition capability can better perform their social roles through supportive task assistance in professional jobs and effective domestic services. For active assistance, social robots must recognize their surroundings, including objects and places to perform the task more efficiently. This article first highlights the value-centric role of social robots in society by presenting recently developed robots and describes their main features. Instigated by the recognition capability of social robots, we present the analysis of data representation methods based on sensor modalities for 3D object and place recognition using deep learning models. In this direction, we delineate the research gaps that need to be addressed, summarize 3D recognition datasets, and present performance comparisons. Finally, a discussion of future research directions concludes the article. This survey is intended to show how recent developments in 3D visual recognition based on sensor modalities using deep-learning-based approaches can lay the groundwork to inspire further research and serves as a guide to those who are interested in vision-based robotics applications.
Collapse
Affiliation(s)
| | | | | | | | | | | | - Tae-Yong Kuc
- Department of Electrical and Computer Engineering, College of Information and Communication Engineering, Sungkyunkwan University, Suwon 16419, Korea; (S.M.); (S.-H.J.); (E.-J.K.); (S.-H.B.); (G.-G.I.); (J.-W.P.)
| |
Collapse
|
156
|
A new opportunity for the emerging tellurium semiconductor: making resistive switching devices. Nat Commun 2021; 12:6081. [PMID: 34667171 PMCID: PMC8526830 DOI: 10.1038/s41467-021-26399-1] [Citation(s) in RCA: 12] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/24/2021] [Accepted: 10/04/2021] [Indexed: 12/03/2022] Open
Abstract
The development of the resistive switching cross-point array as the next-generation platform for high-density storage, in-memory computing and neuromorphic computing heavily relies on the improvement of the two component devices, volatile selector and nonvolatile memory, which have distinct operating current requirements. The perennial current-volatility dilemma that has been widely faced in various device implementations remains a major bottleneck. Here, we show that the device based on electrochemically active, low-thermal conductivity and low-melting temperature semiconducting tellurium filament can solve this dilemma, being able to function as either selector or memory in respective desired current ranges. Furthermore, we demonstrate one-selector-one-resistor behavior in a tandem of two identical Te-based devices, indicating the potential of Te-based device as a universal array building block. These nonconventional phenomena can be understood from a combination of unique electrical-thermal properties in Te. Preliminary device optimization efforts also indicate large and unique design space for Te-based resistive switching devices. Resistive switching devices have great promise for a wide variety of technological applications. Here, Yang et al demonstrate that electrochemically induced tellurium filament can give rise to resistive switching, and show that devices based on this can provide a number of advantages compared to metallic filament-based devices.
Collapse
|
157
|
Loquercio A, Kaufmann E, Ranftl R, Müller M, Koltun V, Scaramuzza D. Learning high-speed flight in the wild. Sci Robot 2021; 6:eabg5810. [PMID: 34613820 DOI: 10.1126/scirobotics.abg5810] [Citation(s) in RCA: 21] [Impact Index Per Article: 7.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/02/2022]
Abstract
Quadrotors are agile. Unlike most other machines, they can traverse extremely complex environments at high speeds. To date, only expert human pilots have been able to fully exploit their capabilities. Autonomous operation with onboard sensing and computation has been limited to low speeds. State-of-the-art methods generally separate the navigation problem into subtasks: sensing, mapping, and planning. Although this approach has proven successful at low speeds, the separation it builds upon can be problematic for high-speed navigation in cluttered environments. The subtasks are executed sequentially, leading to increased processing latency and a compounding of errors through the pipeline. Here, we propose an end-to-end approach that can autonomously fly quadrotors through complex natural and human-made environments at high speeds with purely onboard sensing and computation. The key principle is to directly map noisy sensory observations to collision-free trajectories in a receding-horizon fashion. This direct mapping drastically reduces processing latency and increases robustness to noisy and incomplete perception. The sensorimotor mapping is performed by a convolutional network that is trained exclusively in simulation via privileged learning: imitating an expert with access to privileged information. By simulating realistic sensor noise, our approach achieves zero-shot transfer from simulation to challenging real-world environments that were never experienced during training: dense forests, snow-covered terrain, derailed trains, and collapsed buildings. Our work demonstrates that end-to-end policies trained in simulation enable high-speed autonomous flight through challenging environments, outperforming traditional obstacle avoidance pipelines.
Collapse
|
158
|
Dinaux R, Wessendorp N, Dupeyroux J, Croon GCHED. FAITH: Fast Iterative Half-Plane Focus of Expansion Estimation Using Optic Flow. IEEE Robot Autom Lett 2021. [DOI: 10.1109/lra.2021.3100153] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/05/2022]
|
159
|
|
160
|
Beck M, Maier G, Flitter M, Gruna R, Längle T, Heizmann M, Beyerer J. An Extended Modular Processing Pipeline for Event-Based Vision in Automatic Visual Inspection. SENSORS 2021; 21:s21186143. [PMID: 34577349 PMCID: PMC8472878 DOI: 10.3390/s21186143] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 08/19/2021] [Revised: 09/02/2021] [Accepted: 09/03/2021] [Indexed: 11/16/2022]
Abstract
Dynamic Vision Sensors differ from conventional cameras in that only intensity changes of individual pixels are perceived and transmitted as an asynchronous stream instead of an entire frame. The technology promises, among other things, high temporal resolution and low latencies and data rates. While such sensors currently enjoy much scientific attention, there are only little publications on practical applications. One field of application that has hardly been considered so far, yet potentially fits well with the sensor principle due to its special properties, is automatic visual inspection. In this paper, we evaluate current state-of-the-art processing algorithms in this new application domain. We further propose an algorithmic approach for the identification of ideal time windows within an event stream for object classification. For the evaluation of our method, we acquire two novel datasets that contain typical visual inspection scenarios, i.e., the inspection of objects on a conveyor belt and during free fall. The success of our algorithmic extension for data processing is demonstrated on the basis of these new datasets by showing that classification accuracy of current algorithms is highly increased. By making our new datasets publicly available, we intend to stimulate further research on application of Dynamic Vision Sensors in machine vision applications.
Collapse
Affiliation(s)
- Moritz Beck
- Fraunhofer IOSB, Karlsruhe, Institute of Optronics, System Technologies and Image Exploitation IOSB, 76131 Karlsruhe, Germany; (M.B.); (M.F.); (R.G.); (T.L.); (J.B.)
| | - Georg Maier
- Fraunhofer IOSB, Karlsruhe, Institute of Optronics, System Technologies and Image Exploitation IOSB, 76131 Karlsruhe, Germany; (M.B.); (M.F.); (R.G.); (T.L.); (J.B.)
- Correspondence:
| | - Merle Flitter
- Fraunhofer IOSB, Karlsruhe, Institute of Optronics, System Technologies and Image Exploitation IOSB, 76131 Karlsruhe, Germany; (M.B.); (M.F.); (R.G.); (T.L.); (J.B.)
| | - Robin Gruna
- Fraunhofer IOSB, Karlsruhe, Institute of Optronics, System Technologies and Image Exploitation IOSB, 76131 Karlsruhe, Germany; (M.B.); (M.F.); (R.G.); (T.L.); (J.B.)
| | - Thomas Längle
- Fraunhofer IOSB, Karlsruhe, Institute of Optronics, System Technologies and Image Exploitation IOSB, 76131 Karlsruhe, Germany; (M.B.); (M.F.); (R.G.); (T.L.); (J.B.)
| | - Michael Heizmann
- Institute of Industrial Information Technology (IIIT), Karlsruhe Institute of Technology (KIT), 76131 Karlsruhe, Germany;
| | - Jürgen Beyerer
- Fraunhofer IOSB, Karlsruhe, Institute of Optronics, System Technologies and Image Exploitation IOSB, 76131 Karlsruhe, Germany; (M.B.); (M.F.); (R.G.); (T.L.); (J.B.)
- Vision and Fusion Laboratory (IES), Karlsruhe Institute of Technology (KIT), 76131 Karlsruhe, Germany
| |
Collapse
|
161
|
Joubert D, Marcireau A, Ralph N, Jolley A, van Schaik A, Cohen G. Event Camera Simulator Improvements via Characterized Parameters. Front Neurosci 2021; 15:702765. [PMID: 34385903 PMCID: PMC8353146 DOI: 10.3389/fnins.2021.702765] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/29/2021] [Accepted: 06/28/2021] [Indexed: 11/18/2022] Open
Abstract
It has been more than two decades since the first neuromorphic Dynamic Vision Sensor (DVS) sensor was invented, and many subsequent prototypes have been built with a wide spectrum of applications in mind. Competing against state-of-the-art neural networks in terms of accuracy is difficult, although there are clear opportunities to outperform conventional approaches in terms of power consumption and processing speed. As neuromorphic sensors generate sparse data at the focal plane itself, they are inherently energy-efficient, data-driven, and fast. In this work, we present an extended DVS pixel simulator for neuromorphic benchmarks which simplifies the latency and the noise models. In addition, to more closely model the behaviour of a real pixel, the readout circuitry is modelled, as this can strongly affect the time precision of events in complex scenes. Using a dynamic variant of the MNIST dataset as a benchmarking task, we use this simulator to explore how the latency of the sensor allows it to outperform conventional sensors in terms of sensing speed.
Collapse
Affiliation(s)
- Damien Joubert
- International Centre for Neuromorphic Systems, The MARCS Institute for Brain, Behaviour, and Development, Western Sydney University, Kingswood, NSW, Australia
| | - Alexandre Marcireau
- International Centre for Neuromorphic Systems, The MARCS Institute for Brain, Behaviour, and Development, Western Sydney University, Kingswood, NSW, Australia
| | - Nic Ralph
- International Centre for Neuromorphic Systems, The MARCS Institute for Brain, Behaviour, and Development, Western Sydney University, Kingswood, NSW, Australia
| | - Andrew Jolley
- International Centre for Neuromorphic Systems, The MARCS Institute for Brain, Behaviour, and Development, Western Sydney University, Kingswood, NSW, Australia
| | - André van Schaik
- International Centre for Neuromorphic Systems, The MARCS Institute for Brain, Behaviour, and Development, Western Sydney University, Kingswood, NSW, Australia
| | - Gregory Cohen
- International Centre for Neuromorphic Systems, The MARCS Institute for Brain, Behaviour, and Development, Western Sydney University, Kingswood, NSW, Australia
| |
Collapse
|
162
|
Li R, Shi D, Zhang Y, Li R, Wang M. Asynchronous event feature generation and tracking based on gradient descriptor for event cameras. INT J ADV ROBOT SYST 2021. [DOI: 10.1177/17298814211027028] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/17/2022] Open
Abstract
Recently, the event camera has become a popular and promising vision sensor in the research of simultaneous localization and mapping and computer vision owing to its advantages: low latency, high dynamic range, and high temporal resolution. As a basic part of the feature-based SLAM system, the feature tracking method using event cameras is still an open question. In this article, we present a novel asynchronous event feature generation and tracking algorithm operating directly on event-streams to fully utilize the natural asynchronism of event cameras. The proposed algorithm consists of an event-corner detection unit, a descriptor construction unit, and an event feature tracking unit. The event-corner detection unit addresses a fast and asynchronous corner detector to extract event-corners from event-streams. For the descriptor construction unit, we propose a novel asynchronous gradient descriptor inspired by the scale-invariant feature transform descriptor, which helps to achieve quantitative measurement of similarity between event feature pairs. The construction of the gradient descriptor can be decomposed into three stages: speed-invariant time surface maintenance and extraction, principal orientation calculation, and descriptor generation. The event feature tracking unit combines the constructed gradient descriptor and an event feature matching method to achieve asynchronous feature tracking. We implement the proposed algorithm in C++ and evaluate it on a public event dataset. The experimental results show that our proposed method achieves improvement in terms of tracking accuracy and real-time performance when compared with the state-of-the-art asynchronous event-corner tracker and with no compromise on the feature tracking lifetime.
Collapse
Affiliation(s)
- Ruoxiang Li
- National University of Defense Technology, Changsha, China
| | - Dianxi Shi
- Artificial Intelligence Research Center (AIRC), National Innovation Institute of Defense Technology (NIIDT), Beijing, China
- Tianjin Artificial Intelligence Innovation Center (TAIIC), Tianjin, China
| | - Yongjun Zhang
- Artificial Intelligence Research Center (AIRC), National Innovation Institute of Defense Technology (NIIDT), Beijing, China
| | - Ruihao Li
- Artificial Intelligence Research Center (AIRC), National Innovation Institute of Defense Technology (NIIDT), Beijing, China
- Tianjin Artificial Intelligence Innovation Center (TAIIC), Tianjin, China
| | - Mingkun Wang
- National University of Defense Technology, Changsha, China
| |
Collapse
|
163
|
Gehrig M, Aarents W, Gehrig D, Scaramuzza D. DSEC: A Stereo Event Camera Dataset for Driving Scenarios. IEEE Robot Autom Lett 2021. [DOI: 10.1109/lra.2021.3068942] [Citation(s) in RCA: 22] [Impact Index Per Article: 7.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/10/2022]
|
164
|
Depth-Image Segmentation Based on Evolving Principles for 3D Sensing of Structured Indoor Environments. SENSORS 2021; 21:s21134395. [PMID: 34198980 PMCID: PMC8271552 DOI: 10.3390/s21134395] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 05/20/2021] [Revised: 06/22/2021] [Accepted: 06/24/2021] [Indexed: 11/23/2022]
Abstract
This paper presents an approach of depth image segmentation based on the Evolving Principal Component Clustering (EPCC) method, which exploits data locality in an ordered data stream. The parameters of linear prototypes, which are used to describe different clusters, are estimated in a recursive manner. The main contribution of this work is the extension and application of the EPCC to 3D space for recursive and real-time detection of flat connected surfaces based on linear segments, which are all detected in an evolving way. To obtain optimal results when processing homogeneous surfaces, we introduced two-step filtering for outlier detection within a clustering framework and considered the noise model, which allowed for the compensation of characteristic uncertainties that are introduced into the measurements of depth sensors. The developed algorithm was compared with well-known methods for point cloud segmentation. The proposed approach achieves better segmentation results over longer distances for which the signal-to-noise ratio is low, without prior filtering of the data. On the given database, an average rate higher than 90% was obtained for successfully detected flat surfaces, which indicates high performance when processing huge point clouds in a non-iterative manner.
Collapse
|
165
|
Tayarani-Najaran MH, Schmuker M. Event-Based Sensing and Signal Processing in the Visual, Auditory, and Olfactory Domain: A Review. Front Neural Circuits 2021; 15:610446. [PMID: 34135736 PMCID: PMC8203204 DOI: 10.3389/fncir.2021.610446] [Citation(s) in RCA: 6] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/25/2020] [Accepted: 04/27/2021] [Indexed: 11/13/2022] Open
Abstract
The nervous systems converts the physical quantities sensed by its primary receptors into trains of events that are then processed in the brain. The unmatched efficiency in information processing has long inspired engineers to seek brain-like approaches to sensing and signal processing. The key principle pursued in neuromorphic sensing is to shed the traditional approach of periodic sampling in favor of an event-driven scheme that mimicks sampling as it occurs in the nervous system, where events are preferably emitted upon the change of the sensed stimulus. In this paper we highlight the advantages and challenges of event-based sensing and signal processing in the visual, auditory and olfactory domains. We also provide a survey of the literature covering neuromorphic sensing and signal processing in all three modalities. Our aim is to facilitate research in event-based sensing and signal processing by providing a comprehensive overview of the research performed previously as well as highlighting conceptual advantages, current progress and future challenges in the field.
Collapse
Affiliation(s)
| | - Michael Schmuker
- School of Physics, Engineering and Computer Science, University of Hertfordshire, Hatfield, United Kingdom
| |
Collapse
|
166
|
Seeing through Events: Real-Time Moving Object Sonification for Visually Impaired People Using Event-Based Camera. SENSORS 2021; 21:s21103558. [PMID: 34065360 PMCID: PMC8161033 DOI: 10.3390/s21103558] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 04/07/2021] [Revised: 05/11/2021] [Accepted: 05/12/2021] [Indexed: 11/25/2022]
Abstract
Scene sonification is a powerful technique to help Visually Impaired People (VIP) understand their surroundings. Existing methods usually perform sonification on the entire images of the surrounding scene acquired by a standard camera or on the priori static obstacles acquired by image processing algorithms on the RGB image of the surrounding scene. However, if all the information in the scene are delivered to VIP simultaneously, it will cause information redundancy. In fact, biological vision is more sensitive to moving objects in the scene than static objects, which is also the original intention of the event-based camera. In this paper, we propose a real-time sonification framework to help VIP understand the moving objects in the scene. First, we capture the events in the scene using an event-based camera and cluster them into multiple moving objects without relying on any prior knowledge. Then, sonification based on MIDI is enabled on these objects synchronously. Finally, we conduct comprehensive experiments on the scene video with sonification audio attended by 20 VIP and 20 Sighted People (SP). The results show that our method allows both participants to clearly distinguish the number, size, motion speed, and motion trajectories of multiple objects. The results show that our method is more comfortable to hear than existing methods in terms of aesthetics.
Collapse
|
167
|
Deng Y, Chen H, Chen H, Li Y. Learning From Images: A Distillation Learning Framework for Event Cameras. IEEE TRANSACTIONS ON IMAGE PROCESSING : A PUBLICATION OF THE IEEE SIGNAL PROCESSING SOCIETY 2021; 30:4919-4931. [PMID: 33961557 DOI: 10.1109/tip.2021.3077136] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/12/2023]
Abstract
Event cameras have recently drawn massive attention in the computer vision community because of their low power consumption and high response speed. These cameras produce sparse and non-uniform spatiotemporal representations of a scene. These characteristics of representations make it difficult for event-based models to extract discriminative cues (such as textures and geometric relationships). Consequently, event-based methods usually perform poorly compared to their conventional image counterparts. Considering that traditional images and event signals share considerable visual information, this paper aims to improve the feature extraction ability of event-based models by using knowledge distilled from the image domain to additionally provide explicit feature-level supervision for the learning of event data. Specifically, we propose a simple yet effective distillation learning framework, including multi-level customized knowledge distillation constraints. Our framework can significantly boost the feature extraction process for event data and is applicable to various downstream tasks. We evaluate our framework on high-level and low-level tasks, i.e., object classification and optical flow prediction. Experimental results show that our framework can effectively improve the performance of event-based models on both tasks by a large margin. Furthermore, we present a 10K dataset (CEP-DVS) for event-based object classification. This dataset consists of samples recorded under random motion trajectories that can better evaluate the motion robustness of the event-based model and is compatible with multi-modality vision tasks.
Collapse
|
168
|
Abstract
Pedestrian detection has attracted great research attention in video surveillance, traffic statistics, and especially in autonomous driving. To date, almost all pedestrian detection solutions are derived from conventional framed-based image sensors with limited reaction speed and high data redundancy. Dynamic vision sensor (DVS), which is inspired by biological retinas, efficiently captures the visual information with sparse, asynchronous events rather than dense, synchronous frames. It can eliminate redundant data transmission and avoid motion blur or data leakage in high-speed imaging applications. However, it is usually impractical to directly apply the event streams to conventional object detection algorithms. For this issue, we first propose a novel event-to-frame conversion method by integrating the inherent characteristics of events more efficiently. Moreover, we design an improved feature extraction network that can reuse intermediate features to further reduce the computational effort. We evaluate the performance of our proposed method on a custom dataset containing multiple real-world pedestrian scenes. The results indicate that our proposed method raised its pedestrian detection accuracy by about 5.6–10.8%, and its detection speed is nearly 20% faster than previously reported methods. Furthermore, it can achieve a processing speed of about 26 FPS and an AP of 87.43% when implanted on a single CPU so that it fully meets the requirement of real-time detection.
Collapse
|
169
|
Sun S, Cioffi G, de Visser C, Scaramuzza D. Autonomous Quadrotor Flight Despite Rotor Failure With Onboard Vision Sensors: Frames vs. Events. IEEE Robot Autom Lett 2021. [DOI: 10.1109/lra.2020.3048875] [Citation(s) in RCA: 21] [Impact Index Per Article: 7.0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/09/2022]
|
170
|
Rodriguez-Gomez JP, Tapia R, Paneque JL, Grau P, Gomez Eguiluz A, Martinez-de-Dios JR, Ollero A. The GRIFFIN Perception Dataset: Bridging the Gap Between Flapping-Wing Flight and Robotic Perception. IEEE Robot Autom Lett 2021. [DOI: 10.1109/lra.2021.3056348] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/07/2022]
|
171
|
Gehrig D, Ruegg M, Gehrig M, Hidalgo-Carrio J, Scaramuzza D. Combining Events and Frames Using Recurrent Asynchronous Multimodal Networks for Monocular Depth Prediction. IEEE Robot Autom Lett 2021. [DOI: 10.1109/lra.2021.3060707] [Citation(s) in RCA: 18] [Impact Index Per Article: 6.0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/09/2022]
|
172
|
Review on Vehicle Detection Technology for Unmanned Ground Vehicles. SENSORS 2021; 21:s21041354. [PMID: 33672976 PMCID: PMC7918767 DOI: 10.3390/s21041354] [Citation(s) in RCA: 8] [Impact Index Per Article: 2.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 01/15/2021] [Revised: 02/05/2021] [Accepted: 02/10/2021] [Indexed: 11/17/2022]
Abstract
Unmanned ground vehicles (UGVs) have great potential in the application of both civilian and military fields, and have become the focus of research in many countries. Environmental perception technology is the foundation of UGVs, which is of great significance to achieve a safer and more efficient performance. This article firstly introduces commonly used sensors for vehicle detection, lists their application scenarios and compares the strengths and weakness of different sensors. Secondly, related works about one of the most important aspects of environmental perception technology-vehicle detection-are reviewed and compared in detail in terms of different sensors. Thirdly, several simulation platforms related to UGVs are presented for facilitating simulation testing of vehicle detection algorithms. In addition, some datasets about UGVs are summarized to achieve the verification of vehicle detection algorithms in practical application. Finally, promising research topics in the future study of vehicle detection technology for UGVs are discussed in detail.
Collapse
|
173
|
Falanga D, Kleber K, Scaramuzza D. Dynamic obstacle avoidance for quadrotors with event cameras. Sci Robot 2021; 5:5/40/eaaz9712. [PMID: 33022598 DOI: 10.1126/scirobotics.aaz9712] [Citation(s) in RCA: 31] [Impact Index Per Article: 10.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/24/2019] [Accepted: 02/18/2020] [Indexed: 11/02/2022]
Abstract
Today's autonomous drones have reaction times of tens of milliseconds, which is not enough for navigating fast in complex dynamic environments. To safely avoid fast moving objects, drones need low-latency sensors and algorithms. We departed from state-of-the-art approaches by using event cameras, which are bioinspired sensors with reaction times of microseconds. Our approach exploits the temporal information contained in the event stream to distinguish between static and dynamic objects and leverages a fast strategy to generate the motor commands necessary to avoid the approaching obstacles. Standard vision algorithms cannot be applied to event cameras because the output of these sensors is not images but a stream of asynchronous events that encode per-pixel intensity changes. Our resulting algorithm has an overall latency of only 3.5 milliseconds, which is sufficient for reliable detection and avoidance of fast-moving obstacles. We demonstrate the effectiveness of our approach on an autonomous quadrotor using only onboard sensing and computation. Our drone was capable of avoiding multiple obstacles of different sizes and shapes, at relative speeds up to 10 meters/second, both indoors and outdoors.
Collapse
Affiliation(s)
- Davide Falanga
- Department of Informatics, University of Zurich, Zurich, Switzerland.
| | - Kevin Kleber
- Department of Informatics, University of Zurich, Zurich, Switzerland
| | - Davide Scaramuzza
- Department of Informatics, University of Zurich, Zurich, Switzerland
| |
Collapse
|
174
|
Cadena PRG, Qian Y, Wang C, Yang M. SPADE-E2VID: Spatially-Adaptive Denormalization for Event-Based Video Reconstruction. IEEE TRANSACTIONS ON IMAGE PROCESSING : A PUBLICATION OF THE IEEE SIGNAL PROCESSING SOCIETY 2021; 30:2488-2500. [PMID: 33502977 DOI: 10.1109/tip.2021.3052070] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/12/2023]
Abstract
Event-based cameras have several advantages over traditional cameras that shoot videos in frames. Event cameras have a high temporal resolution, high dynamic range, and almost non-existence of blurriness. The data that is produced by event sensors forms a chain of events when a change in brightness is reported in each pixel. This feature makes it difficult to directly apply existing algorithms and take advantage of the event camera data. Due to the developments in neural networks, important advances were made in event-based image reconstruction. Even though these neural networks achieve precise reconstructions while preserving most of the properties of the event cameras, there is still an initialization time that needs to have the highest possible quality in the reconstructed frames. In this work, we present the SPADE-E2VID neural network model that improves the quality of early frames in an event-based reconstructed video, as well as the overall contrast. The SPADE-E2VID model improves the quality of the first reconstructed frames by 15.87% for MSE error, 4.15% for SSIM, and 2.5% in LPIPS. In addition, the SPADE layer in our model allows training our model to reconstruct videos without a temporal loss function. Another advantage of our model is that it has a faster training time. In a many-to-one training style, we avoid running the loss function at each step, executing the loss function at the end of each loop only once. In the present work, we also carried out experiments with event cameras that do not have polarity data. Our model produces quality video reconstructions with non-polarity events in HD resolution (1200 × 800). The Video, the code, and the datasets will be available at: https://github.com/RodrigoGantier/SPADE_E2VID.
Collapse
|
175
|
Jiao J, Ye H, Zhu Y, Liu M. Robust Odometry and Mapping for Multi-LiDAR Systems With Online Extrinsic Calibration. IEEE T ROBOT 2021. [DOI: 10.1109/tro.2021.3078287] [Citation(s) in RCA: 8] [Impact Index Per Article: 2.7] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/08/2022]
|
176
|
Risi N, Aimar A, Donati E, Solinas S, Indiveri G. A Spike-Based Neuromorphic Architecture of Stereo Vision. Front Neurorobot 2020; 14:568283. [PMID: 33304262 PMCID: PMC7693562 DOI: 10.3389/fnbot.2020.568283] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/31/2020] [Accepted: 10/09/2020] [Indexed: 11/13/2022] Open
Abstract
The problem of finding stereo correspondences in binocular vision is solved effortlessly in nature and yet it is still a critical bottleneck for artificial machine vision systems. As temporal information is a crucial feature in this process, the advent of event-based vision sensors and dedicated event-based processors promises to offer an effective approach to solving the stereo matching problem. Indeed, event-based neuromorphic hardware provides an optimal substrate for fast, asynchronous computation, that can make explicit use of precise temporal coincidences. However, although several biologically-inspired solutions have already been proposed, the performance benefits of combining event-based sensing with asynchronous and parallel computation are yet to be explored. Here we present a hardware spike-based stereo-vision system that leverages the advantages of brain-inspired neuromorphic computing by interfacing two event-based vision sensors to an event-based mixed-signal analog/digital neuromorphic processor. We describe a prototype interface designed to enable the emulation of a stereo-vision system on neuromorphic hardware and we quantify the stereo matching performance with two datasets. Our results provide a path toward the realization of low-latency, end-to-end event-based, neuromorphic architectures for stereo vision.
Collapse
Affiliation(s)
- Nicoletta Risi
- Institute of Neuroinformatics, University of Zurich, Eidgenössische Technische Hochschule Zurich, Zurich, Switzerland
| | - Alessandro Aimar
- Institute of Neuroinformatics, University of Zurich, Eidgenössische Technische Hochschule Zurich, Zurich, Switzerland
| | - Elisa Donati
- Institute of Neuroinformatics, University of Zurich, Eidgenössische Technische Hochschule Zurich, Zurich, Switzerland
| | - Sergio Solinas
- Institute of Neuroinformatics, University of Zurich, Eidgenössische Technische Hochschule Zurich, Zurich, Switzerland
| | - Giacomo Indiveri
- Institute of Neuroinformatics, University of Zurich, Eidgenössische Technische Hochschule Zurich, Zurich, Switzerland
| |
Collapse
|
177
|
Li B, Cao H, Qu Z, Hu Y, Wang Z, Liang Z. Event-Based Robotic Grasping Detection With Neuromorphic Vision Sensor and Event-Grasping Dataset. Front Neurorobot 2020; 14:51. [PMID: 33162883 PMCID: PMC7580650 DOI: 10.3389/fnbot.2020.00051] [Citation(s) in RCA: 12] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/20/2020] [Accepted: 06/24/2020] [Indexed: 11/17/2022] Open
Abstract
Robotic grasping plays an important role in the field of robotics. The current state-of-the-art robotic grasping detection systems are usually built on the conventional vision, such as the RGB-D camera. Compared to traditional frame-based computer vision, neuromorphic vision is a small and young community of research. Currently, there are limited event-based datasets due to the troublesome annotation of the asynchronous event stream. Annotating large scale vision datasets often takes lots of computation resources, especially when it comes to troublesome data for video-level annotation. In this work, we consider the problem of detecting robotic grasps in a moving camera view of a scene containing objects. To obtain more agile robotic perception, a neuromorphic vision sensor (Dynamic and Active-pixel Vision Sensor, DAVIS) attaching to the robot gripper is introduced to explore the potential usage in grasping detection. We construct a robotic grasping dataset named Event-Grasping dataset with 91 objects. A spatial-temporal mixed particle filter (SMP Filter) is proposed to track the LED-based grasp rectangles, which enables video-level annotation of a single grasp rectangle per object. As LEDs blink at high frequency, the Event-Grasping dataset is annotated at a high frequency of 1 kHz. Based on the Event-Grasping dataset, we develop a deep neural network for grasping detection that considers the angle learning problem as classification instead of regression. The method performs high detection accuracy on our Event-Grasping dataset with 93% precision at an object-wise level split. This work provides a large-scale and well-annotated dataset and promotes the neuromorphic vision applications in agile robot.
Collapse
Affiliation(s)
- Bin Li
- JingDong Group, Beijing, China
| | - Hu Cao
- Robotics, Artificial Intelligence and Real-time Systems, Technische Universität München, München, Germany
| | - Zhongnan Qu
- Computer Engineering and Networks Lab, Eidgenössische Technische Hochschule (ETH) Zürich, Zürich, Switzerland
| | - Yingbai Hu
- Robotics, Artificial Intelligence and Real-time Systems, Technische Universität München, München, Germany
| | - Zhenke Wang
- College of Automotive Engineering, Tongji University, Shanghai, China
| | - Zichen Liang
- College of Automotive Engineering, Tongji University, Shanghai, China
| |
Collapse
|
178
|
Hagenaars JJ, Paredes-Valles F, Bohte SM, de Croon GCHE. Evolved Neuromorphic Control for High Speed Divergence-Based Landings of MAVs. IEEE Robot Autom Lett 2020. [DOI: 10.1109/lra.2020.3012129] [Citation(s) in RCA: 9] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/10/2022]
|
179
|
Fischer T, Milford M. Event-Based Visual Place Recognition With Ensembles of Temporal Windows. IEEE Robot Autom Lett 2020. [DOI: 10.1109/lra.2020.3025505] [Citation(s) in RCA: 12] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/08/2022]
|
180
|
Hadviger A, Marković I, Petrović I. Stereo dense depth tracking based on optical flow using frames and events. Adv Robot 2020. [DOI: 10.1080/01691864.2020.1821770] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/23/2022]
Affiliation(s)
- Antea Hadviger
- Laboratory for Autonomous Systems and Mobile Robotics (LAMOR), University of Zagreb Faculty of Electrical Engineering and Computing, Zagreb, Croatia
| | - Ivan Marković
- Laboratory for Autonomous Systems and Mobile Robotics (LAMOR), University of Zagreb Faculty of Electrical Engineering and Computing, Zagreb, Croatia
| | - Ivan Petrović
- Laboratory for Autonomous Systems and Mobile Robotics (LAMOR), University of Zagreb Faculty of Electrical Engineering and Computing, Zagreb, Croatia
| |
Collapse
|
181
|
Fang Y, Ma Z, Zheng H, Ji W. Trainable TV-$$L^1$$ model as recurrent nets for low-level vision. Neural Comput Appl 2020. [DOI: 10.1007/s00521-020-05146-5] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/28/2022]
|
182
|
Huang X, Muthusamy R, Hassan E, Niu Z, Seneviratne L, Gan D, Zweiri Y. Neuromorphic Vision Based Contact-Level Classification in Robotic Grasping Applications. SENSORS 2020; 20:s20174724. [PMID: 32825656 PMCID: PMC7506874 DOI: 10.3390/s20174724] [Citation(s) in RCA: 9] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 07/16/2020] [Revised: 07/28/2020] [Accepted: 07/29/2020] [Indexed: 11/16/2022]
Abstract
In recent years, robotic sorting is widely used in the industry, which is driven by necessity and opportunity. In this paper, a novel neuromorphic vision-based tactile sensing approach for robotic sorting application is proposed. This approach has low latency and low power consumption when compared to conventional vision-based tactile sensing techniques. Two Machine Learning (ML) methods, namely, Support Vector Machine (SVM) and Dynamic Time Warping-K Nearest Neighbor (DTW-KNN), are developed to classify material hardness, object size, and grasping force. An Event-Based Object Grasping (EBOG) experimental setup is developed to acquire datasets, where 243 experiments are produced to train the proposed classifiers. Based on predictions of the classifiers, objects can be automatically sorted. If the prediction accuracy is below a certain threshold, the gripper re-adjusts and re-grasps until reaching a proper grasp. The proposed ML method achieves good prediction accuracy, which shows the effectiveness and the applicability of the proposed approach. The experimental results show that the developed SVM model outperforms the DTW-KNN model in term of accuracy and efficiency for real time contact-level classification.
Collapse
Affiliation(s)
- Xiaoqian Huang
- Khalifa University Center for Autonomous Robotic Systems (KUCARS), Khalifa University of Science and Technology, Abu Dhabi 127788, UAE; (R.M.); (E.H.); (Z.N.); (L.S.); (Y.Z.)
- Correspondence:
| | - Rajkumar Muthusamy
- Khalifa University Center for Autonomous Robotic Systems (KUCARS), Khalifa University of Science and Technology, Abu Dhabi 127788, UAE; (R.M.); (E.H.); (Z.N.); (L.S.); (Y.Z.)
| | - Eman Hassan
- Khalifa University Center for Autonomous Robotic Systems (KUCARS), Khalifa University of Science and Technology, Abu Dhabi 127788, UAE; (R.M.); (E.H.); (Z.N.); (L.S.); (Y.Z.)
| | - Zhenwei Niu
- Khalifa University Center for Autonomous Robotic Systems (KUCARS), Khalifa University of Science and Technology, Abu Dhabi 127788, UAE; (R.M.); (E.H.); (Z.N.); (L.S.); (Y.Z.)
| | - Lakmal Seneviratne
- Khalifa University Center for Autonomous Robotic Systems (KUCARS), Khalifa University of Science and Technology, Abu Dhabi 127788, UAE; (R.M.); (E.H.); (Z.N.); (L.S.); (Y.Z.)
| | - Dongming Gan
- School of Engineering Technology, Purdue University, West Lafayette, IN 47907, USA;
| | - Yahya Zweiri
- Khalifa University Center for Autonomous Robotic Systems (KUCARS), Khalifa University of Science and Technology, Abu Dhabi 127788, UAE; (R.M.); (E.H.); (Z.N.); (L.S.); (Y.Z.)
- Faculty of Science, Engineering and Computing, Kingston University, London SW15 3DW, UK
| |
Collapse
|
183
|
Abstract
PURPOSE OF REVIEW The goal is providing an update to the latest research surrounding optoelectronic devices, highlighting key studies and benefits and limitations of each device. RECENT FINDINGS The Argus II demonstrated long-term safety after five-year follow-up. Due to lack of tack fixation, subretinal implants appear to displace over time. PRIMA's completed primate trial showed initial safety and potential for improved vision, resulting in ongoing clinical trials Bionic Vision Australia developed a new 44-electrode suprachoroidal device currently in a clinical trial. Orion (cortical stimulation) is currently undergoing a clinical trial to demonstrate safety. SUMMARY Devices using external camera for images are unaffected by corneal or lens opacities but disconnect eye movements from image perception, while the opposite is true for implants directly detecting light. Visual acuity provided by devices is more complicated than implant electrode density and new devices aim to target this with innovative approaches.
Collapse
Affiliation(s)
- Victor Wang
- University of Rochester School of Medicine and Dentistry, Rochester, NY, USA
| | - Ajay E. Kuriyan
- Retina Service, Wills Eye Hospital, Thomas Jefferson University, Philadelphia, PA, USA
- Flaum Eye Institute, University of Rochester Medical Center, Rochester, NY, USA
| |
Collapse
|