1
|
Guo L, Wang J, Han H, Wang P, Lu Y, Yuan Q, Du C, Yin S, Zhou Y, Zhang C. MXene/WO 3 Sensor Array with Improved SNN Algorithm for Accurate Identification of Toxic Gases. ACS APPLIED MATERIALS & INTERFACES 2024; 16:62421-62428. [PMID: 39497603 DOI: 10.1021/acsami.4c14793] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/15/2024]
Abstract
Gas sensing is pivotal in critical areas such as industrial production and food safety. This study explores the gas classification capabilities of MXene-based gas sensors. Pure V2CTx MXene and an MXene/WO3 nanocomposite were synthesized, and MXene-based gas sensors were integrated into a 2 × 2 rudimentary electronic nose array. The tests on gas sensitivity revealed that the inclusion of WO3 nanoparticles (NPs) boosted the sensor's response to 10 ppm of NO2 from 2.82 to 3.45 at room temperature. Moreover, the sensor showcased a rapid response/recovery duration of 74.5/149.0 s, excellent environmental stability, and long-term reliable sensing performance. Furthermore, we have improved the method of accurately identifying four toxic gases detected by an MXene-based sensor array using a spiking neural network (SNN) based on the memristive system. Also, the performance of this identification method revealed that the method achieved 95.83% accuracy in the identification of the four gases. Notably, the improved SNN demonstrated approximately 5% higher accuracy than the other gas recognition algorithm. These results highlight the potential of SNN as a powerful tool to accurately and reliably identify toxic gases based on the gas sensor array.
Collapse
Affiliation(s)
- Liangchao Guo
- College of Mechanical Engineering, Yangzhou University, Yangzhou 225127, PR China
| | - Junke Wang
- College of Mechanical Engineering, Yangzhou University, Yangzhou 225127, PR China
| | - Haoran Han
- College of Mechanical Engineering, Yangzhou University, Yangzhou 225127, PR China
| | - Peng Wang
- College of Mechanical Engineering, Yangzhou University, Yangzhou 225127, PR China
| | - Yunxiang Lu
- Key Laboratory of Advanced Marine Materials, Ningbo Institute of Materials Technology and Engineering, Chinese Academy of Sciences, Ningbo 315201, PR China
| | - Qilong Yuan
- Key Laboratory of Advanced Marine Materials, Ningbo Institute of Materials Technology and Engineering, Chinese Academy of Sciences, Ningbo 315201, PR China
| | - Chunyu Du
- College of Materials Science and Engineering, Shenzhen University, Shenzhen 518055, PR China
| | - Shuo Yin
- Department of Mechanical and Manufacturing Engineering, The University of Dublin, Parsons Building, Dublin 2, Ireland
| | - Ye Zhou
- Institute of Advanced Study, Shenzhen University, Shenzhen 518060, PR China
| | - Chao Zhang
- College of Mechanical Engineering, Yangzhou University, Yangzhou 225127, PR China
| |
Collapse
|
2
|
Casanueva-Morato D, Ayuso-Martinez A, Dominguez-Morales JP, Jimenez-Fernandez A, Jimenez-Moreno G. Bio-inspired computational memory model of the Hippocampus: An approach to a neuromorphic spike-based Content-Addressable Memory. Neural Netw 2024; 178:106474. [PMID: 38941736 DOI: 10.1016/j.neunet.2024.106474] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/29/2023] [Revised: 04/12/2024] [Accepted: 06/17/2024] [Indexed: 06/30/2024]
Abstract
The brain has computational capabilities that surpass those of modern systems, being able to solve complex problems efficiently in a simple way. Neuromorphic engineering aims to mimic biology in order to develop new systems capable of incorporating such capabilities. Bio-inspired learning systems continue to be a challenge that must be solved, and much work needs to be done in this regard. Among all brain regions, the hippocampus stands out as an autoassociative short-term memory with the capacity to learn and recall memories from any fragment of them. These characteristics make the hippocampus an ideal candidate for developing bio-inspired learning systems that, in addition, resemble content-addressable memories. Therefore, in this work we propose a bio-inspired spiking content-addressable memory model based on the CA3 region of the hippocampus with the ability to learn, forget and recall memories, both orthogonal and non-orthogonal, from any fragment of them. The model was implemented on the SpiNNaker hardware platform using Spiking Neural Networks. A set of experiments based on functional, stress and applicability tests were performed to demonstrate its correct functioning. This work presents the first hardware implementation of a fully-functional bio-inspired spiking hippocampal content-addressable memory model, paving the way for the development of future more complex neuromorphic systems.
Collapse
Affiliation(s)
- Daniel Casanueva-Morato
- Escuela Técnica Superior de Ingeniería Informática (ETSII), Universidad de Sevilla, Seville, Avenida de Reina Mercedes s/n, 41012, Spain; Robotics and Tech. of Computers Lab., Universidad de Sevilla, Seville, 41012, Spain; Escuela Politécnica Superior (EPS), Universidad de Sevilla, Sevilla, 41011, Spain.
| | - Alvaro Ayuso-Martinez
- Escuela Técnica Superior de Ingeniería Informática (ETSII), Universidad de Sevilla, Seville, Avenida de Reina Mercedes s/n, 41012, Spain; Robotics and Tech. of Computers Lab., Universidad de Sevilla, Seville, 41012, Spain; Escuela Politécnica Superior (EPS), Universidad de Sevilla, Sevilla, 41011, Spain.
| | - Juan P Dominguez-Morales
- Escuela Técnica Superior de Ingeniería Informática (ETSII), Universidad de Sevilla, Seville, Avenida de Reina Mercedes s/n, 41012, Spain; Robotics and Tech. of Computers Lab., Universidad de Sevilla, Seville, 41012, Spain; Escuela Politécnica Superior (EPS), Universidad de Sevilla, Sevilla, 41011, Spain; Smart Computer Systems Research and Engineering Lab (SCORE), Research Institute of Computer Engineering (I3US), Universidad de Sevilla, Seville, 41012, Spain.
| | - Angel Jimenez-Fernandez
- Escuela Técnica Superior de Ingeniería Informática (ETSII), Universidad de Sevilla, Seville, Avenida de Reina Mercedes s/n, 41012, Spain; Robotics and Tech. of Computers Lab., Universidad de Sevilla, Seville, 41012, Spain; Escuela Politécnica Superior (EPS), Universidad de Sevilla, Sevilla, 41011, Spain; Smart Computer Systems Research and Engineering Lab (SCORE), Research Institute of Computer Engineering (I3US), Universidad de Sevilla, Seville, 41012, Spain.
| | - Gabriel Jimenez-Moreno
- Escuela Técnica Superior de Ingeniería Informática (ETSII), Universidad de Sevilla, Seville, Avenida de Reina Mercedes s/n, 41012, Spain; Robotics and Tech. of Computers Lab., Universidad de Sevilla, Seville, 41012, Spain; Escuela Politécnica Superior (EPS), Universidad de Sevilla, Sevilla, 41011, Spain; Smart Computer Systems Research and Engineering Lab (SCORE), Research Institute of Computer Engineering (I3US), Universidad de Sevilla, Seville, 41012, Spain.
| |
Collapse
|
3
|
Jarne C. Exploring Flip Flop memories and beyond: training Recurrent Neural Networks with key insights. Front Syst Neurosci 2024; 18:1269190. [PMID: 38600907 PMCID: PMC11004305 DOI: 10.3389/fnsys.2024.1269190] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/29/2023] [Accepted: 03/11/2024] [Indexed: 04/12/2024] Open
Abstract
Training neural networks to perform different tasks is relevant across various disciplines. In particular, Recurrent Neural Networks (RNNs) are of great interest in Computational Neuroscience. Open-source frameworks dedicated to Machine Learning, such as Tensorflow and Keras have produced significant changes in the development of technologies that we currently use. This work contributes by comprehensively investigating and describing the application of RNNs for temporal processing through a study of a 3-bit Flip Flop memory implementation. We delve into the entire modeling process, encompassing equations, task parametrization, and software development. The obtained networks are meticulously analyzed to elucidate dynamics, aided by an array of visualization and analysis tools. Moreover, the provided code is versatile enough to facilitate the modeling of diverse tasks and systems. Furthermore, we present how memory states can be efficiently stored in the vertices of a cube in the dimensionally reduced space, supplementing previous results with a distinct approach.
Collapse
Affiliation(s)
- Cecilia Jarne
- Departamento de Ciencia y Tecnologia de la Universidad Nacional de Quilmes, Bernal, Quilmes, Buenos Aires, Argentina
- CONICET, Buenos Aires, Argentina
- Department of Clinical Medicine, Center of Functionally Integrative Neuroscience, Aarhus University, Aarhus, Denmark
| |
Collapse
|
4
|
Islam M, Hasan Majumder M, Hussein M, Hossain KM, Miah M. A review of machine learning and deep learning algorithms for Parkinson's disease detection using handwriting and voice datasets. Heliyon 2024; 10:e25469. [PMID: 38356538 PMCID: PMC10865258 DOI: 10.1016/j.heliyon.2024.e25469] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/24/2023] [Revised: 11/30/2023] [Accepted: 01/27/2024] [Indexed: 02/16/2024] Open
Abstract
Parkinson's Disease (PD) is a prevalent neurodegenerative disorder with significant clinical implications. Early and accurate diagnosis of PD is crucial for timely intervention and personalized treatment. In recent years, Machine Learning (ML) and Deep Learning (DL) techniques have emerged as promis-ing tools for improving PD diagnosis. This review paper presents a detailed analysis of the current state of ML and DL-based PD diagnosis, focusing on voice, handwriting, and wave spiral datasets. The study also evaluates the effectiveness of various ML and DL algorithms, including classifiers, on these datasets and highlights their potential in enhancing diagnostic accuracy and aiding clinical decision-making. Additionally, the paper explores the identifi-cation of biomarkers using these techniques, offering insights into improving the diagnostic process. The discussion encompasses different data formats and commonly employed ML and DL methods in PD diagnosis, providing a comprehensive overview of the field. This review serves as a roadmap for future research, guiding the development of ML and DL-based tools for PD detection. It is expected to benefit both the scientific community and medical practitioners by advancing our understanding of PD diagnosis and ultimately improving patient outcomes.
Collapse
Affiliation(s)
- Md.Ariful Islam
- Department of Robotics and Mechatronics Engineering, University of Dhaka, Nilkhet Rd, Dhaka, 1000, Bangladesh
| | - Md.Ziaul Hasan Majumder
- Institute of Electronics, Bangladesh Atomic Energy Commission, Dhaka, 1207, Bangladesh
- Department of Electrical and Electronic Engineering, University of Dhaka, Dhaka, 1000, Bangladesh
| | - Md.Alomgeer Hussein
- Department of Electrical and Electronic Engineering, University of Dhaka, Dhaka, 1000, Bangladesh
| | - Khondoker Murad Hossain
- Department of Electrical and Electronic Engineering, University of Dhaka, Dhaka, 1000, Bangladesh
| | - Md.Sohel Miah
- Department of Electrical and Electronic Engineering, University of Dhaka, Dhaka, 1000, Bangladesh
- Moulvibazar Polytechnic Institute, Bangladesh
| |
Collapse
|
5
|
Siddique MAB, Zhang Y, An H. Monitoring time domain characteristics of Parkinson's disease using 3D memristive neuromorphic system. Front Comput Neurosci 2023; 17:1274575. [PMID: 38162516 PMCID: PMC10754992 DOI: 10.3389/fncom.2023.1274575] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/08/2023] [Accepted: 11/06/2023] [Indexed: 01/03/2024] Open
Abstract
Introduction Parkinson's disease (PD) is a neurodegenerative disorder affecting millions of patients. Closed-Loop Deep Brain Stimulation (CL-DBS) is a therapy that can alleviate the symptoms of PD. The CL-DBS system consists of an electrode sending electrical stimulation signals to a specific region of the brain and a battery-powered stimulator implanted in the chest. The electrical stimuli in CL-DBS systems need to be adjusted in real-time in accordance with the state of PD symptoms. Therefore, fast and precise monitoring of PD symptoms is a critical function for CL-DBS systems. However, the current CL-DBS techniques suffer from high computational demands for real-time PD symptom monitoring, which are not feasible for implanted and wearable medical devices. Methods In this paper, we present an energy-efficient neuromorphic PD symptom detector using memristive three-dimensional integrated circuits (3D-ICs). The excessive oscillation at beta frequencies (13-35 Hz) at the subthalamic nucleus (STN) is used as a biomarker of PD symptoms. Results Simulation results demonstrate that our neuromorphic PD detector, implemented with an 8-layer spiking Long Short-Term Memory (S-LSTM), excels in recognizing PD symptoms, achieving a training accuracy of 99.74% and a validation accuracy of 99.52% for a 75%-25% data split. Furthermore, we evaluated the improvement of our neuromorphic CL-DBS detector using NeuroSIM. The chip area, latency, energy, and power consumption of our CL-DBS detector were reduced by 47.4%, 66.63%, 65.6%, and 67.5%, respectively, for monolithic 3D-ICs. Similarly, for heterogeneous 3D-ICs, employing memristive synapses to replace traditional Static Random Access Memory (SRAM) resulted in reductions of 44.8%, 64.75%, 65.28%, and 67.7% in chip area, latency, and power usage. Discussion This study introduces a novel approach for PD symptom evaluation by directly utilizing spiking signals from neural activities in the time domain. This method significantly reduces the time and energy required for signal conversion compared to traditional frequency domain approaches. The study pioneers the use of neuromorphic computing and memristors in designing CL-DBS systems, surpassing SRAM-based designs in chip design area, latency, and energy efficiency. Lastly, the proposed neuromorphic PD detector demonstrates high resilience to timing variations in brain neural signals, as confirmed by robustness analysis.
Collapse
Affiliation(s)
- Md Abu Bakr Siddique
- Department of Electrical and Computer Engineering, Michigan Technological University, Houghton, MI, United States
| | - Yan Zhang
- Department of Biological Sciences, Michigan Technological University, Houghton, MI, United States
| | - Hongyu An
- Department of Electrical and Computer Engineering, Michigan Technological University, Houghton, MI, United States
| |
Collapse
|
6
|
Bitar A, Rosales R, Paulitsch M. Gradient-based feature-attribution explainability methods for spiking neural networks. Front Neurosci 2023; 17:1153999. [PMID: 37829721 PMCID: PMC10565802 DOI: 10.3389/fnins.2023.1153999] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/30/2023] [Accepted: 09/01/2023] [Indexed: 10/14/2023] Open
Abstract
Introduction Spiking neural networks (SNNs) are a model of computation that mimics the behavior of biological neurons. SNNs process event data (spikes) and operate more sparsely than artificial neural networks (ANNs), resulting in ultra-low latency and small power consumption. This paper aims to adapt and evaluate gradient-based explainability methods for SNNs, which were originally developed for conventional ANNs. Methods The adapted methods aim to create input feature attribution maps for SNNs trained through backpropagation that process either event-based spiking data or real-valued data. The methods address the limitations of existing work on explainability methods for SNNs, such as poor scalability, limited to convolutional layers, requiring the training of another model, and providing maps of activation values instead of true attribution scores. The adapted methods are evaluated on classification tasks for both real-valued and spiking data, and the accuracy of the proposed methods is confirmed through perturbation experiments at the pixel and spike levels. Results and discussion The results reveal that gradient-based SNN attribution methods successfully identify highly contributing pixels and spikes with significantly less computation time than model-agnostic methods. Additionally, we observe that the chosen coding technique has a noticeable effect on the input features that will be most significant. These findings demonstrate the potential of gradient-based explainability methods for SNNs in improving our understanding of how these networks process information and contribute to the development of more efficient and accurate SNNs.
Collapse
Affiliation(s)
- Ammar Bitar
- Intel Labs, Munich, Germany
- Department of Knowledge Engineering, Maastricht University, Maastricht, Netherlands
| | | | | |
Collapse
|
7
|
Rivero-Ortega JD, Mosquera-Maturana JS, Pardo-Cabrera J, Hurtado-López J, Hernández JD, Romero-Cano V, Ramírez-Moreno DF. Ring attractor bio-inspired neural network for social robot navigation. Front Neurorobot 2023; 17:1211570. [PMID: 37719331 PMCID: PMC10501606 DOI: 10.3389/fnbot.2023.1211570] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/24/2023] [Accepted: 08/14/2023] [Indexed: 09/19/2023] Open
Abstract
Introduction We introduce a bio-inspired navigation system for a robot to guide a social agent to a target location while avoiding static and dynamic obstacles. Robot navigation can be accomplished through a model of ring attractor neural networks. This connectivity pattern between neurons enables the generation of stable activity patterns that can represent continuous variables such as heading direction or position. The integration of sensory representation, decision-making, and motor control through ring attractor networks offers a biologically-inspired approach to navigation in complex environments. Methods The navigation system is divided into perception, planning, and control stages. Our approach is compared to the widely-used Social Force Model and Rapidly Exploring Random Tree Star methods using the Social Individual Index and Relative Motion Index as metrics in simulated experiments. We created a virtual scenario of a pedestrian area with various obstacles and dynamic agents. Results The results obtained in our experiments demonstrate the effectiveness of this architecture in guiding a social agent while avoiding obstacles, and the metrics used for evaluating the system indicate that our proposal outperforms the widely used Social Force Model. Discussion Our approach points to improving safety and comfort specifically for human-robot interactions. By integrating the Social Individual Index and Relative Motion Index, this approach considers both social comfort and collision avoidance features, resulting in better human-robot interactions in a crowded environment.
Collapse
Affiliation(s)
| | | | - Josh Pardo-Cabrera
- Department of Engineering, Universidad Autónoma de Occidente, Cali, Colombia
| | | | - Juan D. Hernández
- School of Computer Science and Informatics, Cardiff University, Cardiff, United Kingdom
| | - Victor Romero-Cano
- Robotics and Autonomous Systems Laboratory, Faculty of Engineering, Universidad Autonoma de Occidente, Cali, Colombia
- Rimac Technology, Zagreb, Croatia
| | | |
Collapse
|
8
|
Sanaullah, Koravuna S, Rückert U, Jungeblut T. Exploring spiking neural networks: a comprehensive analysis of mathematical models and applications. Front Comput Neurosci 2023; 17:1215824. [PMID: 37692462 PMCID: PMC10483570 DOI: 10.3389/fncom.2023.1215824] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/02/2023] [Accepted: 08/07/2023] [Indexed: 09/12/2023] Open
Abstract
This article presents a comprehensive analysis of spiking neural networks (SNNs) and their mathematical models for simulating the behavior of neurons through the generation of spikes. The study explores various models, including LIF and NLIF, for constructing SNNs and investigates their potential applications in different domains. However, implementation poses several challenges, including identifying the most appropriate model for classification tasks that demand high accuracy and low-performance loss. To address this issue, this research study compares the performance, behavior, and spike generation of multiple SNN models using consistent inputs and neurons. The findings of the study provide valuable insights into the benefits and challenges of SNNs and their models, emphasizing the significance of comparing multiple models to identify the most effective one. Moreover, the study quantifies the number of spiking operations required by each model to process the same inputs and produce equivalent outputs, enabling a thorough assessment of computational efficiency. The findings provide valuable insights into the benefits and limitations of SNNs and their models. The research underscores the significance of comparing different models to make informed decisions in practical applications. Additionally, the results reveal essential variations in biological plausibility and computational efficiency among the models, further emphasizing the importance of selecting the most suitable model for a given task. Overall, this study contributes to a deeper understanding of SNNs and offers practical guidelines for using their potential in real-world scenarios.
Collapse
Affiliation(s)
- Sanaullah
- Industrial the Internet of Things, Department of Engineering and Mathematics, Bielefeld University of Applied Sciences and Arts, Bielefeld, Germany
| | - Shamini Koravuna
- AG Kognitronik & Sensorik, Technical Faculty, Universität Bielefeld, Bielefeld, Germany
| | - Ulrich Rückert
- AG Kognitronik & Sensorik, Technical Faculty, Universität Bielefeld, Bielefeld, Germany
| | - Thorsten Jungeblut
- Industrial the Internet of Things, Department of Engineering and Mathematics, Bielefeld University of Applied Sciences and Arts, Bielefeld, Germany
| |
Collapse
|
9
|
Aceituno PV, Farinha MT, Loidl R, Grewe BF. Learning cortical hierarchies with temporal Hebbian updates. Front Comput Neurosci 2023; 17:1136010. [PMID: 37293353 PMCID: PMC10244748 DOI: 10.3389/fncom.2023.1136010] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/02/2023] [Accepted: 04/25/2023] [Indexed: 06/10/2023] Open
Abstract
A key driver of mammalian intelligence is the ability to represent incoming sensory information across multiple abstraction levels. For example, in the visual ventral stream, incoming signals are first represented as low-level edge filters and then transformed into high-level object representations. Similar hierarchical structures routinely emerge in artificial neural networks (ANNs) trained for object recognition tasks, suggesting that similar structures may underlie biological neural networks. However, the classical ANN training algorithm, backpropagation, is considered biologically implausible, and thus alternative biologically plausible training methods have been developed such as Equilibrium Propagation, Deep Feedback Control, Supervised Predictive Coding, and Dendritic Error Backpropagation. Several of those models propose that local errors are calculated for each neuron by comparing apical and somatic activities. Notwithstanding, from a neuroscience perspective, it is not clear how a neuron could compare compartmental signals. Here, we propose a solution to this problem in that we let the apical feedback signal change the postsynaptic firing rate and combine this with a differential Hebbian update, a rate-based version of classical spiking time-dependent plasticity (STDP). We prove that weight updates of this form minimize two alternative loss functions that we prove to be equivalent to the error-based losses used in machine learning: the inference latency and the amount of top-down feedback necessary. Moreover, we show that the use of differential Hebbian updates works similarly well in other feedback-based deep learning frameworks such as Predictive Coding or Equilibrium Propagation. Finally, our work removes a key requirement of biologically plausible models for deep learning and proposes a learning mechanism that would explain how temporal Hebbian learning rules can implement supervised hierarchical learning.
Collapse
Affiliation(s)
- Pau Vilimelis Aceituno
- Institute of Neuroinformatics, University of Zurich and ETH Zurich, Zurich, Switzerland
- ETH AI Center, ETH Zurich, Zurich, Switzerland
| | | | - Reinhard Loidl
- Institute of Neuroinformatics, University of Zurich and ETH Zurich, Zurich, Switzerland
| | - Benjamin F. Grewe
- Institute of Neuroinformatics, University of Zurich and ETH Zurich, Zurich, Switzerland
- ETH AI Center, ETH Zurich, Zurich, Switzerland
| |
Collapse
|
10
|
Li Y, Jia S, Li Q. BalanceHRNet: An effective network for bottom-up human pose estimation. Neural Netw 2023; 161:297-305. [PMID: 36774867 DOI: 10.1016/j.neunet.2023.01.036] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/12/2022] [Revised: 12/04/2022] [Accepted: 01/24/2023] [Indexed: 02/05/2023]
Abstract
In the study of human pose estimation, which is widely used in safety and sports scenes, the performance of deep learning methods is greatly reduced in high overlap rate and crowded scenes. Therefore, we propose a bottom-up model, called BalanceHRNet, which is based on balanced high-resolution module and a new branch attention module. BalanceHRNet draws on the multi-branch structure and fusion method of a popular model HigherHRNet. And our model overcomes the shortcoming of HigherHRNet that cannot obtain a large enough receptive field. Specifically, through the connecting structure in balanced high-resolution module, we can connect almost all convolutional layers and obtain a sufficiently large receptive field. At the same time, the multi-resolution representation can be maintained due to the use of balanced high-resolution module, which enable our model to recognize objects with richer scales and obtain more complex semantics information. And for branch fusion method, we design branch attention to obtain the importance of different branches at different stages. Finally, our model improves the accuracy while ensuring a smaller amount of computation than HigherHRNet. The CrowdPose dataset is used as test dataset, and HigherHRNet, AlphaPose, OpenPose and so on are taken as comparison models. The AP measured by BalanceHRNet is 63.0%, increased by 3.1% compared to best model - HigherHRNet. We also demonstrate the effectiveness of our network through the COCO(2017) keypoint detection dataset. Compared with HigherHRNet-w32, the AP of our model is improved by 1.6%.
Collapse
Affiliation(s)
- Yaoping Li
- No. 36 North Third Ring East Road, Beijing, China
| | - Shuangcheng Jia
- No. 36 North Third Ring East Road, Beijing, China. http://www.zhidaohulian.com/
| | - Qian Li
- No. 36 North Third Ring East Road, Beijing, China.
| |
Collapse
|
11
|
Shang F, Lan Y, Yang J, Li E, Kang X. Robust data hiding for JPEG images with invertible neural network. Neural Netw 2023; 163:219-232. [PMID: 37062180 DOI: 10.1016/j.neunet.2023.03.037] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/03/2022] [Revised: 03/15/2023] [Accepted: 03/28/2023] [Indexed: 04/03/2023]
Abstract
JPEG compression will cause severe distortion to the shared compressed image, which brings great challenges to extracting messages correctly from the stego image. To address such challenges, we propose a novel end-to-end robust data hiding scheme for JPEG images. The embedding and extracting secret messages on the quantized discrete cosine transform (DCT) coefficients are implemented by the bi-directional process of the invertible neural network (INN), which can provide intrinsic robustness against lossy JPEG compression. We design a JPEG compression attack module to simulate the JPEG compression process, which helps the network automatically learn how to recover the secret message from JPEG compressed image. Experimental results have demonstrated that our method achieves strong robustness against lossy JPEG compression, and also significantly improves the security compared with the existing data hiding methods on the premise of ensuring image quality and high capacity. For example, the detection error of our method against XuNet has been increased by 3.45% over the existing data hiding methods.
Collapse
|
12
|
Pandey A, Vishwakarma DK. VABDC-Net: A framework for Visual-Caption Sentiment Recognition via spatio-depth visual attention and bi-directional caption processing. Knowl Based Syst 2023. [DOI: 10.1016/j.knosys.2023.110515] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 03/30/2023]
|
13
|
Faheem ZB, Ishaq A, Rustam F, de la Torre Díez I, Gavilanes D, Vergara MM, Ashraf I. Image Watermarking Using Least Significant Bit and Canny Edge Detection. SENSORS (BASEL, SWITZERLAND) 2023; 23:1210. [PMID: 36772250 PMCID: PMC9921098 DOI: 10.3390/s23031210] [Citation(s) in RCA: 2] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 12/11/2022] [Revised: 01/11/2023] [Accepted: 01/16/2023] [Indexed: 06/18/2023]
Abstract
With the advancement in information technology, digital data stealing and duplication have become easier. Over a trillion bytes of data are generated and shared on social media through the internet in a single day, and the authenticity of digital data is currently a major problem. Cryptography and image watermarking are domains that provide multiple security services, such as authenticity, integrity, and privacy. In this paper, a digital image watermarking technique is proposed that employs the least significant bit (LSB) and canny edge detection method. The proposed method provides better security services and it is computationally less expensive, which is the demand of today's world. The major contribution of this method is to find suitable places for watermarking embedding and provides additional watermark security by scrambling the watermark image. A digital image is divided into non-overlapping blocks, and the gradient is calculated for each block. Then convolution masks are applied to find the gradient direction and magnitude, and non-maximum suppression is applied. Finally, LSB is used to embed the watermark in the hysteresis step. Furthermore, additional security is provided by scrambling the watermark signal using our chaotic substitution box. The proposed technique is more secure because of LSB's high payload and watermark embedding feature after a canny edge detection filter. The canny edge gradient direction and magnitude find how many bits will be embedded. To test the performance of the proposed technique, several image processing, and geometrical attacks are performed. The proposed method shows high robustness to image processing and geometrical attacks.
Collapse
Affiliation(s)
- Zaid Bin Faheem
- Department of Computer Science & Information Technology, The Islamia University of Bahawalpur, Bahawalpur 63100, Pakistan
| | - Abid Ishaq
- Department of Computer Science & Information Technology, The Islamia University of Bahawalpur, Bahawalpur 63100, Pakistan
| | - Furqan Rustam
- School of Computer Science, University College Dublin, D04 V1W8 Dublin, Ireland
| | - Isabel de la Torre Díez
- Department of Signal Theory and Communications and Telematic Engineering, University of Valladolid, Paseo de Belén 15, 47011 Valladolid, Spain
| | - Daniel Gavilanes
- Center for Nutrition & Health, Universidad Europea del Atlántico, Isabel Torres 21, 39011 Santander, Spain
- Universidad Internacional Iberoamericana, Arecibo, PR 00613, USA
- Universidade Internacional do Cuanza, Cuito EN250, Angola
| | - Manuel Masias Vergara
- Center for Nutrition & Health, Universidad Europea del Atlántico, Isabel Torres 21, 39011 Santander, Spain
- Área de Nutrición y Salud, Universidad Internacional Iberoamericana, Campeche 24560, Mexico
- Fundación Universitaria Internacional de Colombia, Bogotá 111311, Colombia
| | - Imran Ashraf
- Department of Information and Communication Engineering, Yeungnam University, Gyeongsan 38541, Republic of Korea
| |
Collapse
|
14
|
Wang Y, Li H, Zheng Y, Peng J. A directionally selective collision-sensing visual neural network based on fractional-order differential operator. Front Neurorobot 2023; 17:1149675. [PMID: 37152416 PMCID: PMC10160397 DOI: 10.3389/fnbot.2023.1149675] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/22/2023] [Accepted: 03/30/2023] [Indexed: 05/09/2023] Open
Abstract
In this paper, we propose a directionally selective fractional-order lobular giant motion detector (LGMD) visual neural network. Unlike most collision-sensing network models based on LGMDs, our model can not only sense collision threats but also obtain the motion direction of the collision object. Firstly, this paper simulates the membrane potential response of neurons using the fractional-order differential operator to generate reliable collision response spikes. Then, a new correlation mechanism is proposed to obtain the motion direction of objects. Specifically, this paper performs correlation operation on the signals extracted from two pixels, utilizing the temporal delay of the signals to obtain their position relationship. In this way, the response characteristics of direction-selective neurons can be characterized. Finally, ON/OFF visual channels are introduced to encode increases and decreases in brightness, respectively, thereby modeling the bipolar response of special neurons. Extensive experimental results show that the proposed visual neural system conforms to the response characteristics of biological LGMD and direction-selective neurons, and that the performance of the system is stable and reliable.
Collapse
|
15
|
Hierarchically stacked graph convolution for emotion recognition in conversation. Knowl Based Syst 2023. [DOI: 10.1016/j.knosys.2023.110285] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/13/2023]
|
16
|
Walther D, Viehweg J, Haueisen J, Mäder P. A systematic comparison of deep learning methods for EEG time series analysis. Front Neuroinform 2023; 17:1067095. [PMID: 36911074 PMCID: PMC9995756 DOI: 10.3389/fninf.2023.1067095] [Citation(s) in RCA: 2] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/11/2022] [Accepted: 01/30/2023] [Indexed: 02/25/2023] Open
Abstract
Analyzing time series data like EEG or MEG is challenging due to noisy, high-dimensional, and patient-specific signals. Deep learning methods have been demonstrated to be superior in analyzing time series data compared to shallow learning methods which utilize handcrafted and often subjective features. Especially, recurrent deep neural networks (RNN) are considered suitable to analyze such continuous data. However, previous studies show that they are computationally expensive and difficult to train. In contrast, feed-forward networks (FFN) have previously mostly been considered in combination with hand-crafted and problem-specific feature extractions, such as short time Fourier and discrete wavelet transform. A sought-after are easily applicable methods that efficiently analyze raw data to remove the need for problem-specific adaptations. In this work, we systematically compare RNN and FFN topologies as well as advanced architectural concepts on multiple datasets with the same data preprocessing pipeline. We examine the behavior of those approaches to provide an update and guideline for researchers who deal with automated analysis of EEG time series data. To ensure that the results are meaningful, it is important to compare the presented approaches while keeping the same experimental setup, which to our knowledge was never done before. This paper is a first step toward a fairer comparison of different methodologies with EEG time series data. Our results indicate that a recurrent LSTM architecture with attention performs best on less complex tasks, while the temporal convolutional network (TCN) outperforms all the recurrent architectures on the most complex dataset yielding a 8.61% accuracy improvement. In general, we found the attention mechanism to substantially improve classification results of RNNs. Toward a light-weight and online learning-ready approach, we found extreme learning machines (ELM) to yield comparable results for the less complex tasks.
Collapse
Affiliation(s)
- Dominik Walther
- Data-Intensive Systems and Visualization Group (dAI.SY), Technische Universität Ilmenau, Ilmenau, Germany
| | - Johannes Viehweg
- Data-Intensive Systems and Visualization Group (dAI.SY), Technische Universität Ilmenau, Ilmenau, Germany
| | - Jens Haueisen
- Institute of Biomedical Engineering and Informatics, Technische Universität Ilmenau, Ilmenau, Germany
| | - Patrick Mäder
- Data-Intensive Systems and Visualization Group (dAI.SY), Technische Universität Ilmenau, Ilmenau, Germany.,Faculty of Biological Sciences, Friedrich Schiller University, Jena, Germany
| |
Collapse
|
17
|
Feng H, Zeng Y. A brain-inspired robot pain model based on a spiking neural network. Front Neurorobot 2022; 16:1025338. [PMID: 36605522 PMCID: PMC9807619 DOI: 10.3389/fnbot.2022.1025338] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/22/2022] [Accepted: 11/30/2022] [Indexed: 12/24/2022] Open
Abstract
Introduction Pain is a crucial function for organisms. Building a "Robot Pain" model inspired by organisms' pain could help the robot learn self-preservation and extend longevity. Most previous studies about robots and pain focus on robots interacting with people by recognizing their pain expressions or scenes, or avoiding obstacles by recognizing dangerous objects. Robots do not have human-like pain capacity and cannot adaptively respond to danger. Inspired by the evolutionary mechanisms of pain emergence and the Free Energy Principle (FEP) in the brain, we summarize the neural mechanisms of pain and construct a Brain-inspired Robot Pain Spiking Neural Network (BRP-SNN) with spike-time-dependent-plasticity (STDP) learning rule and population coding method. Methods The proposed model can quantify machine injury by detecting the coupling relationship between multi-modality sensory information and generating "robot pain" as an internal state. Results We provide a comparative analysis with the results of neuroscience experiments, showing that our model has biological interpretability. We also successfully tested our model on two tasks with real robots-the alerting actual injury task and the preventing potential injury task. Discussion Our work has two major contributions: (1) It has positive implications for the integration of pain concepts into robotics in the intelligent robotics field. (2) Our summary of pain's neural mechanisms and the implemented computational simulations provide a new perspective to explore the nature of pain, which has significant value for future pain research in the cognitive neuroscience field.
Collapse
Affiliation(s)
- Hui Feng
- Brain-inspired Cognitive Intelligence Lab, Institute of Automation, Chinese Academy of Sciences, Beijing, China,School of Artificial Intelligence, University of Chinese Academy of Sciences, Beijing, China
| | - Yi Zeng
- Brain-inspired Cognitive Intelligence Lab, Institute of Automation, Chinese Academy of Sciences, Beijing, China,School of Artificial Intelligence, University of Chinese Academy of Sciences, Beijing, China,Center for Excellence in Brain Science and Intelligence Technology, Chinese Academy of Sciences, Shanghai, China,National Laboratory of Pattern Recognition, Institute of Automation, Chinese Academy of Sciences, Beijing, China,*Correspondence: Yi Zeng
| |
Collapse
|
18
|
Li C, Hou L, Pan J, Chen H, Cai X, Liang G. Tuberculous pleural effusion prediction using ant colony optimizer with grade-based search assisted support vector machine. Front Neuroinform 2022; 16:1078685. [PMID: 36601381 PMCID: PMC9806141 DOI: 10.3389/fninf.2022.1078685] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/24/2022] [Accepted: 11/28/2022] [Indexed: 12/23/2022] Open
Abstract
Introduction Although tuberculous pleural effusion (TBPE) is simply an inflammatory response of the pleura caused by tuberculosis infection, it can lead to pleural adhesions and cause sequelae of pleural thickening, which may severely affect the mobility of the chest cavity. Methods In this study, we propose bGACO-SVM, a model with good diagnostic power, for the adjunctive diagnosis of TBPE. The model is based on an enhanced continuous ant colony optimization (ACOR) with grade-based search technique (GACO) and support vector machine (SVM) for wrapped feature selection. In GACO, grade-based search greatly improves the convergence performance of the algorithm and the ability to avoid getting trapped in local optimization, which improves the classification capability of bGACO-SVM. Results To test the performance of GACO, this work conducts comparative experiments between GACO and nine basic algorithms and nine state-of-the-art variants as well. Although the proposed GACO does not offer much advantage in terms of time complexity, the experimental results strongly demonstrate the core advantages of GACO. The accuracy of bGACO-predictive SVM was evaluated using existing datasets from the UCI and TBPE datasets. Discussion In the TBPE dataset trial, 147 TBPE patients were evaluated using the created bGACO-SVM model, showing that the bGACO-SVM method is an effective technique for accurately predicting TBPE.
Collapse
Affiliation(s)
- Chengye Li
- Department of Pulmonary and Critical Care Medicine, The First Affiliated Hospital of Wenzhou Medical University, Wenzhou, China
| | - Lingxian Hou
- Department of Rehabilitation, Wenzhou Hospital of Integrated Traditional Chinese and Western Medicine, Wenzhou, China
| | - Jingye Pan
- Key Laboratory of Intelligent Treatment and Life Support for Critical Diseases of Zhejiang Province, Wenzhou, Zhejiang, China,Collaborative Innovation Center for Intelligence Medical Education, Wenzhou, Zhejiang, China,Zhejiang Engineering Research Center for Hospital Emergency and Process Digitization, Wenzhou, Zhejiang, China,Department of Intensive Care Unit, The First Affiliated Hospital of Wenzhou Medical University, Wenzhou, Zhejiang, China
| | - Huiling Chen
- College of Computer Science and Artificial Intelligence, Wenzhou University, Wenzhou, Zhejiang, China,*Correspondence: Huiling Chen,
| | - Xueding Cai
- Department of Pulmonary and Critical Care Medicine, The First Affiliated Hospital of Wenzhou Medical University, Wenzhou, Zhejiang, China,Xueding Cai,
| | - Guoxi Liang
- Department of Information Technology, Wenzhou Polytechnic, Wenzhou, China
| |
Collapse
|
19
|
Rahmani S, Hosseini S, Zall R, Kangavari MR, Kamran S, Hua W. Transfer-based adaptive tree for multimodal sentiment analysis based on user latent aspects. Knowl Based Syst 2022. [DOI: 10.1016/j.knosys.2022.110219] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/24/2022]
|
20
|
Li P, Liu Q, Liu Z. Outer-synchronization criterions for asymmetric recurrent time-varying neural networks described by differential-algebraic system via data-sampling principles. Front Comput Neurosci 2022; 16:1029235. [DOI: 10.3389/fncom.2022.1029235] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/27/2022] [Accepted: 10/24/2022] [Indexed: 11/18/2022] Open
Abstract
Asymmetric recurrent time-varying neural networks (ARTNNs) can enable realistic brain-like models to help scholars explore the mechanisms of the human brain and thus realize the applications of artificial intelligence, whose dynamical behaviors such as synchronization has attracted extensive research interest due to its superior applicability and flexibility. In this paper, we examined the outer-synchronization of ARTNNs, which are described by the differential-algebraic system (DAS). By designing appropriate centralized and decentralized data-sampling approaches which fully account for information gathering at the times tk and tki. Using the characteristics of integral inequalities and the theory of differential equations, several novel suitable outer-synchronization conditions were established. Those conditions facilitate the analysis and applications of dynamical behaviors of ARTNNs. The superiority of the theoretical results was then demonstrated by using a numerical example.
Collapse
|
21
|
Lee J, Jo J, Lee B, Lee JH, Yoon S. Brain-inspired Predictive Coding Improves the Performance of Machine Challenging Tasks. Front Comput Neurosci 2022; 16:1062678. [PMID: 36465966 PMCID: PMC9709416 DOI: 10.3389/fncom.2022.1062678] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/06/2022] [Accepted: 10/28/2022] [Indexed: 09/19/2023] Open
Abstract
Backpropagation has been regarded as the most favorable algorithm for training artificial neural networks. However, it has been criticized for its biological implausibility because its learning mechanism contradicts the human brain. Although backpropagation has achieved super-human performance in various machine learning applications, it often shows limited performance in specific tasks. We collectively referred to such tasks as machine-challenging tasks (MCTs) and aimed to investigate methods to enhance machine learning for MCTs. Specifically, we start with a natural question: Can a learning mechanism that mimics the human brain lead to the improvement of MCT performances? We hypothesized that a learning mechanism replicating the human brain is effective for tasks where machine intelligence is difficult. Multiple experiments corresponding to specific types of MCTs where machine intelligence has room to improve performance were performed using predictive coding, a more biologically plausible learning algorithm than backpropagation. This study regarded incremental learning, long-tailed, and few-shot recognition as representative MCTs. With extensive experiments, we examined the effectiveness of predictive coding that robustly outperformed backpropagation-trained networks for the MCTs. We demonstrated that predictive coding-based incremental learning alleviates the effect of catastrophic forgetting. Next, predictive coding-based learning mitigates the classification bias in long-tailed recognition. Finally, we verified that the network trained with predictive coding could correctly predict corresponding targets with few samples. We analyzed the experimental result by drawing analogies between the properties of predictive coding networks and those of the human brain and discussing the potential of predictive coding networks in general machine learning.
Collapse
Affiliation(s)
- Jangho Lee
- Department of Electrical and Computer Engineering, Seoul National University, Seoul, South Korea
| | - Jeonghee Jo
- Institute of New Media and Communications, Seoul National University, Seoul, South Korea
| | - Byounghwa Lee
- CybreBrain Research Section, Electronics and Telecommunications Research Institute (ETRI), Daejeon, South Korea
| | - Jung-Hoon Lee
- CybreBrain Research Section, Electronics and Telecommunications Research Institute (ETRI), Daejeon, South Korea
| | - Sungroh Yoon
- Department of Electrical and Computer Engineering, Seoul National University, Seoul, South Korea
- Interdisciplinary Program in Artificial Intelligence, Seoul National University, Seoul, South Korea
| |
Collapse
|
22
|
EIAASG: Emotional Intensive Adaptive Aspect-Specific GCN for sentiment classification. Knowl Based Syst 2022. [DOI: 10.1016/j.knosys.2022.110149] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/27/2022]
|
23
|
Deckers L, Tsang IJ, Van Leekwijck W, Latré S. Extended liquid state machines for speech recognition. Front Neurosci 2022; 16:1023470. [PMID: 36389242 PMCID: PMC9651956 DOI: 10.3389/fnins.2022.1023470] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/19/2022] [Accepted: 10/03/2022] [Indexed: 04/19/2024] Open
Abstract
A liquid state machine (LSM) is a biologically plausible model of a cortical microcircuit. It exists of a random, sparse reservoir of recurrently connected spiking neurons with fixed synapses and a trainable readout layer. The LSM exhibits low training complexity and enables backpropagation-free learning in a powerful, yet simple computing paradigm. In this work, the liquid state machine is enhanced by a set of bio-inspired extensions to create the extended liquid state machine (ELSM), which is evaluated on a set of speech data sets. Firstly, we ensure excitatory/inhibitory (E/I) balance to enable the LSM to operate in edge-of-chaos regime. Secondly, spike-frequency adaptation (SFA) is introduced in the LSM to improve the memory capabilities. Lastly, neuronal heterogeneity, by means of a differentiation in time constants, is introduced to extract a richer dynamical LSM response. By including E/I balance, SFA, and neuronal heterogeneity, we show that the ELSM consistently improves upon the LSM while retaining the benefits of the straightforward LSM structure and training procedure. The proposed extensions led up to an 5.2% increase in accuracy while decreasing the number of spikes in the ELSM up to 20.2% on benchmark speech data sets. On some benchmarks, the ELSM can even attain similar performances as the current state-of-the-art in spiking neural networks. Furthermore, we illustrate that the ELSM input-liquid and recurrent synaptic weights can be reduced to 4-bit resolution without any significant loss in classification performance. We thus show that the ELSM is a powerful, biologically plausible and hardware-friendly spiking neural network model that can attain near state-of-the-art accuracy on speech recognition benchmarks for spiking neural networks.
Collapse
Affiliation(s)
- Lucas Deckers
- imec IDLab, Department of Computer Science, University of Antwerp, Antwerp, Belgium
| | | | | | | |
Collapse
|
24
|
Chen W, Zhang W, Wang W. A multi-view convolutional neural network based on cross-connection and residual-wider. APPL INTELL 2022. [DOI: 10.1007/s10489-022-04248-y] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/02/2022]
|
25
|
Dai J, Liu S, Hao X, Ren Z, Yang X. UAV Localization Algorithm Based on Factor Graph Optimization in Complex Scenes. SENSORS (BASEL, SWITZERLAND) 2022; 22:5862. [PMID: 35957418 PMCID: PMC9370926 DOI: 10.3390/s22155862] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 06/17/2022] [Revised: 07/25/2022] [Accepted: 08/02/2022] [Indexed: 06/15/2023]
Abstract
With the increasingly widespread application of UAV intelligence, the need for autonomous navigation and positioning is becoming more and more important. To solve the problem that UAV cannot perform localization in complex scenes, a new multi-source fusion framework factor graph optimization algorithm is used for UAV localization state estimation in this paper, which is based on IMU/GNSS/VO multi-source sensors. Based on the factor graph model and the iSAM incremental inference algorithm, a multi-source fusion model of IMU/GNSS/VO is established, including the IMU pre-integration factor, IMU bias factor, GNSS factor, and VO factor. Mathematical simulations and validations on the EuRoC dataset show that, when the selected sliding window size is 30, the factor graph optimization (FGO) algorithm can not only meet the requirements of real time and accuracy at the same time, but it also achieves a plug-and-play function in the event of local sensor failures. Finally, compared with the traditional federated Kalman algorithm and the adaptive federated Kalman algorithm, the positioning accuracy of the FGO algorithm in this paper is improved by 1.5-2-fold, and can effectively improve autonomous navigation system robustness and flexibility in complex scenarios. Moreover, the multi-source fusion framework in this paper is a general algorithm framework that can satisfy other scenarios and other types of sensor combinations.
Collapse
Affiliation(s)
- Jun Dai
- Institute of Geospatial Information, Information Engineering University, Zhengzhou 450001, China or
- School of Aerospace Engineering, Zhengzhou University of Aeronautics, Zhengzhou 450001, China
| | - Songlin Liu
- Institute of Geospatial Information, Information Engineering University, Zhengzhou 450001, China or
| | - Xiangyang Hao
- Institute of Geospatial Information, Information Engineering University, Zhengzhou 450001, China or
| | - Zongbin Ren
- Institute of Geospatial Information, Information Engineering University, Zhengzhou 450001, China or
| | - Xiao Yang
- Dengzhou Water Conservancy Bureau, Dengzhou 474150, China
| |
Collapse
|