1
|
Zafar MH, Moosavi SKR, Sanfilippo F. Enhancing unmanned ground vehicle performance in SAR operations: integrated gesture-control and deep learning framework for optimised victim detection. Front Robot AI 2024; 11:1356345. [PMID: 38957217 PMCID: PMC11217714 DOI: 10.3389/frobt.2024.1356345] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/15/2023] [Accepted: 04/17/2024] [Indexed: 07/04/2024] Open
Abstract
In this study, we address the critical need for enhanced situational awareness and victim detection capabilities in Search and Rescue (SAR) operations amidst disasters. Traditional unmanned ground vehicles (UGVs) often struggle in such chaotic environments due to their limited manoeuvrability and the challenge of distinguishing victims from debris. Recognising these gaps, our research introduces a novel technological framework that integrates advanced gesture-recognition with cutting-edge deep learning for camera-based victim identification, specifically designed to empower UGVs in disaster scenarios. At the core of our methodology is the development and implementation of the Meerkat Optimization Algorithm-Stacked Convolutional Neural Network-Bi-Long Short Term Memory-Gated Recurrent Unit (MOA-SConv-Bi-LSTM-GRU) model, which sets a new benchmark for hand gesture detection with its remarkable performance metrics: accuracy, precision, recall, and F1-score all approximately 0.9866. This model enables intuitive, real-time control of UGVs through hand gestures, allowing for precise navigation in confined and obstacle-ridden spaces, which is vital for effective SAR operations. Furthermore, we leverage the capabilities of the latest YOLOv8 deep learning model, trained on specialised datasets to accurately detect human victims under a wide range of challenging conditions, such as varying occlusions, lighting, and perspectives. Our comprehensive testing in simulated emergency scenarios validates the effectiveness of our integrated approach. The system demonstrated exceptional proficiency in navigating through obstructions and rapidly locating victims, even in environments with visual impairments like smoke, clutter, and poor lighting. Our study not only highlights the critical gaps in current SAR response capabilities but also offers a pioneering solution through a synergistic blend of gesture-based control, deep learning, and purpose-built robotics. The key findings underscore the potential of our integrated technological framework to significantly enhance UGV performance in disaster scenarios, thereby optimising life-saving outcomes when time is of the essence. This research paves the way for future advancements in SAR technology, with the promise of more efficient and reliable rescue operations in the face of disaster.
Collapse
Affiliation(s)
| | | | - Filippo Sanfilippo
- Department of Engineering Sciences, University of Agder, Grimstad, Norway
- Department of Software Engineering, Kaunas University of Technology, Kaunas, Lithuania
| |
Collapse
|
2
|
Qiu Y, Liu Y, Li S, Xu J. MiniSeg: An Extremely Minimum Network Based on Lightweight Multiscale Learning for Efficient COVID-19 Segmentation. IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS 2024; 35:8570-8584. [PMID: 37015641 DOI: 10.1109/tnnls.2022.3230821] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/19/2023]
Abstract
The rapid spread of the new pandemic, i.e., coronavirus disease 2019 (COVID-19), has severely threatened global health. Deep-learning-based computer-aided screening, e.g., COVID-19 infected area segmentation from computed tomography (CT) image, has attracted much attention by serving as an adjunct to increase the accuracy of COVID-19 screening and clinical diagnosis. Although lesion segmentation is a hot topic, traditional deep learning methods are usually data-hungry with millions of parameters, easy to overfit under limited available COVID-19 training data. On the other hand, fast training/testing and low computational cost are also necessary for quick deployment and development of COVID-19 screening systems, but traditional methods are usually computationally intensive. To address the above two problems, we propose MiniSeg, a lightweight model for efficient COVID-19 segmentation from CT images. Our efforts start with the design of an attentive hierarchical spatial pyramid (AHSP) module for lightweight, efficient, effective multiscale learning that is essential for image segmentation. Then, we build a two-path (TP) encoder for deep feature extraction, where one path uses AHSP modules for learning multiscale contextual features and the other is a shallow convolutional path for capturing fine details. The two paths interact with each other for learning effective representations. Based on the extracted features, a simple decoder is added for COVID-19 segmentation. For comparing MiniSeg to previous methods, we build a comprehensive COVID-19 segmentation benchmark. Extensive experiments demonstrate that the proposed MiniSeg achieves better accuracy because its only 83k parameters make it less prone to overfitting. Its high efficiency also makes it easy to deploy and develop. The code has been released at https://github.com/yun-liu/MiniSeg.
Collapse
|
3
|
Zhang J, Wang S, Jiang Z, Chen Z, Bai X. CD-Net: Cascaded 3D Dilated convolutional neural network for pneumonia lesion segmentation. Comput Biol Med 2024; 173:108311. [PMID: 38513395 DOI: 10.1016/j.compbiomed.2024.108311] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/25/2023] [Revised: 02/22/2024] [Accepted: 03/12/2024] [Indexed: 03/23/2024]
Abstract
COVID-19 is a global pandemic that has caused significant global, social, and economic disruption. To effectively assist in screening and monitoring diagnosed cases, it is crucial to accurately segment lesions from Computer Tomography (CT) scans. Due to the lack of labeled data and the presence of redundant parameters in 3D CT, there are still significant challenges in diagnosing COVID-19 in related fields. To address the problem, we have developed a new model called the Cascaded 3D Dilated convolutional neural network (CD-Net) for directly processing CT volume data. To reduce memory consumption when cutting volume data into small patches, we initially design a cascade architecture in CD-Net to preserve global information. Then, we construct a Multi-scale Parallel Dilated Convolution (MPDC) block to aggregate features of different sizes and simultaneously reduce the parameters. Moreover, to alleviate the shortage of labeled data, we employ classical transfer learning, which requires only a small amount of data while achieving better performance. Experimental results conducted on the different public-available datasets verify that the proposed CD-Net has reduced the negative-positive ratio and outperformed other existing segmentation methods while requiring less data.
Collapse
Affiliation(s)
- Jinli Zhang
- Faculty of Information Technology, Beijing University of Technology, Beijing 100124, China.
| | - Shaomeng Wang
- Faculty of Information Technology, Beijing University of Technology, Beijing 100124, China.
| | - Zongli Jiang
- Faculty of Information Technology, Beijing University of Technology, Beijing 100124, China.
| | - Zhijie Chen
- Faculty of Information Technology, Beijing University of Technology, Beijing 100124, China.
| | - Xiaolu Bai
- Faculty of Information Technology, Beijing University of Technology, Beijing 100124, China.
| |
Collapse
|
4
|
Alkhodari M, Hadjileontiadis LJ, Khandoker AH. Identification of Congenital Valvular Murmurs in Young Patients Using Deep Learning-Based Attention Transformers and Phonocardiograms. IEEE J Biomed Health Inform 2024; 28:1803-1814. [PMID: 38261492 DOI: 10.1109/jbhi.2024.3357506] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/25/2024]
Abstract
One in every four newborns suffers from congenital heart disease (CHD) that causes defects in the heart structure. The current gold-standard assessment technique, echocardiography, causes delays in the diagnosis owing to the need for experts who vary markedly in their ability to detect and interpret pathological patterns. Moreover, echo is still causing cost difficulties for low- and middle-income countries. Here, we developed a deep learning-based attention transformer model to automate the detection of heart murmurs caused by CHD at an early stage of life using cost-effective and widely available phonocardiography (PCG). PCG recordings were obtained from 942 young patients at four major auscultation locations, including the aortic valve (AV), mitral valve (MV), pulmonary valve (PV), and tricuspid valve (TV), and they were annotated by experts as absent, present, or unknown murmurs. A transformation to wavelet features was performed to reduce the dimensionality before the deep learning stage for inferring the medical condition. The performance was validated through 10-fold cross-validation and yielded an average accuracy and sensitivity of 90.23 % and 72.41 %, respectively. The accuracy of discriminating between murmurs' absence and presence reached 76.10 % when evaluated on unseen data. The model had accuracies of 70 %, 88 %, and 86 % in predicting murmur presence in infants, children, and adolescents, respectively. The interpretation of the model revealed proper discrimination between the learned attributes, and AV channel was found important (score 0.75) for the murmur absence predictions while MV and TV were more important for murmur presence predictions. The findings potentiate deep learning as a powerful front-line tool for inferring CHD status in PCG recordings leveraging early detection of heart anomalies in young people. It is suggested as a tool that can be used independently from high-cost machinery or expert assessment.
Collapse
|
5
|
Mansfield D, Montazeri A. A survey on autonomous environmental monitoring approaches: towards unifying active sensing and reinforcement learning. Front Robot AI 2024; 11:1336612. [PMID: 38533524 PMCID: PMC10964253 DOI: 10.3389/frobt.2024.1336612] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/11/2023] [Accepted: 02/22/2024] [Indexed: 03/28/2024] Open
Abstract
The environmental pollution caused by various sources has escalated the climate crisis making the need to establish reliable, intelligent, and persistent environmental monitoring solutions more crucial than ever. Mobile sensing systems are a popular platform due to their cost-effectiveness and adaptability. However, in practice, operation environments demand highly intelligent and robust systems that can cope with an environment's changing dynamics. To achieve this reinforcement learning has become a popular tool as it facilitates the training of intelligent and robust sensing agents that can handle unknown and extreme conditions. In this paper, a framework that formulates active sensing as a reinforcement learning problem is proposed. This framework allows unification with multiple essential environmental monitoring tasks and algorithms such as coverage, patrolling, source seeking, exploration and search and rescue. The unified framework represents a step towards bridging the divide between theoretical advancements in reinforcement learning and real-world applications in environmental monitoring. A critical review of the literature in this field is carried out and it is found that despite the potential of reinforcement learning for environmental active sensing applications there is still a lack of practical implementation and most work remains in the simulation phase. It is also noted that despite the consensus that, multi-agent systems are crucial to fully realize the potential of active sensing there is a lack of research in this area.
Collapse
|
6
|
Ma Y, Pei Y, Li C. Predictive Recognition of DNA-binding Proteins Based on Pre-trained Language Model BERT. J Bioinform Comput Biol 2023; 21:2350028. [PMID: 38248912 DOI: 10.1142/s0219720023500282] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/23/2024]
Abstract
Identifying proteins is crucial for disease diagnosis and treatment. With the increase of known proteins, large-scale batch predictions are essential. However, traditional biological experiments being time-consuming and expensive are difficult to accomplish this task efficiently. Nevertheless, deep learning algorithms based on big data analysis have manifested potential in this aspect. In recent years, language representation models, especially BERT, have made significant advancements in natural language processing. In this paper, using three protein segmentation methods and three encoder numbers, nine BERT models with different sizes are constructed to predict whether known proteins are DNA-binding proteins or not. Furthermore, based on the concept of protein motifs, multi-scale convolutional networks are fused into the models to extract the local features of DNA-binding proteins. Finally, we find that the larger the number of encoders, the better the model predictions under the condition of considering each amino acid in the protein as a word. Our proposed algorithm achieves 81.88% sensitivity and 0.39 MCC value on the test set. Furthermore, it achieves 62.41% accuracy on the independent test set PDB2272. It is evident that our proposed method can be a tool to assist in the identification of DNA-binding proteins.
Collapse
Affiliation(s)
- Yue Ma
- School of Computer Science and Technology, Tiangong University, Tianjin, P. R. China
| | - Yongzhen Pei
- School of Mathematical Sciences, Tiangong University, Tianjin, P. R. China
| | - Changguo Li
- Department of Basic Science, Army Military Transportation University, Tianjin, P. R. China
| |
Collapse
|
7
|
Su HH, Lu CP. Development of a Deep Learning-Based Epiglottis Obstruction Ratio Calculation System. SENSORS (BASEL, SWITZERLAND) 2023; 23:7669. [PMID: 37765726 PMCID: PMC10535372 DOI: 10.3390/s23187669] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 07/27/2023] [Revised: 08/31/2023] [Accepted: 09/01/2023] [Indexed: 09/29/2023]
Abstract
Surgeons determine the treatment method for patients with epiglottis obstruction based on its severity, often by estimating the obstruction severity (using three obstruction degrees) from the examination of drug-induced sleep endoscopy images. However, the use of obstruction degrees is inadequate and fails to correspond to changes in respiratory airflow. Current artificial intelligence image technologies can effectively address this issue. To enhance the accuracy of epiglottis obstruction assessment and replace obstruction degrees with obstruction ratios, this study developed a computer vision system with a deep learning-based method for calculating epiglottis obstruction ratios. The system employs a convolutional neural network, the YOLOv4 model, for epiglottis cartilage localization, a color quantization method to transform pixels into regions, and a region puzzle algorithm to calculate the range of a patient's epiglottis airway. This information is then utilized to compute the obstruction ratio of the patient's epiglottis site. Additionally, this system integrates web-based and PC-based programming technologies to realize its functionalities. Through experimental validation, this system was found to autonomously calculate obstruction ratios with a precision of 0.1% (ranging from 0% to 100%). It presents epiglottis obstruction levels as continuous data, providing crucial diagnostic insight for surgeons to assess the severity of epiglottis obstruction in patients.
Collapse
Affiliation(s)
- Hsing-Hao Su
- Department of Otorhinolaryngology-Head and Neck Surgery, Kaohsiung Veterans General Hospital, Kaohsiung 81362, Taiwan;
- Department of Physical Therapy, Shu-Zen Junior College of Medicine and Management, Kaohsiung 82144, Taiwan
- Department of Pharmacy and Master Program, College of Pharmacy & Health Care, Tajen University, Pingtung 90741, Taiwan
| | - Chuan-Pin Lu
- Department of Information and Communication Engineering, Chaoyang University of Technology, Taichung 41349, Taiwan
| |
Collapse
|
8
|
Li S, Yang X, Lin X, Zhang Y, Wu J. Real-Time Vehicle Detection from UAV Aerial Images Based on Improved YOLOv5. SENSORS (BASEL, SWITZERLAND) 2023; 23:5634. [PMID: 37420800 DOI: 10.3390/s23125634] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 05/11/2023] [Revised: 06/14/2023] [Accepted: 06/14/2023] [Indexed: 07/09/2023]
Abstract
Aerial vehicle detection has significant applications in aerial surveillance and traffic control. The pictures captured by the UAV are characterized by many tiny objects and vehicles obscuring each other, significantly increasing the detection challenge. In the research of detecting vehicles in aerial images, there is a widespread problem of missed and false detections. Therefore, we customize a model based on YOLOv5 to be more suitable for detecting vehicles in aerial images. Firstly, we add one additional prediction head to detect smaller-scale objects. Furthermore, to keep the original features involved in the training process of the model, we introduce a Bidirectional Feature Pyramid Network (BiFPN) to fuse the feature information from various scales. Lastly, Soft-NMS (soft non-maximum suppression) is employed as a prediction frame filtering method, alleviating the missed detection due to the close alignment of vehicles. The experimental findings on the self-made dataset in this research indicate that compared with YOLOv5s, the mAP@0.5 and mAP@0.5:0.95 of YOLOv5-VTO increase by 3.7% and 4.7%, respectively, and the two indexes of accuracy and recall are also improved.
Collapse
Affiliation(s)
- Shuaicai Li
- College of Intelligent Equipment, Shandong University of Science and Technology, Taian 271019, China
| | - Xiaodong Yang
- College of Intelligent Equipment, Shandong University of Science and Technology, Taian 271019, China
| | - Xiaoxia Lin
- College of Intelligent Equipment, Shandong University of Science and Technology, Taian 271019, China
| | - Yanyi Zhang
- College of Intelligent Equipment, Shandong University of Science and Technology, Taian 271019, China
| | - Jiahui Wu
- College of Intelligent Equipment, Shandong University of Science and Technology, Taian 271019, China
| |
Collapse
|
9
|
Wu S, Yan Y, Wang W. CF-YOLOX: An Autonomous Driving Detection Model for Multi-Scale Object Detection. SENSORS (BASEL, SWITZERLAND) 2023; 23:3794. [PMID: 37112134 PMCID: PMC10144478 DOI: 10.3390/s23083794] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 02/23/2023] [Revised: 03/29/2023] [Accepted: 03/31/2023] [Indexed: 06/19/2023]
Abstract
In self-driving cars, object detection algorithms are becoming increasingly important, and the accurate and fast recognition of objects is critical to realize autonomous driving. The existing detection algorithms are not ideal for the detection of small objects. This paper proposes a YOLOX-based network model for multi-scale object detection tasks in complex scenes. This method adds a CBAM-G module to the backbone of the original network, which performs grouping operations on CBAM. It changes the height and width of the convolution kernel of the spatial attention module to 7 × 1 to improve the ability of the model to extract prominent features. We proposed an object-contextual feature fusion module, which can provide more semantic information and improve the perception of multi-scale objects. Finally, we considered the problem of fewer samples and less loss of small objects and introduced a scaling factor that could increase the loss of small objects to improve the detection ability of small objects. We validated the effectiveness of the proposed method on the KITTI dataset, and the mAP value was 2.46% higher than the original model. Experimental comparisons showed that our model achieved superior detection performance compared to other models.
Collapse
|
10
|
Liu L, Duffy VG. Exploring the Future Development of Artificial Intelligence (AI) Applications in Chatbots: A Bibliometric Analysis. Int J Soc Robot 2023. [DOI: 10.1007/s12369-022-00956-0] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 04/05/2023]
|
11
|
Tutsoy O, Tanrikulu MY. Priority and age specific vaccination algorithm for the pandemic diseases: a comprehensive parametric prediction model. BMC Med Inform Decis Mak 2022; 22:4. [PMID: 34991566 PMCID: PMC8733450 DOI: 10.1186/s12911-021-01720-6] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/14/2021] [Accepted: 12/12/2021] [Indexed: 11/22/2022] Open
Abstract
BACKGROUND There have been several destructive pandemic diseases in the human history. Since these pandemic diseases spread through human-to-human infection, a number of non-pharmacological policies has been enforced until an effective vaccine has been developed. In addition, even though a vaccine has been developed, due to the challenges in the production and distribution of the vaccine, the authorities have to optimize the vaccination policies based on the priorities. Considering all these facts, a comprehensive but simple parametric model enriched with the pharmacological and non-pharmacological policies has been proposed in this study to analyse and predict the future pandemic casualties. METHOD This paper develops a priority and age specific vaccination policy and modifies the non-pharmacological policies including the curfews, lockdowns, and restrictions. These policies are incorporated with the susceptible, suspicious, infected, hospitalized, intensive care, intubated, recovered, and death sub-models. The resulting model is parameterizable by the available data where a recursive least squares algorithm with the inequality constraints optimizes the unknown parameters. The inequality constraints ensure that the structural requirements are satisfied and the parameter weights are distributed proportionally. RESULTS The results exhibit a distinctive third peak in the casualties occurring in 40 days and confirm that the intensive care, intubated, and death casualties converge to zero faster than the susceptible, suspicious, and infected casualties with the priority and age specific vaccination policy. The model also estimates that removing the curfews on the weekends and holidays cause more casualties than lifting the restrictions on the people with the chronic diseases and age over 65. CONCLUSION Sophisticated parametric models equipped with the pharmacological and non-pharmacological policies can predict the future pandemic casualties for various cases.
Collapse
Affiliation(s)
- Onder Tutsoy
- Department of Electreical-Electronics Engineering, Adana Alparslan Turkes Science and Technology University, Adana, 01250, Turkey.
| | - Mahmud Yusuf Tanrikulu
- Department of Electreical-Electronics Engineering, Adana Alparslan Turkes Science and Technology University, Adana, 01250, Turkey
- METU MEMS Center, Middle East Technical University, Ankara, 06800, Turkey
| |
Collapse
|