101
|
D’Souza G, Siddalingaswamy PC, Pandya MA. AlterNet-K: a small and compact model for the detection of glaucoma. Biomed Eng Lett 2024; 14:23-33. [PMID: 38186944 PMCID: PMC10770015 DOI: 10.1007/s13534-023-00307-6] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/15/2022] [Revised: 07/12/2023] [Accepted: 07/20/2023] [Indexed: 01/09/2024] Open
Abstract
Glaucoma is one of the leading causes of permanent blindness in the world. It is caused due to an increase in the intraocular pressure within the eye that harms the optic nerve. People suffering from Glaucoma often do not notice any changes in their vision in the early stages. However, as it progresses, Glaucoma usually leads to vision loss that is irreversible in many cases. Thus, early diagnosis of this eye disease is of critical importance. The fundus image is one of the most used diagnostic tools for glaucoma detection. However, drawing accurate insights from these images requires them to be manually analyzed by medical experts, which is a time-consuming process. In this work, we propose a parameter-efficient AlterNet-K model based on an alternating design pattern, which combines ResNets and multi-head self-attention (MSA) to leverage their complementary properties to improve the generalizability of the overall model. The model was trained on the Rotterdam EyePACS AIROGS dataset, comprising 113,893 colour fundus images from 60,357 subjects. The AlterNet-K model outperformed transformer models such as ViT, DeiT-S, and Swin transformer, standard DCNN models including ResNet, EfficientNet, MobileNet and VGG with an accuracy of 0.916, AUROC of 0.968 and F1 score of 0.915. The results indicate that smaller CNN models combined with self-attention mechanisms can achieve high classification accuracies. Small and compact Resnet models combined with MSA outperform their larger counterparts. The models in this work can be extended to handle classification tasks in other medical imaging domains.
Collapse
Affiliation(s)
- Gavin D’Souza
- Department of Instrumentation and Control Engineering, Manipal Institute of Technology, Manipal Academy of Higher Education, Manipal, Karnataka 576104 India
| | - P. C. Siddalingaswamy
- Department of Computer Science and Engineering, Manipal Institute of Technology, Manipal Academy of Higher Education, Manipal, Karnataka 576104 India
| | - Mayur Anand Pandya
- Department of Computer Science and Engineering, Manipal Institute of Technology, Manipal Academy of Higher Education, Manipal, Karnataka 576104 India
| |
Collapse
|
102
|
Zhang W, Lu F, Su H, Hu Y. Dual-branch multi-information aggregation network with transformer and convolution for polyp segmentation. Comput Biol Med 2024; 168:107760. [PMID: 38064849 DOI: 10.1016/j.compbiomed.2023.107760] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/19/2023] [Revised: 10/21/2023] [Accepted: 11/21/2023] [Indexed: 01/10/2024]
Abstract
Computer-Aided Diagnosis (CAD) for polyp detection offers one of the most notable showcases. By using deep learning technologies, the accuracy of polyp segmentation is surpassing human experts. In such CAD process, a critical step is concerned with segmenting colorectal polyps from colonoscopy images. Despite remarkable successes attained by recent deep learning related works, much improvement is still anticipated to tackle challenging cases. For instance, the effects of motion blur and light reflection can introduce significant noise into the image. The same type of polyps has a diversity of size, color and texture. To address such challenges, this paper proposes a novel dual-branch multi-information aggregation network (DBMIA-Net) for polyp segmentation, which is able to accurately and reliably segment a variety of colorectal polyps with efficiency. Specifically, a dual-branch encoder with transformer and convolutional neural networks (CNN) is employed to extract polyp features, and two multi-information aggregation modules are applied in the decoder to fuse multi-scale features adaptively. Two multi-information aggregation modules include global information aggregation (GIA) module and edge information aggregation (EIA) module. In addition, to enhance the representation learning capability of the potential channel feature association, this paper also proposes a novel adaptive channel graph convolution (ACGC). To validate the effectiveness and advantages of the proposed network, we compare it with several state-of-the-art (SOTA) methods on five public datasets. Experimental results consistently demonstrate that the proposed DBMIA-Net obtains significantly superior segmentation performance across six popularly used evaluation matrices. Especially, we achieve 94.12% mean Dice on CVC-ClinicDB dataset which is 4.22% improvement compared to the previous state-of-the-art method PraNet. Compared with SOTA algorithms, DBMIA-Net has a better fitting ability and stronger generalization ability.
Collapse
Affiliation(s)
- Wenyu Zhang
- School of Information Science and Engineering, Lanzhou University, China
| | - Fuxiang Lu
- School of Information Science and Engineering, Lanzhou University, China.
| | - Hongjing Su
- School of Information Science and Engineering, Lanzhou University, China
| | - Yawen Hu
- School of Information Science and Engineering, Lanzhou University, China
| |
Collapse
|
103
|
Chopannejad S, Roshanpoor A, Sadoughi F. Attention-assisted hybrid CNN-BILSTM-BiGRU model with SMOTE-Tomek method to detect cardiac arrhythmia based on 12 -lead electrocardiogram signals. Digit Health 2024; 10:20552076241234624. [PMID: 38449680 PMCID: PMC10916475 DOI: 10.1177/20552076241234624] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/08/2023] [Accepted: 01/26/2024] [Indexed: 03/08/2024] Open
Abstract
Objectives Cardiac arrhythmia is one of the most severe cardiovascular diseases that can be fatal. Therefore, its early detection is critical. However, detecting types of arrhythmia by physicians based on visual identification is time-consuming and subjective. Deep learning can develop effective approaches to classify arrhythmias accurately and quickly. This study proposed a deep learning approach developed based on a Chapman-Shaoxing electrocardiogram (ECG) dataset signal to detect seven types of arrhythmias. Method Our DNN model is a hybrid CNN-BILSTM-BiGRU algorithm assisted by a multi-head self-attention mechanism regarding the challenging problem of classifying various arrhythmias of ECG signals. Additionally, the synthetic minority oversampling technique (SMOTE)-Tomek technique was utilized to address the data imbalance problem to detect and classify cardiac arrhythmias. Result The proposed model, trained with a single lead, was tested using a dataset containing 10,466 participants. The performance of the algorithm was evaluated using a random split validation approach. The proposed algorithm achieved an accuracy of 98.57% by lead II and 98.34% by lead aVF for the classification of arrhythmias. Conclusion We conducted an analysis of single-lead ECG signals to evaluate the effectiveness of our proposed hybrid model in diagnosing and classifying different types of arrhythmias. We trained separate classification models using each individual signal lead. Additionally, we implemented the SMOTE-Tomek technique along with cross-entropy loss as a cost function to address the class imbalance problem. Furthermore, we utilized a multi-headed self-attention mechanism to adjust the network structure and classify the seven arrhythmia classes. Our model achieved high accuracy and demonstrated good generalization ability in detecting ECG arrhythmias. However, further testing of the model with diverse datasets is crucial to validate its performance.
Collapse
Affiliation(s)
- Sara Chopannejad
- Department of Health Information Management, School of Health Management and Information Sciences, Iran University of Medical Sciences, Tehran, Iran
| | - Arash Roshanpoor
- Department of Computer, Yadegar-e-Imam Khomeini (RAH), Janat-abad Branch, Islamic Azad University, Tehran, Iran
| | - Farahnaz Sadoughi
- Department of Health Information Management, School of Health Management and Information Sciences, Iran University of Medical Sciences, Tehran, Iran
| |
Collapse
|
104
|
Maedera S, Mizuno T, Kusuhara H. Investigation of latent representation of toxicopathological images extracted by CNN model for understanding compound properties in vivo. Comput Biol Med 2024; 168:107748. [PMID: 38016375 DOI: 10.1016/j.compbiomed.2023.107748] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/06/2023] [Revised: 10/25/2023] [Accepted: 11/20/2023] [Indexed: 11/30/2023]
Abstract
Toxicopathological images acquired during safety assessment elucidate an individual's biological responses to a given compound, and their numerization can yield valuable insights contributing to the assessment of compound properties. Currently, toxicopathological images are mainly encoded as pathological findings, evaluated by pathologists, which introduces challenges when used as input for modeling, specifically in terms of representation capability and comparability. In this study, we assessed the usefulness of latent representations extracted from toxicopathological images using Convolutional Neural Network (CNN) in estimating compound properties in vivo. Special emphasis was placed on examining the impact of learning pathological findings, the depth of frozen layers during learning, and the selection of the layer for latent representation. Our findings demonstrate that a machine learning model fed with the latent representation as input surpassed the performance of a model directly employing pathological findings as input, particularly in the classification of a compound's Mechanism of Action and in predicting late-phase findings from early-phase images in repeated-dose tests. While learning pathological findings did improve accuracy, the magnitude of improvement was relatively modest. Similarly, the effect of freezing layers during learning was also limited. Notably, the selection of the layer for latent representation had a substantial impact on the accurate estimation of compound properties in vivo.
Collapse
Affiliation(s)
- Shotaro Maedera
- Laboratory of Molecular Pharmacokinetics, Graduate School of Pharmaceutical Sciences, The University of Tokyo, 7-3-1 Hongo, Bunkyo, Tokyo, Japan
| | - Tadahaya Mizuno
- Laboratory of Molecular Pharmacokinetics, Graduate School of Pharmaceutical Sciences, The University of Tokyo, 7-3-1 Hongo, Bunkyo, Tokyo, Japan.
| | - Hiroyuki Kusuhara
- Laboratory of Molecular Pharmacokinetics, Graduate School of Pharmaceutical Sciences, The University of Tokyo, 7-3-1 Hongo, Bunkyo, Tokyo, Japan
| |
Collapse
|
105
|
Chugh N, Aggarwal S, Balyan A. The Hybrid Deep Learning Model for Identification of Attention-Deficit/Hyperactivity Disorder Using EEG. Clin EEG Neurosci 2024; 55:22-33. [PMID: 37682533 DOI: 10.1177/15500594231193511] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Indexed: 09/09/2023]
Abstract
Common misbehavior among children that prevents them from paying attention to tasks and interacting with their surroundings appropriately is attention-deficit/hyperactivity disorder (ADHD). Studies of children's behavior presently face a significant problem in the early and timely diagnosis of this disease. To diagnose this disease, doctors often use the patient's description and questionnaires, psychological tests, and the patient's behavior in which reliability is questionable. Convolutional neural network (CNN) is one deep learning technique that has been used for the diagnosis of ADHD. CNN, however, does not account for how signals change over time, which leads to low classification performances and ambiguous findings. In this study, the authors designed a hybrid deep learning model that combines long-short-term memory (LSTM) and CNN to simultaneously extract and learn the spatial features and long-term dependencies of the electroencephalography (EEG) data. The effectiveness of the proposed hybrid deep learning model was assessed using 2 publicly available EEG datasets. The suggested model achieves a classification accuracy of 98.86% on the ADHD dataset and 98.28% on the FOCUS dataset, respectively. The experimental findings show that the proposed hybrid CNN-LSTM model outperforms the state-of-the-art methods to diagnose ADHD using EEG. Hence, the proposed hybrid CNN-LSTM model could therefore be utilized to help with the clinical diagnosis of ADHD patients.
Collapse
Affiliation(s)
- Nupur Chugh
- Netaji Subhas Institute of Technology, New Delhi, India
| | - Swati Aggarwal
- Netaji Subhas University of Technology, New Delhi, India
| | - Arnav Balyan
- Netaji Subhas Institute of Technology, New Delhi, India
| |
Collapse
|
106
|
Peluso A, Danciu I, Yoon HJ, Yusof JM, Bhattacharya T, Spannaus A, Schaefferkoetter N, Durbin EB, Wu XC, Stroup A, Doherty J, Schwartz S, Wiggins C, Coyle L, Penberthy L, Tourassi GD, Gao S. Deep learning uncertainty quantification for clinical text classification. J Biomed Inform 2024; 149:104576. [PMID: 38101690 DOI: 10.1016/j.jbi.2023.104576] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/22/2023] [Revised: 12/06/2023] [Accepted: 12/10/2023] [Indexed: 12/17/2023]
Abstract
INTRODUCTION Machine learning algorithms are expected to work side-by-side with humans in decision-making pipelines. Thus, the ability of classifiers to make reliable decisions is of paramount importance. Deep neural networks (DNNs) represent the state-of-the-art models to address real-world classification. Although the strength of activation in DNNs is often correlated with the network's confidence, in-depth analyses are needed to establish whether they are well calibrated. METHOD In this paper, we demonstrate the use of DNN-based classification tools to benefit cancer registries by automating information extraction of disease at diagnosis and at surgery from electronic text pathology reports from the US National Cancer Institute (NCI) Surveillance, Epidemiology, and End Results (SEER) population-based cancer registries. In particular, we introduce multiple methods for selective classification to achieve a target level of accuracy on multiple classification tasks while minimizing the rejection amount-that is, the number of electronic pathology reports for which the model's predictions are unreliable. We evaluate the proposed methods by comparing our approach with the current in-house deep learning-based abstaining classifier. RESULTS Overall, all the proposed selective classification methods effectively allow for achieving the targeted level of accuracy or higher in a trade-off analysis aimed to minimize the rejection rate. On in-distribution validation and holdout test data, with all the proposed methods, we achieve on all tasks the required target level of accuracy with a lower rejection rate than the deep abstaining classifier (DAC). Interpreting the results for the out-of-distribution test data is more complex; nevertheless, in this case as well, the rejection rate from the best among the proposed methods achieving 97% accuracy or higher is lower than the rejection rate based on the DAC. CONCLUSIONS We show that although both approaches can flag those samples that should be manually reviewed and labeled by human annotators, the newly proposed methods retain a larger fraction and do so without retraining-thus offering a reduced computational cost compared with the in-house deep learning-based abstaining classifier.
Collapse
Affiliation(s)
- Alina Peluso
- Oak Ridge National Laboratory, Oak Ridge, TN 37830, United States.
| | - Ioana Danciu
- Oak Ridge National Laboratory, Oak Ridge, TN 37830, United States
| | - Hong-Jun Yoon
- Oak Ridge National Laboratory, Oak Ridge, TN 37830, United States
| | | | | | - Adam Spannaus
- Oak Ridge National Laboratory, Oak Ridge, TN 37830, United States
| | | | - Eric B Durbin
- University of Kentucky, Lexington, KY 40536, United States
| | - Xiao-Cheng Wu
- Louisiana State University, New Orleans, LA 70112, United States
| | - Antoinette Stroup
- Rutgers Cancer Institute of New Jersey, New Brunswick, NJ 08901, United States
| | | | - Stephen Schwartz
- Fred Hutchinson Cancer Research Center, Seattle, WA 98109, United States
| | - Charles Wiggins
- University of New Mexico, Albuquerque, NM 87131, United States
| | - Linda Coyle
- Information Management Services Inc., Calverton, MD 20705, United States
| | - Lynne Penberthy
- National Cancer Institute, Bethesda, MD 20814, United States
| | | | - Shang Gao
- Oak Ridge National Laboratory, Oak Ridge, TN 37830, United States
| |
Collapse
|
107
|
Dokania H, Chattaraj N. An assistive interface protocol for communication between visually and hearing-speech impaired persons in internet platform. Disabil Rehabil Assist Technol 2024; 19:233-246. [PMID: 35618260 DOI: 10.1080/17483107.2022.2078898] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/14/2021] [Accepted: 04/21/2022] [Indexed: 10/18/2022]
Abstract
PURPOSE The article presents a design and development of a generic assistive system to establish an independent conversation-platform for hearing-speech impaired and visually impaired persons. MATERIALS The developed software system is accomplished through programming using python and html. METHODS Considering the constraints associated to the above mentioned impairments, the system implements both speech-to-text/gesture and text/gesture-to-speech conversion in its operation. In real-time hand-gesture to speech generation process is implemented using static image tracking, CNN based deep learning technique and MediaPipe hand-tracking solution. The software-prototype-terminals can be accessed through internet using MQTT protocol to accomplish the communicative conversation between visually impaired and hearing-speech impaired persons. RESULTS The software system exhibits an average prediction time of less than approximately 1 s and 2 s for a four-letter based audio-word and a single hand-gesture, respectively, which are commensurate to the average time-complexity during human-to-human conversation. The average accuracy and loss for the hand-gestures through the CNN based deep learning are 0.9996 and 0.0008, respectively. The confusion matrix related to the prediction of alphabet-specific hand-gestures shows its satisfactory performance in gesture recognition. CONCLUSIONS The software-prototype of the generic assistive device shows its potential to establish an exclusive communication between a visually impaired and a hearing-speech impaired person through the internet. The same software-interface can also be used to accomplish a communicative conversation between either only visually-impaired persons or only hearing-speech impaired persons. IMPLICATIONS FOR REHABILITATIONThe article presents a design and development of a generic assistive interface to establish an independent conversation-platform for hearing-speech impaired and visually impaired people via internet network.The same software-interface can also be used to accomplish a communicative conversation between either only visually-impaired persons or only hearing-speech impaired persons.The design can be further extended by incorporating multi-modal impairments to make a universal assistive device for all-in-one communication.
Collapse
Affiliation(s)
- Harsh Dokania
- Department of Electronics and Communication Engineering, National Institute of Technology, Durgapur, India
| | - Nilanjan Chattaraj
- Department of Electronics and Communication Engineering, National Institute of Technology, Durgapur, India
| |
Collapse
|
108
|
Cumbajin E, Rodrigues N, Costa P, Miragaia R, Frazão L, Costa N, Fernández-Caballero A, Carneiro J, Buruberri LH, Pereira A. A Real-Time Automated Defect Detection System for Ceramic Pieces Manufacturing Process Based on Computer Vision with Deep Learning. Sensors (Basel) 2023; 24:232. [PMID: 38203095 PMCID: PMC10781230 DOI: 10.3390/s24010232] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 11/02/2023] [Revised: 12/19/2023] [Accepted: 12/28/2023] [Indexed: 01/12/2024]
Abstract
Defect detection is a key element of quality control in today's industries, and the process requires the incorporation of automated methods, including image sensors, to detect any potential defects that may occur during the manufacturing process. While there are various methods that can be used for inspecting surfaces, such as those of metal and building materials, there are only a limited number of techniques that are specifically designed to analyze specialized surfaces, such as ceramics, which can potentially reveal distinctive anomalies or characteristics that require a more precise and focused approach. This article describes a study and proposes an extended solution for defect detection on ceramic pieces within an industrial environment, utilizing a computer vision system with deep learning models. The solution includes an image acquisition process and a labeling platform to create training datasets, as well as an image preprocessing technique, to feed a machine learning algorithm based on convolutional neural networks (CNNs) capable of running in real time within a manufacturing environment. The developed solution was implemented and evaluated at a leading Portuguese company that specializes in the manufacturing of tableware and fine stoneware. The collaboration between the research team and the company resulted in the development of an automated and effective system for detecting defects in ceramic pieces, achieving an accuracy of 98.00% and an F1-Score of 97.29%.
Collapse
Affiliation(s)
- Esteban Cumbajin
- Computer Science and Communications Research Centre, School of Technology and Management, Polytechnic of Leiria, 2411-901 Leiria, Portugal; (E.C.); (N.R.); (P.C.); (R.M.); (L.F.); (N.C.)
| | - Nuno Rodrigues
- Computer Science and Communications Research Centre, School of Technology and Management, Polytechnic of Leiria, 2411-901 Leiria, Portugal; (E.C.); (N.R.); (P.C.); (R.M.); (L.F.); (N.C.)
| | - Paulo Costa
- Computer Science and Communications Research Centre, School of Technology and Management, Polytechnic of Leiria, 2411-901 Leiria, Portugal; (E.C.); (N.R.); (P.C.); (R.M.); (L.F.); (N.C.)
| | - Rolando Miragaia
- Computer Science and Communications Research Centre, School of Technology and Management, Polytechnic of Leiria, 2411-901 Leiria, Portugal; (E.C.); (N.R.); (P.C.); (R.M.); (L.F.); (N.C.)
| | - Luís Frazão
- Computer Science and Communications Research Centre, School of Technology and Management, Polytechnic of Leiria, 2411-901 Leiria, Portugal; (E.C.); (N.R.); (P.C.); (R.M.); (L.F.); (N.C.)
| | - Nuno Costa
- Computer Science and Communications Research Centre, School of Technology and Management, Polytechnic of Leiria, 2411-901 Leiria, Portugal; (E.C.); (N.R.); (P.C.); (R.M.); (L.F.); (N.C.)
| | - Antonio Fernández-Caballero
- Instituto de Investigación en Informática de Albacete, 02071 Albacete, Spain;
- Departamento de Sistemas Informáticos, Universidad de Castilla-La Mancha, 02071 Albacete, Spain
| | - Jorge Carneiro
- Grestel-Produtos Cerâmicos S.A, Zona Industrial de Vagos-Lote 78, 3840-385 Vagos, Portugal; (J.C.); (L.H.B.)
| | - Leire H. Buruberri
- Grestel-Produtos Cerâmicos S.A, Zona Industrial de Vagos-Lote 78, 3840-385 Vagos, Portugal; (J.C.); (L.H.B.)
| | - António Pereira
- Computer Science and Communications Research Centre, School of Technology and Management, Polytechnic of Leiria, 2411-901 Leiria, Portugal; (E.C.); (N.R.); (P.C.); (R.M.); (L.F.); (N.C.)
- INOV INESC Inovação, Institute of New Technologies, Leiria Office, 2411-901 Leiria, Portugal
| |
Collapse
|
109
|
Itu R, Danescu R. Fully Convolutional Neural Network for Vehicle Speed and Emergency-Brake Prediction. Sensors (Basel) 2023; 24:212. [PMID: 38203074 PMCID: PMC10781285 DOI: 10.3390/s24010212] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 11/08/2023] [Revised: 12/20/2023] [Accepted: 12/28/2023] [Indexed: 01/12/2024]
Abstract
Ego-vehicle state prediction represents a complex and challenging problem for self-driving and autonomous vehicles. Sensorial information and on-board cameras are used in perception-based solutions in order to understand the state of the vehicle and the surrounding traffic conditions. Monocular camera-based methods are becoming increasingly popular for driver assistance, with precise predictions of vehicle speed and emergency braking being important for road safety enhancement, especially in the prevention of speed-related accidents. In this research paper, we introduce the implementation of a convolutional neural network (CNN) model tailored for the prediction of vehicle velocity, braking events, and emergency braking, employing sequential image sequences and velocity data as inputs. The CNN model is trained on a dataset featuring sequences of 20 consecutive images and corresponding velocity values, all obtained from a moving vehicle navigating through road-traffic scenarios. The model's primary objective is to predict the current vehicle speed, braking actions, and the occurrence of an emergency-brake situation using the information encoded in the preceding 20 frames. We subject our proposed model to an evaluation on a dataset using regression and classification metrics, and comparative analysis with existing published work based on recurrent neural networks (RNNs). Through our efforts to improve the prediction accuracy for velocity, braking behavior, and emergency-brake events, we make a substantial contribution to improving road safety and offer valuable insights for the development of perception-based techniques in the field of autonomous vehicles.
Collapse
Affiliation(s)
- Razvan Itu
- Computer Science Department, Technical University of Cluj-Napoca, St. Memorandumului 28, 400114 Cluj-Napoca, Romania;
| | | |
Collapse
|
110
|
Baban A Erep TR, Chaari L. mid-DeepLabv3+: A Novel Approach for Image Semantic Segmentation Applied to African Food Dietary Assessments. Sensors (Basel) 2023; 24:209. [PMID: 38203070 PMCID: PMC10781344 DOI: 10.3390/s24010209] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 11/05/2023] [Revised: 12/20/2023] [Accepted: 12/27/2023] [Indexed: 01/12/2024]
Abstract
Recent decades have witnessed the development of vision-based dietary assessment (VBDA) systems. These systems generally consist of three main stages: food image analysis, portion estimation, and nutrient derivation. The effectiveness of the initial step is highly dependent on the use of accurate segmentation and image recognition models and the availability of high-quality training datasets. Food image segmentation still faces various challenges, and most existing research focuses mainly on Asian and Western food images. For this reason, this study is based on food images from sub-Saharan Africa, which pose their own problems, such as inter-class similarity and dishes with mixed-class food. This work focuses on the first stage of VBDAs, where we introduce two notable contributions. Firstly, we propose mid-DeepLabv3+, an enhanced food image segmentation model based on DeepLabv3+ with a ResNet50 backbone. Our approach involves adding a middle layer in the decoder path and SimAM after each extracted backbone feature layer. Secondly, we present CamerFood10, the first food image dataset specifically designed for sub-Saharan African food segmentation. It includes 10 classes of the most consumed food items in Cameroon. On our dataset, mid-DeepLabv3+ outperforms benchmark convolutional neural network models for semantic image segmentation, with an mIoU (mean Intersection over Union) of 65.20%, representing a +10.74% improvement over DeepLabv3+ with the same backbone.
Collapse
Affiliation(s)
- Thierry Roland Baban A Erep
- Toulouse INP, University of Toulouse, Institut de Recherche en Informatique de Toulouse, 31400 Toulouse, France
| | - Lotfi Chaari
- Toulouse INP, University of Toulouse, Institut de Recherche en Informatique de Toulouse, 31400 Toulouse, France
| |
Collapse
|
111
|
Stulpinas R, Morkunas M, Rasmusson A, Drachneris J, Augulis R, Gulla A, Strupas K, Laurinavicius A. Improving HCC Prognostic Models after Liver Resection by AI-Extracted Tissue Fiber Framework Analytics. Cancers (Basel) 2023; 16:106. [PMID: 38201532 PMCID: PMC10778366 DOI: 10.3390/cancers16010106] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/31/2023] [Revised: 12/11/2023] [Accepted: 12/21/2023] [Indexed: 01/12/2024] Open
Abstract
Despite advances in diagnostic and treatment technologies, predicting outcomes of patients with hepatocellular carcinoma (HCC) remains a challenge. Prognostic models are further obscured by the variable impact of the tumor properties and the remaining liver parenchyma, often affected by cirrhosis or non-alcoholic fatty liver disease that tend to precede HCC. This study investigated the prognostic value of reticulin and collagen microarchitecture in liver resection samples. We analyzed 105 scanned tissue sections that were stained using a Gordon and Sweet's silver impregnation protocol combined with Picric Acid-Sirius Red. A convolutional neural network was utilized to segment the red-staining collagen and black linear reticulin strands, generating a detailed map of the fiber structure within the HCC and adjacent liver tissue. Subsequent hexagonal grid subsampling coupled with automated epithelial edge detection and computational fiber morphometry provided the foundation for region-specific tissue analysis. Two penalized Cox regression models using LASSO achieved a concordance index (C-index) greater than 0.7. These models incorporated variables such as patient age, tumor multifocality, and fiber-derived features from the epithelial edge in both the tumor and liver compartments. The prognostic value at the tumor edge was derived from the reticulin structure, while collagen characteristics were significant at the epithelial edge of peritumoral liver. The prognostic performance of these models was superior to models solely reliant on conventional clinicopathologic parameters, highlighting the utility of AI-extracted microarchitectural features for the management of HCC.
Collapse
Affiliation(s)
- Rokas Stulpinas
- Faculty of Medicine, Institute of Biomedical Sciences, Department of Pathology and Forensic Medicine, Vilnius University, 03101 Vilnius, Lithuania (A.L.)
- National Center of Pathology, Affiliate of Vilnius University Hospital Santaros Klinikos, 08406 Vilnius, Lithuania;
| | - Mindaugas Morkunas
- National Center of Pathology, Affiliate of Vilnius University Hospital Santaros Klinikos, 08406 Vilnius, Lithuania;
- Vilnius Santaros Klinikos Biobank, Vilnius University Hospital Santaros Klinikos, 08661 Vilnius, Lithuania
| | - Allan Rasmusson
- Faculty of Medicine, Institute of Biomedical Sciences, Department of Pathology and Forensic Medicine, Vilnius University, 03101 Vilnius, Lithuania (A.L.)
- National Center of Pathology, Affiliate of Vilnius University Hospital Santaros Klinikos, 08406 Vilnius, Lithuania;
| | - Julius Drachneris
- Faculty of Medicine, Institute of Biomedical Sciences, Department of Pathology and Forensic Medicine, Vilnius University, 03101 Vilnius, Lithuania (A.L.)
- National Center of Pathology, Affiliate of Vilnius University Hospital Santaros Klinikos, 08406 Vilnius, Lithuania;
| | - Renaldas Augulis
- Faculty of Medicine, Institute of Biomedical Sciences, Department of Pathology and Forensic Medicine, Vilnius University, 03101 Vilnius, Lithuania (A.L.)
- National Center of Pathology, Affiliate of Vilnius University Hospital Santaros Klinikos, 08406 Vilnius, Lithuania;
| | - Aiste Gulla
- Faculty of Medicine, Institute of Clinical Medicine, Vilnius University, 03101 Vilnius, Lithuania
- Faculty of Medicine, Centre for Visceral Medicine and Translational Research, Vilnius University, 03101 Vilnius, Lithuania
- Center of Abdominal Surgery, Vilnius University Hospital Santaros Klinikos, 08410 Vilnius, Lithuania
| | - Kestutis Strupas
- Faculty of Medicine, Institute of Clinical Medicine, Vilnius University, 03101 Vilnius, Lithuania
- Faculty of Medicine, Centre for Visceral Medicine and Translational Research, Vilnius University, 03101 Vilnius, Lithuania
- Center of Abdominal Surgery, Vilnius University Hospital Santaros Klinikos, 08410 Vilnius, Lithuania
| | - Arvydas Laurinavicius
- Faculty of Medicine, Institute of Biomedical Sciences, Department of Pathology and Forensic Medicine, Vilnius University, 03101 Vilnius, Lithuania (A.L.)
- National Center of Pathology, Affiliate of Vilnius University Hospital Santaros Klinikos, 08406 Vilnius, Lithuania;
| |
Collapse
|
112
|
Sauter D, Lodde G, Nensa F, Schadendorf D, Livingstone E, Kukuk M. A Systematic Comparison of Task Adaptation Techniques for Digital Histopathology. Bioengineering (Basel) 2023; 11:19. [PMID: 38247897 PMCID: PMC10813343 DOI: 10.3390/bioengineering11010019] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/20/2023] [Revised: 12/20/2023] [Accepted: 12/21/2023] [Indexed: 01/23/2024] Open
Abstract
Due to an insufficient amount of image annotation, artificial intelligence in computational histopathology usually relies on fine-tuning pre-trained neural networks. While vanilla fine-tuning has shown to be effective, research on computer vision has recently proposed improved algorithms, promising better accuracy. While initial studies have demonstrated the benefits of these algorithms for medical AI, in particular for radiology, there is no empirical evidence for improved accuracy in histopathology. Therefore, based on the ConvNeXt architecture, our study performs a systematic comparison of nine task adaptation techniques, namely, DELTA, L2-SP, MARS-PGM, Bi-Tuning, BSS, MultiTune, SpotTune, Co-Tuning, and vanilla fine-tuning, on five histopathological classification tasks using eight datasets. The results are based on external testing and statistical validation and reveal a multifaceted picture: some techniques are better suited for histopathology than others, but depending on the classification task, a significant relative improvement in accuracy was observed for five advanced task adaptation techniques over the control method, i.e., vanilla fine-tuning (e.g., Co-Tuning: P(≫) = 0.942, d = 2.623). Furthermore, we studied the classification accuracy for three of the nine methods with respect to the training set size (e.g., Co-Tuning: P(≫) = 0.951, γ = 0.748). Overall, our results show that the performance of advanced task adaptation techniques in histopathology is affected by influencing factors such as the specific classification task or the size of the training dataset.
Collapse
Affiliation(s)
- Daniel Sauter
- Department of Computer Science, Fachhochschule Dortmund, 44227 Dortmund, Germany;
| | - Georg Lodde
- Department of Dermatology, University Hospital Essen, 45147 Essen, Germany; (G.L.); (D.S.); (E.L.)
| | - Felix Nensa
- Institute for AI in Medicine (IKIM), University Hospital Essen, 45131 Essen, Germany;
- Institute of Diagnostic and Interventional Radiology and Neuroradiology, University Hospital Essen, 45147 Essen, Germany
| | - Dirk Schadendorf
- Department of Dermatology, University Hospital Essen, 45147 Essen, Germany; (G.L.); (D.S.); (E.L.)
| | - Elisabeth Livingstone
- Department of Dermatology, University Hospital Essen, 45147 Essen, Germany; (G.L.); (D.S.); (E.L.)
| | - Markus Kukuk
- Department of Computer Science, Fachhochschule Dortmund, 44227 Dortmund, Germany;
| |
Collapse
|
113
|
Tong MW, Tolpadi AA, Bhattacharjee R, Han M, Majumdar S, Pedoia V. Synthetic Knee MRI T 1p Maps as an Avenue for Clinical Translation of Quantitative Osteoarthritis Biomarkers. Bioengineering (Basel) 2023; 11:17. [PMID: 38247894 PMCID: PMC10812962 DOI: 10.3390/bioengineering11010017] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/17/2023] [Revised: 12/15/2023] [Accepted: 12/21/2023] [Indexed: 01/23/2024] Open
Abstract
A 2D U-Net was trained to generate synthetic T1p maps from T2 maps for knee MRI to explore the feasibility of domain adaptation for enriching existing datasets and enabling rapid, reliable image reconstruction. The network was developed using 509 healthy contralateral and injured ipsilateral knee images from patients with ACL injuries and reconstruction surgeries acquired across three institutions. Network generalizability was evaluated on 343 knees acquired in a clinical setting and 46 knees from simultaneous bilateral acquisition in a research setting. The deep neural network synthesized high-fidelity reconstructions of T1p maps, preserving textures and local T1p elevation patterns in cartilage with a normalized mean square error of 2.4% and Pearson's correlation coefficient of 0.93. Analysis of reconstructed T1p maps within cartilage compartments revealed minimal bias (-0.10 ms), tight limits of agreement, and quantification error (5.7%) below the threshold for clinically significant change (6.42%) associated with osteoarthritis. In an out-of-distribution external test set, synthetic maps preserved T1p textures, but exhibited increased bias and wider limits of agreement. This study demonstrates the capability of image synthesis to reduce acquisition time, derive meaningful information from existing datasets, and suggest a pathway for standardizing T1p as a quantitative biomarker for osteoarthritis.
Collapse
Affiliation(s)
- Michelle W. Tong
- Department of Radiology and Biomedical Imaging, University of California San Francisco, San Francisco, CA 94143, USA (S.M.); (V.P.)
- Department of Bioengineering, University of California Berkeley, Berkeley, CA 94720, USA
| | - Aniket A. Tolpadi
- Department of Radiology and Biomedical Imaging, University of California San Francisco, San Francisco, CA 94143, USA (S.M.); (V.P.)
- Department of Bioengineering, University of California Berkeley, Berkeley, CA 94720, USA
| | - Rupsa Bhattacharjee
- Department of Radiology and Biomedical Imaging, University of California San Francisco, San Francisco, CA 94143, USA (S.M.); (V.P.)
| | - Misung Han
- Department of Radiology and Biomedical Imaging, University of California San Francisco, San Francisco, CA 94143, USA (S.M.); (V.P.)
| | - Sharmila Majumdar
- Department of Radiology and Biomedical Imaging, University of California San Francisco, San Francisco, CA 94143, USA (S.M.); (V.P.)
| | - Valentina Pedoia
- Department of Radiology and Biomedical Imaging, University of California San Francisco, San Francisco, CA 94143, USA (S.M.); (V.P.)
| |
Collapse
|
114
|
Saikia MJ, Kuanar S, Mahapatra D, Faghani S. Multi-Modal Ensemble Deep Learning in Head and Neck Cancer HPV Sub-Typing. Bioengineering (Basel) 2023; 11:13. [PMID: 38247890 DOI: 10.3390/bioengineering11010013] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/27/2023] [Revised: 12/14/2023] [Accepted: 12/21/2023] [Indexed: 01/23/2024] Open
Abstract
Oropharyngeal Squamous Cell Carcinoma (OPSCC) is one of the common forms of heterogeneity in head and neck cancer. Infection with human papillomavirus (HPV) has been identified as a major risk factor for OPSCC. Therefore, differentiating the HPV-positive and negative cases in OPSCC patients is an essential diagnostic factor influencing future treatment decisions. In this study, we investigated the accuracy of a deep learning-based method for image interpretation and automatically detected the HPV status of OPSCC in routinely acquired Computed Tomography (CT) and Positron Emission Tomography (PET) images. We introduce a 3D CNN-based multi-modal feature fusion architecture for HPV status prediction in primary tumor lesions. The architecture is composed of an ensemble of CNN networks and merges image features in a softmax classification layer. The pipeline separately learns the intensity, contrast variation, shape, texture heterogeneity, and metabolic assessment from CT and PET tumor volume regions and fuses those multi-modal features for final HPV status classification. The precision, recall, and AUC scores of the proposed method are computed, and the results are compared with other existing models. The experimental results demonstrate that the multi-modal ensemble model with soft voting outperformed single-modality PET/CT, with an AUC of 0.76 and F1 score of 0.746 on publicly available TCGA and MAASTRO datasets. In the MAASTRO dataset, our model achieved an AUC score of 0.74 over primary tumor volumes of interest (VOIs). In the future, more extensive cohort validation may suffice for better diagnostic accuracy and provide preliminary assessment before the biopsy.
Collapse
Affiliation(s)
- Manob Jyoti Saikia
- Electrical Engineering, University of North Florida, Jacksonville, FL 32224, USA
| | - Shiba Kuanar
- Department of Radiology, Mayo Clinic, Rochester, MN 55905, USA
| | - Dwarikanath Mahapatra
- Inception Institute of Artificial Intelligence, Abu Dhabi 127788, United Arab Emirates
| | | |
Collapse
|
115
|
Chen L, Li J, Zou Y, Wang T. ETU-Net: edge enhancement-guided U-Net with transformer for skin lesion segmentation. Phys Med Biol 2023; 69:015001. [PMID: 38131313 DOI: 10.1088/1361-6560/ad13d2] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/09/2023] [Accepted: 12/08/2023] [Indexed: 12/23/2023]
Abstract
Objective.Convolutional neural network (CNN)-based deep learning algorithms have been widely used in recent years for automatic skin lesion segmentation. However, the limited receptive fields of convolutional architectures hinder their ability to effectively model dependencies between different image ranges. The transformer is often employed in conjunction with CNN to extract both global and local information from images, as it excels at capturing long-range dependencies. However, this method cannot accurately segment skin lesions with blurred boundaries. To overcome this difficulty, we proposed ETU-Net.Approach.ETU-Net, a novel multi-scale architecture, combines edge enhancement, CNN, and transformer. We introduce the concept of edge detection operators into difference convolution, resulting in the design of the edge enhanced convolution block (EC block) and the local transformer block (LT block), which emphasize edge features. To capture the semantic information contained in local features, we propose the multi-scale local attention block (MLA block), which utilizes convolutions with different kernel sizes. Furthermore, to address the boundary uncertainty caused by patch division in the transformer, we introduce a novel global transformer block (GT block), which allows each patch to gather full-size feature information.Main results.Extensive experimental results on three publicly available skin datasets (PH2, ISIC-2017, and ISIC-2018) demonstrate that ETU-Net outperforms state-of-the-art hybrid methods based on CNN and Transformer in terms of segmentation performance. Moreover, ETU-Net exhibits excellent generalization ability in practical segmentation applications on dermatoscopy images contributed by the Wuxi No.2 People's Hospital.Significance.We propose ETU-Net, a novel multi-scale U-Net model guided by edge enhancement, which can address the challenges posed by complex lesion shapes and ambiguous boundaries in skin lesion segmentation tasks.
Collapse
Affiliation(s)
- Lifang Chen
- School of Artificial Intelligence and Computer Science, Jiangnan University, Wuxi, People's Republic of China
| | - Jiawei Li
- School of Artificial Intelligence and Computer Science, Jiangnan University, Wuxi, People's Republic of China
| | - Yunmin Zou
- Department of Dermatology, Wuxi No.2 People's Hospital, Wuxi, People's Republic of China
| | - Tao Wang
- School of Artificial Intelligence and Computer Science, Jiangnan University, Wuxi, People's Republic of China
| |
Collapse
|
116
|
Chen C, Teng Y, Tan S, Wang Z, Zhang L, Xu J. Performance Test of a Well-Trained Model for Meningioma Segmentation in Health Care Centers: Secondary Analysis Based on Four Retrospective Multicenter Data Sets. J Med Internet Res 2023; 25:e44119. [PMID: 38100181 PMCID: PMC10757229 DOI: 10.2196/44119] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/08/2022] [Revised: 06/21/2023] [Accepted: 11/22/2023] [Indexed: 12/18/2023] Open
Abstract
BACKGROUND Convolutional neural networks (CNNs) have produced state-of-the-art results in meningioma segmentation on magnetic resonance imaging (MRI). However, images obtained from different institutions, protocols, or scanners may show significant domain shift, leading to performance degradation and challenging model deployment in real clinical scenarios. OBJECTIVE This research aims to investigate the realistic performance of a well-trained meningioma segmentation model when deployed across different health care centers and verify the methods to enhance its generalization. METHODS This study was performed in four centers. A total of 606 patients with 606 MRIs were enrolled between January 2015 and December 2021. Manual segmentations, determined through consensus readings by neuroradiologists, were used as the ground truth mask. The model was previously trained using a standard supervised CNN called Deeplab V3+ and was deployed and tested separately in four health care centers. To determine the appropriate approach to mitigating the observed performance degradation, two methods were used: unsupervised domain adaptation and supervised retraining. RESULTS The trained model showed a state-of-the-art performance in tumor segmentation in two health care institutions, with a Dice ratio of 0.887 (SD 0.108, 95% CI 0.903-0.925) in center A and a Dice ratio of 0.874 (SD 0.800, 95% CI 0.854-0.894) in center B. Whereas in the other health care institutions, the performance declined, with Dice ratios of 0.631 (SD 0.157, 95% CI 0.556-0.707) in center C and 0.649 (SD 0.187, 95% CI 0.566-0.732) in center D, as they obtained the MRI using different scanning protocols. The unsupervised domain adaptation showed a significant improvement in performance scores, with Dice ratios of 0.842 (SD 0.073, 95% CI 0.820-0.864) in center C and 0.855 (SD 0.097, 95% CI 0.826-0.886) in center D. Nonetheless, it did not overperform the supervised retraining, which achieved Dice ratios of 0.899 (SD 0.026, 95% CI 0.889-0.906) in center C and 0.886 (SD 0.046, 95% CI 0.870-0.903) in center D. CONCLUSIONS Deploying the trained CNN model in different health care institutions may show significant performance degradation due to the domain shift of MRIs. Under this circumstance, the use of unsupervised domain adaptation or supervised retraining should be considered, taking into account the balance between clinical requirements, model performance, and the size of the available data.
Collapse
Affiliation(s)
- Chaoyue Chen
- Neurosurgery Department, West China Hospital, Sichuan University, Chengdu, China
| | - Yuen Teng
- Neurosurgery Department, West China Hospital, Sichuan University, Chengdu, China
| | - Shuo Tan
- Machine Intelligence Laboratory, College of Computer Science, Sichuan University, Chengdu, China
| | - Zizhou Wang
- Machine Intelligence Laboratory, College of Computer Science, Sichuan University, Chengdu, China
- Institute of High Performance Computing, Agency for Science, Technology and Research, Singapore, Singapore
| | - Lei Zhang
- Machine Intelligence Laboratory, College of Computer Science, Sichuan University, Chengdu, China
| | - Jianguo Xu
- Neurosurgery Department, West China Hospital, Sichuan University, Chengdu, China
| |
Collapse
|
117
|
Cabrera Castillos K, Ladouce S, Darmet L, Dehais F. Burst c-VEP Based BCI: Optimizing stimulus design for enhanced classification with minimal calibration data and improved user experience. Neuroimage 2023; 284:120446. [PMID: 37949256 DOI: 10.1016/j.neuroimage.2023.120446] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/18/2023] [Revised: 10/31/2023] [Accepted: 11/06/2023] [Indexed: 11/12/2023] Open
Abstract
The utilization of aperiodic flickering visual stimuli under the form of code-modulated Visual Evoked Potentials (c-VEP) represents a pivotal advancement in the field of reactive Brain-Computer Interface (rBCI). A major advantage of the c-VEP approach is that the training of the model is independent of the number and complexity of targets, which helps reduce calibration time. Nevertheless, the existing designs of c-VEP stimuli can be further improved in terms of visual user experience but also to achieve a higher signal-to-noise ratio, while shortening the selection time and calibration process. In this study, we introduce an innovative variant of code-VEP, referred to as "Burst c-VEP". This original approach involves the presentation of short bursts of aperiodic visual flashes at a deliberately slow rate, typically ranging from two to four flashes per second. The rationale behind this design is to leverage the sensitivity of the primary visual cortex to transient changes in low-level stimuli features to reliably elicit distinctive series of visual evoked potentials. In comparison to other types of faster-paced code sequences, burst c-VEP exhibit favorable properties to achieve high bitwise decoding performance using convolutional neural networks (CNN), which yields potential to attain faster selection time with the need for less calibration data. Furthermore, our investigation focuses on reducing the perceptual saliency of c-VEP through the attenuation of visual stimuli contrast and intensity to significantly improve users' visual comfort. The proposed solutions were tested through an offline 4-classes c-VEP protocol involving 12 participants. Following a factorial design, participants were instructed to focus on c-VEP targets whose pattern (burst and maximum-length sequences) and amplitude (100% or 40% amplitude depth modulations) were manipulated across experimental conditions. Firstly, the full amplitude burst c-VEP sequences exhibited higher accuracy, ranging from 90.5% (with 17.6s of calibration data) to 95.6% (with 52.8s of calibration data), compared to its m-sequence counterpart (71.4% to 85.0%). The mean selection time for both types of codes (1.5 s) compared favorably to reports from previous studies. Secondly, our findings revealed that lowering the intensity of the stimuli only slightly decreased the accuracy of the burst code sequences to 94.2% while leading to substantial improvements in terms of user experience. Taken together, these results demonstrate the high potential of the proposed burst codes to advance reactive BCI both in terms of performance and usability. The collected dataset, along with the proposed CNN architecture implementation, are shared through open-access repositories.
Collapse
Affiliation(s)
- Kalou Cabrera Castillos
- Human Factors and Neuroergonomics, Institut Supérieur de l'Aéronautique et de l'Espace, 10 Av. Edouard Belin, Toulouse, 31400, France.
| | - Simon Ladouce
- Human Factors and Neuroergonomics, Institut Supérieur de l'Aéronautique et de l'Espace, 10 Av. Edouard Belin, Toulouse, 31400, France
| | - Ludovic Darmet
- Human Factors and Neuroergonomics, Institut Supérieur de l'Aéronautique et de l'Espace, 10 Av. Edouard Belin, Toulouse, 31400, France
| | - Frédéric Dehais
- Human Factors and Neuroergonomics, Institut Supérieur de l'Aéronautique et de l'Espace, 10 Av. Edouard Belin, Toulouse, 31400, France; Biomedical Engineering, Drexel University, Philadelphia, 19104, PA, United States
| |
Collapse
|
118
|
El-Hussieny H, Hameed IA, Nada AA. Deep CNN-Based Static Modeling of Soft Robots Utilizing Absolute Nodal Coordinate Formulation. Biomimetics (Basel) 2023; 8:611. [PMID: 38132550 PMCID: PMC10742251 DOI: 10.3390/biomimetics8080611] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/12/2023] [Revised: 11/26/2023] [Accepted: 12/11/2023] [Indexed: 12/23/2023] Open
Abstract
Soft continuum robots, inspired by the adaptability and agility of natural soft-bodied organisms like octopuses and elephant trunks, present a frontier in robotics research. However, exploiting their full potential necessitates precise modeling and control for specific motion and manipulation tasks. This study introduces an innovative approach using Deep Convolutional Neural Networks (CNN) for the inverse quasi-static modeling of these robots within the Absolute Nodal Coordinate Formulation (ANCF) framework. The ANCF effectively represents the complex non-linear behavior of soft continuum robots, while the CNN-based models are optimized for computational efficiency and precision. This combination is crucial for addressing the complex inverse statics problems associated with ANCF-modeled robots. Extensive numerical experiments were conducted to assess the performance of these Deep CNN-based models, demonstrating their suitability for real-time simulation and control in statics modeling. Additionally, this study includes a detailed cross-validation experiment to identify the most effective model architecture, taking into account factors such as the number of layers, activation functions, and unit configurations. The results highlight the significant benefits of integrating Deep CNN with ANCF models, paving the way for advanced statics modeling in soft continuum robotics.
Collapse
Affiliation(s)
- Haitham El-Hussieny
- Department of Mechatronics and Robotics Engineering, Egypt-Japan University of Science and Technology (E-JUST), Alexandria 21934, Egypt;
| | - Ibrahim A. Hameed
- Department of ICT and Natural Sciences, Norwegian University of Science and Technology, 7034 Trondheim, Norway
| | - Ayman A. Nada
- Department of Mechatronics and Robotics Engineering, Egypt-Japan University of Science and Technology (E-JUST), Alexandria 21934, Egypt;
| |
Collapse
|
119
|
Phumkuea T, Wongsirichot T, Damkliang K, Navasakulpong A, Andritsch J. MSTAC: A Multi-Stage Automated Classification of COVID-19 Chest X-ray Images Using Stacked CNN Models. Tomography 2023; 9:2233-2246. [PMID: 38133077 PMCID: PMC10747997 DOI: 10.3390/tomography9060173] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/02/2023] [Revised: 12/08/2023] [Accepted: 12/08/2023] [Indexed: 12/23/2023] Open
Abstract
This study introduces a Multi-Stage Automated Classification (MSTAC) system for COVID-19 chest X-ray (CXR) images, utilizing stacked Convolutional Neural Network (CNN) models. Suspected COVID-19 patients often undergo CXR imaging, making it valuable for disease classification. The study collected CXR images from public datasets and aimed to differentiate between COVID-19, non-COVID-19, and healthy cases. MSTAC employs two classification stages: the first distinguishes healthy from unhealthy cases, and the second further classifies COVID-19 and non-COVID-19 cases. Compared to a single CNN-Multiclass model, MSTAC demonstrated superior classification performance, achieving 97.30% accuracy and sensitivity. In contrast, the CNN-Multiclass model showed 94.76% accuracy and sensitivity. MSTAC's effectiveness is highlighted in its promising results over the CNN-Multiclass model, suggesting its potential to assist healthcare professionals in efficiently diagnosing COVID-19 cases. The system outperformed similar techniques, emphasizing its accuracy and efficiency in COVID-19 diagnosis. This research underscores MSTAC as a valuable tool in medical image analysis for enhanced disease classification.
Collapse
Affiliation(s)
- Thanakorn Phumkuea
- College of Digital Science, Prince of Songkla University, Songkhla 90110, Thailand
| | - Thakerng Wongsirichot
- Division of Computational Science, Faculty of Science, Prince of Songkla University, Songkhla 90110, Thailand;
| | - Kasikrit Damkliang
- Division of Computational Science, Faculty of Science, Prince of Songkla University, Songkhla 90110, Thailand;
| | - Asma Navasakulpong
- Division of Respiratory and Respiratory Critical Care Medicine, Prince of Songkla University, Songkhla 90110, Thailand;
| | - Jarutas Andritsch
- Faculty of Business, Law and Digital Technologies, Solent University, Southampton SO14 0YN, UK;
| |
Collapse
|
120
|
Schaufelberger M, Kühle RP, Wachter A, Weichel F, Hagen N, Ringwald F, Eisenmann U, Hoffmann J, Engel M, Freudlsperger C, Nahm W. Impact of data synthesis strategies for the classification of craniosynostosis. Front Med Technol 2023; 5:1254690. [PMID: 38192519 PMCID: PMC10773901 DOI: 10.3389/fmedt.2023.1254690] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/07/2023] [Accepted: 11/23/2023] [Indexed: 01/10/2024] Open
Abstract
Introduction Photogrammetric surface scans provide a radiation-free option to assess and classify craniosynostosis. Due to the low prevalence of craniosynostosis and high patient restrictions, clinical data are rare. Synthetic data could support or even replace clinical data for the classification of craniosynostosis, but this has never been studied systematically. Methods We tested the combinations of three different synthetic data sources: a statistical shape model (SSM), a generative adversarial network (GAN), and image-based principal component analysis for a convolutional neural network (CNN)-based classification of craniosynostosis. The CNN is trained only on synthetic data but is validated and tested on clinical data. Results The combination of an SSM and a GAN achieved an accuracy of 0.960 and an F1 score of 0.928 on the unseen test set. The difference to training on clinical data was smaller than 0.01. Including a second image modality improved classification performance for all data sources. Conclusions Without a single clinical training sample, a CNN was able to classify head deformities with similar accuracy as if it was trained on clinical data. Using multiple data sources was key for a good classification based on synthetic data alone. Synthetic data might play an important future role in the assessment of craniosynostosis.
Collapse
Affiliation(s)
- Matthias Schaufelberger
- Institute of Biomedical Engineering (IBT), Karlsruhe Institute of Technology (KIT), Karlsruhe, Germany
| | - Reinald Peter Kühle
- Department of Oral, Dental and Maxillofacial Diseases, Heidelberg University Hospital, Heidelberg, Germany
| | - Andreas Wachter
- Institute of Biomedical Engineering (IBT), Karlsruhe Institute of Technology (KIT), Karlsruhe, Germany
| | - Frederic Weichel
- Department of Oral, Dental and Maxillofacial Diseases, Heidelberg University Hospital, Heidelberg, Germany
| | - Niclas Hagen
- Institute of Medical Informatics, Heidelberg University Hospital, Heidelberg, Germany
| | - Friedemann Ringwald
- Institute of Medical Informatics, Heidelberg University Hospital, Heidelberg, Germany
| | - Urs Eisenmann
- Institute of Medical Informatics, Heidelberg University Hospital, Heidelberg, Germany
| | - Jürgen Hoffmann
- Department of Oral, Dental and Maxillofacial Diseases, Heidelberg University Hospital, Heidelberg, Germany
| | - Michael Engel
- Department of Oral, Dental and Maxillofacial Diseases, Heidelberg University Hospital, Heidelberg, Germany
| | - Christian Freudlsperger
- Department of Oral, Dental and Maxillofacial Diseases, Heidelberg University Hospital, Heidelberg, Germany
| | - Werner Nahm
- Institute of Biomedical Engineering (IBT), Karlsruhe Institute of Technology (KIT), Karlsruhe, Germany
| |
Collapse
|
121
|
Liu S, Shen W, Wu CQ, Lyu X. Optimizing Temperature Setting for Decomposition Furnace Based on Attention Mechanism and Neural Networks. Sensors (Basel) 2023; 23:9754. [PMID: 38139598 PMCID: PMC10747360 DOI: 10.3390/s23249754] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 11/24/2023] [Revised: 12/06/2023] [Accepted: 12/08/2023] [Indexed: 12/24/2023]
Abstract
The temperature setting for a decomposition furnace is of great importance for maintaining the normal operation of the furnace and other equipment in a cement plant and ensuring the output of high-quality cement products. Based on the principles of deep convolutional neural networks (CNNs), long short-term memory networks (LSTMs), and attention mechanisms, we propose a CNN-LSTM-A model to optimize the temperature settings for a decomposition furnace. The proposed model combines the features selected by Least Absolute Shrinkage and Selection Operator (Lasso) with others suggested by domain experts as inputs, and uses CNN to mine spatial features, LSTM to extract time series information, and an attention mechanism to optimize weights. We deploy sensors to collect production measurements at a real-life cement factory for experimentation and investigate the impact of hyperparameter changes on the performance of the proposed model. Experimental results show that CNN-LSTM-A achieves a superior performance in terms of prediction accuracy over existing models such as the basic LSTM model, deep-convolution-based LSTM model, and attention-mechanism-based LSTM model. The proposed model has potentials for wide deployment in cement plants to automate and optimize the operation of decomposition furnaces.
Collapse
Affiliation(s)
- Shangkun Liu
- School of Computer Science and Technology, Zhejiang Sci-Tech University, Hangzhou 310018, China;
| | - Wei Shen
- School of Computer Science and Technology, Zhejiang Sci-Tech University, Hangzhou 310018, China;
| | - Chase Q. Wu
- Department of Data Science, New Jersey Institute of Technology, Newark, NJ 07102, USA
| | - Xukang Lyu
- Zhejiang New Rise Digital Technology Co., Ltd., Hangzhou 311899, China;
| |
Collapse
|
122
|
Sharma D, Singh J, Shah B, Ali F, AlZubi AA, AlZubi MA. Public mental health through social media in the post COVID-19 era. Front Public Health 2023; 11:1323922. [PMID: 38146469 PMCID: PMC10749364 DOI: 10.3389/fpubh.2023.1323922] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/18/2023] [Accepted: 11/22/2023] [Indexed: 12/27/2023] Open
Abstract
Social media is a powerful communication tool and a reflection of our digital environment. Social media acted as an augmenter and influencer during and after COVID-19. Many of the people sharing social media posts were not actually aware of their mental health status. This situation warrants to automate the detection of mental disorders. This paper presents a methodology for the detection of mental disorders using micro facial expressions. Micro-expressions are momentary, involuntary facial expressions that can be indicative of deeper feelings and mental states. Nevertheless, manually detecting and interpreting micro-expressions can be rather challenging. A deep learning HybridMicroNet model, based on convolution neural networks, is proposed for emotion recognition from micro-expressions. Further, a case study for the detection of mental health has been undertaken. The findings demonstrated that the proposed model achieved a high accuracy when attempting to diagnose mental health disorders based on micro-expressions. The attained accuracy on the CASME dataset was 99.08%, whereas the accuracy that was achieved on SAMM dataset was 97.62%. Based on these findings, deep learning may prove to be an effective method for diagnosing mental health conditions by analyzing micro-expressions.
Collapse
Affiliation(s)
- Deepika Sharma
- Chitkara University Institute of Engineering and Technology, Chitkara University, Punjab, India
| | - Jaiteg Singh
- Chitkara University Institute of Engineering and Technology, Chitkara University, Punjab, India
| | - Babar Shah
- College of Technological Innovation, Zayed University, Dubai, United Arab Emirates
| | - Farman Ali
- Department of Computer Science and Engineering, School of Convergence, College of Computing and Informatics, Sungkyunkwan University, Seoul, Republic of Korea
| | - Ahmad Ali AlZubi
- Department of Computer Science, Community College, King Saud University, Riyadh, Saudi Arabia
| | - Mallak Ahmad AlZubi
- Faculty of Medicine, Jordan University of Science and Technology, Irbid, Jordan
| |
Collapse
|
123
|
Alohali MA, Elsadig M, Hilal AM, Mutwakel A. Emerging framework for attack detection in cyber-physical systems using heuristic-based optimization algorithm. PeerJ Comput Sci 2023; 9:e1596. [PMID: 38192469 PMCID: PMC10773567 DOI: 10.7717/peerj-cs.1596] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/02/2023] [Accepted: 08/28/2023] [Indexed: 01/10/2024]
Abstract
In recent days, cyber-physical systems (CPS) have become a new wave generation of human life, exploiting various smart and intelligent uses of automotive systems. In these systems, information is shared through networks, and data is collected from multiple sensor devices. This network has sophisticated control, wireless communication, and high-speed computation. These features are commonly available in CPS, allowing multi-users to access and share information through the network via remote access. Therefore, protecting resources and sensitive information in the network is essential. Many research works have been developed for detecting insecure networks and attacks in the network. This article introduces a framework, namely Deep Bagging Convolutional Neural Network with Heuristic Multiswarm Ant Colony Optimization (DCNN-HMACO), designed to enhance the secure transmission of information, improve efficiency, and provide convenience in Cyber-Physical Systems (CPS). The proposed framework aims to detect attacks in CPS effectively. Compared to existing methods, the DCNN-HMACO framework significantly improves attack detection rates and enhances overall system protection. While the accuracy rates of CNN and FCM are reported as 72.12% and 79.56% respectively, our proposed framework achieves a remarkable accuracy rate of 92.14%.
Collapse
Affiliation(s)
- Manal Abdullah Alohali
- Department of Information Systems, College of Computer and Information Sciences, Princess Nourah bint Abdulrahman University, Riyadh, Saudi Arabia
| | - Muna Elsadig
- Department of Information Systems, College of Computer and Information Sciences, Princess Nourah bint Abdulrahman University, Riyadh, Saudi Arabia
| | - Anwer Mustafa Hilal
- Department of Computer and Self Development, Prince Sattam bin Abdulaziz University, Saudi Arabia, Saudi Arabia, Saudi Arabia
| | - Abdulwahed Mutwakel
- Department of Information Systems, Prince Sattam bin Abdulaziz University, Saudi Arabia, Saudi Arabia, Saudi Arabia
| |
Collapse
|
124
|
Wu JCH, Yu HW, Tsai TH, Lu HHS. Dynamically Synthetic Images for Federated Learning of medical images. Comput Methods Programs Biomed 2023; 242:107845. [PMID: 37852147 DOI: 10.1016/j.cmpb.2023.107845] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 03/01/2023] [Revised: 09/28/2023] [Accepted: 10/03/2023] [Indexed: 10/20/2023]
Abstract
BACKGROUND To develop deep learning models for medical diagnosis, it is important to collect more medical data from several medical institutions. Due to the regulations for privacy concerns, it is infeasible to collect data from various medical institutions to one institution for centralized learning. Federated Learning (FL) provides a feasible approach to jointly train the deep learning model with data stored in various medical institutions instead of collected together. However, the resulting FL models could be biased towards institutions with larger training datasets. METHODOLOGY In this study, we propose the applicable method of Dynamically Synthetic Images for Federated Learning (DSIFL) that aims to integrate the information of local institutions with heterogeneous types of data. The main technique of DSIFL is to develop a synthetic method that can dynamically adjust the number of synthetic images similar to local data that are misclassified by the current model. The resulting global model can handle the diversity in heterogeneous types of data collected in local medical institutions by including the training of synthetic images similar to misclassified cases in local collections. RESULTS In model performance evaluation metrics, we focus on the accuracy of each client's dataset. Finally, the accuracy of the model of DSIFL in the experiments can achieve the higher accuracy of the FL approach. CONCLUSION In this study, we propose the framework of DSIFL that achieves improvements over the conventional FL approach. We conduct empirical studies with two kinds of medical images. We compare the performance by variants of FL vs. DSIFL approaches. The performance by individual training is used as the baseline, whereas the performance by centralized learning is used as the target for the comparison studies. The empirical findings suggest that the DSIFL has improved performance over the FL via the technique of dynamically synthetic images in training.
Collapse
Affiliation(s)
- Jacky Chung-Hao Wu
- Institute of Statistics, National Yang Ming Chiao Tung University, Hsinchu, Taiwan, ROC
| | - Hsuan-Wen Yu
- Institute of Statistics, National Yang Ming Chiao Tung University, Hsinchu, Taiwan, ROC
| | - Tsung-Hung Tsai
- Institute of Statistics, National Yang Ming Chiao Tung University, Hsinchu, Taiwan, ROC
| | - Henry Horng-Shing Lu
- Institute of Statistics, National Yang Ming Chiao Tung University, Hsinchu, Taiwan, ROC; Department of Statistics and Data Science, Cornell University, New York, USA.
| |
Collapse
|
125
|
Andrade-Miranda G, Jaouen V, Tankyevych O, Cheze Le Rest C, Visvikis D, Conze PH. Multi-modal medical Transformers: A meta-analysis for medical image segmentation in oncology. Comput Med Imaging Graph 2023; 110:102308. [PMID: 37918328 DOI: 10.1016/j.compmedimag.2023.102308] [Citation(s) in RCA: 2] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/09/2023] [Revised: 10/05/2023] [Accepted: 10/24/2023] [Indexed: 11/04/2023]
Abstract
Multi-modal medical image segmentation is a crucial task in oncology that enables the precise localization and quantification of tumors. The aim of this work is to present a meta-analysis of the use of multi-modal medical Transformers for medical image segmentation in oncology, specifically focusing on multi-parametric MR brain tumor segmentation (BraTS2021), and head and neck tumor segmentation using PET-CT images (HECKTOR2021). The multi-modal medical Transformer architectures presented in this work exploit the idea of modality interaction schemes based on visio-linguistic representations: (i) single-stream, where modalities are jointly processed by one Transformer encoder, and (ii) multiple-stream, where the inputs are encoded separately before being jointly modeled. A total of fourteen multi-modal architectures are evaluated using different ranking strategies based on dice similarity coefficient (DSC) and average symmetric surface distance (ASSD) metrics. In addition, cost indicators such as the number of trainable parameters and the number of multiply-accumulate operations (MACs) are reported. The results demonstrate that multi-path hybrid CNN-Transformer-based models improve segmentation accuracy when compared to traditional methods, but come at the cost of increased computation time and potentially larger model size.
Collapse
Affiliation(s)
| | - Vincent Jaouen
- LaTIM UMR 1101, Inserm, Brest, France; IMT Atlantique, Brest, France.
| | - Olena Tankyevych
- LaTIM UMR 1101, Inserm, Brest, France; Nuclear Medicine, University Hospital of Poitiers, Poitiers, France.
| | - Catherine Cheze Le Rest
- LaTIM UMR 1101, Inserm, Brest, France; Nuclear Medicine, University Hospital of Poitiers, Poitiers, France.
| | | | | |
Collapse
|
126
|
Choi Y, Jang H, Baek J. Chest tomosynthesis deblurring using CNN with deconvolution layer for vertebrae segmentation. Med Phys 2023; 50:7714-7730. [PMID: 37401539 DOI: 10.1002/mp.16576] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/17/2022] [Revised: 04/13/2023] [Accepted: 06/06/2023] [Indexed: 07/05/2023] Open
Abstract
BACKGROUND Limited scan angles cause severe distortions and artifacts in reconstructed tomosynthesis images when the Feldkamp-Davis-Kress (FDK) algorithm is used for the purpose, which degrades clinical diagnostic performance. These blurring artifacts are fatal in chest tomosynthesis images because precise vertebrae segmentation is crucial for various diagnostic analyses, such as early diagnosis, surgical planning, and injury detection. Moreover, because most spinal pathologies are related to vertebral conditions, the development of methods for accurate and objective vertebrae segmentation in medical images is an important and challenging research area. PURPOSE The existing point-spread-function-(PSF)-based deblurring methods use the same PSF in all sub-volumes without considering the spatially varying property of tomosynthesis images. This increases the PSF estimation error, thus further degrading the deblurring performance. However, the proposed method estimates the PSF more accurately by using sub-CNNs that contain a deconvolution layer for each sub-system, which improves the deblurring performance. METHODS To minimize the effect of the spatially varying property, the proposed deblurring network architecture comprises four modules: (1) block division module, (2) partial PSF module, (3) deblurring block module, and (4) assembling block module. We compared the proposed DL-based method with the FDK algorithm, total-variation iterative reconstruction with GP-BB (TV-IR), 3D U-Net, FBPConvNet, and two-phase deblurring method. To investigate the deblurring performance of the proposed method, we evaluated its vertebrae segmentation performance by comparing the pixel accuracy (PA), intersection-over-union (IoU), and F-score values of reference images to those of the deblurred images. Also, pixel-based evaluations of the reference and deblurred images were performed by comparing their root mean squared error (RMSE) and visual information fidelity (VIF) values. In addition, 2D analysis of the deblurred images were performed by artifact spread function (ASF) and full width half maximum (FWHM) of the ASF curve. RESULTS The proposed method was able to recover the original structure significantly, thereby further improving the image quality. The proposed method yielded the best deblurring performance in terms of vertebrae segmentation and similarity. The IoU, F-score, and VIF values of the chest tomosynthesis images reconstructed using the proposed SV method were 53.5%, 28.7%, and 63.2% higher, respectively, than those of the images reconstructed using the FDK method, and the RMSE value was 80.3% lower. These quantitative results indicate that the proposed method can effectively restore both the vertebrae and the surrounding soft tissue. CONCLUSIONS We proposed a chest tomosynthesis deblurring technique for vertebrae segmentation by considering the spatially varying property of tomosynthesis systems. The results of quantitative evaluations indicated that the vertebrae segmentation performance of the proposed method was better than those of the existing deblurring methods.
Collapse
Affiliation(s)
- Yunsu Choi
- School of Integrated Technology, Yonsei University, Incheon, South Korea
| | - Hanjoo Jang
- School of Integrated Technology, Yonsei University, Incheon, South Korea
| | - Jongduk Baek
- Department of Artificial Intelligence, College of Computing, Yonsei University, Incheon, South Korea
| |
Collapse
|
127
|
Chinnasamy P, Wong WK, Raja AA, Khalaf OI, Kiran A, Babu JC. Health Recommendation System using Deep Learning-based Collaborative Filtering. Heliyon 2023; 9:e22844. [PMID: 38144343 PMCID: PMC10746410 DOI: 10.1016/j.heliyon.2023.e22844] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/21/2023] [Revised: 11/16/2023] [Accepted: 11/21/2023] [Indexed: 12/26/2023] Open
Abstract
The crucial aspect of the medical sector is healthcare in today's modern society. To analyze a massive quantity of medical information, a medical system is necessary to gain additional perspectives and facilitate prediction and diagnosis. This device should be intelligent enough to analyze a patient's state of health through social activities, individual health information, and behavior analysis. The Health Recommendation System (HRS) has become an essential mechanism for medical care. In this sense, efficient healthcare networks are critical for medical decision-making processes. The fundamental purpose is to maintain that sensitive information can be shared only at the right moment while guaranteeing the effectiveness of data, authenticity, security, and legal concerns. As some people use social media to recognize their medical problems, healthcare recommendation systems need to generate findings like diagnosis recommendations, medical insurance, medical passageway-based care strategies, and homeopathic remedies associated with a patient's health status. New studies aimed at the use of vast numbers of health information by integrating multidisciplinary data from various sources are addressed, which also decreases the burden and health care costs. This article presents a recommended intelligent HRS using the deep learning system of the Restricted Boltzmann Machine (RBM)-Coevolutionary Neural Network (CNN) that provides insights on how data mining techniques could be used to introduce an efficient and effective health recommendation systems engine and highlights the pharmaceutical industry's ability to translate from either a conventional scenario towards a more personalized. We developed our proposed system using TensorFlow and Python. We evaluate the suggested method's performance using distinct error quantities compared to alternative methods using the health care dataset. Furthermore, the suggested approach's accuracy, precision, recall, and F-measure were compared with the current methods.
Collapse
Affiliation(s)
- P. Chinnasamy
- Department of Computer Science and Engineering, MLR Institute of Technology, Hyderabad, India
| | | | - A. Ambeth Raja
- PG Department of Computer Science, Thiruthangal Nadar College, Chennai, 600051, India
| | - Osamah Ibrahim Khalaf
- Department of Solar, Al-Nahrain Research Center for Renewable Energy, Al-Nahrain University, Jadriya, Baghdad, Iraq
| | - Ajmeera Kiran
- Department of Computer Science and Engineering, MLR Institute of Technology, Hyderabad, Telangana, 500043, India
| | - J. Chinna Babu
- Department of Electronics and Communication Engineering, Annamacharya Institute of Technology and Sciences, Rajampet, AP, India
| |
Collapse
|
128
|
Ozaltin O, Yeniay O, Subasi A. OzNet: A New Deep Learning Approach for Automated Classification of COVID-19 Computed Tomography Scans. Big Data 2023; 11:420-436. [PMID: 36927081 DOI: 10.1089/big.2022.0042] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/18/2023]
Abstract
Coronavirus disease 2019 (COVID-19) is spreading rapidly around the world. Therefore, the classification of computed tomography (CT) scans alleviates the workload of experts, whose workload increased considerably during the pandemic. Convolutional neural network (CNN) architectures are successful for the classification of medical images. In this study, we have developed a new deep CNN architecture called OzNet. Moreover, we have compared it with pretrained architectures namely AlexNet, DenseNet201, GoogleNet, NASNetMobile, ResNet-50, SqueezeNet, and VGG-16. In addition, we have compared the classification success of three preprocessing methods with raw CT scans. We have not only classified the raw CT scans, but also have performed the classification with three different preprocessing methods, which are discrete wavelet transform (DWT), intensity adjustment, and gray to color red, green, blue image conversion on the data sets. Furthermore, it is known that the architecture's performance increases with the use of DWT preprocessing method rather than using the raw data set. The results are extremely promising with the CNN algorithms using the COVID-19 CT scans processed with the DWT. The proposed DWT-OzNet has achieved a high classification performance of more than 98.8% for each calculated metric.
Collapse
Affiliation(s)
- Oznur Ozaltin
- Department of Statistics, Institute of Science, Hacettepe University, Ankara, Turkey
| | - Ozgur Yeniay
- Department of Statistics, Institute of Science, Hacettepe University, Ankara, Turkey
| | - Abdulhamit Subasi
- Institute of Biomedicine, Faculty of Medicine, University of Turku, Turku, Finland
- Department of Computer Science, College of Engineering, Effat University, Jeddah, Saudi Arabia
| |
Collapse
|
129
|
Abdollahifard S, Farrokhi A, Kheshti F, Jalali M, Mowla A. Application of convolutional network models in detection of intracranial aneurysms: A systematic review and meta-analysis. Interv Neuroradiol 2023; 29:738-747. [PMID: 35549574 PMCID: PMC10680951 DOI: 10.1177/15910199221097475] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/03/2022] [Accepted: 04/11/2022] [Indexed: 11/15/2022] Open
Abstract
INTRODUCTION Intracranial aneurysms have a high prevalence in human population. It also has a heavy burden of disease and high mortality rate in the case of rupture. Convolutional neural network(CNN) is a type of deep learning architecture which has been proven powerful to detect intracranial aneurysms. METHODS Four databases were searched using artificial intelligence, intracranial aneurysms, and synonyms to find eligible studies. Articles which had applied CNN for detection of intracranial aneurisms were included in this review. Sensitivity and specificity of the models and human readers regarding modality, size, and location of aneurysms were sought to be extracted. Random model was the preferred model for analyses using CMA 2 to determine pooled sensitivity and specificity. RESULTS Overall, 20 studies were used in this review. Deep learning models could detect intracranial aneurysms with a sensitivity of 90/6% (CI: 87/2-93/2%) and specificity of 94/6% (CI: 0/914-0/966). CTA was the most sensitive modality (92.0%(CI:85/2-95/8%)). Overall sensitivity of the models for aneurysms more than 3 mm was above 98% (98%-100%) and 74.6 for aneurysms less than 3 mm. With the aid of AI, the clinicians' sensitivity increased to 12/8% and interrater agreement to 0/193. CONCLUSION CNN models had an acceptable sensitivity for detection of intracranial aneurysms, surpassing human readers in some fields. The logical approach for application of deep learning models would be its use as a highly capable assistant. In essence, deep learning models are a groundbreaking technology that can assist clinicians and allow them to diagnose intracranial aneurysms more accurately.
Collapse
Affiliation(s)
- Saeed Abdollahifard
- Research center for neuromodulation and pain, Shiraz, Iran
- Student research committee, Shiraz University of Medical Sciences, Shiraz, Iran
| | - Amirmohammad Farrokhi
- Research center for neuromodulation and pain, Shiraz, Iran
- Student research committee, Shiraz University of Medical Sciences, Shiraz, Iran
| | - Fatemeh Kheshti
- Research center for neuromodulation and pain, Shiraz, Iran
- Student research committee, Shiraz University of Medical Sciences, Shiraz, Iran
| | - Mahtab Jalali
- Research center for neuromodulation and pain, Shiraz, Iran
- Student research committee, Shiraz University of Medical Sciences, Shiraz, Iran
| | - Ashkan Mowla
- Division of Stroke and Endovascular Neurosurgery, Department of Neurological Surgery, Keck School of Medicine, University of Southern California (USC), Los Angeles, CA, USA
| |
Collapse
|
130
|
Zhang Z, Chen T, Liu Y, Wang C, Zhao K, Liu CH, Fu X. Decoding the temporal representation of facial expression in face-selective regions. Neuroimage 2023; 283:120442. [PMID: 37926217 DOI: 10.1016/j.neuroimage.2023.120442] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/10/2023] [Revised: 10/23/2023] [Accepted: 11/02/2023] [Indexed: 11/07/2023] Open
Abstract
The ability of humans to discern facial expressions in a timely manner typically relies on distributed face-selective regions for rapid neural computations. To study the time course in regions of interest for this process, we used magnetoencephalography (MEG) to measure neural responses participants viewed facial expressions depicting seven types of emotions (happiness, sadness, anger, disgust, fear, surprise, and neutral). Analysis of the time-resolved decoding of neural responses in face-selective sources within the inferior parietal cortex (IP-faces), lateral occipital cortex (LO-faces), fusiform gyrus (FG-faces), and posterior superior temporal sulcus (pSTS-faces) revealed that facial expressions were successfully classified starting from ∼100 to 150 ms after stimulus onset. Interestingly, the LO-faces and IP-faces showed greater accuracy than FG-faces and pSTS-faces. To examine the nature of the information processed in these face-selective regions, we entered with facial expression stimuli into a convolutional neural network (CNN) to perform similarity analyses against human neural responses. The results showed that neural responses in the LO-faces and IP-faces, starting ∼100 ms after the stimuli, were more strongly correlated with deep representations of emotional categories than with image level information from the input images. Additionally, we observed a relationship between the behavioral performance and the neural responses in the LO-faces and IP-faces, but not in the FG-faces and lpSTS-faces. Together, these results provided a comprehensive picture of the time course and nature of information involved in facial expression discrimination across multiple face-selective regions, which advances our understanding of how the human brain processes facial expressions.
Collapse
Affiliation(s)
- Zhihao Zhang
- State Key Laboratory of Brain and Cognitive Science, Institute of Psychology, Chinese Academy of Sciences, Beijing 100101, China; Department of Psychology, University of Chinese Academy of Sciences, Beijing 100049, China
| | - Tong Chen
- Chongqing Key Laboratory of Non-Linear Circuit and Intelligent Information Processing, Southwest University, Chongqing 400715, China; Chongqing Key Laboratory of Artificial Intelligence and Service Robot Control Technology, Chongqing 400715, China
| | - Ye Liu
- State Key Laboratory of Brain and Cognitive Science, Institute of Psychology, Chinese Academy of Sciences, Beijing 100101, China; Department of Psychology, University of Chinese Academy of Sciences, Beijing 100049, China
| | - Chongyang Wang
- Department of Computer Science and Technology, Tsinghua University, Beijing 100084, China
| | - Ke Zhao
- State Key Laboratory of Brain and Cognitive Science, Institute of Psychology, Chinese Academy of Sciences, Beijing 100101, China; Department of Psychology, University of Chinese Academy of Sciences, Beijing 100049, China.
| | - Chang Hong Liu
- Department of Psychology, Bournemouth University, Dorset, United Kingdom
| | - Xiaolan Fu
- State Key Laboratory of Brain and Cognitive Science, Institute of Psychology, Chinese Academy of Sciences, Beijing 100101, China; Department of Psychology, University of Chinese Academy of Sciences, Beijing 100049, China.
| |
Collapse
|
131
|
Mai M, Luo S, Fasciano S, Oluwole TE, Ortiz J, Pang Y, Wang S. Morphology-based deep learning approach for predicting adipogenic and osteogenic differentiation of human mesenchymal stem cells (hMSCs). Front Cell Dev Biol 2023; 11:1329840. [PMID: 38099293 PMCID: PMC10720363 DOI: 10.3389/fcell.2023.1329840] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/30/2023] [Accepted: 11/17/2023] [Indexed: 12/17/2023] Open
Abstract
Human mesenchymal stem cells (hMSCs) are multipotent progenitor cells with the potential to differentiate into various cell types, including osteoblasts, chondrocytes, and adipocytes. These cells have been extensively employed in the field of cell-based therapies and regenerative medicine due to their inherent attributes of self-renewal and multipotency. Traditional approaches for assessing hMSCs differentiation capacity have relied heavily on labor-intensive techniques, such as RT-PCR, immunostaining, and Western blot, to identify specific biomarkers. However, these methods are not only time-consuming and economically demanding, but also require the fixation of cells, resulting in the loss of temporal data. Consequently, there is an emerging need for a more efficient and precise approach to predict hMSCs differentiation in live cells, particularly for osteogenic and adipogenic differentiation. In response to this need, we developed innovative approaches that combine live-cell imaging with cutting-edge deep learning techniques, specifically employing a convolutional neural network (CNN) to meticulously classify osteogenic and adipogenic differentiation. Specifically, four notable pre-trained CNN models, VGG 19, Inception V3, ResNet 18, and ResNet 50, were developed and tested for identifying adipogenic and osteogenic differentiated cells based on cell morphology changes. We rigorously evaluated the performance of these four models concerning binary and multi-class classification of differentiated cells at various time intervals, focusing on pivotal metrics such as accuracy, the area under the receiver operating characteristic curve (AUC), sensitivity, precision, and F1-score. Among these four different models, ResNet 50 has proven to be the most effective choice with the highest accuracy (0.9572 for binary, 0.9474 for multi-class) and AUC (0.9958 for binary, 0.9836 for multi-class) in both multi-class and binary classification tasks. Although VGG 19 matched the accuracy of ResNet 50 in both tasks, ResNet 50 consistently outperformed it in terms of AUC, underscoring its superior effectiveness in identifying differentiated cells. Overall, our study demonstrated the capability to use a CNN approach to predict stem cell fate based on morphology changes, which will potentially provide insights for the application of cell-based therapy and advance our understanding of regenerative medicine.
Collapse
Affiliation(s)
- Maxwell Mai
- Department of Mathematics, Southern Connecticut State University, New Haven, CT, United States
| | - Shuai Luo
- Department of Chemistry, Chemical and Biomedical Engineering, University of New Haven, West Haven, CT, United States
| | - Samantha Fasciano
- Department of Cellular and Molecular Biology, University of New Haven, West Haven, CT, United States
| | - Timilehin Esther Oluwole
- Department of Chemistry, Chemical and Biomedical Engineering, University of New Haven, West Haven, CT, United States
| | - Justin Ortiz
- Department of Mechanical and Industrial Engineering, University of New Haven, West Haven, CT, United States
| | - Yulei Pang
- Department of Mathematics, Southern Connecticut State University, New Haven, CT, United States
| | - Shue Wang
- Department of Chemistry, Chemical and Biomedical Engineering, University of New Haven, West Haven, CT, United States
| |
Collapse
|
132
|
Utku A. Deep learning based hybrid prediction model for predicting the spread of COVID-19 in the world's most populous countries. Expert Syst Appl 2023; 231:120769. [PMID: 37334273 PMCID: PMC10260264 DOI: 10.1016/j.eswa.2023.120769] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 09/14/2022] [Revised: 06/07/2023] [Accepted: 06/07/2023] [Indexed: 06/20/2023]
Abstract
COVID-19 has a disease and health phenomenon and has sociological and economic adverse effects. Accurate prediction of the spread of the epidemic will help in the planning of health management and the development of economic and sociological action plans. In the literature, there are many studies to analyse and predict the spread of COVID-19 in cities and countries. However, there is no study to predict and analyse the cross-country spread in the world's most populous countries. In this study, it was aimed to predict the spread of the COVID-19 epidemic. The motivation of this study is to reduce the workload of health workers, take preventive measures and optimize health processes by predicting the spread of the COVID-19 epidemic. A hybrid deep learning model was developed to predict and analyse COVID-19 cross-country spread and a case study was carried out for the world's most populous countries. The developed model was tested extensively using RMSE, MAE and R2. The experimental results showed that the developed model was more successful in predicting and analysis of COVID-19 cross-country spread in the world's most populous countries than LR, RF, SVM, MLP, CNN, GRU, LSTM and base CNN-GRU. In the developed model, CNN performs convolution and pooling operations to extract spatial features from the input data. GRU provides learning of long-term and non-linear relationships inferred by CNN. The developed hybrid model was more successful than the other models compared, as it enabled the effective features of the CNN and GRU models to be used together. The prediction and analysis of the cross-country spread of COVID-19 in the world's most populated countries can be presented as a novelty of this study.
Collapse
Affiliation(s)
- Anil Utku
- Department of Computer Engineering, Faculty of Engineering, Munzur University, 62100 Tunceli, Turkey
| |
Collapse
|
133
|
Bhatt P, Kumar Y, Soulaïmani A. Deep convolutional architectures for extrapolative forecasts in time-dependent flow problems. Adv Model Simul Eng Sci 2023; 10:17. [PMID: 38046086 PMCID: PMC10689563 DOI: 10.1186/s40323-023-00254-y] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Grants] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 08/04/2023] [Accepted: 11/08/2023] [Indexed: 12/05/2023]
Abstract
Physical systems whose dynamics are governed by partial differential equations (PDEs) find numerous applications in science and engineering. The process of obtaining the solution from such PDEs may be computationally expensive for large-scale and parameterized problems. In this work, deep learning techniques developed especially for time-series forecasts, such as LSTM and TCN, or for spatial-feature extraction such as CNN, are employed to model the system dynamics for advection-dominated problems. This paper proposes a Convolutional Autoencoder(CAE) model for compression and a CNN future-step predictor for forecasting. These models take as input a sequence of high-fidelity vector solutions for consecutive time steps obtained from the PDEs and forecast the solutions for the subsequent time steps using auto-regression; thereby reducing the computation time and power needed to obtain such high-fidelity solutions. Non-intrusive reduced-order modeling techniques such as deep auto-encoder networks are utilized to compress the high-fidelity snapshots before feeding them as input to the forecasting models in order to reduce the complexity and the required computations in the online and offline stages. The models are tested on numerical benchmarks (1D Burgers' equation and Stoker's dam-break problem) to assess the long-term prediction accuracy, even outside the training domain (i.e. extrapolation). The most accurate model is then used to model a hypothetical dam break in a river with complex 2D bathymetry. The proposed CNN future-step predictor revealed much more accurate forecasting than LSTM and TCN in the considered spatiotemporal problems.
Collapse
Affiliation(s)
- Pratyush Bhatt
- Department of Mechanical Engineering, Delhi Technological University, P4X9+Q8X, Bawana Rd, Shahbad Daulatpur Village, Rohini, New Delhi, 110042 Delhi India
| | - Yash Kumar
- Department of Mechanical Engineering, Delhi Technological University, P4X9+Q8X, Bawana Rd, Shahbad Daulatpur Village, Rohini, New Delhi, 110042 Delhi India
| | - Azzeddine Soulaïmani
- Department of Mechanical Engineering, École de technologie supérieure, 1100 Rue Notre-Dame W., Montreal, H3C1K3 QC Canada
| |
Collapse
|
134
|
El Abbaoui A, Sodoyer D, Elbahhar F. Contactless Heart and Respiration Rates Estimation and Classification of Driver Physiological States Using CW Radar and Temporal Neural Networks. Sensors (Basel) 2023; 23:9457. [PMID: 38067830 PMCID: PMC10708560 DOI: 10.3390/s23239457] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 10/03/2023] [Revised: 10/27/2023] [Accepted: 10/30/2023] [Indexed: 12/18/2023]
Abstract
The measurement and analysis of vital signs are a subject of significant research interest, particularly for monitoring the driver's physiological state, which is of crucial importance for road safety. Various approaches have been proposed using contact techniques to measure vital signs. However, all of these methods are invasive and cumbersome for the driver. This paper proposes using a non-contact sensor based on continuous wave (CW) radar at 24 GHz to measure vital signs. We associate these measurements with distinct temporal neural networks to analyze the signals to detect and extract heart and respiration rates as well as classify the physiological state of the driver. This approach offers robust performance in estimating the exact values of heart and respiration rates and in classifying the driver's physiological state. It is non-invasive and requires no physical contact with the driver, making it particularly practical and safe. The results presented in this paper, derived from the use of a 1D Convolutional Neural Network (1D-CNN), a Temporal Convolutional Network (TCN), a Recurrent Neural Network particularly the Bidirectional Long Short-Term Memory (Bi-LSTM), and a Convolutional Recurrent Neural Network (CRNN). Among these, the CRNN emerged as the most effective Deep Learning approach for vital signal analysis.
Collapse
Affiliation(s)
- Amal El Abbaoui
- COSYS-LEOST, University Gustave Eiffel, F-59650 Villeneuve d’Ascq, France;
| | | | - Fouzia Elbahhar
- COSYS-LEOST, University Gustave Eiffel, F-59650 Villeneuve d’Ascq, France;
| |
Collapse
|
135
|
Chen J, Cui Y, Qian C, He E. A fine-tuning deep residual convolutional neural network for emotion recognition based on frequency-channel matrices representation of one-dimensional electroencephalography. Comput Methods Biomech Biomed Engin 2023:1-11. [PMID: 38017703 DOI: 10.1080/10255842.2023.2286918] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/22/2023] [Accepted: 11/18/2023] [Indexed: 11/30/2023]
Abstract
Emotion recognition (ER) plays a crucial role in enabling machines to perceive human emotional and psychological states, thus enhancing human-machine interaction. Recently, there has been a growing interest in ER based on electroencephalogram (EEG) signals. However, due to the noisy, nonlinear, and nonstationary properties of electroencephalography signals, developing an automatic and high-accuracy ER system is still a challenging task. In this study, a pretrained deep residual convolutional neural network model, including 17 convolutional layers and one fully connected layer with transfer learning technique in combination frequency-channel matrices (FCM) of two-dimensional data based on Welch power spectral density estimate from the one-dimensional EEG data has been proposed for improving the ER by automatically learning the underlying intrinsic features of multi-channel EEG data. The experiment result shows a mean accuracy of 93.61 ± 0.84%, a mean precision of 94.70 ± 0.60%, a mean sensitivity of 95.13 ± 1.02%, a mean specificity of 91.04 ± 1.02%, and a mean F1-score of 94.91 ± 0.68%, respectively using 5-fold cross-validation on the DEAP dataset. Meanwhile, to better explore and understand how the proposed model works, we noted that the ranking of clustering effect of FCM for the same category by employing the t-distributed stochastic neighbor embedding strategy is: softmax layer activation is the best, the middle convolutional layer activation is the second, and the early max pooling layer activation is the worst. These findings confirm the promising potential of combining deep learning approaches with transfer learning techniques and FCM for effective ER tasks.
Collapse
Affiliation(s)
- Jichi Chen
- School of Mechanical Engineering, Shenyang University of Technology, Shenyang, Liaoning, China
| | - Yuguo Cui
- School of Mechanical Engineering, Shenyang University of Technology, Shenyang, Liaoning, China
| | - Cheng Qian
- School of Mechanical Engineering, Shenyang University of Technology, Shenyang, Liaoning, China
| | - Enqiu He
- School of Chemical Equipment, Shenyang University of Technology, Liaoyang, Liaoning, China
| |
Collapse
|
136
|
Li Y. CNN-Based Image Analysis for EEG Signal Characterization. Stud Health Technol Inform 2023; 308:20-30. [PMID: 38007721 DOI: 10.3233/shti230820] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/28/2023]
Abstract
This article focuses on an attempt to classify and recognize the characterized images of EEG signals directly. For EEG signals, the recognition and judgment of different signals has been the key direction of research. CNN (Convolutional Neural Network) models are usually used for recognition of EEG raw signals about movement and Imagery Dataset. However, the images of EEG raw signals are basically unreadable for researchers, so characterization is a common tool. However, direct recognition of the characterized images is a relatively empty area in the existing research because it requires much higher machine performance than the traditional raw signal recognition. However, feeding the extracted feature images into a CNN and training them can be an efficient and intuitive response to the potential of EEG for brain mapping. The main goal of this research is to examine the discriminative capabilities of traditional visual and image neural networks for pictures described by EEG data. This is not typical in contemporary brain-computer interface research. The direct recognition of the described photos uses a lot of GPU (graphics computing unit) resources, but for the characterized images are easier for people to read than the original images. This work indicates the viability of direct research on defined pictures and increases the application scenario of EEG signals.
Collapse
Affiliation(s)
- Yanqi Li
- Department of Architecture & Information Technology, Faculty of Engineering, University of Queensland, Brisbane, Australia
| |
Collapse
|
137
|
Chen JQ, Zhu ZC, Zhang F, Zeng K, Jiang HZ, Cheng ZN. A BIGRU-Based Stacked Attention Network for Biomedical Named Entity Recognition with Chinese EMRs. Stud Health Technol Inform 2023; 308:757-767. [PMID: 38007808 DOI: 10.3233/shti230909] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/28/2023]
Abstract
Biomedical named entity recognition (BNER) is an effective method to structure the medical text data. It is an important basic task for building the medical application services such as the medical knowledge graphs and the intelligent auxiliary diagnosis systems. Existing medical named entity recognition methods generally leverage the word embedding model to construct text representation, and then integrate multiple semantic understanding models to enhance the semantic understanding ability of the model to achieve high-performance entity recognition. However, in the medical field, there are many professional terms that rarely appear in the general field, which cannot be represented well by the general domain word embedding model. Second, existing approaches typically only focus on the extraction of global semantic features, which generate a loss of local semantic features between characters. Moreover, as the word embedding dimension becomes much higher, the standard single-layer structure fails to fully and deeply extract the global semantic features. We put forward the BIGRU-based Stacked Attention Network (BSAN) model for biomedical named entity recognition. Firstly, we use the large-scale real-world medical electronic medical record (EMR) data to fine-tune BERT to build the proprietary embedding representations of the medical terms. Second, we use the Convolutional Neural Network model to extract semantic features. Finally, a stacked BIGRU is constructed using a multi-layer structure and a novel stacking method. It not only enables comprehensive and in-depth extraction of global semantic features, but also requires less time. Experimentally validated on the real-world datasets in Chinese EMRs, the proposed BSAN model achieves 90.9% performance on F1-values, which is stronger than the BNER performance of other state-of-the-art models.
Collapse
Affiliation(s)
- Jie-Qing Chen
- Peking Union Medical College Hospital, Chinese Academy of Medical Sciences and Peking Union Medical College (CAMS & PUMC), Beijing, China
| | - Zhi-Chao Zhu
- Beijing University of Technology, Beijing, China
| | - Feng Zhang
- Peking Union Medical College Hospital, Chinese Academy of Medical Sciences and Peking Union Medical College (CAMS & PUMC), Beijing, China
| | - Ke Zeng
- Peking Union Medical College Hospital, Chinese Academy of Medical Sciences and Peking Union Medical College (CAMS & PUMC), Beijing, China
| | - Hui-Zhen Jiang
- Peking Union Medical College Hospital, Chinese Academy of Medical Sciences and Peking Union Medical College (CAMS & PUMC), Beijing, China
| | | |
Collapse
|
138
|
Khan AR, Manzoor HU, Ayaz F, Imran MA, Zoha A. A Privacy and Energy-Aware Federated Framework for Human Activity Recognition. Sensors (Basel) 2023; 23:9339. [PMID: 38067712 PMCID: PMC10708886 DOI: 10.3390/s23239339] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 10/14/2023] [Revised: 11/08/2023] [Accepted: 11/17/2023] [Indexed: 12/18/2023]
Abstract
Human activity recognition (HAR) using wearable sensors enables continuous monitoring for healthcare applications. However, the conventional centralised training of deep learning models on sensor data poses challenges related to privacy, communication costs, and on-device efficiency. This paper proposes a federated learning framework integrating spiking neural networks (SNNs) with long short-term memory (LSTM) networks for energy-efficient and privacy-preserving HAR. The hybrid spiking-LSTM (S-LSTM) model synergistically combines the event-driven efficiency of SNNs and the sequence modelling capability of LSTMs. The model is trained using surrogate gradient learning and backpropagation through time, enabling fully supervised end-to-end learning. Extensive evaluations of two public datasets demonstrate that the proposed approach outperforms LSTM, CNN, and S-CNN models in accuracy and energy efficiency. For instance, the proposed S-LSTM achieved an accuracy of 97.36% and 89.69% for indoor and outdoor scenarios, respectively. Furthermore, the results also showed a significant improvement in energy efficiency of 32.30%, compared to simple LSTM. Additionally, we highlight the significance of personalisation in HAR, where fine-tuning with local data enhances model accuracy by up to 9% for individual users.
Collapse
Affiliation(s)
- Ahsan Raza Khan
- James Watt School of Engineering, University of Glasgow, Glasgow G12 8QQ, UK; (A.R.K.); (H.U.M.); (F.A.); (M.A.I.)
| | - Habib Ullah Manzoor
- James Watt School of Engineering, University of Glasgow, Glasgow G12 8QQ, UK; (A.R.K.); (H.U.M.); (F.A.); (M.A.I.)
- FSD-Campus, University of Engineering and Technology, Lahore 38000, Pakistan
| | - Fahad Ayaz
- James Watt School of Engineering, University of Glasgow, Glasgow G12 8QQ, UK; (A.R.K.); (H.U.M.); (F.A.); (M.A.I.)
| | - Muhammad Ali Imran
- James Watt School of Engineering, University of Glasgow, Glasgow G12 8QQ, UK; (A.R.K.); (H.U.M.); (F.A.); (M.A.I.)
| | - Ahmed Zoha
- James Watt School of Engineering, University of Glasgow, Glasgow G12 8QQ, UK; (A.R.K.); (H.U.M.); (F.A.); (M.A.I.)
| |
Collapse
|
139
|
Pianfetti E, Lovino M, Ficarra E, Martignetti L. MiREx: mRNA levels prediction from gene sequence and miRNA target knowledge. BMC Bioinformatics 2023; 24:443. [PMID: 37993778 PMCID: PMC10666312 DOI: 10.1186/s12859-023-05560-1] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/30/2023] [Accepted: 11/06/2023] [Indexed: 11/24/2023] Open
Abstract
Messenger RNA (mRNA) has an essential role in the protein production process. Predicting mRNA expression levels accurately is crucial for understanding gene regulation, and various models (statistical and neural network-based) have been developed for this purpose. A few models predict mRNA expression levels from the DNA sequence, exploiting the DNA sequence and gene features (e.g., number of exons/introns, gene length). Other models include information about long-range interaction molecules (i.e., enhancers/silencers) and transcriptional regulators as predictive features, such as transcription factors (TFs) and small RNAs (e.g., microRNAs - miRNAs). Recently, a convolutional neural network (CNN) model, called Xpresso, has been proposed for mRNA expression level prediction leveraging the promoter sequence and mRNAs' half-life features (gene features). To push forward the mRNA level prediction, we present miREx, a CNN-based tool that includes information about miRNA targets and expression levels in the model. Indeed, each miRNA can target specific genes, and the model exploits this information to guide the learning process. In detail, not all miRNAs are included, only a selected subset with the highest impact on the model. MiREx has been evaluated on four cancer primary sites from the genomics data commons (GDC) database: lung, kidney, breast, and corpus uteri. Results show that mRNA level prediction benefits from selected miRNA targets and expression information. Future model developments could include other transcriptional regulators or be trained with proteomics data to infer protein levels.
Collapse
Affiliation(s)
- Elena Pianfetti
- Department of Engineering, University of Modena and Reggio Emilia, Via Vivarelli 10/1, Modena, 41225, Italy
| | - Marta Lovino
- Department of Engineering, University of Modena and Reggio Emilia, Via Vivarelli 10/1, Modena, 41225, Italy.
| | - Elisa Ficarra
- Department of Engineering, University of Modena and Reggio Emilia, Via Vivarelli 10/1, Modena, 41225, Italy
| | - Loredana Martignetti
- Institut Curie, Rue d'Ulm 26, Paris, 75005, France.
- Inserm U900, Paris, France.
- CBIO-Centre for Computational Biology, Paris, France.
- PSL Research University, Paris, France.
| |
Collapse
|
140
|
Carlier A, Dandrifosse S, Dumont B, Mercatoris B. Comparing CNNs and PLSr for estimating wheat organs biophysical variables using proximal sensing. Front Plant Sci 2023; 14:1204791. [PMID: 38053768 PMCID: PMC10694231 DOI: 10.3389/fpls.2023.1204791] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 04/14/2023] [Accepted: 10/30/2023] [Indexed: 12/07/2023]
Abstract
Estimation of biophysical vegetation variables is of interest for diverse applications, such as monitoring of crop growth and health or yield prediction. However, remote estimation of these variables remains challenging due to the inherent complexity of plant architecture, biology and surrounding environment, and the need for features engineering. Recent advancements in deep learning, particularly convolutional neural networks (CNN), offer promising solutions to address this challenge. Unfortunately, the limited availability of labeled data has hindered the exploration of CNNs for regression tasks, especially in the frame of crop phenotyping. In this study, the effectiveness of various CNN models in predicting wheat dry matter, nitrogen uptake, and nitrogen concentration from RGB and multispectral images taken from tillering to maturity was examined. To overcome the scarcity of labeled data, a training pipeline was devised. This pipeline involves transfer learning, pseudo-labeling of unlabeled data and temporal relationship correction. The results demonstrated that CNN models significantly benefit from the pseudolabeling method, while the machine learning approach employing a PLSr did not show comparable performance. Among the models evaluated, EfficientNetB4 achieved the highest accuracy for predicting above-ground biomass, with an R² value of 0.92. In contrast, Resnet50 demonstrated superior performance in predicting LAI, nitrogen uptake, and nitrogen concentration, with R² values of 0.82, 0.73, and 0.80, respectively. Moreover, the study explored multi-output models to predict the distribution of dry matter and nitrogen uptake between stem, inferior leaves, flag leaf, and ear. The findings indicate that CNNs hold promise as accessible and promising tools for phenotyping quantitative biophysical variables of crops. However, further research is required to harness their full potential.
Collapse
Affiliation(s)
- Alexis Carlier
- Biosystems Dynamics and Exchanges, TERRA Teaching and Research Center, Gembloux Agro-Bio Tech, University of Liège, Gembloux, Belgium
| | - Sébastien Dandrifosse
- Biosystems Dynamics and Exchanges, TERRA Teaching and Research Center, Gembloux Agro-Bio Tech, University of Liège, Gembloux, Belgium
| | - Benjamin Dumont
- Plant Sciences, TERRA Teaching and Research Center, Gembloux Agro-Bio Tech, University of Liège, Gembloux, Belgium
| | - Benoit Mercatoris
- Biosystems Dynamics and Exchanges, TERRA Teaching and Research Center, Gembloux Agro-Bio Tech, University of Liège, Gembloux, Belgium
| |
Collapse
|
141
|
Kim H, Chae H, Kwon S, Lee S. Optimization of Deep Learning Parameters for Magneto-Impedance Sensor in Metal Detection and Classification. Sensors (Basel) 2023; 23:9259. [PMID: 38005645 PMCID: PMC10674819 DOI: 10.3390/s23229259] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 09/02/2023] [Revised: 11/05/2023] [Accepted: 11/13/2023] [Indexed: 11/26/2023]
Abstract
Deep learning technology is generally applied to analyze periodic data, such as the data of electromyography (EMG) and acoustic signals. Conversely, its accuracy is compromised when applied to the anomalous and irregular nature of the data obtained using a magneto-impedance (MI) sensor. Thus, we propose and analyze a deep learning model based on recurrent neural networks (RNNs) optimized for the MI sensor, such that it can detect and classify data that are relatively irregular and diverse compared to the EMG and acoustic signals. Our proposed method combines the long short-term memory (LSTM) and gated recurrent unit (GRU) models to detect and classify metal objects from signals acquired by an MI sensor. First, we configured various layers used in RNN with a basic model structure and tested the performance of each layer type. In addition, we succeeded in increasing the accuracy by processing the sequence length of the input data and performing additional work in the prediction process. An MI sensor acquires data in a non-contact mode; therefore, the proposed deep learning approach can be applied to drone control, electronic maps, geomagnetic measurement, autonomous driving, and foreign object detection.
Collapse
Affiliation(s)
- Hoijun Kim
- Department of Plasma Bio Display, Kwangwoon University, 20 Kwangwoon-ro, Seoul 01897, Republic of Korea
| | - Hobyung Chae
- Industry-Academic Cooperation Foundation, Kwangwoon University, 20 Kwangwoon-ro, Seoul 01897, Republic of Korea
| | - Soonchul Kwon
- Department of Smart Convergence, Kwangwoon University, 20 Kwangwoon-ro, Seoul 01897, Republic of Korea
| | - Seunghyun Lee
- Ingenium College of Liberal Arts, Kwangwoon University, 20 Kwangwoon-ro, Seoul 01897, Republic of Korea
| |
Collapse
|
142
|
Nagarajan B, Chakravarthy S, Venkatesan VK, Ramakrishna MT, Khan SB, Basheer S, Albalawi E. A Deep Learning Framework with an Intermediate Layer Using the Swarm Intelligence Optimizer for Diagnosing Oral Squamous Cell Carcinoma. Diagnostics (Basel) 2023; 13:3461. [PMID: 37998597 PMCID: PMC10670914 DOI: 10.3390/diagnostics13223461] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/04/2023] [Revised: 11/07/2023] [Accepted: 11/08/2023] [Indexed: 11/25/2023] Open
Abstract
One of the most prevalent cancers is oral squamous cell carcinoma, and preventing mortality from this disease primarily depends on early detection. Clinicians will greatly benefit from automated diagnostic techniques that analyze a patient's histopathology images to identify abnormal oral lesions. A deep learning framework was designed with an intermediate layer between feature extraction layers and classification layers for classifying the histopathological images into two categories, namely, normal and oral squamous cell carcinoma. The intermediate layer is constructed using the proposed swarm intelligence technique called the Modified Gorilla Troops Optimizer. While there are many optimization algorithms used in the literature for feature selection, weight updating, and optimal parameter identification in deep learning models, this work focuses on using optimization algorithms as an intermediate layer to convert extracted features into features that are better suited for classification. Three datasets comprising 2784 normal and 3632 oral squamous cell carcinoma subjects are considered in this work. Three popular CNN architectures, namely, InceptionV2, MobileNetV3, and EfficientNetB3, are investigated as feature extraction layers. Two fully connected Neural Network layers, batch normalization, and dropout are used as classification layers. With the best accuracy of 0.89 among the examined feature extraction models, MobileNetV3 exhibits good performance. This accuracy is increased to 0.95 when the suggested Modified Gorilla Troops Optimizer is used as an intermediary layer.
Collapse
Affiliation(s)
- Bharanidharan Nagarajan
- School of Computer Science Engineering and Information Systems (SCORE), Vellore Institute of Technology, Vellore 632014, India; (B.N.); (V.K.V.)
| | - Sannasi Chakravarthy
- Department of ECE, Bannari Amman Institute of Technology, Sathyamangalam 638401, India;
| | - Vinoth Kumar Venkatesan
- School of Computer Science Engineering and Information Systems (SCORE), Vellore Institute of Technology, Vellore 632014, India; (B.N.); (V.K.V.)
| | - Mahesh Thyluru Ramakrishna
- Department of Computer Science and Engineering, Faculty of Engineering and Technology, JAIN (Deemed-to-Be University), Bangalore 562112, India
| | - Surbhi Bhatia Khan
- Department of Data Science, School of Science Engineering and Environment, University of Salford, Manchester M5 4WT, UK
- Department of Engineering and Environment, University of Religions and Denominations, Qom 13357, Iran
- Department of Electrical and Computer Engineering, Lebanese American University, Byblos P.O. Box 13-5053, Lebanon
| | - Shakila Basheer
- Department of Information Systems, College of Computer and Information Science, Princess Nourah bint Abdulrahman University, Riyadh 11671, Saudi Arabia;
| | - Eid Albalawi
- Department of Computer Science, School of Computer Science and Information Technology, King Faisal University, Al-Ahsa 31982, Saudi Arabia;
| |
Collapse
|
143
|
Atanane O, Mourhir A, Benamar N, Zennaro M. Smart Buildings: Water Leakage Detection Using TinyML. Sensors (Basel) 2023; 23:9210. [PMID: 38005596 PMCID: PMC10675406 DOI: 10.3390/s23229210] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 10/10/2023] [Revised: 11/07/2023] [Accepted: 11/12/2023] [Indexed: 11/26/2023]
Abstract
The escalating global water usage and the increasing strain on major cities due to water shortages highlights the critical need for efficient water management practices. In water-stressed regions worldwide, significant water wastage is primarily attributed to leakages, inefficient use, and aging infrastructure. Undetected water leakages in buildings' pipelines contribute to the water waste problem. To address this issue, an effective water leak detection method is required. In this paper, we explore the application of edge computing in smart buildings to enhance water management. By integrating sensors and embedded Machine Learning models, known as TinyML, smart water management systems can collect real-time data, analyze it, and make accurate decisions for efficient water utilization. The transition to TinyML enables faster and more cost-effective local decision-making, reducing the dependence on centralized entities. In this work, we propose a solution that can be adapted for effective leakage detection in real-world scenarios with minimum human intervention using TinyML. We follow an approach that is similar to a typical machine learning lifecycle in production, spanning stages including data collection, training, hyperparameter tuning, offline evaluation and model optimization for on-device resource efficiency before deployment. In this work, we considered an existing water leakage acoustic dataset for polyvinyl chloride pipelines. To prepare the acoustic data for analysis, we performed preprocessing to transform it into scalograms. We devised a water leak detection method by applying transfer learning to five distinct Convolutional Neural Network (CNN) variants, which are namely EfficientNet, ResNet, AlexNet, MobileNet V1, and MobileNet V2. The CNN models were found to be able to detect leakages where a maximum testing accuracy, recall, precision, and F1 score of 97.45%, 98.57%, 96.70%, and 97.63%, respectively, were observed using the EfficientNet model. To enable seamless deployment on the Arduino Nano 33 BLE edge device, the EfficientNet model is compressed using quantization resulting in a low inference time of 1932 ms, a peak RAM usage of 255.3 kilobytes, and a flash usage requirement of merely 48.7 kilobytes.
Collapse
Affiliation(s)
- Othmane Atanane
- School of Science and Engineering, Al Akhawayn University in Ifrane, P.O. Box 104, Hassan II Avenue, Ifrane 53000, Morocco; (O.A.); (N.B.)
| | - Asmaa Mourhir
- School of Science and Engineering, Al Akhawayn University in Ifrane, P.O. Box 104, Hassan II Avenue, Ifrane 53000, Morocco; (O.A.); (N.B.)
| | - Nabil Benamar
- School of Science and Engineering, Al Akhawayn University in Ifrane, P.O. Box 104, Hassan II Avenue, Ifrane 53000, Morocco; (O.A.); (N.B.)
- School of Technology, Moulay Ismail University of Meknes, Meknes 50050, Morocco
| | - Marco Zennaro
- The Abdus Salam International Centre for Theoretical Physics, 34151 Trieste, Italy;
| |
Collapse
|
144
|
Ramírez-Ayala O, González-Hernández I, Salazar S, Flores J, Lozano R. Real-Time Person Detection in Wooded Areas Using Thermal Images from an Aerial Perspective. Sensors (Basel) 2023; 23:9216. [PMID: 38005600 PMCID: PMC10675173 DOI: 10.3390/s23229216] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 09/20/2023] [Revised: 11/01/2023] [Accepted: 11/09/2023] [Indexed: 11/26/2023]
Abstract
Detecting people in images and videos captured from an aerial platform in wooded areas for search and rescue operations is a current problem. Detection is difficult due to the relatively small dimensions of the person captured by the sensor in relation to the environment. The environment can generate occlusion, complicating the timely detection of people. There are currently numerous RGB image datasets available that are used for person detection tasks in urban and wooded areas and consider the general characteristics of a person, like size, shape, and height, without considering the occlusion of the object of interest. The present research work focuses on developing a thermal image dataset, which considers the occlusion situation to develop CNN convolutional deep learning models to perform detection tasks in real-time from an aerial perspective using altitude control in a quadcopter prototype. Extended models are proposed considering the occlusion of the person, in conjunction with a thermal sensor, which allows for highlighting the desired characteristics of the occluded person.
Collapse
Affiliation(s)
| | | | | | | | - Rogelio Lozano
- Aerial and Submarine Autonomous Navigation Systems Program, Cinvestav, Mexico City 07360, Mexico; (O.R.-A.); (I.G.-H.); (S.S.); (J.F.)
| |
Collapse
|
145
|
Zhu Z, Parker W, Wong A. Leveraging deep learning for automatic recognition of microplastics (MPs) via focal plane array (FPA) micro-FT-IR imaging. Environ Pollut 2023; 337:122548. [PMID: 37757933 DOI: 10.1016/j.envpol.2023.122548] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 04/04/2023] [Revised: 08/14/2023] [Accepted: 09/12/2023] [Indexed: 09/29/2023]
Abstract
The fast and accurate identification of MPs in environmental samples is essential for the understanding of the fate and transport of MPs in ecosystems. The recognition of MPs in environmental samples by spectral classification using conventional library search routines can be challenging due to the presence of additives, surface modification, and adsorbed contaminants. Further, the thickness of MPs also impacts the shape of spectra when FTIR spectra are collected in transmission mode. To overcome these challenges, PlasticNet, a deep learning convolutional neural network architecture, was developed for enhanced MP recognition. Once trained with 8000 + spectra of virgin plastic, PlasticNet successfully classified 11 types of common plastic with accuracy higher than 95%. The errors in identification as indicated by a confusion matrix were found to be caused by edge effects, molecular similarity of plastics, and the contamination of standards. When PlasticNet was trained with spectra of virgin plastic it showed good performance (92%+) in recognizing spectra that had increased complexity due to the presence of additives and weathering. The re-training of PlasticNet with more complex spectra further enhanced the model's capability to recognize complex spectra. PlasticNet was also able to successfully identify MPs despite variations in spectra caused by variations in MP thickness. When compared with the performance of the library search in identifying MPs in the same complex dataset collected from an environmental sample, PlasticNet achieved comparable performance in identifying PP MPs, but a 17.3% improvement. PlasticNet has the potential to become a standard approach for rapid and accurate automatic recognition of MPs in environmental samples analyzed by FPA FT-IR imaging.
Collapse
Affiliation(s)
- Ziang Zhu
- Department of Systems Design Engineering, University of Waterloo, 200 University Ave W, Waterloo, ON, N2L 3G1, Canada.
| | - Wayne Parker
- Department of Systems Design Engineering, University of Waterloo, 200 University Ave W, Waterloo, ON, N2L 3G1, Canada
| | - Alexander Wong
- Department of Civil and Environmental Engineering, University of Waterloo, 200 University Ave W, Waterloo, ON, N2L 3G1, Canada
| |
Collapse
|
146
|
You J, Gu J, Du Y, Wan M, Xie C, Xiang Z. Atmospheric Turbulence Aberration Correction Based on Deep Learning Wavefront Sensing. Sensors (Basel) 2023; 23:9159. [PMID: 38005546 PMCID: PMC10675706 DOI: 10.3390/s23229159] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 10/20/2023] [Revised: 11/05/2023] [Accepted: 11/10/2023] [Indexed: 11/26/2023]
Abstract
In this paper, research was conducted on Deep Learning Wavefront Sensing (DLWS) neural networks using simulated atmospheric turbulence datasets, and a novel DLWS was proposed based on attention mechanisms and Convolutional Neural Networks (CNNs). The study encompassed both indoor experiments and kilometer-range laser transmission experiments employing DLWS. In terms of indoor experiments, data were collected and training was performed on the platform built by us. Subsequent comparative experiments with the Shack-Hartmann Wavefront Sensing (SHWS) method revealed that our DLWS model achieved accuracy on par with SHWS. For the kilometer-scale experiments, we directly applied the DLWS model obtained from the indoor platform, eliminating the need for new data collection or additional training. The DLWS predicts the wavefront from the beacon light PSF in real time and then uses it for aberration correction of the emitted laser. The results demonstrate a substantial improvement in the average peak intensity of the light spot at the target position after closed-loop correction, with a remarkable increase of 5.35 times compared to the open-loop configuration.
Collapse
Affiliation(s)
- Jiang You
- Institute of Applied Electronics, China Academy of Engineering Physics, Mianyang 621900, China; (J.Y.); (J.G.); (Y.D.); (M.W.); (C.X.)
- Graduate School of China Academy of Engineering Physics, Beijing 100088, China
| | - Jingliang Gu
- Institute of Applied Electronics, China Academy of Engineering Physics, Mianyang 621900, China; (J.Y.); (J.G.); (Y.D.); (M.W.); (C.X.)
| | - Yinglei Du
- Institute of Applied Electronics, China Academy of Engineering Physics, Mianyang 621900, China; (J.Y.); (J.G.); (Y.D.); (M.W.); (C.X.)
| | - Min Wan
- Institute of Applied Electronics, China Academy of Engineering Physics, Mianyang 621900, China; (J.Y.); (J.G.); (Y.D.); (M.W.); (C.X.)
| | - Chuanlin Xie
- Institute of Applied Electronics, China Academy of Engineering Physics, Mianyang 621900, China; (J.Y.); (J.G.); (Y.D.); (M.W.); (C.X.)
| | - Zhenjiao Xiang
- Institute of Applied Electronics, China Academy of Engineering Physics, Mianyang 621900, China; (J.Y.); (J.G.); (Y.D.); (M.W.); (C.X.)
| |
Collapse
|
147
|
Dong L, Yu Y, Zhang D, Huo Y. Attention-Assisted Feature Comparison and Feature Enhancement for Class-Agnostic Counting. Sensors (Basel) 2023; 23:9126. [PMID: 38005514 PMCID: PMC10675645 DOI: 10.3390/s23229126] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 10/07/2023] [Revised: 11/02/2023] [Accepted: 11/09/2023] [Indexed: 11/26/2023]
Abstract
In this study, we address the class-agnostic counting (CAC) challenge, aiming to count instances in a query image, using just a few exemplars. Recent research has shifted towards few-shot counting (FSC), which involves counting previously unseen object classes. We present ACECount, an FSC framework that combines attention mechanisms and convolutional neural networks (CNNs). ACECount identifies query image-exemplar similarities, using cross-attention mechanisms, enhances feature representations with a feature attention module, and employs a multi-scale regression head, to handle scale variations in CAC. ACECount's experiments on theFSC-147 dataset exhibited the expected performance. ACECount achieved a reduction of 0.3 in the mean absolute error (MAE) on the validation set and a reduction of 0.26 on the test set of FSC-147, compared to previous methods. Notably, ACECount also demonstrated convincing performance in class-specific counting (CSC) tasks. Evaluation on crowd and vehicle counting datasets revealed that ACECount surpasses FSC algorithms like GMN, FamNet, SAFECount, LOCA, and SPDCN, in terms of performance. These results highlight the robust dataset generalization capabilities of our proposed algorithm.
Collapse
Affiliation(s)
- Liang Dong
- College of Information Engineering, Shenyang University, Shenyang 110044, China
| | - Yian Yu
- College of Information Engineering, Shenyang University, Shenyang 110044, China
| | - Di Zhang
- College of Information Engineering, Shenyang University, Shenyang 110044, China
| | - Yan Huo
- College of Information Engineering, Shenyang University, Shenyang 110044, China
| |
Collapse
|
148
|
Topalidis PI, Baron S, Heib DPJ, Eigl ES, Hinterberger A, Schabus M. From Pulses to Sleep Stages: Towards Optimized Sleep Classification Using Heart-Rate Variability. Sensors (Basel) 2023; 23:9077. [PMID: 38005466 PMCID: PMC10674316 DOI: 10.3390/s23229077] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 10/07/2023] [Revised: 10/31/2023] [Accepted: 11/03/2023] [Indexed: 11/26/2023]
Abstract
More and more people quantify their sleep using wearables and are becoming obsessed in their pursuit of optimal sleep ("orthosomnia"). However, it is criticized that many of these wearables are giving inaccurate feedback and can even lead to negative daytime consequences. Acknowledging these facts, we here optimize our previously suggested sleep classification procedure in a new sample of 136 self-reported poor sleepers to minimize erroneous classification during ambulatory sleep sensing. Firstly, we introduce an advanced interbeat-interval (IBI) quality control using a random forest method to account for wearable recordings in naturalistic and more noisy settings. We further aim to improve sleep classification by opting for a loss function model instead of the overall epoch-by-epoch accuracy to avoid model biases towards the majority class (i.e., "light sleep"). Using these implementations, we compare the classification performance between the optimized (loss function model) and the accuracy model. We use signals derived from PSG, one-channel ECG, and two consumer wearables: the ECG breast belt Polar® H10 (H10) and the Polar® Verity Sense (VS), an optical Photoplethysmography (PPG) heart-rate sensor. The results reveal a high overall accuracy for the loss function in ECG (86.3 %, κ = 0.79), as well as the H10 (84.4%, κ = 0.76), and VS (84.2%, κ = 0.75) sensors, with improvements in deep sleep and wake. In addition, the new optimized model displays moderate to high correlations and agreement with PSG on primary sleep parameters, while measures of reliability, expressed in intra-class correlations, suggest excellent reliability for most sleep parameters. Finally, it is demonstrated that the new model is still classifying sleep accurately in 4-classes in users taking heart-affecting and/or psychoactive medication, which can be considered a prerequisite in older individuals with or without common disorders. Further improving and validating automatic sleep stage classification algorithms based on signals from affordable wearables may resolve existing scepticism and open the door for such approaches in clinical practice.
Collapse
Affiliation(s)
- Pavlos I. Topalidis
- Laboratory for Sleep, Cognition and Consciousness Research, Department of Psychology, Centre for Cognitive Neuroscience Salzburg (CCNS), Paris-Lodron University of Salzburg, 5020 Salzburg, Austria; (P.I.T.); (D.P.J.H.); (E.-S.E.)
| | - Sebastian Baron
- Department of Mathematics, Paris-Lodron University of Salzburg, 5020 Salzburg, Austria
- Department of Artificial Intelligence and Human Interfaces (AIHI), Paris-Lodron University of Salzburg, 5020 Salzburg, Austria
| | - Dominik P. J. Heib
- Laboratory for Sleep, Cognition and Consciousness Research, Department of Psychology, Centre for Cognitive Neuroscience Salzburg (CCNS), Paris-Lodron University of Salzburg, 5020 Salzburg, Austria; (P.I.T.); (D.P.J.H.); (E.-S.E.)
- Institut Proschlaf, 5020 Salzburg, Austria
| | - Esther-Sevil Eigl
- Laboratory for Sleep, Cognition and Consciousness Research, Department of Psychology, Centre for Cognitive Neuroscience Salzburg (CCNS), Paris-Lodron University of Salzburg, 5020 Salzburg, Austria; (P.I.T.); (D.P.J.H.); (E.-S.E.)
| | - Alexandra Hinterberger
- Laboratory for Sleep, Cognition and Consciousness Research, Department of Psychology, Centre for Cognitive Neuroscience Salzburg (CCNS), Paris-Lodron University of Salzburg, 5020 Salzburg, Austria; (P.I.T.); (D.P.J.H.); (E.-S.E.)
| | - Manuel Schabus
- Laboratory for Sleep, Cognition and Consciousness Research, Department of Psychology, Centre for Cognitive Neuroscience Salzburg (CCNS), Paris-Lodron University of Salzburg, 5020 Salzburg, Austria; (P.I.T.); (D.P.J.H.); (E.-S.E.)
| |
Collapse
|
149
|
Shim J, Koo J, Park Y. A Methodology of Condition Monitoring System Utilizing Supervised and Semi-Supervised Learning in Railway. Sensors (Basel) 2023; 23:9075. [PMID: 38005464 PMCID: PMC10674533 DOI: 10.3390/s23229075] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 09/14/2023] [Revised: 10/08/2023] [Accepted: 10/28/2023] [Indexed: 11/26/2023]
Abstract
In this paper, research was conducted on anomaly detection of wheel flats. In the railway sector, conducting tests with actual railway vehicles is challenging due to safety concerns for passengers and maintenance issues as it is a public industry. Therefore, dynamics software was utilized. Next, STFT (short-time Fourier transform) was performed to create spectrogram images. In the case of railway vehicles, control, monitoring, and communication are performed through TCMS, but complex analysis and data processing are difficult because there are no devices such as GPUs. Furthermore, there are memory limitations. Therefore, in this paper, the relatively lightweight models LeNet-5, ResNet-20, and MobileNet-V3 were selected for deep learning experiments. At this time, the LeNet-5 and MobileNet-V3 models were modified from the basic architecture. Since railway vehicles are given preventive maintenance, it is difficult to obtain fault data. Therefore, semi-supervised learning was also performed. At this time, the Deep One Class Classification paper was referenced. The evaluation results indicated that the modified LeNet-5 and MobileNet-V3 models achieved approximately 97% and 96% accuracy, respectively. At this point, the LeNet-5 model showed a training time of 12 min faster than the MobileNet-V3 model. In addition, the semi-supervised learning results showed a significant outcome of approximately 94% accuracy when considering the railway maintenance environment. In conclusion, considering the railway vehicle maintenance environment and device specifications, it was inferred that the relatively simple and lightweight LeNet-5 model can be effectively utilized while using small images.
Collapse
Affiliation(s)
- Jaeseok Shim
- Complex Research Center for Materials & Components of Railway, Seoul National University of Science and Technology, Seoul 01811, Republic of Korea;
| | - Jeongseo Koo
- Department of Railway Safety Engineering, Seoul National University of Science and Technology, Seoul 01811, Republic of Korea;
| | - Yongwoon Park
- A2Mind, 213, Toegye-ro, Jung-gu, Seoul 04557, Republic of Korea
| |
Collapse
|
150
|
Lam Thuy LN, Hoang Trong V, Trung Hieu H, Pham TB. Detection of Abnormality in Coronary Artery Magnetic Resonance Imaging using Bit Plane Slicing and Deep Learning. Curr Med Imaging 2023; 20:CMIR-EPUB-135925. [PMID: 37936441 DOI: 10.2174/0115734056252243231025113221] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/16/2023] [Revised: 08/05/2023] [Accepted: 09/25/2023] [Indexed: 11/09/2023]
Abstract
INTRODUCTION This paper presents a novel approach for detecting abnormality in coronary arteries using MRI data in RGB images. The study evaluates the test accuracy of the weak classifiers and the test accuracy and F1 score of the strong classifier. METHODS The method involves separating the image into information planes, including R, G, and B color space, or bit-planes, and training a VGG-like convolutional neural network model on each plane separately, referred to as a "weak classifier." The classification results of these planes are aggregated using a proposed soft voting method, forming a "strong classifier," with the weights for the aggregation determined by the model's performance on the training set. RESULTS The results indicate that the strong classifier achieves a test accuracy and F1 score of around 68% to 74% on our private coronary artery dataset. Moreover, by aggregating the top three highest bit-plane levels in a grayscale image, the accuracy is slightly lower than that of the three color spaces but requires a significantly smaller CNN model of nearly 4M parameters. CONCLUSION The potential of bit-planes in reducing model storage costs is suggested. This approach holds promise for improving the detection of abnormalities in coronary arteries using MRI data.
Collapse
Affiliation(s)
- Le Nhi Lam Thuy
- Information Science Faculty, Sai Gon University, HCM City, Vietnam
- Information Technology Faculty, Industrial University of HCM City, Vietnam
| | - Vo Hoang Trong
- Chonnam National University, Department of ICT Convergence System Engineering, Gwangju, Republic of Korea
| | - Huynh Trung Hieu
- Information Technology Faculty, Industrial University of HCM City, Vietnam
| | - The Bao Pham
- Information Science Faculty, Sai Gon University, HCM City, Vietnam
| |
Collapse
|