1
|
Lewis M, Jiang W, Theis ND, Cape J, Prasad KM. Classification of psychosis spectrum disorders using graph convolutional networks with structurally constrained functional connectomes. Neural Netw 2024; 181:106771. [PMID: 39383678 DOI: 10.1016/j.neunet.2024.106771] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/11/2024] [Revised: 09/06/2024] [Accepted: 09/28/2024] [Indexed: 10/11/2024]
Abstract
This article considers the problem of classifying individuals in a dataset of diverse psychosis spectrum conditions, including persons with subsyndromal psychotic-like experiences (PLEs) and healthy controls. This task is more challenging than the traditional problem of distinguishing patients with a diagnosed disorder from controls using brain network features, since the neurobiological differences between PLE individuals and healthy persons are less pronounced. Further, examining a transdiagnostic sample compared to controls is concordant with contemporary approaches to understanding the full spectrum of neurobiology of psychoses. We consider both support vector machines (SVMs) and graph convolutional networks (GCNs) for classification, with a variety of edge selection methods for processing the inputs. We also employ the MultiVERSE algorithm to generate network embeddings of the functional and structural networks for each subject, which are used as inputs for the SVMs. The best models among SVMs and GCNs yielded accuracies >63%. Investigation of network connectivity between persons with PLE and controls identified a region within the right inferior parietal cortex, called the PGi, as a central region for communication among modules (network hub). Class activation mapping revealed that the PLE group had salient regions in the dorsolateral prefrontal, orbital and polar frontal cortices, and the lateral temporal cortex, whereas the controls did not. Our study demonstrates the potential usefulness of deep learning methods to distinguish persons with subclinical psychosis and diagnosable disorders from controls. In the long term, this could help improve accuracy and reliability of clinical diagnoses, provide neurobiological bases for making diagnoses, and initiate early intervention strategies.
Collapse
Affiliation(s)
- Madison Lewis
- Department of Bioengineering, Swanson School of Engineering, University of Pittsburgh, Pittsburgh, PA 15213, United States
| | - Wenlong Jiang
- Department of Statistics, University of Pittsburgh, Pittsburgh, PA 15213, United States
| | - Nicholas D Theis
- Department of Psychiatry, University of Pittsburgh School of Medicine, Pittsburgh, PA 15213, United States
| | - Joshua Cape
- Department of Statistics, University of Wisconsin-Madison, Madison, WI 52706, United States
| | - Konasale M Prasad
- Department of Bioengineering, Swanson School of Engineering, University of Pittsburgh, Pittsburgh, PA 15213, United States; Department of Psychiatry, University of Pittsburgh School of Medicine, Pittsburgh, PA 15213, United States; Veterans Affairs Pittsburgh Healthcare System, Pittsburgh, PA 15240, United States.
| |
Collapse
|
2
|
Huang C, Sarabi M, Ragab AE. MobileNet-V2 /IFHO model for Accurate Detection of early-stage diabetic retinopathy. Heliyon 2024; 10:e37293. [PMID: 39296185 PMCID: PMC11409123 DOI: 10.1016/j.heliyon.2024.e37293] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/26/2024] [Revised: 08/27/2024] [Accepted: 08/30/2024] [Indexed: 09/21/2024] Open
Abstract
Diabetic retinopathy is a serious eye disease that may lead to loss of vision if it is not treated. Early detection is crucial in preventing further vision impairment and enabling timely interventions. Despite notable advancements in AI-based methods for detecting diabetic retinopathy, researchers are still striving to enhance the efficiency of these techniques. Therefore, obtaining an efficient technique in this field is essential. In this research, a new strategy has been proposed to improve the detection of diabetic retinopathy by increasing the accuracy of diagnosis and identifying cases in the initial stages. To achieve this, it has been proposed to integrate the MobileNet-V2 deep learning-based neural network with Improved Fire Hawk Optimizer (IFHO). The MobileNet-V2 network has been renowned for its efficiency and accuracy in image classification tasks, making it a suitable candidate for diabetic retinopathy detection. By combining it with the IFHO, the feature selection process has been optimized, which is essential for identifying relevant patterns and abnormalities related to diabetic retinopathy. The Diabetic Retinopathy 2015 dataset has been used to evaluate the effectiveness of the MobileNet-V2/IFHO model. The study results indicate that the DRMNV2/IFHO model consistently outperforms other methods in terms of precision, accuracy, and recall. Specifically, the model achieves an average precision of 97.521 %, accuracy of 96.986 %, and recall of 98.543 %. Moreover, when compared to advanced techniques, the DRMNV2/IFHO model demonstrates superior performance in specificity, F1-score, and AUC, with average values of 97.233 %, 93.8 %, and 0.927, respectively. These results underscore the potential of the DRMNV2/IFHO model as a valuable tool for improving the accuracy and efficiency of DR diagnosis. Nevertheless, additional validation and testing on larger datasets are required to verify the model's effectiveness and robustness in real-world clinical scenarios.
Collapse
Affiliation(s)
| | - Mohammad Sarabi
- Ankara Yıldırım Beyazıt University (AYBU), 06010, Ankara, Turkey
| | - Adham E Ragab
- Industrial Engineering Department, College of Engineering, King Saud University, PO Box 800, Riyadh 11421, Saudi Arabia
| |
Collapse
|
3
|
Easwaran S, Venugopal JP, Subramanian AAV, Sundaram G, Naseeba B. A comprehensive learning based swarm optimization approach for feature selection in gene expression data. Heliyon 2024; 10:e37165. [PMID: 39296018 PMCID: PMC11408137 DOI: 10.1016/j.heliyon.2024.e37165] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/28/2024] [Revised: 08/20/2024] [Accepted: 08/28/2024] [Indexed: 09/21/2024] Open
Abstract
Gene expression data analysis is challenging due to the high dimensionality and complexity of the data. Feature selection, which identifies relevant genes, is a common preprocessing step. We propose a Comprehensive Learning-Based Swarm Optimization (CLBSO) approach for feature selection in gene expression data. CLBSO leverages the strengths of ants and grasshoppers to efficiently explore the high-dimensional search space. Ants perform local search and leave pheromone trails to guide the swarm, while grasshoppers use their ability to jump long distances to explore new regions and avoid local optima. The proposed approach was evaluated on several publicly available gene expression datasets and compared with state-of-the-art feature selection methods. CLBSO achieved an average accuracy improvement of 15% over the original high-dimensional data and outperformed other feature selection methods by up to 10%. For instance, in the Pancreatic cancer dataset, CLBSO achieved 97.2% accuracy, significantly higher than XGBoost-MOGA's 84.0%. Convergence analysis showed CLBSO required fewer iterations to reach optimal solutions. Statistical analysis confirmed significant performance improvements, and stability analysis demonstrated consistent gene subset selection across different runs. These findings highlight the robustness and efficacy of CLBSO in handling complex gene expression datasets, making it a valuable tool for enhancing classification tasks in bioinformatics.
Collapse
Affiliation(s)
- Subha Easwaran
- Department of Science and Humanities, Karpagam College of Engineering, Myleripalayam Village, Coimbatore-641032, Tamilnadu, India
| | - Jothi Prakash Venugopal
- Department of Information Technology, Karpagam College of Engineering, Myleripalayam Village, Coimbatore-641032, Tamilnadu, India
| | - Arul Antran Vijay Subramanian
- Department of Computer Science and Engineering, Karpagam College of Engineering, Myleripalayam Village, Coimbatore-641032, Tamilnadu, India
| | - Gopikrishnan Sundaram
- School of Computer Science and Engineering, VIT-AP University, Amaravathi-522241, Andhra Pradesh, India
| | - Beebi Naseeba
- School of Computer Science and Engineering, VIT-AP University, Amaravathi-522241, Andhra Pradesh, India
| |
Collapse
|
4
|
Oyebola K, Ligali F, Owoloye A, Erinwusi B, Alo Y, Musa AZ, Aina O, Salako B. Machine Learning-Based Hyperglycemia Prediction: Enhancing Risk Assessment in a Cohort of Undiagnosed Individuals. JMIRX MED 2024; 5:e56993. [PMID: 39263921 PMCID: PMC11441453 DOI: 10.2196/56993] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 02/01/2024] [Revised: 04/06/2024] [Accepted: 04/24/2024] [Indexed: 09/13/2024]
Abstract
Background Noncommunicable diseases continue to pose a substantial health challenge globally, with hyperglycemia serving as a prominent indicator of diabetes. Objective This study employed machine learning algorithms to predict hyperglycemia in a cohort of individuals who were asymptomatic and unraveled crucial predictors contributing to early risk identification. Methods This dataset included an extensive array of clinical and demographic data obtained from 195 adults who were asymptomatic and residing in a suburban community in Nigeria. The study conducted a thorough comparison of multiple machine learning algorithms to ascertain the most effective model for predicting hyperglycemia. Moreover, we explored feature importance to pinpoint correlates of high blood glucose levels within the cohort. Results Elevated blood pressure and prehypertension were recorded in 8 (4.1%) and 18 (9.2%) of the 195 participants, respectively. A total of 41 (21%) participants presented with hypertension, of which 34 (83%) were female. However, sex adjustment showed that 34 of 118 (28.8%) female participants and 7 of 77 (9%) male participants had hypertension. Age-based analysis revealed an inverse relationship between normotension and age (r=-0.88; P=.02). Conversely, hypertension increased with age (r=0.53; P=.27), peaking between 50-59 years. Of the 195 participants, isolated systolic hypertension and isolated diastolic hypertension were recorded in 16 (8.2%) and 15 (7.7%) participants, respectively, with female participants recording a higher prevalence of isolated systolic hypertension (11/16, 69%) and male participants reporting a higher prevalence of isolated diastolic hypertension (11/15, 73%). Following class rebalancing, the random forest classifier gave the best performance (accuracy score 0.89; receiver operating characteristic-area under the curve score 0.89; F1-score 0.89) of the 26 model classifiers. The feature selection model identified uric acid and age as important variables associated with hyperglycemia. Conclusions The random forest classifier identified significant clinical correlates associated with hyperglycemia, offering valuable insights for the early detection of diabetes and informing the design and deployment of therapeutic interventions. However, to achieve a more comprehensive understanding of each feature's contribution to blood glucose levels, modeling additional relevant clinical features in larger datasets could be beneficial.
Collapse
Affiliation(s)
- Kolapo Oyebola
- Nigerian Institute of Medical Research, Lagos, Nigeria
- Centre for Genomic Research in Biomedicine, Mountain Top University, Ibafo, Nigeria
| | - Funmilayo Ligali
- Nigerian Institute of Medical Research, Lagos, Nigeria
- Centre for Genomic Research in Biomedicine, Mountain Top University, Ibafo, Nigeria
| | - Afolabi Owoloye
- Nigerian Institute of Medical Research, Lagos, Nigeria
- Centre for Genomic Research in Biomedicine, Mountain Top University, Ibafo, Nigeria
| | - Blessing Erinwusi
- Centre for Genomic Research in Biomedicine, Mountain Top University, Ibafo, Nigeria
| | - Yetunde Alo
- Centre for Genomic Research in Biomedicine, Mountain Top University, Ibafo, Nigeria
| | | | | | | |
Collapse
|
5
|
Gunasekaran S, Mercy Bai PS, Mathivanan SK, Rajadurai H, Shivahare BD, Shah MA. Automated brain tumor diagnostics: Empowering neuro-oncology with deep learning-based MRI image analysis. PLoS One 2024; 19:e0306493. [PMID: 39190622 PMCID: PMC11349112 DOI: 10.1371/journal.pone.0306493] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/25/2024] [Accepted: 06/18/2024] [Indexed: 08/29/2024] Open
Abstract
Brain tumors, characterized by the uncontrolled growth of abnormal cells, pose a significant threat to human health. Early detection is crucial for successful treatment and improved patient outcomes. Magnetic Resonance Imaging (MRI) is the primary diagnostic tool for brain tumors, providing detailed visualizations of the brain's intricate structures. However, the complexity and variability of tumor shapes and locations often challenge physicians in achieving accurate tumor segmentation on MRI images. Precise tumor segmentation is essential for effective treatment planning and prognosis. To address this challenge, we propose a novel hybrid deep learning technique, Convolutional Neural Network and ResNeXt101 (ConvNet-ResNeXt101), for automated tumor segmentation and classification. Our approach commences with data acquisition from the BRATS 2020 dataset, a benchmark collection of MRI images with corresponding tumor segmentations. Next, we employ batch normalization to smooth and enhance the collected data, followed by feature extraction using the AlexNet model. This involves extracting features based on tumor shape, position, shape, and surface characteristics. To select the most informative features for effective segmentation, we utilize an advanced meta-heuristics algorithm called Advanced Whale Optimization (AWO). AWO mimics the hunting behavior of humpback whales to iteratively search for the optimal feature subset. With the selected features, we perform image segmentation using the ConvNet-ResNeXt101 model. This deep learning architecture combines the strengths of ConvNet and ResNeXt101, a type of ConvNet with aggregated residual connections. Finally, we apply the same ConvNet-ResNeXt101 model for tumor classification, categorizing the segmented tumor into distinct types. Our experiments demonstrate the superior performance of our proposed ConvNet-ResNeXt101 model compared to existing approaches, achieving an accuracy of 99.27% for the tumor core class with a minimum learning elapsed time of 0.53 s.
Collapse
Affiliation(s)
- Subathra Gunasekaran
- Department of Computer Science and Engineering, Sathyabama Institute of Science and Technology, Chennai, India
| | | | | | - Hariharan Rajadurai
- School of Computing Science and Engineering, VIT Bhopal University, Sehore, India
| | - Basu Dev Shivahare
- School of Computer Science and Engineering, Galgotias University, Greater Noida, India
| | - Mohd Asif Shah
- Faculty of Kebri Dehar University, Somali, Ethiopia
- Division of Research and Development, Lovely Professional University, Phagwara, Punjab, India
- Centre of Research Impact and Outcome, Chitkara University Institute of Engineering and Technology, Chitkara University, Rajpura, Punjab, India
| |
Collapse
|
6
|
Khan HU, Ali Y, Azeem Akbar M, Khan F. A comprehensive survey on exploring and analyzing COVID-19 mobile apps: Meta and exploratory analysis. Heliyon 2024; 10:e35137. [PMID: 39170132 PMCID: PMC11336479 DOI: 10.1016/j.heliyon.2024.e35137] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/08/2024] [Revised: 07/23/2024] [Accepted: 07/23/2024] [Indexed: 08/23/2024] Open
Abstract
During the current COVID-19 pandemic, many digital solutions around the world have been proposed to cope with the deadly virus but the role of mobile-based applications is dominant one. In Pakistan, during the current COVID-19 pandemic, an array of mobile health applications (apps) and platforms have been launched to grapple with the impacts of the COVID-19 situation. In this survey, our major focus is to explore and analyze the starring role of mobile apps based on the features and functionalities to tackle the COVID-19 disease, particularly in Pakistan. In this study, over fifty (50) mobile apps have been scrapped from the well-known three different sources i.e. Google Play Store, iOS Play Store, and web source. We developed our own data set after searching through the different play stores. We have designed two criteria such that the first criteria are known as eligibility criteria, while the second one is known as assessment criteria. The features and functions of each mobile app are pinpointed and discussed against the parameters of the assessment criteria. The major parameters of assessment criteria are: (i) Home monitoring; (ii) COVID-19 awareness; (iii) contact tracing; (iv) telemedicine; (v) health education; (vi) COVID-19 surveillance; (vii) self-assessment; (viii) security; and (ix) accessibility. This study conducted exploratory analysis and quantitative meta-data analysis by adopting PRISMA guidelines. This survey article is not only discussing the function and features of each COVID-19-centered app in Pakistan, but it also sheds light on the limitations of every mobile app as well. The results of this survey might be helpful for the mobile developers to review the current app products and enhance the existing mobile platforms targeted towards the COVID-19 pandemic. This is the first attempt of its kind to present a state-of-the-art survey of the COVID-19-centered mobile health apps in Pakistan.
Collapse
Affiliation(s)
- Habib Ullah Khan
- Department of Accounting and Information Systems, College of Business and Economics, Qatar University, Doha, Qatar
| | - Yasir Ali
- Shahzeb Shaheed Government Degree College Razzar, Swabi, Higher Education, KP, Pakistan
| | - Muhammad Azeem Akbar
- Software Engineering Department, Lappeenranta-Lahti University of Technology, 15210, Lappeenranta, Finland
| | - Faheem Khan
- Department of Computer Engineering, Gachon University, Seongnam-si, 13120, South Korea
| |
Collapse
|
7
|
Shao H, Liu X, Zong D, Song Q. Optimization of diabetes prediction methods based on combinatorial balancing algorithm. Nutr Diabetes 2024; 14:63. [PMID: 39143066 PMCID: PMC11324958 DOI: 10.1038/s41387-024-00324-z] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 10/13/2023] [Revised: 07/22/2024] [Accepted: 07/26/2024] [Indexed: 08/16/2024] Open
Abstract
BACKGROUND Diabetes, as a significant disease affecting public health, requires early detection for effective management and intervention. However, imbalanced datasets pose a challenge to accurate diabetes prediction. This imbalance often results in models performing poorly in predicting minority classes, affecting overall diagnostic performance. OBJECTIVES To address this issue, this study employs a combination of Synthetic Minority Over-sampling Technique (SMOTE) and Random Under-Sampling (RUS) for data balancing and uses Optuna for hyperparameter optimization of machine learning models. This approach aims to fill the gap in current research concerning data balancing and model optimization, thereby improving prediction accuracy and computational efficiency. METHODS First, the study uses SMOTE and RUS methods to process the imbalanced diabetes dataset, balancing the data distribution. Then, Optuna is utilized to optimize the hyperparameters of the LightGBM model to enhance its performance. During the experiment, the effectiveness of the proposed methods is evaluated by comparing the training results of the dataset before and after balancing. RESULTS The experimental results show that the enhanced LightGBM-Optuna model improves the accuracy from 97.07% to 97.11%, and the precision from 97.17% to 98.99%. The time required for a single search is only 2.5 seconds. These results demonstrate the superiority of the proposed method in handling imbalanced datasets and optimizing model performance. CONCLUSIONS The study indicates that combining SMOTE and RUS data balancing algorithms with Optuna for hyperparameter optimization can effectively enhance machine learning models, especially in dealing with imbalanced datasets for diabetes prediction.
Collapse
Affiliation(s)
- HuiZhi Shao
- Jinan Engineering Polytechnic, Ji-Nan, Shandong, China
- College of Intelligent Equipment, Shandong University of Science & Technology, Tai-an, Shandong, China
| | - Xiang Liu
- College of Intelligent Equipment, Shandong University of Science & Technology, Tai-an, Shandong, China
| | - DaShuai Zong
- College of Intelligent Equipment, Shandong University of Science & Technology, Tai-an, Shandong, China
| | - QingJun Song
- College of Intelligent Equipment, Shandong University of Science & Technology, Tai-an, Shandong, China.
| |
Collapse
|
8
|
Zhang H, Cai Z. ConvNextUNet: A small-region attentioned model for cardiac MRI segmentation. Comput Biol Med 2024; 177:108592. [PMID: 38781642 DOI: 10.1016/j.compbiomed.2024.108592] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/06/2023] [Revised: 04/08/2024] [Accepted: 05/09/2024] [Indexed: 05/25/2024]
Abstract
Cardiac MRI segmentation is a significant research area in medical image processing, holding immense clinical and scientific importance in assisting the diagnosis and treatment of heart diseases. Currently, existing cardiac MRI segmentation algorithms are often constrained by specific datasets and conditions, leading to a notable decrease in segmentation performance when applied to diverse datasets. These limitations affect the algorithm's overall performance and generalization capabilities. Inspired by ConvNext, we introduce a two-dimensional cardiac MRI segmentation U-shaped network called ConvNextUNet. It is the first application of a combination of ConvNext and the U-shaped architecture in the field of cardiac MRI segmentation. Firstly, we incorporate up-sampling modules into the original ConvNext architecture and combine it with the U-shaped framework to achieve accurate reconstruction. Secondly, we integrate Input Stem into ConvNext, and introduce attention mechanisms along the bridging path. By merging features extracted from both the encoder and decoder, a probability distribution is obtained through linear and nonlinear transformations, serving as attention weights, thereby enhancing the signal of the same region of interest. The resulting attention weights are applied to the decoder features, highlighting the region of interest. This allows the model to simultaneously consider local context and global details during the learning phase, fully leveraging the advantages of both global and local perception for a more comprehensive understanding of cardiac anatomical structures. Consequently, the model demonstrates a clear advantage and robust generalization capability, especially in small-region segmentation. Experimental results on the ACDC, LVQuan19, and RVSC datasets confirm that the ConvNextUNet model outperforms the current state-of-the-art models, particularly in small-region segmentation tasks. Furthermore, we conducted cross-dataset training and testing experiments, which revealed that the pre-trained model can accurately segment diverse cardiac datasets, showcasing its powerful generalization capabilities. The source code of this project is available at https://github.com/Zemin-Cai/ConvNextUNet.
Collapse
Affiliation(s)
- Huiyi Zhang
- The Department of Electronic Engineering, Shantou University, Shantou, Guangdong 515063, PR China; Key Laboratory of Digital Signal and Image Processing of Guangdong Province, Shantou, Guangdong 515063, PR China
| | - Zemin Cai
- The Department of Electronic Engineering, Shantou University, Shantou, Guangdong 515063, PR China; Key Laboratory of Digital Signal and Image Processing of Guangdong Province, Shantou, Guangdong 515063, PR China.
| |
Collapse
|
9
|
Khan AQ, Sun G, Khalid M, Imran A, Bilal A, Azam M, Sarwar R. A novel fusion of genetic grey wolf optimization and kernel extreme learning machines for precise diabetic eye disease classification. PLoS One 2024; 19:e0303094. [PMID: 38768222 PMCID: PMC11147523 DOI: 10.1371/journal.pone.0303094] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/31/2024] [Accepted: 04/18/2024] [Indexed: 05/22/2024] Open
Abstract
In response to the growing number of diabetes cases worldwide, Our study addresses the escalating issue of diabetic eye disease (DED), a significant contributor to vision loss globally, through a pioneering approach. We propose a novel integration of a Genetic Grey Wolf Optimization (G-GWO) algorithm with a Fully Convolutional Encoder-Decoder Network (FCEDN), further enhanced by a Kernel Extreme Learning Machine (KELM) for refined image segmentation and disease classification. This innovative combination leverages the genetic algorithm and grey wolf optimization to boost the FCEDN's efficiency, enabling precise detection of DED stages and differentiation among disease types. Tested across diverse datasets, including IDRiD, DR-HAGIS, and ODIR, our model showcased superior performance, achieving classification accuracies between 98.5% to 98.8%, surpassing existing methods. This advancement sets a new standard in DED detection and offers significant potential for automating fundus image analysis, reducing reliance on manual examination, and improving patient care efficiency. Our findings are crucial to enhancing diagnostic accuracy and patient outcomes in DED management.
Collapse
Affiliation(s)
- Abdul Qadir Khan
- Faculty of Information Technology, Beijing University of Technology, Beijing, China
| | - Guangmin Sun
- Faculty of Information Technology, Beijing University of Technology, Beijing, China
| | - Majdi Khalid
- Department of Computer Science and Artificial Intelligence, College of Computing, Umm Al-Qura University, Makkah, Saudi Arabia
| | - Azhar Imran
- Department of Creative Technologies, Air University, Islamabad, Pakistan
| | - Anas Bilal
- College of Information Science and Technology, Hainan Normal University, Haikou, China
- Key Laboratory of Data Science and Smart Education, Ministry of Education, Hainan Normal University, Haikou, China
| | - Muhammad Azam
- Department of Computer Science, Superior University, Lahore, Pakistan
| | - Raheem Sarwar
- OTEHM, Manchester Metropolitan University, Manchester, United Kingdom
| |
Collapse
|
10
|
Hassan E, Abd El-Hafeez T, Shams MY. Optimizing classification of diseases through language model analysis of symptoms. Sci Rep 2024; 14:1507. [PMID: 38233458 PMCID: PMC10794698 DOI: 10.1038/s41598-024-51615-5] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/07/2023] [Accepted: 01/07/2024] [Indexed: 01/19/2024] Open
Abstract
This paper investigated the use of language models and deep learning techniques for automating disease prediction from symptoms. Specifically, we explored the use of two Medical Concept Normalization-Bidirectional Encoder Representations from Transformers (MCN-BERT) models and a Bidirectional Long Short-Term Memory (BiLSTM) model, each optimized with a different hyperparameter optimization method, to predict diseases from symptom descriptions. In this paper, we utilized two distinct dataset called Dataset-1, and Dataset-2. Dataset-1 consists of 1,200 data points, with each point representing a unique combination of disease labels and symptom descriptions. While, Dataset-2 is designed to identify Adverse Drug Reactions (ADRs) from Twitter data, comprising 23,516 rows categorized as ADR (1) or Non-ADR (0) tweets. The results indicate that the MCN-BERT model optimized with AdamP achieved 99.58% accuracy for Dataset-1 and 96.15% accuracy for Dataset-2. The MCN-BERT model optimized with AdamW performed well with 98.33% accuracy for Dataset-1 and 95.15% for Dataset-2, while the BiLSTM model optimized with Hyperopt achieved 97.08% accuracy for Dataset-1 and 94.15% for Dataset-2. Our findings suggest that language models and deep learning techniques have promise for supporting earlier detection and more prompt treatment of diseases, as well as expanding remote diagnostic capabilities. The MCN-BERT and BiLSTM models demonstrated robust performance in accurately predicting diseases from symptoms, indicating the potential for further related research.
Collapse
Affiliation(s)
- Esraa Hassan
- Faculty of Artificial Intelligence, Kafrelsheikh University, Kafrelsheikh, 33516, Egypt.
| | - Tarek Abd El-Hafeez
- Department of Computer Science, Faculty of Science, Minia University, Minia, 61519, Egypt.
- Computer Science Unit, Deraya University, Minia University, Minia, 61765, Egypt.
| | - Mahmoud Y Shams
- Faculty of Artificial Intelligence, Kafrelsheikh University, Kafrelsheikh, 33516, Egypt.
| |
Collapse
|
11
|
Zhu W, Zhang L, Jiang X, Zhou P, Xie X, Wang H. A method combining LDA and neural networks for antitumor drug efficacy prediction. Digit Health 2024; 10:20552076241280103. [PMID: 39257869 PMCID: PMC11384538 DOI: 10.1177/20552076241280103] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/26/2023] [Accepted: 08/09/2024] [Indexed: 09/12/2024] Open
Abstract
Background Personalized medicine has gained more attention for cancer precision treatment due to patient genetic heterogeneity in recent years. However, predicting the efficacy of antitumor drugs in advance remains a significant challenge to achieve this task. Objective This study aims to predict the efficacy of antitumor drugs in individual cancer patients based on clinical data. Methods This paper proposes to predict personalized antitumor drug efficacy based on clinical data. Specifically, we encode the clinical text of cancer patients as a probability distribution vector in hidden topics space using the Latent Dirichlet Allocation (LDA) model, named LDA representation. Then, a neural network is designed, and the LDA representation is input into the neural network to predict drug response in cancer patients treated with platinum drugs. To evaluate the effectiveness of the proposed method, we gathered and organized clinical records of lung and bowel cancer patients who underwent platinum-based treatment. The prediction performance is assessed using the following metrics: Precision, Recall, F1-score, Accuracy, and Area Under the ROC Curve (AUC). Results The study analyzed a dataset of 958 patients with non-small cell cancer treated with antitumor drugs. The proposed method achieved a stratified 5-fold cross-validation average Precision of 0.81, Recall of 0.89, F1-score of 0.85, Accuracy of 0.77, and AUC of 0.81 for cisplatin efficacy prediction on the data, which most are better than those of previous methods. Of these, the AUC value is at least 4% higher than those of the previous. At the same time, the superior result over the previous method persisted on an independent dataset of 266 bowel cancer patients, showing the generalizability of the proposed method. These results demonstrate the potential value of precise tumor treatment in clinical practice. Conclusions Combining LDA and neural networks can help predict the efficacy of antitumor drugs based on clinical text. Our approach outperforms previous methods in predicting drug clinical efficacy.
Collapse
Affiliation(s)
- Weiwei Zhu
- University of Science and Technology of China, Hefei, Anhui, China
- Institute of Intelligent Machines, Hefei Institutes of Physical Science, Chinese Academy of Sciences, Hefei, Anhui, China
| | - Lei Zhang
- Department of Pharmacy, The First Affiliated Hospital of USTC, Division of Life Sciences and Medicine, University of Science and Technology of China, Hefei, Anhui, China
| | - Xiaodong Jiang
- Medical Oncology Department, The First Affiliated Hospital of University of Science and Technology of China, Hefei, Anhui, China
| | - Peng Zhou
- School of Life Science, Hefei Normal University, Hefei, Anhui, China
| | - Xinping Xie
- School of Mathematics and Physics, Anhui Jianzhu University, Hefei, Anhui, China
| | - Hongqiang Wang
- Institute of Intelligent Machines, Hefei Institutes of Physical Science, Chinese Academy of Sciences, Hefei, Anhui, China
| |
Collapse
|
12
|
Alizargar A, Chang YL, Tan TH. Performance Comparison of Machine Learning Approaches on Hepatitis C Prediction Employing Data Mining Techniques. Bioengineering (Basel) 2023; 10:481. [PMID: 37106668 PMCID: PMC10135598 DOI: 10.3390/bioengineering10040481] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/20/2023] [Revised: 04/06/2023] [Accepted: 04/11/2023] [Indexed: 04/29/2023] Open
Abstract
Hepatitis C is a liver infection caused by the hepatitis C virus (HCV). Due to the late onset of symptoms, early diagnosis is difficult in this disease. Efficient prediction can save patients before permeant liver damage. The main objective of this study is to employ various machine learning techniques to predict this disease based on common and affordable blood test data to diagnose and treat patients in the early stages. In this study, six machine learning algorithms (Support Vector Machine (SVM), K-nearest Neighbors (KNN), Logistic Regression, decision tree, extreme gradient boosting (XGBoost), artificial neural networks (ANN)) were utilized on two datasets. The performances of these techniques were compared in terms of confusion matrix, precision, recall, F1 score, accuracy, receiver operating characteristics (ROC), and the area under the curve (AUC) to identify a method that is appropriate for predicting this disease. The analysis, on NHANES and UCI datasets, revealed that SVM and XGBoost (with the highest accuracy and AUC among the test models, >80%) can be effective tools for medical professionals using routine and affordable blood test data to predict hepatitis C.
Collapse
Affiliation(s)
| | | | - Tan-Hsu Tan
- Department of Electrical Engineering, College of Electrical Engineering and Computer Science, National Taipei University of Technology, Taipei 10608, Taiwan; (A.A.); (Y.-L.C.)
| |
Collapse
|