1
|
Chiu YM, Sirois C, Simard M, Gagnon ME, Talbot D. Traditional Methods Hold Their Ground Against Machine Learning in Predicting Potentially Inappropriate Medication Use in Older Adults. VALUE IN HEALTH : THE JOURNAL OF THE INTERNATIONAL SOCIETY FOR PHARMACOECONOMICS AND OUTCOMES RESEARCH 2024:S1098-3015(24)02743-8. [PMID: 38977181 DOI: 10.1016/j.jval.2024.06.005] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 12/11/2023] [Revised: 05/30/2024] [Accepted: 06/27/2024] [Indexed: 07/10/2024]
Abstract
OBJECTIVES Machine learning methods have gained much attention in health sciences for predicting various health outcomes but are scarcely used in pharmacoepidemiology. The ability to identify predictors of suboptimal medication use is essential for conducting interventions aimed at improving medication outcomes. It remains uncertain whether machine learning methods could enhance the identification of potentially inappropriate medication use among older adults compared with traditional methods. This study aimed to (1) to compare the performances of machine learning models in predicting use of potentially inappropriate medications and (2) to quantify and compare the relative importance of predictors in a population of community-dwelling older adults (>65 years) in the province of Québec, Canada. METHODS We used the Québec Integrated Chronic Disease Surveillance System and selected a cohort of 1 105 295 older adults of whom 533 719 were potentially inappropriate medication users. Potentially inappropriate medications were defined according to the Beers list. We compared performances between 5 popular machine learning models (gradient boosting machines, logistic regression, naive Bayes, neural networks, and random forests) based on receiver operating characteristic curves and other performance criteria, using a set of sociodemographic and medical predictors. RESULTS No model clearly outperformed the others. All models except neural networks were in agreement regarding the top predictors (sex and anxiety-depressive disorders and schizophrenia) and the bottom predictors (rurality and social and material deprivation indices). CONCLUSIONS Including other types of predictors (eg, unstructured data) may be more useful for increasing performance in prediction of potentially inappropriate medication use.
Collapse
Affiliation(s)
- Yohann Moanahere Chiu
- Faculté de pharmacie, Université Laval, Québec, QC, Canada; Institut national de santé publique du Québec, Québec, QC, Canada; VITAM-Centre de recherche en santé durable, Centre intégré de santé et de services sociaux de la Capitale Nationale, Québec, QC, Canada.
| | - Caroline Sirois
- Faculté de pharmacie, Université Laval, Québec, QC, Canada; Institut national de santé publique du Québec, Québec, QC, Canada; VITAM-Centre de recherche en santé durable, Centre intégré de santé et de services sociaux de la Capitale Nationale, Québec, QC, Canada; Centre de recherche du CHU de Québec-Université Laval, Québec, QC, Canada
| | - Marc Simard
- Institut national de santé publique du Québec, Québec, QC, Canada; VITAM-Centre de recherche en santé durable, Centre intégré de santé et de services sociaux de la Capitale Nationale, Québec, QC, Canada; Département de médecine sociale et préventive, Faculté de médecine, Université Laval, Québec, QC, Canada
| | - Marie-Eve Gagnon
- Faculté de pharmacie, Université Laval, Québec, QC, Canada; VITAM-Centre de recherche en santé durable, Centre intégré de santé et de services sociaux de la Capitale Nationale, Québec, QC, Canada; Département des Sciences de la Santé, Université du Québec à Rimouski, Québec, QC, Canada
| | - Denis Talbot
- Département de médecine sociale et préventive, Faculté de médecine, Université Laval, Québec, QC, Canada; Centre de recherche du CHU de Québec-Université Laval, Québec, QC, Canada
| |
Collapse
|
2
|
Liu G, Li X, Guo Y, Zhang L, Liu H, Ai H. Ensemble multiclassification model for predicting developmental toxicity in zebrafish. AQUATIC TOXICOLOGY (AMSTERDAM, NETHERLANDS) 2024; 271:106936. [PMID: 38723470 DOI: 10.1016/j.aquatox.2024.106936] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 03/24/2024] [Revised: 04/29/2024] [Accepted: 05/01/2024] [Indexed: 05/21/2024]
Abstract
In recent years, with the rapid development of society, organic compounds have been released into aquatic environments in various forms, posing a significant threat to the survival of aquatic organisms. The assessment of developmental toxicity is an important part of environmental safety risk systems, helping to identify the potential impacts of organic compounds on the embryonic development of aquatic organisms and enabling early detection and warning of potential ecological risks. Additionally, binary classification models cannot accurately classify organic compounds. Therefore, it is crucial to construct a multiclassification model for predicting the developmental toxicity of organic compounds. In this study, binary and multiclassification models were developed based on the ToxCast™ Phase I chemical library and literature data. The random forest, support vector machine, extreme gradient boosting, adaptive gradient boosting, and C5.0 decision tree algorithms, as well as 8 types of molecular fingerprint were used to establish a multiclassification base model for predicting developmental toxicity through 5-fold cross-validation and external validation. Ultimately, a multiclassification ensemble model was derived through a voting method. The performance of the binary ensemble model, as measured by the balanced accuracy, was 0.918, while that of the multiclassification model was 0.819. The developmental toxicity voting ensemble model (DT-VEM) achieved accuracies of 0.804, 0.834, and 0.855. Furthermore, by utilizing the XGBoost machine learning algorithm to construct separate models for molecular descriptors and substructure molecular fingerprints, we identified several substructures and physical properties related to developmental toxicity. Our research contributes to a more detailed classification of developmental toxicity, providing a new and valuable tool for predicting the developmental toxicity effects of unknown compounds. This supplement addresses the limitations of previous tools, as it offers an enhanced ability to predict potential developmental toxicity in novel compounds.
Collapse
Affiliation(s)
- Gaohua Liu
- College of Life Science, Liaoning University, Shenyang, 110036, China
| | - Xinran Li
- College of Life Science, Liaoning University, Shenyang, 110036, China
| | - Yaxu Guo
- College of Life Science, Liaoning University, Shenyang, 110036, China
| | - Li Zhang
- College of Life Science, Liaoning University, Shenyang, 110036, China; China Research Center for Computer Simulating and Information Processing of Bio-macromolecules of Shenyang, China
| | - Hongsheng Liu
- College of Life Science, Liaoning University, Shenyang, 110036, China; China Research Center for Computer Simulating and Information Processing of Bio-macromolecules of Shenyang, China
| | - Haixin Ai
- College of Life Science, Liaoning University, Shenyang, 110036, China; China Research Center for Computer Simulating and Information Processing of Bio-macromolecules of Shenyang, China.
| |
Collapse
|
3
|
Chen MS, Liu TC, Jhou MJ, Yang CT, Lu CJ. Analyzing Longitudinal Health Screening Data with Feature Ensemble and Machine Learning Techniques: Investigating Diagnostic Risk Factors of Metabolic Syndrome for Chronic Kidney Disease Stages 3a to 3b. Diagnostics (Basel) 2024; 14:825. [PMID: 38667472 PMCID: PMC11048899 DOI: 10.3390/diagnostics14080825] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/27/2024] [Revised: 04/12/2024] [Accepted: 04/13/2024] [Indexed: 04/28/2024] Open
Abstract
Longitudinal data, while often limited, contain valuable insights into features impacting clinical outcomes. To predict the progression of chronic kidney disease (CKD) in patients with metabolic syndrome, particularly those transitioning from stage 3a to 3b, where data are scarce, utilizing feature ensemble techniques can be advantageous. It can effectively identify crucial risk factors, influencing CKD progression, thereby enhancing model performance. Machine learning (ML) methods have gained popularity due to their ability to perform feature selection and handle complex feature interactions more effectively than traditional approaches. However, different ML methods yield varying feature importance information. This study proposes a multiphase hybrid risk factor evaluation scheme to consider the diverse feature information generated by ML methods. The scheme incorporates variable ensemble rules (VERs) to combine feature importance information, thereby aiding in the identification of important features influencing CKD progression and supporting clinical decision making. In the proposed scheme, we employ six ML models-Lasso, RF, MARS, LightGBM, XGBoost, and CatBoost-each renowned for its distinct feature selection mechanisms and widespread usage in clinical studies. By implementing our proposed scheme, thirteen features affecting CKD progression are identified, and a promising AUC score of 0.883 can be achieved when constructing a model with them.
Collapse
Affiliation(s)
- Ming-Shu Chen
- Department of Healthcare Administration, College of Healthcare & Management, Asia Eastern University of Science and Technology, New Taipei City 220, Taiwan
| | - Tzu-Chi Liu
- Graduate Institute of Business Administration, Fu Jen Catholic University, New Taipei City 242, Taiwan
| | - Mao-Jhen Jhou
- Graduate Institute of Business Administration, Fu Jen Catholic University, New Taipei City 242, Taiwan
| | - Chih-Te Yang
- Department of Business Administration, Tamkang University, New Taipei City 251, Taiwan
| | - Chi-Jie Lu
- Graduate Institute of Business Administration, Fu Jen Catholic University, New Taipei City 242, Taiwan
- Artificial Intelligence Development Center, Fu Jen Catholic University, New Taipei City 242, Taiwan
- Department of Information Management, Fu Jen Catholic University, New Taipei City 242, Taiwan
| |
Collapse
|
4
|
Leme DEDC, de Oliveira C. Machine Learning Models to Predict Future Frailty in Community-Dwelling Middle-Aged and Older Adults: The ELSA Cohort Study. J Gerontol A Biol Sci Med Sci 2023; 78:2176-2184. [PMID: 37209408 PMCID: PMC10613015 DOI: 10.1093/gerona/glad127] [Citation(s) in RCA: 2] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/19/2022] [Indexed: 05/22/2023] Open
Abstract
BACKGROUND Machine learning (ML) models can be used to predict future frailty in the community setting. However, outcome variables for epidemiologic data sets such as frailty usually have an imbalance between categories, that is, there are far fewer individuals classified as frail than as nonfrail, adversely affecting the performance of ML models when predicting the syndrome. METHODS A retrospective cohort study with participants (50 years or older) from the English Longitudinal Study of Ageing who were nonfrail at baseline (2008-2009) and reassessed for the frailty phenotype at 4-year follow-up (2012-2013). Social, clinical, and psychosocial baseline predictors were selected to predict frailty at follow-up in ML models (Logistic Regression, Random Forest [RF], Support Vector Machine, Neural Network, K-nearest neighbor, and Naive Bayes classifier). RESULTS Of all the 4 378 nonfrail participants at baseline, 347 became frail at follow-up. The proposed combined oversampling and undersampling method to adjust imbalanced data improved the performance of the models, and RF had the best performance, with areas under the receiver-operating characteristic curve and the precision-recall curve of 0.92 and 0.97, respectively, specificity of 0.83, sensitivity of 0.88, and balanced accuracy of 85.5% for balanced data. Age, chair-rise test, household wealth, balance problems, and self-rated health were the most important frailty predictors in most of the models trained with balanced data. CONCLUSIONS ML proved useful in identifying individuals who became frail over time, and this result was made possible by balancing the data set. This study highlighted factors that may be useful in the early detection of frailty.
Collapse
Affiliation(s)
| | - Cesar de Oliveira
- Department of Epidemiology and Public Health, University College London, London, UK
| |
Collapse
|
5
|
Predict, diagnose, and treat chronic kidney disease with machine learning: a systematic literature review. J Nephrol 2023; 36:1101-1117. [PMID: 36786976 PMCID: PMC10227138 DOI: 10.1007/s40620-023-01573-4] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/06/2022] [Accepted: 01/01/2023] [Indexed: 02/15/2023]
Abstract
OBJECTIVES In this systematic review we aimed at assessing how artificial intelligence (AI), including machine learning (ML) techniques have been deployed to predict, diagnose, and treat chronic kidney disease (CKD). We systematically reviewed the available evidence on these innovative techniques to improve CKD diagnosis and patient management. METHODS We included English language studies retrieved from PubMed. The review is therefore to be classified as a "rapid review", since it includes one database only, and has language restrictions; the novelty and importance of the issue make missing relevant papers unlikely. We extracted 16 variables, including: main aim, studied population, data source, sample size, problem type (regression, classification), predictors used, and performance metrics. We followed the Preferred Reporting Items for Systematic Reviews (PRISMA) approach; all main steps were done in duplicate. RESULTS From a total of 648 studies initially retrieved, 68 articles met the inclusion criteria. Models, as reported by authors, performed well, but the reported metrics were not homogeneous across articles and therefore direct comparison was not feasible. The most common aim was prediction of prognosis, followed by diagnosis of CKD. Algorithm generalizability, and testing on diverse populations was rarely taken into account. Furthermore, the clinical evaluation and validation of the models/algorithms was perused; only a fraction of the included studies, 6 out of 68, were performed in a clinical context. CONCLUSIONS Machine learning is a promising tool for the prediction of risk, diagnosis, and therapy management for CKD patients. Nonetheless, future work is needed to address the interpretability, generalizability, and fairness of the models to ensure the safe application of such technologies in routine clinical practice.
Collapse
|
6
|
Li X, Liu G, Wang Z, Zhang L, Liu H, Ai H. Ensemble multiclassification model for aquatic toxicity of organic compounds. AQUATIC TOXICOLOGY (AMSTERDAM, NETHERLANDS) 2023; 255:106379. [PMID: 36587517 DOI: 10.1016/j.aquatox.2022.106379] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 05/10/2022] [Revised: 12/04/2022] [Accepted: 12/19/2022] [Indexed: 06/17/2023]
Abstract
With environmental pollution becoming increasingly serious, organic compounds have become the main hazard of environmental pollution and exert substantial negative impacts on aquatic organisms. In research pertaining to the acute toxicity of organic compounds, traditional biological experimental methods are time-consuming and expensive. In addition, computer-aided binary classification models cannot accurately classify acute toxicity. Therefore, the multiclassication model is necessary for more accurate classification of acute toxicity. In this study, median lethal concentrations of 373 organic compounds in the environmental toxicology datasets ECOTOX and EAT5 were used. These chemicals were classified into four categories based on the European Economic Community criteria. Then the random forest, support vector machine, extreme gradient boosting, adaptive gradient boosting, and C5.0 decision tree algorithms and eight molecular fingerprints were used to build a multiclassification base model for the acute toxicity of organic compounds. The base models were repeated 100 times with fivefold cross-validation and external validation. The ensemble model was obtained by the voting method. The best base classifier was ExtendFP-C5.0, which had an accuracy, sensitivity and specificity values of 87.30%, 87.32% and 95.76% for external validation, and the voting ensemble model performance of 96.92%, 96.93% and 98.97%, respectively. The ensemble model achieved a higher accuracy than previously reported studies. Our study will help to further classify the acute toxicity of organic compounds to aquatic organisms and predict the hazard classes of organic compounds.
Collapse
Affiliation(s)
- Xinran Li
- College of Life Science, Liaoning University, Shenyang, 110036, China
| | - Gaohua Liu
- College of Life Science, Liaoning University, Shenyang, 110036, China
| | - Zhibo Wang
- College of Life Science, Liaoning University, Shenyang, 110036, China
| | - Li Zhang
- College of Life Science, Liaoning University, Shenyang, 110036, China; China Research Center for Computer Simulating and Information Processing of Bio-macromolecules of Shenyang, China
| | - Hongsheng Liu
- China Research Center for Computer Simulating and Information Processing of Bio-macromolecules of Shenyang, China; College of Pharmacy, Liaoning University, Shenyang, 110036, China
| | - Haixin Ai
- College of Life Science, Liaoning University, Shenyang, 110036, China; China Research Center for Computer Simulating and Information Processing of Bio-macromolecules of Shenyang, China.
| |
Collapse
|
7
|
Accurate Evaluation of Feature Contributions for Sentinel Lymph Node Status Classification in Breast Cancer. APPLIED SCIENCES-BASEL 2022. [DOI: 10.3390/app12147227] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 12/15/2022]
Abstract
The current guidelines recommend the sentinel lymph node biopsy to evaluate the lymph node involvement for breast cancer patients with clinically negative lymph nodes on clinical or radiological examination. Machine learning (ML) models have significantly improved the prediction of lymph nodes status based on clinical features, thus avoiding expensive, time-consuming and invasive procedures. However, the classification of sentinel lymph node status represents a typical example of an unbalanced classification problem. In this work, we developed a ML framework to explore the effects of unbalanced populations on the performance and stability of feature ranking for sentinel lymph node status classification in breast cancer. Our results indicate state-of-the-art AUC (Area under the Receiver Operating Characteristic curve) values on a hold-out set (67%) while providing particularly stable features related to tumor size, histological subtype and estrogen receptor expression, which should therefore be considered as potential biomarkers.
Collapse
|
8
|
Kobayashi M, Yamada Y, Shinkawa K, Nemoto M, Nemoto K, Arai T. Automated Early Detection of Alzheimer's Disease by Capturing Impairments in Multiple Cognitive Domains with Multiple Drawing Tasks. J Alzheimers Dis 2022; 88:1075-1089. [PMID: 35723100 PMCID: PMC9484124 DOI: 10.3233/jad-215714] [Citation(s) in RCA: 4] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/15/2022]
Abstract
BACKGROUND Automatic analysis of the drawing process using a digital tablet and pen has been applied to successfully detect Alzheimer's disease (AD) and mild cognitive impairment (MCI). However, most studies focused on analyzing individual drawing tasks separately, and the question of how a combination of drawing tasks could improve the detection performance thus remains unexplored. OBJECTIVE We aimed to investigate whether analysis of the drawing process in multiple drawing tasks could capture different, complementary aspects of cognitive impairments, with a view toward combining multiple tasks to effectively improve the detection capability. METHODS We collected drawing data from 144 community-dwelling older adults (27 AD, 65 MCI, and 52 cognitively normal, or CN) who performed five drawing tasks. We then extracted motion- and pause-related drawing features for each task and investigated the statistical associations of the features with the participants' diagnostic statuses and cognitive measures. RESULTS The drawing features showed gradual changes from CN to MCI and then to AD, and the changes in the features for each task were statistically associated with cognitive impairments in different domains. For classification into the three diagnostic categories, a machine learning model using the features from all five tasks achieved a classification accuracy of 75.2%, an improvement by 7.8% over that of the best single-task model. CONCLUSION Our results demonstrate that a common set of drawing features from multiple drawing tasks can capture different, complementary aspects of cognitive impairments, which may lead to a scalable way to improve the automated detection of AD and MCI.
Collapse
|
9
|
Suri JS, Bhagawati M, Paul S, Protogerou AD, Sfikakis PP, Kitas GD, Khanna NN, Ruzsa Z, Sharma AM, Saxena S, Faa G, Laird JR, Johri AM, Kalra MK, Paraskevas KI, Saba L. A Powerful Paradigm for Cardiovascular Risk Stratification Using Multiclass, Multi-Label, and Ensemble-Based Machine Learning Paradigms: A Narrative Review. Diagnostics (Basel) 2022; 12:diagnostics12030722. [PMID: 35328275 PMCID: PMC8947682 DOI: 10.3390/diagnostics12030722] [Citation(s) in RCA: 16] [Impact Index Per Article: 8.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/21/2022] [Revised: 03/10/2022] [Accepted: 03/13/2022] [Indexed: 12/16/2022] Open
Abstract
Background and Motivation: Cardiovascular disease (CVD) causes the highest mortality globally. With escalating healthcare costs, early non-invasive CVD risk assessment is vital. Conventional methods have shown poor performance compared to more recent and fast-evolving Artificial Intelligence (AI) methods. The proposed study reviews the three most recent paradigms for CVD risk assessment, namely multiclass, multi-label, and ensemble-based methods in (i) office-based and (ii) stress-test laboratories. Methods: A total of 265 CVD-based studies were selected using the preferred reporting items for systematic reviews and meta-analyses (PRISMA) model. Due to its popularity and recent development, the study analyzed the above three paradigms using machine learning (ML) frameworks. We review comprehensively these three methods using attributes, such as architecture, applications, pro-and-cons, scientific validation, clinical evaluation, and AI risk-of-bias (RoB) in the CVD framework. These ML techniques were then extended under mobile and cloud-based infrastructure. Findings: Most popular biomarkers used were office-based, laboratory-based, image-based phenotypes, and medication usage. Surrogate carotid scanning for coronary artery risk prediction had shown promising results. Ground truth (GT) selection for AI-based training along with scientific and clinical validation is very important for CVD stratification to avoid RoB. It was observed that the most popular classification paradigm is multiclass followed by the ensemble, and multi-label. The use of deep learning techniques in CVD risk stratification is in a very early stage of development. Mobile and cloud-based AI technologies are more likely to be the future. Conclusions: AI-based methods for CVD risk assessment are most promising and successful. Choice of GT is most vital in AI-based models to prevent the RoB. The amalgamation of image-based strategies with conventional risk factors provides the highest stability when using the three CVD paradigms in non-cloud and cloud-based frameworks.
Collapse
Affiliation(s)
- Jasjit S. Suri
- Stroke Diagnostic and Monitoring Division, AtheroPoint™, Roseville, CA 95661, USA
- Correspondence: ; Tel.: +1-(916)-749-5628
| | - Mrinalini Bhagawati
- Department of Biomedical Engineering, North-Eastern Hill University, Shillong 793022, India; (M.B.); (S.P.)
| | - Sudip Paul
- Department of Biomedical Engineering, North-Eastern Hill University, Shillong 793022, India; (M.B.); (S.P.)
| | - Athanasios D. Protogerou
- Research Unit Clinic, Laboratory of Pathophysiology, Department of Cardiovascular Prevention, National and Kapodistrian University of Athens, 11527 Athens, Greece;
| | - Petros P. Sfikakis
- Rheumatology Unit, National Kapodistrian University of Athens, 11527 Athens, Greece;
| | - George D. Kitas
- Arthritis Research UK Centre for Epidemiology, Manchester University, Manchester 46962, UK;
| | - Narendra N. Khanna
- Department of Cardiology, Indraprastha APOLLO Hospitals, New Delhi 110020, India;
| | - Zoltan Ruzsa
- Department of Internal Medicines, Invasive Cardiology Division, University of Szeged, 6720 Szeged, Hungary;
| | - Aditya M. Sharma
- Division of Cardiovascular Medicine, University of Virginia, Charlottesville, VA 22903, USA;
| | - Sanjay Saxena
- Department of CSE, International Institute of Information Technology, Bhubaneswar 751003, India;
| | - Gavino Faa
- Department of Pathology, A.O.U., di Cagliari-Polo di Monserrato s.s., 09045 Cagliari, Italy;
| | - John R. Laird
- Cardiology Department, St. Helena Hospital, St. Helena, CA 94574, USA;
| | - Amer M. Johri
- Department of Medicine, Division of Cardiology, Queen’s University, Kingston, ON K7L 3N6, Canada;
| | - Manudeep K. Kalra
- Department of Radiology, Massachusetts General Hospital, Boston, MA 02114, USA;
| | - Kosmas I. Paraskevas
- Department of Vascular Surgery, Central Clinic of Athens, N. Iraklio, 14122 Athens, Greece;
| | - Luca Saba
- Department of Radiology, A.O.U., di Cagliari-Polo di Monserrato s.s., 09045 Cagliari, Italy;
| |
Collapse
|
10
|
Jin X, Wang J, Ge L, Hu Q. Identification of Immune-Related Biomarkers for Sciatica in Peripheral Blood. Front Genet 2021; 12:781945. [PMID: 34925462 PMCID: PMC8677837 DOI: 10.3389/fgene.2021.781945] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/23/2021] [Accepted: 11/04/2021] [Indexed: 01/22/2023] Open
Abstract
Objective: Sciatica pertains to neuropathic pain that has been associated with inflammatory response. We aimed to identify significant immune-related biomarkers for sciatica in peripheral blood. Methods: We utilized the GSE150408 expression profiling data from the Gene Expression Omnibus (GEO) database as the training dataset and extracted immune-related genes for further analysis. Differentially expressed immune-related genes (DEIRGs) between healthy controls and patients with sciatica were selected using the "limma" package and verified in clinical specimens by quantitative reverse transcription PCR (RT-qPCR). A diagnostic immune-related gene signature was established using the training model and random forest (RF), generalized linear model (GLM), and support vector machine (SVM) models. Sciatica patient subtypes were identified using the consensus clustering method. Results: Thirteen significant DEIRGs were acquired, of which five (CRP, EREG, FAM19A4, RLN1, and WFIKKN1) were selected to establish a diagnostic immune-related gene signature according to the most appropriate training model, namely, the RF model. A clinical application nomogram model was established based on the expression level of the five DEIRGs. The sciatica patients were divided into two subtypes (C1 and C2) according to the consensus clustering method. Conclusions: Our research established a diagnostic five immune-related gene signature to discriminate sciatica and identified two sciatica subtypes, which may be beneficial to the clinical diagnosis and treatment of sciatica.
Collapse
Affiliation(s)
- Xin Jin
- Department of Obstetrics and Gynecology, Shengjing Hospital of China Medical University, Shenyang, China
| | - Jun Wang
- Department of Obstetrics and Gynecology, Shengjing Hospital of China Medical University, Shenyang, China
| | - Lina Ge
- Department of Obstetrics and Gynecology, Shengjing Hospital of China Medical University, Shenyang, China
| | - Qing Hu
- Department of Obstetrics and Gynecology, Shengjing Hospital of China Medical University, Shenyang, China
| |
Collapse
|
11
|
Chowdhury NH, Reaz MBI, Haque F, Ahmad S, Ali SHM, A Bakar AA, Bhuiyan MAS. Performance Analysis of Conventional Machine Learning Algorithms for Identification of Chronic Kidney Disease in Type 1 Diabetes Mellitus Patients. Diagnostics (Basel) 2021; 11:diagnostics11122267. [PMID: 34943504 PMCID: PMC8700037 DOI: 10.3390/diagnostics11122267] [Citation(s) in RCA: 8] [Impact Index Per Article: 2.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/13/2021] [Revised: 11/12/2021] [Accepted: 12/01/2021] [Indexed: 12/18/2022] Open
Abstract
Chronic kidney disease (CKD) is one of the severe side effects of type 1 diabetes mellitus (T1DM). However, the detection and diagnosis of CKD are often delayed because of its asymptomatic nature. In addition, patients often tend to bypass the traditional urine protein (urinary albumin)-based CKD detection test. Even though disease detection using machine learning (ML) is a well-established field of study, it is rarely used to diagnose CKD in T1DM patients. This research aimed to employ and evaluate several ML algorithms to develop models to quickly predict CKD in patients with T1DM using easily available routine checkup data. This study analyzed 16 years of data of 1375 T1DM patients, obtained from the Epidemiology of Diabetes Interventions and Complications (EDIC) clinical trials directed by the National Institute of Diabetes, Digestive, and Kidney Diseases, USA. Three data imputation techniques (RF, KNN, and MICE) and the SMOTETomek resampling technique were used to preprocess the primary dataset. Ten ML algorithms including logistic regression (LR), k-nearest neighbor (KNN), Gaussian naïve Bayes (GNB), support vector machine (SVM), stochastic gradient descent (SGD), decision tree (DT), gradient boosting (GB), random forest (RF), extreme gradient boosting (XGB), and light gradient-boosted machine (LightGBM) were applied to developed prediction models. Each model included 19 demographic, medical history, behavioral, and biochemical features, and every feature’s effect was ranked using three feature ranking techniques (XGB, RF, and Extra Tree). Lastly, each model’s ROC, sensitivity (recall), specificity, accuracy, precision, and F-1 score were estimated to find the best-performing model. The RF classifier model exhibited the best performance with 0.96 (±0.01) accuracy, 0.98 (±0.01) sensitivity, and 0.93 (±0.02) specificity. LightGBM performed second best and was quite close to RF with 0.95 (±0.06) accuracy. In addition to these two models, KNN, SVM, DT, GB, and XGB models also achieved more than 90% accuracy.
Collapse
Affiliation(s)
- Nakib Hayat Chowdhury
- Department of Electrical, Electronic and Systems Engineering, Universiti Kebangsaan Malaysia, Bangi 43600, Selangor, Malaysia; (N.H.C.); (M.B.I.R.); (F.H.); (S.H.M.A.); (A.A.A.B.)
- Department of Computer Science and Engineering, Bangladesh Army University of Science and Technology (BAUST), Saidpur Cantonment, Saidpur 5310, Bangladesh
| | - Mamun Bin Ibne Reaz
- Department of Electrical, Electronic and Systems Engineering, Universiti Kebangsaan Malaysia, Bangi 43600, Selangor, Malaysia; (N.H.C.); (M.B.I.R.); (F.H.); (S.H.M.A.); (A.A.A.B.)
| | - Fahmida Haque
- Department of Electrical, Electronic and Systems Engineering, Universiti Kebangsaan Malaysia, Bangi 43600, Selangor, Malaysia; (N.H.C.); (M.B.I.R.); (F.H.); (S.H.M.A.); (A.A.A.B.)
| | - Shamim Ahmad
- Department of Computer Science and Engineering, University of Rajshahi, Rajshahi 6205, Bangladesh;
| | - Sawal Hamid Md Ali
- Department of Electrical, Electronic and Systems Engineering, Universiti Kebangsaan Malaysia, Bangi 43600, Selangor, Malaysia; (N.H.C.); (M.B.I.R.); (F.H.); (S.H.M.A.); (A.A.A.B.)
| | - Ahmad Ashrif A Bakar
- Department of Electrical, Electronic and Systems Engineering, Universiti Kebangsaan Malaysia, Bangi 43600, Selangor, Malaysia; (N.H.C.); (M.B.I.R.); (F.H.); (S.H.M.A.); (A.A.A.B.)
| | - Mohammad Arif Sobhan Bhuiyan
- Department of Electrical and Electronics Engineering, Xiamen University Malaysia, Bandar Sunsuria, Sepang 43900, Selangor, Malaysia
- Correspondence:
| |
Collapse
|
12
|
Wang Y, Li Z, Song G, Wang J. Potential of Immune-Related Genes as Biomarkers for Diagnosis and Subtype Classification of Preeclampsia. Front Genet 2020; 11:579709. [PMID: 33335538 PMCID: PMC7737719 DOI: 10.3389/fgene.2020.579709] [Citation(s) in RCA: 9] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/03/2020] [Accepted: 10/30/2020] [Indexed: 11/13/2022] Open
Abstract
Objective Preeclampsia is the main cause of maternal mortality due to a lack of diagnostic biomarkers and effective prevention and treatment. The immune system plays an important role in the occurrence and development of preeclampsia. This research aimed to identify significant immune-related genes to predict preeclampsia and possible prevention and control methods. Methods Differential expression analysis between normotensive and PE pregnancies was performed to identify significantly changed immune-related genes. Generalized linear model (GLM), random forest (RF), and support vector machine (SVM) models were established separately to screen the most suitable biomarkers for the diagnosis of PE among these significantly changed immune-related genes. The consensus clustering method was used to divide the PE cases into several subgroups to explore the function of the significantly changed immune-related genes in PE. Results Thirteen significantly changed immune-related genes were obtained by the differential expression analysis. RF was the best model and was used to select the four most important explanatory variables (CRH, PI3, CCL18, and CCL2) to diagnose PE. A nomogram model was constructed to predict PE based on these four variables. The decision curve analysis (DCA) and clinical impact curves revealed that PE patients could significantly benefit from this nomogram. Consensus clustering analysis of the 13 differentially expressed immune-related genes (DIRGs) was used to identify 3 subgroups of PE pregnancies with different clinical outcomes and immune cell infiltration. Conclusion Our study identified four immune-related genes to predict PE and three subgroups of PE with different clinical outcomes and immune cell infiltration. Future studies on the three subgroups may provide direction for individualized treatment of PE patients.
Collapse
Affiliation(s)
- Ying Wang
- Department of Obstetrics and Gynecology, Shengjing Hospital of China Medical University, Shenyang, China
| | - Zhen Li
- Department of Obstetrics and Gynecology, Shengjing Hospital of China Medical University, Shenyang, China
| | - Guiyu Song
- Department of Obstetrics and Gynecology, Shengjing Hospital of China Medical University, Shenyang, China
| | - Jun Wang
- Department of Obstetrics and Gynecology, Shengjing Hospital of China Medical University, Shenyang, China
| |
Collapse
|
13
|
Multiclass machine learning vs. conventional calculators for stroke/CVD risk assessment using carotid plaque predictors with coronary angiography scores as gold standard: a 500 participants study. Int J Cardiovasc Imaging 2020; 37:1171-1187. [DOI: 10.1007/s10554-020-02099-7] [Citation(s) in RCA: 13] [Impact Index Per Article: 3.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 10/13/2020] [Accepted: 11/03/2020] [Indexed: 02/07/2023]
|