1
|
Jabara M, Kose O, Perlman G, Corcos S, Pelletier MA, Possik E, Tsoukas M, Sharma A. Artificial Intelligence-Based Digital Biomarkers for Type 2 Diabetes: A Review. Can J Cardiol 2024; 40:1922-1933. [PMID: 39111729 DOI: 10.1016/j.cjca.2024.07.028] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/11/2024] [Revised: 07/27/2024] [Accepted: 07/29/2024] [Indexed: 09/10/2024] Open
Abstract
Type 2 diabetes mellitus (T2DM), a complex metabolic disorder that burdens the health care system, requires early detection and treatment. Recent strides in digital health technologies, coupled with artificial intelligence (AI), may have the potential to revolutionize T2DM screening, diagnosis of complications, and management through the development of digital biomarkers. This review provides an overview of the potential applications of AI-driven biomarkers in the context of screening, diagnosing complications, and managing patients with T2DM. The benefits of using multisensor devices to develop digital biomarkers are discussed. The summary of these findings and patterns between model architecture and sensor type are presented. In addition, we highlight the pivotal role of AI techniques in clinical intervention and implementation, encompassing clinical decision support systems, telemedicine interventions, and population health initiatives. Challenges such as data privacy, algorithm interpretability, and regulatory considerations are also highlighted, alongside future research directions to explore the use of AI-driven digital biomarkers in T2DM screening and management.
Collapse
Affiliation(s)
- Mariam Jabara
- Centre for Outcome Research & Evaluation, McGill University Health Centre, Montréal, Québec, Canada; Division of Experimental Medicine, Faculty of Medicine and Health Science, McGill University, Montréal, Québec, Canada
| | - Orhun Kose
- Division of Experimental Medicine, Faculty of Medicine and Health Science, McGill University, Montréal, Québec, Canada; DREAM-CV Lab, Research Institute of the McGill University Health Centre, Montréal, Québec, Canada
| | - George Perlman
- Division of Experimental Medicine, Faculty of Medicine and Health Science, McGill University, Montréal, Québec, Canada; DREAM-CV Lab, Research Institute of the McGill University Health Centre, Montréal, Québec, Canada
| | - Simon Corcos
- HOP-Child Technologies, Sherbrooke, Québec, Canada
| | | | - Elite Possik
- DREAM-CV Lab, Research Institute of the McGill University Health Centre, Montréal, Québec, Canada
| | - Michael Tsoukas
- Centre for Outcome Research & Evaluation, McGill University Health Centre, Montréal, Québec, Canada; Department of Endocrinology, McGill University Health Centre, Montréal, Québec, Canada
| | - Abhinav Sharma
- Centre for Outcome Research & Evaluation, McGill University Health Centre, Montréal, Québec, Canada; Division of Experimental Medicine, Faculty of Medicine and Health Science, McGill University, Montréal, Québec, Canada; DREAM-CV Lab, Research Institute of the McGill University Health Centre, Montréal, Québec, Canada.
| |
Collapse
|
2
|
Ayub H, Khan MA, Shehryar Ali Naqvi S, Faseeh M, Kim J, Mehmood A, Kim YJ. Unraveling the Potential of Attentive Bi-LSTM for Accurate Obesity Prognosis: Advancing Public Health towards Sustainable Cities. Bioengineering (Basel) 2024; 11:533. [PMID: 38927769 PMCID: PMC11200407 DOI: 10.3390/bioengineering11060533] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/09/2024] [Revised: 05/13/2024] [Accepted: 05/19/2024] [Indexed: 06/28/2024] Open
Abstract
The global prevalence of obesity presents a pressing challenge to public health and healthcare systems, necessitating accurate prediction and understanding for effective prevention and management strategies. This article addresses the need for improved obesity prediction models by conducting a comprehensive analysis of existing machine learning (ML) and deep learning (DL) approaches. This study introduces a novel hybrid model, Attention-based Bi-LSTM (ABi-LSTM), which integrates attention mechanisms with bidirectional Long Short-Term Memory (Bi-LSTM) networks to enhance interpretability and performance in obesity prediction. Our study fills a crucial gap by bridging healthcare and urban planning domains, offering insights into data-driven approaches to promote healthier living within urban environments. The proposed ABi-LSTM model demonstrates exceptional performance, achieving a remarkable accuracy of 96.5% in predicting obesity levels. Comparative analysis showcases its superiority over conventional approaches, with superior precision, recall, and overall classification balance. This study highlights significant advancements in predictive accuracy and positions the ABi-LSTM model as a pioneering solution for accurate obesity prognosis. The implications extend beyond healthcare, offering a precise tool to address the global obesity epidemic and foster sustainable development in smart cities.
Collapse
Affiliation(s)
- Hina Ayub
- Interdisciplinary Graduate Program in Advance Convergence Technology and Science, Jeju National University, Jeju 63243, Republic of Korea;
| | - Murad-Ali Khan
- Department of Computer Engineering, Jeju National University, Jeju 63243, Republic of Korea;
| | - Syed Shehryar Ali Naqvi
- Department of Electronics Engineering, Jeju National University, Jeju 63243, Republic of Korea; (S.S.A.N.)
| | - Muhammad Faseeh
- Department of Electronics Engineering, Jeju National University, Jeju 63243, Republic of Korea; (S.S.A.N.)
| | - Jungsuk Kim
- Department of Biomedical Engineering, College of IT Convergence, Gachon University, 1342 Seongnamdaero, Sujeong-gu, Seongnam-si 13120, Republic of Korea;
| | - Asif Mehmood
- Department of Biomedical Engineering, College of IT Convergence, Gachon University, 1342 Seongnamdaero, Sujeong-gu, Seongnam-si 13120, Republic of Korea;
| | - Young-Jin Kim
- Medical Device Development Center, Osong Medical Innovation Foundation, Cheongju 28160, Republic of Korea
| |
Collapse
|
3
|
Zhang H, Zeng T, Zhang J, Zheng J, Min J, Peng M, Liu G, Zhong X, Wang Y, Qiu K, Tian S, Liu X, Huang H, Surmach M, Wang P, Hu X, Chen L. Development and validation of machine learning-augmented algorithm for insulin sensitivity assessment in the community and primary care settings: a population-based study in China. Front Endocrinol (Lausanne) 2024; 15:1292346. [PMID: 38332892 PMCID: PMC10850228 DOI: 10.3389/fendo.2024.1292346] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Figures] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 09/11/2023] [Accepted: 01/11/2024] [Indexed: 02/10/2024] Open
Abstract
Objective Insulin plays a central role in the regulation of energy and glucose homeostasis, and insulin resistance (IR) is widely considered as the "common soil" of a cluster of cardiometabolic disorders. Assessment of insulin sensitivity is very important in preventing and treating IR-related disease. This study aims to develop and validate machine learning (ML)-augmented algorithms for insulin sensitivity assessment in the community and primary care settings. Methods We analyzed the data of 9358 participants over 40 years old who participated in the population-based cohort of the Hubei center of the REACTION study (Risk Evaluation of Cancers in Chinese Diabetic Individuals). Three non-ensemble algorithms and four ensemble algorithms were used to develop the models with 70 non-laboratory variables for the community and 87 (70 non-laboratory and 17 laboratory) variables for the primary care settings to screen the classifier of the state-of-the-art. The models with the best performance were further streamlined using top-ranked 5, 8, 10, 13, 15, and 20 features. Performances of these ML models were evaluated using the area under the receiver operating characteristic curve (AUROC), the area under the precision-recall curve (AUPR), and the Brier score. The Shapley additive explanation (SHAP) analysis was employed to evaluate the importance of features and interpret the models. Results The LightGBM models developed for the community (AUROC 0.794, AUPR 0.575, Brier score 0.145) and primary care settings (AUROC 0.867, AUPR 0.705, Brier score 0.119) achieved higher performance than the models constructed by the other six algorithms. The streamlined LightGBM models for the community (AUROC 0.791, AUPR 0.563, Brier score 0.146) and primary care settings (AUROC 0.863, AUPR 0.692, Brier score 0.124) using the 20 top-ranked variables also showed excellent performance. SHAP analysis indicated that the top-ranked features included fasting plasma glucose (FPG), waist circumference (WC), body mass index (BMI), triglycerides (TG), gender, waist-to-height ratio (WHtR), the number of daughters born, resting pulse rate (RPR), etc. Conclusion The ML models using the LightGBM algorithm are efficient to predict insulin sensitivity in the community and primary care settings accurately and might potentially become an efficient and practical tool for insulin sensitivity assessment in these settings.
Collapse
Affiliation(s)
- Hao Zhang
- Department of Endocrinology, Union Hospital, Tongji Medical College, Huazhong University of Science and Technology, Wuhan, China
- Hubei Provincial Clinical Research Center for Diabetes and Metabolic Disorders, Wuhan, China
| | - Tianshu Zeng
- Department of Endocrinology, Union Hospital, Tongji Medical College, Huazhong University of Science and Technology, Wuhan, China
- Hubei Provincial Clinical Research Center for Diabetes and Metabolic Disorders, Wuhan, China
| | - Jiaoyue Zhang
- Department of Endocrinology, Union Hospital, Tongji Medical College, Huazhong University of Science and Technology, Wuhan, China
- Hubei Provincial Clinical Research Center for Diabetes and Metabolic Disorders, Wuhan, China
| | - Juan Zheng
- Department of Endocrinology, Union Hospital, Tongji Medical College, Huazhong University of Science and Technology, Wuhan, China
- Hubei Provincial Clinical Research Center for Diabetes and Metabolic Disorders, Wuhan, China
| | - Jie Min
- Department of Endocrinology, Union Hospital, Tongji Medical College, Huazhong University of Science and Technology, Wuhan, China
- Hubei Provincial Clinical Research Center for Diabetes and Metabolic Disorders, Wuhan, China
| | - Miaomiao Peng
- Department of Endocrinology, Union Hospital, Tongji Medical College, Huazhong University of Science and Technology, Wuhan, China
- Hubei Provincial Clinical Research Center for Diabetes and Metabolic Disorders, Wuhan, China
| | - Geng Liu
- Department of Endocrinology, Union Hospital, Tongji Medical College, Huazhong University of Science and Technology, Wuhan, China
- Hubei Provincial Clinical Research Center for Diabetes and Metabolic Disorders, Wuhan, China
| | - Xueyu Zhong
- Department of Endocrinology, Union Hospital, Tongji Medical College, Huazhong University of Science and Technology, Wuhan, China
- Hubei Provincial Clinical Research Center for Diabetes and Metabolic Disorders, Wuhan, China
| | - Ying Wang
- Department of Endocrinology, Union Hospital, Tongji Medical College, Huazhong University of Science and Technology, Wuhan, China
- Hubei Provincial Clinical Research Center for Diabetes and Metabolic Disorders, Wuhan, China
| | - Kangli Qiu
- Department of Endocrinology, Union Hospital, Tongji Medical College, Huazhong University of Science and Technology, Wuhan, China
- Hubei Provincial Clinical Research Center for Diabetes and Metabolic Disorders, Wuhan, China
| | - Shenghua Tian
- Department of Endocrinology, Union Hospital, Tongji Medical College, Huazhong University of Science and Technology, Wuhan, China
- Hubei Provincial Clinical Research Center for Diabetes and Metabolic Disorders, Wuhan, China
| | - Xiaohuan Liu
- Department of Endocrinology, Union Hospital, Tongji Medical College, Huazhong University of Science and Technology, Wuhan, China
- Hubei Provincial Clinical Research Center for Diabetes and Metabolic Disorders, Wuhan, China
| | - Hantao Huang
- Department of Emergency Medicine, Yichang Yiling Hospital, Yichang, China
| | - Marina Surmach
- Department of Public Health and Health Services, Grodno State Medical University, Grodno, Belarus
| | - Ping Wang
- Precision Health Program, Department of Radiology, College of Human Medicine, Michigan State University, East Lansing, MI, United States
| | - Xiang Hu
- Department of Endocrinology, Union Hospital, Tongji Medical College, Huazhong University of Science and Technology, Wuhan, China
- Hubei Provincial Clinical Research Center for Diabetes and Metabolic Disorders, Wuhan, China
| | - Lulu Chen
- Department of Endocrinology, Union Hospital, Tongji Medical College, Huazhong University of Science and Technology, Wuhan, China
- Hubei Provincial Clinical Research Center for Diabetes and Metabolic Disorders, Wuhan, China
| |
Collapse
|
4
|
Shojaee-Mend H, Velayati F, Tayefi B, Babaee E. Prediction of Diabetes Using Data Mining and Machine Learning Algorithms: A Cross-Sectional Study. Healthc Inform Res 2024; 30:73-82. [PMID: 38359851 PMCID: PMC10879823 DOI: 10.4258/hir.2024.30.1.73] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/17/2023] [Revised: 01/24/2024] [Accepted: 01/24/2024] [Indexed: 02/17/2024] Open
Abstract
OBJECTIVES This study aimed to develop a model to predict fasting blood glucose status using machine learning and data mining, since the early diagnosis and treatment of diabetes can improve outcomes and quality of life. METHODS This crosssectional study analyzed data from 3376 adults over 30 years old at 16 comprehensive health service centers in Tehran, Iran who participated in a diabetes screening program. The dataset was balanced using random sampling and the synthetic minority over-sampling technique (SMOTE). The dataset was split into training set (80%) and test set (20%). Shapley values were calculated to select the most important features. Noise analysis was performed by adding Gaussian noise to the numerical features to evaluate the robustness of feature importance. Five different machine learning algorithms, including CatBoost, random forest, XGBoost, logistic regression, and an artificial neural network, were used to model the dataset. Accuracy, sensitivity, specificity, accuracy, the F1-score, and the area under the curve were used to evaluate the model. RESULTS Age, waist-to-hip ratio, body mass index, and systolic blood pressure were the most important factors for predicting fasting blood glucose status. Though the models achieved similar predictive ability, the CatBoost model performed slightly better overall with 0.737 area under the curve (AUC). CONCLUSIONS A gradient boosted decision tree model accurately identified the most important risk factors related to diabetes. Age, waist-to-hip ratio, body mass index, and systolic blood pressure were the most important risk factors for diabetes, respectively. This model can support planning for diabetes management and prevention.
Collapse
Affiliation(s)
- Hassan Shojaee-Mend
- Infectious Diseases Research Center, Gonabad University of Medical Sciences, Gonabad,
Iran
| | - Farnia Velayati
- Telemedicine Research Center, National Research Institute of Tuberculosis and Lung Diseases (NRITLD), Shahid Beheshti University of Medical Sciences, Tehran,
Iran
| | - Batool Tayefi
- Preventive Medicine and Public Health Research Center, Psychosocial Health Research Institute, Department of Community and Family Medicine, School of Medicine, Iran University of Medical Sciences, Tehran,
Iran
| | - Ebrahim Babaee
- Preventive Medicine and Public Health Research Center, Psychosocial Health Research Institute, Department of Community and Family Medicine, School of Medicine, Iran University of Medical Sciences, Tehran,
Iran
- Vaccine Research Center, Iran University of Medical Sciences, Tehran,
Iran
| |
Collapse
|
5
|
He Y, Matsunaga M, Li Y, Kishi T, Tanihara S, Iwata N, Tabuchi T, Ota A. Classifying Schizophrenia Cases by Artificial Neural Network Using Japanese Web-Based Survey Data: Case-Control Study. JMIR Form Res 2023; 7:e50193. [PMID: 37966882 PMCID: PMC10687680 DOI: 10.2196/50193] [Citation(s) in RCA: 2] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/26/2023] [Revised: 09/18/2023] [Accepted: 10/08/2023] [Indexed: 11/16/2023] Open
Abstract
BACKGROUND In Japan, challenges were reported in accurately estimating the prevalence of schizophrenia among the general population. Retrieving previous studies, we investigated that patients with schizophrenia were more likely to experience poor subjective well-being and various physical, psychiatric, and social comorbidities. These factors might have great potential for precisely classifying schizophrenia cases in order to estimate the prevalence. Machine learning has shown a positive impact on many fields, including epidemiology, due to its high-precision modeling capability. It has been applied in research on mental disorders. However, few studies have applied machine learning technology to the precise classification of schizophrenia cases by variables of demographic and health-related backgrounds, especially using large-scale web-based surveys. OBJECTIVE The aim of the study is to construct an artificial neural network (ANN) model that can accurately classify schizophrenia cases from large-scale Japanese web-based survey data and to verify the generalizability of the model. METHODS Data were obtained from a large Japanese internet research pooled panel (Rakuten Insight, Inc) in 2021. A total of 223 individuals, aged 20-75 years, having schizophrenia, and 1776 healthy controls were included. Answers to the questions in a web-based survey were formatted as 1 response variable (self-report diagnosed with schizophrenia) and multiple feature variables (demographic, health-related backgrounds, physical comorbidities, psychiatric comorbidities, and social comorbidities). An ANN was applied to construct a model for classifying schizophrenia cases. Logistic regression (LR) was used as a reference. The performances of the models and algorithms were then compared. RESULTS The model trained by the ANN performed better than LR in terms of area under the receiver operating characteristic curve (0.86 vs 0.78), accuracy (0.93 vs 0.91), and specificity (0.96 vs 0.94), while the model trained by LR showed better sensitivity (0.63 vs 0.56). Comparing the performances of the ANN and LR, the ANN was better in terms of area under the receiver operating characteristic curve (bootstrapping: 0.847 vs 0.773 and cross-validation: 0.81 vs 0.72), while LR performed better in terms of accuracy (0.894 vs 0.856). Sleep medication use, age, household income, and employment type were the top 4 variables in terms of importance. CONCLUSIONS This study constructed an ANN model to classify schizophrenia cases using web-based survey data. Our model showed a high internal validity. The findings are expected to provide evidence for estimating the prevalence of schizophrenia in the Japanese population and informing future epidemiological studies.
Collapse
Affiliation(s)
- Yupeng He
- Department of Public Health, Fujita Health University School of Medicine, Toyoake, Japan
| | - Masaaki Matsunaga
- Department of Public Health, Fujita Health University School of Medicine, Toyoake, Japan
| | - Yuanying Li
- Department of Public Health and Health Systems, Nagoya University Graduate School of Medicine, Nagoya, Japan
| | - Taro Kishi
- Department of Psychiatry, Fujita Health University School of Medicine, Toyoake, Japan
| | - Shinichi Tanihara
- Department of Public Health, Kurume University School of Medicine, Kurume, Japan
| | - Nakao Iwata
- Department of Psychiatry, Fujita Health University School of Medicine, Toyoake, Japan
| | - Takahiro Tabuchi
- Cancer Control Center, Osaka International Cancer Institute, Osaka, Japan
| | - Atsuhiko Ota
- Department of Public Health, Fujita Health University School of Medicine, Toyoake, Japan
| |
Collapse
|
6
|
Chellappan D, Rajaguru H. Enhancement of Classifier Performance Using Swarm Intelligence in Detection of Diabetes from Pancreatic Microarray Gene Data. Biomimetics (Basel) 2023; 8:503. [PMID: 37887634 PMCID: PMC10604158 DOI: 10.3390/biomimetics8060503] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/29/2023] [Revised: 10/08/2023] [Accepted: 10/20/2023] [Indexed: 10/28/2023] Open
Abstract
In this study, we focused on using microarray gene data from pancreatic sources to detect diabetes mellitus. Dimensionality reduction (DR) techniques were used to reduce the dimensionally high microarray gene data. DR methods like the Bessel function, Discrete Cosine Transform (DCT), Least Squares Linear Regression (LSLR), and Artificial Algae Algorithm (AAA) are used. Subsequently, we applied meta-heuristic algorithms like the Dragonfly Optimization Algorithm (DOA) and Elephant Herding Optimization Algorithm (EHO) for feature selection. Classifiers such as Nonlinear Regression (NLR), Linear Regression (LR), Gaussian Mixture Model (GMM), Expectation Maximum (EM), Bayesian Linear Discriminant Classifier (BLDC), Logistic Regression (LoR), Softmax Discriminant Classifier (SDC), and Support Vector Machine (SVM) with three types of kernels, Linear, Polynomial, and Radial Basis Function (RBF), were utilized to detect diabetes. The classifier's performance was analyzed based on parameters like accuracy, F1 score, MCC, error rate, FM metric, and Kappa. Without feature selection, the SVM (RBF) classifier achieved a high accuracy of 90% using the AAA DR methods. The SVM (RBF) classifier using the AAA DR method for EHO feature selection outperformed the other classifiers with an accuracy of 95.714%. This improvement in the accuracy of the classifier's performance emphasizes the role of feature selection methods.
Collapse
Affiliation(s)
- Dinesh Chellappan
- Department of Electrical and Electronics Engineering, KPR Institute of Engineering and Technology, Coimbatore 641 407, Tamil Nadu, India;
| | - Harikumar Rajaguru
- Department of Electronics and Communication Engineering, Bannari Amman Institute of Technology, Sathyamangalam 638 401, Tamil Nadu, India
| |
Collapse
|
7
|
Li S, Chen Y, Zhang L, Li R, Kang N, Hou J, Wang J, Bao Y, Jiang F, Zhu R, Wang C, Zhang L. An environment-wide association study for the identification of non-invasive factors for type 2 diabetes mellitus: Analysis based on the Henan Rural Cohort study. Diabetes Res Clin Pract 2023; 204:110917. [PMID: 37748711 DOI: 10.1016/j.diabres.2023.110917] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 06/12/2023] [Revised: 09/16/2023] [Accepted: 09/21/2023] [Indexed: 09/27/2023]
Abstract
AIM To explore the influencing factors of Type 2 diabetes mellitus (T2DM) in the rural population of Henan Province and evaluate the predictive ability of non-invasive factors to T2DM. METHODS A total of 30,020 participants from the Henan Rural Cohort Study in China were included in this study. The dataset was randomly divided into a training set and a testing set with a 50:50 split for validation purposes. We used logistic regression analysis to investigate the association between 56 factors and T2DM in the training set (false discovery rate < 5 %) and significant factors were further validated in the testing set (P < 0.05). Gradient Boosting Machine (GBM) model was used to determine the ability of the non-invasive variables to classify T2DM individuals accurately and the importance ranking of these variables. RESULTS The overall population prevalence of T2DM was 9.10 %. After adjusting for age, sex, educational level, marital status, and body measure index (BMI), we identified 13 non-invasive variables and 6 blood biochemical indexes associated with T2DM in the training and testing dataset. The top three factors according to the GBM importance ranking were pulse pressure (PP), urine glucose (UGLU), and waist-to-hip ratio (WHR). The GBM model achieved a receiver operating characteristic (AUC) curve of 0.837 with non-invasive variables and 0.847 for the full model. CONCLUSIONS Our findings demonstrate that non-invasive variables that can be easily measured and quickly obtained may be used to predict T2DM risk in rural populations in Henan Province.
Collapse
Affiliation(s)
- Shuoyi Li
- Department of Epidemiology and Biostatistics, College of Public Health, Zhengzhou University, Zhengzhou, Henan 450001, PR China
| | - Ying Chen
- Department of Epidemiology and Biostatistics, College of Public Health, Zhengzhou University, Zhengzhou, Henan 450001, PR China
| | - Liying Zhang
- Department of Epidemiology and Biostatistics, College of Public Health, Zhengzhou University, Zhengzhou, Henan 450001, PR China
| | - Ruiying Li
- Department of Epidemiology and Biostatistics, College of Public Health, Zhengzhou University, Zhengzhou, Henan 450001, PR China
| | - Ning Kang
- Department of Epidemiology and Biostatistics, College of Public Health, Zhengzhou University, Zhengzhou, Henan 450001, PR China
| | - Jian Hou
- Department of Epidemiology and Biostatistics, College of Public Health, Zhengzhou University, Zhengzhou, Henan 450001, PR China
| | - Jing Wang
- China-Australia Joint Research Center for Infectious Diseases, School of Public Health, Xi'an Jiaotong University Health Science Center, Xi'an, Shaanxi 710061, PR China
| | - Yining Bao
- China-Australia Joint Research Center for Infectious Diseases, School of Public Health, Xi'an Jiaotong University Health Science Center, Xi'an, Shaanxi 710061, PR China
| | - Feng Jiang
- Department of Epidemiology and Biostatistics, College of Public Health, Zhengzhou University, Zhengzhou, Henan 450001, PR China
| | - Ruifang Zhu
- Department of Epidemiology and Biostatistics, College of Public Health, Zhengzhou University, Zhengzhou, Henan 450001, PR China
| | - Chongjian Wang
- Department of Epidemiology and Biostatistics, College of Public Health, Zhengzhou University, Zhengzhou, Henan 450001, PR China.
| | - Lei Zhang
- China-Australia Joint Research Center for Infectious Diseases, School of Public Health, Xi'an Jiaotong University Health Science Center, Xi'an, Shaanxi 710061, PR China; Artificial Intelligence and Modelling in Epidemiology Program, Melbourne Sexual Health Centre, Alfred Health, Melbourne, Australia; Central Clinical School, Faculty of Medicine, Monash University, Melbourne, Australia.
| |
Collapse
|
8
|
Lee Y, Seo J. Suggestion of statistical validation on feature importance of machine learning. ANNUAL INTERNATIONAL CONFERENCE OF THE IEEE ENGINEERING IN MEDICINE AND BIOLOGY SOCIETY. IEEE ENGINEERING IN MEDICINE AND BIOLOGY SOCIETY. ANNUAL INTERNATIONAL CONFERENCE 2023; 2023:1-4. [PMID: 38083557 DOI: 10.1109/embc40787.2023.10340208] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 12/18/2023]
Abstract
Feature importance methods are widely used in machine learning analysis for medical datasets as both primary and subsidiary tools. These methods aid in selecting biomarkers or markers indicating target diseases, and can provide valuable insight into the mechanism of a disease. However, the simple listing of features with their corresponding importance rank is not sufficient in determining the statistical significance of these features. In this paper, we propose a simple method for evaluating the statistical significance of feature importance values and selecting the optimal number of biomarkers. We demonstrate the application of this method using a public open dataset on heart failure.Clinical Relevance- In order for important indicators to be clinically useful, their statistical significance must be defined. By proposing a simple method for calculating statistical significance, this paper enables clinicians to select a group of biomarkers based on their feature importance in a machine learning model. This approach improves the accuracy and effectiveness of clinical decision-making, leading to more precise diagnosis, treatment, and management of various medical conditions.
Collapse
|
9
|
Dong C, Nemet G, Gao X, Barbose G, Sigrin B, O'Shaughnessy E. Machine learning reduces soft costs for residential solar photovoltaics. Sci Rep 2023; 13:7213. [PMID: 37137971 PMCID: PMC10156750 DOI: 10.1038/s41598-023-33014-4] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/05/2022] [Accepted: 04/05/2023] [Indexed: 05/05/2023] Open
Abstract
Further deployment of rooftop solar photovoltaics (PV) hinges on the reduction of soft (non-hardware) costs-now larger and more resistant to reductions than hardware costs. The largest portion of these soft costs is the expenses solar companies incur to acquire new customers. In this study, we demonstrate the value of a shift from significance-based methodologies to prediction-oriented models to better identify PV adopters and reduce soft costs. We employ machine learning to predict PV adopters and non-adopters, and compare its prediction performance with logistic regression, the dominant significance-based method in technology adoption studies. Our results show that machine learning substantially enhances adoption prediction performance: The true positive rate of predicting adopters increased from 66 to 87%, and the true negative rate of predicting non-adopters increased from 75 to 88%. We attribute the enhanced performance to complex variable interactions and nonlinear effects incorporated by machine learning. With more accurate predictions, machine learning is able to reduce customer acquisition costs by 15% ($0.07/Watt) and identify new market opportunities for solar companies to expand and diversify their customer bases. Our research methods and findings provide broader implications for the adoption of similar clean energy technologies and related policy challenges such as market growth and energy inequality.
Collapse
Affiliation(s)
- Changgui Dong
- School of Public Administration and Policy, Renmin University of China, Beijing, 100872, China.
| | - Gregory Nemet
- La Follette School of Public Affairs, University of Wisconsin-Madison, Madison, WI, 53706, USA
| | - Xue Gao
- Department of Political Science, University of Miami, Coral Gables, FL, 33146, USA.
- Askew School of Public Administration and Policy, Florida State University, Tallahassee, FL, 32306, USA.
| | - Galen Barbose
- Lawrence Berkeley National Laboratory, Berkeley, CA, 94720, USA
| | - Benjamin Sigrin
- National Renewable Energy Laboratory, Golden, CO, 80401, USA
| | | |
Collapse
|
10
|
Cheng YL, Wu YR, Lin KD, Lin CHR, Lin IM. Using Machine Learning for the Risk Factors Classification of Glycemic Control in Type 2 Diabetes Mellitus. Healthcare (Basel) 2023; 11:healthcare11081141. [PMID: 37107975 PMCID: PMC10138388 DOI: 10.3390/healthcare11081141] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/06/2023] [Revised: 04/05/2023] [Accepted: 04/13/2023] [Indexed: 04/29/2023] Open
Abstract
Several risk factors are related to glycemic control in patients with type 2 diabetes mellitus (T2DM), including demographics, medical conditions, negative emotions, lipid profiles, and heart rate variability (HRV; to present cardiac autonomic activity). The interactions between these risk factors remain unclear. This study aimed to use machine learning methods of artificial intelligence to explore the relationships between various risk factors and glycemic control in T2DM patients. The study utilized a database from Lin et al. (2022) that included 647 T2DM patients. Regression tree analysis was conducted to identify the interactions among risk factors that contribute to glycated hemoglobin (HbA1c) values, and various machine learning methods were compared for their accuracy in classifying T2DM patients. The results of the regression tree analysis revealed that high depression scores may be a risk factor in one subgroup but not in others. When comparing different machine learning classification methods, the random forest algorithm emerged as the best-performing method with a small set of features. Specifically, the random forest algorithm achieved 84% accuracy, 95% area under the curve (AUC), 77% sensitivity, and 91% specificity. Using machine learning methods can provide significant value in accurately classifying patients with T2DM when considering depression as a risk factor.
Collapse
Affiliation(s)
- Yi-Ling Cheng
- Department of Psychology, College of Humanities and Social Sciences, Kaohsiung Medical University, Kaohsiung 807378, Taiwan
| | - Ying-Ru Wu
- Department of Psychology, College of Humanities and Social Sciences, Kaohsiung Medical University, Kaohsiung 807378, Taiwan
| | | | - Chun-Hung Richard Lin
- Department of Computer Science and Engineering, National Sun Yat-sen University, Kaohsiung 80424, Taiwan
| | - I-Mei Lin
- Department of Psychology, College of Humanities and Social Sciences, Kaohsiung Medical University, Kaohsiung 807378, Taiwan
- Department of Medical Research, Kaohsiung Medical University Hospital, Kaohsiung 807378, Taiwan
| |
Collapse
|
11
|
Liu X, Huang X, Zhao J, Su Y, Shen L, Duan Y, Gong J, Zhang Z, Piao S, Zhu Q, Rong X, Guo J. Application of machine learning in Chinese medicine differentiation of dampness-heat pattern in patients with type 2 diabetes mellitus. Heliyon 2023; 9:e13289. [PMID: 36873141 PMCID: PMC9975099 DOI: 10.1016/j.heliyon.2023.e13289] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/17/2022] [Revised: 01/26/2023] [Accepted: 01/27/2023] [Indexed: 02/15/2023] Open
Abstract
Background China has become the country with the largest number of people with type 2 diabetes mellitus (T2DM), and Chinese medicine (CM) has unique advantages in preventing and treating T2DM, while accurate pattern differentiation is the guarantee for proper treatment. Objective The establishment of the CM pattern differentiation model of T2DM is helpful to the pattern diagnosis of the disease. At present, there are few studies on dampness-heat pattern differentiation models of T2DM. Therefore, we establish a machine learning model, hoping to provide an efficient tool for the pattern diagnosis of CM for T2DM in the future. Methods A total of 1021 effective samples of T2DM patients from ten CM hospitals or clinics were collected by a questionnaire including patients' demographic and dampness-heat-related symptoms and signs. All information and the diagnosis of the dampness-heat pattern of patients were completed by experienced CM physicians at each visit. We applied six machine learning algorithms (Artificial Neural Network [ANN], K-Nearest Neighbor [KNN], Naïve Bayes [NB], Support Vector Machine [SVM], Extreme Gradient Boosting [XGBoost] and Random Forest [RF]) and compared their performance. And then we also utilized Shapley additive explanation (SHAP) method to explain the best performance model. Results The XGBoost model had the highest AUC (0.951, 95% CI 0.925-0.978) among the six models, with the best sensitivity, accuracy, F1 score, negative predictive value, and excellent specificity, precision, and positive predictive value. The SHAP method based on XGBoost showed that slimy yellow tongue fur was the most important sign in dampness-heat pattern diagnosis. The slippery pulse or rapid-slippery pulse, sticky stool with ungratifying defecation also performed an important role in this diagnostic model. Furthermore, the red tongue acted as an important tongue sign for the dampness-heat pattern. Conclusion This study constructed a dampness-heat pattern differentiation model of T2DM based on machine learning. The XGBoost model is a tool with the potential to help CM practitioners make quick diagnosis decisions and contribute to the standardization and international application of CM patterns.
Collapse
Affiliation(s)
- Xinyu Liu
- Guangdong Metabolic Diseases Research Center of Integrated Chinese and Western Medicine, Guangdong Pharmaceutical University, Guangzhou, 510006, China.,Key Laboratory of Glucolipid Metabolic Disorder, Ministry of Education of China, Guangdong Pharmaceutical University, Guangzhou, 510006, China.,Guangdong TCM Key Laboratory for Metabolic Diseases, Guangdong Pharmaceutical University, Guangzhou, 510006, China.,Institute of Chinese Medicine, Guangdong Pharmaceutical University, Guangzhou, 510006, China
| | - Xiaoqiang Huang
- Science and Technology Innovation Center, Guangzhou University of Chinese Medicine, Guangzhou, 510006, China
| | - Jindong Zhao
- The First Affiliated Hospital of Anhui University of Chinese, Hefei, 230031, China
| | - Yanjin Su
- Shaanxi University of Chinese Medicine, Xi'an, 712046, China
| | - Lu Shen
- Shaanxi Provincial Hospital of Traditional Chinese Medicine, Xi'an, 710003, China
| | - Yuhong Duan
- Affiliated Hospital of Shannxi University of Chinese Medicine, Xi'an, 712000, China
| | - Jing Gong
- Department of Integrated Traditional Chinese and Western Medicine, Tongji Hospital, Tongji Medical College, Huazhong University of Science and Technology, Wuhan, 430030, China
| | - Zhihai Zhang
- The First Affiliated Hospital of Xiamen University, Xiamen, 361003, China
| | - Shenghua Piao
- Guangdong Metabolic Diseases Research Center of Integrated Chinese and Western Medicine, Guangdong Pharmaceutical University, Guangzhou, 510006, China.,Key Laboratory of Glucolipid Metabolic Disorder, Ministry of Education of China, Guangdong Pharmaceutical University, Guangzhou, 510006, China.,Guangdong TCM Key Laboratory for Metabolic Diseases, Guangdong Pharmaceutical University, Guangzhou, 510006, China.,Institute of Chinese Medicine, Guangdong Pharmaceutical University, Guangzhou, 510006, China
| | - Qing Zhu
- Guangdong Metabolic Diseases Research Center of Integrated Chinese and Western Medicine, Guangdong Pharmaceutical University, Guangzhou, 510006, China.,Key Laboratory of Glucolipid Metabolic Disorder, Ministry of Education of China, Guangdong Pharmaceutical University, Guangzhou, 510006, China.,Guangdong TCM Key Laboratory for Metabolic Diseases, Guangdong Pharmaceutical University, Guangzhou, 510006, China.,Institute of Chinese Medicine, Guangdong Pharmaceutical University, Guangzhou, 510006, China
| | - Xianglu Rong
- Guangdong Metabolic Diseases Research Center of Integrated Chinese and Western Medicine, Guangdong Pharmaceutical University, Guangzhou, 510006, China.,Key Laboratory of Glucolipid Metabolic Disorder, Ministry of Education of China, Guangdong Pharmaceutical University, Guangzhou, 510006, China.,Guangdong TCM Key Laboratory for Metabolic Diseases, Guangdong Pharmaceutical University, Guangzhou, 510006, China.,Institute of Chinese Medicine, Guangdong Pharmaceutical University, Guangzhou, 510006, China
| | - Jiao Guo
- Guangdong Metabolic Diseases Research Center of Integrated Chinese and Western Medicine, Guangdong Pharmaceutical University, Guangzhou, 510006, China.,Key Laboratory of Glucolipid Metabolic Disorder, Ministry of Education of China, Guangdong Pharmaceutical University, Guangzhou, 510006, China.,Guangdong TCM Key Laboratory for Metabolic Diseases, Guangdong Pharmaceutical University, Guangzhou, 510006, China.,Institute of Chinese Medicine, Guangdong Pharmaceutical University, Guangzhou, 510006, China
| |
Collapse
|
12
|
Smail HO, Mohamad DA. Identification of DNA methylation of CAPN10 gene changes in the patients with type 2 diabetes mellitus as a predictive biomarker instead of HbA1c, random blood sugar, lipid profile, kidney function test, and some risk factors. Endocr Regul 2023; 57:221-234. [PMID: 37823570 DOI: 10.2478/enr-2023-0025] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Indexed: 10/13/2023] Open
Abstract
Objective. Nowadays, type 2 diabetes mellitus (T2DM) is the most common chronic endocrine disorder, affecting an estimated 5-10% of adults worldwide and this disease rapidly increases in the Kurdistan region population. This research aims to identify DNA methylation change in the CPAN10 gene as a predictive biomarker in T2DM and the association between DNA methylation status with lipid profile and kidney function test. Methods. The participants (113) were divided into three groups: diabetes group (47), prediabetes group (36), and control group (30). The study was carried out on patients who visited the private clinical sectors between August and December 2021 in the Koya city Kurdistan region of Iraq. To determine DNA methylation status, methylation-specific PCR (MPS) with paired primer for each methylated and unmethylated region was used. The Mann-Whitney U test and Spearman's correlation were performed for statistical analysis of data and a value of p<0.05 was considered significant. Results. The obtained results show that DNA hypermethylation was recorded in the promoter region in the samples of the diabetes and prediabetes groups compared to the healthy group (control). Various factors also affected the level of DNA methylation, such as HbA1c in prediabetes group and body mass index in the control group. Conclusion. These results indicate that DNA methylation changes in the CAPN10 gene promoter region may be used as a potential predictive biomarker to diagnose T2DM; however, this study requires further data to support this evidence.
Collapse
Affiliation(s)
- Harem Othman Smail
- 1Department of Biology, Faculty of Science and Health, Koya University, Koya KOY45, Kurdistan Region - F.R. Iraq
| | - Dlnya Asaad Mohamad
- 2Department of Biology, College of Science, University of Sulaimani, Sulaymanyah, Iraq
| |
Collapse
|
13
|
Afsaneh E, Sharifdini A, Ghazzaghi H, Ghobadi MZ. Recent applications of machine learning and deep learning models in the prediction, diagnosis, and management of diabetes: a comprehensive review. Diabetol Metab Syndr 2022; 14:196. [PMID: 36572938 PMCID: PMC9793536 DOI: 10.1186/s13098-022-00969-9] [Citation(s) in RCA: 15] [Impact Index Per Article: 7.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 10/06/2022] [Accepted: 12/16/2022] [Indexed: 12/28/2022] Open
Abstract
Diabetes as a metabolic illness can be characterized by increased amounts of blood glucose. This abnormal increase can lead to critical detriment to the other organs such as the kidneys, eyes, heart, nerves, and blood vessels. Therefore, its prediction, prognosis, and management are essential to prevent harmful effects and also recommend more useful treatments. For these goals, machine learning algorithms have found considerable attention and have been developed successfully. This review surveys the recently proposed machine learning (ML) and deep learning (DL) models for the objectives mentioned earlier. The reported results disclose that the ML and DL algorithms are promising approaches for controlling blood glucose and diabetes. However, they should be improved and employed in large datasets to affirm their applicability.
Collapse
|
14
|
Srinivasu PN, Shafi J, Krishna TB, Sujatha CN, Praveen SP, Ijaz MF. Using Recurrent Neural Networks for Predicting Type-2 Diabetes from Genomic and Tabular Data. Diagnostics (Basel) 2022; 12:3067. [PMID: 36553074 PMCID: PMC9776641 DOI: 10.3390/diagnostics12123067] [Citation(s) in RCA: 5] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/04/2022] [Revised: 12/01/2022] [Accepted: 12/04/2022] [Indexed: 12/12/2022] Open
Abstract
The development of genomic technology for smart diagnosis and therapies for various diseases has lately been the most demanding area for computer-aided diagnostic and treatment research. Exponential breakthroughs in artificial intelligence and machine intelligence technologies could pave the way for identifying challenges afflicting the healthcare industry. Genomics is paving the way for predicting future illnesses, including cancer, Alzheimer's disease, and diabetes. Machine learning advancements have expedited the pace of biomedical informatics research and inspired new branches of computational biology. Furthermore, knowing gene relationships has resulted in developing more accurate models that can effectively detect patterns in vast volumes of data, making classification models important in various domains. Recurrent Neural Network models have a memory that allows them to quickly remember knowledge from previous cycles and process genetic data. The present work focuses on type 2 diabetes prediction using gene sequences derived from genomic DNA fragments through automated feature selection and feature extraction procedures for matching gene patterns with training data. The suggested model was tested using tabular data to predict type 2 diabetes based on several parameters. The performance of neural networks incorporating Recurrent Neural Network (RNN) components, Long Short-Term Memory (LSTM), and Gated Recurrent Units (GRU) was tested in this research. The model's efficiency is assessed using the evaluation metrics such as Sensitivity, Specificity, Accuracy, F1-Score, and Mathews Correlation Coefficient (MCC). The suggested technique predicted future illnesses with fair Accuracy. Furthermore, our research showed that the suggested model could be used in real-world scenarios and that input risk variables from an end-user Android application could be kept and evaluated on a secure remote server.
Collapse
Affiliation(s)
- Parvathaneni Naga Srinivasu
- Department of Computer Science and Engineering, Prasad V. Potluri Siddhartha Institute of Technology, Vijayawada 520007, Andhra Pradesh, India
| | - Jana Shafi
- Department of Computer Science, College of Arts and Science, Prince Sattam bin Abdul Aziz University, Wadi Ad-Dawasir 11991, Saudi Arabia
| | - T Balamurali Krishna
- Department of Computer Science and Engineering, Dhanekula Institute of Engineering and Technology, Vijayawada 521139, Andhra Pradesh, India
| | - Canavoy Narahari Sujatha
- Department of Electronics and Communication Engineering, Sreenidhi Institute of Science and Technology, Hyderabad 501301, Telangana, India
| | - S Phani Praveen
- Department of Computer Science and Engineering, Prasad V. Potluri Siddhartha Institute of Technology, Vijayawada 520007, Andhra Pradesh, India
| | - Muhammad Fazal Ijaz
- Department of Intelligent Mechatronics Engineering, Sejong University, Seoul 05006, Republic of Korea
| |
Collapse
|
15
|
Kanda E, Suzuki A, Makino M, Tsubota H, Kanemata S, Shirakawa K, Yajima T. Machine learning models for prediction of HF and CKD development in early-stage type 2 diabetes patients. Sci Rep 2022; 12:20012. [PMID: 36411366 PMCID: PMC9678863 DOI: 10.1038/s41598-022-24562-2] [Citation(s) in RCA: 6] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/14/2022] [Accepted: 11/17/2022] [Indexed: 11/23/2022] Open
Abstract
Chronic kidney disease (CKD) and heart failure (HF) are the first and most frequent comorbidities associated with mortality risks in early-stage type 2 diabetes mellitus (T2DM). However, efficient screening and risk assessment strategies for identifying T2DM patients at high risk of developing CKD and/or HF (CKD/HF) remains to be established. This study aimed to generate a novel machine learning (ML) model to predict the risk of developing CKD/HF in early-stage T2DM patients. The models were derived from a retrospective cohort of 217,054 T2DM patients without a history of cardiovascular and renal diseases extracted from a Japanese claims database. Among algorithms used for the ML, extreme gradient boosting exhibited the best performance for CKD/HF diagnosis and hospitalization after internal validation and was further validated using another dataset including 16,822 patients. In the external validation, 5-years prediction area under the receiver operating characteristic curves for CKD/HF diagnosis and hospitalization were 0.718 and 0.837, respectively. In Kaplan-Meier curves analysis, patients predicted to be at high risk showed significant increase in CKD/HF diagnosis and hospitalization compared with those at low risk. Thus, the developed model predicted the risk of developing CKD/HF in T2DM patients with reasonable probability in the external validation cohort. Clinical approach identifying T2DM at high risk of developing CKD/HF using ML models may contribute to improved prognosis by promoting early diagnosis and intervention.
Collapse
Affiliation(s)
- Eiichiro Kanda
- grid.415086.e0000 0001 1014 2000Medical Science, Kawasaki Medical University, Okayama, Japan
| | - Atsushi Suzuki
- grid.256115.40000 0004 1761 798XDepartment of Endocrinology, Diabetes and Metabolism, Fujita Health University, Toyoake, Aichi Japan
| | - Masaki Makino
- grid.256115.40000 0004 1761 798XDepartment of Endocrinology, Diabetes and Metabolism, Fujita Health University, Toyoake, Aichi Japan
| | - Hiroo Tsubota
- grid.476017.30000 0004 0376 5631AstraZeneca K.K., Osaka, Japan
| | - Satomi Kanemata
- grid.459873.40000 0004 0376 2510Ono Pharmaceutical Co., Ltd., Osaka, Japan
| | | | | |
Collapse
|
16
|
Mao Y, Zhu Z, Pan S, Lin W, Liang J, Huang H, Li L, Wen J, Chen G. Value of machine learning algorithms for predicting diabetes risk: A subset analysis from a real-world retrospective cohort study. J Diabetes Investig 2022; 14:309-320. [PMID: 36345236 PMCID: PMC9889616 DOI: 10.1111/jdi.13937] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 02/01/2022] [Revised: 10/04/2022] [Accepted: 10/16/2022] [Indexed: 11/11/2022] Open
Abstract
AIMS/INTRODUCTION To compare the application value of different machine learning (ML) algorithms for diabetes risk prediction. MATERIALS AND METHODS This is a 3-year retrospective cohort study with a total of 3,687 participants being included in the data analysis. Modeling variable screening and predictive model building were carried out using logistic regression (LR) analysis and 10-fold cross-validation, respectively. In total, six different ML algorithms, including random forests, light gradient boosting machine, extreme gradient boosting, adaptive boosting (AdaBoost), multi-layer perceptrons and gaussian naive bayes were used for model construction. Model performance was mainly evaluated by the area under the receiver operating characteristic curve. The best performing ML model was selected for comparison with the traditional LR model and visualized using Shapley additive explanations. RESULTS A total of eight risk factors most associated with the development of diabetes were identified by univariate and multivariate LR analysis, and they were visualized in the form of a nomogram. Among the six different ML models, the random forests model had the best predictive performance. After 10-fold cross-validation, its optimal model has an area under the receiver operating characteristic value of 0.855 (95% confidence interval [CI] 0.823-0.886) in the training set and 0.835 (95% CI 0.779-0.892) in the test set. In the traditional LR model, its area under the receiver operating characteristic value is 0.840 (95% CI 0.814-0.866) in the training set and 0.834 (95% CI 0.785-0.884) in the test set. CONCLUSIONS In the real-world epidemiological research, the combination of traditional variable screening and ML algorithm to construct a diabetes risk prediction model has satisfactory clinical application value.
Collapse
Affiliation(s)
- Yaqian Mao
- Department of Internal Medicine, Fujian Provincial Hospital South BranchShengli Clinical Medical College of Fujian Medical UniversityFuzhouChina
| | - Zheng Zhu
- Department of Endocrinology, Fujian Provincial HospitalShengli Clinical Medical College of Fujian Medical UniversityFuzhouChina
| | - Shuyao Pan
- Department of Endocrinology, Fujian Provincial HospitalShengli Clinical Medical College of Fujian Medical UniversityFuzhouChina
| | - Wei Lin
- Department of Endocrinology, Fujian Provincial HospitalShengli Clinical Medical College of Fujian Medical UniversityFuzhouChina
| | - Jixing Liang
- Department of Endocrinology, Fujian Provincial HospitalShengli Clinical Medical College of Fujian Medical UniversityFuzhouChina
| | - Huibin Huang
- Department of Endocrinology, Fujian Provincial HospitalShengli Clinical Medical College of Fujian Medical UniversityFuzhouChina
| | - Liantao Li
- Department of Endocrinology, Fujian Provincial HospitalShengli Clinical Medical College of Fujian Medical UniversityFuzhouChina
| | - Junping Wen
- Department of Endocrinology, Fujian Provincial HospitalShengli Clinical Medical College of Fujian Medical UniversityFuzhouChina
| | - Gang Chen
- Department of Endocrinology, Fujian Provincial HospitalShengli Clinical Medical College of Fujian Medical UniversityFuzhouChina,Fujian Provincial Key Laboratory of Medical Analysis, Fujian Academy of MedicalFuzhouChina
| |
Collapse
|
17
|
Genis-Mendoza AD, González-Castro TB, Tovilla-Vidal G, Juárez-Rojop IE, Castillo-Avila RG, López-Narváez ML, Tovilla-Zárate CA, Sánchez-de la Cruz JP, Fresán A, Nicolini H. Increased Levels of HbA1c in Individuals with Type 2 Diabetes and Depression: A Meta-Analysis of 34 Studies with 68,398 Participants. Biomedicines 2022; 10:biomedicines10081919. [PMID: 36009468 PMCID: PMC9405837 DOI: 10.3390/biomedicines10081919] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/27/2022] [Revised: 07/19/2022] [Accepted: 07/23/2022] [Indexed: 01/10/2023] Open
Abstract
Glycosylated hemoglobin is used to diagnose type 2 diabetes mellitus and assess metabolic control. Depression itself has been associated with high levels of HbA1c in individuals with T2DM. The association between diabetes and depression suggests the usefulness of determining HbA1c as a biological marker of depressive symptoms. The aim of this study was to determine HbA1c levels in individuals with T2DM with vs. without depression. Additionally, we analyzed the influence of pharmacological treatments, time of evolution, and complications of disease. We performed a literature search in different databases published up to January 2020. A total of 34 articles were included. Our results showed that individuals with T2DM with depression showed increased levels of HbA1c in comparison to individuals with T2DM without depression (d = 0.18, 95% CI: 0.12−0.29, p(Z) < 0.001; I2 = 85.00). We also found that HbA1c levels remained elevated in individuals with T2DM with depression who were taking hypoglycemic drugs (d = 0.20 95% CI: 0.11−0.30, p(Z) < 0.001; I2 = 86.80), in individuals with less than 10 years of evolution (d = 0.17 95% CI: 0.09−0.26, p(Z) = 0.001; I2 = 66.03) and in individuals with complications of the disease (d = 0.17, 95% CI: 0.07−0.26, p(Z) < 0.001; I2 = 58.41). Our results show that HbA1c levels in individuals with T2DM with depression are significantly increased compared to controls with T2DM without depression. Additionally, these levels remained elevated in individuals who were taking hypoglycemic drugs, those with less than 10 years of disease evolution, and those with complications related to diabetes. It is necessary to examine the existence of a diabetes−HbA1c−depression connection.
Collapse
Affiliation(s)
- Alma Delia Genis-Mendoza
- Laboratorio de Genómica de Enfermedades Psiquiátricas y Neurodegenerativas, Instituto Nacional de Medicina Genómica, Ciudad de México 14610, Mexico
| | - Thelma Beatriz González-Castro
- División Académica Multidisciplinaria de Jalpa de Méndez, Universidad Juárez Autónoma de Tabasco, Jalpa de Méndez 86040, Tabasco, Mexico
| | - Gisselle Tovilla-Vidal
- División Académica de Ciencias de la Salud, Universidad Juárez Autónoma de Tabasco, Villahermosa 86100, Tabasco, Mexico
| | - Isela Esther Juárez-Rojop
- División Académica de Ciencias de la Salud, Universidad Juárez Autónoma de Tabasco, Villahermosa 86100, Tabasco, Mexico
| | - Rosa Giannina Castillo-Avila
- División Académica de Ciencias de la Salud, Universidad Juárez Autónoma de Tabasco, Villahermosa 86100, Tabasco, Mexico
| | - María Lilia López-Narváez
- Hospital Chiapas Nos Une “Dr. Gilberto Gómez Maza”, Secretaría de Salud de Chiapas, Tuxtla Gutiérrez 29045, Chiapas, Mexico
| | - Carlos Alfonso Tovilla-Zárate
- División Académica Multidisciplinaria de Comalcalco, Universidad Juárez Autónoma de Tabasco, Comalcalco 86040, Tabasco, Mexico
- Correspondence: (C.A.T.-Z.); (H.N.); Tel.: +52-993-358-1500 (ext. 6901) (C.A.T.-Z.); +52-5350-1900 (ext. 1197) (H.N.)
| | - Juan Pablo Sánchez-de la Cruz
- División Académica Multidisciplinaria de Comalcalco, Universidad Juárez Autónoma de Tabasco, Comalcalco 86040, Tabasco, Mexico
| | - Ana Fresán
- Subdirección de Investigaciones Clínicas, Instituto Nacional de Psiquiatría Ramón de la Fuente Muñíz, Ciudad de México 14370, Mexico
| | - Humberto Nicolini
- Laboratorio de Genómica de Enfermedades Psiquiátricas y Neurodegenerativas, Instituto Nacional de Medicina Genómica, Ciudad de México 14610, Mexico
- Correspondence: (C.A.T.-Z.); (H.N.); Tel.: +52-993-358-1500 (ext. 6901) (C.A.T.-Z.); +52-5350-1900 (ext. 1197) (H.N.)
| |
Collapse
|
18
|
Motaib I, Aitlahbib F, Fadil A, Z Rhmari Tlemcani F, Elamari S, Laidi S, Chadli A. Predicting poor glycemic control during Ramadan among non-fasting patients with diabetes using artificial intelligence based machine learning models. Diabetes Res Clin Pract 2022; 190:109982. [PMID: 35803316 DOI: 10.1016/j.diabres.2022.109982] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 03/10/2022] [Revised: 06/17/2022] [Accepted: 07/04/2022] [Indexed: 11/30/2022]
Abstract
AIMS This study aims to predict poor glycemic control during Ramadan among non-fasting patients with diabetes using machine learning models. METHODS First, we conducted three consultations, before, during, and after Ramadan to assess demographics, diabetes history, caloric intake, anthropometric and metabolic parameters. Second, machine learning techniques (Logistic Regression, Support Vector Machine, Naive Bayes, K-nearest neighbor, Decision Tree, Random Forest, Extra Trees Classifier and Catboost) were trained using the data to predict poor glycemic control among patients. Then, we conducted several simulations with the best performing machine learning model using variables that were found as main predictors of poor glycemic control. RESULTS The prevalence of poor glycemic control among patients was 52.6%. Extra tree Classifier was the best performing model for glycemic deterioration (accuracy = 0.87, AUC = 0,87). Caloric intake evolution, gender, baseline caloric intake, baseline weight, BMI variation, waist circumference evolution and Total Cholesterol serum level after Ramadan were selected as the most significant for the prediction of poor glycemic control. We determined thresholds for each predicting factor among which this risk is present. CONCLUSIONS The clinical use of our findings may help to improve glycemic control during Ramadan among patients who do not fast by targeting risk factors of poor glycemic control.
Collapse
Affiliation(s)
- Imane Motaib
- Department of Endocrinology Diabetology Metabolic Disease and Nutrition, Cheikh Khalifa International University Hospital, Faculty of Medicine, Mohammed VI University of Health Sciences (UM6SS), Casablanca, Morocco.
| | - Faiçal Aitlahbib
- Hassania School of Public Works, Casablanca, Morocco; Office Chérifien des Phosphates (OCP), Casablanca, Morocco
| | | | - Fatima Z Rhmari Tlemcani
- Department of Endocrinology Diabetology Metabolic Disease and Nutrition, Cheikh Khalifa International University Hospital, Faculty of Medicine, Mohammed VI University of Health Sciences (UM6SS), Casablanca, Morocco
| | - Saloua Elamari
- Department of Endocrinology Diabetology Metabolic Disease and Nutrition, Cheikh Khalifa International University Hospital, Faculty of Medicine, Mohammed VI University of Health Sciences (UM6SS), Casablanca, Morocco
| | - Soukaina Laidi
- Department of Endocrinology Diabetology Metabolic Disease and Nutrition, Cheikh Khalifa International University Hospital, Faculty of Medicine, Mohammed VI University of Health Sciences (UM6SS), Casablanca, Morocco
| | - Asma Chadli
- Department of Endocrinology Diabetology Metabolic Disease and Nutrition, Cheikh Khalifa International University Hospital, Faculty of Medicine, Mohammed VI University of Health Sciences (UM6SS), Casablanca, Morocco
| |
Collapse
|
19
|
Liu Q, Zhou Q, He Y, Zou J, Guo Y, Yan Y. Predicting the 2-Year Risk of Progression from Prediabetes to Diabetes Using Machine Learning among Chinese Elderly Adults. J Pers Med 2022; 12:jpm12071055. [PMID: 35887552 PMCID: PMC9324396 DOI: 10.3390/jpm12071055] [Citation(s) in RCA: 4] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/10/2022] [Revised: 06/06/2022] [Accepted: 06/23/2022] [Indexed: 11/18/2022] Open
Abstract
Identifying people with a high risk of developing diabetes among those with prediabetes may facilitate the implementation of a targeted lifestyle and pharmacological interventions. We aimed to establish machine learning models based on demographic and clinical characteristics to predict the risk of incident diabetes. We used data from the free medical examination service project for elderly people who were 65 years or older to develop logistic regression (LR), decision tree (DT), random forest (RF), and extreme gradient boosting (XGBoost) machine learning models for the follow-up results of 2019 and 2020 and performed internal validation. The receiver operating characteristic (ROC), sensitivity, specificity, accuracy, and F1 score were used to select the model with better performance. The average annual progression rate to diabetes in prediabetic elderly people was 14.21%. Each model was trained using eight features and one outcome variable from 9607 prediabetic individuals, and the performance of the models was assessed in 2402 prediabetes patients. The predictive ability of four models in the first year was better than in the second year. The XGBoost model performed relatively efficiently (ROC: 0.6742 for 2019 and 0.6707 for 2020). We established and compared four machine learning models to predict the risk of progression from prediabetes to diabetes. Although there was little difference in the performance of the four models, the XGBoost model had a relatively good ROC value, which might perform well in future exploration in this field.
Collapse
Affiliation(s)
- Qing Liu
- Department of Epidemiology, School of Public Health, Wuhan University, Wuhan 430071, China; (Q.L.); (Q.Z.)
| | - Qing Zhou
- Department of Epidemiology, School of Public Health, Wuhan University, Wuhan 430071, China; (Q.L.); (Q.Z.)
| | - Yifeng He
- School of Geodesy and Geomatics, Wuhan University, Wuhan 430079, China; (Y.H.); (J.Z.)
| | - Jingui Zou
- School of Geodesy and Geomatics, Wuhan University, Wuhan 430079, China; (Y.H.); (J.Z.)
| | - Yan Guo
- Wuhan Center for Disease Control and Prevention, Wuhan 430015, China;
| | - Yaqiong Yan
- Wuhan Center for Disease Control and Prevention, Wuhan 430015, China;
- Correspondence:
| |
Collapse
|
20
|
Liu Q, Zhang M, He Y, Zhang L, Zou J, Yan Y, Guo Y. Predicting the Risk of Incident Type 2 Diabetes Mellitus in Chinese Elderly Using Machine Learning Techniques. J Pers Med 2022; 12:jpm12060905. [PMID: 35743691 PMCID: PMC9224915 DOI: 10.3390/jpm12060905] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/09/2022] [Revised: 05/21/2022] [Accepted: 05/27/2022] [Indexed: 02/04/2023] Open
Abstract
Early identification of individuals at high risk of diabetes is crucial for implementing early intervention strategies. However, algorithms specific to elderly Chinese adults are lacking. The aim of this study is to build effective prediction models based on machine learning (ML) for the risk of type 2 diabetes mellitus (T2DM) in Chinese elderly. A retrospective cohort study was conducted using the health screening data of adults older than 65 years in Wuhan, China from 2018 to 2020. With a strict data filtration, 127,031 records from the eligible participants were utilized. Overall, 8298 participants were diagnosed with incident T2DM during the 2-year follow-up (2019–2020). The dataset was randomly split into training set (n = 101,625) and test set (n = 25,406). We developed prediction models based on four ML algorithms: logistic regression (LR), decision tree (DT), random forest (RF), and extreme gradient boosting (XGBoost). Using LASSO regression, 21 prediction features were selected. The Random under-sampling (RUS) was applied to address the class imbalance, and the Shapley Additive Explanations (SHAP) was used to calculate and visualize feature importance. Model performance was evaluated by the area under the receiver operating characteristic curve (AUC), sensitivity, specificity, and accuracy. The XGBoost model achieved the best performance (AUC = 0.7805, sensitivity = 0.6452, specificity = 0.7577, accuracy = 0.7503). Fasting plasma glucose (FPG), education, exercise, gender, and waist circumference (WC) were the top five important predictors. This study showed that XGBoost model can be applied to screen individuals at high risk of T2DM in the early phrase, which has the strong potential for intelligent prevention and control of diabetes. The key features could also be useful for developing targeted diabetes prevention interventions.
Collapse
Affiliation(s)
- Qing Liu
- Department of Epidemiology, School of Public Health, Wuhan University, Wuhan 430071, China; (Q.L.); (M.Z.)
| | - Miao Zhang
- Department of Epidemiology, School of Public Health, Wuhan University, Wuhan 430071, China; (Q.L.); (M.Z.)
| | - Yifeng He
- School of Geodesy and Geomatics, Wuhan University, Wuhan 430079, China; (Y.H.); (J.Z.)
| | - Lei Zhang
- School of Mathematics and Statistics, Wuhan University, Wuhan 430070, China;
| | - Jingui Zou
- School of Geodesy and Geomatics, Wuhan University, Wuhan 430079, China; (Y.H.); (J.Z.)
| | - Yaqiong Yan
- Wuhan Center for Disease Control and Prevention, Wuhan 430015, China;
| | - Yan Guo
- Wuhan Center for Disease Control and Prevention, Wuhan 430015, China;
- Correspondence:
| |
Collapse
|
21
|
Research Progress in the Early Warning of Chicken Diseases by Monitoring Clinical Symptoms. APPLIED SCIENCES-BASEL 2022. [DOI: 10.3390/app12115601] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 02/01/2023]
Abstract
Global animal protein consumption has been steadily increasing as a result of population growth and the increasing demand for nutritious diets. The poultry industry provides a large portion of meat and eggs for human consumption. The early detection and warning of poultry infectious diseases play a critical role in the poultry breeding and production systems, improving animal welfare and reducing losses. However, inadequate methods for the early detection and prevention of infectious diseases in poultry farms sometimes fail to prevent decreased productivity and even widespread mortality. The health status of poultry is often reflected by its individual physiological, physical and behavioral clinical symptoms, such as higher body temperature resulting from fever, abnormal vocalization caused by respiratory disease and abnormal behaviors due to pathogenic infection. Therefore, the use of technologies for symptom detection can monitor the health status of broilers and laying hens in a continuous, noninvasive and automated way, and potentially assist in the early warning decision-making process. This review summarized recent literature on poultry disease detection and highlighted clinical symptom-monitoring technologies for sick poultry. The review concluded that current technologies are already showing their superiority to manual inspection, but the clinical symptom-based monitoring systems have not been fully utilized for on-farm early detection.
Collapse
|
22
|
Zhang L, Niu M, Zhang H, Wang Y, Zhang H, Mao Z, Zhang X, He M, Wu T, Wang Z, Wang C. Nonlaboratory-based risk assessment model for coronary heart disease screening: Model development and validation. Int J Med Inform 2022; 162:104746. [PMID: 35325662 DOI: 10.1016/j.ijmedinf.2022.104746] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/30/2021] [Revised: 03/14/2022] [Accepted: 03/15/2022] [Indexed: 12/11/2022]
Abstract
BACKGROUND Identifying groups at high risk of coronary heart disease (CHD) is important to reduce mortality due to CHD. Although machine learning methods have been introduced, many require laboratory or imaging parameters, which are not always readily available; thus, their wide applications are limited. OBJECTIVE The aim of this study was to develop and validate a simple, efficient, and joint machine learning model for identifying individuals at high risk of CHD using easily obtainable nonlaboratory parameters. METHODS This prospective study used data from the Henan Rural Cohort Study, which was conducted in rural areas of Henan Province, China, between July 2015 and September 2017. A joint machine learning model was developed by selecting and combining four base machine learning algorithms, including logistic regression (LR), artificial neural network (ANN), random forest (RF), and gradient boosting machine (GBM). We used readily accessible variables, including demographics, medical and family history, lifestyle and dietary factors, and anthropometric data, to inform the model. The model was also externally validated by a cohort of individuals from the Dongfeng-Tongji cohort study. Model discrimination was assessed by using the area under the receiver operating characteristic curve (AUC), and calibration was measured by using the Brier score (BS). RESULTS A total of 38 716 participants (mean [SD] age, 55.64[12.19] years; 23449[60.6%] female) from the Henan Rural Cohort Study and 17 958 subjects (mean [SD] age, 62.74 [7.59] years; 10,076 [56.1%] female) from the Dongfeng-Tongji cohort study were included in the analysis. Age, waist circumference, pulse pressure, heart rate, family history of CHD, education level, family history of type 2 diabetes mellitus (T2DM), and family history of dyslipidaemia were strongly associated with the development of CHD. In regard to internal validation, the model we built demonstrated good discrimination (AUC, 0.844 (95% CI 0.828-0.860)) and had acceptable calibration (BS, 0. 066). In regard to external validation, the model performed well with clearly useful discrimination (AUC, 0.792 (95% CI 0.774-0.810)) and robust calibration (BS, 0.069). CONCLUSIONS In this study, the novel and simple, machine learning-based model comprising readily accessible variables accurately identified individuals at high risk of CHD. This model has the potential to be widely applied for large-scale screening of CHD populations, especially in medical resource-constrained settings. TRIAL REGISTRATION The Henan Rural Cohort Study has been registered at the Chinese Clinical Trial Register. (Trial registration: ChiCTR-OOC-15006699. Registered 6 July 2015 - Retrospectively registered) http://www.chictr.org.cn/showproj.aspx?proj=11375.
Collapse
Affiliation(s)
- Liying Zhang
- School of Computer and Artificial Intelligence, Zhengzhou University, Zhengzhou, Henan, PR China; Department of Epidemiology and Biostatistics, College of Public Health, Zhengzhou University, Zhengzhou, Henan, PR China
| | - Miaomiao Niu
- Department of Epidemiology and Biostatistics, College of Public Health, Zhengzhou University, Zhengzhou, Henan, PR China
| | - Haiyang Zhang
- School of Computer and Artificial Intelligence, Zhengzhou University, Zhengzhou, Henan, PR China
| | - Yikang Wang
- Department of Epidemiology and Biostatistics, College of Public Health, Zhengzhou University, Zhengzhou, Henan, PR China
| | - Haiqing Zhang
- Department of Occupational and Environmental Health, Key Laboratory of Environment and Health, Ministry of Education and State Key Laboratory of Environmental Health (Incubating) School of Public Health, Tongji Medical College, Huazhong University of Science and Technology, Wuhan, Hubei, PR China
| | - Zhenxing Mao
- Department of Epidemiology and Biostatistics, College of Public Health, Zhengzhou University, Zhengzhou, Henan, PR China
| | - Xiaomin Zhang
- Department of Occupational and Environmental Health, Key Laboratory of Environment and Health, Ministry of Education and State Key Laboratory of Environmental Health (Incubating) School of Public Health, Tongji Medical College, Huazhong University of Science and Technology, Wuhan, Hubei, PR China
| | - Meian He
- Department of Occupational and Environmental Health, Key Laboratory of Environment and Health, Ministry of Education and State Key Laboratory of Environmental Health (Incubating) School of Public Health, Tongji Medical College, Huazhong University of Science and Technology, Wuhan, Hubei, PR China
| | - Tangchun Wu
- Department of Occupational and Environmental Health, Key Laboratory of Environment and Health, Ministry of Education and State Key Laboratory of Environmental Health (Incubating) School of Public Health, Tongji Medical College, Huazhong University of Science and Technology, Wuhan, Hubei, PR China
| | - Zhenfei Wang
- School of Computer and Artificial Intelligence, Zhengzhou University, Zhengzhou, Henan, PR China.
| | - Chongjian Wang
- Department of Epidemiology and Biostatistics, College of Public Health, Zhengzhou University, Zhengzhou, Henan, PR China.
| |
Collapse
|
23
|
Karaglani M, Panagopoulou M, Cheimonidi C, Tsamardinos I, Maltezos E, Papanas N, Papazoglou D, Mastorakos G, Chatzaki E. Liquid Biopsy in Type 2 Diabetes Mellitus Management: Building Specific Biosignatures via Machine Learning. J Clin Med 2022; 11:1045. [PMID: 35207316 PMCID: PMC8876363 DOI: 10.3390/jcm11041045] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/18/2022] [Revised: 02/09/2022] [Accepted: 02/15/2022] [Indexed: 02/05/2023] Open
Abstract
BACKGROUND The need for minimally invasive biomarkers for the early diagnosis of type 2 diabetes (T2DM) prior to the clinical onset and monitoring of β-pancreatic cell loss is emerging. Here, we focused on studying circulating cell-free DNA (ccfDNA) as a liquid biopsy biomaterial for accurate diagnosis/monitoring of T2DM. METHODS ccfDNA levels were directly quantified in sera from 96 T2DM patients and 71 healthy individuals via fluorometry, and then fragment DNA size profiling was performed by capillary electrophoresis. Following this, ccfDNA methylation levels of five β-cell-related genes were measured via qPCR. Data were analyzed by automated machine learning to build classifying predictive models. RESULTS ccfDNA levels were found to be similar between groups but indicative of apoptosis in T2DM. INS (Insulin), IAPP (Islet Amyloid Polypeptide-Amylin), GCK (Glucokinase), and KCNJ11 (Potassium Inwardly Rectifying Channel Subfamily J member 11) levels differed significantly between groups. AutoML analysis delivered biosignatures including GCK, IAPP and KCNJ11 methylation, with the highest ever reported discriminating performance of T2DM from healthy individuals (AUC 0.927). CONCLUSIONS Our data unravel the value of ccfDNA as a minimally invasive biomaterial carrying important clinical information for T2DM. Upon prospective clinical evaluation, the built biosignature can be disruptive for T2DM clinical management.
Collapse
Affiliation(s)
- Makrina Karaglani
- Laboratory of Pharmacology, Department of Medicine, Democritus University of Thrace, 68100 Alexandroupolis, Greece; (M.K.); (M.P.); (C.C.)
| | - Maria Panagopoulou
- Laboratory of Pharmacology, Department of Medicine, Democritus University of Thrace, 68100 Alexandroupolis, Greece; (M.K.); (M.P.); (C.C.)
| | - Christina Cheimonidi
- Laboratory of Pharmacology, Department of Medicine, Democritus University of Thrace, 68100 Alexandroupolis, Greece; (M.K.); (M.P.); (C.C.)
| | - Ioannis Tsamardinos
- JADBio Gnosis DA, Science and Technology Park of Crete, 71500 Heraklion, Greece;
| | - Efstratios Maltezos
- Diabetes Centre, 2nd Department of Internal Medicine, Democritus University of Thrace, University Hospital of Alexandroupolis, 68100 Alexandroupolis, Greece; (E.M.); (N.P.); (D.P.)
| | - Nikolaos Papanas
- Diabetes Centre, 2nd Department of Internal Medicine, Democritus University of Thrace, University Hospital of Alexandroupolis, 68100 Alexandroupolis, Greece; (E.M.); (N.P.); (D.P.)
| | - Dimitrios Papazoglou
- Diabetes Centre, 2nd Department of Internal Medicine, Democritus University of Thrace, University Hospital of Alexandroupolis, 68100 Alexandroupolis, Greece; (E.M.); (N.P.); (D.P.)
| | - George Mastorakos
- Endocrine Unit, 2nd Department of Obstetrics and Gynecology, National and Kapodistrian University of Athens, “Aretaieion” University Hospital, 11528 Athens, Greece;
| | - Ekaterini Chatzaki
- Laboratory of Pharmacology, Department of Medicine, Democritus University of Thrace, 68100 Alexandroupolis, Greece; (M.K.); (M.P.); (C.C.)
- Institute of Agri-Food and Life Sciences, Hellenic Mediterranean University Research Centre, 71003 Heraklion, Greece
| |
Collapse
|
24
|
Machine learning-based diagnosis and risk factor analysis of cardiocerebrovascular disease based on KNHANES. Sci Rep 2022; 12:2250. [PMID: 35145205 PMCID: PMC8831514 DOI: 10.1038/s41598-022-06333-1] [Citation(s) in RCA: 10] [Impact Index Per Article: 5.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/23/2021] [Accepted: 01/25/2022] [Indexed: 12/31/2022] Open
Abstract
The prevalence of cardiocerebrovascular disease (CVD) is continuously increasing, and it is the leading cause of human death. Since it is difficult for physicians to screen thousands of people, high-accuracy and interpretable methods need to be presented. We developed four machine learning-based CVD classifiers (i.e., multi-layer perceptron, support vector machine, random forest, and light gradient boosting) based on the Korea National Health and Nutrition Examination Survey. We resampled and rebalanced KNHANES data using complex sampling weights such that the rebalanced dataset mimics a uniformly sampled dataset from overall population. For clear risk factor analysis, we removed multicollinearity and CVD-irrelevant variables using VIF-based filtering and the Boruta algorithm. We applied synthetic minority oversampling technique and random undersampling before ML training. We demonstrated that the proposed classifiers achieved excellent performance with AUCs over 0.853. Using Shapley value-based risk factor analysis, we identified that the most significant risk factors of CVD were age, sex, and the prevalence of hypertension. Additionally, we identified that age, hypertension, and BMI were positively correlated with CVD prevalence, while sex (female), alcohol consumption and, monthly income were negative. The results showed that the feature selection and the class balancing technique effectively improve the interpretability of models.
Collapse
|
25
|
Haneef R, Tijhuis M, Thiébaut R, Májek O, Pristaš I, Tolenan H, Gallay A. Methodological guidelines to estimate population-based health indicators using linked data and/or machine learning techniques. Arch Public Health 2022; 80:9. [PMID: 34983651 PMCID: PMC8725299 DOI: 10.1186/s13690-021-00770-6] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/27/2021] [Accepted: 12/17/2021] [Indexed: 12/23/2022] Open
Abstract
BACKGROUND The capacity to use data linkage and artificial intelligence to estimate and predict health indicators varies across European countries. However, the estimation of health indicators from linked administrative data is challenging due to several reasons such as variability in data sources and data collection methods resulting in reduced interoperability at various levels and timeliness, availability of a large number of variables, lack of skills and capacity to link and analyze big data. The main objective of this study is to develop the methodological guidelines calculating population-based health indicators to guide European countries using linked data and/or machine learning (ML) techniques with new methods. METHOD We have performed the following step-wise approach systematically to develop the methodological guidelines: i. Scientific literature review, ii. Identification of inspiring examples from European countries, and iii. Developing the checklist of guidelines contents. RESULTS We have developed the methodological guidelines, which provide a systematic approach for studies using linked data and/or ML-techniques to produce population-based health indicators. These guidelines include a detailed checklist of the following items: rationale and objective of the study (i.e., research question), study design, linked data sources, study population/sample size, study outcomes, data preparation, data analysis (i.e., statistical techniques, sensitivity analysis and potential issues during data analysis) and study limitations. CONCLUSIONS This is the first study to develop the methodological guidelines for studies focused on population health using linked data and/or machine learning techniques. These guidelines would support researchers to adopt and develop a systematic approach for high-quality research methods. There is a need for high-quality research methodologies using more linked data and ML-techniques to develop a structured cross-disciplinary approach for improving the population health information and thereby the population health.
Collapse
Affiliation(s)
- Romana Haneef
- Department of Non-Communicable Diseases and Injuries, Santé Publique France, Saint-Maurice, France.
| | - Mariken Tijhuis
- National Institute for Public Health and the Environment (RIVM), Bilthoven, The Netherlands
| | - Rodolphe Thiébaut
- Bordeaux University, Bordeaux School of Public Health, Bordeaux, France.,INSERM / INRIA SISTM team, Bordeaux Population health, Bordeaux, France.,Medical Information Department, Bordeaux University Hospital, Bordeaux, France
| | - Ondřej Májek
- Institute of Health Information and Statistics of the Czech Republic, Prague, Czech Republic.,Institute of Biostatistics and Analyses, Faculty of Medicine, Masaryk University, Brno, Czech Republic
| | - Ivan Pristaš
- National Institute of public health, division of health informatics and biostatistics, Zagreb, Croatia
| | - Hanna Tolenan
- Finnish Institute for Health and Welfare (THL), Helsinki, Finland
| | - Anne Gallay
- Department of Non-Communicable Diseases and Injuries, Santé Publique France, Saint-Maurice, France
| |
Collapse
|
26
|
AIM in Endocrinology. Artif Intell Med 2022. [DOI: 10.1007/978-3-030-64573-1_328] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/07/2022]
|
27
|
Liu X, Zhang W, Zhang Q, Chen L, Zeng T, Zhang J, Min J, Tian S, Zhang H, Huang H, Wang P, Hu X, Chen L. Development and validation of a machine learning-augmented algorithm for diabetes screening in community and primary care settings: A population-based study. Front Endocrinol (Lausanne) 2022; 13:1043919. [PMID: 36518245 PMCID: PMC9742532 DOI: 10.3389/fendo.2022.1043919] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Figures] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 09/14/2022] [Accepted: 11/11/2022] [Indexed: 11/29/2022] Open
Abstract
BACKGROUND Opportunely screening for diabetes is crucial to reduce its related morbidity, mortality, and socioeconomic burden. Machine learning (ML) has excellent capability to maximize predictive accuracy. We aim to develop ML-augmented models for diabetes screening in community and primary care settings. METHODS 8425 participants were involved from a population-based study in Hubei, China since 2011. The dataset was split into a development set and a testing set. Seven different ML algorithms were compared to generate predictive models. Non-laboratory features were employed in the ML model for community settings, and laboratory test features were further introduced in the ML+lab models for primary care. The area under the receiver operating characteristic curve (AUC), area under the precision-recall curve (auPR), and the average detection costs per participant of these models were compared with their counterparts based on the New China Diabetes Risk Score (NCDRS) currently recommended for diabetes screening. RESULTS The AUC and auPR of the ML model were 0·697and 0·303 in the testing set, seemingly outperforming those of NCDRS by 10·99% and 64·67%, respectively. The average detection cost of the ML model was 12·81% lower than that of NCDRS with the same sensitivity (0·72). Moreover, the average detection cost of the ML+FPG model is the lowest among the ML+lab models and less than that of the ML model and NCDRS+FPG model. CONCLUSION The ML model and the ML+FPG model achieved higher predictive accuracy and lower detection costs than their counterpart based on NCDRS. Thus, the ML-augmented algorithm is potential to be employed for diabetes screening in community and primary care settings.
Collapse
Affiliation(s)
- XiaoHuan Liu
- Department of Endocrinology, Union Hospital, Tongji Medical College, Huazhong University of Science and Technology, Wuhan, China
- Hubei provincial Clinical Research Center for Diabetes and Metabolic Disorders, Wuhan, China
| | - Weiyue Zhang
- Department of Endocrinology, Union Hospital, Tongji Medical College, Huazhong University of Science and Technology, Wuhan, China
- Hubei provincial Clinical Research Center for Diabetes and Metabolic Disorders, Wuhan, China
| | - Qiao Zhang
- Department of Cardiovascular Surgery, Union Hospital, Tongji Medical College, Huazhong University of Science and Technology, Wuhan, China
| | - Long Chen
- Department of Computer Science and Technology, Tsinghua University, Beijing, China
| | - TianShu Zeng
- Department of Endocrinology, Union Hospital, Tongji Medical College, Huazhong University of Science and Technology, Wuhan, China
- Hubei provincial Clinical Research Center for Diabetes and Metabolic Disorders, Wuhan, China
| | - JiaoYue Zhang
- Department of Endocrinology, Union Hospital, Tongji Medical College, Huazhong University of Science and Technology, Wuhan, China
- Hubei provincial Clinical Research Center for Diabetes and Metabolic Disorders, Wuhan, China
| | - Jie Min
- Department of Endocrinology, Union Hospital, Tongji Medical College, Huazhong University of Science and Technology, Wuhan, China
- Hubei provincial Clinical Research Center for Diabetes and Metabolic Disorders, Wuhan, China
| | - ShengHua Tian
- Department of Endocrinology, Union Hospital, Tongji Medical College, Huazhong University of Science and Technology, Wuhan, China
- Hubei provincial Clinical Research Center for Diabetes and Metabolic Disorders, Wuhan, China
| | - Hao Zhang
- Department of Endocrinology, Union Hospital, Tongji Medical College, Huazhong University of Science and Technology, Wuhan, China
- Hubei provincial Clinical Research Center for Diabetes and Metabolic Disorders, Wuhan, China
| | | | - Ping Wang
- Precision Health Program, Department of Radiology, College of Human Medicine, Michigan State University, East Lansing, MI, United States
| | - Xiang Hu
- Department of Endocrinology, Union Hospital, Tongji Medical College, Huazhong University of Science and Technology, Wuhan, China
- Hubei provincial Clinical Research Center for Diabetes and Metabolic Disorders, Wuhan, China
- *Correspondence: LuLu Chen, ; Xiang Hu,
| | - LuLu Chen
- Department of Endocrinology, Union Hospital, Tongji Medical College, Huazhong University of Science and Technology, Wuhan, China
- Hubei provincial Clinical Research Center for Diabetes and Metabolic Disorders, Wuhan, China
- *Correspondence: LuLu Chen, ; Xiang Hu,
| |
Collapse
|
28
|
Fregoso-Aparicio L, Noguez J, Montesinos L, García-García JA. Machine learning and deep learning predictive models for type 2 diabetes: a systematic review. Diabetol Metab Syndr 2021; 13:148. [PMID: 34930452 PMCID: PMC8686642 DOI: 10.1186/s13098-021-00767-9] [Citation(s) in RCA: 31] [Impact Index Per Article: 10.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 07/06/2021] [Accepted: 12/07/2021] [Indexed: 12/12/2022] Open
Abstract
Diabetes Mellitus is a severe, chronic disease that occurs when blood glucose levels rise above certain limits. Over the last years, machine and deep learning techniques have been used to predict diabetes and its complications. However, researchers and developers still face two main challenges when building type 2 diabetes predictive models. First, there is considerable heterogeneity in previous studies regarding techniques used, making it challenging to identify the optimal one. Second, there is a lack of transparency about the features used in the models, which reduces their interpretability. This systematic review aimed at providing answers to the above challenges. The review followed the PRISMA methodology primarily, enriched with the one proposed by Keele and Durham Universities. Ninety studies were included, and the type of model, complementary techniques, dataset, and performance parameters reported were extracted. Eighteen different types of models were compared, with tree-based algorithms showing top performances. Deep Neural Networks proved suboptimal, despite their ability to deal with big and dirty data. Balancing data and feature selection techniques proved helpful to increase the model's efficiency. Models trained on tidy datasets achieved almost perfect models.
Collapse
Affiliation(s)
- Luis Fregoso-Aparicio
- School of Engineering and Sciences, Tecnologico de Monterrey, Av Lago de Guadalupe KM 3.5, Margarita Maza de Juarez, 52926 Cd Lopez Mateos, Mexico
| | - Julieta Noguez
- School of Engineering and Sciences, Tecnologico de Monterrey, Ave. Eugenio Garza Sada 2501, 64849 Monterrey, Nuevo Leon Mexico
| | - Luis Montesinos
- School of Engineering and Sciences, Tecnologico de Monterrey, Ave. Eugenio Garza Sada 2501, 64849 Monterrey, Nuevo Leon Mexico
| | - José A. García-García
- Hospital General de Mexico Dr. Eduardo Liceaga, Dr. Balmis 148, Doctores, Cuauhtemoc, 06720 Mexico City, Mexico
| |
Collapse
|
29
|
Nomura A, Noguchi M, Kometani M, Furukawa K, Yoneda T. Artificial Intelligence in Current Diabetes Management and Prediction. Curr Diab Rep 2021; 21:61. [PMID: 34902070 PMCID: PMC8668843 DOI: 10.1007/s11892-021-01423-2] [Citation(s) in RCA: 25] [Impact Index Per Article: 8.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Accepted: 07/13/2021] [Indexed: 10/28/2022]
Abstract
PURPOSE OF REVIEW Artificial intelligence (AI) can make advanced inferences based on a large amount of data. The mainstream technologies of the AI boom in 2021 are machine learning (ML) and deep learning, which have made significant progress due to the increase in computational resources accompanied by the dramatic improvement in computer performance. In this review, we introduce AI/ML-based medical devices and prediction models regarding diabetes. RECENT FINDINGS In the field of diabetes, several AI-/ML-based medical devices and regarding automatic retinal screening, clinical diagnosis support, and patient self-management tool have already been approved by the US Food and Drug Administration. As for new-onset diabetes prediction using ML methods, its performance is not superior to conventional risk stratification models that use statistical approaches so far. Despite the current situation, it is expected that the predictive performance of AI will soon be maximized by a large amount of organized data and abundant computational resources, which will contribute to a dramatic improvement in the accuracy of disease prediction models for diabetes.
Collapse
Affiliation(s)
- Akihiro Nomura
- Department of Biomedical Informatics, CureApp Institute, Karuizawa, Japan.
- Innovative Clinical Research Center, Kanazawa University, 13-1 Takaramachi, Kanazawa, 9208641, Japan.
- Department of Cardiovascular Medicine, Kanazawa University Graduate School of Medical Sciences, Kanazawa, Japan.
- Department of Health Promotion and Medicine of the Future, Kanazawa University Graduate School of Medical Sciences, Kanazawa, Japan.
| | - Masahiro Noguchi
- Department of Cardiovascular Medicine, Kanazawa University Graduate School of Medical Sciences, Kanazawa, Japan
| | - Mitsuhiro Kometani
- Department of Health Promotion and Medicine of the Future, Kanazawa University Graduate School of Medical Sciences, Kanazawa, Japan
| | - Kenji Furukawa
- Department of Health Promotion and Medicine of the Future, Kanazawa University Graduate School of Medical Sciences, Kanazawa, Japan
- Health Care Center, Japan Advanced Institute of Science and Technology, Nomi, Japan
| | - Takashi Yoneda
- Department of Health Promotion and Medicine of the Future, Kanazawa University Graduate School of Medical Sciences, Kanazawa, Japan
| |
Collapse
|
30
|
Makino K, Lee S, Bae S, Chiba I, Harada K, Katayama O, Tomida K, Morikawa M, Shimada H. Simplified Decision-Tree Algorithm to Predict Falls for Community-Dwelling Older Adults. J Clin Med 2021; 10:jcm10215184. [PMID: 34768703 PMCID: PMC8585075 DOI: 10.3390/jcm10215184] [Citation(s) in RCA: 8] [Impact Index Per Article: 2.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/26/2021] [Revised: 10/26/2021] [Accepted: 11/03/2021] [Indexed: 11/16/2022] Open
Abstract
The present study developed a simplified decision-tree algorithm for fall prediction with easily measurable predictors using data from a longitudinal cohort study: 2520 community-dwelling older adults aged 65 years or older participated. Fall history, age, sex, fear of falling, prescribed medication, knee osteoarthritis, lower limb pain, gait speed, and timed up and go test were assessed in the baseline survey as fall predictors. Moreover, recent falls were assessed in the follow-up survey. We created a fall-prediction algorithm using decision-tree analysis (C5.0) that included 14 nodes with six predictors, and the model could stratify the probabilities of fall incidence ranging from 30.4% to 71.9%. Additionally, the decision-tree model outperformed a logistic regression model with respect to the area under the curve (0.70 vs. 0.64), accuracy (0.65 vs. 0.62), sensitivity (0.62 vs. 0.50), positive predictive value (0.66 vs. 0.65), and negative predictive value (0.64 vs. 0.59). Our decision-tree model consists of common and easily measurable fall predictors, and its white-box algorithm can explain the reasons for risk stratification; therefore, it can be implemented in clinical practices. Our findings provide useful information for the early screening of fall risk and the promotion of timely strategies for fall prevention in community and clinical settings.
Collapse
Affiliation(s)
- Keitaro Makino
- Department of Preventive Gerontology, Center for Gerontology and Social Science, National Center for Geriatrics and Gerontology, 7-430 Morioka-cho, Obu City 474-8511, Japan; (S.L.); (S.B.); (I.C.); (K.H.); (O.K.); (K.T.); (M.M.)
- Research Fellowship for Young Scientists, Japan Society for the Promotion of Science, Chiyoda-ku, Tokyo 102-0083, Japan
- Correspondence: ; Tel.: +81-562-44-5651
| | - Sangyoon Lee
- Department of Preventive Gerontology, Center for Gerontology and Social Science, National Center for Geriatrics and Gerontology, 7-430 Morioka-cho, Obu City 474-8511, Japan; (S.L.); (S.B.); (I.C.); (K.H.); (O.K.); (K.T.); (M.M.)
| | - Seongryu Bae
- Department of Preventive Gerontology, Center for Gerontology and Social Science, National Center for Geriatrics and Gerontology, 7-430 Morioka-cho, Obu City 474-8511, Japan; (S.L.); (S.B.); (I.C.); (K.H.); (O.K.); (K.T.); (M.M.)
| | - Ippei Chiba
- Department of Preventive Gerontology, Center for Gerontology and Social Science, National Center for Geriatrics and Gerontology, 7-430 Morioka-cho, Obu City 474-8511, Japan; (S.L.); (S.B.); (I.C.); (K.H.); (O.K.); (K.T.); (M.M.)
| | - Kenji Harada
- Department of Preventive Gerontology, Center for Gerontology and Social Science, National Center for Geriatrics and Gerontology, 7-430 Morioka-cho, Obu City 474-8511, Japan; (S.L.); (S.B.); (I.C.); (K.H.); (O.K.); (K.T.); (M.M.)
| | - Osamu Katayama
- Department of Preventive Gerontology, Center for Gerontology and Social Science, National Center for Geriatrics and Gerontology, 7-430 Morioka-cho, Obu City 474-8511, Japan; (S.L.); (S.B.); (I.C.); (K.H.); (O.K.); (K.T.); (M.M.)
| | - Kouki Tomida
- Department of Preventive Gerontology, Center for Gerontology and Social Science, National Center for Geriatrics and Gerontology, 7-430 Morioka-cho, Obu City 474-8511, Japan; (S.L.); (S.B.); (I.C.); (K.H.); (O.K.); (K.T.); (M.M.)
| | - Masanori Morikawa
- Department of Preventive Gerontology, Center for Gerontology and Social Science, National Center for Geriatrics and Gerontology, 7-430 Morioka-cho, Obu City 474-8511, Japan; (S.L.); (S.B.); (I.C.); (K.H.); (O.K.); (K.T.); (M.M.)
| | - Hiroyuki Shimada
- Center for Gerontology and Social Science, National Center for Geriatrics and Gerontology, 7-430 Morioka-cho, Obu City 474-8511, Japan;
| |
Collapse
|
31
|
Makino K, Lee S, Bae S, Chiba I, Harada K, Katayama O, Shinkai Y, Shimada H. Development and validation of new screening tool for predicting dementia risk in community-dwelling older Japanese adults. J Transl Med 2021; 19:448. [PMID: 34702306 PMCID: PMC8549197 DOI: 10.1186/s12967-021-03121-9] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/27/2021] [Accepted: 10/16/2021] [Indexed: 12/02/2022] Open
Abstract
BACKGROUND Established clinical assessments for detecting dementia risk often require time, cost, and face-to-face meetings. We aimed to develop a Simplified Telephone Assessment for Dementia risk (STAD) (a new screening tool utilizing telephonic interviews to predict dementia risk) and examine the predictive validity of the STAD for the incidence of dementia. METHODS We developed STAD based on a combination of literature review, statistical analysis, and expert opinion. We selected 12 binary questions on subjective cognitive complaints, depressive symptoms, and lifestyle activities. In the validation study, we used STAD for 4298 community-dwelling older adults and observed the incidence of dementia during the 24-month follow-up period. The total score of STAD ranging from 0 to 12 was calculated, and the cut-off point for dementia incidence was determined using the Youden index. The survival rate of dementia incidence according to the cut-off points was determined. Furthermore, we used a decision-tree model (classification and regression tree, CART) to enhance the predictive ability of STAD for dementia risk screening. RESULTS The cut-off point of STAD was set at 4/5. Participants scoring ≥ 5 points showed a significantly higher risk of dementia than those scoring ≤ 4 points, even after adjusting for covariates (hazard ratio [95% confidence interval], 2.67 [1.40-5.08]). A decision tree model using the CART algorithm was constructed using 12 nodes with three STAD items. It showed better performance for dementia prediction in terms of accuracy and specificity as compared to the logistic regression model, although its sensitivity was worse than the logistic regression model. CONCLUSIONS We developed a 12-item questionnaire, STAD, as a screening tool to predict dementia risk utilizing telephonic interviews and confirmed its predictive validity. Our findings might provide useful information for early screening of dementia risk and enable bridging between community and clinical settings. Additionally, STAD could be employed without face-to-face meetings in a short time; therefore, it may be a suitable screening tool for community-dwelling older adults who have negative attitudes toward clinical examination or are non-adherent to follow-up assessments in clinical trials.
Collapse
Affiliation(s)
- Keitaro Makino
- Department of Preventive Gerontology, Center for Gerontology and Social Science, National Center for Geriatrics and Gerontology, 7-430 Morioka-cho, Obu, Aichi, 474-8511, Japan.
- Japan Society for the Promotion of Science, Chiyoda-ku, Tokyo, Japan.
| | - Sangyoon Lee
- Department of Preventive Gerontology, Center for Gerontology and Social Science, National Center for Geriatrics and Gerontology, 7-430 Morioka-cho, Obu, Aichi, 474-8511, Japan
| | - Seongryu Bae
- Department of Preventive Gerontology, Center for Gerontology and Social Science, National Center for Geriatrics and Gerontology, 7-430 Morioka-cho, Obu, Aichi, 474-8511, Japan
| | - Ippei Chiba
- Department of Preventive Gerontology, Center for Gerontology and Social Science, National Center for Geriatrics and Gerontology, 7-430 Morioka-cho, Obu, Aichi, 474-8511, Japan
| | - Kenji Harada
- Department of Preventive Gerontology, Center for Gerontology and Social Science, National Center for Geriatrics and Gerontology, 7-430 Morioka-cho, Obu, Aichi, 474-8511, Japan
| | - Osamu Katayama
- Department of Preventive Gerontology, Center for Gerontology and Social Science, National Center for Geriatrics and Gerontology, 7-430 Morioka-cho, Obu, Aichi, 474-8511, Japan
| | - Yohei Shinkai
- Department of Preventive Gerontology, Center for Gerontology and Social Science, National Center for Geriatrics and Gerontology, 7-430 Morioka-cho, Obu, Aichi, 474-8511, Japan
| | - Hiroyuki Shimada
- Center for Gerontology and Social Science, National Center for Geriatrics and Gerontology, Obu, Aichi, Japan
| |
Collapse
|
32
|
Nguyen P, Ohnmacht AJ, Galhoz A, Büttner M, Theis F, Menden MP. Künstliche Intelligenz und maschinelles Lernen in der Diabetesforschung. DIABETOLOGE 2021. [DOI: 10.1007/s11428-021-00817-w] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/30/2022]
|
33
|
Niu M, Wang Y, Zhang L, Tu R, Liu X, Hou J, Huo W, Mao Z, Wang C, Bie R. Identifying the predictive effectiveness of a genetic risk score for incident hypertension using machine learning methods among populations in rural China. Hypertens Res 2021; 44:1483-1491. [PMID: 34480134 DOI: 10.1038/s41440-021-00738-7] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/17/2021] [Revised: 07/31/2021] [Accepted: 08/04/2021] [Indexed: 12/17/2022]
Abstract
Current studies have shown the controversial effect of genetic risk scores (GRSs) in hypertension prediction. Machine learning methods are used extensively in the medical field but rarely in the mining of genetic information. This study aims to determine whether genetic information can improve the prediction of incident hypertension using machine learning approaches in a prospective study. The study recruited 4592 subjects without hypertension at baseline from a cohort study conducted in rural China. A polygenic risk score (PGGRS) was calculated using 13 SNPs. According to a ratio of 7:3, subjects were randomly allocated to the train and test datasets. Models with and without the PGGRS were established using the train dataset with Cox regression, artificial neural network (ANN), random forest (RF), and gradient boosting machine (GBM) methods. The discrimination and reclassification of models were estimated using the test dataset. The PGGRS showed a significant association with the risk of incident hypertension (HR (95% CI), 1.046 (1.004, 1.090), P = 0.031) irrespective of baseline blood pressure. Models that did not include the PGGRS achieved AUCs (95% CI) of 0.785 (0.763, 0.807), 0.790 (0.768, 0.811), 0.838 (0.817, 0.857), and 0.854 (0.835, 0.873) for the Cox, ANN, RF, and GBM methods, respectively. The addition of the PGGRS led to the improvement of the AUC by 0.001, 0.008, 0.023, and 0.017; IDI by 1.39%, 2.86%, 4.73%, and 4.68%; and NRI by 25.05%, 13.01%, 44.87%, and 22.94%, respectively. Incident hypertension risk was better predicted by the traditional+PGGRS model, especially when machine learning approaches were used, suggesting that genetic information may have the potential to identify new hypertension cases using machine learning methods in resource-limited areas. CLINICAL TRIAL REGISTRATION: The Henan Rural Cohort Study has been registered at the Chinese Clinical Trial Register (Registration number: ChiCTR-OOC-15006699). http://www.chictr.org.cn/showproj.aspx?proj=11375 .
Collapse
Affiliation(s)
- Miaomiao Niu
- Department of Epidemiology and Biostatistics, College of Public Health, Zhengzhou University, Zhengzhou, Henan, PR China
| | - Yikang Wang
- Department of Epidemiology and Biostatistics, College of Public Health, Zhengzhou University, Zhengzhou, Henan, PR China
| | - Liying Zhang
- School of Information Engineering, Zhengzhou University, Zhengzhou, Henan, PR China
| | - Runqi Tu
- Department of Epidemiology and Biostatistics, College of Public Health, Zhengzhou University, Zhengzhou, Henan, PR China
| | - Xiaotian Liu
- Department of Epidemiology and Biostatistics, College of Public Health, Zhengzhou University, Zhengzhou, Henan, PR China
| | - Jian Hou
- Department of Epidemiology and Biostatistics, College of Public Health, Zhengzhou University, Zhengzhou, Henan, PR China
| | - Wenqian Huo
- Department of Epidemiology and Biostatistics, College of Public Health, Zhengzhou University, Zhengzhou, Henan, PR China
| | - Zhenxing Mao
- Department of Epidemiology and Biostatistics, College of Public Health, Zhengzhou University, Zhengzhou, Henan, PR China
| | - Chongjian Wang
- Department of Epidemiology and Biostatistics, College of Public Health, Zhengzhou University, Zhengzhou, Henan, PR China.
| | - Ronghai Bie
- Department of Epidemiology and Biostatistics, College of Public Health, Zhengzhou University, Zhengzhou, Henan, PR China.
| |
Collapse
|
34
|
Lee S, Zhou J, Leung KSK, Wu WKK, Wong WT, Liu T, Wong ICK, Jeevaratnam K, Zhang Q, Tse G. Development of a predictive risk model for all-cause mortality in patients with diabetes in Hong Kong. BMJ Open Diabetes Res Care 2021; 9:9/1/e001950. [PMID: 34117050 PMCID: PMC8201981 DOI: 10.1136/bmjdrc-2020-001950] [Citation(s) in RCA: 6] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 10/15/2020] [Accepted: 05/09/2021] [Indexed: 01/14/2023] Open
Abstract
INTRODUCTION Patients with diabetes mellitus are risk of premature death. In this study, we developed a machine learning-driven predictive risk model for all-cause mortality among patients with type 2 diabetes mellitus using multiparametric approach with data from different domains. RESEARCH DESIGN AND METHODS This study used territory-wide data of patients with type 2 diabetes attending public hospitals or their associated ambulatory/outpatient facilities in Hong Kong between January 1, 2009 and December 31, 2009. The primary outcome is all-cause mortality. The association of risk variables and all-cause mortality was assessed using Cox proportional hazards models. Machine and deep learning approaches were used to improve overall survival prediction and were evaluated with fivefold cross validation method. RESULTS A total of 273 678 patients (mean age: 65.4±12.7 years, male: 48.2%, median follow-up: 142 (IQR=106-142) months) were included, with 91 155 deaths occurring on follow-up (33.3%; annualized mortality rate: 3.4%/year; 2.7 million patient-years). Multivariate Cox regression found the following significant predictors of all-cause mortality: age, male gender, baseline comorbidities, anemia, mean values of neutrophil-to-lymphocyte ratio, high-density lipoprotein-cholesterol, total cholesterol, triglyceride, HbA1c and fasting blood glucose (FBG), measures of variability of both HbA1c and FBG. The above parameters were incorporated into a score-based predictive risk model that had a c-statistic of 0.73 (95% CI 0.66 to 0.77), which was improved to 0.86 (0.81 to 0.90) and 0.87 (0.84 to 0.91) using random survival forests and deep survival learning models, respectively. CONCLUSIONS A multiparametric model incorporating variables from different domains predicted all-cause mortality accurately in type 2 diabetes mellitus. The predictive and modeling capabilities of machine/deep learning survival analysis achieved more accurate predictions.
Collapse
Affiliation(s)
- Sharen Lee
- Cardiovascular Analytics Group, Laboratory of Cardiovascular Physiology, Hong Kong
| | - Jiandong Zhou
- School of Data Science, City University of Hong Kong, Kowloon, Hong Kong
| | | | - William Ka Kei Wu
- Faculty of Medicine, The Chinese University of Hong Kong, Hong Kong, China
| | - Wing Tak Wong
- School of Life Sciences, The Chinese University of Hong Kong, Hong Kong, China
| | - Tong Liu
- Department of Cardiology, The Second Hospital of Tianjin Medical University, Tianjin, China
| | - Ian Chi Kei Wong
- Li Ka Shing Faculty of Medicine, University of Hong Kong, Hong Kong, China
| | - Kamalan Jeevaratnam
- Faculty of Health and Medical Sciences, University of Surrey, Guildford, Surrey, UK
| | - Qingpeng Zhang
- School of Data Science, City University of Hong Kong, Kowloon, Hong Kong
| | - Gary Tse
- Cardiovascular Analytics Group, Laboratory of Cardiovascular Physiology, Hong Kong
- Department of Cardiology, The Second Hospital of Tianjin Medical University, Tianjin, China
- Faculty of Health and Medical Sciences, University of Surrey, Guildford, Surrey, UK
- Kent and Medway Medical School, Canterbury, UK
| |
Collapse
|
35
|
Liao Q, Zhang Q, Feng X, Huang H, Xu H, Tian B, Liu J, Yu Q, Guo N, Liu Q, Huang B, Ma D, Ai J, Xu S, Li K. Development of deep learning algorithms for predicting blastocyst formation and quality by time-lapse monitoring. Commun Biol 2021; 4:415. [PMID: 33772211 PMCID: PMC7998018 DOI: 10.1038/s42003-021-01937-1] [Citation(s) in RCA: 20] [Impact Index Per Article: 6.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/03/2020] [Accepted: 02/24/2021] [Indexed: 12/24/2022] Open
Abstract
Approaches to reliably predict the developmental potential of embryos and select suitable embryos for blastocyst culture are needed. The development of time-lapse monitoring (TLM) and artificial intelligence (AI) may help solve this problem. Here, we report deep learning models that can accurately predict blastocyst formation and usable blastocysts using TLM videos of the embryo’s first three days. The DenseNet201 network, focal loss, long short-term memory (LSTM) network and gradient boosting classifier were mainly employed, and video preparation algorithms, spatial stream and temporal stream models were developed into ensemble prediction models called STEM and STEM+. STEM exhibited 78.2% accuracy and 0.82 AUC in predicting blastocyst formation, and STEM+ achieved 71.9% accuracy and 0.79 AUC in predicting usable blastocysts. We believe the models are beneficial for blastocyst formation prediction and embryo selection in clinical practice, and our modeling methods will provide valuable information for analyzing medical videos with continuous appearance variation. Liao et al. propose a deep learning model to predict blastocyst formation using TLM videos following the first three days of embryogenesis. The authors develop an ensemble prediction model, STEM and STEM+, which were found to exhibit 78.2% and 71.9% accuracy at predicting blastocyst formation and useable blastocysts respectively.
Collapse
Affiliation(s)
- Qiuyue Liao
- Department of Gynecology and Obstetrics, Tongji Hospital, Tongji Medical College, Huazhong University of Science and Technology, Wuhan, Hubei, China
| | - Qi Zhang
- Shanghai Institute for Advanced Communication and Data Science, Shanghai University, Shanghai, China
| | - Xue Feng
- Department of Gynecology and Obstetrics, Tongji Hospital, Tongji Medical College, Huazhong University of Science and Technology, Wuhan, Hubei, China
| | - Haibo Huang
- Shanghai Institute for Advanced Communication and Data Science, Shanghai University, Shanghai, China
| | - Haohao Xu
- Shanghai Institute for Advanced Communication and Data Science, Shanghai University, Shanghai, China
| | - Baoyuan Tian
- Shanghai Institute for Advanced Communication and Data Science, Shanghai University, Shanghai, China
| | - Jihao Liu
- Shanghai Institute for Advanced Communication and Data Science, Shanghai University, Shanghai, China
| | - Qihui Yu
- Shanghai Institute for Advanced Communication and Data Science, Shanghai University, Shanghai, China
| | - Na Guo
- Department of Gynecology and Obstetrics, Tongji Hospital, Tongji Medical College, Huazhong University of Science and Technology, Wuhan, Hubei, China
| | - Qun Liu
- Department of Gynecology and Obstetrics, Tongji Hospital, Tongji Medical College, Huazhong University of Science and Technology, Wuhan, Hubei, China
| | - Bo Huang
- Department of Gynecology and Obstetrics, Tongji Hospital, Tongji Medical College, Huazhong University of Science and Technology, Wuhan, Hubei, China
| | - Ding Ma
- Department of Gynecology and Obstetrics, Tongji Hospital, Tongji Medical College, Huazhong University of Science and Technology, Wuhan, Hubei, China
| | - Jihui Ai
- Department of Gynecology and Obstetrics, Tongji Hospital, Tongji Medical College, Huazhong University of Science and Technology, Wuhan, Hubei, China.
| | - Shugong Xu
- Shanghai Institute for Advanced Communication and Data Science, Shanghai University, Shanghai, China.
| | - Kezhen Li
- Department of Gynecology and Obstetrics, Tongji Hospital, Tongji Medical College, Huazhong University of Science and Technology, Wuhan, Hubei, China.
| |
Collapse
|
36
|
Wang Y, Zhang L, Niu M, Li R, Tu R, Liu X, Hou J, Mao Z, Wang Z, Wang C. Genetic Risk Score Increased Discriminant Efficiency of Predictive Models for Type 2 Diabetes Mellitus Using Machine Learning: Cohort Study. Front Public Health 2021; 9:606711. [PMID: 33681127 PMCID: PMC7925839 DOI: 10.3389/fpubh.2021.606711] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/15/2020] [Accepted: 01/25/2021] [Indexed: 11/13/2022] Open
Abstract
Background: Previous studies have constructed prediction models for type 2 diabetes mellitus (T2DM), but machine learning was rarely used and few focused on genetic prediction. This study aimed to establish an effective T2DM prediction tool and to further explore the potential of genetic risk scores (GRS) via various classifiers among rural adults. Methods: In this prospective study, the GRS for a total of 5,712 participants from the Henan Rural Cohort Study was calculated. Cox proportional hazards (CPH) regression was used to analyze the associations between GRS and T2DM. CPH, artificial neural network (ANN), random forest (RF), and gradient boosting machine (GBM) were used to establish prediction models, respectively. The area under the receiver operating characteristic curve (AUC) and net reclassification index (NRI) were used to assess the discrimination ability of the models. The decision curve was plotted to determine the clinical-utility for prediction models. Results: Compared with the individuals in the lowest quintile of the GRS, the HR (95% CI) was 2.06 (1.40 to 3.03) for those with the highest quintile of GRS (Ptrend < 0.05). Based on conventional predictors, the AUCs of the prediction model were 0.815, 0.816, 0.843, and 0.851 via CPH, ANN, RF, and GBM, respectively. Changes with the integration of GRS for CPH, ANN, RF, and GBM were 0.001, 0.002, 0.018, and 0.033, respectively. The reclassifications were significantly improved for all classifiers when adding GRS (NRI: 41.2% for CPH; 41.0% for ANN; 46.4% for ANN; 45.1% for GBM). Decision curve analysis indicated the clinical benefits of model combined GRS. Conclusion: The prediction model combined with GRS may provide incremental predictions of performance beyond conventional factors for T2DM, which demonstrated the potential clinical use of genetic markers to screen vulnerable populations. Clinical Trial Registration: The Henan Rural Cohort Study is registered in the Chinese Clinical Trial Register (Registration number: ChiCTR-OOC-15006699). http://www.chictr.org.cn/showproj.aspx?proj=11375.
Collapse
Affiliation(s)
- Yikang Wang
- Department of Epidemiology and Biostatistics, College of Public Health, Zhengzhou University, Zhengzhou, China
| | - Liying Zhang
- Department of Epidemiology and Biostatistics, College of Public Health, Zhengzhou University, Zhengzhou, China.,School of Information Engineering, Zhengzhou University, Zhengzhou, China
| | - Miaomiao Niu
- Department of Epidemiology and Biostatistics, College of Public Health, Zhengzhou University, Zhengzhou, China
| | - Ruiying Li
- Department of Epidemiology and Biostatistics, College of Public Health, Zhengzhou University, Zhengzhou, China
| | - Runqi Tu
- Department of Epidemiology and Biostatistics, College of Public Health, Zhengzhou University, Zhengzhou, China
| | - Xiaotian Liu
- Department of Epidemiology and Biostatistics, College of Public Health, Zhengzhou University, Zhengzhou, China
| | - Jian Hou
- Department of Epidemiology and Biostatistics, College of Public Health, Zhengzhou University, Zhengzhou, China
| | - Zhenxing Mao
- Department of Epidemiology and Biostatistics, College of Public Health, Zhengzhou University, Zhengzhou, China
| | - Zhenfei Wang
- School of Information Engineering, Zhengzhou University, Zhengzhou, China
| | - Chongjian Wang
- Department of Epidemiology and Biostatistics, College of Public Health, Zhengzhou University, Zhengzhou, China
| |
Collapse
|
37
|
Niu M, Zhang L, Wang Y, Tu R, Liu X, Hou J, Huo W, Mao Z, Wang Z, Wang C. Genetic factors increase the identification efficiency of predictive models for dyslipidaemia: a prospective cohort study. Lipids Health Dis 2021; 20:11. [PMID: 33579296 PMCID: PMC7881493 DOI: 10.1186/s12944-021-01439-3] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/15/2020] [Accepted: 01/27/2021] [Indexed: 11/10/2022] Open
Abstract
Background Few studies have developed risk models for dyslipidaemia, especially for rural populations. Furthermore, the performance of genetic factors in predicting dyslipidaemia has not been explored. The purpose of this study is to develop and evaluate prediction models with and without genetic factors for dyslipidaemia in rural populations. Methods A total of 3596 individuals from the Henan Rural Cohort Study were included in this study. According to the ratio of 7:3, all individuals were divided into a training set and a testing set. The conventional models and conventional+GRS (genetic risk score) models were developed with Cox regression, artificial neural network (ANN), random forest (RF), and gradient boosting machine (GBM) classifiers in the training set. The area under the receiver operating characteristic curve (AUC), net reclassification index (NRI), and integrated discrimination index (IDI) were used to assess the discrimination ability of the models, and the calibration curve was used to show calibration ability in the testing set. Results Compared to the lowest quartile of GRS, the hazard ratio (HR) (95% confidence interval (CI)) of individuals in the highest quartile of GRS was 1.23(1.07, 1.41) in the total population. Age, family history of diabetes, physical activity, body mass index (BMI), triglycerides (TGs), high-density lipoprotein cholesterol (HDL-C), and low-density lipoprotein cholesterol (LDL-C) were used to develop the conventional models, and the AUCs of the Cox, ANN, RF, and GBM classifiers were 0.702(0.673, 0.729), 0.736(0.708, 0.762), 0.787 (0.762, 0.811), and 0.816(0.792, 0.839), respectively. After adding GRS, the AUCs increased by 0.005, 0.018, 0.023, and 0.015 with the Cox, ANN, RF, and GBM classifiers, respectively. The corresponding NRI and IDI were 25.6, 7.8, 14.1, and 18.1% and 2.3, 1.0, 2.5, and 1.8%, respectively. Conclusion Genetic factors could improve the predictive ability of the dyslipidaemia risk model, suggesting that genetic information could be provided as a potential predictor to screen for clinical dyslipidaemia. Trial registration The Henan Rural Cohort Study has been registered at the Chinese Clinical Trial Register. (Trial registration: ChiCTR-OOC-15006699. Registered 6 July 2015 - Retrospectively registered). Supplementary Information The online version contains supplementary material available at 10.1186/s12944-021-01439-3.
Collapse
Affiliation(s)
- Miaomiao Niu
- Department of Epidemiology and Biostatistics, College of Public Health, Zhengzhou University, 100 Kexue Avenue, Zhengzhou, 450001, Henan, People's Republic of China
| | - Liying Zhang
- School of Information Engineering, Zhengzhou University, Zhengzhou, Henan, People's Republic of China
| | - Yikang Wang
- Department of Epidemiology and Biostatistics, College of Public Health, Zhengzhou University, 100 Kexue Avenue, Zhengzhou, 450001, Henan, People's Republic of China
| | - Runqi Tu
- Department of Epidemiology and Biostatistics, College of Public Health, Zhengzhou University, 100 Kexue Avenue, Zhengzhou, 450001, Henan, People's Republic of China
| | - Xiaotian Liu
- Department of Epidemiology and Biostatistics, College of Public Health, Zhengzhou University, 100 Kexue Avenue, Zhengzhou, 450001, Henan, People's Republic of China
| | - Jian Hou
- Department of Epidemiology and Biostatistics, College of Public Health, Zhengzhou University, 100 Kexue Avenue, Zhengzhou, 450001, Henan, People's Republic of China
| | - Wenqian Huo
- Department of Epidemiology and Biostatistics, College of Public Health, Zhengzhou University, 100 Kexue Avenue, Zhengzhou, 450001, Henan, People's Republic of China
| | - Zhenxing Mao
- Department of Epidemiology and Biostatistics, College of Public Health, Zhengzhou University, 100 Kexue Avenue, Zhengzhou, 450001, Henan, People's Republic of China
| | - Zhenfei Wang
- School of Information Engineering, Zhengzhou University, Zhengzhou, Henan, People's Republic of China.
| | - Chongjian Wang
- Department of Epidemiology and Biostatistics, College of Public Health, Zhengzhou University, 100 Kexue Avenue, Zhengzhou, 450001, Henan, People's Republic of China.
| |
Collapse
|
38
|
Wu Y, Hu H, Cai J, Chen R, Zuo X, Cheng H, Yan D. Machine Learning for Predicting the 3-Year Risk of Incident Diabetes in Chinese Adults. Front Public Health 2021; 9:626331. [PMID: 34268283 PMCID: PMC8275929 DOI: 10.3389/fpubh.2021.626331] [Citation(s) in RCA: 16] [Impact Index Per Article: 5.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/16/2020] [Accepted: 05/21/2021] [Indexed: 02/05/2023] Open
Abstract
Purpose: We aimed to establish and validate a risk assessment system that combines demographic and clinical variables to predict the 3-year risk of incident diabetes in Chinese adults. Methods: A 3-year cohort study was performed on 15,928 Chinese adults without diabetes at baseline. All participants were randomly divided into a training set (n = 7,940) and a validation set (n = 7,988). XGBoost method is an effective machine learning technique used to select the most important variables from candidate variables. And we further established a stepwise model based on the predictors chosen by the XGBoost model. The area under the receiver operating characteristic curve (AUC), decision curve and calibration analysis were used to assess discrimination, clinical use and calibration of the model, respectively. The external validation was performed on a cohort of 11,113 Japanese participants. Result: In the training and validation sets, 148 and 145 incident diabetes cases occurred. XGBoost methods selected the 10 most important variables from 15 candidate variables. Fasting plasma glucose (FPG), body mass index (BMI) and age were the top 3 important variables. And we further established a stepwise model and a prediction nomogram. The AUCs of the stepwise model were 0.933 and 0.910 in the training and validation sets, respectively. The Hosmer-Lemeshow test showed a perfect fit between the predicted diabetes risk and the observed diabetes risk (p = 0.068 for the training set, p = 0.165 for the validation set). Decision curve analysis presented the clinical use of the stepwise model and there was a wide range of alternative threshold probability spectrum. And there were almost no the interactions between these predictors (most P-values for interaction >0.05). Furthermore, the AUC for the external validation set was 0.830, and the Hosmer-Lemeshow test for the external validation set showed no statistically significant difference between the predicted diabetes risk and observed diabetes risk (P = 0.824). Conclusion: We established and validated a risk assessment system for characterizing the 3-year risk of incident diabetes.
Collapse
Affiliation(s)
- Yang Wu
- Department of Endocrinology, The First Affiliated Hospital of Shenzhen University, Shenzhen, China
- Department of Endocrinology, Shenzhen Second People's Hospital, Shenzhen, China
- Shenzhen University Health Science Center, Shenzhen, China
| | - Haofei Hu
- Shenzhen University Health Science Center, Shenzhen, China
- Department of Nephrology, The First Affiliated Hospital of Shenzhen University, Shenzhen, China
- Department of Nephrology, Shenzhen Second People's Hospital, Shenzhen, China
| | - Jinlin Cai
- Department of Endocrinology, The First Affiliated Hospital of Shenzhen University, Shenzhen, China
- Department of Endocrinology, Shenzhen Second People's Hospital, Shenzhen, China
- Shantou University Medical College, Shantou, China
| | - Runtian Chen
- Department of Endocrinology, The First Affiliated Hospital of Shenzhen University, Shenzhen, China
- Department of Endocrinology, Shenzhen Second People's Hospital, Shenzhen, China
- Shenzhen University Health Science Center, Shenzhen, China
| | - Xin Zuo
- Department of Endocrinology, The Third People's Hospital of Shenzhen, Shenzhen, China
| | - Heng Cheng
- Department of Endocrinology, The Third People's Hospital of Shenzhen, Shenzhen, China
| | - Dewen Yan
- Department of Endocrinology, The First Affiliated Hospital of Shenzhen University, Shenzhen, China
- Department of Endocrinology, Shenzhen Second People's Hospital, Shenzhen, China
- Shenzhen University Health Science Center, Shenzhen, China
- *Correspondence: Dewen Yan
| |
Collapse
|
39
|
Hong N, Park Y, You SC, Rhee Y. AIM in Endocrinology. Artif Intell Med 2021. [DOI: 10.1007/978-3-030-58080-3_328-1] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/07/2022]
|
40
|
Artificial Neural Networks Model for Predicting Type 2 Diabetes Mellitus Based on VDR Gene FokI Polymorphism, Lipid Profile and Demographic Data. BIOLOGY 2020; 9:biology9080222. [PMID: 32823649 PMCID: PMC7465516 DOI: 10.3390/biology9080222] [Citation(s) in RCA: 8] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 07/12/2020] [Revised: 08/04/2020] [Accepted: 08/10/2020] [Indexed: 01/06/2023]
Abstract
Type 2 diabetes mellitus (T2DM) is a multifactorial disease associated with many genetic polymorphisms; among them is the FokI polymorphism in the vitamin D receptor (VDR) gene. In this case-control study, samples from 82 T2DM patients and 82 healthy controls were examined to investigate the association of the FokI polymorphism and lipid profile with T2DM in the Jordanian population. DNA was extracted from blood and genotyped for the FokI polymorphism by polymerase chain reaction (PCR) and DNA sequencing. Lipid profile and fasting blood sugar were also measured. There were significant differences in high-density lipoprotein (HDL) cholesterol and triglyceride levels between T2DM and control samples. Frequencies of the FokI polymorphism (CC, CT and TT) were determined in T2DM and control samples and were not significantly different. Furthermore, there was no significant association between the FokI polymorphism and T2DM or lipid profile. A feed-forward neural network (FNN) was used as a computational platform to predict the persons with diabetes based on the FokI polymorphism, lipid profile, gender and age. The accuracy of prediction reached 88% when all parameters were included, 81% when the FokI polymorphism was excluded, and 72% when lipids were only included. This is the first study investigating the association of the VDR gene FokI polymorphism with T2DM in the Jordanian population, and it showed negative association. Diabetes was predicted with high accuracy based on medical data using an FNN. This highlights the great value of incorporating neural network tools into large medical databases and the ability to predict patient susceptibility to diabetes.
Collapse
|
41
|
Reed RA, Morgan AS, Zeitlin J, Jarreau PH, Torchin H, Pierrat V, Ancel PY, Khoshnood B. Machine-Learning vs. Expert-Opinion Driven Logistic Regression Modelling for Predicting 30-Day Unplanned Rehospitalisation in Preterm Babies: A Prospective, Population-Based Study (EPIPAGE 2). Front Pediatr 2020; 8:585868. [PMID: 33614539 PMCID: PMC7886676 DOI: 10.3389/fped.2020.585868] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 07/21/2020] [Accepted: 12/29/2020] [Indexed: 11/28/2022] Open
Abstract
Introduction: Preterm babies are a vulnerable population that experience significant short and long-term morbidity. Rehospitalisations constitute an important, potentially modifiable adverse event in this population. Improving the ability of clinicians to identify those patients at the greatest risk of rehospitalisation has the potential to improve outcomes and reduce costs. Machine-learning algorithms can provide potentially advantageous methods of prediction compared to conventional approaches like logistic regression. Objective: To compare two machine-learning methods (least absolute shrinkage and selection operator (LASSO) and random forest) to expert-opinion driven logistic regression modelling for predicting unplanned rehospitalisation within 30 days in a large French cohort of preterm babies. Design, Setting and Participants: This study used data derived exclusively from the population-based prospective cohort study of French preterm babies, EPIPAGE 2. Only those babies discharged home alive and whose parents completed the 1-year survey were eligible for inclusion in our study. All predictive models used a binary outcome, denoting a baby's status for an unplanned rehospitalisation within 30 days of discharge. Predictors included those quantifying clinical, treatment, maternal and socio-demographic factors. The predictive abilities of models constructed using LASSO and random forest algorithms were compared with a traditional logistic regression model. The logistic regression model comprised 10 predictors, selected by expert clinicians, while the LASSO and random forest included 75 predictors. Performance measures were derived using 10-fold cross-validation. Performance was quantified using area under the receiver operator characteristic curve, sensitivity, specificity, Tjur's coefficient of determination and calibration measures. Results: The rate of 30-day unplanned rehospitalisation in the eligible population used to construct the models was 9.1% (95% CI 8.2-10.1) (350/3,841). The random forest model demonstrated both an improved AUROC (0.65; 95% CI 0.59-0.7; p = 0.03) and specificity vs. logistic regression (AUROC 0.57; 95% CI 0.51-0.62, p = 0.04). The LASSO performed similarly (AUROC 0.59; 95% CI 0.53-0.65; p = 0.68) to logistic regression. Conclusions: Compared to an expert-specified logistic regression model, random forest offered improved prediction of 30-day unplanned rehospitalisation in preterm babies. However, all models offered relatively low levels of predictive ability, regardless of modelling method.
Collapse
Affiliation(s)
- Robert A Reed
- Université de Paris, Epidemiology and Statistics Research Center/CRESS, INSERM, INRA, Paris, France
| | - Andrei S Morgan
- Université de Paris, Epidemiology and Statistics Research Center/CRESS, INSERM, INRA, Paris, France.,Elizabeth Garrett Anderson Institute for Womens' Health, University College London (UCL), London, United Kingdom.,SAMU 93, SMUR Pédiatrique, CHI André Gregoire, Groupe Hospitalier Universitaire Paris Seine-Saint-Denis, Assistance Publique des Hôpitaux de Paris, Paris, France
| | - Jennifer Zeitlin
- Université de Paris, Epidemiology and Statistics Research Center/CRESS, INSERM, INRA, Paris, France
| | - Pierre-Henri Jarreau
- Université de Paris, Epidemiology and Statistics Research Center/CRESS, INSERM, INRA, Paris, France.,APHP.5, Service de Médecine et Réanimation Néonatales de Port-Royal, Paris, France
| | - Héloïse Torchin
- Université de Paris, Epidemiology and Statistics Research Center/CRESS, INSERM, INRA, Paris, France.,APHP.5, Service de Médecine et Réanimation Néonatales de Port-Royal, Paris, France
| | - Véronique Pierrat
- Université de Paris, Epidemiology and Statistics Research Center/CRESS, INSERM, INRA, Paris, France.,CHU Lille, Department of Neonatal Medicine, Jeanne de Flandre Lille, France
| | - Pierre-Yves Ancel
- Université de Paris, Epidemiology and Statistics Research Center/CRESS, INSERM, INRA, Paris, France.,Clinical Research Unit, Center for Clinical Investigation P1419, APHP.5, Paris, France
| | - Babak Khoshnood
- Université de Paris, Epidemiology and Statistics Research Center/CRESS, INSERM, INRA, Paris, France
| |
Collapse
|