1
|
Wang SCY, Nickel G, Venkatesh KP, Raza MM, Kvedar JC. AI-based diabetes care: risk prediction models and implementation concerns. NPJ Digit Med 2024; 7:36. [PMID: 38361152 PMCID: PMC10869708 DOI: 10.1038/s41746-024-01034-7] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/16/2023] [Accepted: 02/05/2024] [Indexed: 02/17/2024] Open
|
2
|
Chen Q, Hu H, She Y, He Q, Huang X, Shi H, Cao X, Zhang X, Xu Y. An artificial neural network model for evaluating the risk of hyperuricaemia in type 2 diabetes mellitus. Sci Rep 2024; 14:2197. [PMID: 38273015 PMCID: PMC10810925 DOI: 10.1038/s41598-024-52550-1] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/05/2023] [Accepted: 01/19/2024] [Indexed: 01/27/2024] Open
Abstract
Type 2 diabetes with hyperuricaemia may lead to gout, kidney damage, hypertension, coronary heart disease, etc., further aggravating the condition of diabetes as well as adding to the medical and financial burden. To construct a risk model for hyperuricaemia in patients with type 2 diabetes mellitus based on artificial neural network, and to evaluate the effectiveness of the risk model to provide directions for the prevention and control of the disease in this population. From June to December 2022, 8243 patients with type 2 diabetes were recruited from six community service centers for questionnaire and physical examination. Secondly, the collected data were used to select suitable variables and based on the comparison results, logistic regression was used to screen the variable characteristics. Finally, three risk models for evaluating the risk of hyperuricaemia in type 2 diabetes mellitus were developed using an artificial neural network algorithm and evaluated for performance. A total of eleven factors affecting the development of hyperuricaemia in patients with type 2 diabetes mellitus in this study, including gender, waist circumference, diabetes medication use, diastolic blood pressure, γ-glutamyl transferase, blood urea nitrogen, triglycerides, low-density lipoprotein cholesterol, high-density lipoprotein cholesterol, fasting glucose and estimated glomerular filtration rate. Among the generated models, baseline & biochemical risk model had the best performance with cutoff, area under the curve, accuracy, recall, specificity, positive likelihood ratio, negative likelihood ratio, precision, negative predictive value, KAPPA and F1-score were 0.488, 0.744, 0.689, 0.625, 0.749, 2.489, 0.501, 0.697, 0.684, 0.375 and 0.659. In addition, its Brier score was 0.169 and the calibration curve also showed good agreement between fitting and observation. The constructed artificial neural network model has better efficacy and facilitates the reduction of the harm caused by type 2 diabetes mellitus combined with hyperuricaemia.
Collapse
Affiliation(s)
- Qingquan Chen
- The Affiliated Fuzhou Center for Disease Control and Prevention of Fujian Medical University, Fuzhou, China
- School of Public Health, Fujian Medical University, Fuzhou, China
| | - Haiping Hu
- The Affiliated Fuzhou Center for Disease Control and Prevention of Fujian Medical University, Fuzhou, China
- School of Public Health, Fujian Medical University, Fuzhou, China
| | - Yuanyu She
- The Affiliated Fuzhou Center for Disease Control and Prevention of Fujian Medical University, Fuzhou, China
- School of Public Health, Fujian Medical University, Fuzhou, China
| | - Qing He
- The Affiliated Fuzhou Center for Disease Control and Prevention of Fujian Medical University, Fuzhou, China
- School of Public Health, Fujian Medical University, Fuzhou, China
| | - Xinfeng Huang
- The Affiliated Fuzhou Center for Disease Control and Prevention of Fujian Medical University, Fuzhou, China
- School of Public Health, Fujian Medical University, Fuzhou, China
| | - Huanhuan Shi
- The Affiliated Fuzhou Center for Disease Control and Prevention of Fujian Medical University, Fuzhou, China
- School of Public Health, Fujian Medical University, Fuzhou, China
| | - Xiangyu Cao
- The Affiliated Fuzhou Center for Disease Control and Prevention of Fujian Medical University, Fuzhou, China
- School of Public Health, Fujian Medical University, Fuzhou, China
| | - Xiaoyang Zhang
- The Affiliated Fuzhou Center for Disease Control and Prevention of Fujian Medical University, Fuzhou, China.
- School of Public Health, Fujian Medical University, Fuzhou, China.
| | - Youqiong Xu
- The Affiliated Fuzhou Center for Disease Control and Prevention of Fujian Medical University, Fuzhou, China.
- School of Public Health, Fujian Medical University, Fuzhou, China.
| |
Collapse
|
3
|
Liao X, Yao C, Zhang J, Liu LZ. Recent advancement in integrating artificial intelligence and information technology with real-world data for clinical decision-making in China: A scoping review. J Evid Based Med 2023; 16:534-546. [PMID: 37772921 DOI: 10.1111/jebm.12549] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 05/24/2023] [Accepted: 08/31/2023] [Indexed: 09/30/2023]
Abstract
OBJECTIVE Striking innovations and advancements have been achieved with the use of artificial intelligence and healthcare information technology being integrated into clinical real-world data. The current scoping review aimed to provide an overview of the current status of artificial intelligence-/information technology-based clinical decision support tools in China. METHODS PubMed/MEDLINE, Embase, China National Knowledge Internet, and Wanfang data were searched for both English and Chinese literature. The gray literature search was conducted for commercially available tools. Original studies that focused on clinical decision support tools driven by artificial intelligence or information technology in China and were published between 2010 and February 2022 were included. Information extracted from each article was further synthesized by themes based on three types of clinical decision-making. RESULTS A total of 37 peer-reviewed publications and 13 commercially available tools were included in the final analysis. Among them, 32.0% were developed for disease diagnosis, 54.0% for risk prediction and classification, and 14.0% for disease management. Chronic diseases were the most popular therapeutic areas of exploration, with particular emphasis on cardiovascular and cerebrovascular diseases. Single-center electronic medical records were the mainstream data sources leveraged to inform clinical decision-making, with internal validation being predominately used for model evaluation. CONCLUSIONS To effectively promote the extensive use of real-world data and drive a paradigm shift in clinical decision-making in China, multidisciplinary collaboration of key stakeholders is urgently needed.
Collapse
Affiliation(s)
- Xiwen Liao
- Peking University Clinical Research Institute, Peking University First Hospital, Beijing, China
| | - Chen Yao
- Peking University Clinical Research Institute, Peking University First Hospital, Beijing, China
- Hainan Institute of Real World Data, Qionghai, Hainan, China
| | - Jun Zhang
- Center for Observational and Real-world Evidence (CORE), MSD R&D (China) Co., Ltd., Beijing, China
| | - Larry Z Liu
- Center for Observational and Real-world Evidence (CORE), Merck & Co Inc, Rahway, Rahway, New Jersey, USA
- Department of Population Health Sciences, Weill Cornell Medical College, New York City, New York, USA
| |
Collapse
|
4
|
Mohsen F, Al-Absi HRH, Yousri NA, El Hajj N, Shah Z. A scoping review of artificial intelligence-based methods for diabetes risk prediction. NPJ Digit Med 2023; 6:197. [PMID: 37880301 PMCID: PMC10600138 DOI: 10.1038/s41746-023-00933-5] [Citation(s) in RCA: 4] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/25/2023] [Accepted: 09/25/2023] [Indexed: 10/27/2023] Open
Abstract
The increasing prevalence of type 2 diabetes mellitus (T2DM) and its associated health complications highlight the need to develop predictive models for early diagnosis and intervention. While many artificial intelligence (AI) models for T2DM risk prediction have emerged, a comprehensive review of their advancements and challenges is currently lacking. This scoping review maps out the existing literature on AI-based models for T2DM prediction, adhering to the PRISMA extension for Scoping Reviews guidelines. A systematic search of longitudinal studies was conducted across four databases, including PubMed, Scopus, IEEE-Xplore, and Google Scholar. Forty studies that met our inclusion criteria were reviewed. Classical machine learning (ML) models dominated these studies, with electronic health records (EHR) being the predominant data modality, followed by multi-omics, while medical imaging was the least utilized. Most studies employed unimodal AI models, with only ten adopting multimodal approaches. Both unimodal and multimodal models showed promising results, with the latter being superior. Almost all studies performed internal validation, but only five conducted external validation. Most studies utilized the area under the curve (AUC) for discrimination measures. Notably, only five studies provided insights into the calibration of their models. Half of the studies used interpretability methods to identify key risk predictors revealed by their models. Although a minority highlighted novel risk predictors, the majority reported commonly known ones. Our review provides valuable insights into the current state and limitations of AI-based models for T2DM prediction and highlights the challenges associated with their development and clinical integration.
Collapse
Affiliation(s)
- Farida Mohsen
- College of Science and Engineering, Hamad Bin Khalifa University, Qatar Foundation, 34110, Doha, Qatar
| | - Hamada R H Al-Absi
- College of Science and Engineering, Hamad Bin Khalifa University, Qatar Foundation, 34110, Doha, Qatar
| | - Noha A Yousri
- Genetic Medicine, Weill Cornell Medicine-Qatar, Qatar Foundation, Doha, Qatar
- College of Health and Life Sciences, Hamad Bin Khalifa University, Qatar Foundation, 34110, Doha, Qatar
- Computer and Systems Engineering, Alexandria University, Alexandria, Egypt
| | - Nady El Hajj
- College of Science and Engineering, Hamad Bin Khalifa University, Qatar Foundation, 34110, Doha, Qatar
- College of Health and Life Sciences, Hamad Bin Khalifa University, Qatar Foundation, 34110, Doha, Qatar
| | - Zubair Shah
- College of Science and Engineering, Hamad Bin Khalifa University, Qatar Foundation, 34110, Doha, Qatar.
| |
Collapse
|
5
|
Tsai MH, Jhou MJ, Liu TC, Fang YW, Lu CJ. An integrated machine learning predictive scheme for longitudinal laboratory data to evaluate the factors determining renal function changes in patients with different chronic kidney disease stages. Front Med (Lausanne) 2023; 10:1155426. [PMID: 37859858 PMCID: PMC10582636 DOI: 10.3389/fmed.2023.1155426] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/03/2023] [Accepted: 09/19/2023] [Indexed: 10/21/2023] Open
Abstract
Background and objectives Chronic kidney disease (CKD) is a global health concern. This study aims to identify key factors associated with renal function changes using the proposed machine learning and important variable selection (ML&IVS) scheme on longitudinal laboratory data. The goal is to predict changes in the estimated glomerular filtration rate (eGFR) in a cohort of patients with CKD stages 3-5. Design A retrospective cohort study. Setting and participants A total of 710 outpatients who presented with stable nondialysis-dependent CKD stages 3-5 at the Shin-Kong Wu Ho-Su Memorial Hospital Medical Center from 2016 to 2021. Methods This study analyzed trimonthly laboratory data including 47 indicators. The proposed scheme used stochastic gradient boosting, multivariate adaptive regression splines, random forest, eXtreme gradient boosting, and light gradient boosting machine algorithms to evaluate the important factors for predicting the results of the fourth eGFR examination, especially in patients with CKD stage 3 and those with CKD stages 4-5, with or without diabetes mellitus (DM). Main outcome measurement Subsequent eGFR level after three consecutive laboratory data assessments. Results Our ML&IVS scheme demonstrated superior predictive capabilities and identified significant factors contributing to renal function changes in various CKD groups. The latest levels of eGFR, blood urea nitrogen (BUN), proteinuria, sodium, and systolic blood pressure as well as mean levels of eGFR, BUN, proteinuria, and triglyceride were the top 10 significantly important factors for predicting the subsequent eGFR level in patients with CKD stages 3-5. In individuals with DM, the latest levels of BUN and proteinuria, mean levels of phosphate and proteinuria, and variations in diastolic blood pressure levels emerged as important factors for predicting the decline of renal function. In individuals without DM, all phosphate patterns and latest albumin levels were found to be key factors in the advanced CKD group. Moreover, proteinuria was identified as an important factor in the CKD stage 3 group without DM and CKD stages 4-5 group with DM. Conclusion The proposed scheme highlighted factors associated with renal function changes in different CKD conditions, offering valuable insights to physicians for raising awareness about renal function changes.
Collapse
Affiliation(s)
- Ming-Hsien Tsai
- Division of Nephrology, Department of Medicine, Shin Kong Wu Ho-Su Memorial Hospital, Taipei, Taiwan
- Department of Medicine, School of Medicine, Fu Jen Catholic University, New Taipei City, Taiwan
| | - Mao-Jhen Jhou
- Graduate Institute of Business Administration, Fu Jen Catholic University, New Taipei City, Taiwan
| | - Tzu-Chi Liu
- Graduate Institute of Business Administration, Fu Jen Catholic University, New Taipei City, Taiwan
| | - Yu-Wei Fang
- Division of Nephrology, Department of Medicine, Shin Kong Wu Ho-Su Memorial Hospital, Taipei, Taiwan
- Department of Medicine, School of Medicine, Fu Jen Catholic University, New Taipei City, Taiwan
| | - Chi-Jie Lu
- Graduate Institute of Business Administration, Fu Jen Catholic University, New Taipei City, Taiwan
- Artificial Intelligence Development Center, Fu Jen Catholic University, New Taipei City, Taiwan
- Department of Information Management, Fu Jen Catholic University, New Taipei City, Taiwan
| |
Collapse
|
6
|
Bin C, Li Q, Tang J, Dai C, Jiang T, Xie X, Qiu M, Chen L, Yang S. Machine learning models for predicting the risk factor of carotid plaque in cardiovascular disease. Front Cardiovasc Med 2023; 10:1178782. [PMID: 37808888 PMCID: PMC10556651 DOI: 10.3389/fcvm.2023.1178782] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/06/2023] [Accepted: 09/12/2023] [Indexed: 10/10/2023] Open
Abstract
Introduction Cardiovascular disease (CVD) is a group of diseases involving the heart or blood vessels and represents a leading cause of death and disability worldwide. Carotid plaque is an important risk factor for CVD that can reflect the severity of atherosclerosis. Accordingly, developing a prediction model for carotid plaque formation is essential to assist in the early prevention and management of CVD. Methods In this study, eight machine learning algorithms were established, and their performance in predicting carotid plaque risk was compared. Physical examination data were collected from 4,659 patients and used for model training and validation. The eight predictive models based on machine learning algorithms were optimized using the above dataset and 10-fold cross-validation. The Shapley Additive Explanations (SHAP) tool was used to compute and visualize feature importance. Then, the performance of the models was evaluated according to the area under the receiver operating characteristic curve (AUC), feature importance, accuracy and specificity. Results The experimental results indicated that the XGBoost algorithm outperformed the other machine learning algorithms, with an AUC, accuracy and specificity of 0.808, 0.749 and 0.762, respectively. Moreover, age, smoke, alcohol drink and BMI were the top four predictors of carotid plaque formation. It is feasible to predict carotid plaque risk using machine learning algorithms. Conclusions This study indicates that our models can be applied to routine chronic disease management procedures to enable more preemptive, broad-based screening for carotid plaque and improve the prognosis of CVD patients.
Collapse
Affiliation(s)
- Chengling Bin
- Health Management Section, The First People’s Hospital of Neijiang, Neijiang, China
| | - Qin Li
- Health Management Section, The First People’s Hospital of Neijiang, Neijiang, China
| | - Jing Tang
- Health Management Section, The First People’s Hospital of Neijiang, Neijiang, China
| | - Chaorong Dai
- Health Management Section, The First People’s Hospital of Neijiang, Neijiang, China
| | - Ting Jiang
- Health Management Section, The First People’s Hospital of Neijiang, Neijiang, China
| | - Xiufang Xie
- Department of Respiratory and Critical Care Medicine, The First People’s Hospital of Neijiang, Neijiang, China
| | - Min Qiu
- Special Inspection Department, The First People’s Hospital of Neijiang, Neijiang, China
| | - Lumiao Chen
- Laboratory Department, The First People’s Hospital of Neijiang, Neijiang, China
| | - Shaorong Yang
- Health Management Section, The First People’s Hospital of Neijiang, Neijiang, China
| |
Collapse
|
7
|
Li L, Cheng Y, Ji W, Liu M, Hu Z, Yang Y, Wang Y, Zhou Y. Machine learning for predicting diabetes risk in western China adults. Diabetol Metab Syndr 2023; 15:165. [PMID: 37501094 PMCID: PMC10373320 DOI: 10.1186/s13098-023-01112-y] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 03/09/2023] [Accepted: 06/15/2023] [Indexed: 07/29/2023] Open
Abstract
OBJECTIVE Diabetes mellitus is a global epidemic disease. Long-time exposure of patients to hyperglycemia can lead to various type of chronic tissue damage. Early diagnosis of and screening for diabetes are crucial to population health. METHODS We collected the national physical examination data in Xinjiang, China, in 2020 (a total of more than 4 million people). Three types of physical examination indices were analyzed: questionnaire, routine physical examination and laboratory values. Integrated learning, deep learning and logistic regression methods were used to establish a risk model for type-2 diabetes mellitus. In addition, to improve the convenience and flexibility of the model, a diabetes risk score card was established based on logistic regression to assess the risk of the population. RESULTS An XGBoost-based risk prediction model outperformed the other five risk assessment algorithms. The AUC of the model was 0.9122. Based on the feature importance ranking map, we found that hypertension, fasting blood glucose, age, coronary heart disease, ethnicity, parental diabetes mellitus, triglycerides, waist circumference, total cholesterol, and body mass index were the most important features of the risk prediction model for type-2 diabetes. CONCLUSIONS This study established a diabetes risk assessment model based on multiple ethnicities, a large sample and many indices, and classified the diabetes risk of the population, thus providing a new forecast tool for the screening of patients and providing information on diabetes prevention for healthy populations.
Collapse
Affiliation(s)
- Lin Li
- Zhongshan School of Medicine, Sun Yat-sen University, No. 74, Zhongshan Second Road, Yuexiu District, Guangzhou, 510080, Guangdong, China
| | - Yinlin Cheng
- Zhongshan School of Medicine, Sun Yat-sen University, No. 74, Zhongshan Second Road, Yuexiu District, Guangzhou, 510080, Guangdong, China
| | - Weidong Ji
- Zhongshan School of Medicine, Sun Yat-sen University, No. 74, Zhongshan Second Road, Yuexiu District, Guangzhou, 510080, Guangdong, China
| | - Mimi Liu
- Zhongshan School of Medicine, Sun Yat-sen University, No. 74, Zhongshan Second Road, Yuexiu District, Guangzhou, 510080, Guangdong, China
| | - Zhensheng Hu
- Zhongshan School of Medicine, Sun Yat-sen University, No. 74, Zhongshan Second Road, Yuexiu District, Guangzhou, 510080, Guangdong, China
| | - Yining Yang
- People's Hospital of Xinjiang Uygur Autonomous Region, No. 91 Tianchi Road, Tianshan District, Urumqi, 830001, Xijiang, China.
| | - Yushan Wang
- Center of Health Management, The First Affiliated Hospital of Xinjiang Medical University, No. 393, Xinyi Road, Xinshi District, Urumqi, 830054, Xinjiang, China.
| | - Yi Zhou
- Zhongshan School of Medicine, Sun Yat-sen University, No. 74, Zhongshan Second Road, Yuexiu District, Guangzhou, 510080, Guangdong, China.
| |
Collapse
|
8
|
Tang X, Wang T, Shi H, Zhang M, Yin R, Wu Q, Pan C. Artificial Intelligence and Big Data Technologies in the Construction of Surgical Risk Prediction Model for Patients with Coronary Artery Bypass Grafting. COMPUTATIONAL INTELLIGENCE AND NEUROSCIENCE 2023; 2023:9575553. [PMID: 37455771 PMCID: PMC10348861 DOI: 10.1155/2023/9575553] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 05/05/2022] [Revised: 06/06/2022] [Accepted: 06/14/2022] [Indexed: 07/18/2023]
Abstract
The objective of this work was to predict the risk of mortality rate in patients with coronary artery bypass grafting (CABG) based on the risk prediction model of CABG using artificial intelligence (AI) and big data technologies. The clinical data of 2,364 patients undergoing CABG in our hospital from January 2019 to August 2021 were collected in this work. Based on AI and big data technology, business requirement analysis, system requirement analysis, complication prediction module, big data mining technology, and model building are carried out, respectively; the successful CABG risk prediction system includes case feature analysis service, risk warning service, and case retrieval service. The commonly used precision, recall, and F1-score were adopted to evaluate the quality of the gradient-boosted tree (GBT) model. The analysis proved that the GBT model was the best in terms of precision, F1-score, and area under the receiver operating characteristic curve (ROC). According to the CABG risk prediction model, 1,382 patients had a score of <0, 463 patients had a score of 0 ≤ score ≤ 2, 252 patients had a score of 2 < score ≤ 5, and 267 patients had a score of >5, which were stratified into four groups: A, B, C, and D. The actual number of in-hospital deaths was 25, and the in-hospital mortality rate was 1.05%. The mortality rate predicted by the CABG risk prediction model was 2.67 ± 1.82% (95% confidential interval (CI) (2.87-2.98)), which was higher than the actual value. The CABG risk prediction model showed the credible results only in group B with AUC = 0.763 > 0.7. In group B, 3 patients actually died, the actual mortality rate was 0.33%, and the predicted mortality rate was 0.96 ± 0.78 (95% CI (0.82-0.87)), which overestimated the mortality rate of patients in group B. It successfully constructed a CABG risk prediction model based on the AI and big data technologies, which would overestimate the mortality of patients with intermediate risk, and it is suitable for different types of heart diseases through continuous research and development and innovation, and provides clinical guidance value.
Collapse
Affiliation(s)
- Xiaoqiang Tang
- Radiology Department, the Affiliated Changzhou No. 2 People's Hospital of Nanjing Medical University, Changzhou 213164, Jiangsu, China
| | - Tao Wang
- Radiology Department, the Affiliated Changzhou No. 2 People's Hospital of Nanjing Medical University, Changzhou 213164, Jiangsu, China
| | - Haifeng Shi
- Radiology Department, the Affiliated Changzhou No. 2 People's Hospital of Nanjing Medical University, Changzhou 213164, Jiangsu, China
| | - Ming Zhang
- Radiology Department, the Affiliated Changzhou No. 2 People's Hospital of Nanjing Medical University, Changzhou 213164, Jiangsu, China
| | - RuoHan Yin
- Radiology Department, the Affiliated Changzhou No. 2 People's Hospital of Nanjing Medical University, Changzhou 213164, Jiangsu, China
| | - Qiyong Wu
- Cardio Thoracic Department, the Affiliated Changzhou No. 2 People's Hospital of Nanjing Medical University, Changzhou 213164, Jiangsu, China
| | - Changjie Pan
- Radiology Department, the Affiliated Changzhou No. 2 People's Hospital of Nanjing Medical University, Changzhou 213164, Jiangsu, China
| |
Collapse
|
9
|
Afsaneh E, Sharifdini A, Ghazzaghi H, Ghobadi MZ. Recent applications of machine learning and deep learning models in the prediction, diagnosis, and management of diabetes: a comprehensive review. Diabetol Metab Syndr 2022; 14:196. [PMID: 36572938 PMCID: PMC9793536 DOI: 10.1186/s13098-022-00969-9] [Citation(s) in RCA: 6] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 10/06/2022] [Accepted: 12/16/2022] [Indexed: 12/28/2022] Open
Abstract
Diabetes as a metabolic illness can be characterized by increased amounts of blood glucose. This abnormal increase can lead to critical detriment to the other organs such as the kidneys, eyes, heart, nerves, and blood vessels. Therefore, its prediction, prognosis, and management are essential to prevent harmful effects and also recommend more useful treatments. For these goals, machine learning algorithms have found considerable attention and have been developed successfully. This review surveys the recently proposed machine learning (ML) and deep learning (DL) models for the objectives mentioned earlier. The reported results disclose that the ML and DL algorithms are promising approaches for controlling blood glucose and diabetes. However, they should be improved and employed in large datasets to affirm their applicability.
Collapse
|
10
|
Xu S, Coleman RL, Wan Q, Gu Y, Meng G, Song K, Shi Z, Xie Q, Tuomilehto J, Holman RR, Niu K, Tong N. Risk prediction models for incident type 2 diabetes in Chinese people with intermediate hyperglycemia: a systematic literature review and external validation study. Cardiovasc Diabetol 2022; 21:182. [PMID: 36100925 PMCID: PMC9472437 DOI: 10.1186/s12933-022-01622-5] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 04/11/2022] [Accepted: 09/07/2022] [Indexed: 11/23/2022] Open
Abstract
Background People with intermediate hyperglycemia (IH), including impaired fasting glucose and/or impaired glucose tolerance, are at higher risk of developing type 2 diabetes (T2D) than those with normoglycemia. We aimed to evaluate the performance of published T2D risk prediction models in Chinese people with IH to inform them about the choice of primary diabetes prevention measures. Methods A systematic literature search was conducted to identify Asian-derived T2D risk prediction models, which were eligible if they were built on a prospective cohort of Asian adults without diabetes at baseline and utilized routinely-available variables to predict future risk of T2D. These Asian-derived and five prespecified non-Asian derived T2D risk prediction models were divided into BASIC (clinical variables only) and EXTENDED (plus laboratory variables) versions, with validation performed on them in three prospective Chinese IH cohorts: ACE (n = 3241), Luzhou (n = 1333), and TCLSIH (n = 1702). Model performance was assessed in terms of discrimination (C-statistic) and calibration (Hosmer–Lemeshow test). Results Forty-four Asian and five non-Asian studies comprising 21 BASIC and 46 EXTENDED T2D risk prediction models for validation were identified. The majority were at high (n = 43, 87.8%) or unclear (n = 3, 6.1%) risk of bias, while only three studies (6.1%) were scored at low risk of bias. BASIC models showed poor-to-moderate discrimination with C-statistics 0.52–0.60, 0.50–0.59, and 0.50–0.64 in the ACE, Luzhou, and TCLSIH cohorts respectively. EXTENDED models showed poor-to-acceptable discrimination with C-statistics 0.54–0.73, 0.52–0.67, and 0.59–0.78 respectively. Fifteen BASIC and 40 EXTENDED models showed poor calibration (P < 0.05), overpredicting or underestimating the observed diabetes risk. Most recalibrated models showed improved calibration but modestly-to-severely overestimated diabetes risk in the three cohorts. The NAVIGATOR model showed the best discrimination in the three cohorts but had poor calibration (P < 0.05). Conclusions In Chinese people with IH, previously published BASIC models to predict T2D did not exhibit good discrimination or calibration. Several EXTENDED models performed better, but a robust Chinese T2D risk prediction tool in people with IH remains a major unmet need. Supplementary Information The online version contains supplementary material available at 10.1186/s12933-022-01622-5.
Collapse
Affiliation(s)
- Shishi Xu
- Division of Endocrinology and Metabolism, Center for Diabetes and Metabolism Research, Laboratory of Diabetes and Islet Transplantation Research, West China Medical School, West China Hospital, Sichuan University, Guo Xue Lane 37, Chengdu, China.,Diabetes Trials Unit, Radcliffe Department of Medicine, University of Oxford, Oxford, UK
| | - Ruth L Coleman
- Diabetes Trials Unit, Radcliffe Department of Medicine, University of Oxford, Oxford, UK
| | - Qin Wan
- Department of Endocrine and Metabolic Diseases, The Affiliated Hospital of Southwest Medical University, Luzhou, China
| | - Yeqing Gu
- Nutrition and Radiation Epidemiology Research Center, Institute of Radiation Medicine, Chinese Academy of Medical Sciences & Peking Union Medical College, Tianjin, China
| | - Ge Meng
- Nutritional Epidemiology Institute and School of Public Health, Tianjin Medical University, Tianjin, China
| | - Kun Song
- Health Management Centre, Tianjin Medical University General Hospital, Tianjin, China
| | - Zumin Shi
- Human Nutrition Department, College of Health Sciences, QU Health, Qatar University, Doha, Qatar
| | - Qian Xie
- Department of General Practice, People's Hospital of LeShan, LeShan, China
| | - Jaakko Tuomilehto
- Department of Public Health, University of Helsinki, Helsinki, Finland.,Population Health Unit, Finnish Institute for Health and Welfare, Helsinki, Finland.,Saudi Diabetes Research Group, King Abdulaziz University, Jeddah, Saudi Arabia
| | - Rury R Holman
- Diabetes Trials Unit, Radcliffe Department of Medicine, University of Oxford, Oxford, UK
| | - Kaijun Niu
- Nutrition and Radiation Epidemiology Research Center, Institute of Radiation Medicine, Chinese Academy of Medical Sciences & Peking Union Medical College, Tianjin, China. .,Nutritional Epidemiology Institute and School of Public Health, Tianjin Medical University, Tianjin, China.
| | - Nanwei Tong
- Division of Endocrinology and Metabolism, Center for Diabetes and Metabolism Research, Laboratory of Diabetes and Islet Transplantation Research, West China Medical School, West China Hospital, Sichuan University, Guo Xue Lane 37, Chengdu, China.
| |
Collapse
|
11
|
Liu Q, Zhou Q, He Y, Zou J, Guo Y, Yan Y. Predicting the 2-Year Risk of Progression from Prediabetes to Diabetes Using Machine Learning among Chinese Elderly Adults. J Pers Med 2022; 12:jpm12071055. [PMID: 35887552 PMCID: PMC9324396 DOI: 10.3390/jpm12071055] [Citation(s) in RCA: 4] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/10/2022] [Revised: 06/06/2022] [Accepted: 06/23/2022] [Indexed: 11/18/2022] Open
Abstract
Identifying people with a high risk of developing diabetes among those with prediabetes may facilitate the implementation of a targeted lifestyle and pharmacological interventions. We aimed to establish machine learning models based on demographic and clinical characteristics to predict the risk of incident diabetes. We used data from the free medical examination service project for elderly people who were 65 years or older to develop logistic regression (LR), decision tree (DT), random forest (RF), and extreme gradient boosting (XGBoost) machine learning models for the follow-up results of 2019 and 2020 and performed internal validation. The receiver operating characteristic (ROC), sensitivity, specificity, accuracy, and F1 score were used to select the model with better performance. The average annual progression rate to diabetes in prediabetic elderly people was 14.21%. Each model was trained using eight features and one outcome variable from 9607 prediabetic individuals, and the performance of the models was assessed in 2402 prediabetes patients. The predictive ability of four models in the first year was better than in the second year. The XGBoost model performed relatively efficiently (ROC: 0.6742 for 2019 and 0.6707 for 2020). We established and compared four machine learning models to predict the risk of progression from prediabetes to diabetes. Although there was little difference in the performance of the four models, the XGBoost model had a relatively good ROC value, which might perform well in future exploration in this field.
Collapse
Affiliation(s)
- Qing Liu
- Department of Epidemiology, School of Public Health, Wuhan University, Wuhan 430071, China; (Q.L.); (Q.Z.)
| | - Qing Zhou
- Department of Epidemiology, School of Public Health, Wuhan University, Wuhan 430071, China; (Q.L.); (Q.Z.)
| | - Yifeng He
- School of Geodesy and Geomatics, Wuhan University, Wuhan 430079, China; (Y.H.); (J.Z.)
| | - Jingui Zou
- School of Geodesy and Geomatics, Wuhan University, Wuhan 430079, China; (Y.H.); (J.Z.)
| | - Yan Guo
- Wuhan Center for Disease Control and Prevention, Wuhan 430015, China;
| | - Yaqiong Yan
- Wuhan Center for Disease Control and Prevention, Wuhan 430015, China;
- Correspondence:
| |
Collapse
|
12
|
Mu D, Li H, Wang D, Yang X, Wang S. Analysis of Environmental and Social Significant Factors Affecting the Flow of Maternal Patients in Jilin, China. Front Public Health 2022; 10:780452. [PMID: 35669749 PMCID: PMC9164295 DOI: 10.3389/fpubh.2022.780452] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/29/2021] [Accepted: 03/28/2022] [Indexed: 11/23/2022] Open
Abstract
Background With the implementation of China's Two-child policy, the number of pregnant women has been increasing year by year in recent years. However, the pregnancy success rate of pregnant women is declining year by year, and it is almost necessary for all the elderly mothers to do pregnancy protection. Objective The purpose of this study is to analyze the social and environmental factors that affect the patient flow of pregnant women in Jilin area of China, and further utilize the favorable factors to avoid the negative effects of adverse factors, so as to improve the pregnancy success rate and eugenics level. Methods Monthly patient flow data from 2018 to 2020 were collected in the obstetrics department of the First Hospital of Jilin University. The decompose function in R software was used to decompose the time series data, and the seasonal and trend change rules of the data were obtained; the significant factors influencing patient flow were analyzed by using Poisson regression model, and the prediction model was verified by using assumptions, such as the normal distribution of residuals and the constant difference of residuals. Results Temperature in environmental factors (P = 4.00E−08) had a significant impact on the flow of obstetric patient. The flow of patients was also significantly affected by the busy farming (P = 0.0013), entrance (P = 3.51E−10) and festivals (P = 0.00299). The patient flow was accompanied by random flow, but also showed trend change and seasonal change. The trend of change has been increasing year by year. The seasonal variation rule is that the flow of patients presents a trough in February every year, and reaches the peak in July. Conclusion In this article, Poisson regression model is used to obtain the social and environmental significant factors of obstetric patient flow. According to the significant factors, we should give full play to significant factors to further improve the level of eugenics. By using time series decomposition model, we can obtain the rising trend and seasonal trend of patient flow, and then provide the management with decision support, which is conducive to providing pregnant women with higher level of medical services and more comfortable medical experience.
Collapse
Affiliation(s)
- Dongmei Mu
- Department of Clinical Research, The First Hospital of Jilin University, Changchun, China.,School of Public Health, Jilin University, Changchun, China
| | - Hua Li
- Department of Abdominal Ultrasound, The First Hospital of Jilin University, Changchun, China.,School of Public Health, Jilin University, Changchun, China
| | - Dongxuan Wang
- Department of Abdominal Ultrasound, The First Hospital of Jilin University, Changchun, China
| | - Xinyu Yang
- School of Public Health, Jilin University, Changchun, China
| | - Shutong Wang
- School of Public Health, Jilin University, Changchun, China
| |
Collapse
|
13
|
Matabuena M, Félix P, García-Meixide C, Gude F. Kernel machine learning methods to handle missing responses with complex predictors. Application in modelling five-year glucose changes using distributional representations. COMPUTER METHODS AND PROGRAMS IN BIOMEDICINE 2022; 221:106905. [PMID: 35649295 DOI: 10.1016/j.cmpb.2022.106905] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 10/14/2021] [Revised: 05/11/2022] [Accepted: 05/22/2022] [Indexed: 06/15/2023]
Abstract
BACKGROUND AND OBJECTIVES Missing data is a ubiquitous problem in longitudinal studies due to the number of patients lost to follow-up. Kernel methods have enriched the machine learning field by successfully managing non-vectorial predictors, such as graphs, strings, and probability distributions, and have emerged as a promising tool for the analysis of complex data stemming from modern healthcare. This paper proposes a new set of kernel methods to handle missing data in the response variables. These methods will be applied to predict long-term changes in glycated haemoglobin (A1c), the primary biomarker used to diagnose and monitor the progression of diabetes mellitus, making emphasis on exploring the predictive potential of continuous glucose monitoring (CGM). METHODS We propose a new framework of non-linear kernel methods for testing statistical independence, selecting relevant predictors, and quantifying the uncertainty of the resultant predictive models. As a novelty in the clinical analysis, we used a distributional representation of CGM as a predictor and compared its performance with that of traditional diabetes biomarkers. RESULTS The results show that, after the incorporation of CGM information, predictive ability increases from R2=0.61 to R2=0.71. In addition, uncertainty analysis is useful for characterising some subpopulations where predictivity is worsened, and a more personalised clinical follow-up is advisable according to expected patient uncertainty in glucose values. CONCLUSIONS The proposed methods have proven to deal effectively with missing data. They also have the potential to improve the results of predictive tasks by including new complex objects as explanatory variables and modelling arbitrary dependence relations. The application of these methods to a longitudinal study of diabetes showed that the inclusion of a distributional representation of CGM data provides greater sensitivity in predicting five-year A1c changes than classical diabetes biomarkers and traditional CGM metrics.
Collapse
Affiliation(s)
- Marcos Matabuena
- CiTIUS (Centro Singular de Investigación en Tecnoloxías Intelixentes), Universidade de Santiago de Compostela, Santiago de Compostela 15782, Spain.
| | - Paulo Félix
- CiTIUS (Centro Singular de Investigación en Tecnoloxías Intelixentes), Universidade de Santiago de Compostela, Santiago de Compostela 15782, Spain
| | | | - Francisco Gude
- Unidade de Epidemioloxía Clínica, Complexo Hospitalario Universidade de Santiago (CHUS), Travesía da Choupana, Santiago de Compostela 15706, Spain
| |
Collapse
|
14
|
Liu Q, Zhang M, He Y, Zhang L, Zou J, Yan Y, Guo Y. Predicting the Risk of Incident Type 2 Diabetes Mellitus in Chinese Elderly Using Machine Learning Techniques. J Pers Med 2022; 12:jpm12060905. [PMID: 35743691 PMCID: PMC9224915 DOI: 10.3390/jpm12060905] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/09/2022] [Revised: 05/21/2022] [Accepted: 05/27/2022] [Indexed: 02/04/2023] Open
Abstract
Early identification of individuals at high risk of diabetes is crucial for implementing early intervention strategies. However, algorithms specific to elderly Chinese adults are lacking. The aim of this study is to build effective prediction models based on machine learning (ML) for the risk of type 2 diabetes mellitus (T2DM) in Chinese elderly. A retrospective cohort study was conducted using the health screening data of adults older than 65 years in Wuhan, China from 2018 to 2020. With a strict data filtration, 127,031 records from the eligible participants were utilized. Overall, 8298 participants were diagnosed with incident T2DM during the 2-year follow-up (2019–2020). The dataset was randomly split into training set (n = 101,625) and test set (n = 25,406). We developed prediction models based on four ML algorithms: logistic regression (LR), decision tree (DT), random forest (RF), and extreme gradient boosting (XGBoost). Using LASSO regression, 21 prediction features were selected. The Random under-sampling (RUS) was applied to address the class imbalance, and the Shapley Additive Explanations (SHAP) was used to calculate and visualize feature importance. Model performance was evaluated by the area under the receiver operating characteristic curve (AUC), sensitivity, specificity, and accuracy. The XGBoost model achieved the best performance (AUC = 0.7805, sensitivity = 0.6452, specificity = 0.7577, accuracy = 0.7503). Fasting plasma glucose (FPG), education, exercise, gender, and waist circumference (WC) were the top five important predictors. This study showed that XGBoost model can be applied to screen individuals at high risk of T2DM in the early phrase, which has the strong potential for intelligent prevention and control of diabetes. The key features could also be useful for developing targeted diabetes prevention interventions.
Collapse
Affiliation(s)
- Qing Liu
- Department of Epidemiology, School of Public Health, Wuhan University, Wuhan 430071, China; (Q.L.); (M.Z.)
| | - Miao Zhang
- Department of Epidemiology, School of Public Health, Wuhan University, Wuhan 430071, China; (Q.L.); (M.Z.)
| | - Yifeng He
- School of Geodesy and Geomatics, Wuhan University, Wuhan 430079, China; (Y.H.); (J.Z.)
| | - Lei Zhang
- School of Mathematics and Statistics, Wuhan University, Wuhan 430070, China;
| | - Jingui Zou
- School of Geodesy and Geomatics, Wuhan University, Wuhan 430079, China; (Y.H.); (J.Z.)
| | - Yaqiong Yan
- Wuhan Center for Disease Control and Prevention, Wuhan 430015, China;
| | - Yan Guo
- Wuhan Center for Disease Control and Prevention, Wuhan 430015, China;
- Correspondence:
| |
Collapse
|
15
|
Chen J, Guo C, Lu M, Ding S. Unifying Diagnosis Identification and Prediction Method Embedding the Disease Ontology Structure From Electronic Medical Records. Front Public Health 2022; 9:793801. [PMID: 35127624 PMCID: PMC8811031 DOI: 10.3389/fpubh.2021.793801] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/12/2021] [Accepted: 12/21/2021] [Indexed: 11/13/2022] Open
Abstract
OBJECTIVE The reasonable classification of a large number of distinct diagnosis codes can clarify patient diagnostic information and help clinicians to improve their ability to assign and target treatment for primary diseases. Our objective is to identify and predict a unifying diagnosis (UD) from electronic medical records (EMRs). METHODS We screened 4,418 sepsis patients from a public MIMIC-III database and extracted their diagnostic information for UD identification, their demographic information, laboratory examination information, chief complaint, and history of present illness information for UD prediction. We proposed a data-driven UD identification and prediction method (UDIPM) embedding the disease ontology structure. First, we designed a set similarity measure method embedding the disease ontology structure to generate a patient similarity matrix. Second, we applied affinity propagation clustering to divide patients into different clusters, and extracted a typical diagnosis code co-occurrence pattern from each cluster. Furthermore, we identified a UD by fusing visual analysis and a conditional co-occurrence matrix. Finally, we trained five classifiers in combination with feature fusion and feature selection method to unify the diagnosis prediction. RESULTS The experimental results on a public electronic medical record dataset showed that the UDIPM could extracted a typical diagnosis code co-occurrence pattern effectively, identified and predicted a UD based on patients' diagnostic and admission information, and outperformed other fusion methods overall. CONCLUSIONS The accurate identification and prediction of the UD from a large number of distinct diagnosis codes and multi-source heterogeneous patient admission information in EMRs can provide a data-driven approach to assist better coding integration of diagnosis.
Collapse
Affiliation(s)
- Jingfeng Chen
- Health Management Center, The First Affiliated Hospital of Zhengzhou University, Zhengzhou, China
- School of Economics and Management, Institute of Systems Engineering, Dalian University of Technology, Dalian, China
| | - Chonghui Guo
- School of Economics and Management, Institute of Systems Engineering, Dalian University of Technology, Dalian, China
| | - Menglin Lu
- School of Economics and Management, Institute of Systems Engineering, Dalian University of Technology, Dalian, China
| | - Suying Ding
- Health Management Center, The First Affiliated Hospital of Zhengzhou University, Zhengzhou, China
| |
Collapse
|
16
|
Li H, Mu D, Wang P, Li Y, Wang D. Prediction of Obstetric Patient Flow and Horizontal Allocation of Medical Resources Based on Time Series Analysis. Front Public Health 2021; 9:646157. [PMID: 34738002 PMCID: PMC8562385 DOI: 10.3389/fpubh.2021.646157] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/14/2020] [Accepted: 09/08/2021] [Indexed: 11/26/2022] Open
Abstract
Objective: Given the ever-changing flow of obstetric patients in the hospital, how the government and hospital management plan and allocate medical resources has become an important problem that needs to be urgently solved. In this study a prediction method for calculating the monthly and daily flow of patients based on time series is proposed to provide decision support for government and hospital management. Methods: The historical patient flow data from the Department of Obstetrics and Gynecology of the First Hospital of Jilin University, China, from January 1, 2018, to February 29, 2020, were used as the training set. Seven models such as XGBoost, SVM, RF, and NNAR were used to predict the daily patient flow in the next 14 days. The HoltWinters model is then used to predict the monthly flow of patients over the next year. Results: The results of this analysis and prediction model showed that the obstetric inpatient flow was not a purely random process, and that patient flow was not only accompanied by the random patient flow but also showed a trend change and seasonal change rule. ACF,PACF,Ljung_box, and residual histogram were then used to verify the accuracy of the prediction model, and the results show that the Holtwiners model was optimal. R2, MAPE, and other indicators were used to measure the accuracy of the 14 day prediction model, and the results showed that HoltWinters and STL prediction models achieved high accuracy. Conclusion: In this paper, the time series model was used to analyze the trend and seasonal changes of obstetric patient flow and predict the patient flow in the next 14 days and 12 months. On this basis, combined with the trend and seasonal changes of obstetric patient flow, a more reasonable and fair horizontal allocation scheme of medical resources is proposed, combined with the prediction of patient flow.
Collapse
Affiliation(s)
- Hua Li
- Department of Abdominal Ultrasound, First Affiliated Hospital of Jilin University, Changchun, China.,School of Public Health, Jilin University, Changchun, China
| | - Dongmei Mu
- School of Public Health, Jilin University, Changchun, China.,Department of Clinical Laboratory, First Affiliated Hospital of Jilin University, Changchun, China
| | - Ping Wang
- School of Public Health, Jilin University, Changchun, China
| | - Yin Li
- School of Public Health, Jilin University, Changchun, China
| | - Dongxuan Wang
- Department of Abdominal Ultrasound, First Affiliated Hospital of Jilin University, Changchun, China
| |
Collapse
|