1
|
Zhang Y, Zhang L, Lv H, Zhang G. Ensemble machine learning prediction of hyperuricemia based on a prospective health checkup population. Front Physiol 2024; 15:1357404. [PMID: 38665596 PMCID: PMC11043598 DOI: 10.3389/fphys.2024.1357404] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/21/2023] [Accepted: 03/11/2024] [Indexed: 04/28/2024] Open
Abstract
Objectives: An accurate prediction model for hyperuricemia (HUA) in adults remain unavailable. This study aimed to develop a stacking ensemble prediction model for HUA to identify high-risk groups and explore risk factors. Methods: A prospective health checkup cohort of 40899 subjects was examined and randomly divided into the training and validation sets with the ratio of 7:3. LASSO regression was employed to screen out important features and then the ROSE sampling was used to handle the imbalanced classes. An ensemble model using stacking strategy was constructed based on three individual models, including support vector machine, decision tree C5.0, and eXtreme gradient boosting. Model validations were conducted using the area under the receiver operating characteristic curve (AUC) and the calibration curve, as well as metrics including accuracy, sensitivity, specificity, positive predictive value, negative predictive value, and F1 score. A model agnostic instance level variable attributions technique (iBreakdown) was used to illustrate the black-box nature of our ensemble model, and to identify contributing risk factors. Results: Fifteen important features were screened out of 23 clinical variables. Our stacking ensemble model with an AUC of 0.854, outperformed the other three models, support vector machine, decision tree C5.0, and eXtreme gradient boosting with AUCs of 0.848, 0.851 and 0.849 respectively. Calibration accuracy as well as other metrics including accuracy, specificity, negative predictive value, and F1 score were also proved our ensemble model's superiority. The contributing risk factors were estimated using six randomly selected subjects, which showed that being female and relatively younger, together with having higher baseline uric acid, body mass index, γ-glutamyl transpeptidase, total protein, triglycerides, creatinine, and fasting blood glucose can increase the risk of HUA. To further validate our model's applicability in the health checkup population, we used another cohort of 8559 subjects that also showed our ensemble prediction model had favorable performances with an AUC of 0.846. Conclusion: In this study, the stacking ensemble prediction model for HUA was developed, and it outperformed three individual models that compose it (support vector machine, decision tree C5.0, and eXtreme gradient boosting). The contributing risk factors were identified with insightful ideas.
Collapse
Affiliation(s)
- Yongsheng Zhang
- Health Management Center, The First Affiliated Hospital of Shandong First Medical University and Shandong Provincial Qianfoshan Hospital, Jinan, China
- Institute of Health Management, The First Affiliated Hospital of Shandong First Medical University and Shandong Provincial Qianfoshan Hospital, Jinan, China
- Shandong Engineering Laboratory of Health Management, The First Affiliated Hospital of Shandong First Medical University and Shandong Provincial Qianfoshan Hospital, Jinan, China
| | - Li Zhang
- Department of Pharmacology, Jinan Central Hospital Affiliated to Shandong First Medical University, Jinan, China
| | - Haoyue Lv
- Health Management Center, The First Affiliated Hospital of Shandong First Medical University and Shandong Provincial Qianfoshan Hospital, Jinan, China
- Institute of Health Management, The First Affiliated Hospital of Shandong First Medical University and Shandong Provincial Qianfoshan Hospital, Jinan, China
- Shandong Engineering Laboratory of Health Management, The First Affiliated Hospital of Shandong First Medical University and Shandong Provincial Qianfoshan Hospital, Jinan, China
| | - Guang Zhang
- Health Management Center, The First Affiliated Hospital of Shandong First Medical University and Shandong Provincial Qianfoshan Hospital, Jinan, China
- Institute of Health Management, The First Affiliated Hospital of Shandong First Medical University and Shandong Provincial Qianfoshan Hospital, Jinan, China
- Shandong Engineering Laboratory of Health Management, The First Affiliated Hospital of Shandong First Medical University and Shandong Provincial Qianfoshan Hospital, Jinan, China
| |
Collapse
|
2
|
Chen Q, Hu H, She Y, He Q, Huang X, Shi H, Cao X, Zhang X, Xu Y. An artificial neural network model for evaluating the risk of hyperuricaemia in type 2 diabetes mellitus. Sci Rep 2024; 14:2197. [PMID: 38273015 PMCID: PMC10810925 DOI: 10.1038/s41598-024-52550-1] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/05/2023] [Accepted: 01/19/2024] [Indexed: 01/27/2024] Open
Abstract
Type 2 diabetes with hyperuricaemia may lead to gout, kidney damage, hypertension, coronary heart disease, etc., further aggravating the condition of diabetes as well as adding to the medical and financial burden. To construct a risk model for hyperuricaemia in patients with type 2 diabetes mellitus based on artificial neural network, and to evaluate the effectiveness of the risk model to provide directions for the prevention and control of the disease in this population. From June to December 2022, 8243 patients with type 2 diabetes were recruited from six community service centers for questionnaire and physical examination. Secondly, the collected data were used to select suitable variables and based on the comparison results, logistic regression was used to screen the variable characteristics. Finally, three risk models for evaluating the risk of hyperuricaemia in type 2 diabetes mellitus were developed using an artificial neural network algorithm and evaluated for performance. A total of eleven factors affecting the development of hyperuricaemia in patients with type 2 diabetes mellitus in this study, including gender, waist circumference, diabetes medication use, diastolic blood pressure, γ-glutamyl transferase, blood urea nitrogen, triglycerides, low-density lipoprotein cholesterol, high-density lipoprotein cholesterol, fasting glucose and estimated glomerular filtration rate. Among the generated models, baseline & biochemical risk model had the best performance with cutoff, area under the curve, accuracy, recall, specificity, positive likelihood ratio, negative likelihood ratio, precision, negative predictive value, KAPPA and F1-score were 0.488, 0.744, 0.689, 0.625, 0.749, 2.489, 0.501, 0.697, 0.684, 0.375 and 0.659. In addition, its Brier score was 0.169 and the calibration curve also showed good agreement between fitting and observation. The constructed artificial neural network model has better efficacy and facilitates the reduction of the harm caused by type 2 diabetes mellitus combined with hyperuricaemia.
Collapse
Affiliation(s)
- Qingquan Chen
- The Affiliated Fuzhou Center for Disease Control and Prevention of Fujian Medical University, Fuzhou, China
- School of Public Health, Fujian Medical University, Fuzhou, China
| | - Haiping Hu
- The Affiliated Fuzhou Center for Disease Control and Prevention of Fujian Medical University, Fuzhou, China
- School of Public Health, Fujian Medical University, Fuzhou, China
| | - Yuanyu She
- The Affiliated Fuzhou Center for Disease Control and Prevention of Fujian Medical University, Fuzhou, China
- School of Public Health, Fujian Medical University, Fuzhou, China
| | - Qing He
- The Affiliated Fuzhou Center for Disease Control and Prevention of Fujian Medical University, Fuzhou, China
- School of Public Health, Fujian Medical University, Fuzhou, China
| | - Xinfeng Huang
- The Affiliated Fuzhou Center for Disease Control and Prevention of Fujian Medical University, Fuzhou, China
- School of Public Health, Fujian Medical University, Fuzhou, China
| | - Huanhuan Shi
- The Affiliated Fuzhou Center for Disease Control and Prevention of Fujian Medical University, Fuzhou, China
- School of Public Health, Fujian Medical University, Fuzhou, China
| | - Xiangyu Cao
- The Affiliated Fuzhou Center for Disease Control and Prevention of Fujian Medical University, Fuzhou, China
- School of Public Health, Fujian Medical University, Fuzhou, China
| | - Xiaoyang Zhang
- The Affiliated Fuzhou Center for Disease Control and Prevention of Fujian Medical University, Fuzhou, China.
- School of Public Health, Fujian Medical University, Fuzhou, China.
| | - Youqiong Xu
- The Affiliated Fuzhou Center for Disease Control and Prevention of Fujian Medical University, Fuzhou, China.
- School of Public Health, Fujian Medical University, Fuzhou, China.
| |
Collapse
|
3
|
Jin W, Jin H, Su X, Che M, Wang Q, Gu L, Ni Z. Development and validation of the prediction model for mortality in patients with diabetic kidney disease in intensive care unit: a study based on medical information Mart for intensive care. Ren Fail 2023; 45:2257808. [PMID: 37724537 PMCID: PMC10512753 DOI: 10.1080/0886022x.2023.2257808] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/12/2023] [Accepted: 09/06/2023] [Indexed: 09/21/2023] Open
Abstract
We aimed to explore factors associated with mortality of diabetic kidney disease (DKD), and to establish a prediction model for predicting the mortality of DKD. This was a cohort study. In total, 1,357 DKD patients were identified from the Medical Information Mart for Intensive Care IV (MIMIC-IV) database, with 505 DKD patients being identified from the MIMIC-III as the testing set. The outcome of the study was 1-year mortality. COX proportional hazard models were applied to screen the predictive factors. The prediction model was conducted based on the predictive factors. A receiver operating characteristic (ROC) curve with the area under the curve (AUC) was calculated to evaluate the performance of the prediction model. The median follow-up time was 365.00 (54.50,365.00) days, and 586 patients (43.18%) died within 1 year. The predictive factors for 1-year mortality in DKD included age, weight, sepsis, heart rate, temperature, Charlson Comorbidity Index (CCI), Simplified Acute Physiology Score (SAPS) II, and Sequential Organ Failure Assessment (SOFA), lymphocytes, red cell distribution width (RDW), serum albumin, and metformin. The AUC of the prediction model for predicting 1-year mortality in the training set was 0.771 [95% confidence interval (CI): 0.746-0.795] and the AUC of the prediction model in the testing set was 0.795 (95% CI: 0.756-0.834). This study establishes a prediction model for predicting mortality of DKD, providing a basis for clinical intervention and decision-making in time.
Collapse
Affiliation(s)
- Wei Jin
- Department of Nephrology, Renji Hospital, Shanghai Jiao Tong University School of Medicine, Shanghai, P. R. China
| | - Haijiao Jin
- Department of Nephrology, Renji Hospital, Shanghai Jiao Tong University School of Medicine, Shanghai, P. R. China
| | - Xinyu Su
- Department of Nephrology, Renji Hospital, Shanghai Jiao Tong University School of Medicine, Shanghai, P. R. China
| | - Miaolin Che
- Department of Nephrology, Renji Hospital, Shanghai Jiao Tong University School of Medicine, Shanghai, P. R. China
| | - Qin Wang
- Department of Nephrology, Renji Hospital, Shanghai Jiao Tong University School of Medicine, Shanghai, P. R. China
| | - Leyi Gu
- Department of Nephrology, Renji Hospital, Shanghai Jiao Tong University School of Medicine, Shanghai, P. R. China
| | - Zhaohui Ni
- Department of Nephrology, Renji Hospital, Shanghai Jiao Tong University School of Medicine, Shanghai, P. R. China
| |
Collapse
|
4
|
Huang G, Jin Q, Mao Y. Predicting the 5-Year Risk of Nonalcoholic Fatty Liver Disease Using Machine Learning Models: Prospective Cohort Study. J Med Internet Res 2023; 25:e46891. [PMID: 37698911 PMCID: PMC10523217 DOI: 10.2196/46891] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/01/2023] [Revised: 08/02/2023] [Accepted: 08/16/2023] [Indexed: 09/13/2023] Open
Abstract
BACKGROUND Nonalcoholic fatty liver disease (NAFLD) has emerged as a worldwide public health issue. Identifying and targeting populations at a heightened risk of developing NAFLD over a 5-year period can help reduce and delay adverse hepatic prognostic events. OBJECTIVE This study aimed to investigate the 5-year incidence of NAFLD in the Chinese population. It also aimed to establish and validate a machine learning model for predicting the 5-year NAFLD risk. METHODS The study population was derived from a 5-year prospective cohort study. A total of 6196 individuals without NAFLD who underwent health checkups in 2010 at Zhenhai Lianhua Hospital in Ningbo, China, were enrolled in this study. Extreme gradient boosting (XGBoost)-recursive feature elimination, combined with the least absolute shrinkage and selection operator (LASSO), was used to screen for characteristic predictors. A total of 6 machine learning models, namely logistic regression, decision tree, support vector machine, random forest, categorical boosting, and XGBoost, were utilized in the construction of a 5-year risk model for NAFLD. Hyperparameter optimization of the predictive model was performed in the training set, and a further evaluation of the model performance was carried out in the internal and external validation sets. RESULTS The 5-year incidence of NAFLD was 18.64% (n=1155) in the study population. We screened 11 predictors for risk prediction model construction. After the hyperparameter optimization, CatBoost demonstrated the best prediction performance in the training set, with an area under the receiver operating characteristic (AUROC) curve of 0.810 (95% CI 0.768-0.852). Logistic regression showed the best prediction performance in the internal and external validation sets, with AUROC curves of 0.778 (95% CI 0.759-0.794) and 0.806 (95% CI 0.788-0.821), respectively. The development of web-based calculators has enhanced the clinical feasibility of the risk prediction model. CONCLUSIONS Developing and validating machine learning models can aid in predicting which populations are at the highest risk of developing NAFLD over a 5-year period, thereby helping delay and reduce the occurrence of adverse liver prognostic events.
Collapse
Affiliation(s)
- Guoqing Huang
- Department of Endocrinology, The First Affiliated Hospital of Ningbo University, Ningbo, China
- Health Science Center, Ningbo University, Ningbo, China
| | - Qiankai Jin
- Department of Endocrinology, The First Affiliated Hospital of Ningbo University, Ningbo, China
- Health Science Center, Ningbo University, Ningbo, China
| | - Yushan Mao
- Department of Endocrinology, The First Affiliated Hospital of Ningbo University, Ningbo, China
| |
Collapse
|
5
|
Abudureyimu P, Pang Y, Huang L, Luo Q, Zhang X, Xu Y, Jiang L, Mohemaiti P. A predictive model for hyperuricemia among type 2 diabetes mellitus patients in Urumqi, China. BMC Public Health 2023; 23:1740. [PMID: 37679683 PMCID: PMC10483783 DOI: 10.1186/s12889-023-16669-6] [Citation(s) in RCA: 2] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/26/2022] [Accepted: 08/31/2023] [Indexed: 09/09/2023] Open
Abstract
BACKGROUND Patients with type 2 diabetes Mellitus (T2DM) are more likely to suffer from a higher uric acid level in blood-hyperuricemia (HUA). There are no conclusive studies done to predict HUA among T2DM patients. Therefore, this study aims to explore the risk factors of HUA among T2DM patients and finally suggest a model to help with its prediction. METHOD In this retrospective research, all the date were collected between March 2017 and October 2019 in the Medical Laboratory Center of the First Affiliated Hospital of Xinjiang Medical University. The information included sociodemographic factors, blood routine index, thyroid function indicators and serum biochemical markers. The least absolute shrinkage and selection operator (LASSO) and multivariate binary logistic regression were performed to screen the risk factors of HUA among T2DM patients in blood tests, and the nomogram was used to perform and visualise the predictive model. The receiver operator characteristic (ROC) curve, internal validation, and clinical decision curve analysis (DCA) were applied to evaluate the prediction performance of the model. RESULTS We total collected the clinical date of 841 T2DM patients, whose age vary from 19-86. In this study, the overall prevalence of HUA in T2DM patients was 12.6%. According to the result of LASSO-logistic regression analysis, sex, ethnicity, serum albumin (ALB), serum cystatin C (CysC), serum inorganic phosphorus (IPHOS), alkaline phosphatase (ALP), serum bicarbonate (CO2) and high-density lipoprotein (HDLC) were included in the HUA risk prediction model. The nomogram confirmed that the prediction model fits well (χ2 = 5.4952, P = 0.704) and the calibration curve indicates the model had a good calibration. ROC analysis indicates that the predictive model shows the best discrimination ability (AUC = 0.827; 95% CI: 0.78-0.874) whose specificity is 0.885, and sensitivity is 0.602. CONCLUSION Our study reveals that there were 8 variables that can be considered as independent risk factors for HUA among T2DM patients. In light of our findings, a predictive model was developed and clinical advice was given on its use.
Collapse
Affiliation(s)
- Palizhati Abudureyimu
- Medical Laboratory Center, First Affiliated Hospital of Xinjiang Medical University, No.137, Liyushan South Road, Xinshi District, Urumqi, 830001, China
| | - Yuesheng Pang
- Xinjiang Uygur Autonomous Region, Xinjiang Medical University, No.567, North Shangde Road, Shuimogou District, Urumqi, 830017, China
| | - Lirun Huang
- Xinjiang Uygur Autonomous Region, Xinjiang Medical University, No.567, North Shangde Road, Shuimogou District, Urumqi, 830017, China
| | - Qianqian Luo
- Xinjiang Uygur Autonomous Region, Xinjiang Medical University, No.567, North Shangde Road, Shuimogou District, Urumqi, 830017, China
| | - Xiaozheng Zhang
- Xinjiang Uygur Autonomous Region, Xinjiang Medical University, No.567, North Shangde Road, Shuimogou District, Urumqi, 830017, China
| | - Yifan Xu
- Xinjiang Uygur Autonomous Region, Xinjiang Medical University, No.567, North Shangde Road, Shuimogou District, Urumqi, 830017, China
| | - Liang Jiang
- Xinjiang Uygur Autonomous Region, Xinjiang Medical University, No.567, North Shangde Road, Shuimogou District, Urumqi, 830017, China
| | - Patamu Mohemaiti
- Xinjiang Uygur Autonomous Region, Xinjiang Medical University, No.567, North Shangde Road, Shuimogou District, Urumqi, 830017, China.
| |
Collapse
|