1
|
Goldman O, Ben-Assuli O, Ababa S, Rogowski O, Berliner S. Predicting metabolic syndrome: Machine learning techniques for improved preventive medicine. Health Informatics J 2025; 31:14604582251315602. [PMID: 39819060 DOI: 10.1177/14604582251315602] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/19/2025]
Abstract
Objectives: Metabolic syndrome (MetS) has a significant impact on health. MetS is the umbrella term for a group of interdependent metabolic threats that contribute to the emergence of diseases that can lead to death. This study was designed to better predict the risks associated with MetS to enable medical personnel to make more optimal preventive medical decisions. Study design: Data from a large hospital survey database was used to train data mining classification techniques to predict patient-level risk subsequent to extensive data engineering that included aggregating predictors from multiple visits. Methods: A prospective group of seemingly healthy volunteers from the database was studied based on data obtained during their regular annual health checkups. Results: After aggregating the variables over time, the findings indicated that the predictive power of our model outperformed methods presented in other studies (AUC = 0.947). Specific lifestyle factors were identified as contributing to MetS. Conclusion: Involvement to avoid recurring diseases can significantly decrease medical problems and treatment expenses. The findings emphasize the importance of using predictive tools in healthcare and preventive medicine. The results can be used for future prevention strategies that encourage lifestyle changes and implement directed medical treatment protocols to decrease the burden of illness.
Collapse
Affiliation(s)
- Orit Goldman
- Faculty of Business Administration, Ono Academic College, Kiryat Ono, Israel
| | - Ofir Ben-Assuli
- Faculty of Business Administration, Ono Academic College, Kiryat Ono, Israel
| | - Shimon Ababa
- Faculty of Business Administration, Ono Academic College, Kiryat Ono, Israel
| | - Ori Rogowski
- Departments of Internal Medicine "C", "D" and "E", Tel-Aviv Sourasky Medical Center, Sackler Faculty of Medicine, Tel-Aviv University, Tel. Aviv, Israel
| | - Shlomo Berliner
- Departments of Internal Medicine "C", "D" and "E", Tel-Aviv Sourasky Medical Center, Sackler Faculty of Medicine, Tel-Aviv University, Tel. Aviv, Israel
| |
Collapse
|
2
|
Hossain MF, Hossain S, Akter MN, Nahar A, Liu B, Faruque MO. Metabolic syndrome predictive modelling in Bangladesh applying machine learning approach. PLoS One 2024; 19:e0309869. [PMID: 39236041 PMCID: PMC11376561 DOI: 10.1371/journal.pone.0309869] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/15/2024] [Accepted: 08/12/2024] [Indexed: 09/07/2024] Open
Abstract
Metabolic syndrome (MetS) is a cluster of interconnected metabolic risk factors, including abdominal obesity, high blood pressure, and elevated fasting blood glucose levels, that result in an increased risk of heart disease and stroke. In this research, we aim to identify the risk factors that have an impact on MetS in the Bangladeshi population. Subsequently, we intend to construct predictive machine learning (ML) models and ultimately, assess the accuracy and reliability of these models. In this particular study, we utilized the ATP III criteria as the basis for evaluating various health parameters from a dataset comprising 8185 participants in Bangladesh. After employing multiple ML algorithms, we identified that 27.8% of the population exhibited a prevalence of MetS. The prevalence of MetS was higher among females, accounting for 58.3% of the cases, compared to males with a prevalence of 41.7%. Initially, we identified the crucial variables using Chi-Square and Random Forest techniques. Subsequently, the obtained optimal variables are employed to train various models including Decision Trees, Random Forests, Support Vector Machines, Extreme Gradient Boosting, K-nearest neighbors, and Logistic Regression. Particularly we employed the ATP III criteria, which utilizes the Waist-to-Height Ratio (WHtR) as an anthropometric index for diagnosing abdominal obesity. Our analysis indicated that Age, SBP, WHtR, FBG, WC, DBP, marital status, HC, TGs, and smoking emerged as the most significant factors when using Chi-Square and Random Forest analyses. However, further investigation is necessary to evaluate its precision as a classification tool and to improve the accuracy of all classifiers for MetS prediction.
Collapse
Affiliation(s)
- Md Farhad Hossain
- Division of Computing, Analytics and Mathematics, Department of Mathematics and Statistics, School of Science and Engineering, University of Missouri, Kansas City, MO, United States of America
- Department of Statistics, Comilla University, Cumilla, Bangladesh
| | - Shaheed Hossain
- Department of Statistics, Comilla University, Cumilla, Bangladesh
| | - Mst Nira Akter
- Department of Statistics, Comilla University, Cumilla, Bangladesh
| | - Ainur Nahar
- Department of Statistics, Comilla University, Cumilla, Bangladesh
| | - Bowen Liu
- Division of Computing, Analytics and Mathematics, Department of Mathematics and Statistics, School of Science and Engineering, University of Missouri, Kansas City, MO, United States of America
| | - Md Omar Faruque
- Division of Energy, Matter and Sciences, School of Science and Engineering, University of Missouri, Kansas City, MO, United States of America
| |
Collapse
|
3
|
Shin D. Prediction of metabolic syndrome using machine learning approaches based on genetic and nutritional factors: a 14-year prospective-based cohort study. BMC Med Genomics 2024; 17:224. [PMID: 39232768 PMCID: PMC11373243 DOI: 10.1186/s12920-024-01998-1] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/12/2024] [Accepted: 08/28/2024] [Indexed: 09/06/2024] Open
Abstract
INTRODUCTION Metabolic syndrome is a chronic disease associated with multiple comorbidities. Over the last few years, machine learning techniques have been used to predict metabolic syndrome. However, studies incorporating demographic, clinical, laboratory, dietary, and genetic factors to predict the incidence of metabolic syndrome in Koreans are limited. In the present study, we propose a genome-wide polygenic risk score for the prediction of metabolic syndrome, along with other factors, to improve the prediction accuracy of metabolic syndrome. METHODS We developed 7 machine learning-based models and used Cox multivariable regression, deep neural network (DNN), support vector machine (SVM), stochastic gradient descent (SGD), random forest (RAF), Naïve Bayes (NBA) classifier, and AdaBoost (ADB) to predict the incidence of metabolic syndrome at year 14 using the dataset from the Korean Genome and Epidemiology Study (KoGES) Ansan and Ansung. RESULTS Of the 5440 patients, 2,120 were considered to have new-onset metabolic syndrome. The AUC values of model, which included sex, age, alcohol intake, energy intake, marital status, education status, income status, smoking status, dried laver intake, and genome-wide polygenic risk score (gPRS) Z-score based on 344,447 SNPs (p-value < 1.0), were the highest for RAF (0.994 [95% CI 0.985, 1.000]) and ADB (0.994 [95% CI 0.986, 1.000]). CONCLUSIONS Incorporating both gPRS and demographic, clinical, laboratory, and seaweed data led to enhanced metabolic syndrome risk prediction by capturing the distinct etiologies of metabolic syndrome development. The RAF- and ADB-based models predicted metabolic syndrome more accurately than the NBA-based model for the Korean population.
Collapse
Affiliation(s)
- Dayeon Shin
- Department of Food and Nutrition, Inha University, Incheon, 22212, Republic of Korea.
| |
Collapse
|
4
|
Lee M, Park T, Shin JY, Park M. A comprehensive multi-task deep learning approach for predicting metabolic syndrome with genetic, nutritional, and clinical data. Sci Rep 2024; 14:17851. [PMID: 39090161 PMCID: PMC11294629 DOI: 10.1038/s41598-024-68541-1] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/26/2024] [Accepted: 07/24/2024] [Indexed: 08/04/2024] Open
Abstract
Metabolic syndrome (MetS) is a complex disorder characterized by a cluster of metabolic abnormalities, including abdominal obesity, hypertension, elevated triglycerides, reduced high-density lipoprotein cholesterol, and impaired glucose tolerance. It poses a significant public health concern, as individuals with MetS are at an increased risk of developing cardiovascular diseases and type 2 diabetes. Early and accurate identification of individuals at risk for MetS is essential. Various machine learning approaches have been employed to predict MetS, such as logistic regression, support vector machines, and several boosting techniques. However, these methods use MetS as a binary status and do not consider that MetS comprises five components. Therefore, a method that focuses on these characteristics of MetS is needed. In this study, we propose a multi-task deep learning model designed to predict MetS and its five components simultaneously. The benefit of multi-task learning is that it can manage multiple tasks with a single model, and learning related tasks may enhance the model's predictive performance. To assess the efficacy of our proposed method, we compared its performance with that of several single-task approaches, including logistic regression, support vector machine, CatBoost, LightGBM, XGBoost and one-dimensional convolutional neural network. For the construction of our multi-task deep learning model, we utilized data from the Korean Association Resource (KARE) project, which includes 352,228 single nucleotide polymorphisms (SNPs) from 7729 individuals. We also considered lifestyle, dietary, and socio-economic factors that affect chronic diseases, in addition to genomic data. By evaluating metrics such as accuracy, precision, F1-score, and the area under the receiver operating characteristic curve, we demonstrate that our multi-task learning model surpasses traditional single-task machine learning models in predicting MetS.
Collapse
Affiliation(s)
- Minhyuk Lee
- Department of Statistics, Korea University, Seoul, Republic of Korea
| | - Taesung Park
- Department of Statistics, Seoul National University, Seoul, Republic of Korea
| | - Ji-Yeon Shin
- Department of Preventive Medicine, School of Medicine, Kyungpook National University, Daegu, Republic of Korea.
| | - Mira Park
- Department of Preventive Medicine, School of Medicine, Eulji University, Daejeon, Republic of Korea.
| |
Collapse
|
5
|
Huang AA, Huang SY. Application of a transparent artificial intelligence algorithm for US adults in the obese category of weight. PLoS One 2024; 19:e0304509. [PMID: 38820332 PMCID: PMC11142543 DOI: 10.1371/journal.pone.0304509] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/02/2023] [Accepted: 05/13/2024] [Indexed: 06/02/2024] Open
Abstract
OBJECTIVE AND AIMS Identification of associations between the obese category of weight in the general US population will continue to advance our understanding of the condition and allow clinicians, providers, communities, families, and individuals make more informed decisions. This study aims to improve the prediction of the obese category of weight and investigate its relationships with factors, ultimately contributing to healthier lifestyle choices and timely management of obesity. METHODS Questionnaires that included demographic, dietary, exercise and health information from the US National Health and Nutrition Examination Survey (NHANES 2017-2020) were utilized with BMI 30 or higher defined as obesity. A machine learning model, XGBoost predicted the obese category of weight and Shapely Additive Explanations (SHAP) visualized the various covariates and their feature importance. Model statistics including Area under the receiver operator curve (AUROC), sensitivity, specificity, positive predictive value, negative predictive value and feature properties such as gain, cover, and frequency were measured. SHAP explanations were created for transparent and interpretable analysis. RESULTS There were 6,146 adults (age > 18) that were included in the study with average age 58.39 (SD = 12.94) and 3122 (51%) females. The machine learning model had an Area under the receiver operator curve of 0.8295. The top four covariates include waist circumference (gain = 0.185), GGT (gain = 0.101), platelet count (gain = 0.059), AST (gain = 0.057), weight (gain = 0.049), HDL cholesterol (gain = 0.032), and ferritin (gain = 0.034). CONCLUSION In conclusion, the utilization of machine learning models proves to be highly effective in accurately predicting the obese category of weight. By considering various factors such as demographic information, laboratory results, physical examination findings, and lifestyle factors, these models successfully identify crucial risk factors associated with the obese category of weight.
Collapse
Affiliation(s)
- Alexander A. Huang
- Northwestern University Feinberg School of Medicine, Chicago, Illinois, United States of America
| | - Samuel Y. Huang
- Virginia Commonwealth University School of Medicine, Richmond, Virginia, United States of America
| |
Collapse
|
6
|
Huang X, He Q, Hu H, Shi H, Zhang X, Xu Y. Integrating machine learning and nontargeted plasma lipidomics to explore lipid characteristics of premetabolic syndrome and metabolic syndrome. Front Endocrinol (Lausanne) 2024; 15:1335269. [PMID: 38559697 PMCID: PMC10979736 DOI: 10.3389/fendo.2024.1335269] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Figures] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 11/08/2023] [Accepted: 02/14/2024] [Indexed: 04/04/2024] Open
Abstract
Objective To identify plasma lipid characteristics associated with premetabolic syndrome (pre-MetS) and metabolic syndrome (MetS) and provide biomarkers through machine learning methods. Methods Plasma lipidomics profiling was conducted using samples from healthy individuals, pre-MetS patients, and MetS patients. Orthogonal partial least squares-discriminant analysis (OPLS-DA) models were employed to identify dysregulated lipids in the comparative groups. Biomarkers were selected using support vector machine recursive feature elimination (SVM-RFE), random forest (rf), and least absolute shrinkage and selection operator (LASSO) regression, and the performance of two biomarker panels was compared across five machine learning models. Results In the OPLS-DA models, 50 and 89 lipid metabolites were associated with pre-MetS and MetS patients, respectively. Further machine learning identified two sets of plasma metabolites composed of PS(38:3), DG(16:0/18:1), and TG(16:0/14:1/22:6), TG(16:0/18:2/20:4), and TG(14:0/18:2/18:3), which were used as biomarkers for the pre-MetS and MetS discrimination models in this study. Conclusion In the initial lipidomics analysis of pre-MetS and MetS, we identified relevant lipid features primarily linked to insulin resistance in key biochemical pathways. Biomarker panels composed of lipidomics components can reflect metabolic changes across different stages of MetS, offering valuable insights for the differential diagnosis of pre-MetS and MetS.
Collapse
Affiliation(s)
- Xinfeng Huang
- The Affiliated Fuzhou Center for Disease Control and Prevention of Fujian Medical University, Fuzhou, China
- School of Public Health, Fujian Medical University, Fuzhou, China
| | - Qing He
- The Affiliated Fuzhou Center for Disease Control and Prevention of Fujian Medical University, Fuzhou, China
| | - Haiping Hu
- The Affiliated Fuzhou Center for Disease Control and Prevention of Fujian Medical University, Fuzhou, China
- School of Public Health, Fujian Medical University, Fuzhou, China
| | - Huanhuan Shi
- The Affiliated Fuzhou Center for Disease Control and Prevention of Fujian Medical University, Fuzhou, China
- School of Public Health, Fujian Medical University, Fuzhou, China
| | - Xiaoyang Zhang
- The Affiliated Fuzhou Center for Disease Control and Prevention of Fujian Medical University, Fuzhou, China
- School of Public Health, Fujian Medical University, Fuzhou, China
| | - Youqiong Xu
- The Affiliated Fuzhou Center for Disease Control and Prevention of Fujian Medical University, Fuzhou, China
- School of Public Health, Fujian Medical University, Fuzhou, China
| |
Collapse
|
7
|
Jeong S, Choi YJ. Investigating the Influence of Heavy Metals and Environmental Factors on Metabolic Syndrome Risk Based on Nutrient Intake: Machine Learning Analysis of Data from the Eighth Korea National Health and Nutrition Examination Survey (KNHANES). Nutrients 2024; 16:724. [PMID: 38474852 DOI: 10.3390/nu16050724] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/28/2024] [Revised: 02/27/2024] [Accepted: 02/28/2024] [Indexed: 03/14/2024] Open
Abstract
This study delves into the complex interrelations among nutrient intake, environmental exposures (particularly to heavy metals), and metabolic syndrome. Utilizing data from the Korea National Health and Nutrition Examination Survey (KNHANES), machine learning techniques were applied to analyze associations in a cohort of 5719 participants, categorized into four distinct nutrient intake phenotypes. Our findings reveal that different nutrient intake patterns are associated with varying levels of heavy metal exposure and metabolic health outcomes. Key findings include significant variations in metal levels (Pb, Hg, Cd, Ni) across the clusters, with certain clusters showing heightened levels of specific metals. These variations were associated with distinct metabolic health profiles, including differences in obesity, diabetes prevalence, hypertension, and cholesterol levels. Notably, Cluster 3, characterized by high-energy and nutrient-rich diets, showed the highest levels of Pb and Hg exposure and had the most concerning metabolic health indicators. Moreover, the study highlights the significant impact of lifestyle habits, such as smoking and eating out, on nutrient intake phenotypes and associated health risks. Physical activity emerged as a critical factor, with its absence linked to imbalanced nutrient intake in certain clusters. In conclusion, our research underscores the intricate connections among diet, environmental factors, and metabolic health. The findings emphasize the need for tailored health interventions and policies that consider these complex interplays, potentially informing future strategies to combat metabolic syndrome and related health issues.
Collapse
Affiliation(s)
- Seungpil Jeong
- Department of Medical Informatics, College of Medicine, Catholic University of Korea, Seoul 06591, Republic of Korea
| | - Yean-Jung Choi
- Department of Food and Nutrition, Sahmyook University, Seoul 01795, Republic of Korea
| |
Collapse
|
8
|
Boitor O, Stoica F, Mihăilă R, Stoica LF, Stef L. Automated Machine Learning to Develop Predictive Models of Metabolic Syndrome in Patients with Periodontal Disease. Diagnostics (Basel) 2023; 13:3631. [PMID: 38132215 PMCID: PMC10743072 DOI: 10.3390/diagnostics13243631] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/23/2023] [Revised: 12/04/2023] [Accepted: 12/06/2023] [Indexed: 12/23/2023] Open
Abstract
Metabolic syndrome is experiencing a concerning and escalating rise in prevalence today. The link between metabolic syndrome and periodontal disease is a highly relevant area of research. Some studies have suggested a bidirectional relationship between metabolic syndrome and periodontal disease, where one condition may exacerbate the other. Furthermore, the existence of periodontal disease among these individuals significantly impacts overall health management. This research focuses on the relationship between periodontal disease and metabolic syndrome, while also incorporating data on general health status and overall well-being. We aimed to develop advanced machine learning models that efficiently identify key predictors of metabolic syndrome, a significant emphasis being placed on thoroughly explaining the predictions generated by the models. We studied a group of 296 patients, hospitalized in SCJU Sibiu, aged between 45-79 years, of which 57% had metabolic syndrome. The patients underwent dental consultations and subsequently responded to a dedicated questionnaire, along with a standard EuroQol 5-Dimensions 5-Levels (EQ-5D-5L) questionnaire. The following data were recorded: DMFT (Decayed, Missing due to caries, and Filled Teeth), CPI (Community Periodontal Index), periodontal pockets depth, loss of epithelial insertion, bleeding after probing, frequency of tooth brushing, regular dental control, cardiovascular risk, carotid atherosclerosis, and EQ-5D-5L score. We used Automated Machine Learning (AutoML) frameworks to build predictive models in order to determine which of these risk factors exhibits the most robust association with metabolic syndrome. To gain confidence in the results provided by the machine learning models provided by the AutoML pipelines, we used SHapley Additive exPlanations (SHAP) values for the interpretability of these models, from a global and local perspective. The obtained results confirm that the severity of periodontal disease, high cardiovascular risk, and low EQ-5D-5L score have the greatest impact in the occurrence of metabolic syndrome.
Collapse
Affiliation(s)
- Ovidiu Boitor
- Dental Medicine Research Center, Faculty of Medicine, “Lucian Blaga” University, 550024 Sibiu, Romania;
| | - Florin Stoica
- Department of Mathematics and Informatics, Research Center in Informatics and Information Technology, Faculty of Sciences, “Lucian Blaga” University, 550024 Sibiu, Romania;
| | - Romeo Mihăilă
- Department of Internal Medicine, Faculty of Medicine, “Lucian Blaga” University, 550024 Sibiu, Romania;
| | - Laura Florentina Stoica
- Department of Mathematics and Informatics, Research Center in Informatics and Information Technology, Faculty of Sciences, “Lucian Blaga” University, 550024 Sibiu, Romania;
| | - Laura Stef
- Department of Oral Health, Dental Medicine Research Center, Faculty of Medicine, “Lucian Blaga” University, 550024 Sibiu, Romania;
| |
Collapse
|