1
|
Martin-Morales A, Yamamoto M, Inoue M, Vu T, Dawadi R, Araki M. Predicting Cardiovascular Disease Mortality: Leveraging Machine Learning for Comprehensive Assessment of Health and Nutrition Variables. Nutrients 2023; 15:3937. [PMID: 37764721 PMCID: PMC10534618 DOI: 10.3390/nu15183937] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/14/2023] [Revised: 09/06/2023] [Accepted: 09/08/2023] [Indexed: 09/29/2023] Open
Abstract
Cardiovascular disease (CVD) is one of the primary causes of death around the world. This study aimed to identify risk factors associated with CVD mortality using data from the National Health and Nutrition Examination Survey (NHANES). We created three models focusing on dietary data, non-diet-related health data, and a combination of both. Machine learning (ML) models, particularly the random forest algorithm, demonstrated robust consistency across health, nutrition, and mixed categories in predicting death from CVD. Shapley additive explanation (SHAP) values showed age, systolic blood pressure, and several other health factors as crucial variables, while fiber, calcium, and vitamin E, among others, were significant nutritional variables. Our research emphasizes the importance of comprehensive health evaluation and dietary intake in predicting CVD mortality. The inclusion of nutrition variables improved the performance of our models, underscoring the utility of dietary intake in ML-based data analysis. Further investigation using large datasets with recurring dietary recalls is necessary to enhance the effectiveness and interpretability of such models.
Collapse
Affiliation(s)
- Agustin Martin-Morales
- Artificial Intelligence Center for Health and Biomedical Research, National Institutes of Biomedical Innovation, Health and Nutrition, 3-17 Senrioka-shinmachi, Settsu 566-0002, Japan
- National Cerebral and Cardiovascular Center, 6-1 Kishibe-Shinmachi, Suita 564-8565, Japan
| | - Masaki Yamamoto
- Artificial Intelligence Center for Health and Biomedical Research, National Institutes of Biomedical Innovation, Health and Nutrition, 3-17 Senrioka-shinmachi, Settsu 566-0002, Japan
- National Cerebral and Cardiovascular Center, 6-1 Kishibe-Shinmachi, Suita 564-8565, Japan
| | - Mai Inoue
- Artificial Intelligence Center for Health and Biomedical Research, National Institutes of Biomedical Innovation, Health and Nutrition, 3-17 Senrioka-shinmachi, Settsu 566-0002, Japan
- National Cerebral and Cardiovascular Center, 6-1 Kishibe-Shinmachi, Suita 564-8565, Japan
| | - Thien Vu
- Artificial Intelligence Center for Health and Biomedical Research, National Institutes of Biomedical Innovation, Health and Nutrition, 3-17 Senrioka-shinmachi, Settsu 566-0002, Japan
- National Cerebral and Cardiovascular Center, 6-1 Kishibe-Shinmachi, Suita 564-8565, Japan
| | - Research Dawadi
- Artificial Intelligence Center for Health and Biomedical Research, National Institutes of Biomedical Innovation, Health and Nutrition, 3-17 Senrioka-shinmachi, Settsu 566-0002, Japan
- National Cerebral and Cardiovascular Center, 6-1 Kishibe-Shinmachi, Suita 564-8565, Japan
| | - Michihiro Araki
- Artificial Intelligence Center for Health and Biomedical Research, National Institutes of Biomedical Innovation, Health and Nutrition, 3-17 Senrioka-shinmachi, Settsu 566-0002, Japan
- National Cerebral and Cardiovascular Center, 6-1 Kishibe-Shinmachi, Suita 564-8565, Japan
- Graduate School of Medicine, Kyoto University, 54 Shogoin-Kawahara-cho, Sakyo-ku, Kyoto 606-8507, Japan
- Graduate School of Science, Technology and Innovation, Kobe University, 1-1 Rokkodai, Nada-ku, Kobe 657-8501, Japan
| |
Collapse
|
2
|
Li JX, Li L, Zhong X, Fan SJ, Cen T, Wang J, He C, Zhang Z, Luo YN, Liu XX, Hu LX, Zhang YD, Qiu HL, Dong GH, Zou XG, Yang BY. Machine learning identifies prominent factors associated with cardiovascular disease: findings from two million adults in the Kashgar Prospective Cohort Study (KPCS). Glob Health Res Policy 2022; 7:48. [PMID: 36474302 PMCID: PMC9724436 DOI: 10.1186/s41256-022-00282-y] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/19/2022] [Accepted: 11/18/2022] [Indexed: 12/12/2022] Open
Abstract
BACKGROUND Identifying factors associated with cardiovascular disease (CVD) is critical for its prevention, but this topic is scarcely investigated in Kashgar prefecture, Xinjiang, northwestern China. We thus explored the CVD epidemiology and identified prominent factors associated with CVD in this region. METHODS A total of 1,887,710 adults at baseline (in 2017) of the Kashgar Prospective Cohort Study were included in the analysis. Sixteen candidate factors, including seven demographic factors, 4 lifestyle factors, and 5 clinical factors, were collected from a questionnaire and health examination records. CVD was defined according to International Clinical Diagnosis (ICD-10) codes. We first used logistic regression models to investigate the association between each of the candidate factors and CVD. Then, we employed 3 machine learning methods-Random Forest, Random Ferns, and Extreme Gradient Boosting-to rank and identify prominent factors associated with CVD. Stratification analyses by sex, ethnicity, education level, economic status, and residential setting were also performed to test the consistency of the ranking. RESULTS The prevalence of CVD in Kashgar prefecture was 8.1%. All the 16 candidate factors were confirmed to be significantly associated with CVD (odds ratios ranged from 1.03 to 2.99, all p values < 0.05) in logistic regression models. Further machine learning-based analysis suggested that age, occupation, hypertension, exercise frequency, and dietary pattern were the five most prominent factors associated with CVD. The ranking of relative importance for prominent factors in stratification analyses showed that the factor importance generally followed the same pattern as that in the overall sample. CONCLUSIONS CVD is a major public health concern in Kashgar prefecture. Age, occupation, hypertension, exercise frequency, and dietary pattern might be the prominent factors associated with CVD in this region.In the future, these factors should be given priority in preventing CVD in future.
Collapse
Affiliation(s)
- Jia-Xin Li
- grid.12981.330000 0001 2360 039XGuangdong Provincial Engineering Technology Research Center of Environmental Pollution and Health Risk Assessment, Department of Occupational and Environmental Health, School of Public Health, Sun Yat-Sen University, 74 Zhongshan 2nd Road, Yuexiu District, Guangzhou, 510080 China
| | - Li Li
- grid.12981.330000 0001 2360 039XDepartment of Respiratory and Critical Care Medicine, The First People’s Hospital of Kashi (The Affiliated Kashi Hospital of Sun Yat-Sen University), No.66, Yingbin Avenue, Kashgar City, 844000 China
| | - Xuemei Zhong
- grid.12981.330000 0001 2360 039XDepartment of Respiratory and Critical Care Medicine, The First People’s Hospital of Kashi (The Affiliated Kashi Hospital of Sun Yat-Sen University), No.66, Yingbin Avenue, Kashgar City, 844000 China
| | - Shu-Jun Fan
- grid.508371.80000 0004 1774 3337Guangzhou Center for Disease Control and Prevention, Guangzhou, 510440 China
| | - Tao Cen
- grid.284723.80000 0000 8877 7471Department of Research and Development, Nanfang Hospital, Southern Medical University, Guangzhou, 510515 China
| | - Jianquan Wang
- grid.12981.330000 0001 2360 039XDepartment of Respiratory and Critical Care Medicine, The First People’s Hospital of Kashi (The Affiliated Kashi Hospital of Sun Yat-Sen University), No.66, Yingbin Avenue, Kashgar City, 844000 China
| | - Chuanjiang He
- grid.12981.330000 0001 2360 039XDepartment of Respiratory and Critical Care Medicine, The First People’s Hospital of Kashi (The Affiliated Kashi Hospital of Sun Yat-Sen University), No.66, Yingbin Avenue, Kashgar City, 844000 China
| | - Zhoubin Zhang
- grid.508371.80000 0004 1774 3337Guangzhou Center for Disease Control and Prevention, Guangzhou, 510440 China
| | - Ya-Na Luo
- grid.12981.330000 0001 2360 039XGuangdong Provincial Engineering Technology Research Center of Environmental Pollution and Health Risk Assessment, Department of Occupational and Environmental Health, School of Public Health, Sun Yat-Sen University, 74 Zhongshan 2nd Road, Yuexiu District, Guangzhou, 510080 China
| | - Xiao-Xuan Liu
- grid.12981.330000 0001 2360 039XGuangdong Provincial Engineering Technology Research Center of Environmental Pollution and Health Risk Assessment, Department of Occupational and Environmental Health, School of Public Health, Sun Yat-Sen University, 74 Zhongshan 2nd Road, Yuexiu District, Guangzhou, 510080 China
| | - Li-Xin Hu
- grid.12981.330000 0001 2360 039XGuangdong Provincial Engineering Technology Research Center of Environmental Pollution and Health Risk Assessment, Department of Occupational and Environmental Health, School of Public Health, Sun Yat-Sen University, 74 Zhongshan 2nd Road, Yuexiu District, Guangzhou, 510080 China
| | - Yi-Dan Zhang
- grid.12981.330000 0001 2360 039XGuangdong Provincial Engineering Technology Research Center of Environmental Pollution and Health Risk Assessment, Department of Occupational and Environmental Health, School of Public Health, Sun Yat-Sen University, 74 Zhongshan 2nd Road, Yuexiu District, Guangzhou, 510080 China
| | - Hui-Ling Qiu
- grid.12981.330000 0001 2360 039XGuangdong Provincial Engineering Technology Research Center of Environmental Pollution and Health Risk Assessment, Department of Occupational and Environmental Health, School of Public Health, Sun Yat-Sen University, 74 Zhongshan 2nd Road, Yuexiu District, Guangzhou, 510080 China
| | - Guang-Hui Dong
- grid.12981.330000 0001 2360 039XGuangdong Provincial Engineering Technology Research Center of Environmental Pollution and Health Risk Assessment, Department of Occupational and Environmental Health, School of Public Health, Sun Yat-Sen University, 74 Zhongshan 2nd Road, Yuexiu District, Guangzhou, 510080 China
| | - Xiao-Guang Zou
- grid.12981.330000 0001 2360 039XDepartment of Respiratory and Critical Care Medicine, The First People’s Hospital of Kashi (The Affiliated Kashi Hospital of Sun Yat-Sen University), No.66, Yingbin Avenue, Kashgar City, 844000 China
| | - Bo-Yi Yang
- grid.12981.330000 0001 2360 039XGuangdong Provincial Engineering Technology Research Center of Environmental Pollution and Health Risk Assessment, Department of Occupational and Environmental Health, School of Public Health, Sun Yat-Sen University, 74 Zhongshan 2nd Road, Yuexiu District, Guangzhou, 510080 China
| |
Collapse
|
3
|
Mavrogiorgou A, Kiourtis A, Kleftakis S, Mavrogiorgos K, Zafeiropoulos N, Kyriazis D. A Catalogue of Machine Learning Algorithms for Healthcare Risk Predictions. SENSORS (BASEL, SWITZERLAND) 2022; 22:8615. [PMID: 36433212 PMCID: PMC9695983 DOI: 10.3390/s22228615] [Citation(s) in RCA: 5] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 10/01/2022] [Revised: 11/04/2022] [Accepted: 11/04/2022] [Indexed: 05/27/2023]
Abstract
Extracting useful knowledge from proper data analysis is a very challenging task for efficient and timely decision-making. To achieve this, there exist a plethora of machine learning (ML) algorithms, while, especially in healthcare, this complexity increases due to the domain's requirements for analytics-based risk predictions. This manuscript proposes a data analysis mechanism experimented in diverse healthcare scenarios, towards constructing a catalogue of the most efficient ML algorithms to be used depending on the healthcare scenario's requirements and datasets, for efficiently predicting the onset of a disease. To this context, seven (7) different ML algorithms (Naïve Bayes, K-Nearest Neighbors, Decision Tree, Logistic Regression, Random Forest, Neural Networks, Stochastic Gradient Descent) have been executed on top of diverse healthcare scenarios (stroke, COVID-19, diabetes, breast cancer, kidney disease, heart failure). Based on a variety of performance metrics (accuracy, recall, precision, F1-score, specificity, confusion matrix), it has been identified that a sub-set of ML algorithms are more efficient for timely predictions under specific healthcare scenarios, and that is why the envisioned ML catalogue prioritizes the ML algorithms to be used, depending on the scenarios' nature and needed metrics. Further evaluation must be performed considering additional scenarios, involving state-of-the-art techniques (e.g., cloud deployment, federated ML) for improving the mechanism's efficiency.
Collapse
Affiliation(s)
- Argyro Mavrogiorgou
- Department of Digital Systems, University of Piraeus, 185 34 Piraeus, Greece
| | | | | | | | | | | |
Collapse
|
4
|
Russo S, Bonassi S. Prospects and Pitfalls of Machine Learning in Nutritional Epidemiology. Nutrients 2022; 14:nu14091705. [PMID: 35565673 PMCID: PMC9105182 DOI: 10.3390/nu14091705] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/18/2022] [Revised: 04/13/2022] [Accepted: 04/14/2022] [Indexed: 02/06/2023] Open
Abstract
Nutritional epidemiology employs observational data to discover associations between diet and disease risk. However, existing analytic methods of dietary data are often sub-optimal, with limited incorporation and analysis of the correlations between the studied variables and nonlinear behaviours in the data. Machine learning (ML) is an area of artificial intelligence that has the potential to improve modelling of nonlinear associations and confounding which are found in nutritional data. These opportunities notwithstanding, the applications of ML in nutritional epidemiology must be approached cautiously to safeguard the scientific quality of the results and provide accurate interpretations. Given the complex scenario around ML, judicious application of such tools is necessary to offer nutritional epidemiology a novel analytical resource for dietary measurement and assessment and a tool to model the complexity of dietary intake and its relation to health. This work describes the applications of ML in nutritional epidemiology and provides guidelines to avoid common pitfalls encountered in applying predictive statistical models to nutritional data. Furthermore, it helps unfamiliar readers better assess the significance of their results and provides new possible future directions in the field of ML in nutritional epidemiology.
Collapse
Affiliation(s)
- Stefania Russo
- EcoVision Lab, Photogrammetry and Remote Sensing Group, ETH Zürich, 8092 Zurich, Switzerland
- Correspondence:
| | - Stefano Bonassi
- Department of Human Sciences and Quality of Life Promotion, San Raffaele University, 00166 Rome, Italy;
- Unit of Clinical and Molecular Epidemiology, IRCCS San Raffaele Roma, 00163 Rome, Italy
| |
Collapse
|