1
|
Leiherer A, Muendlein A, Mink S, Mader A, Saely CH, Festa A, Fraunberger P, Drexel H. Machine Learning Approach to Metabolomic Data Predicts Type 2 Diabetes Mellitus Incidence. Int J Mol Sci 2024; 25:5331. [PMID: 38791370 PMCID: PMC11120685 DOI: 10.3390/ijms25105331] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/20/2024] [Revised: 04/30/2024] [Accepted: 05/09/2024] [Indexed: 05/26/2024] Open
Abstract
Metabolomics, with its wealth of data, offers a valuable avenue for enhancing predictions and decision-making in diabetes. This observational study aimed to leverage machine learning (ML) algorithms to predict the 4-year risk of developing type 2 diabetes mellitus (T2DM) using targeted quantitative metabolomics data. A cohort of 279 cardiovascular risk patients who underwent coronary angiography and who were initially free of T2DM according to American Diabetes Association (ADA) criteria was analyzed at baseline, including anthropometric data and targeted metabolomics, using liquid chromatography (LC)-mass spectroscopy (MS) and flow injection analysis (FIA)-MS, respectively. All patients were followed for four years. During this time, 11.5% of the patients developed T2DM. After data preprocessing, 362 variables were used for ML, employing the Caret package in R. The dataset was divided into training and test sets (75:25 ratio) and we used an oversampling approach to address the classifier imbalance of T2DM incidence. After an additional recursive feature elimination step, identifying a set of 77 variables that were the most valuable for model generation, a Support Vector Machine (SVM) model with a linear kernel demonstrated the most promising predictive capabilities, exhibiting an F1 score of 50%, a specificity of 93%, and balanced and unbalanced accuracies of 72% and 88%, respectively. The top-ranked features were bile acids, ceramides, amino acids, and hexoses, whereas anthropometric features such as age, sex, waist circumference, or body mass index had no contribution. In conclusion, ML analysis of metabolomics data is a promising tool for identifying individuals at risk of developing T2DM and opens avenues for personalized and early intervention strategies.
Collapse
Affiliation(s)
- Andreas Leiherer
- Vorarlberg Institute for Vascular Investigation and Treatment (VIVIT), A-6800 Feldkirch, Austria; (A.M.); (A.M.); (C.H.S.); (A.F.); (H.D.)
- Central Medical Laboratories, A-6800 Feldkirch, Austria; (S.M.); (P.F.)
- Faculty of Medical Sciences, Private University of the Principality of Liechtenstein, FL-9495 Triesen, Liechtenstein
| | - Axel Muendlein
- Vorarlberg Institute for Vascular Investigation and Treatment (VIVIT), A-6800 Feldkirch, Austria; (A.M.); (A.M.); (C.H.S.); (A.F.); (H.D.)
| | - Sylvia Mink
- Central Medical Laboratories, A-6800 Feldkirch, Austria; (S.M.); (P.F.)
- Faculty of Medical Sciences, Private University of the Principality of Liechtenstein, FL-9495 Triesen, Liechtenstein
| | - Arthur Mader
- Vorarlberg Institute for Vascular Investigation and Treatment (VIVIT), A-6800 Feldkirch, Austria; (A.M.); (A.M.); (C.H.S.); (A.F.); (H.D.)
- Department of Internal Medicine III, Academic Teaching Hospital Feldkirch, A-6800 Feldkirch, Austria
| | - Christoph H. Saely
- Vorarlberg Institute for Vascular Investigation and Treatment (VIVIT), A-6800 Feldkirch, Austria; (A.M.); (A.M.); (C.H.S.); (A.F.); (H.D.)
- Faculty of Medical Sciences, Private University of the Principality of Liechtenstein, FL-9495 Triesen, Liechtenstein
- Department of Internal Medicine III, Academic Teaching Hospital Feldkirch, A-6800 Feldkirch, Austria
| | - Andreas Festa
- Vorarlberg Institute for Vascular Investigation and Treatment (VIVIT), A-6800 Feldkirch, Austria; (A.M.); (A.M.); (C.H.S.); (A.F.); (H.D.)
| | - Peter Fraunberger
- Central Medical Laboratories, A-6800 Feldkirch, Austria; (S.M.); (P.F.)
- Faculty of Medical Sciences, Private University of the Principality of Liechtenstein, FL-9495 Triesen, Liechtenstein
| | - Heinz Drexel
- Vorarlberg Institute for Vascular Investigation and Treatment (VIVIT), A-6800 Feldkirch, Austria; (A.M.); (A.M.); (C.H.S.); (A.F.); (H.D.)
- Faculty of Medical Sciences, Private University of the Principality of Liechtenstein, FL-9495 Triesen, Liechtenstein
- Vorarlberger Landeskrankenhausbetriebsgesellschaft, Academic Teaching Hospital Feldkirch, A-6800 Feldkirch, Austria
- Drexel University College of Medicine, Philadelphia, PA 19129, USA
| |
Collapse
|
2
|
Alghamdi S, Turki T. A novel interpretable deep transfer learning combining diverse learnable parameters for improved T2D prediction based on single-cell gene regulatory networks. Sci Rep 2024; 14:4491. [PMID: 38396138 PMCID: PMC10891129 DOI: 10.1038/s41598-024-54923-y] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/12/2023] [Accepted: 02/18/2024] [Indexed: 02/25/2024] Open
Abstract
Accurate deep learning (DL) models to predict type 2 diabetes (T2D) are concerned not only with targeting the discrimination task but also with learning useful feature representation. However, existing DL tools are far from perfect and do not provide appropriate interpretation as a guideline to explain and promote superior performance in the target task. Therefore, we provide an interpretable approach for our presented deep transfer learning (DTL) models to overcome such drawbacks, working as follows. We utilize several pre-trained models including SEResNet152, and SEResNeXT101. Then, we transfer knowledge from pre-trained models via keeping the weights in the convolutional base (i.e., feature extraction part) while modifying the classification part with the use of Adam optimizer to deal with classifying healthy controls and T2D based on single-cell gene regulatory network (SCGRN) images. Another DTL models work in a similar manner but just with keeping weights of the bottom layers in the feature extraction unaltered while updating weights of consecutive layers through training from scratch. Experimental results on the whole 224 SCGRN images using five-fold cross-validation show that our model (TFeSEResNeXT101) achieving the highest average balanced accuracy (BAC) of 0.97 and thereby significantly outperforming the baseline that resulted in an average BAC of 0.86. Moreover, the simulation study demonstrated that the superiority is attributed to the distributional conformance of model weight parameters obtained with Adam optimizer when coupled with weights from a pre-trained model.
Collapse
Affiliation(s)
- Sumaya Alghamdi
- Department of Computer Science, King Abdulaziz University, 21589, Jeddah, Saudi Arabia
- Department of Computer Science, Albaha University, 65799, Albaha, Saudi Arabia
| | - Turki Turki
- Department of Computer Science, King Abdulaziz University, 21589, Jeddah, Saudi Arabia.
| |
Collapse
|