1
|
Traquete F, Sousa Silva M, Ferreira AEN. Enhancing supervised analysis of imbalanced untargeted metabolomics datasets using a CWGAN-GP framework for data augmentation. Comput Biol Med 2024; 184:109414. [PMID: 39546879 DOI: 10.1016/j.compbiomed.2024.109414] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/24/2024] [Revised: 10/21/2024] [Accepted: 11/08/2024] [Indexed: 11/17/2024]
Abstract
Untargeted metabolomics is an extremely useful approach for the discrimination of biological systems and biomarker identification. However, data analysis workflows are complex and face many challenges. Two of these challenges are the demand of high sample size and the possibility of severe class imbalance, which is particularly common in clinical studies. The latter can make statistical models less generalizable, increase the risk of overfitting and skew the analysis in favour of the majority class. One possible approach to mitigate this problem is data augmentation. However, the use of artificial data requires adequate data augmentation methods and criteria for assessing the quality of the generated data. In this work, we used Conditional Wasserstein Generative Adversarial Networks with Gradient Penalty (CWGAN-GPs) for data augmentation of metabolomics data. Using a set of benchmark datasets, we applied several criteria for the evaluation of the quality of generated data and assessed the performance of supervised predictive models trained with datasets that included such data. CWGAN-GP models generated realistic data with identical characteristics to real samples, mostly avoiding mode collapse. Furthermore, in cases of class imbalance, the performance of predictive models improved by supplementing the minority class with generated samples. This is evident for high quality datasets with well separated classes. Conversely, model improvements were quite modest for high class overlap datasets. This trend was confirmed by using synthetic datasets with different class separation levels. Data augmentation is a viable procedure to alleviate class imbalance problems but is not universally beneficial in metabolomics.
Collapse
Affiliation(s)
- Francisco Traquete
- FT-ICR and Structural Mass Spectrometry Laboratory, Faculdade de Ciências, Universidade de Lisboa, Portugal; Biosystems and Integrative Sciences Institute (BioISI), Faculdade de Ciências, Universidade de Lisboa, Campo Grande, 1749-016, Lisboa, Portugal.
| | - Marta Sousa Silva
- FT-ICR and Structural Mass Spectrometry Laboratory, Faculdade de Ciências, Universidade de Lisboa, Portugal; Biosystems and Integrative Sciences Institute (BioISI), Faculdade de Ciências, Universidade de Lisboa, Campo Grande, 1749-016, Lisboa, Portugal.
| | - António E N Ferreira
- FT-ICR and Structural Mass Spectrometry Laboratory, Faculdade de Ciências, Universidade de Lisboa, Portugal; Biosystems and Integrative Sciences Institute (BioISI), Faculdade de Ciências, Universidade de Lisboa, Campo Grande, 1749-016, Lisboa, Portugal.
| |
Collapse
|
2
|
Cobre ADF, Fachi MM, Domingues KZA, Lazo REL, Ferreira LM, Tonin FS, Pontarolo R. Accuracy of COVID-19 diagnostic tests via infrared spectroscopy: A systematic review and meta-analysis. SPECTROCHIMICA ACTA. PART A, MOLECULAR AND BIOMOLECULAR SPECTROSCOPY 2024; 327:125337. [PMID: 39481165 DOI: 10.1016/j.saa.2024.125337] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 05/07/2024] [Revised: 10/19/2024] [Accepted: 10/22/2024] [Indexed: 11/02/2024]
Abstract
This study aims to synthesize the evidence on the accuracy parameters of COVID-19 diagnosis methods using infrared spectroscopy (FTIR). A systematic review with searches in PubMed and Embase was performed (September 2023). Studies reporting data on test specificity, sensitivity, true positive, true negative, false positive, and false negative using different human samples were included. Meta-analysis of accuracy estimates with 95 % confidence intervals and area under the ROC Curve (AUC) were conducted (Meta-Disc 1.4.7). Seventeen studies were included - all of them highlighted regions 650-1800 cm-1 and 2300-3900 cm-1 as most important for diagnosing COVID-19. The FTIR technique presented high sensitivity [0.912 (95 %CI, 0.878-0.939), especially in vaccinated [0.959 (CI95 %, 0.908-0.987)] compared to unvaccinated [0.625 (CI95 %, 0.584-0.664)] individuals for COVID-19. Overall specificity was also high [0.886 (95 %CI, 0.855-0.912), with increased rates in vaccinated [0.884 (CI95 %, 0.819-0.932)] than in unvaccinated [0.667 (CI95 %, 0.629-0.704)] patients. These findings reveal that FTIR is an accurate technique for detecting SARS-CoV-2 infection in different biological matrices with advantages including low cost, rapid and environmentally friendly with minimal preparation analyses. This could lead to an easy implementation of this technique in practice as a screening tool for patients with suspected COVID-19, especially in low-income countries.
Collapse
Affiliation(s)
- Alexandre de Fátima Cobre
- Pharmaceutical Sciences Postgraduate Program, Universidade Federal do Paraná, Curitiba, Brazil; School of Biosciences and Medicine, Faculty of Health and Medical Sciences, University of Surrey, United Kingdom
| | - Mariana Millan Fachi
- Pharmaceutical Sciences Postgraduate Program, Universidade Federal do Paraná, Curitiba, Brazil
| | | | - Raul Edison Luna Lazo
- Pharmaceutical Sciences Postgraduate Program, Universidade Federal do Paraná, Curitiba, Brazil
| | - Luana Mota Ferreira
- Department of Pharmacy, Pharmaceutical Sciences Postgraduate Program, Universidade Federal do Paraná, Curitiba, Brazil
| | - Fernanda Stumpf Tonin
- H&TRC - Health & Technology Research Centre, ESTeSL, Escola Superior de Tecnologia da Saúde, Instituto Politécnico de Lisboa, Lisbon, Portugal
| | - Roberto Pontarolo
- Department of Pharmacy, Pharmaceutical Sciences Postgraduate Program, Universidade Federal do Paraná, Curitiba, Brazil.
| |
Collapse
|
3
|
de Fátima Cobre A, Alves AC, Gotine ARM, Domingues KZA, Lazo REL, Ferreira LM, Tonin FS, Pontarolo R. Novel COVID-19 biomarkers identified through multi-omics data analysis: N-acetyl-4-O-acetylneuraminic acid, N-acetyl-L-alanine, N-acetyltriptophan, palmitoylcarnitine, and glycerol 1-myristate. Intern Emerg Med 2024; 19:1439-1458. [PMID: 38416303 DOI: 10.1007/s11739-024-03547-1] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 11/01/2023] [Accepted: 01/16/2024] [Indexed: 02/29/2024]
Abstract
This study aims to apply machine learning models to identify new biomarkers associated with the early diagnosis and prognosis of SARS-CoV-2 infection.Plasma and serum samples from COVID-19 patients (mild, moderate, and severe), patients with other pneumonia (but with negative COVID-19 RT-PCR), and healthy volunteers (control) from hospitals in four different countries (China, Spain, France, and Italy) were analyzed by GC-MS, LC-MS, and NMR. Machine learning models (PCA and PLS-DA) were developed to predict the diagnosis and prognosis of COVID-19 and identify biomarkers associated with these outcomes.A total of 1410 patient samples were analyzed. The PLS-DA model presented a diagnostic and prognostic accuracy of around 95% of all analyzed data. A total of 23 biomarkers (e.g., spermidine, taurine, L-aspartic, L-glutamic, L-phenylalanine and xanthine, ornithine, and ribothimidine) have been identified as being associated with the diagnosis and prognosis of COVID-19. Additionally, we also identified for the first time five new biomarkers (N-Acetyl-4-O-acetylneuraminic acid, N-Acetyl-L-Alanine, N-Acetyltriptophan, palmitoylcarnitine, and glycerol 1-myristate) that are also associated with the severity and diagnosis of COVID-19. These five new biomarkers were elevated in severe COVID-19 patients compared to patients with mild disease or healthy volunteers.The PLS-DA model was able to predict the diagnosis and prognosis of COVID-19 around 95%. Additionally, our investigation pinpointed five novel potential biomarkers linked to the diagnosis and prognosis of COVID-19: N-Acetyl-4-O-acetylneuraminic acid, N-Acetyl-L-Alanine, N-Acetyltriptophan, palmitoylcarnitine, and glycerol 1-myristate. These biomarkers exhibited heightened levels in severe COVID-19 patients compared to those with mild COVID-19 or healthy volunteers.
Collapse
Affiliation(s)
| | - Alexessander Couto Alves
- School of Biosciences and Medicine, Faculty of Health and Medical Sciences, University of Surrey, Guildford, UK
| | | | | | | | - Luana Mota Ferreira
- Department of Pharmacy, Universidade Federal do Paraná, Campus III, Av. Pref. Lothário Meissner, 632, Jardim Botânico, Curitiba, PR, 80210-170, Brazil
| | - Fernanda Stumpf Tonin
- H&TRC - Health & Technology Research Centre, ESTeSL, Escola Superior de Tecnologia da Saúde, Instituto Politécnico de Lisboa, Lisbon, Portugal
| | - Roberto Pontarolo
- Department of Pharmacy, Universidade Federal do Paraná, Campus III, Av. Pref. Lothário Meissner, 632, Jardim Botânico, Curitiba, PR, 80210-170, Brazil.
| |
Collapse
|
4
|
Zhang C, Jiang X, Wu S, Zhang J, Wang Y, Li Z, Yao J. Dietary fat and carbohydrate-balancing the lactation performance and methane emissions in the dairy cow industry: A meta-analysis. ANIMAL NUTRITION (ZHONGGUO XU MU SHOU YI XUE HUI) 2024; 17:347-357. [PMID: 38800741 PMCID: PMC11127094 DOI: 10.1016/j.aninu.2024.02.004] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 07/03/2023] [Revised: 01/11/2024] [Accepted: 02/20/2024] [Indexed: 05/29/2024]
Abstract
For the agroecosystems of the dairy cow industry, dietary carbohydrate (starch, neutral detergent fiber [NDF]) and fat could directly affect rumen methane emissions and host energy utilization. However, the relationships among diet, lactation performance, and methane emissions need to be further determined to assist dairy farms to adjust diet formulations and feeding strategies for environmental and production management. A meta-analysis was conducted in the current study to explore quantitative patterns of dietary fat and carbohydrate at different levels in balancing lactation performance and environment sustainability of dairy cows, and to establish a methane emission prediction model using the artificial neural network (ANN) model. The results showed that the regression relationship between dietary fat, carbohydrate and methane emissions could be shown by the following models: methane = 106.78 + (14.86 × DMI), R2 = 0.80; methane = 443.17 - (46.41 × starch/NDF), R2 = 0.76; and methane = 388.91 + (31.40 × fat) - (5.42 × fat2), R2 = 0.80. The regression relationships between dietary fat, carbohydrate and lactation performance could be shown by the following models: milk fat yield = 1.08 + (0.43 × starch/NDF) - [0.34 × (starch/NDF)2], R2 = 0.79; milk protein yield = 0.68 + (0.15 × fat) - (0.016 × fat2), R2 = 0.82. In the structural equation model, we found that when formulating dietary carbohydrates and fats, it was necessary to balance the relationship between methane emissions and lactation performance. Specifically, dietary starch/NDF was lower than 0.63 (extremum point) and dietary fat was between 2.89% and 4.69% (extremum point), it could ensure that the aim of methane emission reduction (methane emissions decrease with increasing dietary starch/NDF and fat) was achieved without losing lactation performance of dairy cows (lactation performance increase with increasing dietary starch/NDF and fat). Finally, we established the ANN model to predict methane emissions (training set: R2 = 0.62; validation set: R2 = 0.61).
Collapse
Affiliation(s)
| | | | - Shengru Wu
- College of Animal Science and Technology, Northwest A&F University, Yangling 712100, Shaanxi, China
| | - Jun Zhang
- College of Animal Science and Technology, Northwest A&F University, Yangling 712100, Shaanxi, China
| | - Yue Wang
- College of Animal Science and Technology, Northwest A&F University, Yangling 712100, Shaanxi, China
| | - Zongjun Li
- College of Animal Science and Technology, Northwest A&F University, Yangling 712100, Shaanxi, China
| | - Junhu Yao
- College of Animal Science and Technology, Northwest A&F University, Yangling 712100, Shaanxi, China
| |
Collapse
|
5
|
Santilli G, Vetrano M, Mangone M, Agostini F, Bernetti A, Coraci D, Paoloni M, de Sire A, Paolucci T, Latini E, Santoboni F, Nusca SM, Vulpiani MC. Predictive Prognostic Factors in Non-Calcific Supraspinatus Tendinopathy Treated with Focused Extracorporeal Shock Wave Therapy: An Artificial Neural Network Approach. Life (Basel) 2024; 14:681. [PMID: 38929665 PMCID: PMC11205102 DOI: 10.3390/life14060681] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/08/2024] [Revised: 05/22/2024] [Accepted: 05/23/2024] [Indexed: 06/28/2024] Open
Abstract
The supraspinatus tendon is one of the most involved tendons in the development of shoulder pain. Extracorporeal shockwave therapy (ESWT) has been recognized as a valid and safe treatment. Sometimes the symptoms cannot be relieved, or a relapse develops, affecting the patient's quality of life. Therefore, a prediction protocol could be a powerful tool aiding our clinical decisions. An artificial neural network was run, in particular a multilayer perceptron model incorporating input information such as the VAS and Constant-Murley score, administered at T0 and at T1 after six months. It showed a model sensitivity of 80.7%, and the area under the ROC curve was 0.701, which demonstrates good discrimination. The aim of our study was to identify predictive factors for minimal clinically successful therapy (MCST), defined as a reduction of ≥40% in VAS score at T1 following ESWT for chronic non-calcific supraspinatus tendinopathy (SNCCT). From the male gender, we expect greater and more frequent clinical success. The more severe the patient's initial condition, the greater the possibility that clinical success will decrease. The Constant and Murley score, Roles and Maudsley score, and VAS are not just evaluation tools to verify an improvement; they are also prognostic factors to be taken into consideration in the assessment of achieving clinical success. Due to the lower clinical improvement observed in older patients and those with worse clinical and functional scales, it would be preferable to also provide these patients with the possibility of combined treatments. The ANN predictive model is reasonable and accurate in studying the influence of prognostic factors and achieving clinical success in patients with chronic non-calcific tendinopathy of the supraspinatus treated with ESWT.
Collapse
Affiliation(s)
- Gabriele Santilli
- Physical Medicine and Rehabilitation Unit, Sant’Andrea Hospital, Sapienza University of Rome, 00189 Rome, Italy
| | - Mario Vetrano
- Physical Medicine and Rehabilitation Unit, Sant’Andrea Hospital, Sapienza University of Rome, 00189 Rome, Italy
| | - Massimiliano Mangone
- Department of Anatomical and Histological Sciences, Legal Medicine and Orthopedics, Sapienza University, 00185 Rome, Italy
| | - Francesco Agostini
- Department of Anatomical and Histological Sciences, Legal Medicine and Orthopedics, Sapienza University, 00185 Rome, Italy
| | - Andrea Bernetti
- Department of Biological and Environmental Science and Technologies, University of Salento, 73100 Lecce, Italy
| | - Daniele Coraci
- Department of Neuroscience, Section of Rehabilitation, University of Padua, 35122 Padua, Italy
| | - Marco Paoloni
- Department of Anatomical and Histological Sciences, Legal Medicine and Orthopedics, Sapienza University, 00185 Rome, Italy
| | - Alessandro de Sire
- Physical and Rehabilitative Medicine, Department of Medical and Surgical Sciences, University of Catanzaro “Magna Graecia”, 88100 Catanzaro, Italy
- Research Center on Musculoskeletal Health, MusculoSkeletalHealth@UMG, University of Catanzaro “Magna Graecia”, 88100 Catanzaro, Italy
| | - Teresa Paolucci
- Department of Oral Medical Science and Biotechnology, G. D’Annunzio University of Chieti-Pescara, 66100 Chieti, Italy
| | - Eleonora Latini
- Physical Medicine and Rehabilitation Unit, Sant’Andrea Hospital, Sapienza University of Rome, 00189 Rome, Italy
| | - Flavia Santoboni
- Physical Medicine and Rehabilitation Unit, Sant’Andrea Hospital, Sapienza University of Rome, 00189 Rome, Italy
| | - Sveva Maria Nusca
- Physical Medicine and Rehabilitation Unit, Sant’Andrea Hospital, Sapienza University of Rome, 00189 Rome, Italy
| | - Maria Chiara Vulpiani
- Physical Medicine and Rehabilitation Unit, Sant’Andrea Hospital, Sapienza University of Rome, 00189 Rome, Italy
| |
Collapse
|
6
|
He Y, Sun Q, Matsunaga M, Ota A. Can feature structure improve model's precision? A novel prediction method using artificial image and image identification. JAMIA Open 2024; 7:ooae012. [PMID: 38348347 PMCID: PMC10860535 DOI: 10.1093/jamiaopen/ooae012] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/30/2023] [Revised: 01/03/2024] [Accepted: 02/01/2024] [Indexed: 02/15/2024] Open
Abstract
Objectives This study aimed to develop an approach to enhance the model precision by artificial images. Materials and Methods Given an epidemiological study designed to predict 1 response using f features with M samples, each feature was converted into a pixel with certain value. Permutated these pixels into F orders, resulting in F distinct artificial image sample sets. Based on the experience of image recognition techniques, appropriate training images results in higher precision model. In the preliminary experiment, a binary response was predicted by 76 features, the sample set included 223 patients and 1776 healthy controls. Results We randomly selected 10 000 artificial sample sets to train the model. Models' performance (area under the receiver operating characteristic curve values) depicted a bell-shaped distribution. Conclusion The model construction strategy developed in the research has potential to capture feature order related information and enhance model predictability.
Collapse
Affiliation(s)
- Yupeng He
- Department of Public Health, Fujita Health University School of Medicine, Toyoake, Aichi 4701192, Japan
| | - Qiwen Sun
- Independent scholar, Nagoya, Aichi 4640831, Japan
| | - Masaaki Matsunaga
- Department of Public Health, Fujita Health University School of Medicine, Toyoake, Aichi 4701192, Japan
| | - Atsuhiko Ota
- Department of Public Health, Fujita Health University School of Medicine, Toyoake, Aichi 4701192, Japan
| |
Collapse
|
7
|
Kim SY, Shin SY, Saeed M, Ryu JE, Kim JS, Ahn J, Jung Y, Moon JM, Choi CH, Choi HK. Prediction of Clinical Remission with Adalimumab Therapy in Patients with Ulcerative Colitis by Fourier Transform-Infrared Spectroscopy Coupled with Machine Learning Algorithms. Metabolites 2023; 14:2. [PMID: 38276292 PMCID: PMC10818421 DOI: 10.3390/metabo14010002] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/31/2023] [Revised: 12/06/2023] [Accepted: 12/12/2023] [Indexed: 01/27/2024] Open
Abstract
We aimed to develop prediction models for clinical remission associated with adalimumab treatment in patients with ulcerative colitis (UC) using Fourier transform-infrared (FT-IR) spectroscopy coupled with machine learning (ML) algorithms. This prospective, observational, multicenter study enrolled 62 UC patients and 30 healthy controls. The patients were treated with adalimumab for 56 weeks, and clinical remission was evaluated using the Mayo score. Baseline fecal samples were collected and analyzed using FT-IR spectroscopy. Various data preprocessing methods were applied, and prediction models were established by 10-fold cross-validation using various ML methods. Orthogonal partial least squares-discriminant analysis (OPLS-DA) showed a clear separation of healthy controls and UC patients, applying area normalization and Pareto scaling. OPLS-DA models predicting short- and long-term remission (8 and 56 weeks) yielded area-under-the-curve values of 0.76 and 0.75, respectively. Logistic regression and a nonlinear support vector machine were selected as the best prediction models for short- and long-term remission, respectively (accuracy of 0.99). In external validation, prediction models for short-term (logistic regression) and long-term (decision tree) remission performed well, with accuracy values of 0.73 and 0.82, respectively. This was the first study to develop prediction models for clinical remission associated with adalimumab treatment in UC patients by fecal analysis using FT-IR spectroscopy coupled with ML algorithms. Logistic regression, nonlinear support vector machines, and decision tree were suggested as the optimal prediction models for remission, and these were noninvasive, simple, inexpensive, and fast analyses that could be applied to personalized treatments.
Collapse
Affiliation(s)
- Seok-Young Kim
- College of Pharmacy, Chung-Ang University, Seoul 06974, Republic of Korea; (S.-Y.K.); (M.S.); (J.E.R.); (J.-S.K.); (J.A.); (Y.J.)
| | - Seung Yong Shin
- Department of Internal Medicine, College of Medicine, Chung-Ang University, Seoul 06973, Republic of Korea; (S.Y.S.); (J.M.M.)
| | - Maham Saeed
- College of Pharmacy, Chung-Ang University, Seoul 06974, Republic of Korea; (S.-Y.K.); (M.S.); (J.E.R.); (J.-S.K.); (J.A.); (Y.J.)
| | - Ji Eun Ryu
- College of Pharmacy, Chung-Ang University, Seoul 06974, Republic of Korea; (S.-Y.K.); (M.S.); (J.E.R.); (J.-S.K.); (J.A.); (Y.J.)
| | - Jung-Seop Kim
- College of Pharmacy, Chung-Ang University, Seoul 06974, Republic of Korea; (S.-Y.K.); (M.S.); (J.E.R.); (J.-S.K.); (J.A.); (Y.J.)
| | - Junyoung Ahn
- College of Pharmacy, Chung-Ang University, Seoul 06974, Republic of Korea; (S.-Y.K.); (M.S.); (J.E.R.); (J.-S.K.); (J.A.); (Y.J.)
| | - Youngmi Jung
- College of Pharmacy, Chung-Ang University, Seoul 06974, Republic of Korea; (S.-Y.K.); (M.S.); (J.E.R.); (J.-S.K.); (J.A.); (Y.J.)
| | - Jung Min Moon
- Department of Internal Medicine, College of Medicine, Chung-Ang University, Seoul 06973, Republic of Korea; (S.Y.S.); (J.M.M.)
| | - Chang Hwan Choi
- Department of Internal Medicine, College of Medicine, Chung-Ang University, Seoul 06973, Republic of Korea; (S.Y.S.); (J.M.M.)
| | - Hyung-Kyoon Choi
- College of Pharmacy, Chung-Ang University, Seoul 06974, Republic of Korea; (S.-Y.K.); (M.S.); (J.E.R.); (J.-S.K.); (J.A.); (Y.J.)
| |
Collapse
|
8
|
Wang Y, Wei W, Du W, Cai J, Liao Y, Lu H, Kong B, Zhang Z. Deep-Learning-Based Mixture Identification for Nuclear Magnetic Resonance Spectroscopy Applied to Plant Flavors. Molecules 2023; 28:7380. [PMID: 37959799 PMCID: PMC10648966 DOI: 10.3390/molecules28217380] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/27/2023] [Revised: 10/25/2023] [Accepted: 10/30/2023] [Indexed: 11/15/2023] Open
Abstract
Nuclear magnetic resonance (NMR) is a crucial technique for analyzing mixtures consisting of small molecules, providing non-destructive, fast, reproducible, and unbiased benefits. However, it is challenging to perform mixture identification because of the offset of chemical shifts and peak overlaps that often exist in mixtures such as plant flavors. Here, we propose a deep-learning-based mixture identification method (DeepMID) that can be used to identify plant flavors (mixtures) in a formulated flavor (mixture consisting of several plant flavors) without the need to know the specific components in the plant flavors. A pseudo-Siamese convolutional neural network (pSCNN) and a spatial pyramid pooling (SPP) layer were used to solve the problems due to their high accuracy and robustness. The DeepMID model is trained, validated, and tested on an augmented data set containing 50,000 pairs of formulated and plant flavors. We demonstrate that DeepMID can achieve excellent prediction results in the augmented test set: ACC = 99.58%, TPR = 99.48%, FPR = 0.32%; and two experimentally obtained data sets: one shows ACC = 97.60%, TPR = 92.81%, FPR = 0.78% and the other shows ACC = 92.31%, TPR = 80.00%, FPR = 0.00%. In conclusion, DeepMID is a reliable method for identifying plant flavors in formulated flavors based on NMR spectroscopy, which can assist researchers in accelerating the design of flavor formulations.
Collapse
Affiliation(s)
- Yufei Wang
- College of Chemistry and Chemical Engineering, Central South University, Changsha 410083, China; (Y.W.); (Y.L.); (H.L.)
| | - Weiwei Wei
- Technology Center, China Tobacco Hunan Industrial Co., Ltd., Changsha 410014, China; (W.W.); (W.D.); (J.C.)
| | - Wen Du
- Technology Center, China Tobacco Hunan Industrial Co., Ltd., Changsha 410014, China; (W.W.); (W.D.); (J.C.)
| | - Jiaxiao Cai
- Technology Center, China Tobacco Hunan Industrial Co., Ltd., Changsha 410014, China; (W.W.); (W.D.); (J.C.)
| | - Yuxuan Liao
- College of Chemistry and Chemical Engineering, Central South University, Changsha 410083, China; (Y.W.); (Y.L.); (H.L.)
| | - Hongmei Lu
- College of Chemistry and Chemical Engineering, Central South University, Changsha 410083, China; (Y.W.); (Y.L.); (H.L.)
| | - Bo Kong
- Technology Center, China Tobacco Hunan Industrial Co., Ltd., Changsha 410014, China; (W.W.); (W.D.); (J.C.)
| | - Zhimin Zhang
- College of Chemistry and Chemical Engineering, Central South University, Changsha 410083, China; (Y.W.); (Y.L.); (H.L.)
| |
Collapse
|
9
|
Wenck S, Mix T, Fischer M, Hackl T, Seifert S. Opening the Random Forest Black Box of 1H NMR Metabolomics Data by the Exploitation of Surrogate Variables. Metabolites 2023; 13:1075. [PMID: 37887402 PMCID: PMC10608983 DOI: 10.3390/metabo13101075] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/18/2023] [Revised: 10/05/2023] [Accepted: 10/10/2023] [Indexed: 10/28/2023] Open
Abstract
The untargeted metabolomics analysis of biological samples with nuclear magnetic resonance (NMR) provides highly complex data containing various signals from different molecules. To use these data for classification, e.g., in the context of food authentication, machine learning methods are used. These methods are usually applied as a black box, which means that no information about the complex relationships between the variables and the outcome is obtained. In this study, we show that the random forest-based approach surrogate minimal depth (SMD) can be applied for a comprehensive analysis of class-specific differences by selecting relevant variables and analyzing their mutual impact on the classification model of different truffle species. SMD allows the assignment of variables from the same metabolites as well as the detection of interactions between different metabolites that can be attributed to known biological relationships.
Collapse
Affiliation(s)
- Soeren Wenck
- Institute of Food Chemistry, Hamburg School of Food Science, University of Hamburg, Grindelallee 117, 20146 Hamburg, Germany (M.F.); (T.H.)
| | - Thorsten Mix
- Institute of Organic Chemistry, University of Hamburg, Martin-Luther-King-Platz 6, 20146 Hamburg, Germany;
| | - Markus Fischer
- Institute of Food Chemistry, Hamburg School of Food Science, University of Hamburg, Grindelallee 117, 20146 Hamburg, Germany (M.F.); (T.H.)
| | - Thomas Hackl
- Institute of Food Chemistry, Hamburg School of Food Science, University of Hamburg, Grindelallee 117, 20146 Hamburg, Germany (M.F.); (T.H.)
- Institute of Organic Chemistry, University of Hamburg, Martin-Luther-King-Platz 6, 20146 Hamburg, Germany;
| | - Stephan Seifert
- Institute of Food Chemistry, Hamburg School of Food Science, University of Hamburg, Grindelallee 117, 20146 Hamburg, Germany (M.F.); (T.H.)
| |
Collapse
|
10
|
Li W, Shao C, Li C, Zhou H, Yu L, Yang J, Wan H, He Y. Metabolomics: A useful tool for ischemic stroke research. J Pharm Anal 2023; 13:968-983. [PMID: 37842657 PMCID: PMC10568109 DOI: 10.1016/j.jpha.2023.05.015] [Citation(s) in RCA: 9] [Impact Index Per Article: 9.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/17/2023] [Revised: 05/14/2023] [Accepted: 05/29/2023] [Indexed: 10/17/2023] Open
Abstract
Ischemic stroke (IS) is a multifactorial and heterogeneous disease. Despite years of studies, effective strategies for the diagnosis, management and treatment of stroke are still lacking in clinical practice. Metabolomics is a growing field in systems biology. It is starting to show promise in the identification of biomarkers and in the use of pharmacometabolomics to help patients with certain disorders choose their course of treatment. The development of metabolomics has enabled further and more biological applications. Particularly, metabolomics is increasingly being used to diagnose diseases, discover new drug targets, elucidate mechanisms, and monitor therapeutic outcomes and its potential effect on precision medicine. In this review, we reviewed some recent advances in the study of metabolomics as well as how metabolomics might be used to identify novel biomarkers and understand the mechanisms of IS. Then, the use of metabolomics approaches to investigate the molecular processes and active ingredients of Chinese herbal formulations with anti-IS capabilities is summarized. We finally summarized recent developments in single cell metabolomics for exploring the metabolic profiles of single cells. Although the field is relatively young, the development of single cell metabolomics promises to provide a powerful tool for unraveling the pathogenesis of IS.
Collapse
Affiliation(s)
- Wentao Li
- School of Pharmaceutical Sciences, Zhejiang Chinese Medical University, Hangzhou, 310053, China
| | - Chongyu Shao
- School of Basic Medicine Sciences, Zhejiang Chinese Medical University, Hangzhou, 310053, China
| | - Chang Li
- School of Basic Medicine Sciences, Zhejiang Chinese Medical University, Hangzhou, 310053, China
| | - Huifen Zhou
- School of Basic Medicine Sciences, Zhejiang Chinese Medical University, Hangzhou, 310053, China
| | - Li Yu
- School of Basic Medicine Sciences, Zhejiang Chinese Medical University, Hangzhou, 310053, China
| | - Jiehong Yang
- School of Basic Medicine Sciences, Zhejiang Chinese Medical University, Hangzhou, 310053, China
| | - Haitong Wan
- School of Basic Medicine Sciences, Zhejiang Chinese Medical University, Hangzhou, 310053, China
| | - Yu He
- School of Pharmaceutical Sciences, Zhejiang Chinese Medical University, Hangzhou, 310053, China
| |
Collapse
|
11
|
Bartmanski BJ, Rocha M, Zimmermann-Kogadeeva M. Recent advances in data- and knowledge-driven approaches to explore primary microbial metabolism. Curr Opin Chem Biol 2023; 75:102324. [PMID: 37207402 PMCID: PMC10410306 DOI: 10.1016/j.cbpa.2023.102324] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/28/2022] [Revised: 04/15/2023] [Accepted: 04/18/2023] [Indexed: 05/21/2023]
Abstract
With the rapid progress in metabolomics and sequencing technologies, more data on the metabolome of single microbes and their communities become available, revealing the potential of microorganisms to metabolize a broad range of chemical compounds. The analysis of microbial metabolomics datasets remains challenging since it inherits the technical challenges of metabolomics analysis, such as compound identification and annotation, while harboring challenges in data interpretation, such as distinguishing metabolite sources in mixed samples. This review outlines the recent advances in computational methods to analyze primary microbial metabolism: knowledge-based approaches that take advantage of metabolic and molecular networks and data-driven approaches that employ machine/deep learning algorithms in combination with large-scale datasets. These methods aim at improving metabolite identification and disentangling reciprocal interactions between microbes and metabolites. We also discuss the perspective of combining these approaches and further developments required to advance the investigation of primary metabolism in mixed microbial samples.
Collapse
Affiliation(s)
| | - Miguel Rocha
- Centre of Biological Engineering, University of Minho, Campus of Gualtar, Braga, Portugal
| | | |
Collapse
|
12
|
Alatrany AS, Khan W, Hussain A, Al-Jumeily D. Wide and deep learning based approaches for classification of Alzheimer's disease using genome-wide association studies. PLoS One 2023; 18:e0283712. [PMID: 37126509 PMCID: PMC10150974 DOI: 10.1371/journal.pone.0283712] [Citation(s) in RCA: 2] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/04/2022] [Accepted: 03/15/2023] [Indexed: 05/02/2023] Open
Abstract
The increasing incidence of Alzheimer's disease (AD) has been leading towards a significant growth in socioeconomic challenges. A reliable prediction of AD might be useful to mitigate or at-least slow down its progression for which, identification of the factors affecting the AD and its accurate diagnoses, are vital. In this study, we use Genome-Wide Association Studies (GWAS) dataset which comprises significant genetic markers of complex diseases. The original dataset contains large number of attributes (620901) for which we propose a hybrid feature selection approach based on association test, principal component analysis, and the Boruta algorithm, to identify the most promising predictors of AD. The selected features are then forwarded to a wide and deep neural network models to classify the AD cases and healthy controls. The experimental outcomes indicate that our approach outperformed the existing methods when evaluated on standard dataset, producing an accuracy and f1-score of 99%. The outcomes from this study are impactful particularly, the identified features comprising AD-associated genes and a reliable classification model that might be useful for other chronic diseases.
Collapse
Affiliation(s)
- Abbas Saad Alatrany
- School of Computer Science and Mathematics, Liverpool John Moores University, Liverpool, United Kingdom
- University of Information Technology and Communications, Baghdad, Iraq
- Imam Ja’afar Al-Sadiq University, Baghdad, Iraq
| | - Wasiq Khan
- School of Computer Science and Mathematics, Liverpool John Moores University, Liverpool, United Kingdom
| | - Abir Hussain
- School of Computer Science and Mathematics, Liverpool John Moores University, Liverpool, United Kingdom
- Department of Electrical Engineering, University of Sharjah, Sharjah, UAE
| | - Dhiya Al-Jumeily
- School of Computer Science and Mathematics, Liverpool John Moores University, Liverpool, United Kingdom
| | | |
Collapse
|
13
|
Liu W, Zhang L, Bao L, Shen G, Feng J. Accurate Classification and Prediction of Acute Myocardial Infarction through an ARMD Procedure. J Proteome Res 2023; 22:758-767. [PMID: 36710647 DOI: 10.1021/acs.jproteome.2c00488] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/31/2023]
Abstract
The risk stratification of acute myocardial infarction (AMI) patients is of prime importance for clinical management and prognosis assessment. Thus, we propose an ensemble machine learning analysis procedure named ADASYN-RFECV-MDA-DNN (ARMD) to address sample-unbalanced problems and enable stratification and prediction of AMI outcomes. The ARMD analysis procedure was applied to the NMR data of sera from 534 AMI-related subjects in four categories with an extremely imbalanced sample proportion. Firstly, the adaptive synthetic sampling (ADASYN) algorithm was used to address the issue of the original sample imbalance. Secondly, the recursive feature elimination with cross-validation (RFECV) processing and random forest mean decrease accuracy (RF-MDA) algorithm was performed to identify the differential metabolites corresponding to each AMI outcome. Finally, the deep neural network (DNN) was employed to classify and predict AMI events, and its performance was evaluated by comparing the four traditional machine learning methods. Compared with the other four machine learning models, DNN presented consistent superiority in almost all of the model parameters including precision, f1-score, sensitivity, specificity, area under the receiver operating characteristic curve (AUC), and classification accuracy, highlighting the potential of deep learning in classification and stratification of clinical diseases. The ARMD analysis procedure was a practical analysis tool for supervised classification and regression modeling of clinical diseases.
Collapse
Affiliation(s)
- Wuping Liu
- Department of Electronic Science, Fujian Provincial Key Laboratory of Plasma and Magnetic Resonance, Xiamen University, 422 Siming South Road, Siming District, Xiamen, Fujian 361005, China
| | - Lirong Zhang
- Department of Electronic Science, Fujian Provincial Key Laboratory of Plasma and Magnetic Resonance, Xiamen University, 422 Siming South Road, Siming District, Xiamen, Fujian 361005, China
| | - Lijun Bao
- Department of Electronic Science, Fujian Provincial Key Laboratory of Plasma and Magnetic Resonance, Xiamen University, 422 Siming South Road, Siming District, Xiamen, Fujian 361005, China
| | - Guiping Shen
- Department of Electronic Science, Fujian Provincial Key Laboratory of Plasma and Magnetic Resonance, Xiamen University, 422 Siming South Road, Siming District, Xiamen, Fujian 361005, China
| | - Jianghua Feng
- Department of Electronic Science, Fujian Provincial Key Laboratory of Plasma and Magnetic Resonance, Xiamen University, 422 Siming South Road, Siming District, Xiamen, Fujian 361005, China
| |
Collapse
|
14
|
Mohammed MA, Abdulkareem KH, Dinar AM, Zapirain BG. Rise of Deep Learning Clinical Applications and Challenges in Omics Data: A Systematic Review. Diagnostics (Basel) 2023; 13:diagnostics13040664. [PMID: 36832152 PMCID: PMC9955380 DOI: 10.3390/diagnostics13040664] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/24/2022] [Revised: 02/05/2023] [Accepted: 02/07/2023] [Indexed: 02/12/2023] Open
Abstract
This research aims to review and evaluate the most relevant scientific studies about deep learning (DL) models in the omics field. It also aims to realize the potential of DL techniques in omics data analysis fully by demonstrating this potential and identifying the key challenges that must be addressed. Numerous elements are essential for comprehending numerous studies by surveying the existing literature. For example, the clinical applications and datasets from the literature are essential elements. The published literature highlights the difficulties encountered by other researchers. In addition to looking for other studies, such as guidelines, comparative studies, and review papers, a systematic approach is used to search all relevant publications on omics and DL using different keyword variants. From 2018 to 2022, the search procedure was conducted on four Internet search engines: IEEE Xplore, Web of Science, ScienceDirect, and PubMed. These indexes were chosen because they offer enough coverage and linkages to numerous papers in the biological field. A total of 65 articles were added to the final list. The inclusion and exclusion criteria were specified. Of the 65 publications, 42 are clinical applications of DL in omics data. Furthermore, 16 out of 65 articles comprised the review publications based on single- and multi-omics data from the proposed taxonomy. Finally, only a small number of articles (7/65) were included in papers focusing on comparative analysis and guidelines. The use of DL in studying omics data presented several obstacles related to DL itself, preprocessing procedures, datasets, model validation, and testbed applications. Numerous relevant investigations were performed to address these issues. Unlike other review papers, our study distinctly reflects different observations on omics with DL model areas. We believe that the result of this study can be a useful guideline for practitioners who look for a comprehensive view of the role of DL in omics data analysis.
Collapse
Affiliation(s)
- Mazin Abed Mohammed
- College of Computer Science and Information Technology, University of Anbar, Anbar 31001, Iraq
- eVIDA Lab, University of Deusto, 48007 Bilbao, Spain
- Correspondence: (M.A.M.); (B.G.Z.)
| | - Karrar Hameed Abdulkareem
- College of Agriculture, Al-Muthanna University, Samawah 66001, Iraq
- College of Engineering, University of Warith Al-Anbiyaa, Karbala 56001, Iraq
| | - Ahmed M. Dinar
- Computer Engineering Department, University of Technology- Iraq, Baghdad 19006, Iraq
| | | |
Collapse
|
15
|
Liu X, Wang Z, Gmitter FG, Grosser JW, Wang Y. Effects of Different Rootstocks on the Metabolites of Huanglongbing-Affected Sweet Orange Juices Using a Novel Combined Strategy of Untargeted Metabolomics and Machine Learning. JOURNAL OF AGRICULTURAL AND FOOD CHEMISTRY 2023; 71:1246-1257. [PMID: 36606748 DOI: 10.1021/acs.jafc.2c07456] [Citation(s) in RCA: 5] [Impact Index Per Article: 5.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/17/2023]
Abstract
Huanglongbing (HLB) is one of the most destructive citrus diseases, mainly caused by the Gram-negative bacteria Candidatus Liberibacter asiaticus. Aiming at unraveling the mechanisms of different scion/rootstock combinations on improving HLB-affected orange juice quality, the effects of rootstocks on the metabolites of HLB-affected sweet orange juices were investigated using a combined strategy of untargeted metabolomics and machine learning. A total of 2531 ion features were detected using UHPLC-Q-Orbitrap high-resolution electrospray ionization mass spectrometry, and 54 metabolites including amino acids, amines, flavonoids, coumarins, fatty acids, and glycosides were definitely or tentatively identified as the differential markers based on the random forest algorithm. Furthermore, 24 metabolites were verified and semi-quantified using authentic standards. Notably, the presence of specific amino acids and amines, especially polyamines, indicated that different rootstocks might affect glutamate, aspartate, proline, and arginine metabolism to regulate the physiological response against HLB. Meanwhile, the production of flavonoids and prenylated coumarins suggested that rootstocks influenced phenylalanine and phenylpropanoid metabolism. The possible metabolic pathways were proposed, and the important intermediates were verified by authentic standards. These results provide new insights on the effects of rootstocks on the metabolites of HLB-affected sweet orange juices.
Collapse
Affiliation(s)
- Xin Liu
- Citrus Research and Education Center, University of Florida, Lake Alfred, Florida 33850, United States
- Department of Food Science and Human Nutrition, University of Florida, Gainesville, Florida 32611, United States
| | - Zhixin Wang
- Citrus Research and Education Center, University of Florida, Lake Alfred, Florida 33850, United States
- Department of Food Science and Human Nutrition, University of Florida, Gainesville, Florida 32611, United States
| | - Frederick G Gmitter
- Citrus Research and Education Center, University of Florida, Lake Alfred, Florida 33850, United States
| | - Jude W Grosser
- Citrus Research and Education Center, University of Florida, Lake Alfred, Florida 33850, United States
| | - Yu Wang
- Citrus Research and Education Center, University of Florida, Lake Alfred, Florida 33850, United States
- Department of Food Science and Human Nutrition, University of Florida, Gainesville, Florida 32611, United States
| |
Collapse
|
16
|
Tashakor S, Chamani A, Moshtaghie M. Noise pollution prediction and seasonal comparison in urban parks using a coupled GIS- artificial neural network model. ENVIRONMENTAL MONITORING AND ASSESSMENT 2023; 195:303. [PMID: 36646942 DOI: 10.1007/s10661-022-10858-3] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 08/21/2022] [Accepted: 12/15/2022] [Indexed: 06/17/2023]
Abstract
Noise pollution is a challenging environmental issue in densely built urban areas and requires a holistic understanding of its sources and alleviation processes. Taking Isfahan City in Iran as a typical case, this study developed a combined GIS-artificial neural network (ANN) model to predict the spatio-temporal contribution of low-width parks to poise pollution mitigation. The 30-min equivalent sound level was measured at 100 stations in six urban parks (with a total area of 55.84 ha) under stable and controlled winter and summer conditions. The noise level predicting variables were hypothesized to be the area of vegetation cover; NDVI-based vegetation density and standard deviation (std); vegetation height; and road coverage measured within 100-, 200-, and 300-m radius buffer rings drown around each noise sampling station. These predictors were introduced to a multi-layer perceptron ANN model to identify and compare the most important noise alleviation variables among the selected predictors. The mean noise levels ranged from 67.23 to 70.57 dB. The number of vehicles showed an insignificant temporal difference, indicating that the noise source was relatively constant between the seasons. The ANN model performed satisfactorily in both seasons with SSE values of < 0.03. The Mann-Whitney U test showed a significant difference in the predicted noise levels between summer and winter. This study highlighted the efficiency of the combined GIS-ANN model in predicting distant-dependent urban processes, especially noise pollution whose levels and variability are essential in formulating urban land-use management.
Collapse
Affiliation(s)
- Shahla Tashakor
- Environmental Science and Engineering Department, Isfahan(Khorasgan) Branch, Islamic Azad University, Isfahan, Iran
| | - Atefeh Chamani
- Environmental Science and Engineering Department, Waste and Wastewater Research Center, Isfahan(Khorasgan) Branch, Islamic Azad University, Isfahan, Iran.
| | - Minoo Moshtaghie
- Environmental Science and Engineering Department, Isfahan(Khorasgan) Branch, Islamic Azad University, Isfahan, Iran
| |
Collapse
|
17
|
Nichani K, Uhlig S, Colson B, Hettwer K, Simon K, Bönick J, Uhlig C, Kemmlein S, Stoyke M, Gowik P, Huschek G, Rawel HM. Development of Non-Targeted Mass Spectrometry Method for Distinguishing Spelt and Wheat. Foods 2022; 12:141. [PMID: 36613357 PMCID: PMC9818861 DOI: 10.3390/foods12010141] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/21/2022] [Revised: 12/13/2022] [Accepted: 12/21/2022] [Indexed: 12/29/2022] Open
Abstract
Food fraud, even when not in the news, is ubiquitous and demands the development of innovative strategies to combat it. A new non-targeted method (NTM) for distinguishing spelt and wheat is described, which aids in food fraud detection and authenticity testing. A highly resolved fingerprint in the form of spectra is obtained for several cultivars of spelt and wheat using liquid chromatography coupled high-resolution mass spectrometry (LC-HRMS). Convolutional neural network (CNN) models are built using a nested cross validation (NCV) approach by appropriately training them using a calibration set comprising duplicate measurements of eleven cultivars of wheat and spelt, each. The results reveal that the CNNs automatically learn patterns and representations to best discriminate tested samples into spelt or wheat. This is further investigated using an external validation set comprising artificially mixed spectra, samples for processed goods (spelt bread and flour), eleven untypical spelt, and six old wheat cultivars. These cultivars were not part of model building. We introduce a metric called the D score to quantitatively evaluate and compare the classification decisions. Our results demonstrate that NTMs based on NCV and CNNs trained using appropriately chosen spectral data can be reliable enough to be used on a wider range of cultivars and their mixes.
Collapse
Affiliation(s)
- Kapil Nichani
- QuoData GmbH, Prellerstr. 14, D-01309 Dresden, Germany
- Institute of Nutritional Science, University of Potsdam, Arthur-Scheunert-Allee 114-116, D-14558 Nuthetal, Germany
| | - Steffen Uhlig
- QuoData GmbH, Fabeckstr. 43, D-14195 Berlin, Germany
| | | | | | - Kirsten Simon
- QuoData GmbH, Prellerstr. 14, D-01309 Dresden, Germany
| | - Josephine Bönick
- Bundesinstitut für Risikobewertung, Max-Dohrn-Str. 8-10, D-10589 Berlin, Germany
| | - Carsten Uhlig
- Akees GmbH, Ansbacher Str. 11, D-10787 Berlin, Germany
| | - Sabine Kemmlein
- Bundesamt für Verbraucherschutz und Lebensmittelsicherheit, Diedersdorfer Weg. 1, D-12277 Berlin, Germany
| | - Manfred Stoyke
- Bundesamt für Verbraucherschutz und Lebensmittelsicherheit, Diedersdorfer Weg. 1, D-12277 Berlin, Germany
| | - Petra Gowik
- Bundesamt für Verbraucherschutz und Lebensmittelsicherheit, Diedersdorfer Weg. 1, D-12277 Berlin, Germany
| | - Gerd Huschek
- IGV-Institut für Getreideverarbeitung GmbH, Arthur-Scheunert-Allee 40/41, D-14558 Nuthetal, Germany
| | - Harshadrai M. Rawel
- Institute of Nutritional Science, University of Potsdam, Arthur-Scheunert-Allee 114-116, D-14558 Nuthetal, Germany
| |
Collapse
|
18
|
Galal A, Talal M, Moustafa A. Applications of machine learning in metabolomics: Disease modeling and classification. Front Genet 2022; 13:1017340. [PMID: 36506316 PMCID: PMC9730048 DOI: 10.3389/fgene.2022.1017340] [Citation(s) in RCA: 22] [Impact Index Per Article: 11.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/11/2022] [Accepted: 11/07/2022] [Indexed: 11/25/2022] Open
Abstract
Metabolomics research has recently gained popularity because it enables the study of biological traits at the biochemical level and, as a result, can directly reveal what occurs in a cell or a tissue based on health or disease status, complementing other omics such as genomics and transcriptomics. Like other high-throughput biological experiments, metabolomics produces vast volumes of complex data. The application of machine learning (ML) to analyze data, recognize patterns, and build models is expanding across multiple fields. In the same way, ML methods are utilized for the classification, regression, or clustering of highly complex metabolomic data. This review discusses how disease modeling and diagnosis can be enhanced via deep and comprehensive metabolomic profiling using ML. We discuss the general layout of a metabolic workflow and the fundamental ML techniques used to analyze metabolomic data, including support vector machines (SVM), decision trees, random forests (RF), neural networks (NN), and deep learning (DL). Finally, we present the advantages and disadvantages of various ML methods and provide suggestions for different metabolic data analysis scenarios.
Collapse
Affiliation(s)
- Aya Galal
- Systems Genomics Laboratory, American University in Cairo, New Cairo, Egypt,Institute of Global Health and Human Ecology, American University in Cairo, New Cairo, Egypt
| | - Marwa Talal
- Systems Genomics Laboratory, American University in Cairo, New Cairo, Egypt,Biotechnology Graduate Program, American University in Cairo, New Cairo, Egypt
| | - Ahmed Moustafa
- Systems Genomics Laboratory, American University in Cairo, New Cairo, Egypt,Biotechnology Graduate Program, American University in Cairo, New Cairo, Egypt,Department of Biology, American University in Cairo, New Cairo, Egypt,*Correspondence: Ahmed Moustafa,
| |
Collapse
|
19
|
Kuhn S, Tumer E, Colreavy-Donnelly S, Moreira Borges R. A pilot study for fragment identification using 2D NMR and deep learning. MAGNETIC RESONANCE IN CHEMISTRY : MRC 2022; 60:1052-1060. [PMID: 34480494 DOI: 10.1002/mrc.5212] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 01/14/2021] [Revised: 06/05/2021] [Accepted: 08/23/2021] [Indexed: 06/13/2023]
Abstract
This paper presents a proof of concept of a method to identify substructures in 2D NMR spectra of mixtures using a bespoke image-based convolutional neural network application. This is done using HSQC and HMBC spectra separately and in combination. The application can reliably detect substructures in pure compounds, using a simple network. Results indicate that it can work for mixtures when trained on pure compounds only. HMBC data and the combination of HMBC and HSQC show better results than HSQC alone in this pilot study.
Collapse
Affiliation(s)
- Stefan Kuhn
- School of Computer Science and Informatics, De Montfort University, Leicester, UK
- Institute of Computer Science, University of Tartu, Tartu, Estonia
| | | | | | - Ricardo Moreira Borges
- Instituto de Pesquisas de Produtos Naturais Walter Mors, Universidade Federal do Rio de Janeiro, Rio de Janeiro, Brazil
| |
Collapse
|
20
|
Panyard DJ, Yu B, Snyder MP. The metabolomics of human aging: Advances, challenges, and opportunities. SCIENCE ADVANCES 2022; 8:eadd6155. [PMID: 36260671 PMCID: PMC9581477 DOI: 10.1126/sciadv.add6155] [Citation(s) in RCA: 38] [Impact Index Per Article: 19.0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 06/22/2022] [Accepted: 09/01/2022] [Indexed: 05/02/2023]
Abstract
As the global population becomes older, understanding the impact of aging on health and disease becomes paramount. Recent advancements in multiomic technology have allowed for the high-throughput molecular characterization of aging at the population level. Metabolomics studies that analyze the small molecules in the body can provide biological information across a diversity of aging processes. Here, we review the growing body of population-scale metabolomics research on aging in humans, identifying the major trends in the field, implicated biological pathways, and how these pathways relate to health and aging. We conclude by assessing the main challenges in the research to date, opportunities for advancing the field, and the outlook for precision health applications.
Collapse
Affiliation(s)
- Daniel J. Panyard
- Department of Genetics, Stanford University School of Medicine, Stanford University, Stanford, CA 94305, USA
| | - Bing Yu
- Department of Epidemiology, Human Genetics, and Environmental Sciences, School of Public Health, The University of Texas Health Science Center at Houston, Houston, TX 77030, USA
| | - Michael P. Snyder
- Department of Genetics, Stanford University School of Medicine, Stanford University, Stanford, CA 94305, USA
| |
Collapse
|
21
|
Zhang X, Kupczyk E, Schmitt-Kopplin P, Mueller C. Current and future approaches for in vitro hit discovery in diabetes mellitus. Drug Discov Today 2022; 27:103331. [PMID: 35926826 DOI: 10.1016/j.drudis.2022.07.016] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/04/2022] [Revised: 06/10/2022] [Accepted: 07/26/2022] [Indexed: 12/15/2022]
Abstract
Type 2 diabetes mellitus (T2DM) is a serious public health problem. In this review, we discuss current and promising future drugs, targets, in vitro assays and emerging omics technologies in T2DM. Importantly, we open the perspective to image-based high-content screening (HCS), with the focus of combining it with metabolomics or lipidomics. HCS has become a strong technology in phenotypic screens because it allows comprehensive screening for the cell-modulatory activity of small molecules. Metabolomics and lipidomics screen for perturbations at the molecular level. The combination of these data-intensive comprehensive technologies is enabled by the rapid development of artificial intelligence. It promises a deep cellular and molecular phenotyping directly linked to chemical information about the applied drug candidates or complex mixtures.
Collapse
Affiliation(s)
- Xin Zhang
- Research Unit Analytical BioGeoChemistry, Helmholtz Zentrum München, Ingolstaedter Landstr. 1, 85764 Neuherberg, Germany
| | - Erwin Kupczyk
- Research Unit Analytical BioGeoChemistry, Helmholtz Zentrum München, Ingolstaedter Landstr. 1, 85764 Neuherberg, Germany; Comprehensive Foodomics Platform, Chair of Analytical Food Chemistry, TUM School of Life Sciences, Technical University of Munich, Maximus-von-Imhof-Forum 2, 85354 Freising, Germany
| | - Philippe Schmitt-Kopplin
- Research Unit Analytical BioGeoChemistry, Helmholtz Zentrum München, Ingolstaedter Landstr. 1, 85764 Neuherberg, Germany; Comprehensive Foodomics Platform, Chair of Analytical Food Chemistry, TUM School of Life Sciences, Technical University of Munich, Maximus-von-Imhof-Forum 2, 85354 Freising, Germany.
| | - Constanze Mueller
- Research Unit Analytical BioGeoChemistry, Helmholtz Zentrum München, Ingolstaedter Landstr. 1, 85764 Neuherberg, Germany.
| |
Collapse
|
22
|
Kupczyk E, Schorpp K, Hadian K, Lin S, Tziotis D, Schmitt-Kopplin P, Mueller C. Unleashing high content screening in hit detection - Benchmarking AI workflows including novelty detection. Comput Struct Biotechnol J 2022; 20:5453-5465. [PMID: 36212538 PMCID: PMC9530837 DOI: 10.1016/j.csbj.2022.09.023] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/30/2022] [Revised: 09/16/2022] [Accepted: 09/16/2022] [Indexed: 11/22/2022] Open
Abstract
Complex mixtures containing natural products are still an interesting source of novel drug candidates. High content screening (HCS) is a popular tool to screen for such. In particular, multiplexed HCS assays promise comprehensive bioactivity profiles, but generate also high amounts of data. Yet, only some machine learning (ML) applications for data analysis are available and these usually require a profound knowledge of the underlying cell biology. Unfortunately, there are no applications that simply predict if samples are biologically active or not (any kind of bioactivity). Within this work, we benchmark ML algorithms for binary classification, starting with classical ML models, which are the standard classifiers of the scikit-learn library or ensemble models of these classifiers (a total of 92 models tested). Followed by a partial least square regression (PLSR)-based classification (44 tested models in total) and simple artificial neural networks (ANNs) with dense layers (72 tested models in total). In addition, a novelty detection (ND) was examined, which is supposed to handle unknown patterns. For the final analysis the models, with and without upstream ND, were tested with two independent data sets. In our analysis, a stacking model, an ensamble model of class ML algorithms, performed best to predict new and unknown data. ND improved the predictions of the models and was useful to handle unknown patterns. Importantly, the classifier presented here can be easily rebuilt and be adapted to the data and demands of other groups. The hit detector (ND + stacking model) is universal and suitable for a broader application to support the search for new drug candidates.
Collapse
Affiliation(s)
- Erwin Kupczyk
- Research Unit Analytical BioGeoChemistry, Helmholtz Zentrum München, Ingolstaedter Landstr. 1, 85764 Neuherberg, Germany
- Comprehensive Foodomics Platform, Chair of Analytical Food Chemistry, TUM School of Life Sciences, Technical University of Munich, Maximus-von-Imhof-Forum 2, 85354 Freising, Germany
| | - Kenji Schorpp
- Institute for Molecular Toxicology and Pharmacology, Cell Signaling and Chemical Biology, Helmholtz Zentrum München, Ingolstaedter Landstr. 1, 85764 Neuherberg, Germany
| | - Kamyar Hadian
- Institute for Molecular Toxicology and Pharmacology, Cell Signaling and Chemical Biology, Helmholtz Zentrum München, Ingolstaedter Landstr. 1, 85764 Neuherberg, Germany
| | - Sean Lin
- Institute for Molecular Toxicology and Pharmacology, Cell Signaling and Chemical Biology, Helmholtz Zentrum München, Ingolstaedter Landstr. 1, 85764 Neuherberg, Germany
| | - Dimitrios Tziotis
- Research Unit Analytical BioGeoChemistry, Helmholtz Zentrum München, Ingolstaedter Landstr. 1, 85764 Neuherberg, Germany
| | - Philippe Schmitt-Kopplin
- Research Unit Analytical BioGeoChemistry, Helmholtz Zentrum München, Ingolstaedter Landstr. 1, 85764 Neuherberg, Germany
- Comprehensive Foodomics Platform, Chair of Analytical Food Chemistry, TUM School of Life Sciences, Technical University of Munich, Maximus-von-Imhof-Forum 2, 85354 Freising, Germany
| | - Constanze Mueller
- Research Unit Analytical BioGeoChemistry, Helmholtz Zentrum München, Ingolstaedter Landstr. 1, 85764 Neuherberg, Germany
| |
Collapse
|
23
|
Chardin D, Gille C, Pourcher T, Humbert O, Barlaud M. Learning a confidence score and the latent space of a new supervised autoencoder for diagnosis and prognosis in clinical metabolomic studies. BMC Bioinformatics 2022; 23:361. [PMID: 36050631 PMCID: PMC9434875 DOI: 10.1186/s12859-022-04900-x] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/17/2022] [Accepted: 07/27/2022] [Indexed: 11/15/2022] Open
Abstract
Background Presently, there is a wide variety of classification methods and deep neural network approaches in bioinformatics. Deep neural networks have proven their effectiveness for classification tasks, and have outperformed classical methods, but they suffer from a lack of interpretability. Therefore, these innovative methods are not appropriate for decision support systems in healthcare. Indeed, to allow clinicians to make informed and well thought out decisions, the algorithm should provide the main pieces of information used to compute the predicted diagnosis and/or prognosis, as well as a confidence score for this prediction. Methods Herein, we used a new supervised autoencoder (SAE) approach for classification of clinical metabolomic data. This new method has the advantage of providing a confidence score for each prediction thanks to a softmax classifier and a meaningful latent space visualization and to include a new efficient feature selection method, with a structured constraint, which allows for biologically interpretable results. Results Experimental results on three metabolomics datasets of clinical samples illustrate the effectiveness of our SAE and its confidence score. The supervised autoencoder provides an accurate localization of the patients in the latent space, and an efficient confidence score. Experiments show that the SAE outperforms classical methods (PLS-DA, Random Forests, SVM, and neural networks (NN)). Furthermore, the metabolites selected by the SAE were found to be biologically relevant. Conclusion In this paper, we describe a new efficient SAE method to support diagnostic or prognostic evaluation based on metabolomics analyses.
Collapse
Affiliation(s)
- David Chardin
- Transporters in Imaging and Radiotherapy in Oncology (TIRO), Direction de la Recherche Fondamentale (DRF), Institut des sciences du vivant Fréderic Joliot, Commissariat à l'Energie Atomique et aux énergies alternatives (CEA), Université Côte d'Azur (UCA), Nice, France.,Centre Antoine Lacassagne, Université Côte d'Azur (UCA), Nice, France
| | - Cyprien Gille
- Laboratoire d'Informatique, Signaux et Systèmes de Sophia Antipolis (I3S), Centre de Recherche Scientifique (CNRS), Université Côte d'Azur (UCA), Sophia Antipolis, France
| | - Thierry Pourcher
- Transporters in Imaging and Radiotherapy in Oncology (TIRO), Direction de la Recherche Fondamentale (DRF), Institut des sciences du vivant Fréderic Joliot, Commissariat à l'Energie Atomique et aux énergies alternatives (CEA), Université Côte d'Azur (UCA), Nice, France
| | - Olivier Humbert
- Transporters in Imaging and Radiotherapy in Oncology (TIRO), Direction de la Recherche Fondamentale (DRF), Institut des sciences du vivant Fréderic Joliot, Commissariat à l'Energie Atomique et aux énergies alternatives (CEA), Université Côte d'Azur (UCA), Nice, France.,Centre Antoine Lacassagne, Université Côte d'Azur (UCA), Nice, France
| | - Michel Barlaud
- Laboratoire d'Informatique, Signaux et Systèmes de Sophia Antipolis (I3S), Centre de Recherche Scientifique (CNRS), Université Côte d'Azur (UCA), Sophia Antipolis, France.
| |
Collapse
|
24
|
Diagnosis and prognosis of COVID-19 employing analysis of patients' plasma and serum via LC-MS and machine learning. Comput Biol Med 2022; 146:105659. [PMID: 35751188 PMCID: PMC9123826 DOI: 10.1016/j.compbiomed.2022.105659] [Citation(s) in RCA: 14] [Impact Index Per Article: 7.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/20/2022] [Revised: 05/11/2022] [Accepted: 05/18/2022] [Indexed: 01/11/2023]
Abstract
OBJECTIVE To implement and evaluate machine learning (ML) algorithms for the prediction of COVID-19 diagnosis, severity, and fatality and to assess biomarkers potentially associated with these outcomes. MATERIAL AND METHODS Serum (n = 96) and plasma (n = 96) samples from patients with COVID-19 (acute, severe and fatal illness) from two independent hospitals in China were analyzed by LC-MS. Samples from healthy volunteers and from patients with pneumonia caused by other viruses (i.e. negative RT-PCR for COVID-19) were used as controls. Seven different ML-based models were built: PLS-DA, ANNDA, XGBoostDA, SIMCA, SVM, LREG and KNN. RESULTS The PLS-DA model presented the best performance for both datasets, with accuracy rates to predict the diagnosis, severity and fatality of COVID-19 of 93%, 94% and 97%, respectively. Low levels of the metabolites ribothymidine, 4-hydroxyphenylacetoylcarnitine and uridine were associated with COVID-19 positivity, whereas high levels of N-acetyl-glucosamine-1-phosphate, cysteinylglycine, methyl isobutyrate, l-ornithine and 5,6-dihydro-5-methyluracil were significantly related to greater severity and fatality from COVID-19. CONCLUSION The PLS-DA model can help to predict SARS-CoV-2 diagnosis, severity and fatality in daily practice. Some biomarkers typically increased in COVID-19 patients' serum or plasma (i.e. ribothymidine, N-acetyl-glucosamine-1-phosphate, l-ornithine, 5,6-dihydro-5-methyluracil) should be further evaluated as prognostic indicators of the disease.
Collapse
|
25
|
Wang L, Hu Q, Wang L, Shi H, Lai C, Zhang S. Predicting the growth performance of growing-finishing pigs based on net energy and digestible lysine intake using multiple regression and artificial neural networks models. J Anim Sci Biotechnol 2022; 13:57. [PMID: 35550214 PMCID: PMC9102637 DOI: 10.1186/s40104-022-00707-1] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/16/2021] [Accepted: 03/13/2022] [Indexed: 11/10/2022] Open
Abstract
BACKGROUNDS Evaluating the growth performance of pigs in real-time is laborious and expensive, thus mathematical models based on easily accessible variables are developed. Multiple regression (MR) is the most widely used tool to build prediction models in swine nutrition, while the artificial neural networks (ANN) model is reported to be more accurate than MR model in prediction performance. Therefore, the potential of ANN models in predicting the growth performance of pigs was evaluated and compared with MR models in this study. RESULTS Body weight (BW), net energy (NE) intake, standardized ileal digestible lysine (SID Lys) intake, and their quadratic terms were selected as input variables to predict ADG and F/G among 10 candidate variables. In the training phase, MR models showed high accuracy in both ADG and F/G prediction (R2ADG = 0.929, R2F/G = 0.886) while ANN models with 4, 6 neurons and radial basis activation function yielded the best performance in ADG and F/G prediction (R2ADG = 0.964, R2F/G = 0.932). In the testing phase, these ANN models showed better accuracy in ADG prediction (CCC: 0.976 vs. 0.861, R2: 0.951 vs. 0.584), and F/G prediction (CCC: 0.952 vs. 0.900, R2: 0.905 vs. 0.821) compared with the MR models. Meanwhile, the "over-fitting" occurred in MR models but not in ANN models. On validation data from the animal trial, ANN models exhibited superiority over MR models in both ADG and F/G prediction (P < 0.01). Moreover, the growth stages have a significant effect on the prediction accuracy of the models. CONCLUSION Body weight, NE intake and SID Lys intake can be used as input variables to predict the growth performance of growing-finishing pigs, with trained ANN models are more flexible and accurate than MR models. Therefore, it is promising to use ANN models in related swine nutrition studies in the future.
Collapse
Affiliation(s)
- Li Wang
- State Key Laboratory of Animal Nutrition, College of Animal Science and Technology, China Agricultural University, Beijing, 100193, P. R. China
| | - Qile Hu
- State Key Laboratory of Animal Nutrition, College of Animal Science and Technology, China Agricultural University, Beijing, 100193, P. R. China
| | - Lu Wang
- State Key Laboratory of Animal Nutrition, College of Animal Science and Technology, China Agricultural University, Beijing, 100193, P. R. China
| | - Huangwei Shi
- State Key Laboratory of Animal Nutrition, College of Animal Science and Technology, China Agricultural University, Beijing, 100193, P. R. China
| | - Changhua Lai
- State Key Laboratory of Animal Nutrition, College of Animal Science and Technology, China Agricultural University, Beijing, 100193, P. R. China.
| | - Shuai Zhang
- State Key Laboratory of Animal Nutrition, College of Animal Science and Technology, China Agricultural University, Beijing, 100193, P. R. China.
| |
Collapse
|
26
|
Debik J, Sangermani M, Wang F, Madssen TS, Giskeødegård GF. Multivariate analysis of NMR-based metabolomic data. NMR IN BIOMEDICINE 2022; 35:e4638. [PMID: 34738674 DOI: 10.1002/nbm.4638] [Citation(s) in RCA: 14] [Impact Index Per Article: 7.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 06/03/2021] [Revised: 09/08/2021] [Accepted: 09/29/2021] [Indexed: 06/13/2023]
Abstract
Nuclear magnetic resonance (NMR) spectroscopy allows for simultaneous detection of a wide range of metabolites and lipids. As metabolites act together in complex metabolic networks, they are often highly correlated, and optimal biological insight is achieved when using methods that take the correlation into account. For this reason, latent-variable-based methods, such as principal component analysis and partial least-squares discriminant analysis, are widely used in metabolomic studies. However, with increasing availability of larger population cohorts, and a shift from analysis of spectral data to using quantified metabolite levels, both more traditional statistical approaches and alternative machine learning methods have become more widely used. This review aims at providing an overview of the current state-of-the-art multivariate methods for the analysis of NMR-based metabolomic data as well as alternative methods, highlighting their strengths and limitations.
Collapse
Affiliation(s)
- Julia Debik
- Department of Circulation and Medical Imaging, Faculty of Medicine and Health Sciences, Norwegian University of Science and Technology-NTNU, Trondheim, Norway
| | - Matteo Sangermani
- Department of Circulation and Medical Imaging, Faculty of Medicine and Health Sciences, Norwegian University of Science and Technology-NTNU, Trondheim, Norway
| | - Feng Wang
- Department of Circulation and Medical Imaging, Faculty of Medicine and Health Sciences, Norwegian University of Science and Technology-NTNU, Trondheim, Norway
- Clinic of Surgery, St. Olavs Hospital HF, Trondheim, Norway
| | - Torfinn S Madssen
- Department of Circulation and Medical Imaging, Faculty of Medicine and Health Sciences, Norwegian University of Science and Technology-NTNU, Trondheim, Norway
| | - Guro F Giskeødegård
- Clinic of Surgery, St. Olavs Hospital HF, Trondheim, Norway
- K.G. Jebsen Center for Genetic Epidemiology, Department of Public Health and Nursing, Norwegian University of Science and Technology-NTNU, Trondheim, Norway
| |
Collapse
|
27
|
Defining Blood Plasma and Serum Metabolome by GC-MS. Metabolites 2021; 12:metabo12010015. [PMID: 35050137 PMCID: PMC8779220 DOI: 10.3390/metabo12010015] [Citation(s) in RCA: 18] [Impact Index Per Article: 6.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/24/2021] [Revised: 12/18/2021] [Accepted: 12/21/2021] [Indexed: 01/04/2023] Open
Abstract
Metabolomics uses advanced analytical chemistry methods to analyze metabolites in biological samples. The most intensively studied samples are blood and its liquid components: plasma and serum. Armed with advanced equipment and progressive software solutions, the scientific community has shown that small molecules’ roles in living systems are not limited to traditional “building blocks” or “just fuel” for cellular energy. As a result, the conclusions based on studying the metabolome are finding practical reflection in molecular medicine and a better understanding of fundamental biochemical processes in living systems. This review is not a detailed protocol of metabolomic analysis. However, it should support the reader with information about the achievements in the whole process of metabolic exploration of human plasma and serum using mass spectrometry combined with gas chromatography.
Collapse
|
28
|
Abraham EJ, Kellogg JJ. Chemometric-Guided Approaches for Profiling and Authenticating Botanical Materials. Front Nutr 2021; 8:780228. [PMID: 34901127 PMCID: PMC8663772 DOI: 10.3389/fnut.2021.780228] [Citation(s) in RCA: 14] [Impact Index Per Article: 4.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/20/2021] [Accepted: 10/31/2021] [Indexed: 01/08/2023] Open
Abstract
Botanical supplements with broad traditional and medicinal uses represent an area of growing importance for American health management; 25% of U.S. adults use dietary supplements daily and collectively spent over $9. 5 billion in 2019 in herbal and botanical supplements alone. To understand how natural products benefit human health and determine potential safety concerns, careful in vitro, in vivo, and clinical studies are required. However, botanicals are innately complex systems, with complicated compositions that defy many standard analytical approaches and fluctuate based upon a plethora of factors, including genetics, growth conditions, and harvesting/processing procedures. Robust studies rely upon accurate identification of the plant material, and botanicals' increasing economic and health importance demand reproducible sourcing, as well as assessment of contamination or adulteration. These quality control needs for botanical products remain a significant problem plaguing researchers in academia as well as the supplement industry, thus posing a risk to consumers and possibly rendering clinical data irreproducible and/or irrelevant. Chemometric approaches that analyze the small molecule composition of materials provide a reliable and high-throughput avenue for botanical authentication. This review emphasizes the need for consistent material and provides insight into the roles of various modern chemometric analyses in evaluating and authenticating botanicals, focusing on advanced methodologies, including targeted and untargeted metabolite analysis, as well as the role of multivariate statistical modeling and machine learning in phytochemical characterization. Furthermore, we will discuss how chemometric approaches can be integrated with orthogonal techniques to provide a more robust approach to authentication, and provide directions for future research.
Collapse
Affiliation(s)
- Evelyn J Abraham
- Intercollege Graduate Degree Program in Plant Biology, The Pennsylvania State University (PSU), University Park, PA, United States
| | - Joshua J Kellogg
- Intercollege Graduate Degree Program in Plant Biology, The Pennsylvania State University (PSU), University Park, PA, United States.,Department of Veterinary and Biomedical Sciences, The Pennsylvania State University, University Park, PA, United States
| |
Collapse
|
29
|
Shrivastava AD, Swainston N, Samanta S, Roberts I, Wright Muelas M, Kell DB. MassGenie: A Transformer-Based Deep Learning Method for Identifying Small Molecules from Their Mass Spectra. Biomolecules 2021; 11:1793. [PMID: 34944436 PMCID: PMC8699281 DOI: 10.3390/biom11121793] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/24/2021] [Revised: 11/14/2021] [Accepted: 11/27/2021] [Indexed: 12/15/2022] Open
Abstract
The 'inverse problem' of mass spectrometric molecular identification ('given a mass spectrum, calculate/predict the 2D structure of the molecule whence it came') is largely unsolved, and is especially acute in metabolomics where many small molecules remain unidentified. This is largely because the number of experimentally available electrospray mass spectra of small molecules is quite limited. However, the forward problem ('calculate a small molecule's likely fragmentation and hence at least some of its mass spectrum from its structure alone') is much more tractable, because the strengths of different chemical bonds are roughly known. This kind of molecular identification problem may be cast as a language translation problem in which the source language is a list of high-resolution mass spectral peaks and the 'translation' a representation (for instance in SMILES) of the molecule. It is thus suitable for attack using the deep neural networks known as transformers. We here present MassGenie, a method that uses a transformer-based deep neural network, trained on ~6 million chemical structures with augmented SMILES encoding and their paired molecular fragments as generated in silico, explicitly including the protonated molecular ion. This architecture (containing some 400 million elements) is used to predict the structure of a molecule from the various fragments that may be expected to be observed when some of its bonds are broken. Despite being given essentially no detailed nor explicit rules about molecular fragmentation methods, isotope patterns, rearrangements, neutral losses, and the like, MassGenie learns the effective properties of the mass spectral fragment and valency space, and can generate candidate molecular structures that are very close or identical to those of the 'true' molecules. We also use VAE-Sim, a previously published variational autoencoder, to generate candidate molecules that are 'similar' to the top hit. In addition to using the 'top hits' directly, we can produce a rank order of these by 'round-tripping' candidate molecules and comparing them with the true molecules, where known. As a proof of principle, we confine ourselves to positive electrospray mass spectra from molecules with a molecular mass of 500Da or lower, including those in the last CASMI challenge (for which the results are known), getting 49/93 (53%) precisely correct. The transformer method, applied here for the first time to mass spectral interpretation, works extremely effectively both for mass spectra generated in silico and on experimentally obtained mass spectra from pure compounds. It seems to act as a Las Vegas algorithm, in that it either gives the correct answer or simply states that it cannot find one. The ability to create and to 'learn' millions of fragmentation patterns in silico, and therefrom generate candidate structures (that do not have to be in existing libraries) directly, thus opens up entirely the field of de novo small molecule structure prediction from experimental mass spectra.
Collapse
Affiliation(s)
- Aditya Divyakant Shrivastava
- Department of Biochemistry and Systems Biology, Institute of Systems, Molecular and Integrative Biology, Faculty of Health and Life Sciences, University of Liverpool, Crown St, Liverpool L69 7ZB, UK; (A.D.S.); (N.S.); (S.S.); (I.R.); (M.W.M.)
- Department of Computer Science and Engineering, Nirma University, Ahmedabad 382481, India
| | - Neil Swainston
- Department of Biochemistry and Systems Biology, Institute of Systems, Molecular and Integrative Biology, Faculty of Health and Life Sciences, University of Liverpool, Crown St, Liverpool L69 7ZB, UK; (A.D.S.); (N.S.); (S.S.); (I.R.); (M.W.M.)
- Mellizyme Biotechnology Ltd., Liverpool Science Park IC1, 131 Mount Pleasant, Liverpool L3 5TF, UK
| | - Soumitra Samanta
- Department of Biochemistry and Systems Biology, Institute of Systems, Molecular and Integrative Biology, Faculty of Health and Life Sciences, University of Liverpool, Crown St, Liverpool L69 7ZB, UK; (A.D.S.); (N.S.); (S.S.); (I.R.); (M.W.M.)
| | - Ivayla Roberts
- Department of Biochemistry and Systems Biology, Institute of Systems, Molecular and Integrative Biology, Faculty of Health and Life Sciences, University of Liverpool, Crown St, Liverpool L69 7ZB, UK; (A.D.S.); (N.S.); (S.S.); (I.R.); (M.W.M.)
| | - Marina Wright Muelas
- Department of Biochemistry and Systems Biology, Institute of Systems, Molecular and Integrative Biology, Faculty of Health and Life Sciences, University of Liverpool, Crown St, Liverpool L69 7ZB, UK; (A.D.S.); (N.S.); (S.S.); (I.R.); (M.W.M.)
| | - Douglas B. Kell
- Department of Biochemistry and Systems Biology, Institute of Systems, Molecular and Integrative Biology, Faculty of Health and Life Sciences, University of Liverpool, Crown St, Liverpool L69 7ZB, UK; (A.D.S.); (N.S.); (S.S.); (I.R.); (M.W.M.)
- Mellizyme Biotechnology Ltd., Liverpool Science Park IC1, 131 Mount Pleasant, Liverpool L3 5TF, UK
- Novo Nordisk Foundation Centre for Biosustainability, Technical University of Denmark, Building 220, Kemitorvet, 2800 Kongens Lyngby, Denmark
| |
Collapse
|
30
|
Côté M, Lamarche B. Artificial intelligence in nutrition research: perspectives on current and future applications. Appl Physiol Nutr Metab 2021; 47:1-8. [PMID: 34525321 DOI: 10.1139/apnm-2021-0448] [Citation(s) in RCA: 8] [Impact Index Per Article: 2.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/22/2022]
Abstract
Artificial intelligence (AI) is a rapidly evolving area that offers unparalleled opportunities of progress and applications in many healthcare fields. In this review, we provide an overview of the main and latest applications of AI in nutrition research and identify gaps to address to potentialize this emerging field. AI algorithms may help better understand and predict the complex and non-linear interactions between nutrition-related data and health outcomes, particularly when large amounts of data need to be structured and integrated, such as in metabolomics. AI-based approaches, including image recognition, may also improve dietary assessment by maximizing efficiency and addressing systematic and random errors associated with self-reported measurements of dietary intakes. Finally, AI applications can extract, structure and analyze large amounts of data from social media platforms to better understand dietary behaviours and perceptions among the population. In summary, AI-based approaches will likely improve and advance nutrition research as well as help explore new applications. However, further research is needed to identify areas where AI does deliver added value compared with traditional approaches, and other areas where AI is simply not likely to advance the field. Novelty: Artificial intelligence offers unparalleled opportunities of progress and applications in nutrition. There remain gaps to address to potentialize this emerging field.
Collapse
Affiliation(s)
- Mélina Côté
- Centre de recherche Nutrition, santé et société (NUTRISS), INAF, Université Laval, Québec, QC, Canada
- School of Nutrition, Université Laval, Québec, QC, Canada
| | - Benoît Lamarche
- Centre de recherche Nutrition, santé et société (NUTRISS), INAF, Université Laval, Québec, QC, Canada
- School of Nutrition, Université Laval, Québec, QC, Canada
| |
Collapse
|
31
|
Kikuchi J, Yamada S. The exposome paradigm to predict environmental health in terms of systemic homeostasis and resource balance based on NMR data science. RSC Adv 2021; 11:30426-30447. [PMID: 35480260 PMCID: PMC9041152 DOI: 10.1039/d1ra03008f] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/18/2021] [Accepted: 08/31/2021] [Indexed: 12/22/2022] Open
Abstract
The environment, from microbial ecosystems to recycled resources, fluctuates dynamically due to many physical, chemical and biological factors, the profile of which reflects changes in overall state, such as environmental illness caused by a collapse of homeostasis. To evaluate and predict environmental health in terms of systemic homeostasis and resource balance, a comprehensive understanding of these factors requires an approach based on the "exposome paradigm", namely the totality of exposure to all substances. Furthermore, in considering sustainable development to meet global population growth, it is important to gain an understanding of both the circulation of biological resources and waste recycling in human society. From this perspective, natural environment, agriculture, aquaculture, wastewater treatment in industry, biomass degradation and biodegradable materials design are at the forefront of current research. In this respect, nuclear magnetic resonance (NMR) offers tremendous advantages in the analysis of samples of molecular complexity, such as crude bio-extracts, intact cells and tissues, fibres, foods, feeds, fertilizers and environmental samples. Here we outline examples to promote an understanding of recent applications of solution-state, solid-state, time-domain NMR and magnetic resonance imaging (MRI) to the complex evaluation of organisms, materials and the environment. We also describe useful databases and informatics tools, as well as machine learning techniques for NMR analysis, demonstrating that NMR data science can be used to evaluate the exposome in both the natural environment and human society towards a sustainable future.
Collapse
Affiliation(s)
- Jun Kikuchi
- Environmental Metabolic Analysis Research Team, RIKEN Center for Sustainable Resource Science 1-7-22 Suehiro-cho, Tsurumi-ku Yokohama 230-0045 Japan
- Graduate School of Bioagricultural Sciences, Nagoya University Furo-cho, Chikusa-ku Nagoya 464-8601 Japan
- Graduate School of Medical Life Science, Yokohama City University 1-7-29 Suehiro-cho, Tsurumi-ku Yokohama 230-0045 Japan
| | - Shunji Yamada
- Environmental Metabolic Analysis Research Team, RIKEN Center for Sustainable Resource Science 1-7-22 Suehiro-cho, Tsurumi-ku Yokohama 230-0045 Japan
- Prediction Science Laboratory, RIKEN Cluster for Pioneering Research 7-1-26 Minatojima-minami-machi, Chuo-ku Kobe 650-0047 Japan
- Data Assimilation Research Team, RIKEN Center for Computational Science 7-1-26 Minatojima-minami-machi, Chuo-ku Kobe 650-0047 Japan
| |
Collapse
|
32
|
Why Has Metabolomics So Far Not Managed to Efficiently Contribute to the Improvement of Assisted Reproduction Outcomes? The Answer through a Review of the Best Available Current Evidence. Diagnostics (Basel) 2021; 11:diagnostics11091602. [PMID: 34573944 PMCID: PMC8469471 DOI: 10.3390/diagnostics11091602] [Citation(s) in RCA: 9] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/18/2021] [Revised: 08/24/2021] [Accepted: 08/31/2021] [Indexed: 12/31/2022] Open
Abstract
Metabolomics emerged to give clinicians the necessary information on the competence, in terms of physiology and function, of gametes, embryos, and the endometrium towards a targeted infertility treatment, namely, assisted reproduction techniques (ART). Our minireview aims to investigate the current status of the use of metabolomics in assisted reproduction, the potential flaws in its use, and to propose specific solutions towards the improvement of ART outcomes through the use of the intervention. We used published reports assessing the role of metabolomic investigation of the endometrium, oocytes, and embryos in improving clinical outcomes in women undergoing ART. We initially found that there is no evidence to support that fertility outcomes can be improved through metabolomics profiling. In contrast, it may be helpful for understanding and appraising the nutritional environment of oocytes and embryos. The causes include the different infertility populations, the difference between animals and humans, technical limitations, and the great heterogeneity in the variables employed. Suggested steps include the standardization of variables of the method itself, the universal creation of a panel where all biomarkers are stored concerning specific infertile populations with different phenotypes or etiologies, specific bioinformatics contribution, significant computing power for data processing, and importantly, properly conducted trials.
Collapse
|
33
|
An Improved Stacked Autoencoder for Metabolomic Data Classification. COMPUTATIONAL INTELLIGENCE AND NEUROSCIENCE 2021; 2021:1051172. [PMID: 34434226 PMCID: PMC8382558 DOI: 10.1155/2021/1051172] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 05/04/2021] [Revised: 06/28/2021] [Accepted: 07/28/2021] [Indexed: 12/24/2022]
Abstract
Naru3 (NR) is a traditional Mongolian medicine with high clinical efficacy and low incidence of side effects. Metabolomics is an approach that can facilitate the development of traditional drugs. However, metabolomic data have a high throughput, sparse, high-dimensional, and small sample nature, and their classification is challenging. Although deep learning methods have a wide range of applications, deep learning-based metabolomic studies have not been widely performed. We aimed to develop an improved stacked autoencoder (SAE) for metabolomic data classification. We established an NR-treated rheumatoid arthritis (RA) mouse model and classified the obtained metabolomic data using the Hessian-free SAE (HF-SAE) algorithm. During training, the unlabeled data were used for pretraining, and the labeled data were used for fine-tuning based on the HF algorithm for gradient descent optimization. The hybrid algorithm successfully classified the data. The results were compared with those of the support vector machine (SVM), k-nearest neighbor (KNN), and gradient descent SAE (GD-SAE) algorithms. A five-fold cross-validation was used to complete the classification experiment. In each fine-tuning process, the mean square error (MSE) and misclassification rates of the training and test data were recorded. We successfully established an NR animal model and an improved SAE for metabolomic data classification.
Collapse
|
34
|
Wang J, Abdella Kemal M. Comparison of the Metabolites of Water Polo Players before and after Competition by the Metabolomic Approach. JOURNAL OF HEALTHCARE ENGINEERING 2021; 2021:7600835. [PMID: 34336166 PMCID: PMC8318763 DOI: 10.1155/2021/7600835] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 06/16/2021] [Revised: 07/02/2021] [Accepted: 07/11/2021] [Indexed: 11/25/2022]
Abstract
Background The metabolic characteristics of body fluids of excellent water polo players before and after competition have not been reported. The purpose of the study was to compare the metabolites in the urine of water polo players before and after competition by 1H-NMR-based metabolomic approach. Methods Twenty-six young water polo players participated in the study voluntarily. The urine and blood samples of players were collected one week before competition (A), immediately after competition (B), and one week after competition (C). Metabolomic analysis was conducted on the urine samples. Urine routine items and biochemical indicators in blood samples were detected. Results Metabolomic results showed that the contents of eleven metabolites including lactic acid, acetoacetate, and succinic acid in the urine of the subjects increased and four metabolites such as dimethylamine, choline, and glucose decreased at stage B. Most metabolites at stage C had basically returned to the levels at stage A. Pyruvate metabolism, pantothenate and CoA biosynthesis, synthesis, and degradation of ketone bodies were mainly involved in the above process. Urine conventional analysis results showed that the urine pH decreased dramatically and the levels of PRO and URO significantly increased at stage B, and the three indicators had similar values between stages A and C. The other indicators did not have obvious difference among the three stages. Analysis of blood biochemical indicators showed that the levels of LDH, BUN, CK, and AST significantly increased at stage B and did not show an obvious difference between stages A and C. The results are helpful for coaches to arrange the athletes' diet reasonably and to conduct scientific training for athletes.
Collapse
Affiliation(s)
- Jingjing Wang
- School of Physical Education, Shanxi University, Taiyuan 030006, China
| | | |
Collapse
|
35
|
Tinte MM, Chele KH, van der Hooft JJJ, Tugizimana F. Metabolomics-Guided Elucidation of Plant Abiotic Stress Responses in the 4IR Era: An Overview. Metabolites 2021; 11:445. [PMID: 34357339 PMCID: PMC8305945 DOI: 10.3390/metabo11070445] [Citation(s) in RCA: 8] [Impact Index Per Article: 2.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/25/2021] [Revised: 06/30/2021] [Accepted: 07/03/2021] [Indexed: 12/27/2022] Open
Abstract
Plants are constantly challenged by changing environmental conditions that include abiotic stresses. These are limiting their development and productivity and are subsequently threatening our food security, especially when considering the pressure of the increasing global population. Thus, there is an urgent need for the next generation of crops with high productivity and resilience to climate change. The dawn of a new era characterized by the emergence of fourth industrial revolution (4IR) technologies has redefined the ideological boundaries of research and applications in plant sciences. Recent technological advances and machine learning (ML)-based computational tools and omics data analysis approaches are allowing scientists to derive comprehensive metabolic descriptions and models for the target plant species under specific conditions. Such accurate metabolic descriptions are imperatively essential for devising a roadmap for the next generation of crops that are resilient to environmental deterioration. By synthesizing the recent literature and collating data on metabolomics studies on plant responses to abiotic stresses, in the context of the 4IR era, we point out the opportunities and challenges offered by omics science, analytical intelligence, computational tools and big data analytics. Specifically, we highlight technological advancements in (plant) metabolomics workflows and the use of machine learning and computational tools to decipher the dynamics in the chemical space that define plant responses to abiotic stress conditions.
Collapse
Affiliation(s)
- Morena M. Tinte
- Department of Biochemistry, University of Johannesburg, Auckland Park, Johannesburg 2006, South Africa; (M.M.T.); (K.H.C.)
| | - Kekeletso H. Chele
- Department of Biochemistry, University of Johannesburg, Auckland Park, Johannesburg 2006, South Africa; (M.M.T.); (K.H.C.)
| | | | - Fidele Tugizimana
- Department of Biochemistry, University of Johannesburg, Auckland Park, Johannesburg 2006, South Africa; (M.M.T.); (K.H.C.)
- International Research and Development Division, Omnia Group, Ltd., Johannesburg 2021, South Africa
| |
Collapse
|
36
|
Pérez-Jiménez M, Sherman E, Pozo-Bayón MA, Pinu FR. Application of untargeted volatile profiling and data driven approaches in wine flavoromics research. Food Res Int 2021; 145:110392. [PMID: 34112395 DOI: 10.1016/j.foodres.2021.110392] [Citation(s) in RCA: 16] [Impact Index Per Article: 5.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/04/2021] [Revised: 03/31/2021] [Accepted: 05/04/2021] [Indexed: 11/28/2022]
Abstract
Traditional flavor chemistry research usually makes use of targeted approaches by focusing on the detection and quantification of key flavor active metabolites that are present in food and beverages. In the last decade, flavoromics has emerged as an alternative to targeted methods where non-targeted and data driven approaches have been used to determine as many metabolites as possible with the aim to establish relationships among the chemical composition of foods and their sensory properties. Flavoromics has been successfully applied in wine research to gain more insights into the impact of a wide range of flavor active metabolites on wine quality. In this review, we aim to provide an overview of the applications of flavoromics approaches in wine research based on existing literature mainly by focusing on untargeted volatile profiling of wines and how this can be used as a powerful tool to generate novel insights. We highlight the fact that untargeted volatile profiling used in flavoromics approaches ultimately can assist the wine industry to produce different wine styles and to market existing wines appropriately based on consumer preference. In addition to summarizing the main steps involved in untargeted volatile profiling, we also provide an outlook about future perspectives and challenges of wine flavoromics research.
Collapse
Affiliation(s)
- Maria Pérez-Jiménez
- Institute of Food Science Research (CIAL), CSIC-UAM, C/Nicolás Cabrera, 28049 Madrid, Spain
| | - Emma Sherman
- The New Zealand Institute for Plant and Food Research Limited, Private Bag 92169, Auckland 1142, New Zealand
| | - M A Pozo-Bayón
- Institute of Food Science Research (CIAL), CSIC-UAM, C/Nicolás Cabrera, 28049 Madrid, Spain
| | - Farhana R Pinu
- The New Zealand Institute for Plant and Food Research Limited, Private Bag 92169, Auckland 1142, New Zealand.
| |
Collapse
|
37
|
Feizi N, Hashemi-Nasab FS, Golpelichi F, Saburouh N, Parastar H. Recent trends in application of chemometric methods for GC-MS and GC×GC-MS-based metabolomic studies. Trends Analyt Chem 2021. [DOI: 10.1016/j.trac.2021.116239] [Citation(s) in RCA: 19] [Impact Index Per Article: 6.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/08/2023]
|
38
|
Chemometric applications in metabolomic studies using chromatography-mass spectrometry. Trends Analyt Chem 2021. [DOI: 10.1016/j.trac.2020.116165] [Citation(s) in RCA: 13] [Impact Index Per Article: 4.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/14/2022]
|
39
|
Jimenez-Carvelo AM, Cuadros-Rodríguez L. Data mining/machine learning methods in foodomics. Curr Opin Food Sci 2021. [DOI: 10.1016/j.cofs.2020.09.008] [Citation(s) in RCA: 7] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/21/2022]
|
40
|
Pomyen Y, Wanichthanarak K, Poungsombat P, Fahrmann J, Grapov D, Khoomrung S. Deep metabolome: Applications of deep learning in metabolomics. Comput Struct Biotechnol J 2020; 18:2818-2825. [PMID: 33133423 PMCID: PMC7575644 DOI: 10.1016/j.csbj.2020.09.033] [Citation(s) in RCA: 67] [Impact Index Per Article: 16.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/24/2020] [Revised: 09/21/2020] [Accepted: 09/21/2020] [Indexed: 01/11/2023] Open
Abstract
In the past few years, deep learning has been successfully applied to various omics data. However, the applications of deep learning in metabolomics are still relatively low compared to others omics. Currently, data pre-processing using convolutional neural network architecture appears to benefit the most from deep learning. Compound/structure identification and quantification using artificial neural network/deep learning performed relatively better than traditional machine learning techniques, whereas only marginally better results are observed in biological interpretations. Before deep learning can be effectively applied to metabolomics, several challenges should be addressed, including metabolome-specific deep learning architectures, dimensionality problems, and model evaluation regimes.
Collapse
Key Words
- AI, Artificial Intelligence
- ANN, Artificial Neural Network
- AUC, Area Under the receiver-operating characteristic Curve
- Artificial neural network
- CCS value, Collision Cross Section value
- CFM-EI, Competitive Fragmentation Modeling-Electron Ionization
- CNN, Convolutional Neural Network
- DL, Deep Learning
- DNN, Deep Neural Network
- Deep learning
- ECFP, Extended Circular Fingerprint
- ER, Estrogen Receptor
- FID, Free Induction Decay
- FP score, Fingerprint correlation score
- FTIR, Fourier Transform Infrared
- GC–MS, Gas Chromatography-Mass Spectrometry
- HDLSS data, High Dimensional Low Sample Size data
- IST, Iterative Soft Thresholding
- LC-MS, Liquid Chromatography-Mass Spectrometry
- LSTM, Long Short-Term Memory
- ML, Machine Learning
- MLP, Multi-layered Perceptron
- MS, Mass Spectrometry
- Mass spectrometry
- Metabolomics
- NEIMS, Neural Electron-Ionization Mass Spectrometry
- NMR
- NMR, Nuclear Magnetic Resonance
- NUS, Non-Uniformly Sampling
- PARAFAC2, Parallel Factor Analysis 2
- RF, Random Forest
- RNN, Recurrent Neural Network
- ReLU, Rectified Linear Unit
- SMARTS, SMILES arbitrary target specification
- SMILE, Sparse Multidimensional Iterative Lineshape-enhanced
- SMILES, Simplified Molecular-Input Line-Entry System
- SRA, Sequence Read Archive
- VAE, Variational Autoencoder
- istHMS, Implementation of IST at Harvard Medical School
- m/z, mass/charge ratio
Collapse
Affiliation(s)
- Yotsawat Pomyen
- Translational Research Unit, Chulabhorn Research Institute, Bangkok, Thailand
| | - Kwanjeera Wanichthanarak
- Metabolomics and Systems Biology, Department of Biochemistry, Faculty of Medicine Siriraj Hospital, Mahidol University, Bangkok 10700, Thailand
- Siriraj Metabolomics and Phenomics Center, Faculty of Medicine Siriraj Hospital, Mahidol University, Bangkok 10700, Thailand
| | - Patcha Poungsombat
- Metabolomics and Systems Biology, Department of Biochemistry, Faculty of Medicine Siriraj Hospital, Mahidol University, Bangkok 10700, Thailand
- Siriraj Metabolomics and Phenomics Center, Faculty of Medicine Siriraj Hospital, Mahidol University, Bangkok 10700, Thailand
- Center for Innovation in Chemistry (PERCH-CIC), Faculty of Science, Mahidol University, Rama 6 Road, Bangkok 10400, Thailand
| | - Johannes Fahrmann
- Department of Clinical Cancer Prevention, The University of Texas MD Anderson Cancer Center, 1515 Holcombe Boulevard, Houston, TX 77030, USA
| | - Dmitry Grapov
- CDS- Creative Data Solutions LLC, https://creative-data.solutions, USA
| | - Sakda Khoomrung
- Metabolomics and Systems Biology, Department of Biochemistry, Faculty of Medicine Siriraj Hospital, Mahidol University, Bangkok 10700, Thailand
- Siriraj Metabolomics and Phenomics Center, Faculty of Medicine Siriraj Hospital, Mahidol University, Bangkok 10700, Thailand
- Center for Innovation in Chemistry (PERCH-CIC), Faculty of Science, Mahidol University, Rama 6 Road, Bangkok 10400, Thailand
| |
Collapse
|
41
|
Sen P, Lamichhane S, Mathema VB, McGlinchey A, Dickens AM, Khoomrung S, Orešič M. Deep learning meets metabolomics: a methodological perspective. Brief Bioinform 2020; 22:1531-1542. [PMID: 32940335 DOI: 10.1093/bib/bbaa204] [Citation(s) in RCA: 45] [Impact Index Per Article: 11.3] [Reference Citation Analysis] [Abstract] [Key Words] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/04/2020] [Revised: 08/08/2020] [Accepted: 08/10/2020] [Indexed: 12/15/2022] Open
Abstract
Deep learning (DL), an emerging area of investigation in the fields of machine learning and artificial intelligence, has markedly advanced over the past years. DL techniques are being applied to assist medical professionals and researchers in improving clinical diagnosis, disease prediction and drug discovery. It is expected that DL will help to provide actionable knowledge from a variety of 'big data', including metabolomics data. In this review, we discuss the applicability of DL to metabolomics, while presenting and discussing several examples from recent research. We emphasize the use of DL in tackling bottlenecks in metabolomics data acquisition, processing, metabolite identification, as well as in metabolic phenotyping and biomarker discovery. Finally, we discuss how DL is used in genome-scale metabolic modelling and in interpretation of metabolomics data. The DL-based approaches discussed here may assist computational biologists with the integration, prediction and drawing of statistical inference about biological outcomes, based on metabolomics data.
Collapse
Affiliation(s)
- Partho Sen
- Turku Bioscience Centre, University of Turku and Åbo Akademi University, 20520 Turku, Finland.,School of Medical Sciences, Örebro University, 702 81 Örebro, Sweden
| | - Santosh Lamichhane
- Turku Bioscience Centre, University of Turku and Åbo Akademi University, 20520 Turku, Finland
| | - Vivek B Mathema
- Metabolomics and Systems Biology, Department of Biochemistry, and Siriraj Metabolomics and Phenomics Center, Faculty of Medicine Siriraj Hospital, Mahidol University, Bangkok 10700, Thailand
| | - Aidan McGlinchey
- School of Medical Sciences, Örebro University, 702 81 Örebro, Sweden
| | - Alex M Dickens
- Turku Bioscience Centre, University of Turku and Åbo Akademi University, 20520 Turku, Finland
| | - Sakda Khoomrung
- Metabolomics and Systems Biology, Department of Biochemistry, and Siriraj Metabolomics and Phenomics Center, Faculty of Medicine Siriraj Hospital, Mahidol University, Bangkok 10700, Thailand.,Center for Innovation in Chemistry (PERCH), Faculty of Science, Mahidol University, Rama 6 Road, Bangkok 10400, Thailand
| | - Matej Orešič
- Turku Bioscience Centre, University of Turku and Åbo Akademi University, 20520 Turku, Finland.,School of Medical Sciences, Örebro University, 702 81 Örebro, Sweden
| |
Collapse
|
42
|
Roscini L, Conti A, Casagrande Pierantoni D, Robert V, Corte L, Cardinali G. Do Metabolomics and Taxonomic Barcode Markers Tell the Same Story about the Evolution of Saccharomyces sensu stricto Complex in Fermentative Environments? Microorganisms 2020; 8:microorganisms8081242. [PMID: 32824262 PMCID: PMC7463906 DOI: 10.3390/microorganisms8081242] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/18/2020] [Revised: 08/08/2020] [Accepted: 08/13/2020] [Indexed: 01/07/2023] Open
Abstract
Yeast taxonomy was introduced based on the idea that physiological properties would help discriminate species, thus assuming a strong link between physiology and taxonomy. However, the instability of physiological characteristics within species configured them as not ideal markers for species delimitation, shading the importance of physiology and paving the way to the DNA-based taxonomy. The hypothesis of reconnecting taxonomy with specific traits from phylogenies has been successfully explored for Bacteria and Archaea, suggesting that a similar route can be traveled for yeasts. In this framework, thirteen single copy loci were used to investigate the predictability of complex Fourier Transform InfaRed spectroscopy (FTIR) and High-performance Liquid Chromatography–Mass Spectrometry (LC-MS) profiles of the four historical species of the Saccharomyces sensu stricto group, both on resting cells and under short-term ethanol stress. Our data show a significant connection between the taxonomy and physiology of these strains. Eight markers out of the thirteen tested displayed high correlation values with LC-MS profiles of cells in resting condition, confirming the low efficacy of FTIR in the identification of strains of closely related species. Conversely, most genetic markers displayed increasing trends of correlation with FTIR profiles as the ethanol concentration increased, according to their role in the cellular response to different type of stress.
Collapse
Affiliation(s)
- Luca Roscini
- Department of Pharmaceutical Sciences, University of Perugia, 06121 Perugia, Italy; (L.R.); (A.C.); (D.C.P.); (G.C.)
| | - Angela Conti
- Department of Pharmaceutical Sciences, University of Perugia, 06121 Perugia, Italy; (L.R.); (A.C.); (D.C.P.); (G.C.)
| | - Debora Casagrande Pierantoni
- Department of Pharmaceutical Sciences, University of Perugia, 06121 Perugia, Italy; (L.R.); (A.C.); (D.C.P.); (G.C.)
| | - Vincent Robert
- Westerdijk Fungal Biodiversity Institute, Uppsalalaan 8, 3584 CT Utrecht, The Netherlands;
| | - Laura Corte
- Department of Pharmaceutical Sciences, University of Perugia, 06121 Perugia, Italy; (L.R.); (A.C.); (D.C.P.); (G.C.)
- Correspondence: ; Tel.: +39-0755856478
| | - Gianluigi Cardinali
- Department of Pharmaceutical Sciences, University of Perugia, 06121 Perugia, Italy; (L.R.); (A.C.); (D.C.P.); (G.C.)
| |
Collapse
|
43
|
Liebal UW, Phan ANT, Sudhakar M, Raman K, Blank LM. Machine Learning Applications for Mass Spectrometry-Based Metabolomics. Metabolites 2020; 10:E243. [PMID: 32545768 PMCID: PMC7345470 DOI: 10.3390/metabo10060243] [Citation(s) in RCA: 143] [Impact Index Per Article: 35.8] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/30/2020] [Revised: 06/09/2020] [Accepted: 06/11/2020] [Indexed: 12/20/2022] Open
Abstract
The metabolome of an organism depends on environmental factors and intracellular regulation and provides information about the physiological conditions. Metabolomics helps to understand disease progression in clinical settings or estimate metabolite overproduction for metabolic engineering. The most popular analytical metabolomics platform is mass spectrometry (MS). However, MS metabolome data analysis is complicated, since metabolites interact nonlinearly, and the data structures themselves are complex. Machine learning methods have become immensely popular for statistical analysis due to the inherent nonlinear data representation and the ability to process large and heterogeneous data rapidly. In this review, we address recent developments in using machine learning for processing MS spectra and show how machine learning generates new biological insights. In particular, supervised machine learning has great potential in metabolomics research because of the ability to supply quantitative predictions. We review here commonly used tools, such as random forest, support vector machines, artificial neural networks, and genetic algorithms. During processing steps, the supervised machine learning methods help peak picking, normalization, and missing data imputation. For knowledge-driven analysis, machine learning contributes to biomarker detection, classification and regression, biochemical pathway identification, and carbon flux determination. Of important relevance is the combination of different omics data to identify the contributions of the various regulatory levels. Our overview of the recent publications also highlights that data quality determines analysis quality, but also adds to the challenge of choosing the right model for the data. Machine learning methods applied to MS-based metabolomics ease data analysis and can support clinical decisions, guide metabolic engineering, and stimulate fundamental biological discoveries.
Collapse
Affiliation(s)
- Ulf W. Liebal
- Institute of Applied Microbiology, Aachen Biology and Biotechnology, RWTH Aachen University, Worringer Weg 1, 52074 Aachen, Germany;
| | - An N. T. Phan
- Institute of Applied Microbiology, Aachen Biology and Biotechnology, RWTH Aachen University, Worringer Weg 1, 52074 Aachen, Germany;
| | - Malvika Sudhakar
- Department of Biotechnology, Bhupat and Juoti Mehta School of Biosciences, Indian Institute of Technology (IIT) Madras, Chennai 600 036, India; (M.S.); (K.R.)
- Initiative for Biological Systems Engineering, IIT Madras, Chennai 600 036, India
- Robert Bosch Centre for Data Science and Artificial Intelligence (RBCDSAI), IIT Madras, Chennai 600 036, India
| | - Karthik Raman
- Department of Biotechnology, Bhupat and Juoti Mehta School of Biosciences, Indian Institute of Technology (IIT) Madras, Chennai 600 036, India; (M.S.); (K.R.)
- Initiative for Biological Systems Engineering, IIT Madras, Chennai 600 036, India
- Robert Bosch Centre for Data Science and Artificial Intelligence (RBCDSAI), IIT Madras, Chennai 600 036, India
| | - Lars M. Blank
- Institute of Applied Microbiology, Aachen Biology and Biotechnology, RWTH Aachen University, Worringer Weg 1, 52074 Aachen, Germany;
| |
Collapse
|
44
|
Cobas C. NMR signal processing, prediction, and structure verification with machine learning techniques. MAGNETIC RESONANCE IN CHEMISTRY : MRC 2020; 58:512-519. [PMID: 31912547 DOI: 10.1002/mrc.4989] [Citation(s) in RCA: 46] [Impact Index Per Article: 11.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 12/10/2019] [Revised: 01/02/2020] [Accepted: 01/03/2020] [Indexed: 05/25/2023]
Abstract
Machine learning (ML) methods have been present in the field of NMR since decades, but it has experienced a tremendous growth in the last few years, especially thanks to the emergence of deep learning (DL) techniques taking advantage of the increased amounts of data and available computer power. These algorithms are successfully employed for classification, regression, clustering, or dimensionality reduction tasks of large data sets and have been intensively applied in different areas of NMR including metabonomics, clinical diagnosis, or relaxometry. In this article, we concentrate on the various applications of ML/DL in the areas of NMR signal processing and analysis of small molecules, including automatic structure verification and prediction of NMR observables in solution.
Collapse
Affiliation(s)
- Carlos Cobas
- Mestrelab Research, Santiago de Compostela, Spain
| |
Collapse
|
45
|
Advances in Liquid Chromatography–Mass Spectrometry-Based Lipidomics: A Look Ahead. JOURNAL OF ANALYSIS AND TESTING 2020. [DOI: 10.1007/s41664-020-00135-y] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/07/2023]
|
46
|
Bos TS, Knol WC, Molenaar SR, Niezen LE, Schoenmakers PJ, Somsen GW, Pirok BW. Recent applications of chemometrics in one- and two-dimensional chromatography. J Sep Sci 2020; 43:1678-1727. [PMID: 32096604 PMCID: PMC7317490 DOI: 10.1002/jssc.202000011] [Citation(s) in RCA: 36] [Impact Index Per Article: 9.0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/07/2020] [Revised: 02/20/2020] [Accepted: 02/21/2020] [Indexed: 12/28/2022]
Abstract
The proliferation of increasingly more sophisticated analytical separation systems, often incorporating increasingly more powerful detection techniques, such as high-resolution mass spectrometry, causes an urgent need for highly efficient data-analysis and optimization strategies. This is especially true for comprehensive two-dimensional chromatography applied to the separation of very complex samples. In this contribution, the requirement for chemometric tools is explained and the latest developments in approaches for (pre-)processing and analyzing data arising from one- and two-dimensional chromatography systems are reviewed. The final part of this review focuses on the application of chemometrics for method development and optimization.
Collapse
Affiliation(s)
- Tijmen S. Bos
- Division of Bioanalytical ChemistryAmsterdam Institute for Molecules, Medicines and SystemsVrije Universiteit AmsterdamAmsterdamThe Netherlands
- Centre for Analytical Sciences Amsterdam (CASA)AmsterdamThe Netherlands
| | - Wouter C. Knol
- Analytical Chemistry Groupvan ’t Hoff Institute for Molecular Sciences, Faculty of ScienceUniversity of AmsterdamAmsterdamThe Netherlands
- Centre for Analytical Sciences Amsterdam (CASA)AmsterdamThe Netherlands
| | - Stef R.A. Molenaar
- Analytical Chemistry Groupvan ’t Hoff Institute for Molecular Sciences, Faculty of ScienceUniversity of AmsterdamAmsterdamThe Netherlands
- Centre for Analytical Sciences Amsterdam (CASA)AmsterdamThe Netherlands
| | - Leon E. Niezen
- Analytical Chemistry Groupvan ’t Hoff Institute for Molecular Sciences, Faculty of ScienceUniversity of AmsterdamAmsterdamThe Netherlands
- Centre for Analytical Sciences Amsterdam (CASA)AmsterdamThe Netherlands
| | - Peter J. Schoenmakers
- Analytical Chemistry Groupvan ’t Hoff Institute for Molecular Sciences, Faculty of ScienceUniversity of AmsterdamAmsterdamThe Netherlands
- Centre for Analytical Sciences Amsterdam (CASA)AmsterdamThe Netherlands
| | - Govert W. Somsen
- Division of Bioanalytical ChemistryAmsterdam Institute for Molecules, Medicines and SystemsVrije Universiteit AmsterdamAmsterdamThe Netherlands
- Centre for Analytical Sciences Amsterdam (CASA)AmsterdamThe Netherlands
| | - Bob W.J. Pirok
- Analytical Chemistry Groupvan ’t Hoff Institute for Molecular Sciences, Faculty of ScienceUniversity of AmsterdamAmsterdamThe Netherlands
- Centre for Analytical Sciences Amsterdam (CASA)AmsterdamThe Netherlands
| |
Collapse
|
47
|
Mendez KM, Broadhurst DI, Reinke SN. Migrating from partial least squares discriminant analysis to artificial neural networks: a comparison of functionally equivalent visualisation and feature contribution tools using jupyter notebooks. Metabolomics 2020; 16:17. [PMID: 31965332 PMCID: PMC6974504 DOI: 10.1007/s11306-020-1640-0] [Citation(s) in RCA: 23] [Impact Index Per Article: 5.8] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 11/30/2019] [Accepted: 01/13/2020] [Indexed: 01/25/2023]
Abstract
INTRODUCTION Metabolomics data is commonly modelled multivariately using partial least squares discriminant analysis (PLS-DA). Its success is primarily due to ease of interpretation, through projection to latent structures, and transparent assessment of feature importance using regression coefficients and Variable Importance in Projection scores. In recent years several non-linear machine learning (ML) methods have grown in popularity but with limited uptake essentially due to convoluted optimisation and interpretation. Artificial neural networks (ANNs) are a non-linear projection-based ML method that share a structural equivalence with PLS, and as such should be amenable to equivalent optimisation and interpretation methods. OBJECTIVES We hypothesise that standardised optimisation, visualisation, evaluation and statistical inference techniques commonly used by metabolomics researchers for PLS-DA can be migrated to a non-linear, single hidden layer, ANN. METHODS We compared a standardised optimisation, visualisation, evaluation and statistical inference techniques workflow for PLS with the proposed ANN workflow. Both workflows were implemented in the Python programming language. All code and results have been made publicly available as Jupyter notebooks on GitHub. RESULTS The migration of the PLS workflow to a non-linear, single hidden layer, ANN was successful. There was a similarity in significant metabolites determined using PLS model coefficients and ANN Connection Weight Approach. CONCLUSION We have shown that it is possible to migrate the standardised PLS-DA workflow to simple non-linear ANNs. This result opens the door for more widespread use and to the investigation of transparent interpretation of more complex ANN architectures.
Collapse
Affiliation(s)
- Kevin M Mendez
- Centre for Integrative Metabolomics & Computational Biology, School of Science, Edith Cowan University, Joondalup, 6027, Australia
| | - David I Broadhurst
- Centre for Integrative Metabolomics & Computational Biology, School of Science, Edith Cowan University, Joondalup, 6027, Australia.
| | - Stacey N Reinke
- Centre for Integrative Metabolomics & Computational Biology, School of Science, Edith Cowan University, Joondalup, 6027, Australia.
| |
Collapse
|
48
|
Mendez KM, Reinke SN, Broadhurst DI. A comparative evaluation of the generalised predictive ability of eight machine learning algorithms across ten clinical metabolomics data sets for binary classification. Metabolomics 2019; 15:150. [PMID: 31728648 PMCID: PMC6856029 DOI: 10.1007/s11306-019-1612-4] [Citation(s) in RCA: 90] [Impact Index Per Article: 18.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 09/22/2019] [Accepted: 11/05/2019] [Indexed: 12/18/2022]
Abstract
INTRODUCTION Metabolomics is increasingly being used in the clinical setting for disease diagnosis, prognosis and risk prediction. Machine learning algorithms are particularly important in the construction of multivariate metabolite prediction. Historically, partial least squares (PLS) regression has been the gold standard for binary classification. Nonlinear machine learning methods such as random forests (RF), kernel support vector machines (SVM) and artificial neural networks (ANN) may be more suited to modelling possible nonlinear metabolite covariance, and thus provide better predictive models. OBJECTIVES We hypothesise that for binary classification using metabolomics data, non-linear machine learning methods will provide superior generalised predictive ability when compared to linear alternatives, in particular when compared with the current gold standard PLS discriminant analysis. METHODS We compared the general predictive performance of eight archetypal machine learning algorithms across ten publicly available clinical metabolomics data sets. The algorithms were implemented in the Python programming language. All code and results have been made publicly available as Jupyter notebooks. RESULTS There was only marginal improvement in predictive ability for SVM and ANN over PLS across all data sets. RF performance was comparatively poor. The use of out-of-bag bootstrap confidence intervals provided a measure of uncertainty of model prediction such that the quality of metabolomics data was observed to be a bigger influence on generalised performance than model choice. CONCLUSION The size of the data set, and choice of performance metric, had a greater influence on generalised predictive performance than the choice of machine learning algorithm.
Collapse
Affiliation(s)
- Kevin M Mendez
- Centre for Metabolomics & Computational Biology, School of Science, Edith Cowan University, Joondalup, 6027, Australia
| | - Stacey N Reinke
- Centre for Metabolomics & Computational Biology, School of Science, Edith Cowan University, Joondalup, 6027, Australia
| | - David I Broadhurst
- Centre for Metabolomics & Computational Biology, School of Science, Edith Cowan University, Joondalup, 6027, Australia.
| |
Collapse
|