1
|
Batty CA, Pearson VK, Olsson-Francis K, Morgan G. Volatile organic compounds (VOCs) in terrestrial extreme environments: implications for life detection beyond Earth. Nat Prod Rep 2025; 42:93-112. [PMID: 39431456 DOI: 10.1039/d4np00037d] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/22/2024]
Abstract
Covering: 1961 to 2024Discovering and identifying unique natural products/biosignatures (signatures that can be used as evidence for past or present life) that are abundant, and complex enough that they indicate robust evidence of life is a multifaceted process. One distinct category of biosignatures being explored is organic compounds. A subdivision of these compounds not yet readily investigated are volatile organic compound (VOCs). When assessing these VOCs as a group (volatilome) a fingerprint of all VOCs within an environment allows the complex patterns in metabolic data to be unravelled. As a technique already successfully applied to many biological and ecological fields, this paper explores how analysis of volatilomes in terrestrial extreme environments could be used to enhance processes (such as metabolomics and metagenomics) already utilised in life detection beyond Earth. By overcoming some of the complexities of collecting VOCs in remote field sites, a variety of lab based analytical equipment and techniques can then be utilised. Researching volatilomics in astrobiology requires time to characterise the patterns of VOCs. They must then be differentiated from abiotic (non-living) signals within extreme environments similar to those found on other planetary bodies (analogue sites) or in lab-based simulated environments or microcosms. Such an effort is critical for understanding data returned from past or upcoming missions, but it requires a step change in approach which explores the volatilome as a vital additional tool to current 'Omics techniques.
Collapse
Affiliation(s)
- Claire A Batty
- The Open University, Walton Hall, Milton Keynes, MK7 6AA, UK.
| | | | | | - Geraint Morgan
- The Open University, Walton Hall, Milton Keynes, MK7 6AA, UK.
| |
Collapse
|
2
|
Matyushin DD, Burov IA, Sholokhova AY. Uncertainty Quantification and Flagging of Unreliable Predictions in Predicting Mass Spectrometry-Related Properties of Small Molecules Using Machine Learning. Int J Mol Sci 2024; 25:13077. [PMID: 39684785 DOI: 10.3390/ijms252313077] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/08/2024] [Revised: 11/28/2024] [Accepted: 12/04/2024] [Indexed: 12/18/2024] Open
Abstract
Mass spectral identification (in particular, in metabolomics) can be refined by comparing the observed and predicted properties of molecules, such as chromatographic retention. Significant advancements have been made in predicting these values using machine learning and deep learning. Usually, model predictions do not contain any indication of the possible error (uncertainty) or only one criterion is used for this purpose. The spread of predictions of several models included in the ensemble, and the molecular similarity of the considered molecule and the most "similar" molecule from the training set, are values that allow us to estimate the uncertainty. The Euclidean distance between vectors, calculated based on real-valued molecular descriptors, can be used for the assessment of molecular similarity. Another factor indicating uncertainty is the molecule's belonging to one of the clusters (data set clustering). Together, all three factors can be used as features for the uncertainty assessment model. Classification models that predict whether a prediction belongs to the worst 15% were obtained. The area under the receiver operating curve value is in the range of 0.73-0.82 for the considered tasks: the prediction of retention indices in gas chromatography, retention times in liquid chromatography, and collision cross-sections in ion mobility spectroscopy.
Collapse
Affiliation(s)
- Dmitriy D Matyushin
- A.N. Frumkin Institute of Physical Chemistry and Electrochemistry, Russian Academy of Sciences, 31 Leninsky Prospect, GSP-1, 119071 Moscow, Russia
| | - Ivan A Burov
- A.N. Frumkin Institute of Physical Chemistry and Electrochemistry, Russian Academy of Sciences, 31 Leninsky Prospect, GSP-1, 119071 Moscow, Russia
| | - Anastasia Yu Sholokhova
- A.N. Frumkin Institute of Physical Chemistry and Electrochemistry, Russian Academy of Sciences, 31 Leninsky Prospect, GSP-1, 119071 Moscow, Russia
| |
Collapse
|
3
|
Feng Y, Soni A, Brightwell G, M Reis M, Wang Z, Wang J, Wu Q, Ding Y. The potential new microbial hazard monitoring tool in food safety: Integration of metabolomics and artificial intelligence. Trends Food Sci Technol 2024; 149:104555. [DOI: 10.1016/j.tifs.2024.104555] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/04/2025]
|
4
|
Raki H, Aalaila Y, Taktour A, Peluffo-Ordóñez DH. Combining AI Tools with Non-Destructive Technologies for Crop-Based Food Safety: A Comprehensive Review. Foods 2023; 13:11. [PMID: 38201039 PMCID: PMC10777928 DOI: 10.3390/foods13010011] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/24/2023] [Revised: 11/27/2023] [Accepted: 12/06/2023] [Indexed: 01/12/2024] Open
Abstract
On a global scale, food safety and security aspects entail consideration throughout the farm-to-fork continuum, considering food's supply chain. Generally, the agrifood system is a multiplex network of interconnected features and processes, with a hard predictive rate, where maintaining the food's safety is an indispensable element and is part of the Sustainable Development Goals (SDGs). It has led the scientific community to develop advanced applied analytical methods, such as machine learning (ML) and deep learning (DL) techniques applied for assessing foodborne diseases. The main objective of this paper is to contribute to the development of the consensus version of ongoing research about the application of Artificial Intelligence (AI) tools in the domain of food-crop safety from an analytical point of view. Writing a comprehensive review for a more specific topic can also be challenging, especially when searching within the literature. To our knowledge, this review is the first to address this issue. This work consisted of conducting a unique and exhaustive study of the literature, using our TriScope Keywords-based Synthesis methodology. All available literature related to our topic was investigated according to our criteria of inclusion and exclusion. The final count of data papers was subject to deep reading and analysis to extract the necessary information to answer our research questions. Although many studies have been conducted, limited attention has been paid to outlining the applications of AI tools combined with analytical strategies for crop-based food safety specifically.
Collapse
Affiliation(s)
- Hind Raki
- College of Computing, University Mohammed VI Polytechnic, Ben Guerir 43150, Morocco; (Y.A.); (D.H.P.-O.)
| | - Yahya Aalaila
- College of Computing, University Mohammed VI Polytechnic, Ben Guerir 43150, Morocco; (Y.A.); (D.H.P.-O.)
| | - Ayoub Taktour
- Materials Sciences and Nanotechnoloy (MSN), University Mohammed VI Polytechnic, Ben Guerir 43150, Morocco;
| | - Diego H. Peluffo-Ordóñez
- College of Computing, University Mohammed VI Polytechnic, Ben Guerir 43150, Morocco; (Y.A.); (D.H.P.-O.)
| |
Collapse
|
5
|
Wang K, Theeke LA, Liao C, Wang N, Lu Y, Xiao D, Xu C. Deep learning analysis of UPLC-MS/MS-based metabolomics data to predict Alzheimer's disease. J Neurol Sci 2023; 453:120812. [PMID: 37776718 DOI: 10.1016/j.jns.2023.120812] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/27/2023] [Revised: 08/22/2023] [Accepted: 09/14/2023] [Indexed: 10/02/2023]
Abstract
OBJECTIVE Metabolic biomarkers can potentially inform disease progression in Alzheimer's disease (AD). The purpose of this study is to identify and describe a new set of diagnostic biomarkers for developing deep learning (DL) tools to predict AD using Ultra Performance Liquid Chromatography Mass Spectrometry (UPLC-MS/MS)-based metabolomics data. METHODS A total of 177 individuals, including 78 with AD and 99 with cognitive normal (CN), were selected from the Alzheimer's Disease Neuroimaging Initiative (ADNI) cohort along with 150 metabolomic biomarkers. We performed feature selection using the Least Absolute Shrinkage and Selection Operator (LASSO). The H2O DL function was used to build multilayer feedforward neural networks to predict AD. RESULTS The LASSO selected 21 metabolic biomarkers. To develop DL models, the 21 biomarkers identified by LASSO were imported into the H2O package. The data was split into 70% for training and 30% for validation. The best DL model with two layers and 18 neurons achieved an accuracy of 0.881, F1-score of 0.892, and AUC of 0.873. Several metabolomic biomarkers involved in glucose and lipid metabolism, in particular bile acid metabolites, were associated with APOE-ε4 allele and clinical biomarkers (Aβ42, tTau, pTau), cognitive assessments [the Alzheimer's Disease Assessment Scale-cognitive subscale 13 (ADAS13), the Mini-Mental State Examination (MMSE)], and hippocampus volume. CONCLUSIONS This study identified a new set of diagnostic metabolomic biomarkers for developing DL tools to predict AD. These biomarkers may help with early diagnosis, prognostic risk stratification, and/or early treatment interventions for patients at risk for AD.
Collapse
Affiliation(s)
- Kesheng Wang
- School of Nursing, Health Sciences Center, West Virginia University, Morgantown, WV 26506, USA.
| | - Laurie A Theeke
- School of Nursing, The George Washington University, Ashburn, VA 20147, USA
| | - Christopher Liao
- Department of Electrical and Computer Engineering, Boston University, MA 02215, USA
| | - Nianyang Wang
- Department of Health Policy and Management, School of Public Health, University of Maryland, College Park, MD 20742, USA
| | - Yongke Lu
- Department of Biomedical Sciences, Joan C. Edwards School of Medicine, Marshall University, Huntington, WV 25755, USA
| | - Danqing Xiao
- Department of STEM, School of Arts and Sciences, Regis College, Weston, MA 02493, USA
| | - Chun Xu
- Department of Health and Biomedical Sciences, College of Health Professions, University of Texas Rio Grande Valley, Brownsville, TX 78520, USA.
| |
Collapse
|
6
|
Wang D, Greenwood P, Klein MS. Feature impact assessment: a new score to identify relevant metabolomics features in artificial neural networks using validated labels. Metabolomics 2023; 19:22. [PMID: 36964272 DOI: 10.1007/s11306-023-01996-x] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 06/10/2022] [Accepted: 03/14/2023] [Indexed: 03/26/2023]
Abstract
INTRODUCTION Artificial Neural Networks (ANN) are increasingly used in metabolomics but are hard to interpret. OBJECTIVES We aimed at developing a feature impact score that is model-agnostic, simple, and interpretable. METHODS Feature Impact Assessment (FIA) is calculated by varying combinations of features within their observed value range and checking for changes in prediction outcomes. FIA was implemented in R and tested on metabolomics datasets. RESULTS FIA exceeded LIME and SHAP in selecting biologically meaningful features. Values were comparable across different ANN architectures. CONCLUSION FIA is a novel score ranking feature impact, helping interpreting ANN in the metabolomics field.
Collapse
Affiliation(s)
- Danhui Wang
- Department of Food Science and Technology, The Ohio State University, Columbus, OH, 43210, USA
- Department of Nutrition and Food Sciences, Texas Woman's University, Denton, TX, 76204, USA
| | - Peyton Greenwood
- Department of Food Science and Technology, The Ohio State University, Columbus, OH, 43210, USA
| | - Matthias S Klein
- Department of Food Science and Technology, The Ohio State University, Columbus, OH, 43210, USA.
| |
Collapse
|
7
|
Barberis E, Khoso S, Sica A, Falasca M, Gennari A, Dondero F, Afantitis A, Manfredi M. Precision Medicine Approaches with Metabolomics and Artificial Intelligence. Int J Mol Sci 2022; 23:11269. [PMID: 36232571 PMCID: PMC9569627 DOI: 10.3390/ijms231911269] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/18/2022] [Revised: 09/14/2022] [Accepted: 09/20/2022] [Indexed: 11/18/2022] Open
Abstract
Recent technological innovations in the field of mass spectrometry have supported the use of metabolomics analysis for precision medicine. This growth has been allowed also by the application of algorithms to data analysis, including multivariate and machine learning methods, which are fundamental to managing large number of variables and samples. In the present review, we reported and discussed the application of artificial intelligence (AI) strategies for metabolomics data analysis. Particularly, we focused on widely used non-linear machine learning classifiers, such as ANN, random forest, and support vector machine (SVM) algorithms. A discussion of recent studies and research focused on disease classification, biomarker identification and early diagnosis is presented. Challenges in the implementation of metabolomics-AI systems, limitations thereof and recent tools were also discussed.
Collapse
Affiliation(s)
- Elettra Barberis
- Department of Translational Medicine, University of Piemonte Orientale, 28100 Novara, Italy
- Center for Translational Research on Autoimmune and Allergic Diseases, University of Piemonte Orientale, 28100 Novara, Italy
| | - Shahzaib Khoso
- Department of Translational Medicine, University of Piemonte Orientale, 28100 Novara, Italy
- Center for Translational Research on Autoimmune and Allergic Diseases, University of Piemonte Orientale, 28100 Novara, Italy
| | - Antonio Sica
- Department of Pharmaceutical Sciences, University of Piemonte Orientale, 28100 Novara, Italy
- Humanitas Clinical and Research Center, IRCCS, 20089 Rozzano, Italy
| | - Marco Falasca
- Metabolic Signaling Group, Curtin Medical School, Curtin University, Perth 6845, Australia
| | - Alessandra Gennari
- Department of Translational Medicine, University of Piemonte Orientale, 28100 Novara, Italy
| | - Francesco Dondero
- Department of Sciences and Technological Innovation, University of Piemonte Orientale, 15100 Alessandria, Italy
| | | | - Marcello Manfredi
- Department of Translational Medicine, University of Piemonte Orientale, 28100 Novara, Italy
- Center for Translational Research on Autoimmune and Allergic Diseases, University of Piemonte Orientale, 28100 Novara, Italy
| |
Collapse
|
8
|
NMR in Metabolomics: From Conventional Statistics to Machine Learning and Neural Network Approaches. APPLIED SCIENCES-BASEL 2022. [DOI: 10.3390/app12062824] [Citation(s) in RCA: 11] [Impact Index Per Article: 3.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 12/17/2022]
Abstract
NMR measurements combined with chemometrics allow achieving a great amount of information for the identification of potential biomarkers responsible for a precise metabolic pathway. These kinds of data are useful in different fields, ranging from food to biomedical fields, including health science. The investigation of the whole set of metabolites in a sample, representing its fingerprint in the considered condition, is known as metabolomics and may take advantage of different statistical tools. The new frontier is to adopt self-learning techniques to enhance clustering or classification actions that can improve the predictive power over large amounts of data. Although machine learning is already employed in metabolomics, deep learning and artificial neural networks approaches were only recently successfully applied. In this work, we give an overview of the statistical approaches underlying the wide range of opportunities that machine learning and neural networks allow to perform with accurate metabolites assignment and quantification.Various actual challenges are discussed, such as proper metabolomics, deep learning architectures and model accuracy.
Collapse
|