1
|
Deng Y, Yao Y, Wang Y, Yu T, Cai W, Zhou D, Yin F, Liu W, Liu Y, Xie C, Guan J, Hu Y, Huang P, Li W. An end-to-end deep learning method for mass spectrometry data analysis to reveal disease-specific metabolic profiles. Nat Commun 2024; 15:7136. [PMID: 39164279 PMCID: PMC11335749 DOI: 10.1038/s41467-024-51433-3] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/16/2024] [Accepted: 08/07/2024] [Indexed: 08/22/2024] Open
Abstract
Untargeted metabolomic analysis using mass spectrometry provides comprehensive metabolic profiling, but its medical application faces challenges of complex data processing, high inter-batch variability, and unidentified metabolites. Here, we present DeepMSProfiler, an explainable deep-learning-based method, enabling end-to-end analysis on raw metabolic signals with output of high accuracy and reliability. Using cross-hospital 859 human serum samples from lung adenocarcinoma, benign lung nodules, and healthy individuals, DeepMSProfiler successfully differentiates the metabolomic profiles of different groups (AUC 0.99) and detects early-stage lung adenocarcinoma (accuracy 0.961). Model flow and ablation experiments demonstrate that DeepMSProfiler overcomes inter-hospital variability and effects of unknown metabolites signals. Our ensemble strategy removes background-category phenomena in multi-classification deep-learning models, and the novel interpretability enables direct access to disease-related metabolite-protein networks. Further applying to lipid metabolomic data unveils correlations of important metabolites and proteins. Overall, DeepMSProfiler offers a straightforward and reliable method for disease diagnosis and mechanism discovery, enhancing its broad applicability.
Collapse
Affiliation(s)
- Yongjie Deng
- Zhongshan School of Medicine, Sun Yat-sen University, Guangzhou, China
| | - Yao Yao
- State Key Laboratory of Oncology in South China, Guangdong Provincial Clinical Research Center for Cancer, Sun Yat-sen University Cancer Center, Guangzhou, China
- Metabolic Innovation Platform, Zhongshan School of Medicine, Sun Yat-sen University, Guangzhou, China
| | - Yanni Wang
- Zhongshan School of Medicine, Sun Yat-sen University, Guangzhou, China
| | - Tiantian Yu
- Zhongshan School of Medicine, Sun Yat-sen University, Guangzhou, China
- State Key Laboratory of Oncology in South China, Guangdong Provincial Clinical Research Center for Cancer, Sun Yat-sen University Cancer Center, Guangzhou, China
- Metabolic Innovation Platform, Zhongshan School of Medicine, Sun Yat-sen University, Guangzhou, China
| | - Wenhao Cai
- Zhongshan School of Medicine, Sun Yat-sen University, Guangzhou, China
| | - Dingli Zhou
- Zhongshan School of Medicine, Sun Yat-sen University, Guangzhou, China
| | - Feng Yin
- State Key Laboratory of Oncology in South China, Guangdong Provincial Clinical Research Center for Cancer, Sun Yat-sen University Cancer Center, Guangzhou, China
| | - Wanli Liu
- State Key Laboratory of Oncology in South China, Guangdong Provincial Clinical Research Center for Cancer, Sun Yat-sen University Cancer Center, Guangzhou, China
| | - Yuying Liu
- State Key Laboratory of Oncology in South China, Guangdong Provincial Clinical Research Center for Cancer, Sun Yat-sen University Cancer Center, Guangzhou, China
| | - Chuanbo Xie
- State Key Laboratory of Oncology in South China, Guangdong Provincial Clinical Research Center for Cancer, Sun Yat-sen University Cancer Center, Guangzhou, China
| | - Jian Guan
- Department of Radiology, The First Affiliated Hospital of Sun Yat-sen University, Guangzhou, China
| | - Yumin Hu
- State Key Laboratory of Oncology in South China, Guangdong Provincial Clinical Research Center for Cancer, Sun Yat-sen University Cancer Center, Guangzhou, China.
- Metabolic Innovation Platform, Zhongshan School of Medicine, Sun Yat-sen University, Guangzhou, China.
| | - Peng Huang
- State Key Laboratory of Oncology in South China, Guangdong Provincial Clinical Research Center for Cancer, Sun Yat-sen University Cancer Center, Guangzhou, China.
- Metabolic Innovation Platform, Zhongshan School of Medicine, Sun Yat-sen University, Guangzhou, China.
| | - Weizhong Li
- Zhongshan School of Medicine, Sun Yat-sen University, Guangzhou, China.
- Sun Yat-Sen University School of Medicine, Sun Yat-Sen University, Shenzhen, China.
- Key Laboratory of Tropical Disease Control of Ministry of Education, Sun Yat-sen University, Guangzhou, China.
| |
Collapse
|
2
|
O’Sullivan JF, Li M, Koay YC, Wang XS, Guglielmi G, Marques FZ, Nanayakkara S, Mariani J, Slaughter E, Kaye DM. Cardiac Substrate Utilization and Relationship to Invasive Exercise Hemodynamic Parameters in HFpEF. JACC Basic Transl Sci 2024; 9:281-299. [PMID: 38559626 PMCID: PMC10978404 DOI: 10.1016/j.jacbts.2023.11.006] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 10/10/2023] [Revised: 11/02/2023] [Accepted: 11/02/2023] [Indexed: 04/04/2024]
Abstract
The authors conducted transcardiac blood sampling in healthy subjects and subjects with heart failure with preserved ejection fraction (HFpEF) to compare cardiac metabolite and lipid substrate use. We demonstrate that fatty acids are less used by HFpEF hearts and that lipid extraction is influenced by hemodynamic factors including pulmonary pressures and cardiac index. The release of many products of protein catabolism is apparent in HFpEF compared to healthy myocardium. In subgroup analyses, differences in energy substrate use between female and male hearts were identified.
Collapse
Affiliation(s)
- John F. O’Sullivan
- Cardiometabolic Medicine, School of Medical Sciences, Faculty of Medicine and Health, The University of Sydney, Camperdown, Australia
- Department of Cardiology, Royal Prince Alfred Hospital, Sydney, Australia
- Charles Perkins Centre, The University of Sydney, Camperdown, Australia
- Department of Medicine, TU Dresden, Dresden, Germany
| | - Mengbo Li
- Bioinformatics Division, The Walter and Eliza Hall Institute of Medical Research, Parkville, Victoria, Australia
- Department of Medical Biology, The University of Melbourne, Parkville, Victoria, Australia
| | - Yen Chin Koay
- Cardiometabolic Medicine, School of Medical Sciences, Faculty of Medicine and Health, The University of Sydney, Camperdown, Australia
- Charles Perkins Centre, The University of Sydney, Camperdown, Australia
| | - Xiao Suo Wang
- Cardiometabolic Medicine, School of Medical Sciences, Faculty of Medicine and Health, The University of Sydney, Camperdown, Australia
| | - Giovanni Guglielmi
- Department of Biomedical Engineering, The University of Melbourne, Melbourne, Australia
- School of Mathematics, University of Birmingham, Birmingham, United Kingdom
| | - Francine Z. Marques
- Hypertension Research Laboratory, School of Biological Sciences, Faculty of Science, Monash University, Melbourne, Australia
- Heart Failure Research Group, Baker Heart and Diabetes Institute, Melbourne, Australia
- Victorian Heart Institute, Monash University, Melbourne, Australia
- Department of Cardiology, Alfred Hospital, Melbourne, Australia
| | - Shane Nanayakkara
- Heart Failure Research Group, Baker Heart and Diabetes Institute, Melbourne, Australia
- Department of Cardiology, Alfred Hospital, Melbourne, Australia
- Monash-Alfred-Baker Centre for Cardiovascular Research, Monash University, Melbourne, Australia
| | - Justin Mariani
- Victorian Heart Institute, Monash University, Melbourne, Australia
- Department of Cardiology, Alfred Hospital, Melbourne, Australia
- Monash-Alfred-Baker Centre for Cardiovascular Research, Monash University, Melbourne, Australia
| | - Eugene Slaughter
- Cardiometabolic Medicine, School of Medical Sciences, Faculty of Medicine and Health, The University of Sydney, Camperdown, Australia
| | - David M. Kaye
- Heart Failure Research Group, Baker Heart and Diabetes Institute, Melbourne, Australia
- Department of Cardiology, Alfred Hospital, Melbourne, Australia
- Monash-Alfred-Baker Centre for Cardiovascular Research, Monash University, Melbourne, Australia
| |
Collapse
|
3
|
Roach J, Mital R, Haffner JJ, Colwell N, Coats R, Palacios HM, Liu Z, Godinho JLP, Ness M, Peramuna T, McCall LI. Microbiome metabolite quantification methods enabling insights into human health and disease. Methods 2024; 222:81-99. [PMID: 38185226 DOI: 10.1016/j.ymeth.2023.12.007] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/07/2023] [Revised: 10/27/2023] [Accepted: 12/13/2023] [Indexed: 01/09/2024] Open
Abstract
Many of the health-associated impacts of the microbiome are mediated by its chemical activity, producing and modifying small molecules (metabolites). Thus, microbiome metabolite quantification has a central role in efforts to elucidate and measure microbiome function. In this review, we cover general considerations when designing experiments to quantify microbiome metabolites, including sample preparation, data acquisition and data processing, since these are critical to downstream data quality. We then discuss data analysis and experimental steps to demonstrate that a given metabolite feature is of microbial origin. We further discuss techniques used to quantify common microbial metabolites, including short-chain fatty acids (SCFA), secondary bile acids (BAs), tryptophan derivatives, N-acyl amides and trimethylamine N-oxide (TMAO). Lastly, we conclude with challenges and future directions for the field.
Collapse
Affiliation(s)
- Jarrod Roach
- Department of Chemistry and Biochemistry, University of Oklahoma
| | - Rohit Mital
- Department of Biology, University of Oklahoma
| | - Jacob J Haffner
- Department of Anthropology, University of Oklahoma; Laboratories of Molecular Anthropology and Microbiome Research, University of Oklahoma
| | - Nathan Colwell
- Department of Chemistry and Biochemistry, University of Oklahoma
| | - Randy Coats
- Department of Chemistry and Biochemistry, University of Oklahoma
| | - Horvey M Palacios
- Department of Anthropology, University of Oklahoma; Laboratories of Molecular Anthropology and Microbiome Research, University of Oklahoma
| | - Zongyuan Liu
- Department of Chemistry and Biochemistry, University of Oklahoma
| | | | - Monica Ness
- Department of Chemistry and Biochemistry, University of Oklahoma
| | - Thilini Peramuna
- Department of Chemistry and Biochemistry, University of Oklahoma
| | - Laura-Isobel McCall
- Department of Chemistry and Biochemistry, University of Oklahoma; Laboratories of Molecular Anthropology and Microbiome Research, University of Oklahoma; Department of Chemistry and Biochemistry, San Diego State University.
| |
Collapse
|
4
|
Zhang N, Chen Q, Zhang P, Zhou K, Liu Y, Wang H, Duan S, Xie Y, Yu W, Kong Z, Ren L, Hou W, Yang J, Gong X, Dong L, Fang X, Shi L, Yu Y, Zheng Y. Quartet metabolite reference materials for inter-laboratory proficiency test and data integration of metabolomics profiling. Genome Biol 2024; 25:34. [PMID: 38268000 PMCID: PMC10809448 DOI: 10.1186/s13059-024-03168-z] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/08/2022] [Accepted: 01/09/2024] [Indexed: 01/26/2024] Open
Abstract
BACKGROUND Various laboratory-developed metabolomic methods lead to big challenges in inter-laboratory comparability and effective integration of diverse datasets. RESULTS As part of the Quartet Project, we establish a publicly available suite of four metabolite reference materials derived from B lymphoblastoid cell lines from a family of parents and monozygotic twin daughters. We generate comprehensive LC-MS-based metabolomic data from the Quartet reference materials using targeted and untargeted strategies in different laboratories. The Quartet multi-sample-based signal-to-noise ratio enables objective assessment of the reliability of intra-batch and cross-batch metabolomics profiling in detecting intrinsic biological differences among the four groups of samples. Significant variations in the reliability of the metabolomics profiling are identified across laboratories. Importantly, ratio-based metabolomics profiling, by scaling the absolute values of a study sample relative to those of a common reference sample, enables cross-laboratory quantitative data integration. Thus, we construct the ratio-based high-confidence reference datasets between two reference samples, providing "ground truth" for inter-laboratory accuracy assessment, which enables objective evaluation of quantitative metabolomics profiling using various instruments and protocols. CONCLUSIONS Our study provides the community with rich resources and best practices for inter-laboratory proficiency tests and data integration, ensuring reliability of large-scale and longitudinal metabolomic studies.
Collapse
Affiliation(s)
- Naixin Zhang
- State Key Laboratory of Genetic Engineering, School of Life Sciences and Human Phenome Institute, Shanghai Cancer Center, Fudan University, Shanghai, China
| | - Qiaochu Chen
- State Key Laboratory of Genetic Engineering, School of Life Sciences and Human Phenome Institute, Shanghai Cancer Center, Fudan University, Shanghai, China
| | - Peipei Zhang
- State Key Laboratory of Genetic Engineering, School of Life Sciences and Human Phenome Institute, Shanghai Cancer Center, Fudan University, Shanghai, China
| | - Kejun Zhou
- Human Metabolomics Institute, Inc., Shenzhen, Guangdong, China
| | - Yaqing Liu
- State Key Laboratory of Genetic Engineering, School of Life Sciences and Human Phenome Institute, Shanghai Cancer Center, Fudan University, Shanghai, China
| | - Haiyan Wang
- State Key Laboratory of Genetic Engineering, School of Life Sciences and Human Phenome Institute, Shanghai Cancer Center, Fudan University, Shanghai, China
| | - Shumeng Duan
- State Key Laboratory of Genetic Engineering, School of Life Sciences and Human Phenome Institute, Shanghai Cancer Center, Fudan University, Shanghai, China
| | - Yongming Xie
- Shanghai Applied Protein Technology Co. Ltd, Shanghai, China
| | - Wenxiang Yu
- Novogene Bioinformatics Institute, Beijing, China
| | - Ziqing Kong
- Calibra Diagnostics, Hangzhou, Zhejiang, China
| | - Luyao Ren
- State Key Laboratory of Genetic Engineering, School of Life Sciences and Human Phenome Institute, Shanghai Cancer Center, Fudan University, Shanghai, China
| | - Wanwan Hou
- State Key Laboratory of Genetic Engineering, School of Life Sciences and Human Phenome Institute, Shanghai Cancer Center, Fudan University, Shanghai, China
| | - Jingcheng Yang
- State Key Laboratory of Genetic Engineering, School of Life Sciences and Human Phenome Institute, Shanghai Cancer Center, Fudan University, Shanghai, China
- Greater Bay Area Institute of Precision Medicine, Guangzhou, Guangdong, China
| | | | | | - Xiang Fang
- National Institute of Metrology, Beijing, China
| | - Leming Shi
- State Key Laboratory of Genetic Engineering, School of Life Sciences and Human Phenome Institute, Shanghai Cancer Center, Fudan University, Shanghai, China
- International Human Phenome Institute, Shanghai, China
| | - Ying Yu
- State Key Laboratory of Genetic Engineering, School of Life Sciences and Human Phenome Institute, Shanghai Cancer Center, Fudan University, Shanghai, China.
| | - Yuanting Zheng
- State Key Laboratory of Genetic Engineering, School of Life Sciences and Human Phenome Institute, Shanghai Cancer Center, Fudan University, Shanghai, China.
| |
Collapse
|
5
|
Yu Y, Zhang N, Mai Y, Ren L, Chen Q, Cao Z, Chen Q, Liu Y, Hou W, Yang J, Hong H, Xu J, Tong W, Dong L, Shi L, Fang X, Zheng Y. Correcting batch effects in large-scale multiomics studies using a reference-material-based ratio method. Genome Biol 2023; 24:201. [PMID: 37674217 PMCID: PMC10483871 DOI: 10.1186/s13059-023-03047-z] [Citation(s) in RCA: 4] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/08/2022] [Accepted: 05/18/2023] [Indexed: 09/08/2023] Open
Abstract
BACKGROUND Batch effects are notoriously common technical variations in multiomics data and may result in misleading outcomes if uncorrected or over-corrected. A plethora of batch-effect correction algorithms are proposed to facilitate data integration. However, their respective advantages and limitations are not adequately assessed in terms of omics types, the performance metrics, and the application scenarios. RESULTS As part of the Quartet Project for quality control and data integration of multiomics profiling, we comprehensively assess the performance of seven batch effect correction algorithms based on different performance metrics of clinical relevance, i.e., the accuracy of identifying differentially expressed features, the robustness of predictive models, and the ability of accurately clustering cross-batch samples into their own donors. The ratio-based method, i.e., by scaling absolute feature values of study samples relative to those of concurrently profiled reference material(s), is found to be much more effective and broadly applicable than others, especially when batch effects are completely confounded with biological factors of study interests. We further provide practical guidelines for implementing the ratio based approach in increasingly large-scale multiomics studies. CONCLUSIONS Multiomics measurements are prone to batch effects, which can be effectively corrected using ratio-based scaling of the multiomics data. Our study lays the foundation for eliminating batch effects at a ratio scale.
Collapse
Affiliation(s)
- Ying Yu
- State Key Laboratory of Genetic Engineering, School of Life Sciences and Human Phenome Institute, Shanghai Cancer Center, Fudan University, Shanghai, China
| | - Naixin Zhang
- State Key Laboratory of Genetic Engineering, School of Life Sciences and Human Phenome Institute, Shanghai Cancer Center, Fudan University, Shanghai, China
| | - Yuanbang Mai
- State Key Laboratory of Genetic Engineering, School of Life Sciences and Human Phenome Institute, Shanghai Cancer Center, Fudan University, Shanghai, China
| | - Luyao Ren
- State Key Laboratory of Genetic Engineering, School of Life Sciences and Human Phenome Institute, Shanghai Cancer Center, Fudan University, Shanghai, China
| | - Qiaochu Chen
- State Key Laboratory of Genetic Engineering, School of Life Sciences and Human Phenome Institute, Shanghai Cancer Center, Fudan University, Shanghai, China
| | - Zehui Cao
- State Key Laboratory of Genetic Engineering, School of Life Sciences and Human Phenome Institute, Shanghai Cancer Center, Fudan University, Shanghai, China
| | - Qingwang Chen
- State Key Laboratory of Genetic Engineering, School of Life Sciences and Human Phenome Institute, Shanghai Cancer Center, Fudan University, Shanghai, China
| | - Yaqing Liu
- State Key Laboratory of Genetic Engineering, School of Life Sciences and Human Phenome Institute, Shanghai Cancer Center, Fudan University, Shanghai, China
| | - Wanwan Hou
- State Key Laboratory of Genetic Engineering, School of Life Sciences and Human Phenome Institute, Shanghai Cancer Center, Fudan University, Shanghai, China
| | - Jingcheng Yang
- State Key Laboratory of Genetic Engineering, School of Life Sciences and Human Phenome Institute, Shanghai Cancer Center, Fudan University, Shanghai, China
- Greater Bay Area Institute of Precision Medicine, Guangzhou, Guangdong, China
| | - Huixiao Hong
- Division of Bioinformatics and Biostatistics, National Center for Toxicological Research, US Food and Drug Administration, Jefferson, AR, USA
| | - Joshua Xu
- Division of Bioinformatics and Biostatistics, National Center for Toxicological Research, US Food and Drug Administration, Jefferson, AR, USA
| | - Weida Tong
- Division of Bioinformatics and Biostatistics, National Center for Toxicological Research, US Food and Drug Administration, Jefferson, AR, USA
| | | | - Leming Shi
- State Key Laboratory of Genetic Engineering, School of Life Sciences and Human Phenome Institute, Shanghai Cancer Center, Fudan University, Shanghai, China.
- International Human Phenome Institutes, Shanghai, China.
| | - Xiang Fang
- National Institute of Metrology, Beijing, China.
| | - Yuanting Zheng
- State Key Laboratory of Genetic Engineering, School of Life Sciences and Human Phenome Institute, Shanghai Cancer Center, Fudan University, Shanghai, China.
| |
Collapse
|
6
|
Lin Y, Cao Y, Willie E, Patrick E, Yang JYH. Atlas-scale single-cell multi-sample multi-condition data integration using scMerge2. Nat Commun 2023; 14:4272. [PMID: 37460600 DOI: 10.1038/s41467-023-39923-2] [Citation(s) in RCA: 2] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/09/2022] [Accepted: 07/04/2023] [Indexed: 07/20/2023] Open
Abstract
The recent emergence of multi-sample multi-condition single-cell multi-cohort studies allows researchers to investigate different cell states. The effective integration of multiple large-cohort studies promises biological insights into cells under different conditions that individual studies cannot provide. Here, we present scMerge2, a scalable algorithm that allows data integration of atlas-scale multi-sample multi-condition single-cell studies. We have generalized scMerge2 to enable the merging of millions of cells from single-cell studies generated by various single-cell technologies. Using a large COVID-19 data collection with over five million cells from 1000+ individuals, we demonstrate that scMerge2 enables multi-sample multi-condition scRNA-seq data integration from multiple cohorts and reveals signatures derived from cell-type expression that are more accurate in discriminating disease progression. Further, we demonstrate that scMerge2 can remove dataset variability in CyTOF, imaging mass cytometry and CITE-seq experiments, demonstrating its applicability to a broad spectrum of single-cell profiling technologies.
Collapse
Affiliation(s)
- Yingxin Lin
- Sydney Precision Data Science Centre, The University of Sydney, Sydney, NSW, Australia
- Charles Perkins Centre, The University of Sydney, Sydney, NSW, Australia
- School of Mathematics and Statistics, The University of Sydney, Sydney, NSW, Australia
- Laboratory of Data Discovery for Health Limited (D24H), Science Park, Hong Kong SAR, China
| | - Yue Cao
- Sydney Precision Data Science Centre, The University of Sydney, Sydney, NSW, Australia
- Charles Perkins Centre, The University of Sydney, Sydney, NSW, Australia
- School of Mathematics and Statistics, The University of Sydney, Sydney, NSW, Australia
- Laboratory of Data Discovery for Health Limited (D24H), Science Park, Hong Kong SAR, China
| | - Elijah Willie
- Sydney Precision Data Science Centre, The University of Sydney, Sydney, NSW, Australia
| | - Ellis Patrick
- Sydney Precision Data Science Centre, The University of Sydney, Sydney, NSW, Australia
- School of Mathematics and Statistics, The University of Sydney, Sydney, NSW, Australia
- Laboratory of Data Discovery for Health Limited (D24H), Science Park, Hong Kong SAR, China
- The Westmead Institute for Medical Research, The University of Sydney, Sydney, NSW, 2006, Australia
| | - Jean Y H Yang
- Sydney Precision Data Science Centre, The University of Sydney, Sydney, NSW, Australia.
- Charles Perkins Centre, The University of Sydney, Sydney, NSW, Australia.
- School of Mathematics and Statistics, The University of Sydney, Sydney, NSW, Australia.
- Laboratory of Data Discovery for Health Limited (D24H), Science Park, Hong Kong SAR, China.
| |
Collapse
|
7
|
Mattoli L, Gianni M, Burico M. Mass spectrometry-based metabolomic analysis as a tool for quality control of natural complex products. MASS SPECTROMETRY REVIEWS 2023; 42:1358-1396. [PMID: 35238411 DOI: 10.1002/mas.21773] [Citation(s) in RCA: 7] [Impact Index Per Article: 7.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 05/11/2021] [Revised: 11/16/2021] [Accepted: 02/11/2022] [Indexed: 06/07/2023]
Abstract
Metabolomics is an area of intriguing and growing interest. Since the late 1990s, when the first Omic applications appeared to study metabolite's pool ("metabolome"), to understand new aspects of the global regulation of cellular metabolism in biology, there have been many evolutions. Currently, there are many applications in different fields such as clinical, medical, agricultural, and food. In our opinion, it is clear that developments in metabolomics analysis have also been driven by advances in mass spectrometry (MS) technology. As natural complex products (NCPs) are increasingly used around the world as medicines, food supplements, and substance-based medical devices, their analysis using metabolomic approaches will help to bring more and more rigor to scientific studies and industrial production monitoring. This review is intended to emphasize the importance of metabolomics as a powerful tool for studying NCPs, by which significant advantages can be obtained in terms of elucidation of their composition, biological effects, and quality control. The different approaches of metabolomic analysis, the main and basic techniques of multivariate statistical analysis are also briefly illustrated, to allow an overview of the workflow associated with the metabolomic studies of NCPs. Therefore, various articles and reviews are illustrated and commented as examples of the application of MS-based metabolomics to NCPs.
Collapse
Affiliation(s)
- Luisa Mattoli
- Department of Metabolomics & Analytical Sciences, Aboca SpA Società Agricola, Sansepolcro, AR, Italy
| | - Mattia Gianni
- Department of Metabolomics & Analytical Sciences, Aboca SpA Società Agricola, Sansepolcro, AR, Italy
| | - Michela Burico
- Department of Metabolomics & Analytical Sciences, Aboca SpA Società Agricola, Sansepolcro, AR, Italy
| |
Collapse
|
8
|
Zhang Y, Sylvester KG, Jin B, Wong RJ, Schilling J, Chou CJ, Han Z, Luo RY, Tian L, Ladella S, Mo L, Marić I, Blumenfeld YJ, Darmstadt GL, Shaw GM, Stevenson DK, Whitin JC, Cohen HJ, McElhinney DB, Ling XB. Development of a Urine Metabolomics Biomarker-Based Prediction Model for Preeclampsia during Early Pregnancy. Metabolites 2023; 13:715. [PMID: 37367874 DOI: 10.3390/metabo13060715] [Citation(s) in RCA: 2] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/21/2023] [Revised: 05/21/2023] [Accepted: 05/25/2023] [Indexed: 06/28/2023] Open
Abstract
Preeclampsia (PE) is a condition that poses a significant risk of maternal mortality and multiple organ failure during pregnancy. Early prediction of PE can enable timely surveillance and interventions, such as low-dose aspirin administration. In this study, conducted at Stanford Health Care, we examined a cohort of 60 pregnant women and collected 478 urine samples between gestational weeks 8 and 20 for comprehensive metabolomic profiling. By employing liquid chromatography mass spectrometry (LCMS/MS), we identified the structures of seven out of 26 metabolomics biomarkers detected. Utilizing the XGBoost algorithm, we developed a predictive model based on these seven metabolomics biomarkers to identify individuals at risk of developing PE. The performance of the model was evaluated using 10-fold cross-validation, yielding an area under the receiver operating characteristic curve of 0.856. Our findings suggest that measuring urinary metabolomics biomarkers offers a noninvasive approach to assess the risk of PE prior to its onset.
Collapse
Affiliation(s)
- Yaqi Zhang
- College of Automation, Guangdong Polytechnic Normal University, Guangzhou 510665, China
- Department of Surgery, Stanford University School of Medicine, Stanford, CA 94305, USA
| | - Karl G Sylvester
- Department of Surgery, Stanford University School of Medicine, Stanford, CA 94305, USA
| | - Bo Jin
- mProbe Inc., Palo Alto, CA 94303, USA
| | - Ronald J Wong
- Department of Pediatrics, Stanford University School of Medicine, Stanford, CA 94305, USA
| | | | - C James Chou
- Department of Surgery, Stanford University School of Medicine, Stanford, CA 94305, USA
| | - Zhi Han
- Department of Surgery, Stanford University School of Medicine, Stanford, CA 94305, USA
| | - Ruben Y Luo
- Department of Pathology, Stanford University School of Medicine, Stanford, CA 94305, USA
| | - Lu Tian
- Department of Biomedical Data Science, Stanford University School of Medicine, Stanford, CA 94305, USA
| | | | - Lihong Mo
- UC Davis Health, Sacramento, CA 95817, USA
| | - Ivana Marić
- Department of Pediatrics, Stanford University School of Medicine, Stanford, CA 94305, USA
| | - Yair J Blumenfeld
- Department of Obstetrics and Gynecology, Stanford University School of Medicine, Stanford, CA 94305, USA
| | - Gary L Darmstadt
- Department of Pediatrics, Stanford University School of Medicine, Stanford, CA 94305, USA
| | - Gary M Shaw
- Department of Pediatrics, Stanford University School of Medicine, Stanford, CA 94305, USA
| | - David K Stevenson
- Department of Pediatrics, Stanford University School of Medicine, Stanford, CA 94305, USA
| | - John C Whitin
- Department of Pediatrics, Stanford University School of Medicine, Stanford, CA 94305, USA
| | - Harvey J Cohen
- Department of Pediatrics, Stanford University School of Medicine, Stanford, CA 94305, USA
| | - Doff B McElhinney
- Departments of Cardiothoracic Surgery and Pediatrics (Cardiology), Stanford University School of Medicine, Stanford, CA 94305, USA
| | - Xuefeng B Ling
- Department of Surgery, Stanford University School of Medicine, Stanford, CA 94305, USA
| |
Collapse
|
9
|
Chan AS, Wu S, Vernon ST, Tang O, Figtree GA, Liu T, Yang JY, Patrick E. Overcoming cohort heterogeneity for the prediction of subclinical cardiovascular disease risk. iScience 2023; 26:106633. [PMID: 37192969 PMCID: PMC10182278 DOI: 10.1016/j.isci.2023.106633] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/03/2022] [Revised: 02/03/2023] [Accepted: 04/04/2023] [Indexed: 05/18/2023] Open
Abstract
Cardiovascular disease remains a leading cause of mortality with an estimated half a billion people affected in 2019. However, detecting signals between specific pathophysiology and coronary plaque phenotypes using complex multi-omic discovery datasets remains challenging due to the diversity of individuals and their risk factors. Given the complex cohort heterogeneity present in those with coronary artery disease (CAD), we illustrate several different methods, both knowledge-guided and data-driven approaches, for identifying subcohorts of individuals with subclinical CAD and distinct metabolomic signatures. We then demonstrate that utilizing these subcohorts can improve the prediction of subclinical CAD and can facilitate the discovery of novel biomarkers of subclinical disease. Analyses acknowledging cohort heterogeneity through identifying and utilizing these subcohorts may be able to advance our understanding of CVD and provide more effective preventative treatments to reduce the burden of this disease in individuals and in society as a whole.
Collapse
Affiliation(s)
- Adam S. Chan
- School of Mathematics and Statistics, The University of Sydney, Sydney, NSW, Australia
- Charles Perkins Centre, The University of Sydney, Sydney, NSW, Australia
- Sydney Precision Data Science Centre, The University of Sydney, Sydney, NSW, Australia
| | - Songhua Wu
- School of Computer Science, The University of Sydney, Sydney, NSW, Australia
| | - Stephen T. Vernon
- Kolling Institute of Medical Research, Royal North Shore Hospital, Sydney, NSW, Australia
| | - Owen Tang
- Charles Perkins Centre, The University of Sydney, Sydney, NSW, Australia
- Kolling Institute of Medical Research, Royal North Shore Hospital, Sydney, NSW, Australia
| | - Gemma A. Figtree
- Charles Perkins Centre, The University of Sydney, Sydney, NSW, Australia
- Kolling Institute of Medical Research, Royal North Shore Hospital, Sydney, NSW, Australia
| | - Tongliang Liu
- Sydney Precision Data Science Centre, The University of Sydney, Sydney, NSW, Australia
- School of Computer Science, The University of Sydney, Sydney, NSW, Australia
| | - Jean Y.H. Yang
- School of Mathematics and Statistics, The University of Sydney, Sydney, NSW, Australia
- Charles Perkins Centre, The University of Sydney, Sydney, NSW, Australia
- Sydney Precision Data Science Centre, The University of Sydney, Sydney, NSW, Australia
- Corresponding author
| | - Ellis Patrick
- School of Mathematics and Statistics, The University of Sydney, Sydney, NSW, Australia
- Sydney Precision Data Science Centre, The University of Sydney, Sydney, NSW, Australia
- Westmead Medical Institute, Sydney, NSW, Australia
- Corresponding author
| |
Collapse
|
10
|
Guo F, Lin G, Dong L, Cheng KK, Deng L, Xu X, Raftery D, Dong J. Concordance-Based Batch Effect Correction for Large-Scale Metabolomics. Anal Chem 2023; 95:7220-7228. [PMID: 37115661 DOI: 10.1021/acs.analchem.2c05748] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 04/29/2023]
Abstract
For a large-scale metabolomics study, sample collection, preparation, and analysis may last several days, months, or even (intermittently) over years. This may lead to apparent batch effects in the acquired metabolomics data due to variability in instrument status, environmental conditions, or experimental operators. Batch effects may confound the true biological relationships among metabolites and thus obscure real metabolic changes. At present, most of the commonly used batch effect correction (BEC) methods are based on quality control (QC) samples, which require sufficient and stable QC samples. However, the quality of the QC samples may deteriorate if the experiment lasts for a long time. Alternatively, isotope-labeled internal standards have been used, but they generally do not provide good coverage of the metabolome. On the other hand, BEC can also be conducted through a data-driven method, in which no QC sample is needed. Here, we propose a novel data-driven BEC method, namely, CordBat, to achieve concordance between each batch of samples. In the proposed CordBat method, a reference batch is first selected from all batches of data, and the remaining batches are referred to as "other batches." The reference batch serves as the baseline for the batch adjustment by providing a coordinate of correlation between metabolites. Next, a Gaussian graphical model is built on the combined dataset of reference and other batches, and finally, BEC is achieved by optimizing the correction coefficients in the other batches so that the correlation between metabolites of each batch and their combinations are in concordance with that of the reference batch. Three real-world metabolomics datasets are used to evaluate the performance of CordBat by comparing it with five commonly used BEC methods. The present experimental results showed the effectiveness of CordBat in batch effect removal and the concordance of correlation between metabolites after BEC. CordBat was found to be comparable to the QC-based methods and achieved better performance in the preservation of biological effects. The proposed CordBat method may serve as an alternative BEC method for large-scale metabolomics that lack proper QC samples.
Collapse
Affiliation(s)
- Fanjing Guo
- Department of Electronic Science, National Institute for Data Science in Health and Medicine, Xiamen University, Xiamen 361005, China
| | - Genjin Lin
- Department of Electronic Science, National Institute for Data Science in Health and Medicine, Xiamen University, Xiamen 361005, China
| | - Liheng Dong
- School of Computer Science and Technology, Xiamen University Malaysia, Sepang 43600, Malaysia
| | - Kian-Kai Cheng
- Faculty of Chemical and Energy Engineering, Universiti Teknologi Malaysia, Johor 81310, Malaysia
| | - Lingli Deng
- Department of Information Engineering, East China University of Technology, Nanchang 330013, China
| | - Xiangnan Xu
- School of Mathematics and Statistics, The University of Sydney, Sydney, New South Wales 2006, Australia
| | - Daniel Raftery
- Northwest Metabolomics Research Center, University of Washington, Seattle, Washington 98109, United States
| | - Jiyang Dong
- Department of Electronic Science, National Institute for Data Science in Health and Medicine, Xiamen University, Xiamen 361005, China
| |
Collapse
|
11
|
Quantitative challenges and their bioinformatic solutions in mass spectrometry-based metabolomics. Trends Analyt Chem 2023. [DOI: 10.1016/j.trac.2023.117009] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 03/04/2023]
|
12
|
Ding J, Feng YQ. Mass spectrometry-based metabolomics for clinical study: Recent progresses and applications. Trends Analyt Chem 2022. [DOI: 10.1016/j.trac.2022.116896] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/24/2022]
|
13
|
Yang Q, Li B, Wang P, Xie J, Feng Y, Liu Z, Zhu F. LargeMetabo: an out-of-the-box tool for processing and analyzing large-scale metabolomic data. Brief Bioinform 2022; 23:6768054. [PMID: 36274234 DOI: 10.1093/bib/bbac455] [Citation(s) in RCA: 30] [Impact Index Per Article: 15.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/04/2022] [Revised: 09/06/2022] [Accepted: 09/24/2022] [Indexed: 12/14/2022] Open
Abstract
Large-scale metabolomics is a powerful technique that has attracted widespread attention in biomedical studies focused on identifying biomarkers and interpreting the mechanisms of complex diseases. Despite a rapid increase in the number of large-scale metabolomic studies, the analysis of metabolomic data remains a key challenge. Specifically, diverse unwanted variations and batch effects in processing many samples have a substantial impact on identifying true biological markers, and it is a daunting challenge to annotate a plethora of peaks as metabolites in untargeted mass spectrometry-based metabolomics. Therefore, the development of an out-of-the-box tool is urgently needed to realize data integration and to accurately annotate metabolites with enhanced functions. In this study, the LargeMetabo package based on R code was developed for processing and analyzing large-scale metabolomic data. This package is unique because it is capable of (1) integrating multiple analytical experiments to effectively boost the power of statistical analysis; (2) selecting the appropriate biomarker identification method by intelligent assessment for large-scale metabolic data and (3) providing metabolite annotation and enrichment analysis based on an enhanced metabolite database. The LargeMetabo package can facilitate flexibility and reproducibility in large-scale metabolomics. The package is freely available from https://github.com/LargeMetabo/LargeMetabo.
Collapse
Affiliation(s)
- Qingxia Yang
- Department of Bioinformatics, Smart Health Big Data Analysis and Location Services Engineering Lab of Jiangsu Province, School of Geographic and Biologic Information, Nanjing University of Posts and Telecommunications, Nanjing, 210023, China.,College of Pharmaceutical Sciences, Zhejiang University, Hangzhou, Zhejiang 310058, China
| | - Bo Li
- College of Life Sciences, Chongqing Normal University, Chongqing, Chongqing 401331, China
| | - Panpan Wang
- College of Chemistry and Pharmaceutical Engineering, Huanghuai University, Zhumadian 463000, China
| | - Jicheng Xie
- Department of Bioinformatics, Smart Health Big Data Analysis and Location Services Engineering Lab of Jiangsu Province, School of Geographic and Biologic Information, Nanjing University of Posts and Telecommunications, Nanjing, 210023, China
| | - Yuhao Feng
- Department of Bioinformatics, Smart Health Big Data Analysis and Location Services Engineering Lab of Jiangsu Province, School of Geographic and Biologic Information, Nanjing University of Posts and Telecommunications, Nanjing, 210023, China
| | - Ziqiang Liu
- Department of Bioinformatics, Smart Health Big Data Analysis and Location Services Engineering Lab of Jiangsu Province, School of Geographic and Biologic Information, Nanjing University of Posts and Telecommunications, Nanjing, 210023, China
| | - Feng Zhu
- College of Pharmaceutical Sciences, Zhejiang University, Hangzhou, Zhejiang 310058, China
| |
Collapse
|
14
|
Shaver AO, Garcia BM, Gouveia GJ, Morse AM, Liu Z, Asef CK, Borges RM, Leach FE, Andersen EC, Amster IJ, Fernández FM, Edison AS, McIntyre LM. An anchored experimental design and meta-analysis approach to address batch effects in large-scale metabolomics. Front Mol Biosci 2022; 9:930204. [PMID: 36438654 PMCID: PMC9682135 DOI: 10.3389/fmolb.2022.930204] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/27/2022] [Accepted: 10/10/2022] [Indexed: 11/27/2022] Open
Abstract
Untargeted metabolomics studies are unbiased but identifying the same feature across studies is complicated by environmental variation, batch effects, and instrument variability. Ideally, several studies that assay the same set of metabolic features would be used to select recurring features to pursue for identification. Here, we developed an anchored experimental design. This generalizable approach enabled us to integrate three genetic studies consisting of 14 test strains of Caenorhabditis elegans prior to the compound identification process. An anchor strain, PD1074, was included in every sample collection, resulting in a large set of biological replicates of a genetically identical strain that anchored each study. This enables us to estimate treatment effects within each batch and apply straightforward meta-analytic approaches to combine treatment effects across batches without the need for estimation of batch effects and complex normalization strategies. We collected 104 test samples for three genetic studies across six batches to produce five analytical datasets from two complementary technologies commonly used in untargeted metabolomics. Here, we use the model system C. elegans to demonstrate that an augmented design combined with experimental blocks and other metabolomic QC approaches can be used to anchor studies and enable comparisons of stable spectral features across time without the need for compound identification. This approach is generalizable to systems where the same genotype can be assayed in multiple environments and provides biologically relevant features for downstream compound identification efforts. All methods are included in the newest release of the publicly available SECIMTools based on the open-source Galaxy platform.
Collapse
Affiliation(s)
- Amanda O. Shaver
- Department of Genetics, University of Georgia, Athens, GA, United States,Complex Carbohydrate Research Center, University of Georgia, Athens, GA, United States
| | - Brianna M. Garcia
- Complex Carbohydrate Research Center, University of Georgia, Athens, GA, United States,Department of Chemistry, University of Georgia, Athens, GA, United States
| | - Goncalo J. Gouveia
- Complex Carbohydrate Research Center, University of Georgia, Athens, GA, United States,Department of Biochemistry, University of Georgia, Athens, GA, United States
| | - Alison M. Morse
- Department of Molecular Genetics and Microbiology, University of Florida, Gainesville, FL, United States
| | - Zihao Liu
- Department of Molecular Genetics and Microbiology, University of Florida, Gainesville, FL, United States
| | - Carter K. Asef
- School of Chemistry and Biochemistry, Georgia Institute of Technology, Atlanta, GA, United States
| | - Ricardo M. Borges
- Walter Mors Institute of Research on Natural Products, Federal University of Rio de Janeiro, Rio de Janeiro, Brazil
| | - Franklin E. Leach
- Complex Carbohydrate Research Center, University of Georgia, Athens, GA, United States,Department of Environmental Health Science, University of Georgia, Athens, GA, United States
| | - Erik C. Andersen
- Department of Molecular Biosciences, Northwestern University, Evanston, IL, United States
| | - I. Jonathan Amster
- Department of Chemistry, University of Georgia, Athens, GA, United States
| | - Facundo M. Fernández
- School of Chemistry and Biochemistry, Georgia Institute of Technology, Atlanta, GA, United States
| | - Arthur S. Edison
- Department of Genetics, University of Georgia, Athens, GA, United States,Complex Carbohydrate Research Center, University of Georgia, Athens, GA, United States,Department of Biochemistry, University of Georgia, Athens, GA, United States
| | - Lauren M. McIntyre
- Department of Molecular Genetics and Microbiology, University of Florida, Gainesville, FL, United States,University of Florida Genetics Institute, University of Florida, Gainesville, FL, United States,*Correspondence: Lauren M. McIntyre,
| |
Collapse
|