1
|
Sun B, Fang Y, Yang H, Meng F, He C, Zhao Y, Zhao K, Zhang H. The combination of deep learning and pseudo-MS image improves the applicability of metabolomics to congenital heart defect prenatal screening. Talanta 2024; 275:126109. [PMID: 38648686 DOI: 10.1016/j.talanta.2024.126109] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/05/2024] [Revised: 04/09/2024] [Accepted: 04/12/2024] [Indexed: 04/25/2024]
Abstract
To investigate the metabolic alterations in maternal individuals with fetal congenital heart disease (FCHD), establish the FCHD diagnostic models, and assess the performance of these models, we recruited two batches of pregnant women. By metabolomics analysis using Ultra High-performance Liquid Chromatography-Mass/Mass (UPLC-MS/MS), a total of 36 significantly altered metabolites (VIP >1.0) were identified between FCHD and non-FCHD groups. Two logistic regression models and four support vector machine (SVM) models exhibited strong performance and clinical utility in the training set (area under the curve (AUC) = 1.00). The convolutional neural network (CNN) model also demonstrated commendable performance and clinical utility (AUC = 0.89 in the training set). Notably, in the validation set, the performance of the CNN model (AUC = 0.66, precision = 0.714) exhibited better robustness than the six models above (AUC≤0.50). In conclusion, the CNN model based on pseudo-MS images holds promise for real-world and clinical applications due to its better repeatability.
Collapse
Affiliation(s)
- Borui Sun
- Institute of Reproductive Health, Tongji Medical College, Huazhong University of Science and Technology, Wuhan, 430030, China
| | - Yiwei Fang
- Institute of Reproductive Health, Tongji Medical College, Huazhong University of Science and Technology, Wuhan, 430030, China; Center for Reproductive Medicine, Department of Obstetrics and Gynecology, Peking University Third Hospital, No. 49, North Garden Road, Haidian district, Beijing, 100191, China; National Clinical Research Center for Obstetrics and Gynecology, Peking University Third Hospital, Beijing, 100191, China; State Key Laboratory of Female Fertility Promotion, Department of Obstetrics and Gynecology, Peking University Third Hospital, Beijing, 100191, China; Key Laboratory of Assisted Reproduction, Ministry of Education, Peking University, Beijing, 100191, China; Beijing Key Laboratory of Reproductive Endocrinology and Assisted Reproductive Technology, Beijing, 100191, China.
| | - Hui Yang
- Department of Obstetrics, Maternal and Child Health Hospital of Hubei Province, Tongji Medical College, Huazhong University of Science and Technology, Wuhan, 430070, China
| | - Fan Meng
- Institute of Reproductive Health, Tongji Medical College, Huazhong University of Science and Technology, Wuhan, 430030, China
| | - Chao He
- Institute of Reproductive Health, Tongji Medical College, Huazhong University of Science and Technology, Wuhan, 430030, China
| | - Yun Zhao
- Department of Obstetrics, Maternal and Child Health Hospital of Hubei Province, Tongji Medical College, Huazhong University of Science and Technology, Wuhan, 430070, China.
| | - Kai Zhao
- Institute of Reproductive Health, Tongji Medical College, Huazhong University of Science and Technology, Wuhan, 430030, China.
| | - Huiping Zhang
- Institute of Reproductive Health, Tongji Medical College, Huazhong University of Science and Technology, Wuhan, 430030, China.
| |
Collapse
|
2
|
Li Y, Xu Y, Le Sayec M, Yan X, Spector TD, Steves CJ, Bell JT, Small KS, Menni C, Gibson R, Rodriguez-Mateos A. Development of a (Poly)phenol Metabolic Signature for Assessing (Poly)phenol-Rich Dietary Patterns. JOURNAL OF AGRICULTURAL AND FOOD CHEMISTRY 2024; 72:13439-13450. [PMID: 38829321 PMCID: PMC11181312 DOI: 10.1021/acs.jafc.4c00959] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 01/30/2024] [Revised: 04/14/2024] [Accepted: 04/30/2024] [Indexed: 06/05/2024]
Abstract
The objective assessment of habitual (poly)phenol-rich diets in nutritional epidemiology studies remains challenging. This study developed and evaluated the metabolic signature of a (poly)phenol-rich dietary score (PPS) using a targeted metabolomics method comprising 105 representative (poly)phenol metabolites, analyzed in 24 h of urine samples collected from healthy volunteers. The metabolites that were significantly associated with PPS after adjusting for energy intake were selected to establish a metabolic signature using a combination of linear regression followed by ridge regression to estimate penalized weights for each metabolite. A metabolic signature comprising 51 metabolites was significantly associated with adherence to PPS in 24 h urine samples, as well as with (poly)phenol intake estimated from food frequency questionnaires and diaries. Internal and external data sets were used for validation, and plasma, spot urine, and 24 h urine samples were compared. The metabolic signature proposed here has the potential to accurately reflect adherence to (poly)phenol-rich diets, and may be used as an objective tool for the assessment of (poly)phenol intake.
Collapse
Affiliation(s)
- Yong Li
- Department
of Nutritional Sciences, School of Life Course and Population Sciences,
Faculty of Life Sciences and Medicine, King’s
College London, London SE1 9NH, U.K.
| | - Yifan Xu
- Department
of Nutritional Sciences, School of Life Course and Population Sciences,
Faculty of Life Sciences and Medicine, King’s
College London, London SE1 9NH, U.K.
| | - Melanie Le Sayec
- Department
of Nutritional Sciences, School of Life Course and Population Sciences,
Faculty of Life Sciences and Medicine, King’s
College London, London SE1 9NH, U.K.
| | - Xinyu Yan
- Department
of Twin Research & Genetic Epidemiology, School of Life Course
and Population Sciences, Faculty of Life Sciences and Medicine, King’s College London, London SE1 7EH, U.K.
| | - Tim D. Spector
- Department
of Twin Research & Genetic Epidemiology, School of Life Course
and Population Sciences, Faculty of Life Sciences and Medicine, King’s College London, London SE1 7EH, U.K.
| | - Claire J. Steves
- Department
of Twin Research & Genetic Epidemiology, School of Life Course
and Population Sciences, Faculty of Life Sciences and Medicine, King’s College London, London SE1 7EH, U.K.
| | - Jordana T. Bell
- Department
of Twin Research & Genetic Epidemiology, School of Life Course
and Population Sciences, Faculty of Life Sciences and Medicine, King’s College London, London SE1 7EH, U.K.
| | - Kerrin S. Small
- Department
of Twin Research & Genetic Epidemiology, School of Life Course
and Population Sciences, Faculty of Life Sciences and Medicine, King’s College London, London SE1 7EH, U.K.
| | - Cristina Menni
- Department
of Twin Research & Genetic Epidemiology, School of Life Course
and Population Sciences, Faculty of Life Sciences and Medicine, King’s College London, London SE1 7EH, U.K.
| | - Rachel Gibson
- Department
of Nutritional Sciences, School of Life Course and Population Sciences,
Faculty of Life Sciences and Medicine, King’s
College London, London SE1 9NH, U.K.
| | - Ana Rodriguez-Mateos
- Department
of Nutritional Sciences, School of Life Course and Population Sciences,
Faculty of Life Sciences and Medicine, King’s
College London, London SE1 9NH, U.K.
| |
Collapse
|
3
|
Haridas PC, Ravichandran R, Shaikh N, Kishore P, Kumar Panda S, Banerjee K, Sekhar Chatterjee N. Authentication of the species identity of squid rings using UHPLC-Q-Orbitrap MS/MS-based lipidome fingerprinting and chemoinformatics. Food Chem 2024; 442:138525. [PMID: 38271906 DOI: 10.1016/j.foodchem.2024.138525] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/19/2023] [Revised: 12/20/2023] [Accepted: 01/18/2024] [Indexed: 01/27/2024]
Abstract
Species mislabeling of commercial loliginidae squid can undermine important conservation efforts and prevent consumers from making informed decisions. A comprehensive lipidomic fingerprint of Uroteuthis singhalensis, Uroteuthis edulis, and Uroteuthis duvauceli rings was established using high-resolution mass spectrometry-based lipidomics and chemoinformatics analysis. The principal component analysis showed a clear separation of sample groups, with R2X and Q2 values of 0.97 and 0.85 for ESI+ and 0.96 and 0.86 for ESI-, indicating a good model fit. The optimized OPLS-DA and PLS-DA models could discriminate the species identity of validation samples with 100 % accuracy. A total of 67 and 90 lipid molecules were putatively identified as biomarkers in ESI+ and ESI-, respectively. Identified lipids, including PC(40:6), C14 sphingomyelin, PS(O-36:0), and PE(41:4), played an important role in species discrimination. For the first time, this study provides a detailed lipidomics profile of commercially important loliginidae squid and establishes a faster workflow for species authentication.
Collapse
Affiliation(s)
- Pranamya C Haridas
- National Reference Laboratory, ICAR-Central Institute of Fisheries Technology, Matsyapuri P.O., W. Island, Cochin 682029, India; Department of Chemical Oceanography, School of Marine Sciences, Cochin University of Science and Technology, Cochin 682016, India
| | - Rajesh Ravichandran
- National Reference Laboratory, ICAR-Central Institute of Fisheries Technology, Matsyapuri P.O., W. Island, Cochin 682029, India
| | - Nasiruddin Shaikh
- National Referral Laboratory, ICAR-National Research Centre for Grapes, Manjri Farm, Pune 412307, India
| | - Pankaj Kishore
- National Reference Laboratory, ICAR-Central Institute of Fisheries Technology, Matsyapuri P.O., W. Island, Cochin 682029, India
| | - Satyen Kumar Panda
- National Reference Laboratory, ICAR-Central Institute of Fisheries Technology, Matsyapuri P.O., W. Island, Cochin 682029, India; Food Safety and Standards Authority of India, FDA Bhawan, Kotla Road, New Delhi 110002, India
| | - Kaushik Banerjee
- National Referral Laboratory, ICAR-National Research Centre for Grapes, Manjri Farm, Pune 412307, India
| | - Niladri Sekhar Chatterjee
- National Reference Laboratory, ICAR-Central Institute of Fisheries Technology, Matsyapuri P.O., W. Island, Cochin 682029, India.
| |
Collapse
|
4
|
Pelletier SJ, Leclercq M, Roux-Dalvai F, de Geus MB, Leslie S, Wang W, Lam TT, Nairn AC, Arnold SE, Carlyle BC, Precioso F, Droit A. BERNN: Enhancing classification of Liquid Chromatography Mass Spectrometry data with batch effect removal neural networks. Nat Commun 2024; 15:3777. [PMID: 38710683 PMCID: PMC11074280 DOI: 10.1038/s41467-024-48177-5] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/28/2023] [Accepted: 04/24/2024] [Indexed: 05/08/2024] Open
Abstract
Liquid Chromatography Mass Spectrometry (LC-MS) is a powerful method for profiling complex biological samples. However, batch effects typically arise from differences in sample processing protocols, experimental conditions, and data acquisition techniques, significantly impacting the interpretability of results. Correcting batch effects is crucial for the reproducibility of omics research, but current methods are not optimal for the removal of batch effects without compressing the genuine biological variation under study. We propose a suite of Batch Effect Removal Neural Networks (BERNN) to remove batch effects in large LC-MS experiments, with the goal of maximizing sample classification performance between conditions. More importantly, these models must efficiently generalize in batches not seen during training. A comparison of batch effect correction methods across five diverse datasets demonstrated that BERNN models consistently showed the strongest sample classification performance. However, the model producing the greatest classification improvements did not always perform best in terms of batch effect removal. Finally, we show that the overcorrection of batch effects resulted in the loss of some essential biological variability. These findings highlight the importance of balancing batch effect removal while preserving valuable biological diversity in large-scale LC-MS experiments.
Collapse
Affiliation(s)
- Simon J Pelletier
- Computational Biology Laboratory, CHU de Québec - Université Laval Research Center, Québec City, QC, Canada
| | - Mickaël Leclercq
- Computational Biology Laboratory, CHU de Québec - Université Laval Research Center, Québec City, QC, Canada
| | - Florence Roux-Dalvai
- Computational Biology Laboratory, CHU de Québec - Université Laval Research Center, Québec City, QC, Canada
- Proteomics Platform, CHU de Québec - Université Laval Research Center, Québec City, QC, Canada
| | - Matthijs B de Geus
- Massachusetts General Hospital Department of Neurology, Charlestown, MA, USA
- Leiden University Medical Center, Leiden, The Netherlands
| | - Shannon Leslie
- Yale Department of Psychiatry, New Haven, CT, USA
- Janssen Pharmaceuticals, San Diego, CA, USA
| | - Weiwei Wang
- Keck MS & Proteomics Resource, Yale School of Medicine, New Haven, CT, USA
| | - TuKiet T Lam
- Keck MS & Proteomics Resource, Yale School of Medicine, New Haven, CT, USA
- Yale School of Medicine, Department of Molecular Biophysics and Biochemistry, New Haven, CT, USA
| | | | - Steven E Arnold
- Massachusetts General Hospital Department of Neurology, Charlestown, MA, USA
| | - Becky C Carlyle
- Massachusetts General Hospital Department of Neurology, Charlestown, MA, USA
- Oxford University Department of Physiology Anatomy and Genetics, Oxford, UK
- Kavli Institute for Nanoscience Discovery, Oxford, UK
| | - Frédéric Precioso
- Université Côte d'Azur, CNRS, INRIA, I3S, Sophia Antipolis, Nice, France
| | - Arnaud Droit
- Computational Biology Laboratory, CHU de Québec - Université Laval Research Center, Québec City, QC, Canada.
- Proteomics Platform, CHU de Québec - Université Laval Research Center, Québec City, QC, Canada.
| |
Collapse
|
5
|
Fino NF, Adingwupu OM, Coresh J, Greene T, Haaland B, Shlipak MG, Costa E Silva VT, Kalil R, Mindikoglu AL, Furth SL, Seegmiller JC, Levey AS, Inker LA. Evaluation of novel candidate filtration markers from a global metabolomic discovery for glomerular filtration rate estimation. Kidney Int 2024; 105:582-592. [PMID: 38006943 PMCID: PMC10932836 DOI: 10.1016/j.kint.2023.11.007] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/24/2023] [Revised: 10/31/2023] [Accepted: 11/10/2023] [Indexed: 11/27/2023]
Abstract
Creatinine and cystatin-C are recommended for estimating glomerular filtration rate (eGFR) but accuracy is suboptimal. Here, using untargeted metabolomics data, we sought to identify candidate filtration markers for a new targeted assay using a novel approach based on their maximal joint association with measured GFR (mGFR) and with flexibility to consider their biological properties. We analyzed metabolites measured in seven diverse studies encompasing 2,851 participants on the Metabolon H4 platform that had Pearson correlations with log mGFR and used a stepwise approach to develop models to < -0.5 estimate mGFR with and without inclusion of creatinine that enabled selection of candidate markers. In total, 456 identified metabolites were present in all studies, and 36 had correlations with mGFR < -0.5. A total of 2,225 models were developed that included these metabolites; all with lower root mean square errors and smaller coefficients for demographic variables compared to estimates using untargeted creatinine. Seventeen metabolites were chosen, including 12 new candidate filtration markers. The selected metabolites had strong associations with mGFR and little dependence on demographic factors. Candidate metabolites were identified with maximal joint association with mGFR and minimal dependence on demographic variables across many varied clinical settings. These metabolites are excreted in urine and represent diverse metabolic pathways and tubular handling. Thus, our data can be used to select metabolites for a multi-analyte eGFR determination assay using mass spectrometry that potentially offers better accuracy and is less prone to non-GFR determinants than the current eGFR biomarkers.
Collapse
Affiliation(s)
- Nora F Fino
- Division of Biostatistics, Department of Population Health Sciences, University of Utah Health, Salt Lake City, Utah, USA
| | - Ogechi M Adingwupu
- Division of Nephrology, Tufts Medical Center, Boston, Massachusetts, USA
| | - Josef Coresh
- Department of Population Health, NYU Langone, New York, New York, USA
| | - Tom Greene
- Division of Biostatistics, Department of Population Health Sciences, University of Utah Health, Salt Lake City, Utah, USA
| | - Ben Haaland
- Division of Biostatistics, Department of Population Health Sciences, University of Utah Health, Salt Lake City, Utah, USA
| | - Michael G Shlipak
- Kidney Health Research Collaborative, San Francisco Veterans Affair Medical Center and University of California, San Francisco, San Francisco, California, USA
| | - Veronica T Costa E Silva
- Serviço de Nefrologia, Instituto do Câncer do Estado de São Paulo, Faculdade de Medicina, Universidade de São Paulo, São Paulo, Brazil; Laboratório de Investigação Médica 16, Faculdade de Medicina da Universidade de São Paulo, São Paulo, Brazil
| | - Roberto Kalil
- Division of Nephrology, Department of Medicine, University of Maryland School of Medicine, Baltimore, Maryland, USA
| | - Ayse L Mindikoglu
- Margaret M. and Albert B. Alkek Department of Medicine, Section of Gastroenterology and Hepatology, Baylor College of Medicine, Houston, Texas, USA; Michael E. DeBakey Department of Surgery, Division of Abdominal Transplantation, Baylor College of Medicine, Houston, Texas, USA
| | - Susan L Furth
- Department of Pediatrics, Children's Hospital of Philadelphia, and the Perelman School of Medicine at the University of Pennsylvania, Philadelphia, Pennsylvania, USA
| | - Jesse C Seegmiller
- Department of Laboratory Medicine and Pathology, University of Minnesota, Minneapolis, Minnesota, USA
| | - Andrew S Levey
- Division of Nephrology, Tufts Medical Center, Boston, Massachusetts, USA
| | - Lesley A Inker
- Division of Nephrology, Tufts Medical Center, Boston, Massachusetts, USA.
| |
Collapse
|
6
|
Fuller N, Kimbrough KL, Davenport E, Edwards ME, Jacob A, Chandramouli B, Johnson WE. Contaminants of Concern and Spatiotemporal Metabolomic Changes in Quagga Mussels (Dreissena bugensis rostriformis) from the Milwaukee Estuary (Wisconsin, USA). ENVIRONMENTAL TOXICOLOGY AND CHEMISTRY 2024; 43:307-323. [PMID: 37877769 DOI: 10.1002/etc.5776] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 06/08/2023] [Revised: 07/31/2023] [Accepted: 10/23/2023] [Indexed: 10/26/2023]
Abstract
Environmental metabolomics has emerged as a promising technique in the field of biomonitoring and as an indicator of aquatic ecosystem health. In the Milwaukee Estuary (Wisconsin, USA), previous studies have used a nontargeted metabolomic approach to distinguish between zebra mussels (Dreissena polymorpha) collected from sites of varying contamination. To further elucidate the potential effects of contaminants on bivalve health in the Milwaukee Estuary, the present study adopted a caging approach to study the metabolome of quagga mussels (Dreissena bugensis rostriformis) deployed in six sites of varying contamination for 2, 5, or 55 days. Caged mussels were co-deployed with two types of passive sampler (polar organic chemical integrative samplers and semipermeable membrane devices) and data loggers. In conjunction, in situ quagga mussels were collected from the four sites studied previously and analyzed for residues of contaminants and metabolomics using a targeted approach. For the caging study, temporal differences in the metabolomic response were observed with few significant changes observed after 2 and 5 days, but larger differences (up to 97 significantly different metabolites) to the metabolome in all sites after 55 days. A suite of metabolic pathways were altered, including biosynthesis and metabolism of amino acids, and upmodulation of phospholipids at all sites, suggesting a potential biological influence such as gametogenesis. In the caging study, average temperatures appeared to have a greater effect on the metabolome than contaminants, despite a large concentration gradient in polycyclic aromatic hydrocarbons residues measured in passive samplers and mussel tissue. Conversely, significant differences between the metabolome of mussels collected in situ from all three contaminated sites and the offshore reference site were observed. Overall, these findings highlight the importance of contextualizing the effects of environmental conditions and reproductive processes on the metabolome of model organisms to facilitate the wider use of this technique for biomonitoring and environmental health assessments. Environ Toxicol Chem 2024;43:307-323. © 2023 The Authors. Environmental Toxicology and Chemistry published by Wiley Periodicals LLC on behalf of SETAC. This article has been contributed to by U.S. Government employees and their work is in the public domain in the USA.
Collapse
Affiliation(s)
| | - Kimani L Kimbrough
- National Centers for Coastal Ocean Science, National Oceanic and Atmospheric Administration National Ocean Service, Silver Spring, Maryland, USA
| | - Erik Davenport
- National Centers for Coastal Ocean Science, National Oceanic and Atmospheric Administration National Ocean Service, Silver Spring, Maryland, USA
| | - Michael E Edwards
- National Centers for Coastal Ocean Science, National Oceanic and Atmospheric Administration National Ocean Service, Silver Spring, Maryland, USA
| | | | | | - W Edward Johnson
- National Centers for Coastal Ocean Science, National Oceanic and Atmospheric Administration National Ocean Service, Silver Spring, Maryland, USA
| |
Collapse
|
7
|
Miyake A, Harada S, Sugiyama D, Matsumoto M, Hirata A, Miyagawa N, Toki R, Edagawa S, Kuwabara K, Okamura T, Sato A, Amano K, Hirayama A, Sugimoto M, Soga T, Tomita M, Arakawa K, Takebayashi T, Iida M. Reliability of Time-Series Plasma Metabolome Data over 6 Years in a Large-Scale Cohort Study. Metabolites 2024; 14:77. [PMID: 38276312 PMCID: PMC10819202 DOI: 10.3390/metabo14010077] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/31/2023] [Revised: 01/16/2024] [Accepted: 01/18/2024] [Indexed: 01/27/2024] Open
Abstract
Studies examining long-term longitudinal metabolomic data and their reliability in large-scale populations are limited. Therefore, we aimed to evaluate the reliability of repeated measurements of plasma metabolites in a prospective cohort setting and to explore intra-individual concentration changes at three time points over a 6-year period. The study participants included 2999 individuals (1317 men and 1682 women) from the Tsuruoka Metabolomics Cohort Study, who participated in all three surveys-at baseline, 3 years, and 6 years. In each survey, 94 plasma metabolites were quantified for each individual and quality control (QC) sample. The coefficients of variation of QC, intraclass correlation coefficients, and change rates of QC were calculated for each metabolite, and their reliability was classified into three categories: excellent, fair to good, and poor. Seventy-six percent (71/94) of metabolites were classified as fair to good or better. Of the 39 metabolites grouped as excellent, 29 (74%) in men and 26 (67%) in women showed significant intra-individual changes over 6 years. Overall, our study demonstrated a high degree of reliability for repeated metabolome measurements. Many highly reliable metabolites showed significant changes over the 6-year period, suggesting that repeated longitudinal metabolome measurements are useful for epidemiological studies.
Collapse
Affiliation(s)
- Atsuko Miyake
- Department of Preventive Medicine and Public Health, Keio University School of Medicine, Shinjuku, Tokyo 160-8582, Japan; (A.M.); (S.H.); (D.S.); (M.M.); (A.H.); (N.M.); (R.T.); (S.E.); (K.K.); (T.O.); (T.T.)
- Department of Obstetrics and Gynecology, Keio University School of Medicine, Shinjuku, Tokyo 160-8582, Japan
| | - Sei Harada
- Department of Preventive Medicine and Public Health, Keio University School of Medicine, Shinjuku, Tokyo 160-8582, Japan; (A.M.); (S.H.); (D.S.); (M.M.); (A.H.); (N.M.); (R.T.); (S.E.); (K.K.); (T.O.); (T.T.)
| | - Daisuke Sugiyama
- Department of Preventive Medicine and Public Health, Keio University School of Medicine, Shinjuku, Tokyo 160-8582, Japan; (A.M.); (S.H.); (D.S.); (M.M.); (A.H.); (N.M.); (R.T.); (S.E.); (K.K.); (T.O.); (T.T.)
- Faculty of Nursing and Medical Care, Keio University, Kanagawa, Fujisawa 252-0883, Japan
- Graduate School of Health Management, Keio University, Kanagawa, Fujisawa 252-0883, Japan
| | - Minako Matsumoto
- Department of Preventive Medicine and Public Health, Keio University School of Medicine, Shinjuku, Tokyo 160-8582, Japan; (A.M.); (S.H.); (D.S.); (M.M.); (A.H.); (N.M.); (R.T.); (S.E.); (K.K.); (T.O.); (T.T.)
| | - Aya Hirata
- Department of Preventive Medicine and Public Health, Keio University School of Medicine, Shinjuku, Tokyo 160-8582, Japan; (A.M.); (S.H.); (D.S.); (M.M.); (A.H.); (N.M.); (R.T.); (S.E.); (K.K.); (T.O.); (T.T.)
| | - Naoko Miyagawa
- Department of Preventive Medicine and Public Health, Keio University School of Medicine, Shinjuku, Tokyo 160-8582, Japan; (A.M.); (S.H.); (D.S.); (M.M.); (A.H.); (N.M.); (R.T.); (S.E.); (K.K.); (T.O.); (T.T.)
| | - Ryota Toki
- Department of Preventive Medicine and Public Health, Keio University School of Medicine, Shinjuku, Tokyo 160-8582, Japan; (A.M.); (S.H.); (D.S.); (M.M.); (A.H.); (N.M.); (R.T.); (S.E.); (K.K.); (T.O.); (T.T.)
| | - Shun Edagawa
- Department of Preventive Medicine and Public Health, Keio University School of Medicine, Shinjuku, Tokyo 160-8582, Japan; (A.M.); (S.H.); (D.S.); (M.M.); (A.H.); (N.M.); (R.T.); (S.E.); (K.K.); (T.O.); (T.T.)
| | - Kazuyo Kuwabara
- Department of Preventive Medicine and Public Health, Keio University School of Medicine, Shinjuku, Tokyo 160-8582, Japan; (A.M.); (S.H.); (D.S.); (M.M.); (A.H.); (N.M.); (R.T.); (S.E.); (K.K.); (T.O.); (T.T.)
| | - Tomonori Okamura
- Department of Preventive Medicine and Public Health, Keio University School of Medicine, Shinjuku, Tokyo 160-8582, Japan; (A.M.); (S.H.); (D.S.); (M.M.); (A.H.); (N.M.); (R.T.); (S.E.); (K.K.); (T.O.); (T.T.)
- Graduate School of Health Management, Keio University, Kanagawa, Fujisawa 252-0883, Japan
| | - Asako Sato
- Institute for Advanced Biosciences, Keio University, Yamagata, Tsuruoka 997-0052, Japan; (A.S.); (K.A.); (A.H.); (M.S.); (T.S.); (M.T.); (K.A.)
| | - Kaori Amano
- Institute for Advanced Biosciences, Keio University, Yamagata, Tsuruoka 997-0052, Japan; (A.S.); (K.A.); (A.H.); (M.S.); (T.S.); (M.T.); (K.A.)
| | - Akiyoshi Hirayama
- Institute for Advanced Biosciences, Keio University, Yamagata, Tsuruoka 997-0052, Japan; (A.S.); (K.A.); (A.H.); (M.S.); (T.S.); (M.T.); (K.A.)
| | - Masahiro Sugimoto
- Institute for Advanced Biosciences, Keio University, Yamagata, Tsuruoka 997-0052, Japan; (A.S.); (K.A.); (A.H.); (M.S.); (T.S.); (M.T.); (K.A.)
| | - Tomoyoshi Soga
- Institute for Advanced Biosciences, Keio University, Yamagata, Tsuruoka 997-0052, Japan; (A.S.); (K.A.); (A.H.); (M.S.); (T.S.); (M.T.); (K.A.)
| | - Masaru Tomita
- Institute for Advanced Biosciences, Keio University, Yamagata, Tsuruoka 997-0052, Japan; (A.S.); (K.A.); (A.H.); (M.S.); (T.S.); (M.T.); (K.A.)
| | - Kazuharu Arakawa
- Institute for Advanced Biosciences, Keio University, Yamagata, Tsuruoka 997-0052, Japan; (A.S.); (K.A.); (A.H.); (M.S.); (T.S.); (M.T.); (K.A.)
| | - Toru Takebayashi
- Department of Preventive Medicine and Public Health, Keio University School of Medicine, Shinjuku, Tokyo 160-8582, Japan; (A.M.); (S.H.); (D.S.); (M.M.); (A.H.); (N.M.); (R.T.); (S.E.); (K.K.); (T.O.); (T.T.)
- Graduate School of Health Management, Keio University, Kanagawa, Fujisawa 252-0883, Japan
| | - Miho Iida
- Department of Preventive Medicine and Public Health, Keio University School of Medicine, Shinjuku, Tokyo 160-8582, Japan; (A.M.); (S.H.); (D.S.); (M.M.); (A.H.); (N.M.); (R.T.); (S.E.); (K.K.); (T.O.); (T.T.)
| |
Collapse
|
8
|
Jeppesen MJ, Powers R. Multiplatform untargeted metabolomics. MAGNETIC RESONANCE IN CHEMISTRY : MRC 2023; 61:628-653. [PMID: 37005774 PMCID: PMC10948111 DOI: 10.1002/mrc.5350 10.1002/mrc.5350] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 01/16/2023] [Revised: 03/29/2023] [Accepted: 03/30/2023] [Indexed: 06/23/2024]
Abstract
Metabolomics samples like human urine or serum contain upwards of a few thousand metabolites, but individual analytical techniques can only characterize a few hundred metabolites at best. The uncertainty in metabolite identification commonly encountered in untargeted metabolomics adds to this low coverage problem. A multiplatform (multiple analytical techniques) approach can improve upon the number of metabolites reliably detected and correctly assigned. This can be further improved by applying synergistic sample preparation along with the use of combinatorial or sequential non-destructive and destructive techniques. Similarly, peak detection and metabolite identification strategies that employ multiple probabilistic approaches have led to better annotation decisions. Applying these techniques also addresses the issues of reproducibility found in single platform methods. Nevertheless, the analysis of large data sets from disparate analytical techniques presents unique challenges. While the general data processing workflow is similar across multiple platforms, many software packages are only fully capable of processing data types from a single analytical instrument. Traditional statistical methods such as principal component analysis were not designed to handle multiple, distinct data sets. Instead, multivariate analysis requires multiblock or other model types for understanding the contribution from multiple instruments. This review summarizes the advantages, limitations, and recent achievements of a multiplatform approach to untargeted metabolomics.
Collapse
Affiliation(s)
- Micah J. Jeppesen
- Department of Chemistry, University of Nebraska-Lincoln, Lincoln, NE 68588-0304, United States
- Nebraska Center for Integrated Biomolecular Communication, University of Nebraska-Lincoln, Lincoln, NE 68588-0304, United States
| | - Robert Powers
- Department of Chemistry, University of Nebraska-Lincoln, Lincoln, NE 68588-0304, United States
- Nebraska Center for Integrated Biomolecular Communication, University of Nebraska-Lincoln, Lincoln, NE 68588-0304, United States
| |
Collapse
|
9
|
Jeppesen MJ, Powers R. Multiplatform untargeted metabolomics. MAGNETIC RESONANCE IN CHEMISTRY : MRC 2023; 61:628-653. [PMID: 37005774 PMCID: PMC10948111 DOI: 10.1002/mrc.5350] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 01/16/2023] [Revised: 03/29/2023] [Accepted: 03/30/2023] [Indexed: 06/19/2023]
Abstract
Metabolomics samples like human urine or serum contain upwards of a few thousand metabolites, but individual analytical techniques can only characterize a few hundred metabolites at best. The uncertainty in metabolite identification commonly encountered in untargeted metabolomics adds to this low coverage problem. A multiplatform (multiple analytical techniques) approach can improve upon the number of metabolites reliably detected and correctly assigned. This can be further improved by applying synergistic sample preparation along with the use of combinatorial or sequential non-destructive and destructive techniques. Similarly, peak detection and metabolite identification strategies that employ multiple probabilistic approaches have led to better annotation decisions. Applying these techniques also addresses the issues of reproducibility found in single platform methods. Nevertheless, the analysis of large data sets from disparate analytical techniques presents unique challenges. While the general data processing workflow is similar across multiple platforms, many software packages are only fully capable of processing data types from a single analytical instrument. Traditional statistical methods such as principal component analysis were not designed to handle multiple, distinct data sets. Instead, multivariate analysis requires multiblock or other model types for understanding the contribution from multiple instruments. This review summarizes the advantages, limitations, and recent achievements of a multiplatform approach to untargeted metabolomics.
Collapse
Affiliation(s)
- Micah J. Jeppesen
- Department of Chemistry, University of Nebraska-Lincoln, Lincoln, NE 68588-0304, United States
- Nebraska Center for Integrated Biomolecular Communication, University of Nebraska-Lincoln, Lincoln, NE 68588-0304, United States
| | - Robert Powers
- Department of Chemistry, University of Nebraska-Lincoln, Lincoln, NE 68588-0304, United States
- Nebraska Center for Integrated Biomolecular Communication, University of Nebraska-Lincoln, Lincoln, NE 68588-0304, United States
| |
Collapse
|
10
|
Xu Y, Li Y, Hu J, Gibson R, Rodriguez-Mateos A. Development of a novel (poly)phenol-rich diet score and its association with urinary (poly)phenol metabolites. Food Funct 2023; 14:9635-9649. [PMID: 37840467 DOI: 10.1039/d3fo01982a] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/17/2023]
Abstract
Background: Estimating (poly)phenol intake is challenging due to inadequate dietary assessment tools and limited food content data. Currently, a priori diet scores to characterise (poly)phenol-rich diets are lacking. This study aimed to develop a novel (poly)phenol-rich diet score (PPS) and explore its relationship with circulating (poly)phenol metabolites. Methods: A total of 543 healthy free-living participants aged 18-80 years completed a food frequency questionnaire (FFQ) (EPIC-Norfolk) and provided 24 h urine samples. The PPS was developed based on the relative intake (quintiles) of 20 selected (poly)phenol-rich food items abundant in the UK diet, including tea, coffee, red wine, whole grains, chocolate and cocoa products, berries, apples and juice, pears, grapes, plums, citrus fruits and juice, potatoes and carrots, onions, peppers, garlic, green vegetables, pulses, soy and soy products, nuts, and olive oil. Foods included in the PPS were chosen based on their (poly)phenol content, main sources of (poly)phenols, and consumption frequencies in the UK population. Associations between the PPS and urinary phenolic metabolites were investigated using linear models adjusting energy intake and multiple testing (FDR adjusted p < 0.05). Result: The total PPS ranged from 25 to 88, with a mean score of 54. A total of 51 individual urinary metabolites were significantly associated with the PPS, including 39 phenolic acids, 5 flavonoids, 3 lignans, 2 resveratrol and 2 other (poly)phenol metabolites. The total (poly)phenol intake derived from FFQs also showed a positive association with PPS (stdBeta 0.32, 95% CI (0.24, 0.40), p < 0.01). Significant positive associations were observed in 24 of 27 classes and subclasses of estimated (poly)phenol intake and PPS, with stdBeta values ranging from 0.12 (0.04, 0.20) for theaflavins/thearubigins to 0.43 (0.34, 0.51) for flavonols (p < 0.01). Conclusion: High adherence to the PPS diet is associated with (poly)phenol intake and urinary biomarkers, indicating the utility of the PPS to characterise diets rich in (poly)phenols at a population level.
Collapse
Affiliation(s)
- Yifan Xu
- Department of Nutritional Sciences, School of Life Course and Population Sciences, Faculty of Life Sciences and Medicine, King's College London, London, UK.
| | - Yong Li
- Department of Nutritional Sciences, School of Life Course and Population Sciences, Faculty of Life Sciences and Medicine, King's College London, London, UK.
| | - Jiaying Hu
- Department of Nutritional Sciences, School of Life Course and Population Sciences, Faculty of Life Sciences and Medicine, King's College London, London, UK.
| | - Rachel Gibson
- Department of Nutritional Sciences, School of Life Course and Population Sciences, Faculty of Life Sciences and Medicine, King's College London, London, UK.
| | - Ana Rodriguez-Mateos
- Department of Nutritional Sciences, School of Life Course and Population Sciences, Faculty of Life Sciences and Medicine, King's College London, London, UK.
| |
Collapse
|
11
|
Yu Y, Zhang N, Mai Y, Ren L, Chen Q, Cao Z, Chen Q, Liu Y, Hou W, Yang J, Hong H, Xu J, Tong W, Dong L, Shi L, Fang X, Zheng Y. Correcting batch effects in large-scale multiomics studies using a reference-material-based ratio method. Genome Biol 2023; 24:201. [PMID: 37674217 PMCID: PMC10483871 DOI: 10.1186/s13059-023-03047-z] [Citation(s) in RCA: 4] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/08/2022] [Accepted: 05/18/2023] [Indexed: 09/08/2023] Open
Abstract
BACKGROUND Batch effects are notoriously common technical variations in multiomics data and may result in misleading outcomes if uncorrected or over-corrected. A plethora of batch-effect correction algorithms are proposed to facilitate data integration. However, their respective advantages and limitations are not adequately assessed in terms of omics types, the performance metrics, and the application scenarios. RESULTS As part of the Quartet Project for quality control and data integration of multiomics profiling, we comprehensively assess the performance of seven batch effect correction algorithms based on different performance metrics of clinical relevance, i.e., the accuracy of identifying differentially expressed features, the robustness of predictive models, and the ability of accurately clustering cross-batch samples into their own donors. The ratio-based method, i.e., by scaling absolute feature values of study samples relative to those of concurrently profiled reference material(s), is found to be much more effective and broadly applicable than others, especially when batch effects are completely confounded with biological factors of study interests. We further provide practical guidelines for implementing the ratio based approach in increasingly large-scale multiomics studies. CONCLUSIONS Multiomics measurements are prone to batch effects, which can be effectively corrected using ratio-based scaling of the multiomics data. Our study lays the foundation for eliminating batch effects at a ratio scale.
Collapse
Affiliation(s)
- Ying Yu
- State Key Laboratory of Genetic Engineering, School of Life Sciences and Human Phenome Institute, Shanghai Cancer Center, Fudan University, Shanghai, China
| | - Naixin Zhang
- State Key Laboratory of Genetic Engineering, School of Life Sciences and Human Phenome Institute, Shanghai Cancer Center, Fudan University, Shanghai, China
| | - Yuanbang Mai
- State Key Laboratory of Genetic Engineering, School of Life Sciences and Human Phenome Institute, Shanghai Cancer Center, Fudan University, Shanghai, China
| | - Luyao Ren
- State Key Laboratory of Genetic Engineering, School of Life Sciences and Human Phenome Institute, Shanghai Cancer Center, Fudan University, Shanghai, China
| | - Qiaochu Chen
- State Key Laboratory of Genetic Engineering, School of Life Sciences and Human Phenome Institute, Shanghai Cancer Center, Fudan University, Shanghai, China
| | - Zehui Cao
- State Key Laboratory of Genetic Engineering, School of Life Sciences and Human Phenome Institute, Shanghai Cancer Center, Fudan University, Shanghai, China
| | - Qingwang Chen
- State Key Laboratory of Genetic Engineering, School of Life Sciences and Human Phenome Institute, Shanghai Cancer Center, Fudan University, Shanghai, China
| | - Yaqing Liu
- State Key Laboratory of Genetic Engineering, School of Life Sciences and Human Phenome Institute, Shanghai Cancer Center, Fudan University, Shanghai, China
| | - Wanwan Hou
- State Key Laboratory of Genetic Engineering, School of Life Sciences and Human Phenome Institute, Shanghai Cancer Center, Fudan University, Shanghai, China
| | - Jingcheng Yang
- State Key Laboratory of Genetic Engineering, School of Life Sciences and Human Phenome Institute, Shanghai Cancer Center, Fudan University, Shanghai, China
- Greater Bay Area Institute of Precision Medicine, Guangzhou, Guangdong, China
| | - Huixiao Hong
- Division of Bioinformatics and Biostatistics, National Center for Toxicological Research, US Food and Drug Administration, Jefferson, AR, USA
| | - Joshua Xu
- Division of Bioinformatics and Biostatistics, National Center for Toxicological Research, US Food and Drug Administration, Jefferson, AR, USA
| | - Weida Tong
- Division of Bioinformatics and Biostatistics, National Center for Toxicological Research, US Food and Drug Administration, Jefferson, AR, USA
| | | | - Leming Shi
- State Key Laboratory of Genetic Engineering, School of Life Sciences and Human Phenome Institute, Shanghai Cancer Center, Fudan University, Shanghai, China.
- International Human Phenome Institutes, Shanghai, China.
| | - Xiang Fang
- National Institute of Metrology, Beijing, China.
| | - Yuanting Zheng
- State Key Laboratory of Genetic Engineering, School of Life Sciences and Human Phenome Institute, Shanghai Cancer Center, Fudan University, Shanghai, China.
| |
Collapse
|
12
|
Goh WWB, Hui HWH, Wong L. How missing value imputation is confounded with batch effects and what you can do about it. Drug Discov Today 2023; 28:103661. [PMID: 37301250 DOI: 10.1016/j.drudis.2023.103661] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/08/2023] [Revised: 05/31/2023] [Accepted: 06/05/2023] [Indexed: 06/12/2023]
Abstract
In data-processing pipelines, upstream steps can influence downstream processes because of their sequential nature. Among these data-processing steps, batch effect (BE) correction (BEC) and missing value imputation (MVI) are crucial for ensuring data suitability for advanced modeling and reducing the likelihood of false discoveries. Although BEC-MVI interactions are not well studied, they are ultimately interdependent. Batch sensitization can improve the quality of MVI. Conversely, accounting for missingness also improves proper BE estimation in BEC. Here, we discuss how BEC and MVI are interconnected and interdependent. We show how batch sensitization can improve any MVI and bring attention to the idea of BE-associated missing values (BEAMs). Finally, we discuss how batch-class imbalance problems can be mitigated by borrowing ideas from machine learning.
Collapse
Affiliation(s)
- Wilson Wen Bin Goh
- Lee Kong Chian School of Medicine, Nanyang Technological University, Singapore; School of Biological Sciences, Nanyang Technological University, Singapore; Center for Biomedical Informatics, Nanyang Technological University, Singapore.
| | - Harvard Wai Hann Hui
- Lee Kong Chian School of Medicine, Nanyang Technological University, Singapore; School of Biological Sciences, Nanyang Technological University, Singapore
| | - Limsoon Wong
- Department of Computer Science, National University of Singapore, Singapore; Department of Pathology, National University of Singapore, Singapore.
| |
Collapse
|
13
|
Krasnovsky L, Crowley AP, Naeem F, Wang LS, Wu GD, Chao AM. A Scoping Review of Nutritional Biomarkers Associated with Food Security. Nutrients 2023; 15:3576. [PMID: 37630766 PMCID: PMC10459650 DOI: 10.3390/nu15163576] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/19/2023] [Revised: 08/05/2023] [Accepted: 08/10/2023] [Indexed: 08/27/2023] Open
Abstract
Food insecurity affects more than 40 million individuals in the United States and is linked to negative health outcomes due, in part, to poor dietary quality. Despite the emergence of metabolomics as a modality to objectively characterize nutritional biomarkers, it is unclear whether food security is associated with any biomarkers of dietary quality. This scoping review aims to summarize studies that examined associations between nutritional biomarkers and food security, as well as studies that investigated metabolomic differences between people with and without food insecurity. PubMed, Embase, Scopus, and AGRICOLA were searched through August 2022 for studies describing food insecurity and metabolic markers in blood, urine, plasma, hair, or nails. The 78 studies included consisted of targeted assays quantifying lipids, dietary nutrients, heavy metals, and environmental xenobiotics as biochemical features associated with food insecurity. Among those biomarkers which were quantified in at least five studies, none showed a consistent association with food insecurity. Although three biomarkers of dietary quality have been assessed between food-insecure versus food-secure populations, no studies have utilized untargeted metabolomics to characterize patterns of small molecules that distinguish between these two populations. Further studies are needed to characterize the dietary quality profiles of individuals with and without food insecurity.
Collapse
Affiliation(s)
- Lev Krasnovsky
- Perelman School of Medicine, University of Pennsylvania, Philadelphia, PA 19104, USA; (A.P.C.); (F.N.); (L.S.W.)
| | - Aidan P. Crowley
- Perelman School of Medicine, University of Pennsylvania, Philadelphia, PA 19104, USA; (A.P.C.); (F.N.); (L.S.W.)
| | - Fawaz Naeem
- Perelman School of Medicine, University of Pennsylvania, Philadelphia, PA 19104, USA; (A.P.C.); (F.N.); (L.S.W.)
| | - Lucy S. Wang
- Perelman School of Medicine, University of Pennsylvania, Philadelphia, PA 19104, USA; (A.P.C.); (F.N.); (L.S.W.)
| | - Gary D. Wu
- Division of Gastroenterology and Hepatology, Perelman School of Medicine, University of Pennsylvania, Philadelphia, PA 19104, USA;
| | - Ariana M. Chao
- Johns Hopkins School of Nursing, Johns Hopkins University, Baltimore, MD 21205, USA;
| |
Collapse
|
14
|
Badillo-Sanchez D, Serrano Ruber M, Davies-Barrett A, Jones DJ, Hansen M, Inskip S. Metabolomics in archaeological science: A review of their advances and present requirements. SCIENCE ADVANCES 2023; 9:eadh0485. [PMID: 37566664 PMCID: PMC10421062 DOI: 10.1126/sciadv.adh0485] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 02/07/2023] [Accepted: 07/11/2023] [Indexed: 08/13/2023]
Abstract
Metabolomics, the study of metabolites (small molecules of <1500 daltons), has been posited as a potential tool to explore the past in a comparable manner to other omics, e.g., genomics or proteomics. Archaeologists have used metabolomic approaches for a decade or so, mainly applied to organic residues adhering to archaeological materials. Because of advances in sensitivity, resolution, and the increased availability of different analytical platforms, combined with the low mass/volume required for analysis, metabolomics is now becoming a more feasible choice in the archaeological sector. Additional approaches, as presented by our group, show the versatility of metabolomics as a source of knowledge about the human past when using human osteoarchaeological remains. There is tremendous potential for metabolomics within archaeology, but further efforts are required to position it as a routine technique.
Collapse
Affiliation(s)
| | - Maria Serrano Ruber
- School of Archaeology and Ancient History, University of Leicester, Leicester, UK
| | - Anna Davies-Barrett
- School of Archaeology and Ancient History, University of Leicester, Leicester, UK
| | - Donald J. L. Jones
- Leicester Cancer Research Centre, RKCSB, University of Leicester, Leicester, UK
- The Leicester van Geest MultiOmics Facility, University of Leicester, Leicester, UK
| | - Martin Hansen
- Environmental Metabolomics Lab, Department of Environmental Science, Aarhus University, Roskilde, Denmark
| | - Sarah Inskip
- School of Archaeology and Ancient History, University of Leicester, Leicester, UK
| |
Collapse
|
15
|
Droit A, Pelletier S, Leclerq M, Roux-Dalvai F, de Geus M, Leslie S, Wang W, Lam T, Nairn A, Arnold S, Carlyle B, Precioso F. Enhancing Classification of liquid chromatography mass spectrometry data with Batch Effect Removal Neural Networks (BERNN). RESEARCH SQUARE 2023:rs.3.rs-3112514. [PMID: 37461653 PMCID: PMC10350225 DOI: 10.21203/rs.3.rs-3112514/v1] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 07/28/2023]
Abstract
Liquid Chromatography Mass Spectrometry (LC-MS) is a powerful method for profiling complex biological samples. However, batch effects typically arise from differences in sample processing protocols, experimental conditions and data acquisition techniques, significantlyimpacting the interpretability of results. Correcting batch effects is crucial for the reproducibility of proteomics research, but current methods are not optimal for removal of batch effects without compressing the genuine biological variation under study. We propose a suite of Batch Effect Removal Neural Networks (BERNN) to remove batch effects in large LC-MS experiments, with the goal of maximizing sample classification performance between conditions. More importantly, these models must efficiently generalize in batches not seen during training. Comparison of batch effect correction methods across three diverse datasets demonstrated that BERNN models consistently showed the strongest sample classification performance. However, the model producing the greatest classification improvements did not always perform best in terms of batch effect removal. Finally, we show that overcorrection of batch effects resulted in the loss of some essential biological variability. These findings highlight the importance of balancing batch effect removal while preserving valuable biological diversity in large-scale LC-MS experiments.
Collapse
Affiliation(s)
- Arnaud Droit
- Centre de Recherche du CHU de Québec - Université Laval, Axe Endocrinologie et Néphrologie, Québec, Canada
| | | | | | | | | | | | - Weiwei Wang
- 7. Keck MS & Proteomics Resource, Yale School of Medicine
| | - TuKiet Lam
- 7. Keck MS & Proteomics Resource, Yale School of Medicine
| | | | - Steven Arnold
- 3. Massachusetts General Hospital Department of Neurology
| | - Becky Carlyle
- 3. Massachusetts General Hospital Department of Neurology
| | | |
Collapse
|
16
|
Morton JT, Jin DM, Mills RH, Shao Y, Rahman G, McDonald D, Zhu Q, Balaban M, Jiang Y, Cantrell K, Gonzalez A, Carmel J, Frankiensztajn LM, Martin-Brevet S, Berding K, Needham BD, Zurita MF, David M, Averina OV, Kovtun AS, Noto A, Mussap M, Wang M, Frank DN, Li E, Zhou W, Fanos V, Danilenko VN, Wall DP, Cárdenas P, Baldeón ME, Jacquemont S, Koren O, Elliott E, Xavier RJ, Mazmanian SK, Knight R, Gilbert JA, Donovan SM, Lawley TD, Carpenter B, Bonneau R, Taroncher-Oldenburg G. Multi-level analysis of the gut-brain axis shows autism spectrum disorder-associated molecular and microbial profiles. Nat Neurosci 2023:10.1038/s41593-023-01361-0. [PMID: 37365313 DOI: 10.1038/s41593-023-01361-0] [Citation(s) in RCA: 33] [Impact Index Per Article: 33.0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/11/2022] [Accepted: 05/13/2023] [Indexed: 06/28/2023]
Abstract
Autism spectrum disorder (ASD) is a neurodevelopmental disorder characterized by heterogeneous cognitive, behavioral and communication impairments. Disruption of the gut-brain axis (GBA) has been implicated in ASD although with limited reproducibility across studies. In this study, we developed a Bayesian differential ranking algorithm to identify ASD-associated molecular and taxa profiles across 10 cross-sectional microbiome datasets and 15 other datasets, including dietary patterns, metabolomics, cytokine profiles and human brain gene expression profiles. We found a functional architecture along the GBA that correlates with heterogeneity of ASD phenotypes, and it is characterized by ASD-associated amino acid, carbohydrate and lipid profiles predominantly encoded by microbial species in the genera Prevotella, Bifidobacterium, Desulfovibrio and Bacteroides and correlates with brain gene expression changes, restrictive dietary patterns and pro-inflammatory cytokine profiles. The functional architecture revealed in age-matched and sex-matched cohorts is not present in sibling-matched cohorts. We also show a strong association between temporal changes in microbiome composition and ASD phenotypes. In summary, we propose a framework to leverage multi-omic datasets from well-defined cohorts and investigate how the GBA influences ASD.
Collapse
Affiliation(s)
- James T Morton
- Center for Computational Biology, Flatiron Institute, Simons Foundation, New York, NY, USA
- Biostatistics & Bioinformatics Branch, Eunice Kennedy Shriver National Institute of Child Health and Human Development, National Institutes of Health, Bethesda, MD, USA
| | - Dong-Min Jin
- Center for Genomics and Systems Biology, Department of Biology, New York University, New York, NY, USA
| | | | - Yan Shao
- Host-Microbiota Interactions Laboratory, Wellcome Sanger Institute, Hinxton, UK
| | - Gibraan Rahman
- Bioinformatics and Systems Biology Program, University of California, San Diego, La Jolla, CA, USA
- Department of Pediatrics, School of Medicine, University of California, San Diego, La Jolla, CA, USA
| | - Daniel McDonald
- Department of Pediatrics, School of Medicine, University of California, San Diego, La Jolla, CA, USA
| | - Qiyun Zhu
- School of Life Sciences, Arizona State University, Tempe, AZ, USA
- Biodesign Center for Fundamental and Applied Microbiomics, Arizona State University, Tempe, AZ, USA
| | - Metin Balaban
- Bioinformatics and Systems Biology Program, University of California, San Diego, La Jolla, CA, USA
| | - Yueyu Jiang
- Department of Electrical and Computer Engineering, University of California, San Diego, La Jolla, CA, USA
| | - Kalen Cantrell
- Department of Pediatrics, School of Medicine, University of California, San Diego, La Jolla, CA, USA
- Department of Computer Science and Engineering, Jacobs School of Engineering, University of California, San Diego, La Jolla, CA, USA
| | - Antonio Gonzalez
- Department of Pediatrics, School of Medicine, University of California, San Diego, La Jolla, CA, USA
| | - Julie Carmel
- Azrieli Faculty of Medicine, Bar Ilan University, Safed, Israel
| | | | - Sandra Martin-Brevet
- Laboratory for Research in Neuroimaging, Centre for Research in Neurosciences, Department of Clinical Neurosciences, Centre Hospitalier Universitaire Vaudois, University of Lausanne, Lausanne, Switzerland
| | - Kirsten Berding
- Division of Nutritional Sciences, University of Illinois, Urbana, IL, USA
| | - Brittany D Needham
- Stark Neurosciences Research Institute, Indiana University School of Medicine, Indianapolis, IN, USA
- Department of Anatomy, Cell Biology and Physiology, Indiana University School of Medicine, Indianapolis, IN, USA
| | - María Fernanda Zurita
- Microbiology Institute and Health Science College, Universidad San Francisco de Quito, Quito, Ecuador
| | - Maude David
- Departments of Microbiology & Pharmaceutical Sciences, Oregon State University, Corvallis, OR, USA
| | - Olga V Averina
- Vavilov Institute of General Genetics Russian Academy of Sciences, Moscow, Russia
| | - Alexey S Kovtun
- Vavilov Institute of General Genetics Russian Academy of Sciences, Moscow, Russia
- Skolkovo Institute of Science and Technology, Skolkovo, Russia
| | - Antonio Noto
- Department of Biomedical Sciences, School of Medicine, University of Cagliari, Cagliari, Italy
| | - Michele Mussap
- Laboratory Medicine, Department of Surgical Sciences, School of Medicine, University of Cagliari, Cagliari, Italy
| | - Mingbang Wang
- Shanghai Key Laboratory of Birth Defects, Division of Neonatology, Children's Hospital of Fudan University, National Center for Children's Health, Shanghai, China
- Microbiome Therapy Center, South China Hospital, Health Science Center, Shenzhen University, Shenzhen, China
| | - Daniel N Frank
- Department of Medicine, University of Colorado Anschutz Medical Campus, Aurora, CO, USA
| | - Ellen Li
- Department of Medicine, Division of Gastroenterology and Hepatology, Stony Brook University, Stony Brook, NY, USA
| | - Wenhao Zhou
- Shanghai Key Laboratory of Birth Defects, Division of Neonatology, Children's Hospital of Fudan University, National Center for Children's Health, Shanghai, China
| | - Vassilios Fanos
- Neonatal Intensive Care Unit and Neonatal Pathology, Department of Surgical Sciences, School of Medicine, University of Cagliari, Cagliari, Italy
| | - Valery N Danilenko
- Vavilov Institute of General Genetics Russian Academy of Sciences, Moscow, Russia
| | - Dennis P Wall
- Pediatrics (Systems Medicine), Biomedical Data Science, and Psychiatry and Behavioral Sciences, Stanford University, Stanford, CA, USA
| | - Paúl Cárdenas
- Institute of Microbiology, COCIBA, Universidad San Francisco de Quito, Quito, Ecuador
| | - Manuel E Baldeón
- Facultad de Ciencias Médicas, de la Salud y la Vida, Universidad Internacional del Ecuador, Quito, Ecuador
| | - Sébastien Jacquemont
- Sainte Justine Hospital Research Center, Montréal, QC, Canada
- Department of Pediatrics, Université de Montréal, Montréal, QC, Canada
| | - Omry Koren
- Azrieli Faculty of Medicine, Bar Ilan University, Safed, Israel
| | - Evan Elliott
- Azrieli Faculty of Medicine, Bar Ilan University, Safed, Israel
- The Leslie and Susan Gonda Multidisciplinary Brain Research Center, Bar Ilan University, Ramat Gan, Israel
| | - Ramnik J Xavier
- Broad Institute of MIT and Harvard, Cambridge, MA, USA
- Department of Molecular Biology, Massachusetts General Hospital, Boston, MA, USA
- Center for the Study of Inflammatory Bowel Disease, Massachusetts General Hospital, Boston, MA, USA
| | - Sarkis K Mazmanian
- Division of Biology & Biological Engineering, California Institute of Technology, Pasadena, CA, USA
| | - Rob Knight
- Department of Pediatrics, School of Medicine, University of California, San Diego, La Jolla, CA, USA
- Department of Computer Science and Engineering, Jacobs School of Engineering, University of California, San Diego, La Jolla, CA, USA
- Department of Bioengineering, University of California, San Diego, La Jolla, California, USA
- Center for Microbiome Innovation, University of California, San Diego, La Jolla, California, USA
| | - Jack A Gilbert
- Department of Pediatrics, School of Medicine, University of California, San Diego, La Jolla, CA, USA
- Center for Microbiome Innovation, University of California, San Diego, La Jolla, California, USA
- Scripps Institution of Oceanography, University of California, San Diego, La Jolla, CA, USA
| | - Sharon M Donovan
- Division of Nutritional Sciences, University of Illinois, Urbana, IL, USA
| | - Trevor D Lawley
- Host-Microbiota Interactions Laboratory, Wellcome Sanger Institute, Hinxton, UK
| | - Bob Carpenter
- Center for Computational Biology, Flatiron Institute, Simons Foundation, New York, NY, USA
| | - Richard Bonneau
- Center for Computational Biology, Flatiron Institute, Simons Foundation, New York, NY, USA
- Center for Genomics and Systems Biology, Department of Biology, New York University, New York, NY, USA
- Prescient Design, a Genentech Accelerator, New York, NY, USA
| | - Gaspar Taroncher-Oldenburg
- Gaspar Taroncher Consulting, Philadelphia, PA, USA.
- Simons Foundation Autism Research Initiative, Simons Foundation, New York, NY, USA.
| |
Collapse
|
17
|
Li Y, Jiang G, Wu W, Yang H, Jin Y, Wu M, Liu W, Yang A, Chervova O, Zhang S, Zheng L, Zhang X, Du F, Kanu N, Wu L, Yang F, Wang J, Chen K. Multi-omics integrated circulating cell-free DNA genomic signatures enhanced the diagnostic performance of early-stage lung cancer and postoperative minimal residual disease. EBioMedicine 2023; 91:104553. [PMID: 37027928 PMCID: PMC10102814 DOI: 10.1016/j.ebiom.2023.104553] [Citation(s) in RCA: 10] [Impact Index Per Article: 10.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/06/2022] [Revised: 03/17/2023] [Accepted: 03/19/2023] [Indexed: 04/08/2023] Open
Abstract
BACKGROUND Liquid biopsy is a promising non-invasive alternative for cancer screening and minimal residual disease (MRD) detection, although there are some concerns regarding its clinical applications. We aimed to develop an accurate detection platform based on liquid biopsy for both cancer screening and MRD detection in patients with lung cancer (LC), which is also applicable to clinical use. METHODS We applied a modified whole-genome sequencing (WGS) -based High-performance Infrastructure For MultIomics (HIFI) method for LC screening and postoperative MRD detection by combining the hyper-co-methylated read approach and the circulating single-molecule amplification and resequencing technology (cSMART2.0). FINDINGS For early screening of LC, the LC score model was constructed using the support vector machine, which showed sensitivity (51.8%) at high specificity (96.3%) and achieved an AUC of 0.912 in the validation set prospectively enrolled from multiple centers. The screening model achieved detection efficiency with an AUC of 0.906 in patients with lung adenocarcinoma and outperformed other clinical models in solid nodule cohort. When applied the HIFI model to real social population, a negative predictive value (NPV) of 99.92% was achieved in Chinese population. Additionally, the MRD detection rate improved significantly by combining results from WGS and cSMART2.0, with sensitivity of 73.7% at specificity of 97.3%. INTERPRETATION In conclusion, the HIFI method is promising for diagnosis and postoperative monitoring of LC. FUNDING This study was supported by CAMS Innovation Fund for Medical Sciences, Chinese Academy of Medical Sciences, National Natural Science Foundation of China, Beijing Natural Science Foundation and Peking University People's Hospital.
Collapse
|
18
|
Guo F, Lin G, Dong L, Cheng KK, Deng L, Xu X, Raftery D, Dong J. Concordance-Based Batch Effect Correction for Large-Scale Metabolomics. Anal Chem 2023; 95:7220-7228. [PMID: 37115661 DOI: 10.1021/acs.analchem.2c05748] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 04/29/2023]
Abstract
For a large-scale metabolomics study, sample collection, preparation, and analysis may last several days, months, or even (intermittently) over years. This may lead to apparent batch effects in the acquired metabolomics data due to variability in instrument status, environmental conditions, or experimental operators. Batch effects may confound the true biological relationships among metabolites and thus obscure real metabolic changes. At present, most of the commonly used batch effect correction (BEC) methods are based on quality control (QC) samples, which require sufficient and stable QC samples. However, the quality of the QC samples may deteriorate if the experiment lasts for a long time. Alternatively, isotope-labeled internal standards have been used, but they generally do not provide good coverage of the metabolome. On the other hand, BEC can also be conducted through a data-driven method, in which no QC sample is needed. Here, we propose a novel data-driven BEC method, namely, CordBat, to achieve concordance between each batch of samples. In the proposed CordBat method, a reference batch is first selected from all batches of data, and the remaining batches are referred to as "other batches." The reference batch serves as the baseline for the batch adjustment by providing a coordinate of correlation between metabolites. Next, a Gaussian graphical model is built on the combined dataset of reference and other batches, and finally, BEC is achieved by optimizing the correction coefficients in the other batches so that the correlation between metabolites of each batch and their combinations are in concordance with that of the reference batch. Three real-world metabolomics datasets are used to evaluate the performance of CordBat by comparing it with five commonly used BEC methods. The present experimental results showed the effectiveness of CordBat in batch effect removal and the concordance of correlation between metabolites after BEC. CordBat was found to be comparable to the QC-based methods and achieved better performance in the preservation of biological effects. The proposed CordBat method may serve as an alternative BEC method for large-scale metabolomics that lack proper QC samples.
Collapse
Affiliation(s)
- Fanjing Guo
- Department of Electronic Science, National Institute for Data Science in Health and Medicine, Xiamen University, Xiamen 361005, China
| | - Genjin Lin
- Department of Electronic Science, National Institute for Data Science in Health and Medicine, Xiamen University, Xiamen 361005, China
| | - Liheng Dong
- School of Computer Science and Technology, Xiamen University Malaysia, Sepang 43600, Malaysia
| | - Kian-Kai Cheng
- Faculty of Chemical and Energy Engineering, Universiti Teknologi Malaysia, Johor 81310, Malaysia
| | - Lingli Deng
- Department of Information Engineering, East China University of Technology, Nanchang 330013, China
| | - Xiangnan Xu
- School of Mathematics and Statistics, The University of Sydney, Sydney, New South Wales 2006, Australia
| | - Daniel Raftery
- Northwest Metabolomics Research Center, University of Washington, Seattle, Washington 98109, United States
| | - Jiyang Dong
- Department of Electronic Science, National Institute for Data Science in Health and Medicine, Xiamen University, Xiamen 361005, China
| |
Collapse
|
19
|
Flores JE, Claborne DM, Weller ZD, Webb-Robertson BJM, Waters KM, Bramer LM. Missing data in multi-omics integration: Recent advances through artificial intelligence. Front Artif Intell 2023; 6:1098308. [PMID: 36844425 PMCID: PMC9949722 DOI: 10.3389/frai.2023.1098308] [Citation(s) in RCA: 15] [Impact Index Per Article: 15.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/14/2022] [Accepted: 01/23/2023] [Indexed: 02/11/2023] Open
Abstract
Biological systems function through complex interactions between various 'omics (biomolecules), and a more complete understanding of these systems is only possible through an integrated, multi-omic perspective. This has presented the need for the development of integration approaches that are able to capture the complex, often non-linear, interactions that define these biological systems and are adapted to the challenges of combining the heterogenous data across 'omic views. A principal challenge to multi-omic integration is missing data because all biomolecules are not measured in all samples. Due to either cost, instrument sensitivity, or other experimental factors, data for a biological sample may be missing for one or more 'omic techologies. Recent methodological developments in artificial intelligence and statistical learning have greatly facilitated the analyses of multi-omics data, however many of these techniques assume access to completely observed data. A subset of these methods incorporate mechanisms for handling partially observed samples, and these methods are the focus of this review. We describe recently developed approaches, noting their primary use cases and highlighting each method's approach to handling missing data. We additionally provide an overview of the more traditional missing data workflows and their limitations; and we discuss potential avenues for further developments as well as how the missing data issue and its current solutions may generalize beyond the multi-omics context.
Collapse
Affiliation(s)
- Javier E. Flores
- Pacific Northwest National Laboratory, Biological Sciences Division, Earth and Biological Sciences Directorate, Richland, WA, United States
| | - Daniel M. Claborne
- Pacific Northwest National Laboratory, Artificial Intelligence and Data Analytics Division, National Security Directorate, Richland, WA, United States
| | - Zachary D. Weller
- Pacific Northwest National Laboratory, Artificial Intelligence and Data Analytics Division, National Security Directorate, Richland, WA, United States
| | - Bobbie-Jo M. Webb-Robertson
- Pacific Northwest National Laboratory, Biological Sciences Division, Earth and Biological Sciences Directorate, Richland, WA, United States
| | - Katrina M. Waters
- Pacific Northwest National Laboratory, Biological Sciences Division, Earth and Biological Sciences Directorate, Richland, WA, United States
| | - Lisa M. Bramer
- Pacific Northwest National Laboratory, Biological Sciences Division, Earth and Biological Sciences Directorate, Richland, WA, United States,*Correspondence: Lisa M. Bramer ✉
| |
Collapse
|
20
|
Quantitative challenges and their bioinformatic solutions in mass spectrometry-based metabolomics. Trends Analyt Chem 2023. [DOI: 10.1016/j.trac.2023.117009] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 03/04/2023]
|
21
|
Jacobs JP, Lagishetty V, Hauer MC, Labus JS, Dong TS, Toma R, Vuyisich M, Naliboff BD, Lackner JM, Gupta A, Tillisch K, Mayer EA. Multi-omics profiles of the intestinal microbiome in irritable bowel syndrome and its bowel habit subtypes. MICROBIOME 2023; 11:5. [PMID: 36624530 PMCID: PMC9830758 DOI: 10.1186/s40168-022-01450-5] [Citation(s) in RCA: 22] [Impact Index Per Article: 22.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 03/23/2022] [Accepted: 12/14/2022] [Indexed: 06/17/2023]
Abstract
BACKGROUND Irritable bowel syndrome (IBS) is a common gastrointestinal disorder that is thought to involve alterations in the gut microbiome, but robust microbial signatures have been challenging to identify. As prior studies have primarily focused on composition, we hypothesized that multi-omics assessment of microbial function incorporating both metatranscriptomics and metabolomics would further delineate microbial profiles of IBS and its subtypes. METHODS Fecal samples were collected from a racially/ethnically diverse cohort of 495 subjects, including 318 IBS patients and 177 healthy controls, for analysis by 16S rRNA gene sequencing (n = 486), metatranscriptomics (n = 327), and untargeted metabolomics (n = 368). Differentially abundant microbes, predicted genes, transcripts, and metabolites in IBS were identified by multivariate models incorporating age, sex, race/ethnicity, BMI, diet, and HAD-Anxiety. Inter-omic functional relationships were assessed by transcript/gene ratios and microbial metabolic modeling. Differential features were used to construct random forests classifiers. RESULTS IBS was associated with global alterations in microbiome composition by 16S rRNA sequencing and metatranscriptomics, and in microbiome function by predicted metagenomics, metatranscriptomics, and metabolomics. After adjusting for age, sex, race/ethnicity, BMI, diet, and anxiety, IBS was associated with differential abundance of bacterial taxa such as Bacteroides dorei; metabolites including increased tyramine and decreased gentisate and hydrocinnamate; and transcripts related to fructooligosaccharide and polyol utilization. IBS further showed transcriptional upregulation of enzymes involved in fructose and glucan metabolism as well as the succinate pathway of carbohydrate fermentation. A multi-omics classifier for IBS had significantly higher accuracy (AUC 0.82) than classifiers using individual datasets. Diarrhea-predominant IBS (IBS-D) demonstrated shifts in the metatranscriptome and metabolome including increased bile acids, polyamines, succinate pathway intermediates (malate, fumarate), and transcripts involved in fructose, mannose, and polyol metabolism compared to constipation-predominant IBS (IBS-C). A classifier incorporating metabolites and gene-normalized transcripts differentiated IBS-D from IBS-C with high accuracy (AUC 0.86). CONCLUSIONS IBS is characterized by a multi-omics microbial signature indicating increased capacity to utilize fermentable carbohydrates-consistent with the clinical benefit of diets restricting this energy source-that also includes multiple previously unrecognized metabolites and metabolic pathways. These findings support the need for integrative assessment of microbial function to investigate the microbiome in IBS and identify novel microbiome-related therapeutic targets. Video Abstract.
Collapse
Affiliation(s)
- Jonathan P Jacobs
- Vatche and Tamar Manoukian Division of Digestive Diseases, Department of Medicine, David Geffen School of Medicine, University of California Los Angeles, Los Angeles, CA, USA.
- G. Oppenheimer Center for Neurobiology of Stress and Resilience, David Geffen School of Medicine, University of California Los Angeles, Los Angeles, CA, USA.
- Division of Gastroenterology, Hepatology and Parenteral Nutrition, VA Greater Los Angeles Healthcare System, Los Angeles, CA, USA.
| | - Venu Lagishetty
- Vatche and Tamar Manoukian Division of Digestive Diseases, Department of Medicine, David Geffen School of Medicine, University of California Los Angeles, Los Angeles, CA, USA
| | - Megan C Hauer
- Vatche and Tamar Manoukian Division of Digestive Diseases, Department of Medicine, David Geffen School of Medicine, University of California Los Angeles, Los Angeles, CA, USA
| | - Jennifer S Labus
- Vatche and Tamar Manoukian Division of Digestive Diseases, Department of Medicine, David Geffen School of Medicine, University of California Los Angeles, Los Angeles, CA, USA
- G. Oppenheimer Center for Neurobiology of Stress and Resilience, David Geffen School of Medicine, University of California Los Angeles, Los Angeles, CA, USA
| | - Tien S Dong
- Vatche and Tamar Manoukian Division of Digestive Diseases, Department of Medicine, David Geffen School of Medicine, University of California Los Angeles, Los Angeles, CA, USA
- Division of Gastroenterology, Hepatology and Parenteral Nutrition, VA Greater Los Angeles Healthcare System, Los Angeles, CA, USA
| | - Ryan Toma
- Viome Life Sciences, Bellevue, WA, USA
| | | | - Bruce D Naliboff
- Vatche and Tamar Manoukian Division of Digestive Diseases, Department of Medicine, David Geffen School of Medicine, University of California Los Angeles, Los Angeles, CA, USA
- G. Oppenheimer Center for Neurobiology of Stress and Resilience, David Geffen School of Medicine, University of California Los Angeles, Los Angeles, CA, USA
| | - Jeffrey M Lackner
- Division of Behavioral Medicine, Department of Medicine, Jacobs School of Medicine, University at Buffalo, SUNY, Buffalo, NY, USA
| | - Arpana Gupta
- Vatche and Tamar Manoukian Division of Digestive Diseases, Department of Medicine, David Geffen School of Medicine, University of California Los Angeles, Los Angeles, CA, USA
- G. Oppenheimer Center for Neurobiology of Stress and Resilience, David Geffen School of Medicine, University of California Los Angeles, Los Angeles, CA, USA
| | - Kirsten Tillisch
- Vatche and Tamar Manoukian Division of Digestive Diseases, Department of Medicine, David Geffen School of Medicine, University of California Los Angeles, Los Angeles, CA, USA
- G. Oppenheimer Center for Neurobiology of Stress and Resilience, David Geffen School of Medicine, University of California Los Angeles, Los Angeles, CA, USA
- Integrative Medicine, VA Greater Los Angeles Healthcare System, Los Angeles, CA, USA
| | - Emeran A Mayer
- Vatche and Tamar Manoukian Division of Digestive Diseases, Department of Medicine, David Geffen School of Medicine, University of California Los Angeles, Los Angeles, CA, USA.
- G. Oppenheimer Center for Neurobiology of Stress and Resilience, David Geffen School of Medicine, University of California Los Angeles, Los Angeles, CA, USA.
| |
Collapse
|
22
|
Hattaway ME, Black GP, Young TM. Batch correction methods for nontarget chemical analysis data: application to a municipal wastewater collection system. Anal Bioanal Chem 2023; 415:1321-1331. [PMID: 36627378 PMCID: PMC9928919 DOI: 10.1007/s00216-023-04511-2] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/23/2022] [Revised: 12/08/2022] [Accepted: 01/02/2023] [Indexed: 01/12/2023]
Abstract
Nontarget chemical analysis using high-resolution mass spectrometry has increasingly been used to discern spatial patterns and temporal trends in anthropogenic chemical abundance in natural and engineered systems. A critical experimental design consideration in such applications, especially those monitoring complex matrices over long time periods, is a choice between analyzing samples in multiple batches as they are collected, or in one batch after all samples have been processed. While datasets acquired in multiple analytical batches can include the effects of instrumental variability over time, datasets acquired in a single batch risk compound degradation during sample storage. To assess the influence of batch effects on the analysis and interpretation of nontarget data, this study examined a set of 56 samples collected from a municipal wastewater system over 7 months. Each month's samples included 6 from sites within the collection system, one combined influent, and one treated effluent sample. Samples were analyzed using liquid chromatography high-resolution mass spectrometry in positive electrospray ionization mode in multiple batches as the samples were collected and in a single batch at the conclusion of the study. Data were aligned and normalized using internal standard scaling and ComBat, an empirical Bayes method developed for estimating and removing batch effects in microarrays. As judged by multiple lines of evidence, including comparing principal variance component analysis between single and multi-batch datasets and through patterns in principal components and hierarchical clustering analyses, ComBat appeared to significantly reduce the influence of batch effects. For this reason, we recommend the use of more, small batches with an appropriate batch correction step rather than acquisition in one large batch.
Collapse
Affiliation(s)
- Madison E. Hattaway
- grid.27860.3b0000 0004 1936 9684Department of Civil and Environmental Engineering, University of California, Davis, Davis, CA 95616 USA
| | - Gabrielle P. Black
- grid.27860.3b0000 0004 1936 9684Department of Civil and Environmental Engineering, University of California, Davis, Davis, CA 95616 USA
| | - Thomas M. Young
- grid.27860.3b0000 0004 1936 9684Department of Civil and Environmental Engineering, University of California, Davis, Davis, CA 95616 USA
| |
Collapse
|
23
|
Mäkinen VP, Karsikas M, Kettunen J, Lehtimäki T, Kähönen M, Viikari J, Perola M, Salomaa V, Järvelin MR, Raitakari OT, Ala-Korpela M. Longitudinal profiling of metabolic ageing trends in two population cohorts of young adults. Int J Epidemiol 2022; 51:1970-1983. [PMID: 35441226 DOI: 10.1093/ije/dyac062] [Citation(s) in RCA: 13] [Impact Index Per Article: 6.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/21/2021] [Accepted: 03/20/2022] [Indexed: 01/21/2023] Open
Abstract
BACKGROUND Quantification of metabolic changes over the human life course is essential to understanding ageing processes. Yet longitudinal metabolomics data are rare and long gaps between visits can introduce biases that mask true trends. We introduce new ways to process quantitative time-series population data and elucidate metabolic ageing trends in two large cohorts. METHODS Eligible participants included 1672 individuals from the Cardiovascular Risk in Young Finns Study and 3117 from the Northern Finland Birth Cohort 1966. Up to three time points (ages 24-49 years) were analysed by nuclear magnetic resonance metabolomics and clinical biochemistry (236 measures). Temporal trends were quantified as median change per decade. Sample quality was verified by consistency of shared biomarkers between metabolomics and clinical assays. Batch effects between visits were mitigated by a new algorithm introduced in this report. The results below satisfy multiple testing threshold of P < 0.0006. RESULTS Women gained more weight than men (+6.5% vs +5.0%) but showed milder metabolic changes overall. Temporal sex differences were observed for C-reactive protein (women +5.1%, men +21.1%), glycine (women +5.2%, men +1.9%) and phenylalanine (women +0.6%, men +3.5%). In 566 individuals with ≥+3% weight gain vs 561 with weight change ≤-3%, divergent patterns were observed for insulin (+24% vs -10%), very-low-density-lipoprotein triglycerides (+32% vs -6%), high-density-lipoprotein2 cholesterol (-6.5% vs +4.7%), isoleucine (+5.7% vs -6.0%) and C-reactive protein (+25% vs -22%). CONCLUSION We report absolute and proportional trends for 236 metabolic measures as new reference material for overall age-associated and specific weight-driven changes in real-world populations.
Collapse
Affiliation(s)
- Ville-Petteri Mäkinen
- Computational and Systems Biology Program, Precision Medicine Theme, South Australian Health and Medical Research Institute, Adelaide, Australia.,Australian Centre for Precision Health, University of South Australia, Adelaide, Australia.,Computational Medicine, Faculty of Medicine, University of Oulu, Oulu, Finland
| | - Mari Karsikas
- Computational Medicine, Faculty of Medicine, University of Oulu, Oulu, Finland.,Biocenter Oulu, Oulu, Finland.,Center for Life Course Health Research, Faculty of Medicine, University of Oulu, Oulu, Finland
| | - Johannes Kettunen
- Computational Medicine, Faculty of Medicine, University of Oulu, Oulu, Finland.,Biocenter Oulu, Oulu, Finland.,Center for Life Course Health Research, Faculty of Medicine, University of Oulu, Oulu, Finland.,Department of Public Health and Welfare, Finnish Institute for Health and Welfare, Helsinki, Finland
| | - Terho Lehtimäki
- Department of Clinical Chemistry, Fimlab Laboratories, and Finnish Cardiovascular Research Center Tampere, Faculty of Medicine and Health Technology, Tampere University, Tampere, Finland
| | - Mika Kähönen
- Department of Clinical Physiology, Tampere University Hospital, and Finnish Cardiovascular Research Center Tampere, Faculty of Medicine and Health Technology, Tampere University, Tampere, Finland
| | - Jorma Viikari
- Department of Medicine, University of Turku, Turku, Finland.,Division of Medicine, Turku University Hospital, Turku, Finland
| | - Markus Perola
- Department of Public Health and Welfare, Finnish Institute for Health and Welfare, Helsinki, Finland.,Institute for Molecular Medicine (FIMM), University of Helsinki, Helsinki, Finland.,Estonian Genome Center, University of Tartu, Tartu, Estonia
| | - Veikko Salomaa
- Department of Public Health and Welfare, Finnish Institute for Health and Welfare, Helsinki, Finland
| | - Marjo-Riitta Järvelin
- Center for Life Course Health Research, Faculty of Medicine, University of Oulu, Oulu, Finland.,Unit of Primary Health Care, Oulu University Hospital, OYS, Oulu, Finland.,Department of Epidemiology and Biostatistics, MRC-PHE Centre for Environment and Health, School of Public Health, Imperial College London, London, UK.,Department of Life Sciences, College of Health and Life Sciences, Brunel University London, UK
| | - Olli T Raitakari
- Research Centre of Applied and Preventive Cardiovascular Medicine, University of Turku, Turku, Finland.,Department of Clinical Physiology and Nuclear Medicine, Turku University Hospital, Turku, Finland.,Centre for Population Health Research, University of Turku and Turku University Hospital
| | - Mika Ala-Korpela
- Computational Medicine, Faculty of Medicine, University of Oulu, Oulu, Finland.,Biocenter Oulu, Oulu, Finland.,Center for Life Course Health Research, Faculty of Medicine, University of Oulu, Oulu, Finland.,NMR Metabolomics Laboratory, School of Pharmacy, University of Eastern Finland, Kuopio, Finland
| |
Collapse
|
24
|
Shaver AO, Garcia BM, Gouveia GJ, Morse AM, Liu Z, Asef CK, Borges RM, Leach FE, Andersen EC, Amster IJ, Fernández FM, Edison AS, McIntyre LM. An anchored experimental design and meta-analysis approach to address batch effects in large-scale metabolomics. Front Mol Biosci 2022; 9:930204. [PMID: 36438654 PMCID: PMC9682135 DOI: 10.3389/fmolb.2022.930204] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/27/2022] [Accepted: 10/10/2022] [Indexed: 11/27/2022] Open
Abstract
Untargeted metabolomics studies are unbiased but identifying the same feature across studies is complicated by environmental variation, batch effects, and instrument variability. Ideally, several studies that assay the same set of metabolic features would be used to select recurring features to pursue for identification. Here, we developed an anchored experimental design. This generalizable approach enabled us to integrate three genetic studies consisting of 14 test strains of Caenorhabditis elegans prior to the compound identification process. An anchor strain, PD1074, was included in every sample collection, resulting in a large set of biological replicates of a genetically identical strain that anchored each study. This enables us to estimate treatment effects within each batch and apply straightforward meta-analytic approaches to combine treatment effects across batches without the need for estimation of batch effects and complex normalization strategies. We collected 104 test samples for three genetic studies across six batches to produce five analytical datasets from two complementary technologies commonly used in untargeted metabolomics. Here, we use the model system C. elegans to demonstrate that an augmented design combined with experimental blocks and other metabolomic QC approaches can be used to anchor studies and enable comparisons of stable spectral features across time without the need for compound identification. This approach is generalizable to systems where the same genotype can be assayed in multiple environments and provides biologically relevant features for downstream compound identification efforts. All methods are included in the newest release of the publicly available SECIMTools based on the open-source Galaxy platform.
Collapse
Affiliation(s)
- Amanda O. Shaver
- Department of Genetics, University of Georgia, Athens, GA, United States,Complex Carbohydrate Research Center, University of Georgia, Athens, GA, United States
| | - Brianna M. Garcia
- Complex Carbohydrate Research Center, University of Georgia, Athens, GA, United States,Department of Chemistry, University of Georgia, Athens, GA, United States
| | - Goncalo J. Gouveia
- Complex Carbohydrate Research Center, University of Georgia, Athens, GA, United States,Department of Biochemistry, University of Georgia, Athens, GA, United States
| | - Alison M. Morse
- Department of Molecular Genetics and Microbiology, University of Florida, Gainesville, FL, United States
| | - Zihao Liu
- Department of Molecular Genetics and Microbiology, University of Florida, Gainesville, FL, United States
| | - Carter K. Asef
- School of Chemistry and Biochemistry, Georgia Institute of Technology, Atlanta, GA, United States
| | - Ricardo M. Borges
- Walter Mors Institute of Research on Natural Products, Federal University of Rio de Janeiro, Rio de Janeiro, Brazil
| | - Franklin E. Leach
- Complex Carbohydrate Research Center, University of Georgia, Athens, GA, United States,Department of Environmental Health Science, University of Georgia, Athens, GA, United States
| | - Erik C. Andersen
- Department of Molecular Biosciences, Northwestern University, Evanston, IL, United States
| | - I. Jonathan Amster
- Department of Chemistry, University of Georgia, Athens, GA, United States
| | - Facundo M. Fernández
- School of Chemistry and Biochemistry, Georgia Institute of Technology, Atlanta, GA, United States
| | - Arthur S. Edison
- Department of Genetics, University of Georgia, Athens, GA, United States,Complex Carbohydrate Research Center, University of Georgia, Athens, GA, United States,Department of Biochemistry, University of Georgia, Athens, GA, United States
| | - Lauren M. McIntyre
- Department of Molecular Genetics and Microbiology, University of Florida, Gainesville, FL, United States,University of Florida Genetics Institute, University of Florida, Gainesville, FL, United States,*Correspondence: Lauren M. McIntyre,
| |
Collapse
|
25
|
Armbruster M, Grady SF, Arnatt CK, Edwards JL. Isobaric 4-Plex Tagging for Absolute Quantitation of Biological Acids in Diabetic Urine Using Capillary LC-MS/MS. ACS MEASUREMENT SCIENCE AU 2022; 2:287-295. [PMID: 35726255 PMCID: PMC9204807 DOI: 10.1021/acsmeasuresciau.1c00061] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 12/22/2021] [Revised: 02/21/2022] [Accepted: 02/22/2022] [Indexed: 06/15/2023]
Abstract
Isobaric labeling in mass spectrometry enables multiplexed absolute quantitation and high throughput, while minimizing full scan spectral complexity. Here, we use 4-plex isobaric labeling with a fixed positive charge tag to improve quantitation and throughput for polar carboxylic acid metabolites. The isobaric tag uses an isotope-encoded neutral loss to create mass-dependent reporters spaced 2 Da apart and was validated for both single- and double-tagged analytes. Tags were synthesized in-house using deuterated formaldehyde and methyl iodide in a total of four steps, producing cost-effective multiplexing. No chromatographic deuterium shifts were observed for single- or double-tagged analytes, producing consistent reporter ratios across each peak. Perfluoropentanoic acid was added to the sample to drastically increase retention of double-tagged analytes on a C18 column. Excess tag was scavenged and extracted using hexadecyl chloroformate after reaction completion. This allowed for removal of excess tag that typically causes ion suppression and column overloading. A total of 54 organic acids were investigated, producing an average linearity of 0.993, retention time relative standard deviation (RSD) of 0.58%, and intensity RSD of 12.1%. This method was used for absolute quantitation of acid metabolites comparing control and type 1 diabetic urine. Absolute quantitation of organic acids was achieved by using one isobaric lane for standards, thereby allowing for analysis of six urine samples in two injections. Quantified acids showed good agreement with previous work, and six significant changes were found. Overall, this method demonstrated 4-plex absolute quantitation of acids in a complex biological sample.
Collapse
|
26
|
Quantitative Comparison of Statistical Methods for Analyzing Human Metabolomics Data. Metabolites 2022; 12:metabo12060519. [PMID: 35736452 PMCID: PMC9227835 DOI: 10.3390/metabo12060519] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/04/2022] [Revised: 05/26/2022] [Accepted: 05/27/2022] [Indexed: 01/26/2023] Open
Abstract
Emerging technologies now allow for mass spectrometry-based profiling of thousands of small molecule metabolites ('metabolomics') in an increasing number of biosamples. While offering great promise for insight into the pathogenesis of human disease, standard approaches have not yet been established for statistically analyzing increasingly complex, high-dimensional human metabolomics data in relation to clinical phenotypes, including disease outcomes. To determine optimal approaches for analysis, we formally compare traditional and newer statistical learning methods across a range of metabolomics dataset types. In simulated and experimental metabolomics data derived from large population-based human cohorts, we observe that with an increasing number of study subjects, univariate compared to multivariate methods result in an apparently higher false discovery rate as represented by substantial correlation between metabolites directly associated with the outcome and metabolites not associated with the outcome. Although the higher frequency of such associations would not be considered false in the strict statistical sense, it may be considered biologically less informative. In scenarios wherein the number of assayed metabolites increases, as in measures of nontargeted versus targeted metabolomics, multivariate methods performed especially favorably across a range of statistical operating characteristics. In nontargeted metabolomics datasets that included thousands of metabolite measures, sparse multivariate models demonstrated greater selectivity and lower potential for spurious relationships. When the number of metabolites was similar to or exceeded the number of study subjects, as is common with nontargeted metabolomics analysis of relatively small cohorts, sparse multivariate models exhibited the most-robust statistical power with more consistent results. These findings have important implications for metabolomics analysis in human disease.
Collapse
|
27
|
Ding X, Yang F, Chen Y, Xu J, He J, Zhang R, Abliz Z. Norm ISWSVR: A Data Integration and Normalization Approach for Large-Scale Metabolomics. Anal Chem 2022; 94:7500-7509. [PMID: 35584098 DOI: 10.1021/acs.analchem.1c05502] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/08/2023]
Abstract
Large-scale and long-period metabolomics study is more susceptible to various sources of systematic errors, resulting in nonreproducibility and poor data quality. A reliable and robust batch correction method removes unwanted systematic variations and improves the statistical power of metabolomics data, which undeniably becomes an important issue for the quality control of metabolomics. This study proposed a novel data normalization and integration method, Norm ISWSVR. It is a two-step approach via combining the best-performance internal standard correction with support vector regression normalization, comprehensively removing the systematic and random errors and matrix effects. This method was investigated in three untargeted lipidomics or metabolomics datasets, and the performance was further evaluated systematically in comparison with that of 11 other normalization methods. As a result, Norm ISWSVR decreased the data's median cross-validated relative standard deviation (cvRSD), increased the correlation between QCs, improved the classification accuracy of biomarkers, and was well-compatible with quantitative data. More importantly, Norm ISWSVR also allows a low frequency of QCs, which could significantly decrease the burden of a large-scale experiment. Correspondingly, Norm ISWSVR favorably improves the data quality of large-scale metabolomics data.
Collapse
Affiliation(s)
- Xian Ding
- State Key Laboratory of Bioactive Substance and Function of Natural Medicines, Institute of Materia Medica, Chinese Academy of Medical Sciences and Peking Union Medical College, 100050 Beijing, China
| | - Fen Yang
- Key Laboratory of Carcinogenesis and Translational Research (Ministry of Education), Center of Drug Clinical Trial, Peking University Cancer Hospital and Institute, Beijing 100142, China
| | - Yanhua Chen
- Key Laboratory of Mass Spectrometry Imaging and Metabolomics, Minzu University of China, State Ethnic Affairs Commission, 100081 Beijing, China.,Center for Imaging and Systems Biology, College of Life and Environmental Sciences, Minzu University of China, 100081 Beijing, China.,Key Laboratory of Ethnomedicine of Ministry of Education, School of Pharmacy, Minzu University of China, Beijing 100081, China
| | - Jing Xu
- State Key Laboratory of Bioactive Substance and Function of Natural Medicines, Institute of Materia Medica, Chinese Academy of Medical Sciences and Peking Union Medical College, 100050 Beijing, China
| | - Jiuming He
- State Key Laboratory of Bioactive Substance and Function of Natural Medicines, Institute of Materia Medica, Chinese Academy of Medical Sciences and Peking Union Medical College, 100050 Beijing, China
| | - Ruiping Zhang
- State Key Laboratory of Bioactive Substance and Function of Natural Medicines, Institute of Materia Medica, Chinese Academy of Medical Sciences and Peking Union Medical College, 100050 Beijing, China
| | - Zeper Abliz
- State Key Laboratory of Bioactive Substance and Function of Natural Medicines, Institute of Materia Medica, Chinese Academy of Medical Sciences and Peking Union Medical College, 100050 Beijing, China.,Key Laboratory of Mass Spectrometry Imaging and Metabolomics, Minzu University of China, State Ethnic Affairs Commission, 100081 Beijing, China.,Center for Imaging and Systems Biology, College of Life and Environmental Sciences, Minzu University of China, 100081 Beijing, China.,Key Laboratory of Ethnomedicine of Ministry of Education, School of Pharmacy, Minzu University of China, Beijing 100081, China
| |
Collapse
|
28
|
Rodriguez J, Gomez-Cano L, Grotewold E, de Leon N. Normalizing and Correcting Variable and Complex LC-MS Metabolomic Data with the R Package pseudoDrift. Metabolites 2022; 12:435. [PMID: 35629939 PMCID: PMC9144304 DOI: 10.3390/metabo12050435] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/28/2022] [Revised: 05/09/2022] [Accepted: 05/10/2022] [Indexed: 01/27/2023] Open
Abstract
In biological research domains, liquid chromatography-mass spectroscopy (LC-MS) has prevailed as the preferred technique for generating high quality metabolomic data. However, even with advanced instrumentation and established data acquisition protocols, technical errors are still routinely encountered and can pose a significant challenge to unveiling biologically relevant information. In large-scale studies, signal drift and batch effects are how technical errors are most commonly manifested. We developed pseudoDrift, an R package with capabilities for data simulation and outlier detection, and a new training and testing approach that is implemented to capture and to optionally correct for technical errors in LC-MS metabolomic data. Using data simulation, we demonstrate here that our approach performs equally as well as existing methods and offers increased flexibility to the researcher. As part of our study, we generated a targeted LC-MS dataset that profiled 33 phenolic compounds from seedling stem tissue in 602 genetically diverse non-transgenic maize inbred lines. This dataset provides a unique opportunity to investigate the dynamics of specialized metabolism in plants.
Collapse
Affiliation(s)
- Jonas Rodriguez
- Department of Agronomy, University of Wisconsin-Madison, Madison, WI 53706, USA;
| | - Lina Gomez-Cano
- Department of Biochemistry and Molecular Biology, Michigan State University, East Lansing, MI 48824, USA; (L.G.-C.); (E.G.)
| | - Erich Grotewold
- Department of Biochemistry and Molecular Biology, Michigan State University, East Lansing, MI 48824, USA; (L.G.-C.); (E.G.)
| | - Natalia de Leon
- Department of Agronomy, University of Wisconsin-Madison, Madison, WI 53706, USA;
| |
Collapse
|
29
|
Bai Y, Zhang H, Wu Z, Huang S, Luo Z, Wu K, Hu L, Chen C. Use of Ultra High Performance Liquid Chromatography with High Resolution Mass Spectrometry to Analyze Urinary Metabolome Alterations Following Acute Kidney Injury in Post-Cardiac Surgery Patients. J Mass Spectrom Adv Clin Lab 2022; 24:31-40. [PMID: 35252948 PMCID: PMC8892161 DOI: 10.1016/j.jmsacl.2022.02.003] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/19/2021] [Revised: 02/08/2022] [Accepted: 02/17/2022] [Indexed: 12/20/2022] Open
Abstract
Cardiac surgery-associated AKI results in dramatic changes in urinary metabolome. Urinary metabolite disorder observed in patients with cardiac surgery-associated AKI. When metaboloite disorder was due to ischaemia and medical treatment, kidneys could return to normal. This work provides data about urinary metabolic profiles and resources for further research on AKI.
Background Cardiac surgery-associated acute kidney injury (AKI) can increase the mortality and morbidity, and the incidence of chronic kidney disease, in critically ill survivors. The purpose of this research was to investigate possible links between urinary metabolic changes and cardiac surgery-associated AKI. Methods Using ultra-high-performance liquid chromatography coupled with Q-Exactive Orbitrap mass spectrometry, non-targeted metabolomics was performed on urinary samples collected from groups of patients with cardiac surgery-associated AKI at different time points, including Before_AKI (uninjured kidney), AKI_Day1 (injured kidney) and AKI_Day14 (recovered kidney) groups. The data among the three groups were analyzed by combining multivariate and univariate statistical methods, and urine metabolites related to AKI in patients after cardiac surgery were screened. Altered metabolic pathways associated with cardiac surgery-induced AKI were identified by examining the Kyoto Encyclopedia of Genes and Genomes database. Results The secreted urinary metabolome of the injured kidney can be well separated from the urine metabolomes of uninjured or recovered patients using multivariate and univariate statistical analyses. However, urine samples from the AKI_Day14 and Before_AKI groups cannot be distinguished using either of the two statistical analyses. Nearly 4000 urinary metabolites were identified through bioinformatics methods at Annotation Levels 1–4. Several of these differential metabolites may also perform essential biological functions. Differential analysis of the urinary metabolome among groups was also performed to provide potential prognostic indicators and changes in signalling pathways. Compared with the uninjured kidney group, the patients with cardiac surgery-associated AKI displayed dramatic changes in renal metabolism, including sulphur metabolism and amino acid metabolism. Conclusions Urinary metabolite disorder was observed in patients with cardiac surgery-associated AKI due to ischaemia and medical treatment, and the recovered patients’ kidneys were able to return to normal. This work provides data on urine metabolite markers and essential resources for further research on AKI.
Collapse
Affiliation(s)
- Yunpeng Bai
- Center of Scientific Research, Maoming People’s Hospital, Maoming 525000, China
- Department of Critical Care Medicine, Maoming People’s Hospital, Maoming 525000, China
| | - Huidan Zhang
- Department of Intensive Care Unit of Cardiovascular Surgery, Guangdong Cardiovascular Institute, Guangdong Provincial People's Hospital, Guangdong Academy of Medical Sciences, Guangzhou 510080, China
- Department of Critical Care Medicine, Guangdong Provincial People's Hospital, Guangdong Academy of Medical Sciences, Guangzhou 510080, China
- School of Medicine, South China University of Technology, Guangzhou 510006, China
| | - Zheng Wu
- Department of Critical Care Medicine, Guangdong Provincial People's Hospital, Guangdong Academy of Medical Sciences, Guangzhou 510080, China
- School of Biology and Biological Engineering, South China University of Technology, Guangzhou 510006, China
| | - Sumei Huang
- Center of Scientific Research, Maoming People’s Hospital, Maoming 525000, China
- Biological Resource Center of Maoming People’s Hospital, Maoming 525000, China
| | - Zhidan Luo
- Center of Scientific Research, Maoming People’s Hospital, Maoming 525000, China
| | - Kunyong Wu
- Center of Scientific Research, Maoming People’s Hospital, Maoming 525000, China
- Biological Resource Center of Maoming People’s Hospital, Maoming 525000, China
| | - Linhui Hu
- Center of Scientific Research, Maoming People’s Hospital, Maoming 525000, China
- Department of Critical Care Medicine, Maoming People’s Hospital, Maoming 525000, China
| | - Chunbo Chen
- Department of Critical Care Medicine, Maoming People’s Hospital, Maoming 525000, China
- Corresponding author at: Department of Critical Care Medicine, Maoming People’s Hospital, Maoming 525000, China.
| |
Collapse
|
30
|
Ishikawa S, Sugimoto M, Konta T, Kitabatake K, Ueda S, Edamatsu K, Okuyama N, Yusa K, Iino M. Salivary Metabolomics for Prognosis of Oral Squamous Cell Carcinoma. Front Oncol 2022; 11:789248. [PMID: 35070995 PMCID: PMC8769065 DOI: 10.3389/fonc.2021.789248] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/04/2021] [Accepted: 12/14/2021] [Indexed: 12/22/2022] Open
Abstract
This study aimed to identify salivary metabolomic biomarkers for predicting the prognosis of oral squamous cell carcinoma (OSCC) based on comprehensive metabolomic analyses. Quantified metabolomics data of unstimulated saliva samples collected from patients with OSCC (n = 72) were randomly divided into the training (n = 35) and validation groups (n = 37). The training data were used to develop a Cox proportional hazards regression model for identifying significant metabolites as prognostic factors for overall survival (OS) and disease-free survival. Moreover, the validation group was used to develop another Cox proportional hazards regression model using the previously identified metabolites. There were no significant between-group differences in the participants’ characteristics, including age, sex, and the median follow-up periods (55 months [range: 3–100] vs. 43 months [range: 0–97]). The concentrations of 5-hydroxylysine (p = 0.009) and 3-methylhistidine (p = 0.012) were identified as significant prognostic factors for OS in the training group. Among them, the concentration of 3-methylhistidine was a significant prognostic factor for OS in the validation group (p = 0.048). Our findings revealed that salivary 3-methylhistidine is a prognostic factor for OS in patients with OSCC.
Collapse
Affiliation(s)
- Shigeo Ishikawa
- Department of Dentistry, Oral and Maxillofacial Plastic and Reconstructive Surgery, Faculty of Medicine, Yamagata University, Iida-nishi, Japan
| | - Masahiro Sugimoto
- Health Promotion and Pre-emptive Medicine, Research and Development Center for Minimally Invasive Therapies, Tokyo Medical University, Shinjuku, Japan
| | - Tsuneo Konta
- Department of Public Health and Hygiene, Yamagata University Graduate School of Medicine, Iida-nishi, Japan
| | - Kenichiro Kitabatake
- Department of Dentistry, Oral and Maxillofacial Plastic and Reconstructive Surgery, Faculty of Medicine, Yamagata University, Iida-nishi, Japan
| | - Shohei Ueda
- Department of Dentistry, Oral and Maxillofacial Plastic and Reconstructive Surgery, Faculty of Medicine, Yamagata University, Iida-nishi, Japan
| | - Kaoru Edamatsu
- Department of Dentistry, Oral and Maxillofacial Plastic and Reconstructive Surgery, Faculty of Medicine, Yamagata University, Iida-nishi, Japan
| | - Naoki Okuyama
- Department of Dentistry, Oral and Maxillofacial Plastic and Reconstructive Surgery, Faculty of Medicine, Yamagata University, Iida-nishi, Japan
| | - Kazuyuki Yusa
- Department of Dentistry, Oral and Maxillofacial Plastic and Reconstructive Surgery, Faculty of Medicine, Yamagata University, Iida-nishi, Japan
| | - Mitsuyoshi Iino
- Department of Dentistry, Oral and Maxillofacial Plastic and Reconstructive Surgery, Faculty of Medicine, Yamagata University, Iida-nishi, Japan
| |
Collapse
|
31
|
Neutron encoded derivatization of endothelial cell lysates for quantitation of aldehyde metabolites using nESI-LC-HRMS. Anal Chim Acta 2022; 1190:339260. [PMID: 34857138 PMCID: PMC8646956 DOI: 10.1016/j.aca.2021.339260] [Citation(s) in RCA: 8] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/13/2021] [Revised: 10/27/2021] [Accepted: 11/06/2021] [Indexed: 01/17/2023]
Abstract
Biological aldehydes are difficult to analyze by electrospray ionization mass spectrometry due to their poor proton affinity and low biological concentrations. Chemical derivatization with stable isotope tags is used here for sample multiplexing, increased throughput, improved signal intensity, and quantitation. Nine quaternary amine tags with mass differences as low as 0.0058 Da had no observable chromatographic shifts, small amounts of ion suppression, and minimal matrix effects. Low concentration perfluoropentanoic acid was used as an ion pairing reagent to improve the retention of derivatized aldehydes. Perfluoropentanoic acid addition showed an average of three-fold improvement in limits of detection, 50% reduction in peak width, and 2.5 fold increase in analyte retention. Analysis of fifteen tagged aldehydes yielded an average of 13 nM limit of detection, 9 %RSD, R2 of 0.995, and linear dynamic range of 40-1000 nM. In a single 20 min separation, absolute quantitative data was obtained for 11 reactive aldehydes across 8 aortic endothelial cell samples. High glucose treatment produced significant changes to malondialdehyde, decanal, and (2E)-hexadecenal. These changes are consistent with glucose-induced oxidative stress. This method demonstrates that neutron encoded tagging of aldehydes is suitable for the analysis of complex samples.
Collapse
|
32
|
Zhu F, Fernie AR, Scossa F. Preparation and Curation of Omics Data for Genome-Wide Association Studies. Methods Mol Biol 2022; 2481:127-150. [PMID: 35641762 DOI: 10.1007/978-1-0716-2237-7_8] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 06/15/2023]
Abstract
With the development of large-scale molecular phenotyping platforms, genome-wide association studies have greatly developed, being no longer limited to the analysis of classical agronomic traits, such as yield or flowering time, but also embracing the dissection of the genetic basis of molecular traits. Data generated by omics platforms, however, pose some technical and statistical challenges to the classical methodology and assumptions of an association study. Although genotyping data are subject to strict filtering procedures, and several advanced statistical approaches are now available to adjust for population structure, less attention has been instead devoted to the preparation of omics data prior to GWAS. In the present chapter, we briefly present the methods to acquire profiling data from transcripts, proteins, and small molecules, and discuss the tools and possibilities to clean, normalize, and remove the unwanted variation from large datasets of molecular phenotypic traits prior to their use in GWAS.
Collapse
Affiliation(s)
- Feng Zhu
- National R&D Center for Citrus Preservation, Key Laboratory of Horticultural Plant Biology, Ministry of Education, Huazhong Agricultural University, Wuhan, China
- Max Planck Institute of Molecular Plant Physiology, Potsdam-Golm, Germany
| | - Alisdair R Fernie
- Max Planck Institute of Molecular Plant Physiology, Potsdam-Golm, Germany
| | - Federico Scossa
- Max Planck Institute of Molecular Plant Physiology, Potsdam-Golm, Germany.
- Council for Agricultural Research and Economics (CREA), Research Centre for Genomics and Bioinformatics (CREA-GB), Rome, Italy.
| |
Collapse
|
33
|
1H HR-MAS NMR Based Metabolic Profiling of Lung Cancer Cells with Induced and De-Induced Cisplatin Resistance to Reveal Metabolic Resistance Adaptations. Molecules 2021; 26:molecules26226766. [PMID: 34833859 PMCID: PMC8625954 DOI: 10.3390/molecules26226766] [Citation(s) in RCA: 6] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/24/2021] [Revised: 10/27/2021] [Accepted: 11/03/2021] [Indexed: 12/01/2022] Open
Abstract
Cisplatin (cisPt) is an important drug that is used against various cancers, including advanced lung cancer. However, drug resistance is still a major ongoing problem and its investigation is of paramount interest. Here, a high-resolution magic angle spinning (HR-MAS) NMR study is presented deciphering the metabolic profile of non-small cell lung cancer (NSCLC) cells and metabolic adaptations at different levels of induced cisPt-resistance, as well as in their de-induced counterparts (cells cultivated in absence of cisPt). In total, fifty-three metabolites were identified and quantified in the 1H-HR-MAS NMR cell spectra. Metabolic adaptations to cisPt-resistance were detected, which correlated with the degree of resistance. Importantly, de-induced cell lines demonstrated similar metabolic adaptations as the corresponding cisPt-resistant cell lines. Metabolites predominantly changed in cisPt resistant cells and their de-induced counterparts include glutathione and taurine. Characteristic metabolic patterns for cisPt resistance may become relevant as biomarkers in cancer medicine.
Collapse
|
34
|
Gouveia GJ, Shaver AO, Garcia BM, Morse AM, Andersen EC, Edison AS, McIntyre LM. Long-Term Metabolomics Reference Material. Anal Chem 2021; 93:9193-9199. [PMID: 34156835 PMCID: PMC8996483 DOI: 10.1021/acs.analchem.1c01294] [Citation(s) in RCA: 7] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/17/2022]
Abstract
The use of quality control samples in metabolomics ensures data quality, reproducibility, and comparability between studies, analytical platforms, and laboratories. Long-term, stable, and sustainable reference materials (RMs) are a critical component of the quality assurance/quality control (QA/QC) system; however, the limited selection of currently available matrix-matched RMs reduces their applicability for widespread use. To produce an RM in any context, for any matrix that is robust to changes over the course of time, we developed iterative batch averaging method (IBAT). To illustrate this method, we generated 11 independently grown Escherichia coli batches and made an RM over the course of 10 IBAT iterations. We measured the variance of these materials by nuclear magnetic resonance (NMR) and showed that IBAT produces a stable and sustainable RM over time. This E. coli RM was then used as a food source to produce a Caenorhabditis elegans RM for a metabolomics experiment. The metabolite extraction of this material, alongside 41 independently grown individual C. elegans samples of the same genotype, allowed us to estimate the proportion of sample variation in preanalytical steps. From the NMR data, we found that 40% of the metabolite variance is due to the metabolite extraction process and analysis and 60% is due to sample-to-sample variance. The availability of RMs in untargeted metabolomics is one of the predominant needs of the metabolomics community that reach beyond quality control practices. IBAT addresses this need by facilitating the production of biologically relevant RMs and increasing their widespread use.
Collapse
Affiliation(s)
- Goncalo J Gouveia
- Department of Biochemistry & Molecular Biology, University of Georgia, Green Street, Athens, Georgia 30602, United States
- Complex Carbohydrate Research Center, University of Georgia, 315, Riverbend Road, Athens, Georgia 30602, United States
| | - Amanda O Shaver
- Department of Genetics, University of Georgia, Green Street, Athens, Georgia 30602, United States
- Complex Carbohydrate Research Center, University of Georgia, 315, Riverbend Road, Athens, Georgia 30602, United States
| | - Brianna M Garcia
- Department of Chemistry, University of Georgia, 140, Cedar Street, Athens, Georgia 30602, United States
- Complex Carbohydrate Research Center, University of Georgia, 315, Riverbend Road, Athens, Georgia 30602, United States
| | - Alison M Morse
- Department of Molecular Genetics and Microbiology, University of Florida Genetics Institute, University of Florida, Mowry Road, Gainesville, Florida 32610, United States
| | - Erik C Andersen
- Department of Molecular Biosciences, Northwestern University, 2205, Tech Drive, Evanston, Illinois 60208, United States
| | - Arthur S Edison
- Department of Biochemistry & Molecular Biology, University of Georgia, Green Street, Athens, Georgia 30602, United States
- Department of Genetics, University of Georgia, Green Street, Athens, Georgia 30602, United States
- Complex Carbohydrate Research Center, University of Georgia, 315, Riverbend Road, Athens, Georgia 30602, United States
| | - Lauren M McIntyre
- Department of Molecular Genetics and Microbiology, University of Florida Genetics Institute, University of Florida, Mowry Road, Gainesville, Florida 32610, United States
| |
Collapse
|
35
|
Yamamoto H, Suzuki M, Matsuta R, Sasaki K, Kang MI, Kami K, Tatara Y, Itoh K, Nakaji S. Capillary Electrophoresis Mass Spectrometry-Based Metabolomics of Plasma Samples from Healthy Subjects in a Cross-Sectional Japanese Population Study. Metabolites 2021; 11:metabo11050314. [PMID: 34068294 PMCID: PMC8153282 DOI: 10.3390/metabo11050314] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/06/2021] [Revised: 05/10/2021] [Accepted: 05/11/2021] [Indexed: 12/27/2022] Open
Abstract
For large-scale metabolomics, such as in cohort studies, normalization protocols using quality control (QC) samples have been established when using data from gas chromatography and liquid chromatography coupled to mass spectrometry. However, normalization protocols have not been established for capillary electrophoresis-mass spectrometry metabolomics. In this study, we performed metabolome analysis of 314 human plasma samples using capillary electrophoresis-mass spectrometry. QC samples were analyzed every 10 samples. The results of principal component analysis for the metabolome data from only the QC samples showed variations caused by capillary replacement in the first principal component score and linear variation with continuous measurement in the second principal component score. Correlation analysis between diagnostic blood tests and plasma metabolites normalized by the QC samples was performed for samples from 188 healthy subjects who participated in a Japanese population study. Five highly correlated pairs were identified, including two previously unidentified pairs in normal healthy subjects of blood urea nitrogen and guanidinosuccinic acid, and gamma-glutamyl transferase and cysteine glutathione disulfide. These results confirmed the validity of normalization protocols in capillary electrophoresis-mass spectrometry using large-scale metabolomics and comprehensive analysis.
Collapse
Affiliation(s)
- Hiroyuki Yamamoto
- Human Metabolome Technologies, Inc., 246-2 Mizukami, Kakuganji, Tsuruoka, Yamagata 997-0052, Japan; (M.S.); (R.M.); (K.S.); (M.-I.K.); (K.K.)
- Department of Metabolomics Innovation, Hirosaki University Graduate School of Medicine, 5 Zaifu-cho, Hirosaki 036-8562, Japan;
- Correspondence: (H.Y.); (K.I.)
| | - Makoto Suzuki
- Human Metabolome Technologies, Inc., 246-2 Mizukami, Kakuganji, Tsuruoka, Yamagata 997-0052, Japan; (M.S.); (R.M.); (K.S.); (M.-I.K.); (K.K.)
| | - Rira Matsuta
- Human Metabolome Technologies, Inc., 246-2 Mizukami, Kakuganji, Tsuruoka, Yamagata 997-0052, Japan; (M.S.); (R.M.); (K.S.); (M.-I.K.); (K.K.)
| | - Kazunori Sasaki
- Human Metabolome Technologies, Inc., 246-2 Mizukami, Kakuganji, Tsuruoka, Yamagata 997-0052, Japan; (M.S.); (R.M.); (K.S.); (M.-I.K.); (K.K.)
| | - Moon-Il Kang
- Human Metabolome Technologies, Inc., 246-2 Mizukami, Kakuganji, Tsuruoka, Yamagata 997-0052, Japan; (M.S.); (R.M.); (K.S.); (M.-I.K.); (K.K.)
| | - Kenjiro Kami
- Human Metabolome Technologies, Inc., 246-2 Mizukami, Kakuganji, Tsuruoka, Yamagata 997-0052, Japan; (M.S.); (R.M.); (K.S.); (M.-I.K.); (K.K.)
| | - Yota Tatara
- Center for Advanced Medical Research, Department of Stress Response Science, Hirosaki University Graduate School of Medicine, 5 Zaifu-cho, Hirosaki 036-8562, Japan;
| | - Ken Itoh
- Department of Metabolomics Innovation, Hirosaki University Graduate School of Medicine, 5 Zaifu-cho, Hirosaki 036-8562, Japan;
- Center for Advanced Medical Research, Department of Stress Response Science, Hirosaki University Graduate School of Medicine, 5 Zaifu-cho, Hirosaki 036-8562, Japan;
- Correspondence: (H.Y.); (K.I.)
| | - Shigeyuki Nakaji
- Department of Metabolomics Innovation, Hirosaki University Graduate School of Medicine, 5 Zaifu-cho, Hirosaki 036-8562, Japan;
- Department of Social Health, Hirosaki University Graduate School of Medicine, 5 Zaifu-cho, Hirosaki 036-8562, Japan
| |
Collapse
|
36
|
Abstract
BACKGROUND Precision medicine, space exploration, drug discovery to characterization of dark chemical space of habitats and organisms, metabolomics takes a centre stage in providing answers to diverse biological, biomedical, and environmental questions. With technological advances in mass-spectrometry and spectroscopy platforms that aid in generation of information rich datasets that are complex big-data, data analytics tend to co-evolve to match the pace of analytical instrumentation. Software tools, resources, databases, and solutions help in harnessing the concealed information in the generated data for eventual translational success. AIM OF THE REVIEW In this review, ~ 85 metabolomics software resources, packages, tools, databases, and other utilities that appeared in 2020 are introduced to the research community. KEY SCIENTIFIC CONCEPTS OF REVIEW In Table 1 the computational dependencies and downloadable links of the tools are provided, and the resources are categorized based on their utility. The review aims to keep the community of metabolomics researchers updated with all the resources developed in 2020 at a collated avenue, in line with efforts form 2015 onwards to help them find these at one place for further referencing and use.
Collapse
|
37
|
Quality Assessment of Untargeted Analytical Data in a Large-Scale Metabolomic Study. J Clin Med 2021; 10:jcm10091826. [PMID: 33922230 PMCID: PMC8122759 DOI: 10.3390/jcm10091826] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Key Words] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/09/2021] [Revised: 04/12/2021] [Accepted: 04/19/2021] [Indexed: 12/12/2022] Open
Abstract
Large-scale metabolomic studies have become common, and the reliability of the peak data produced by the various instruments is an important issue. However, less attention has been paid to the large number of uncharacterized peaks in untargeted metabolomics data. In this study, we tested various criteria to assess the reliability of 276 and 202 uncharacterized peaks that were detected in a gathered set of 30 plasma and urine quality control samples, respectively, using capillary electrophoresis-time-of-flight mass spectrometry (CE-TOFMS). The linear relationship between the amounts of pooled samples and the corresponding peak areas was one of the criteria used to select reliable peaks. We used samples from approximately 3000 participants in the Tsuruoka Metabolome Cohort Study to investigate patterns of the areas of these uncharacterized peaks among the samples and clustered the peaks by combining the patterns and differences in the migration times. Our assessment pipeline removed substantial numbers of unreliable or redundant peaks and detected 35 and 74 reliable uncharacterized peaks in plasma and urine, respectively, some of which may correspond to metabolites involved in important physiological processes such as disease progression. We propose that our assessment pipeline can be used to help establish large-scale untargeted clinical metabolomic studies.
Collapse
|
38
|
DBnorm as an R package for the comparison and selection of appropriate statistical methods for batch effect correction in metabolomic studies. Sci Rep 2021; 11:5657. [PMID: 33707505 PMCID: PMC7952378 DOI: 10.1038/s41598-021-84824-3] [Citation(s) in RCA: 12] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/01/2020] [Accepted: 02/19/2021] [Indexed: 02/07/2023] Open
Abstract
As a powerful phenotyping technology, metabolomics provides new opportunities in biomarker discovery through metabolome-wide association studies (MWAS) and the identification of metabolites having a regulatory effect in various biological processes. While mass spectrometry-based (MS) metabolomics assays are endowed with high throughput and sensitivity, MWAS are doomed to long-term data acquisition generating an overtime-analytical signal drift that can hinder the uncovering of real biologically relevant changes. We developed “dbnorm”, a package in the R environment, which allows for an easy comparison of the model performance of advanced statistical tools commonly used in metabolomics to remove batch effects from large metabolomics datasets. “dbnorm” integrates advanced statistical tools to inspect the dataset structure not only at the macroscopic (sample batches) scale, but also at the microscopic (metabolic features) level. To compare the model performance on data correction, “dbnorm” assigns a score that help users identify the best fitting model for each dataset. In this study, we applied “dbnorm” to two large-scale metabolomics datasets as a proof of concept. We demonstrate that “dbnorm” allows for the accurate selection of the most appropriate statistical tool to efficiently remove the overtime signal drift and to focus on the relevant biological components of complex datasets.
Collapse
|