1
|
Pakkir Shah AK, Walter A, Ottosson F, Russo F, Navarro-Diaz M, Boldt J, Kalinski JCJ, Kontou EE, Elofson J, Polyzois A, González-Marín C, Farrell S, Aggerbeck MR, Pruksatrakul T, Chan N, Wang Y, Pöchhacker M, Brungs C, Cámara B, Caraballo-Rodríguez AM, Cumsille A, de Oliveira F, Dührkop K, El Abiead Y, Geibel C, Graves LG, Hansen M, Heuckeroth S, Knoblauch S, Kostenko A, Kuijpers MCM, Mildau K, Papadopoulos Lambidis S, Portal Gomes PW, Schramm T, Steuer-Lodd K, Stincone P, Tayyab S, Vitale GA, Wagner BC, Xing S, Yazzie MT, Zuffa S, de Kruijff M, Beemelmanns C, Link H, Mayer C, van der Hooft JJJ, Damiani T, Pluskal T, Dorrestein P, Stanstrup J, Schmid R, Wang M, Aron A, Ernst M, Petras D. Statistical analysis of feature-based molecular networking results from non-targeted metabolomics data. Nat Protoc 2024:10.1038/s41596-024-01046-3. [PMID: 39304763 DOI: 10.1038/s41596-024-01046-3] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/31/2023] [Accepted: 07/02/2024] [Indexed: 09/22/2024]
Abstract
Feature-based molecular networking (FBMN) is a popular analysis approach for liquid chromatography-tandem mass spectrometry-based non-targeted metabolomics data. While processing liquid chromatography-tandem mass spectrometry data through FBMN is fairly streamlined, downstream data handling and statistical interrogation are often a key bottleneck. Especially users new to statistical analysis struggle to effectively handle and analyze complex data matrices. Here we provide a comprehensive guide for the statistical analysis of FBMN results, focusing on the downstream analysis of the FBMN output table. We explain the data structure and principles of data cleanup and normalization, as well as uni- and multivariate statistical analysis of FBMN results. We provide explanations and code in two scripting languages (R and Python) as well as the QIIME2 framework for all protocol steps, from data clean-up to statistical analysis. All code is shared in the form of Jupyter Notebooks ( https://github.com/Functional-Metabolomics-Lab/FBMN-STATS ). Additionally, the protocol is accompanied by a web application with a graphical user interface ( https://fbmn-statsguide.gnps2.org/ ) to lower the barrier of entry for new users and for educational purposes. Finally, we also show users how to integrate their statistical results into the molecular network using the Cytoscape visualization tool. Throughout the protocol, we use a previously published environmental metabolomics dataset for demonstration purposes. Together, the protocol, code and web application provide a complete guide and toolbox for FBMN data integration, cleanup and advanced statistical analysis, enabling new users to uncover molecular insights from their non-targeted metabolomics data. Our protocol is tailored for the seamless analysis of FBMN results from Global Natural Products Social Molecular Networking and can be easily adapted to other mass spectrometry feature detection, annotation and networking tools.
Collapse
Affiliation(s)
- Abzer K Pakkir Shah
- Virtual Multi-Omics Laboratory, The Internet, Riverside, CA, USA
- University of Tübingen, Interfaculty Institute of Microbiology and Infection Medicine, Tübingen, Germany
| | - Axel Walter
- Virtual Multi-Omics Laboratory, The Internet, Riverside, CA, USA
- University of Tübingen, Interfaculty Institute of Microbiology and Infection Medicine, Tübingen, Germany
- Applied Bioinformatics, Department of Computer Science, University of Tübingen, Tübingen, Germany
| | - Filip Ottosson
- Section for Clinical Mass Spectrometry, Danish Center for Neonatal Screening, Department of Congenital Disorders, Statens Serum Institut, Copenhagen S, Denmark
| | - Francesco Russo
- Section for Clinical Mass Spectrometry, Danish Center for Neonatal Screening, Department of Congenital Disorders, Statens Serum Institut, Copenhagen S, Denmark
| | - Marcelo Navarro-Diaz
- University of Tübingen, Interfaculty Institute of Microbiology and Infection Medicine, Tübingen, Germany
| | - Judith Boldt
- Virtual Multi-Omics Laboratory, The Internet, Riverside, CA, USA
- Leibniz Institute DSMZ-German Collection of Microorganisms and Cell Cultures, Braunschweig, Germany
- German Center for Infection Research, Partner Site Braunschweig-Hannover, Braunschweig, Germany
| | - Jarmo-Charles J Kalinski
- Virtual Multi-Omics Laboratory, The Internet, Riverside, CA, USA
- Department of Biochemistry and Microbiology, Rhodes University, Makhanda, South Africa
| | - Eftychia Eva Kontou
- Virtual Multi-Omics Laboratory, The Internet, Riverside, CA, USA
- The Novo Nordisk Foundation for Biosustainability, Technical University of Denmark, Kongens Lyngby, Denmark
| | - James Elofson
- Department of Chemistry and Biochemistry, University of Denver, Denver, CO, USA
| | - Alexandros Polyzois
- Virtual Multi-Omics Laboratory, The Internet, Riverside, CA, USA
- Boyce Thompson Institute and Department of Chemistry and Chemical Biology, Cornell University, Ithaca, NY, USA
| | - Carolina González-Marín
- Virtual Multi-Omics Laboratory, The Internet, Riverside, CA, USA
- Universidad EAFIT, Medellín, Antioquia, Colombia
| | - Shane Farrell
- Bigelow Laboratory for Ocean Sciences, East Boothbay, ME, USA
- School of Marine Sciences, Darling Marine Center, University of Maine, Walpole, ME, USA
| | - Marie R Aggerbeck
- Virtual Multi-Omics Laboratory, The Internet, Riverside, CA, USA
- Department of Environmental Science, Aarhus University, Roskilde, Denmark
| | - Thapanee Pruksatrakul
- Virtual Multi-Omics Laboratory, The Internet, Riverside, CA, USA
- National Center for Genetic Engineering and Biotechnology, National Science and Technology Development Agency, Thailand Science Park, Pathum Thani, Thailand
| | - Nathan Chan
- Department of Computer Science, University of California Riverside, Riverside, CA, USA
| | - Yunshu Wang
- Department of Computer Science, University of California Riverside, Riverside, CA, USA
| | - Magdalena Pöchhacker
- Virtual Multi-Omics Laboratory, The Internet, Riverside, CA, USA
- Department of Food Chemistry and Toxicology, University of Vienna, Vienna, Austria
| | - Corinna Brungs
- Institute of Organic Chemistry and Biochemistry of the Czech Academy of Sciences, Prague, Czech Republic
| | - Beatriz Cámara
- Laboratorio de Microbiología Molecular y Biotecnología Ambiental, Centro de Biotecnología DAL, Universidad Técnica Federico Santa María, Valparaíso, Chile
| | | | - Andres Cumsille
- Laboratorio de Microbiología Molecular y Biotecnología Ambiental, Centro de Biotecnología DAL, Universidad Técnica Federico Santa María, Valparaíso, Chile
| | - Fernanda de Oliveira
- Skaggs School of Pharmacy and Pharmaceutical Sciences, University of California San Diego, San Diego, CA, USA
- Department of Biotechnology, Engineering School of Lorena, University of São Paulo, Lorena, São Paulo, Brazil
| | - Kai Dührkop
- Department of Bioinformatics, University of Jena, Jena, Germany
| | - Yasin El Abiead
- Skaggs School of Pharmacy and Pharmaceutical Sciences, University of California San Diego, San Diego, CA, USA
| | - Christian Geibel
- University of Tübingen, Interfaculty Institute of Microbiology and Infection Medicine, Tübingen, Germany
| | - Lana G Graves
- Department of Environmental Systems Analysis, University of Tübingen, Tübingen, Germany
- Leibniz Institute of Freshwater Ecology and Inland Fisheries, Berlin, Germany
| | - Martin Hansen
- Department of Environmental Science, Aarhus University, Roskilde, Denmark
| | - Steffen Heuckeroth
- Institute of Inorganic and Analytical Chemistry, University of Münster, Münster, Germany
| | - Simon Knoblauch
- University of Tübingen, Interfaculty Institute of Microbiology and Infection Medicine, Tübingen, Germany
| | - Anastasiia Kostenko
- Department of Chemistry and Biochemistry, University of Denver, Denver, CO, USA
| | - Mirte C M Kuijpers
- Department of Ecology, Behavior and Evolution, University of California San Diego, San Diego, CA, USA
| | - Kevin Mildau
- Virtual Multi-Omics Laboratory, The Internet, Riverside, CA, USA
- Department of Analytical Chemistry, University of Vienna, Vienna, Austria
- Bioinformatics Group, Wageningen University and Research, Wageningen, the Netherlands
| | | | - Paulo Wender Portal Gomes
- Skaggs School of Pharmacy and Pharmaceutical Sciences, University of California San Diego, San Diego, CA, USA
| | - Tilman Schramm
- University of Tübingen, Interfaculty Institute of Microbiology and Infection Medicine, Tübingen, Germany
- Department of Biochemistry, University of California Riverside, Riverside, CA, USA
| | - Karoline Steuer-Lodd
- University of Tübingen, Interfaculty Institute of Microbiology and Infection Medicine, Tübingen, Germany
- Department of Biochemistry, University of California Riverside, Riverside, CA, USA
| | - Paolo Stincone
- University of Tübingen, Interfaculty Institute of Microbiology and Infection Medicine, Tübingen, Germany
| | - Sibgha Tayyab
- University of Tübingen, Interfaculty Institute of Microbiology and Infection Medicine, Tübingen, Germany
| | - Giovanni Andrea Vitale
- University of Tübingen, Interfaculty Institute of Microbiology and Infection Medicine, Tübingen, Germany
| | - Berenike C Wagner
- University of Tübingen, Interfaculty Institute of Microbiology and Infection Medicine, Tübingen, Germany
| | - Shipei Xing
- Skaggs School of Pharmacy and Pharmaceutical Sciences, University of California San Diego, San Diego, CA, USA
| | - Marquis T Yazzie
- Department of Chemistry and Biochemistry, University of Denver, Denver, CO, USA
| | - Simone Zuffa
- Skaggs School of Pharmacy and Pharmaceutical Sciences, University of California San Diego, San Diego, CA, USA
- Collaborative Mass Spectrometry Innovation Center, Skaggs School of Pharmacy and Pharmaceutical Sciences, University of California San Diego, San Diego, CA, USA
| | - Martinus de Kruijff
- Helmholtz Institute for Pharmaceutical Research Saarland, Helmholtz Centre for Infection Research, Saarbrücken, Germany
| | - Christine Beemelmanns
- Helmholtz Institute for Pharmaceutical Research Saarland, Helmholtz Centre for Infection Research, Saarbrücken, Germany
- Saarland University, Saarbrücken, Germany
| | - Hannes Link
- University of Tübingen, Interfaculty Institute of Microbiology and Infection Medicine, Tübingen, Germany
| | - Christoph Mayer
- University of Tübingen, Interfaculty Institute of Microbiology and Infection Medicine, Tübingen, Germany
| | - Justin J J van der Hooft
- Virtual Multi-Omics Laboratory, The Internet, Riverside, CA, USA
- Bioinformatics Group, Wageningen University and Research, Wageningen, the Netherlands
- Department of Biochemistry, University of Johannesburg, Johannesburg, South Africa
| | - Tito Damiani
- Institute of Organic Chemistry and Biochemistry of the Czech Academy of Sciences, Prague, Czech Republic
| | - Tomáš Pluskal
- Institute of Organic Chemistry and Biochemistry of the Czech Academy of Sciences, Prague, Czech Republic
| | - Pieter Dorrestein
- Skaggs School of Pharmacy and Pharmaceutical Sciences, University of California San Diego, San Diego, CA, USA
| | - Jan Stanstrup
- Department of Nutrition, Exercise and Sports, University of Copenhagen, Frederiksberg C, Denmark
| | - Robin Schmid
- Virtual Multi-Omics Laboratory, The Internet, Riverside, CA, USA
- Institute of Organic Chemistry and Biochemistry of the Czech Academy of Sciences, Prague, Czech Republic
| | - Mingxun Wang
- Virtual Multi-Omics Laboratory, The Internet, Riverside, CA, USA
- Department of Computer Science, University of California Riverside, Riverside, CA, USA
| | - Allegra Aron
- Virtual Multi-Omics Laboratory, The Internet, Riverside, CA, USA
- Department of Chemistry and Biochemistry, University of Denver, Denver, CO, USA
| | - Madeleine Ernst
- Section for Clinical Mass Spectrometry, Danish Center for Neonatal Screening, Department of Congenital Disorders, Statens Serum Institut, Copenhagen S, Denmark.
| | - Daniel Petras
- Virtual Multi-Omics Laboratory, The Internet, Riverside, CA, USA.
- University of Tübingen, Interfaculty Institute of Microbiology and Infection Medicine, Tübingen, Germany.
- Department of Biochemistry, University of California Riverside, Riverside, CA, USA.
| |
Collapse
|
2
|
Ferrario PG, Bub A, Frommherz L, Krüger R, Rist MJ, Watzl B. A new statistical workflow (R-packages based) to investigate associations between one variable of interest and the metabolome. Metabolomics 2023; 20:2. [PMID: 38036896 PMCID: PMC10689553 DOI: 10.1007/s11306-023-02065-z] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 03/09/2023] [Accepted: 11/09/2023] [Indexed: 12/02/2023]
Abstract
INTRODUCTION In metabolomics, the investigation of associations between the metabolome and one trait of interest is a key research question. However, statistical analyses of such associations are often challenging. Statistical tools enabling resilient verification and clear presentation are therefore highly desired. OBJECTIVES Our aim is to provide a contribution for statistical analysis of metabolomics data, offering a widely applicable open-source statistical workflow, which considers the intrinsic complexity of metabolomics data. METHODS We combined selected R packages tailored for all properties of heterogeneous metabolomics datasets, where metabolite parameters typically (i) are analyzed in different matrices, (ii) are measured on different analytical platforms with different precision, (iii) are analyzed by targeted as well as non-targeted methods, (iv) are scaled variously, (v) reveal heterogeneous variances, (vi) may be correlated, (vii) may have only few values or values below a detection limit, or (viii) may be incomplete. RESULTS The code is shared entirely and freely available. The workflow output is a table of metabolites associated with a trait of interest and a compact plot for high-quality results visualization. The workflow output and its utility are presented by applying it to two previously published datasets: one dataset from our own lab and another dataset taken from the repository MetaboLights. CONCLUSION Robustness and benefits of the statistical workflow were clearly demonstrated, and everyone can directly re-use it for analysis of own data.
Collapse
Affiliation(s)
- Paola G Ferrario
- Department of Physiology and Biochemistry of Nutrition, Max Rubner-Institut, Haid-und-Neu-Str. 9, 76131, Karlsruhe, Germany.
| | - Achim Bub
- Department of Physiology and Biochemistry of Nutrition, Max Rubner-Institut, Haid-und-Neu-Str. 9, 76131, Karlsruhe, Germany
| | - Lara Frommherz
- Department of Safety and Quality of Fruit and Vegetables, Max Rubner-Institut, Haid-und-Neu-Str. 9, 76131, Karlsruhe, Germany
| | - Ralf Krüger
- Department of Physiology and Biochemistry of Nutrition, Max Rubner-Institut, Haid-und-Neu-Str. 9, 76131, Karlsruhe, Germany
| | - Manuela J Rist
- Department of Physiology and Biochemistry of Nutrition, Max Rubner-Institut, Haid-und-Neu-Str. 9, 76131, Karlsruhe, Germany
| | - Bernhard Watzl
- Department of Physiology and Biochemistry of Nutrition, Max Rubner-Institut, Haid-und-Neu-Str. 9, 76131, Karlsruhe, Germany
| |
Collapse
|
3
|
Du X, Dastmalchi F, Ye H, Garrett TJ, Diller MA, Liu M, Hogan WR, Brochhausen M, Lemas DJ. Evaluating LC-HRMS metabolomics data processing software using FAIR principles for research software. Metabolomics 2023; 19:11. [PMID: 36745241 DOI: 10.1007/s11306-023-01974-3] [Citation(s) in RCA: 4] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 12/08/2022] [Accepted: 01/20/2023] [Indexed: 02/07/2023]
Abstract
BACKGROUND Liquid chromatography-high resolution mass spectrometry (LC-HRMS) is a popular approach for metabolomics data acquisition and requires many data processing software tools. The FAIR Principles - Findability, Accessibility, Interoperability, and Reusability - were proposed to promote open science and reusable data management, and to maximize the benefit obtained from contemporary and formal scholarly digital publishing. More recently, the FAIR principles were extended to include Research Software (FAIR4RS). AIM OF REVIEW This study facilitates open science in metabolomics by providing an implementation solution for adopting FAIR4RS in the LC-HRMS metabolomics data processing software. We believe our evaluation guidelines and results can help improve the FAIRness of research software. KEY SCIENTIFIC CONCEPTS OF REVIEW We evaluated 124 LC-HRMS metabolomics data processing software obtained from a systematic review and selected 61 software for detailed evaluation using FAIR4RS-related criteria, which were extracted from the literature along with internal discussions. We assigned each criterion one or more FAIR4RS categories through discussion. The minimum, median, and maximum percentages of criteria fulfillment of software were 21.6%, 47.7%, and 71.8%. Statistical analysis revealed no significant improvement in FAIRness over time. We identified four criteria covering multiple FAIR4RS categories but had a low %fulfillment: (1) No software had semantic annotation of key information; (2) only 6.3% of evaluated software were registered to Zenodo and received DOIs; (3) only 14.5% of selected software had official software containerization or virtual machine; (4) only 16.7% of evaluated software had a fully documented functions in code. According to the results, we discussed improvement strategies and future directions.
Collapse
Affiliation(s)
- Xinsong Du
- Department of Health Outcomes and Biomedical Informatics, University of Florida College of Medicine, Gainesville, FL, USA
| | - Farhad Dastmalchi
- Department of Health Outcomes and Biomedical Informatics, University of Florida College of Medicine, Gainesville, FL, USA
| | - Hao Ye
- Health Science Center Libraries, University of Florida, Florida, USA
| | - Timothy J Garrett
- Department of Pathology, Immunology and Laboratory Medicine, College of Medicine, University of Florida, Florida, USA
| | - Matthew A Diller
- Department of Health Outcomes and Biomedical Informatics, University of Florida College of Medicine, Gainesville, FL, USA
| | - Mei Liu
- Department of Health Outcomes and Biomedical Informatics, University of Florida College of Medicine, Gainesville, FL, USA
| | - William R Hogan
- Department of Health Outcomes and Biomedical Informatics, University of Florida College of Medicine, Gainesville, FL, USA
| | - Mathias Brochhausen
- Department of Biomedical Informatics, College of Medicine, University of Arkansas for Medical Sciences, Little Rock, USA
| | - Dominick J Lemas
- Department of Health Outcomes and Biomedical Informatics, University of Florida College of Medicine, Gainesville, FL, USA.
- Department of Obstetrics and Gynecology, University of Florida College of Medicine, Florida, Gainesville, United States.
- Center for Perinatal Outcomes Research, University of Florida College of Medicine, Gainesville, United States.
| |
Collapse
|
4
|
Peralbo-Molina Á, Solà-Santos P, Perera-Lluna A, Chicano-Gálvez E. Data Processing and Analysis in Mass Spectrometry-Based Metabolomics. Methods Mol Biol 2023; 2571:207-239. [PMID: 36152164 DOI: 10.1007/978-1-0716-2699-3_20] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 06/16/2023]
Abstract
Metabolomics is the latest of the omics sciences. It attempts to measure and characterize metabolites-small chemical compounds <1500 Da-on cells, tissue, or biofluids, which are usually products of biological reactions. As metabolic reactions are closer to the phenotype, metabolomics has emerged as an attractive science for various areas of research, including personalized medicine. However, due to the complexity of data obtained and the absence of curated databases for metabolite identification, data processing is the major bottleneck in this area since most technicians lack the required bioinformatics expertise to process datasets in a reliable and fast manner. The aim of this chapter is to describe the available tools for data processing that makes an inexperienced researcher capable of obtaining reliable results without having to undergo through huge parametrization steps.
Collapse
Affiliation(s)
- Ángela Peralbo-Molina
- IMIBIC Mass Spectrometry and Molecular Imaging Unit, Maimonides, Biomedical Research Institute of Cordoba (IMIBIC), Reina Sofia University Hospital, University of Cordoba (UCO), Córdoba, Spain.
| | - Pol Solà-Santos
- B2SLab, Departament d'Enginyeria de Sistemes, Automàtica i Informàtica Industrial, Universitat Politècnica de Catalunya, Barcelona, Spain
- Networking Biomedical Research Centre in the Subject Area of Bioengineering, Biomaterials and Nanomedicine (CIBER-BBN), Madrid, Spain
- Institut de Recerca Sant Joan de Déu, Barcelona, Spain
| | - Alexandre Perera-Lluna
- B2SLab, Departament d'Enginyeria de Sistemes, Automàtica i Informàtica Industrial, Universitat Politècnica de Catalunya, Barcelona, Spain
- Networking Biomedical Research Centre in the Subject Area of Bioengineering, Biomaterials and Nanomedicine (CIBER-BBN), Madrid, Spain
- Institut de Recerca Sant Joan de Déu, Barcelona, Spain
| | - Eduardo Chicano-Gálvez
- IMIBIC Mass Spectrometry and Molecular Imaging Unit, Maimonides, Biomedical Research Institute of Cordoba (IMIBIC), Reina Sofia University Hospital, University of Cordoba (UCO), Córdoba, Spain
| |
Collapse
|
5
|
Shen X, Yan H, Wang C, Gao P, Johnson CH, Snyder MP. TidyMass an object-oriented reproducible analysis framework for LC-MS data. Nat Commun 2022; 13:4365. [PMID: 35902589 PMCID: PMC9334349 DOI: 10.1038/s41467-022-32155-w] [Citation(s) in RCA: 18] [Impact Index Per Article: 9.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/10/2022] [Accepted: 07/15/2022] [Indexed: 02/05/2023] Open
Abstract
Reproducibility, traceability, and transparency have been long-standing issues for metabolomics data analysis. Multiple tools have been developed, but limitations still exist. Here, we present the tidyMass project ( https://www.tidymass.org/ ), a comprehensive R-based computational framework that can achieve the traceable, shareable, and reproducible workflow needs of data processing and analysis for LC-MS-based untargeted metabolomics. TidyMass is an ecosystem of R packages that share an underlying design philosophy, grammar, and data structure, which provides a comprehensive, reproducible, and object-oriented computational framework. The modular architecture makes tidyMass a highly flexible and extensible tool, which other users can improve and integrate with other tools to customize their own pipeline.
Collapse
Affiliation(s)
- Xiaotao Shen
- Department of Genetics, Stanford University School of Medicine, Stanford, CA, USA
| | - Hong Yan
- Department of Environmental Health Sciences, Yale School of Public Health, New Haven, CT, USA
| | - Chuchu Wang
- Howard Hughes Medical Institute, Stanford University, Stanford, CA, USA
| | - Peng Gao
- Department of Genetics, Stanford University School of Medicine, Stanford, CA, USA
| | - Caroline H Johnson
- Department of Environmental Health Sciences, Yale School of Public Health, New Haven, CT, USA.
| | - Michael P Snyder
- Department of Genetics, Stanford University School of Medicine, Stanford, CA, USA.
| |
Collapse
|
6
|
Pinter N, Glätzer D, Fahrner M, Fröhlich K, Johnson J, Grüning BA, Warscheid B, Drepper F, Schilling O, Föll MC. MaxQuant and MSstats in Galaxy Enable Reproducible Cloud-Based Analysis of Quantitative Proteomics Experiments for Everyone. J Proteome Res 2022; 21:1558-1565. [PMID: 35503992 DOI: 10.1021/acs.jproteome.2c00051] [Citation(s) in RCA: 5] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/29/2022]
Abstract
Quantitative mass spectrometry-based proteomics has become a high-throughput technology for the identification and quantification of thousands of proteins in complex biological samples. Two frequently used tools, MaxQuant and MSstats, allow for the analysis of raw data and finding proteins with differential abundance between conditions of interest. To enable accessible and reproducible quantitative proteomics analyses in a cloud environment, we have integrated MaxQuant (including TMTpro 16/18plex), Proteomics Quality Control (PTXQC), MSstats, and MSstatsTMT into the open-source Galaxy framework. This enables the web-based analysis of label-free and isobaric labeling proteomics experiments via Galaxy's graphical user interface on public clouds. MaxQuant and MSstats in Galaxy can be applied in conjunction with thousands of existing Galaxy tools and integrated into standardized, sharable workflows. Galaxy tracks all metadata and intermediate results in analysis histories, which can be shared privately for collaborations or publicly, allowing full reproducibility and transparency of published analysis. To further increase accessibility, we provide detailed hands-on training materials. The integration of MaxQuant and MSstats into the Galaxy framework enables their usage in a reproducible way on accessible large computational infrastructures, hence realizing the foundation for high-throughput proteomics data science for everyone.
Collapse
Affiliation(s)
- Niko Pinter
- Institute for Surgical Pathology, Medical Center, University of Freiburg, 79106 Freiburg, Germany.,Faculty of Medicine, University of Freiburg, 79110 Freiburg, Germany
| | - Damian Glätzer
- Biochemistry and Functional Proteomics, Institute of Biology II, Faculty of Biology, University of Freiburg, 79104 Freiburg, Germany
| | - Matthias Fahrner
- Institute for Surgical Pathology, Medical Center, University of Freiburg, 79106 Freiburg, Germany.,Faculty of Medicine, University of Freiburg, 79110 Freiburg, Germany.,Faculty of Biology, University of Freiburg, 79104 Freiburg, Germany
| | - Klemens Fröhlich
- Institute for Surgical Pathology, Medical Center, University of Freiburg, 79106 Freiburg, Germany.,Faculty of Medicine, University of Freiburg, 79110 Freiburg, Germany.,Faculty of Biology, University of Freiburg, 79104 Freiburg, Germany.,Spemann Graduate School of Biology and Medicine (SGBM), Albert-Ludwigs-University Freiburg, 79104 Freiburg, Germany
| | - James Johnson
- Minnesota Supercomputing Institute, University of Minnesota, Minneapolis, Minnesota 55455, United States
| | | | - Bettina Warscheid
- Biochemistry and Functional Proteomics, Institute of Biology II, Faculty of Biology, University of Freiburg, 79104 Freiburg, Germany.,Faculty of Chemistry and Pharmacy, Department of Biochemistry, Julius Maximilian University of Würzburg, 97074 Würzburg, Germany
| | - Friedel Drepper
- Biochemistry and Functional Proteomics, Institute of Biology II, Faculty of Biology, University of Freiburg, 79104 Freiburg, Germany
| | - Oliver Schilling
- Institute for Surgical Pathology, Medical Center, University of Freiburg, 79106 Freiburg, Germany.,Faculty of Medicine, University of Freiburg, 79110 Freiburg, Germany.,German Cancer Consortium (DKTK) and Cancer Research Center (DKFZ), 79106 Freiburg, Germany
| | - Melanie Christine Föll
- Institute for Surgical Pathology, Medical Center, University of Freiburg, 79106 Freiburg, Germany.,Faculty of Medicine, University of Freiburg, 79110 Freiburg, Germany.,Khoury College of Computer Sciences, Northeastern University, Boston, Massachusetts 02115, United States
| |
Collapse
|
7
|
Ye D, Li X, Shen J, Xia X. Microbial metabolomics: From novel technologies to diversified applications. Trends Analyt Chem 2022. [DOI: 10.1016/j.trac.2022.116540] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/16/2022]
|
8
|
Niemuth NJ, Curtis BJ, Laudadio ED, Sostare E, Bennett EA, Neureuther NJ, Mohaimani AA, Schmoldt A, Ostovich ED, Viant MR, Hamers RJ, Klaper RD. Energy Starvation in Daphnia magna from Exposure to a Lithium Cobalt Oxide Nanomaterial. Chem Res Toxicol 2021; 34:2287-2297. [PMID: 34724609 DOI: 10.1021/acs.chemrestox.1c00189] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/30/2022]
Abstract
Growing evidence across organisms points to altered energy metabolism as an adverse outcome of metal oxide nanomaterial toxicity, with a mechanism of toxicity potentially related to the redox chemistry of processes involved in energy production. Despite this evidence, the significance of this mechanism has gone unrecognized in nanotoxicology due to the field's focus on oxidative stress as a universal─but nonspecific─nanotoxicity mechanism. To further explore metabolic impacts, we determined lithium cobalt oxide's (LCO's) effects on these pathways in the model organism Daphnia magna through global gene-expression analysis using RNA-Seq and untargeted metabolomics by direct-injection mass spectrometry. Our results show that a sublethal 1 mg/L 48 h exposure of D. magna to LCO nanosheets causes significant impacts on metabolic pathways versus untreated controls, while exposure to ions released over 48 h does not. Specifically, transcriptomic analysis using DAVID indicated significant enrichment (Benjamini-adjusted p ≤0.0.5) in LCO-exposed animals for changes in pathways involved in the cellular response to starvation (25 genes), mitochondrial function (70 genes), ATP-binding (70 genes), oxidative phosphorylation (53 genes), NADH dehydrogenase activity (12 genes), and protein biosynthesis (40 genes). Metabolomic analysis using MetaboAnalyst indicated significant enrichment (γ-adjusted p <0.1) for changes in amino acid metabolism (19 metabolites) and starch, sucrose, and galactose metabolism (7 metabolites). Overlap of significantly impacted pathways by RNA-Seq and metabolomics suggests amino acid breakdown and increased sugar import for energy production. Results indicate that LCO-exposed Daphnia respond to energy starvation by altering metabolic pathways, both at the gene expression and metabolite levels. These results support altered energy production as a sensitive nanotoxicity adverse outcome for LCO exposure and suggest negative impacts on energy metabolism as an important avenue for future studies of nanotoxicity, including for other biological systems and for metal oxide nanomaterials more broadly.
Collapse
Affiliation(s)
- Nicholas J Niemuth
- School of Freshwater Sciences, University of Wisconsin-Milwaukee, 600 E Greenfield Ave., Milwaukee, Wisconsin 53204, United States
| | - Becky J Curtis
- School of Freshwater Sciences, University of Wisconsin-Milwaukee, 600 E Greenfield Ave., Milwaukee, Wisconsin 53204, United States
| | - Elizabeth D Laudadio
- Department of Chemistry, University of Wisconsin-Madison, 1101 University Ave., Madison, Wisconsin 53706, United States
| | - Elena Sostare
- School of Biosciences, University of Birmingham, Edgbaston, Birmingham B15 2TT, U.K
| | - Evan A Bennett
- School of Freshwater Sciences, University of Wisconsin-Milwaukee, 600 E Greenfield Ave., Milwaukee, Wisconsin 53204, United States
| | - Nicklaus J Neureuther
- School of Freshwater Sciences, University of Wisconsin-Milwaukee, 600 E Greenfield Ave., Milwaukee, Wisconsin 53204, United States
| | - Aurash A Mohaimani
- School of Freshwater Sciences, University of Wisconsin-Milwaukee, 600 E Greenfield Ave., Milwaukee, Wisconsin 53204, United States
| | - Angela Schmoldt
- School of Freshwater Sciences, University of Wisconsin-Milwaukee, 600 E Greenfield Ave., Milwaukee, Wisconsin 53204, United States
| | - Eric D Ostovich
- School of Freshwater Sciences, University of Wisconsin-Milwaukee, 600 E Greenfield Ave., Milwaukee, Wisconsin 53204, United States
| | - Mark R Viant
- School of Biosciences, University of Birmingham, Edgbaston, Birmingham B15 2TT, U.K
| | - Robert J Hamers
- Department of Chemistry, University of Wisconsin-Madison, 1101 University Ave., Madison, Wisconsin 53706, United States
| | - Rebecca D Klaper
- School of Freshwater Sciences, University of Wisconsin-Milwaukee, 600 E Greenfield Ave., Milwaukee, Wisconsin 53204, United States
| |
Collapse
|
9
|
Pandohee J, Kyereh E, Kulshrestha S, Xu B, Mahomoodally MF. Review of the recent developments in metabolomics-based phytochemical research. Crit Rev Food Sci Nutr 2021:1-16. [PMID: 34672234 DOI: 10.1080/10408398.2021.1993127] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/20/2022]
Abstract
Phytochemicals are important bioactive components present in natural products. Although the health benefits of many food products are well-known and accepted as a common knowledge, the identity of the main bioactive molecules and the mechanism by which they interact in the body of human are often unknown. It was only in the last 30 years when the field of metabolomics had matured that the identification of such molecules with bioactivity has been made possible through the development of instruments to separate and computational techniques to characterize complex samples. This in turn has enabled in vitro studies to quantify the biological activity of the respective phytochemical either in mice models or in humans. In this review, the importance of key dietary phytochemicals such as phenolic acids, flavonoids, carotenoids, resveratrol, curcumin, and capsaicinoids are discussed together with their potential functions for human health. Untargeted metabolomics, in particular, liquid chromatography mass spectrometry, is the most used method to isolate, identify and profile bioactive compounds in the study of phytochemicals in foods. The application of metabolomics in drug discovery is a common practice nowadays and has boosted the drug and/or supplement manufacturing sector.HighlightsPhytochemicals are beneficial compounds for human healthPhytochemicals are plant-based bioactive and obtainable from natural productsUntargeted metabolomics has boosted the discovery of phytochemicals from foodTargeted metabolomics is key in the authentication and screening of phytochemicalsMetabolomics of phytochemicals is reshaping the road to drug and supplement manufacture.
Collapse
Affiliation(s)
- Jessica Pandohee
- Centre for Crop and Disease Management, Curtin University, Perth, Western Australia, Australia.,Department of Health Sciences, Faculty of Science, University of Mauritius, Réduit, Mauritius
| | | | - Saurabh Kulshrestha
- School of Biotechnology, Faculty of Applied Sciences and Biotechnology, Shoolini University, Solan, Himachal Pradesh, India
| | - Baojun Xu
- Food Science and Technology Program, BNU-HKBU United International College, Zhuhai, Guangdong, China
| | | |
Collapse
|
10
|
The Hitchhiker's Guide to Untargeted Lipidomics Analysis: Practical Guidelines. Metabolites 2021; 11:metabo11110713. [PMID: 34822371 PMCID: PMC8624948 DOI: 10.3390/metabo11110713] [Citation(s) in RCA: 7] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/08/2021] [Revised: 10/13/2021] [Accepted: 10/16/2021] [Indexed: 11/30/2022] Open
Abstract
Lipidomics is a newly emerged discipline involving the identification and quantification of thousands of lipids. As a part of the omics field, lipidomics has shown rapid growth both in the number of studies and in the size of lipidome datasets, thus, requiring specific and efficient data analysis approaches. This paper aims to provide guidelines for analyzing and interpreting lipidome data obtained using untargeted methods that rely on liquid chromatography coupled with mass spectrometry (LC-MS) to detect and measure the intensities of lipid compounds. We present a state-of-the-art untargeted LC-MS workflow for lipidomics, from study design to annotation of lipid features, focusing on practical, rather than theoretical, approaches for data analysis, and we outline possible applications of untargeted lipidomics for biological studies. We provide a detailed R notebook designed specifically for untargeted lipidome LC-MS data analysis, which is based on xcms software.
Collapse
|
11
|
Karimi MR, Karimi AH, Abolmaali S, Sadeghi M, Schmitz U. Prospects and challenges of cancer systems medicine: from genes to disease networks. Brief Bioinform 2021; 23:6361045. [PMID: 34471925 PMCID: PMC8769701 DOI: 10.1093/bib/bbab343] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/04/2021] [Revised: 08/02/2021] [Accepted: 08/03/2021] [Indexed: 12/20/2022] Open
Abstract
It is becoming evident that holistic perspectives toward cancer are crucial in deciphering the overwhelming complexity of tumors. Single-layer analysis of genome-wide data has greatly contributed to our understanding of cellular systems and their perturbations. However, fundamental gaps in our knowledge persist and hamper the design of effective interventions. It is becoming more apparent than ever, that cancer should not only be viewed as a disease of the genome but as a disease of the cellular system. Integrative multilayer approaches are emerging as vigorous assets in our endeavors to achieve systemic views on cancer biology. Herein, we provide a comprehensive review of the approaches, methods and technologies that can serve to achieve systemic perspectives of cancer. We start with genome-wide single-layer approaches of omics analyses of cellular systems and move on to multilayer integrative approaches in which in-depth descriptions of proteogenomics and network-based data analysis are provided. Proteogenomics is a remarkable example of how the integration of multiple levels of information can reduce our blind spots and increase the accuracy and reliability of our interpretations and network-based data analysis is a major approach for data interpretation and a robust scaffold for data integration and modeling. Overall, this review aims to increase cross-field awareness of the approaches and challenges regarding the omics-based study of cancer and to facilitate the necessary shift toward holistic approaches.
Collapse
Affiliation(s)
| | | | | | - Mehdi Sadeghi
- Department of Cell & Molecular Biology, Semnan University, Semnan, Iran
| | - Ulf Schmitz
- Department of Molecular & Cell Biology, James Cook University, Townsville, QLD 4811, Australia
| |
Collapse
|
12
|
Castellano-Escuder P, González-Domínguez R, Carmona-Pontaque F, Andrés-Lacueva C, Sánchez-Pla A. POMAShiny: A user-friendly web-based workflow for metabolomics and proteomics data analysis. PLoS Comput Biol 2021; 17:e1009148. [PMID: 34197462 PMCID: PMC8279420 DOI: 10.1371/journal.pcbi.1009148] [Citation(s) in RCA: 13] [Impact Index Per Article: 4.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/11/2021] [Revised: 07/14/2021] [Accepted: 06/05/2021] [Indexed: 12/14/2022] Open
Abstract
Metabolomics and proteomics, like other omics domains, usually face a data mining challenge in providing an understandable output to advance in biomarker discovery and precision medicine. Often, statistical analysis is one of the most difficult challenges and it is critical in the subsequent biological interpretation of the results. Because of this, combined with the computational programming skills needed for this type of analysis, several bioinformatic tools aimed at simplifying metabolomics and proteomics data analysis have emerged. However, sometimes the analysis is still limited to a few hidebound statistical methods and to data sets with limited flexibility. POMAShiny is a web-based tool that provides a structured, flexible and user-friendly workflow for the visualization, exploration and statistical analysis of metabolomics and proteomics data. This tool integrates several statistical methods, some of them widely used in other types of omics, and it is based on the POMA R/Bioconductor package, which increases the reproducibility and flexibility of analyses outside the web environment. POMAShiny and POMA are both freely available at https://github.com/nutrimetabolomics/POMAShiny and https://github.com/nutrimetabolomics/POMA, respectively. Metabolomics and proteomics are two growing areas in human health and personalized medicine fields. Often, one of the main applications of metabolomics and proteomics is the discovery of novel biomarkers and new therapeutic targets in these areas. However, these data are extremely complex and hard to analyse, since they have a large number of features, several missing values, and often important clinical variables to consider in the analyses. Therefore, powerful and versatile tools are needed to provide efficient methods for data visualization and exploration, as well as a wide range of robust statistical methods to meet all data and users requirements. Although powerful tools do exist for the analysis of these data, many of them are still limiting the analyses in terms of visualization and statistical analysis. To address this limitation and complement the existing tools, we have developed a web-based application, named POMAShiny, for the data analysis of metabolomics and proteomics. This novel and versatile tool offers a wholly interactive and easy-to-use environment for the analysis of these data, including numerous methods for preprocessing, data visualization and statistical analysis. The POMAShiny open-source tool is extremely flexible and portable, as it can be installed locally and freely accessed online at https://webapps.nutrimetabolomics.com/POMAShiny.
Collapse
Affiliation(s)
- Pol Castellano-Escuder
- Biomarkers and Nutritional & Food Metabolomics Research Group, Department of Nutrition, Food Science and Gastronomy, Food Innovation Network (XIA), University of Barcelona, Barcelona, Spain
- Statistics and Bioinformatics Research Group, Department of Genetics, Microbiology and Statistics, University of Barcelona, Barcelona, Spain
- CIBERFES, Instituto de Salud Carlos III, Madrid, Spain
- * E-mail: (PC-E); (AS-P)
| | - Raúl González-Domínguez
- Biomarkers and Nutritional & Food Metabolomics Research Group, Department of Nutrition, Food Science and Gastronomy, Food Innovation Network (XIA), University of Barcelona, Barcelona, Spain
- CIBERFES, Instituto de Salud Carlos III, Madrid, Spain
| | - Francesc Carmona-Pontaque
- Statistics and Bioinformatics Research Group, Department of Genetics, Microbiology and Statistics, University of Barcelona, Barcelona, Spain
- CIBERFES, Instituto de Salud Carlos III, Madrid, Spain
| | - Cristina Andrés-Lacueva
- Biomarkers and Nutritional & Food Metabolomics Research Group, Department of Nutrition, Food Science and Gastronomy, Food Innovation Network (XIA), University of Barcelona, Barcelona, Spain
- CIBERFES, Instituto de Salud Carlos III, Madrid, Spain
| | - Alex Sánchez-Pla
- Statistics and Bioinformatics Research Group, Department of Genetics, Microbiology and Statistics, University of Barcelona, Barcelona, Spain
- CIBERFES, Instituto de Salud Carlos III, Madrid, Spain
- * E-mail: (PC-E); (AS-P)
| |
Collapse
|
13
|
Zhou D, Zhu W, Sun T, Wang Y, Chi Y, Chen T, Lin J. iMAP: A Web Server for Metabolomics Data Integrative Analysis. Front Chem 2021; 9:659656. [PMID: 34026726 PMCID: PMC8133432 DOI: 10.3389/fchem.2021.659656] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/28/2021] [Accepted: 04/06/2021] [Indexed: 12/11/2022] Open
Abstract
Metabolomics data analysis depends on the utilization of bioinformatics tools. To meet the evolving needs of metabolomics research, several integrated platforms have been developed. Our group has developed a desktop platform IP4M (integrated Platform for Metabolomics Data Analysis) which allows users to perform a nearly complete metabolomics data analysis in one-stop. With the extensive usage of IP4M, more and more demands were raised from users worldwide for a web version and a more customized workflow. Thus, iMAP (integrated Metabolomics Analysis Platform) was developed with extended functions, improved performances, and redesigned structures. Compared with existing platforms, iMAP has more methods and usage modes. A new module was developed with an automatic pipeline for train-test set separation, feature selection, and predictive model construction and validation. A new module was incorporated with sufficient editable parameters for network construction, visualization, and analysis. Moreover, plenty of plotting tools have been upgraded for highly customized publication-ready figures. Overall, iMAP is a good alternative tool with complementary functions to existing metabolomics data analysis platforms. iMAP is freely available for academic usage at https://imap.metaboprofile.cloud/ (License MPL 2.0).
Collapse
Affiliation(s)
- Di Zhou
- Metabo-Profile Biotechnology (Shanghai) Co. Ltd., Shanghai, China
| | - Wenjia Zhu
- Metabo-Profile Biotechnology (Shanghai) Co. Ltd., Shanghai, China
| | - Tao Sun
- Center for Translational Medicine, Shanghai Jiao Tong University Affiliated Sixth People's Hospital, Shanghai, China
| | - Yang Wang
- Metabo-Profile Biotechnology (Shanghai) Co. Ltd., Shanghai, China
| | - Yi Chi
- Metabo-Profile Biotechnology (Shanghai) Co. Ltd., Shanghai, China
| | - Tianlu Chen
- Center for Translational Medicine, Shanghai Jiao Tong University Affiliated Sixth People's Hospital, Shanghai, China
| | - Jingchao Lin
- Metabo-Profile Biotechnology (Shanghai) Co. Ltd., Shanghai, China
| |
Collapse
|
14
|
Chang HY, Colby SM, Du X, Gomez JD, Helf MJ, Kechris K, Kirkpatrick CR, Li S, Patti GJ, Renslow RS, Subramaniam S, Verma M, Xia J, Young JD. A Practical Guide to Metabolomics Software Development. Anal Chem 2021; 93:1912-1923. [PMID: 33467846 PMCID: PMC7859930 DOI: 10.1021/acs.analchem.0c03581] [Citation(s) in RCA: 30] [Impact Index Per Article: 10.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/22/2022]
Abstract
![]()
A growing number
of software tools have been developed for metabolomics
data processing and analysis. Many new tools are contributed by metabolomics
practitioners who have limited prior experience with software development,
and the tools are subsequently implemented by users with expertise
that ranges from basic point-and-click data analysis to advanced coding.
This Perspective is intended to introduce metabolomics software users
and developers to important considerations that determine the overall
impact of a publicly available tool within the scientific community.
The recommendations reflect the collective experience of an NIH-sponsored
Metabolomics Consortium working group that was formed with the goal
of researching guidelines and best practices for metabolomics tool
development. The recommendations are aimed at metabolomics researchers
with little formal background in programming and are organized into
three stages: (i) preparation, (ii) tool development, and (iii) distribution
and maintenance.
Collapse
Affiliation(s)
- Hui-Yin Chang
- Department of Pathology, University of Michigan, 1301 Catherine Street, Ann Arbor, Michigan 48109, United States.,Department of Biomedical Sciences and Engineering, National Central University, No. 300, Zhongda Road, Zhongli District, Taoyuan City 320, Taiwan
| | - Sean M Colby
- Biological Sciences Division, Pacific Northwest National Laboratory, P.O. Box 999, MSIN: K8-98, Richland, Washington 99352, United States
| | - Xiuxia Du
- Department of Bioinformatics & Genomics, University of North Carolina at Charlotte, 9201 University City Boulevard, Charlotte, North Carolina 28223, United States
| | - Javier D Gomez
- Department of Chemical and Biomolecular Engineering, Vanderbilt University, PMB 351604, 2301 Vanderbilt Place, Nashville, Tennessee 37235, United States
| | - Maximilian J Helf
- Boyce Thompson Institute and Department of Chemistry and Chemical Biology, Cornell University, 533 Tower Road, Ithaca, New York 14853, United States
| | - Katerina Kechris
- Department of Biostatistics and Informatics, University of Colorado Anschutz Medical Campus, 13001 East 17th Place B119, Aurora, Colorado 80045, United States
| | - Christine R Kirkpatrick
- San Diego Supercomputer Center, University of California San Diego, MC 0505, 9500 Gilman Drive, La Jolla, California 92093, United States
| | - Shuzhao Li
- The Jackson Laboratory for Genomic Medicine, 10 Discovery Drive, Farmington, Connecticut 06032, United States
| | - Gary J Patti
- Department of Chemistry, Department of Medicine, and Siteman Cancer Center, Washington University in St. Louis, CB 1134, One Brookings Drive, St. Louis, Missouri 63130, United States
| | - Ryan S Renslow
- Biological Sciences Division, Pacific Northwest National Laboratory, P.O. Box 999, MSIN: K8-98, Richland, Washington 99352, United States.,Gene and Linda Voiland School of Chemical Engineering and Bioengineering, Washington State University, P.O. Box 646515, Pullman, Washington 99164, United States
| | - Shankar Subramaniam
- San Diego Supercomputer Center, University of California San Diego, MC 0505, 9500 Gilman Drive, La Jolla, California 92093, United States.,Department of Bioengineering, Department of Computer Science and Engineering, Department of Cellular and Molecular Medicine, and Department of Chemistry and Biochemistry, University of California San Diego, 9500 Gilman Drive #0412, La Jolla, California 92093, United States
| | - Mukesh Verma
- Epidemiology and Genomics Research Program, National Cancer Institute, National Institutes of Health, Suite 4E102, 9609 Medical Center Drive, MSC 9763, Rockville, Maryland 20850, United States
| | - Jianguo Xia
- Faculty of Agricultural and Environmental Sciences, McGill University, 21111 Lakeshore Road, Ste. Anne de Bellevue, Quebec H9X 3 V9, Canada
| | - Jamey D Young
- Department of Chemical and Biomolecular Engineering, Vanderbilt University, PMB 351604, 2301 Vanderbilt Place, Nashville, Tennessee 37235, United States.,Department of Molecular Physiology and Biophysics, Vanderbilt University, PMB 351604, 2301 Vanderbilt Place, Nashville, Tennessee 37235, United States
| |
Collapse
|
15
|
Southam AD, Pursell H, Frigerio G, Jankevics A, Weber RJM, Dunn WB. Characterization of Monophasic Solvent-Based Tissue Extractions for the Detection of Polar Metabolites and Lipids Applying Ultrahigh-Performance Liquid Chromatography-Mass Spectrometry Clinical Metabolic Phenotyping Assays. J Proteome Res 2020; 20:831-840. [PMID: 33236910 DOI: 10.1021/acs.jproteome.0c00660] [Citation(s) in RCA: 12] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/12/2022]
Abstract
Metabolic phenotyping of tissues uses metabolomics and lipidomics to measure the relative polar and nonpolar (lipid) metabolite levels in biological samples. This approach aims to understand disease biochemistry and identify biochemical markers of disease. Sample preparation methods must be reproducible, sensitive (high metabolite and lipid yield), and ideally rapid. We evaluated three biphasic methods for polar and nonpolar compound extraction (chloroform/methanol/water, dichloromethane/methanol/water, and methyl tert-butyl ether [MTBE]/methanol/water), a monophasic method for polar compound extraction (acetonitrile/methanol/water), and a monophasic method for nonpolar compound extraction (isopropanol/water). All methods were applied to mammalian heart, kidney, and liver tissues. Polar extracts were analyzed by hydrophilic interaction chromatography (HILIC) ultrahigh-performance liquid chromatography-mass spectrometry (UHPLC-MS) and nonpolar extracts by C18 reversed-phase UHPLC-MS. Method reproducibility and yield were assessed using multiple annotated endogenous compounds (putatively and MS/MS annotated). Monophasic methods had the highest yield and high reproducibility for both polar (positive ion: median relative standard deviation (RSD) < 18%; negative ion: median RSD < 28%) and nonpolar (positive and negative ion: median RSD < 15%) extractions for heart, kidneys, and liver. The polar monophasic method extracted higher levels of lipid than biphasic polar extractions, and these lipids caused minimal detection suppression for other compounds during HILIC UHPLC-MS. The nonpolar monophasic method had similar or greater detection responses of all detected lipid classes compared to biphasic methods (including increased phosphatidylinositol, phosphatidylserine, and cardiolipin responses). Monophasic methods are quicker and simpler than biphasic methods and are therefore most suited for future automation.
Collapse
Affiliation(s)
- Andrew D Southam
- School of Biosciences, University of Birmingham, Edgbaston, Birmingham B15 2TT, United Kingdom.,Phenome Centre Birmingham, University of Birmingham, Edgbaston, Birmingham B15 2TT, United Kingdom
| | - Harriet Pursell
- School of Biosciences, University of Birmingham, Edgbaston, Birmingham B15 2TT, United Kingdom
| | - Gianfranco Frigerio
- Department of Clinical Sciences and Community Health, Università degli Studi di Milano, Milan 20122, Italy
| | - Andris Jankevics
- School of Biosciences, University of Birmingham, Edgbaston, Birmingham B15 2TT, United Kingdom.,Phenome Centre Birmingham, University of Birmingham, Edgbaston, Birmingham B15 2TT, United Kingdom
| | - Ralf J M Weber
- School of Biosciences, University of Birmingham, Edgbaston, Birmingham B15 2TT, United Kingdom.,Phenome Centre Birmingham, University of Birmingham, Edgbaston, Birmingham B15 2TT, United Kingdom
| | - Warwick B Dunn
- School of Biosciences, University of Birmingham, Edgbaston, Birmingham B15 2TT, United Kingdom.,Phenome Centre Birmingham, University of Birmingham, Edgbaston, Birmingham B15 2TT, United Kingdom.,Institute of Metabolism and Systems Research, University of Birmingham, Edgbaston, Birmingham B15 2TT, United Kingdom
| |
Collapse
|
16
|
Choudhury R, Beezley J, Davis B, Tomeck J, Gratzl S, Golzarri-Arroyo L, Wan J, Raftery D, Baumes J, O'Connell TM. Viime: Visualization and Integration of Metabolomics Experiments. JOURNAL OF OPEN SOURCE SOFTWARE 2020; 5:2410. [PMID: 33768193 PMCID: PMC7990241 DOI: 10.21105/joss.02410] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Indexed: 06/12/2023]
Abstract
Metabolomics involves the comprehensive measurement of metabolites from a biological system. The resulting metabolite profiles are influenced by genetics, lifestyle, biological stresses, disease, diet and the environment and therefore provides a more holistic biological readout of the pathological condition of the organism (Beger et al., 2016; Wishart, 2016). The challenge for metabolomics is that no single analytical platform can provide a truly comprehensive coverage of the metabolome. The most commonly used platforms are based on mass-spectrometry (MS) and nuclear magnetic resonance (NMR). Investigators are increasingly using both methods to increase the metabolite coverage. The challenge for this type of multi-platform approach is that the data structure may be very different in these two platforms. For example, NMR data may be reported as a list of spectral features, e.g., bins or peaks with arbitrary intensity units or more directly with named metabolites reported in concentration units ranging from micromolar to millimolar. Some MS approaches can also provide data in the form of identified metabolite concentrations, but given the superior sensitivity of MS, the concentrations can be several orders of magnitude lower than for NMR. Other MS approaches yield data in the form of arbitrary response units where the dynamic range can be more than 6 orders of magnitude. Importantly, the variability and reproducibility of the data may differ across platforms. Given the diversity of data structures (i.e., magnitude and dynamic range) integrating the data from multiple platforms can be challenging. This often leads investigators to analyze the datasets separately, which prevents the observation of potentially interesting relationships and correlations between metabolites detected on different platforms. Viime (VIsualization and Integration of Metabolomics Experiments) is an open-source, web-based application designed to integrate metabolomics data from multiple platforms. The workflow of Viime for data integration and visualization is shown in Figure 1.
Collapse
Affiliation(s)
| | | | | | | | | | - Lilian Golzarri-Arroyo
- Department of Epidemiology and Biostatistics, Indiana University School of Public Health
| | - Jun Wan
- Department of Medical and Molecular Genetics, Indiana University School of Medicine
- Center for Computational Biology and Bioinformatics, Indiana University School of Medicine
- Department of BioHealth Informatics, Indiana University School of Informatics and Computing
| | - Daniel Raftery
- Department of Anesthesiology and Pain Medicine, University of Washington
| | | | - Thomas M O'Connell
- Department of Otolaryngology-Head and Neck Surgery, Indiana University School of Medicine
| |
Collapse
|
17
|
Barnett CB, Senapathi T, Naidoo KJ. Comparative ligand structural analytics illustrated on variably glycosylated MUC1 antigen-antibody binding. Beilstein J Org Chem 2020; 16:2540-2550. [PMID: 33133286 PMCID: PMC7590620 DOI: 10.3762/bjoc.16.206] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/09/2020] [Accepted: 09/30/2020] [Indexed: 01/03/2023] Open
Abstract
When faced with the investigation of the preferential binding of a series of ligands against a known target, the solution is not always evident from single structure analysis. An ensemble of structures generated from computer simulations is valuable; however, visual analysis of the extensive structural data can be overwhelming. Rapid analysis of trajectory data, with tools available in the Galaxy platform, can be used to understand key features and compare differences that inform the preferential ligand structure that favors binding. We illustrate this informatics approach by investigating the in-silico binding of a peptide and glycopeptide epitope of the glycoprotein Mucin 1 (MUC1) binding with the antibody AR20.5. To study the binding, we performed molecular dynamics simulations using OpenMM and then used the Galaxy platform for data analysis. The same analysis tools are applied to each of the simulation trajectories and this process was streamlined by using Galaxy workflows. The conformations of the antigens were analyzed using root-mean-square deviation, end-to-end distance, Ramachandran plots, and hydrogen bonding analysis. Additionally, RMSF and clustering analysis were carried out. These analyses were used to rapidly assess key features of the system, interrogate the dynamic structure of the ligand, and determine the role of glycosylation on the conformational equilibrium. The glycopeptide conformations in solution change relative to the peptide; thus a partially pre-structuring is seen prior to binding. Although the bound conformation of peptide and glycopeptide is similar, the glycopeptide fluctuates less and resides in specific conformers for more extended periods. This structural analysis which gives a high-level view of the features in the system under observation, could be readily applied to other binding problems as part of a general strategy in drug design or mechanistic analysis.
Collapse
Affiliation(s)
- Christopher B Barnett
- Scientific Computing Research Unit and Department of Chemistry, University of Cape Town, Rondebosch, 7701, South Africa
| | - Tharindu Senapathi
- Scientific Computing Research Unit and Department of Chemistry, University of Cape Town, Rondebosch, 7701, South Africa
| | - Kevin J Naidoo
- Scientific Computing Research Unit and Department of Chemistry, University of Cape Town, Rondebosch, 7701, South Africa.,Infectious Disease and Molecular Medicine, Faculty of Health Science, University of Cape Town, Rondebosch, 7701, South Africa
| |
Collapse
|
18
|
Liang D, Liu Q, Zhou K, Jia W, Xie G, Chen T. IP4M: an integrated platform for mass spectrometry-based metabolomics data mining. BMC Bioinformatics 2020; 21:444. [PMID: 33028191 PMCID: PMC7542974 DOI: 10.1186/s12859-020-03786-x] [Citation(s) in RCA: 30] [Impact Index Per Article: 7.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/21/2020] [Accepted: 09/28/2020] [Indexed: 12/15/2022] Open
Abstract
Background Metabolomics data analyses rely on the use of bioinformatics tools. Many integrated multi-functional tools have been developed for untargeted metabolomics data processing and have been widely used. More alternative platforms are expected for both basic and advanced users. Results Integrated mass spectrometry-based untargeted metabolomics data mining (IP4M) software was designed and developed. The IP4M, has 62 functions categorized into 8 modules, covering all the steps of metabolomics data mining, including raw data preprocessing (alignment, peak de-convolution, peak picking, and isotope filtering), peak annotation, peak table preprocessing, basic statistical description, classification and biomarker detection, correlation analysis, cluster and sub-cluster analysis, regression analysis, ROC analysis, pathway and enrichment analysis, and sample size and power analysis. Additionally, a KEGG-derived metabolic reaction database was embedded and a series of ratio variables (product/substrate) can be generated with enlarged information on enzyme activity. A new method, GRaMM, for correlation analysis between metabolome and microbiome data was also provided. IP4M provides both a number of parameters for customized and refined analysis (for expert users), as well as 4 simplified workflows with few key parameters (for beginners who are unfamiliar with computational metabolomics). The performance of IP4M was evaluated and compared with existing computational platforms using 2 data sets derived from standards mixture and 2 data sets derived from serum samples, from GC–MS and LC–MS respectively. Conclusion IP4M is powerful, modularized, customizable and easy-to-use. It is a good choice for metabolomics data processing and analysis. Free versions for Windows, MAC OS, and Linux systems are provided.
Collapse
Affiliation(s)
- Dandan Liang
- Shanghai Key Laboratory of Diabetes Mellitus and Center for Translational Medicine, Shanghai Jiao Tong University Affiliated Sixth People's Hospital, Shanghai, 200233, China
| | - Quan Liu
- Human Metabolomics Institute, Inc., Shenzhen, 518109, Guangdong, China
| | - Kejun Zhou
- Human Metabolomics Institute, Inc., Shenzhen, 518109, Guangdong, China
| | - Wei Jia
- Shanghai Key Laboratory of Diabetes Mellitus and Center for Translational Medicine, Shanghai Jiao Tong University Affiliated Sixth People's Hospital, Shanghai, 200233, China.
| | - Guoxiang Xie
- Human Metabolomics Institute, Inc., Shenzhen, 518109, Guangdong, China.
| | - Tianlu Chen
- Shanghai Key Laboratory of Diabetes Mellitus and Center for Translational Medicine, Shanghai Jiao Tong University Affiliated Sixth People's Hospital, Shanghai, 200233, China.
| |
Collapse
|
19
|
Creydt M, Fischer M. Food Phenotyping: Recording and Processing of Non-Targeted Liquid Chromatography Mass Spectrometry Data for Verifying Food Authenticity. Molecules 2020; 25:E3972. [PMID: 32878155 PMCID: PMC7504784 DOI: 10.3390/molecules25173972] [Citation(s) in RCA: 8] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/06/2020] [Revised: 08/27/2020] [Accepted: 08/28/2020] [Indexed: 12/11/2022] Open
Abstract
Experiments based on metabolomics represent powerful approaches to the experimental verification of the integrity of food. In particular, high-resolution non-targeted analyses, which are carried out by means of liquid chromatography-mass spectrometry systems (LC-MS), offer a variety of options. However, an enormous amount of data is recorded, which must be processed in a correspondingly complex manner. The evaluation of LC-MS based non-targeted data is not entirely trivial and a wide variety of strategies have been developed that can be used in this regard. In this paper, an overview of the mandatory steps regarding data acquisition is given first, followed by a presentation of the required preprocessing steps for data evaluation. Then some multivariate analysis methods are discussed, which have proven to be particularly suitable in this context in recent years. The publication closes with information on the identification of marker compounds.
Collapse
Affiliation(s)
- Marina Creydt
- Hamburg School of Food Science-Institute of Food Chemistry, University of Hamburg, Grindelallee 117, 20146 Hamburg, Germany;
- Center for Hybrid Nanostructures (CHyN), Department of Physics, University of Hamburg, Luruper Chaussee 149, 22761 Hamburg, Germany
| | - Markus Fischer
- Hamburg School of Food Science-Institute of Food Chemistry, University of Hamburg, Grindelallee 117, 20146 Hamburg, Germany;
- Center for Hybrid Nanostructures (CHyN), Department of Physics, University of Hamburg, Luruper Chaussee 149, 22761 Hamburg, Germany
| |
Collapse
|
20
|
Plyushchenko I, Shakhmatov D, Bolotnik T, Baygildiev T, Nesterenko PN, Rodin I. An approach for feature selection with data modelling in LC-MS metabolomics. ANALYTICAL METHODS : ADVANCING METHODS AND APPLICATIONS 2020; 12:3582-3591. [PMID: 32701078 DOI: 10.1039/d0ay00204f] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/11/2023]
Abstract
The data processing workflow for LC-MS based metabolomics study is suggested with signal drift correction, univariate analysis, supervised learning, feature selection and unsupervised modelling. The proposed approach requires only an annotation-free peak table and produces an extremely reduced set of the most relevant features together with validation via Receiver Operating Characteristic analysis for selected predictors, cross-validation and unsupervised projection. The presented study was initially optimised by its own experimental set and then was successfully tested by using 36 datasets from 21 publicly available metabolomics projects. The suggested workflow can be used for classification purposes in high dimensional metabolomics studies and as a first step in exploratory analysis, data projection, biomarker selection, data integration and fusion.
Collapse
Affiliation(s)
- Ivan Plyushchenko
- Lomonosov Moscow State University, Chemistry Department, 119992, GSP-2, Lenin Hills, 1b3, Moscow, Russia.
| | | | | | | | | | | |
Collapse
|
21
|
McGee EE, Kiblawi R, Playdon MC, Eliassen AH. Nutritional Metabolomics in Cancer Epidemiology: Current Trends, Challenges, and Future Directions. Curr Nutr Rep 2020; 8:187-201. [PMID: 31129888 DOI: 10.1007/s13668-019-00279-z] [Citation(s) in RCA: 9] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/06/2023]
Abstract
PURPOSE OF REVIEW Metabolomics offers several opportunities for advancement in nutritional cancer epidemiology; however, numerous research gaps and challenges remain. This narrative review summarizes current research, challenges, and future directions for epidemiologic studies of nutritional metabolomics and cancer. RECENT FINDINGS Although many studies have used metabolomics to investigate either dietary exposures or cancer, few studies have explicitly investigated diet-cancer relationships using metabolomics. Most studies have been relatively small (≤ ~ 250 cases) or have assessed a limited number of nutritional metabolites (e.g., coffee or alcohol-related metabolites). Nutritional metabolomic investigations of cancer face several challenges in study design; biospecimen selection, handling, and processing; diet and metabolite measurement; statistical analyses; and data sharing and synthesis. More metabolomics studies linking dietary exposures to cancer risk, prognosis, and survival are needed, as are biomarker validation studies, longitudinal analyses, and methodological studies. Despite the remaining challenges, metabolomics offers a promising avenue for future dietary cancer research.
Collapse
Affiliation(s)
- Emma E McGee
- Channing Division of Network Medicine, Department of Medicine, Brigham and Women's Hospital, Harvard Medical School, Boston, MA, USA.
- Department of Epidemiology, Harvard T.H. Chan School of Public Health, Boston, MA, USA.
| | - Rama Kiblawi
- Division of Cancer Population Sciences, Huntsman Cancer Institute, University of Utah, Salt Lake City, UT, USA
| | - Mary C Playdon
- Division of Cancer Population Sciences, Huntsman Cancer Institute, University of Utah, Salt Lake City, UT, USA
- Department of Nutrition and Integrative Physiology, University of Utah, Salt Lake City, UT, USA
| | - A Heather Eliassen
- Channing Division of Network Medicine, Department of Medicine, Brigham and Women's Hospital, Harvard Medical School, Boston, MA, USA
- Department of Epidemiology, Harvard T.H. Chan School of Public Health, Boston, MA, USA
| |
Collapse
|
22
|
Tarazona S, Balzano-Nogueira L, Gómez-Cabrero D, Schmidt A, Imhof A, Hankemeier T, Tegnér J, Westerhuis JA, Conesa A. Harmonization of quality metrics and power calculation in multi-omic studies. Nat Commun 2020; 11:3092. [PMID: 32555183 PMCID: PMC7303201 DOI: 10.1038/s41467-020-16937-8] [Citation(s) in RCA: 43] [Impact Index Per Article: 10.8] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/27/2018] [Accepted: 05/29/2020] [Indexed: 12/20/2022] Open
Abstract
Multi-omic studies combine measurements at different molecular levels to build comprehensive models of cellular systems. The success of a multi-omic data analysis strategy depends largely on the adoption of adequate experimental designs, and on the quality of the measurements provided by the different omic platforms. However, the field lacks a comparative description of performance parameters across omic technologies and a formulation for experimental design in multi-omic data scenarios. Here, we propose a set of harmonized Figures of Merit (FoM) as quality descriptors applicable to different omic data types. Employing this information, we formulate the MultiPower method to estimate and assess the optimal sample size in a multi-omics experiment. MultiPower supports different experimental settings, data types and sample sizes, and includes graphical for experimental design decision-making. MultiPower is complemented with MultiML, an algorithm to estimate sample size for machine learning classification problems based on multi-omic data.
Collapse
Affiliation(s)
- Sonia Tarazona
- Department of Applied Statistics, Operations Research and Quality, Universitat Politècnica de València, Valencia, Spain
| | - Leandro Balzano-Nogueira
- Microbiology and Cell Science Department, Institute for Food and Agricultural Research, University of Florida, Gainesville, FL, USA
| | - David Gómez-Cabrero
- Unit of Computational Medicine, Department of Medicine, Solna, Center for Molecular Medicine, Karolinska Institutet, Stockholm, Sweden
- Science for Life Laboratory, Solna, Sweden
- Mucosal & Salivary Biology Division, King's College London Dental Institute, London, UK
- Navarrabiomed, Complejo Hospitalario de Navarra (CHN), Universidad Pública de Navarra (UPNA), IdiSNA, Pamplona, Spain
| | - Andreas Schmidt
- Protein Analysis Unit, Biomedical Center, Faculty of Medicine, LMU Munich, Planegg-Martinsried, Germany
| | - Axel Imhof
- Protein Analysis Unit, Biomedical Center, Faculty of Medicine, LMU Munich, Planegg-Martinsried, Germany
- Munich Center of Integrated Protein Science LMU Munich, Planegg-Martinsried, Germany
| | - Thomas Hankemeier
- Division Analytical Biosciences, Leiden/Amsterdam Center for Drug Research, Leiden, The Netherlands
| | - Jesper Tegnér
- Unit of Computational Medicine, Department of Medicine, Solna, Center for Molecular Medicine, Karolinska Institutet, Stockholm, Sweden
- Science for Life Laboratory, Solna, Sweden
- Biological and Environmental Sciences and Engineering Division, Computer, Electrical and Mathematical Sciences and Engineering Division, King Abdullah University of Science and Technology, Thuwal, Saudi Arabia
| | - Johan A Westerhuis
- Swammerdam Institute for Life Sciences, University of Amsterdam, Amsterdam, The Netherlands
- Department of Statistics, Faculty of Natural Sciences, North-West University (Potchefstroom Campus), Potchefstroom, South Africa
| | - Ana Conesa
- Microbiology and Cell Science Department, Institute for Food and Agricultural Research, University of Florida, Gainesville, FL, USA.
- Genetics Institute, University of Florida, Gainesville, FL, USA.
| |
Collapse
|
23
|
Perez-Riverol Y, Moreno P. Scalable Data Analysis in Proteomics and Metabolomics Using BioContainers and Workflows Engines. Proteomics 2020; 20:e1900147. [PMID: 31657527 PMCID: PMC7613303 DOI: 10.1002/pmic.201900147] [Citation(s) in RCA: 12] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/12/2019] [Revised: 09/30/2019] [Indexed: 12/29/2022]
Abstract
The recent improvements in mass spectrometry instruments and new analytical methods are increasing the intersection between proteomics and big data science. In addition, bioinformatics analysis is becoming increasingly complex and convoluted, involving multiple algorithms and tools. A wide variety of methods and software tools have been developed for computational proteomics and metabolomics during recent years, and this trend is likely to continue. However, most of the computational proteomics and metabolomics tools are designed as single-tiered software application where the analytics tasks cannot be distributed, limiting the scalability and reproducibility of the data analysis. In this paper the key steps of metabolomics and proteomics data processing, including the main tools and software used to perform the data analysis, are summarized. The combination of software containers with workflows environments for large-scale metabolomics and proteomics analysis is discussed. Finally, a new approach for reproducible and large-scale data analysis based on BioContainers and two of the most popular workflow environments, Galaxy and Nextflow, is introduced to the proteomics and metabolomics communities.
Collapse
Affiliation(s)
- Yasset Perez-Riverol
- European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Wellcome Trust Genome Campus, Hinxton, Cambridge, CB10 1SD, UK
| | - Pablo Moreno
- European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Wellcome Trust Genome Campus, Hinxton, Cambridge, CB10 1SD, UK
| |
Collapse
|
24
|
McLean C, Kujawinski EB. AutoTuner: High Fidelity and Robust Parameter Selection for Metabolomics Data Processing. Anal Chem 2020; 92:5724-5732. [PMID: 32212641 PMCID: PMC7310949 DOI: 10.1021/acs.analchem.9b04804] [Citation(s) in RCA: 32] [Impact Index Per Article: 8.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/15/2022]
Abstract
![]()
Untargeted
metabolomics experiments provide a snapshot of cellular
metabolism but remain challenging to interpret due to the computational
complexity involved in data processing and analysis. Prior to any
interpretation, raw data must be processed to remove noise and to
align mass-spectral peaks across samples. This step requires selection
of dataset-specific parameters, as erroneous parameters can result
in noise inflation. While several algorithms exist to automate parameter
selection, each depends on gradient descent optimization functions.
In contrast, our new parameter optimization algorithm, AutoTuner,
obtains parameter estimates from raw data in a single step as opposed
to many iterations. Here, we tested the accuracy and the run-time
of AutoTuner in comparison to isotopologue parameter optimization
(IPO), the most commonly used parameter selection tool, and compared
the resulting parameters’ influence on the properties of feature
tables after processing. We performed a Monte Carlo experiment to
test the robustness of AutoTuner parameter selection and found that
AutoTuner generated similar parameter estimates from random subsets
of samples. We conclude that AutoTuner is a desirable alternative
to existing tools, because it is scalable, highly robust, and very
fast (∼100–1000× speed improvement from other algorithms
going from days to minutes). AutoTuner is freely available as an R
package through BioConductor.
Collapse
Affiliation(s)
- Craig McLean
- Department of Marine Chemistry and Geochemistry, Woods Hole Oceanographic Institution, Woods Hole, Massachusetts 02543, United States.,MIT/WHOI Joint Program in Oceanography/Applied Ocean Science and Engineering, Department of Marine Chemistry and Geochemistry, Woods Hole Oceanographic Institution, Woods Hole, Massachusetts 02543, United States
| | - Elizabeth B Kujawinski
- Department of Marine Chemistry and Geochemistry, Woods Hole Oceanographic Institution, Woods Hole, Massachusetts 02543, United States
| |
Collapse
|
25
|
Stuart KA, Welsh K, Walker MC, Edrada-Ebel R. Metabolomic tools used in marine natural product drug discovery. Expert Opin Drug Discov 2020; 15:499-522. [PMID: 32026730 DOI: 10.1080/17460441.2020.1722636] [Citation(s) in RCA: 32] [Impact Index Per Article: 8.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/07/2023]
Abstract
Introduction: The marine environment is a very promising resource for natural product research, with many of these reaching the market as new drugs, especially in the field of cancer therapy as well as the drug discovery pipeline for new antimicrobials. Exploitation for bioactive marine compounds with unique structures and novel bioactivity such as the isoquinoline alkaloid; trabectedin, the polyether macrolide; halichondrin B, and the peptide; dolastatin 10, requires the use of analytical techniques, which can generate unbiased, quantitative, and qualitative data to benefit the biodiscovery process. Metabolomics has shown to bridge this understanding and facilitate the development of new potential drugs from marine sources and particularly their microbial symbionts.Areas covered: In this review, articles on applied secondary metabolomics ranging from 1990-2018 as well as to the last quarter of 2019 were probed to investigate the impact of metabolomics on drug discovery for new antibiotics and cancer treatment.Expert opinion: The current literature review highlighted the effectiveness of metabolomics in the study of targeting biologically active secondary metabolites from marine sources for optimized discovery of potential new natural products to be made accessible to a R&D pipeline.
Collapse
Affiliation(s)
- Kevin Andrew Stuart
- Strathclyde Institute of Pharmacy and Biomedical Sciences, University of Strathclyde, Glasgow, UK
| | - Keira Welsh
- Strathclyde Institute of Pharmacy and Biomedical Sciences, University of Strathclyde, Glasgow, UK
| | - Molly Clare Walker
- Strathclyde Institute of Pharmacy and Biomedical Sciences, University of Strathclyde, Glasgow, UK
| | - RuAngelie Edrada-Ebel
- Strathclyde Institute of Pharmacy and Biomedical Sciences, University of Strathclyde, Glasgow, UK
| |
Collapse
|
26
|
Verhoeven A, Giera M, Mayboroda OA. Scientific workflow managers in metabolomics: an overview. Analyst 2020; 145:3801-3808. [DOI: 10.1039/d0an00272k] [Citation(s) in RCA: 6] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/21/2022]
Abstract
Metabolomics workflows for data processing reproducibility and accelerated clinical deployment.
Collapse
Affiliation(s)
- Aswin Verhoeven
- Center for Proteomics and Metabolomics
- Leiden University Medical Center
- Leiden
- The Netherlands
| | - Martin Giera
- Center for Proteomics and Metabolomics
- Leiden University Medical Center
- Leiden
- The Netherlands
| | - Oleg A. Mayboroda
- Center for Proteomics and Metabolomics
- Leiden University Medical Center
- Leiden
- The Netherlands
| |
Collapse
|
27
|
Open-Source Software Tools, Databases, and Resources for Single-Cell and Single-Cell-Type Metabolomics. Methods Mol Biol 2020; 2064:191-217. [PMID: 31565776 DOI: 10.1007/978-1-4939-9831-9_15] [Citation(s) in RCA: 9] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/08/2023]
Abstract
In this age of -omics data-guided big data revolution, metabolomics has received significant attention as compared to genomics, transcriptomics, and proteomics for its proximity to the phenotype, the promises it makes and the challenges it throws. Although metabolomes of entire organisms, organs, biofluids, and tissues are of immense interest, a cell-specific resolution is deemed critical for biomedical applications where a granular understanding of cellular metabolism at cell-type and subcellular resolution is desirable. Mass spectrometry (MS) is a versatile technique that is used to analyze a broad range of compounds from different species and cell-types, with high accuracy, resolution, sensitivity, selectivity, and fast data acquisition speeds. With recent advances in MS and spectroscopy-based platforms, the research community is able to generate high-throughput data sets from single cells. However, it is challenging to handle, store, process, analyze, and interpret data in a routine manner. In this treatise, I present a workflow of metabolomics data generation from single cells and single-cell types to their analysis, visualization, and interpretation for obtaining biological insights.
Collapse
|
28
|
Southam AD, Haglington LD, Najdekr L, Jankevics A, Weber RJM, Dunn WB. Assessment of human plasma and urine sample preparation for reproducible and high-throughput UHPLC-MS clinical metabolic phenotyping. Analyst 2020; 145:6511-6523. [DOI: 10.1039/d0an01319f] [Citation(s) in RCA: 12] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/27/2022]
Abstract
In this study we assess multiple sample preparation methods for UHPLC-MS metabolic phenotyping analysis of human urine and plasma. All methods are discussed in terms of metabolite and lipid coverage and reproducibility.
Collapse
Affiliation(s)
- Andrew D. Southam
- School of Biosciences
- University of Birmingham
- Birmingham
- UK
- Phenome Centre Birmingham
| | | | - Lukáš Najdekr
- School of Biosciences
- University of Birmingham
- Birmingham
- UK
- Phenome Centre Birmingham
| | - Andris Jankevics
- School of Biosciences
- University of Birmingham
- Birmingham
- UK
- Phenome Centre Birmingham
| | - Ralf J. M. Weber
- School of Biosciences
- University of Birmingham
- Birmingham
- UK
- Phenome Centre Birmingham
| | - Warwick B. Dunn
- School of Biosciences
- University of Birmingham
- Birmingham
- UK
- Phenome Centre Birmingham
| |
Collapse
|
29
|
Razzaq A, Sadia B, Raza A, Khalid Hameed M, Saleem F. Metabolomics: A Way Forward for Crop Improvement. Metabolites 2019; 9:E303. [PMID: 31847393 PMCID: PMC6969922 DOI: 10.3390/metabo9120303] [Citation(s) in RCA: 83] [Impact Index Per Article: 16.6] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/13/2019] [Revised: 12/02/2019] [Accepted: 12/11/2019] [Indexed: 12/15/2022] Open
Abstract
Metabolomics is an emerging branch of "omics" and it involves identification and quantification of metabolites and chemical footprints of cellular regulatory processes in different biological species. The metabolome is the total metabolite pool in an organism, which can be measured to characterize genetic or environmental variations. Metabolomics plays a significant role in exploring environment-gene interactions, mutant characterization, phenotyping, identification of biomarkers, and drug discovery. Metabolomics is a promising approach to decipher various metabolic networks that are linked with biotic and abiotic stress tolerance in plants. In this context, metabolomics-assisted breeding enables efficient screening for yield and stress tolerance of crops at the metabolic level. Advanced metabolomics analytical tools, like non-destructive nuclear magnetic resonance spectroscopy (NMR), liquid chromatography mass-spectroscopy (LC-MS), gas chromatography-mass spectrometry (GC-MS), high performance liquid chromatography (HPLC), and direct flow injection (DFI) mass spectrometry, have sped up metabolic profiling. Presently, integrating metabolomics with post-genomics tools has enabled efficient dissection of genetic and phenotypic association in crop plants. This review provides insight into the state-of-the-art plant metabolomics tools for crop improvement. Here, we describe the workflow of plant metabolomics research focusing on the elucidation of biotic and abiotic stress tolerance mechanisms in plants. Furthermore, the potential of metabolomics-assisted breeding for crop improvement and its future applications in speed breeding are also discussed. Mention has also been made of possible bottlenecks and future prospects of plant metabolomics.
Collapse
Affiliation(s)
- Ali Razzaq
- Centre of Agricultural Biochemistry and Biotechnology (CABB), University of Agriculture, Faisalabad 38040, Pakistan; (A.R.); (B.S.)
| | - Bushra Sadia
- Centre of Agricultural Biochemistry and Biotechnology (CABB), University of Agriculture, Faisalabad 38040, Pakistan; (A.R.); (B.S.)
| | - Ali Raza
- Oil Crops Research Institute, Chinese Academy of Agricultural Sciences (CAAS), Wuhan 430062, China;
| | - Muhammad Khalid Hameed
- School of Agriculture and Biology, Shanghai Jiao Tong University, Shanghai 200240, China;
| | - Fozia Saleem
- Centre of Agricultural Biochemistry and Biotechnology (CABB), University of Agriculture, Faisalabad 38040, Pakistan; (A.R.); (B.S.)
| |
Collapse
|
30
|
Multi-Omics Integration Reveals Short and Long-Term Effects of Gestational Hypoxia on the Heart Development. Cells 2019; 8:cells8121608. [PMID: 31835778 PMCID: PMC6952773 DOI: 10.3390/cells8121608] [Citation(s) in RCA: 9] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/01/2019] [Revised: 12/09/2019] [Accepted: 12/10/2019] [Indexed: 12/13/2022] Open
Abstract
Antenatal hypoxia caused epigenetic reprogramming of methylome and transcriptome in the developing heart and increased the risk of heart disease later in life. Herein, we investigated the impact of gestational hypoxia in proteome and metabolome in the hearts of fetus and adult offspring. Pregnant rats were treated with normoxia or hypoxia (10.5% O2) from day 15 to 21 of gestation. Hearts were isolated from near-term fetuses and 5 month-old offspring, and proteomics and metabolomics profiling was determined. The data demonstrated that antenatal hypoxia altered proteomics and metabolomics profiling in the heart, impacting energy metabolism, lipid metabolism, oxidative stress, and inflammation-related pathways in a developmental and sex dependent manner. Of importance, integrating multi-omics data of transcriptomics, proteomics, and metabolomics profiling revealed reprogramming of the mitochondrion, especially in two clusters: (a) the cluster associated with "mitochondrial translation"/"aminoacyl t-RNA biosynthesis"/"one-carbon pool of folate"/"DNA methylation"; and (b) the cluster with "mitochondrion"/"TCA cycle and respiratory electron transfer"/"acyl-CoA dehydrogenase"/"oxidative phosphorylation"/"complex I"/"troponin myosin cardiac complex". Our study provides a powerful means of multi-omics data integration and reveals new insights into phenotypic reprogramming of the mitochondrion in the developing heart by fetal hypoxia, contributing to an increase in the heart vulnerability to disease later in life.
Collapse
|
31
|
Föll MC, Moritz L, Wollmann T, Stillger MN, Vockert N, Werner M, Bronsert P, Rohr K, Grüning BA, Schilling O. Accessible and reproducible mass spectrometry imaging data analysis in Galaxy. Gigascience 2019; 8:giz143. [PMID: 31816088 PMCID: PMC6901077 DOI: 10.1093/gigascience/giz143] [Citation(s) in RCA: 20] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/05/2019] [Revised: 09/10/2019] [Accepted: 11/10/2019] [Indexed: 02/06/2023] Open
Abstract
BACKGROUND Mass spectrometry imaging is increasingly used in biological and translational research because it has the ability to determine the spatial distribution of hundreds of analytes in a sample. Being at the interface of proteomics/metabolomics and imaging, the acquired datasets are large and complex and often analyzed with proprietary software or in-house scripts, which hinders reproducibility. Open source software solutions that enable reproducible data analysis often require programming skills and are therefore not accessible to many mass spectrometry imaging (MSI) researchers. FINDINGS We have integrated 18 dedicated mass spectrometry imaging tools into the Galaxy framework to allow accessible, reproducible, and transparent data analysis. Our tools are based on Cardinal, MALDIquant, and scikit-image and enable all major MSI analysis steps such as quality control, visualization, preprocessing, statistical analysis, and image co-registration. Furthermore, we created hands-on training material for use cases in proteomics and metabolomics. To demonstrate the utility of our tools, we re-analyzed a publicly available N-linked glycan imaging dataset. By providing the entire analysis history online, we highlight how the Galaxy framework fosters transparent and reproducible research. CONCLUSION The Galaxy framework has emerged as a powerful analysis platform for the analysis of MSI data with ease of use and access, together with high levels of reproducibility and transparency.
Collapse
Affiliation(s)
- Melanie Christine Föll
- Institute of Surgical Pathology, Medical Center – University of Freiburg, Breisacher Straße 115a, 79106 Freiburg, Germany
- Faculty of Biology, University of Freiburg, Schänzlestraße 1, 79104 Freiburg, Germany
| | - Lennart Moritz
- Institute of Surgical Pathology, Medical Center – University of Freiburg, Breisacher Straße 115a, 79106 Freiburg, Germany
| | - Thomas Wollmann
- Biomedical Computer Vision Group, BioQuant, IPMB, Heidelberg University, Im Neuenheimer Feld 267, 69120 Heidelberg, Germany
| | - Maren Nicole Stillger
- Institute of Surgical Pathology, Medical Center – University of Freiburg, Breisacher Straße 115a, 79106 Freiburg, Germany
- Faculty of Biology, University of Freiburg, Schänzlestraße 1, 79104 Freiburg, Germany
- Institute of Molecular Medicine and Cell Research, Faculty of Medicine, University of Freiburg, Stefan-Meier-Straße 17, 79104 Freiburg, Germany
| | - Niklas Vockert
- Biomedical Computer Vision Group, BioQuant, IPMB, Heidelberg University, Im Neuenheimer Feld 267, 69120 Heidelberg, Germany
| | - Martin Werner
- Institute of Surgical Pathology, Medical Center – University of Freiburg, Breisacher Straße 115a, 79106 Freiburg, Germany
- Faculty of Medicine - University of Freiburg, Breisacher Straße 153, 79110 Freiburg, Germany
- Tumorbank Comprehensive Cancer Center Freiburg, Medical Center – University of Freiburg, Breisacher Straße 115a, 79106 Freiburg, Germany
- German Cancer Consortium (DKTK) and Cancer Research Center (DKFZ), Hugstetter Straße 55, 79106 Freiburg, Germany
| | - Peter Bronsert
- Institute of Surgical Pathology, Medical Center – University of Freiburg, Breisacher Straße 115a, 79106 Freiburg, Germany
- Faculty of Medicine - University of Freiburg, Breisacher Straße 153, 79110 Freiburg, Germany
- Tumorbank Comprehensive Cancer Center Freiburg, Medical Center – University of Freiburg, Breisacher Straße 115a, 79106 Freiburg, Germany
- German Cancer Consortium (DKTK) and Cancer Research Center (DKFZ), Hugstetter Straße 55, 79106 Freiburg, Germany
| | - Karl Rohr
- Biomedical Computer Vision Group, BioQuant, IPMB, Heidelberg University, Im Neuenheimer Feld 267, 69120 Heidelberg, Germany
| | - Björn Andreas Grüning
- Department of Computer Science, University of Freiburg, Georges-Köhler-Allee 106, 79110 Freiburg, Germany
| | - Oliver Schilling
- Institute of Surgical Pathology, Medical Center – University of Freiburg, Breisacher Straße 115a, 79106 Freiburg, Germany
- Faculty of Medicine - University of Freiburg, Breisacher Straße 153, 79110 Freiburg, Germany
- German Cancer Consortium (DKTK) and Cancer Research Center (DKFZ), Hugstetter Straße 55, 79106 Freiburg, Germany
| |
Collapse
|
32
|
Ali A, Abouleila Y, Shimizu Y, Hiyama E, Emara S, Mashaghi A, Hankemeier T. Single-cell metabolomics by mass spectrometry: Advances, challenges, and future applications. Trends Analyt Chem 2019. [DOI: 10.1016/j.trac.2019.02.033] [Citation(s) in RCA: 20] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/08/2023]
|
33
|
Chong J, Xia J. MetaboAnalystR: an R package for flexible and reproducible analysis of metabolomics data. Bioinformatics 2019; 34:4313-4314. [PMID: 29955821 PMCID: PMC6289126 DOI: 10.1093/bioinformatics/bty528] [Citation(s) in RCA: 412] [Impact Index Per Article: 82.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/20/2017] [Accepted: 06/27/2018] [Indexed: 11/24/2022] Open
Abstract
Summary The MetaboAnalyst web application has been widely used for metabolomics data analysis and interpretation. Despite its user-friendliness, the web interface has presented its inherent limitations (especially for advanced users) with regard to flexibility in creating customized workflow, support for reproducible analysis, and capacity in dealing with large data. To address these limitations, we have developed a companion R package (MetaboAnalystR) based on the R code base of the web server. The package has been thoroughly tested to ensure that the same R commands will produce identical results from both interfaces. MetaboAnalystR complements the MetaboAnalyst web server to facilitate transparent, flexible and reproducible analysis of metabolomics data. Availability and implementation MetaboAnalystR is freely available from https://github.com/xia-lab/MetaboAnalystR.
Collapse
Affiliation(s)
| | - Jianguo Xia
- To whom correspondence should be addressed. E-mail:
| |
Collapse
|
34
|
Cardoso S, Afonso T, Maraschin M, Rocha M. WebSpecmine: A Website for Metabolomics Data Analysis and Mining. Metabolites 2019; 9:metabo9100237. [PMID: 31635085 PMCID: PMC6835413 DOI: 10.3390/metabo9100237] [Citation(s) in RCA: 10] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/08/2019] [Revised: 10/09/2019] [Accepted: 10/15/2019] [Indexed: 11/16/2022] Open
Abstract
Metabolomics data analysis is an important task in biomedical research. The available tools do not provide a wide variety of methods and data types, nor ways to store and share data and results generated. Thus, we have developed WebSpecmine to overcome the aforementioned limitations. WebSpecmine is a web-based application designed to perform the analysis of metabolomics data based on spectroscopic and chromatographic techniques (NMR, Infrared, UV-visible, and Raman, and LC/GC-MS) and compound concentrations. Users, even those not possessing programming skills, can access several analysis methods including univariate, unsupervised and supervised multivariate statistical analysis, as well as metabolite identification and pathway analysis, also being able to create accounts to store their data and results, either privately or publicly. The tool's implementation is based in the R project, including its shiny web-based framework. Webspecmine is freely available, supporting all major browsers. We provide abundant documentation, including tutorials and a user guide with case studies.
Collapse
Affiliation(s)
- Sara Cardoso
- CEB-Centre Biological Engineering, University of Minho, 4710-057 Braga, Portugal.
| | - Telma Afonso
- CEB-Centre Biological Engineering, University of Minho, 4710-057 Braga, Portugal.
| | - Marcelo Maraschin
- Plant Morphogenesis and Biochemistry Laboratory, Federal University of Santa Catarina, Florianópolis SC 88040-900, Brazil.
| | - Miguel Rocha
- CEB-Centre Biological Engineering, University of Minho, 4710-057 Braga, Portugal.
| |
Collapse
|
35
|
Stanstrup J, Broeckling CD, Helmus R, Hoffmann N, Mathé E, Naake T, Nicolotti L, Peters K, Rainer J, Salek RM, Schulze T, Schymanski EL, Stravs MA, Thévenot EA, Treutler H, Weber RJM, Willighagen E, Witting M, Neumann S. The metaRbolomics Toolbox in Bioconductor and beyond. Metabolites 2019; 9:E200. [PMID: 31548506 PMCID: PMC6835268 DOI: 10.3390/metabo9100200] [Citation(s) in RCA: 51] [Impact Index Per Article: 10.2] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/11/2019] [Revised: 09/16/2019] [Accepted: 09/17/2019] [Indexed: 11/17/2022] Open
Abstract
Metabolomics aims to measure and characterise the complex composition of metabolites in a biological system. Metabolomics studies involve sophisticated analytical techniques such as mass spectrometry and nuclear magnetic resonance spectroscopy, and generate large amounts of high-dimensional and complex experimental data. Open source processing and analysis tools are of major interest in light of innovative, open and reproducible science. The scientific community has developed a wide range of open source software, providing freely available advanced processing and analysis approaches. The programming and statistics environment R has emerged as one of the most popular environments to process and analyse Metabolomics datasets. A major benefit of such an environment is the possibility of connecting different tools into more complex workflows. Combining reusable data processing R scripts with the experimental data thus allows for open, reproducible research. This review provides an extensive overview of existing packages in R for different steps in a typical computational metabolomics workflow, including data processing, biostatistics, metabolite annotation and identification, and biochemical network and pathway analysis. Multifunctional workflows, possible user interfaces and integration into workflow management systems are also reviewed. In total, this review summarises more than two hundred metabolomics specific packages primarily available on CRAN, Bioconductor and GitHub.
Collapse
Affiliation(s)
- Jan Stanstrup
- Preventive and Clinical Nutrition, University of Copenhagen, Rolighedsvej 30, 1958 Frederiksberg C, Denmark.
| | - Corey D Broeckling
- Proteomics and Metabolomics Facility, Colorado State University, Fort Collins, CO 80523, USA.
| | - Rick Helmus
- Institute for Biodiversity and Ecosystem Dynamics, University of Amsterdam, 1098 XH Amsterdam, The Netherlands.
| | - Nils Hoffmann
- Leibniz-Institut für Analytische Wissenschaften-ISAS-e.V., Otto-Hahn-Straße 6b, 44227 Dortmund, Germany.
| | - Ewy Mathé
- Department of Biomedical Informatics, College of Medicine, The Ohio State University, Columbus, OH 43210, USA.
| | - Thomas Naake
- Max Planck Institute of Molecular Plant Physiology, 14476 Potsdam-Golm, Germany.
| | - Luca Nicolotti
- The Australian Wine Research Institute, Metabolomics Australia, PO Box 197, Adelaide SA 5064, Australia.
| | - Kristian Peters
- Leibniz Institute of Plant Biochemistry (IPB Halle), Bioinformatics and Scientific Data, 06120 Halle, Germany.
| | - Johannes Rainer
- Institute for Biomedicine, Eurac Research, Affiliated Institute of the University of Lübeck, 39100 Bolzano, Italy.
| | - Reza M Salek
- The International Agency for Research on Cancer, 150 cours Albert Thomas, CEDEX 08, 69372 Lyon, France.
| | - Tobias Schulze
- Department of Effect-Directed Analysis, Helmholtz Centre for Environmental Research-UFZ, Permoserstraße 15, 04318 Leipzig, Germany.
| | - Emma L Schymanski
- Luxembourg Centre for Systems Biomedicine, University of Luxembourg, 6 avenue du Swing, L-4367 Belvaux, Luxembourg.
| | - Michael A Stravs
- Eawag, Swiss Federal Institute of Aquatic Science and Technology, Überlandstrasse 133, 8600 Dubendorf, Switzerland.
| | - Etienne A Thévenot
- CEA, LIST, Laboratory for Data Sciences and Decision, MetaboHUB, Gif-Sur-Yvette F-91191, France.
| | - Hendrik Treutler
- Leibniz Institute of Plant Biochemistry (IPB Halle), Bioinformatics and Scientific Data, 06120 Halle, Germany.
| | - Ralf J M Weber
- Phenome Centre Birmingham and School of Biosciences, University of Birmingham, Edgbaston, Birmingham B15 2TT, UK.
| | - Egon Willighagen
- Department of Bioinformatics-BiGCaT, NUTRIM, Maastricht University, 6229 ER Maastricht, The Netherlands.
| | - Michael Witting
- Research Unit Analytical BioGeoChemistry, Helmholtz Zentrum München, 85764 Neuherberg, Germany.
- Chair of Analytical Food Chemistry, Technische Universität München, 85354 Weihenstephan, Germany.
| | - Steffen Neumann
- Leibniz Institute of Plant Biochemistry (IPB Halle), Bioinformatics and Scientific Data, 06120 Halle, Germany.
- German Centre for Integrative Biodiversity Research (iDiv), Halle-Jena-Leipzig Deutscher, Platz 5e, 04103 Leipzig, Germany.
| |
Collapse
|
36
|
Mendez KM, Pritchard L, Reinke SN, Broadhurst DI. Toward collaborative open data science in metabolomics using Jupyter Notebooks and cloud computing. Metabolomics 2019; 15:125. [PMID: 31522294 PMCID: PMC6745024 DOI: 10.1007/s11306-019-1588-0] [Citation(s) in RCA: 34] [Impact Index Per Article: 6.8] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 05/30/2019] [Accepted: 09/07/2019] [Indexed: 12/20/2022]
Abstract
BACKGROUND A lack of transparency and reporting standards in the scientific community has led to increasing and widespread concerns relating to reproduction and integrity of results. As an omics science, which generates vast amounts of data and relies heavily on data science for deriving biological meaning, metabolomics is highly vulnerable to irreproducibility. The metabolomics community has made substantial efforts to align with FAIR data standards by promoting open data formats, data repositories, online spectral libraries, and metabolite databases. Open data analysis platforms also exist; however, they tend to be inflexible and rely on the user to adequately report their methods and results. To enable FAIR data science in metabolomics, methods and results need to be transparently disseminated in a manner that is rapid, reusable, and fully integrated with the published work. To ensure broad use within the community such a framework also needs to be inclusive and intuitive for both computational novices and experts alike. AIM OF REVIEW To encourage metabolomics researchers from all backgrounds to take control of their own data science, mould it to their personal requirements, and enthusiastically share resources through open science. KEY SCIENTIFIC CONCEPTS OF REVIEW This tutorial introduces the concept of interactive web-based computational laboratory notebooks. The reader is guided through a set of experiential tutorials specifically targeted at metabolomics researchers, based around the Jupyter Notebook web application, GitHub data repository, and Binder cloud computing platform.
Collapse
Affiliation(s)
- Kevin M Mendez
- Centre for Metabolomics & Computational Biology, School of Science, Edith Cowan University, Joondalup, 6027, Australia
| | - Leighton Pritchard
- Strathclyde Institute of Pharmacy & Biomedical Sciences, University of Strathclyde, Cathedral Street, Glasgow, G1 1XQ, Scotland, UK
| | - Stacey N Reinke
- Centre for Metabolomics & Computational Biology, School of Science, Edith Cowan University, Joondalup, 6027, Australia.
| | - David I Broadhurst
- Centre for Metabolomics & Computational Biology, School of Science, Edith Cowan University, Joondalup, 6027, Australia.
| |
Collapse
|
37
|
Chong J, Soufan O, Li C, Caraus I, Li S, Bourque G, Wishart DS, Xia J. MetaboAnalyst 4.0: towards more transparent and integrative metabolomics analysis. Nucleic Acids Res 2019; 46:W486-W494. [PMID: 29762782 PMCID: PMC6030889 DOI: 10.1093/nar/gky310] [Citation(s) in RCA: 2604] [Impact Index Per Article: 520.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/31/2018] [Accepted: 04/13/2018] [Indexed: 02/06/2023] Open
Abstract
We present a new update to MetaboAnalyst (version 4.0) for comprehensive metabolomic data analysis, interpretation, and integration with other omics data. Since the last major update in 2015, MetaboAnalyst has continued to evolve based on user feedback and technological advancements in the field. For this year's update, four new key features have been added to MetaboAnalyst 4.0, including: (1) real-time R command tracking and display coupled with the release of a companion MetaboAnalystR package; (2) a MS Peaks to Pathways module for prediction of pathway activity from untargeted mass spectral data using the mummichog algorithm; (3) a Biomarker Meta-analysis module for robust biomarker identification through the combination of multiple metabolomic datasets and (4) a Network Explorer module for integrative analysis of metabolomics, metagenomics, and/or transcriptomics data. The user interface of MetaboAnalyst 4.0 has been reengineered to provide a more modern look and feel, as well as to give more space and flexibility to introduce new functions. The underlying knowledgebases (compound libraries, metabolite sets, and metabolic pathways) have also been updated based on the latest data from the Human Metabolome Database (HMDB). A Docker image of MetaboAnalyst is also available to facilitate download and local installation of MetaboAnalyst. MetaboAnalyst 4.0 is freely available at http://metaboanalyst.ca.
Collapse
Affiliation(s)
- Jasmine Chong
- Institute of Parasitology, McGill University, Montreal, Québec, Canada
| | - Othman Soufan
- Institute of Parasitology, McGill University, Montreal, Québec, Canada
| | - Carin Li
- Department of Biological Sciences, University of Alberta, Edmonton, Alberta, Canada
| | - Iurie Caraus
- Institute of Parasitology, McGill University, Montreal, Québec, Canada.,Canadian Center for Computational Genomics, McGill University, Montreal, Québec, Canada
| | - Shuzhao Li
- Department of Medicine, Emory University School of Medicine, Atlanta, Georgia, USA
| | - Guillaume Bourque
- Canadian Center for Computational Genomics, McGill University, Montreal, Québec, Canada.,Department of Human Genetics, McGill University, Montreal, Québec, Canada
| | - David S Wishart
- Department of Biological Sciences, University of Alberta, Edmonton, Alberta, Canada.,Department of Computing Science, University of Alberta, Edmonton, Alberta, Canada
| | - Jianguo Xia
- Institute of Parasitology, McGill University, Montreal, Québec, Canada.,Canadian Center for Computational Genomics, McGill University, Montreal, Québec, Canada.,Department of Animal Science, McGill University, Montreal, Québec, Canada
| |
Collapse
|
38
|
Bell M, Blais JM. "-Omics" workflow for paleolimnological and geological archives: A review. THE SCIENCE OF THE TOTAL ENVIRONMENT 2019; 672:438-455. [PMID: 30965259 DOI: 10.1016/j.scitotenv.2019.03.477] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 11/01/2018] [Revised: 03/29/2019] [Accepted: 03/30/2019] [Indexed: 06/09/2023]
Abstract
"-Omics" is a powerful screening method with applications in molecular biology, toxicology, wildlife biology, natural product discovery, and many other fields. Genomics, proteomics, metabolomics, and lipidomics are common examples included under the "-omics" umbrella. This screening method uses combinations of untargeted, semi-targeted, and targeted analyses paired with data mining to facilitate researchers' understanding of the genome, proteins, and small organic molecules in biological systems. Recently, however, the use of "-omics" has expanded into the fields of geology, specifically petrology, and paleolimnology. Specifically, untargeted analyses stand to transform these fields as petroleomics, and sediment-"omics" become more prevalent. "-Omics" facilitates the visualization of small molecule profiles from environmental matrices (i.e. oil and sediment). Small molecule profiles can provide improved understanding of small molecules distributions throughout the environment, and how those compositions can change depending on conditions (i.e. climate change, weathering, etc.). "-Omics" also facilities discovery of next-generation biomarkers that can be used for oil source identification and as proxies for reconstructing past environmental changes. Untargeted analyses paired with data mining and multivariate statistical analyses represents a powerful suite of tools for hypothesis generation, and new method development for environmental reconstructions. Here we present an introduction to "-omics" methodology, technical terms, and examples of applications to paleolimnology and petrology. The purpose of this review is to highlight the important considerations at each step in the "-omics" workflow to produce high quality and statistically powerful data for petrological and paleolimnological applications.
Collapse
Affiliation(s)
- Madison Bell
- Laboratory for the Analysis of Natural and Synthetic Environmental Toxicants, Department of Biology, University of Ottawa, Ottawa, ON K1N 6N5, Canada
| | - Jules M Blais
- Laboratory for the Analysis of Natural and Synthetic Environmental Toxicants, Department of Biology, University of Ottawa, Ottawa, ON K1N 6N5, Canada.
| |
Collapse
|
39
|
Rinschen MM, Ivanisevic J, Giera M, Siuzdak G. Identification of bioactive metabolites using activity metabolomics. Nat Rev Mol Cell Biol 2019; 20:353-367. [PMID: 30814649 PMCID: PMC6613555 DOI: 10.1038/s41580-019-0108-4] [Citation(s) in RCA: 572] [Impact Index Per Article: 114.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/11/2022]
Abstract
The metabolome, the collection of small-molecule chemical entities involved in metabolism, has traditionally been studied with the aim of identifying biomarkers in the diagnosis and prediction of disease. However, the value of metabolome analysis (metabolomics) has been redefined from a simple biomarker identification tool to a technology for the discovery of active drivers of biological processes. It is now clear that the metabolome affects cellular physiology through modulation of other 'omics' levels, including the genome, epigenome, transcriptome and proteome. In this Review, we focus on recent progress in using metabolomics to understand how the metabolome influences other omics and, by extension, to reveal the active role of metabolites in physiology and disease. This concept of utilizing metabolomics to perform activity screens to identify biologically active metabolites - which we term activity metabolomics - is already having a broad impact on biology.
Collapse
Affiliation(s)
- Markus M Rinschen
- The Scripps Research Institute, Center for Metabolomics and Mass Spectrometry, La Jolla, CA, USA
| | - Julijana Ivanisevic
- Metabolomics Platform, Faculty of Biology and Medicine, University of Lausanne, Lausanne, Switzerland
| | - Martin Giera
- Leiden University Medical Center, Center for Proteomics & Metabolomics, Leiden, Netherlands.
| | - Gary Siuzdak
- The Scripps Research Institute, Center for Metabolomics and Mass Spectrometry, La Jolla, CA, USA.
| |
Collapse
|
40
|
Johnson JE, Kumar P, Easterly C, Esler M, Mehta S, Eschenlauer AC, Hegeman AD, Jagtap PD, Griffin TJ. Improve your Galaxy text life: The Query Tabular Tool. F1000Res 2018; 7:1604. [PMID: 30519459 PMCID: PMC6248266 DOI: 10.12688/f1000research.16450.2] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Accepted: 01/02/2019] [Indexed: 11/20/2022] Open
Abstract
Galaxy provides an accessible platform where multi-step data analysis workflows integrating disparate software can be run, even by researchers with limited programming expertise. Applications of such sophisticated workflows are many, including those which integrate software from different ‘omic domains (e.g. genomics, proteomics, metabolomics). In these complex workflows, intermediate outputs are often generated as tabular text files, which must be transformed into customized formats which are compatible with the next software tools in the pipeline. Consequently, many text manipulation steps are added to an already complex workflow, overly complicating the process. In some cases, limitations to existing text manipulation are such that desired analyses can only be carried out using highly sophisticated processing steps beyond the reach of even advanced users and developers. For users with some SQL knowledge, these text operations could be combined into single, concise query on a relational database. As a solution, we have developed the Query Tabular Galaxy tool, which leverages a SQLite database generated from tabular input data. This database can be queried and manipulated to produce transformed and customized tabular outputs compatible with downstream processing steps. Regular expressions can also be utilized for even more sophisticated manipulations, such as find and replace and other filtering actions. Using several Galaxy-based multi-omic workflows as an example, we demonstrate how the Query Tabular tool dramatically streamlines and simplifies the creation of multi-step analyses, efficiently enabling complicated textual manipulations and processing. This tool should find broad utility for users of the Galaxy platform seeking to develop and use sophisticated workflows involving text manipulation on tabular outputs.
Collapse
Affiliation(s)
- James E Johnson
- Minnesota Supercomputing Institute, University of Minnesota, Minneapolis, MN, 55455, USA
| | - Praveen Kumar
- Department of Biochemistry, Molecular Biology and Biophysics, University of Minnesota, Minneapolis, Minnesota, 55455, USA.,Bioinformatics and Computational Biology Program, University of Minnesota-Rochester, Rochester, MN, 55904, USA
| | - Caleb Easterly
- Department of Biochemistry, Molecular Biology and Biophysics, University of Minnesota, Minneapolis, Minnesota, 55455, USA
| | - Mark Esler
- Department of Horticulture, University of Minnesota, St. Paul, MN, 55108, USA
| | - Subina Mehta
- Department of Biochemistry, Molecular Biology and Biophysics, University of Minnesota, Minneapolis, Minnesota, 55455, USA
| | - Arthur C Eschenlauer
- Department of Biochemistry, Molecular Biology and Biophysics, University of Minnesota, Minneapolis, Minnesota, 55455, USA.,Department of Horticulture, University of Minnesota, St. Paul, MN, 55108, USA
| | - Adrian D Hegeman
- Department of Horticulture, University of Minnesota, St. Paul, MN, 55108, USA
| | - Pratik D Jagtap
- Department of Biochemistry, Molecular Biology and Biophysics, University of Minnesota, Minneapolis, Minnesota, 55455, USA
| | - Timothy J Griffin
- Department of Biochemistry, Molecular Biology and Biophysics, University of Minnesota, Minneapolis, Minnesota, 55455, USA
| |
Collapse
|
41
|
Johnson JE, Kumar P, Easterly C, Esler M, Mehta S, Eschenlauer AC, Hegeman AD, Jagtap PD, Griffin TJ. Improve your Galaxy text life: The Query Tabular Tool. F1000Res 2018; 7:1604. [PMID: 30519459 PMCID: PMC6248266 DOI: 10.12688/f1000research.16450.1] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Accepted: 01/02/2019] [Indexed: 10/04/2023] Open
Abstract
Galaxy provides an accessible platform where multi-step data analysis workflows integrating disparate software can be run, even by researchers with limited programming expertise. Applications of such sophisticated workflows are many, including those which integrate software from different 'omic domains (e.g. genomics, proteomics, metabolomics). In these complex workflows, intermediate outputs are often generated as tabular text files, which must be transformed into customized formats which are compatible with the next software tools in the pipeline. Consequently, many text manipulation steps are added to an already complex workflow, overly complicating the process. In some cases, limitations to existing text manipulation are such that desired analyses can only be carried out using highly sophisticated processing steps beyond the reach of even advanced users and developers. For users with some SQL knowledge, these text operations could be combined into single, concise query on a relational database. As a solution, we have developed the Query Tabular Galaxy tool, which leverages a SQLite database generated from tabular input data. This database can be queried and manipulated to produce transformed and customized tabular outputs compatible with downstream processing steps. Regular expressions can also be utilized for even more sophisticated manipulations, such as find and replace and other filtering actions. Using several Galaxy-based multi-omic workflows as an example, we demonstrate how the Query Tabular tool dramatically streamlines and simplifies the creation of multi-step analyses, efficiently enabling complicated textual manipulations and processing. This tool should find broad utility for users of the Galaxy platform seeking to develop and use sophisticated workflows involving text manipulation on tabular outputs.
Collapse
Affiliation(s)
- James E. Johnson
- Minnesota Supercomputing Institute, University of Minnesota, Minneapolis, MN, 55455, USA
| | - Praveen Kumar
- Department of Biochemistry, Molecular Biology and Biophysics, University of Minnesota, Minneapolis, Minnesota, 55455, USA
- Bioinformatics and Computational Biology Program, University of Minnesota-Rochester, Rochester, MN, 55904, USA
| | - Caleb Easterly
- Department of Biochemistry, Molecular Biology and Biophysics, University of Minnesota, Minneapolis, Minnesota, 55455, USA
| | - Mark Esler
- Department of Horticulture, University of Minnesota, St. Paul, MN, 55108, USA
| | - Subina Mehta
- Department of Biochemistry, Molecular Biology and Biophysics, University of Minnesota, Minneapolis, Minnesota, 55455, USA
| | - Arthur C. Eschenlauer
- Department of Biochemistry, Molecular Biology and Biophysics, University of Minnesota, Minneapolis, Minnesota, 55455, USA
- Department of Horticulture, University of Minnesota, St. Paul, MN, 55108, USA
| | - Adrian D. Hegeman
- Department of Horticulture, University of Minnesota, St. Paul, MN, 55108, USA
| | - Pratik D. Jagtap
- Department of Biochemistry, Molecular Biology and Biophysics, University of Minnesota, Minneapolis, Minnesota, 55455, USA
| | - Timothy J. Griffin
- Department of Biochemistry, Molecular Biology and Biophysics, University of Minnesota, Minneapolis, Minnesota, 55455, USA
| |
Collapse
|
42
|
Sato S, Horikawa M, Kondo T, Sato T, Setou M. A power law distribution of metabolite abundance levels in mice regardless of the time and spatial scale of analysis. Sci Rep 2018; 8:10315. [PMID: 29985415 PMCID: PMC6037760 DOI: 10.1038/s41598-018-28667-5] [Citation(s) in RCA: 6] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/30/2017] [Accepted: 06/26/2018] [Indexed: 11/29/2022] Open
Abstract
Biomolecule abundance levels change with the environment and enable a living system to adapt to the new conditions. Although, the living system maintains at least some characteristics, e.g. homeostasis. One of the characteristics maintained by a living system is a power law distribution of biomolecule abundance levels. Previous studies have pointed to a universal characteristic of biochemical reaction networks, with data obtained from lysates of multiple cells. As a result, the spatial scale of the data related to the power law distribution of biomolecule abundance levels is not clear. In this study, we researched the scaling law of metabolites in mouse tissue with a spatial scale of quantification that was changed stepwise between a whole-tissue section and a single-point analysis (25 μm). As a result, metabolites in mouse tissues were found to follow the power law distribution independently of the spatial scale of analysis. Additionally, we tested the temporal changes by comparing data from younger and older mice. Both followed similar power law distributions, indicating that metabolite composition is not diversified by aging to disrupt the power law distribution. The power law distribution of metabolite abundance is thus a robust characteristic of a living system regardless of time and space.
Collapse
Affiliation(s)
- Shumpei Sato
- Department of Cellular and Molecular Anatomy, Hamamatsu University School of Medicine, 1-20-1 Handayama Higashi-ku, Hamamatsu, Shizuoka, 431-3192, Japan
| | - Makoto Horikawa
- Department of Cellular and Molecular Anatomy, Hamamatsu University School of Medicine, 1-20-1 Handayama Higashi-ku, Hamamatsu, Shizuoka, 431-3192, Japan
- International Mass Imaging Center, Hamamatsu University School of Medicine, 1-20-1 Handayama Higashi-ku, Hamamatsu, Shizuoka, 431-3192, Japan
| | - Takeshi Kondo
- Department of Cellular and Molecular Anatomy, Hamamatsu University School of Medicine, 1-20-1 Handayama Higashi-ku, Hamamatsu, Shizuoka, 431-3192, Japan
- International Mass Imaging Center, Hamamatsu University School of Medicine, 1-20-1 Handayama Higashi-ku, Hamamatsu, Shizuoka, 431-3192, Japan
| | - Tomohito Sato
- Department of Cellular and Molecular Anatomy, Hamamatsu University School of Medicine, 1-20-1 Handayama Higashi-ku, Hamamatsu, Shizuoka, 431-3192, Japan
| | - Mitsutoshi Setou
- Department of Cellular and Molecular Anatomy, Hamamatsu University School of Medicine, 1-20-1 Handayama Higashi-ku, Hamamatsu, Shizuoka, 431-3192, Japan.
- International Mass Imaging Center, Hamamatsu University School of Medicine, 1-20-1 Handayama Higashi-ku, Hamamatsu, Shizuoka, 431-3192, Japan.
- Preeminent Medical Photonics Education & Research Center, 1-20-1 Handayama, Higashi-ku, Hamamatsu, Shizuoka, 431-3192, Japan.
- Department of Anatomy, The University of Hong Kong, 6/F, William MW Mong Block 21 Sassoon Road, Pokfulam, Hong Kong SAR, China.
| |
Collapse
|
43
|
Peters K, Worrich A, Weinhold A, Alka O, Balcke G, Birkemeyer C, Bruelheide H, Calf OW, Dietz S, Dührkop K, Gaquerel E, Heinig U, Kücklich M, Macel M, Müller C, Poeschl Y, Pohnert G, Ristok C, Rodríguez VM, Ruttkies C, Schuman M, Schweiger R, Shahaf N, Steinbeck C, Tortosa M, Treutler H, Ueberschaar N, Velasco P, Weiß BM, Widdig A, Neumann S, Dam NMV. Current Challenges in Plant Eco-Metabolomics. Int J Mol Sci 2018; 19:E1385. [PMID: 29734799 PMCID: PMC5983679 DOI: 10.3390/ijms19051385] [Citation(s) in RCA: 62] [Impact Index Per Article: 10.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/28/2018] [Revised: 04/24/2018] [Accepted: 04/25/2018] [Indexed: 12/22/2022] Open
Abstract
The relatively new research discipline of Eco-Metabolomics is the application of metabolomics techniques to ecology with the aim to characterise biochemical interactions of organisms across different spatial and temporal scales. Metabolomics is an untargeted biochemical approach to measure many thousands of metabolites in different species, including plants and animals. Changes in metabolite concentrations can provide mechanistic evidence for biochemical processes that are relevant at ecological scales. These include physiological, phenotypic and morphological responses of plants and communities to environmental changes and also interactions with other organisms. Traditionally, research in biochemistry and ecology comes from two different directions and is performed at distinct spatiotemporal scales. Biochemical studies most often focus on intrinsic processes in individuals at physiological and cellular scales. Generally, they take a bottom-up approach scaling up cellular processes from spatiotemporally fine to coarser scales. Ecological studies usually focus on extrinsic processes acting upon organisms at population and community scales and typically study top-down and bottom-up processes in combination. Eco-Metabolomics is a transdisciplinary research discipline that links biochemistry and ecology and connects the distinct spatiotemporal scales. In this review, we focus on approaches to study chemical and biochemical interactions of plants at various ecological levels, mainly plant⁻organismal interactions, and discuss related examples from other domains. We present recent developments and highlight advancements in Eco-Metabolomics over the last decade from various angles. We further address the five key challenges: (1) complex experimental designs and large variation of metabolite profiles; (2) feature extraction; (3) metabolite identification; (4) statistical analyses; and (5) bioinformatics software tools and workflows. The presented solutions to these challenges will advance connecting the distinct spatiotemporal scales and bridging biochemistry and ecology.
Collapse
Affiliation(s)
- Kristian Peters
- Leibniz Institute of Plant Biochemistry, Stress and Developmental Biology, Weinberg 3, 06120 Halle (Saale), Germany.
| | - Anja Worrich
- German Centre for Integrative Biodiversity Research (iDiv) Halle-Jena-Leipzig, Deutscher Platz 5e, 04103 Leipzig, Germany.
- Institute of Biodiversity, Friedrich Schiller University Jena, Dornburger-Str. 159, 07743 Jena, Germany.
- UFZ-Helmholtz-Centre for Environmental Research, Department Environmental Microbiology, Permoserstraße 15, 04318 Leipzig, Germany.
| | - Alexander Weinhold
- German Centre for Integrative Biodiversity Research (iDiv) Halle-Jena-Leipzig, Deutscher Platz 5e, 04103 Leipzig, Germany.
- Institute of Biodiversity, Friedrich Schiller University Jena, Dornburger-Str. 159, 07743 Jena, Germany.
| | - Oliver Alka
- Applied Bioinformatics Group, Center for Bioinformatics, University of Tübingen, Sand 14, 72076 Tübingen, Germany.
| | - Gerd Balcke
- Leibniz Institute of Plant Biochemistry, Cell and Metabolic Biology, Weinberg 3, 06120 Halle (Saale), Germany.
| | - Claudia Birkemeyer
- Institute of Analytical Chemistry, University of Leipzig, Linnéstr. 3, 04103 Leipzig, Germany.
| | - Helge Bruelheide
- German Centre for Integrative Biodiversity Research (iDiv) Halle-Jena-Leipzig, Deutscher Platz 5e, 04103 Leipzig, Germany.
- Institute of Biology/Geobotany and Botanical Garden, Martin Luther University Halle-Wittenberg, Am Kirchtor 1, 06108 Halle (Saale), Germany.
| | - Onno W Calf
- Molecular Interaction Ecology, Institute for Water and Wetland Research (IWWR), Radboud University, Heyendaalseweg 135, 6525 AJ Nijmegen, The Netherlands.
| | - Sophie Dietz
- Leibniz Institute of Plant Biochemistry, Stress and Developmental Biology, Weinberg 3, 06120 Halle (Saale), Germany.
| | - Kai Dührkop
- Department of Bioinformatics, Friedrich Schiller University Jena, Ernst-Abbe-Platz 2, 07743 Jena, Germany.
| | - Emmanuel Gaquerel
- Centre for Organismal Studies, Heidelberg University, Im Neuenheimer Feld 360, 69120 Heidelberg, Germany.
| | - Uwe Heinig
- Weizmann Institute of Science, Faculty of Biochemistry, Department of Plant Sciences, 234 Herzl St., P.O. Box 26, Rehovot 7610001, Israel.
| | - Marlen Kücklich
- Institute of Biology, University of Leipzig, Talstraße 33, 04109 Leipzig, Germany.
| | - Mirka Macel
- Molecular Interaction Ecology, Institute for Water and Wetland Research (IWWR), Radboud University, Heyendaalseweg 135, 6525 AJ Nijmegen, The Netherlands.
| | - Caroline Müller
- Chemical Ecology, Bielefeld University, Universitätsstr. 25, 33615 Bielefeld, Germany.
| | - Yvonne Poeschl
- German Centre for Integrative Biodiversity Research (iDiv) Halle-Jena-Leipzig, Deutscher Platz 5e, 04103 Leipzig, Germany.
- Institute of Informatics, Martin Luther University Halle-Wittenberg, Von-Seckendorff-Platz 1, 06120 Halle (Saale), Germany.
| | - Georg Pohnert
- Institute of Inorganic and Analytical Chemistry, Friedrich Schiller University Jena, Lessingstr. 8, 07743 Jena, Germany.
| | - Christian Ristok
- German Centre for Integrative Biodiversity Research (iDiv) Halle-Jena-Leipzig, Deutscher Platz 5e, 04103 Leipzig, Germany.
| | - Victor Manuel Rodríguez
- Group of Genetics, Breeding and Biochemistry of Brassica, Misión Biológica de Galicia (CSIC), Apartado 28, 36080 Pontevedra, Spain.
| | - Christoph Ruttkies
- Leibniz Institute of Plant Biochemistry, Stress and Developmental Biology, Weinberg 3, 06120 Halle (Saale), Germany.
| | - Meredith Schuman
- Department of Molecular Ecology, Max Planck Institute for Chemical Ecology, Hans-Knöll-Straße 8, 07745 Jena, Germany.
| | - Rabea Schweiger
- Chemical Ecology, Bielefeld University, Universitätsstr. 25, 33615 Bielefeld, Germany.
| | - Nir Shahaf
- Weizmann Institute of Science, Faculty of Biochemistry, Department of Plant Sciences, 234 Herzl St., P.O. Box 26, Rehovot 7610001, Israel.
| | - Christoph Steinbeck
- Institute of Inorganic and Analytical Chemistry, Friedrich Schiller University Jena, Lessingstr. 8, 07743 Jena, Germany.
| | - Maria Tortosa
- Group of Genetics, Breeding and Biochemistry of Brassica, Misión Biológica de Galicia (CSIC), Apartado 28, 36080 Pontevedra, Spain.
| | - Hendrik Treutler
- Leibniz Institute of Plant Biochemistry, Stress and Developmental Biology, Weinberg 3, 06120 Halle (Saale), Germany.
| | - Nico Ueberschaar
- Institute of Inorganic and Analytical Chemistry, Friedrich Schiller University Jena, Lessingstr. 8, 07743 Jena, Germany.
| | - Pablo Velasco
- Group of Genetics, Breeding and Biochemistry of Brassica, Misión Biológica de Galicia (CSIC), Apartado 28, 36080 Pontevedra, Spain.
| | - Brigitte M Weiß
- Institute of Biology, University of Leipzig, Talstraße 33, 04109 Leipzig, Germany.
| | - Anja Widdig
- German Centre for Integrative Biodiversity Research (iDiv) Halle-Jena-Leipzig, Deutscher Platz 5e, 04103 Leipzig, Germany.
- Institute of Biology, University of Leipzig, Talstraße 33, 04109 Leipzig, Germany.
- Research Group of Primate Kin Selection, Max Planck Institute for Evolutionary Anthropology, Deutscher Platz 6, 04103 Leipzig, Germany.
| | - Steffen Neumann
- Leibniz Institute of Plant Biochemistry, Stress and Developmental Biology, Weinberg 3, 06120 Halle (Saale), Germany.
- German Centre for Integrative Biodiversity Research (iDiv) Halle-Jena-Leipzig, Deutscher Platz 5e, 04103 Leipzig, Germany.
| | - Nicole M van Dam
- German Centre for Integrative Biodiversity Research (iDiv) Halle-Jena-Leipzig, Deutscher Platz 5e, 04103 Leipzig, Germany.
- Institute of Biodiversity, Friedrich Schiller University Jena, Dornburger-Str. 159, 07743 Jena, Germany.
| |
Collapse
|
44
|
Kirpich AS, Ibarra M, Moskalenko O, Fear JM, Gerken J, Mi X, Ashrafi A, Morse AM, McIntyre LM. SECIMTools: a suite of metabolomics data analysis tools. BMC Bioinformatics 2018; 19:151. [PMID: 29678131 PMCID: PMC5910624 DOI: 10.1186/s12859-018-2134-1] [Citation(s) in RCA: 28] [Impact Index Per Article: 4.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/09/2017] [Accepted: 03/26/2018] [Indexed: 11/28/2022] Open
Abstract
Background Metabolomics has the promise to transform the area of personalized medicine with the rapid development of high throughput technology for untargeted analysis of metabolites. Open access, easy to use, analytic tools that are broadly accessible to the biological community need to be developed. While technology used in metabolomics varies, most metabolomics studies have a set of features identified. Galaxy is an open access platform that enables scientists at all levels to interact with big data. Galaxy promotes reproducibility by saving histories and enabling the sharing workflows among scientists. Results SECIMTools (SouthEast Center for Integrated Metabolomics) is a set of Python applications that are available both as standalone tools and wrapped for use in Galaxy. The suite includes a comprehensive set of quality control metrics (retention time window evaluation and various peak evaluation tools), visualization techniques (hierarchical cluster heatmap, principal component analysis, modular modularity clustering), basic statistical analysis methods (partial least squares - discriminant analysis, analysis of variance, t-test, Kruskal-Wallis non-parametric test), advanced classification methods (random forest, support vector machines), and advanced variable selection tools (least absolute shrinkage and selection operator LASSO and Elastic Net). Conclusions SECIMTools leverages the Galaxy platform and enables integrated workflows for metabolomics data analysis made from building blocks designed for easy use and interpretability. Standard data formats and a set of utilities allow arbitrary linkages between tools to encourage novel workflow designs. The Galaxy framework enables future data integration for metabolomics studies with other omics data. Electronic supplementary material The online version of this article (10.1186/s12859-018-2134-1) contains supplementary material, which is available to authorized users.
Collapse
Affiliation(s)
- Alexander S Kirpich
- Southeast Center for Integrated Metabolomics (SECIM), University of Florida, Gainesville, FL, 32611, USA.,University of Florida Informatics Institute, University of Florida, Gainesville, FL, 32611, USA.,University of Florida Genetics Institute, University of Florida, Gainesville, FL, 32611, USA.,Department of Molecular Genetics and Microbiology, University of Florida, Gainesville, FL, 32611, USA
| | - Miguel Ibarra
- Southeast Center for Integrated Metabolomics (SECIM), University of Florida, Gainesville, FL, 32611, USA.,University of Florida Informatics Institute, University of Florida, Gainesville, FL, 32611, USA
| | - Oleksandr Moskalenko
- University of Florida Research Computing, University of Florida, Gainesville, FL, 32611, USA
| | - Justin M Fear
- Southeast Center for Integrated Metabolomics (SECIM), University of Florida, Gainesville, FL, 32611, USA.,University of Florida Genetics Institute, University of Florida, Gainesville, FL, 32611, USA.,Department of Molecular Genetics and Microbiology, University of Florida, Gainesville, FL, 32611, USA.,National Institute of Health, Washington, DC, USA
| | - Joseph Gerken
- Southeast Center for Integrated Metabolomics (SECIM), University of Florida, Gainesville, FL, 32611, USA
| | - Xinlei Mi
- Department of Biostatistics, University of Florida, Gainesville, FL, 32611, USA
| | - Ali Ashrafi
- Southeast Center for Integrated Metabolomics (SECIM), University of Florida, Gainesville, FL, 32611, USA
| | - Alison M Morse
- Southeast Center for Integrated Metabolomics (SECIM), University of Florida, Gainesville, FL, 32611, USA.,University of Florida Genetics Institute, University of Florida, Gainesville, FL, 32611, USA.,Department of Molecular Genetics and Microbiology, University of Florida, Gainesville, FL, 32611, USA
| | - Lauren M McIntyre
- Southeast Center for Integrated Metabolomics (SECIM), University of Florida, Gainesville, FL, 32611, USA.,University of Florida Informatics Institute, University of Florida, Gainesville, FL, 32611, USA.,University of Florida Genetics Institute, University of Florida, Gainesville, FL, 32611, USA.,Department of Molecular Genetics and Microbiology, University of Florida, Gainesville, FL, 32611, USA
| |
Collapse
|
45
|
Sostare J, Di Guida R, Kirwan J, Chalal K, Palmer E, Dunn WB, Viant MR. Comparison of modified Matyash method to conventional solvent systems for polar metabolite and lipid extractions. Anal Chim Acta 2018; 1037:301-315. [PMID: 30292307 DOI: 10.1016/j.aca.2018.03.019] [Citation(s) in RCA: 55] [Impact Index Per Article: 9.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/31/2017] [Revised: 03/13/2018] [Accepted: 03/18/2018] [Indexed: 02/02/2023]
Abstract
In the last decade, metabolomics has experienced significant advances in the throughput and robustness of analytical methodologies. Yet the preparation of biofluids and low-mass tissue samples remains a laborious and potentially inconsistent manual process, and a significant bottleneck for high-throughput metabolomics. To address this, we have compared three different sample extraction solvent systems in three diverse sample types with the purpose of selecting an optimum protocol for subsequent automation of sample preparation. We have investigated and re-optimised the solvent ratios in the recently introduced methyl tert-butyl ether (MTBE)/methanol/water solvent system (here termed modified Matyash; 2.6/2.0/2.4, v/v/v) and compared it to the original Matyash method (10/3/2.5, v/v/v) and the conventional chloroform/methanol/water (stepwise Bligh and Dyer, 2.0/2.0/1.8, v/v/v) using two biofluids (human serum and urine) and one tissue (whole Daphnia magna). This is the first report of the use of the Matyash method for extracting metabolites from the US National Institutes of Health (NIH) model organism D. magna. Extracted samples were analysed by non-targeted direct infusion mass spectrometry metabolomics or LC-MS metabolomics. Overall, the modified Matyash method yielded a higher number of peaks and putatively annotated metabolites compared to the original Matyash method (1-29% more peaks and 1-30% more metabolites) and the Bligh and Dyer method (4-20% more peaks and 1-41% more metabolites). Additionally the modified Matyash method was superior when considering metabolite intensities. The reproducibility of the modified Matyash method was higher than other methods (in 10 out of 12 datasets, compared to the original Matyash method; and in 8 out of 12 datasets, compared to the Bligh and Dyer method), based upon the observation of a lower mRSD of peak intensities. In conclusion, the modified Matyash method tended to provide a higher yield and reproducibility for most sample types in this study compared to two widely used methods.
Collapse
Affiliation(s)
- Jelena Sostare
- School of Biosciences, University of Birmingham, Birmingham, B15 2TT, UK
| | - Riccardo Di Guida
- School of Biosciences, University of Birmingham, Birmingham, B15 2TT, UK
| | - Jennifer Kirwan
- School of Biosciences, University of Birmingham, Birmingham, B15 2TT, UK
| | - Karnpreet Chalal
- School of Biosciences, University of Birmingham, Birmingham, B15 2TT, UK
| | - Elliott Palmer
- School of Biosciences, University of Birmingham, Birmingham, B15 2TT, UK
| | - Warwick B Dunn
- School of Biosciences, University of Birmingham, Birmingham, B15 2TT, UK
| | - Mark R Viant
- School of Biosciences, University of Birmingham, Birmingham, B15 2TT, UK.
| |
Collapse
|
46
|
Forsberg EM, Huan T, Rinehart D, Benton HP, Warth B, Hilmers B, Siuzdak G. Data processing, multi-omic pathway mapping, and metabolite activity analysis using XCMS Online. Nat Protoc 2018; 13:633-651. [PMID: 29494574 PMCID: PMC5937130 DOI: 10.1038/nprot.2017.151] [Citation(s) in RCA: 164] [Impact Index Per Article: 27.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/12/2022]
Abstract
Systems biology is the study of complex living organisms, and as such, analysis on a systems-wide scale involves the collection of information-dense data sets that are representative of an entire phenotype. To uncover dynamic biological mechanisms, bioinformatics tools have become essential to facilitating data interpretation in large-scale analyses. Global metabolomics is one such method for performing systems biology, as metabolites represent the downstream functional products of ongoing biological processes. We have developed XCMS Online, a platform that enables online metabolomics data processing and interpretation. A systems biology workflow recently implemented within XCMS Online enables rapid metabolic pathway mapping using raw metabolomics data for investigating dysregulated metabolic processes. In addition, this platform supports integration of multi-omic (such as genomic and proteomic) data to garner further systems-wide mechanistic insight. Here, we provide an in-depth procedure showing how to effectively navigate and use the systems biology workflow within XCMS Online without a priori knowledge of the platform, including uploading liquid chromatography (LC)-mass spectrometry (MS) data from metabolite-extracted biological samples, defining the job parameters to identify features, correcting for retention time deviations, conducting statistical analysis of features between sample classes and performing predictive metabolic pathway analysis. Additional multi-omics data can be uploaded and overlaid with previously identified pathways to enhance systems-wide analysis of the observed dysregulations. We also describe unique visualization tools to assist in elucidation of statistically significant dysregulated metabolic pathways. Parameter input takes 5-10 min, depending on user experience; data processing typically takes 1-3 h, and data analysis takes ∼30 min.
Collapse
Affiliation(s)
- Erica M Forsberg
- Center for Metabolomics and Mass Spectrometry, The Scripps Research Institute, La Jolla, California, USA
- Department of Chemistry and Biochemistry, San Diego State University, San Diego, California, USA
| | - Tao Huan
- Center for Metabolomics and Mass Spectrometry, The Scripps Research Institute, La Jolla, California, USA
| | - Duane Rinehart
- Center for Metabolomics and Mass Spectrometry, The Scripps Research Institute, La Jolla, California, USA
| | - H Paul Benton
- Center for Metabolomics and Mass Spectrometry, The Scripps Research Institute, La Jolla, California, USA
| | - Benedikt Warth
- Center for Metabolomics and Mass Spectrometry, The Scripps Research Institute, La Jolla, California, USA
- Department of Food Chemistry and Toxicology, University of Vienna, Vienna, Austria
| | - Brian Hilmers
- Center for Metabolomics and Mass Spectrometry, The Scripps Research Institute, La Jolla, California, USA
| | - Gary Siuzdak
- Center for Metabolomics and Mass Spectrometry, The Scripps Research Institute, La Jolla, California, USA
| |
Collapse
|
47
|
Guitton Y, Tremblay-Franco M, Le Corguillé G, Martin JF, Pétéra M, Roger-Mele P, Delabrière A, Goulitquer S, Monsoor M, Duperier C, Canlet C, Servien R, Tardivel P, Caron C, Giacomoni F, Thévenot EA. Create, run, share, publish, and reference your LC–MS, FIA–MS, GC–MS, and NMR data analysis workflows with the Workflow4Metabolomics 3.0 Galaxy online infrastructure for metabolomics. Int J Biochem Cell Biol 2017; 93:89-101. [DOI: 10.1016/j.biocel.2017.07.002] [Citation(s) in RCA: 65] [Impact Index Per Article: 9.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/28/2017] [Revised: 06/14/2017] [Accepted: 07/10/2017] [Indexed: 12/11/2022]
|
48
|
|
49
|
van Rijswijk M, Beirnaert C, Caron C, Cascante M, Dominguez V, Dunn WB, Ebbels TMD, Giacomoni F, Gonzalez-Beltran A, Hankemeier T, Haug K, Izquierdo-Garcia JL, Jimenez RC, Jourdan F, Kale N, Klapa MI, Kohlbacher O, Koort K, Kultima K, Le Corguillé G, Moreno P, Moschonas NK, Neumann S, O'Donovan C, Reczko M, Rocca-Serra P, Rosato A, Salek RM, Sansone SA, Satagopam V, Schober D, Shimmo R, Spicer RA, Spjuth O, Thévenot EA, Viant MR, Weber RJM, Willighagen EL, Zanetti G, Steinbeck C. The future of metabolomics in ELIXIR. F1000Res 2017; 6. [PMID: 29043062 PMCID: PMC5627583 DOI: 10.12688/f1000research.12342.2] [Citation(s) in RCA: 8] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Accepted: 10/31/2017] [Indexed: 01/11/2023] Open
Abstract
Metabolomics, the youngest of the major omics technologies, is supported by an active community of researchers and infrastructure developers across Europe. To coordinate and focus efforts around infrastructure building for metabolomics within Europe, a workshop on the "Future of metabolomics in ELIXIR" was organised at Frankfurt Airport in Germany. This one-day strategic workshop involved representatives of ELIXIR Nodes, members of the PhenoMeNal consortium developing an e-infrastructure that supports workflow-based metabolomics analysis pipelines, and experts from the international metabolomics community. The workshop established metabolite identification as the critical area, where a maximal impact of computational metabolomics and data management on other fields could be achieved. In particular, the existing four ELIXIR Use Cases, where the metabolomics community - both industry and academia - would benefit most, and which could be exhaustively mapped onto the current five ELIXIR Platforms were discussed. This opinion article is a call for support for a new ELIXIR metabolomics Use Case, which aligns with and complements the existing and planned ELIXIR Platforms and Use Cases.
Collapse
Affiliation(s)
- Merlijn van Rijswijk
- ELIXIR-NL, Dutch Techcentre for Life Sciences, Utrecht, 3503 RM, Netherlands.,Netherlands Metabolomics Center, Leiden, 2333 CC, Netherlands
| | - Charlie Beirnaert
- ADReM, Department of Mathematics and Computer Science, University of Antwerp, Antwerp, 2020, Belgium
| | - Christophe Caron
- ELIXIR-FR, French Institute of Bioinformatics, Gif-sur-Yvette, F-91198, France
| | - Marta Cascante
- Department of Biochemistry and Molecular Biomedicine, Faculty of Biology, Universitat de Barcelona, Barcelona, 08028, Spain
| | - Victoria Dominguez
- ELIXIR-FR, French Institute of Bioinformatics, Gif-sur-Yvette, F-91198, France
| | - Warwick B Dunn
- School of Biosciences, Phenome Centre Birmingham and Birmingham Metabolomics Training Centre, University of Birmingham, Birmingham, B15 2TT, UK
| | - Timothy M D Ebbels
- Computational and Systems Medicine, Department of Surgery and Cancer, Imperial College London, London, SW7 2AZ, UK
| | - Franck Giacomoni
- INRA, UNH, Human Nutrition Unit, PFEM, Metabolism Exploration Platform, MetaboHUB-Clermont, Clermont Auvergne University, Clermont-Ferrand, F-63000, France
| | | | - Thomas Hankemeier
- Netherlands Metabolomics Center, Leiden, 2333 CC, Netherlands.,Leiden Academic Centre for Drug Research, Leiden University, Leiden, 2300 RA, Netherlands
| | - Kenneth Haug
- European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Cambridge, CB10 1SD, UK
| | - Jose L Izquierdo-Garcia
- Centro Nacional Investigaciones Cardiovasculares, Madrid, 28029, Spain.,CIBER de Enfermedades Respiratorias, Madrid, 28029 , Spain
| | | | - Fabien Jourdan
- Toxalim, UMR 1331, Université de Toulouse, Toulouse, F-31300, France
| | - Namrata Kale
- European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Cambridge, CB10 1SD, UK
| | - Maria I Klapa
- Metabolic Engineering and Systems Biology Laboratory, Institute of Chemical Engineering Sciences, Foundation for Research & Technology - Hellas (FORTH/ICE-HT), Patras, GR-26504, Greece
| | - Oliver Kohlbacher
- Biomolecular Interactions, Max Planck Institute for Developmental Biology, Tübingen, 72076, Germany.,Department of Computer Science, University of Tübingen, Tübingen, 72076, Germany.,Center for Bioinformatics, University of Tübingen, Tübingen, 72076, Germany
| | - Kairi Koort
- The Centre of Excellence in Neural and Behavioural Sciences, Tallinn, Tallinn, 10120, Estonia.,School of Natural Sciences and Health, Tallinn University, 10120, 10120, Estonia
| | - Kim Kultima
- Department of Medical Sciences, Uppsala University, Uppsala, 752 36, Sweden
| | - Gildas Le Corguillé
- ELIXIR-FR, French Institute of Bioinformatics, Gif-sur-Yvette, F-91198, France.,UPMC, CNRS, FR2424, ABiMS, Station Biologique, Roscoff, F-29680, France
| | - Pablo Moreno
- European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Cambridge, CB10 1SD, UK
| | - Nicholas K Moschonas
- Metabolic Engineering and Systems Biology Laboratory, Institute of Chemical Engineering Sciences, Foundation for Research & Technology - Hellas (FORTH/ICE-HT), Patras, GR-26504, Greece.,Department of General Biology, School of Medicine, University of Patras, Patras, GR-26504, Greece
| | - Steffen Neumann
- Department of Stress and Developmental Biology, Leibniz Institute of Plant Biochemistry, Halle, 06120, Germany
| | - Claire O'Donovan
- European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Cambridge, CB10 1SD, UK
| | | | - Philippe Rocca-Serra
- Oxford e-Research Centre, Engineering Science Department, University of Oxford, Oxford, OX1 3QG, UK
| | - Antonio Rosato
- Magnetic Resonance Center, Interuniversity Consortium for Magnetic Resonance on MetalloProteins, University of Florence, Florence, 50121, Italy
| | - Reza M Salek
- European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Cambridge, CB10 1SD, UK
| | - Susanna-Assunta Sansone
- Oxford e-Research Centre, Engineering Science Department, University of Oxford, Oxford, OX1 3QG, UK
| | - Venkata Satagopam
- Luxembourg Centre For Systems Biomedicine (LCSB), University of Luxembourg, Belvaux, L-4367, Luxembourg
| | - Daniel Schober
- Department of Stress and Developmental Biology, Leibniz Institute of Plant Biochemistry, Halle, 06120, Germany
| | - Ruth Shimmo
- The Centre of Excellence in Neural and Behavioural Sciences, Tallinn, Tallinn, 10120, Estonia.,School of Natural Sciences and Health, Tallinn University, 10120, 10120, Estonia
| | - Rachel A Spicer
- European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Cambridge, CB10 1SD, UK
| | - Ola Spjuth
- Department of Pharmaceutical Biosciences, Uppsala University, Uppsala, 752 36, Sweden
| | - Etienne A Thévenot
- CEA, LIST, Laboratory for Data Analysis and Systems' Intelligence, MetaboHUB, Gif-sur-Yvette, F-91191, France
| | - Mark R Viant
- School of Biosciences, Phenome Centre Birmingham and Birmingham Metabolomics Training Centre, University of Birmingham, Birmingham, B15 2TT, UK
| | - Ralf J M Weber
- School of Biosciences, Phenome Centre Birmingham and Birmingham Metabolomics Training Centre, University of Birmingham, Birmingham, B15 2TT, UK
| | - Egon L Willighagen
- Department of Bioinformatics - BiGCaT, NUTRIM, Maastricht University, Maastricht, NL-6200, Netherlands
| | - Gianluigi Zanetti
- CRS4, Data Intensive Computing Group, Ed.1 POLARIS, Pula, 09010, Italy
| | | |
Collapse
|
50
|
Abstract
Data processing and analysis are major bottlenecks in high-throughput metabolomic experiments. Recent advancements in data acquisition platforms are driving trends toward increasing data size (e.g., petabyte scale) and complexity (multiple omic platforms). Improvements in data analysis software and in silico methods are similarly required to effectively utilize these advancements and link the acquired data with biological interpretations. Herein, we provide an overview of recently developed and freely available metabolomic tools, algorithms, databases, and data analysis frameworks. This overview of popular tools for MS and NMR-based metabolomics is organized into the following sections: data processing, annotation, analysis, and visualization. The following overview of newly developed tools helps to better inform researchers to support the emergence of metabolomics as an integral tool for the study of biochemistry, systems biology, environmental analysis, health, and personalized medicine.
Collapse
Affiliation(s)
- Biswapriya B Misra
- Department of Genetics, Texas Biomedical Research Institute, San Antonio, TX, USA
| | - Johannes F Fahrmann
- Department of Clinical Cancer Prevention, University of Texas MD Anderson Cancer Center, TX, USA
| | | |
Collapse
|