1
|
Tkalec Ž, Antignac JP, Bandow N, Béen FM, Belova L, Bessems J, Le Bizec B, Brack W, Cano-Sancho G, Chaker J, Covaci A, Creusot N, David A, Debrauwer L, Dervilly G, Duca RC, Fessard V, Grimalt JO, Guerin T, Habchi B, Hecht H, Hollender J, Jamin EL, Klánová J, Kosjek T, Krauss M, Lamoree M, Lavison-Bompard G, Meijer J, Moeller R, Mol H, Mompelat S, Van Nieuwenhuyse A, Oberacher H, Parinet J, Van Poucke C, Roškar R, Togola A, Trontelj J, Price EJ. Innovative analytical methodologies for characterizing chemical exposure with a view to next-generation risk assessment. ENVIRONMENT INTERNATIONAL 2024; 186:108585. [PMID: 38521044 DOI: 10.1016/j.envint.2024.108585] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 08/18/2023] [Revised: 03/14/2024] [Accepted: 03/15/2024] [Indexed: 03/25/2024]
Abstract
The chemical burden on the environment and human population is increasing. Consequently, regulatory risk assessment must keep pace to manage, reduce, and prevent adverse impacts on human and environmental health associated with hazardous chemicals. Surveillance of chemicals of known, emerging, or potential future concern, entering the environment-food-human continuum is needed to document the reality of risks posed by chemicals on ecosystem and human health from a one health perspective, feed into early warning systems and support public policies for exposure mitigation provisions and safe and sustainable by design strategies. The use of less-conventional sampling strategies and integration of full-scan, high-resolution mass spectrometry and effect-directed analysis in environmental and human monitoring programmes have the potential to enhance the screening and identification of a wider range of chemicals of known, emerging or potential future concern. Here, we outline the key needs and recommendations identified within the European Partnership for Assessment of Risks from Chemicals (PARC) project for leveraging these innovative methodologies to support the development of next-generation chemical risk assessment.
Collapse
Affiliation(s)
- Žiga Tkalec
- RECETOX, Faculty of Science, Masaryk University, Kotlarska 2, Brno, Czech Republic; Jožef Stefan Institute, Department of Environmental Sciences, Ljubljana, Slovenia.
| | | | - Nicole Bandow
- German Environment Agency, Laboratory for Water Analysis, Colditzstraße 34, 12099 Berlin, Germany.
| | - Frederic M Béen
- Vrije Universiteit Amsterdam, Amsterdam Institute for Life and Environment (A-LIFE), Section Chemistry for Environment and Health, De Boelelaan 1085, 1081 HV Amsterdam, The Netherlands; KWR Water Research Institute, Nieuwegein, The Netherlands.
| | - Lidia Belova
- Toxicological Center, University of Antwerp, 2610 Wilrijk, Belgium.
| | - Jos Bessems
- Flemish Institute for Technological Research (VITO), Mol, Belgium.
| | | | - Werner Brack
- Helmholtz Centre for Environmental Research GmbH - UFZ, Department of Effect-Directed Analysis, Permoserstraße 15, 04318 Leipzig, Germany; Goethe University Frankfurt, Department of Evolutionary Ecology and Environmental Toxicology, Max-von-Laue-Strasse 13, 60438 Frankfurt, Germany.
| | | | - Jade Chaker
- Univ Rennes, Inserm, EHESP, Irset (Institut de recherche en santé, environnement et travail) - UMR_S 1085, Rennes, France.
| | - Adrian Covaci
- Toxicological Center, University of Antwerp, 2610 Wilrijk, Belgium.
| | - Nicolas Creusot
- INRAE, French National Research Institute For Agriculture, Food & Environment, UR1454 EABX, Bordeaux Metabolome, MetaboHub, Gazinet Cestas, France.
| | - Arthur David
- Univ Rennes, Inserm, EHESP, Irset (Institut de recherche en santé, environnement et travail) - UMR_S 1085, Rennes, France.
| | - Laurent Debrauwer
- Toxalim (Research Centre in Food Toxicology), INRAE UMR 1331, ENVT, INP-Purpan, Paul Sabatier University (UPS), Toulouse, France.
| | | | - Radu Corneliu Duca
- Unit Environmental Hygiene and Human Biological Monitoring, Department of Health Protection, Laboratoire National de Santé (LNS), 1 Rue Louis Rech, L-3555 Dudelange, Luxembourg; Environment and Health, Department of Public Health and Primary Care, Katholieke Universiteit of Leuven (KU Leuven), 3000 Leuven, Belgium.
| | - Valérie Fessard
- ANSES, French Agency for Food, Environmental and Occupational Health & Safety, Laboratory of Fougères, Toxicology of Contaminants Unit, 35306 Fougères, France.
| | - Joan O Grimalt
- Institute of Environmental Assessment and Water Research (IDAEA-CSIC), Barcelona, Catalonia, Spain.
| | - Thierry Guerin
- ANSES, French Agency for Food, Environmental and Occupational Health & Safety, Strategy and Programs Department, F-94701 Maisons-Alfort, France.
| | - Baninia Habchi
- INRS, Département Toxicologie et Biométrologie Laboratoire Biométrologie 1, rue du Morvan - CS 60027 - 54519, Vandoeuvre Cedex, France.
| | - Helge Hecht
- RECETOX, Faculty of Science, Masaryk University, Kotlarska 2, Brno, Czech Republic.
| | - Juliane Hollender
- Swiss Federal Institute of Aquatic Science and Technology - Eawag, 8600 Dübendorf, Switzerland; Institute of Biogeochemistry and Pollutant Dynamics, ETH Zürich, 8092 Zürich, Switzerland.
| | - Emilien L Jamin
- Toxalim (Research Centre in Food Toxicology), INRAE UMR 1331, ENVT, INP-Purpan, Paul Sabatier University (UPS), Toulouse, France.
| | - Jana Klánová
- RECETOX, Faculty of Science, Masaryk University, Kotlarska 2, Brno, Czech Republic.
| | - Tina Kosjek
- Jožef Stefan Institute, Department of Environmental Sciences, Ljubljana, Slovenia.
| | - Martin Krauss
- Helmholtz Centre for Environmental Research GmbH - UFZ, Department of Effect-Directed Analysis, Permoserstraße 15, 04318 Leipzig, Germany.
| | - Marja Lamoree
- Vrije Universiteit Amsterdam, Amsterdam Institute for Life and Environment (A-LIFE), Section Chemistry for Environment and Health, De Boelelaan 1085, 1081 HV Amsterdam, The Netherlands.
| | - Gwenaelle Lavison-Bompard
- ANSES, French Agency for Food, Environmental and Occupational Health & Safety, Laboratory for Food Safety, Pesticides and Marine Biotoxins Unit, F-94701 Maisons-Alfort, France.
| | - Jeroen Meijer
- Vrije Universiteit Amsterdam, Amsterdam Institute for Life and Environment (A-LIFE), Section Chemistry for Environment and Health, De Boelelaan 1085, 1081 HV Amsterdam, The Netherlands.
| | - Ruth Moeller
- Unit Medical Expertise and Data Intelligence, Department of Health Protection, Laboratoire National de Santé (LNS), 1 Rue Louis Rech, L-3555 Dudelange, Luxembourg.
| | - Hans Mol
- Wageningen Food Safety Research - Part of Wageningen University and Research, Akkermaalsbos 2, 6708 WB, Wageningen, The Netherlands.
| | - Sophie Mompelat
- ANSES, French Agency for Food, Environmental and Occupational Health & Safety, Laboratory of Fougères, Toxicology of Contaminants Unit, 35306 Fougères, France.
| | - An Van Nieuwenhuyse
- Environment and Health, Department of Public Health and Primary Care, Katholieke Universiteit of Leuven (KU Leuven), 3000 Leuven, Belgium; Department of Health Protection, Laboratoire National de Santé (LNS), 1 Rue Louis Rech, L-3555 Dudelange, Luxembourg.
| | - Herbert Oberacher
- Institute of Legal Medicine and Core Facility Metabolomics, Medical University of Insbruck, 6020 Innsbruck, Austria.
| | - Julien Parinet
- ANSES, French Agency for Food, Environmental and Occupational Health & Safety, Laboratory for Food Safety, Pesticides and Marine Biotoxins Unit, F-94701 Maisons-Alfort, France.
| | - Christof Van Poucke
- Flanders Research Institute for Agriculture, Fisheries And Food (ILVO), Brusselsesteenweg 370, 9090 Melle, Belgium.
| | - Robert Roškar
- University of Ljubljana, Faculty of Pharmacy, Slovenia.
| | - Anne Togola
- BRGM, 3 avenue Claude Guillemin, 45060 Orléans, France.
| | | | - Elliott J Price
- RECETOX, Faculty of Science, Masaryk University, Kotlarska 2, Brno, Czech Republic.
| |
Collapse
|
2
|
Walzer M, Jeong K, Tabb DL, Vizcaíno JA. TopDownApp: An open and modular platform for analysis and visualisation of top-down proteomics data. Proteomics 2024; 24:e2200403. [PMID: 37787899 DOI: 10.1002/pmic.202200403] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/02/2023] [Revised: 09/13/2023] [Accepted: 09/13/2023] [Indexed: 10/04/2023]
Abstract
Although Top-down (TD) proteomics techniques, aimed at the analysis of intact proteins and proteoforms, are becoming increasingly popular, efforts are needed at different levels to generalise their adoption. In this context, there are numerous improvements that are possible in the area of open science practices, including a greater application of the FAIR (Findable, Accessible, Interoperable, and Reusable) data principles. These include, for example, increased data sharing practices and readily available open data standards. Additionally, the field would benefit from the development of open data analysis workflows that can enable data reuse of public datasets, something that is increasingly common in other proteomics fields.
Collapse
Affiliation(s)
- Mathias Walzer
- European Molecular Biology Laboratory, EMBL-European Bioinformatics Institute (EMBL-EBI), Hinxton, Cambridge, UK
| | - Kyowon Jeong
- Applied Bioinformatics, Computer Science Department, University of Tübingen, Tübingen, Germany
| | - David L Tabb
- Institut Pasteur, Université Paris Cité, CNRS UAR 2024, Mass Spectrometry for Biology Unit, Paris, France
| | - Juan Antonio Vizcaíno
- European Molecular Biology Laboratory, EMBL-European Bioinformatics Institute (EMBL-EBI), Hinxton, Cambridge, UK
| |
Collapse
|
3
|
Wang H, Dai C, Pfeuffer J, Sachsenberg T, Sanchez A, Bai M, Perez-Riverol Y. Tissue-based absolute quantification using large-scale TMT and LFQ experiments. Proteomics 2023; 23:e2300188. [PMID: 37488995 DOI: 10.1002/pmic.202300188] [Citation(s) in RCA: 2] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/16/2023] [Revised: 07/04/2023] [Accepted: 07/05/2023] [Indexed: 07/26/2023]
Abstract
Relative and absolute intensity-based protein quantification across cell lines, tissue atlases and tumour datasets is increasingly available in public datasets. These atlases enable researchers to explore fundamental biological questions, such as protein existence, expression location, quantity and correlation with RNA expression. Most studies provide MS1 feature-based label-free quantitative (LFQ) datasets; however, growing numbers of isobaric tandem mass tags (TMT) datasets remain unexplored. Here, we compare traditional intensity-based absolute quantification (iBAQ) proteome abundance ranking to an analogous method using reporter ion proteome abundance ranking with data from an experiment where LFQ and TMT were measured on the same samples. This new TMT method substitutes reporter ion intensities for MS1 feature intensities in the iBAQ framework. Additionally, we compared LFQ-iBAQ values to TMT-iBAQ values from two independent large-scale tissue atlas datasets (one LFQ and one TMT) using robust bottom-up proteomic identification, normalisation and quantitation workflows.
Collapse
Affiliation(s)
- Hong Wang
- Chongqing Key Laboratory of Big Data for Bio Intelligence, Chongqing University of Posts and Telecommunications, Chongqing, China
| | - Chengxin Dai
- Chongqing Key Laboratory of Big Data for Bio Intelligence, Chongqing University of Posts and Telecommunications, Chongqing, China
- State Key Laboratory of Proteomics, Beijing Proteome Research Center, National Center for Protein Sciences (Beijing), Beijing Institute of Life Omics, Beijing, China
| | - Julianus Pfeuffer
- Algorithmic Bioinformatics, Freie Universität Berlin, Berlin, Germany
| | - Timo Sachsenberg
- Department of Computer Science, Applied Bioinformatics, University of Tübingen, Tübingen, Germany
- Institute for Biological and Medical Informatics, University of Tübingen, Tübingen, Germany
| | - Aniel Sanchez
- Section for Clinical Chemistry, Department of Translational Medicine, Lund University, Skåne University Hospital Malmö, Malmö, Sweden
| | - Mingze Bai
- Chongqing Key Laboratory of Big Data for Bio Intelligence, Chongqing University of Posts and Telecommunications, Chongqing, China
- State Key Laboratory of Proteomics, Beijing Proteome Research Center, National Center for Protein Sciences (Beijing), Beijing Institute of Life Omics, Beijing, China
| | - Yasset Perez-Riverol
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, UK
| |
Collapse
|
4
|
Kontou EE, Walter A, Alka O, Pfeuffer J, Sachsenberg T, Mohite OS, Nuhamunada M, Kohlbacher O, Weber T. UmetaFlow: an untargeted metabolomics workflow for high-throughput data processing and analysis. J Cheminform 2023; 15:52. [PMID: 37173725 PMCID: PMC10176759 DOI: 10.1186/s13321-023-00724-w] [Citation(s) in RCA: 2] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/19/2022] [Accepted: 04/27/2023] [Indexed: 05/15/2023] Open
Abstract
Metabolomics experiments generate highly complex datasets, which are time and work-intensive, sometimes even error-prone if inspected manually. Therefore, new methods for automated, fast, reproducible, and accurate data processing and dereplication are required. Here, we present UmetaFlow, a computational workflow for untargeted metabolomics that combines algorithms for data pre-processing, spectral matching, molecular formula and structural predictions, and an integration to the GNPS workflows Feature-Based Molecular Networking and Ion Identity Molecular Networking for downstream analysis. UmetaFlow is implemented as a Snakemake workflow, making it easy to use, scalable, and reproducible. For more interactive computing, visualization, as well as development, the workflow is also implemented in Jupyter notebooks using the Python programming language and a set of Python bindings to the OpenMS algorithms (pyOpenMS). Finally, UmetaFlow is also offered as a web-based Graphical User Interface for parameter optimization and processing of smaller-sized datasets. UmetaFlow was validated with in-house LC-MS/MS datasets of actinomycetes producing known secondary metabolites, as well as commercial standards, and it detected all expected features and accurately annotated 76% of the molecular formulas and 65% of the structures. As a more generic validation, the publicly available MTBLS733 and MTBLS736 datasets were used for benchmarking, and UmetaFlow detected more than 90% of all ground truth features and performed exceptionally well in quantification and discriminating marker selection. We anticipate that UmetaFlow will provide a useful platform for the interpretation of large metabolomics datasets.
Collapse
Affiliation(s)
- Eftychia E Kontou
- The Novo Nordisk Foundation Center for Biosustainability, Technical University of Denmark, Kemitorvet Building 220, 2800, Kgs. Lyngby, Denmark
| | - Axel Walter
- Applied Bioinformatics, Department of Computer Science, Eberhard Karls University Tübingen, Sand 14, 72076, Tübingen, Germany
- Institute for Bioinformatics and Medical Informatics, University of Tübingen, Sand 14, 72076, Tübingen, Germany
| | - Oliver Alka
- Applied Bioinformatics, Department of Computer Science, Eberhard Karls University Tübingen, Sand 14, 72076, Tübingen, Germany
- Institute for Bioinformatics and Medical Informatics, University of Tübingen, Sand 14, 72076, Tübingen, Germany
| | - Julianus Pfeuffer
- Visual and Data-Centric Computing, Zuse Institute Berlin, Takustr. 7, 14195, Berlin, Germany
- Algorithmic Bioinformatics, Freie Universität Berlin, Takustr. 9, 14195, Berlin, Germany
| | - Timo Sachsenberg
- Applied Bioinformatics, Department of Computer Science, Eberhard Karls University Tübingen, Sand 14, 72076, Tübingen, Germany
- Institute for Bioinformatics and Medical Informatics, University of Tübingen, Sand 14, 72076, Tübingen, Germany
| | - Omkar S Mohite
- The Novo Nordisk Foundation Center for Biosustainability, Technical University of Denmark, Kemitorvet Building 220, 2800, Kgs. Lyngby, Denmark
| | - Matin Nuhamunada
- The Novo Nordisk Foundation Center for Biosustainability, Technical University of Denmark, Kemitorvet Building 220, 2800, Kgs. Lyngby, Denmark
| | - Oliver Kohlbacher
- Applied Bioinformatics, Department of Computer Science, Eberhard Karls University Tübingen, Sand 14, 72076, Tübingen, Germany
- Institute for Bioinformatics and Medical Informatics, University of Tübingen, Sand 14, 72076, Tübingen, Germany
- Translational Bioinformatics, University Hospital Tübingen, Schaffhausenstr. 77, 72072, Tübingen, Germany
| | - Tilmann Weber
- The Novo Nordisk Foundation Center for Biosustainability, Technical University of Denmark, Kemitorvet Building 220, 2800, Kgs. Lyngby, Denmark.
| |
Collapse
|
5
|
Ofek P, Yeini E, Arad G, Danilevsky A, Pozzi S, Luna CB, Dangoor SI, Grossman R, Ram Z, Shomron N, Brem H, Hyde TM, Geiger T, Satchi-Fainaro R. Deoxyhypusine hydroxylase: A novel therapeutic target differentially expressed in short-term vs long-term survivors of glioblastoma. Int J Cancer 2023. [PMID: 37141410 DOI: 10.1002/ijc.34545] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/27/2022] [Revised: 02/13/2023] [Accepted: 03/10/2023] [Indexed: 05/06/2023]
Abstract
Glioblastoma (GB) is the most aggressive neoplasm of the brain. Poor prognosis is mainly attributed to tumor heterogeneity, invasiveness and drug resistance. Only a small fraction of GB patients survives longer than 24 months from the time of diagnosis (ie, long-term survivors [LTS]). In our study, we aimed to identify molecular markers associated with favorable GB prognosis as a basis to develop therapeutic applications to improve patients' outcome. We have recently assembled a proteogenomic dataset of 87 GB clinical samples of varying survival rates. Following RNA-seq and mass spectrometry (MS)-based proteomics analysis, we identified several differentially expressed genes and proteins, including some known cancer-related pathways and some less established that showed higher expression in short-term (<6 months) survivors (STS) compared to LTS. One such target found was deoxyhypusine hydroxylase (DOHH), which is known to be involved in the biosynthesis of hypusine, an unusual amino acid essential for the function of the eukaryotic translation initiation factor 5A (eIF5A), which promotes tumor growth. We consequently validated DOHH overexpression in STS samples by quantitative polymerase chain reaction (qPCR) and immunohistochemistry. We further showed robust inhibition of proliferation, migration and invasion of GB cells following silencing of DOHH with short hairpin RNA (shRNA) or inhibition of its activity with small molecules, ciclopirox and deferiprone. Moreover, DOHH silencing led to significant inhibition of tumor progression and prolonged survival in GB mouse models. Searching for a potential mechanism by which DOHH promotes tumor aggressiveness, we found that it supports the transition of GB cells to a more invasive phenotype via epithelial-mesenchymal transition (EMT)-related pathways.
Collapse
Affiliation(s)
- Paula Ofek
- Department of Physiology and Pharmacology, Sackler Faculty of Medicine, Tel Aviv University, Tel Aviv, Israel
| | - Eilam Yeini
- Department of Physiology and Pharmacology, Sackler Faculty of Medicine, Tel Aviv University, Tel Aviv, Israel
| | - Gali Arad
- Department of Molecular Genetics, Sackler Faculty of Medicine, Tel Aviv University, Tel Aviv, Israel
| | - Artem Danilevsky
- Department of Cell and Developmental Biology, Sackler Faculty of Medicine, Tel Aviv University, Tel Aviv, Israel
- Edmond J Safra Center for Bioinformatics, Tel Aviv University, Tel Aviv, Israel
| | - Sabina Pozzi
- Department of Physiology and Pharmacology, Sackler Faculty of Medicine, Tel Aviv University, Tel Aviv, Israel
| | - Christian Burgos Luna
- Department of Physiology and Pharmacology, Sackler Faculty of Medicine, Tel Aviv University, Tel Aviv, Israel
| | - Sahar Israeli Dangoor
- Department of Physiology and Pharmacology, Sackler Faculty of Medicine, Tel Aviv University, Tel Aviv, Israel
| | - Rachel Grossman
- Department of Neurosurgery, Tel Aviv Sourasky Medical Center, Tel Aviv, Israel
| | - Zvi Ram
- Department of Neurosurgery, Tel Aviv Sourasky Medical Center, Tel Aviv, Israel
| | - Noam Shomron
- Department of Cell and Developmental Biology, Sackler Faculty of Medicine, Tel Aviv University, Tel Aviv, Israel
- Edmond J Safra Center for Bioinformatics, Tel Aviv University, Tel Aviv, Israel
- Sagol School of Neurosciences, Tel Aviv University, Tel Aviv, Israel
| | - Henry Brem
- Department of Neurosurgery, Johns Hopkins University School of Medicine, Baltimore, Maryland, USA
| | - Thomas M Hyde
- Lieber Institute for Brain Development, Johns Hopkins Medical Campus, Baltimore, Maryland, USA
- Department of Psychiatry & Behavioral Science, Johns Hopkins University School of Medicine, Baltimore, Maryland, USA
- Department of Neurology, Johns Hopkins University School of Medicine, Baltimore, Maryland, USA
| | - Tamar Geiger
- Department of Molecular Genetics, Sackler Faculty of Medicine, Tel Aviv University, Tel Aviv, Israel
| | - Ronit Satchi-Fainaro
- Department of Physiology and Pharmacology, Sackler Faculty of Medicine, Tel Aviv University, Tel Aviv, Israel
- Sagol School of Neurosciences, Tel Aviv University, Tel Aviv, Israel
| |
Collapse
|
6
|
A peptide-centric approach to analyse quantitative proteomics data- an application to prostate cancer biomarker discovery. J Proteomics 2023; 272:104774. [PMID: 36427804 DOI: 10.1016/j.jprot.2022.104774] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/11/2022] [Revised: 09/23/2022] [Accepted: 11/01/2022] [Indexed: 11/25/2022]
Abstract
Bottom-up proteomics is a popular approach in molecular biomarker research. However, protein analysts have realized the limitations of protein-based approaches for identifying and quantifying proteins in complex samples, such as the identification of peptides sequences shared by multiple proteins and the difficulty in identifying modified peptides. Thus, there are many exciting opportunities to improve analysis methods. Here, an alternative method focused on peptide analysis is proposed as a complement to the conventional proteomics data analysis. To investigate this hypothesis, a peptide-centric approach was applied to reanalyse a urine proteome dataset of samples from prostate cancer patients and controls. The results were compared with the conventional protein-centric approach. The relevant proteins/peptides to discriminate the groups were detected based on two approaches, p-value and VIP values obtained by a PLS-DA model. A comparison of the two strategies revealed high inconsistency between protein and peptide information and greater involvement of peptides in key PCa processes. This peptide analysis unveiled discriminative features that are lost when proteins are analyzed as homogeneous entities. This type of analysis is innovative in PCa and integrated with the widely used protein-centric approach might provide a more comprehensive view of this disease and revolutionize biomarker discovery. SIGNIFICANCE: In this study, the application of a protein and peptide-centric approaches to reanalyse a urine proteome dataset from prostate cancer (PCa) patients and controls showed that many relevant proteins/peptides are missed by the conservative nature of p-value in statistical tests, therefore, the inclusion of variable selection methods in the analysis of the dataset reported in this work is fruitful. Comparison of protein- and peptide-based approaches revealed a high inconsistency between protein and peptide information and a greater involvement of peptides in key PCa processes. These results provide a new perspective to analyse proteomics data and detect relevant targets based on the integration of peptide and protein information. This data integration allows to unravel discriminative features that normally go unnoticed, to have a more comprehensive view of the disease pathophysiology and to open new avenues for the discovery of biomarkers.
Collapse
|
7
|
Gabriels R, Declercq A, Bouwmeester R, Degroeve S, Martens L. psm_utils: A High-Level Python API for Parsing and Handling Peptide-Spectrum Matches and Proteomics Search Results. J Proteome Res 2023; 22:557-560. [PMID: 36508242 DOI: 10.1021/acs.jproteome.2c00609] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/14/2022]
Abstract
A plethora of proteomics search engine output file formats are in circulation. This lack of standardized output files greatly complicates generic downstream processing of peptide-spectrum matches (PSMs) and PSM files. While standards exist to solve this problem, these are far from universally supported by search engines. Moreover, software libraries are available to read a selection of PSM file formats, but a package to parse PSM files into a unified data structure has been missing. Here, we present psm_utils, a Python package to read and write various PSM file formats and to handle peptidoforms, PSMs, and PSM lists in a unified and user-friendly Python-, command line-, and web-interface. psm_utils was developed with pragmatism and maintainability in mind, adhering to community standards and relying on existing packages where possible. The Python API and command line interface greatly facilitate handling various PSM file formats. Moreover, a user-friendly web application was built using psm_utils that allows anyone to interconvert PSM files and retrieve basic PSM statistics. psm_utils is freely available under the permissive Apache2 license at https://github.com/compomics/psm_utils.
Collapse
Affiliation(s)
- Ralf Gabriels
- VIB-UGent Center for Medical Biotechnology, VIB, 9052 Ghent, Belgium.,Department of Biomolecular Medicine, Ghent University, 9000 Ghent, Belgium
| | - Arthur Declercq
- VIB-UGent Center for Medical Biotechnology, VIB, 9052 Ghent, Belgium.,Department of Biomolecular Medicine, Ghent University, 9000 Ghent, Belgium
| | - Robbin Bouwmeester
- VIB-UGent Center for Medical Biotechnology, VIB, 9052 Ghent, Belgium.,Department of Biomolecular Medicine, Ghent University, 9000 Ghent, Belgium
| | - Sven Degroeve
- VIB-UGent Center for Medical Biotechnology, VIB, 9052 Ghent, Belgium.,Department of Biomolecular Medicine, Ghent University, 9000 Ghent, Belgium
| | - Lennart Martens
- VIB-UGent Center for Medical Biotechnology, VIB, 9052 Ghent, Belgium.,Department of Biomolecular Medicine, Ghent University, 9000 Ghent, Belgium
| |
Collapse
|
8
|
Player RA, Aguinaldo AM, Merritt BB, Maszkiewicz LN, Adeyemo OE, Forsyth ER, Verratti KJ, Chee BW, Grady SL, Bradburne CE. The META tool optimizes metagenomic analyses across sequencing platforms and classifiers. FRONTIERS IN BIOINFORMATICS 2023; 2:969247. [PMID: 36685333 PMCID: PMC9852826 DOI: 10.3389/fbinf.2022.969247] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/14/2022] [Accepted: 12/14/2022] [Indexed: 01/09/2023] Open
Abstract
A major challenge in the field of metagenomics is the selection of the correct combination of sequencing platform and downstream metagenomic analysis algorithm, or "classifier". Here, we present the Metagenomic Evaluation Tool Analyzer (META), which produces simulated data and facilitates platform and algorithm selection for any given metagenomic use case. META-generated in silico read data are modular, scalable, and reflect user-defined community profiles, while the downstream analysis is done using a variety of metagenomic classifiers. Reported results include information on resource utilization, time-to-answer, and performance. Real-world data can also be analyzed using selected classifiers and results benchmarked against simulations. To test the utility of the META software, simulated data was compared to real-world viral and bacterial metagenomic samples run on four different sequencers and analyzed using 12 metagenomic classifiers. Lastly, we introduce "META Score": a unified, quantitative value which rates an analytic classifier's ability to both identify and count taxa in a representative sample.
Collapse
Affiliation(s)
- Robert A. Player
- Applied Physics Laboratory, Johns Hopkins University, Laurel, MD, United States
| | | | - Brian B. Merritt
- Applied Physics Laboratory, Johns Hopkins University, Laurel, MD, United States
| | - Lisa N. Maszkiewicz
- Applied Physics Laboratory, Johns Hopkins University, Laurel, MD, United States
| | | | - Ellen R. Forsyth
- Applied Physics Laboratory, Johns Hopkins University, Laurel, MD, United States
| | | | - Brant W. Chee
- Division of General Internal Medicine, Johns Hopkins School of Medicine, Baltimore, MD, United States,Armstrong Institute for Patient Safety and Quality, Johns Hopkins School of Medicine, Baltimore, MD, United States
| | - Sarah L. Grady
- Applied Physics Laboratory, Johns Hopkins University, Laurel, MD, United States
| | - Christopher E. Bradburne
- Applied Physics Laboratory, Johns Hopkins University, Laurel, MD, United States,McKusick-Nathans Department of Genetic Medicine, Johns Hopkins School of Medicine, Baltimore, MD, United States,*Correspondence: Christopher E. Bradburne,
| |
Collapse
|
9
|
Perez-Riverol Y. Proteomic repository data submission, dissemination, and reuse: key messages. Expert Rev Proteomics 2022; 19:297-310. [PMID: 36529941 PMCID: PMC7614296 DOI: 10.1080/14789450.2022.2160324] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/23/2022]
Abstract
INTRODUCTION The creation of ProteomeXchange data workflows in 2012 transformed the field of proteomics, consisting of the standardization of data submission and dissemination and enabling the widespread reanalysis of public MS proteomics data worldwide. ProteomeXchange has triggered a growing trend toward public dissemination of proteomics data, facilitating the assessment, reuse, comparative analyses, and extraction of new findings from public datasets. By 2022, the consortium is integrated by PRIDE, PeptideAtlas, MassIVE, jPOST, iProX, and Panorama Public. AREAS COVERED Here, we review and discuss the current ecosystem of resources, guidelines, and file formats for proteomics data dissemination and reanalysis. Special attention is drawn to new exciting quantitative and post-translational modification-oriented resources. The challenges and future directions on data depositions including the lack of metadata and cloud-based and high-performance software solutions for fast and reproducible reanalysis of the available data are discussed. EXPERT OPINION The success of ProteomeXchange and the amount of proteomics data available in the public domain have triggered the creation and/or growth of other protein knowledgebase resources. Data reuse is a leading, active, and evolving field; supporting the creation of new formats, tools, and workflows to rediscover and reshape the public proteomics data.
Collapse
Affiliation(s)
- Yasset Perez-Riverol
- European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Wellcome Trust Genome Campus, Hinxton, Cambridge, UK
| |
Collapse
|
10
|
Pinter N, Glätzer D, Fahrner M, Fröhlich K, Johnson J, Grüning BA, Warscheid B, Drepper F, Schilling O, Föll MC. MaxQuant and MSstats in Galaxy Enable Reproducible Cloud-Based Analysis of Quantitative Proteomics Experiments for Everyone. J Proteome Res 2022; 21:1558-1565. [PMID: 35503992 DOI: 10.1021/acs.jproteome.2c00051] [Citation(s) in RCA: 5] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/29/2022]
Abstract
Quantitative mass spectrometry-based proteomics has become a high-throughput technology for the identification and quantification of thousands of proteins in complex biological samples. Two frequently used tools, MaxQuant and MSstats, allow for the analysis of raw data and finding proteins with differential abundance between conditions of interest. To enable accessible and reproducible quantitative proteomics analyses in a cloud environment, we have integrated MaxQuant (including TMTpro 16/18plex), Proteomics Quality Control (PTXQC), MSstats, and MSstatsTMT into the open-source Galaxy framework. This enables the web-based analysis of label-free and isobaric labeling proteomics experiments via Galaxy's graphical user interface on public clouds. MaxQuant and MSstats in Galaxy can be applied in conjunction with thousands of existing Galaxy tools and integrated into standardized, sharable workflows. Galaxy tracks all metadata and intermediate results in analysis histories, which can be shared privately for collaborations or publicly, allowing full reproducibility and transparency of published analysis. To further increase accessibility, we provide detailed hands-on training materials. The integration of MaxQuant and MSstats into the Galaxy framework enables their usage in a reproducible way on accessible large computational infrastructures, hence realizing the foundation for high-throughput proteomics data science for everyone.
Collapse
Affiliation(s)
- Niko Pinter
- Institute for Surgical Pathology, Medical Center, University of Freiburg, 79106 Freiburg, Germany.,Faculty of Medicine, University of Freiburg, 79110 Freiburg, Germany
| | - Damian Glätzer
- Biochemistry and Functional Proteomics, Institute of Biology II, Faculty of Biology, University of Freiburg, 79104 Freiburg, Germany
| | - Matthias Fahrner
- Institute for Surgical Pathology, Medical Center, University of Freiburg, 79106 Freiburg, Germany.,Faculty of Medicine, University of Freiburg, 79110 Freiburg, Germany.,Faculty of Biology, University of Freiburg, 79104 Freiburg, Germany
| | - Klemens Fröhlich
- Institute for Surgical Pathology, Medical Center, University of Freiburg, 79106 Freiburg, Germany.,Faculty of Medicine, University of Freiburg, 79110 Freiburg, Germany.,Faculty of Biology, University of Freiburg, 79104 Freiburg, Germany.,Spemann Graduate School of Biology and Medicine (SGBM), Albert-Ludwigs-University Freiburg, 79104 Freiburg, Germany
| | - James Johnson
- Minnesota Supercomputing Institute, University of Minnesota, Minneapolis, Minnesota 55455, United States
| | | | - Bettina Warscheid
- Biochemistry and Functional Proteomics, Institute of Biology II, Faculty of Biology, University of Freiburg, 79104 Freiburg, Germany.,Faculty of Chemistry and Pharmacy, Department of Biochemistry, Julius Maximilian University of Würzburg, 97074 Würzburg, Germany
| | - Friedel Drepper
- Biochemistry and Functional Proteomics, Institute of Biology II, Faculty of Biology, University of Freiburg, 79104 Freiburg, Germany
| | - Oliver Schilling
- Institute for Surgical Pathology, Medical Center, University of Freiburg, 79106 Freiburg, Germany.,Faculty of Medicine, University of Freiburg, 79110 Freiburg, Germany.,German Cancer Consortium (DKTK) and Cancer Research Center (DKFZ), 79106 Freiburg, Germany
| | - Melanie Christine Föll
- Institute for Surgical Pathology, Medical Center, University of Freiburg, 79106 Freiburg, Germany.,Faculty of Medicine, University of Freiburg, 79110 Freiburg, Germany.,Khoury College of Computer Sciences, Northeastern University, Boston, Massachusetts 02115, United States
| |
Collapse
|
11
|
Perez-Riverol Y, Bai J, Bandla C, García-Seisdedos D, Hewapathirana S, Kamatchinathan S, Kundu D, Prakash A, Frericks-Zipper A, Eisenacher M, Walzer M, Wang S, Brazma A, Vizcaíno J. The PRIDE database resources in 2022: a hub for mass spectrometry-based proteomics evidences. Nucleic Acids Res 2022; 50:D543-D552. [PMID: 34723319 PMCID: PMC8728295 DOI: 10.1093/nar/gkab1038] [Citation(s) in RCA: 2686] [Impact Index Per Article: 1343.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/11/2021] [Revised: 10/12/2021] [Accepted: 10/14/2021] [Indexed: 12/12/2022] Open
Abstract
The PRoteomics IDEntifications (PRIDE) database (https://www.ebi.ac.uk/pride/) is the world's largest data repository of mass spectrometry-based proteomics data. PRIDE is one of the founding members of the global ProteomeXchange (PX) consortium and an ELIXIR core data resource. In this manuscript, we summarize the developments in PRIDE resources and related tools since the previous update manuscript was published in Nucleic Acids Research in 2019. The number of submitted datasets to PRIDE Archive (the archival component of PRIDE) has reached on average around 500 datasets per month during 2021. In addition to continuous improvements in PRIDE Archive data pipelines and infrastructure, the PRIDE Spectra Archive has been developed to provide direct access to the submitted mass spectra using Universal Spectrum Identifiers. As a key point, the file format MAGE-TAB for proteomics has been developed to enable the improvement of sample metadata annotation. Additionally, the resource PRIDE Peptidome provides access to aggregated peptide/protein evidences across PRIDE Archive. Furthermore, we will describe how PRIDE has increased its efforts to reuse and disseminate high-quality proteomics data into other added-value resources such as UniProt, Ensembl and Expression Atlas.
Collapse
Affiliation(s)
- Yasset Perez-Riverol
- European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Wellcome Trust Genome Campus, Hinxton, Cambridge CB10 1SD, UK
| | - Jingwen Bai
- European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Wellcome Trust Genome Campus, Hinxton, Cambridge CB10 1SD, UK
| | - Chakradhar Bandla
- European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Wellcome Trust Genome Campus, Hinxton, Cambridge CB10 1SD, UK
| | - David García-Seisdedos
- European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Wellcome Trust Genome Campus, Hinxton, Cambridge CB10 1SD, UK
| | - Suresh Hewapathirana
- European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Wellcome Trust Genome Campus, Hinxton, Cambridge CB10 1SD, UK
| | - Selvakumar Kamatchinathan
- European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Wellcome Trust Genome Campus, Hinxton, Cambridge CB10 1SD, UK
| | - Deepti J Kundu
- European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Wellcome Trust Genome Campus, Hinxton, Cambridge CB10 1SD, UK
| | - Ananth Prakash
- European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Wellcome Trust Genome Campus, Hinxton, Cambridge CB10 1SD, UK
| | - Anika Frericks-Zipper
- Ruhr University Bochum, Medical Faculty, Medizinisches Proteom-Center, D-44801 Bochum, Germany
- Ruhr University Bochum, Center for Protein Diagnostics (PRODI), Medical Proteome Analysis, 44801 Bochum, Germany
| | - Martin Eisenacher
- Ruhr University Bochum, Medical Faculty, Medizinisches Proteom-Center, D-44801 Bochum, Germany
- Ruhr University Bochum, Center for Protein Diagnostics (PRODI), Medical Proteome Analysis, 44801 Bochum, Germany
| | - Mathias Walzer
- European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Wellcome Trust Genome Campus, Hinxton, Cambridge CB10 1SD, UK
| | - Shengbo Wang
- European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Wellcome Trust Genome Campus, Hinxton, Cambridge CB10 1SD, UK
| | - Alvis Brazma
- European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Wellcome Trust Genome Campus, Hinxton, Cambridge CB10 1SD, UK
| | - Juan Antonio Vizcaíno
- European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Wellcome Trust Genome Campus, Hinxton, Cambridge CB10 1SD, UK
| |
Collapse
|
12
|
Imbert A, Rompais M, Selloum M, Castelli F, Mouton-Barbosa E, Brandolini-Bunlon M, Chu-Van E, Joly C, Hirschler A, Roger P, Burger T, Leblanc S, Sorg T, Ouzia S, Vandenbrouck Y, Médigue C, Junot C, Ferro M, Pujos-Guillot E, de Peredo AG, Fenaille F, Carapito C, Herault Y, Thévenot EA. ProMetIS, deep phenotyping of mouse models by combined proteomics and metabolomics analysis. Sci Data 2021; 8:311. [PMID: 34862403 PMCID: PMC8642540 DOI: 10.1038/s41597-021-01095-3] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/22/2021] [Accepted: 11/02/2021] [Indexed: 01/20/2023] Open
Abstract
Genes are pleiotropic and getting a better knowledge of their function requires a comprehensive characterization of their mutants. Here, we generated multi-level data combining phenomic, proteomic and metabolomic acquisitions from plasma and liver tissues of two C57BL/6 N mouse models lacking the Lat (linker for activation of T cells) and the Mx2 (MX dynamin-like GTPase 2) genes, respectively. Our dataset consists of 9 assays (1 preclinical, 2 proteomics and 6 metabolomics) generated with a fully non-targeted and standardized approach. The data and processing code are publicly available in the ProMetIS R package to ensure accessibility, interoperability, and reusability. The dataset thus provides unique molecular information about the physiological role of the Lat and Mx2 genes. Furthermore, the protocols described herein can be easily extended to a larger number of individuals and tissues. Finally, this resource will be of great interest to develop new bioinformatic and biostatistic methods for multi-omics data integration.
Collapse
Affiliation(s)
- Alyssa Imbert
- CEA, LIST, Laboratoire Sciences des Données et de la Décision, IFB, MetaboHUB, Gif-sur-Yvette, France.
- IFB-core, UMS3601, Genoscope, Evry, France.
| | - Magali Rompais
- Laboratoire de Spectrométrie de Masse BioOrganique, Université de Strasbourg, CNRS, IPHC UMR 7178, ProFI, Strasbourg, France
| | - Mohammed Selloum
- Université de Strasbourg, CNRS, INSERM, Institut Clinique de la Souris, Phenomin-ICS, Illkirch, France
| | - Florence Castelli
- Université Paris Saclay, CEA, INRAE, Département Médicaments et Technologies pour la Santé (MTS), MetaboHUB, Gif-sur-Yvette, France
| | - Emmanuelle Mouton-Barbosa
- Institut de Pharmacologie et Biologie Structurale (IPBS), Université de Toulouse, CNRS, UPS, ProFI, Toulouse, France
| | - Marion Brandolini-Bunlon
- Université Clermont Auvergne, INRAE, UNH, Plateforme d'Exploration du Métabolisme, MetaboHUB, Clermont-Ferrand, France
| | - Emeline Chu-Van
- Université Paris Saclay, CEA, INRAE, Département Médicaments et Technologies pour la Santé (MTS), MetaboHUB, Gif-sur-Yvette, France
| | - Charlotte Joly
- Université Clermont Auvergne, INRAE, UNH, Plateforme d'Exploration du Métabolisme, MetaboHUB, Clermont-Ferrand, France
| | - Aurélie Hirschler
- Laboratoire de Spectrométrie de Masse BioOrganique, Université de Strasbourg, CNRS, IPHC UMR 7178, ProFI, Strasbourg, France
| | - Pierrick Roger
- CEA, LIST, Laboratoire Intelligence Artificielle et Apprentissage Automatique, MetaboHUB, Gif-sur-Yvette, France
| | - Thomas Burger
- Université Grenoble Alpes, INSERM, CEA, UMR BioSanté U1292, FR2048, ProFI, Grenoble, France
| | - Sophie Leblanc
- Université de Strasbourg, CNRS, INSERM, Institut Clinique de la Souris, Phenomin-ICS, Illkirch, France
| | - Tania Sorg
- Université de Strasbourg, CNRS, INSERM, Institut Clinique de la Souris, Phenomin-ICS, Illkirch, France
| | - Sadia Ouzia
- Université Paris Saclay, CEA, INRAE, Département Médicaments et Technologies pour la Santé (MTS), MetaboHUB, Gif-sur-Yvette, France
| | - Yves Vandenbrouck
- Université Grenoble Alpes, INSERM, CEA, UMR BioSanté U1292, FR2048, ProFI, Grenoble, France
| | - Claudine Médigue
- IFB-core, UMS3601, Genoscope, Evry, France
- Laboratoire d'Analyses Bioinformatique en Génomique et Métabolisme (LABGeM), CNRS & CEA/DRF/IFJ, UMR8030, Evry, France
| | - Christophe Junot
- Université Paris Saclay, CEA, INRAE, Département Médicaments et Technologies pour la Santé (MTS), MetaboHUB, Gif-sur-Yvette, France
| | - Myriam Ferro
- Université Grenoble Alpes, INSERM, CEA, UMR BioSanté U1292, FR2048, ProFI, Grenoble, France
| | - Estelle Pujos-Guillot
- Université Clermont Auvergne, INRAE, UNH, Plateforme d'Exploration du Métabolisme, MetaboHUB, Clermont-Ferrand, France
| | - Anne Gonzalez de Peredo
- Institut de Pharmacologie et Biologie Structurale (IPBS), Université de Toulouse, CNRS, UPS, ProFI, Toulouse, France
| | - François Fenaille
- Université Paris Saclay, CEA, INRAE, Département Médicaments et Technologies pour la Santé (MTS), MetaboHUB, Gif-sur-Yvette, France
| | - Christine Carapito
- Laboratoire de Spectrométrie de Masse BioOrganique, Université de Strasbourg, CNRS, IPHC UMR 7178, ProFI, Strasbourg, France
| | - Yann Herault
- Université de Strasbourg, CNRS, INSERM, Institut Clinique de la Souris, Phenomin-ICS, Illkirch, France
- Université de Strasbourg, CNRS, INSERM, Institut de Génétique Biologie Moléculaire et Cellulaire, IGBMC, Illkirch, France
| | - Etienne A Thévenot
- Université Paris Saclay, CEA, INRAE, Département Médicaments et Technologies pour la Santé (MTS), MetaboHUB, Gif-sur-Yvette, France.
| |
Collapse
|
13
|
Wratten L, Wilm A, Göke J. Reproducible, scalable, and shareable analysis pipelines with bioinformatics workflow managers. Nat Methods 2021; 18:1161-1168. [PMID: 34556866 DOI: 10.1038/s41592-021-01254-9] [Citation(s) in RCA: 39] [Impact Index Per Article: 13.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/20/2020] [Accepted: 07/29/2021] [Indexed: 02/08/2023]
Abstract
The rapid growth of high-throughput technologies has transformed biomedical research. With the increasing amount and complexity of data, scalability and reproducibility have become essential not just for experiments, but also for computational analysis. However, transforming data into information involves running a large number of tools, optimizing parameters, and integrating dynamically changing reference data. Workflow managers were developed in response to such challenges. They simplify pipeline development, optimize resource usage, handle software installation and versions, and run on different compute platforms, enabling workflow portability and sharing. In this Perspective, we highlight key features of workflow managers, compare commonly used approaches for bioinformatics workflows, and provide a guide for computational and noncomputational users. We outline community-curated pipeline initiatives that enable novice and experienced users to perform complex, best-practice analyses without having to manually assemble workflows. In sum, we illustrate how workflow managers contribute to making computational analysis in biomedical research shareable, scalable, and reproducible.
Collapse
Affiliation(s)
| | | | - Jonathan Göke
- Genome Institute of Singapore, Singapore, Singapore.
| |
Collapse
|
14
|
Abstract
PURPOSE The aim of this article is to describe the technical development in proteomics during the last two decades with the focus on its use in radiation biology. It is written from a subjective point of view and aims not to be a scientific review of the subject. CONCLUSION Proteomics is a fast developing technique and it has already contributed greatly to our understanding of biological mechanisms following radiation exposure. Novel proteomics approaches can be used in adequately designed cellular and animal experiments and above all in big clinical trials to investigate effects of ionizing radiation in the future.
Collapse
Affiliation(s)
- Soile Tapio
- Institute of Radiation Biology and Institute for Biological and Medical Imaging, Helmholtz Center Munich, German Research Center for Environmental Health GmbH, Neuherberg, Germany
| |
Collapse
|
15
|
Cartelier K, Aimé D, Ly Vu J, Combes-Soia L, Labas V, Prosperi JM, Buitink J, Gallardo K, Le Signor C. Genetic determinants of seed protein plasticity in response to the environment in Medicago truncatula. THE PLANT JOURNAL : FOR CELL AND MOLECULAR BIOLOGY 2021; 106:1298-1311. [PMID: 33733554 DOI: 10.1111/tpj.15236] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 10/26/2020] [Revised: 03/04/2021] [Accepted: 03/09/2021] [Indexed: 06/12/2023]
Abstract
As the frequency of extreme environmental events is expected to increase with climate change, identifying candidate genes for stabilizing the protein composition of legume seeds or optimizing this in a given environment is increasingly important. To elucidate the genetic determinants of seed protein plasticity, major seed proteins from 200 ecotypes of Medicago truncatula grown in four contrasting environments were quantified after one-dimensional electrophoresis. The plasticity index of these proteins was recorded for each genotype as the slope of Finlay and Wilkinson's regression and then used for genome-wide association studies (GWASs), enabling the identification of candidate genes for determining this plasticity. This list was enriched in genes related to transcription, DNA repair and signal transduction, with many of them being stress responsive. Other over-represented genes were related to sulfur and aspartate family pathways leading to the synthesis of the nutritionally essential amino acids methionine and lysine. By placing these genes in metabolic pathways, and using a M. truncatula mutant impaired in regenerating methionine from S-methylmethionine, we discovered that methionine recycling pathways are major contributors to globulin composition establishment and plasticity. These data provide a unique resource of genes that can be targeted to mitigate negative impacts of environmental stresses on seed protein composition.
Collapse
Affiliation(s)
- Kevin Cartelier
- Agroécologie, AgroSup Dijon, Institut national de recherche pour l'agriculture, l'alimentation et l'environnement (INRAE), Université de Bourgogne, Université Bourgogne Franche-Comté, Dijon, France
| | - Delphine Aimé
- Agroécologie, AgroSup Dijon, Institut national de recherche pour l'agriculture, l'alimentation et l'environnement (INRAE), Université de Bourgogne, Université Bourgogne Franche-Comté, Dijon, France
| | - Joseph Ly Vu
- Univ Angers, Institut Agro, INRAE, IRHS, SFR QUASAV, Angers, F-49000, France
| | - Lucie Combes-Soia
- Physiologie de la Reproduction et des Comportements (PRC) UMR85, INRAE, CNRS, Université de Tours, IFCE, Nouzilly, France
| | - Valérie Labas
- Physiologie de la Reproduction et des Comportements (PRC) UMR85, INRAE, CNRS, Université de Tours, IFCE, Nouzilly, France
| | - Jean-Marie Prosperi
- Genetic Improvement and Adaptation of Mediterranean and Tropical Plants (AGAP), INRAE, Centre de coopération internationale en recherche agronomique pour le développement (CIRAD, Montpellier SupAgro, Montpellier, France
| | - Julia Buitink
- Univ Angers, Institut Agro, INRAE, IRHS, SFR QUASAV, Angers, F-49000, France
| | - Karine Gallardo
- Agroécologie, AgroSup Dijon, Institut national de recherche pour l'agriculture, l'alimentation et l'environnement (INRAE), Université de Bourgogne, Université Bourgogne Franche-Comté, Dijon, France
| | - Christine Le Signor
- Agroécologie, AgroSup Dijon, Institut national de recherche pour l'agriculture, l'alimentation et l'environnement (INRAE), Université de Bourgogne, Université Bourgogne Franche-Comté, Dijon, France
| |
Collapse
|
16
|
Tolani P, Gupta S, Yadav K, Aggarwal S, Yadav AK. Big data, integrative omics and network biology. ADVANCES IN PROTEIN CHEMISTRY AND STRUCTURAL BIOLOGY 2021; 127:127-160. [PMID: 34340766 DOI: 10.1016/bs.apcsb.2021.03.006] [Citation(s) in RCA: 11] [Impact Index Per Article: 3.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 12/12/2022]
Abstract
A cell integrates various signals through a network of biomolecules that crosstalk to synergistically regulate the replication, transcription, translation and other metabolic activities of a cell. These networks regulate signal perception and processing that drives biological functions. The biological complexity cannot be fully captured by a single -omics discipline. The holistic study of an organism-in health, perturbation, exposure to environment and disease, is studied under systems biology. The bottom-up molecular approaches (genes, mRNA, protein, metabolite, etc.) have laid the foundation of current biological knowledge covering the horizon from viruses, bacteria, fungi, plants and animals. Yet, these techniques provide a rather myopic view of biology at the molecular level. To understand how the interconnected molecular components are formed and rewired in disease or exposure to environmental stimuli is the holy grail of modern biology. The omics era was heralded by the genomics revolution but advanced sequencing techniques are now also ubiquitous in transcriptomics, proteomics, metabolomics and lipidomics. Multi-omics data analysis and integration techniques are driving the quest for deeper insights into how the different layers of biomolecules talk to each other in diverse contexts.
Collapse
Affiliation(s)
- Priya Tolani
- Translational Health Science and Technology Institute, NCR Biotech Science Cluster, Faridabad, Haryana, India
| | - Srishti Gupta
- Translational Health Science and Technology Institute, NCR Biotech Science Cluster, Faridabad, Haryana, India; School of Biosciences and Technology, Vellore Institute of Technology, Vellore, India
| | - Kirti Yadav
- Translational Health Science and Technology Institute, NCR Biotech Science Cluster, Faridabad, Haryana, India; Department of Pharmaceutical Biotechnology, Delhi Pharmaceutical Sciences and Research University, New Delhi, India
| | - Suruchi Aggarwal
- Translational Health Science and Technology Institute, NCR Biotech Science Cluster, Faridabad, Haryana, India; Department of Molecular Biology and Biotechnology, Cotton University, Guwahati, Assam, India
| | - Amit Kumar Yadav
- Translational Health Science and Technology Institute, NCR Biotech Science Cluster, Faridabad, Haryana, India.
| |
Collapse
|
17
|
Neely BA. Cloudy with a Chance of Peptides: Accessibility, Scalability, and Reproducibility with Cloud-Hosted Environments. J Proteome Res 2021; 20:2076-2082. [PMID: 33513299 PMCID: PMC8637422 DOI: 10.1021/acs.jproteome.0c00920] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/22/2023]
Abstract
Cloud-hosted environments offer known benefits when computational needs outstrip affordable local workstations, enabling high-performance computation without a physical cluster. What has been less apparent, especially to novice users, is the transformative potential for cloud-hosted environments to bridge the digital divide that exists between poorly funded and well-resourced laboratories, and to empower modern research groups with remote personnel and trainees. Using cloud-based proteomic bioinformatic pipelines is not predicated on analyzing thousands of files, but instead can be used to improve accessibility during remote work, extreme weather, or working with under-resourced remote trainees. The general benefits of cloud-hosted environments also allow for scalability and encourage reproducibility. Since one possible hurdle to adoption is awareness, this paper is written with the nonexpert in mind. The benefits and possibilities of using a cloud-hosted environment are emphasized by describing how to setup an example workflow to analyze a previously published label-free data-dependent acquisition mass spectrometry data set of mammalian urine. Cost and time of analysis are compared using different computational tiers, and important practical considerations are described. Overall, cloud-hosted environments offer the potential to solve large computational problems, but more importantly can enable and accelerate research in smaller research groups with inadequate infrastructure and suboptimal local computational resources.
Collapse
Affiliation(s)
- Benjamin A Neely
- Chemical Sciences Division, National Institute of Standards and Technology, Charleston, South Carolina 29412, United States
| |
Collapse
|
18
|
Bai J, Bandla C, Guo J, Alvarez RV, Bai M, Vizcaíno JA, Moreno P, Grüning B, Sallou O, Perez-Riverol Y. BioContainers Registry: Searching Bioinformatics and Proteomics Tools, Packages, and Containers. J Proteome Res 2021; 20:2056-2061. [PMID: 33625229 PMCID: PMC7611561 DOI: 10.1021/acs.jproteome.0c00904] [Citation(s) in RCA: 15] [Impact Index Per Article: 5.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/06/2023]
Abstract
BioContainers is an open-source project that aims to create, store, and distribute bioinformatics software containers and packages. The BioContainers community has developed a set of guidelines to standardize software containers including the metadata, versions, licenses, and software dependencies. BioContainers supports multiple packaging and container technologies such as Conda, Docker, and Singularity. The BioContainers provide over 9000 bioinformatics tools, including more than 200 proteomics and mass spectrometry tools. Here we introduce the BioContainers Registry and Restful API to make containerized bioinformatics tools more findable, accessible, interoperable, and reusable (FAIR). The BioContainers Registry provides a fast and convenient way to find and retrieve bioinformatics tool packages and containers. By doing so, it will increase the use of bioinformatics packages and containers while promoting replicability and reproducibility in research.
Collapse
Affiliation(s)
- Jingwen Bai
- European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Wellcome Trust Genome Campus, Hinxton, Cambridge, CB10 1SD, UK
| | - Chakradhar Bandla
- European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Wellcome Trust Genome Campus, Hinxton, Cambridge, CB10 1SD, UK
| | - Jiaxin Guo
- College of Bioinformation, Chongqing University of Posts and Telecommunications, Chongqing, 400065, China
| | - Roberto Vera Alvarez
- Computational Biology Branch, National Center for Biotechnology Information, National Library of Medicine, National Institutes of Health, Bethesda, MD, USA
| | - Mingze Bai
- Chongqing Key Laboratory of Big Data for Bio Intelligence, Chongqing, 400065, China
| | - Juan Antonio Vizcaíno
- European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Wellcome Trust Genome Campus, Hinxton, Cambridge, CB10 1SD, UK
| | - Pablo Moreno
- European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Wellcome Trust Genome Campus, Hinxton, Cambridge, CB10 1SD, UK
| | - Björn Grüning
- Bioinformatics Group, Department of Computer Science, University of Freiburg, Freiburg,79110, Germany
| | - Olivier Sallou
- Institut de Recherche en Informatique et Systèmes Aléatoires (IRISA/INRIA) -GenOuest Platform, Université de Rennes, Rennes, France
| | - Yasset Perez-Riverol
- European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Wellcome Trust Genome Campus, Hinxton, Cambridge, CB10 1SD, UK
| |
Collapse
|
19
|
Boekweg H, McCown MA, Payne SH. Simple and Efficient Data Analysis Dissemination for Individual Laboratories. J Proteome Res 2020; 19:4191-4195. [PMID: 32790999 DOI: 10.1021/acs.jproteome.0c00454] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/28/2022]
Abstract
Scientific progress comes as we build upon the work of others. Implicit in this advance is that we have access to and can thoroughly examine the work of others. It is important to recognize that our scholarly work as scientists encompasses not only experimental design and data collection but also our analytical methods. Thus when communicating biology experiments, especially those that utilize molecular omics data, the analysis methods that connect raw data to scientific conclusions must be presented with sufficient clarity that others can reproduce our exact work. Although there are many resources for sharing raw data files, there is currently not a widely utilized method for sharing analysis methods. We present a semistructured pattern for sharing analysis methods that is simple and efficient and can be implemented by individual laboratories using existing software. This pattern requires three types of files in a publicly accessible repository, such as GitHub: (1) data files, (2) a universal I/O script that parses all data files, and (3) analysis scripts creating figures and metrics reported in the manuscript. We suggest additional conventions to improve the readability and provide a template repository for the pattern. Sharing our exact analysis methods as software, in addition to their narrative description in a manuscript, will ensure reproducibility and transparency. Importantly, the pattern we present does not require new infrastructure and can be achieved without advanced computing skills.
Collapse
Affiliation(s)
- Hannah Boekweg
- Biology Department, Brigham Young University, Provo, Utah 84602, United States
| | - Michaela A McCown
- Biology Department, Brigham Young University, Provo, Utah 84602, United States
| | - Samuel H Payne
- Biology Department, Brigham Young University, Provo, Utah 84602, United States
| |
Collapse
|