1
|
Webel H, Niu L, Nielsen AB, Locard-Paulet M, Mann M, Jensen LJ, Rasmussen S. Imputation of label-free quantitative mass spectrometry-based proteomics data using self-supervised deep learning. Nat Commun 2024; 15:5405. [PMID: 38926340 PMCID: PMC11208500 DOI: 10.1038/s41467-024-48711-5] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/01/2023] [Accepted: 05/13/2024] [Indexed: 06/28/2024] Open
Abstract
Imputation techniques provide means to replace missing measurements with a value and are used in almost all downstream analysis of mass spectrometry (MS) based proteomics data using label-free quantification (LFQ). Here we demonstrate how collaborative filtering, denoising autoencoders, and variational autoencoders can impute missing values in the context of LFQ at different levels. We applied our method, proteomics imputation modeling mass spectrometry (PIMMS), to an alcohol-related liver disease (ALD) cohort with blood plasma proteomics data available for 358 individuals. Removing 20 percent of the intensities we were able to recover 15 out of 17 significant abundant protein groups using PIMMS-VAE imputations. When analyzing the full dataset we identified 30 additional proteins (+13.2%) that were significantly differentially abundant across disease stages compared to no imputation and found that some of these were predictive of ALD progression in machine learning models. We, therefore, suggest the use of deep learning approaches for imputing missing values in MS-based proteomics on larger datasets and provide workflows for these.
Collapse
Affiliation(s)
- Henry Webel
- Novo Nordisk Foundation Center for Protein Research, Faculty of Health and Medical Sciences, University of Copenhagen, Copenhagen N, Denmark
- Novo Nordisk Foundation Center for Basic Metabolic Research, Faculty of Health and Medical Sciences, University of Copenhagen, Copenhagen N, Denmark
| | - Lili Niu
- Novo Nordisk Foundation Center for Protein Research, Faculty of Health and Medical Sciences, University of Copenhagen, Copenhagen N, Denmark
| | - Annelaura Bach Nielsen
- Novo Nordisk Foundation Center for Protein Research, Faculty of Health and Medical Sciences, University of Copenhagen, Copenhagen N, Denmark
| | - Marie Locard-Paulet
- Novo Nordisk Foundation Center for Protein Research, Faculty of Health and Medical Sciences, University of Copenhagen, Copenhagen N, Denmark
- Institut de Pharmacologie et de Biologie Structurale (IPBS), Université de Toulouse, CNRS, Université Toulouse III - Paul Sabatier (UT3), Toulouse, France
| | - Matthias Mann
- Novo Nordisk Foundation Center for Protein Research, Faculty of Health and Medical Sciences, University of Copenhagen, Copenhagen N, Denmark
- Department of Proteomics and Signal Transduction, Max Planck Institute of Biochemistry, Martinsried, Germany
| | - Lars Juhl Jensen
- Novo Nordisk Foundation Center for Protein Research, Faculty of Health and Medical Sciences, University of Copenhagen, Copenhagen N, Denmark
| | - Simon Rasmussen
- Novo Nordisk Foundation Center for Protein Research, Faculty of Health and Medical Sciences, University of Copenhagen, Copenhagen N, Denmark.
- Novo Nordisk Foundation Center for Basic Metabolic Research, Faculty of Health and Medical Sciences, University of Copenhagen, Copenhagen N, Denmark.
- The Novo Nordisk Foundation Center for Genomic Mechanisms of Disease, Broad Institute of MIT and Harvard, Cambridge, MA, 02142, USA.
| |
Collapse
|
2
|
Lange E, Kranert L, Krüger J, Benndorf D, Heyer R. Microbiome modeling: a beginner's guide. Front Microbiol 2024; 15:1368377. [PMID: 38962127 PMCID: PMC11220171 DOI: 10.3389/fmicb.2024.1368377] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/10/2024] [Accepted: 05/27/2024] [Indexed: 07/05/2024] Open
Abstract
Microbiomes, comprised of diverse microbial species and viruses, play pivotal roles in human health, environmental processes, and biotechnological applications and interact with each other, their environment, and hosts via ecological interactions. Our understanding of microbiomes is still limited and hampered by their complexity. A concept improving this understanding is systems biology, which focuses on the holistic description of biological systems utilizing experimental and computational methods. An important set of such experimental methods are metaomics methods which analyze microbiomes and output lists of molecular features. These lists of data are integrated, interpreted, and compiled into computational microbiome models, to predict, optimize, and control microbiome behavior. There exists a gap in understanding between microbiologists and modelers/bioinformaticians, stemming from a lack of interdisciplinary knowledge. This knowledge gap hinders the establishment of computational models in microbiome analysis. This review aims to bridge this gap and is tailored for microbiologists, researchers new to microbiome modeling, and bioinformaticians. To achieve this goal, it provides an interdisciplinary overview of microbiome modeling, starting with fundamental knowledge of microbiomes, metaomics methods, common modeling formalisms, and how models facilitate microbiome control. It concludes with guidelines and repositories for modeling. Each section provides entry-level information, example applications, and important references, serving as a valuable resource for comprehending and navigating the complex landscape of microbiome research and modeling.
Collapse
Affiliation(s)
- Emanuel Lange
- Multidimensional Omics Data Analysis, Department for Bioanalytics, Leibniz-Institut für Analytische Wissenschaften - ISAS - e.V., Dortmund, Germany
- Graduate School Digital Infrastructure for the Life Sciences, Bielefeld Institute for Bioinformatics Infrastructure (BIBI), Faculty of Technology, Bielefeld University, Bielefeld, Germany
| | - Lena Kranert
- Institute for Automation Engineering, Otto von Guericke University Magdeburg, Magdeburg, Germany
| | - Jacob Krüger
- Engineering of Software-Intensive Systems, Department of Mathematics and Computer Science, Eindhoven University of Technology, Eindhoven, Netherlands
| | - Dirk Benndorf
- Applied Biosciences and Bioprocess Engineering, Anhalt University of Applied Sciences, Köthen, Germany
| | - Robert Heyer
- Multidimensional Omics Data Analysis, Department for Bioanalytics, Leibniz-Institut für Analytische Wissenschaften - ISAS - e.V., Dortmund, Germany
- Graduate School Digital Infrastructure for the Life Sciences, Bielefeld Institute for Bioinformatics Infrastructure (BIBI), Faculty of Technology, Bielefeld University, Bielefeld, Germany
- Multidimensional Omics Data Analysis, Faculty of Technology, Bielefeld University, Bielefeld, Germany
| |
Collapse
|
3
|
Weaver C, Nam A, Settle C, Overton M, Giddens M, Richardson KP, Piver R, Mysona DP, Rungruang B, Ghamande S, McIndoe R, Purohit S. Serum Proteomic Signatures in Cervical Cancer: Current Status and Future Directions. Cancers (Basel) 2024; 16:1629. [PMID: 38730581 PMCID: PMC11083044 DOI: 10.3390/cancers16091629] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/29/2024] [Revised: 04/18/2024] [Accepted: 04/19/2024] [Indexed: 05/13/2024] Open
Abstract
In 2020, the World Health Organization (WHO) reported 604,000 new diagnoses of cervical cancer (CC) worldwide, and over 300,000 CC-related fatalities. The vast majority of CC cases are caused by persistent human papillomavirus (HPV) infections. HPV-related CC incidence and mortality rates have declined worldwide because of increased HPV vaccination and CC screening with the Papanicolaou test (PAP test). Despite these significant improvements, developing countries face difficulty implementing these programs, while developed nations are challenged with identifying HPV-independent cases. Molecular and proteomic information obtained from blood or tumor samples have a strong potential to provide information on malignancy progression and response to therapy in CC. There is a large amount of published biomarker data related to CC available but the extensive validation required by the FDA approval for clinical use is lacking. The ability of researchers to use the big data obtained from clinical studies and to draw meaningful relationships from these data are two obstacles that must be overcome for implementation into clinical practice. We report on identified multimarker panels of serum proteomic studies in CC for the past 5 years, the potential for modern computational biology efforts, and the utilization of nationwide biobanks to bridge the gap between multivariate protein signature development and the prediction of clinically relevant CC patient outcomes.
Collapse
Affiliation(s)
- Chaston Weaver
- Center for Biotechnology and Genomic Medicine, Medical College of Georgia, Augusta University, Augusta, GA 30912, USA; (C.W.); (K.P.R.); (R.P.); (D.P.M.); (R.M.)
| | - Alisha Nam
- Department of Undergraduate Health Professions, College of Allied Health Sciences, Augusta University, Augusta, GA 30912, USA; (A.N.); (C.S.); (M.O.); (M.G.)
| | - Caitlin Settle
- Department of Undergraduate Health Professions, College of Allied Health Sciences, Augusta University, Augusta, GA 30912, USA; (A.N.); (C.S.); (M.O.); (M.G.)
| | - Madelyn Overton
- Department of Undergraduate Health Professions, College of Allied Health Sciences, Augusta University, Augusta, GA 30912, USA; (A.N.); (C.S.); (M.O.); (M.G.)
| | - Maya Giddens
- Department of Undergraduate Health Professions, College of Allied Health Sciences, Augusta University, Augusta, GA 30912, USA; (A.N.); (C.S.); (M.O.); (M.G.)
| | - Katherine P. Richardson
- Center for Biotechnology and Genomic Medicine, Medical College of Georgia, Augusta University, Augusta, GA 30912, USA; (C.W.); (K.P.R.); (R.P.); (D.P.M.); (R.M.)
| | - Rachael Piver
- Center for Biotechnology and Genomic Medicine, Medical College of Georgia, Augusta University, Augusta, GA 30912, USA; (C.W.); (K.P.R.); (R.P.); (D.P.M.); (R.M.)
- Department of Obstetrics and Gynecology, Medical College of Georgia, Augusta University, Augusta, GA 30912, USA; (B.R.); (S.G.)
| | - David P. Mysona
- Center for Biotechnology and Genomic Medicine, Medical College of Georgia, Augusta University, Augusta, GA 30912, USA; (C.W.); (K.P.R.); (R.P.); (D.P.M.); (R.M.)
- Department of Obstetrics and Gynecology, Medical College of Georgia, Augusta University, Augusta, GA 30912, USA; (B.R.); (S.G.)
| | - Bunja Rungruang
- Department of Obstetrics and Gynecology, Medical College of Georgia, Augusta University, Augusta, GA 30912, USA; (B.R.); (S.G.)
| | - Sharad Ghamande
- Department of Obstetrics and Gynecology, Medical College of Georgia, Augusta University, Augusta, GA 30912, USA; (B.R.); (S.G.)
| | - Richard McIndoe
- Center for Biotechnology and Genomic Medicine, Medical College of Georgia, Augusta University, Augusta, GA 30912, USA; (C.W.); (K.P.R.); (R.P.); (D.P.M.); (R.M.)
- Department of Obstetrics and Gynecology, Medical College of Georgia, Augusta University, Augusta, GA 30912, USA; (B.R.); (S.G.)
| | - Sharad Purohit
- Center for Biotechnology and Genomic Medicine, Medical College of Georgia, Augusta University, Augusta, GA 30912, USA; (C.W.); (K.P.R.); (R.P.); (D.P.M.); (R.M.)
- Department of Undergraduate Health Professions, College of Allied Health Sciences, Augusta University, Augusta, GA 30912, USA; (A.N.); (C.S.); (M.O.); (M.G.)
- Department of Obstetrics and Gynecology, Medical College of Georgia, Augusta University, Augusta, GA 30912, USA; (B.R.); (S.G.)
| |
Collapse
|
4
|
Danaeifar M, Najafi A. Artificial Intelligence and Computational Biology in Gene Therapy: A Review. Biochem Genet 2024:10.1007/s10528-024-10799-1. [PMID: 38635012 DOI: 10.1007/s10528-024-10799-1] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/16/2023] [Accepted: 04/02/2024] [Indexed: 04/19/2024]
Abstract
One of the trending fields in almost all areas of science and technology is artificial intelligence. Computational biology and artificial intelligence can help gene therapy in many steps including: gene identification, gene editing, vector design, development of new macromolecules and modeling of gene delivery. There are various tools used by computational biology and artificial intelligence in this field, such as genomics, transcriptomic and proteomics data analysis, machine learning algorithms and molecular interaction studies. These tools can introduce new gene targets, novel vectors, optimized experiment conditions, predict the outcomes and suggest the best solutions to avoid undesired immune responses following gene therapy treatment.
Collapse
Affiliation(s)
- Mohsen Danaeifar
- Molecular Biology Research Center, Systems Biology and Poisonings Institute, Baqiyatallah University of Medical Science, P.O. Box 19395-5487, Tehran, Iran
| | - Ali Najafi
- Molecular Biology Research Center, Systems Biology and Poisonings Institute, Baqiyatallah University of Medical Science, P.O. Box 19395-5487, Tehran, Iran.
| |
Collapse
|
5
|
Perron N, Kirst M, Chen S. Bringing CAM photosynthesis to the table: Paving the way for resilient and productive agricultural systems in a changing climate. PLANT COMMUNICATIONS 2024; 5:100772. [PMID: 37990498 PMCID: PMC10943566 DOI: 10.1016/j.xplc.2023.100772] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 06/20/2023] [Revised: 07/27/2023] [Accepted: 11/20/2023] [Indexed: 11/23/2023]
Abstract
Modern agricultural systems are directly threatened by global climate change and the resulting freshwater crisis. A considerable challenge in the coming years will be to develop crops that can cope with the consequences of declining freshwater resources and changing temperatures. One approach to meeting this challenge may lie in our understanding of plant photosynthetic adaptations and water use efficiency. Plants from various taxa have evolved crassulacean acid metabolism (CAM), a water-conserving adaptation of photosynthetic carbon dioxide fixation that enables plants to thrive under semi-arid or seasonally drought-prone conditions. Although past research on CAM has led to a better understanding of the inner workings of plant resilience and adaptation to stress, successful introduction of this pathway into C3 or C4 plants has not been reported. The recent revolution in molecular, systems, and synthetic biology, as well as innovations in high-throughput data generation and mining, creates new opportunities to uncover the minimum genetic tool kit required to introduce CAM traits into drought-sensitive crops. Here, we propose four complementary research avenues to uncover this tool kit. First, genomes and computational methods should be used to improve understanding of the nature of variations that drive CAM evolution. Second, single-cell 'omics technologies offer the possibility for in-depth characterization of the mechanisms that trigger environmentally controlled CAM induction. Third, the rapid increase in new 'omics data enables a comprehensive, multimodal exploration of CAM. Finally, the expansion of functional genomics methods is paving the way for integration of CAM into farming systems.
Collapse
Affiliation(s)
- Noé Perron
- Plant Molecular and Cellular Biology Program, University of Florida, Gainesville, FL 32608, USA
| | - Matias Kirst
- Plant Molecular and Cellular Biology Program, University of Florida, Gainesville, FL 32608, USA; School of Forest, Fisheries and Geomatics Sciences, University of Florida, Gainesville, FL 32603, USA.
| | - Sixue Chen
- Department of Biology, University of Mississippi, Oxford, MS 38677-1848, USA.
| |
Collapse
|
6
|
Kitata RB, Yang JC, Chen YJ. Advances in data-independent acquisition mass spectrometry towards comprehensive digital proteome landscape. MASS SPECTROMETRY REVIEWS 2023; 42:2324-2348. [PMID: 35645145 DOI: 10.1002/mas.21781] [Citation(s) in RCA: 37] [Impact Index Per Article: 37.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 10/09/2021] [Revised: 12/17/2021] [Accepted: 01/21/2022] [Indexed: 06/15/2023]
Abstract
The data-independent acquisition mass spectrometry (DIA-MS) has rapidly evolved as a powerful alternative for highly reproducible proteome profiling with a unique strength of generating permanent digital maps for retrospective analysis of biological systems. Recent advancements in data analysis software tools for the complex DIA-MS/MS spectra coupled to fast MS scanning speed and high mass accuracy have greatly expanded the sensitivity and coverage of DIA-based proteomics profiling. Here, we review the evolution of the DIA-MS techniques, from earlier proof-of-principle of parallel fragmentation of all-ions or ions in selected m/z range, the sequential window acquisition of all theoretical mass spectra (SWATH-MS) to latest innovations, recent development in computation algorithms for data informatics, and auxiliary tools and advanced instrumentation to enhance the performance of DIA-MS. We further summarize recent applications of DIA-MS and experimentally-derived as well as in silico spectra library resources for large-scale profiling to facilitate biomarker discovery and drug development in human diseases with emphasis on the proteomic profiling coverage. Toward next-generation DIA-MS for clinical proteomics, we outline the challenges in processing multi-dimensional DIA data set and large-scale clinical proteomics, and continuing need in higher profiling coverage and sensitivity.
Collapse
Affiliation(s)
| | - Jhih-Ci Yang
- Institute of Chemistry, Academia Sinica, Taipei, Taiwan
- Sustainable Chemical Science and Technology, Taiwan International Graduate Program, Academia Sinica and National Yang Ming Chiao Tung University, Taipei, Taiwan
- Department of Applied Chemistry, National Yang Ming Chiao Tung University, Hsinchu, Taiwan
| | - Yu-Ju Chen
- Institute of Chemistry, Academia Sinica, Taipei, Taiwan
- Sustainable Chemical Science and Technology, Taiwan International Graduate Program, Academia Sinica and National Yang Ming Chiao Tung University, Taipei, Taiwan
- Department of Chemistry, National Taiwan University, Taipei, Taiwan
| |
Collapse
|
7
|
Declercq A, Bouwmeester R, Chiva C, Sabidó E, Hirschler A, Carapito C, Martens L, Degroeve S, Gabriels R. Updated MS²PIP web server supports cutting-edge proteomics applications. Nucleic Acids Res 2023:7151340. [PMID: 37140039 DOI: 10.1093/nar/gkad335] [Citation(s) in RCA: 9] [Impact Index Per Article: 9.0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/26/2023] [Revised: 04/04/2023] [Accepted: 04/25/2023] [Indexed: 05/05/2023] Open
Abstract
Interest in the use of machine learning for peptide fragmentation spectrum prediction has been strongly on the rise over the past years, especially for applications in challenging proteomics identification workflows such as immunopeptidomics and the full-proteome identification of data independent acquisition spectra. Since its inception, the MS²PIP peptide spectrum predictor has been widely used for various downstream applications, mostly thanks to its accuracy, ease-of-use, and broad applicability. We here present a thoroughly updated version of the MS²PIP web server, which includes new and more performant prediction models for both tryptic- and non-tryptic peptides, for immunopeptides, and for CID-fragmented TMT-labeled peptides. Additionally, we have also added new functionality to greatly facilitate the generation of proteome-wide predicted spectral libraries, requiring only a FASTA protein file as input. These libraries also include retention time predictions from DeepLC. Moreover, we now provide pre-built and ready-to-download spectral libraries for various model organisms in multiple DIA-compatible spectral library formats. Besides upgrading the back-end models, the user experience on the MS²PIP web server is thus also greatly enhanced, extending its applicability to new domains, including immunopeptidomics and MS3-based TMT quantification experiments. MS²PIP is freely available at https://iomics.ugent.be/ms2pip/.
Collapse
Affiliation(s)
- Arthur Declercq
- VIB-UGent Center for Medical Biotechnology, VIB, Belgium
- Department of Biomolecular Medicine, Ghent University, Belgium
| | - Robbin Bouwmeester
- VIB-UGent Center for Medical Biotechnology, VIB, Belgium
- Department of Biomolecular Medicine, Ghent University, Belgium
| | - Cristina Chiva
- Proteomics Unit, Universitat Pompeu Fabra, 08003, Barcelona, Spain
- Proteomics Unit, Centre for Genomic Regulation, Barcelona Institute of Science and Technology (BIST), 08003, Barcelona, Spain
| | - Eduard Sabidó
- Proteomics Unit, Universitat Pompeu Fabra, 08003, Barcelona, Spain
- Proteomics Unit, Centre for Genomic Regulation, Barcelona Institute of Science and Technology (BIST), 08003, Barcelona, Spain
| | - Aurélie Hirschler
- Laboratoire de Spectrométrie de Masse BioOrganique (LSMBO), Université de Strasbourg, CNRS, France
| | - Christine Carapito
- Laboratoire de Spectrométrie de Masse BioOrganique (LSMBO), Université de Strasbourg, CNRS, France
| | - Lennart Martens
- VIB-UGent Center for Medical Biotechnology, VIB, Belgium
- Department of Biomolecular Medicine, Ghent University, Belgium
| | - Sven Degroeve
- VIB-UGent Center for Medical Biotechnology, VIB, Belgium
- Department of Biomolecular Medicine, Ghent University, Belgium
| | - Ralf Gabriels
- VIB-UGent Center for Medical Biotechnology, VIB, Belgium
- Department of Biomolecular Medicine, Ghent University, Belgium
| |
Collapse
|
8
|
Claeys T, Menu M, Bouwmeester R, Gevaert K, Martens L. Machine Learning on Large-Scale Proteomics Data Identifies Tissue and Cell-Type Specific Proteins. J Proteome Res 2023; 22:1181-1192. [PMID: 36963412 PMCID: PMC10088018 DOI: 10.1021/acs.jproteome.2c00644] [Citation(s) in RCA: 3] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 03/26/2023]
Abstract
Using data from 183 public human data sets from PRIDE, a machine learning model was trained to identify tissue and cell-type specific protein patterns. PRIDE projects were searched with ionbot and tissue/cell type annotation was manually added. Data from physiological samples were used to train a Random Forest model on protein abundances to classify samples into tissues and cell types. Subsequently, a one-vs-all classification and feature importance were used to analyze the most discriminating protein abundances per class. Based on protein abundance alone, the model was able to predict tissues with 98% accuracy, and cell types with 99% accuracy. The F-scores describe a clear view on tissue-specific proteins and tissue-specific protein expression patterns. In-depth feature analysis shows slight confusion between physiologically similar tissues, demonstrating the capacity of the algorithm to detect biologically relevant patterns. These results can in turn inform downstream uses, from identification of the tissue of origin of proteins in complex samples such as liquid biopsies, to studying the proteome of tissue-like samples such as organoids and cell lines.
Collapse
Affiliation(s)
- Tine Claeys
- VIB-UGent Center for Medical Biotechnology, VIB, 9052 Ghent, Belgium
- Department of Biomolecular Medicine, Ghent University, 9052 Ghent, Belgium
| | - Maxime Menu
- VIB-UGent Center for Medical Biotechnology, VIB, 9052 Ghent, Belgium
- Department of Biomolecular Medicine, Ghent University, 9052 Ghent, Belgium
| | - Robbin Bouwmeester
- VIB-UGent Center for Medical Biotechnology, VIB, 9052 Ghent, Belgium
- Department of Biomolecular Medicine, Ghent University, 9052 Ghent, Belgium
| | - Kris Gevaert
- VIB-UGent Center for Medical Biotechnology, VIB, 9052 Ghent, Belgium
- Department of Biomolecular Medicine, Ghent University, 9052 Ghent, Belgium
| | - Lennart Martens
- VIB-UGent Center for Medical Biotechnology, VIB, 9052 Ghent, Belgium
- Department of Biomolecular Medicine, Ghent University, 9052 Ghent, Belgium
| |
Collapse
|
9
|
Letunica N, McCafferty C, Swaney E, Cai T, Monagle P, Ignjatovic V, Attard C. Proteomic Applications and Considerations: From Research to Patient Care. Methods Mol Biol 2023; 2628:181-192. [PMID: 36781786 DOI: 10.1007/978-1-0716-2978-9_12] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/15/2023]
Abstract
Despite technological advancements in the field of proteomics, the rate at which serum and plasma biomarkers identified using proteomic approaches are translated into clinical use remains extremely low. In this chapter, we describe recent technological advancements and analytical strategies in proteomic methods. We also describe the progress of proteomic blood-based biomarkers to date and discuss what the future of proteomics might entail with the use of multi-omic approaches and implementing machine learning on large proteomic datasets. Lastly, we provide several key considerations for biomarker studies, ranging from sample type to the use of reference samples, in order to achieve progress from bench to bedside, ultimately improving patient diagnosis, disease, and/or therapeutic monitoring and care.
Collapse
Affiliation(s)
- Natasha Letunica
- Haematology Research, Murdoch Children's Research Institute, Melbourne, VIC, Australia
| | - Conor McCafferty
- Haematology Research, Murdoch Children's Research Institute, Melbourne, VIC, Australia.,Department of Paediatrics, The University of Melbourne, Melbourne, VIC, Australia
| | - Ella Swaney
- Haematology Research, Murdoch Children's Research Institute, Melbourne, VIC, Australia.,Department of Paediatrics, The University of Melbourne, Melbourne, VIC, Australia
| | - Tengyi Cai
- Haematology Research, Murdoch Children's Research Institute, Melbourne, VIC, Australia.,Department of Paediatrics, The University of Melbourne, Melbourne, VIC, Australia
| | - Paul Monagle
- Haematology Research, Murdoch Children's Research Institute, Melbourne, VIC, Australia.,Department of Paediatrics, The University of Melbourne, Melbourne, VIC, Australia.,Department of Clinical Haematology, Royal Children's Hospital, Melbourne, VIC, Australia.,Kids Cancer Centre, Sydney Children's Hospital, Randwick, NSW, Australia
| | - Vera Ignjatovic
- Department of Paediatrics, The University of Melbourne, Melbourne, VIC, Australia.,Institute for Clinical and Translational Research, Johns Hopkins All Children's Hospital, St. Petersburg, USA.,Department of Pediatrics, Johns Hopkins University, Baltimore, USA
| | - Chantal Attard
- Haematology Research, Murdoch Children's Research Institute, Melbourne, VIC, Australia. .,Department of Paediatrics, The University of Melbourne, Melbourne, VIC, Australia. .,The Royal Children's Hospital, Parkville, VIC, Australia.
| |
Collapse
|
10
|
Rehfeldt T, Gabriels R, Bouwmeester R, Gessulat S, Neely BA, Palmblad M, Perez-Riverol Y, Schmidt T, Vizcaíno JA, Deutsch EW. ProteomicsML: An Online Platform for Community-Curated Data sets and Tutorials for Machine Learning in Proteomics. J Proteome Res 2023; 22:632-636. [PMID: 36693629 PMCID: PMC9903315 DOI: 10.1021/acs.jproteome.2c00629] [Citation(s) in RCA: 4] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/26/2023]
Abstract
Data set acquisition and curation are often the most difficult and time-consuming parts of a machine learning endeavor. This is especially true for proteomics-based liquid chromatography (LC) coupled to mass spectrometry (MS) data sets, due to the high levels of data reduction that occur between raw data and machine learning-ready data. Since predictive proteomics is an emerging field, when predicting peptide behavior in LC-MS setups, each lab often uses unique and complex data processing pipelines in order to maximize performance, at the cost of accessibility and reproducibility. For this reason we introduce ProteomicsML, an online resource for proteomics-based data sets and tutorials across most of the currently explored physicochemical peptide properties. This community-driven resource makes it simple to access data in easy-to-process formats, and contains easy-to-follow tutorials that allow new users to interact with even the most advanced algorithms in the field. ProteomicsML provides data sets that are useful for comparing state-of-the-art machine learning algorithms, as well as providing introductory material for teachers and newcomers to the field alike. The platform is freely available at https://www.proteomicsml.org/, and we welcome the entire proteomics community to contribute to the project at https://github.com/ProteomicsML/ProteomicsML.
Collapse
Affiliation(s)
- Tobias
G. Rehfeldt
- Institute
for Mathematics and Computer Science, University
of Southern Denmark, 5000 Odense, Denmark
| | - Ralf Gabriels
- VIB-UGent
Center for Medical Biotechnology, VIB, Ghent 9052, Belgium,Department
of Biomolecular Medicine, Ghent University, Ghent 9052, Belgium
| | - Robbin Bouwmeester
- VIB-UGent
Center for Medical Biotechnology, VIB, Ghent 9052, Belgium,Department
of Biomolecular Medicine, Ghent University, Ghent 9052, Belgium
| | | | - Benjamin A. Neely
- National
Institute of Standards and Technology, Charleston, South Carolina 29412, United States
| | - Magnus Palmblad
- Center for
Proteomics and Metabolomics, Leiden University
Medical Center, 2300 RC Leiden, The Netherlands
| | - Yasset Perez-Riverol
- European
Molecular Biology Laboratory, European Bioinformatics
Institute (EMBL-EBI), Wellcome Trust
Genome Campus, Hinxton, Cambridge CB10 1SD, United Kingdom
| | | | - Juan Antonio Vizcaíno
- European
Molecular Biology Laboratory, European Bioinformatics
Institute (EMBL-EBI), Wellcome Trust
Genome Campus, Hinxton, Cambridge CB10 1SD, United Kingdom,Juan
Antonio Vizcaíno: , Phone: +44 (0) 1223 492686
| | - Eric W. Deutsch
- Institute
for Systems Biology, Seattle, Washington 98109, United States,Eric Deutsch: ,
Phone: 206-732-1200, Fax: 206-732-1299
| |
Collapse
|
11
|
Illing PT, Ramarathinam SH, Purcell AW. New insights and approaches for analyses of immunopeptidomes. Curr Opin Immunol 2022; 77:102216. [PMID: 35716458 DOI: 10.1016/j.coi.2022.102216] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/11/2022] [Accepted: 05/10/2022] [Indexed: 11/03/2022]
Abstract
Human leucocyte antigen (HLA) molecules play a key role in health and disease by presenting antigen to T-lymphocytes for immunosurveillance. Immunopeptidomics involves the study of the collection of peptides presented within the antigen-binding groove of HLA molecules. Identifying their nature and diversity is crucial to understanding immunosurveillance especially during infection or for the recognition and potential eradication of tumours. This review discusses recent advances in the isolation, identification, and quantitation of these peptide antigens. New informatics approaches and databases have shed light on the extent of peptide antigens derived from unconventional sources including peptides derived from transcripts associated with frame shifts, long noncoding RNA, incorrectly annotated untranslated regions, post-translational modifications, and proteasomal splicing. Several challenges remain in successful analysis of immunopeptides, yet recent developments point to unexplored biology waiting to be unravelled.
Collapse
Affiliation(s)
- Patricia T Illing
- Department of Biochemistry and Molecular Biology and Infection and Immunity Program, Biomedicine Discovery Institute, Monash University, Melbourne, Victoria, Australia
| | - Sri H Ramarathinam
- Department of Biochemistry and Molecular Biology and Infection and Immunity Program, Biomedicine Discovery Institute, Monash University, Melbourne, Victoria, Australia
| | - Anthony W Purcell
- Department of Biochemistry and Molecular Biology and Infection and Immunity Program, Biomedicine Discovery Institute, Monash University, Melbourne, Victoria, Australia.
| |
Collapse
|
12
|
Walzer M, García-Seisdedos D, Prakash A, Brack P, Crowther P, Graham RL, George N, Mohammed S, Moreno P, Papatheodorou I, Hubbard SJ, Vizcaíno JA. Implementing the reuse of public DIA proteomics datasets: from the PRIDE database to Expression Atlas. Sci Data 2022; 9:335. [PMID: 35701420 PMCID: PMC9197839 DOI: 10.1038/s41597-022-01380-9] [Citation(s) in RCA: 10] [Impact Index Per Article: 5.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/15/2021] [Accepted: 05/12/2022] [Indexed: 11/14/2022] Open
Abstract
The number of mass spectrometry (MS)-based proteomics datasets in the public domain keeps increasing, particularly those generated by Data Independent Acquisition (DIA) approaches such as SWATH-MS. Unlike Data Dependent Acquisition datasets, the re-use of DIA datasets has been rather limited to date, despite its high potential, due to the technical challenges involved. We introduce a (re-)analysis pipeline for public SWATH-MS datasets which includes a combination of metadata annotation protocols, automated workflows for MS data analysis, statistical analysis, and the integration of the results into the Expression Atlas resource. Automation is orchestrated with Nextflow, using containerised open analysis software tools, rendering the pipeline readily available and reproducible. To demonstrate its utility, we reanalysed 10 public DIA datasets from the PRIDE database, comprising 1,278 SWATH-MS runs. The robustness of the analysis was evaluated, and the results compared to those obtained in the original publications. The final expression values were integrated into Expression Atlas, making SWATH-MS experiments more widely available and combining them with expression data originating from other proteomics and transcriptomics datasets.
Collapse
Affiliation(s)
- Mathias Walzer
- European Molecular Biology Laboratory, EMBL-European Bioinformatics Institute (EMBL-EBI), Hinxton, Cambridge, CB10 1SD, United Kingdom.
| | - David García-Seisdedos
- European Molecular Biology Laboratory, EMBL-European Bioinformatics Institute (EMBL-EBI), Hinxton, Cambridge, CB10 1SD, United Kingdom
| | - Ananth Prakash
- European Molecular Biology Laboratory, EMBL-European Bioinformatics Institute (EMBL-EBI), Hinxton, Cambridge, CB10 1SD, United Kingdom
| | - Paul Brack
- Division of Evolution, Infection and Genomics, School of Biological Sciences, Faculty of Biology, Medicine and Health, University of Manchester, Manchester Academic Health Science Centre, Oxford Road, Manchester, M13 9PT, United Kingdom
| | - Peter Crowther
- Melandra Limited, 16 Brook Road, Urmston, Manchester, M41 5RY, United Kingdom
| | - Robert L Graham
- School of Biological Sciences, Chlorine Gardens, Queen's University Belfast, Belfast, BT9 5DL, United Kingdom
| | - Nancy George
- European Molecular Biology Laboratory, EMBL-European Bioinformatics Institute (EMBL-EBI), Hinxton, Cambridge, CB10 1SD, United Kingdom
| | - Suhaib Mohammed
- European Molecular Biology Laboratory, EMBL-European Bioinformatics Institute (EMBL-EBI), Hinxton, Cambridge, CB10 1SD, United Kingdom
| | - Pablo Moreno
- European Molecular Biology Laboratory, EMBL-European Bioinformatics Institute (EMBL-EBI), Hinxton, Cambridge, CB10 1SD, United Kingdom
| | - Irene Papatheodorou
- European Molecular Biology Laboratory, EMBL-European Bioinformatics Institute (EMBL-EBI), Hinxton, Cambridge, CB10 1SD, United Kingdom
| | - Simon J Hubbard
- Division of Evolution, Infection and Genomics, School of Biological Sciences, Faculty of Biology, Medicine and Health, University of Manchester, Manchester Academic Health Science Centre, Oxford Road, Manchester, M13 9PT, United Kingdom
| | - Juan Antonio Vizcaíno
- European Molecular Biology Laboratory, EMBL-European Bioinformatics Institute (EMBL-EBI), Hinxton, Cambridge, CB10 1SD, United Kingdom.
| |
Collapse
|
13
|
A comprehensive LFQ benchmark dataset on modern day acquisition strategies in proteomics. Sci Data 2022; 9:126. [PMID: 35354825 PMCID: PMC8967878 DOI: 10.1038/s41597-022-01216-6] [Citation(s) in RCA: 18] [Impact Index Per Article: 9.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/23/2021] [Accepted: 02/23/2022] [Indexed: 12/23/2022] Open
Abstract
In the last decade, a revolution in liquid chromatography-mass spectrometry (LC-MS) based proteomics was unfolded with the introduction of dozens of novel instruments that incorporate additional data dimensions through innovative acquisition methodologies, in turn inspiring specialized data analysis pipelines. Simultaneously, a growing number of proteomics datasets have been made publicly available through data repositories such as ProteomeXchange, Zenodo and Skyline Panorama. However, developing algorithms to mine this data and assessing the performance on different platforms is currently hampered by the lack of a single benchmark experimental design. Therefore, we acquired a hybrid proteome mixture on different instrument platforms and in all currently available families of data acquisition. Here, we present a comprehensive Data-Dependent and Data-Independent Acquisition (DDA/DIA) dataset acquired using several of the most commonly used current day instrumental platforms. The dataset consists of over 700 LC-MS runs, including adequate replicates allowing robust statistics and covering over nearly 10 different data formats, including scanning quadrupole and ion mobility enabled acquisitions. Datasets are available via ProteomeXchange (PXD028735). Measurement(s) | Digital Data Repository | Technology Type(s) | Digital Data Repository |
Collapse
|
14
|
Urban J. A review on recent trends in the phosphoproteomics workflow. From sample preparation to data analysis. Anal Chim Acta 2022; 1199:338857. [DOI: 10.1016/j.aca.2021.338857] [Citation(s) in RCA: 9] [Impact Index Per Article: 4.5] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/07/2021] [Revised: 07/14/2021] [Accepted: 07/15/2021] [Indexed: 12/12/2022]
|
15
|
Schallert K, Verschaffelt P, Mesuere B, Benndorf D, Martens L, Van Den Bossche T. Pout2Prot: An Efficient Tool to Create Protein (Sub)groups from Percolator Output Files. J Proteome Res 2022; 21:1175-1180. [PMID: 35143215 DOI: 10.1021/acs.jproteome.1c00685] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/30/2022]
Abstract
In metaproteomics, the study of the collective proteome of microbial communities, the protein inference problem is more challenging than in single-species proteomics. Indeed, a peptide sequence can be present not only in multiple proteins or protein isoforms of the same species, but also in homologous proteins from closely related species. To assign the taxonomy and functions of the microbial species, specialized tools have been developed, such as Prophane. This tool, however, is not directly compatible with post-processing tools such as Percolator. In this manuscript we therefore present Pout2Prot, which takes Percolator Output (.pout) files from multiple experiments and creates protein group and protein subgroup output files (.tsv) that can be used directly with Prophane. We investigated different grouping strategies and compared existing protein grouping tools to develop an advanced protein grouping algorithm that offers a variety of different approaches, allows grouping for multiple files, and uses a weighted spectral count for protein (sub)groups to reflect abundance. Pout2Prot is available as a web application at https://pout2prot.ugent.be and is installable via pip as a standalone command line tool and reusable software library. All code is open source under the Apache License 2.0 and is available at https://github.com/compomics/pout2prot.
Collapse
Affiliation(s)
- Kay Schallert
- Bioprocess Engineering, Otto-von-Guericke University Magdeburg, 39104 Magdeburg, Germany.,Bioprocess Engineering, Max Planck Institute for Dynamics of Complex Technical Systems, 39104 Magdeburg, Germany
| | - Pieter Verschaffelt
- Department of Applied Mathematics, Computer Science and Statistics, Ghent University, 9000 Ghent, Belgium.,VIB - UGent Center for Medical Biotechnology, VIB, 9000 Ghent, Belgium
| | - Bart Mesuere
- Department of Applied Mathematics, Computer Science and Statistics, Ghent University, 9000 Ghent, Belgium.,VIB - UGent Center for Medical Biotechnology, VIB, 9000 Ghent, Belgium
| | - Dirk Benndorf
- Bioprocess Engineering, Otto-von-Guericke University Magdeburg, 39104 Magdeburg, Germany.,Bioprocess Engineering, Max Planck Institute for Dynamics of Complex Technical Systems, 39104 Magdeburg, Germany.,Microbiology, Department of Applied Biosciences and Process Technology, Anhalt University of Applied Sciences, 06366 Köthen, Germany
| | - Lennart Martens
- VIB - UGent Center for Medical Biotechnology, VIB, 9000 Ghent, Belgium.,Department of Biomolecular Medicine, Faculty of Medicine and Health Sciences, Ghent University, 9000 Ghent, Belgium
| | - Tim Van Den Bossche
- VIB - UGent Center for Medical Biotechnology, VIB, 9000 Ghent, Belgium.,Department of Biomolecular Medicine, Faculty of Medicine and Health Sciences, Ghent University, 9000 Ghent, Belgium
| |
Collapse
|
16
|
Van Den Bossche T, Kunath BJ, Schallert K, Schäpe SS, Abraham PE, Armengaud J, Arntzen MØ, Bassignani A, Benndorf D, Fuchs S, Giannone RJ, Griffin TJ, Hagen LH, Halder R, Henry C, Hettich RL, Heyer R, Jagtap P, Jehmlich N, Jensen M, Juste C, Kleiner M, Langella O, Lehmann T, Leith E, May P, Mesuere B, Miotello G, Peters SL, Pible O, Queiros PT, Reichl U, Renard BY, Schiebenhoefer H, Sczyrba A, Tanca A, Trappe K, Trezzi JP, Uzzau S, Verschaffelt P, von Bergen M, Wilmes P, Wolf M, Martens L, Muth T. Critical Assessment of MetaProteome Investigation (CAMPI): a multi-laboratory comparison of established workflows. Nat Commun 2021; 12:7305. [PMID: 34911965 PMCID: PMC8674281 DOI: 10.1038/s41467-021-27542-8] [Citation(s) in RCA: 10] [Impact Index Per Article: 3.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/02/2021] [Accepted: 11/24/2021] [Indexed: 12/17/2022] Open
Abstract
Metaproteomics has matured into a powerful tool to assess functional interactions in microbial communities. While many metaproteomic workflows are available, the impact of method choice on results remains unclear. Here, we carry out a community-driven, multi-laboratory comparison in metaproteomics: the critical assessment of metaproteome investigation study (CAMPI). Based on well-established workflows, we evaluate the effect of sample preparation, mass spectrometry, and bioinformatic analysis using two samples: a simplified, laboratory-assembled human intestinal model and a human fecal sample. We observe that variability at the peptide level is predominantly due to sample processing workflows, with a smaller contribution of bioinformatic pipelines. These peptide-level differences largely disappear at the protein group level. While differences are observed for predicted community composition, similar functional profiles are obtained across workflows. CAMPI demonstrates the robustness of present-day metaproteomics research, serves as a template for multi-laboratory studies in metaproteomics, and provides publicly available data sets for benchmarking future developments.
Collapse
Affiliation(s)
- Tim Van Den Bossche
- VIB - UGent Center for Medical Biotechnology, VIB, Ghent, Belgium
- Department of Biomolecular Medicine, Faculty of Medicine and Health Sciences, Ghent University, Ghent, Belgium
| | - Benoit J Kunath
- Luxembourg Centre for Systems Biomedicine, University of Luxembourg, Esch-sur-Alzette, Luxembourg
| | - Kay Schallert
- Bioprocess Engineering, Otto-von-Guericke University Magdeburg, Magdeburg, Germany
| | - Stephanie S Schäpe
- Department of Molecular Systems Biology, Helmholtz-Centre for Environmental Research - UFZ GmbH, Leipzig, Germany
| | - Paul E Abraham
- Biosciences Division, Oak Ridge National Laboratory, Oak Ridge, TN, USA
| | - Jean Armengaud
- Département Médicaments et Technologies pour la Santé (DMTS), Université Paris Saclay, CEA, INRAE, SPI, 30200, Bagnols-sur-Cèze, France
| | - Magnus Ø Arntzen
- Faculty of Chemistry, Biotechnology and Food Science, Norwegian University of Life Sciences (NMBU), Ås, Norway
| | - Ariane Bassignani
- INRAE, AgroParisTech, Micalis Institute, Université Paris-Saclay, 78350, Jouy-en-Josas, France
| | - Dirk Benndorf
- Bioprocess Engineering, Otto-von-Guericke University Magdeburg, Magdeburg, Germany
- Microbiology, Department of Applied Biosciences and Process Technology, Anhalt University of Applied Sciences, Köthen, Germany
- Bioprocess Engineering, Max Planck Institute for Dynamics of Complex Technical Systems, Magdeburg, Germany
| | - Stephan Fuchs
- Bioinformatics Unit (MF1), Department for Methods Development and Research Infrastructure, Robert Koch Institute, Berlin, Germany
| | | | - Timothy J Griffin
- Department of Biochemistry Molecular Biology and Biophysics, University of Minnesota, Minneapolis, MN, USA
| | - Live H Hagen
- Faculty of Chemistry, Biotechnology and Food Science, Norwegian University of Life Sciences (NMBU), Ås, Norway
| | - Rashi Halder
- Luxembourg Centre for Systems Biomedicine, University of Luxembourg, Esch-sur-Alzette, Luxembourg
| | - Céline Henry
- INRAE, AgroParisTech, Micalis Institute, Université Paris-Saclay, 78350, Jouy-en-Josas, France
| | - Robert L Hettich
- Biosciences Division, Oak Ridge National Laboratory, Oak Ridge, TN, USA
| | - Robert Heyer
- Bioprocess Engineering, Otto-von-Guericke University Magdeburg, Magdeburg, Germany
| | - Pratik Jagtap
- Department of Biochemistry Molecular Biology and Biophysics, University of Minnesota, Minneapolis, MN, USA
| | - Nico Jehmlich
- Department of Molecular Systems Biology, Helmholtz-Centre for Environmental Research - UFZ GmbH, Leipzig, Germany
| | - Marlene Jensen
- Department of Plant & Microbial Biology, North Carolina State University, Raleigh, USA
| | - Catherine Juste
- INRAE, AgroParisTech, Micalis Institute, Université Paris-Saclay, 78350, Jouy-en-Josas, France
| | - Manuel Kleiner
- Department of Plant & Microbial Biology, North Carolina State University, Raleigh, USA
| | - Olivier Langella
- Université Paris-Saclay, INRAE, CNRS, AgroParisTech, GQE - Le Moulon, 91190, Gif-sur-Yvette, France
| | - Theresa Lehmann
- Bioprocess Engineering, Otto-von-Guericke University Magdeburg, Magdeburg, Germany
| | - Emma Leith
- Department of Biochemistry Molecular Biology and Biophysics, University of Minnesota, Minneapolis, MN, USA
| | - Patrick May
- Luxembourg Centre for Systems Biomedicine, University of Luxembourg, Esch-sur-Alzette, Luxembourg
| | - Bart Mesuere
- VIB - UGent Center for Medical Biotechnology, VIB, Ghent, Belgium
- Department of Applied Mathematics, Computer Science and Statistics, Ghent University, Ghent, Belgium
| | - Guylaine Miotello
- Département Médicaments et Technologies pour la Santé (DMTS), Université Paris Saclay, CEA, INRAE, SPI, 30200, Bagnols-sur-Cèze, France
| | - Samantha L Peters
- Biosciences Division, Oak Ridge National Laboratory, Oak Ridge, TN, USA
| | - Olivier Pible
- Département Médicaments et Technologies pour la Santé (DMTS), Université Paris Saclay, CEA, INRAE, SPI, 30200, Bagnols-sur-Cèze, France
| | - Pedro T Queiros
- Luxembourg Centre for Systems Biomedicine, University of Luxembourg, Esch-sur-Alzette, Luxembourg
| | - Udo Reichl
- Bioprocess Engineering, Otto-von-Guericke University Magdeburg, Magdeburg, Germany
- Bioprocess Engineering, Max Planck Institute for Dynamics of Complex Technical Systems, Magdeburg, Germany
| | - Bernhard Y Renard
- Bioinformatics Unit (MF1), Department for Methods Development and Research Infrastructure, Robert Koch Institute, Berlin, Germany
- Data Analytics and Computational Statistics, Hasso-Plattner-Institute, Faculty of Digital Engineering, University of Potsdam, Potsdam, Germany
| | - Henning Schiebenhoefer
- Bioinformatics Unit (MF1), Department for Methods Development and Research Infrastructure, Robert Koch Institute, Berlin, Germany
- Data Analytics and Computational Statistics, Hasso-Plattner-Institute, Faculty of Digital Engineering, University of Potsdam, Potsdam, Germany
| | | | - Alessandro Tanca
- Department of Biomedical Sciences, University of Sassari, Sassari, Italy
| | - Kathrin Trappe
- Bioinformatics Unit (MF1), Department for Methods Development and Research Infrastructure, Robert Koch Institute, Berlin, Germany
| | - Jean-Pierre Trezzi
- Luxembourg Centre for Systems Biomedicine, University of Luxembourg, Esch-sur-Alzette, Luxembourg
- Integrated Biobank of Luxembourg, Luxembourg Institute of Health, 1, rue Louis Rech, L-3555, Dudelange, Luxembourg
| | - Sergio Uzzau
- Department of Biomedical Sciences, University of Sassari, Sassari, Italy
| | - Pieter Verschaffelt
- VIB - UGent Center for Medical Biotechnology, VIB, Ghent, Belgium
- Department of Applied Mathematics, Computer Science and Statistics, Ghent University, Ghent, Belgium
| | - Martin von Bergen
- Department of Molecular Systems Biology, Helmholtz-Centre for Environmental Research - UFZ GmbH, Leipzig, Germany
| | - Paul Wilmes
- Luxembourg Centre for Systems Biomedicine, University of Luxembourg, Esch-sur-Alzette, Luxembourg
- Department of Life Sciences and Medicine, Faculty of Science, Technology and Medicine, University of Luxembourg, 6 avenue du Swing, L-4367, Belvaux, Luxembourg
| | - Maximilian Wolf
- Bioprocess Engineering, Otto-von-Guericke University Magdeburg, Magdeburg, Germany
| | - Lennart Martens
- VIB - UGent Center for Medical Biotechnology, VIB, Ghent, Belgium.
- Department of Biomolecular Medicine, Faculty of Medicine and Health Sciences, Ghent University, Ghent, Belgium.
| | - Thilo Muth
- Section eScience (S.3), Federal Institute for Materials Research and Testing, Berlin, Germany
| |
Collapse
|
17
|
Van Den Bossche T, Kunath BJ, Schallert K, Schäpe SS, Abraham PE, Armengaud J, Arntzen MØ, Bassignani A, Benndorf D, Fuchs S, Giannone RJ, Griffin TJ, Hagen LH, Halder R, Henry C, Hettich RL, Heyer R, Jagtap P, Jehmlich N, Jensen M, Juste C, Kleiner M, Langella O, Lehmann T, Leith E, May P, Mesuere B, Miotello G, Peters SL, Pible O, Queiros PT, Reichl U, Renard BY, Schiebenhoefer H, Sczyrba A, Tanca A, Trappe K, Trezzi JP, Uzzau S, Verschaffelt P, von Bergen M, Wilmes P, Wolf M, Martens L, Muth T. Critical Assessment of MetaProteome Investigation (CAMPI): a multi-laboratory comparison of established workflows. Nat Commun 2021; 12:7305. [PMID: 34911965 DOI: 10.1101/2021.03.05.433915] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/02/2021] [Accepted: 11/24/2021] [Indexed: 05/21/2023] Open
Abstract
Metaproteomics has matured into a powerful tool to assess functional interactions in microbial communities. While many metaproteomic workflows are available, the impact of method choice on results remains unclear. Here, we carry out a community-driven, multi-laboratory comparison in metaproteomics: the critical assessment of metaproteome investigation study (CAMPI). Based on well-established workflows, we evaluate the effect of sample preparation, mass spectrometry, and bioinformatic analysis using two samples: a simplified, laboratory-assembled human intestinal model and a human fecal sample. We observe that variability at the peptide level is predominantly due to sample processing workflows, with a smaller contribution of bioinformatic pipelines. These peptide-level differences largely disappear at the protein group level. While differences are observed for predicted community composition, similar functional profiles are obtained across workflows. CAMPI demonstrates the robustness of present-day metaproteomics research, serves as a template for multi-laboratory studies in metaproteomics, and provides publicly available data sets for benchmarking future developments.
Collapse
Affiliation(s)
- Tim Van Den Bossche
- VIB - UGent Center for Medical Biotechnology, VIB, Ghent, Belgium
- Department of Biomolecular Medicine, Faculty of Medicine and Health Sciences, Ghent University, Ghent, Belgium
| | - Benoit J Kunath
- Luxembourg Centre for Systems Biomedicine, University of Luxembourg, Esch-sur-Alzette, Luxembourg
| | - Kay Schallert
- Bioprocess Engineering, Otto-von-Guericke University Magdeburg, Magdeburg, Germany
| | - Stephanie S Schäpe
- Department of Molecular Systems Biology, Helmholtz-Centre for Environmental Research - UFZ GmbH, Leipzig, Germany
| | - Paul E Abraham
- Biosciences Division, Oak Ridge National Laboratory, Oak Ridge, TN, USA
| | - Jean Armengaud
- Département Médicaments et Technologies pour la Santé (DMTS), Université Paris Saclay, CEA, INRAE, SPI, 30200, Bagnols-sur-Cèze, France
| | - Magnus Ø Arntzen
- Faculty of Chemistry, Biotechnology and Food Science, Norwegian University of Life Sciences (NMBU), Ås, Norway
| | - Ariane Bassignani
- INRAE, AgroParisTech, Micalis Institute, Université Paris-Saclay, 78350, Jouy-en-Josas, France
| | - Dirk Benndorf
- Bioprocess Engineering, Otto-von-Guericke University Magdeburg, Magdeburg, Germany
- Microbiology, Department of Applied Biosciences and Process Technology, Anhalt University of Applied Sciences, Köthen, Germany
- Bioprocess Engineering, Max Planck Institute for Dynamics of Complex Technical Systems, Magdeburg, Germany
| | - Stephan Fuchs
- Bioinformatics Unit (MF1), Department for Methods Development and Research Infrastructure, Robert Koch Institute, Berlin, Germany
| | | | - Timothy J Griffin
- Department of Biochemistry Molecular Biology and Biophysics, University of Minnesota, Minneapolis, MN, USA
| | - Live H Hagen
- Faculty of Chemistry, Biotechnology and Food Science, Norwegian University of Life Sciences (NMBU), Ås, Norway
| | - Rashi Halder
- Luxembourg Centre for Systems Biomedicine, University of Luxembourg, Esch-sur-Alzette, Luxembourg
| | - Céline Henry
- INRAE, AgroParisTech, Micalis Institute, Université Paris-Saclay, 78350, Jouy-en-Josas, France
| | - Robert L Hettich
- Biosciences Division, Oak Ridge National Laboratory, Oak Ridge, TN, USA
| | - Robert Heyer
- Bioprocess Engineering, Otto-von-Guericke University Magdeburg, Magdeburg, Germany
| | - Pratik Jagtap
- Department of Biochemistry Molecular Biology and Biophysics, University of Minnesota, Minneapolis, MN, USA
| | - Nico Jehmlich
- Department of Molecular Systems Biology, Helmholtz-Centre for Environmental Research - UFZ GmbH, Leipzig, Germany
| | - Marlene Jensen
- Department of Plant & Microbial Biology, North Carolina State University, Raleigh, USA
| | - Catherine Juste
- INRAE, AgroParisTech, Micalis Institute, Université Paris-Saclay, 78350, Jouy-en-Josas, France
| | - Manuel Kleiner
- Department of Plant & Microbial Biology, North Carolina State University, Raleigh, USA
| | - Olivier Langella
- Université Paris-Saclay, INRAE, CNRS, AgroParisTech, GQE - Le Moulon, 91190, Gif-sur-Yvette, France
| | - Theresa Lehmann
- Bioprocess Engineering, Otto-von-Guericke University Magdeburg, Magdeburg, Germany
| | - Emma Leith
- Department of Biochemistry Molecular Biology and Biophysics, University of Minnesota, Minneapolis, MN, USA
| | - Patrick May
- Luxembourg Centre for Systems Biomedicine, University of Luxembourg, Esch-sur-Alzette, Luxembourg
| | - Bart Mesuere
- VIB - UGent Center for Medical Biotechnology, VIB, Ghent, Belgium
- Department of Applied Mathematics, Computer Science and Statistics, Ghent University, Ghent, Belgium
| | - Guylaine Miotello
- Département Médicaments et Technologies pour la Santé (DMTS), Université Paris Saclay, CEA, INRAE, SPI, 30200, Bagnols-sur-Cèze, France
| | - Samantha L Peters
- Biosciences Division, Oak Ridge National Laboratory, Oak Ridge, TN, USA
| | - Olivier Pible
- Département Médicaments et Technologies pour la Santé (DMTS), Université Paris Saclay, CEA, INRAE, SPI, 30200, Bagnols-sur-Cèze, France
| | - Pedro T Queiros
- Luxembourg Centre for Systems Biomedicine, University of Luxembourg, Esch-sur-Alzette, Luxembourg
| | - Udo Reichl
- Bioprocess Engineering, Otto-von-Guericke University Magdeburg, Magdeburg, Germany
- Bioprocess Engineering, Max Planck Institute for Dynamics of Complex Technical Systems, Magdeburg, Germany
| | - Bernhard Y Renard
- Bioinformatics Unit (MF1), Department for Methods Development and Research Infrastructure, Robert Koch Institute, Berlin, Germany
- Data Analytics and Computational Statistics, Hasso-Plattner-Institute, Faculty of Digital Engineering, University of Potsdam, Potsdam, Germany
| | - Henning Schiebenhoefer
- Bioinformatics Unit (MF1), Department for Methods Development and Research Infrastructure, Robert Koch Institute, Berlin, Germany
- Data Analytics and Computational Statistics, Hasso-Plattner-Institute, Faculty of Digital Engineering, University of Potsdam, Potsdam, Germany
| | | | - Alessandro Tanca
- Department of Biomedical Sciences, University of Sassari, Sassari, Italy
| | - Kathrin Trappe
- Bioinformatics Unit (MF1), Department for Methods Development and Research Infrastructure, Robert Koch Institute, Berlin, Germany
| | - Jean-Pierre Trezzi
- Luxembourg Centre for Systems Biomedicine, University of Luxembourg, Esch-sur-Alzette, Luxembourg
- Integrated Biobank of Luxembourg, Luxembourg Institute of Health, 1, rue Louis Rech, L-3555, Dudelange, Luxembourg
| | - Sergio Uzzau
- Department of Biomedical Sciences, University of Sassari, Sassari, Italy
| | - Pieter Verschaffelt
- VIB - UGent Center for Medical Biotechnology, VIB, Ghent, Belgium
- Department of Applied Mathematics, Computer Science and Statistics, Ghent University, Ghent, Belgium
| | - Martin von Bergen
- Department of Molecular Systems Biology, Helmholtz-Centre for Environmental Research - UFZ GmbH, Leipzig, Germany
| | - Paul Wilmes
- Luxembourg Centre for Systems Biomedicine, University of Luxembourg, Esch-sur-Alzette, Luxembourg
- Department of Life Sciences and Medicine, Faculty of Science, Technology and Medicine, University of Luxembourg, 6 avenue du Swing, L-4367, Belvaux, Luxembourg
| | - Maximilian Wolf
- Bioprocess Engineering, Otto-von-Guericke University Magdeburg, Magdeburg, Germany
| | - Lennart Martens
- VIB - UGent Center for Medical Biotechnology, VIB, Ghent, Belgium.
- Department of Biomolecular Medicine, Faculty of Medicine and Health Sciences, Ghent University, Ghent, Belgium.
| | - Thilo Muth
- Section eScience (S.3), Federal Institute for Materials Research and Testing, Berlin, Germany
| |
Collapse
|
18
|
Aumailley L, Bourassa S, Gotti C, Droit A, Lebel M. Vitamin C Differentially Impacts the Serum Proteome Profile in Female and Male Mice. J Proteome Res 2021; 20:5036-5053. [PMID: 34643398 DOI: 10.1021/acs.jproteome.1c00542] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/31/2022]
Abstract
A suboptimal blood vitamin C (ascorbate) level increases the risk of several chronic diseases. However, the detection of hypovitaminosis C is not a simple task, as ascorbate is unstable in blood samples. In this study, we examined the serum proteome of mice lacking the gulonolactone oxidase (Gulo) required for the ascorbate biosynthesis. Gulo-/- mice were supplemented with different concentrations of ascorbate in drinking water, and serum was collected to identify proteins correlating with serum ascorbate levels using an unbiased label-free liquid chromatography-tandem mass spectrometry global quantitative proteomic approach. Parallel reaction monitoring was performed to validate the correlations. We uncovered that the serum proteome profiles differ significantly between male and female mice. Also, unlike Gulo-/- males, a four-week ascorbate treatment did not entirely re-establish the serum proteome profile of ascorbate-deficient Gulo-/- females to the optimal profile exhibited by Gulo-/- females that never experienced an ascorbate deficiency. Finally, the serum proteins involved in retinoid metabolism, cholesterol, and lipid transport were similarly affected by ascorbate levels in males and females. In contrast, the proteins regulating serum peptidases and the protein of the acute phase response were different between males and females. These proteins are potential biomarkers correlating with blood ascorbate levels and require further study in standard clinical settings. The complete proteomics data set generated in this study has been deposited to the public repository ProteomeXchange with the data set identifier: PXD027019.
Collapse
Affiliation(s)
- Lucie Aumailley
- Centre de recherche du CHU de Québec, Faculty of Medicine, Université Laval, Québec City, Québec G1 V 4G2, Canada
| | - Sylvie Bourassa
- Proteomics Platform, Centre de recherche du CHU de Québec, Faculty of Medicine, Université Laval, Québec City, Québec G1 V 4G2, Canada
| | - Clarisse Gotti
- Proteomics Platform, Centre de recherche du CHU de Québec, Faculty of Medicine, Université Laval, Québec City, Québec G1 V 4G2, Canada
| | - Arnaud Droit
- Centre de recherche du CHU de Québec, Faculty of Medicine, Université Laval, Québec City, Québec G1 V 4G2, Canada.,Proteomics Platform, Centre de recherche du CHU de Québec, Faculty of Medicine, Université Laval, Québec City, Québec G1 V 4G2, Canada
| | - Michel Lebel
- Centre de recherche du CHU de Québec, Faculty of Medicine, Université Laval, Québec City, Québec G1 V 4G2, Canada
| |
Collapse
|
19
|
Bouwmeester R, Gabriels R, Hulstaert N, Martens L, Degroeve S. DeepLC can predict retention times for peptides that carry as-yet unseen modifications. Nat Methods 2021; 18:1363-1369. [PMID: 34711972 DOI: 10.1038/s41592-021-01301-5] [Citation(s) in RCA: 76] [Impact Index Per Article: 25.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/15/2020] [Accepted: 09/13/2021] [Indexed: 11/09/2022]
Abstract
The inclusion of peptide retention time prediction promises to remove peptide identification ambiguity in complex liquid chromatography-mass spectrometry identification workflows. However, due to the way peptides are encoded in current prediction models, accurate retention times cannot be predicted for modified peptides. This is especially problematic for fledgling open searches, which will benefit from accurate retention time prediction for modified peptides to reduce identification ambiguity. We present DeepLC, a deep learning peptide retention time predictor using peptide encoding based on atomic composition that allows the retention time of (previously unseen) modified peptides to be predicted accurately. We show that DeepLC performs similarly to current state-of-the-art approaches for unmodified peptides and, more importantly, accurately predicts retention times for modifications not seen during training. Moreover, we show that DeepLC's ability to predict retention times for any modification enables potentially incorrect identifications to be flagged in an open search of a wide variety of proteome data.
Collapse
Affiliation(s)
- Robbin Bouwmeester
- VIB-UGent Center for Medical Biotechnology, VIB, Ghent, Belgium.,Department of Biomolecular Medicine, Ghent University, Ghent, Belgium
| | - Ralf Gabriels
- VIB-UGent Center for Medical Biotechnology, VIB, Ghent, Belgium.,Department of Biomolecular Medicine, Ghent University, Ghent, Belgium
| | - Niels Hulstaert
- VIB-UGent Center for Medical Biotechnology, VIB, Ghent, Belgium.,Department of Biomolecular Medicine, Ghent University, Ghent, Belgium
| | - Lennart Martens
- VIB-UGent Center for Medical Biotechnology, VIB, Ghent, Belgium. .,Department of Biomolecular Medicine, Ghent University, Ghent, Belgium.
| | - Sven Degroeve
- VIB-UGent Center for Medical Biotechnology, VIB, Ghent, Belgium.,Department of Biomolecular Medicine, Ghent University, Ghent, Belgium
| |
Collapse
|
20
|
Parmar BS, Peeters MKR, Boonen K, Clark EC, Baggerman G, Menschaert G, Temmerman L. Identification of Non-Canonical Translation Products in C. elegans Using Tandem Mass Spectrometry. Front Genet 2021; 12:728900. [PMID: 34759956 PMCID: PMC8575065 DOI: 10.3389/fgene.2021.728900] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/22/2021] [Accepted: 09/16/2021] [Indexed: 11/22/2022] Open
Abstract
Transcriptome and ribosome sequencing have revealed the existence of many non-canonical transcripts, mainly containing splice variants, ncRNA, sORFs and altORFs. However, identification and characterization of products that may be translated out of these remains a challenge. Addressing this, we here report on 552 non-canonical proteins and splice variants in the model organism C. elegans using tandem mass spectrometry. Aided by sequencing-based prediction, we generated a custom proteome database tailored to search for non-canonical translation products of C. elegans. Using this database, we mined available mass spectrometric resources of C. elegans, from which 51 novel, non-canonical proteins could be identified. Furthermore, we utilized diverse proteomic and peptidomic strategies to detect 40 novel non-canonical proteins in C. elegans by LC-TIMS-MS/MS, of which 6 were common with our meta-analysis of existing resources. Together, this permits us to provide a resource with detailed annotation of 467 splice variants and 85 novel proteins mapped onto UTRs, non-coding regions and alternative open reading frames of the C. elegans genome.
Collapse
Affiliation(s)
- Bhavesh S. Parmar
- Animal Physiology and Neurobiology, University of Leuven (KU Leuven), Leuven, Belgium
| | - Marlies K. R. Peeters
- Laboratory of Bioinformatics and Computational Genomics (BioBix), Department of Mathematical Modelling, Ghent University, Ghent, Belgium
| | - Kurt Boonen
- Centre for Proteomics (CFP), University of Antwerp, Antwerp, Belgium
| | - Ellie C. Clark
- Animal Physiology and Neurobiology, University of Leuven (KU Leuven), Leuven, Belgium
| | - Geert Baggerman
- Centre for Proteomics (CFP), University of Antwerp, Antwerp, Belgium
| | - Gerben Menschaert
- Laboratory of Bioinformatics and Computational Genomics (BioBix), Department of Mathematical Modelling, Ghent University, Ghent, Belgium
| | - Liesbet Temmerman
- Animal Physiology and Neurobiology, University of Leuven (KU Leuven), Leuven, Belgium
| |
Collapse
|
21
|
Peeters MKR, Baggerman G, Gabriels R, Pepermans E, Menschaert G, Boonen K. Ion Mobility Coupled to a Time-of-Flight Mass Analyzer Combined With Fragment Intensity Predictions Improves Identification of Classical Bioactive Peptides and Small Open Reading Frame-Encoded Peptides. Front Cell Dev Biol 2021; 9:720570. [PMID: 34604223 PMCID: PMC8484717 DOI: 10.3389/fcell.2021.720570] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/04/2021] [Accepted: 08/25/2021] [Indexed: 12/29/2022] Open
Abstract
Bioactive peptides exhibit key roles in a wide variety of complex processes, such as regulation of body weight, learning, aging, and innate immune response. Next to the classical bioactive peptides, emerging from larger precursor proteins by specific proteolytic processing, a new class of peptides originating from small open reading frames (sORFs) have been recognized as important biological regulators. But their intrinsic properties, specific expression pattern and location on presumed non-coding regions have hindered the full characterization of the repertoire of bioactive peptides, despite their predominant role in various pathways. Although the development of peptidomics has offered the opportunity to study these peptides in vivo, it remains challenging to identify the full peptidome as the lack of cleavage enzyme specification and large search space complicates conventional database search approaches. In this study, we introduce a proteogenomics methodology using a new type of mass spectrometry instrument and the implementation of machine learning tools toward improved identification of potential bioactive peptides in the mouse brain. The application of trapped ion mobility spectrometry (tims) coupled to a time-of-flight mass analyzer (TOF) offers improved sensitivity, an enhanced peptide coverage, reduction in chemical noise and the reduced occurrence of chimeric spectra. Subsequent machine learning tools MS2PIP, predicting fragment ion intensities and DeepLC, predicting retention times, improve the database searching based on a large and comprehensive custom database containing both sORFs and alternative ORFs. Finally, the identification of peptides is further enhanced by applying the post-processing semi-supervised learning tool Percolator. Applying this workflow, the first peptidomics workflow combined with spectral intensity and retention time predictions, we identified a total of 167 predicted sORF-encoded peptides, of which 48 originating from presumed non-coding locations, next to 401 peptides from known neuropeptide precursors, linked to 66 annotated bioactive neuropeptides from within 22 different families. Additional PEAKS analysis expanded the pool of SEPs on presumed non-coding locations to 84, while an additional 204 peptides completed the list of peptides from neuropeptide precursors. Altogether, this study provides insights into a new robust pipeline that fuses technological advancements from different fields ensuring an improved coverage of the neuropeptidome in the mouse brain.
Collapse
Affiliation(s)
- Marlies K. R. Peeters
- BioBix, Department of Data Analysis and Mathematical Modelling, Ghent University, Ghent, Belgium
| | - Geert Baggerman
- Centre for Proteomics, University of Antwerp, Antwerp, Belgium
- Unit Environmental Risk and Health, Flemish Institute for Technological Research, Mol, Belgium
| | - Ralf Gabriels
- Department of Biomolecular Medicine, Ghent University, Ghent, Belgium
- VIB-UGent Center for Medical Biotechnology, Flanders Institute for Biotechnology, Ghent, Belgium
| | - Elise Pepermans
- Centre for Proteomics, University of Antwerp, Antwerp, Belgium
- Unit Environmental Risk and Health, Flemish Institute for Technological Research, Mol, Belgium
| | - Gerben Menschaert
- BioBix, Department of Data Analysis and Mathematical Modelling, Ghent University, Ghent, Belgium
- OHMX.bio, Ghent, Belgium
| | - Kurt Boonen
- Centre for Proteomics, University of Antwerp, Antwerp, Belgium
- Unit Environmental Risk and Health, Flemish Institute for Technological Research, Mol, Belgium
| |
Collapse
|
22
|
Van Puyvelde B, Van Uytfanghe K, Tytgat O, Van Oudenhove L, Gabriels R, Bouwmeester R, Daled S, Van Den Bossche T, Ramasamy P, Verhelst S, De Clerck L, Corveleyn L, Willems S, Debunne N, Wynendaele E, De Spiegeleer B, Judak P, Roels K, De Wilde L, Van Eenoo P, Reyns T, Cherlet M, Dumont E, Debyser G, t'Kindt R, Sandra K, Gupta S, Drouin N, Harms A, Hankemeier T, Jones DJL, Gupta P, Lane D, Lane CS, El Ouadi S, Vincendet JB, Morrice N, Oehrle S, Tanna N, Silvester S, Hannam S, Sigloch FC, Bhangu-Uhlmann A, Claereboudt J, Anderson NL, Razavi M, Degroeve S, Cuypers L, Stove C, Lagrou K, Martens GA, Deforce D, Martens L, Vissers JPC, Dhaenens M. Cov-MS: A Community-Based Template Assay for Mass-Spectrometry-Based Protein Detection in SARS-CoV-2 Patients. JACS AU 2021. [PMID: 34254058 DOI: 10.1101/2020.11.18.20231688] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/03/2023]
Abstract
Rising population density and global mobility are among the reasons why pathogens such as SARS-CoV-2, the virus that causes COVID-19, spread so rapidly across the globe. The policy response to such pandemics will always have to include accurate monitoring of the spread, as this provides one of the few alternatives to total lockdown. However, COVID-19 diagnosis is currently performed almost exclusively by reverse transcription polymerase chain reaction (RT-PCR). Although this is efficient, automatable, and acceptably cheap, reliance on one type of technology comes with serious caveats, as illustrated by recurring reagent and test shortages. We therefore developed an alternative diagnostic test that detects proteolytically digested SARS-CoV-2 proteins using mass spectrometry (MS). We established the Cov-MS consortium, consisting of 15 academic laboratories and several industrial partners to increase applicability, accessibility, sensitivity, and robustness of this kind of SARS-CoV-2 detection. This, in turn, gave rise to the Cov-MS Digital Incubator that allows other laboratories to join the effort, navigate, and share their optimizations and translate the assay into their clinic. As this test relies on viral proteins instead of RNA, it provides an orthogonal and complementary approach to RT-PCR using other reagents that are relatively inexpensive and widely available, as well as orthogonally skilled personnel and different instruments. Data are available via ProteomeXchange with identifier PXD022550.
Collapse
Affiliation(s)
- Bart Van Puyvelde
- ProGenTomics, Laboratory of Pharmaceutical Biotechnology, Ghent University, 9000 Ghent, Belgium
| | - Katleen Van Uytfanghe
- Laboratory of Toxicology, Department of Bioanalysis, Faculty of Pharmaceutical Sciences, Ghent University, 9000 Ghent, Belgium
| | - Olivier Tytgat
- ProGenTomics, Laboratory of Pharmaceutical Biotechnology, Ghent University, 9000 Ghent, Belgium
- Department of Life Science Technologies, Imec, 3000 Leuven, Belgium
| | | | - Ralf Gabriels
- VIB-UGent Center for Medical Biotechnology, VIB, 9000 Ghent, Belgium
- Department of Biomolecular Medicine, Ghent University, 9000 Ghent Belgium
| | - Robbin Bouwmeester
- VIB-UGent Center for Medical Biotechnology, VIB, 9000 Ghent, Belgium
- Department of Biomolecular Medicine, Ghent University, 9000 Ghent Belgium
| | - Simon Daled
- ProGenTomics, Laboratory of Pharmaceutical Biotechnology, Ghent University, 9000 Ghent, Belgium
| | - Tim Van Den Bossche
- VIB-UGent Center for Medical Biotechnology, VIB, 9000 Ghent, Belgium
- Department of Biomolecular Medicine, Ghent University, 9000 Ghent Belgium
| | - Pathmanaban Ramasamy
- VIB-UGent Center for Medical Biotechnology, VIB, 9000 Ghent, Belgium
- Department of Biomolecular Medicine, Ghent University, 9000 Ghent Belgium
- Interuniversity Institute of Bioinformatics in Brussels, ULB/VUB, 1050 Brussels, Belgium
| | - Sigrid Verhelst
- ProGenTomics, Laboratory of Pharmaceutical Biotechnology, Ghent University, 9000 Ghent, Belgium
| | - Laura De Clerck
- ProGenTomics, Laboratory of Pharmaceutical Biotechnology, Ghent University, 9000 Ghent, Belgium
| | - Laura Corveleyn
- ProGenTomics, Laboratory of Pharmaceutical Biotechnology, Ghent University, 9000 Ghent, Belgium
| | - Sander Willems
- ProGenTomics, Laboratory of Pharmaceutical Biotechnology, Ghent University, 9000 Ghent, Belgium
| | - Nathan Debunne
- Drug Quality and Registration Group, Faculty of Pharmaceutical Sciences, Ghent University, 9000 Ghent, Belgium
| | - Evelien Wynendaele
- Drug Quality and Registration Group, Faculty of Pharmaceutical Sciences, Ghent University, 9000 Ghent, Belgium
| | - Bart De Spiegeleer
- Drug Quality and Registration Group, Faculty of Pharmaceutical Sciences, Ghent University, 9000 Ghent, Belgium
| | - Peter Judak
- Doping Control Laboratory, Department of Diagnostic Sciences, Ghent University, 9000 Ghent, Belgium
| | - Kris Roels
- Doping Control Laboratory, Department of Diagnostic Sciences, Ghent University, 9000 Ghent, Belgium
| | - Laurie De Wilde
- Doping Control Laboratory, Department of Diagnostic Sciences, Ghent University, 9000 Ghent, Belgium
| | - Peter Van Eenoo
- Doping Control Laboratory, Department of Diagnostic Sciences, Ghent University, 9000 Ghent, Belgium
| | - Tim Reyns
- Department of Clinical Chemistry, Ghent University Hospital, 9000 Ghent, Belgium
| | - Marc Cherlet
- Department of Pharmacology, Toxicology, and Biochemistry, Faculty of Veterinary Medicine, Ghent University 9000 Ghent, Belgium
| | - Emmie Dumont
- Research Institute for Chromatography (RIC), 8500 Kortrijk, Belgium
| | - Griet Debyser
- Research Institute for Chromatography (RIC), 8500 Kortrijk, Belgium
| | - Ruben t'Kindt
- Research Institute for Chromatography (RIC), 8500 Kortrijk, Belgium
| | - Koen Sandra
- Research Institute for Chromatography (RIC), 8500 Kortrijk, Belgium
| | - Surya Gupta
- VIB-UGent Center for Medical Biotechnology, VIB, 9000 Ghent, Belgium
- Department of Biomolecular Medicine, Ghent University, 9000 Ghent Belgium
| | - Nicolas Drouin
- Division of Systems Biomedicine and Pharmacology, Leiden Academic Centre for Drug Research, Leiden University, 2311 G Leiden, The Netherlands
| | - Amy Harms
- Division of Systems Biomedicine and Pharmacology, Leiden Academic Centre for Drug Research, Leiden University, 2311 G Leiden, The Netherlands
| | - Thomas Hankemeier
- Division of Systems Biomedicine and Pharmacology, Leiden Academic Centre for Drug Research, Leiden University, 2311 G Leiden, The Netherlands
| | - Donald J L Jones
- Leicester Cancer Research Centre, RKCSB, University of Leicester, U.K., and John and Lucille van Geest Biomarker Facility, Cardiovascular Research Centre, Glenfield Hospital, Leicester LE1 7RH, United Kingdom
| | - Pankaj Gupta
- The Department of Chemical Pathology and Metabolic Diseases, Level 4, Sandringham Building, Leicester Royal Infirmary, Leicester LE1 7RH, United Kingdom
| | - Dan Lane
- The Department of Chemical Pathology and Metabolic Diseases, Level 4, Sandringham Building, Leicester Royal Infirmary, Leicester LE1 7RH, United Kingdom
| | | | - Said El Ouadi
- AB Sciex, Alderley Park, Macclesfield SK10 4TG, United Kingdom
| | | | - Nick Morrice
- AB Sciex, Alderley Park, Macclesfield SK10 4TG, United Kingdom
| | - Stuart Oehrle
- Waters Corporation, Milford, Massachusetts 01757, United States
| | - Nikunj Tanna
- Waters Corporation, Milford, Massachusetts 01757, United States
| | - Steve Silvester
- Alderley Analytical, Alderley Park, Macclesfield SK10 4TG, United Kingdom
| | - Sally Hannam
- Alderley Analytical, Alderley Park, Macclesfield SK10 4TG, United Kingdom
| | | | | | | | - N Leigh Anderson
- SISCAPA Assay Technologies, Inc., Washington, D.C. 20009, United States
| | - Morteza Razavi
- SISCAPA Assay Technologies, Inc., Washington, D.C. 20009, United States
| | - Sven Degroeve
- VIB-UGent Center for Medical Biotechnology, VIB, 9000 Ghent, Belgium
- Department of Biomolecular Medicine, Ghent University, 9000 Ghent Belgium
| | - Lize Cuypers
- Clinical Department of Laboratory Medicine, UZ Leuven, KU Leuven, 3000 Leuven, Belgium
| | - Christophe Stove
- Laboratory of Toxicology, Department of Bioanalysis, Faculty of Pharmaceutical Sciences, Ghent University, 9000 Ghent, Belgium
| | - Katrien Lagrou
- Clinical Department of Laboratory Medicine, UZ Leuven, KU Leuven, 3000 Leuven, Belgium
| | - Geert A Martens
- AZ Delta Medical Laboratories, AZ Delta General Hospital, 8800 Roeselare, Belgium
| | - Dieter Deforce
- ProGenTomics, Laboratory of Pharmaceutical Biotechnology, Ghent University, 9000 Ghent, Belgium
| | - Lennart Martens
- VIB-UGent Center for Medical Biotechnology, VIB, 9000 Ghent, Belgium
- Department of Biomolecular Medicine, Ghent University, 9000 Ghent Belgium
| | | | - Maarten Dhaenens
- ProGenTomics, Laboratory of Pharmaceutical Biotechnology, Ghent University, 9000 Ghent, Belgium
| |
Collapse
|
23
|
Van Puyvelde B, Van Uytfanghe K, Tytgat O, Van Oudenhove L, Gabriels R, Bouwmeester R, Daled S, Van Den Bossche T, Ramasamy P, Verhelst S, De Clerck L, Corveleyn L, Willems S, Debunne N, Wynendaele E, De Spiegeleer B, Judak P, Roels K, De Wilde L, Van Eenoo P, Reyns T, Cherlet M, Dumont E, Debyser G, t’Kindt R, Sandra K, Gupta S, Drouin N, Harms A, Hankemeier T, Jones DJL, Gupta P, Lane D, Lane CS, El Ouadi S, Vincendet JB, Morrice N, Oehrle S, Tanna N, Silvester S, Hannam S, Sigloch FC, Bhangu-Uhlmann A, Claereboudt J, Anderson NL, Razavi M, Degroeve S, Cuypers L, Stove C, Lagrou K, Martens GA, Deforce D, Martens L, Vissers JPC, Dhaenens M. Cov-MS: A Community-Based Template Assay for Mass-Spectrometry-Based Protein Detection in SARS-CoV-2 Patients. JACS AU 2021; 1:750-765. [PMID: 34254058 PMCID: PMC8230961 DOI: 10.1021/jacsau.1c00048] [Citation(s) in RCA: 21] [Impact Index Per Article: 7.0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 02/09/2021] [Indexed: 05/03/2023]
Abstract
Rising population density and global mobility are among the reasons why pathogens such as SARS-CoV-2, the virus that causes COVID-19, spread so rapidly across the globe. The policy response to such pandemics will always have to include accurate monitoring of the spread, as this provides one of the few alternatives to total lockdown. However, COVID-19 diagnosis is currently performed almost exclusively by reverse transcription polymerase chain reaction (RT-PCR). Although this is efficient, automatable, and acceptably cheap, reliance on one type of technology comes with serious caveats, as illustrated by recurring reagent and test shortages. We therefore developed an alternative diagnostic test that detects proteolytically digested SARS-CoV-2 proteins using mass spectrometry (MS). We established the Cov-MS consortium, consisting of 15 academic laboratories and several industrial partners to increase applicability, accessibility, sensitivity, and robustness of this kind of SARS-CoV-2 detection. This, in turn, gave rise to the Cov-MS Digital Incubator that allows other laboratories to join the effort, navigate, and share their optimizations and translate the assay into their clinic. As this test relies on viral proteins instead of RNA, it provides an orthogonal and complementary approach to RT-PCR using other reagents that are relatively inexpensive and widely available, as well as orthogonally skilled personnel and different instruments. Data are available via ProteomeXchange with identifier PXD022550.
Collapse
Affiliation(s)
- Bart Van Puyvelde
- ProGenTomics,
Laboratory of Pharmaceutical Biotechnology, Ghent University, 9000 Ghent, Belgium
| | - Katleen Van Uytfanghe
- Laboratory
of Toxicology, Department of Bioanalysis, Faculty of Pharmaceutical
Sciences, Ghent University, 9000 Ghent, Belgium
| | - Olivier Tytgat
- ProGenTomics,
Laboratory of Pharmaceutical Biotechnology, Ghent University, 9000 Ghent, Belgium
- Department
of Life Science Technologies, Imec, 3000 Leuven, Belgium
| | | | - Ralf Gabriels
- VIB-UGent
Center for Medical Biotechnology, VIB, 9000 Ghent, Belgium
- Department
of Biomolecular Medicine, Ghent University, 9000 Ghent Belgium
| | - Robbin Bouwmeester
- VIB-UGent
Center for Medical Biotechnology, VIB, 9000 Ghent, Belgium
- Department
of Biomolecular Medicine, Ghent University, 9000 Ghent Belgium
| | - Simon Daled
- ProGenTomics,
Laboratory of Pharmaceutical Biotechnology, Ghent University, 9000 Ghent, Belgium
| | - Tim Van Den Bossche
- VIB-UGent
Center for Medical Biotechnology, VIB, 9000 Ghent, Belgium
- Department
of Biomolecular Medicine, Ghent University, 9000 Ghent Belgium
| | - Pathmanaban Ramasamy
- VIB-UGent
Center for Medical Biotechnology, VIB, 9000 Ghent, Belgium
- Department
of Biomolecular Medicine, Ghent University, 9000 Ghent Belgium
- Interuniversity
Institute of Bioinformatics in Brussels, ULB/VUB, 1050 Brussels, Belgium
| | - Sigrid Verhelst
- ProGenTomics,
Laboratory of Pharmaceutical Biotechnology, Ghent University, 9000 Ghent, Belgium
| | - Laura De Clerck
- ProGenTomics,
Laboratory of Pharmaceutical Biotechnology, Ghent University, 9000 Ghent, Belgium
| | - Laura Corveleyn
- ProGenTomics,
Laboratory of Pharmaceutical Biotechnology, Ghent University, 9000 Ghent, Belgium
| | - Sander Willems
- ProGenTomics,
Laboratory of Pharmaceutical Biotechnology, Ghent University, 9000 Ghent, Belgium
| | - Nathan Debunne
- Drug Quality and Registration Group, Faculty of Pharmaceutical
Sciences, Ghent University, 9000 Ghent, Belgium
| | - Evelien Wynendaele
- Drug Quality and Registration Group, Faculty of Pharmaceutical
Sciences, Ghent University, 9000 Ghent, Belgium
| | - Bart De Spiegeleer
- Drug Quality and Registration Group, Faculty of Pharmaceutical
Sciences, Ghent University, 9000 Ghent, Belgium
| | - Peter Judak
- Doping
Control Laboratory, Department of Diagnostic Sciences, Ghent University, 9000 Ghent, Belgium
| | - Kris Roels
- Doping
Control Laboratory, Department of Diagnostic Sciences, Ghent University, 9000 Ghent, Belgium
| | - Laurie De Wilde
- Doping
Control Laboratory, Department of Diagnostic Sciences, Ghent University, 9000 Ghent, Belgium
| | - Peter Van Eenoo
- Doping
Control Laboratory, Department of Diagnostic Sciences, Ghent University, 9000 Ghent, Belgium
| | - Tim Reyns
- Department
of Clinical Chemistry, Ghent University
Hospital, 9000 Ghent, Belgium
| | - Marc Cherlet
- Department
of Pharmacology, Toxicology, and Biochemistry, Faculty of Veterinary
Medicine, Ghent University 9000 Ghent, Belgium
| | - Emmie Dumont
- Research Institute for Chromatography
(RIC), 8500 Kortrijk, Belgium
| | - Griet Debyser
- Research Institute for Chromatography
(RIC), 8500 Kortrijk, Belgium
| | - Ruben t’Kindt
- Research Institute for Chromatography
(RIC), 8500 Kortrijk, Belgium
| | - Koen Sandra
- Research Institute for Chromatography
(RIC), 8500 Kortrijk, Belgium
| | - Surya Gupta
- VIB-UGent
Center for Medical Biotechnology, VIB, 9000 Ghent, Belgium
- Department
of Biomolecular Medicine, Ghent University, 9000 Ghent Belgium
| | - Nicolas Drouin
- Division
of Systems Biomedicine and Pharmacology, Leiden Academic
Centre for Drug Research, Leiden University, 2311 G Leiden, The Netherlands
| | - Amy Harms
- Division
of Systems Biomedicine and Pharmacology, Leiden Academic
Centre for Drug Research, Leiden University, 2311 G Leiden, The Netherlands
| | - Thomas Hankemeier
- Division
of Systems Biomedicine and Pharmacology, Leiden Academic
Centre for Drug Research, Leiden University, 2311 G Leiden, The Netherlands
| | - Donald J. L. Jones
- Leicester
Cancer Research Centre, RKCSB, University of Leicester, U.K., and
John and Lucille van Geest Biomarker Facility, Cardiovascular Research
Centre, Glenfield Hospital, Leicester LE1 7RH, United Kingdom
| | - Pankaj Gupta
- The
Department of Chemical Pathology and Metabolic Diseases, Level 4,
Sandringham Building, Leicester Royal Infirmary, Leicester LE1 7RH, United Kingdom
| | - Dan Lane
- The
Department of Chemical Pathology and Metabolic Diseases, Level 4,
Sandringham Building, Leicester Royal Infirmary, Leicester LE1 7RH, United Kingdom
| | | | - Said El Ouadi
- AB Sciex, Alderley Park, Macclesfield SK10 4TG, United Kingdom
| | | | - Nick Morrice
- AB Sciex, Alderley Park, Macclesfield SK10 4TG, United Kingdom
| | - Stuart Oehrle
- Waters Corporation, Milford, Massachusetts 01757, United States
| | - Nikunj Tanna
- Waters Corporation, Milford, Massachusetts 01757, United States
| | - Steve Silvester
- Alderley Analytical, Alderley Park, Macclesfield SK10 4TG, United Kingdom
| | - Sally Hannam
- Alderley Analytical, Alderley Park, Macclesfield SK10 4TG, United Kingdom
| | | | | | | | - N. Leigh Anderson
- SISCAPA Assay Technologies, Inc., Washington, D.C. 20009, United States
| | - Morteza Razavi
- SISCAPA Assay Technologies, Inc., Washington, D.C. 20009, United States
| | - Sven Degroeve
- VIB-UGent
Center for Medical Biotechnology, VIB, 9000 Ghent, Belgium
- Department
of Biomolecular Medicine, Ghent University, 9000 Ghent Belgium
| | - Lize Cuypers
- Clinical
Department of Laboratory Medicine, UZ Leuven, KU Leuven, 3000 Leuven, Belgium
| | - Christophe Stove
- Laboratory
of Toxicology, Department of Bioanalysis, Faculty of Pharmaceutical
Sciences, Ghent University, 9000 Ghent, Belgium
| | - Katrien Lagrou
- Clinical
Department of Laboratory Medicine, UZ Leuven, KU Leuven, 3000 Leuven, Belgium
| | - Geert A. Martens
- AZ
Delta Medical Laboratories, AZ Delta General
Hospital, 8800 Roeselare, Belgium
| | - Dieter Deforce
- ProGenTomics,
Laboratory of Pharmaceutical Biotechnology, Ghent University, 9000 Ghent, Belgium
| | - Lennart Martens
- VIB-UGent
Center for Medical Biotechnology, VIB, 9000 Ghent, Belgium
- Department
of Biomolecular Medicine, Ghent University, 9000 Ghent Belgium
| | | | - Maarten Dhaenens
- ProGenTomics,
Laboratory of Pharmaceutical Biotechnology, Ghent University, 9000 Ghent, Belgium
| |
Collapse
|
24
|
Abstract
Mass-spectrometry-based proteomics enables quantitative analysis of thousands of human proteins. However, experimental and computational challenges restrict progress in the field. This review summarizes the recent flurry of machine-learning strategies using artificial deep neural networks (or "deep learning") that have started to break barriers and accelerate progress in the field of shotgun proteomics. Deep learning now accurately predicts physicochemical properties of peptides from their sequence, including tandem mass spectra and retention time. Furthermore, deep learning methods exist for nearly every aspect of the modern proteomics workflow, enabling improved feature selection, peptide identification, and protein inference.
Collapse
Affiliation(s)
- Jesse G. Meyer
- Department of Biochemistry, Medical College of Wisconsin, Milwaukee, WI 53226, USA
| |
Collapse
|
25
|
Salz R, Bouwmeester R, Gabriels R, Degroeve S, Martens L, Volders PJ, 't Hoen PAC. Personalized Proteome: Comparing Proteogenomics and Open Variant Search Approaches for Single Amino Acid Variant Detection. J Proteome Res 2021; 20:3353-3364. [PMID: 33998808 PMCID: PMC8280751 DOI: 10.1021/acs.jproteome.1c00264] [Citation(s) in RCA: 8] [Impact Index Per Article: 2.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/02/2021] [Indexed: 12/30/2022]
Abstract
Discovery of variant peptides such as a single amino acid variant (SAAV) in shotgun proteomics data is essential for personalized proteomics. Both the resolution of shotgun proteomics methods and the search engines have improved dramatically, allowing for confident identification of SAAV peptides. However, it is not yet known if these methods are truly successful in accurately identifying SAAV peptides without prior genomic information in the search database. We studied this in unprecedented detail by exploiting publicly available long-read RNA sequences and shotgun proteomics data from the gold standard reference cell line NA12878. Searching spectra from this cell line with the state-of-the-art open modification search engine ionbot against carefully curated search databases resulted in 96.7% false-positive SAAVs and an 85% lower true positive rate than searching with peptide search databases that incorporate prior genetic information. While adding genetic variants to the search database remains indispensable for correct peptide identification, inclusion of long-read RNA sequences in the search database contributes only 0.3% new peptide identifications. These findings reveal the differences in SAAV detection that result from various approaches, providing guidance to researchers studying SAAV peptides and developers of peptide spectrum identification tools.
Collapse
Affiliation(s)
- Renee Salz
- Centre for Molecular and Biomolecular Informatics, Radboud Institute for Molecular Life Sciences, Radboud University Medical Center, Nijmegen 6525 GA, The Netherlands
| | - Robbin Bouwmeester
- VIB-UGent Center for Medical Biotechnology VIB, Technologiepark-Zwijnaarde 75, 9052 Ghent, Belgium
- Department of Biomolecular Medicine, Ghent University, Technologiepark-Zwijnaarde 75, 9052 Ghent, Belgium
| | - Ralf Gabriels
- VIB-UGent Center for Medical Biotechnology VIB, Technologiepark-Zwijnaarde 75, 9052 Ghent, Belgium
- Department of Biomolecular Medicine, Ghent University, Technologiepark-Zwijnaarde 75, 9052 Ghent, Belgium
| | - Sven Degroeve
- VIB-UGent Center for Medical Biotechnology VIB, Technologiepark-Zwijnaarde 75, 9052 Ghent, Belgium
- Department of Biomolecular Medicine, Ghent University, Technologiepark-Zwijnaarde 75, 9052 Ghent, Belgium
| | - Lennart Martens
- VIB-UGent Center for Medical Biotechnology VIB, Technologiepark-Zwijnaarde 75, 9052 Ghent, Belgium
- Department of Biomolecular Medicine, Ghent University, Technologiepark-Zwijnaarde 75, 9052 Ghent, Belgium
| | - Pieter-Jan Volders
- VIB-UGent Center for Medical Biotechnology VIB, Technologiepark-Zwijnaarde 75, 9052 Ghent, Belgium
- Department of Biomolecular Medicine, Ghent University, Technologiepark-Zwijnaarde 75, 9052 Ghent, Belgium
| | - Peter A C 't Hoen
- Centre for Molecular and Biomolecular Informatics, Radboud Institute for Molecular Life Sciences, Radboud University Medical Center, Nijmegen 6525 GA, The Netherlands
| |
Collapse
|
26
|
Bandeira N, Deutsch EW, Kohlbacher O, Martens L, Vizcaíno JA. Data Management of Sensitive Human Proteomics Data: Current Practices, Recommendations, and Perspectives for the Future. Mol Cell Proteomics 2021; 20:100071. [PMID: 33711481 PMCID: PMC8056256 DOI: 10.1016/j.mcpro.2021.100071] [Citation(s) in RCA: 19] [Impact Index Per Article: 6.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/21/2020] [Revised: 03/01/2021] [Accepted: 03/02/2021] [Indexed: 12/12/2022] Open
Abstract
Today it is the norm that all relevant proteomics data that support the conclusions in scientific publications are made available in public proteomics data repositories. However, given the increase in the number of clinical proteomics studies, an important emerging topic is the management and dissemination of clinical, and thus potentially sensitive, human proteomics data. Both in the United States and in the European Union, there are legal frameworks protecting the privacy of individuals. Implementing privacy standards for publicly released research data in genomics and transcriptomics has led to processes to control who may access the data, so-called "controlled access" data. In parallel with the technological developments in the field, it is clear that the privacy risks of sharing proteomics data need to be properly assessed and managed. In our view, the proteomics community must be proactive in addressing these issues. Yet a careful balance must be kept. On the one hand, neglecting to address the potential of identifiability in human proteomics data could lead to reputational damage of the field, while on the other hand, erecting barriers to open access to clinical proteomics data will inevitably reduce reuse of proteomics data and could substantially delay critical discoveries in biomedical research. In order to balance these apparently conflicting requirements for data privacy and efficient use and reuse of research efforts through the sharing of clinical proteomics data, development efforts will be needed at different levels including bioinformatics infrastructure, policymaking, and mechanisms of oversight.
Collapse
Affiliation(s)
- Nuno Bandeira
- Center for Computational Mass Spectrometry, University of California, San Diego (UCSD), La Jolla, California, USA; Department Computer Science and Engineering, University of California, San Diego (UCSD), La Jolla, California, USA; Skaggs School of Pharmacy and Pharmaceutical Sciences, University of California, San Diego (UCSD), La Jolla, California, USA
| | | | - Oliver Kohlbacher
- Institute for Bioinformatics and Medical Informatics, University of Tübingen, Tübingen, Germany; Quantitative Biology Center, University of Tübingen, Tübingen, Germany; Biomolecular Interactions, Max Planck Institute for Developmental Biology, Tübingen, Germany; Institute for Translational Bioinformatics, University Hospital Tübingen, Tübingen, Germany
| | - Lennart Martens
- VIB-UGent Center for Medical Biotechnology, VIB, Ghent, Belgium; Department of Biomolecular Medicine, Ghent University, Ghent, Belgium
| | - Juan Antonio Vizcaíno
- European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Wellcome Trust Genome Campus, Hinxton, Cambridge, United Kingdom.
| |
Collapse
|
27
|
Wen B, Zhang B. Computational Proteomics: Focus on Deep Learning. Proteomics 2020; 20:e2000258. [PMID: 33210458 DOI: 10.1002/pmic.202000258] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/14/2020] [Revised: 10/14/2020] [Indexed: 11/09/2022]
Affiliation(s)
- Bo Wen
- Lester and Sue Smith Breast Center, Baylor College of Medicine, Houston, TX, 77030, USA.,Department of Molecular and Human Genetics, Baylor College of Medicine, Houston, TX, 77030, USA
| | - Bing Zhang
- Lester and Sue Smith Breast Center, Baylor College of Medicine, Houston, TX, 77030, USA.,Department of Molecular and Human Genetics, Baylor College of Medicine, Houston, TX, 77030, USA
| |
Collapse
|
28
|
Wen B, Zeng W, Liao Y, Shi Z, Savage SR, Jiang W, Zhang B. Deep Learning in Proteomics. Proteomics 2020; 20:e1900335. [PMID: 32939979 PMCID: PMC7757195 DOI: 10.1002/pmic.201900335] [Citation(s) in RCA: 70] [Impact Index Per Article: 17.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/27/2020] [Revised: 09/14/2020] [Indexed: 12/17/2022]
Abstract
Proteomics, the study of all the proteins in biological systems, is becoming a data-rich science. Protein sequences and structures are comprehensively catalogued in online databases. With recent advancements in tandem mass spectrometry (MS) technology, protein expression and post-translational modifications (PTMs) can be studied in a variety of biological systems at the global scale. Sophisticated computational algorithms are needed to translate the vast amount of data into novel biological insights. Deep learning automatically extracts data representations at high levels of abstraction from data, and it thrives in data-rich scientific research domains. Here, a comprehensive overview of deep learning applications in proteomics, including retention time prediction, MS/MS spectrum prediction, de novo peptide sequencing, PTM prediction, major histocompatibility complex-peptide binding prediction, and protein structure prediction, is provided. Limitations and the future directions of deep learning in proteomics are also discussed. This review will provide readers an overview of deep learning and how it can be used to analyze proteomics data.
Collapse
Affiliation(s)
- Bo Wen
- Lester and Sue Smith Breast CenterBaylor College of MedicineHoustonTX77030USA
- Department of Molecular and Human GeneticsBaylor College of MedicineHoustonTX77030USA
| | - Wen‐Feng Zeng
- Key Lab of Intelligent Information Processing of Chinese Academy of Sciences (CAS)Chinese Academy of SciencesInstitute of Computing TechnologyBeijing100190China
| | - Yuxing Liao
- Lester and Sue Smith Breast CenterBaylor College of MedicineHoustonTX77030USA
- Department of Molecular and Human GeneticsBaylor College of MedicineHoustonTX77030USA
| | - Zhiao Shi
- Lester and Sue Smith Breast CenterBaylor College of MedicineHoustonTX77030USA
- Department of Molecular and Human GeneticsBaylor College of MedicineHoustonTX77030USA
| | - Sara R. Savage
- Lester and Sue Smith Breast CenterBaylor College of MedicineHoustonTX77030USA
- Department of Molecular and Human GeneticsBaylor College of MedicineHoustonTX77030USA
| | - Wen Jiang
- Lester and Sue Smith Breast CenterBaylor College of MedicineHoustonTX77030USA
- Department of Molecular and Human GeneticsBaylor College of MedicineHoustonTX77030USA
| | - Bing Zhang
- Lester and Sue Smith Breast CenterBaylor College of MedicineHoustonTX77030USA
- Department of Molecular and Human GeneticsBaylor College of MedicineHoustonTX77030USA
| |
Collapse
|