151
|
Elguoshy A, Hirao Y, Xu B, Saito S, Quadery AF, Yamamoto K, Mitsui T, Yamamoto T. Identification and Validation of Human Missing Proteins and Peptides in Public Proteome Databases: Data Mining Strategy. J Proteome Res 2017; 16:4403-4414. [PMID: 28980472 DOI: 10.1021/acs.jproteome.7b00423] [Citation(s) in RCA: 15] [Impact Index Per Article: 2.1] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/18/2022]
Abstract
In an attempt to complete human proteome project (HPP), Chromosome-Centric Human Proteome Project (C-HPP) launched the journey of missing protein (MP) investigation in 2012. However, 2579 and 572 protein entries in the neXtProt (2017-1) are still considered as missing and uncertain proteins, respectively. Thus, in this study, we proposed a pipeline to analyze, identify, and validate human missing and uncertain proteins in open-access transcriptomics and proteomics databases. Analysis of RNA expression pattern for missing proteins in Human protein Atlas showed that 28% of them, such as Olfactory receptor 1I1 ( O60431 ), had no RNA expression, suggesting the necessity to consider uncommon tissues for transcriptomic and proteomic studies. Interestingly, 21% had elevated expression level in a particular tissue (tissue-enriched proteins), indicating the importance of targeting such proteins in their elevated tissues. Additionally, the analysis of RNA expression level for missing proteins showed that 95% had no or low expression level (0-10 transcripts per million), indicating that low abundance is one of the major obstacles facing the detection of missing proteins. Moreover, missing proteins are predicted to generate fewer predicted unique tryptic peptides than the identified proteins. Searching for these predicted unique tryptic peptides that correspond to missing and uncertain proteins in the experimental peptide list of open-access MS-based databases (PA, GPM) resulted in the detection of 402 missing and 19 uncertain proteins with at least two unique peptides (≥9 aa) at <(5 × 10-4)% FDR. Finally, matching the native spectra for the experimentally detected peptides with their SRMAtlas synthetic counterparts at three transition sources (QQQ, QTOF, QTRAP) gave us an opportunity to validate 41 missing proteins by ≥2 proteotypic peptides.
Collapse
Affiliation(s)
- Amr Elguoshy
- Biofluid and Biomarker Center, Niigata University , Niigata 950-2181, Japan.,Graduate School of Science and Technology, Niigata University , Niigata 950-2181, Japan.,Biotechnology Department - Faculty of Agriculture, Al-azhar University , Cairo 11651, Egypt
| | - Yoshitoshi Hirao
- Biofluid and Biomarker Center, Niigata University , Niigata 950-2181, Japan
| | - Bo Xu
- Biofluid and Biomarker Center, Niigata University , Niigata 950-2181, Japan
| | - Suguru Saito
- Biofluid and Biomarker Center, Niigata University , Niigata 950-2181, Japan
| | - Ali F Quadery
- Biofluid and Biomarker Center, Niigata University , Niigata 950-2181, Japan
| | - Keiko Yamamoto
- Biofluid and Biomarker Center, Niigata University , Niigata 950-2181, Japan
| | - Toshiaki Mitsui
- Graduate School of Science and Technology, Niigata University , Niigata 950-2181, Japan
| | - Tadashi Yamamoto
- Biofluid and Biomarker Center, Niigata University , Niigata 950-2181, Japan
| | | |
Collapse
|
152
|
Dimitrakopoulos L, Prassas I, Diamandis EP, Charames GS. Onco-proteogenomics: Multi-omics level data integration for accurate phenotype prediction. Crit Rev Clin Lab Sci 2017; 54:414-432. [DOI: 10.1080/10408363.2017.1384446] [Citation(s) in RCA: 12] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/10/2023]
Affiliation(s)
- Lampros Dimitrakopoulos
- Department of Laboratory Medicine and Pathobiology, University of Toronto, Toronto, ON, Canada
- Department of Pathology and Laboratory Medicine, Mount Sinai Hospital, Joseph and Wolf Lebovic Health Complex, Toronto, ON, Canada
- Lunenfeld-Tanenbaum Research Institute, Sinai Health System, Toronto, ON, Canada
| | - Ioannis Prassas
- Department of Pathology and Laboratory Medicine, Mount Sinai Hospital, Joseph and Wolf Lebovic Health Complex, Toronto, ON, Canada
| | - Eleftherios P. Diamandis
- Department of Laboratory Medicine and Pathobiology, University of Toronto, Toronto, ON, Canada
- Department of Pathology and Laboratory Medicine, Mount Sinai Hospital, Joseph and Wolf Lebovic Health Complex, Toronto, ON, Canada
- Lunenfeld-Tanenbaum Research Institute, Sinai Health System, Toronto, ON, Canada
- Department of Clinical Biochemistry, University Health Network, Toronto, ON, Canada
| | - George S. Charames
- Department of Laboratory Medicine and Pathobiology, University of Toronto, Toronto, ON, Canada
- Department of Pathology and Laboratory Medicine, Mount Sinai Hospital, Joseph and Wolf Lebovic Health Complex, Toronto, ON, Canada
- Lunenfeld-Tanenbaum Research Institute, Sinai Health System, Toronto, ON, Canada
| |
Collapse
|
153
|
Next-Generation Proteomics and Its Application to Clinical Breast Cancer Research. THE AMERICAN JOURNAL OF PATHOLOGY 2017; 187:2175-2184. [DOI: 10.1016/j.ajpath.2017.07.003] [Citation(s) in RCA: 22] [Impact Index Per Article: 3.1] [Reference Citation Analysis] [Track Full Text] [Subscribe] [Scholar Register] [Received: 01/07/2017] [Revised: 07/05/2017] [Accepted: 07/06/2017] [Indexed: 12/17/2022]
|
154
|
Rosenberger G, Bludau I, Schmitt U, Heusel M, Hunter CL, Liu Y, MacCoss MJ, MacLean BX, Nesvizhskii AI, Pedrioli PGA, Reiter L, Röst HL, Tate S, Ting YS, Collins BC, Aebersold R. Statistical control of peptide and protein error rates in large-scale targeted data-independent acquisition analyses. Nat Methods 2017; 14:921-927. [PMID: 28825704 PMCID: PMC5581544 DOI: 10.1038/nmeth.4398] [Citation(s) in RCA: 145] [Impact Index Per Article: 20.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/12/2016] [Accepted: 07/07/2017] [Indexed: 12/18/2022]
Abstract
Liquid chromatography coupled to tandem mass spectrometry is the main method for high-throughput identification and quantification of peptides and inferred proteins. Within this field, data-independent acquisition (DIA) combined with peptide-centric scoring, exemplified by SWATH-MS, emerged as a scalable method to achieve deep and consistent proteome coverage across large-scale datasets. Here we discuss the adaptation of statistical concepts developed for discovery proteomics based on spectrum-centric scoring to large-scale DIA experiments analyzed with peptide-centric scoring strategies and provide guidance on their application. We show that optimal tradeoffs between sensitivity and specificity require careful considerations of the relationship between proteins in the samples and proteins represented in the spectral library. We propose the application of a global analyte constraint to prevent accumulation of false positives across large-scale datasets. Furthermore, to increase the quality and reproducibility of published proteomic results, well-established confidence criteria should be reported for detected peptide queries, peptides and inferred proteins.
Collapse
Affiliation(s)
- George Rosenberger
- Department of Biology, Institute of Molecular Systems Biology, ETH Zurich, Zurich, Switzerland.,PhD Program in Systems Biology, University of Zurich and ETH Zurich, Zurich, Switzerland
| | - Isabell Bludau
- Department of Biology, Institute of Molecular Systems Biology, ETH Zurich, Zurich, Switzerland.,PhD Program in Systems Biology, University of Zurich and ETH Zurich, Zurich, Switzerland
| | - Uwe Schmitt
- ID Scientific IT Services, ETH Zurich, Zurich, Switzerland
| | - Moritz Heusel
- Department of Biology, Institute of Molecular Systems Biology, ETH Zurich, Zurich, Switzerland.,PhD program in Molecular and Translational Biomedicine, Competence Center Personalized Medicine (CC-PM), ETH Zurich and University of Zurich, Zurich, Switzerland
| | | | - Yansheng Liu
- Department of Biology, Institute of Molecular Systems Biology, ETH Zurich, Zurich, Switzerland
| | - Michael J MacCoss
- Department of Genome Sciences, University of Washington, Seattle, Washington, USA
| | - Brendan X MacLean
- Department of Genome Sciences, University of Washington, Seattle, Washington, USA
| | - Alexey I Nesvizhskii
- Department of Computational Medicine and Bioinformatics, University of Michigan, Ann Arbor, Michigan, USA.,Department of Pathology, University of Michigan, Ann Arbor, Michigan, USA
| | - Patrick G A Pedrioli
- Department of Biology, Institute of Molecular Systems Biology, ETH Zurich, Zurich, Switzerland
| | | | - Hannes L Röst
- Department of Biology, Institute of Molecular Systems Biology, ETH Zurich, Zurich, Switzerland
| | | | - Ying S Ting
- Department of Genome Sciences, University of Washington, Seattle, Washington, USA
| | - Ben C Collins
- Department of Biology, Institute of Molecular Systems Biology, ETH Zurich, Zurich, Switzerland
| | - Ruedi Aebersold
- Department of Biology, Institute of Molecular Systems Biology, ETH Zurich, Zurich, Switzerland.,Faculty of Science, University of Zurich, Zurich, Switzerland
| |
Collapse
|
155
|
Hu H, Khatri K, Zaia J. Algorithms and design strategies towards automated glycoproteomics analysis. MASS SPECTROMETRY REVIEWS 2017; 36:475-498. [PMID: 26728195 PMCID: PMC4931994 DOI: 10.1002/mas.21487] [Citation(s) in RCA: 71] [Impact Index Per Article: 10.1] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 07/10/2015] [Accepted: 11/30/2015] [Indexed: 05/09/2023]
Abstract
Glycoproteomics involves the study of glycosylation events on protein sequences ranging from purified proteins to whole proteome scales. Understanding these complex post-translational modification (PTM) events requires elucidation of the glycan moieties (monosaccharide sequences and glycosidic linkages between residues), protein sequences, as well as site-specific attachment of glycan moieties onto protein sequences, in a spatial and temporal manner in a variety of biological contexts. Compared with proteomics, bioinformatics for glycoproteomics is immature and many researchers still rely on tedious manual interpretation of glycoproteomics data. As sample preparation protocols and analysis techniques have matured, the number of publications on glycoproteomics and bioinformatics has increased substantially; however, the lack of consensus on tool development and code reuse limits the dissemination of bioinformatics tools because it requires significant effort to migrate a computational tool tailored for one method design to alternative methods. This review discusses algorithms and methods in glycoproteomics, and refers to the general proteomics field for potential solutions. It also introduces general strategies for tool integration and pipeline construction in order to better serve the glycoproteomics community. © 2016 Wiley Periodicals, Inc. Mass Spec Rev 36:475-498, 2017.
Collapse
Affiliation(s)
- Han Hu
- Bioinformatics Program, Boston University, Boston, Massachusetts 02215, USA
- Center for Biomedical Mass Spectrometry, Department of Biochemistry, Boston University School of Medicine, Boston University, Boston, Massachusetts 02118, USA
| | - Kshitij Khatri
- Center for Biomedical Mass Spectrometry, Department of Biochemistry, Boston University School of Medicine, Boston University, Boston, Massachusetts 02118, USA
| | - Joseph Zaia
- Center for Biomedical Mass Spectrometry, Department of Biochemistry, Boston University School of Medicine, Boston University, Boston, Massachusetts 02118, USA
| |
Collapse
|
156
|
Barbieri R, Guryev V, Brandsma CA, Suits F, Bischoff R, Horvatovich P. Proteogenomics: Key Driver for Clinical Discovery and Personalized Medicine. ADVANCES IN EXPERIMENTAL MEDICINE AND BIOLOGY 2017; 926:21-47. [PMID: 27686804 DOI: 10.1007/978-3-319-42316-6_3] [Citation(s) in RCA: 10] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 10/24/2022]
Abstract
Proteogenomics is a multi-omics research field that has the aim to efficiently integrate genomics, transcriptomics and proteomics. With this approach it is possible to identify new patient-specific proteoforms that may have implications in disease development, specifically in cancer. Understanding the impact of a large number of mutations detected at the genomics level is needed to assess the effects at the proteome level. Proteogenomics data integration would help in identifying molecular changes that are persistent across multiple molecular layers and enable better interpretation of molecular mechanisms of disease, such as the causal relationship between single nucleotide polymorphisms (SNPs) and the expression of transcripts and translation of proteins compared to mainstream proteomics approaches. Identifying patient-specific protein forms and getting a better picture of molecular mechanisms of disease opens the avenue for precision and personalized medicine. Proteogenomics is, however, a challenging interdisciplinary science that requires the understanding of sample preparation, data acquisition and processing for genomics, transcriptomics and proteomics. This chapter aims to guide the reader through the technology and bioinformatics aspects of these multi-omics approaches, illustrated with proteogenomics applications having clinical or biological relevance.
Collapse
Affiliation(s)
- Ruggero Barbieri
- Department of Gastroenterology and Hepatology, University Medical Center Groningen, University of Groningen, Groningen, The Netherlands
| | - Victor Guryev
- European Research Institute for the Biology of Ageing, University Medical Center Groningen, Antonius Deusinglaan 1, 9713 AV, Groningen, The Netherlands
| | - Corry-Anke Brandsma
- Department of Pathology & Medical Biology, University Medical Center Groningen, University of Groningen, Groningen, The Netherlands
| | - Frank Suits
- IBM T.J. Watson Research Centre, 1101 Kitchawan Road, Yorktown Heights, New York, 10598, NY, USA
| | - Rainer Bischoff
- Department of Analytical Biochemistry, Research Institute of Pharmacy, University of Groningen, Antonius Deusinglaan 1, 9713 AV, Groningen, The Netherlands
| | - Peter Horvatovich
- Department of Analytical Biochemistry, Research Institute of Pharmacy, University of Groningen, Antonius Deusinglaan 1, 9713 AV, Groningen, The Netherlands.
| |
Collapse
|
157
|
Kroll JE, da Silva VL, de Souza SJ, de Souza GA. A tool for integrating genetic and mass spectrometry-based peptide data: Proteogenomics Viewer. Bioessays 2017; 39. [DOI: 10.1002/bies.201700015] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/29/2022]
Affiliation(s)
- José Eduardo Kroll
- Institute of Bioinformatics and Biotechnology; Natal − RN Brazil
- Brain Institute; Universidade Federal do Rio Grande do Norte; Natal − RN Brazil
- Bioinformatics Multidisciplinary Environment; Instituto Metrópole Digital; UFRN, Natal-RN Brazil
| | - Vandeclécio Lira da Silva
- Brain Institute; Universidade Federal do Rio Grande do Norte; Natal − RN Brazil
- Bioinformatics Multidisciplinary Environment; Instituto Metrópole Digital; UFRN, Natal-RN Brazil
| | - Sandro José de Souza
- Brain Institute; Universidade Federal do Rio Grande do Norte; Natal − RN Brazil
- Bioinformatics Multidisciplinary Environment; Instituto Metrópole Digital; UFRN, Natal-RN Brazil
| | - Gustavo Antonio de Souza
- Brain Institute; Universidade Federal do Rio Grande do Norte; Natal − RN Brazil
- Bioinformatics Multidisciplinary Environment; Instituto Metrópole Digital; UFRN, Natal-RN Brazil
- Department of Immunology and Centre for Immune Regulation, Oslo University Hospital HF Rikshospitalet; University of Oslo; Oslo Norway
| |
Collapse
|
158
|
Zhang SR, Shan YC, Jiang H, Liu JH, Zhou Y, Zhang LH, Zhang YK. The Null-Test for peptide identification algorithm in Shotgun proteomics. J Proteomics 2017; 163:118-125. [DOI: 10.1016/j.jprot.2017.05.010] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/28/2016] [Revised: 05/09/2017] [Accepted: 05/11/2017] [Indexed: 12/24/2022]
|
159
|
Abstract
The interrogation of cell surface-presented immunogenic epitopes is of great importance to differentiate diseased cells in consequence to malignant transformation or viral infections. On the basis of this knowledge, next-generation immunotherapies against cancers, autoimmunity, or infectious diseases can be developed. The identification of altered peptide repertoires of transformed cells renders mass spectrometry-based analysis indispensable. This is evident considering the low correlation of gene or protein expression alterations, respectively, with changes in the peptide repertoire rendering those analyses less informative. Nevertheless, immunogenicity of peptides appearing to be exclusively found on diseased cells has to be finally proven in T cell-based assays. This review highlights the capabilities and limitations of mass spectrometry in the identification of entire immunopeptidomes, as well as individual potential immunogenic epitopes with a strong focus on cancer. Furthermore, an overview of state-of-the-art immunogenicity screens is presented.
Collapse
|
160
|
Blue LE, Franklin EG, Godinho JM, Grinias JP, Grinias KM, Lunn DB, Moore SM. Recent advances in capillary ultrahigh pressure liquid chromatography. J Chromatogr A 2017; 1523:17-39. [PMID: 28599863 DOI: 10.1016/j.chroma.2017.05.039] [Citation(s) in RCA: 48] [Impact Index Per Article: 6.9] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/15/2017] [Revised: 05/12/2017] [Accepted: 05/15/2017] [Indexed: 11/28/2022]
Abstract
In the twenty years since its initial demonstration, capillary ultrahigh pressure liquid chromatography (UHPLC) has proven to be one of most powerful separation techniques for the analysis of complex mixtures. This review focuses on the most recent advances made since 2010 towards increasing the performance of such separations. Improvements in capillary column preparation techniques that have led to columns with unprecedented performance are described. New stationary phases and phase supports that have been reported over the past decade are detailed, with a focus on their use in capillary formats. A discussion on the instrument developments that have been required to ensure that extra-column effects do not diminish the intrinsic efficiency of these columns during analysis is also included. Finally, the impact of these capillary UHPLC topics on the field of proteomics and ways in which capillary UHPLC may continue to be applied to the separation of complex samples are addressed.
Collapse
Affiliation(s)
- Laura E Blue
- Process Development, Amgen Inc., Thousand Oaks, CA 91320, USA
| | - Edward G Franklin
- HPLC Research & Development, Restek Corp., Bellefonte, PA 16823, USA
| | - Justin M Godinho
- Department of Chemistry, University of North Carolina at Chapel Hill, Chapel Hill, NC 27599, USA
| | - James P Grinias
- Department of Chemistry and Biochemistry, Rowan University, Glassboro, NJ 08028, USA.
| | - Kaitlin M Grinias
- Department of Product Development & Supply, GlaxoSmithKline, King of Prussia, PA 19406, USA
| | - Daniel B Lunn
- Department of Chemistry, University of North Carolina at Chapel Hill, Chapel Hill, NC 27599, USA
| | | |
Collapse
|
161
|
Kong AT, Leprevost FV, Avtonomov DM, Mellacheruvu D, Nesvizhskii AI. MSFragger: ultrafast and comprehensive peptide identification in mass spectrometry-based proteomics. Nat Methods 2017; 14:513-520. [PMID: 28394336 PMCID: PMC5409104 DOI: 10.1038/nmeth.4256] [Citation(s) in RCA: 905] [Impact Index Per Article: 129.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/30/2016] [Accepted: 03/06/2017] [Indexed: 12/22/2022]
Abstract
There is a need to better understand and handle the 'dark matter' of proteomics-the vast diversity of post-translational and chemical modifications that are unaccounted in a typical mass spectrometry-based analysis and thus remain unidentified. We present a fragment-ion indexing method, and its implementation in peptide identification tool MSFragger, that enables a more than 100-fold improvement in speed over most existing proteome database search tools. Using several large proteomic data sets, we demonstrate how MSFragger empowers the open database search concept for comprehensive identification of peptides and all their modified forms, uncovering dramatic differences in modification rates across experimental samples and conditions. We further illustrate its utility using protein-RNA cross-linked peptide data and using affinity purification experiments where we observe, on average, a 300% increase in the number of identified spectra for enriched proteins. We also discuss the benefits of open searching for improved false discovery rate estimation in proteomics.
Collapse
Affiliation(s)
- Andy T. Kong
- Department of Computational Medicine and Bioinformatics, University of Michigan, Ann Arbor, Michigan, USA
- Department of Pathology, University of Michigan, Ann Arbor, Michigan, USA
| | | | | | | | - Alexey I. Nesvizhskii
- Department of Computational Medicine and Bioinformatics, University of Michigan, Ann Arbor, Michigan, USA
- Department of Pathology, University of Michigan, Ann Arbor, Michigan, USA
| |
Collapse
|
162
|
May DH, Tamura K, Noble WS. Param-Medic: A Tool for Improving MS/MS Database Search Yield by Optimizing Parameter Settings. J Proteome Res 2017; 16:1817-1824. [PMID: 28263070 PMCID: PMC5738039 DOI: 10.1021/acs.jproteome.7b00028] [Citation(s) in RCA: 18] [Impact Index Per Article: 2.6] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/07/2023]
Abstract
In shotgun proteomics analysis, user-specified parameters are critical to database search performance and therefore to the yield of confident peptide-spectrum matches (PSMs). Two of the most important parameters are related to the accuracy of the mass spectrometer. Precursor mass tolerance defines the peptide candidates considered for each spectrum. Fragment mass tolerance or bin size determines how close observed and theoretical fragments must be to be considered a match. For either of these two parameters, too wide a setting yields randomly high-scoring false PSMs, whereas too narrow a setting erroneously excludes true PSMs, in both cases, lowering the yield of peptides detected at a given false discovery rate. We describe a strategy for inferring optimal search parameters by assembling and analyzing pairs of spectra that are likely to have been generated by the same peptide ion to infer precursor and fragment mass error. This strategy does not rely on a database search, making it usable in a wide variety of settings. In our experiments on data from a variety of instruments including Orbitrap and Q-TOF acquisitions, this strategy yields more high-confidence PSMs than using settings based on instrument defaults or determined by experts. Param-Medic is open-source and cross-platform. It is available as a standalone tool ( http://noble.gs.washington.edu/proj/param-medic/ ) and has been integrated into the Crux proteomics toolkit ( http://crux.ms ), providing automatic parameter selection for the Comet and Tide search engines.
Collapse
Affiliation(s)
- Damon H May
- Department of Genome Sciences, University of Washington , Seattle, Washington 98195, United States
| | - Kaipo Tamura
- Department of Genome Sciences, University of Washington , Seattle, Washington 98195, United States
| | - William S Noble
- Department of Genome Sciences, University of Washington , Seattle, Washington 98195, United States
- Department of Computer Science and Engineering, University of Washington , Seattle, Washington 98195, United States
| |
Collapse
|
163
|
Titeca K, Van Quickelberghe E, Samyn N, De Sutter D, Verhee A, Gevaert K, Tavernier J, Eyckerman S. Analyzing trapped protein complexes by Virotrap and SFINX. Nat Protoc 2017; 12:881-898. [PMID: 28358392 DOI: 10.1038/nprot.2017.014] [Citation(s) in RCA: 13] [Impact Index Per Article: 1.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/14/2022]
Abstract
The analysis of protein interaction networks is one of the key challenges in the study of biology. It connects genotypes to phenotypes, and disruption of such networks is associated with many pathologies. Virtually all the approaches to the study of protein complexes require cell lysis, a dramatic step that obliterates cellular integrity and profoundly affects protein interactions. This protocol starts with Virotrap, a novel approach that avoids the need for cell homogenization by fusing the protein of interest to the HIV-1 Gag protein, trapping protein complexes in virus-like particles. By using the straightforward filtering index (SFINX), which is a powerful and intuitive online tool (http://sfinx.ugent.be) that enables contaminant removal from candidate lists resulting from mass-spectrometry-based analysis, we provide a complete workflow for researchers interested in mammalian protein complexes. Given direct access to mass spectrometers, researchers can process up to 24 samples in 7 d.
Collapse
Affiliation(s)
- Kevin Titeca
- VIB-UGent Center for Medical Biotechnology, Ghent, Belgium.,Department of Biochemistry, Ghent University, Ghent, Belgium
| | - Emmy Van Quickelberghe
- VIB-UGent Center for Medical Biotechnology, Ghent, Belgium.,Department of Biochemistry, Ghent University, Ghent, Belgium
| | - Noortje Samyn
- VIB-UGent Center for Medical Biotechnology, Ghent, Belgium.,Department of Biochemistry, Ghent University, Ghent, Belgium
| | - Delphine De Sutter
- VIB-UGent Center for Medical Biotechnology, Ghent, Belgium.,Department of Biochemistry, Ghent University, Ghent, Belgium
| | - Annick Verhee
- VIB-UGent Center for Medical Biotechnology, Ghent, Belgium.,Department of Biochemistry, Ghent University, Ghent, Belgium
| | - Kris Gevaert
- VIB-UGent Center for Medical Biotechnology, Ghent, Belgium.,Department of Biochemistry, Ghent University, Ghent, Belgium
| | - Jan Tavernier
- VIB-UGent Center for Medical Biotechnology, Ghent, Belgium.,Department of Biochemistry, Ghent University, Ghent, Belgium
| | - Sven Eyckerman
- VIB-UGent Center for Medical Biotechnology, Ghent, Belgium.,Department of Biochemistry, Ghent University, Ghent, Belgium
| |
Collapse
|
164
|
Luck K, Sheynkman GM, Zhang I, Vidal M. Proteome-Scale Human Interactomics. Trends Biochem Sci 2017; 42:342-354. [PMID: 28284537 DOI: 10.1016/j.tibs.2017.02.006] [Citation(s) in RCA: 87] [Impact Index Per Article: 12.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/26/2016] [Revised: 02/10/2017] [Accepted: 02/16/2017] [Indexed: 01/28/2023]
Abstract
Cellular functions are mediated by complex interactome networks of physical, biochemical, and functional interactions between DNA sequences, RNA molecules, proteins, lipids, and small metabolites. A thorough understanding of cellular organization requires accurate and relatively complete models of interactome networks at proteome scale. The recent publication of four human protein-protein interaction (PPI) maps represents a technological breakthrough and an unprecedented resource for the scientific community, heralding a new era of proteome-scale human interactomics. Our knowledge gained from these and complementary studies provides fresh insights into the opportunities and challenges when analyzing systematically generated interactome data, defines a clear roadmap towards the generation of a first reference interactome, and reveals new perspectives on the organization of cellular life.
Collapse
Affiliation(s)
- Katja Luck
- Center for Cancer Systems Biology (CCSB), Dana-Farber Cancer Institute, Boston, MA 02215, USA; Department of Genetics, Harvard Medical School, Boston, MA 02115, USA; Department of Cancer Biology, Dana-Farber Cancer Institute, Boston, MA 02215, USA.
| | - Gloria M Sheynkman
- Center for Cancer Systems Biology (CCSB), Dana-Farber Cancer Institute, Boston, MA 02215, USA; Department of Genetics, Harvard Medical School, Boston, MA 02115, USA; Department of Cancer Biology, Dana-Farber Cancer Institute, Boston, MA 02215, USA.
| | - Ivy Zhang
- Center for Cancer Systems Biology (CCSB), Dana-Farber Cancer Institute, Boston, MA 02215, USA; Department of Genetics, Harvard Medical School, Boston, MA 02115, USA; Department of Cancer Biology, Dana-Farber Cancer Institute, Boston, MA 02215, USA
| | - Marc Vidal
- Center for Cancer Systems Biology (CCSB), Dana-Farber Cancer Institute, Boston, MA 02215, USA; Department of Genetics, Harvard Medical School, Boston, MA 02115, USA
| |
Collapse
|
165
|
Fu S, Liu X, Luo M, Xie K, Nice EC, Zhang H, Huang C. Proteogenomic studies on cancer drug resistance: towards biomarker discovery and target identification. Expert Rev Proteomics 2017; 14:351-362. [PMID: 28276747 DOI: 10.1080/14789450.2017.1299006] [Citation(s) in RCA: 8] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/05/2023]
Abstract
INTRODUCTION Chemoresistance is a major obstacle for current cancer treatment. Proteogenomics is a powerful multi-omics research field that uses customized protein sequence databases generated by genomic and transcriptomic information to identify novel genes (e.g. noncoding, mutation and fusion genes) from mass spectrometry-based proteomic data. By identifying aberrations that are differentially expressed between tumor and normal pairs, this approach can also be applied to validate protein variants in cancer, which may reveal the response to drug treatment. Areas covered: In this review, we will present recent advances in proteogenomic investigations of cancer drug resistance with an emphasis on integrative proteogenomic pipelines and the biomarker discovery which contributes to achieving the goal of using precision/personalized medicine for cancer treatment. Expert commentary: The discovery and comprehensive understanding of potential biomarkers help identify the cohort of patients who may benefit from particular treatments, and will assist real-time clinical decision-making to maximize therapeutic efficacy and minimize adverse effects. With the development of MS-based proteomics and NGS-based sequencing, a growing number of proteogenomic tools are being developed specifically to investigate cancer drug resistance.
Collapse
Affiliation(s)
- Shuyue Fu
- a State Key Laboratory of Biotherapy and Cancer Center , West China Hospital, Sichuan University, and Collaborative Innovation Center for Biotherapy , Chengdu , P.R. China
| | - Xiang Liu
- b Department of Pathology , Sichuan Academy of Medical Sciences, Sichuan Provincial People's Hospital , Chengdu , P.R. China
| | - Maochao Luo
- c West China School of Public Health, Sichuan University , Chengdu , P.R.China
| | - Ke Xie
- d Department of Oncology , Sichuan Academy of Medical Sciences, Sichuan Provincial People's Hospital , Chengdu , P.R. China
| | - Edouard C Nice
- e Department of Biochemistry and Molecular Biology , Monash University , Clayton , Australia
| | - Haiyuan Zhang
- f School of Medicine , Yangtze University , P. R. China
| | - Canhua Huang
- a State Key Laboratory of Biotherapy and Cancer Center , West China Hospital, Sichuan University, and Collaborative Innovation Center for Biotherapy , Chengdu , P.R. China
| |
Collapse
|
166
|
Audain E, Uszkoreit J, Sachsenberg T, Pfeuffer J, Liang X, Hermjakob H, Sanchez A, Eisenacher M, Reinert K, Tabb DL, Kohlbacher O, Perez-Riverol Y. In-depth analysis of protein inference algorithms using multiple search engines and well-defined metrics. J Proteomics 2017; 150:170-182. [DOI: 10.1016/j.jprot.2016.08.002] [Citation(s) in RCA: 47] [Impact Index Per Article: 6.7] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/20/2016] [Revised: 07/30/2016] [Accepted: 08/02/2016] [Indexed: 12/24/2022]
|
167
|
Gómez-Gómez L, Parra-Vega V, Rivas-Sendra A, Seguí-Simarro JM, Molina RV, Pallotti C, Rubio-Moraga Á, Diretto G, Prieto A, Ahrazem O. Unraveling Massive Crocins Transport and Accumulation through Proteome and Microscopy Tools during the Development of Saffron Stigma. Int J Mol Sci 2017; 18:E76. [PMID: 28045431 PMCID: PMC5297711 DOI: 10.3390/ijms18010076] [Citation(s) in RCA: 31] [Impact Index Per Article: 4.4] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/22/2016] [Revised: 12/23/2016] [Accepted: 12/24/2016] [Indexed: 11/18/2022] Open
Abstract
Crocins, the glucosides of crocetin, are present at high concentrations in saffron stigmas and accumulate in the vacuole. However, the biogenesis of the saffron chromoplast, the changes during the development of the stigma and the transport of crocins to the vacuole, are processes that remain poorly understood. We studied the process of chromoplast differentiation in saffron throughout stigma development by means of transmission electron microscopy. Our results provided an overview of a massive transport of crocins to the vacuole in the later developmental stages, when electron dense drops of a much greater size than plastoglobules (here defined "crocinoplast") were observed in the chromoplast, connected to the vacuole with a subsequent transfer of these large globules inside the vacuole. A proteome analysis of chromoplasts from saffron stigma allowed the identification of several well-known plastid proteins and new candidates involved in crocetin metabolism. Furthermore, expressions throughout five developmental stages of candidate genes responsible for carotenoid and apocarotenoid biogenesis, crocins transport to the vacuole and starch metabolism were analyzed. Correlation matrices and networks were exploited to identify a series of transcripts highly associated to crocetin (such as 1-Deoxy-d-xylulose 5-phosphate synthase (DXS), 1-Deoxy-d-xylulose 5-phosphate reductoisomerase (DXR), carotenoid isomerase (CRTISO), Crocetin glucosyltransferase 2 (UGT2), etc.) and crocin (e.g., ζ-carotene desaturase (ZDS) and plastid-lipid-associated proteins (PLAP2)) accumulation; in addition, candidate aldehyde dehydrogenase (ADH) genes were highlighted.
Collapse
Affiliation(s)
- Lourdes Gómez-Gómez
- Botanical Institute, Department of Science Technology, Agroforestry and Genetics, Faculty of Pharmacy, University of Castilla-La Mancha, Campus Universitario s/n, 02071 Albacete, Spain.
| | - Verónica Parra-Vega
- Cell Biology Group, COMAV Institute, Polytechnic University of Valencia, 46071 Valencia, Spain.
| | - Alba Rivas-Sendra
- Cell Biology Group, COMAV Institute, Polytechnic University of Valencia, 46071 Valencia, Spain.
| | - Jose M Seguí-Simarro
- Cell Biology Group, COMAV Institute, Polytechnic University of Valencia, 46071 Valencia, Spain.
| | - Rosa Victoria Molina
- Department of Vegetal Biology, Polytechnic University of Valencia, 46071 Valencia, Spain.
| | - Claudia Pallotti
- Department of Vegetal Biology, Polytechnic University of Valencia, 46071 Valencia, Spain.
| | - Ángela Rubio-Moraga
- Botanical Institute, Department of Science Technology, Agroforestry and Genetics, Faculty of Pharmacy, University of Castilla-La Mancha, Campus Universitario s/n, 02071 Albacete, Spain.
| | - Gianfranco Diretto
- Italian National Agency for New Technologies, Energy, and Sustainable Development, Casaccia Research Centre, 00123 Rome, Italy.
| | - Alicia Prieto
- The Biological Research Center (CIB) Spanish National Research Council (CSIC), C/Ramiro de Maeztu 9, 28040 Madrid, Spain.
| | - Oussama Ahrazem
- Botanical Institute, Department of Science Technology, Agroforestry and Genetics, Faculty of Pharmacy, University of Castilla-La Mancha, Campus Universitario s/n, 02071 Albacete, Spain.
- Faculty of Environmental Sciences and Biochemistry Toledo, University of Castilla-La Mancha, Campus Tecnológico de la Fábrica de Armas, Avda, Carlos III, s/n, 45071 Toledo, Spain.
| |
Collapse
|
168
|
Crowgey EL, Matlock A, Fort-Bober J, Van Eky JE, Van Eyk JE. Mapping Biological Networks from Quantitative Data-Independent Acquisition Mass Spectrometry: Data to Knowledge Pipelines. Methods Mol Biol 2017; 1558:395-413. [PMID: 28150249 PMCID: PMC6844627 DOI: 10.1007/978-1-4939-6783-4_19] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/21/2023]
Abstract
Data-independent acquisition mass spectrometry (DIA-MS) strategies and applications provide unique advantages for qualitative and quantitative proteome probing of a biological sample allowing constant sensitivity and reproducibility across large sample sets. These advantages in LC-MS/MS are being realized in fundamental research laboratories and for clinical research applications. However, the ability to translate high-throughput raw LC-MS/MS proteomic data into biological knowledge is a complex and difficult task requiring the use of many algorithms and tools for which there is no widely accepted standard and best practices are slowly being implemented. Today a single tool or approach inherently fails to capture the full interpretation that proteomics uniquely supplies, including the dynamics of quickly reversible chemically modified states of proteins, irreversible amino acid modifications, signaling truncation events, and, finally, determining the presence of protein from allele-specific transcripts. This chapter highlights key steps and publicly available algorithms required to translate DIA-MS data into knowledge.
Collapse
Affiliation(s)
| | - Andrea Matlock
- Advanced Clinical BioSystems Research Institute, Cedars Sinai Medical Center, Heart Institute, 127 S. San Vicente Blvd, Los Angeles, CA 90048
| | | | - Jennifer E Van Eky
- Cedar Sinai, Advanced Clinical BioSystems Research Institute, Cedars Sinai Medical Center, Heart Institute, 127 S. San Vicente Blvd, Los Angeles, CA 90048
| | - Jennifer E Van Eyk
- Advanced Clinical BioSystems Research Institute, Cedars Sinai Medical Center, Heart Institute, Los Angeles, CA, 90048, USA
| |
Collapse
|
169
|
Li H, Joh YS, Kim H, Paek E, Lee SW, Hwang KB. Evaluating the effect of database inflation in proteogenomic search on sensitive and reliable peptide identification. BMC Genomics 2016; 17:1031. [PMID: 28155652 PMCID: PMC5259817 DOI: 10.1186/s12864-016-3327-5] [Citation(s) in RCA: 26] [Impact Index Per Article: 3.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/16/2022] Open
Abstract
Background Proteogenomics is a promising approach for various tasks ranging from gene annotation to cancer research. Databases for proteogenomic searches are often constructed by adding peptide sequences inferred from genomic or transcriptomic evidence to reference protein sequences. Such inflation of databases has potential of identifying novel peptides. However, it also raises concerns on sensitive and reliable peptide identification. Spurious peptides included in target databases may result in underestimated false discovery rate (FDR). On the other hand, inflation of decoy databases could decrease the sensitivity of peptide identification due to the increased number of high-scoring random hits. Although several studies have addressed these issues, widely applicable guidelines for sensitive and reliable proteogenomic search have hardly been available. Results To systematically evaluate the effect of database inflation in proteogenomic searches, we constructed a variety of real and simulated proteogenomic databases for yeast and human tandem mass spectrometry (MS/MS) data, respectively. Against these databases, we tested two popular database search tools with various approaches to search result validation: the target-decoy search strategy (with and without a refined scoring-metric) and a mixture model-based method. The effect of separate filtering of known and novel peptides was also examined. The results from real and simulated proteogenomic searches confirmed that separate filtering increases the sensitivity and reliability in proteogenomic search. However, no one method consistently identified the largest (or the smallest) number of novel peptides from real proteogenomic searches. Conclusions We propose to use a set of search result validation methods with separate filtering, for sensitive and reliable identification of peptides in proteogenomic search. Electronic supplementary material The online version of this article (doi:10.1186/s12864-016-3327-5) contains supplementary material, which is available to authorized users.
Collapse
Affiliation(s)
- Honglan Li
- School of Computer Science and Engineering, Soongsil University, Seoul, 06978, Republic of Korea
| | - Yoon Sung Joh
- Department of Computer Science, Hanyang University, Seoul, 04763, Republic of Korea
| | - Hyunwoo Kim
- Scientific Data Research Center, Korea Institute of Science and Technology Information, Daejeon, 34141, Republic of Korea
| | - Eunok Paek
- Department of Computer Science, Hanyang University, Seoul, 04763, Republic of Korea
| | - Sang-Won Lee
- Department of Chemistry, Research Institute for Natural Sciences, Korea University, Seoul, 02841, Republic of Korea
| | - Kyu-Baek Hwang
- School of Computer Science and Engineering, Soongsil University, Seoul, 06978, Republic of Korea.
| |
Collapse
|
170
|
Willems S, Dhaenens M, Govaert E, De Clerck L, Meert P, Van Neste C, Van Nieuwerburgh F, Deforce D. Flagging False Positives Following Untargeted LC–MS Characterization of Histone Post-Translational Modification Combinations. J Proteome Res 2016; 16:655-664. [DOI: 10.1021/acs.jproteome.6b00724] [Citation(s) in RCA: 10] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/04/2023]
Affiliation(s)
- Sander Willems
- Laboratory
of Pharmaceutical Biotechnology, Ghent University, Ghent, 9000, Belgium
| | - Maarten Dhaenens
- Laboratory
of Pharmaceutical Biotechnology, Ghent University, Ghent, 9000, Belgium
| | - Elisabeth Govaert
- Laboratory
of Pharmaceutical Biotechnology, Ghent University, Ghent, 9000, Belgium
| | - Laura De Clerck
- Laboratory
of Pharmaceutical Biotechnology, Ghent University, Ghent, 9000, Belgium
| | - Paulien Meert
- Laboratory
of Pharmaceutical Biotechnology, Ghent University, Ghent, 9000, Belgium
| | - Christophe Van Neste
- Laboratory
of Pharmaceutical Biotechnology, Ghent University, Ghent, 9000, Belgium
- Bioinformatics
Institute Ghent, Ghent University, Ghent, 9052, Belgium
- Center
for Medical Genetics Ghent, Ghent University, Ghent, 9000, Belgium
| | | | - Dieter Deforce
- Laboratory
of Pharmaceutical Biotechnology, Ghent University, Ghent, 9000, Belgium
| |
Collapse
|
171
|
Lam MPY, Lau E, Ng DCM, Wang D, Ping P. Cardiovascular proteomics in the era of big data: experimental and computational advances. Clin Proteomics 2016; 13:23. [PMID: 27980500 PMCID: PMC5137214 DOI: 10.1186/s12014-016-9124-y] [Citation(s) in RCA: 9] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/03/2016] [Accepted: 08/24/2016] [Indexed: 01/14/2023] Open
Abstract
Proteomics plays an increasingly important role in our quest to understand cardiovascular biology. Fueled by analytical and computational advances in the past decade, proteomics applications can now go beyond merely inventorying protein species, and address sophisticated questions on cardiac physiology. The advent of massive mass spectrometry datasets has in turn led to increasing intersection between proteomics and big data science. Here we review new frontiers in technological developments and their applications to cardiovascular medicine. The impact of big data science on cardiovascular proteomics investigations and translation to medicine is highlighted.
Collapse
Affiliation(s)
- Maggie P Y Lam
- NIH BD2K Center of Excellence at UCLA; Department of Physiology, University of California at Los Angeles, 675 Charles E. Young Drive, Los Angeles, CA 90095 USA
| | - Edward Lau
- NIH BD2K Center of Excellence at UCLA; Department of Physiology, University of California at Los Angeles, 675 Charles E. Young Drive, Los Angeles, CA 90095 USA
| | - Dominic C M Ng
- NIH BD2K Center of Excellence at UCLA; Department of Physiology, University of California at Los Angeles, 675 Charles E. Young Drive, Los Angeles, CA 90095 USA
| | - Ding Wang
- NIH BD2K Center of Excellence at UCLA; Department of Physiology, University of California at Los Angeles, 675 Charles E. Young Drive, Los Angeles, CA 90095 USA
| | - Peipei Ping
- NIH BD2K Center of Excellence at UCLA; Department of Physiology, University of California at Los Angeles, 675 Charles E. Young Drive, Los Angeles, CA 90095 USA ; Department of Medicine, University of California at Los Angeles, 675 Charles E. Young Drive, Los Angeles, CA 90095 USA ; Department of Bioinformatics, University of California at Los Angeles, 675 Charles E. Young Drive, Los Angeles, CA 90095 USA
| |
Collapse
|
172
|
Critical decisions in metaproteomics: achieving high confidence protein annotations in a sea of unknowns. ISME JOURNAL 2016; 11:309-314. [PMID: 27824341 PMCID: PMC5270573 DOI: 10.1038/ismej.2016.132] [Citation(s) in RCA: 61] [Impact Index Per Article: 7.6] [Reference Citation Analysis] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Indexed: 01/23/2023]
|
173
|
Proteomics progresses in microbial physiology and clinical antimicrobial therapy. Eur J Clin Microbiol Infect Dis 2016; 36:403-413. [PMID: 27812806 PMCID: PMC5309286 DOI: 10.1007/s10096-016-2816-4] [Citation(s) in RCA: 28] [Impact Index Per Article: 3.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/12/2016] [Accepted: 10/16/2016] [Indexed: 02/05/2023]
Abstract
Clinical microbial identification plays an important role in optimizing the management of infectious diseases and provides diagnostic and therapeutic support for clinical management. Microbial proteomic research is aimed at identifying proteins associated with microbial activity, which has facilitated the discovery of microbial physiology changes and host–pathogen interactions during bacterial infection and antimicrobial therapy. Here, we summarize proteomic-driven progresses of host–microbial pathogen interactions at multiple levels, mass spectrometry-based microbial proteome identification for clinical diagnosis, and antimicrobial therapy. Proteomic technique progresses pave new ways towards effective prevention and drug discovery for microbial-induced infectious diseases.
Collapse
|
174
|
Abstract
In computational proteomics, the identification of peptides with an unlimited number of post-translational modification (PTM) types is a challenging task. The computational cost associated with database search increases exponentially with respect to the number of modified amino acids and linearly with respect to the number of potential PTM types at each amino acid. The problem becomes intractable very quickly if we want to enumerate all possible PTM patterns. To address this issue, one group of methods named restricted tools (including Mascot, Comet, and MS-GF+) only allow a small number of PTM types in database search process. Alternatively, the other group of methods named unrestricted tools (including MS-Alignment, ProteinProspector, and MODa) avoids enumerating PTM patterns with an alignment-based approach to localizing and characterizing modified amino acids. However, because of the large search space and PTM localization issue, the sensitivity of these unrestricted tools is low. This paper proposes a novel method named PIPI to achieve PTM-invariant peptide identification. PIPI belongs to the category of unrestricted tools. It first codes peptide sequences into Boolean vectors and codes experimental spectra into real-valued vectors. For each coded spectrum, it then searches the coded sequence database to find the top scored peptide sequences as candidates. After that, PIPI uses dynamic programming to localize and characterize modified amino acids in each candidate. We used simulation experiments and real data experiments to evaluate the performance in comparison with restricted tools (i.e., Mascot, Comet, and MS-GF+) and unrestricted tools (i.e., Mascot with error tolerant search, MS-Alignment, ProteinProspector, and MODa). Comparison with restricted tools shows that PIPI has a close sensitivity and running speed. Comparison with unrestricted tools shows that PIPI has the highest sensitivity except for Mascot with error tolerant search and ProteinProspector. These two tools simplify the task by only considering up to one modified amino acid in each peptide, which results in a higher sensitivity but has difficulty in dealing with multiple modified amino acids. The simulation experiments also show that PIPI has the lowest false discovery proportion, the highest PTM characterization accuracy, and the shortest running time among the unrestricted tools.
Collapse
Affiliation(s)
- Fengchao Yu
- Division of Biomedical Engineering, The Hong Kong University of Science and Technology , Hong Kong, China
| | - Ning Li
- Division of Biomedical Engineering, The Hong Kong University of Science and Technology , Hong Kong, China.,Division of Life Science, The Hong Kong University of Science and Technology , Hong Kong, China
| | - Weichuan Yu
- Division of Biomedical Engineering, The Hong Kong University of Science and Technology , Hong Kong, China.,Department of Electronic and Computer Engineering, The Hong Kong University of Science and Technology , Hong Kong, China
| |
Collapse
|
175
|
Keiblinger KM, Fuchs S, Zechmeister-Boltenstern S, Riedel K. Soil and leaf litter metaproteomics-a brief guideline from sampling to understanding. FEMS Microbiol Ecol 2016; 92:fiw180. [PMID: 27549116 PMCID: PMC5026301 DOI: 10.1093/femsec/fiw180] [Citation(s) in RCA: 28] [Impact Index Per Article: 3.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Revised: 03/31/2016] [Accepted: 08/18/2016] [Indexed: 11/14/2022] Open
Abstract
The increasing application of soil metaproteomics is providing unprecedented, in-depth characterization of the composition and functionality of in situ microbial communities. Despite recent advances in high-resolution mass spectrometry, soil metaproteomics still suffers from a lack of effective and reproducible protein extraction protocols and standardized data analyses. This review discusses the opportunities and limitations of selected techniques in soil-, and leaf litter metaproteomics, and presents a step-by-step guideline on their application, covering sampling, sample preparation, extraction and data evaluation strategies. In addition, we present recent applications of soil metaproteomics and discuss how such approaches, linking phylogenetics and functionality, can help gain deeper insights into terrestrial microbial ecology. Finally, we strongly recommend that to maximize the insights environmental metaproteomics may provide, such methods should be employed within a holistic experimental approach considering relevant aboveground and belowground ecosystem parameters.
Collapse
Affiliation(s)
- Katharina M Keiblinger
- Institute for Soil Research, Department of Forest and Soil Sciences, University of Natural Resources and Life Sciences Vienna (BOKU), Peter Jordan-Strasse 82, 1190 Vienna, Austria
| | - Stephan Fuchs
- Institute of Microbiology, University of Greifswald, Friedrich-Ludwig-Jahnstrasse 15, 17489 Greifswald, Germany
| | - Sophie Zechmeister-Boltenstern
- Institute for Soil Research, Department of Forest and Soil Sciences, University of Natural Resources and Life Sciences Vienna (BOKU), Peter Jordan-Strasse 82, 1190 Vienna, Austria
| | - Katharina Riedel
- Institute of Microbiology, University of Greifswald, Friedrich-Ludwig-Jahnstrasse 15, 17489 Greifswald, Germany
| |
Collapse
|
176
|
Caron E, Kowalewski DJ, Chiek Koh C, Sturm T, Schuster H, Aebersold R. Analysis of Major Histocompatibility Complex (MHC) Immunopeptidomes Using Mass Spectrometry. Mol Cell Proteomics 2016; 14:3105-17. [PMID: 26628741 DOI: 10.1074/mcp.o115.052431] [Citation(s) in RCA: 164] [Impact Index Per Article: 20.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/17/2022] Open
Abstract
The myriad of peptides presented at the cell surface by class I and class II major histocompatibility complex (MHC) molecules are referred to as the immunopeptidome and are of great importance for basic and translational science. For basic science, the immunopeptidome is a critical component for understanding the immune system; for translational science, exact knowledge of the immunopeptidome can directly fuel and guide the development of next-generation vaccines and immunotherapies against autoimmunity, infectious diseases, and cancers. In this mini-review, we summarize established isolation techniques as well as emerging mass spectrometry-based platforms (i.e. SWATH-MS) to identify and quantify MHC-associated peptides. We also highlight selected biological applications and discuss important current technical limitations that need to be solved to accelerate the development of this field.
Collapse
Affiliation(s)
- Etienne Caron
- From the ‡Department of Biology, Institute of Molecular Systems Biology, ETH Zurich, Zurich, Switzerland;
| | - Daniel J Kowalewski
- §Department of Immunology, Interfaculty Institute for Cell Biology, University of Tübingen, Tübingen, Germany
| | - Ching Chiek Koh
- From the ‡Department of Biology, Institute of Molecular Systems Biology, ETH Zurich, Zurich, Switzerland
| | - Theo Sturm
- From the ‡Department of Biology, Institute of Molecular Systems Biology, ETH Zurich, Zurich, Switzerland
| | - Heiko Schuster
- §Department of Immunology, Interfaculty Institute for Cell Biology, University of Tübingen, Tübingen, Germany
| | - Ruedi Aebersold
- From the ‡Department of Biology, Institute of Molecular Systems Biology, ETH Zurich, Zurich, Switzerland; ¶Faculty of Science, University of Zurich, Zurich, Switzerland
| |
Collapse
|
177
|
Parker GJ, Leppert T, Anex DS, Hilmer JK, Matsunami N, Baird L, Stevens J, Parsawar K, Durbin-Johnson BP, Rocke DM, Nelson C, Fairbanks DJ, Wilson AS, Rice RH, Woodward SR, Bothner B, Hart BR, Leppert M. Demonstration of Protein-Based Human Identification Using the Hair Shaft Proteome. PLoS One 2016; 11:e0160653. [PMID: 27603779 PMCID: PMC5014411 DOI: 10.1371/journal.pone.0160653] [Citation(s) in RCA: 70] [Impact Index Per Article: 8.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/24/2015] [Accepted: 07/21/2016] [Indexed: 12/28/2022] Open
Abstract
Human identification from biological material is largely dependent on the ability to characterize genetic polymorphisms in DNA. Unfortunately, DNA can degrade in the environment, sometimes below the level at which it can be amplified by PCR. Protein however is chemically more robust than DNA and can persist for longer periods. Protein also contains genetic variation in the form of single amino acid polymorphisms. These can be used to infer the status of non-synonymous single nucleotide polymorphism alleles. To demonstrate this, we used mass spectrometry-based shotgun proteomics to characterize hair shaft proteins in 66 European-American subjects. A total of 596 single nucleotide polymorphism alleles were correctly imputed in 32 loci from 22 genes of subjects' DNA and directly validated using Sanger sequencing. Estimates of the probability of resulting individual non-synonymous single nucleotide polymorphism allelic profiles in the European population, using the product rule, resulted in a maximum power of discrimination of 1 in 12,500. Imputed non-synonymous single nucleotide polymorphism profiles from European-American subjects were considerably less frequent in the African population (maximum likelihood ratio = 11,000). The converse was true for hair shafts collected from an additional 10 subjects with African ancestry, where some profiles were more frequent in the African population. Genetically variant peptides were also identified in hair shaft datasets from six archaeological skeletal remains (up to 260 years old). This study demonstrates that quantifiable measures of identity discrimination and biogeographic background can be obtained from detecting genetically variant peptides in hair shaft protein, including hair from bioarchaeological contexts.
Collapse
Affiliation(s)
- Glendon J. Parker
- Department of Biology, Utah Valley University, Orem, Utah, United States of America
- Protein-Based Identification Technologies L.L.C., Orem, Utah, United States of America
- * E-mail: parker64@llnl;
| | - Tami Leppert
- Protein-Based Identification Technologies L.L.C., Orem, Utah, United States of America
- Department of Human Genetics, University of Utah, Salt Lake City, Utah, United States of America
| | - Deon S. Anex
- Forensic Science Center, Lawrence Livermore National Laboratory, Livermore, California, United States of America
| | - Jonathan K. Hilmer
- Department of Chemistry and Biochemistry, Montana State University, Bozeman, Montana, United States of America
| | - Nori Matsunami
- Department of Human Genetics, University of Utah, Salt Lake City, Utah, United States of America
| | - Lisa Baird
- Department of Human Genetics, University of Utah, Salt Lake City, Utah, United States of America
| | - Jeffery Stevens
- Department of Human Genetics, University of Utah, Salt Lake City, Utah, United States of America
| | - Krishna Parsawar
- Mass Spectrometry and Proteomics Core Facility, University of Utah, Salt Lake City, Utah, United States of America
| | - Blythe P. Durbin-Johnson
- Department of Public Health Sciences, University of California, Davis, California, United States of America
| | - David M. Rocke
- Department of Public Health Sciences, University of California, Davis, California, United States of America
| | - Chad Nelson
- Mass Spectrometry and Proteomics Core Facility, University of Utah, Salt Lake City, Utah, United States of America
| | - Daniel J. Fairbanks
- Department of Biology, Utah Valley University, Orem, Utah, United States of America
| | - Andrew S. Wilson
- School of Archaeological Sciences, University of Bradford, Bradford, United Kingdom
| | - Robert H. Rice
- Department of Environmental Toxicology, University of California, Davis, California, United States of America
| | - Scott R. Woodward
- Sorenson Molecular Genealogical Foundation, Salt Lake City, Utah, United States of America
| | - Brian Bothner
- Department of Chemistry and Biochemistry, Montana State University, Bozeman, Montana, United States of America
| | - Bradley R. Hart
- Forensic Science Center, Lawrence Livermore National Laboratory, Livermore, California, United States of America
| | - Mark Leppert
- Department of Human Genetics, University of Utah, Salt Lake City, Utah, United States of America
| |
Collapse
|
178
|
Ivanov MV, Levitsky LI, Gorshkov MV. Adaptation of Decoy Fusion Strategy for Existing Multi-Stage Search Workflows. JOURNAL OF THE AMERICAN SOCIETY FOR MASS SPECTROMETRY 2016; 27:1579-1582. [PMID: 27349255 DOI: 10.1007/s13361-016-1436-7] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 03/17/2016] [Revised: 04/30/2016] [Accepted: 05/19/2016] [Indexed: 06/06/2023]
Abstract
A number of proteomic database search engines implement multi-stage strategies aiming at increasing the sensitivity of proteome analysis. These approaches often employ a subset of the original database for the secondary stage of analysis. However, if target-decoy approach (TDA) is used for false discovery rate (FDR) estimation, the multi-stage strategies may violate the underlying assumption of TDA that false matches are distributed uniformly across the target and decoy databases. This violation occurs if the numbers of target and decoy proteins selected for the second search are not equal. Here, we propose a method of decoy database generation based on the previously reported decoy fusion strategy. This method allows unbiased TDA-based FDR estimation in multi-stage searches and can be easily integrated into existing workflows utilizing popular search engines and post-search algorithms. Graphical Abstract ᅟ.
Collapse
Affiliation(s)
- Mark V Ivanov
- Institute for Energy Problems of Chemical Physics, Russian Academy of Sciences, Moscow, Russia
- Moscow Institute of Physics and Technology (State University), Moscow, Russia
| | - Lev I Levitsky
- Institute for Energy Problems of Chemical Physics, Russian Academy of Sciences, Moscow, Russia
- Moscow Institute of Physics and Technology (State University), Moscow, Russia
| | - Mikhail V Gorshkov
- Institute for Energy Problems of Chemical Physics, Russian Academy of Sciences, Moscow, Russia.
- Moscow Institute of Physics and Technology (State University), Moscow, Russia.
| |
Collapse
|
179
|
Deutsch EW, Overall CM, Van Eyk JE, Baker MS, Paik YK, Weintraub ST, Lane L, Martens L, Vandenbrouck Y, Kusebauch U, Hancock WS, Hermjakob H, Aebersold R, Moritz RL, Omenn GS. Human Proteome Project Mass Spectrometry Data Interpretation Guidelines 2.1. J Proteome Res 2016; 15:3961-3970. [PMID: 27490519 DOI: 10.1021/acs.jproteome.6b00392] [Citation(s) in RCA: 134] [Impact Index Per Article: 16.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/02/2023]
Abstract
Every data-rich community research effort requires a clear plan for ensuring the quality of the data interpretation and comparability of analyses. To address this need within the Human Proteome Project (HPP) of the Human Proteome Organization (HUPO), we have developed through broad consultation a set of mass spectrometry data interpretation guidelines that should be applied to all HPP data contributions. For submission of manuscripts reporting HPP protein identification results, the guidelines are presented as a one-page checklist containing 15 essential points followed by two pages of expanded description of each. Here we present an overview of the guidelines and provide an in-depth description of each of the 15 elements to facilitate understanding of the intentions and rationale behind the guidelines, for both authors and reviewers. Broadly, these guidelines provide specific directions regarding how HPP data are to be submitted to mass spectrometry data repositories, how error analysis should be presented, and how detection of novel proteins should be supported with additional confirmatory evidence. These guidelines, developed by the HPP community, are presented to the broader scientific community for further discussion.
Collapse
Affiliation(s)
- Eric W Deutsch
- Institute for Systems Biology , 401 Terry Avenure North, Seattle, Washington 98109, United States
| | - Christopher M Overall
- Centre for Blood Research, Departments of Oral Biological & Medical Sciences, and Biochemistry & Molecular Biology, Faculty of Dentistry, University of British Columbia , Vancouver, British Columbia V6T 1Z3, Canada
| | - Jennifer E Van Eyk
- Advanced Clinical Biosystems Research Institute, Department of Medicine, Cedars Sinai Medical Center , Los Angeles, California 90048, United States
| | - Mark S Baker
- Department of Biomedical Sciences, Faculty of Medicine and Health Science, Macquarie University , Sydney, New South Wales 2109, Australia
| | - Young-Ki Paik
- Yonsei Proteome Research Center and Department of Biochemistry, Yonsei University , 50 Yonsei-ro, Sudaemoon-ku, Seoul 120-749, Korea
| | - Susan T Weintraub
- The University of Texas , Health Science Center at San Antonio, San Antonio, Texas 78229, United States
| | - Lydie Lane
- SIB Swiss Institute of Bioinformatics and Department of Human Protein Science, Faculty of Medicine, University of Geneva , CMU, Michel Servet 1, 1211 Geneva 4, Switzerland
| | - Lennart Martens
- Department of Medical Protein Research, VIB , Ghent 9052, Belgium.,Department of Biochemistry, Ghent University , Ghent B-9000, Belgium
| | - Yves Vandenbrouck
- French Proteomics Infrastructure, Biosciences and Biotechnology Institute of Grenoble (BIG), Université Grenoble Alpes, CEA, INSERM , U1038 Grenoble, France
| | - Ulrike Kusebauch
- Institute for Systems Biology , 401 Terry Avenure North, Seattle, Washington 98109, United States
| | - William S Hancock
- Department of Chemical Biology, Northeastern University , Boston, Massachusetts 02115, United States
| | - Henning Hermjakob
- European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Wellcome Trust Genome Campus , Hinxton, Cambridge CB10 1SD, United Kingdom.,National Center for Protein Sciences , Beijing 102206, China
| | - Ruedi Aebersold
- Department of Biology, Institute of Molecular Systems Biology , ETH Zurich, Zurich 8093, Switzerland.,Faculty of Science, University of Zurich , 8006 Zurich, Switzerland
| | - Robert L Moritz
- Institute for Systems Biology , 401 Terry Avenure North, Seattle, Washington 98109, United States
| | - Gilbert S Omenn
- Institute for Systems Biology , 401 Terry Avenure North, Seattle, Washington 98109, United States.,Departments of Computational Medicine & Bioinformatics, Internal Medicine, and Human Genetics and School of Public Health, University of Michigan , Ann Arbor, Michigan 48109-2218, United States
| |
Collapse
|
180
|
May DH, Timmins-Schiffman E, Mikan MP, Harvey HR, Borenstein E, Nunn BL, Noble WS. An Alignment-Free "Metapeptide" Strategy for Metaproteomic Characterization of Microbiome Samples Using Shotgun Metagenomic Sequencing. J Proteome Res 2016; 15:2697-705. [PMID: 27396978 PMCID: PMC5116374 DOI: 10.1021/acs.jproteome.6b00239] [Citation(s) in RCA: 35] [Impact Index Per Article: 4.4] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/29/2023]
Abstract
In principle, tandem mass spectrometry can be used to detect and quantify the peptides present in a microbiome sample, enabling functional and taxonomic insight into microbiome metabolic activity. However, the phylogenetic diversity constituting a particular microbiome is often unknown, and many of the organisms present may not have assembled genomes. In ocean microbiome samples, with particularly diverse and uncultured bacterial communities, it is difficult to construct protein databases that contain the bulk of the peptides in the sample without losing detection sensitivity due to the overwhelming number of candidate peptides for each tandem mass spectrum. We describe a method for deriving "metapeptides" (short amino acid sequences that may be represented in multiple organisms) from shotgun metagenomic sequencing of microbiome samples. In two ocean microbiome samples, we constructed site-specific metapeptide databases to detect more than one and a half times as many peptides as by searching against predicted genes from an assembled metagenome and roughly three times as many peptides as by searching against the NCBI environmental proteome database. The increased peptide yield has the potential to enrich the taxonomic and functional characterization of sample metaproteomes.
Collapse
Affiliation(s)
- Damon H May
- Department of Genome Sciences and ‡Department of Computer Science and Engineering, University of Washington , Seattle, Washington 98195-5065, United States
| | - Emma Timmins-Schiffman
- Department of Genome Sciences and ‡Department of Computer Science and Engineering, University of Washington , Seattle, Washington 98195-5065, United States
| | - Molly P Mikan
- Department of Ocean, Earth & Atmospheric Sciences, Old Dominion University , Norfolk, Virginia 23529, United States
| | - H Rodger Harvey
- Department of Ocean, Earth & Atmospheric Sciences, Old Dominion University , Norfolk, Virginia 23529, United States
| | - Elhanan Borenstein
- Department of Genome Sciences and ‡Department of Computer Science and Engineering, University of Washington , Seattle, Washington 98195-5065, United States
- Santa Fe Institute , Santa Fe, New Mexico 87501, United States
| | - Brook L Nunn
- Department of Genome Sciences and ‡Department of Computer Science and Engineering, University of Washington , Seattle, Washington 98195-5065, United States
| | - William S Noble
- Department of Genome Sciences and ‡Department of Computer Science and Engineering, University of Washington , Seattle, Washington 98195-5065, United States
| |
Collapse
|
181
|
Tsou CC, Tsai CF, Teo GC, Chen YJ, Nesvizhskii AI. Untargeted, spectral library-free analysis of data-independent acquisition proteomics data generated using Orbitrap mass spectrometers. Proteomics 2016; 16:2257-71. [PMID: 27246681 DOI: 10.1002/pmic.201500526] [Citation(s) in RCA: 47] [Impact Index Per Article: 5.9] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/17/2015] [Revised: 04/11/2016] [Accepted: 05/30/2016] [Indexed: 12/12/2022]
Abstract
We describe an improved version of the data-independent acquisition (DIA) computational analysis tool DIA-Umpire, and show that it enables highly sensitive, untargeted, and direct (spectral library-free) analysis of DIA data obtained using the Orbitrap family of mass spectrometers. DIA-Umpire v2 implements an improved feature detection algorithm with two additional filters based on the isotope pattern and fractional peptide mass analysis. The targeted re-extraction step of DIA-Umpire is updated with an improved scoring function and a more robust, semiparametric mixture modeling of the resulting scores for computing posterior probabilities of correct peptide identification in a targeted setting. Using two publicly available Q Exactive DIA datasets generated using HEK-293 cells and human liver microtissues, we demonstrate that DIA-Umpire can identify similar number of peptide ions, but with better identification reproducibility between replicates and samples, as with conventional data-dependent acquisition. We further demonstrate the utility of DIA-Umpire using a series of Orbitrap Fusion DIA experiments with HeLa cell lysates profiled using conventional data-dependent acquisition and using DIA with different isolation window widths.
Collapse
Affiliation(s)
- Chih-Chiang Tsou
- Department of Computational Medicine and Bioinformatics, University of Michigan, Ann Arbor, MI, USA
| | | | - Guo Ci Teo
- Department of Pathology, University of Michigan, Ann Arbor, MI, USA
| | - Yu-Ju Chen
- Institute of Chemistry, Academia Sinica, Taipei, Taiwan
| | - Alexey I Nesvizhskii
- Department of Computational Medicine and Bioinformatics, University of Michigan, Ann Arbor, MI, USA.,Department of Pathology, University of Michigan, Ann Arbor, MI, USA
| |
Collapse
|
182
|
Muth T, Renard BY, Martens L. Metaproteomic data analysis at a glance: advances in computational microbial community proteomics. Expert Rev Proteomics 2016; 13:757-69. [DOI: 10.1080/14789450.2016.1209418] [Citation(s) in RCA: 37] [Impact Index Per Article: 4.6] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/07/2023]
|
183
|
Jian L, Xia Z, Niu X, Liang X, Samir P, Link AJ. l2 Multiple Kernel Fuzzy SVM-Based Data Fusion for Improving Peptide Identification. IEEE/ACM TRANSACTIONS ON COMPUTATIONAL BIOLOGY AND BIOINFORMATICS 2016; 13:804-809. [PMID: 26394437 DOI: 10.1109/tcbb.2015.2480084] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/05/2023]
Abstract
SEQUEST is a database-searching engine, which calculates the correlation score between observed spectrum and theoretical spectrum deduced from protein sequences stored in a flat text file, even though it is not a relational and object-oriental repository. Nevertheless, the SEQUEST score functions fail to discriminate between true and false PSMs accurately. Some approaches, such as PeptideProphet and Percolator, have been proposed to address the task of distinguishing true and false PSMs. However, most of these methods employ time-consuming learning algorithms to validate peptide assignments [1] . In this paper, we propose a fast algorithm for validating peptide identification by incorporating heterogeneous information from SEQUEST scores and peptide digested knowledge. To automate the peptide identification process and incorporate additional information, we employ l2 multiple kernel learning (MKL) to implement the current peptide identification task. Results on experimental datasets indicate that compared with state-of-the-art methods, i.e., PeptideProphet and Percolator, our data fusing strategy has comparable performance but reduces the running time significantly.
Collapse
|
184
|
Cooper B. Doubling down on phosphorylation as a variable peptide modification. Proteomics 2016; 16:2444-7. [DOI: 10.1002/pmic.201500440] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/01/2015] [Revised: 04/13/2016] [Accepted: 05/04/2016] [Indexed: 01/21/2023]
Affiliation(s)
- Bret Cooper
- Soybean Genomics and Improvement Laboratory; USDA-ARS; Beltsville MD USA
| |
Collapse
|
185
|
Wang S, Halloran JT, Bilmes JA, Noble WS. Faster and more accurate graphical model identification of tandem mass spectra using trellises. Bioinformatics 2016; 32:i322-i331. [PMID: 27307634 PMCID: PMC4908353 DOI: 10.1093/bioinformatics/btw269] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/12/2022] Open
Abstract
Tandem mass spectrometry (MS/MS) is the dominant high throughput technology for identifying and quantifying proteins in complex biological samples. Analysis of the tens of thousands of fragmentation spectra produced by an MS/MS experiment begins by assigning to each observed spectrum the peptide that is hypothesized to be responsible for generating the spectrum. This assignment is typically done by searching each spectrum against a database of peptides. To our knowledge, all existing MS/MS search engines compute scores individually between a given observed spectrum and each possible candidate peptide from the database. In this work, we use a trellis, a data structure capable of jointly representing a large set of candidate peptides, to avoid redundantly recomputing common sub-computations among different candidates. We show how trellises may be used to significantly speed up existing scoring algorithms, and we theoretically quantify the expected speedup afforded by trellises. Furthermore, we demonstrate that compact trellis representations of whole sets of peptides enables efficient discriminative learning of a dynamic Bayesian network for spectrum identification, leading to greatly improved spectrum identification accuracy. Contact:bilmes@uw.edu or william-noble@uw.edu Supplementary information:Supplementary data are available at Bioinformatics online.
Collapse
Affiliation(s)
| | | | - Jeff A Bilmes
- Department of Computer Science and Engineering Department of Electrical Engineering
| | - William S Noble
- Department of Computer Science and Engineering Department of Genome Sciences, University of Washington, Seattle, WA 98195, USA
| |
Collapse
|
186
|
Li Y, Wang X, Cho JH, Shaw TI, Wu Z, Bai B, Wang H, Zhou S, Beach TG, Wu G, Zhang J, Peng J. JUMPg: An Integrative Proteogenomics Pipeline Identifying Unannotated Proteins in Human Brain and Cancer Cells. J Proteome Res 2016; 15:2309-20. [PMID: 27225868 DOI: 10.1021/acs.jproteome.6b00344] [Citation(s) in RCA: 62] [Impact Index Per Article: 7.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/18/2023]
Abstract
Proteogenomics is an emerging approach to improve gene annotation and interpretation of proteomics data. Here we present JUMPg, an integrative proteogenomics pipeline including customized database construction, tag-based database search, peptide-spectrum match filtering, and data visualization. JUMPg creates multiple databases of DNA polymorphisms, mutations, splice junctions, partially trypticity, as well as protein fragments translated from the whole transcriptome in all six frames upon RNA-seq de novo assembly. We use a multistage strategy to search these databases sequentially, in which the performance is optimized by re-searching only unmatched high-quality spectra and reusing amino acid tags generated by the JUMP search engine. The identified peptides/proteins are displayed with gene loci using the UCSC genome browser. Then, the JUMPg program is applied to process a label-free mass spectrometry data set of Alzheimer's disease postmortem brain, uncovering 496 new peptides of amino acid substitutions, alternative splicing, frame shift, and "non-coding gene" translation. The novel protein PNMA6BL specifically expressed in the brain is highlighted. We also tested JUMPg to analyze a stable-isotope labeled data set of multiple myeloma cells, revealing 991 sample-specific peptides that include protein sequences in the immunoglobulin light chain variable region. Thus, the JUMPg program is an effective proteogenomics tool for multiomics data integration.
Collapse
Affiliation(s)
| | | | | | | | | | | | - Hong Wang
- Integrated Biomedical Sciences Program, University of Tennessee Health Science Center , 920 Madison Avenue, Memphis, Tennessee 38163, United States
| | | | - Thomas G Beach
- Banner Sun Health Research Institute , Sun City, Arizona 85351, United States
| | | | | | | |
Collapse
|
187
|
Khatri K, Klein JA, White MR, Grant OC, Leymarie N, Woods RJ, Hartshorn KL, Zaia J. Integrated Omics and Computational Glycobiology Reveal Structural Basis for Influenza A Virus Glycan Microheterogeneity and Host Interactions. Mol Cell Proteomics 2016; 15:1895-912. [PMID: 26984886 PMCID: PMC5083086 DOI: 10.1074/mcp.m116.058016] [Citation(s) in RCA: 79] [Impact Index Per Article: 9.9] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/06/2016] [Revised: 03/04/2016] [Indexed: 02/04/2023] Open
Abstract
Despite sustained biomedical research effort, influenza A virus remains an imminent threat to the world population and a major healthcare burden. The challenge in developing vaccines against influenza is the ability of the virus to mutate rapidly in response to selective immune pressure. Hemagglutinin is the predominant surface glycoprotein and the primary determinant of antigenicity, virulence and zoonotic potential. Mutations leading to changes in the number of HA glycosylation sites are often reported. Such genetic sequencing studies predict at best the disruption or creation of sequons for N-linked glycosylation; they do not reflect actual phenotypic changes in HA structure. Therefore, combined analysis of glycan micro and macro-heterogeneity and bioassays will better define the relationships among glycosylation, viral bioactivity and evolution. We present a study that integrates proteomics, glycomics and glycoproteomics of HA before and after adaptation to innate immune system pressure. We combined this information with glycan array and immune lectin binding data to correlate the phenotypic changes with biological activity. Underprocessed glycoforms predominated at the glycosylation sites found to be involved in viral evolution in response to selection pressures and interactions with innate immune-lectins. To understand the structural basis for site-specific glycan microheterogeneity at these sites, we performed structural modeling and molecular dynamics simulations. We observed that the presence of immature, high-mannose type glycans at a particular site correlated with reduced accessibility to glycan remodeling enzymes. Further, the high mannose glycans at sites implicated in immune lectin recognition were predicted to be capable of forming trimeric interactions with the immune-lectin surfactant protein-D.
Collapse
Affiliation(s)
- Kshitij Khatri
- From the ‡Center for Biomedical Mass Spectrometry, Department of Biochemistry, Boston University School of Medicine, Boston, Massachusetts 02118
| | - Joshua A Klein
- From the ‡Center for Biomedical Mass Spectrometry, Department of Biochemistry, Boston University School of Medicine, Boston, Massachusetts 02118; §Bioinformatics Program, Boston University, Boston, Massachusetts 02215
| | - Mitchell R White
- ¶Department of Medicine, Boston University School of Medicine, Boston, Massachusetts 02118
| | - Oliver C Grant
- ‖Complex Carbohydrate Research Center, University of Georgia, Athens, Georgia 30602
| | - Nancy Leymarie
- From the ‡Center for Biomedical Mass Spectrometry, Department of Biochemistry, Boston University School of Medicine, Boston, Massachusetts 02118
| | - Robert J Woods
- ‖Complex Carbohydrate Research Center, University of Georgia, Athens, Georgia 30602
| | - Kevan L Hartshorn
- ¶Department of Medicine, Boston University School of Medicine, Boston, Massachusetts 02118
| | - Joseph Zaia
- From the ‡Center for Biomedical Mass Spectrometry, Department of Biochemistry, Boston University School of Medicine, Boston, Massachusetts 02118; §Bioinformatics Program, Boston University, Boston, Massachusetts 02215;
| |
Collapse
|
188
|
Bogdanow B, Zauber H, Selbach M. Systematic Errors in Peptide and Protein Identification and Quantification by Modified Peptides. Mol Cell Proteomics 2016; 15:2791-801. [PMID: 27215553 DOI: 10.1074/mcp.m115.055103] [Citation(s) in RCA: 49] [Impact Index Per Article: 6.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/28/2015] [Indexed: 01/17/2023] Open
Abstract
The principle of shotgun proteomics is to use peptide mass spectra in order to identify corresponding sequences in a protein database. The quality of peptide and protein identification and quantification critically depends on the sensitivity and specificity of this assignment process. Many peptides in proteomic samples carry biochemical modifications, and a large fraction of unassigned spectra arise from modified peptides. Spectra derived from modified peptides can erroneously be assigned to wrong amino acid sequences. However, the impact of this problem on proteomic data has not yet been investigated systematically. Here we use combinations of different database searches to show that modified peptides can be responsible for 20-50% of false positive identifications in deep proteomic data sets. These false positive hits are particularly problematic as they have significantly higher scores and higher intensities than other false positive matches. Furthermore, these wrong peptide assignments lead to hundreds of false protein identifications and systematic biases in protein quantification. We devise a "cleaned search" strategy to address this problem and show that this considerably improves the sensitivity and specificity of proteomic data. In summary, we show that modified peptides cause systematic errors in peptide and protein identification and quantification and should therefore be considered to further improve the quality of proteomic data annotation.
Collapse
Affiliation(s)
- Boris Bogdanow
- From the ‡Proteome Dynamics lab, Max Delbrück Center for Molecular Medicine, Robert-Rössle-Str.13, 13092 Berlin, Germany
| | - Henrik Zauber
- From the ‡Proteome Dynamics lab, Max Delbrück Center for Molecular Medicine, Robert-Rössle-Str.13, 13092 Berlin, Germany
| | - Matthias Selbach
- From the ‡Proteome Dynamics lab, Max Delbrück Center for Molecular Medicine, Robert-Rössle-Str.13, 13092 Berlin, Germany
| |
Collapse
|
189
|
Yu F, Li N, Yu W. ECL: an exhaustive search tool for the identification of cross-linked peptides using whole database. BMC Bioinformatics 2016; 17:217. [PMID: 27206479 PMCID: PMC4874008 DOI: 10.1186/s12859-016-1073-y] [Citation(s) in RCA: 12] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/20/2015] [Accepted: 05/07/2016] [Indexed: 12/16/2022] Open
Abstract
BACKGROUND Chemical cross-linking combined with mass spectrometry (CX-MS) is a high-throughput approach to studying protein-protein interactions. The number of peptide-peptide combinations grows quadratically with respect to the number of proteins, resulting in a high computational complexity. Widely used methods including xQuest (Rinner et al., Nat Methods 5(4):315-8, 2008; Walzthoeni et al., Nat Methods 9(9):901-3, 2012), pLink (Yang et al., Nat Methods 9(9):904-6, 2012), ProteinProspector (Chu et al., Mol Cell Proteomics 9:25-31, 2010; Trnka et al., 13(2):420-34, 2014) and Kojak (Hoopmann et al., J Proteome Res 14(5):2190-198, 2015) avoid searching all peptide-peptide combinations by pre-selecting peptides with heuristic approaches. However, pre-selection procedures may cause missing findings. The most intuitive approach is searching all possible candidates. A tool that can exhaustively search a whole database without any heuristic pre-selection procedure is therefore desirable. RESULTS We have developed a cross-linked peptides identification tool named ECL. It can exhaustively search a whole database in a reasonable period of time without any heuristic pre-selection procedure. Tests showed that searching a database containing 5200 proteins took 7 h. ECL identified more non-redundant cross-linked peptides than xQuest, pLink, and ProteinProspector. Experiments showed that about 30 % of these additional identified peptides were not pre-selected by Kojak. We used protein crystal structures from the protein data bank to check the intra-protein cross-linked peptides. Most of the distances between cross-linking sites were smaller than 30 Å. CONCLUSIONS To the best of our knowledge, ECL is the first tool that can exhaustively search all candidates in cross-linked peptides identification. The experiments showed that ECL could identify more peptides than xQuest, pLink, and ProteinProspector. A further analysis indicated that some of the additional identified results were thanks to the exhaustive search.
Collapse
Affiliation(s)
- Fengchao Yu
- Division of Biomedical Engineering, The Hong Kong University of Science and Technology, Hong Kong, China
| | - Ning Li
- Division of Life Science, The Hong Kong University of Science and Technology, Hong Kong, China.
| | - Weichuan Yu
- Division of Biomedical Engineering, The Hong Kong University of Science and Technology, Hong Kong, China.
- Department of Electronic and Computer Engineering, The Hong Kong University of Science and Technology, Hong Kong, China.
| |
Collapse
|
190
|
Shen S, Jiang X, Li J, Straubinger RM, Suarez M, Tu C, Duan X, Thompson AC, Qu J. Large-Scale, Ion-Current-Based Proteomic Investigation of the Rat Striatal Proteome in a Model of Short- and Long-Term Cocaine Withdrawal. J Proteome Res 2016; 15:1702-16. [PMID: 27018876 DOI: 10.1021/acs.jproteome.6b00137] [Citation(s) in RCA: 10] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/30/2022]
Abstract
Given the tremendous detriments of cocaine dependence, effective diagnosis and patient stratification are critical for successful intervention yet difficult to achieve due to the largely unknown molecular mechanisms involved. To obtain new insights into cocaine dependence and withdrawal, we employed a reproducible, reliable, and large-scale proteomics approach to investigate the striatal proteomes of rats (n = 40, 10 per group) subjected to chronic cocaine exposure, followed by either short- (WD1) or long- (WD22) term withdrawal. By implementing a surfactant-aided precipitation/on-pellet digestion procedure, a reproducible and sensitive nanoLC-Orbitrap MS analysis, and an optimized ion-current-based MS1 quantification pipeline, >2000 nonredundant proteins were quantified confidently without missing data in any replicate. Although cocaine was cleared from the body, 129/37 altered proteins were observed in WD1/WD22 that are implicated in several biological processes related closely to drug-induced neuroplasticity. Although many of these changes recapitulate the findings from independent studies reported over the last two decades, some novel insights were obtained and further validated by immunoassays. For example, significantly elevated striatal protein kinase C activity persisted over the 22 day cocaine withdrawal. Cofilin-1 activity was up-regulated in WD1 and down-regulated in WD22. These discoveries suggest potentially distinct structural plasticity after short- and long-term cocaine withdrawal. In addition, this study provides compelling evidence that blood vessel narrowing, a long-known effect of cocaine use, occurred after long-term but not short-term withdrawal. In summary, this work developed a well-optimized paradigm for ion-current-based quantitative proteomics in brain tissues and obtained novel insights into molecular alterations in the striatum following cocaine exposure and withdrawal.
Collapse
Affiliation(s)
- Shichen Shen
- New York State Center of Excellence in Bioinformatics & Life Sciences , Buffalo, New York 14203, United States.,Department of Biochemistry, School of Medicine and Biomedical Sciences, SUNY at Buffalo , Buffalo, New York 14214, United States
| | - Xiaosheng Jiang
- Department of Pharmaceutical Sciences, SUNY at Buffalo , Buffalo, New York 14214, United States.,New York State Center of Excellence in Bioinformatics & Life Sciences , Buffalo, New York 14203, United States
| | - Jun Li
- Department of Pharmaceutical Sciences, SUNY at Buffalo , Buffalo, New York 14214, United States.,New York State Center of Excellence in Bioinformatics & Life Sciences , Buffalo, New York 14203, United States
| | - Robert M Straubinger
- Department of Pharmaceutical Sciences, SUNY at Buffalo , Buffalo, New York 14214, United States
| | - Mauricio Suarez
- Department of Psychology, SUNY at Buffalo , Buffalo, New York 14260, United States.,Research Institute on Addictions, SUNY at Buffalo , Buffalo, New York 14203, United States
| | - Chengjian Tu
- New York State Center of Excellence in Bioinformatics & Life Sciences , Buffalo, New York 14203, United States.,Department of Biochemistry, School of Medicine and Biomedical Sciences, SUNY at Buffalo , Buffalo, New York 14214, United States
| | - Xiaotao Duan
- State Key Laboratory of Toxicology and Medical Countermeasures, Beijing Institute of Pharmacology and Toxicology , Beijing 100850, China
| | - Alexis C Thompson
- Department of Psychology, SUNY at Buffalo , Buffalo, New York 14260, United States.,Research Institute on Addictions, SUNY at Buffalo , Buffalo, New York 14203, United States
| | - Jun Qu
- Department of Pharmaceutical Sciences, SUNY at Buffalo , Buffalo, New York 14214, United States.,New York State Center of Excellence in Bioinformatics & Life Sciences , Buffalo, New York 14203, United States
| |
Collapse
|
191
|
Zelanis A, Menezes MC, Kitano ES, Liberato T, Tashima AK, Pinto AF, Sherman NE, Ho PL, Fox JW, Serrano SM. Proteomic identification of gender molecular markers in Bothrops jararaca venom. J Proteomics 2016; 139:26-37. [DOI: 10.1016/j.jprot.2016.02.030] [Citation(s) in RCA: 33] [Impact Index Per Article: 4.1] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/19/2015] [Revised: 02/10/2016] [Accepted: 02/24/2016] [Indexed: 01/13/2023]
|
192
|
Shortreed MR, Frey BL, Scalf M, Knoener RA, Cesnik AJ, Smith LM. Elucidating Proteoform Families from Proteoform Intact-Mass and Lysine-Count Measurements. J Proteome Res 2016; 15:1213-21. [PMID: 26941048 PMCID: PMC4917391 DOI: 10.1021/acs.jproteome.5b01090] [Citation(s) in RCA: 35] [Impact Index Per Article: 4.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/21/2023]
Abstract
![]()
Proteomics
is presently dominated by the “bottom-up”
strategy, in which proteins are enzymatically digested into peptides
for mass spectrometric identification. Although this approach is highly
effective at identifying large numbers of proteins present in complex
samples, the digestion into peptides renders it impossible to identify
the proteoforms from which they were derived. We present here a powerful
new strategy for the identification of proteoforms and the elucidation
of proteoform families (groups of related proteoforms) from the experimental
determination of the accurate proteoform mass and number of lysine
residues contained. Accurate proteoform masses are determined by standard
LC–MS analysis of undigested protein mixtures in an Orbitrap
mass spectrometer, and the lysine count is determined using the NeuCode
isotopic tagging method. We demonstrate the approach in analysis of
the yeast proteome, revealing 8637 unique proteoforms and 1178 proteoform
families. The elucidation of proteoforms and proteoform families afforded
here provides an unprecedented new perspective upon proteome complexity
and dynamics.
Collapse
Affiliation(s)
- Michael R Shortreed
- Department of Chemistry, University of Wisconsin , 1101 University Avenue, Madison, Wisconsin 53706, United States
| | - Brian L Frey
- Department of Chemistry, University of Wisconsin , 1101 University Avenue, Madison, Wisconsin 53706, United States
| | - Mark Scalf
- Department of Chemistry, University of Wisconsin , 1101 University Avenue, Madison, Wisconsin 53706, United States
| | - Rachel A Knoener
- Department of Chemistry, University of Wisconsin , 1101 University Avenue, Madison, Wisconsin 53706, United States
| | - Anthony J Cesnik
- Department of Chemistry, University of Wisconsin , 1101 University Avenue, Madison, Wisconsin 53706, United States
| | - Lloyd M Smith
- Department of Chemistry, University of Wisconsin , 1101 University Avenue, Madison, Wisconsin 53706, United States.,Genome Center of Wisconsin, University of Wisconsin , 425G Henry Mall, Room 3420, Madison, Wisconsin 53706, United States
| |
Collapse
|
193
|
Abstract
Plant-omics is rapidly becoming an important field of study in the scientific community due to the urgent need to address many of the most important questions facing humanity today with regard to agriculture, medicine, biofuels, environmental decontamination, ecological sustainability, etc. High-performance mass spectrometry is a dominant tool for interrogating the metabolomes, peptidomes, and proteomes of a diversity of plant species under various conditions, revealing key insights into the functions and mechanisms of plant biochemistry.
Collapse
Affiliation(s)
- Erin Gemperline
- Department of Chemistry, University of Wisconsin-Madison , 1101 University Avenue, Madison, Wisconsin 53706, United States
| | - Caitlin Keller
- Department of Chemistry, University of Wisconsin-Madison , 1101 University Avenue, Madison, Wisconsin 53706, United States
| | - Lingjun Li
- Department of Chemistry, University of Wisconsin-Madison , 1101 University Avenue, Madison, Wisconsin 53706, United States.,School of Pharmacy, University of Wisconsin-Madison , 777 Highland Avenue, Madison, Wisconsin 53705, United States
| |
Collapse
|
194
|
Blein-Nicolas M, Zivy M. Thousand and one ways to quantify and compare protein abundances in label-free bottom-up proteomics. BIOCHIMICA ET BIOPHYSICA ACTA-PROTEINS AND PROTEOMICS 2016; 1864:883-95. [PMID: 26947242 DOI: 10.1016/j.bbapap.2016.02.019] [Citation(s) in RCA: 55] [Impact Index Per Article: 6.9] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 11/05/2015] [Revised: 01/21/2016] [Accepted: 02/24/2016] [Indexed: 11/18/2022]
Abstract
How to process and analyze MS data to quantify and statistically compare protein abundances in bottom-up proteomics has been an open debate for nearly fifteen years. Two main approaches are generally used: the first is based on spectral data generated during the process of identification (e.g. peptide counting, spectral counting), while the second makes use of extracted ion currents to quantify chromatographic peaks and infer protein abundances based on peptide quantification. These two approaches actually refer to multiple methods which have been developed during the last decade, but were submitted to deep evaluations only recently. In this paper, we compiled these different methods as exhaustively as possible. We also summarized the way they address the different problems raised by bottom-up protein quantification such as normalization, the presence of shared peptides, unequal peptide measurability and missing data. This article is part of a Special Issue entitled: Plant Proteomics--a bridge between fundamental processes and crop production, edited by Dr. Hans-Peter Mock.
Collapse
Affiliation(s)
- Mélisande Blein-Nicolas
- GQE-Le Moulon, INRA, Univ Paris-Sud, CNRS, AgroParisTech, Université Paris-Saclay, F-91190 Gif-sur-Yvette, France
| | - Michel Zivy
- GQE-Le Moulon, INRA, Univ Paris-Sud, CNRS, AgroParisTech, Université Paris-Saclay, F-91190 Gif-sur-Yvette, France.
| |
Collapse
|
195
|
Aamodt JM, Grainger DW. Extracellular matrix-based biomaterial scaffolds and the host response. Biomaterials 2016; 86:68-82. [PMID: 26890039 DOI: 10.1016/j.biomaterials.2016.02.003] [Citation(s) in RCA: 297] [Impact Index Per Article: 37.1] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/16/2015] [Revised: 01/30/2016] [Accepted: 02/01/2016] [Indexed: 01/08/2023]
Abstract
Extracellular matrix (ECM) collectively represents a class of naturally derived proteinaceous biomaterials purified from harvested organs and tissues with increasing scientific focus and utility in tissue engineering and repair. This interest stems predominantly from the largely unproven concept that processed ECM biomaterials as natural tissue-derived matrices better integrate with host tissue than purely synthetic biomaterials. Nearly every tissue type has been decellularized and processed for re-use as tissue-derived ECM protein implants and scaffolds. To date, however, little consensus exists for defining ECM compositions or sources that best constitute decellularized biomaterials that might better heal, integrate with host tissues and avoid the foreign body response (FBR). Metrics used to assess ECM performance in biomaterial implants are arbitrary and contextually specific by convention. Few comparisons for in vivo host responses to ECM implants from different sources are published. This review discusses current ECM-derived biomaterials characterization methods including relationships between ECM material compositions from different sources, properties and host tissue response as implants. Relevant preclinical in vivo models are compared along with their associated advantages and limitations, and the current state of various metrics used to define material integration and biocompatibility are discussed. Commonly applied applications of these ECM-derived biomaterials as stand-alone implanted matrices and devices are compared with respect to host tissue responses.
Collapse
Affiliation(s)
- Joseph M Aamodt
- Department of Bioengineering, University of Utah, Salt Lake City, UT, 84112-5820, USA
| | - David W Grainger
- Department of Bioengineering, University of Utah, Salt Lake City, UT, 84112-5820, USA; Department of Pharmaceutics and Pharmaceutical Chemistry University of Utah, Salt Lake City, UT, 84112-5820, USA.
| |
Collapse
|
196
|
Bischoff R, Permentier H, Guryev V, Horvatovich P. Genomic variability and protein species — Improving sequence coverage for proteogenomics. J Proteomics 2016; 134:25-36. [DOI: 10.1016/j.jprot.2015.09.021] [Citation(s) in RCA: 9] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/04/2015] [Revised: 09/06/2015] [Accepted: 09/14/2015] [Indexed: 12/30/2022]
|
197
|
Ning Z, Zhang X, Mayne J, Figeys D. Peptide-Centric Approaches Provide an Alternative Perspective To Re-Examine Quantitative Proteomic Data. Anal Chem 2016; 88:1973-8. [DOI: 10.1021/acs.analchem.5b04148] [Citation(s) in RCA: 11] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/31/2022]
Affiliation(s)
- Zhibin Ning
- Ottawa
Institute of Systems
Biology, Department of Biochemistry, Microbiology and Immunology, University of Ottawa, 451 Smyth Road, Ottawa, Ontario Canada, K1H8M5
| | - Xu Zhang
- Ottawa
Institute of Systems
Biology, Department of Biochemistry, Microbiology and Immunology, University of Ottawa, 451 Smyth Road, Ottawa, Ontario Canada, K1H8M5
| | - Janice Mayne
- Ottawa
Institute of Systems
Biology, Department of Biochemistry, Microbiology and Immunology, University of Ottawa, 451 Smyth Road, Ottawa, Ontario Canada, K1H8M5
| | - Daniel Figeys
- Ottawa
Institute of Systems
Biology, Department of Biochemistry, Microbiology and Immunology, University of Ottawa, 451 Smyth Road, Ottawa, Ontario Canada, K1H8M5
| |
Collapse
|
198
|
Global proteogenomic analysis of human MHC class I-associated peptides derived from non-canonical reading frames. Nat Commun 2016; 7:10238. [PMID: 26728094 PMCID: PMC4728431 DOI: 10.1038/ncomms10238] [Citation(s) in RCA: 170] [Impact Index Per Article: 21.3] [Reference Citation Analysis] [Abstract] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/13/2015] [Accepted: 11/16/2015] [Indexed: 12/21/2022] Open
Abstract
In view of recent reports documenting pervasive translation outside of canonical protein-coding sequences, we wished to determine the proportion of major histocompatibility complex (MHC) class I-associated peptides (MAPs) derived from non-canonical reading frames. Here we perform proteogenomic analyses of MAPs eluted from human B cells using high-throughput mass spectrometry to probe the six-frame translation of the B-cell transcriptome. We report that ∼10% of MAPs originate from allegedly noncoding genomic sequences or exonic out-of-frame translation. The biogenesis and properties of these ‘cryptic MAPs' differ from those of conventional MAPs. Cryptic MAPs come from very short proteins with atypical C termini, and are coded by transcripts bearing long 3′UTRs enriched in destabilizing elements. Relative to conventional MAPs, cryptic MAPs display different MHC class I-binding preferences and harbour more genomic polymorphisms, some of which are immunogenic. Cryptic MAPs increase the complexity of the MAP repertoire and enhance the scope of CD8 T-cell immunosurveillance. Cryptic translation of the 'non-coding' genome is increasingly recognised, however its biological significance remains unclear. Laumont et al. employ proteogenomic techniques to map the human immunoproteome, and find that approximately 10% of MHC class I-associated peptides are cryptic.
Collapse
|
199
|
Guldbrandsen A, Barsnes H, Kroksveen AC, Berven FS, Vaudel M. A Simple Workflow for Large Scale Shotgun Glycoproteomics. Methods Mol Biol 2016; 1394:275-286. [PMID: 26700056 DOI: 10.1007/978-1-4939-3341-9_20] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 06/05/2023]
Abstract
Targeting subproteomes is a good strategy to decrease the complexity of a sample, for example in body fluid biomarker studies. Glycoproteins are proteins with carbohydrates of varying size and structure attached to the polypeptide chain, and it has been shown that glycosylation plays essential roles in several vital cellular processes, making glycosylation a particularly interesting field of study. Here, we describe a method for the enrichment of glycosylated peptides from trypsin digested proteins in human cerebrospinal fluid. We also describe how to perform the data analysis on the mass spectrometry data for such samples, focusing on site-specific identification of glycosylation sites, using user friendly open source software.
Collapse
Affiliation(s)
- Astrid Guldbrandsen
- Proteomics Unit, Department of Biomedicine, University of Bergen, Bergen, Norway
- KG Jebsen Centre for Multiple Sclerosis Research, Department of Clinical Medicine, University of Bergen, Bergen, Norway
| | - Harald Barsnes
- Proteomics Unit, Department of Biomedicine, University of Bergen, Bergen, Norway
- KG Jebsen Center for Diabetes Research, Department of Clinical Sciences, University of Bergen, Bergen, Norway
| | - Ann Cathrine Kroksveen
- Proteomics Unit, Department of Biomedicine, University of Bergen, Bergen, Norway
- KG Jebsen Centre for Multiple Sclerosis Research, Department of Clinical Medicine, University of Bergen, Bergen, Norway
| | - Frode S Berven
- Proteomics Unit, Department of Biomedicine, University of Bergen, Bergen, Norway
- KG Jebsen Centre for Multiple Sclerosis Research, Department of Clinical Medicine, University of Bergen, Bergen, Norway
- Norwegian Multiple Sclerosis Competence Centre, Department of Neurology, Haukeland University Hospital, Bergen, Norway
| | - Marc Vaudel
- Proteomics Unit, Department of Biomedicine, University of Bergen, Bergen, Norway.
| |
Collapse
|
200
|
Olexiouk V, Menschaert G. Identification of Small Novel Coding Sequences, a Proteogenomics Endeavor. ADVANCES IN EXPERIMENTAL MEDICINE AND BIOLOGY 2016; 926:49-64. [PMID: 27686805 DOI: 10.1007/978-3-319-42316-6_4] [Citation(s) in RCA: 14] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 12/15/2022]
Abstract
The identification of small proteins and peptides has consistently proven to be challenging. However, technological advances as well as multi-omics endeavors facilitate the identification of novel small coding sequences, leading to new insights. Specifically, the application of next generation sequencing technologies (NGS), providing accurate and sample specific transcriptome / translatome information, into the proteomics field led to more comprehensive results and new discoveries. This book chapter focuses on the inclusion of RNA-Seq and RIBO-Seq also known as ribosome profiling, an RNA-Seq based technique sequencing the +/- 30 bp long fragments captured by translating ribosomes. We emphasize the identification of micropeptides and neo-antigens, two distinct classes of small translation products, triggering our current understanding of biology. RNA-Seq is capable of capturing sample specific genomic variations, enabling focused neo-antigen identification. RIBO-Seq can identify translation events in small open reading frames which are considered to be non-coding, leading to the discovery of micropeptides. The identification of small translation products requires the integration of multi-omics data, stressing the importance of proteogenomics in this novel research area.
Collapse
Affiliation(s)
- Volodimir Olexiouk
- Lab of Bioinformatics and Computational Genomics (BioBix), Faculty of Bioscience Engineering, Department of Mathematical Modelling, Statistics and Bioinformatics, Ghent University, Coupure Links 653, Building A, Ghent, 9000, Belgium.
| | - Gerben Menschaert
- Lab of Bioinformatics and Computational Genomics (BioBix), Faculty of Bioscience Engineering, Department of Mathematical Modelling, Statistics and Bioinformatics, Ghent University, Coupure Links 653, Building A, Ghent, 9000, Belgium
| |
Collapse
|