1
|
Ivanov MV, Lobas AA, Karpov DS, Moshkovskii SA, Gorshkov MV. Comparison of False Discovery Rate Control Strategies for Variant Peptide Identifications in Shotgun Proteogenomics. J Proteome Res 2017; 16:1936-1943. [DOI: 10.1021/acs.jproteome.6b01014] [Citation(s) in RCA: 12] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/29/2022]
Affiliation(s)
- Mark V. Ivanov
- Institute
for Energy Problems of Chemical Physics, Russian Academy of Sciences, Moscow 119334, Russia
- Moscow Institute of Physics and Technology (State University), Moscow Region, Dolgoprudny 141700, Russia
| | - Anna A. Lobas
- Institute
for Energy Problems of Chemical Physics, Russian Academy of Sciences, Moscow 119334, Russia
- Moscow Institute of Physics and Technology (State University), Moscow Region, Dolgoprudny 141700, Russia
| | - Dmitry S. Karpov
- Institute of Biomedical Chemistry, Moscow 119121, Russia
- Engelhardt
Institute of Molecular Biology, Russian Academy of Sciences, Moscow 119991, Russia
| | - Sergei A. Moshkovskii
- Institute of Biomedical Chemistry, Moscow 119121, Russia
- Pirogov Russian National Research Medical University, Moscow 117997, Russia
| | - Mikhail V. Gorshkov
- Institute
for Energy Problems of Chemical Physics, Russian Academy of Sciences, Moscow 119334, Russia
- Moscow Institute of Physics and Technology (State University), Moscow Region, Dolgoprudny 141700, Russia
| |
Collapse
|
2
|
SpotLight Proteomics: uncovering the hidden blood proteome improves diagnostic power of proteomics. Sci Rep 2017; 7:41929. [PMID: 28167817 PMCID: PMC5294601 DOI: 10.1038/srep41929] [Citation(s) in RCA: 23] [Impact Index Per Article: 3.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/01/2016] [Accepted: 01/05/2017] [Indexed: 01/25/2023] Open
Abstract
The human blood proteome is frequently assessed by protein abundance profiling using a combination of liquid chromatography and tandem mass spectrometry (LC-MS/MS). In traditional sequence database search, many good-quality MS/MS data remain unassigned. Here we uncover the hidden part of the blood proteome via novel SpotLight approach. This method combines de novo MS/MS sequencing of enriched antibodies and co-extracted proteins with subsequent label-free quantification of new and known peptides in both enriched and unfractionated samples. In a pilot study on differentiating early stages of Alzheimer’s disease (AD) from Dementia with Lewy Bodies (DLB), on peptide level the hidden proteome contributed almost as much information to patient stratification as the apparent proteome. Intriguingly, many of the new peptide sequences are attributable to antibody variable regions, and are potentially indicative of disease etiology. When the hidden and apparent proteomes are combined, the accuracy of differentiating AD (n = 97) and DLB (n = 47) increased from ≈85% to ≈95%. The low added burden of SpotLight proteome analysis makes it attractive for use in clinical settings.
Collapse
|
3
|
Deutsch EW, Sun Z, Campbell DS, Binz PA, Farrah T, Shteynberg D, Mendoza L, Omenn GS, Moritz RL. Tiered Human Integrated Sequence Search Databases for Shotgun Proteomics. J Proteome Res 2016; 15:4091-4100. [PMID: 27577934 DOI: 10.1021/acs.jproteome.6b00445] [Citation(s) in RCA: 21] [Impact Index Per Article: 2.6] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/30/2022]
Abstract
The results of analysis of shotgun proteomics mass spectrometry data can be greatly affected by the selection of the reference protein sequence database against which the spectra are matched. For many species there are multiple sources from which somewhat different sequence sets can be obtained. This can lead to confusion about which database is best in which circumstances-a problem especially acute in human sample analysis. All sequence databases are genome-based, with sequences for the predicted gene and their protein translation products compiled. Our goal is to create a set of primary sequence databases that comprise the union of sequences from many of the different available sources and make the result easily available to the community. We have compiled a set of four sequence databases of varying sizes, from a small database consisting of only the ∼20,000 primary isoforms plus contaminants to a very large database that includes almost all nonredundant protein sequences from several sources. This set of tiered, increasingly complete human protein sequence databases suitable for mass spectrometry proteomics sequence database searching is called the Tiered Human Integrated Search Proteome set. In order to evaluate the utility of these databases, we have analyzed two different data sets, one from the HeLa cell line and the other from normal human liver tissue, with each of the four tiers of database complexity. The result is that approximately 0.8%, 1.1%, and 1.5% additional peptides can be identified for Tiers 2, 3, and 4, respectively, as compared with the Tier 1 database, at substantially increasing computational cost. This increase in computational cost may be worth bearing if the identification of sequence variants or the discovery of sequences that are not present in the reviewed knowledge base entries is an important goal of the study. We find that it is useful to search a data set against a simpler database, and then check the uniqueness of the discovered peptides against a more complex database. We have set up an automated system that downloads all the source databases on the first of each month and automatically generates a new set of search databases and makes them available for download at http://www.peptideatlas.org/thisp/ .
Collapse
Affiliation(s)
- Eric W Deutsch
- Institute for Systems Biology , Seattle, Washington 98109, United States
| | - Zhi Sun
- Institute for Systems Biology , Seattle, Washington 98109, United States
| | - David S Campbell
- Institute for Systems Biology , Seattle, Washington 98109, United States
| | - Pierre-Alain Binz
- CHUV Centre Universitaire Hospitalier Vaudois , 1011 Lausanne, Switzerland
| | - Terry Farrah
- Institute for Systems Biology , Seattle, Washington 98109, United States
| | - David Shteynberg
- Institute for Systems Biology , Seattle, Washington 98109, United States
| | - Luis Mendoza
- Institute for Systems Biology , Seattle, Washington 98109, United States
| | - Gilbert S Omenn
- Institute for Systems Biology , Seattle, Washington 98109, United States.,Departments of Computational Medicine & Bioinformatics, Internal Medicine, Human Genetics and School of Public Health, University of Michigan , Ann Arbor, Michigan 48109, United States
| | - Robert L Moritz
- Institute for Systems Biology , Seattle, Washington 98109, United States
| |
Collapse
|