1
|
Geer LY, Lapin J, Slotta DJ, Mak TD, Stein SE. AIomics: Exploring More of the Proteome Using Mass Spectral Libraries Extended by Artificial Intelligence. J Proteome Res 2023; 22:2246-2255. [PMID: 37232537 PMCID: PMC10542943 DOI: 10.1021/acs.jproteome.2c00807] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 05/27/2023]
Abstract
The unbounded permutations of biological molecules, including proteins and their constituent peptides, present a dilemma in identifying the components of complex biosamples. Sequence search algorithms used to identify peptide spectra can be expanded to cover larger classes of molecules, including more modifications, isoforms, and atypical cleavage, but at the cost of false positives or false negatives due to the simplified spectra they compute from sequence records. Spectral library searching can help solve this issue by precisely matching experimental spectra to library spectra with excellent sensitivity and specificity. However, compiling spectral libraries that span entire proteomes is pragmatically difficult. Neural networks that predict complete spectra containing a full range of annotated and unannotated ions can be used to replace these simplified spectra with libraries of fully predicted spectra, including modified peptides. Using such a network, we created predicted spectral libraries that were used to rescore matches from a sequence search done over a large search space, including a large number of modifications. Rescoring improved the separation of true and false hits by 82%, yielding an 8% increase in peptide identifications, including a 21% increase in nonspecifically cleaved peptides and a 17% increase in phosphopeptides.
Collapse
Affiliation(s)
- Lewis Y. Geer
- Mass Spectrometry Data Center, National Institute of Standards and Technology, Biomolecular Measurement Division, 100 Bureau Dr., Gaithersburg, Maryland 20899, United States
| | - Joel Lapin
- Department of Physics, Georgetown University, Washington, DC 20057, United States
- Associate, Mass Spectrometry Data Center, National Institute of Standards and Technology, Biomolecular Measurement Division, 100 Bureau Dr., Gaithersburg, Maryland 20899, United States
| | - Douglas J. Slotta
- Mass Spectrometry Data Center, National Institute of Standards and Technology, Biomolecular Measurement Division, 100 Bureau Dr., Gaithersburg, Maryland 20899, United States
| | - Tytus D. Mak
- Mass Spectrometry Data Center, National Institute of Standards and Technology, Biomolecular Measurement Division, 100 Bureau Dr., Gaithersburg, Maryland 20899, United States
| | - Stephen E. Stein
- Mass Spectrometry Data Center, National Institute of Standards and Technology, Biomolecular Measurement Division, 100 Bureau Dr., Gaithersburg, Maryland 20899, United States
| |
Collapse
|
2
|
Fan Z, Jia W. Lactobacillus casei-Derived Postbiotics Elevate the Bioaccessibility of Proteins via Allosteric Regulation of Pepsin and Trypsin and Introduction of Endopeptidases. JOURNAL OF AGRICULTURAL AND FOOD CHEMISTRY 2023. [PMID: 37410960 DOI: 10.1021/acs.jafc.3c02125] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 07/08/2023]
Abstract
The potential of probiotics to benefit digestion has been widely reported, while its utilization in high-risk patients and potential adverse reactions have focused interest on postbiotics. A variable data-independent acquisition (vDIA)-based spatial-omics strategy integrated with unsupervised variational autoencoders was applied to profile the functional mechanism underlying the action of Lactobacillus casei-derived postbiotic supplementation in goat milk digestion in an infant digestive system, from a metabolomics-peptidomics-proteomics perspective. Amide and olefin derivatives were proved to elevate the activities of pepsin and trypsin through hydrogen bonding and hydrophobic forces based on allosteric effects, and recognition of nine endopeptidases and their cleavage to serine, proline, and aspartate were introduced by postbiotics, thereby promoting the generation of hydrophilic peptides and elevating the bioaccessibility of goat milk protein. The peptides originating from αs1-casein, β-casein, β-lactoglobulin, Ig-like domain-containing protein, κ-casein, and serum amyloid A protein, with multiple bioactivities including angiotensin I-converting enzyme (ACE)-inhibitory, osteoanabolic, dipeptidyl peptidase IV (DPP-IV) inhibitory, antimicrobial, bradykinin-potentiating, antioxidant, and anti-inflammatory activities, were significantly increased in the postbiotic supplementation group, which was also considered to potentially prevent necrotizing enterocolitis through inhibiting the multiplication of pathogenic bacteria and blocking signal transducer and activator of transcription 1 and nuclear factor kappa-light-chain-enhancer of activated B cells inflammatory pathways. This research deepened the understanding of the mechanism underlying the postbiotics affecting goat milk digestion, which established a critical groundwork for the clinical application of postbiotics in infant complementary foods.
Collapse
Affiliation(s)
- Zibian Fan
- School of Food and Biological Engineering, Shaanxi University of Science & Technology, Xi'an 710021, China
| | - Wei Jia
- School of Food and Biological Engineering, Shaanxi University of Science & Technology, Xi'an 710021, China
- Shaanxi Research Institute of Agricultural Products Processing Technology, Xi'an 710021, China
| |
Collapse
|
3
|
Searle BC, Shannon AE, Wilburn DB. Scribe: Next Generation Library Searching for DDA Experiments. J Proteome Res 2023; 22:482-490. [PMID: 36695531 DOI: 10.1021/acs.jproteome.2c00672] [Citation(s) in RCA: 6] [Impact Index Per Article: 6.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/26/2023]
Abstract
Spectrum library searching is a powerful alternative to database searching for data dependent acquisition experiments, but has been historically limited to identifying previously observed peptides in libraries. Here we present Scribe, a new library search engine designed to leverage deep learning fragmentation prediction software such as Prosit. Rather than relying on highly curated DDA libraries, this approach predicts fragmentation and retention times for every peptide in a FASTA database. Scribe embeds Percolator for false discovery rate correction and an interference tolerant, label-free quantification integrator for an end-to-end proteomics workflow. By leveraging expected relative fragmentation and retention time values, we find that library searching with Scribe can outperform traditional database searching tools both in terms of sensitivity and quantitative precision. Scribe and its graphical interface are easy to use, freely accessible, and fully open source.
Collapse
Affiliation(s)
- Brian C Searle
- Department of Biomedical Informatics, The Ohio State University Medical Center, Columbus, Ohio43210, United States.,Department of Chemistry and Biochemistry, The Ohio State University, Columbus, Ohio43210, United States.,Pelotonia Institute for Immuno-Oncology, The Ohio State University Comprehensive Cancer Center, Columbus, Ohio43210, United States.,Proteome Software Inc., Portland, Oregon97219, United States
| | - Ariana E Shannon
- Department of Biomedical Informatics, The Ohio State University Medical Center, Columbus, Ohio43210, United States.,Department of Chemistry and Biochemistry, The Ohio State University, Columbus, Ohio43210, United States.,Pelotonia Institute for Immuno-Oncology, The Ohio State University Comprehensive Cancer Center, Columbus, Ohio43210, United States
| | - Damien Beau Wilburn
- Department of Biomedical Informatics, The Ohio State University Medical Center, Columbus, Ohio43210, United States.,Department of Chemistry and Biochemistry, The Ohio State University, Columbus, Ohio43210, United States.,Pelotonia Institute for Immuno-Oncology, The Ohio State University Comprehensive Cancer Center, Columbus, Ohio43210, United States
| |
Collapse
|
4
|
Geiser DL, Li W, Pham DQD, Wysocki VH, Winzerling JJ. Shotgun and TMT-Labeled Proteomic Analysis of the Ovarian Proteins of an Insect Vector, Aedes aegypti (Diptera: Culicidae). JOURNAL OF INSECT SCIENCE (ONLINE) 2022; 22:7. [PMID: 35303100 PMCID: PMC8932505 DOI: 10.1093/jisesa/ieac018] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 10/23/2021] [Indexed: 06/14/2023]
Abstract
Aedes aegypti [Linnaeus in Hasselquist; yellow fever mosquito] transmits several viruses that infect millions of people each year, including Zika, dengue, yellow fever, chikungunya, and West Nile. Pathogen transmission occurs during blood feeding. Only the females blood feed as they require a bloodmeal for oogenesis; in the bloodmeal, holo-transferrin and hemoglobin provide the females with a high iron load. We are interested in the effects of the bloodmeal on the expression of iron-associated proteins in oogenesis. Previous data showed that following digestion of a bloodmeal, ovarian iron concentrations doubles by 72 hr. We have used shotgun proteomics to identify proteins expressed in Ae. aegypti ovaries at two oogenesis developmental stages following blood feeding, and tandem mass tag-labeling proteomics to quantify proteins expressed at one stage following feeding of a controlled iron diet. Our findings provide the first report of mosquito ovarian protein expression in early and late oogenesis. We identify proteins differentially expressed in the two oogenesis development stages. We establish that metal-associated proteins play an important role in Ae. aegypti oogenesis and we identify new candidate proteins that might be involved in mosquito iron metabolism. Finally, this work identified a unique second ferritin light chain subunit, the first reported in any species. The shotgun proteomic data are available via ProteomeXchange with identifier PXD005893, while the tandem mass tag-labeled proteomic data are available with identifier PXD028242.
Collapse
Affiliation(s)
- Dawn L Geiser
- Nutritional Sciences, Division of Agriculture, Life and Veterinary Sciences, University of Arizona, Tucson, AZ 85721, USA
| | - Wenzhou Li
- Department of Chemistry and Biochemistry, College of Science, University of Arizona, Tucson, AZ 85721, USA
- Present Address: Amgen Incorporation, One Amgen Center Drive, Thousand Oaks, CA 91320, USA
| | - Daphne Q-D Pham
- Department of Biological Sciences, University of Wisconsin-Parkside, Kenosha, WI 53141, USA
| | - Vicki H Wysocki
- Department of Chemistry and Biochemistry, College of Science, University of Arizona, Tucson, AZ 85721, USA
- Present Address: Department of Chemistry and Biochemistry, The Ohio State University, Columbus, OH 43210, USA
| | - Joy J Winzerling
- Nutritional Sciences, Division of Agriculture, Life and Veterinary Sciences, University of Arizona, Tucson, AZ 85721, USA
| |
Collapse
|
5
|
Zhu S, Yang C, Wu W. MSPoisDM: A Novel Peptide Identification Algorithm Optimized for Tandem Mass Spectra. BIO WEB OF CONFERENCES 2022. [DOI: 10.1051/bioconf/20225501003] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/23/2022] Open
Abstract
Tandem mass spectrometry (MS/MS) plays an extremely important role in proteomics research. Thousands of spectra can be generated in modern experiments, how to interpret the LC-MS/MS is a challenging problem in tandem mass spectra analysis. Our peptide identification algorithm, MSPoisDM, is integrated the intensity information which produced by target-decoy statistics, although intensity information often undervalued. Furthermore, in order to combine the intensity information for better, we propose a novel concept scoring model which based on Poisson distribution. Compared with commonly used commercial software Mascot and Sequest at 1% FDR, the results show MSPoisDM is robust and versatile for various datasets which obtained from different instruments. We expect our algorithm MSPoisDM will be broadly applied in the proteomics studies.
Collapse
|
6
|
Guan S, Bythell BJ. Size Dependent Fragmentation Chemistry of Short Doubly Protonated Tryptic Peptides. JOURNAL OF THE AMERICAN SOCIETY FOR MASS SPECTROMETRY 2021; 32:1020-1032. [PMID: 33779179 DOI: 10.1021/jasms.1c00009] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/12/2023]
Abstract
Tandem mass spectrometry of electrospray ionized multiply charged peptide ions is commonly used to identify the sequence of peptide(s) and infer the identity of source protein(s). Doubly protonated peptide ions are consistently the most efficiently sequenced ions following collision-induced dissociation of peptides generated by tryptic digestion. While the broad characteristics of longer (N ≥ 8 residue) doubly protonated peptides have been investigated, there is comparatively little data on shorter systems where charge repulsion should exhibit the greatest influence on the dissociation chemistry. To address this gap and further understand the chemistry underlying collisional-dissociation of doubly charged tryptic peptides, two series of analytes ([GxR+2H]2+ and [AxR+2H]2+, x = 2-5) were investigated experimentally and with theory. We find distinct differences in the preference of bond cleavage sites for these peptides as a function of size and to a lesser extent composition. Density functional calculations at two levels of theory predict that the threshold relative energies required for bond cleavages at the same site for peptides of different size are quite similar (for example, b2-yN-2). In isolation, this finding is inconsistent with experiment. However, the predicted extent of entropy change of these reactions is size dependent. Subsequent RRKM rate constant calculations provide a far clearer picture of the kinetics of the competing bond cleavage reactions enabling rationalization of experimental findings. The M06-2X data were substantially more consistent with experiment than were the B3LYP data.
Collapse
Affiliation(s)
- Shanshan Guan
- Department of Chemistry and Biochemistry, Ohio University, 307 Chemistry Building, Athens, Ohio 45701, United States
- Department of Chemistry and Biochemistry, University of Missouri-St. Louis, 1 University Boulevard, St. Louis, Missouri 63121, United States
| | - Benjamin J Bythell
- Department of Chemistry and Biochemistry, Ohio University, 307 Chemistry Building, Athens, Ohio 45701, United States
- Department of Chemistry and Biochemistry, University of Missouri-St. Louis, 1 University Boulevard, St. Louis, Missouri 63121, United States
| |
Collapse
|
7
|
Wilburn DB, Richards AL, Swaney DL, Searle BC. CIDer: A Statistical Framework for Interpreting Differences in CID and HCD Fragmentation. J Proteome Res 2021; 20:1951-1965. [PMID: 33729787 DOI: 10.1021/acs.jproteome.0c00964] [Citation(s) in RCA: 7] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Key Words] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/29/2022]
Abstract
Library searching is a powerful technique for detecting peptides using either data independent or data dependent acquisition. While both large-scale spectrum library curators and deep learning prediction approaches have focused on beam-type CID fragmentation (HCD), resonance CID fragmentation remains a popular technique. Here we demonstrate an approach to model the differences between HCD and CID spectra, and present a software tool, CIDer, for converting libraries between the two fragmentation methods. We demonstrate that just using a combination of simple linear models and basic principles of peptide fragmentation, we can explain up to 43% of the variation between ions fragmented by HCD and CID across an array of collision energy settings. We further show that in some circumstances, searching converted CID libraries can detect more peptides than searching existing CID libraries or libraries of machine learning predictions from FASTA databases. These results suggest that leveraging information in existing libraries by converting between HCD and CID libraries may be an effective interim solution while large-scale CID libraries are being developed.
Collapse
Affiliation(s)
- Damien B Wilburn
- Institute for Systems Biology, Seattle, Washington 98109, United States.,Department of Genome Sciences, University of Washington, Seattle, Washington 98195, United States
| | - Alicia L Richards
- Quantitative Biosciences Institute (QBI), University of California San Francisco, San Francisco, California 94158, United States.,J. David Gladstone Institutes, San Francisco, California 94158, United States.,Department of Cellular and Molecular Pharmacology, University of California San Francisco, San Francisco, California 94158, United States
| | - Danielle L Swaney
- Quantitative Biosciences Institute (QBI), University of California San Francisco, San Francisco, California 94158, United States.,J. David Gladstone Institutes, San Francisco, California 94158, United States.,Department of Cellular and Molecular Pharmacology, University of California San Francisco, San Francisco, California 94158, United States
| | - Brian C Searle
- Institute for Systems Biology, Seattle, Washington 98109, United States
| |
Collapse
|
8
|
FPTMS: Frequency-based approach to identify the peptide from the low-energy collision-induced dissociation tandem mass spectra. J Proteomics 2021; 235:104116. [PMID: 33453436 DOI: 10.1016/j.jprot.2021.104116] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/10/2020] [Revised: 12/30/2020] [Accepted: 01/05/2021] [Indexed: 11/20/2022]
Abstract
The database search method is a widely accepted method to assign a peptide to the tandem mass spectra. In this study, a new flexible method- FPTMS is introduced to interpret the tandem mass spectra with the known peptide sequences in a protein database. Here the frequency of occurrence of fragment ion peaks extracted from the extensive spectral library is used to predict the theoretical tandem mass spectra of the peptides. The dot product scoring and windowed-xcorr scoring methods were implemented to score the experimental spectrum against the theoretical peptide spectra. Windowed-xcorr is introduced to tackle the mass errors and the cleavage position of the fragmentation process. The new method with windowed-xcorr shows an improved identification rate compared to the existing search engines Crux-Tide and X!Tandem at 1% False Discovery Rate (FDR) for the dataset considered in this study. SIGNIFICANCE: Identifying or sequencing of the peptide from tandem mass spectra is an important application in mass spectrometry-based proteomics. Collision-induced dissociation (CID) fragmentation spectra have been widely used to develop a peptide identification algorithm using database search strategy. CID fragmentation behavior is a complex process and found to have dependency on the sequences of peptide, charge state, and residue content. The inclusion of more features of peptide fragmentation behavior and adaptable scoring algorithm improves the efficiency of the peptide identification algorithm.
Collapse
|
9
|
Xu R, Sheng J, Bai M, Shu K, Zhu Y, Chang C. A Comprehensive Evaluation of MS/MS Spectrum Prediction Tools for Shotgun Proteomics. Proteomics 2020; 20:e1900345. [DOI: 10.1002/pmic.201900345] [Citation(s) in RCA: 10] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/31/2019] [Revised: 04/29/2020] [Indexed: 01/27/2023]
Affiliation(s)
- Rui Xu
- State Key Laboratory of Proteomics Beijing Proteome Research Center National Center for Protein Sciences (Beijing) Beijing Institute of Lifeomics Beijing 102206 China
- Chongqing Key Laboratory on Big Data for Bio Intelligence Chongqing University of Posts and Telecommunications Chongqing 400065 China
| | - Jie Sheng
- State Key Laboratory of Proteomics Beijing Proteome Research Center National Center for Protein Sciences (Beijing) Beijing Institute of Lifeomics Beijing 102206 China
- Chongqing Key Laboratory on Big Data for Bio Intelligence Chongqing University of Posts and Telecommunications Chongqing 400065 China
| | - Mingze Bai
- Chongqing Key Laboratory on Big Data for Bio Intelligence Chongqing University of Posts and Telecommunications Chongqing 400065 China
| | - Kunxian Shu
- Chongqing Key Laboratory on Big Data for Bio Intelligence Chongqing University of Posts and Telecommunications Chongqing 400065 China
| | - Yunping Zhu
- State Key Laboratory of Proteomics Beijing Proteome Research Center National Center for Protein Sciences (Beijing) Beijing Institute of Lifeomics Beijing 102206 China
| | - Cheng Chang
- State Key Laboratory of Proteomics Beijing Proteome Research Center National Center for Protein Sciences (Beijing) Beijing Institute of Lifeomics Beijing 102206 China
| |
Collapse
|
10
|
Ramachandran S, Thomas T. A Frequency-Based Approach to Predict the Low-Energy Collision-Induced Dissociation Fragmentation Spectra. ACS OMEGA 2020; 5:12615-12622. [PMID: 32548445 PMCID: PMC7288360 DOI: 10.1021/acsomega.9b03935] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 11/18/2019] [Accepted: 05/12/2020] [Indexed: 06/11/2023]
Abstract
Peptide identification algorithms rely on the comparison between the experimental tandem mass spectrometry spectrum and the theoretical spectrum to identify a peptide from the tandem mass spectra. Hence, it is important to understand the fragmentation process and predict the tandem mass spectra for high-throughput proteomics research. In this study, a novel method was developed to predict the theoretical ion trap collision-induced dissociation (CID) tandem mass spectra of the singly, doubly, and triply charged tryptic peptides. The fragmentation statistics of the ion trap CID spectra were used to predict the theoretical tandem mass spectra of the peptide sequence. The study estimated the relative cleavage frequency for each pair of adjacent amino acids along the peptide length. The study showed that the cleavage frequency can be directly used to predict the tandem mass spectra. The predicted spectra show a high correlation with the experimental spectra used in this study; 99.73% of the high-quality reference spectra have correlation scores greater than 0.8. The new method predicts the theoretical spectrum and correlates significantly better with the experimental spectrum as compared to the existing spectrum prediction tools OpenMS_Simulator, MS2PIP, and MS2PBPI, where only 80, 85.76, and 85.80% of the spectral count, respectively, has a correlation score greater than 0.8.
Collapse
|
11
|
Verheggen K, Raeder H, Berven FS, Martens L, Barsnes H, Vaudel M. Anatomy and evolution of database search engines-a central component of mass spectrometry based proteomic workflows. MASS SPECTROMETRY REVIEWS 2020; 39:292-306. [PMID: 28902424 DOI: 10.1002/mas.21543] [Citation(s) in RCA: 60] [Impact Index Per Article: 15.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 04/06/2016] [Accepted: 07/05/2017] [Indexed: 06/07/2023]
Abstract
Sequence database search engines are bioinformatics algorithms that identify peptides from tandem mass spectra using a reference protein sequence database. Two decades of development, notably driven by advances in mass spectrometry, have provided scientists with more than 30 published search engines, each with its own properties. In this review, we present the common paradigm behind the different implementations, and its limitations for modern mass spectrometry datasets. We also detail how the search engines attempt to alleviate these limitations, and provide an overview of the different software frameworks available to the researcher. Finally, we highlight alternative approaches for the identification of proteomic mass spectrometry datasets, either as a replacement for, or as a complement to, sequence database search engines.
Collapse
Affiliation(s)
- Kenneth Verheggen
- VIB-UGent Center for Medical Biotechnology, VIB, Ghent, Belgium
- Department of Biochemistry, Ghent University, Ghent, Belgium
- Bioinformatics Institute Ghent, Ghent University, Ghent, Belgium
| | - Helge Raeder
- KG Jebsen Center for Diabetes Research, Department of Clinical Science, University of Bergen, Norway
- Department of Pediatrics, Haukeland University Hospital, Bergen, Norway
| | - Frode S Berven
- Proteomics Unit, Department of Biomedicine, University of Bergen, Norway
| | - Lennart Martens
- VIB-UGent Center for Medical Biotechnology, VIB, Ghent, Belgium
- Department of Biochemistry, Ghent University, Ghent, Belgium
- Bioinformatics Institute Ghent, Ghent University, Ghent, Belgium
| | - Harald Barsnes
- KG Jebsen Center for Diabetes Research, Department of Clinical Science, University of Bergen, Norway
- Proteomics Unit, Department of Biomedicine, University of Bergen, Norway
- Computational Biology Unit, Department of Informatics, University of Bergen, Norway
| | - Marc Vaudel
- KG Jebsen Center for Diabetes Research, Department of Clinical Science, University of Bergen, Norway
- Proteomics Unit, Department of Biomedicine, University of Bergen, Norway
- Center for Medical Genetics and Molecular Medicine, Haukeland University Hospital, Bergen, Norway
| |
Collapse
|
12
|
Liu K, Li S, Wang L, Ye Y, Tang H. Full-Spectrum Prediction of Peptides Tandem Mass Spectra using Deep Neural Network. Anal Chem 2020; 92:4275-4283. [PMID: 32053352 DOI: 10.1021/acs.analchem.9b04867] [Citation(s) in RCA: 33] [Impact Index Per Article: 8.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/03/2023]
Abstract
The ability to predict tandem mass (MS/MS) spectra from peptide sequences can significantly enhance our understanding of the peptide fragmentation process and could improve peptide identification in proteomics. However, current approaches for predicting high-energy collisional dissociation (HCD) spectra are limited to predict the intensities of expected ion types, that is, the a/b/c/x/y/z ions and their neutral loss derivatives (referred to as backbone ions). In practice, backbone ions only account for <70% of total ion intensities in HCD spectra, indicating many intense ions are ignored by current predictors. In this paper, we present a deep learning approach that can predict the complete spectra (both backbone and nonbackbone ions) directly from peptide sequences. We made no assumptions or expectations on which kind of ions to predict but instead predicting the intensities for all possible m/z. Training this model needs no annotations of fragment ion nor any prior knowledge of the fragmentation rules. Our analyses show that the predicted 2+ and 3+ HCD spectra are highly similar to the experimental spectra, with average full-spectrum cosine similarities of 0.820 (±0.088) and 0.786 (±0.085), respectively, very close to the similarities between the experimental replicated spectra. In contrast, the best-performed backbone only models can only achieve an average similarity below 0.75 and 0.70 for 2+ and 3+ spectra, respectively. Furthermore, we developed a multitask learning (MTL) approach for predicting spectra of insufficient training samples, which allows our model to make accurate predictions for electron transfer dissociation (ETD) spectra and HCD spectra of less abundant charges (1+ and 4+).
Collapse
Affiliation(s)
- Kaiyuan Liu
- School of Informatics, Computing, and Engineering, Indiana University, Bloomington, Indiana 47405, United States
| | - Sujun Li
- School of Informatics, Computing, and Engineering, Indiana University, Bloomington, Indiana 47405, United States
| | - Lei Wang
- School of Informatics, Computing, and Engineering, Indiana University, Bloomington, Indiana 47405, United States
| | - Yuzhen Ye
- School of Informatics, Computing, and Engineering, Indiana University, Bloomington, Indiana 47405, United States
| | - Haixu Tang
- School of Informatics, Computing, and Engineering, Indiana University, Bloomington, Indiana 47405, United States
| |
Collapse
|
13
|
Ma WT, Liu ZY, Chen XZ, Lin ZL, Zheng ZB, Miao WG, Xie SQ. A protein identification algorithm for tandem mass spectrometry by incorporating the abundance of mRNA into a binomial probability scoring model. J Proteomics 2019; 197:53-59. [PMID: 30790687 DOI: 10.1016/j.jprot.2019.02.010] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/06/2018] [Revised: 02/15/2019] [Accepted: 02/17/2019] [Indexed: 12/17/2022]
Abstract
Peptide-spectrum matches (PSM) scoring between the experimental and theoretical spectrum is a key step in the identification of proteins using mass spectrometry (MS)-based proteomics analyses. Efficient protein identification using MS/MS data remains a challenge. The strategy of using RNA-seq data increases the number of proteins identified by re-constructing the custom search database and integrating mRNA abundance into the false discovery rate of post-PSM. However, this process lacks an algorithm that can allow the incorporation of mRNA abundance into the key scoring model of PSM. Therefore, we developed a novel PSM scoring model, which incorporates mRNA abundance for improved peptide and protein identification. In the new algorithm, abundance information of mRNA was transformed to the prior probability of protein identification and integrated to re-score in PSM using the binomial probability distribution model. Compared with other algorithms using five MS/MS datasets, the results showed that the least improvement ratios of peptide and protein groups were 3.39%-9.79% and 0.48%-8.16% in different datasets (human, rat, zebrafish, yeast, and Arabidopsis thaliana). The new strategy offers an effective solution for MS-based identification of peptides and proteins. SIGNIFICANCE: The new algorithm identifies proteins by quantifying mRNA abundance (FPKM) and incorporating it into a scoring model for peptide-spectrum matches. It is important to improve peptide and protein identification from MS/MS datasets in proteomics research.
Collapse
Affiliation(s)
- Wen-Tai Ma
- Institute of Tropical Agriculture and Forestry, Hainan University, Haikou 570228, China
| | - Zhao-Yu Liu
- Institute of Tropical Agriculture and Forestry, Hainan University, Haikou 570228, China
| | - Xiao-Zhou Chen
- School of Mathematics and Computer science, Yunnan Minzu University, Kunming 650031, China
| | - Zhen-Liang Lin
- Department of General Surgery, The Affiliated Cangnan Hospital of Wenzhou Medical University, Wenzhou 325800, China
| | - Zhong-Bing Zheng
- Institute of Tropical Agriculture and Forestry, Hainan University, Haikou 570228, China
| | - Wei-Guo Miao
- Institute of Tropical Agriculture and Forestry, Hainan University, Haikou 570228, China.
| | - Shang-Qian Xie
- Institute of Tropical Agriculture and Forestry, Hainan University, Haikou 570228, China.
| |
Collapse
|
14
|
Zhou XX, Zeng WF, Chi H, Luo C, Liu C, Zhan J, He SM, Zhang Z. pDeep: Predicting MS/MS Spectra of Peptides with Deep Learning. Anal Chem 2017; 89:12690-12697. [DOI: 10.1021/acs.analchem.7b02566] [Citation(s) in RCA: 128] [Impact Index Per Article: 18.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/14/2022]
Affiliation(s)
- Xie-Xuan Zhou
- State
Key Laboratory of Computer Architecture, Institute of Computing Technology
(ICT), Chinese Academy of Sciences (CAS), Beijing 100190, China
- University of Chinese Academy of Sciences, Beijing, China
| | - Wen-Feng Zeng
- University of Chinese Academy of Sciences, Beijing, China
- Key
Laboratory of Intelligent Information Processing of CAS, ICT, Chinese Academy of Sciences, Beijing 100190, China
| | - Hao Chi
- University of Chinese Academy of Sciences, Beijing, China
- Key
Laboratory of Intelligent Information Processing of CAS, ICT, Chinese Academy of Sciences, Beijing 100190, China
| | - Chunjie Luo
- State
Key Laboratory of Computer Architecture, Institute of Computing Technology
(ICT), Chinese Academy of Sciences (CAS), Beijing 100190, China
- University of Chinese Academy of Sciences, Beijing, China
| | - Chao Liu
- University of Chinese Academy of Sciences, Beijing, China
- Key
Laboratory of Intelligent Information Processing of CAS, ICT, Chinese Academy of Sciences, Beijing 100190, China
| | - Jianfeng Zhan
- State
Key Laboratory of Computer Architecture, Institute of Computing Technology
(ICT), Chinese Academy of Sciences (CAS), Beijing 100190, China
- University of Chinese Academy of Sciences, Beijing, China
| | - Si-Min He
- University of Chinese Academy of Sciences, Beijing, China
- Key
Laboratory of Intelligent Information Processing of CAS, ICT, Chinese Academy of Sciences, Beijing 100190, China
| | - Zhifei Zhang
- Capital Medical University, Beijing 100069, China
| |
Collapse
|
15
|
Poutsma JC, Martens J, Oomens J, Maitre P, Steinmetz V, Bernier M, Jia M, Wysocki V. Infrared Multiple-Photon Dissociation Action Spectroscopy of the b 2+ Ion from PPG: Evidence of Third Residue Affecting b 2+ Fragment Structure. JOURNAL OF THE AMERICAN SOCIETY FOR MASS SPECTROMETRY 2017; 28:1482-1488. [PMID: 28374317 PMCID: PMC5484043 DOI: 10.1007/s13361-017-1659-2] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 02/01/2017] [Revised: 03/10/2017] [Accepted: 03/11/2017] [Indexed: 06/07/2023]
Abstract
Infrared multiple-photon dissociation (IRMPD) action spectroscopy was performed on the b2+ fragment ion from the protonated PPG tripeptide. Comparison of the experimental infrared spectrum with computed spectra for both oxazolone and diketopiperazine structures indicates that the majority of the fragment ion population has an oxazolone structure with the remainder having a diketopiperazine structure. This result is in contrast with a recent study of the IRMPD action spectrum of the PP b2+ fragment ion from PPP, which was found to be nearly 100% diketopiperazine (Martens et al. Int. J. Mass Spectrom. 2015, 377, 179). The diketopiperazine b2+ ion is thermodynamically more stable than the oxazolone but normally requires a trans/cis peptide bond isomerization in the dissociating peptide. Martens et al. showed through IRMPD action spectroscopy that the PPP precursor ion was in a conformation in which the first peptide bond is already in the cis conformation and thus it was energetically favorable to form the thermodynamically-favored diketopiperazine b2+ ion. In the present case, solution-phase NMR spectroscopy and gas-phase IRMPD action spectroscopy show that the PPG precursor ion has its first amide bond in a trans configuration suggesting that the third residue is playing an important role in both the structure of the peptide and the associated ring-closure barriers for oxazolone and diketopiperazine formation. Graphical Abstract ᅟ.
Collapse
Affiliation(s)
- John C Poutsma
- Department of Chemistry, College of William and Mary, Williamsburg, VA, 23187, USA.
| | - Jonathan Martens
- Radboud University, Institute for Molecules and Materials, FELIX Laboratory, Toernooiveld 7c, NL-6525ED, Nijmegen, The Netherlands
| | - Jos Oomens
- Radboud University, Institute for Molecules and Materials, FELIX Laboratory, Toernooiveld 7c, NL-6525ED, Nijmegen, The Netherlands
- Van't Hoff Institute for Molecular Sciences, University of Amsterdam, Science Park 908, 1098XH, Amsterdam, The Netherlands
| | - Phillipe Maitre
- Laboratoire de Chimie Physique, CNRS UMR 8000, Université Paris Sud, Université Paris Saclay, CNRS, Orsay, France
| | - Vincent Steinmetz
- Laboratoire de Chimie Physique, CNRS UMR 8000, Université Paris Sud, Université Paris Saclay, CNRS, Orsay, France
| | - Matthew Bernier
- Department of Chemistry, Ohio State University, Columbus, OH, 43210, USA
| | - Mengxuan Jia
- Department of Chemistry, Ohio State University, Columbus, OH, 43210, USA
| | - Vicki Wysocki
- Department of Chemistry, Ohio State University, Columbus, OH, 43210, USA.
| |
Collapse
|
16
|
Morrison LJ, Rosenberg JA, Singleton JP, Brodbelt JS. Statistical Examination of the a and a + 1 Fragment Ions from 193 nm Ultraviolet Photodissociation Reveals Local Hydrogen Bonding Interactions. JOURNAL OF THE AMERICAN SOCIETY FOR MASS SPECTROMETRY 2016; 27:1443-53. [PMID: 27206509 PMCID: PMC4974117 DOI: 10.1007/s13361-016-1418-9] [Citation(s) in RCA: 10] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 02/24/2016] [Revised: 05/01/2016] [Accepted: 05/06/2016] [Indexed: 05/11/2023]
Abstract
Dissociation of proteins and peptides by 193 nm ultraviolet photodissociation (UVPD) has gained momentum in proteomic studies because of the diversity of backbone fragments that are produced and subsequent unrivaled sequence coverage obtained by the approach. The pathways that form the basis for the production of particular ion types are not completely understood. In this study, a statistical approach is used to probe hydrogen atom elimination from a + 1 radical ions, and different extents of elimination are found to vary as a function of the identity of the C-terminal residue of the a product ions and the presence or absence of hydrogen bonds to the cleaved residue. Graphical Abstract ᅟ.
Collapse
Affiliation(s)
| | - Jake A Rosenberg
- Department of Chemistry, University of Texas, Austin, TX, 78712, USA
| | | | | |
Collapse
|
17
|
Computational Methods in Mass Spectrometry-Based Proteomics. ADVANCES IN EXPERIMENTAL MEDICINE AND BIOLOGY 2016; 939:63-89. [PMID: 27807744 DOI: 10.1007/978-981-10-1503-8_4] [Citation(s) in RCA: 9] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 12/12/2022]
Abstract
This chapter introduces computational methods used in mass spectrometry-based proteomics, including those for addressing the critical problems such as peptide identification and protein inference, peptide and protein quantification, characterization of posttranslational modifications (PTMs), and data-independent acquisitions (DIA). The chapter concludes with emerging applications of proteomic techniques, such as metaproteomics, glycoproteomics, and proteogenomics.
Collapse
|
18
|
Song Y, Laskay ÜA, Vilcins IME, Barbour AG, Wysocki VH. Top-down-assisted bottom-up method for homologous protein sequencing: hemoglobin from 33 bird species. JOURNAL OF THE AMERICAN SOCIETY FOR MASS SPECTROMETRY 2015; 26:1875-84. [PMID: 26111519 PMCID: PMC6467653 DOI: 10.1007/s13361-015-1185-z] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 02/23/2015] [Revised: 05/08/2015] [Accepted: 05/08/2015] [Indexed: 05/12/2023]
Abstract
Ticks are vectors for disease transmission because they are indiscriminant in their feeding on multiple vertebrate hosts, transmitting pathogens between their hosts. Identifying the hosts on which ticks have fed is important for disease prevention and intervention. We have previously shown that hemoglobin (Hb) remnants from a host on which a tick fed can be used to reveal the host's identity. For the present research, blood was collected from 33 bird species that are common in the U.S. as hosts for ticks but that have unknown Hb sequences. A top-down-assisted bottom-up mass spectrometry approach with a customized searching database, based on variability in known bird hemoglobin sequences, has been devised to facilitate fast and complete sequencing of hemoglobin from birds with unknown sequences. These hemoglobin sequences will be added to a hemoglobin database and used for tick host identification. The general approach has the potential to sequence any set of homologous proteins completely in a rapid manner. Graphical Abstract ᅟ.
Collapse
Affiliation(s)
- Yang Song
- Department of Chemistry and Biochemistry, The Ohio State University, Columbus, OH, 43210, USA
- Department of Chemistry and Biochemistry, The University of Arizona, Tucson, AZ, 85721, USA
| | - Ünige A Laskay
- Department of Chemistry and Biochemistry, The University of Arizona, Tucson, AZ, 85721, USA
| | - Inger-Marie E Vilcins
- Emerging and Acute Infectious Diseases Branch, Department of State Health Services, Austin, TX, 78756, USA
| | - Alan G Barbour
- Microbiology and Molecular Genetics, Medicine, and Ecology and Evolutionary Biology, University of California, Irvine, CA, 92687, USA
| | - Vicki H Wysocki
- Department of Chemistry and Biochemistry, The Ohio State University, Columbus, OH, 43210, USA.
- Department of Chemistry and Biochemistry, The University of Arizona, Tucson, AZ, 85721, USA.
| |
Collapse
|
19
|
Levy MJ, Gucinski AC, Boyne MT. Primary Sequence Confirmation of a Protein Therapeutic Using Top Down MS/MS and MS3. Anal Chem 2015; 87:6995-9. [DOI: 10.1021/acs.analchem.5b01113] [Citation(s) in RCA: 12] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/22/2023]
Affiliation(s)
- Michaella J. Levy
- U.S. Food and Drug Administration, Center for Drug Evaluation and Research, Office of
Testing and Research, Division of Pharmaceutical Analysis, 645 S. Newstead Ave., St. Louis, Missouri 63110, United States
| | - Ashley C. Gucinski
- U.S. Food and Drug Administration, Center for Drug Evaluation and Research, Office of
Testing and Research, Division of Pharmaceutical Analysis, 645 S. Newstead Ave., St. Louis, Missouri 63110, United States
| | - Michael T. Boyne
- U.S. Food and Drug Administration, Center for Drug Evaluation and Research, Office of
Testing and Research, Division of Pharmaceutical Analysis, 645 S. Newstead Ave., St. Louis, Missouri 63110, United States
| |
Collapse
|
20
|
Wang H, Wang B, Wei Z, Cao Y, Guan X, Guo X. Characteristic neutral loss of CH3CHO from Thr-containing sodium-associated peptides. JOURNAL OF MASS SPECTROMETRY : JMS 2015; 50:488-494. [PMID: 25800185 DOI: 10.1002/jms.3555] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 09/19/2014] [Revised: 11/15/2014] [Accepted: 11/23/2014] [Indexed: 06/04/2023]
Abstract
A characteristic neutral loss of 44 Da is observed in the MS/MS spectra of Thr-containing sodiated peptides. A combination of tandem mass spectrometry and quantum chemical calculations calculated at the B3LYP/6-311G (d, p) level of ab initio theory is used to elucidate this fragmentation pathway. The high resolution mass spectrometry data indicate this neutral loss is acetaldehyde lost from the side chain of Thr rather than CO2. The intensity of this neutral loss can be enhanced when Thr residue is far from the C-terminus and when the C-terminus is esterified as well. The mechanism of the acetaldehyde loss is proposed to adopt a McLafferty-type rearrangement reaction, which involves a proton transfer from the hydroxyl of Thr side chain to its C-terminal neighboring carbonyl oxygen inducing the cleavage of the Ca-Cβ bond. This mechanism is further supported by examining the fragmentation of a [GT(tBu)G + Na](+) peptide derivative and by comparing the product ion spectra of [M + Na-44](+) of [GTGA + Na](+) with [M + Na](+) of [GGGA + Na](+). A similar neutral loss of HCHO can also be detected in Ser-containing peptides. Our computational results reveal that the most stable [GTG + Na](+) ion is present as a tridentate charge-solvated structure and the dissociation leading to the 44 loss is dynamically and energetically favorable.
Collapse
Affiliation(s)
- Huixin Wang
- College of Chemistry, Jilin University, Changchun, 130012, China
| | | | | | | | | | | |
Collapse
|
21
|
Wang H, Wang B, Wei Z, Zhang H, Guo X. Structure and further fragmentation of significant [a3 + Na - H]+ ions from sodium-cationized peptides. JOURNAL OF MASS SPECTROMETRY : JMS 2015; 50:212-219. [PMID: 25601695 DOI: 10.1002/jms.3520] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 04/11/2014] [Revised: 09/21/2014] [Accepted: 10/02/2014] [Indexed: 06/04/2023]
Abstract
A good understanding of gas-phase fragmentation chemistry of peptides is important for accurate protein identification. Additional product ions obtained by sodiated peptides can provide useful sequence information supplementary to protonated peptides and improve protein identification. In this work, we first demonstrate that the sodiated a3 ions are abundant in the tandem mass spectra of sodium-cationized peptides although observations of a3 ions have rarely been reported in protonated peptides. Quantum chemical calculations combined with tandem mass spectrometry are used to investigate this phenomenon by using a model tetrapeptide GGAG. Our results reveal that the most stable [a3 + Na - H](+) ion is present as a bidentate linear structure in which the sodium cation coordinates to the two backbone carbonyl oxygen atoms. Due to structural inflexibility, further fragmentation of the [a3 + Na - H](+) ion needs to overcome several relatively high energetic barriers to form [b2 + Na - H](+) ion with a diketopiperazine structure. As a result, low abundance of [b2 + Na - H](+) ion is detected at relatively high collision energy. In addition, our computational data also indicate that the common oxazolone pathway to generate [b2 + Na - H](+) from the [a3 + Na - H](+) ion is unlikely. The present work provides a mechanistic insight into how a sodium ion affects the fragmentation behaviors of peptides.
Collapse
Affiliation(s)
- Huixin Wang
- College of Chemistry, Jilin University, Changchun, 130012, China
| | | | | | | | | |
Collapse
|
22
|
Raulfs MDM, Breci L, Bernier M, Hamdy OM, Janiga A, Wysocki V, Poutsma JC. Investigations of the mechanism of the "proline effect" in tandem mass spectrometry experiments: the "pipecolic acid effect". JOURNAL OF THE AMERICAN SOCIETY FOR MASS SPECTROMETRY 2014; 25:1705-1715. [PMID: 25078156 DOI: 10.1007/s13361-014-0953-5] [Citation(s) in RCA: 10] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 10/08/2013] [Revised: 06/12/2014] [Accepted: 06/14/2014] [Indexed: 06/03/2023]
Abstract
The fragmentation behavior of a set of model peptides containing proline, its four-membered ring analog azetidine-2-carboxylic acid (Aze), its six-membered ring analog pipecolic acid (Pip), an acyclic secondary amine residue N-methyl-alanine (NMeA), and the D stereoisomers of Pro and Pip has been determined using collision-induced dissociation in ESI-tandem mass spectrometers. Experimental results for AAXAA, AVXLG, AAAXA, AGXGA, and AXPAA peptides are presented, where X represents Pro, Aze, Pip, or NMeA. Aze- and Pro-containing peptides fragment according to the well-established "proline effect" through selective cleavage of the amide bond N-terminal to the Aze/Pro residue to give yn (+) ions. In contrast, Pip- and NMA-fragment through a different mechanism, the "pipecolic acid effect," selectively at the amide bond C-terminal to the Pip/NMA residue to give bn (+) ions. Calculations of the relative basicities of various sites in model peptide molecules containing Aze, Pro, Pip, or NMeA indicate that whereas the "proline effect' can in part be rationalized by the increased basicity of the prolyl-amide site, the "pipecolic acid effect" cannot be justified through the basicity of the residue. Rather, the increased flexibility of the Pip and NMeA residues allow for conformations of the peptide for which transfer of the mobile proton to the amide site C-terminal to the Pip/NMeA becomes energetically favorable. This argument is supported by the differing results obtained for AAPAA versus AA(D-Pro)AA, a result that can best be explained by steric effects. Fragmentation of pentapeptides containing both Pro and Pip indicate that the "pipecolic acid effect" is stronger than the "proline effect."
Collapse
Affiliation(s)
- Mary Disa M Raulfs
- Department of Chemistry, The College of William and Mary, Williamsburg, VA, 23187, USA
| | | | | | | | | | | | | |
Collapse
|
23
|
Toprak UH, Gillet LC, Maiolica A, Navarro P, Leitner A, Aebersold R. Conserved peptide fragmentation as a benchmarking tool for mass spectrometers and a discriminating feature for targeted proteomics. Mol Cell Proteomics 2014; 13:2056-71. [PMID: 24623587 PMCID: PMC4125737 DOI: 10.1074/mcp.o113.036475] [Citation(s) in RCA: 84] [Impact Index Per Article: 8.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/20/2013] [Revised: 02/26/2014] [Indexed: 12/21/2022] Open
Abstract
Quantifying the similarity of spectra is an important task in various areas of spectroscopy, for example, to identify a compound by comparing sample spectra to those of reference standards. In mass spectrometry based discovery proteomics, spectral comparisons are used to infer the amino acid sequence of peptides. In targeted proteomics by selected reaction monitoring (SRM) or SWATH MS, predetermined sets of fragment ion signals integrated over chromatographic time are used to identify target peptides in complex samples. In both cases, confidence in peptide identification is directly related to the quality of spectral matches. In this study, we used sets of simulated spectra of well-controlled dissimilarity to benchmark different spectral comparison measures and to develop a robust scoring scheme that quantifies the similarity of fragment ion spectra. We applied the normalized spectral contrast angle score to quantify the similarity of spectra to objectively assess fragment ion variability of tandem mass spectrometric datasets, to evaluate portability of peptide fragment ion spectra for targeted mass spectrometry across different types of mass spectrometers and to discriminate target assays from decoys in targeted proteomics. Altogether, this study validates the use of the normalized spectral contrast angle as a sensitive spectral similarity measure for targeted proteomics, and more generally provides a methodology to assess the performance of spectral comparisons and to support the rational selection of the most appropriate similarity measure. The algorithms used in this study are made publicly available as an open source toolset with a graphical user interface.
Collapse
Affiliation(s)
- Umut H Toprak
- From the ‡Department of Biology, Institute of Molecular Systems Biology, ETH Zurich, 8093 Zurich, Switzerland
| | - Ludovic C Gillet
- From the ‡Department of Biology, Institute of Molecular Systems Biology, ETH Zurich, 8093 Zurich, Switzerland
| | - Alessio Maiolica
- From the ‡Department of Biology, Institute of Molecular Systems Biology, ETH Zurich, 8093 Zurich, Switzerland
| | - Pedro Navarro
- From the ‡Department of Biology, Institute of Molecular Systems Biology, ETH Zurich, 8093 Zurich, Switzerland
| | - Alexander Leitner
- From the ‡Department of Biology, Institute of Molecular Systems Biology, ETH Zurich, 8093 Zurich, Switzerland
| | - Ruedi Aebersold
- From the ‡Department of Biology, Institute of Molecular Systems Biology, ETH Zurich, 8093 Zurich, Switzerland; §Faculty of Science, University of Zurich, Zurich, 8093 Zurich, Switzerland
| |
Collapse
|
24
|
Dong NP, Liang YZ, Xu QS, Mok DKW, Yi LZ, Lu HM, He M, Fan W. Prediction of Peptide Fragment Ion Mass Spectra by Data Mining Techniques. Anal Chem 2014; 86:7446-54. [DOI: 10.1021/ac501094m] [Citation(s) in RCA: 14] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/19/2022]
Affiliation(s)
| | | | | | - Daniel K. W. Mok
- Department
of Applied Biology and Chemical Technology, The Hong Kong Polytechnic University, Hung Hom, Hong Kong
- State Key Laboratory of Chinese Medicine and Molecular Pharmacology (Incubation), Shenzhen, 518000, P. R. China
| | - Lun-zhao Yi
- Yunnan
Food Safety Research Institute, Kunming University of Science and Technology, Kunming, 650500, P. R. China
| | | | - Min He
- Department of
Pharmaceutical Engineering,
School of Chemical Engineering, Xiangtan University, Xiangtan, 411105, P.R. China
| | - Wei Fan
- College of
Bioscience and Biotechnology, Hunan Agricultural University, Changsha, 410083, P. R. China
| |
Collapse
|
25
|
Cannon JR, Kluwe C, Ellington A, Brodbelt JS. Characterization of green fluorescent proteins by 193 nm ultraviolet photodissociation mass spectrometry. Proteomics 2014; 14:1165-73. [PMID: 24596159 PMCID: PMC4071602 DOI: 10.1002/pmic.201300364] [Citation(s) in RCA: 24] [Impact Index Per Article: 2.4] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/19/2013] [Revised: 12/07/2013] [Accepted: 01/13/2014] [Indexed: 11/05/2022]
Abstract
We investigate the utility of 193 nm ultraviolet photodissociation (UVPD) in comparison to CID, higher energy CID (HCD), and electron transfer dissociation (ETD) for top down fragmentation of highly homologous green fluorescent proteins (GFP) in the gas phase. Several GFP variants were constructed via mutation of surface residues to charged moieties, demonstrating different pIs and presenting a challenge for identification by mass spectrometry. Presented is a comparison of fragmentation techniques utilized for top down characterization of four variants with varying levels of surface charge. UVPD consistently resulted in identification of more fragment ions relative to other MS/MS methods, allowing higher confidence identification. In addition to the high number of fragment ions, the sites of fragmentation were more evenly spread throughout the protein backbone, which proved key for localizing the point mutations.
Collapse
Affiliation(s)
- Joe R. Cannon
- Department of Chemistry, University of Texas at Austin, Austin, Texas
| | - Christien Kluwe
- Center for Systems and Synthetic Biology, University of Texas at Austin, Austin, Texas
| | - Andrew Ellington
- Center for Systems and Synthetic Biology, University of Texas at Austin, Austin, Texas
| | | |
Collapse
|
26
|
Risk BA, Edwards NJ, Giddings MC. A peptide-spectrum scoring system based on ion alignment, intensity, and pair probabilities. J Proteome Res 2013; 12:4240-7. [PMID: 23875887 DOI: 10.1021/pr400286p] [Citation(s) in RCA: 11] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/14/2022]
Abstract
Peppy, the proteogenomic/proteomic search software, employs a novel method for assessing the match quality between an MS/MS spectrum and a theorized peptide sequence. The scoring system uses three score factors calculated with binomial probabilities: the probability that a fragment ion will randomly align with a peptide ion, the probability that the aligning ions will be selected from subsets of the most intense peaks, and the probability that the intensities of fragment ions identified as y-ions are greater than those of their counterpart b-ions. The scores produced by the method act as global confidence scores, which facilitate the accurate comparison of results and the estimation of false discovery rates. Peppy has been integrated into the meta-search engine PepArML to produce meaningful comparisons with Mascot, MSGF+, OMSSA, X!Tandem, k-Score and s-Score. For two of the four data sets examined with the PepArML analysis, Peppy exceeded the accuracy performance of the other scoring systems. Peppy is available for download at http://geneffects.com/peppy .
Collapse
Affiliation(s)
- Brian A Risk
- Department of Biochemistry & Biophysics, UNC School of Medicine, Chapel Hill, North Carolina 27599, United States.
| | | | | |
Collapse
|
27
|
Xiao CL, Chen XZ, Du YL, Li ZF, Wei L, Zhang G, He QY. Dispec: a novel peptide scoring algorithm based on peptide matching discriminability. PLoS One 2013; 8:e62724. [PMID: 23675420 PMCID: PMC3652849 DOI: 10.1371/journal.pone.0062724] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/18/2012] [Accepted: 03/25/2013] [Indexed: 11/20/2022] Open
Abstract
Identifying peptides from the fragmentation spectra is a fundamental step in mass spectrometry (MS) data processing. The significance (discriminability) of every peak varies, providing additional information for potentially enhancing the identification sensitivity and the correct match rate. However this important information was not considered in previous algorithms. Here we presented a novel method based on Peptide Matching Discriminability (PMD), in which the PMD information of every peak reflects the discriminability of candidate peptides. In addition, we developed a novel peptide scoring algorithm Dispec based on PMD, by taking three aspects of discriminability into consideration: PMD, intensity discriminability and m/z error discriminability. Compared with Mascot and Sequest, Dispec identified remarkably more peptides from three experimental datasets with the same confidence at 1% PSM-level FDR. Dispec is also robust and versatile for various datasets obtained on different instruments. The concept of discriminability enhances the peptide identification and thus may contribute largely to the proteome studies. As an open-source program, Dispec is freely available at http://bioinformatics.jnu.edu.cn/software/dispec/.
Collapse
Affiliation(s)
- Chuan-Le Xiao
- Key Laboratory of Functional Protein Research of Guangdong Higher Education Institutes, Institute of Life and Health Engineering, College of Life Science and Technology, Jinan University, Guangzhou, China
| | - Xiao-Zhou Chen
- School of Mathematics and Computer Science, Yunnan University of Nationalities, Kunming, China
| | - Yang-Li Du
- School of Mathematics and Computer Science, Yunnan University of Nationalities, Kunming, China
| | - Zhe-Fu Li
- Jinan University Network and Educational Technology Center, Guangzhou, China
| | - Li Wei
- School of Mathematics and Computer Science, Yunnan University of Nationalities, Kunming, China
| | - Gong Zhang
- Key Laboratory of Functional Protein Research of Guangdong Higher Education Institutes, Institute of Life and Health Engineering, College of Life Science and Technology, Jinan University, Guangzhou, China
- * E-mail: (QYH); (GZ)
| | - Qing-Yu He
- Key Laboratory of Functional Protein Research of Guangdong Higher Education Institutes, Institute of Life and Health Engineering, College of Life Science and Technology, Jinan University, Guangzhou, China
- * E-mail: (QYH); (GZ)
| |
Collapse
|
28
|
Zhang Y, Fonslow BR, Shan B, Baek MC, Yates JR. Protein analysis by shotgun/bottom-up proteomics. Chem Rev 2013; 113:2343-94. [PMID: 23438204 PMCID: PMC3751594 DOI: 10.1021/cr3003533] [Citation(s) in RCA: 970] [Impact Index Per Article: 88.2] [Reference Citation Analysis] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/07/2023]
Affiliation(s)
- Yaoyang Zhang
- Department of Chemical Physiology, The Scripps Research Institute, La Jolla, CA 92037, USA
| | - Bryan R. Fonslow
- Department of Chemical Physiology, The Scripps Research Institute, La Jolla, CA 92037, USA
| | - Bing Shan
- Department of Chemical Physiology, The Scripps Research Institute, La Jolla, CA 92037, USA
| | - Moon-Chang Baek
- Department of Chemical Physiology, The Scripps Research Institute, La Jolla, CA 92037, USA
- Department of Molecular Medicine, Cell and Matrix Biology Research Institute, School of Medicine, Kyungpook National University, Daegu 700-422, Republic of Korea
| | - John R. Yates
- Department of Chemical Physiology, The Scripps Research Institute, La Jolla, CA 92037, USA
| |
Collapse
|
29
|
Wang D, Dasari S, Chambers MC, Holman JD, Chen K, Liebler DC, Orton DJ, Purvine SO, Monroe ME, Chung CY, Rose KL, Tabb DL. Basophile: accurate fragment charge state prediction improves peptide identification rates. GENOMICS PROTEOMICS & BIOINFORMATICS 2013; 11:86-95. [PMID: 23499924 PMCID: PMC3737598 DOI: 10.1016/j.gpb.2012.11.004] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 09/12/2012] [Revised: 11/03/2012] [Accepted: 11/22/2012] [Indexed: 01/14/2023]
Abstract
In shotgun proteomics, database search algorithms rely on fragmentation models to predict fragment ions that should be observed for a given peptide sequence. The most widely used strategy (Naive model) is oversimplified, cleaving all peptide bonds with equal probability to produce fragments of all charges below that of the precursor ion. More accurate models, based on fragmentation simulation, are too computationally intensive for on-the-fly use in database search algorithms. We have created an ordinal-regression-based model called Basophile that takes fragment size and basic residue distribution into account when determining the charge retention during CID/higher-energy collision induced dissociation (HCD) of charged peptides. This model improves the accuracy of predictions by reducing the number of unnecessary fragments that are routinely predicted for highly-charged precursors. Basophile increased the identification rates by 26% (on average) over the Naive model, when analyzing triply-charged precursors from ion trap data. Basophile achieves simplicity and speed by solving the prediction problem with an ordinal regression equation, which can be incorporated into any database search software for shotgun proteomic identification.
Collapse
Affiliation(s)
- Dong Wang
- Department of Biomedical Informatics, Vanderbilt University Medical Center, Nashville, TN 37232, USA
| | | | | | | | | | | | | | | | | | | | | | | |
Collapse
|
30
|
Gucinski AC, Chamot-Rooke J, Steinmetz V, Somogyi Á, Wysocki VH. Influence of N-terminal residue composition on the structure of proline-containing b2+ ions. J Phys Chem A 2013; 117:1291-8. [PMID: 23312013 PMCID: PMC3641857 DOI: 10.1021/jp306759f] [Citation(s) in RCA: 23] [Impact Index Per Article: 2.1] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/28/2022]
Abstract
To probe the structural implications of the proline residue on its characteristic peptide fragmentation patterns, in particular its unusual cleavage at its C-terminus in formation of a b(2) ion in XxxProZzz sequences, the structures of a series of proline-containing b(2)(+) ions were studied by using action infrared multiphoton dissociation (IRMPD) spectroscopy and fragment ion hydrogen-deuterium exchange (HDX). Five different Xxx-Pro b(2)(+) ions were studied, with glycine, alanine, isoleucine, valine, or histidine in the N-terminal position. The residues selected feature different sizes, chain lengths, and gas phase basicities to explore whether the structure of the N-terminal residue influences the Xxx-Pro b(2)(+) ion structure. In proteins, the proline side chain-to-backbone attachment causes its peptide bonds to be in the cis conformation more than any other amino acid, although trans is still favored over cis. However, HP is the only b(2)(+) ion studied here that forms the diketopiperazine exclusively. The GP, AP, IP, and VP b(2)(+) ions formed from protonated tripeptide precursors predominantly featured oxazolone structures with small diketopiperazine contributions. In contrast to the b(2)(+) ions generated from tripeptides, synthetic cyclic dipeptides VP and HP were confirmed to have exclusive diketopiperazine structures.
Collapse
Affiliation(s)
- Ashley C Gucinski
- Department of Chemistry and Biochemistry, The University of Arizona, 1306 East University Boulevard, P.O. Box 210041, Tucson, Arizona 85721-0041, USA.
| | | | | | | | | |
Collapse
|
31
|
Ji C, Arnold RJ, Sokoloski KJ, Hardy RW, Tang H, Radivojac P. Extending the coverage of spectral libraries: a neighbor-based approach to predicting intensities of peptide fragmentation spectra. Proteomics 2013; 13:756-65. [PMID: 23303707 DOI: 10.1002/pmic.201100670] [Citation(s) in RCA: 9] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/27/2011] [Revised: 10/19/2012] [Accepted: 11/11/2012] [Indexed: 01/10/2023]
Abstract
Searching spectral libraries in MS/MS is an important new approach to improving the quality of peptide and protein identification. The idea relies on the observation that ion intensities in an MS/MS spectrum of a given peptide are generally reproducible across experiments, and thus, matching between spectra from an experiment and the spectra of previously identified peptides stored in a spectral library can lead to better peptide identification compared to the traditional database search. However, the use of libraries is greatly limited by their coverage of peptide sequences: even for well-studied organisms a large fraction of peptides have not been previously identified. To address this issue, we propose to expand spectral libraries by predicting the MS/MS spectra of peptides based on the spectra of peptides with similar sequences. We first demonstrate that the intensity patterns of dominant fragment ions between similar peptides tend to be similar. In accordance with this observation, we develop a neighbor-based approach that first selects peptides that are likely to have spectra similar to the target peptide and then combines their spectra using a weighted K-nearest neighbor method to accurately predict fragment ion intensities corresponding to the target peptide. This approach has the potential to predict spectra for every peptide in the proteome. When rigorous quality criteria are applied, we estimate that the method increases the coverage of spectral libraries available from the National Institute of Standards and Technology by 20-60%, although the values vary with peptide length and charge state. We find that the overall best search performance is achieved when spectral libraries are supplemented by the high quality predicted spectra.
Collapse
Affiliation(s)
- Chao Ji
- Department of Biology, Indiana University, Bloomington, IN 47405, USA
| | | | | | | | | | | |
Collapse
|
32
|
Grover H, Wallstrom G, Wu CC, Gopalakrishnan V. Context-sensitive markov models for peptide scoring and identification from tandem mass spectrometry. OMICS : A JOURNAL OF INTEGRATIVE BIOLOGY 2013; 17:94-105. [PMID: 23289783 PMCID: PMC3567622 DOI: 10.1089/omi.2012.0073] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/12/2022]
Abstract
Peptide and protein identification via tandem mass spectrometry (MS/MS) lies at the heart of proteomic characterization of biological samples. Several algorithms are able to search, score, and assign peptides to large MS/MS datasets. Most popular methods, however, underutilize the intensity information available in the tandem mass spectrum due to the complex nature of the peptide fragmentation process, thus contributing to loss of potential identifications. We present a novel probabilistic scoring algorithm called Context-Sensitive Peptide Identification (CSPI) based on highly flexible Input-Output Hidden Markov Models (IO-HMM) that capture the influence of peptide physicochemical properties on their observed MS/MS spectra. We use several local and global properties of peptides and their fragment ions from literature. Comparison with two popular algorithms, Crux (re-implementation of SEQUEST) and X!Tandem, on multiple datasets of varying complexity, shows that peptide identification scores from our models are able to achieve greater discrimination between true and false peptides, identifying up to ∼25% more peptides at a False Discovery Rate (FDR) of 1%. We evaluated two alternative normalization schemes for fragment ion-intensities, a global rank-based and a local window-based. Our results indicate the importance of appropriate normalization methods for learning superior models. Further, combining our scores with Crux using a state-of-the-art procedure, Percolator, we demonstrate the utility of using scoring features from intensity-based models, identifying ∼4-8 % additional identifications over Percolator at 1% FDR. IO-HMMs offer a scalable and flexible framework with several modeling choices to learn complex patterns embedded in MS/MS data.
Collapse
Affiliation(s)
- Himanshu Grover
- Department of Biomedical Informatics, University of Pittsburgh, Pittsburgh, Pennsylvania
| | - Garrick Wallstrom
- Department of Biomedical Informatics, Arizona State University, Scottsdale, Arizona
| | - Christine C. Wu
- Department of Cell Biology and Physiology, University of Pittsburgh, Pittsburgh, Pennsylvania
| | - Vanathi Gopalakrishnan
- Department of Biomedical Informatics, University of Pittsburgh, Pittsburgh, Pennsylvania
| |
Collapse
|
33
|
Abstract
Peptides and proteins are routinely identified from peptide fragmentation spectra acquired in a mass spectrometer, analyzed by database search engines. The types of fragments that can be formed are known, and it is also well appreciated that certain fragment types are more common or more informative than others. However, most search engines do not use detailed knowledge of peptide fragmentation, but rather consider a limited range of fragments, giving each an equivalent weighting in their scoring system that decides which results are likely to be correct. This chapter discusses efforts to make use of information about the frequency of observation of different fragment ion types in order to produce more sophisticated and sensitive scoring systems and demonstrates how these new scoring systems are particularly powerful for analysis of electron capture or electron transfer dissociation data.
Collapse
|
34
|
Xiao CL, Chen XZ, Du YL, Sun X, Zhang G, He QY. Binomial Probability Distribution Model-Based Protein Identification Algorithm for Tandem Mass Spectrometry Utilizing Peak Intensity Information. J Proteome Res 2012; 12:328-35. [DOI: 10.1021/pr300781t] [Citation(s) in RCA: 12] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/28/2022]
Affiliation(s)
- Chuan-Le Xiao
- Institute of Life and Health
Engineering, Key Laboratory of Functional Protein Research of Guangdong
Higher Education Institutes, Jinan University, Guangzhou 510632, China
| | - Xiao-Zhou Chen
- School of Mathematics and Computer
Science, Yunnan University of Nationalities, Kunming 650031, China
| | - Yang-Li Du
- School of Mathematics and Computer
Science, Yunnan University of Nationalities, Kunming 650031, China
| | - Xuesong Sun
- Institute of Life and Health
Engineering, Key Laboratory of Functional Protein Research of Guangdong
Higher Education Institutes, Jinan University, Guangzhou 510632, China
| | - Gong Zhang
- Institute of Life and Health
Engineering, Key Laboratory of Functional Protein Research of Guangdong
Higher Education Institutes, Jinan University, Guangzhou 510632, China
| | - Qing-Yu He
- Institute of Life and Health
Engineering, Key Laboratory of Functional Protein Research of Guangdong
Higher Education Institutes, Jinan University, Guangzhou 510632, China
| |
Collapse
|
35
|
Abstract
Shotgun proteomics has recently emerged as a powerful approach to characterizing proteomes in biological samples. Its overall objective is to identify the form and quantity of each protein in a high-throughput manner by coupling liquid chromatography with tandem mass spectrometry. As a consequence of its high throughput nature, shotgun proteomics faces challenges with respect to the analysis and interpretation of experimental data. Among such challenges, the identification of proteins present in a sample has been recognized as an important computational task. This task generally consists of (1) assigning experimental tandem mass spectra to peptides derived from a protein database, and (2) mapping assigned peptides to proteins and quantifying the confidence of identified proteins. Protein identification is fundamentally a statistical inference problem with a number of methods proposed to address its challenges. In this review we categorize current approaches into rule-based, combinatorial optimization and probabilistic inference techniques, and present them using integer programming and Bayesian inference frameworks. We also discuss the main challenges of protein identification and propose potential solutions with the goal of spurring innovative research in this area.
Collapse
Affiliation(s)
- Yong Fuga Li
- School of Informatics and Computing, Indiana University, Bloomington 150 S, Woodlawn Avenue, Bloomington, Indiana 47405, USA
| | | |
Collapse
|
36
|
Li W, Wysocki VH. ETD fragmentation features improve algorithm. Expert Rev Proteomics 2012; 9:241-3. [PMID: 22809203 DOI: 10.1586/epr.12.23] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/08/2022]
Abstract
Electron transfer dissociation (ETD) is an alternative technique used in mass spectrometry-based proteomics experiments. Because it is newer, most of the protein identification algorithms for ETD are still a simple derivation of well-established collision-activated dissociation algorithms without the consideration of many unique ETD spectral features. Sridhara and coworkers recently reported removing the charge-reduced precursors and corresponding neutral loss peaks to improve ETD peptide identification with the Open Mass Spectrometry Search Algorithm (OMSSA). These peaks were also used to deduce the charge of the precursors for low resolution data. The scheme is a concrete example of implementing known ETD fragmentation features to improve a protein identification algorithm.
Collapse
Affiliation(s)
- Wenzhou Li
- Department of Chemistry and Biochemistry, University of Arizona, Tucson, AZ, USA
| | | |
Collapse
|
37
|
Li W, O'Neill HA, Wysocki VH. SQID-XLink: implementation of an intensity-incorporated algorithm for cross-linked peptide identification. ACTA ACUST UNITED AC 2012; 28:2548-50. [PMID: 22796956 DOI: 10.1093/bioinformatics/bts442] [Citation(s) in RCA: 9] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/13/2022]
Abstract
SUMMARY Peptide identification algorithm is a major bottleneck for mass spectrometry based chemical cross-linking experiments. Our lab recently developed an intensity-incorporated peptide identification algorithm, and here we implemented this scheme for cross-linked peptide discovery. Our program, SQID-XLink, searches all regular, dead-end, intra and inter cross-linked peptides simultaneously, and its effectiveness is validated by testing a published dataset. This new algorithm provides an alternative approach for high confidence cross-linking identification. AVAILABILITY SQID-XLink program is freely available for download from http://quiz2.chem.arizona.edu/wysocki/bioinformatics.htm SUPPLEMENTARY INFORMATION Supplementary data are available at Bioinformatics online. CONTACT vwysocki@email.arizona.edu.
Collapse
Affiliation(s)
- Wenzhou Li
- Department of Chemistry and Biochemistry, University of Arizona, Tucson, AZ 85721, USA
| | | | | |
Collapse
|
38
|
Morrison L, Somogyi Á, Wysocki VH. The influence glutamic acid in protonated b 3 → b 2 formation from VGEIG and related analogs. INTERNATIONAL JOURNAL OF MASS SPECTROMETRY 2012; 325-327:139-149. [PMID: 23667319 PMCID: PMC3647700 DOI: 10.1016/j.ijms.2012.08.012] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/02/2023]
Abstract
A direct pathway for the fragmentation of peptide b3 fragment ions to b2 ions has, until now, not been identified. Experimental evidence for the formation of a b3 anhydride structure and isomerization to an extended macrocycle is demonstrated here by comparison of the completely different fragmentation patterns of the b3 ions generated from protonated VGEIG and its methyl ester. In particular, the absence of a b2 ion in the fragmentation spectrum of the methyl ester b3 indicates that facile fragmentation of an anhydride-type b3 is responsible for virtually all b2 ions formed. The stability of this b3 structure and the ease with which it fragments to the b2 may be responsible for the relatively high abundance of the b3 and b2 ions. IRMPD action spectroscopy measurements indicate the presence of a ring protonated oxazolone in the b2 population. VGEIG and three related analogs, VALEIG, VADEIG, and V(Aib)EIG were studied by QCID-HDX-SORI experiments in an FT-ICR instrument, and provide significant evidence for extensive alpha proton scrambling in an ion-molecule complex formed between the b2 and neutral loss fragment following formation of the b2. MS3 and HDX of VG(2,2-d2)EIG indicate that the scrambled b2 ions have the same structure as the unscrambled b2. Based on these data and with the support of molecular modeling, we propose a new mechanism for this scrambling, in which the alpha protons are transferred in a multistep pathway during an ion-molecule complex formed between the b2 and amino-terminated anhydride ring neutral loss component.
Collapse
Affiliation(s)
- Lindsay Morrison
- Department of Chemistry and Biochemistry, University of Arizona, Tucson, AZ, United States
| | - Árpád Somogyi
- Department of Chemistry and Biochemistry, University of Arizona, Tucson, AZ, United States
| | - Vicki H. Wysocki
- Department of Chemistry and Biochemistry, University of Arizona, Tucson, AZ, United States
| |
Collapse
|
39
|
Gucinski AC, Chamot-Rooke J, Nicol E, Somogyi Á, Wysocki VH. Structural influences on preferential oxazolone versus diketopiperazine b(2+) ion formation for histidine analogue-containing peptides. J Phys Chem A 2012; 116:4296-304. [PMID: 22448972 PMCID: PMC3523341 DOI: 10.1021/jp300262d] [Citation(s) in RCA: 27] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/28/2022]
Abstract
Studies of peptide fragment ion structures are important to aid in the accurate kinetic modeling and prediction of peptide fragmentation pathways for a given sequence. Peptide b(2)(+) ion structures have been of recent interest. While previously studied b(2)(+) ions that contain only aliphatic or simple aromatic residues are oxazolone structures, the HA b(2)(+) ion consists of both oxazolone and diketopiperazine structures. The structures of a series of histidine-analogue-containing Xxx-Ala b(2)(+) ions were studied by using action infrared multiphoton dissociation (IRMPD) spectroscopy, fragment ion hydrogen-deuterium exchange (HDX), and density functional theory (DFT) calculations to systematically probe the influence of different side chain structural elements on the resulting b(2)(+) ion structures formed. The b(2)(+) ions studied include His-Ala (HA), methylated histidine analogues, including π-methyl-HA and τ-methyl-HA, pyridylalanine (pa) analogues, including 2-(pa)A, 3-(pa)A, and 4-(pa)A, and linear analogues, including diaminobutanoic acid-Ala (DabA) and Lys-Ala (KA). The location and accessibility of the histidine π-nitrogen, or an amino nitrogen on an aliphatic side chain, were seen to be essential for diketopiperazine formation in addition to the more typical oxazolone structure formation, while blocking or removal of the τ-nitrogen did not change the b(2)(+) ion structures formed. Linear histidine analogues, DabA and KA, formed only diketopiperazine structures, suggesting that a steric interaction in the HisAla case may interfere with the complete trans-cis isomerization of the first amide bond that is necessary for diketopiperazine formation.
Collapse
Affiliation(s)
- Ashley C Gucinski
- Department of Chemistry and Biochemistry, The University of Arizona, 1306 East University Boulevard, Tucson, Arizona 85721, USA.
| | | | | | | | | |
Collapse
|
40
|
Obolensky OI, Wu WW, Shen RF, Yu YK. Using dissociation energies to predict observability of b- and y-peaks in mass spectra of short peptides. RAPID COMMUNICATIONS IN MASS SPECTROMETRY : RCM 2012; 26:915-20. [PMID: 22396027 PMCID: PMC3468955 DOI: 10.1002/rcm.6180] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/27/2023]
Abstract
RATIONALE Peptide identification reliability can be improved by excluding from analysis those m/z peaks of candidate peptides which cannot be observed in practice due to various physical, chemical or thermodynamic considerations. We propose using dissociation energies (as opposed to proton affinities) as a predictor of observability of different m/z peaks in spectra of short peptides. METHODS Mass spectra of the tetrapeptides AAAA, AAFA, AAVA, AFAA, AVAA, AFFA, and AVVA were measured in the collision-induced dissociation (CID) activation mode on a grid of activation times 0.05 to 100 ms and normalized collision energy 10 to 35%. The lowest energy geometries and vibrational spectra were calculated for the precursor ions and their charged and neutral fragments using density functional theory (DFT) at the TPSS/6-31G(d,p) level. Dissociation energies were calculated for all fragmentation channels leading to b- or y-fragments. RESULTS It is demonstrated that m/z peaks observed in the mass spectra correspond to the fragmentation channels with the lowest dissociation energies. Using 50 kcal/mol as the cut-off value of dissociation energy, it was predicted that 28 out of 42 possible peaks in the b- and y-series of the seven tetrapeptides can be observed in mass spectra. In the experiments, 26 b- or y-peaks were observed, all of which are among the 28 predicted ones. CONCLUSIONS The use of dissociation energies generalizes the use of proton affinities for semi-quantitative predictions of relative intensities of different m/z peaks of short peptides. Further advances in this direction will pave the way for reliable quantitative predictions and, hence, for a significant improvement in robustness and accuracy of peptide and protein identification tools.
Collapse
Affiliation(s)
- O I Obolensky
- National Center for Biotechnology Information, National Library of Medicine, National Institutes of Health, Bethesda, MD, USA.
| | | | | | | |
Collapse
|
41
|
Bernier MC, Paizs B, Wysocki VH. Influence of a Gamma Amino Acid on the Structures and Reactivity of Peptide a(3) Ions. INTERNATIONAL JOURNAL OF MASS SPECTROMETRY 2012; 316-318:259-267. [PMID: 23258959 PMCID: PMC3523335 DOI: 10.1016/j.ijms.2012.02.028] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/01/2023]
Abstract
Collision-induced dissociation of protonated AGabaAIG (where Gaba is gamma-amino butyric acid, NH(2)-(CH(2))(3)-COOH) leads to an unusually stable a(3) ion. Tandem mass spectrometry and theory are used here to probe the enhanced stability of this fragment, whose counterpart is not usually observed in CID of protonated peptides containing only alpha amino acids. Experiments are carried out on the unlabelled and (15)N-Ala labeled AGabaAIG (labeled separately at residue one or three) probing the b(3), a(3), a(3)-NH(3) (a(3) (*)), and b(2) fragments while theory is used to characterize the most stable b(3), a(3), and b(2) structures and the formation and dissociation of the a(3) ion. Our results indicate the AGabaA oxazolone b(3) isomer undergoes head-to-tail macrocyclization and subsequent ring opening to form the GabaAA sequence isomer while this chemistry is energetically disfavored for the AAA sequence. The AGabaA a(3) fragment also undergoes macrocyclization and rearrangement to form the rearranged imine-amide isomer while this reaction is energetically disfavored for the AAA sequence. The barriers to dissociation of the AGabaA a(3) ion via the a(3)→b(2) and a(3)→a(3)* channels are higher than the literature values reported for the AAA sequence. These two effects provide a clear explanation for the enhanced stability of the AGabaA a(3) ion.
Collapse
Affiliation(s)
- Matthew C. Bernier
- Department of Chemistry and Biochemistry, University of Arizona, Tucson, Arizona 85721
| | - Bela Paizs
- Department of Chemistry and Biochemistry, University of Arizona, Tucson, Arizona 85721
- Computational Proteomics Group, German Cancer Research Center, Im Neuenheimer Feld 580, 69120 Heidelberg, Germany
| | - Vicki H. Wysocki
- Department of Chemistry and Biochemistry, University of Arizona, Tucson, Arizona 85721
| |
Collapse
|
42
|
Kilpatrick LE, Neta P, Yang X, Simón-Manso Y, Liang Y, Stein SE. Formation of y + 10 and y + 11 ions in the collision-induced dissociation of peptide ions. JOURNAL OF THE AMERICAN SOCIETY FOR MASS SPECTROMETRY 2012; 23:655-663. [PMID: 22161574 DOI: 10.1007/s13361-011-0277-7] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 05/24/2011] [Revised: 10/07/2011] [Accepted: 10/12/2011] [Indexed: 05/31/2023]
Abstract
Tandem mass spectra of peptide ions, acquired in shotgun proteomic studies of selected proteins, tissues, and organisms, commonly include prominent peaks that cannot be assigned to the known fragmentation product ions (y, b, a, neutral losses). In many cases these persist even when creating consensus spectra for inclusion in spectral libraries, where it is important to determine whether these peaks represent new fragmentation paths or arise from impurities. Using spectra from libraries and synthesized peptides, we investigate a class of fragment ions corresponding to y(n-1) + 10 and y(n-1) + 11, where n is the number of amino acid residues in the peptide. These 10 and 11 Da differences in mass of the y ion were ascribed before to the masses of [+ CO - H(2)O] and [+ CO - NH(3)], respectively. The mechanism is suggested to involve dissociation of the N-terminal residue at the CH-CO bond following loss of H(2)O or NH(3). MS(3) spectra of these ions show that the location of the additional 10 or 11 Da is at the N-terminal residue. The y(n-1) + 10 ion is most often found in peptides with N-terminal proline, asparagine, and histidine, and also with serine and threonine in the adjacent position. The y(n-1) + 11 ion is observed predominantly with histidine and asparagine at the N-terminus, but also occurs with asparagine in positions two through four. The intensities of the y(n-1) + 10 ions decrease with increasing peptide length. These data for y(n-1) + 10 and y(n-1) + 11 ion formation may be used to improve peptide identification from tandem mass spectra.
Collapse
Affiliation(s)
- Lisa E Kilpatrick
- Chemical and Biochemical Reference Data Division, National Institute of Standards and Technology, 100 Bureau Drive, MS8320, Gaithersburg, MD 20899, USA.
| | | | | | | | | | | |
Collapse
|
43
|
Bauer C, Kleinjung F, Rutishauser D, Panse C, Chadt A, Dreja T, Al-Hasani H, Reinert K, Schlapbach R, Schuchhardt J. PPINGUIN: Peptide Profiling Guided Identification of Proteins improves quantitation of iTRAQ ratios. BMC Bioinformatics 2012; 13:34. [PMID: 22340093 PMCID: PMC3368728 DOI: 10.1186/1471-2105-13-34] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/02/2011] [Accepted: 02/16/2012] [Indexed: 01/07/2023] Open
Abstract
Background Recent development of novel technologies paved the way for quantitative proteomics. One of the most important among them is iTRAQ, employing isobaric tags for relative or absolute quantitation. Despite large progress in technology development, still many challenges remain for derivation and interpretation of quantitative results. One of these challenges is the consistent assignment of peptides to proteins. Results We have developed Peptide Profiling Guided Identification of Proteins (PPINGUIN), a statistical analysis workflow for iTRAQ data addressing the problem of ambiguous peptide quantitations. Motivated by the assumption that peptides uniquely derived from the same protein are correlated, our method employs clustering as a very early step in data processing prior to protein inference. Our method increases experimental reproducibility and decreases variability of quantitations of peptides assigned to the same protein. Giving further support to our method, application to a type 2 diabetes dataset identifies a list of protein candidates that is in very good agreement with previously performed transcriptomics meta analysis. Making use of quantitative properties of signal patterns identified, PPINGUIN can reveal new isoform candidates. Conclusions Regarding the increasing importance of quantitative proteomics we think that this method will be useful in practical applications like model fitting or functional enrichment analysis. We recommend to use this method if quantitation is a major objective of research.
Collapse
Affiliation(s)
- Chris Bauer
- MicroDiscovery GmbH, Marienburger Str, 1, 10405 Berlin, Germany.
| | | | | | | | | | | | | | | | | | | |
Collapse
|
44
|
Li W, Song C, Bailey DJ, Tseng GC, Coon JJ, Wysocki VH. Statistical analysis of electron transfer dissociation pairwise fragmentation patterns. Anal Chem 2011; 83:9540-5. [PMID: 22022956 DOI: 10.1021/ac202327r] [Citation(s) in RCA: 32] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/29/2022]
Abstract
Electron transfer dissociation (ETD) is an alternative peptide dissociation method developed in recent years. Compared with the traditional collision induced dissociation (CID) b and y ion formation, ETD generates c and z ions and the backbone cleavage is believed to be less selective. We have reported previously the application of a statistical data mining strategy, K-means clustering, to discover fragmentation patterns for CID, and here we report application of this approach to ETD spectra. We use ETD data sets from digestions with three different proteases. Data analysis shows that selective cleavages do exist for ETD, with the fragmentation patterns affected by protease, charge states, and amino acid residue compositions. It is also noticed that the c(n-1) ion, corresponding to loss of the C-terminal amino acid residue, is statistically strong regardless of the residue at the C-terminus of the peptide, which suggests that the peptide gas phase conformation plays an important role in the dissociation pathways. These patterns provide a basis for mechanism elucidation, spectral prediction, and improvement of ETD peptide identification algorithms.
Collapse
Affiliation(s)
- Wenzhou Li
- Department of Chemistry and Biochemistry, University of Arizona, Tucson, Arizona 85721, United States
| | | | | | | | | | | |
Collapse
|