1
|
Cautereels J, Van Hee N, Chatterjee S, Van Alsenoy C, Lemière F, Blockhuys F. QCMS 2 as a new method for providing insight into peptide fragmentation: The influence of the side-chain and inter-side-chain interactions. JOURNAL OF MASS SPECTROMETRY : JMS 2020; 55:e4446. [PMID: 31652378 DOI: 10.1002/jms.4446] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 06/27/2019] [Revised: 09/12/2019] [Accepted: 09/21/2019] [Indexed: 06/10/2023]
Abstract
The identification of peptides and proteins from tandem mass spectra is a difficult task and multiple tools have been developed to aid this identification. We present a new method called quantum chemical mass spectrometry for materials science (QCMS2 ), which is based on quantum chemical calculations of bond orders, reaction, and transition-state energies at the DFT/B3LYP/6-311+G* level of theory. The method was used to describe the fragmentation pathways of five X-His-Ser tripeptides with X = Asn, Asp, Glu, Ser, and Trp, thereby focusing on the influence of the side chain and inter-side-chain interactions on the fragmentation. The main features in the mass spectra of the five tripeptides were correctly reproduced, and a number of fragments were assigned to fragmentations involving the side chain and the influence of inter-side-chain interactions. Product ion spectra were recorded to evaluate the capabilities and limitations of QCMS2 and a number of conventional tools.
Collapse
Affiliation(s)
- Julie Cautereels
- Department of Chemistry, University of Antwerp, Antwerp, Belgium
| | - Nils Van Hee
- Department of Chemistry, University of Antwerp, Antwerp, Belgium
| | - Sneha Chatterjee
- Department of Chemistry, University of Antwerp, Antwerp, Belgium
| | | | - Filip Lemière
- Department of Chemistry, University of Antwerp, Antwerp, Belgium
| | - Frank Blockhuys
- Department of Chemistry, University of Antwerp, Antwerp, Belgium
| |
Collapse
|
2
|
Cautereels J, Giribaldi J, Enjalbal C, Blockhuys F. Quantum chemical mass spectrometry: Ab initio study of b 2 -ion formation mechanisms for the singly protonated Gln-His-Ser tripeptide. RAPID COMMUNICATIONS IN MASS SPECTROMETRY : RCM 2020; 34:e8778. [PMID: 32144813 DOI: 10.1002/rcm.8778] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 11/08/2019] [Revised: 02/28/2020] [Accepted: 03/05/2020] [Indexed: 06/10/2023]
Abstract
RATIONALE Both amide bond protonation triggering peptide fragmentations and the controversial b2 -ion structures have been subjects of intense research. The involvement of histidine (H), with its imidazole side chain that induces specific dissociation patterns involving inter-side-chain (ISC) interactions, in b2 -ion formation was investigated, focusing on the QHS model tripeptide. METHODS To identify the effect of histidine on fragmentations issued from ISC interactions, QHS was selected for a comprehensive analysis of the pathways leading to the three possible b2 -ion structures, using quantum chemical calculations performed at the DFT/B3LYP/6-311+G* level of theory. Electrospray ionization ion trap mass spectrometry allowed the recording of MS2 and MS3 tandem mass spectra, whereas the Quantum Chemical Mass Spectrometry for Materials Science (QCMS2 ) method was used to predict fragmentation patterns. RESULTS Whereas it is very difficult to differentiate among protonated oxazolone, diketopiperazine, or lactam b2 -ions using MS2 and MS3 mass spectra, the calculations indicated that the QH b2 -ion (detected at m/z 266) is probably a mixture of the lactam and oxazolone structures formed after amide nitrogen protonation, making the formation of diketopiperazine less likely as it requires an additional step for its formation. CONCLUSIONS In contrast to glycine-histidine-containing b2 -ions, known to be issued from the backbone-imidazole cyclization, we found that interactions between the side chains were not obvious to perceive, neither from a thermodynamics nor from a fragmentation perspective, emphasizing the importance of the whole sequence on the dissociation behavior usually demonstrated from simple glycine-containing tripeptides.
Collapse
Affiliation(s)
- Julie Cautereels
- Department of Chemistry, University of Antwerp, Antwerp, Belgium
| | | | | | - Frank Blockhuys
- Department of Chemistry, University of Antwerp, Antwerp, Belgium
| |
Collapse
|
3
|
A classification of liquid chromatography mass spectrometry techniques for evaluation of chemical composition and quality control of traditional medicines. J Chromatogr A 2019; 1609:460501. [PMID: 31515074 DOI: 10.1016/j.chroma.2019.460501] [Citation(s) in RCA: 42] [Impact Index Per Article: 8.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/25/2019] [Revised: 08/06/2019] [Accepted: 08/29/2019] [Indexed: 12/25/2022]
Abstract
Natural products (NPs) and traditional medicines (TMs) are used for treatment of various diseases and also to develop new drugs. However, identification of drug leads within the immense biodiversity of living organisms is a challenging task that requires considerable time, labor, and computational resources as well as the application of modern analytical instruments. LC-MS platforms are widely used for both drug discovery and quality control of TMs and food supplements. Moreover, a large dataset generated during LC-MS analysis contains valuable information that could be extracted and handled by means of various data mining and statistical tools. Novel sophisticated LC-MS based approaches are being introduced every year. Therefore, this review is prepared for the scientists specialized in pharmacognosy and analytical chemistry of NPs as well as working in related areas, in order to navigate them in the world of diverse LC-MS based techniques and strategies currently employed for NP discovery and dereplication, quality control, pattern recognition and sample comparison, and also in targeted and untargeted metabolomic studies. The suggested classification system includes the following LC-MS based procedures: elemental composition determination, isotopic fine structure analysis, mass defect filtering, de novo identification, clustering of the compounds in Molecular Networking (MN), diagnostic fragment ion (or neutral loss) filtering, manual dereplication using MS/MS data, database-assisted peak annotation, annotation of spectral trees, MS fingerprinting, feature extraction, bucketing of LC-MS data, peak profiling, predicted metabolite screening, targeted quantification of biomarkers, quantitative analysis of multi-component system, construction of chemical fingerprints, multi-targeted and untargeted metabolite profiling.
Collapse
|
4
|
Muth T, Renard BY. Evaluating de novo sequencing in proteomics: already an accurate alternative to database-driven peptide identification? Brief Bioinform 2019; 19:954-970. [PMID: 28369237 DOI: 10.1093/bib/bbx033] [Citation(s) in RCA: 63] [Impact Index Per Article: 12.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/08/2016] [Indexed: 01/24/2023] Open
Abstract
While peptide identifications in mass spectrometry (MS)-based shotgun proteomics are mostly obtained using database search methods, high-resolution spectrum data from modern MS instruments nowadays offer the prospect of improving the performance of computational de novo peptide sequencing. The major benefit of de novo sequencing is that it does not require a reference database to deduce full-length or partial tag-based peptide sequences directly from experimental tandem mass spectrometry spectra. Although various algorithms have been developed for automated de novo sequencing, the prediction accuracy of proposed solutions has been rarely evaluated in independent benchmarking studies. The main objective of this work is to provide a detailed evaluation on the performance of de novo sequencing algorithms on high-resolution data. For this purpose, we processed four experimental data sets acquired from different instrument types from collision-induced dissociation and higher energy collisional dissociation (HCD) fragmentation mode using the software packages Novor, PEAKS and PepNovo. Moreover, the accuracy of these algorithms is also tested on ground truth data based on simulated spectra generated from peak intensity prediction software. We found that Novor shows the overall best performance compared with PEAKS and PepNovo with respect to the accuracy of correct full peptide, tag-based and single-residue predictions. In addition, the same tool outpaced the commercial competitor PEAKS in terms of running time speedup by factors of around 12-17. Despite around 35% prediction accuracy for complete peptide sequences on HCD data sets, taken as a whole, the evaluated algorithms perform moderately on experimental data but show a significantly better performance on simulated data (up to 84% accuracy). Further, we describe the most frequently occurring de novo sequencing errors and evaluate the influence of missing fragment ion peaks and spectral noise on the accuracy. Finally, we discuss the potential of de novo sequencing for now becoming more widely used in the field.
Collapse
Affiliation(s)
- Thilo Muth
- Research Group Bioinformatics, Robert Koch Institute, Berlin, Germany
| | - Bernhard Y Renard
- Research Group Bioinformatics, Robert Koch Institute, Berlin, Germany
| |
Collapse
|
5
|
Bertile F, Fouillen L, Wasselin T, Maes P, Le Maho Y, Van Dorsselaer A, Raclot T. The Safety Limits Of An Extended Fast: Lessons from a Non-Model Organism. Sci Rep 2016; 6:39008. [PMID: 27991520 PMCID: PMC5171797 DOI: 10.1038/srep39008] [Citation(s) in RCA: 8] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/19/2016] [Accepted: 11/16/2016] [Indexed: 02/03/2023] Open
Abstract
While safety of fasting therapy is debated in humans, extended fasting occurs routinely and safely in wild animals. To do so, food deprived animals like breeding penguins anticipate the critical limit of fasting by resuming feeding. To date, however, no molecular indices of the physiological state that links spontaneous refeeding behaviour with fasting limits had been identified. Blood proteomics and physiological data reveal here that fasting-induced body protein depletion is not unsafe “per se”. Indeed, incubating penguins only abandon their chick/egg to refeed when this state is associated with metabolic defects in glucose homeostasis/fatty acid utilization, insulin production and action, and possible renal dysfunctions. Our data illustrate how the field investigation of “exotic” models can be a unique source of information, with possible biomedical interest.
Collapse
Affiliation(s)
- Fabrice Bertile
- CNRS, UMR7178, 67037 Strasbourg, France.,Université de Strasbourg, IPHC, Laboratoire de Spectrométrie de Masse Bio-Organique, 25 rue Becquerel, 67087 Strasbourg, France
| | - Laetitia Fouillen
- CNRS, UMR7178, 67037 Strasbourg, France.,Université de Strasbourg, IPHC, Laboratoire de Spectrométrie de Masse Bio-Organique, 25 rue Becquerel, 67087 Strasbourg, France
| | - Thierry Wasselin
- CNRS, UMR7178, 67037 Strasbourg, France.,Université de Strasbourg, IPHC, Laboratoire de Spectrométrie de Masse Bio-Organique, 25 rue Becquerel, 67087 Strasbourg, France
| | - Pauline Maes
- CNRS, UMR7178, 67037 Strasbourg, France.,Université de Strasbourg, IPHC, Laboratoire de Spectrométrie de Masse Bio-Organique, 25 rue Becquerel, 67087 Strasbourg, France
| | - Yvon Le Maho
- CNRS, UMR7178, 67037 Strasbourg, France.,Université de Strasbourg, IPHC, Département Ecologie, Physiologie et Ethologie, 23 rue Becquerel, 67087 Strasbourg, France
| | - Alain Van Dorsselaer
- CNRS, UMR7178, 67037 Strasbourg, France.,Université de Strasbourg, IPHC, Laboratoire de Spectrométrie de Masse Bio-Organique, 25 rue Becquerel, 67087 Strasbourg, France
| | - Thierry Raclot
- CNRS, UMR7178, 67037 Strasbourg, France.,Université de Strasbourg, IPHC, Département Ecologie, Physiologie et Ethologie, 23 rue Becquerel, 67087 Strasbourg, France
| |
Collapse
|
6
|
Yılmaz Ş, Victor B, Hulstaert N, Vandermarliere E, Barsnes H, Degroeve S, Gupta S, Sticker A, Gabriël S, Dorny P, Palmblad M, Martens L. A Pipeline for Differential Proteomics in Unsequenced Species. J Proteome Res 2016; 15:1963-70. [DOI: 10.1021/acs.jproteome.6b00140] [Citation(s) in RCA: 9] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/20/2022]
Affiliation(s)
- Şule Yılmaz
- Medical Biotechnology Center, VIB, Albert Baertsoenkaai 3, Ghent B-9000, Belgium
- Department
of Biochemistry, Ghent University, Albert Baertsoenkaai 3, B-9000 Ghent, Belgium
- Bioinformatics
Institute Ghent, Ghent University, B-9052 Ghent, Belgium
| | - Bjorn Victor
- Veterinary
Helminthology Unit, Department of Biomedical Sciences, Institute of Tropical Medicine, 2000 Antwerp, Belgium
| | - Niels Hulstaert
- Medical Biotechnology Center, VIB, Albert Baertsoenkaai 3, Ghent B-9000, Belgium
- Department
of Biochemistry, Ghent University, Albert Baertsoenkaai 3, B-9000 Ghent, Belgium
- Bioinformatics
Institute Ghent, Ghent University, B-9052 Ghent, Belgium
| | - Elien Vandermarliere
- Medical Biotechnology Center, VIB, Albert Baertsoenkaai 3, Ghent B-9000, Belgium
- Department
of Biochemistry, Ghent University, Albert Baertsoenkaai 3, B-9000 Ghent, Belgium
- Bioinformatics
Institute Ghent, Ghent University, B-9052 Ghent, Belgium
| | - Harald Barsnes
- Proteomics
Unit (PROBE), Department of Biomedicine, University of Bergen, Jonas Liesvei 91, N-5009 Bergen, Norway
| | - Sven Degroeve
- Medical Biotechnology Center, VIB, Albert Baertsoenkaai 3, Ghent B-9000, Belgium
- Department
of Biochemistry, Ghent University, Albert Baertsoenkaai 3, B-9000 Ghent, Belgium
- Bioinformatics
Institute Ghent, Ghent University, B-9052 Ghent, Belgium
| | - Surya Gupta
- Medical Biotechnology Center, VIB, Albert Baertsoenkaai 3, Ghent B-9000, Belgium
- Department
of Biochemistry, Ghent University, Albert Baertsoenkaai 3, B-9000 Ghent, Belgium
- Bioinformatics
Institute Ghent, Ghent University, B-9052 Ghent, Belgium
| | - Adriaan Sticker
- Medical Biotechnology Center, VIB, Albert Baertsoenkaai 3, Ghent B-9000, Belgium
- Department
of Biochemistry, Ghent University, Albert Baertsoenkaai 3, B-9000 Ghent, Belgium
- Bioinformatics
Institute Ghent, Ghent University, B-9052 Ghent, Belgium
- Department
of Applied Mathematics, Computer Science, and Statistics, Ghent University, B-9000 Ghent, Belgium
| | - Sarah Gabriël
- Veterinary
Helminthology Unit, Department of Biomedical Sciences, Institute of Tropical Medicine, 2000 Antwerp, Belgium
| | - Pierre Dorny
- Veterinary
Helminthology Unit, Department of Biomedical Sciences, Institute of Tropical Medicine, 2000 Antwerp, Belgium
| | - Magnus Palmblad
- Center
for Proteomics and Metabolomics, Leiden University Medical Center, 2300 RC Leiden, The Netherlands
| | - Lennart Martens
- Medical Biotechnology Center, VIB, Albert Baertsoenkaai 3, Ghent B-9000, Belgium
- Department
of Biochemistry, Ghent University, Albert Baertsoenkaai 3, B-9000 Ghent, Belgium
- Bioinformatics
Institute Ghent, Ghent University, B-9052 Ghent, Belgium
| |
Collapse
|
7
|
Butt AQ, McArdle A, Gibson DS, FitzGerald O, Pennington SR. Psoriatic Arthritis Under a Proteomic Spotlight: Application of Novel Technologies to Advance Diagnosis and Management. Curr Rheumatol Rep 2015; 17:35. [DOI: 10.1007/s11926-015-0509-0] [Citation(s) in RCA: 10] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/20/2022]
|
8
|
Carapito C, Burel A, Guterl P, Walter A, Varrier F, Bertile F, Van Dorsselaer A. MSDA, a proteomics software suite for in-depth Mass Spectrometry Data Analysis using grid computing. Proteomics 2014; 14:1014-9. [DOI: 10.1002/pmic.201300415] [Citation(s) in RCA: 46] [Impact Index Per Article: 4.6] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/22/2013] [Revised: 01/15/2014] [Accepted: 01/15/2014] [Indexed: 12/20/2022]
Affiliation(s)
- Christine Carapito
- Laboratoire de Spectrométrie de Masse BioOrganique; IPHC; Université de Strasbourg; CNRS; UMR7178 Strasbourg France
| | - Alexandre Burel
- Laboratoire de Spectrométrie de Masse BioOrganique; IPHC; Université de Strasbourg; CNRS; UMR7178 Strasbourg France
| | - Patrick Guterl
- Laboratoire de Spectrométrie de Masse BioOrganique; IPHC; Université de Strasbourg; CNRS; UMR7178 Strasbourg France
| | - Alexandre Walter
- Laboratoire de Spectrométrie de Masse BioOrganique; IPHC; Université de Strasbourg; CNRS; UMR7178 Strasbourg France
| | - Fabrice Varrier
- Laboratoire de Spectrométrie de Masse BioOrganique; IPHC; Université de Strasbourg; CNRS; UMR7178 Strasbourg France
| | - Fabrice Bertile
- Laboratoire de Spectrométrie de Masse BioOrganique; IPHC; Université de Strasbourg; CNRS; UMR7178 Strasbourg France
| | - Alain Van Dorsselaer
- Laboratoire de Spectrométrie de Masse BioOrganique; IPHC; Université de Strasbourg; CNRS; UMR7178 Strasbourg France
| |
Collapse
|
9
|
Buts K, Michielssens S, Hertog MLATM, Hayakawa E, Cordewener J, America AHP, Nicolai BM, Carpentier SC. Improving the identification rate of data independent label-free quantitative proteomics experiments on non-model crops: a case study on apple fruit. J Proteomics 2014; 105:31-45. [PMID: 24565695 DOI: 10.1016/j.jprot.2014.02.015] [Citation(s) in RCA: 39] [Impact Index Per Article: 3.9] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/05/2013] [Revised: 01/23/2014] [Accepted: 02/14/2014] [Indexed: 11/28/2022]
Abstract
UNLABELLED Complex peptide extracts from non-model crops are troublesome for proper identification and quantification. To increase the identification rate of label free DIA experiments of Braeburn apple a new workflow was developed where a DDA database was constructed and linked to the DIA data. At a first level, parent masses found in DIA were searched in the DDA database based on their mass to charge ratio and retention time; at a second level, masses of fragmentation ions were compared for each of the linked spectrum. Following this workflow, a tenfold increase of peptides was identified from a single DIA run. As proof of principle, the designed workflow was applied to determine the changes during a storage experiment, achieving a two-fold identification increase in the number of significant peptides. The corresponding protein families were divided into nine clusters, representing different time profiles of changes in abundances during storage. Up-regulated protein families already show a glimpse of important pathways affecting aging during long-term storage, such as ethylene synthesis, and responses to abiotic stresses and their influence on the central metabolism. BIOLOGICAL SIGNIFICANCE Proteomics research on non-model crops causes additional difficulties in identifying the peptides present in, often complex, samples. This work proposes a new workflow to retrieve more identifications from a set of quantitative data, based on linking DIA and DDA data at two consecutive levels. As proof of principle, a storage experiment on Braeburn apple resulted in twice as much identified storage related peptides. Important proteins involved in central metabolism and stress are significantly up-regulated after long term storage. This article is part of a Special Issue entitled: Proteomics of non-model organisms.
Collapse
Affiliation(s)
- Kim Buts
- BIOSYST-MeBioS, KU Leuven, Belgium.
| | - Servaas Michielssens
- Quantum Chemistry and Physical Chemistry Section, KU Leuven, Belgium; Computational Biomolecular Dynamics Group, Max Planck Institute for Biophysical Chemistry, Göttingen, Germany
| | | | - Eisuke Hayakawa
- Research Group of Functional Genomics and Proteomics, KU Leuven, Belgium
| | | | | | - Bart M Nicolai
- BIOSYST-MeBioS, KU Leuven, Belgium; Flanders Centre of Postharvest Technology, Leuven, Belgium
| | | |
Collapse
|
10
|
Perez-Riverol Y, Wang R, Hermjakob H, Müller M, Vesada V, Vizcaíno JA. Open source libraries and frameworks for mass spectrometry based proteomics: a developer's perspective. BIOCHIMICA ET BIOPHYSICA ACTA 2014; 1844:63-76. [PMID: 23467006 PMCID: PMC3898926 DOI: 10.1016/j.bbapap.2013.02.032] [Citation(s) in RCA: 61] [Impact Index Per Article: 6.1] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 10/01/2012] [Revised: 02/05/2013] [Accepted: 02/22/2013] [Indexed: 12/23/2022]
Abstract
Data processing, management and visualization are central and critical components of a state of the art high-throughput mass spectrometry (MS)-based proteomics experiment, and are often some of the most time-consuming steps, especially for labs without much bioinformatics support. The growing interest in the field of proteomics has triggered an increase in the development of new software libraries, including freely available and open-source software. From database search analysis to post-processing of the identification results, even though the objectives of these libraries and packages can vary significantly, they usually share a number of features. Common use cases include the handling of protein and peptide sequences, the parsing of results from various proteomics search engines output files, and the visualization of MS-related information (including mass spectra and chromatograms). In this review, we provide an overview of the existing software libraries, open-source frameworks and also, we give information on some of the freely available applications which make use of them. This article is part of a Special Issue entitled: Computational Proteomics in the Post-Identification Era. Guest Editors: Martin Eisenacher and Christian Stephan.
Collapse
Affiliation(s)
- Yasset Perez-Riverol
- EMBL Outstation, European Bioinformatics Institute, Wellcome Trust Genome Campus, Hinxton, Cambridge, CB10 1SD, UK
- Department of Proteomics, Center for Genetic Engineering and Biotechnology, Ciudad de la Habana, Cuba
| | - Rui Wang
- EMBL Outstation, European Bioinformatics Institute, Wellcome Trust Genome Campus, Hinxton, Cambridge, CB10 1SD, UK
| | - Henning Hermjakob
- EMBL Outstation, European Bioinformatics Institute, Wellcome Trust Genome Campus, Hinxton, Cambridge, CB10 1SD, UK
| | - Markus Müller
- Proteome Informatics Group, Swiss Institute of Bioinformatics, CMU - 1, rue Michel Servet CH-1211 Geneva, Switzerland
| | - Vladimir Vesada
- Department of Proteomics, Center for Genetic Engineering and Biotechnology, Ciudad de la Habana, Cuba
| | - Juan Antonio Vizcaíno
- EMBL Outstation, European Bioinformatics Institute, Wellcome Trust Genome Campus, Hinxton, Cambridge, CB10 1SD, UK
| |
Collapse
|
11
|
Moradian A, Kalli A, Sweredoski MJ, Hess S. The top-down, middle-down, and bottom-up mass spectrometry approaches for characterization of histone variants and their post-translational modifications. Proteomics 2013; 14:489-97. [DOI: 10.1002/pmic.201300256] [Citation(s) in RCA: 110] [Impact Index Per Article: 10.0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/28/2013] [Revised: 07/30/2013] [Accepted: 08/15/2013] [Indexed: 12/22/2022]
Affiliation(s)
- Annie Moradian
- Proteome Exploration Laboratory; Beckman Institute; California Institute of Technology; Pasadena CA USA
| | - Anastasia Kalli
- Department of Pathology and Laboratory Medicine; Children's Hospital Los Angeles; Los Angeles CA USA
| | - Michael J. Sweredoski
- Proteome Exploration Laboratory; Beckman Institute; California Institute of Technology; Pasadena CA USA
| | - Sonja Hess
- Proteome Exploration Laboratory; Beckman Institute; California Institute of Technology; Pasadena CA USA
| |
Collapse
|
12
|
Richards AL, Vincent CE, Guthals A, Rose CM, Westphall MS, Bandeira N, Coon JJ. Neutron-encoded signatures enable product ion annotation from tandem mass spectra. Mol Cell Proteomics 2013; 12:3812-23. [PMID: 24043425 DOI: 10.1074/mcp.m113.028951] [Citation(s) in RCA: 18] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/20/2023] Open
Abstract
We report the use of neutron-encoded (NeuCode) stable isotope labeling of amino acids in cell culture for the purpose of C-terminal product ion annotation. Two NeuCode labeling isotopologues of lysine, (13)C6(15)N2 and (2)H8, which differ by 36 mDa, were metabolically embedded in a sample proteome, and the resultant labeled proteins were combined, digested, and analyzed via liquid chromatography and mass spectrometry. With MS/MS scan resolving powers of ~50,000 or higher, product ions containing the C terminus (i.e. lysine) appear as a doublet spaced by exactly 36 mDa, whereas N-terminal fragments exist as a single m/z peak. Through theory and experiment, we demonstrate that over 90% of all y-type product ions have detectable doublets. We report on an algorithm that can extract these neutron signatures with high sensitivity and specificity. In other words, of 15,503 y-type product ion peaks, the y-type ion identification algorithm correctly identified 14,552 (93.2%) based on detection of the NeuCode doublet; 6.8% were misclassified (i.e. other ion types that were assigned as y-type products). Searching NeuCode labeled yeast with PepNovo(+) resulted in a 34% increase in correct de novo identifications relative to searching through MS/MS only. We use this tool to simplify spectra prior to database searching, to sort unmatched tandem mass spectra for spectral richness, for correlation of co-fragmented ions to their parent precursor, and for de novo sequence identification.
Collapse
Affiliation(s)
- Alicia L Richards
- Department of Chemistry, University of Wisconsin, Madison, Wisconsin 53706
| | | | | | | | | | | | | |
Collapse
|
13
|
Hoopmann MR, Moritz RL. Current algorithmic solutions for peptide-based proteomics data generation and identification. Curr Opin Biotechnol 2012; 24:31-8. [PMID: 23142544 DOI: 10.1016/j.copbio.2012.10.013] [Citation(s) in RCA: 39] [Impact Index Per Article: 3.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/19/2012] [Revised: 10/08/2012] [Accepted: 10/18/2012] [Indexed: 12/28/2022]
Abstract
Peptide-based proteomic data sets are ever increasing in size and complexity. These data sets provide computational challenges when attempting to quickly analyze spectra and obtain correct protein identifications. Database search and de novo algorithms must consider high-resolution MS/MS spectra and alternative fragmentation methods. Protein inference is a tricky problem when analyzing large data sets of degenerate peptide identifications. Combining multiple algorithms for improved peptide identification puts significant strain on computational systems when investigating large data sets. This review highlights some of the recent developments in peptide and protein identification algorithms for analyzing shotgun mass spectrometry data when encountering the aforementioned hurdles. Also explored are the roles that analytical pipelines, public spectral libraries, and cloud computing play in the evolution of peptide-based proteomics.
Collapse
|
14
|
Allmer J. Algorithms for the de novo sequencing of peptides from tandem mass spectra. Expert Rev Proteomics 2012; 8:645-57. [PMID: 21999834 DOI: 10.1586/epr.11.54] [Citation(s) in RCA: 91] [Impact Index Per Article: 7.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/27/2023]
Abstract
Proteomics is the study of proteins, their time- and location-dependent expression profiles, as well as their modifications and interactions. Mass spectrometry is useful to investigate many of the questions asked in proteomics. Database search methods are typically employed to identify proteins from complex mixtures. However, databases are not often available or, despite their availability, some sequences are not readily found therein. To overcome this problem, de novo sequencing can be used to directly assign a peptide sequence to a tandem mass spectrometry spectrum. Many algorithms have been proposed for de novo sequencing and a selection of them are detailed in this article. Although a standard accuracy measure has not been agreed upon in the field, relative algorithm performance is discussed. The current state of the de novo sequencing is assessed thereafter and, finally, examples are used to construct possible future perspectives of the field.
Collapse
Affiliation(s)
- Jens Allmer
- Molecular Biology and Genetics, Izmir Institute of Technology, Urla, Izmir 35430, Turkey.
| |
Collapse
|
15
|
Carrasco MA, Buechler SA, Arnold RJ, Sformo T, Barnes BM, Duman JG. Investigating the deep supercooling ability of an Alaskan beetle, Cucujus clavipes puniceus, via high throughput proteomics. J Proteomics 2011; 75:1220-34. [PMID: 22094879 DOI: 10.1016/j.jprot.2011.10.034] [Citation(s) in RCA: 15] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/02/2011] [Revised: 10/26/2011] [Accepted: 10/31/2011] [Indexed: 12/27/2022]
Abstract
Cucujus clavipes puniceus is a freeze avoiding beetle capable of surviving the long, extremely cold winters of the Interior of Alaska. Previous studies showed that some individuals typically supercool to mean values of approximately -40 °C, with some individuals supercooling to as low as -58 °C, but these non-deep supercooling (NDSC) individuals eventually freeze if temperatures drop below this. However, other larvae, especially if exposed to very cold temperatures, supercool even further. These deep supercooling (DSC) individuals do not freeze even if cooled to -100 °C. In addition, the body water of the DSC larvae vitrifies (turns to a glass) at glass transition temperatures of -58 to -70 °C. This study examines the proteomes of DSC and NDSC larvae to assess proteins that may contribute to or inhibit the DSC trait. Using high throughput proteomics, we identified 138 proteins and 513 Gene Ontology categories in the DSC group and 104 proteins and 573 GO categories in the NDSC group. GO categories enriched in DSC include alcohol metabolic process, cellular component morphogenesis, monosaccharide metabolic process, regulation of biological quality, extracellular region, structural molecule activity, and antioxidant activity. Proteins unique to DSC include alpha casein precursor, alpha-actinin, vimentin, tropomyosin, beta-lactoglobulin, immunoglobulins, tubulin, cuticle proteins and endothelins.
Collapse
|
16
|
Carrasco MA, Buechler SA, Arnold RJ, Sformo T, Barnes BM, Duman JG. Elucidating the Biochemical Overwintering Adaptations of Larval Cucujus clavipes puniceus, a Nonmodel Organism, via High Throughput Proteomics. J Proteome Res 2011; 10:4634-46. [DOI: 10.1021/pr200518y] [Citation(s) in RCA: 23] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/26/2022]
Affiliation(s)
- Martin A. Carrasco
- Department of Biological Sciences, University of Notre Dame, Notre Dame, Indiana 46556, United States
| | - Steven A. Buechler
- Department of Applied and Computational Mathematics and Statistics, University of Notre Dame, Notre Dame, Indiana 46556, United States
| | - Randy J. Arnold
- Proteomics Facility, Indiana University, Indianapolis, Indiana, United States
| | - Todd Sformo
- University of Alaska, Fairbanks, Alaska, United States
| | - Brian M. Barnes
- Institute of Arctic Biology, University of Alaska, Fairbanks, Alaska, United States
| | - John G. Duman
- Department of Biological Sciences, University of Notre Dame, Notre Dame, Indiana 46556, United States
| |
Collapse
|
17
|
Lam H. Building and searching tandem mass spectral libraries for peptide identification. Mol Cell Proteomics 2011; 10:R111.008565. [PMID: 21900153 DOI: 10.1074/mcp.r111.008565] [Citation(s) in RCA: 50] [Impact Index Per Article: 3.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/06/2022] Open
Abstract
Spectral library searching is an emerging approach in peptide identifications from tandem mass spectra, a critical step in proteomic data analysis. Conceptually, the premise of this approach is that the tandem MS fragmentation pattern of a peptide under some fixed conditions is a reproducible fingerprint of that peptide, such that unknown spectra acquired under the same conditions can be identified by spectral matching. In actual practice, a spectral library is first meticulously compiled from a large collection of previously observed and identified tandem MS spectra, usually obtained from shotgun proteomics experiments of complex mixtures. Then, a query spectrum is then identified by spectral matching using recently developed spectral search engines. This review discusses the basic principles of the two pillars of this approach: spectral library construction, and spectral library searching. An overview of the software tools available for these two tasks, as well as a high-level description of the underlying algorithms, will be given. Finally, several new methods that utilize spectral libraries for peptide identification in ways other than straightforward spectral matching will also be described.
Collapse
Affiliation(s)
- Henry Lam
- Department of Chemical and Biomolecular Engineering, the Hong Kong University of Science and Technology, Clear Water Bay, Hong Kong, China
| |
Collapse
|
18
|
SUN HC, ZHANG JY, LIU H, ZHANG W, XU CM, MA HB, ZHU YP, XIE HW. Algorithm Development of de novo Peptide Sequencing Via Tandem Mass Spectrometry. PROG BIOCHEM BIOPHYS 2011. [DOI: 10.3724/sp.j.1206.2010.00226] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/25/2022]
|
19
|
Abstract
Mass spectrometry-based proteomics has become an essential part of the analytical toolbox of the life sciences. With the ability to identify and quantify hundreds to thousands of proteins in high throughput, the field has contributed its fair share to the data avalanche coming from the so-called omics fields. As a result, the challenges involved in processing and managing this flood of data have grown as well. This chapter will point out and discuss these challenges, starting from the processing of raw mass spectrometry data into peaks, over the identification of peptides and proteins, to the quantification of the identified molecules. Finally, the informatics aspects of the nascent field of targeted proteomics are outlined as well.
Collapse
Affiliation(s)
- Lennart Martens
- Department of Medical Protein Research, VIB, Ghent University, B-9000, Ghent, Belgium.
| |
Collapse
|
20
|
Tessier D, Yclon P, Jacquemin I, Larré C, Rogniaux H. OVNIp: an open source application facilitating the interpretation, the validation and the edition of proteomics data generated by MS analyses and de novo sequencing. Proteomics 2010; 10:1794-801. [PMID: 20198638 DOI: 10.1002/pmic.200800783] [Citation(s) in RCA: 7] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/10/2022]
Abstract
Several academic software are available to help the validation and reporting of proteomics data generated by MS analyses. However, to our knowledge, none of them have been conceived to meet the particular needs generated by the study of organisms whose genomes are not sequenced. In that context, we have developed OVNIp, an open-source application which facilitates the whole process of proteomics results interpretation. One of its unique attributes is its capacity to compile multiple results (from several search engines and/or several databank searches) with a resolution of conflicting interpretations. Moreover, OVNIp enables automated exploitation of de novo sequences generated from unassigned MS/MS spectra leading to higher sequence coverage and enhancing confidence in the identified proteins. The exploitation of these additional spectra might also identify novel proteins through a MS-BLAST search, which can be easily ran from the OVNIp interface. Beyond this primary scope, OVNIp can also benefit to users who look for a simple standalone application to both visualize and confirm MS/MS result interpretations through a simple graphical interface and generate reports according to user-defined forms which may integrate the prerequisites for publication. Sources, documentation and a stable release for Windows are available at http://wwwappli.nantes.inra.fr:8180/OVNIp.
Collapse
Affiliation(s)
- Dominique Tessier
- INRA, UR 1268 Biopolymères, Interactions, Assemblages, Nantes, France.
| | | | | | | | | |
Collapse
|
21
|
Abstract
The current status of de novo sequencing of peptides by MS/MS is reviewed with focus on collision cell MS/MS spectra. The relation between peptide structure and observed fragment ion series is discussed and the exhaustive extraction of sequence information from CID spectra of protonated peptide ions is described. The partial redundancy of the extracted sequence information and a high mass accuracy are recognized as key parameters for dependable de novo sequencing by MS. In addition, the benefits of special techniques enhancing the generation of long uninterrupted fragment ion series for de novo peptide sequencing are highlighted. Among these are terminal (18)O labeling, MS(n) of sodiated peptide ions, N-terminal derivatization, the use of special proteases, and time-delayed fragmentation. The emerging electron transfer dissociation technique and the recent progress of MALDI techniques for intact protein sequencing are covered. Finally, the integration of bioinformatic tools into peptide de novo sequencing is demonstrated.
Collapse
Affiliation(s)
- Joerg Seidler
- Molecular Structure Analysis, German Cancer Research Center, Heidelberg, Germany
| | | | | | | |
Collapse
|
22
|
Jung S, Fladerer C, Braendle F, Madlung J, Spring O, Nordheim A. Identification of a novel Plasmopara halstedii elicitor protein combining de novo peptide sequencing algorithms and RACE-PCR. Proteome Sci 2010; 8:24. [PMID: 20459704 PMCID: PMC2881003 DOI: 10.1186/1477-5956-8-24] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/06/2009] [Accepted: 05/10/2010] [Indexed: 12/04/2022] Open
Abstract
Background Often high-quality MS/MS spectra of tryptic peptides do not match to any database entry because of only partially sequenced genomes and therefore, protein identification requires de novo peptide sequencing. To achieve protein identification of the economically important but still unsequenced plant pathogenic oomycete Plasmopara halstedii, we first evaluated the performance of three different de novo peptide sequencing algorithms applied to a protein digests of standard proteins using a quadrupole TOF (QStar Pulsar i). Results The performance order of the algorithms was PEAKS online > PepNovo > CompNovo. In summary, PEAKS online correctly predicted 45% of measured peptides for a protein test data set. All three de novo peptide sequencing algorithms were used to identify MS/MS spectra of tryptic peptides of an unknown 57 kDa protein of P. halstedii. We found ten de novo sequenced peptides that showed homology to a Phytophthora infestans protein, a closely related organism of P. halstedii. Employing a second complementary approach, verification of peptide prediction and protein identification was performed by creation of degenerate primers for RACE-PCR and led to an ORF of 1,589 bp for a hypothetical phosphoenolpyruvate carboxykinase. Conclusions Our study demonstrated that identification of proteins within minute amounts of sample material improved significantly by combining sensitive LC-MS methods with different de novo peptide sequencing algorithms. In addition, this is the first study that verified protein prediction from MS data by also employing a second complementary approach, in which RACE-PCR led to identification of a novel elicitor protein in P. halstedii.
Collapse
Affiliation(s)
- Stephan Jung
- Proteome Center Tuebingen, Interfakultaeres Institut fuer Zellbiologie, Universitaet Tuebingen, Tuebingen, Germany.
| | | | | | | | | | | |
Collapse
|
23
|
Chi H, Sun RX, Yang B, Song CQ, Wang LH, Liu C, Fu Y, Yuan ZF, Wang HP, He SM, Dong MQ. pNovo: de novo peptide sequencing and identification using HCD spectra. J Proteome Res 2010; 9:2713-24. [PMID: 20329752 DOI: 10.1021/pr100182k] [Citation(s) in RCA: 120] [Impact Index Per Article: 8.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/28/2022]
Abstract
De novo peptide sequencing has improved remarkably in the past decade as a result of better instruments and computational algorithms. However, de novo sequencing can correctly interpret only approximately 30% of high- and medium-quality spectra generated by collision-induced dissociation (CID), which is much less than database search. This is mainly due to incomplete fragmentation and overlap of different ion series in CID spectra. In this study, we show that higher-energy collisional dissociation (HCD) is of great help to de novo sequencing because it produces high mass accuracy tandem mass spectrometry (MS/MS) spectra without the low-mass cutoff associated with CID in ion trap instruments. Besides, abundant internal and immonium ions in the HCD spectra can help differentiate similar peptide sequences. Taking advantage of these characteristics, we developed an algorithm called pNovo for efficient de novo sequencing of peptides from HCD spectra. pNovo gave correct identifications to 80% or more of the HCD spectra identified by database search. The number of correct full-length peptides sequenced by pNovo is comparable with that obtained by database search. A distinct advantage of de novo sequencing is that deamidated peptides and peptides with amino acid mutations can be identified efficiently without extra cost in computation. In summary, implementation of the HCD characteristics makes pNovo an excellent tool for de novo peptide sequencing from HCD spectra.
Collapse
Affiliation(s)
- Hao Chi
- Key Lab of Intelligent Information Processing, Institute of Computing Technology, Chinese Academy of Sciences, Beijing 100190, People's Republic of China
| | | | | | | | | | | | | | | | | | | | | |
Collapse
|
24
|
Abstract
The review describes methods of de novo sequencing of peptides by mass spectrometry. De novo methods utilize computational approaches to deduce the sequence or partial sequence of peptides directly from the experimental MS/MS spectra. The concepts behind a number of de novo sequencing methods are discussed. The other approach to identify peptides by tandem mass spectrometry is to match the fragment ions with virtual peptide ions generated from a genomic or protein database. De novo methods are essential to identify proteins when the genomes are not known but they are also extremely useful even when the genomes are known since they are not affected by errors in a search database. Another advantage of de novo methods is that the partial sequence can be used to search for posttranslation modifications or for the identification of mutations by homology based software.
Collapse
Affiliation(s)
- Christopher Hughes
- Department of Biochemistry, University of Western Ontario, London, ON, Canada
| | | | | |
Collapse
|
25
|
Abstract
Proteomics has advanced in leaps and bounds over the past couple of decades. However, the continuing dependency of mass spectrometry-based protein identification on the searching of spectra against protein sequence databases limits many proteomics experiments. If there is no sequenced genome for a given species, then cross species proteomics is required, attempting to identify proteins across the species boundary, typically using the sequenced genome of a closely related species. Unlike sequence searching for homologues, the proteomics equivalent is confounded by small differences in amino acid sequences, leading to large differences in peptide masses; this renders mass matching of peptides and their product ions difficult. Therefore, the phylogenetic distance between the two species and the attendant level of conservation between the homologous proteins play a huge part in determining the extent of protein identification that is possible across the species boundary. In this chapter, we review the cross species challenge itself, as well as various approaches taken to deal with it and the success met with in past studies. This is followed by recommendations of best practice and suggestions to researchers facing this challenge as well as a final section predicting developments, which may help improve cross species proteomics in the future.
Collapse
Affiliation(s)
- J C Wright
- Department Veterinary Preclinical Sciences, University of Liverpool, Crown Street, Liverpool, UK
| | | | | |
Collapse
|
26
|
Shevchenko A, Valcu CM, Junqueira M. Tools for exploring the proteomosphere. J Proteomics 2009; 72:137-44. [PMID: 19167528 DOI: 10.1016/j.jprot.2009.01.012] [Citation(s) in RCA: 38] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/12/2009] [Accepted: 01/13/2009] [Indexed: 11/29/2022]
Abstract
Homology-driven proteomics aims at exploring the proteomes of organisms with unsequenced genomes that, despite rapid genomic sequencing progress, still represent the overwhelming majority of species in the biosphere. Methodologies have been developed to enable automated LC-MS/MS identifications of unknown proteins, which rely on the sequence similarity between the fragmented peptides and reference database sequences from phylogenetically related species. However, because full sequences of matched proteins are not available and matching specificity is reduced, estimating protein abundances should become the obligatory element of homology-driven proteomics pipelines to circumvent the interpretation bias towards proteins from evolutionary conserved families.
Collapse
Affiliation(s)
- Andrej Shevchenko
- Max Planck Institute of Molecular Cell Biology and Genetics, 01307 Dresden, Germany.
| | | | | |
Collapse
|
27
|
Bringans S, Kendrick TS, Lui J, Lipscombe R. A comparative study of the accuracy of several de novo sequencing software packages for datasets derived by matrix-assisted laser desorption/ionisation and electrospray. RAPID COMMUNICATIONS IN MASS SPECTROMETRY : RCM 2008; 22:3450-4. [PMID: 18837480 DOI: 10.1002/rcm.3752] [Citation(s) in RCA: 23] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/24/2023]
|
28
|
Bocker S, Rasche F. Towards de novo identification of metabolites by analyzing tandem mass spectra. Bioinformatics 2008; 24:i49-i55. [DOI: 10.1093/bioinformatics/btn270] [Citation(s) in RCA: 94] [Impact Index Per Article: 5.9] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/14/2022] Open
|