1
|
Gabriel W, The M, Zolg DP, Bayer FP, Shouman O, Lautenbacher L, Schnatbaum K, Zerweck J, Knaute T, Delanghe B, Huhmer A, Wenschuh H, Reimer U, Médard G, Kuster B, Wilhelm M. Prosit-TMT: Deep Learning Boosts Identification of TMT-Labeled Peptides. Anal Chem 2022; 94:7181-7190. [PMID: 35549156 DOI: 10.1021/acs.analchem.1c05435] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/15/2022]
Abstract
The prediction of fragment ion intensities and retention time of peptides has gained significant attention over the past few years. However, the progress shown in the accurate prediction of such properties focused primarily on unlabeled peptides. Tandem mass tags (TMT) are chemical peptide labels that are coupled to free amine groups usually after protein digestion to enable the multiplexed analysis of multiple samples in bottom-up mass spectrometry. It is a standard workflow in proteomics ranging from single-cell to high-throughput proteomics. Particularly for TMT, increasing the number of confidently identified spectra is highly desirable as it provides identification and quantification information with every spectrum. Here, we report on the generation of an extensive resource of synthetic TMT-labeled peptides as part of the ProteomeTools project and present the extension of the deep learning model Prosit to accurately predict the retention time and fragment ion intensities of TMT-labeled peptides with high accuracy. Prosit-TMT supports CID and HCD fragmentation and ion trap and Orbitrap mass analyzers in a single model. Reanalysis of published TMT data sets show that this single model extracts substantial additional information. Applying Prosit-TMT, we discovered that the expression of many proteins in human breast milk follows a distinct daily cycle which may prime the newborn for nutritional or environmental cues.
Collapse
Affiliation(s)
- Wassim Gabriel
- Computational Mass Spectrometry, Technical University of Munich, 85354 Freising, Germany
| | - Matthew The
- Chair of Proteomics and Bioanalytics, Technical University of Munich, 85354 Freising, Germany
| | - Daniel P Zolg
- Chair of Proteomics and Bioanalytics, Technical University of Munich, 85354 Freising, Germany
| | - Florian P Bayer
- Chair of Proteomics and Bioanalytics, Technical University of Munich, 85354 Freising, Germany
| | - Omar Shouman
- Computational Mass Spectrometry, Technical University of Munich, 85354 Freising, Germany
| | - Ludwig Lautenbacher
- Computational Mass Spectrometry, Technical University of Munich, 85354 Freising, Germany
| | | | | | - Tobias Knaute
- JPT Peptide Technologies GmbH, 12489 Berlin, Germany
| | | | - Andreas Huhmer
- Thermo Fisher Scientific, San Jose, California 95134, United States
| | | | - Ulf Reimer
- JPT Peptide Technologies GmbH, 12489 Berlin, Germany
| | - Guillaume Médard
- Chair of Proteomics and Bioanalytics, Technical University of Munich, 85354 Freising, Germany
| | - Bernhard Kuster
- Chair of Proteomics and Bioanalytics, Technical University of Munich, 85354 Freising, Germany.,Bavarian Center for Biomolecular Mass Spectrometry, 85354 Freising, Germany
| | - Mathias Wilhelm
- Computational Mass Spectrometry, Technical University of Munich, 85354 Freising, Germany
| |
Collapse
|
2
|
Phipps WS, Smith KD, Yang HY, Henderson CM, Pflaum H, Lerch ML, Fondrie WE, Emrick MA, Wu CC, MacCoss MJ, Noble WS, Hoofnagle AN. Tandem Mass Spectrometry-Based Amyloid Typing Using Manual Microdissection and Open-Source Data Processing. Am J Clin Pathol 2022; 157:748-757. [PMID: 35512256 PMCID: PMC9071319 DOI: 10.1093/ajcp/aqab185] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/21/2021] [Accepted: 09/20/2021] [Indexed: 11/14/2022] Open
Abstract
OBJECTIVES Standard implementations of amyloid typing by liquid chromatography-tandem mass spectrometry use capabilities unavailable to most clinical laboratories. To improve accessibility of this testing, we explored easier approaches to tissue sampling and data processing. METHODS We validated a typing method using manual sampling in place of laser microdissection, pairing the technique with a semiquantitative measure of sampling adequacy. In addition, we created an open-source data processing workflow (Crux Pipeline) for clinical users. RESULTS Cases of amyloidosis spanning the major types were distinguishable with 100% specificity using measurements of individual amyloidogenic proteins or in combination with the ratio of λ and κ constant regions. Crux Pipeline allowed for rapid, batched data processing, integrating the steps of peptide identification, statistical confidence estimation, and label-free protein quantification. CONCLUSIONS Accurate mass spectrometry-based amyloid typing is possible without laser microdissection. To facilitate entry into solid tissue proteomics, newcomers can leverage manual sampling approaches in combination with Crux Pipeline and related tools.
Collapse
Affiliation(s)
- William S Phipps
- Department of Laboratory Medicine and Pathology, Seattle, WA, USA
| | - Kelly D Smith
- Department of Laboratory Medicine and Pathology, Seattle, WA, USA
- Department of Medicine, Seattle, WA, USA
| | - Han-Yin Yang
- Department of Genome Sciences, University of Washington, Seattle, WA, USA
| | - Clark M Henderson
- Department of Laboratory Medicine and Pathology, Seattle, WA, USA
- Seagen, Bothel, WA, USA
| | - Hannah Pflaum
- Department of Laboratory Medicine and Pathology, Seattle, WA, USA
- Seattle Children’s Hospital, Seattle, WA, USA
| | - Melissa L Lerch
- Department of Laboratory Medicine and Pathology, Seattle, WA, USA
| | - William E Fondrie
- Department of Genome Sciences, University of Washington, Seattle, WA, USA
| | | | - Christine C Wu
- Department of Genome Sciences, University of Washington, Seattle, WA, USA
| | - Michael J MacCoss
- Department of Genome Sciences, University of Washington, Seattle, WA, USA
| | - William S Noble
- Department of Genome Sciences, University of Washington, Seattle, WA, USA
| | - Andrew N Hoofnagle
- Department of Laboratory Medicine and Pathology, Seattle, WA, USA
- Department of Medicine, Seattle, WA, USA
| |
Collapse
|
3
|
Synthetic Peptide-Based Antibody Detection for Diagnosis of Chikungunya Infection with and without Neurological Complications. Methods Mol Biol 2016. [PMID: 27233259 DOI: 10.1007/978-1-4939-3618-2_4] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register]
Abstract
Synthetic peptide-based diagnosis of Chikungunya can be an efficient and more accessible approach in immunodiagnostics. Here, we describe the identification of Chikungunya-specific 40 kD protein for development of synthetic peptide-based enzyme-linked immunosorbent assay for the detection of Chikungunya virus-specific antibodies in the patient's sample. The total sodium dodecyl sulfate-polyacrylamide gel electrophoresis protein profile of the patient's sample can be done to identify specific protein bands. The identified proteins can be subjected to liquid chromatography-tandem mass spectrometry (LC-MS/MS) for characterization. After characterization, immunogenic peptides can be designed using softwares and subsequently synthesized chemically. The peptides can be used to develop more specific, sensitive, and simpler diagnostic assay.
Collapse
|
4
|
Zekavat B, Miladi M, Al-Fdeilat AH, Somogyi A, Solouki T. Evidence for sequence scrambling and divergent H/D exchange reactions of doubly-charged isobaric b-type fragment ions. JOURNAL OF THE AMERICAN SOCIETY FOR MASS SPECTROMETRY 2014; 25:226-236. [PMID: 24346960 DOI: 10.1007/s13361-013-0768-9] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 01/20/2012] [Revised: 10/07/2013] [Accepted: 10/08/2013] [Indexed: 06/03/2023]
Abstract
To date, only a limited number of reports are available on structural variants of multiply-charged b-fragment ions. We report on observed bimodal gas-phase hydrogen/deuterium exchange (HDX) reaction kinetics and patterns for substance P b10(2+) that point to presence of isomeric structures. We also compare HDX reactions, post-ion mobility/collision-induced dissociation (post-IM/CID), and sustained off-resonance irradiation-collision induced dissociation (SORI-CID) of substance P b10(2+) and a cyclic peptide with an identical amino acid (AA) sequence order to substance P b10. The observed HDX patterns and reaction kinetics and SORI-CID pattern for the doubly charged head-to-tail cyclized peptide were different from either of the presumed isomers of substance P b10(2+), suggesting that b10(2+) may not exist exclusively as a head-to-tail cyclized structure. Ultra-high mass measurement accuracy was used to assign identities of the observed SORI-CID fragment ions of substance P b10(2+); over 30% of the observed SORI-CID fragment ions from substance P b10(2+) had rearranged (scrambled) AA sequences. Moreover, post-IM/CID experiments revealed the presence of two conformer types for substance P b10(2+), whereas only one conformer type was observed for the head-to-tail cyclized peptide. We also show that AA sequence scrambling from CID of doubly-charged b-fragment ions is not unique to substance P b10(2+).
Collapse
Affiliation(s)
- Behrooz Zekavat
- Department of Chemistry and Biochemistry, Baylor University, Waco, TX, 76798, USA
| | | | | | | | | |
Collapse
|
5
|
Jagusztyn-Krynicka EK, Dadlez M, Grabowska A, Roszczenko P. Proteomic technology in the design of new effective antibacterial vaccines. Expert Rev Proteomics 2014; 6:315-30. [DOI: 10.1586/epr.09.47] [Citation(s) in RCA: 14] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/04/2023]
|
6
|
Paik YK, Jeong SK, Lee EY, Jeong PY, Shim YH. C. elegans: an invaluable model organism for the proteomics studies of the cholesterol-mediated signaling pathway. Expert Rev Proteomics 2014; 3:439-53. [PMID: 16901202 DOI: 10.1586/14789450.3.4.439] [Citation(s) in RCA: 15] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/08/2022]
Abstract
With the availability of its complete genome sequence and unique biological features relevant to human disease, Caenorhabditis elegans has become an invaluable model organism for the studies of proteomics, leading to the elucidation of nematode gene function. A journey from the genome to proteome of C. elegans may begin with preparation of expressed proteins, which enables a large-scale analysis of all possible proteins expressed under specific physiological conditions. Although various techniques have been used for proteomic analysis of C. elegans, systematic high-throughput analysis is still to come in order to accommodate studies of post-translational modification and quantitative analysis. Given that no integrated C. elegans protein expression database is available, it is about time that a global C. elegans proteome project is launched through which datasets of transcriptomes, protein-protein interaction and functional annotation can be integrated. As an initial target of a pilot project of the C. elegans proteome project, the cholesterol-mediated signaling pathway will be an excellent example since, like in other organisms, it is one of the key controlling pathways in cell growth and development in C. elegans. As this field tends to broaden to functional proteomics, there is a high demand to develop the versatile proteome informatics tools that can mange many different data in an integrative manner.
Collapse
Affiliation(s)
- Young-Ki Paik
- Yonsei University, Department of Biochemistry, 134 Shinchon-dong, Sudamoon-Ku, Seoul, 120-749, Korea.
| | | | | | | | | |
Collapse
|
7
|
Proteomic analysis reveals differentially expressed proteins in macrophages infected with Leishmania amazonensis or Leishmania major. Microbes Infect 2013; 15:579-91. [DOI: 10.1016/j.micinf.2013.04.005] [Citation(s) in RCA: 33] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/28/2012] [Revised: 04/09/2013] [Accepted: 04/18/2013] [Indexed: 11/20/2022]
|
8
|
Huang T, Gong H, Yang C, He Z. ProteinLasso: A Lasso regression approach to protein inference problem in shotgun proteomics. Comput Biol Chem 2013; 43:46-54. [PMID: 23385215 DOI: 10.1016/j.compbiolchem.2012.12.008] [Citation(s) in RCA: 18] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/26/2012] [Revised: 12/30/2012] [Accepted: 12/30/2012] [Indexed: 11/28/2022]
Abstract
Protein inference is an important issue in proteomics research. Its main objective is to select a proper subset of candidate proteins that best explain the observed peptides. Although many methods have been proposed for solving this problem, several issues such as peptide degeneracy and one-hit wonders still remain unsolved. Therefore, the accurate identification of proteins that are truly present in the sample continues to be a challenging task. Based on the concept of peptide detectability, we formulate the protein inference problem as a constrained Lasso regression problem, which can be solved very efficiently through a coordinate descent procedure. The new inference algorithm is named as ProteinLasso, which explores an ensemble learning strategy to address the sparsity parameter selection problem in Lasso model. We test the performance of ProteinLasso on three datasets. As shown in the experimental results, ProteinLasso outperforms those state-of-the-art protein inference algorithms in terms of both identification accuracy and running efficiency. In addition, we show that ProteinLasso is stable under different parameter specifications. The source code of our algorithm is available at: http://sourceforge.net/projects/proteinlasso.
Collapse
Affiliation(s)
- Ting Huang
- School of Software, Dalian University of Technology, China
| | | | | | | |
Collapse
|
9
|
|
10
|
Conductive carbon tape as a sample platform for microwave-based MALDI MS detection of proteins and phosphoproteins. Anal Bioanal Chem 2011; 401:1219-29. [DOI: 10.1007/s00216-011-5198-6] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/26/2011] [Revised: 06/18/2011] [Accepted: 06/20/2011] [Indexed: 10/18/2022]
|
11
|
Ivanov AS, Zgoda VG, Archakov AI. Technologies of protein interactomics: A review. RUSSIAN JOURNAL OF BIOORGANIC CHEMISTRY 2011; 37:8-21. [DOI: 10.1134/s1068162011010092] [Citation(s) in RCA: 17] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 12/18/2022]
|
12
|
Abstract
The complexity of proteomes makes good experimental design essential for their successful investigation. Here, we describe how proteomics experiments can be modeled and how computer simulations of these models can be used to improve experimental designs.
Collapse
|
13
|
Fritz R, Ruth W, Kragl U. Assessment of acetone as an alternative to acetonitrile in peptide analysis by liquid chromatography/mass spectrometry. RAPID COMMUNICATIONS IN MASS SPECTROMETRY : RCM 2009; 23:2139-2145. [PMID: 19517463 DOI: 10.1002/rcm.4122] [Citation(s) in RCA: 15] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/27/2023]
Abstract
Acetonitrile as a solvent used in liquid chromatography/mass spectrometry (LC/MS) of peptides and proteins is a relatively toxic solvent (LD50 oral; rat; 2,460 mg/kg) compared to alternatives like methanol (LD50 oral; rat; 5,628 mg/kg) and acetone (LD50 oral; rat; 5,800 mg/kg). Strategies to minimize its consumption in LC are either to reduce the inner diameter of the column or replace acetonitrile with a suitable alternative. Methanol is often recommended to replace acetonitrile in peptide analysis. In this study however, the main focus lies on another alternative solvent for LC/MS of peptides; acetone. A number of model proteins were tryptically digested and the peptide solutions were analyzed on a linear trap quadrupole (LTQ) mass spectrometer. The performances of acetonitrile, methanol and acetone were compared according to the quality of the chromatograms obtained and identification of the peptides using the BioWorks software developed by Thermo Scientific. In accordance to the elutropic series, acetone was found to significantly reduce the retention times of peptides separated by C18 column material with regard to acetonitrile while methanol led to increased retention times. Acetone was the superior solvent to methanol for most of the tested model proteins reaching similar sequence coverage and numbers of identified peptides as acetonitrile. We therefore propose acetone as an alternative to acetonitrile in LC/MS of peptides.
Collapse
Affiliation(s)
- Ria Fritz
- Institute of Chemistry, University of Rostock, Albert-Einstein-Str. 3A, 18059 Rostock, Germany
| | | | | |
Collapse
|
14
|
Abstract
Diagnostic oncoproteomics is the application of proteomic techniques for the diagnosis of malignancies. A new mass spectrometric technology involves surface enhanced laser desorption ionization combined with time-of flight mass analysis (SELDI-TOF-MS), using special protein chips. After the description of the relevant principles of the technique, including approaches to proteomic pattern diagnostics, applications are reviewed for the diagnosis of ovarian, breast, prostate, bladder, pancreatic, and head and neck cancers, and also several other malignancies. Finally, problems and prospects of the approach are discussed.
Collapse
Affiliation(s)
- John Roboz
- Division of Hematology-Oncology, Department of Medicine, Mount Sinai School of Medicine, New York, New York, USA
| |
Collapse
|
15
|
Shao C, Sun W, Li F, Yang R, Zhang L, Gao Y. Oscore: a combined score to reduce false negative rates for peptide identification in tandem mass spectrometry analysis. JOURNAL OF MASS SPECTROMETRY : JMS 2009; 44:25-31. [PMID: 18698557 DOI: 10.1002/jms.1466] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/26/2023]
Abstract
Tandem mass spectrometry (MS/MS) has been widely used in proteomics studies. Multiple algorithms have been developed for assessing matches between MS/MS spectra and peptide sequences in databases. However, it is still a challenge to reduce false negative rates without compromising the high confidence of peptide identification. In this study, we developed the score, Oscore, by logistic regression using SEQUEST and AMASS variables to identify fully tryptic peptides. Since these variables showed complicated association with each other, combining them together rather than applying them to a threshold model improved the classification of correct and incorrect peptide identifications. Oscore achieved both a lower false negative rate and a lower false positive rate than PeptideProphet on datasets from 18 known protein mixtures and several proteome-scale samples of different complexity, database size and separation methods. By a three-way comparison among Oscore, PeptideProphet and another logistic regression model which made use of PeptideProphet's variables, the main contributor for the improvement made by Oscore is discussed.
Collapse
Affiliation(s)
- Chen Shao
- Department of Physiology and Pathophysiology, Institute of Basic Medical Sciences, Chinese Academy of Medical Sciences, School of Basic Medicine, Peking Union Medical College, Beijing, China
| | | | | | | | | | | |
Collapse
|
16
|
Evaluation of a standardized method of protein purification and identification after discovery by mass spectrometry. J Proteomics 2008; 71:368-78. [DOI: 10.1016/j.jprot.2008.06.003] [Citation(s) in RCA: 11] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/02/2008] [Revised: 05/31/2008] [Accepted: 06/05/2008] [Indexed: 11/19/2022]
|
17
|
Droit A, Hunter JM, Rouleau M, Ethier C, Picard-Cloutier A, Bourgais D, Poirier GG. PARPs database: a LIMS systems for protein-protein interaction data mining or laboratory information management system. BMC Bioinformatics 2007; 8:483. [PMID: 18093328 PMCID: PMC2266781 DOI: 10.1186/1471-2105-8-483] [Citation(s) in RCA: 15] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/14/2007] [Accepted: 12/19/2007] [Indexed: 11/10/2022] Open
Abstract
BACKGROUND In the "post-genome" era, mass spectrometry (MS) has become an important method for the analysis of proteins and the rapid advancement of this technique, in combination with other proteomics methods, results in an increasing amount of proteome data. This data must be archived and analysed using specialized bioinformatics tools. DESCRIPTION We herein describe "PARPs database," a data analysis and management pipeline for liquid chromatography tandem mass spectrometry (LC-MS/MS) proteomics. PARPs database is a web-based tool whose features include experiment annotation, protein database searching, protein sequence management, as well as data-mining of the peptides and proteins identified. CONCLUSION Using this pipeline, we have successfully identified several interactions of biological significance between PARP-1 and other proteins, namely RFC-1, 2, 3, 4 and 5.
Collapse
Affiliation(s)
- Arnaud Droit
- Health and Environment Unit, Laval University Medical research Center, CHUQ, Québec, Canada.
| | | | | | | | | | | | | |
Collapse
|
18
|
Waridel P, Frank A, Thomas H, Surendranath V, Sunyaev S, Pevzner P, Shevchenko A. Sequence similarity-driven proteomics in organisms with unknown genomes by LC-MS/MS and automated de novo sequencing. Proteomics 2007; 7:2318-29. [PMID: 17623296 DOI: 10.1002/pmic.200700003] [Citation(s) in RCA: 89] [Impact Index Per Article: 5.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/08/2022]
Abstract
LC-MS/MS analysis on a linear ion trap LTQ mass spectrometer, combined with data processing, stringent, and sequence-similarity database searching tools, was employed in a layered manner to identify proteins in organisms with unsequenced genomes. Highly specific stringent searches (MASCOT) were applied as a first layer screen to identify either known (i.e. present in a database) proteins, or unknown proteins sharing identical peptides with related database sequences. Once the confidently matched spectra were removed, the remainder was filtered against a nonannotated library of background spectra that cleaned up the dataset from spectra of common protein and chemical contaminants. The rectified spectral dataset was further subjected to rapid batch de novo interpretation by PepNovo software, followed by the MS BLAST sequence-similarity search that used multiple redundant and partially accurate candidate peptide sequences. Importantly, a single dataset was acquired at the uncompromised sensitivity with no need of manual selection of MS/MS spectra for subsequent de novo interpretation. This approach enabled a completely automated identification of novel proteins that were, otherwise, missed by conventional database searches.
Collapse
Affiliation(s)
- Patrice Waridel
- Max Planck Institute of Molecular Cell Biology and Genetics, Dresden, Germany
| | | | | | | | | | | | | |
Collapse
|
19
|
Eriksson J, Fenyö D. Improving the success rate of proteome analysis by modeling protein-abundance distributions and experimental designs. Nat Biotechnol 2007; 25:651-5. [PMID: 17557102 DOI: 10.1038/nbt1315] [Citation(s) in RCA: 61] [Impact Index Per Article: 3.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/08/2022]
Abstract
Truly comprehensive proteome analysis is highly desirable in systems biology and biomarker discovery efforts. But complete proteome characterization has been hindered by the dynamic range and detection sensitivity of experimental designs, which are not adequate to the very wide range of protein abundances. Experimental designs for comprehensive analytical efforts involve separation followed by mass spectrometry-based identification of digested proteins. Because results are generally reported as a collection of identifications with no information on the fraction of the proteome that was missed, they are difficult to evaluate and potentially misleading. Here we address this problem by taking a holistic view of the experimental design and using computer simulations to estimate the success rate for any given experiment. Our approach demonstrates that simple changes in typical experimental designs can enhance the success rate of proteome analysis by five- to tenfold.
Collapse
Affiliation(s)
- Jan Eriksson
- Department of Chemistry, Swedish University of Agricultural Sciences, Box 7015, SE-750 07, Uppsala, Sweden.
| | | |
Collapse
|
20
|
Zhang X, Wei D, Yap Y, Li L, Guo S, Chen F. Mass spectrometry-based "omics" technologies in cancer diagnostics. MASS SPECTROMETRY REVIEWS 2007; 26:403-31. [PMID: 17405143 DOI: 10.1002/mas.20132] [Citation(s) in RCA: 53] [Impact Index Per Article: 3.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/14/2023]
Abstract
Many "omics" techniques have been developed for one goal: biomarker discovery and early diagnosis of human cancers. A comprehensive review of mass spectrometry-based "omics" approaches performed on various biological samples for molecular diagnosis of human cancers is presented in this article. Furthermore, the existing and potential problems/solutions (both de facto experimental and bioinformatic challenges), and future prospects have been extensively discussed. Although the use of present omic methods as diagnostic tools are still in their infant stage and consequently not ready for immediate clinical use, it can be envisaged that the "omics"-based cancer diagnostics will gradually enter into the clinic in next 10 years as an important supplement to current clinical diagnostics.
Collapse
Affiliation(s)
- Xuewu Zhang
- College of Light Industry and Food Sciences, South China University of Technology, Guangzhou, China.
| | | | | | | | | | | |
Collapse
|
21
|
Huang Y, Triscari JM, Tseng GC, Pasa-Tolic L, Lipton MS, Smith RD, Wysocki VH. Statistical characterization of the charge state and residue dependence of low-energy CID peptide dissociation patterns. Anal Chem 2007; 77:5800-13. [PMID: 16159109 PMCID: PMC4543285 DOI: 10.1021/ac0480949] [Citation(s) in RCA: 181] [Impact Index Per Article: 10.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/30/2022]
Abstract
Data mining was performed on 28 330 unique peptide tandem mass spectra for which sequences were assigned with high confidence. By dividing the spectra into different sets based on structural features and charge states of the corresponding peptides, chemical interactions involved in promoting specific cleavage patterns in gas-phase peptides were characterized. Pairwise fragmentation maps describing cleavages at all Xxx-Zzz residue combinations for b and y ions reveal that the difference in basicity between Arg and Lys results in different dissociation patterns for singly charged Arg- and Lys-ending tryptic peptides. While one dominant protonation form (proton localized) exists for Arg-ending peptides, a heterogeneous population of different protonated forms or more facile interconversion of protonated forms (proton partially mobile) exists for Lys-ending peptides. Cleavage C-terminal to acidic residues dominates spectra from singly charged peptides that have a localized proton and cleavage N-terminal to Pro dominates those that have a mobile or partially mobile proton. When Pro is absent from peptides that have a mobile or partially mobile proton, cleavage at each peptide bond becomes much more prominent. Whether the above patterns can be found in b ions, y ions, or both depends on the location of the proton holder(s) in multiply protonated peptides. Enhanced cleavages C-terminal to branched aliphatic residues (Ile, Val, Leu) are observed in both b and y ions from peptides that have a mobile proton, as well as in y ions from peptides that have a partially mobile proton; enhanced cleavages N-terminal to these residues are observed in b ions from peptides that have a partially mobile proton. Statistical tools have been designed to visualize the fragmentation maps and measure the similarity between them. The pairwise cleavage patterns observed expand our knowledge of peptide gas-phase fragmentation behaviors and may be useful in algorithm development that employs improved models to predict fragment ion intensities.
Collapse
Affiliation(s)
- Yingying Huang
- Department of Chemistry, University of Arizona, Tucson, AZ 85721
| | | | - George C. Tseng
- Department of Biostatistics and Department of Human Genetics, University of Pittsburgh, Pittsburgh, PA 99352
| | - Ljiljana Pasa-Tolic
- Biological Sciences Division, Pacific Northwest National Laboratory, Richland, WA 99352
| | - Mary S. Lipton
- Biological Sciences Division, Pacific Northwest National Laboratory, Richland, WA 99352
| | - Richard D. Smith
- Biological Sciences Division, Pacific Northwest National Laboratory, Richland, WA 99352
| | - Vicki H. Wysocki
- Department of Chemistry, University of Arizona, Tucson, AZ 85721
- Corresponding author. , Fax: 520-621-8407
| |
Collapse
|
22
|
Wolski WE, Farrow M, Emde AK, Lehrach H, Lalowski M, Reinert K. Analytical model of peptide mass cluster centres with applications. Proteome Sci 2006; 4:18. [PMID: 16995952 PMCID: PMC1617084 DOI: 10.1186/1477-5956-4-18] [Citation(s) in RCA: 19] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/30/2006] [Accepted: 09/23/2006] [Indexed: 11/10/2022] Open
Abstract
Background The elemental composition of peptides results in formation of distinct, equidistantly spaced clusters across the mass range. The property of peptide mass clustering is used to calibrate peptide mass lists, to identify and remove non-peptide peaks and for data reduction. Results We developed an analytical model of the peptide mass cluster centres. Inputs to the model included, the amino acid frequencies in the sequence database, the average length of the proteins in the database, the cleavage specificity of the proteolytic enzyme used and the cleavage probability. We examined the accuracy of our model by comparing it with the model based on an in silico sequence database digest. To identify the crucial parameters we analysed how the cluster centre location depends on the inputs. The distance to the nearest cluster was used to calibrate mass spectrometric peptide peak-lists and to identify non-peptide peaks. Conclusion The model introduced here enables us to predict the location of the peptide mass cluster centres. It explains how the location of the cluster centres depends on the input parameters. Fast and efficient calibration and filtering of non-peptide peaks is achieved by a distance measure suggested by Wool and Smilansky.
Collapse
Affiliation(s)
- Witold E Wolski
- School of Mathematics and Statistics, Merz Court, University of Newcastle upon Tyne, NE1 7RU, UK
- Institute for Computer Science, Free University Berlin, Takustr. 9, 14195 Berlin, Germany
| | - Malcolm Farrow
- School of Mathematics and Statistics, Merz Court, University of Newcastle upon Tyne, NE1 7RU, UK
| | - Anne-Katrin Emde
- Institute for Computer Science, Free University Berlin, Takustr. 9, 14195 Berlin, Germany
| | - Hans Lehrach
- Max Planck Institute for Molecular Genetics, Ihnestraße 63-73, D-14195 Berlin, Germany
| | - Maciej Lalowski
- Max Delbrück Center for Molecular Medicine, Robert-Roessle-Str. 10, D-13125 Berlin-Buch, Germany
| | - Knut Reinert
- Institute for Computer Science, Free University Berlin, Takustr. 9, 14195 Berlin, Germany
| |
Collapse
|
23
|
Wielsch N, Thomas H, Surendranath V, Waridel P, Frank A, Pevzner P, Shevchenko A. Rapid Validation of Protein Identifications with the Borderline Statistical Confidence via De Novo Sequencing and MS BLAST Searches. J Proteome Res 2006; 5:2448-56. [PMID: 16944958 DOI: 10.1021/pr060200v] [Citation(s) in RCA: 36] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/28/2022]
Abstract
Protein identifications with the borderline statistical confidence are typically produced by matching a few marginal quality MS/MS spectra to database peptide sequences and represent a significant bottleneck in the reliable and reproducible characterization of proteomes. Here, we present a method for rapid validation of borderline hits that circumvents the need in, often biased, manual inspection of raw MS/MS spectra. The approach takes advantage of the independent interpretation of corresponding MS/MS spectra by PepNovo de novo sequencing software followed by mass spectrometry-driven BLAST (MS BLAST) sequence-similarity database searches that utilize all partially inaccurate, degenerate and redundant candidate peptide sequences. In a case study involving the identification of more than 180 Caenorhabditis elegans proteins by nanoLC-MS/MS analysis on a linear ion trap LTQ mass spectrometer, the approach enabled rapid assignment (confirmation or rejection) of more than 70% of Mascot hits of borderline statistical confidence.
Collapse
Affiliation(s)
- Natalie Wielsch
- Max Planck Institute of Molecular Cell Biology and Genetics, Pfotenhauerstrasse 108, 01307 Dresden, Germany
| | | | | | | | | | | | | |
Collapse
|
24
|
Fröhlich T, Arnold GJ. Proteome research based on modern liquid chromatography – tandem mass spectrometry: separation, identification and quantification. J Neural Transm (Vienna) 2006; 113:973-94. [PMID: 16835695 DOI: 10.1007/s00702-006-0509-3] [Citation(s) in RCA: 31] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/25/2005] [Accepted: 04/05/2006] [Indexed: 01/31/2023]
Abstract
Recent developments of new generations of mass spectrometers and improvements in the field of chromatography have revolutionized protein analytics. Particularly the combination of liquid chromatography as a separation tool for proteins and peptides with tandem mass spectrometry as an identification tool referred to as LC-MS/MS has generated a powerful and broadly used technique in the field of proteomics. The resolution and sensitivity of state-of-the-art LC-MS/MS systems has reached dimensions allowing not only the analysis of individual proteins but also investigations on the level of complete proteomes. However, the enormous complexity and the extreme concentration range of proteins within typical eukaryotic proteomes are still the major challenge of this technique. This review gives an overview of modern LC-MS/MS based proteomics, describing state-of-the-art chromatography and modern mass spectrometry. Strategies to perform quantitative proteomics will be presented and capabilities as well as current limitations of this innovative methodology will be discussed.
Collapse
Affiliation(s)
- T Fröhlich
- Laboratory for Functional Genome Analysis (LAFUGA), Gene Center, Ludwig-Maximilians University Munich, Germany
| | | |
Collapse
|
25
|
Temporini C, Perani E, Mancini F, Bartolini M, Calleri E, Lubda D, Felix G, Andrisano V, Massolini G. Optimization of a trypsin-bioreactor coupled with high-performance liquid chromatography–electrospray ionization tandem mass spectrometry for quality control of biotechnological drugs. J Chromatogr A 2006; 1120:121-31. [PMID: 16472537 DOI: 10.1016/j.chroma.2006.01.030] [Citation(s) in RCA: 44] [Impact Index Per Article: 2.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/04/2005] [Revised: 11/07/2005] [Accepted: 01/11/2006] [Indexed: 10/25/2022]
Abstract
The optimization of a silica-based trypsin bioreactor and its use in the quality control of biotechnological drugs like peptides and proteins is described. Five bioreactors based on monolithic material have been prepared, with different amount of bound trypsin. The performances of these bioreactors were compared to the proteolytic activity of a bioreactor based on silica material. The trypsin-based chromatographic columns were coupled on-line with an LC/ESI/MS/MS system for digestion and identification of proteins. First, human serum albumin has been used as test protein to compare the ability of the bioreactors to hydrolyse high-molecular-weight proteins. The best chromatographic material (epoxy monolithic silica) and the optimum amount of enzyme bound (7.13 mg) have been identified to obtain the highest protein recovery and an analytical reproducibility of the whole digestion, separation and identification process. The optimized enzyme-reactor has been used for the on-line digestion of some biotechnological drugs such as somatotropin. Somatotropin for parentheral use has been analyzed, without sample pre-treatment, with both an on-line procedure and the traditional off-line procedure described in the European Pharmacopoeia. It was found that the cleavage efficiency (aminoacidic recovery, %AA) achieved within minutes by the developed protocol is at least comparable or even better than the conventional 4h consuming method.
Collapse
Affiliation(s)
- C Temporini
- Dipartimento di Chimica Farmaceutica, Università di Pavia, Via Taramelli 12, I-27100 Pavia, Italy
| | | | | | | | | | | | | | | | | |
Collapse
|
26
|
Page MJ, Griffiths TAM, Bleackley MR, MacGillivray RTA. Proteomics: applications relevant to transfusion medicine. Transfus Med Rev 2006; 20:63-74. [PMID: 16373189 DOI: 10.1016/j.tmrv.2005.08.006] [Citation(s) in RCA: 14] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/30/2022]
Abstract
With the completion of the human genome sequence, it is now possible to analyze the many individual components that comprise complex biologic systems. Despite this sequence data, understanding the biologic relationships of all proteins of a given cell or biologic sample (the proteome) is still an exceedingly difficult task. However, new technology developments mean that proteomics research can be used to investigate a variety of biologic systems. Already, these studies have given valuable insight for the development of improved diagnostic and therapeutic products. The present review aims to provide a basic understanding of proteomics research by discussing the methods used to study large numbers of proteins and by reviewing the application of proteomics methods to transfusion medicine.
Collapse
Affiliation(s)
- Michael J Page
- Department of Biochemistry and Molecular Biology, Centre for Blood Research, University of British Columbia, Vancouver, Canada
| | | | | | | |
Collapse
|
27
|
Fälth M, Sköld K, Norrman M, Svensson M, Fenyö D, Andren PE. SwePep, a database designed for endogenous peptides and mass spectrometry. Mol Cell Proteomics 2006; 5:998-1005. [PMID: 16501280 DOI: 10.1074/mcp.m500401-mcp200] [Citation(s) in RCA: 108] [Impact Index Per Article: 6.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/06/2022] Open
Abstract
A new database, SwePep, specifically designed for endogenous peptides, has been constructed to significantly speed up the identification process from complex tissue samples utilizing mass spectrometry. In the identification process the experimental peptide masses are compared with the peptide masses stored in the database both with and without possible post-translational modifications. This intermediate identification step is fast and singles out peptides that are potential endogenous peptides and can later be confirmed with tandem mass spectrometry data. Successful applications of this methodology are presented. The SwePep database is a relational database developed using MySql and Java. The database contains 4180 annotated endogenous peptides from different tissues originating from 394 different species as well as 50 novel peptides from brain tissue identified in our laboratory. Information about the peptides, including mass, isoelectric point, sequence, and precursor protein, is also stored in the database. This new approach holds great potential for removing the bottleneck that occurs during the identification process in the field of peptidomics. The SwePep database is available to the public.
Collapse
Affiliation(s)
- Maria Fälth
- Laboratory for Biological and Medical Mass Spectrometry, Biomedical Centre, Box 583, Uppsala University, SE-75123 Uppsala, Sweden
| | | | | | | | | | | |
Collapse
|
28
|
Hagerty L, Haystead TAJ. Delineating signal transduction pathways in smooth muscle through focused proteomics. Expert Rev Proteomics 2006; 3:75-85. [PMID: 16445352 DOI: 10.1586/14789450.3.1.75] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/08/2022]
Abstract
This review will outline examples of the authors' focused proteomics approaches to studying signal transduction pathways in smooth muscle. By focusing the use of traditional proteomics techniques with hypothesis-driven selection methods, this approach efficiently addresses the identification of novel elements in a signal transduction pathway of interest. However, focused proteomics serves only as a starting point in the investigation of novel signaling proteins. While focused proteomics studies can suggest the involvement and general biochemical function of a protein in a signaling pathway, these findings must be further investigated and validated. Through the integrated use of focused proteomics with complementary approaches such as genetics, biochemistry and cell physiology, a complete and detailed mechanism of signal transduction can be determined.
Collapse
Affiliation(s)
- Laura Hagerty
- Department of Pharmacology & Cancer Biology, Duke University, C118 LSRC, Durham, NC 27710, USA
| | | |
Collapse
|
29
|
Lee TY, Horng JT, Juan HF, Huang HD, Wu LC, Tsai MF, Huang HC. An agent-based system to discover protein–protein interactions, identify protein complexes and proteins with multiple peptide mass fingerprints. J Comput Chem 2006; 27:1020-32. [PMID: 16639701 DOI: 10.1002/jcc.20417] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/06/2022]
Abstract
Proteins "work together" by actually binding to form multicomponent complexes that carry out specific functions. Proteomic analyses based on the mass spectrum are now key methods to determine the components in protein complexes. The protein-protein interaction or functional association may be known to exist among the extracted protein spots while analyzing the proteins on the 2D gel. In this study, we develop an agent-based system, namely AgentMultiProtIdent, which integrated two protein identification tools and a variety of databases storing relations among proteins and used to discover protein-protein interactions and protein functional associations, and identify protein complexes and proteins with multiple peptide mass fingerprints as input. The system takes Multiple Peptide Mass Fingerprints (PMFs) as a whole in the protein complex or protein identification. With the relations among proteins, it may greatly improve the accuracy of identification of protein complexes. Also, possible relationship of the multiple peptide mass fingerprints, such as ontology relation, can be discovered by our system, especially in the identification of protein complexes. The agent-based system is now available on the Web at http://dbms104.csie.ncu.edu.tw/ approximately protein/NEW2/.
Collapse
Affiliation(s)
- Tzong-Yi Lee
- Department of Biological Science and Technology & Institute of Bioinformatics, National Chiao-Tung University, Taiwan, Republic of China
| | | | | | | | | | | | | |
Collapse
|
30
|
Smith LL, Herrmann KA, Wysocki VH. Investigation of gas phase ion structure for proline-containing b(2) ion. JOURNAL OF THE AMERICAN SOCIETY FOR MASS SPECTROMETRY 2006; 17:20-8. [PMID: 16338148 DOI: 10.1016/j.jasms.2005.06.016] [Citation(s) in RCA: 11] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 01/14/2005] [Revised: 06/17/2005] [Accepted: 06/23/2005] [Indexed: 05/05/2023]
Abstract
Unusual fragmentation was observed for doubly charged VPDPR in which cleavage C-terminal to proline and N-terminal to aspartic acid yielded b(2) (+ a(2))/y(3) complementary ions. This unique fragmentation is contradictory to trends previously established by statistical analysis of peptide tandem mass (MS/MS) spectra. Substitution of alanine for aspartic acid (i.e., VPAPR) did not change the fragmentation, indicating the cleavage was not directed by aspartic acid. Fragmentation patterns for VPAPR and V(NmA)APR (NmA = N-methyl alanine) were compared to determine whether conformational constraints from proline's cyclic side-chain contribute to b(2) ion formation. While both peptide sequences fragmented to yield b(2)/y(3) ions, only VPAPR produced a(2) ions, suggesting the VP b(2) ion is structurally different from the V(NmA) b(2) ion. Instead, the V(NmA) b(2) ion was accompanied by an ion corresponding to formal loss of 71. The suspected structural differences were confirmed by isolation and fragmentation of the respective b(2) ions (i.e., MS(3) spectra). Evidence supporting a diketopiperazine structure for the VP b(2) ion is reported. Fragmentation patterns for the VP b(2) ion and a synthetic VP diketopiperazine showed great similarity. N-terminal acetylation of VPAPR prevented the formation of the VP b(2) ion, presumably by blocking nucleophilic attack by the N-terminal amine on the carbonyl oxygen of the protonation site. Acetylation of the N-terminus for V(NmA)APR did not prevent the formation of the V(NmA) b(2) ion, indicating the V(NmA) b(2) ion has a structure, presumably that of an oxazolone, which requires no attack by the N-terminus for formation. Finally, high-resolution, accurate mass measurements determined that the V(NmA) (b(2)-71) ion results from losing a portion of valine from oxazolone V(NmA) b(2) ion, rather than cross-ring cleavage of the alternate diketopiperazine.
Collapse
Affiliation(s)
- Lori L Smith
- Department of Chemistry, University of Arizona, Tucson, Arizona 85721, USA
| | | | | |
Collapse
|
31
|
Hua L, Low TY, Sze SK. Microwave-assisted specific chemical digestion for rapid protein identification. Proteomics 2006; 6:586-91. [PMID: 16342144 DOI: 10.1002/pmic.200500304] [Citation(s) in RCA: 50] [Impact Index Per Article: 2.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/11/2022]
Abstract
We have developed a rapid microwave-assisted protein digestion technique based on classic acid hydrolysis reaction with 2% formic acid solution. In this mild chemical environment, proteins are hydrolyzed to peptides, which can be directly analyzed by MALDI-MS or ESI-MS without prior sample purification. Dilute formic acid cleaves proteins specifically at the C-terminal of aspartyl (Asp) residues within 10 min of exposure to microwave irradiation. By adjusting the irradiation time, we found that the extent of protein fragmentation can be controlled, as shown by the single fragmentation of myoglobin at the C-terminal of any of the Asp residues. The efficacy and simplicity of this technique for protein identification are demonstrated by the peptide mass maps of in-gel digested myoglobin and BSA, as well as proteins isolated from Escherichia coli K12 cells.
Collapse
Affiliation(s)
- Lin Hua
- Genome Institute of Singapore, Singapore
| | | | | |
Collapse
|
32
|
Hufnagel P, Rabus R. Mass Spectrometric Identification of Proteins in Complex Post-Genomic Projects. J Mol Microbiol Biotechnol 2006; 11:53-81. [PMID: 16825790 DOI: 10.1159/000092819] [Citation(s) in RCA: 20] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/24/2022] Open
Abstract
The rapidly developing proteomics technologies help to advance the global understanding of physiological and cellular processes. The lifestyle of a study organism determines the type and complexity of a given proteomic project. The complexity of this study is characterized by a broad collection of pathway-specific subproteomes, reflecting the metabolic versatility as well as the regulatory potential of the aromatic-degrading, denitrifying bacterium 'Aromatoleum' sp. strain EbN1. Differences in protein profiles were determined using a gel-based approach. Protein identification was based on a progressive application of MALDI-TOF-MS, MALDI-TOF-MS/MS and LC-ESI-MS/MS. This progression was result-driven and automated by software control. The identification rate was increased by the assembly of a project-specific list of background signals that was used for internal calibration of the MS spectra, and by the combination of two search engines using a dedicated MetaScoring algorithm. In total, intelligent bioinformatics could increase the identification yield from 53 to 70% of the analyzed 5,050 gel spots; a total of 556 different proteins were identified. MS identification was highly reproducible: most proteins were identified more than twice from parallel 2DE gels with an average sequence coverage of >50% and rather restrictive score thresholds (Mascot >or=95, ProFound >or=2.2, MetaScore >or=97). The MS technologies and bioinformatics tools that were implemented and integrated to handle this complex proteomic project are presented. In addition, we describe the basic principles and current developments of the applied technologies and provide an overview over the current state of microbial proteome research.
Collapse
|
33
|
Wolski WE, Lalowski M, Martus P, Herwig R, Giavalisco P, Gobom J, Sickmann A, Lehrach H, Reinert K. Transformation and other factors of the peptide mass spectrometry pairwise peak-list comparison process. BMC Bioinformatics 2005; 6:285. [PMID: 16318636 PMCID: PMC1343595 DOI: 10.1186/1471-2105-6-285] [Citation(s) in RCA: 15] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/08/2005] [Accepted: 11/30/2005] [Indexed: 11/22/2022] Open
Abstract
Background: Biological Mass Spectrometry is used to analyse peptides and proteins. A mass spectrum generates a list of measured mass to charge ratios and intensities of ionised peptides, which is called a peak-list. In order to classify the underlying amino acid sequence, the acquired spectra are usually compared with synthetic ones. Development of suitable methods of direct peak-list comparison may be advantageous for many applications. Results: The pairwise peak-list comparison is a multistage process composed of matching of peaks embedded in two peak-lists, normalisation, scaling of peak intensities and dissimilarity measures. In our analysis, we focused on binary and intensity based measures. We have modified the measures in order to comprise the mass spectrometry specific properties of mass measurement accuracy and non-matching peaks. We compared the labelling of peak-list pairs, obtained using different factors of the pairwise peak-list comparison, as being the same or different to those determined by sequence database searches. In order to elucidate how these factors influence the peak-list comparison we adopted an analysis of variance type method with the partial area under the ROC curve as a dependent variable. Conclusion: The analysis of variance provides insight into the relevance of various factors influencing the outcome of the pairwise peak-list comparison. For large MS/MS and PMF data sets the outcome of ANOVA analysis was consistent, providing a strong indication that the results presented here might be valid for many various types of peptide mass measurements.
Collapse
Affiliation(s)
- Witold E Wolski
- Max Planck Institute for Molecular Genetics, Ihnestraβe 63-73, D-14195 Berlin, Germany
- School of Mathematics and Statistics, Merz Court, University of Newcastle upon Tyne, NE1 7RU, UK
| | - Maciej Lalowski
- Max Delbrück Center for Molecular Medicine, Robert-Roessle-Str. 10, D-13125 Berlin-Buch, Germany
| | - Peter Martus
- Institute for Medical Informatics, Biometry and Epidemiology; Charite University Medicine Berlin, Hindenburgdamm 30 (HBD 30), 12200 Berlin
| | - Ralf Herwig
- Max Planck Institute for Molecular Genetics, Ihnestraβe 63-73, D-14195 Berlin, Germany
| | - Patrick Giavalisco
- Boyce Thompson Institute for Plant Research, Tower Road, Ithaca 14850, NY, USA
| | - Johan Gobom
- Max Planck Institute for Molecular Genetics, Ihnestraβe 63-73, D-14195 Berlin, Germany
| | - Albert Sickmann
- DFG Research Center for Experimental Biomedicine, University of Würzburg, Versbacherstr. 9, D-97078 Würzburg, Germany
| | - Hans Lehrach
- Max Planck Institute for Molecular Genetics, Ihnestraβe 63-73, D-14195 Berlin, Germany
| | - Knut Reinert
- Institute for Computer Science, Free University Berlin, Takustr. 9, D-14195 Berlin, Germany
| |
Collapse
|
34
|
Wolski WE, Lalowski M, Jungblut P, Reinert K. Calibration of mass spectrometric peptide mass fingerprint data without specific external or internal calibrants. BMC Bioinformatics 2005; 6:203. [PMID: 16102175 PMCID: PMC1199585 DOI: 10.1186/1471-2105-6-203] [Citation(s) in RCA: 16] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/26/2005] [Accepted: 08/15/2005] [Indexed: 12/24/2022] Open
Abstract
BACKGROUND Peptide Mass Fingerprinting (PMF) is a widely used mass spectrometry (MS) method of analysis of proteins and peptides. It relies on the comparison between experimentally determined and theoretical mass spectra. The PMF process requires calibration, usually performed with external or internal calibrants of known molecular masses. RESULTS We have introduced two novel MS calibration methods. The first method utilises the local similarity of peptide maps generated after separation of complex protein samples by two-dimensional gel electrophoresis. It computes a multiple peak-list alignment of the data set using a modified Minimum Spanning Tree (MST) algorithm. The second method exploits the idea that hundreds of MS samples are measured in parallel on one sample support. It improves the calibration coefficients by applying a two-dimensional Thin Plate Splines (TPS) smoothing algorithm. We studied the novel calibration methods utilising data generated by three different MALDI-TOF-MS instruments. We demonstrate that a PMF data set can be calibrated without resorting to external or relying on widely occurring internal calibrants. The methods developed here were implemented in R and are part of the BioConductor package mscalib available from http://www.bioconductor.org. CONCLUSION The MST calibration algorithm is well suited to calibrate MS spectra of protein samples resulting from two-dimensional gel electrophoretic separation. The TPS based calibration algorithm might be used to correct systematic mass measurement errors observed for large MS sample supports. As compared to other methods, our combined MS spectra calibration strategy increases the peptide/protein identification rate by an additional 5-15%.
Collapse
Affiliation(s)
- Witold E Wolski
- Max Planck Institute for Molecular Genetics, Ihnestraße 63–73, D-14195 Berlin, Germany
- Institute for Computer Science, Free University Berlin, Takustr. 9, 14195 Berlin, Germany
- School of Mathematics and Statistics, Merz Court, University of Newcastle upon Tyne, NE1 7RU, UK
| | - Maciej Lalowski
- Max Delbrück Center for Molecular Medicine, Robert-Roessle-Str. 10, D-13125 Berlin-Buch, Germany
| | - Peter Jungblut
- Max Planck Institute for Infection Biology, Schumannstr. 21–22, D-10117 Berlin, Germany
| | - Knut Reinert
- Institute for Computer Science, Free University Berlin, Takustr. 9, 14195 Berlin, Germany
| |
Collapse
|
35
|
Halligan BD, Ruotti V, Twigger SN, Greene AS. DeNovoID: a web-based tool for identifying peptides from sequence and mass tags deduced from de novo peptide sequencing by mass spectroscopy. Nucleic Acids Res 2005; 33:W376-81. [PMID: 15980493 PMCID: PMC1160222 DOI: 10.1093/nar/gki461] [Citation(s) in RCA: 12] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/11/2023] Open
Abstract
One of the core activities of high-throughput proteomics is the identification of peptides from mass spectra. Some peptides can be identified using spectral matching programs like Sequest or Mascot, but many spectra do not produce high quality database matches. De novo peptide sequencing is an approach to determine partial peptide sequences for some of the unidentified spectra. A drawback of de novo peptide sequencing is that it produces a series of ordered and disordered sequence tags and mass tags rather than a complete, non-degenerate peptide amino acid sequence. This incomplete data is difficult to use in conventional search programs such as BLAST or FASTA. DeNovoID is a program that has been specifically designed to use degenerate amino acid sequence and mass data derived from MS experiments to search a peptide database. Since the algorithm employed depends on the amino acid composition of the peptide and not its sequence, DeNovoID does not have to consider all possible sequences, but rather a smaller number of compositions consistent with a spectrum. DeNovoID also uses a geometric indexing scheme that reduces the number of calculations required to determine the best peptide match in the database. DeNovoID is available at .
Collapse
Affiliation(s)
- Brian D Halligan
- Bioinformatics Research Center, Medical College of Wisconsin, 8701 Watertown Plank Road, Milwaukee, WI 53213, USA.
| | | | | | | |
Collapse
|
36
|
Herrmann KA, Wysocki VH, Vorpagel ER. Computational investigation and hydrogen/deuterium exchange of the fixed charge derivative tris(2,4,6-trimethoxyphenyl) phosphonium: implications for the aspartic acid cleavage mechanism. JOURNAL OF THE AMERICAN SOCIETY FOR MASS SPECTROMETRY 2005; 16:1067-80. [PMID: 15921922 DOI: 10.1016/j.jasms.2005.03.028] [Citation(s) in RCA: 36] [Impact Index Per Article: 1.9] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 12/01/2004] [Revised: 03/25/2005] [Accepted: 03/25/2005] [Indexed: 05/02/2023]
Abstract
Aspartic acid (Asp)-containing peptides with the fixed charge derivative tris(2,4,6trimethoxyphenyl) phosphonium (tTMP-P+) were explored computationally and experimentally by hydrogen/deuterium (H/D) exchange and by fragmentation studies to probe the phenomenon of selective cleavage C-terminal to Asp in the absence of a "mobile" proton. Ab initio modeling of the tTMP-P+ electrostatic potential shows that the positive charge is distributed on the phosphonium group and therefore is not initiating or directing fragmentation as would a "mobile" proton. Geometry optimizations and vibrational analyses of different Asp conformations show that the Asp structure with a hydrogen bond between the side-chain hydroxy and backbone carbonyl lies 2.8 kcal/mol above the lowest energy conformer. In reactions with D2O, the phosphonium-derived doubly charged peptide (H+)P+LDIFSDF rapidly exchanges all 12 of its exchangeable hydrogens for deuterium and also displays a nonexchanging population. With no added proton, P+LDIFSDF exchanges a maximum of 4 of 11 exchangeable hydrogens for deuterium. No exchange is observed when all acidic groups are converted to the corresponding methyl esters. Together, these H/D exchange results indicate that the acidic hydrogens are "mobile locally" because they are able to participate in exchange even in the absence of an added proton. Fragmentation of two distinct (H+)P+LDIFSDF ion populations shows that the nonexchanging population displays selective cleavage, whereas the exchanging population fragments more evenly across the peptide backbone. This result indicates that H/D exchange can sometimes distinguish between and provide a means of separation of different protonation motifs and that these protonation motifs can have an effect on the fragmentation.
Collapse
Affiliation(s)
- Kristin A Herrmann
- Department of Chemistry, University of Arizona, Tucson, Arizona 85721-0041, USA
| | | | | |
Collapse
|
37
|
Shevchenko A, de Sousa MML, Waridel P, Bittencourt ST, de Sousa MV, Shevchenko A. Sequence Similarity-Based Proteomics in Insects: Characterization of the Larvae Venom of the Brazilian MothCerodirphiaspeciosa. J Proteome Res 2005; 4:862-9. [PMID: 15952733 DOI: 10.1021/pr0500051] [Citation(s) in RCA: 32] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/29/2022]
Abstract
Using a combination of tandem mass spectrometric sequencing and sequence similarity searches, we characterized the larvae venom of the moth Cerodirphia speciosa, which belongs to the Saturniidae family of the Lepidoptera order. Despite the paucity of available database sequence resources, the approach enabled us to identify 48 out of 58 attempted spots on its two-dimensional gel electrophoresis map, which represented 37 unique proteins, whereas it was only possible to identify 13 proteins by conventional non-error tolerant database searching methods. The majority of cross-species hits were made to proteins from the phylogenetically related Lepidoptera organism, the silk worm Bombyx mori. The protein composition of the venom suggested that envenoming by C. speciosa toxins might proceed through the contact with its hemolymph, similarly to another toxic Lepidoptera organism, Lonomia obliqua.
Collapse
Affiliation(s)
- Anna Shevchenko
- Max Planck Institute of Molecular Cell Biology and Genetics, Pfotenhauerstrasse 108, 01307 Dresden, Germany
| | | | | | | | | | | |
Collapse
|
38
|
Sadygov RG, Cociorva D, Yates JR. Large-scale database searching using tandem mass spectra: looking up the answer in the back of the book. Nat Methods 2005; 1:195-202. [PMID: 15789030 DOI: 10.1038/nmeth725] [Citation(s) in RCA: 270] [Impact Index Per Article: 14.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/30/2022]
Abstract
Database searching is an essential element of large-scale proteomics. Because these methods are widely used, it is important to understand the rationale of the algorithms. Most algorithms are based on concepts first developed in SEQUEST and PeptideSearch. Four basic approaches are used to determine a match between a spectrum and sequence: descriptive, interpretative, stochastic and probability-based matching. We review the basic concepts used by most search algorithms, the computational modeling of peptide identification and current challenges and limitations of this approach for protein identification.
Collapse
Affiliation(s)
- Rovshan G Sadygov
- Department of Cell Biology, The Scripps Research Institute, La Jolla, California 92037, USA
| | | | | |
Collapse
|
39
|
Raucci G, Gabrielli M, Novelli S, Picariello G, Collins SH. CHASE, a charge-assisted sequencing algorithm for automated homology-based protein identifications with matrix-assisted laser desorption/ionization time-of-flight post-source decay fragmentation data. JOURNAL OF MASS SPECTROMETRY : JMS 2005; 40:475-488. [PMID: 15712359 DOI: 10.1002/jms.817] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/24/2023]
Abstract
We describe CHASE, a novel algorithm for automated de novo sequencing based on the mass spectrometric (MS) fragmentation analysis of tryptic peptides. This algorithm is used for protein identification from sequence similarity criteria and consists of four steps: (1) derivatization of tryptic peptides at the N-terminus with a negatively charged reagent; (2) post-source decay (PSD) fragmentation analysis of peptides; (3) interpretation of the mass peaks with the CHASE algorithm and reconstruction of the amino acid sequence; (4) transfer of these data to software for protein identifications based on sequence homology (Basic Local Alignment Search Tool, BLAST). This procedure deduced the correct amino acid sequence of tryptic peptide samples and also was able to deduce the correct sequence from difficult mass patterns and identify the amino acid sequence. This allows complete automation of the process starting from MS fragmentation of complex peptide mixtures at low concentration (e.g. from silver-stained gel bands) to identification of the protein. We also show that if PSD data are collected in a single spectrum (instead of the segmented mode offered by conventional matrix-assisted laser desorption/ionization time-of-flight (MALDI-TOF) instrumentation), the complete workflow from MS-PSD data acquisition to similarity-based identification can be completely automated. This strategy may be applied to proteomic studies for protein identification based on automated de novo sequencing instead of MS or tandem MS patterns. We describe the Charge Assisted Sequencing Engine (CHASE) algorithm, the working protocol, the performance of the algorithm on spectra from MALDI-TOFMS and the data comparison between a TOF and a TOF-TOF instrument.
Collapse
Affiliation(s)
- Giuseppe Raucci
- Menarini Biotech, Via Tito Speri 10, 00040 Pomezia, RM, Italy.
| | | | | | | | | |
Collapse
|
40
|
Abstract
Two-dimensional electrophoresis (2-DE) combined with mass spectrometry has significantly improved the possibilities of large-scale identification of proteins. However, 2-DE is limited by its inability to speed up the in-gel digestion process. We have developed a new approach to speed up the protein identification process utilizing microwave technology. Proteins excised from gels are subjected to in-gel digestion with endoprotease trypsin by microwave irradiation, which rapidly produces peptide fragments. The peptide fragments were further analyzed by matrix-assisted laser desorption/ionization technique for protein identification. The efficacy of this technique for protein mapping was demonstrated by the mass spectral analyses of the peptide fragmentation of several proteins, including lysozyme, albumin, conalbumin, and ribonuclease A. The method reduced the required time for in-gel digestion of proteins from 16 hours to as little as five minutes. This new application of microwave technology to protein identification will be an important advancement in biotechnology and proteome research.
Collapse
Affiliation(s)
- Hsueh-Fen Juan
- Institute of Molecular and Cellular Biology, National Taiwan University, Taipei.
| | | | | | | |
Collapse
|
41
|
Graham DRM, Elliott ST, Van Eyk JE. Broad-based proteomic strategies: a practical guide to proteomics and functional screening. J Physiol 2005; 563:1-9. [PMID: 15611015 PMCID: PMC1665568 DOI: 10.1113/jphysiol.2004.080341] [Citation(s) in RCA: 41] [Impact Index Per Article: 2.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/01/2004] [Accepted: 12/15/2004] [Indexed: 11/08/2022] Open
Abstract
Proteomics, the study of the proteome (the collection of all the proteins expressed from the genome in all isoforms, polymorphisms and post-translational modifications), is a rapidly developing field in which there are numerous new and often expensive technologies, making it imperative to use the most appropriate technology for the biological system and hypothesis being addressed. This review provides some guidelines on approaching a broad-based proteomics project, including strategies on refining hypotheses, choosing models and proteomic approaches with an emphasis on aspects of sample complexity (including abundance and protein characteristics), and separation technologies and their respective strengths and weaknesses. Finally, issues related to quantification, mass spectrometry and informatics strategies are discussed. The goal of this review is therefore twofold: the first section provides a brief outline of proteomic technologies, specifically with respect to their applications to broad-based proteomic approaches, and the second part provides more details about the application of these technologies in typical scenarios dealing with physiological and pathological processes. Proteomics at its best is the integration of carefully planned research and complementary techniques with the advantages of powerful discovery technologies that has the potential to make substantial contributions to the understanding of disease and disease processes.
Collapse
Affiliation(s)
- David R M Graham
- Department of Medicine, Division of Cardiology, Johns Hopkins University, Baltimore, MD 21224, USA
| | | | | |
Collapse
|
42
|
Maguire PB, Moran N, Cagney G, Fitzgerald DJ. Application of proteomics to the study of platelet regulatory mechanisms. Trends Cardiovasc Med 2005; 14:207-20. [PMID: 15451512 DOI: 10.1016/j.tcm.2004.06.001] [Citation(s) in RCA: 17] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Indexed: 12/20/2022]
Abstract
Newly developed proteomic technologies now permit the routine identification of hundreds or even thousands of proteins in a single experiment. However, the global study of any proteome has unique challenges that set it apart from comprehensive studies of genes and transcripts. The detection of low-abundance, biologically relevant proteins poses a particular challenge, especially given that the dynamic range of proteins in cells is estimated to be > or =10(6). Nevertheless, the incorporation of proteomics into functional biochemical and biologic investigation has proved to be a powerful tool when applied to platelet biology. This review highlights recent proteomic approaches to the characterization of the proteins released from activated platelets and to the identification of integrin-associated regulators of platelet function. Also described are efforts to link platelet-proteomic and platelet-transcriptional data.
Collapse
Affiliation(s)
- Patricia B Maguire
- Department of Clinical Pharmacology, Royal College of Surgeon's in Ireland, Dublin, Ireland.
| | | | | | | |
Collapse
|
43
|
Abstract
In mass spectrometry (MS)-based protein studies, peptide fragmentation analysis (i.e., MS/MS experiments such as matrix-assisted laser desorption ionization [MALDI]-post-source decay [PSD] analysis, collision-induced dissociation [CID] of electrospray- and MALDI-generated ions, and electron-capture and electron-transfer dissociation analysis of multiply charged ions) provide sequence information and, thus, can be used for (i) de novo sequencing, (ii) protein identification, and (iii) posttranslational or other covalent modification site assignments. This chapter offers a qualitative overview on which kind of peptide fragments are formed under different MS/MS conditions. High-quality PSD and CID spectra provide illustrations of de novo sequencing and protein identification. The MS/MS behavior of some common posttranslational modifications such as acetylation, trimethylation, phosphorylation, sulfation, and O-glycosylation is also discussed.
Collapse
|
44
|
Dayon L, Roussel C, Prudent M, Lion N, Girault HH. On-line counting of cysteine residues in peptides during electrospray ionization by electrogenerated tags and their application to protein identification. Electrophoresis 2005; 26:238-47. [PMID: 15624160 DOI: 10.1002/elps.200406144] [Citation(s) in RCA: 38] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/12/2022]
Abstract
The electrochemically induced mass spectrometric tagging of cysteines by substituted hydroquinones was studied for peptides in a classical electrospray solvent (i.e., MeOH/H2O/AcOH 50/49/1). The tagging efficiency was tested with different hydroquinone compounds on an undecapeptide containing one cysteine residue. 2-carboxymethylhydroquinone was the most reactive probe and revealed to be suitable for cysteine quantification in peptides containing up to three cysteine residues, even in the case of two consecutive cysteines in the sequence. We demonstrate the compatibility of the on-line electrochemical tagging method for the cysteine content analysis of peptides coming from gel-free protein digestion procedures. The identification of bovine serum albumin and human alpha-lactalbumin digest samples in a peptide mapping strategy was greatly improved by the application of the electrotagging technique as post-column treatment. Indeed, the determination of cysteine content in the tryptic peptides provided powerful information in order to enhance the identification score as well as the discrimination against other protein candidates. The tagging method was then applied to the determination of four proteins in a model mixture.
Collapse
Affiliation(s)
- Loïc Dayon
- Laboratoire d'Electrochimie Physique et Analytique, Institut des Sciences et Ingénierie Chimiques, Ecole Polytechnique Fédéralede Lausanne, Lausanne, Switzerland
| | | | | | | | | |
Collapse
|
45
|
Barrier M, Mirkes PE. Proteomics in developmental toxicology. Reprod Toxicol 2005; 19:291-304. [PMID: 15686865 DOI: 10.1016/j.reprotox.2004.09.001] [Citation(s) in RCA: 11] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/31/2004] [Revised: 07/26/2004] [Accepted: 09/03/2004] [Indexed: 10/26/2022]
Abstract
The objective of this presentation is to review the major proteomic technologies available to developmental toxicologists and, when possible, to provide examples of how various proteomic technologies have been used in developmental toxicology or toxicology in general. The field of proteomics is too broad for us to go into great depth about each technology, so we have attempted to provide brief overviews supplemented with many references that cover the subjects in more detail. Proteomics tools produce a global view of complex biological systems by examining complex protein mixtures using large-scale, high-throughput technologies. These technologies speed up the process of protein separation, quantification, and identification. As an important complement to genomics, proteomics allows for the examination of the entire complement of proteins in an organism, tissue, or cell-type. Current proteomics technologies not only identify protein expression, but also post-translational modifications and protein interactions. The field of proteomics is expanding rapidly to provide greater volume and quality of protein information to help understand the multifaceted nature of biological systems.
Collapse
Affiliation(s)
- Marianne Barrier
- Birth Defects Research Laboratory, Division of Genetics and Developmental Medicine, Department of Pediatrics, University of Washington, Box 356320, 1959 NE Pacific Street, Seattle, WA 98195, USA
| | | |
Collapse
|
46
|
Sun W, Li F, Wang J, Zheng D, Gao Y. AMASS: Software for Automatically Validating the Quality of MS/MS Spectrum from SEQUEST Results. Mol Cell Proteomics 2004; 3:1194-9. [PMID: 15489460 DOI: 10.1074/mcp.m400120-mcp200] [Citation(s) in RCA: 29] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/06/2022] Open
Abstract
Time-consuming and experience-dependent manual validations of tandem mass spectra are usually applied to SEQUEST results. This inefficient method has become a significant bottleneck for MS/MS data processing. Here we introduce a program AMASS (advanced mass spectrum screener), which can filter the tandem mass spectra of SEQUEST results by measuring the match percentage of high-abundant ions and the continuity of matched fragment ions in b, y series. Compared with Xcorr and DeltaCn filter, AMASS can increase the number of positives and reduce the number of negatives in 22 datasets generated from 18 known protein mixtures. It effectively removed most noisy spectra, false interpretations, and about half of poor fragmentation spectra, and AMASS can work synergistically with Rscore filter. We believe the use of AMASS and Rscore can result in a more accurate identification of peptide MS/MS spectra and reduce the time and energy for manual validation.
Collapse
Affiliation(s)
- Wei Sun
- Proteomics Research Center, National Key Laboratory of Medical Molecular, Biology, Institute of Basic Medical Sciences, Chinese Academy of Medical, Sciences, Beijing, People's Republic of China
| | | | | | | | | |
Collapse
|
47
|
Yan B, Pan C, Olman VN, Hettich RL, Xu Y. A graph-theoretic approach for the separation of b and y ions in tandem mass spectra. Bioinformatics 2004; 21:563-74. [PMID: 15454408 DOI: 10.1093/bioinformatics/bti044] [Citation(s) in RCA: 28] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/14/2022] Open
Abstract
MOTIVATION Ion-type identification is a fundamental problem in computational proteomics. Methods for accurate identification of ion types provide the basis for many mass spectrometry data interpretation problems, including (a) de novo sequencing, (b) identification of post-translational modifications and mutations and (c) validation of database search results. RESULTS Here, we present a novel graph-theoretic approach for solving the problem of separating b ions from y ions in a set of tandem mass spectra. We represent each spectral peak as a node and consider two types of edges: type-1 edge connecting two peaks probably of the same ion types and type-2 edge connecting two peaks probably of different ion types. The problem of ion-separation is formulated and solved as a graph partition problem, which is to partition the graph into three subgraphs, representing b, y and others ions, respectively, through maximizing the total weight of type-1 edges while minimizing the total weight of type-2 edges within each partitioned subgraph. We have developed a dynamic programming algorithm for rigorously solving this graph partition problem and implemented it as a computer program PRIME (PaRtition of Ion types in tandem Mass spEctra). The tests on a large amount of simulated mass spectra and 19 sets of high-quality experimental Fourier transform ion cyclotron resonance tandem mass spectra indicate that an accuracy level of approximately 90% for the separation of b and y ions was achieved. AVAILABILITY The executable code of PRIME is available upon request. CONTACT xyn@bmb.uga.edu.
Collapse
Affiliation(s)
- Bo Yan
- Computational Systems Biology Laboratory, Department of Biochemical and Molecular Biology, University of Georgia, Athens, GA 30602, USA
| | | | | | | | | |
Collapse
|
48
|
Yang X, Dondeti V, Dezube R, Maynard DM, Geer LY, Epstein J, Chen X, Markey SP, Kowalak JA. DBParser: Web-Based Software for Shotgun Proteomic Data Analyses. J Proteome Res 2004; 3:1002-8. [PMID: 15473689 DOI: 10.1021/pr049920x] [Citation(s) in RCA: 82] [Impact Index Per Article: 4.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/29/2022]
Abstract
We describe a web-based program called 'DBParser' for rapidly culling, merging, and comparing sequence search engine results from multiple LC-MS/MS peptide analyses. DBParser employs the principle of parsimony to consolidate redundant protein assignments and derive the most concise set of proteins consistent with all of the assigned peptide sequences observed in an experiment or series of experiments. The resulting reports summarize peptide and protein identifications from multidimensional experiments that may contain a single data set or combine data from a group of data sets, all related to a single analytical sample. Additionally, the results of multiple experiments, each of which may contain several data sets, can be compared in reports that identify features that are common or different. DBParser actively links to the primary mass spectral data and to public online databases such as NCBI, GO, and Swiss-Prot in order to structure contextually specific reports for biologists and biochemists.
Collapse
Affiliation(s)
- Xiaoyu Yang
- National Institutes of Health, 10 Center Drive, Room 3D42, Bethesda, Maryland 20892-1262, USA
| | | | | | | | | | | | | | | | | |
Collapse
|
49
|
Liska AJ, Shevchenko A, Pick U, Katz A. Enhanced photosynthesis and redox energy production contribute to salinity tolerance in Dunaliella as revealed by homology-based proteomics. PLANT PHYSIOLOGY 2004; 136:2806-17. [PMID: 15333751 PMCID: PMC523343 DOI: 10.1104/pp.104.039438] [Citation(s) in RCA: 148] [Impact Index Per Article: 7.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 02/04/2004] [Revised: 05/31/2004] [Accepted: 06/02/2004] [Indexed: 05/17/2023]
Abstract
Salinity is a major limiting factor for the proliferation of plants and inhibits central metabolic activities such as photosynthesis. The halotolerant green alga Dunaliella can adapt to hypersaline environments and is considered a model photosynthetic organism for salinity tolerance. To clarify the molecular basis for salinity tolerance, a proteomic approach has been applied for identification of salt-induced proteins in Dunaliella. Seventy-six salt-induced proteins were selected from two-dimensional gel separations of different subcellular fractions and analyzed by mass spectrometry (MS). Application of nanoelectrospray mass spectrometry, combined with sequence-similarity database-searching algorithms, MS BLAST and MultiTag, enabled identification of 80% of the salt-induced proteins. Salinity stress up-regulated key enzymes in the Calvin cycle, starch mobilization, and redox energy production; regulatory factors in protein biosynthesis and degradation; and a homolog of a bacterial Na(+)-redox transporters. The results indicate that Dunaliella responds to high salinity by enhancement of photosynthetic CO(2) assimilation and by diversion of carbon and energy resources for synthesis of glycerol, the osmotic element in Dunaliella. The ability of Dunaliella to enhance photosynthetic activity at high salinity is remarkable because, in most plants and cyanobacteria, salt stress inhibits photosynthesis. The results demonstrated the power of MS BLAST searches for the identification of proteins in organisms whose genomes are not known and paved the way for dissecting molecular mechanisms of salinity tolerance in algae and higher plants.
Collapse
Affiliation(s)
- Adam J Liska
- Max Planck Institute of Molecular Cell Biology and Genetics, 01307 Dresden, Germany
| | | | | | | |
Collapse
|
50
|
Wilke A, Rückert C, Bartels D, Dondrup M, Goesmann A, Hüser AT, Kespohl S, Linke B, Mahne M, McHardy A, Pühler A, Meyer F. Bioinformatics support for high-throughput proteomics. J Biotechnol 2004; 106:147-56. [PMID: 14651857 DOI: 10.1016/j.jbiotec.2003.08.009] [Citation(s) in RCA: 28] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/17/2022]
Abstract
In the "post-genome" era, mass spectrometry (MS) has become an important method for the analysis of proteome data. The rapid advancement of this technique in combination with other methods used in proteomics results in an increasing number of high-throughput projects. This leads to an increasing amount of data that needs to be archived and analyzed. To cope with the need for automated data conversion, storage, and analysis in the field of proteomics, the open source system ProDB was developed. The system handles data conversion from different mass spectrometer software, automates data analysis, and allows the annotation of MS spectra (e.g. assign gene names, store data on protein modifications). The system is based on an extensible relational database to store the mass spectra together with the experimental setup. It also provides a graphical user interface (GUI) for managing the experimental steps which led to the MS data. Furthermore, it allows the integration of genome and proteome data. Data from an ongoing experiment was used to compare manual and automated analysis. First tests showed that the automation resulted in a significant saving of time. Furthermore, the quality and interpretability of the results was improved in all cases.
Collapse
Affiliation(s)
- Andreas Wilke
- Center for Genome Research, Bielefeld University, D-33594 Bielefeld, Germany.
| | | | | | | | | | | | | | | | | | | | | | | |
Collapse
|