1
|
Tabb DL, Jeong K, Druart K, Gant MS, Brown KA, Nicora C, Zhou M, Couvillion S, Nakayasu E, Williams JE, Peterson HK, McGuire MK, McGuire MA, Metz TO, Chamot-Rooke J. Comparing Top-Down Proteoform Identification: Deconvolution, PrSM Overlap, and PTM Detection. J Proteome Res 2023. [PMID: 37235544 DOI: 10.1021/acs.jproteome.2c00673] [Citation(s) in RCA: 6] [Impact Index Per Article: 6.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 05/28/2023]
Abstract
Generating top-down tandem mass spectra (MS/MS) from complex mixtures of proteoforms benefits from improvements in fractionation, separation, fragmentation, and mass analysis. The algorithms to match MS/MS to sequences have undergone a parallel evolution, with both spectral alignment and match-counting approaches producing high-quality proteoform-spectrum matches (PrSMs). This study assesses state-of-the-art algorithms for top-down identification (ProSight PD, TopPIC, MSPathFinderT, and pTop) in their yield of PrSMs while controlling false discovery rate. We evaluated deconvolution engines (ThermoFisher Xtract, Bruker AutoMSn, Matrix Science Mascot Distiller, TopFD, and FLASHDeconv) in both ThermoFisher Orbitrap-class and Bruker maXis Q-TOF data (PXD033208) to produce consistent precursor charges and mass determinations. Finally, we sought post-translational modifications (PTMs) in proteoforms from bovine milk (PXD031744) and human ovarian tissue. Contemporary identification workflows produce excellent PrSM yields, although approximately half of all identified proteoforms from these four pipelines were specific to only one workflow. Deconvolution algorithms disagree on precursor masses and charges, contributing to identification variability. Detection of PTMs is inconsistent among algorithms. In bovine milk, 18% of PrSMs produced by pTop and TopMG were singly phosphorylated, but this percentage fell to 1% for one algorithm. Applying multiple search engines produces more comprehensive assessments of experiments. Top-down algorithms would benefit from greater interoperability.
Collapse
Affiliation(s)
- David L Tabb
- Université Paris Cité, Institut Pasteur, CNRS UAR 2024, Mass Spectrometry for Biology Unit, Paris 75015, France
| | - Kyowon Jeong
- Applied Bioinformatics, Computer Science Department, University of Tübingen, Tübingen 72076, Germany
| | - Karen Druart
- Université Paris Cité, Institut Pasteur, CNRS UAR 2024, Mass Spectrometry for Biology Unit, Paris 75015, France
| | - Megan S Gant
- Université Paris Cité, Institut Pasteur, CNRS UAR 2024, Mass Spectrometry for Biology Unit, Paris 75015, France
| | - Kyle A Brown
- School of Medicine and Public Health, University of Wisconsin, Madison, Wisconsin 53705, United States
| | - Carrie Nicora
- Biological Sciences Division, Pacific Northwest National Laboratory, Richland, Washington 99352, United States
| | - Mowei Zhou
- Environmental Molecular Sciences Laboratory, Pacific Northwest National Laboratory, Richland, Washington 99354, United States
| | - Sneha Couvillion
- Biological Sciences Division, Pacific Northwest National Laboratory, Richland, Washington 99352, United States
| | - Ernesto Nakayasu
- Biological Sciences Division, Pacific Northwest National Laboratory, Richland, Washington 99352, United States
| | - Janet E Williams
- Department of Animal, Veterinary, and Food Sciences, University of Idaho, Moscow, Idaho 83844, United States
| | - Haley K Peterson
- Department of Animal, Veterinary, and Food Sciences, University of Idaho, Moscow, Idaho 83844, United States
| | - Michelle K McGuire
- Margaret Ritchie School of Family and Consumer Sciences, University of Idaho, Moscow, Idaho 83844, United States
| | - Mark A McGuire
- Department of Animal, Veterinary, and Food Sciences, University of Idaho, Moscow, Idaho 83844, United States
| | - Thomas O Metz
- Biological Sciences Division, Pacific Northwest National Laboratory, Richland, Washington 99352, United States
| | - Julia Chamot-Rooke
- Université Paris Cité, Institut Pasteur, CNRS UAR 2024, Mass Spectrometry for Biology Unit, Paris 75015, France
| |
Collapse
|
2
|
Qin S, Tian Z. Proteoform Identification and Quantification Using Intact Protein Database Search Engine ProteinGoggle. Methods Mol Biol 2022; 2500:131-144. [PMID: 35657591 DOI: 10.1007/978-1-0716-2325-1_10] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 06/15/2023]
Abstract
Proteomics studies the proteome of organisms, especially proteins that are differentially expressed under certain physiological or pathological conditions; qualitative identification of protein sequences and posttranslational modifications (PTMs) and their positions can help us systematically understand the structure and function of proteoforms. With the development and relative popularity of soft ionization technology (such as electrospray ionization technology) and high mass measurement accuracy and high-resolution mass spectrometers (such as orbitrap), the mass spectrometry (MS) characterization of complete proteins (the so-called top-down proteomics) has become possible and has gradually become popular. Corresponding database search engines and protein identification bioinformatics tools have also been greatly developed. This chapter provides a brief overview of intact protein database search algorithm "isotopic mass-to-charge ratio and envelope fingerprinting" and search engine ProteinGoggle.
Collapse
Affiliation(s)
- Suideng Qin
- School of Chemical Science & Engineering and Shanghai Key Laboratory of Chemical Assessment and Sustainability, Tongji University, Shanghai, China
| | - Zhixin Tian
- School of Chemical Science & Engineering and Shanghai Key Laboratory of Chemical Assessment and Sustainability, Tongji University, Shanghai, China.
| |
Collapse
|
3
|
Zhong J, Sun Y, Xie M, Peng W, Zhang C, Wu FX, Wang J. Proteoform characterization based on top-down mass spectrometry. Brief Bioinform 2020; 22:1729-1750. [PMID: 32118252 DOI: 10.1093/bib/bbaa015] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/01/2019] [Revised: 01/23/2020] [Indexed: 12/16/2022] Open
Abstract
Proteins are dominant executors of living processes. Compared to genetic variations, changes in the molecular structure and state of a protein (i.e. proteoforms) are more directly related to pathological changes in diseases. Characterizing proteoforms involves identifying and locating primary structure alterations (PSAs) in proteoforms, which is of practical importance for the advancement of the medical profession. With the development of mass spectrometry (MS) technology, the characterization of proteoforms based on top-down MS technology has become possible. This type of method is relatively new and faces many challenges. Since the proteoform identification is the most important process in characterizing proteoforms, we comprehensively review the existing proteoform identification methods in this study. Before identifying proteoforms, the spectra need to be preprocessed, and protein sequence databases can be filtered to speed up the identification. Therefore, we also summarize some popular deconvolution algorithms, various filtering algorithms for improving the proteoform identification performance and various scoring methods for localizing proteoforms. Moreover, commonly used methods were evaluated and compared in this review. We believe our review could help researchers better understand the current state of the development in this field and design new efficient algorithms for the proteoform characterization.
Collapse
Affiliation(s)
- Jiancheng Zhong
- College of Information Science and Engineering, Hunan Normal University, Changsha, Hunan, China
| | - Yusui Sun
- College of Information Science and Engineering, Hunan Normal University, Changsha, Hunan, China
| | - Minzhu Xie
- College of Information Science and Engineering, Hunan Normal University, Changsha, Hunan, China
| | - Wei Peng
- Kunming University of Science and Technology, Kunming, Yunnan, China
| | - Chushu Zhang
- College of Information Science and Engineering, Hunan Normal University, Changsha, Hunan, China
| | - Fang-Xiang Wu
- College of Engineering and the Department of Computer Science at University of Saskatchewan, Saskatoon, Canada
| | - Jianxin Wang
- Hunan Provincial Key Lab on Bioinformatics, School of Computer Science and Engineering at Central South University, Changsha, Hunan, China
| |
Collapse
|
4
|
Ghezellou P, Garikapati V, Kazemi SM, Strupat K, Ghassempour A, Spengler B. A perspective view of top-down proteomics in snake venom research. RAPID COMMUNICATIONS IN MASS SPECTROMETRY : RCM 2019; 33 Suppl 1:20-27. [PMID: 30076652 DOI: 10.1002/rcm.8255] [Citation(s) in RCA: 20] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 05/04/2018] [Revised: 07/25/2018] [Accepted: 07/29/2018] [Indexed: 06/08/2023]
Abstract
The venom produced by snakes contains complex mixtures of pharmacologically active proteins and peptides which play a crucial role in the pathophysiology of snakebite diseases. The deep understanding of venom proteomes can help to improve the treatment of this "neglected tropical disease" (as expressed by the World Health Organization [WHO]) and to develop new drugs. The most widely used technique for venom analysis is liquid chromatography/tandem mass spectrometry (LC/MS/MS)-based bottom-up (BU) proteomics. Considering the fact that multiple multi-locus gene families encode snake venom proteins, the major challenge for the BU proteomics is the limited sequence coverage and also the "protein inference problem" which result in a loss of information for the identification and characterization of toxin proteoforms (genetic variation, alternative mRNA splicing, single nucleotide polymorphism [SNP] and post-translational modifications [PTMs]). In contrast, intact protein measurements with top-down (TD) MS strategies cover almost complete protein sequences, and prove the ability to identify venom proteoforms and to localize their modifications and sequence variations.
Collapse
Affiliation(s)
- Parviz Ghezellou
- Institute of Inorganic and Analytical Chemistry, Justus Liebig University Giessen, Germany
- Medicinal Plants and Drugs Research Institute, Shahid Beheshti University, Tehran, Iran
| | | | - Seyed Mahdi Kazemi
- Medicinal Plants and Drugs Research Institute, Shahid Beheshti University, Tehran, Iran
| | | | - Alireza Ghassempour
- Medicinal Plants and Drugs Research Institute, Shahid Beheshti University, Tehran, Iran
| | - Bernhard Spengler
- Institute of Inorganic and Analytical Chemistry, Justus Liebig University Giessen, Germany
| |
Collapse
|
5
|
Spodzieja M, Rodziewicz-Motowidło S, Szymanska A. Hyphenated Mass Spectrometry Techniques in the Diagnosis of Amyloidosis. Curr Med Chem 2019; 26:104-120. [DOI: 10.2174/0929867324666171003113019] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/04/2016] [Revised: 07/25/2016] [Accepted: 09/01/2016] [Indexed: 12/18/2022]
Abstract
Amyloidoses are a group of diseases caused by the extracellular deposition of proteins forming amyloid fibrils. The amyloidosis is classified according to the main protein or peptide that constitutes the amyloid fibrils. The most effective methods for the diagnosis of amyloidosis are based on mass spectrometry. Mass spectrometry enables confirmation of the identity of the protein precursor of amyloid fibrils in biological samples with very high sensitivity and specificity, which is crucial for proper amyloid typing. Due to the fact that biological samples are very complex, mass spectrometry is usually connected with techniques such as liquid chromatography or capillary electrophoresis, which enable the separation of proteins before MS analysis. Therefore mass spectrometry constitutes an important part of the so called “hyphenated techniques” combining, preferentially in-line, different analytical methods to provide comprehensive information about the studied problem. Hyphenated methods are very useful in the discovery of biomarkers in different types of amyloidosis. In systemic forms of amyloidosis, the analysis of aggregated proteins is usually performed based on the tissues obtained during a biopsy of an affected organ or a subcutaneous adipose tissue. In some cases, when the diagnostic biopsy is not possible due to the fact that amyloid fibrils are formed in organs like the brain (Alzheimer’s disease), the study of biomarkers presented in body fluids can be carried out. Currently, large-scale studies are performed to find and validate more effective biomarkers, which can be used in diagnostic procedures. We would like to present the methods connected with mass spectrometry which are used in the diagnosis of amyloidosis based on the analysis of proteins occurring in tissues, blood and cerebrospinal fluid.
Collapse
Affiliation(s)
- Marta Spodzieja
- Department of Biomedical Chemistry, Faculty of Chemistry, University of Gdansk, Wita Stwosza 63, 80-308 Gdansk, Poland
| | - Sylwia Rodziewicz-Motowidło
- Department of Biomedical Chemistry, Faculty of Chemistry, University of Gdansk, Wita Stwosza 63, 80-308 Gdansk, Poland
| | - Aneta Szymanska
- Department of Biomedical Chemistry, Faculty of Chemistry, University of Gdansk, Wita Stwosza 63, 80-308 Gdansk, Poland
| |
Collapse
|
6
|
Kou Q, Wu S, Liu X. Systematic Evaluation of Protein Sequence Filtering Algorithms for Proteoform Identification Using Top-Down Mass Spectrometry. Proteomics 2018; 18. [PMID: 29327814 DOI: 10.1002/pmic.201700306] [Citation(s) in RCA: 10] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/16/2017] [Revised: 11/20/2017] [Indexed: 01/19/2023]
Abstract
Complex proteoforms contain various primary structural alterations resulting from variations in genes, RNA, and proteins. Top-down mass spectrometry is commonly used for analyzing complex proteoforms because it provides whole sequence information of the proteoforms. Proteoform identification by top-down mass spectral database search is a challenging computational problem because the types and/or locations of some alterations in target proteoforms are in general unknown. Although spectral alignment and mass graph alignment algorithms have been proposed for identifying proteoforms with unknown alterations, they are extremely slow to align millions of spectra against tens of thousands of protein sequences in high throughput proteome level analyses. Many software tools in this area combine efficient protein sequence filtering algorithms and spectral alignment algorithms to speed up database search. As a result, the performance of these tools heavily relies on the sensitivity and efficiency of their filtering algorithms. Here, we propose two efficient approximate spectrum-based filtering algorithms for proteoform identification. We evaluated the performances of the proposed algorithms and four existing ones on simulated and real top-down mass spectrometry data sets. Experiments showed that the proposed algorithms outperformed the existing ones for complex proteoform identification. In addition, combining the proposed filtering algorithms and mass graph alignment algorithms identified many proteoforms missed by ProSightPC in proteome-level proteoform analyses.
Collapse
Affiliation(s)
- Qiang Kou
- Department of BioHealth Informatics, Indiana University-Purdue University Indianapolis, Indianapolis, IN, USA
| | - Si Wu
- Department of Chemistry and Biochemistry, University of Oklahoma, Norman, OK, USA
| | - Xiaowen Liu
- Department of BioHealth Informatics, Indiana University-Purdue University Indianapolis, Indianapolis, IN, USA.,Center for Computational Biology and Bioinformatics, Indiana University School of Medicine, Indianapolis, IN, USA
| |
Collapse
|
7
|
Affiliation(s)
- Bifan Chen
- Department of Chemistry, University of Wisconsin-Madison, Madison, Wisconsin 53706, United States
| | - Kyle A. Brown
- Department of Chemistry, University of Wisconsin-Madison, Madison, Wisconsin 53706, United States
| | - Ziqing Lin
- Department of Cell and Regenerative Biology, University of Wisconsin-Madison, Madison, Wisconsin 53706, United States
- Human Proteomics Program, University of Wisconsin-Madison, Madison, Wisconsin 53706, United States
| | - Ying Ge
- Department of Chemistry, University of Wisconsin-Madison, Madison, Wisconsin 53706, United States
- Department of Cell and Regenerative Biology, University of Wisconsin-Madison, Madison, Wisconsin 53706, United States
- Human Proteomics Program, University of Wisconsin-Madison, Madison, Wisconsin 53706, United States
| |
Collapse
|
8
|
Xiao K, Yu F, Tian Z. Top-down protein identification using isotopic envelope fingerprinting. J Proteomics 2016; 152:41-47. [PMID: 27989944 DOI: 10.1016/j.jprot.2016.10.010] [Citation(s) in RCA: 19] [Impact Index Per Article: 2.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/29/2016] [Revised: 10/11/2016] [Accepted: 10/23/2016] [Indexed: 12/14/2022]
Abstract
For top-down protein database search and identification from tandem mass spectra, our isotopic envelope fingerprinting search algorithm and ProteinGoggle search engine have demonstrated their strength of efficiently resolving heavily overlapping data as well separating non-ideal data with non-ideal isotopic envelopes from ideal ones with ideal isotopic envelopes. Here we report our updated ProteinGoggle 2.0 for intact protein database search with full-capacity. The indispensable updates include users' optional definition of dynamic post-translational modifications and static chemical labeling during database creation, comprehensive dissociation methods and ion series, as well as a Proteoform Score for each proteoform. ProteinGoggle has previously been benchmarked with both collision-based dissociation (CID, HCD) and electron-based dissociation (ETD) data of either intact proteins or intact proteomes. Here we report our further benchmarking of the new version of ProteinGoggle with publically available photon-based dissociation (UVPD) data (http://hdl.handle.net/2022/17316) of intact E. coli ribosomal proteins. BIOLOGICAL SIGNIFICANCE Protein species (aka proteoforms) function at their molecular level, and diverse structures and biological roles of every proteoform come from often co-occurring proteolysis, amino acid variation and post-translational modifications. Complete and high-throughput capture of this combinatorial information of proteoforms has become possible in evolving top-down proteomics; yet, various methods and technologies, especially database search and bioinformatics identification tools, in the top-down pipeline are still in their infancy stages and demand intensive research and development.
Collapse
Affiliation(s)
- Kaijie Xiao
- School of Chemical Science and Engineering, Tongji University, Shanghai, China; Shanghai Key Laboratory of Chemical Assessment and Sustainability, Tongji University, Shanghai, China
| | - Fan Yu
- School of Chemical Science and Engineering, Tongji University, Shanghai, China; Shanghai Key Laboratory of Chemical Assessment and Sustainability, Tongji University, Shanghai, China
| | - Zhixin Tian
- School of Chemical Science and Engineering, Tongji University, Shanghai, China; Shanghai Key Laboratory of Chemical Assessment and Sustainability, Tongji University, Shanghai, China.
| |
Collapse
|
9
|
Kou Q, Zhu B, Wu S, Ansong C, Tolić N, Paša-Tolić L, Liu X. Characterization of Proteoforms with Unknown Post-translational Modifications Using the MIScore. J Proteome Res 2016; 15:2422-32. [PMID: 27291504 DOI: 10.1021/acs.jproteome.5b01098] [Citation(s) in RCA: 17] [Impact Index Per Article: 2.1] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/30/2022]
Abstract
Various proteoforms may be generated from a single gene due to primary structure alterations (PSAs) such as genetic variations, alternative splicing, and post-translational modifications (PTMs). Top-down mass spectrometry is capable of analyzing intact proteins and identifying patterns of multiple PSAs, making it the method of choice for studying complex proteoforms. In top-down proteomics, proteoform identification is often performed by searching tandem mass spectra against a protein sequence database that contains only one reference protein sequence for each gene or transcript variant in a proteome. Because of the incompleteness of the protein database, an identified proteoform may contain unknown PSAs compared with the reference sequence. Proteoform characterization is to identify and localize PSAs in a proteoform. Although many software tools have been proposed for proteoform identification by top-down mass spectrometry, the characterization of proteoforms in identified proteoform-spectrum matches still relies mainly on manual annotation. We propose to use the Modification Identification Score (MIScore), which is based on Bayesian models, to automatically identify and localize PTMs in proteoforms. Experiments showed that the MIScore is accurate in identifying and localizing one or two modifications.
Collapse
Affiliation(s)
- Qiang Kou
- Department of BioHealth Informatics, Indiana University-Purdue University Indianapolis , Indianapolis, Indiana 46202, United States
| | - Binhai Zhu
- Department of Computer Science, Montana State University , Bozeman, Montana 59717, United States
| | - Si Wu
- Department of Chemistry and Biochemistry, Univeristy of Oklahoma , Norman, Oklahoma 73019-5251, United States
| | | | | | | | - Xiaowen Liu
- Department of BioHealth Informatics, Indiana University-Purdue University Indianapolis , Indianapolis, Indiana 46202, United States.,Center for Computational Biology and Bioinformatics, Indiana University School of Medicine , Indianapolis, Indiana 46202-5122, United States
| |
Collapse
|
10
|
Sun RX, Luo L, Wu L, Wang RM, Zeng WF, Chi H, Liu C, He SM. pTop 1.0: A High-Accuracy and High-Efficiency Search Engine for Intact Protein Identification. Anal Chem 2016; 88:3082-90. [DOI: 10.1021/acs.analchem.5b03963] [Citation(s) in RCA: 44] [Impact Index Per Article: 5.5] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/28/2023]
Affiliation(s)
- Rui-Xiang Sun
- Key
Lab of Intelligent Information Processing of Chinese Academy of Sciences
(CAS), Institute of Computing Technology, CAS, Beijing 100190, China
| | - Lan Luo
- Key
Lab of Intelligent Information Processing of Chinese Academy of Sciences
(CAS), Institute of Computing Technology, CAS, Beijing 100190, China
- University of Chinese Academy of Sciences, Beijing 100049, China
| | - Long Wu
- Key
Lab of Intelligent Information Processing of Chinese Academy of Sciences
(CAS), Institute of Computing Technology, CAS, Beijing 100190, China
- University of Chinese Academy of Sciences, Beijing 100049, China
| | - Rui-Min Wang
- Key
Lab of Intelligent Information Processing of Chinese Academy of Sciences
(CAS), Institute of Computing Technology, CAS, Beijing 100190, China
- University of Chinese Academy of Sciences, Beijing 100049, China
| | - Wen-Feng Zeng
- Key
Lab of Intelligent Information Processing of Chinese Academy of Sciences
(CAS), Institute of Computing Technology, CAS, Beijing 100190, China
- University of Chinese Academy of Sciences, Beijing 100049, China
| | - Hao Chi
- Key
Lab of Intelligent Information Processing of Chinese Academy of Sciences
(CAS), Institute of Computing Technology, CAS, Beijing 100190, China
| | - Chao Liu
- Key
Lab of Intelligent Information Processing of Chinese Academy of Sciences
(CAS), Institute of Computing Technology, CAS, Beijing 100190, China
| | - Si-Min He
- Key
Lab of Intelligent Information Processing of Chinese Academy of Sciences
(CAS), Institute of Computing Technology, CAS, Beijing 100190, China
| |
Collapse
|
11
|
Patrie SM. Top-Down Mass Spectrometry: Proteomics to Proteoforms. ADVANCES IN EXPERIMENTAL MEDICINE AND BIOLOGY 2016; 919:171-200. [PMID: 27975217 DOI: 10.1007/978-3-319-41448-5_8] [Citation(s) in RCA: 20] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 12/24/2022]
Abstract
This chapter highlights many of the fundamental concepts and technologies in the field of top-down mass spectrometry (TDMS), and provides numerous examples of contributions that TD is making in biology, biophysics, and clinical investigations. TD workflows include variegated steps that may include non-specific or targeted preparative strategies, orthogonal liquid chromatography techniques, analyte ionization, mass analysis, tandem mass spectrometry (MS/MS) and informatics procedures. This diversity of experimental designs has evolved to manage the large dynamic range of protein expression and diverse physiochemical properties of proteins in proteome investigations, tackle proteoform microheterogeneity, as well as determine structure and composition of gas-phase proteins and protein assemblies.
Collapse
Affiliation(s)
- Steven M Patrie
- Computational and Systems Biology & Biomedical Engineering Graduate Programs, University of Texas Southwestern Medical Center, Dallas, TX, USA. .,Department of Pathology, University of Texas Southwestern Medical Center, Dallas, TX, USA. .,Department of Bioengineering, University of Texas at Dallas, Richardson, TX, USA.
| |
Collapse
|
12
|
Accurate and Efficient Resolution of Overlapping Isotopic Envelopes in Protein Tandem Mass Spectra. Sci Rep 2015; 5:14755. [PMID: 26439836 PMCID: PMC4593959 DOI: 10.1038/srep14755] [Citation(s) in RCA: 13] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/25/2015] [Accepted: 09/09/2015] [Indexed: 12/03/2022] Open
Abstract
It has long been an analytical challenge to accurately and efficiently resolve extremely dense overlapping isotopic envelopes (OIEs) in protein tandem mass spectra to confidently identify proteins. Here, we report a computationally efficient method, called OIE_CARE, to resolve OIEs by calculating the relative deviation between the ideal and observed experimental abundance. In the OIE_CARE method, the ideal experimental abundance of a particular overlapping isotopic peak (OIP) is first calculated for all the OIEs sharing this OIP. The relative deviation (RD) of the overall observed experimental abundance of this OIP relative to the summed ideal value is then calculated. The final individual abundance of the OIP for each OIE is the individual ideal experimental abundance multiplied by 1 + RD. Initial studies were performed using higher-energy collisional dissociation tandem mass spectra on myoglobin (with direct infusion) and the intact E. coli proteome (with liquid chromatographic separation). Comprehensive data at the protein and proteome levels, high confidence and good reproducibility were achieved. The resolving method reported here can, in principle, be extended to resolve any envelope-type overlapping data for which the corresponding theoretical reference values are available.
Collapse
|
13
|
Gregorich ZR, Ge Y. Top-down proteomics in health and disease: challenges and opportunities. Proteomics 2014; 14:1195-210. [PMID: 24723472 DOI: 10.1002/pmic.201300432] [Citation(s) in RCA: 147] [Impact Index Per Article: 14.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/30/2013] [Revised: 03/10/2014] [Accepted: 03/24/2014] [Indexed: 01/06/2023]
Abstract
Proteomics is essential for deciphering how molecules interact as a system and for understanding the functions of cellular systems in human disease; however, the unique characteristics of the human proteome, which include a high dynamic range of protein expression and extreme complexity due to a plethora of PTMs and sequence variations, make such analyses challenging. An emerging "top-down" MS-based proteomics approach, which provides a "bird's eye" view of all proteoforms, has unique advantages for the assessment of PTMs and sequence variations. Recently, a number of studies have showcased the potential of top-down proteomics for the unraveling of disease mechanisms and discovery of new biomarkers. Nevertheless, the top-down approach still faces significant challenges in terms of protein solubility, separation, and the detection of large intact proteins, as well as underdeveloped data analysis tools. Consequently, new technological developments are urgently needed to advance the field of top-down proteomics. Herein, we intend to provide an overview of the recent applications of top-down proteomics in biomedical research. Moreover, we will outline the challenges and opportunities facing top-down proteomics strategies aimed at understanding and diagnosing human diseases.
Collapse
Affiliation(s)
- Zachery R Gregorich
- Molecular and Cellular Pharmacology Training Program, University of Wisconsin-Madison, Madison, WI, USA; Department of Cell and Regenerative Biology, School of Medicine and Public Health, University of Wisconsin-Madison, Madison, WI, USA
| | | |
Collapse
|
14
|
Kou Q, Wu S, Liu X. A new scoring function for top-down spectral deconvolution. BMC Genomics 2014; 15:1140. [PMID: 25523396 PMCID: PMC4378558 DOI: 10.1186/1471-2164-15-1140] [Citation(s) in RCA: 11] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/16/2014] [Accepted: 12/11/2014] [Indexed: 12/04/2022] Open
Abstract
Background Top-down mass spectrometry plays an important role in intact protein identification and characterization. Top-down mass spectra are more complex than bottom-up mass spectra because they often contain many isotopomer envelopes from highly charged ions, which may overlap with one another. As a result, spectral deconvolution, which converts a complex top-down mass spectrum into a monoisotopic mass list, is a key step in top-down spectral interpretation. Results In this paper, we propose a new scoring function, L-score, for evaluating isotopomer envelopes. By combining L-score with MS-Deconv, a new software tool, MS-Deconv+, was developed for top-down spectral deconvolution. Experimental results showed that MS-Deconv+ outperformed existing software tools in top-down spectral deconvolution. Conclusions L-score shows high discriminative ability in identification of isotopomer envelopes. Using L-score, MS-Deconv+ reports many correct monoisotopic masses missed by other software tools, which are valuable for proteoform identification and characterization. Electronic supplementary material The online version of this article (doi:10.1186/1471-2164-15-1140) contains supplementary material, which is available to authorized users.
Collapse
Affiliation(s)
| | | | - Xiaowen Liu
- Department of BioHealth Informatics, Indiana University-Purdue University Indianapolis, 535 W, Michigan Street, Indianapolis, IN 46202, USA.
| |
Collapse
|
15
|
Han X, Wang Y, Aslanian A, Fonslow B, Graczyk B, Davis TN, Yates JR. In-line separation by capillary electrophoresis prior to analysis by top-down mass spectrometry enables sensitive characterization of protein complexes. J Proteome Res 2014; 13:6078-86. [PMID: 25382489 PMCID: PMC4262260 DOI: 10.1021/pr500971h] [Citation(s) in RCA: 51] [Impact Index Per Article: 5.1] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/31/2022]
Abstract
![]()
Intact
protein analysis via top-down mass spectrometry (MS) provides
a bird’s eye view over the protein complexes and complex protein
mixtures with the unique capability of characterizing protein variants,
splice isoforms, and combinatorial post-translational modifications
(PTMs). Here we applied capillary electrophoresis (CE) through a sheathless
CE–electrospray ionization interface coupled to an LTQ Velos
Orbitrap Elite mass spectrometer to analyze the Dam1 complex from Saccharomyces cerevisiae. We achieved a 100-fold
increase in sensitivity compared to a reversed-phase liquid chromatography
coupled MS analysis of recombinant Dam1 complex with a total loading
of 2.5 ng (12 amol). N-terminal processing forms of individual subunits
of the Dam1 complex were observed as well as their phosphorylation
stoichiometry upon Mps1p kinase treatment.
Collapse
Affiliation(s)
- Xuemei Han
- Department of Chemical Physiology, The Scripps Research Institute , 10550 North Torrey Pines Road, La Jolla, California 92037, United States
| | | | | | | | | | | | | |
Collapse
|
16
|
Han X, Wang Y, Aslanian A, Bern M, Lavallée-Adam M, Yates JR. Sheathless capillary electrophoresis-tandem mass spectrometry for top-down characterization of Pyrococcus furiosus proteins on a proteome scale. Anal Chem 2014; 86:11006-12. [PMID: 25346219 PMCID: PMC4238646 DOI: 10.1021/ac503439n] [Citation(s) in RCA: 56] [Impact Index Per Article: 5.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/17/2023]
Abstract
![]()
Intact protein analysis via top-down
mass spectrometry (MS) provides
the unique capability of fully characterizing protein isoforms and
combinatorial post-translational modifications (PTMs) compared to
the bottom-up MS approach. Front-end protein separation poses a challenge
for analyzing complex mixtures of intact proteins on a proteomic scale.
Here we applied capillary electrophoresis (CE) through a sheathless
capillary electrophoresis-electrospray ionization (CESI) interface
coupled to an Orbitrap Elite mass spectrometer to profile the proteome
from Pyrococcus furiosus. CESI-top-down MS analysis
of Pyrococcus furiosus cell lysate identified 134
proteins and 291 proteoforms with a total sample consumption of 270
ng in 120 min of total analysis time. Truncations and various PTMs
were detected, including acetylation, disulfide bonds, oxidation,
glycosylation, and hypusine. This is the largest scale analysis of
intact proteins by CE-top-down MS to date.
Collapse
Affiliation(s)
- Xuemei Han
- Department of Chemical Physiology, The Scripps Research Institute , 10550 North Torrey Pines Road, La Jolla, California 92037, United States
| | | | | | | | | | | |
Collapse
|
17
|
Catherman AD, Skinner OS, Kelleher NL. Top Down proteomics: facts and perspectives. Biochem Biophys Res Commun 2014; 445:683-93. [PMID: 24556311 PMCID: PMC4103433 DOI: 10.1016/j.bbrc.2014.02.041] [Citation(s) in RCA: 315] [Impact Index Per Article: 31.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/04/2014] [Accepted: 02/10/2014] [Indexed: 12/29/2022]
Abstract
The rise of the "Top Down" method in the field of mass spectrometry-based proteomics has ushered in a new age of promise and challenge for the characterization and identification of proteins. Injecting intact proteins into the mass spectrometer allows for better characterization of post-translational modifications and avoids several of the serious "inference" problems associated with peptide-based proteomics. However, successful implementation of a Top Down approach to endogenous or other biologically relevant samples often requires the use of one or more forms of separation prior to mass spectrometric analysis, which have only begun to mature for whole protein MS. Recent advances in instrumentation have been used in conjunction with new ion fragmentation using photons and electrons that allow for better (and often complete) protein characterization on cases simply not tractable even just a few years ago. Finally, the use of native electrospray mass spectrometry has shown great promise for the identification and characterization of whole protein complexes in the 100 kDa to 1 MDa regime, with prospects for complete compositional analysis for endogenous protein assemblies a viable goal over the coming few years.
Collapse
Affiliation(s)
- Adam D Catherman
- Departments of Chemistry and Molecular Biosciences, The Chemistry of Life Processes Institute, The Proteomics Center of Excellence, The Robert H. Lurie Comprehensive Cancer Center, Northwestern University, Evanston, IL 60208, United States
| | - Owen S Skinner
- Departments of Chemistry and Molecular Biosciences, The Chemistry of Life Processes Institute, The Proteomics Center of Excellence, The Robert H. Lurie Comprehensive Cancer Center, Northwestern University, Evanston, IL 60208, United States
| | - Neil L Kelleher
- Departments of Chemistry and Molecular Biosciences, The Chemistry of Life Processes Institute, The Proteomics Center of Excellence, The Robert H. Lurie Comprehensive Cancer Center, Northwestern University, Evanston, IL 60208, United States.
| |
Collapse
|
18
|
Abstract
Background In mass spectrometry-based proteomics, the statistical significance of a peptide-spectrum or protein-spectrum match is an important indicator of the correctness of the peptide or protein identification. In bottom-up mass spectrometry, probabilistic models, such as the generating function method, have been successfully applied to compute the statistical significance of peptide-spectrum matches for short peptides containing no post-translational modifications. As top-down mass spectrometry, which often identifies intact proteins with post-translational modifications, becomes available in many laboratories, the estimation of statistical significance of top-down protein identification results has come into great demand. Results In this paper, we study an extended generating function method for accurately computing the statistical significance of protein-spectrum matches with post-translational modifications. Experiments show that the extended generating function method achieves high accuracy in computing spectral probabilities and false discovery rates. Conclusions The extended generating function method is a non-trivial extension of the generating function method for bottom-up mass spectrometry. It can be used to choose the correct protein-spectrum match from several candidate protein-spectrum matches for a spectrum, as well as separate correct protein-spectrum matches from incorrect ones identified from a large number of tandem mass spectra.
Collapse
|
19
|
Zhang Z, Wu S, Stenoien DL, Paša-Tolić L. High-throughput proteomics. ANNUAL REVIEW OF ANALYTICAL CHEMISTRY (PALO ALTO, CALIF.) 2014; 7:427-454. [PMID: 25014346 DOI: 10.1146/annurev-anchem-071213-020216] [Citation(s) in RCA: 168] [Impact Index Per Article: 16.8] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/03/2023]
Abstract
Mass spectrometry (MS)-based high-throughput proteomics is the core technique for large-scale protein characterization. Due to the extreme complexity of proteomes, sophisticated separation techniques and advanced MS instrumentation have been developed to extend coverage and enhance dynamic range and sensitivity. In this review, we discuss the separation and prefractionation techniques applied for large-scale analysis in both bottom-up (i.e., peptide-level) and top-down (i.e., protein-level) proteomics. Different approaches for quantifying peptides or intact proteins, including label-free and stable-isotope-labeling strategies, are also discussed. In addition, we present a brief overview of different types of mass analyzers and fragmentation techniques as well as selected emerging techniques.
Collapse
|
20
|
Perez-Riverol Y, Wang R, Hermjakob H, Müller M, Vesada V, Vizcaíno JA. Open source libraries and frameworks for mass spectrometry based proteomics: a developer's perspective. BIOCHIMICA ET BIOPHYSICA ACTA 2014; 1844:63-76. [PMID: 23467006 PMCID: PMC3898926 DOI: 10.1016/j.bbapap.2013.02.032] [Citation(s) in RCA: 61] [Impact Index Per Article: 6.1] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 10/01/2012] [Revised: 02/05/2013] [Accepted: 02/22/2013] [Indexed: 12/23/2022]
Abstract
Data processing, management and visualization are central and critical components of a state of the art high-throughput mass spectrometry (MS)-based proteomics experiment, and are often some of the most time-consuming steps, especially for labs without much bioinformatics support. The growing interest in the field of proteomics has triggered an increase in the development of new software libraries, including freely available and open-source software. From database search analysis to post-processing of the identification results, even though the objectives of these libraries and packages can vary significantly, they usually share a number of features. Common use cases include the handling of protein and peptide sequences, the parsing of results from various proteomics search engines output files, and the visualization of MS-related information (including mass spectra and chromatograms). In this review, we provide an overview of the existing software libraries, open-source frameworks and also, we give information on some of the freely available applications which make use of them. This article is part of a Special Issue entitled: Computational Proteomics in the Post-Identification Era. Guest Editors: Martin Eisenacher and Christian Stephan.
Collapse
Affiliation(s)
- Yasset Perez-Riverol
- EMBL Outstation, European Bioinformatics Institute, Wellcome Trust Genome Campus, Hinxton, Cambridge, CB10 1SD, UK
- Department of Proteomics, Center for Genetic Engineering and Biotechnology, Ciudad de la Habana, Cuba
| | - Rui Wang
- EMBL Outstation, European Bioinformatics Institute, Wellcome Trust Genome Campus, Hinxton, Cambridge, CB10 1SD, UK
| | - Henning Hermjakob
- EMBL Outstation, European Bioinformatics Institute, Wellcome Trust Genome Campus, Hinxton, Cambridge, CB10 1SD, UK
| | - Markus Müller
- Proteome Informatics Group, Swiss Institute of Bioinformatics, CMU - 1, rue Michel Servet CH-1211 Geneva, Switzerland
| | - Vladimir Vesada
- Department of Proteomics, Center for Genetic Engineering and Biotechnology, Ciudad de la Habana, Cuba
| | - Juan Antonio Vizcaíno
- EMBL Outstation, European Bioinformatics Institute, Wellcome Trust Genome Campus, Hinxton, Cambridge, CB10 1SD, UK
| |
Collapse
|
21
|
|
22
|
Ahlf DR, Thomas PM, Kelleher NL. Developing top down proteomics to maximize proteome and sequence coverage from cells and tissues. Curr Opin Chem Biol 2013; 17:787-94. [PMID: 23988518 DOI: 10.1016/j.cbpa.2013.07.028] [Citation(s) in RCA: 38] [Impact Index Per Article: 3.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/26/2013] [Revised: 07/01/2013] [Accepted: 07/29/2013] [Indexed: 12/25/2022]
Abstract
Mass spectrometry based proteomics generally seeks to identify and characterize protein molecules with high accuracy and throughput. Recent speed and quality improvements to the independent steps of integrated platforms have removed many limitations to the robust implementation of top down proteomics (TDP) for proteins below 70 kDa. Improved intact protein separations coupled to high-performance instruments have increased the quality and number of protein and proteoform identifications. To date, TDP applications have shown >1000 protein identifications, expanding to an average of ∼3-4 more proteoforms for each protein detected. In the near future, increased fractionation power, new mass spectrometers and improvements in proteoform scoring will combine to accelerate the application and impact of TDP to this century's biomedical problems.
Collapse
Affiliation(s)
- Dorothy R Ahlf
- Department of Chemistry and Biochemistry and the Harper Cancer Institute, University of Notre Dame, Notre Dame, IN, United States
| | | | | |
Collapse
|
23
|
Li L, Tian Z. Interpreting raw biological mass spectra using isotopic mass-to-charge ratio and envelope fingerprinting. RAPID COMMUNICATIONS IN MASS SPECTROMETRY : RCM 2013; 27:1267-1277. [PMID: 23650040 DOI: 10.1002/rcm.6565] [Citation(s) in RCA: 20] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 12/24/2012] [Revised: 02/19/2013] [Accepted: 03/09/2013] [Indexed: 06/02/2023]
Abstract
RATIONALE Soft ionization, high-resolution mass spectrometry is widely used to characterize large biological molecules, such as proteins. Deconvolution ('deisotoping') of isotopic envelopes (iEs) in biological mass spectra into monoisotopic or average masses is challenging due to low signals and heavily overlapped iEs, resulting in many wrong interpretations. METHODS Isotopic envelopes (iEs) are directly used without deisotoping to identify biological molecules. An algorithm, isotopic mass-to-charge ratio (m/z) and envelope fingerprinting (iMEF), was implemented in the ProteinGoggle search engine for top-down intact protein database searching. iMEF combines isotopic m/z fingerprinting (iMF) and isotopic envelop fingerprinting (iEF), where 'Isotopic mass-to-charge ratio' means the m/z value of the most abundant isotopic peak within the iE of a precursor or product ion. iMF is used to 'fish' precursor or product ion candidates from the database, which is pre-built and contains all iE information (precursor and product ions) of all proteoforms of the studied system. iEF identifies matching precursor or product ions. A protein is finally identified with user-specified total number of matching product ions and post-translational modification scores. RESULTS The working principles of iMEF and ProteinGoggle, and the definition of a set of related parameters and scoring metrics, are illustrated with high-resolution tandem mass spectrometric analysis of a mixture of ubiquitin and the HUMAN histone H4 proteoforms. Ubiquitin was confidently identified from its CID, ETD, and HCD spectra with 57, 91, and 66 matching product ions, respectively; 125 proteoforms were confidently found from the H4 dataset. The locations of PTMs in 54 and 6 isoforms were partially and fully identified. CONCLUSIONS Database search with iMEF bypasses 'deisotoping' avoiding associated errors, and also provides full quality control of matching precursor and product ions and finally protein IDs. Overlapped iEs of different product ions could also be confidently unwrapped in situ. Improvement and addition of more functionalities and utilities of ProteinGoggle are underway.
Collapse
Affiliation(s)
- Li Li
- State Key Laboratory of Molecular Reaction Dynamics, Dalian National Laboratory for Clean Energy, Dalian Institute of Chemical Physics, Chinese Academy of Sciences, Liaoning, China
| | | |
Collapse
|
24
|
Top-down proteomics reveals a unique protein S-thiolation switch in Salmonella Typhimurium in response to infection-like conditions. Proc Natl Acad Sci U S A 2013; 110:10153-8. [PMID: 23720318 DOI: 10.1073/pnas.1221210110] [Citation(s) in RCA: 128] [Impact Index Per Article: 11.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/24/2023] Open
Abstract
Characterization of the mature protein complement in cells is crucial for a better understanding of cellular processes on a systems-wide scale. Toward this end, we used single-dimension ultra-high-pressure liquid chromatography mass spectrometry to investigate the comprehensive "intact" proteome of the Gram-negative bacterial pathogen Salmonella Typhimurium. Top-down proteomics analysis revealed 563 unique proteins including 1,665 proteoforms generated by posttranslational modifications (PTMs), representing the largest microbial top-down dataset reported to date. We confirmed many previously recognized aspects of Salmonella biology and bacterial PTMs, and our analysis also revealed several additional biological insights. Of particular interest was differential utilization of the protein S-thiolation forms S-glutathionylation and S-cysteinylation in response to infection-like conditions versus basal conditions. This finding of a S-glutathionylation-to-S-cysteinylation switch in a condition-specific manner was corroborated by bottom-up proteomics data and further by changes in corresponding biosynthetic pathways under infection-like conditions and during actual infection of host cells. This differential utilization highlights underlying metabolic mechanisms that modulate changes in cellular signaling, and represents a report of S-cysteinylation in Gram-negative bacteria. Additionally, the functional relevance of these PTMs was supported by protein structure and gene deletion analyses. The demonstrated utility of our simple proteome-wide intact protein level measurement strategy for gaining biological insight should promote broader adoption and applications of top-down proteomics approaches.
Collapse
|
25
|
Top-Down Characterization of the Post-Translationally Modified Intact Periplasmic Proteome from the Bacterium Novosphingobium aromaticivorans. INTERNATIONAL JOURNAL OF PROTEOMICS 2013; 2013:279590. [PMID: 23555055 PMCID: PMC3608174 DOI: 10.1155/2013/279590] [Citation(s) in RCA: 17] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 10/07/2012] [Revised: 01/31/2013] [Accepted: 02/04/2013] [Indexed: 11/17/2022]
Abstract
The periplasm of Gram-negative bacteria is a dynamic and physiologically important subcellular compartment where the constant exposure to potential environmental insults amplifies the need for proper protein folding and modifications. Top-down proteomics analysis of the periplasmic fraction at the intact protein level provides unrestricted characterization and annotation of the periplasmic proteome, including the post-translational modifications (PTMs) on these proteins. Here, we used single-dimension ultra-high pressure liquid chromatography coupled with the Fourier transform mass spectrometry (FTMS) to investigate the intact periplasmic proteome of Novosphingobium aromaticivorans. Our top-down analysis provided the confident identification of 55 proteins in the periplasm and characterized their PTMs including signal peptide removal, N-terminal methionine excision, acetylation, glutathionylation, pyroglutamate, and disulfide bond formation. This study provides the first experimental evidence for the expression and periplasmic localization of many hypothetical and uncharacterized proteins and the first unrestrictive, large-scale data on PTMs in the bacterial periplasm.
Collapse
|
26
|
Abstract
Glycosylation is increasingly recognized as a common and biologically significant post-translational modification of proteins. Modern mass spectrometry methods offer the best ways to characterize the glycosylation state of proteins. Both glycobiology and mass spectrometry rely on specialized nomenclature, techniques, and knowledge, which pose a barrier to entry by the nonspecialist. This introductory chapter provides an overview of the fundamentals of glycobiology, mass spectrometry methods, and the intersection of the two fields. Foundational material included in this chapter includes a description of the biological process of glycosylation, an overview of typical glycoproteomics workflows, a description of mass spectrometry ionization methods and instrumentation, and an introduction to bioinformatics resources. In addition to providing an orientation to the contents of the other chapters of this volume, this chapter cites other important works of potential interest to the practitioner. This overview, combined with the state-of-the-art protocols contained within this volume, provides a foundation for both glycobiologists and mass spectrometrists seeking to bridge the two fields.
Collapse
Affiliation(s)
- Steven M Patrie
- Department of Pathology, University of Texas Southwestern Medical Center, Dallas, TX, USA.
| | | | | |
Collapse
|
27
|
Lanucara F, Eyers CE. Top-down mass spectrometry for the analysis of combinatorial post-translational modifications. MASS SPECTROMETRY REVIEWS 2013; 32:27-42. [PMID: 22718314 DOI: 10.1002/mas.21348] [Citation(s) in RCA: 47] [Impact Index Per Article: 4.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 11/22/2011] [Revised: 02/21/2012] [Accepted: 02/21/2012] [Indexed: 06/01/2023]
Abstract
Protein post-translational modifications (PTMs) are critically important in regulating both protein structure and function, often in a rapid and reversible manner. Due to its sensitivity and vast applicability, mass spectrometry (MS) has become the technique of choice for analyzing PTMs. Whilst the "bottom-up' analytical approach, in which proteins are proteolyzed generating peptides for analysis by MS, is routinely applied and offers some advantages in terms of ease of analysis and lower limit of detection, "top-down" MS, describing the analysis of intact proteins, yields unique and highly valuable information on the connectivity and therefore combinatorial effect of multiple PTMs in the same polypeptide chain. In this review, the state of the art in top-down MS will be discussed, covering the main instrumental platforms and ion activation techniques. Moreover, the way that this approach can be used to gain insights on the combinatorial effect of multiple post-translational modifications and how this information can assist in studying physiologically relevant systems at the molecular level will also be addressed.
Collapse
Affiliation(s)
- Francesco Lanucara
- Michael Barber Centre for Mass Spectrometry, School of Chemistry, University of Manchester, Manchester Interdisciplinary Biocentre, Manchester M1 7DN, UK
| | | |
Collapse
|
28
|
Zhang H, Ge Y. Comprehensive analysis of protein modifications by top-down mass spectrometry. ACTA ACUST UNITED AC 2012; 4:711. [PMID: 22187450 DOI: 10.1161/circgenetics.110.957829] [Citation(s) in RCA: 106] [Impact Index Per Article: 8.8] [Reference Citation Analysis] [Abstract] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/28/2022]
Abstract
Mass spectrometry (MS)-based proteomics is playing an increasingly important role in cardiovascular research. Proteomics includes identification and quantification of proteins and the characterization of protein modifications, such as posttranslational modifications and sequence variants. The conventional bottom-up approach, involving proteolytic digestion of proteins into small peptides before MS analysis, is routinely used for protein identification and quantification with high throughput and automation. Nevertheless, it has limitations in the analysis of protein modifications, mainly because of the partial sequence coverage and loss of connections among modifications on disparate portions of a protein. An alternative approach, top-down MS, has emerged as a powerful tool for the analysis of protein modifications. The top-down approach analyzes whole proteins directly, providing a "bird's-eye" view of all existing modifications. Subsequently, each modified protein form can be isolated and fragmented in the mass spectrometer to locate the modification site. The incorporation of the nonergodic dissociation methods, such as electron-capture dissociation (ECD), greatly enhances the top-down capabilities. ECD is especially useful for mapping labile posttranslational modifications that are well preserved during the ECD fragmentation process. Top-down MS with ECD has been successfully applied to cardiovascular research, with the unique advantages in unraveling the molecular complexity, quantifying modified protein forms, complete mapping of modifications with full-sequence coverage, discovering unexpected modifications, identifying and quantifying positional isomers, and determining the order of multiple modifications. Nevertheless, top-down MS still needs to overcome some technical challenges to realize its full potential. Herein, we reviewed the advantages and challenges of the top-down method, with a focus on its application in cardiovascular research.
Collapse
Affiliation(s)
- Han Zhang
- Department of Physiology, School of Medicine and Public Health, University of Wisconsin-Madison, USA
| | | |
Collapse
|
29
|
Angel TE, Aryal UK, Hengel SM, Baker ES, Kelly RT, Robinson EW, Smith RD. Mass spectrometry-based proteomics: existing capabilities and future directions. Chem Soc Rev 2012; 41:3912-28. [PMID: 22498958 DOI: 10.1039/c2cs15331a] [Citation(s) in RCA: 256] [Impact Index Per Article: 21.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/14/2022]
Abstract
Mass spectrometry (MS)-based proteomics is emerging as a broadly effective means for identification, characterization, and quantification of proteins that are integral components of the processes essential for life. Characterization of proteins at the proteome and sub-proteome (e.g., the phosphoproteome, proteoglycome, or degradome/peptidome) levels provides a foundation for understanding fundamental aspects of biology. Emerging technologies such as ion mobility separations coupled with MS and microchip-based-proteome measurements combined with MS instrumentation and chromatographic separation techniques, such as nanoscale reversed phase liquid chromatography and capillary electrophoresis, show great promise for both broad undirected and targeted highly sensitive measurements. MS-based proteomics increasingly contribute to our understanding of the dynamics, interactions, and roles that proteins and peptides play, advancing our understanding of biology on a systems wide level for a wide range of applications including investigations of microbial communities, bioremediation, and human health.
Collapse
Affiliation(s)
- Thomas E Angel
- Biological Sciences Division, Pacific Northwest National Laboratory, Richland, WA 99352, USA
| | | | | | | | | | | | | |
Collapse
|
30
|
Panchaud A, Affolter M, Kussmann M. Mass spectrometry for nutritional peptidomics: How to analyze food bioactives and their health effects. J Proteomics 2011; 75:3546-59. [PMID: 22227401 DOI: 10.1016/j.jprot.2011.12.022] [Citation(s) in RCA: 111] [Impact Index Per Article: 8.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/05/2011] [Revised: 12/13/2011] [Accepted: 12/14/2011] [Indexed: 01/24/2023]
Abstract
We describe nutritional peptidomics for discovery and validation of bioactive food peptide and their health effects. Understanding nature and bioactivity of nutritional peptides means comprehending an important level of environmental regulation of the human genome, because diet is the environmental factor with the most profound life-long influence on health. We approach the theme from three angles, namely the analysis, the discovery and the biology perspective. Food peptides derive from parent food proteins via in vitro hydrolysis (processing) or in vivo digestion by various unspecific and specific proteases, as opposed to the tryptic peptides typically generated in biomarker proteomics. A food bioactive peptide may be rare or unique in terms of sequence and modification, and many food genomes are less well annotated than e.g. the human genome. Bioactive peptides can be discovered either empirically or by prediction: we explain both the classical hydrolysis strategy and the bioinformatics-driven reversed genome engineering. In order to exert bioactivity, food peptides must be either ingested and then reach the intestine in their intact form or be liberated in situ from their parent proteins to act locally, that is in the gut, or even systemically, i.e. through the blood stream. This article is part of a Special Section entitled: Understanding genome regulation and genetic diversity by mass spectrometry.
Collapse
Affiliation(s)
- Alexandre Panchaud
- Functional Genomics Group, Nestlé Research Centre, Lausanne, Switzerland
| | | | | |
Collapse
|
31
|
Mazur MT, Fyhr R. An algorithm for identifying multiply modified endogenous proteins using both full-scan and high-resolution tandem mass spectrometric data. RAPID COMMUNICATIONS IN MASS SPECTROMETRY : RCM 2011; 25:3617-3626. [PMID: 22095511 DOI: 10.1002/rcm.5257] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/31/2023]
Abstract
Mass spectrometry based proteomic experiments have advanced considerably over the past decade with high-resolution and mass accuracy tandem mass spectrometry (MS/MS) capabilities now allowing routine interrogation of large peptides and proteins. Often a major bottleneck to 'top-down' proteomics, however, is the ability to identify and characterize the complex peptides or proteins based on the acquired high-resolution MS/MS spectra. For biological samples containing proteins with multiple unpredicted processing events, unsupervised identifications can be particularly challenging. Described here is a newly created search algorithm (MAR) designed for the identification of experimentally detected peptides or proteins. This algorithm relies only on predefined list of 'differential' modifications (e.g. phosphorylation) and a FASTA-formatted protein database, and is not constrained to full-length proteins for identification. The algorithm is further powered by the ability to leverage identified mass differences between chromatographically separated ions within full-scan MS spectra to automatically generate a list of likely 'differential' modifications to be searched. The utility of the algorithm is demonstrated with the identification of 54 unique polypeptides from human apolipoprotein enriched from the high-density lipoprotein particle (HDL), and searching time benchmarks demonstrate scalability (12 high-resolution MS/MS scans searched per minute with modifications considered). This parallelizable algorithm provides an additional solution for converting high-quality MS/MS data of multiply processed proteins into reliable identifications.
Collapse
Affiliation(s)
- Matthew T Mazur
- Department of Proteomics, Merck & Co., Inc., 126 E. Lincoln Avenue, P.O. Box 2000, Rahway, NJ 07065, USA
| | | |
Collapse
|
32
|
Kellie JF, Catherman AD, Durbin KR, Tran JC, Tipton JD, Norris JL, Witkowski CE, Thomas PM, Kelleher NL. Robust analysis of the yeast proteome under 50 kDa by molecular-mass-based fractionation and top-down mass spectrometry. Anal Chem 2011; 84:209-15. [PMID: 22103811 DOI: 10.1021/ac202384v] [Citation(s) in RCA: 53] [Impact Index Per Article: 4.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/28/2022]
Abstract
As the process of top-down mass spectrometry continues to mature, we benchmark the next installment of an improving methodology that incorporates a tube-gel electrophoresis (TGE) device to separate intact proteins by molecular mass. Top-down proteomics is accomplished in a robust fashion to yield the identification of hundreds of unique proteins, many of which correspond to multiple protein forms. The TGE platform separates 0-50 kDa proteins extracted from the yeast proteome into 12 fractions prior to automated nanocapillary LC-MS/MS in technical triplicate. The process may be completed in less than 72 h. From this study, 530 unique proteins and 1103 distinct protein species were identified and characterized, thus representing the highest coverage to date of the Saccharomyces cerevisiae proteome using top-down proteomics. The work signifies a significant step in the maturation of proteomics based on direct measurement and fragmentation of intact proteins.
Collapse
Affiliation(s)
- John F Kellie
- Department of Chemistry, Proteomics Center of Excellence and Chemistry of Life Processes Institute, Northwestern University, 2145 North Sheridan Road, Evanston, Illinois 60208, USA
| | | | | | | | | | | | | | | | | |
Collapse
|
33
|
Zhou H, Ning Z, E. Starr A, Abu-Farha M, Figeys D. Advancements in Top-Down Proteomics. Anal Chem 2011; 84:720-34. [DOI: 10.1021/ac202882y] [Citation(s) in RCA: 72] [Impact Index Per Article: 5.5] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/05/2023]
Affiliation(s)
- Hu Zhou
- Ottawa Institute of Systems Biology, Department of Biochemistry, Microbiology and Immunology, University of Ottawa, Ottawa, Ontario, K1H8M5
- Shanghai Institute of Materia Medica, Chinese Academy of Sciences, Shanghai, China 201203
| | - Zhibing Ning
- Ottawa Institute of Systems Biology, Department of Biochemistry, Microbiology and Immunology, University of Ottawa, Ottawa, Ontario, K1H8M5
| | - Amanda E. Starr
- Ottawa Institute of Systems Biology, Department of Biochemistry, Microbiology and Immunology, University of Ottawa, Ottawa, Ontario, K1H8M5
| | - Mohamed Abu-Farha
- Biochemistry and Molecular Biology Unit, Dasman Diabetes Institute, Dasman 15462, Kuwait
| | - Daniel Figeys
- Ottawa Institute of Systems Biology, Department of Biochemistry, Microbiology and Immunology, University of Ottawa, Ottawa, Ontario, K1H8M5
| |
Collapse
|
34
|
Liu X, Sirotkin Y, Shen Y, Anderson G, Tsai YS, Ting YS, Goodlett DR, Smith RD, Bafna V, Pevzner PA. Protein identification using top-down. Mol Cell Proteomics 2011; 11:M111.008524. [PMID: 22027200 DOI: 10.1074/mcp.m111.008524] [Citation(s) in RCA: 115] [Impact Index Per Article: 8.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/09/2023] Open
Abstract
In the last two years, because of advances in protein separation and mass spectrometry, top-down mass spectrometry moved from analyzing single proteins to analyzing complex samples and identifying hundreds and even thousands of proteins. However, computational tools for database search of top-down spectra against protein databases are still in their infancy. We describe MS-Align+, a fast algorithm for top-down protein identification based on spectral alignment that enables searches for unexpected post-translational modifications. We also propose a method for evaluating statistical significance of top-down protein identifications and further benchmark various software tools on two top-down data sets from Saccharomyces cerevisiae and Salmonella typhimurium. We demonstrate that MS-Align+ significantly increases the number of identified spectra as compared with MASCOT and OMSSA on both data sets. Although MS-Align+ and ProSightPC have similar performance on the Salmonella typhimurium data set, MS-Align+ outperforms ProSightPC on the (more complex) Saccharomyces cerevisiae data set.
Collapse
Affiliation(s)
- Xiaowen Liu
- Department of Computer Science and Engineering, University of California, San Diego, 9500 Gilman Drive, San Diego, California 92093, USA
| | | | | | | | | | | | | | | | | | | |
Collapse
|
35
|
Tipton JD, Tran JC, Catherman AD, Ahlf DR, Durbin KR, Kelleher NL. Analysis of intact protein isoforms by mass spectrometry. J Biol Chem 2011; 286:25451-8. [PMID: 21632550 DOI: 10.1074/jbc.r111.239442] [Citation(s) in RCA: 84] [Impact Index Per Article: 6.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/11/2022] Open
Abstract
The diverse proteome of an organism arises from such events as single nucleotide substitutions at the DNA level, different RNA processing, and dynamic enzymatic post-translational modifications. This minireview focuses on the measurement of intact proteins to describe the diversity found in proteomes. The field of biological mass spectrometry has steadily advanced, enabling improvements in the characterization of single proteins to proteins derived from cells or tissues. In this minireview, we discuss the basic technology for "top-down" intact protein analysis. Furthermore, examples of studies involved with the qualitative and quantitative analysis of full-length polypeptides are provided.
Collapse
Affiliation(s)
- Jeremiah D Tipton
- Departmen of Chemistry, Northwestern University, Evanston, Illinois 60208, USA
| | | | | | | | | | | |
Collapse
|
36
|
Kollipara S, Agarwal N, Varshney B, Paliwal J. Technological Advancements in Mass Spectrometry and Its Impact on Proteomics. ANAL LETT 2011. [DOI: 10.1080/00032719.2010.520386] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/16/2023]
|
37
|
Jones AW, Cooper HJ. Dissociation techniques in mass spectrometry-based proteomics. Analyst 2011; 136:3419-29. [DOI: 10.1039/c0an01011a] [Citation(s) in RCA: 35] [Impact Index Per Article: 2.7] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/19/2022]
|
38
|
Liu X, Inbar Y, Dorrestein PC, Wynne C, Edwards N, Souda P, Whitelegge JP, Bafna V, Pevzner PA. Deconvolution and database search of complex tandem mass spectra of intact proteins: a combinatorial approach. Mol Cell Proteomics 2010; 9:2772-82. [PMID: 20855543 DOI: 10.1074/mcp.m110.002766] [Citation(s) in RCA: 133] [Impact Index Per Article: 9.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/06/2022] Open
Abstract
Top-down proteomics studies intact proteins, enabling new opportunities for analyzing post-translational modifications. Because tandem mass spectra of intact proteins are very complex, spectral deconvolution (grouping peaks into isotopomer envelopes) is a key initial stage for their interpretation. In such spectra, isotopomer envelopes of different protein fragments span overlapping regions on the m/z axis and even share spectral peaks. This raises both pattern recognition and combinatorial challenges for spectral deconvolution. We present MS-Deconv, a combinatorial algorithm for spectral deconvolution. The algorithm first generates a large set of candidate isotopomer envelopes for a spectrum, then represents the spectrum as a graph, and finally selects its highest scoring subset of envelopes as a heaviest path in the graph. In contrast with other approaches, the algorithm scores sets of envelopes rather than individual envelopes. We demonstrate that MS-Deconv improves on Thrash and Xtract in the number of correctly recovered monoisotopic masses and speed. We applied MS-Deconv to a large set of top-down spectra from Yersinia rohdei (with a still unsequenced genome) and further matched them against the protein database of related and sequenced bacterium Yersinia enterocolitica. MS-Deconv is available at http://proteomics.ucsd.edu/Software.html.
Collapse
Affiliation(s)
- Xiaowen Liu
- Department of Computer Science and Engineering, University of California, San Diego, California 92093, USA
| | | | | | | | | | | | | | | | | |
Collapse
|
39
|
|
40
|
Advances in proteomic prostate cancer biomarker discovery. J Proteomics 2010; 73:1839-50. [PMID: 20398807 DOI: 10.1016/j.jprot.2010.04.002] [Citation(s) in RCA: 38] [Impact Index Per Article: 2.7] [Reference Citation Analysis] [Abstract] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/31/2009] [Revised: 01/15/2010] [Accepted: 04/06/2010] [Indexed: 11/21/2022]
Abstract
Prostate cancer is the most common non-cutaneous cancer in men in the United States. For reasons largely unknown, the incidence of prostate cancer has increased in the last two decades, in spite or perhaps because of a concomitant increase in serum prostate-specific antigen (PSA) screening. While PSA is acknowledged not to be an ideal biomarker for prostate cancer detection, it is however widely used by physicians due to lack of an alternative. Thus, the identification of a biomarker(s) that can complement or replace PSA represents a major goal for prostate cancer research. Screening complex biological specimens such as blood, urine, and tissue to identify protein biomarkers has become increasingly popular over the last decade thanks to advances in proteomic discovery methods. The completion of human genome sequence together with new development in mass spectrometry instrumentation and bioinformatics has been a major driving force in biomarker discovery research. Here we review the current state of proteomic applications as applied to various sample sources including blood, urine, tissue, and "secretome" for the purpose of prostate cancer biomarker discovery. Additionally, we review recent developments in validation of putative markers, efforts at systems biology approach, and current challenges of proteomics in biomarker discovery.
Collapse
|
41
|
Kellie JF, Tran JC, Lee JE, Ahlf DR, Thomas HM, Ntai I, Catherman AD, Durbin KR, Zamdborg L, Vellaichamy A, Thomas PM, Kelleher NL. The emerging process of Top Down mass spectrometry for protein analysis: biomarkers, protein-therapeutics, and achieving high throughput. MOLECULAR BIOSYSTEMS 2010; 6:1532-9. [PMID: 20711533 DOI: 10.1039/c000896f] [Citation(s) in RCA: 82] [Impact Index Per Article: 5.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/21/2022]
Abstract
Top Down mass spectrometry (MS) has emerged as an alternative to common Bottom Up strategies for protein analysis. In the Top Down approach, intact proteins are fragmented directly in the mass spectrometer to achieve both protein identification and characterization, even capturing information on combinatorial post-translational modifications. Just in the past two years, Top Down MS has seen incremental advances in instrumentation and dedicated software, and has also experienced a major boost from refined separations of whole proteins in complex mixtures that have both high recovery and reproducibility. Combined with steadily advancing commercial MS instrumentation and data processing, a high-throughput workflow covering intact proteins and polypeptides up to 70 kDa is directly visible in the near future.
Collapse
Affiliation(s)
- John F Kellie
- Technology Development Team, Center for Top Down Proteomics, University of Illinois at Urbana-Champaign, USA
| | | | | | | | | | | | | | | | | | | | | | | |
Collapse
|