1
|
Carr AV, Bollis NE, Pavek JG, Shortreed MR, Smith LM. Spectral averaging with outlier rejection algorithms to increase identifications in top-down proteomics. Proteomics 2024; 24:e2300234. [PMID: 38487981 PMCID: PMC11216233 DOI: 10.1002/pmic.202300234] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/31/2023] [Revised: 02/15/2024] [Accepted: 02/29/2024] [Indexed: 04/05/2024]
Abstract
The identification of proteoforms by top-down proteomics requires both high quality fragmentation spectra and the neutral mass of the proteoform from which the fragments derive. Intact proteoform spectra can be highly complex and may include multiple overlapping proteoforms, as well as many isotopic peaks and charge states. The resulting lower signal-to-noise ratios for intact proteins complicates downstream analyses such as deconvolution. Averaging multiple scans is a common way to improve signal-to-noise, but mass spectrometry data contains artifacts unique to it that can degrade the quality of an averaged spectra. To overcome these limitations and increase signal-to-noise, we have implemented outlier rejection algorithms to remove outlier measurements efficiently and robustly in a set of MS1 scans prior to averaging. We have implemented averaging with rejection algorithms in the open-source, freely available, proteomics search engine MetaMorpheus. Herein, we report the application of the averaging with rejection algorithms to direct injection and online liquid chromatography mass spectrometry data. Averaging with rejection algorithms demonstrated a 45% increase in the number of proteoforms detected in Jurkat T cell lysate. We show that the increase is due to improved spectral quality, particularly in regions surrounding isotopic envelopes.
Collapse
Affiliation(s)
- Austin V Carr
- Department of Chemistry, University of Wisconsin-Madison, Madison, Wisconsin, USA
| | - Nicholas E Bollis
- Department of Chemistry, University of Wisconsin-Madison, Madison, Wisconsin, USA
| | - John G Pavek
- Department of Chemistry, University of Wisconsin-Madison, Madison, Wisconsin, USA
| | - Michael R Shortreed
- Department of Chemistry, University of Wisconsin-Madison, Madison, Wisconsin, USA
| | - Lloyd M Smith
- Department of Chemistry, University of Wisconsin-Madison, Madison, Wisconsin, USA
| |
Collapse
|
2
|
Pavek JG, Frey BL, Frost DC, Gu TJ, Li L, Smith LM. Cysteine Counting via Isotopic Chemical Labeling for Intact Mass Proteoform Identifications in Tissue. Anal Chem 2023; 95:15245-15253. [PMID: 37791746 PMCID: PMC10637319 DOI: 10.1021/acs.analchem.3c02473] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/05/2023]
Abstract
Top-down proteomics, the tandem mass spectrometric analysis of intact proteoforms, is the dominant method for proteoform characterization in complex mixtures. While this strategy produces detailed molecular information, it also requires extensive instrument time per mass spectrum obtained and thus compromises the depth of proteoform coverage that is accessible on liquid chromatography time scales. Such a top-down analysis is necessary for making original proteoform identifications, but once a proteoform has been confidently identified, the extensive characterization it provides may no longer be required for a subsequent identification of the same proteoform. We present a strategy to identify proteoforms in tissue samples on the basis of the combination of an intact mass determination with a measured count of the number of cysteine residues present in each proteoform. We developed and characterized a cysteine tagging chemistry suitable for the efficient and specific labeling of cysteine residues within intact proteoforms and for providing a count of the cysteine amino acids present. On simple protein mixtures, the tagging chemistry yields greater than 98% labeling of all cysteine residues, with a labeling specificity of greater than 95%. Similar results are observed on more complex samples. In a proof-of-principle study, proteoforms present in a human prostate tumor biopsy were characterized. Observed proteoforms, each characterized by an intact mass and a cysteine count, were grouped into proteoform families (groups of proteoforms originating from the same gene). We observed 2190 unique experimental proteoforms, 703 of which were grouped into 275 proteoform families.
Collapse
Affiliation(s)
- John G. Pavek
- Department of Chemistry, University of Wisconsin-Madison, 1101 University Ave. Madison, WI 53706
| | - Brian L. Frey
- Department of Chemistry, University of Wisconsin-Madison, 1101 University Ave. Madison, WI 53706
| | - Dustin C. Frost
- School of Pharmacy, University of Wisconsin-Madison, 777 Highland Ave, Madison, WI 53705
| | - Ting-Jia Gu
- School of Pharmacy, University of Wisconsin-Madison, 777 Highland Ave, Madison, WI 53705
| | - Lingjun Li
- Department of Chemistry, University of Wisconsin-Madison, 1101 University Ave. Madison, WI 53706
- School of Pharmacy, University of Wisconsin-Madison, 777 Highland Ave, Madison, WI 53705
| | - Lloyd M. Smith
- Department of Chemistry, University of Wisconsin-Madison, 1101 University Ave. Madison, WI 53706
| |
Collapse
|
3
|
Nickerson JL, Baghalabadi V, Rajendran SRCK, Jakubec PJ, Said H, McMillen TS, Dang Z, Doucette AA. Recent advances in top-down proteome sample processing ahead of MS analysis. MASS SPECTROMETRY REVIEWS 2023; 42:457-495. [PMID: 34047392 DOI: 10.1002/mas.21706] [Citation(s) in RCA: 18] [Impact Index Per Article: 18.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 02/19/2021] [Revised: 04/21/2021] [Accepted: 05/06/2021] [Indexed: 06/12/2023]
Abstract
Top-down proteomics is emerging as a preferred approach to investigate biological systems, with objectives ranging from the detailed assessment of a single protein therapeutic, to the complete characterization of every possible protein including their modifications, which define the human proteoform. Given the controlling influence of protein modifications on their biological function, understanding how gene products manifest or respond to disease is most precisely achieved by characterization at the intact protein level. Top-down mass spectrometry (MS) analysis of proteins entails unique challenges associated with processing whole proteins while maintaining their integrity throughout the processes of extraction, enrichment, purification, and fractionation. Recent advances in each of these critical front-end preparation processes, including minimalistic workflows, have greatly expanded the capacity of MS for top-down proteome analysis. Acknowledging the many contributions in MS technology and sample processing, the present review aims to highlight the diverse strategies that have forged a pathway for top-down proteomics. We comprehensively discuss the evolution of front-end workflows that today facilitate optimal characterization of proteoform-driven biology, including a brief description of the clinical applications that have motivated these impactful contributions.
Collapse
Affiliation(s)
| | - Venus Baghalabadi
- Department of Chemistry, Dalhousie University, Halifax, Nova Scotia, Canada
| | - Subin R C K Rajendran
- Department of Chemistry, Dalhousie University, Halifax, Nova Scotia, Canada
- Verschuren Centre for Sustainability in Energy and the Environment, Sydney, Nova Scotia, Canada
| | - Philip J Jakubec
- Department of Chemistry, Dalhousie University, Halifax, Nova Scotia, Canada
| | - Hammam Said
- Department of Chemistry, Dalhousie University, Halifax, Nova Scotia, Canada
| | - Teresa S McMillen
- Department of Chemistry, Dalhousie University, Halifax, Nova Scotia, Canada
| | - Ziheng Dang
- Department of Chemistry, Dalhousie University, Halifax, Nova Scotia, Canada
| | - Alan A Doucette
- Department of Chemistry, Dalhousie University, Halifax, Nova Scotia, Canada
| |
Collapse
|
4
|
Chen D, McCool EN, Yang Z, Shen X, Lubeckyj RA, Xu T, Wang Q, Sun L. Recent advances (2019-2021) of capillary electrophoresis-mass spectrometry for multilevel proteomics. MASS SPECTROMETRY REVIEWS 2023; 42:617-642. [PMID: 34128246 PMCID: PMC8671558 DOI: 10.1002/mas.21714] [Citation(s) in RCA: 30] [Impact Index Per Article: 30.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 03/21/2021] [Revised: 05/29/2021] [Accepted: 06/03/2021] [Indexed: 05/06/2023]
Abstract
Multilevel proteomics aims to delineate proteins at the peptide (bottom-up proteomics), proteoform (top-down proteomics), and protein complex (native proteomics) levels. Capillary electrophoresis-mass spectrometry (CE-MS) can achieve highly efficient separation and highly sensitive detection of complex mixtures of peptides, proteoforms, and even protein complexes because of its substantial technical progress. CE-MS has become a valuable alternative to the routinely used liquid chromatography-mass spectrometry for multilevel proteomics. This review summarizes the most recent (2019-2021) advances of CE-MS for multilevel proteomics regarding technological progress and biological applications. We also provide brief perspectives on CE-MS for multilevel proteomics at the end, highlighting some future directions and potential challenges.
Collapse
Affiliation(s)
| | | | | | - Xiaojing Shen
- Department of Chemistry, Michigan State University, 578 S Shaw Lane, East Lansing, MI 48824, USA
| | - Rachele A. Lubeckyj
- Department of Chemistry, Michigan State University, 578 S Shaw Lane, East Lansing, MI 48824, USA
| | - Tian Xu
- Department of Chemistry, Michigan State University, 578 S Shaw Lane, East Lansing, MI 48824, USA
| | - Qianjie Wang
- Department of Chemistry, Michigan State University, 578 S Shaw Lane, East Lansing, MI 48824, USA
| | - Liangliang Sun
- Department of Chemistry, Michigan State University, 578 S Shaw Lane, East Lansing, MI 48824, USA
| |
Collapse
|
5
|
Melo RM, de Souza JMF, Williams TCR, Fontes W, de Sousa MV, Ricart CAO, do Vale LHF. Revealing Corynebacterium glutamicum proteoforms through top-down proteomics. Sci Rep 2023; 13:2602. [PMID: 36788287 PMCID: PMC9929327 DOI: 10.1038/s41598-023-29857-6] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/07/2022] [Accepted: 02/11/2023] [Indexed: 02/16/2023] Open
Abstract
Corynebacterium glutamicum is a bacterium widely employed in the industrial production of amino acids as well as a broad range of other biotechnological products. The present study describes the characterization of C. glutamicum proteoforms, and their post-translational modifications (PTMs) employing top-down proteomics. Despite previous evidence of PTMs having roles in the regulation of C. glutamicum metabolism, this is the first top-down proteome analysis of this organism. We identified 1125 proteoforms from 273 proteins, with 60% of proteins presenting at least one mass shift, suggesting the presence of PTMs, including several acetylated, oxidized and formylated proteoforms. Furthermore, proteins relevant to amino acid production, protein secretion, and oxidative stress were identified with mass shifts suggesting the presence of uncharacterized PTMs and proteoforms that may affect biotechnologically relevant processes in this industrial workhorse. For instance, the membrane proteins mepB and SecG were identified as a cleaved and a formylated proteoform, respectively. While in the central metabolism, OdhI was identified as two proteoforms with potential biological relevance: a cleaved proteoform and a proteoform with PTMs corresponding to a 70 Da mass shift.
Collapse
Affiliation(s)
- Reynaldo Magalhães Melo
- Laboratory of Protein Chemistry and Biochemistry, Department of Cell Biology, Institute of Biology, University of Brasilia, Brasilia, Brazil
| | - Jaques Miranda Ferreira de Souza
- Laboratory of Protein Chemistry and Biochemistry, Department of Cell Biology, Institute of Biology, University of Brasilia, Brasilia, Brazil
| | | | - Wagner Fontes
- Laboratory of Protein Chemistry and Biochemistry, Department of Cell Biology, Institute of Biology, University of Brasilia, Brasilia, Brazil
| | - Marcelo Valle de Sousa
- Laboratory of Protein Chemistry and Biochemistry, Department of Cell Biology, Institute of Biology, University of Brasilia, Brasilia, Brazil
| | - Carlos André Ornelas Ricart
- Laboratory of Protein Chemistry and Biochemistry, Department of Cell Biology, Institute of Biology, University of Brasilia, Brasilia, Brazil
| | - Luis Henrique Ferreira do Vale
- Laboratory of Protein Chemistry and Biochemistry, Department of Cell Biology, Institute of Biology, University of Brasilia, Brasilia, Brazil.
| |
Collapse
|
6
|
Dai Y, Millikin R, Rolfs Z, Shortreed MR, Smith LM. A Hybrid Spectral Library and Protein Sequence Database Search Strategy for Bottom-Up and Top-Down Proteomic Data Analysis. J Proteome Res 2022; 21:2609-2618. [PMID: 36206157 PMCID: PMC9869658 DOI: 10.1021/acs.jproteome.2c00305] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/26/2023]
Abstract
Tandem mass spectrometry (MS/MS) is widely employed for the analysis of complex proteomic samples. While protein sequence database searching and spectral library searching are both well-established peptide identification methods, each has shortcomings. Protein sequence databases lack fragment peak intensity information, which can result in poor discrimination between correct and incorrect spectrum assignments. Spectral libraries usually contain fewer peptides than protein sequence databases, which limits the number of peptides that can be identified. Notably, few post-translationally modified peptides are represented in spectral libraries. This is because few search engines can both identify a broad spectrum of PTMs and create corresponding spectral libraries. Also, programs that generate spectral libraries using deep learning approaches are not yet able to accurately predict spectra for the vast majority of PTMs. Here, we address these limitations through use of a hybrid search strategy that combines protein sequence database and spectral library searches to improve identification success rates and sensitivity. This software uses Global PTM Discovery (G-PTM-D) to produce spectral libraries for a wide variety of different PTMs. These features, along with a new spectrum annotation and visualization tool, have been integrated into the freely available and open-source search engine MetaMorpheus.
Collapse
Affiliation(s)
- Yuling Dai
- Department of Chemistry, University of Wisconsin, 1101 University Avenue, Madison, Wisconsin 53706, United States
| | - Robert Millikin
- Department of Chemistry, University of Wisconsin, 1101 University Avenue, Madison, Wisconsin 53706, United States
| | - Zach Rolfs
- Department of Chemistry, University of Wisconsin, 1101 University Avenue, Madison, Wisconsin 53706, United States
| | - Michael R. Shortreed
- Department of Chemistry, University of Wisconsin, 1101 University Avenue, Madison, Wisconsin 53706, United States
| | - Lloyd M. Smith
- Department of Chemistry, University of Wisconsin, 1101 University Avenue, Madison, Wisconsin 53706, United States
| |
Collapse
|
7
|
Schaffer LV, Shortreed MR, Smith LM. Proteoform Analysis and Construction of Proteoform Families in Proteoform Suite. Methods Mol Biol 2022; 2500:67-81. [PMID: 35657588 PMCID: PMC9694099 DOI: 10.1007/978-1-0716-2325-1_7] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/03/2023]
Abstract
Proteoform Suite is an interactive software program for the identification and quantification of intact proteoforms from mass spectrometry data. Proteoform Suite identifies proteoforms observed by intact-mass (MS1) analysis. In intact-mass analysis, unfragmented experimental proteoforms are compared to a database of known proteoform sequences and to one another, searching for mass differences corresponding to well-known post-translational modifications or amino acids. Intact-mass analysis enables proteoforms observed in the MS1 data without MS/MS (MS2) fragmentation to be identified. Proteoform Suite further facilitates the construction and visualization of proteoform families, which are the sets of proteoforms derived from individual genes. Bottom-up peptide identifications and top-down (MS2) proteoform identifications can be integrated into the Proteoform Suite analysis to increase the sensitivity and accuracy of the analysis. Proteoform Suite is open source and freely available at https://github.com/smith-chem-wisc/proteoform-suite .
Collapse
Affiliation(s)
- Leah V Schaffer
- Department of Chemistry, University of Wisconsin-Madison, Madison, WI, USA
| | | | - Lloyd M Smith
- Department of Chemistry, University of Wisconsin-Madison, Madison, WI, USA
| |
Collapse
|
8
|
Proteoforms and Proteoform Families: Past, Present, and Future. Methods Mol Biol 2022; 2500:1-4. [PMID: 35657582 PMCID: PMC9676067 DOI: 10.1007/978-1-0716-2325-1_1] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/03/2023]
Abstract
The Human Proteoform Project is an ambitious international effort to accelerate the development of technologies for proteoform analysis and to establish comprehensive atlases of proteoforms for humans and model organisms. Proteoforms are the ultimate molecular effectors of function in biology and are thus central to understanding that function. Proteoform analysis as it is practiced today is almost exclusively accomplished by mass spectrometry (MS) and is rapidly advancing in its capabilities. This volume presents a beautiful snapshot of emerging technologies at the exciting frontier of MS-based proteoform analysis.
Collapse
|
9
|
Lu L, Scalf M, Shortreed MR, Smith LM. Mesh Fragmentation Improves Dissociation Efficiency in Top-down Proteomics. JOURNAL OF THE AMERICAN SOCIETY FOR MASS SPECTROMETRY 2021; 32:1319-1325. [PMID: 33754701 PMCID: PMC8783543 DOI: 10.1021/jasms.0c00462] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/12/2023]
Abstract
Top-down proteomics is a key mass spectrometry-based technology for comprehensive analysis of proteoforms. Proteoforms exhibit multiple high charge states and isotopic forms in full MS scans. The dissociation behavior of proteoforms in different charge states and subjected to different collision energies is highly variable. The current widely employed data-dependent acquisition (DDA) method selects a narrow m/z range (corresponding to a single proteoform charge state) for dissociation from the most abundant precursors. We describe here Mesh, a novel dissociation strategy, to dissociate multiple charge states of one proteoform with multiple collision energies. We show that the Mesh strategy has the potential to generate fragment ions with improved sequence coverage and improve identification ratios in top-down proteomic analyses of complex samples. The strategy is implemented within an open-source instrument control software program named MetaDrive to perform real time deconvolution and precursor selection.
Collapse
Affiliation(s)
- Lei Lu
- Department of Chemistry, University of Wisconsin, Madison, Wisconsin 53706, United States
| | - Mark Scalf
- Department of Chemistry, University of Wisconsin, Madison, Wisconsin 53706, United States
| | - Michael R. Shortreed
- Department of Chemistry, University of Wisconsin, Madison, Wisconsin 53706, United States
| | - Lloyd M. Smith
- Department of Chemistry, University of Wisconsin, Madison, Wisconsin 53706, United States
- Corresponding Author Phone: (608) 263-2594. Fax: (608) 265-6780.
| |
Collapse
|
10
|
Schaffer LV, Anderson LC, Butcher DS, Shortreed MR, Miller RM, Pavelec C, Smith LM. Construction of Human Proteoform Families from 21 Tesla Fourier Transform Ion Cyclotron Resonance Mass Spectrometry Top-Down Proteomic Data. J Proteome Res 2020; 20:317-325. [PMID: 33074679 DOI: 10.1021/acs.jproteome.0c00403] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [Key Words] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/22/2022]
Abstract
Identification of proteoforms, the different forms of a protein, is important to understand biological processes. A proteoform family is the set of different proteoforms from the same gene. We previously developed the software program Proteoform Suite, which constructs proteoform families and identifies proteoforms by intact-mass analysis. Here, we have applied this approach to top-down proteomic data acquired at the National High Magnetic Field Laboratory 21 tesla Fourier transform ion cyclotron resonance mass spectrometer (data available on the MassIVE platform with identifier MSV000085978). We explored the ability to construct proteoform families and identify proteoforms from the high mass accuracy data that this instrument provides for a complex cell lysate sample from the MCF-7 human breast cancer cell line. There were 2830 observed experimental proteforms, of which 932 were identified, 44 were ambiguous, and 1854 were unidentified. Of the 932 unique identified proteoforms, 766 were identified by top-down MS2 analysis at 1% false discovery rate (FDR) using TDPortal, and 166 were additional intact-mass identifications (∼4.7% calculated global FDR) made using Proteoform Suite. We recently published a proteoform level schema to represent ambiguity in proteoform identifications. We implemented this proteoform level classification in Proteoform Suite for intact-mass identifications, which enables users to determine the ambiguity levels and sources of ambiguity for each intact-mass proteoform identification.
Collapse
Affiliation(s)
- Leah V Schaffer
- Department of Chemistry, University of Wisconsin-Madison, Madison, Wisconsin 53706, United States
| | - Lissa C Anderson
- Ion Cyclotron Resonance Program, National High Magnetic Field Laboratory, Tallahassee, Florida 32310, United States
| | - David S Butcher
- Ion Cyclotron Resonance Program, National High Magnetic Field Laboratory, Tallahassee, Florida 32310, United States
| | - Michael R Shortreed
- Department of Chemistry, University of Wisconsin-Madison, Madison, Wisconsin 53706, United States
| | - Rachel M Miller
- Department of Chemistry, University of Wisconsin-Madison, Madison, Wisconsin 53706, United States
| | - Caitlin Pavelec
- Department of Chemistry, University of Wisconsin-Madison, Madison, Wisconsin 53706, United States
| | - Lloyd M Smith
- Department of Chemistry, University of Wisconsin-Madison, Madison, Wisconsin 53706, United States
| |
Collapse
|
11
|
Cesnik AJ, Miller RM, Ibrahim K, Lu L, Millikin RJ, Shortreed MR, Frey BL, Smith LM. Spritz: A Proteogenomic Database Engine. J Proteome Res 2020; 20:1826-1834. [PMID: 32967423 DOI: 10.1021/acs.jproteome.0c00407] [Citation(s) in RCA: 13] [Impact Index Per Article: 3.3] [Reference Citation Analysis] [Abstract] [Key Words] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/07/2023]
Abstract
Proteoforms are the workhorses of the cell, and subtle differences between their amino acid sequences or post-translational modifications (PTMs) can change their biological function. To most effectively identify and quantify proteoforms in genetically diverse samples by mass spectrometry (MS), it is advantageous to search the MS data against a sample-specific protein database that is tailored to the sample being analyzed, in that it contains the correct amino acid sequences and relevant PTMs for that sample. To this end, we have developed Spritz (https://smith-chem-wisc.github.io/Spritz/), an open-source software tool for generating protein databases annotated with sequence variations and PTMs. We provide a simple graphical user interface for Windows and scripts that can be run on any operating system. Spritz automatically sets up and executes approximately 20 tools, which enable the construction of a proteogenomic database from only raw RNA sequencing data. Sequence variations that are discovered in RNA sequencing data upon comparison to the Ensembl reference genome are annotated on proteins in these databases, and PTM annotations are transferred from UniProt. Modifications can also be discovered and added to the database using bottom-up mass spectrometry data and global PTM discovery in MetaMorpheus. We demonstrate that such sample-specific databases allow the identification of variant peptides, modified variant peptides, and variant proteoforms by searching bottom-up and top-down proteomic data from the Jurkat human T lymphocyte cell line and demonstrate the identification of phosphorylated variant sites with phosphoproteomic data from the U2OS human osteosarcoma cell line.
Collapse
Affiliation(s)
- Anthony J Cesnik
- Department of Chemistry, University of Wisconsin-Madison, Madison, Wisconsin 53706, United States.,Science for Life Laboratory, School of Engineering Sciences in Chemistry, Biotechnology and Health, KTH - Royal Institute of Technology, Stockholm 17121, Sweden.,Department of Genetics, Stanford University, Stanford, California 94305, United States.,Chan Zuckerberg Biohub, San Francisco, California 94158, United States
| | - Rachel M Miller
- Department of Chemistry, University of Wisconsin-Madison, Madison, Wisconsin 53706, United States
| | - Khairina Ibrahim
- Department of Chemistry, University of Wisconsin-Madison, Madison, Wisconsin 53706, United States
| | - Lei Lu
- Department of Chemistry, University of Wisconsin-Madison, Madison, Wisconsin 53706, United States
| | - Robert J Millikin
- Department of Chemistry, University of Wisconsin-Madison, Madison, Wisconsin 53706, United States
| | - Michael R Shortreed
- Department of Chemistry, University of Wisconsin-Madison, Madison, Wisconsin 53706, United States
| | - Brian L Frey
- Department of Chemistry, University of Wisconsin-Madison, Madison, Wisconsin 53706, United States
| | - Lloyd M Smith
- Department of Chemistry, University of Wisconsin-Madison, Madison, Wisconsin 53706, United States
| |
Collapse
|
12
|
Watson ZL, Ward FR, Méheust R, Ad O, Schepartz A, Banfield JF, Cate JH. Structure of the bacterial ribosome at 2 Å resolution. eLife 2020; 9:60482. [PMID: 32924932 DOI: 10.1101/2020.06.26.174334] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/27/2020] [Accepted: 09/11/2020] [Indexed: 05/24/2023] Open
Abstract
Using cryo-electron microscopy (cryo-EM), we determined the structure of the Escherichia coli 70S ribosome with a global resolution of 2.0 Å. The maps reveal unambiguous positioning of protein and RNA residues, their detailed chemical interactions, and chemical modifications. Notable features include the first examples of isopeptide and thioamide backbone substitutions in ribosomal proteins, the former likely conserved in all domains of life. The maps also reveal extensive solvation of the small (30S) ribosomal subunit, and interactions with A-site and P-site tRNAs, mRNA, and the antibiotic paromomycin. The maps and models of the bacterial ribosome presented here now allow a deeper phylogenetic analysis of ribosomal components including structural conservation to the level of solvation. The high quality of the maps should enable future structural analyses of the chemical basis for translation and aid the development of robust tools for cryo-EM structure modeling and refinement.
Collapse
Affiliation(s)
- Zoe L Watson
- Department of Chemistry, University of California, Berkeley, Berkeley, United States
| | - Fred R Ward
- Department of Molecular and Cell Biology, University of California, Berkeley, Berkeley, United States
| | - Raphaël Méheust
- Innovative Genomics Institute, University of California, Berkeley, Berkeley, United States
- Earth and Planetary Science, University of California, Berkeley, Berkeley, United States
| | - Omer Ad
- Department of Chemistry, Yale University, New Haven, United States
| | - Alanna Schepartz
- Department of Chemistry, University of California, Berkeley, Berkeley, United States
- Department of Molecular and Cell Biology, University of California, Berkeley, Berkeley, United States
| | - Jillian F Banfield
- Innovative Genomics Institute, University of California, Berkeley, Berkeley, United States
- Earth and Planetary Science, University of California, Berkeley, Berkeley, United States
- Environmental Science, Policy and Management, University of California Berkeley, Berkeley, United States
| | - Jamie Hd Cate
- Department of Chemistry, University of California, Berkeley, Berkeley, United States
- Department of Molecular and Cell Biology, University of California, Berkeley, Berkeley, United States
- Molecular Biophysics and Integrated Bioimaging Division, Lawrence Berkeley National Laboratory, Berkeley, United States
| |
Collapse
|
13
|
Watson ZL, Ward FR, Méheust R, Ad O, Schepartz A, Banfield JF, Cate JHD. Structure of the bacterial ribosome at 2 Å resolution. eLife 2020; 9:e60482. [PMID: 32924932 PMCID: PMC7550191 DOI: 10.7554/elife.60482] [Citation(s) in RCA: 129] [Impact Index Per Article: 32.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/27/2020] [Accepted: 09/11/2020] [Indexed: 12/31/2022] Open
Abstract
Using cryo-electron microscopy (cryo-EM), we determined the structure of the Escherichia coli 70S ribosome with a global resolution of 2.0 Å. The maps reveal unambiguous positioning of protein and RNA residues, their detailed chemical interactions, and chemical modifications. Notable features include the first examples of isopeptide and thioamide backbone substitutions in ribosomal proteins, the former likely conserved in all domains of life. The maps also reveal extensive solvation of the small (30S) ribosomal subunit, and interactions with A-site and P-site tRNAs, mRNA, and the antibiotic paromomycin. The maps and models of the bacterial ribosome presented here now allow a deeper phylogenetic analysis of ribosomal components including structural conservation to the level of solvation. The high quality of the maps should enable future structural analyses of the chemical basis for translation and aid the development of robust tools for cryo-EM structure modeling and refinement.
Collapse
Affiliation(s)
- Zoe L Watson
- Department of Chemistry, University of California, BerkeleyBerkeleyUnited States
| | - Fred R Ward
- Department of Molecular and Cell Biology, University of California, BerkeleyBerkeleyUnited States
| | - Raphaël Méheust
- Innovative Genomics Institute, University of California, BerkeleyBerkeleyUnited States
- Earth and Planetary Science, University of California, BerkeleyBerkeleyUnited States
| | - Omer Ad
- Department of Chemistry, Yale UniversityNew HavenUnited States
| | - Alanna Schepartz
- Department of Chemistry, University of California, BerkeleyBerkeleyUnited States
- Department of Molecular and Cell Biology, University of California, BerkeleyBerkeleyUnited States
| | - Jillian F Banfield
- Innovative Genomics Institute, University of California, BerkeleyBerkeleyUnited States
- Earth and Planetary Science, University of California, BerkeleyBerkeleyUnited States
- Environmental Science, Policy and Management, University of California BerkeleyBerkeleyUnited States
| | - Jamie HD Cate
- Department of Chemistry, University of California, BerkeleyBerkeleyUnited States
- Department of Molecular and Cell Biology, University of California, BerkeleyBerkeleyUnited States
- Molecular Biophysics and Integrated Bioimaging Division, Lawrence Berkeley National LaboratoryBerkeleyUnited States
| |
Collapse
|
14
|
Gouveia D, Grenga L, Pible O, Armengaud J. Quick microbial molecular phenotyping by differential shotgun proteomics. Environ Microbiol 2020; 22:2996-3004. [PMID: 32133743 PMCID: PMC7496289 DOI: 10.1111/1462-2920.14975] [Citation(s) in RCA: 21] [Impact Index Per Article: 5.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/01/2020] [Revised: 02/29/2020] [Accepted: 03/02/2020] [Indexed: 12/12/2022]
Abstract
Differential shotgun proteomics identifies proteins that discriminate between sets of samples based on differences in abundance. This methodology can be easily applied to study (i) specific microorganisms subjected to a variety of growth or stress conditions or (ii) different microorganisms sampled in the same condition. In microbiology, this comparison is particularly successful because differing microorganism phenotypes are explained by clearly altered abundances of key protein players. The extensive description and quantification of proteins from any given microorganism can be routinely obtained for several conditions within a few days by tandem mass spectrometry. Such protein-centred microbial molecular phenotyping is rich in information. However, well-designed experimental strategies, carefully parameterized analytical pipelines, and sound statistical approaches must be applied if the shotgun proteomic data are to be correctly interpreted. This minireview describes these key items for a quick molecular phenotyping based on label-free quantification shotgun proteomics.
Collapse
Affiliation(s)
- Duarte Gouveia
- Laboratoire Innovations technologiques pour la Détection et le Diagnostic (Li2D)Service de Pharmacologie et Immunoanalyse (SPI)CEA, INRAE, F‐30207 Bagnols‐sur‐CèzeFrance
| | - Lucia Grenga
- Laboratoire Innovations technologiques pour la Détection et le Diagnostic (Li2D)Service de Pharmacologie et Immunoanalyse (SPI)CEA, INRAE, F‐30207 Bagnols‐sur‐CèzeFrance
| | - Olivier Pible
- Laboratoire Innovations technologiques pour la Détection et le Diagnostic (Li2D)Service de Pharmacologie et Immunoanalyse (SPI)CEA, INRAE, F‐30207 Bagnols‐sur‐CèzeFrance
| | - Jean Armengaud
- Laboratoire Innovations technologiques pour la Détection et le Diagnostic (Li2D)Service de Pharmacologie et Immunoanalyse (SPI)CEA, INRAE, F‐30207 Bagnols‐sur‐CèzeFrance
| |
Collapse
|
15
|
Schaffer LV, Millikin RJ, Shortreed MR, Scalf M, Smith LM. Improving Proteoform Identifications in Complex Systems Through Integration of Bottom-Up and Top-Down Data. J Proteome Res 2020; 19:3510-3517. [PMID: 32584579 DOI: 10.1021/acs.jproteome.0c00332] [Citation(s) in RCA: 21] [Impact Index Per Article: 5.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/18/2022]
Abstract
Cellular functions are performed by a vast and diverse set of proteoforms. Proteoforms are the specific forms of proteins produced as a result of genetic variations, RNA splicing, and post-translational modifications (PTMs). Top-down mass spectrometric analysis of intact proteins enables proteoform identification, including proteoforms derived from sequence cleavage events or harboring multiple PTMs. In contrast, bottom-up proteomics identifies peptides, which necessitates protein inference and does not yield proteoform identifications. We seek here to exploit the synergies between these two data types to improve the quality and depth of the overall proteomic analysis. To this end, we automated the large-scale integration of results from multiprotease bottom-up and top-down analyses in the software program Proteoform Suite and applied it to the analysis of proteoforms from the human Jurkat T lymphocyte cell line. We implemented the recently developed proteoform-level classification scheme for top-down tandem mass spectrometry (MS/MS) identifications in Proteoform Suite, which enables users to observe the level and type of ambiguity for each proteoform identification, including which of the ambiguous proteoform identifications are supported by bottom-up-level evidence. We used Proteoform Suite to find instances where top-down identifications aid in protein inference from bottom-up analysis and conversely where bottom-up peptide identifications aid in proteoform PTM localization. We also show the use of bottom-up data to infer proteoform candidates potentially present in the sample, allowing confirmation of such proteoform candidates by intact-mass analysis of MS1 spectra. The implementation of these capabilities in the freely available software program Proteoform Suite enables users to integrate large-scale top-down and bottom-up data sets and to utilize the synergies between them to improve and extend the proteomic analysis.
Collapse
Affiliation(s)
- Leah V Schaffer
- Department of Chemistry, University of Wisconsin-Madison, Madison, Wisconsin 53706, United States
| | - Robert J Millikin
- Department of Chemistry, University of Wisconsin-Madison, Madison, Wisconsin 53706, United States
| | - Michael R Shortreed
- Department of Chemistry, University of Wisconsin-Madison, Madison, Wisconsin 53706, United States
| | - Mark Scalf
- Department of Chemistry, University of Wisconsin-Madison, Madison, Wisconsin 53706, United States
| | - Lloyd M Smith
- Department of Chemistry, University of Wisconsin-Madison, Madison, Wisconsin 53706, United States
| |
Collapse
|
16
|
Clauwaert J, Menschaert G, Waegeman W. DeepRibo: a neural network for precise gene annotation of prokaryotes by combining ribosome profiling signal and binding site patterns. Nucleic Acids Res 2019; 47:e36. [PMID: 30753697 PMCID: PMC6451124 DOI: 10.1093/nar/gkz061] [Citation(s) in RCA: 43] [Impact Index Per Article: 8.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/26/2018] [Revised: 01/02/2019] [Accepted: 01/30/2019] [Indexed: 12/13/2022] Open
Abstract
Annotation of gene expression in prokaryotes often finds itself corrected due to small variations of the annotated gene regions observed between different (sub)-species. It has become apparent that traditional sequence alignment algorithms, used for the curation of genomes, are not able to map the full complexity of the genomic landscape. We present DeepRibo, a novel neural network utilizing features extracted from ribosome profiling information and binding site sequence patterns that shows to be a precise tool for the delineation and annotation of expressed genes in prokaryotes. The neural network combines recurrent memory cells and convolutional layers, adapting the information gained from both the high-throughput ribosome profiling data and ribosome binding translation initiation sequence region into one model. DeepRibo is designed as a single model trained on a variety of ribosome profiling experiments, used for the identification of open reading frames in prokaryotes without a priori knowledge of the translational landscape. Through extensive validation of the model trained on various sets of data, multiple species sequence similarity, mass spectrometry and Edman degradation verified proteins, the effectiveness of DeepRibo is highlighted.
Collapse
Affiliation(s)
- Jim Clauwaert
- KERMIT, Department of Data Analysis and Mathematical Modelling, Ghent University, Coupure Links 653, 9000 Gent, Belgium
| | - Gerben Menschaert
- Biobix, Department of Data Analysis and Mathematical Modelling, Ghent University, Coupure Links 653, 9000 Gent, Belgium
| | - Willem Waegeman
- KERMIT, Department of Data Analysis and Mathematical Modelling, Ghent University, Coupure Links 653, 9000 Gent, Belgium
| |
Collapse
|
17
|
Shen X, Yang Z, McCool EN, Lubeckyj RA, Chen D, Sun L. Capillary zone electrophoresis-mass spectrometry for top-down proteomics. Trends Analyt Chem 2019; 120:115644. [PMID: 31537953 PMCID: PMC6752746 DOI: 10.1016/j.trac.2019.115644] [Citation(s) in RCA: 48] [Impact Index Per Article: 9.6] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/06/2023]
Abstract
Mass spectrometry (MS)-based top-down proteomics characterizes complex proteomes at the intact proteoform level and provides an accurate picture of protein isoforms and protein post-translational modifications in the cell. The progress of top-down proteomics requires novel analytical tools with high peak capacity for proteoform separation and high sensitivity for proteoform detection. The requirements have made capillary zone electrophoresis (CZE)-MS an attractive approach for advancing large-scale top-down proteomics. CZE has achieved a peak capacity of 300 for separation of complex proteoform mixtures. CZE-MS has shown drastically better sensitivity than commonly used reversed-phase liquid chromatography (RPLC)-MS for proteoform detection. The advanced CZE-MS identified 6,000 proteoforms of nearly 1,000 proteoform families from a complex proteome sample, which represents one of the largest top-down proteomic datasets so far. In this review, we focus on the recent progress in CZE-MS-based top-down proteomics and provide our perspectives about its future directions.
Collapse
Affiliation(s)
- Xiaojing Shen
- Department of Chemistry, Michigan State University, 578 S Shaw Lane, East Lansing, Michigan 48824, United States
| | - Zhichang Yang
- Department of Chemistry, Michigan State University, 578 S Shaw Lane, East Lansing, Michigan 48824, United States
| | - Elijah N. McCool
- Department of Chemistry, Michigan State University, 578 S Shaw Lane, East Lansing, Michigan 48824, United States
| | - Rachele A. Lubeckyj
- Department of Chemistry, Michigan State University, 578 S Shaw Lane, East Lansing, Michigan 48824, United States
| | - Daoyang Chen
- Department of Chemistry, Michigan State University, 578 S Shaw Lane, East Lansing, Michigan 48824, United States
| | - Liangliang Sun
- Department of Chemistry, Michigan State University, 578 S Shaw Lane, East Lansing, Michigan 48824, United States
| |
Collapse
|
18
|
Dai Y, Buxton KE, Schaffer LV, Miller RM, Millikin RJ, Scalf M, Frey BL, Shortreed MR, Smith LM. Constructing Human Proteoform Families Using Intact-Mass and Top-Down Proteomics with a Multi-Protease Global Post-Translational Modification Discovery Database. J Proteome Res 2019; 18:3671-3680. [PMID: 31479276 DOI: 10.1021/acs.jproteome.9b00339] [Citation(s) in RCA: 13] [Impact Index Per Article: 2.6] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/22/2023]
Abstract
Complex human biomolecular processes are made possible by the diversity of human proteoforms. Constructing proteoform families, groups of proteoforms derived from the same gene, is one way to represent this diversity. Comprehensive, high-confidence identification of human proteoforms remains a central challenge in mass spectrometry-based proteomics. We have previously reported a strategy for proteoform identification using intact-mass measurements, and we have since improved that strategy by mass calibration based on search results, the use of a global post-translational modification discovery database, and the integration of top-down proteomics results with intact-mass analysis. In the present study, we combine these strategies for enhanced proteoform identification in total cell lysate from the Jurkat human T lymphocyte cell line. We collected, processed, and integrated three types of proteomics data (NeuCode-labeled intact-mass, label-free top-down, and multi-protease bottom-up) to maximize the number of confident proteoform identifications. The integrated analysis revealed 5950 unique experimentally observed proteoforms, which were assembled into 848 proteoform families. Twenty percent of the observed proteoforms were confidently identified at a 3.9% false discovery rate, representing 1207 unique proteoforms derived from 484 genes.
Collapse
Affiliation(s)
- Yunxiang Dai
- Department of Chemistry , University of Wisconsin , 1101 University Avenue , Madison , Wisconsin 53706 , United States.,Biophysics Graduate Program , University of Wisconsin , 413 Bock Laboratories, 1525 Linden Drive , Madison , Wisconsin 53706 , United States
| | - Katherine E Buxton
- Department of Chemistry , University of Wisconsin , 1101 University Avenue , Madison , Wisconsin 53706 , United States
| | - Leah V Schaffer
- Department of Chemistry , University of Wisconsin , 1101 University Avenue , Madison , Wisconsin 53706 , United States
| | - Rachel M Miller
- Department of Chemistry , University of Wisconsin , 1101 University Avenue , Madison , Wisconsin 53706 , United States
| | - Robert J Millikin
- Department of Chemistry , University of Wisconsin , 1101 University Avenue , Madison , Wisconsin 53706 , United States
| | - Mark Scalf
- Department of Chemistry , University of Wisconsin , 1101 University Avenue , Madison , Wisconsin 53706 , United States
| | - Brian L Frey
- Department of Chemistry , University of Wisconsin , 1101 University Avenue , Madison , Wisconsin 53706 , United States
| | - Michael R Shortreed
- Department of Chemistry , University of Wisconsin , 1101 University Avenue , Madison , Wisconsin 53706 , United States
| | - Lloyd M Smith
- Department of Chemistry , University of Wisconsin , 1101 University Avenue , Madison , Wisconsin 53706 , United States
| |
Collapse
|
19
|
Nagarajan A, Zhou M, Nguyen AY, Liberton M, Kedia K, Shi T, Piehowski P, Shukla A, Fillmore TL, Nicora C, Smith RD, Koppenaal DW, Jacobs JM, Pakrasi HB. Proteomic Insights into Phycobilisome Degradation, A Selective and Tightly Controlled Process in The Fast-Growing Cyanobacterium Synechococcus elongatus UTEX 2973. Biomolecules 2019; 9:biom9080374. [PMID: 31426316 PMCID: PMC6722726 DOI: 10.3390/biom9080374] [Citation(s) in RCA: 11] [Impact Index Per Article: 2.2] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/24/2019] [Revised: 08/12/2019] [Accepted: 08/13/2019] [Indexed: 11/16/2022] Open
Abstract
Phycobilisomes (PBSs) are large (3-5 megadalton) pigment-protein complexes in cyanobacteria that associate with thylakoid membranes and harvest light primarily for photosystem II. PBSs consist of highly ordered assemblies of pigmented phycobiliproteins (PBPs) and linker proteins that can account for up to half of the soluble protein in cells. Cyanobacteria adjust to changing environmental conditions by modulating PBS size and number. In response to nutrient depletion such as nitrogen (N) deprivation, PBSs are degraded in an extensive, tightly controlled, and reversible process. In Synechococcus elongatus UTEX 2973, a fast-growing cyanobacterium with a doubling time of two hours, the process of PBS degradation is very rapid, with 80% of PBSs per cell degraded in six hours under optimal light and CO2 conditions. Proteomic analysis during PBS degradation and re-synthesis revealed multiple proteoforms of PBPs with partially degraded phycocyanobilin (PCB) pigments. NblA, a small proteolysis adaptor essential for PBS degradation, was characterized and validated with targeted mass spectrometry. NblA levels rose from essentially 0 to 25,000 copies per cell within 30 min of N depletion, and correlated with the rate of decrease in phycocyanin (PC). Implications of this correlation on the overall mechanism of PBS degradation during N deprivation are discussed.
Collapse
Affiliation(s)
- Aparna Nagarajan
- Department of Biology, Washington University, St. Louis, MO 63130, USA
| | - Mowei Zhou
- Environmental and Molecular Sciences Laboratory, Pacific Northwest National Laboratory, Richland, WA 99354, USA
| | - Amelia Y Nguyen
- Department of Biology, Washington University, St. Louis, MO 63130, USA
| | - Michelle Liberton
- Department of Biology, Washington University, St. Louis, MO 63130, USA
| | - Komal Kedia
- Biological Sciences Division, Pacific Northwest National Laboratory, Richland, WA 99354, USA
| | - Tujin Shi
- Biological Sciences Division, Pacific Northwest National Laboratory, Richland, WA 99354, USA
| | - Paul Piehowski
- Biological Sciences Division, Pacific Northwest National Laboratory, Richland, WA 99354, USA
| | - Anil Shukla
- Biological Sciences Division, Pacific Northwest National Laboratory, Richland, WA 99354, USA
| | - Thomas L Fillmore
- Biological Sciences Division, Pacific Northwest National Laboratory, Richland, WA 99354, USA
| | - Carrie Nicora
- Biological Sciences Division, Pacific Northwest National Laboratory, Richland, WA 99354, USA
| | - Richard D Smith
- Biological Sciences Division, Pacific Northwest National Laboratory, Richland, WA 99354, USA
| | - David W Koppenaal
- Biological Sciences Division, Pacific Northwest National Laboratory, Richland, WA 99354, USA
| | - Jon M Jacobs
- Biological Sciences Division, Pacific Northwest National Laboratory, Richland, WA 99354, USA
| | - Himadri B Pakrasi
- Department of Biology, Washington University, St. Louis, MO 63130, USA.
| |
Collapse
|
20
|
Schaffer LV, Tucholski T, Shortreed MR, Ge Y, Smith LM. Intact-Mass Analysis Facilitating the Identification of Large Human Heart Proteoforms. Anal Chem 2019; 91:10937-10942. [PMID: 31393705 DOI: 10.1021/acs.analchem.9b02343] [Citation(s) in RCA: 12] [Impact Index Per Article: 2.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/05/2023]
Abstract
Proteoforms, the primary effectors of biological processes, are the different forms of proteins that arise from molecular processing events such as alternative splicing and post-translational modifications. Heart diseases exhibit changes in proteoform levels, motivating the development of a deeper understanding of the heart proteoform landscape. Our recently developed two-dimensional top-down proteomics platform coupling serial size exclusion chromatography (sSEC) to reversed-phase chromatography (RPC) expanded coverage of the human heart proteome and allowed observation of high-molecular weight proteoforms. However, most of these observed proteoforms were not identified due to the difficulty in obtaining quality tandem mass spectrometry (MS2) fragmentation data for large proteoforms from complex biological mixtures on a chromatographic time scale. Herein, we sought to identify human heart proteoforms in this data set using an enhanced version of Proteoform Suite, which identifies proteoforms by intact mass alone. Specifically, we added a new feature to Proteoform Suite to determine candidate identifications for isotopically unresolved proteoforms larger than 50 kDa, enabling subsequent MS2 identification of important high-molecular weight human heart proteoforms such as lamin A (72 kDa) and trifunctional enzyme subunit α (79 kDa). With this new workflow for large proteoform identification, endogenous human cardiac myosin binding protein C (140 kDa) was identified for the first time. This study demonstrates the integration of our sSEC-RPC-MS proteomics platform with intact-mass analysis through Proteoform Suite to create a catalog of human heart proteoforms and facilitate the identification of large proteoforms in complex systems.
Collapse
Affiliation(s)
- Leah V Schaffer
- Department of Chemistry , University of Wisconsin-Madison , Madison , Wisconsin 53706 , United States
| | - Trisha Tucholski
- Department of Chemistry , University of Wisconsin-Madison , Madison , Wisconsin 53706 , United States
| | - Michael R Shortreed
- Department of Chemistry , University of Wisconsin-Madison , Madison , Wisconsin 53706 , United States
| | - Ying Ge
- Department of Chemistry , University of Wisconsin-Madison , Madison , Wisconsin 53706 , United States.,Department of Cell and Regenerative Biology , University of Wisconsin-Madison , Madison , Wisconsin 53705 , United States.,Human Proteomics Program , University of Wisconsin-Madison , Madison , Wisconsin 53705 , United States
| | - Lloyd M Smith
- Department of Chemistry , University of Wisconsin-Madison , Madison , Wisconsin 53706 , United States
| |
Collapse
|
21
|
Schaffer LV, Millikin RJ, Miller RM, Anderson LC, Fellers RT, Ge Y, Kelleher NL, LeDuc RD, Liu X, Payne SH, Sun L, Thomas PM, Tucholski T, Wang Z, Wu S, Wu Z, Yu D, Shortreed MR, Smith LM. Identification and Quantification of Proteoforms by Mass Spectrometry. Proteomics 2019; 19:e1800361. [PMID: 31050378 PMCID: PMC6602557 DOI: 10.1002/pmic.201800361] [Citation(s) in RCA: 128] [Impact Index Per Article: 25.6] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/28/2019] [Revised: 04/07/2019] [Indexed: 12/29/2022]
Abstract
A proteoform is a defined form of a protein derived from a given gene with a specific amino acid sequence and localized post-translational modifications. In top-down proteomic analyses, proteoforms are identified and quantified through mass spectrometric analysis of intact proteins. Recent technological developments have enabled comprehensive proteoform analyses in complex samples, and an increasing number of laboratories are adopting top-down proteomic workflows. In this review, some recent advances are outlined and current challenges and future directions for the field are discussed.
Collapse
Affiliation(s)
- Leah V. Schaffer
- Department of Chemistry, University of Wisconsin-Madison, Madison, Wisconsin 53706, United States
| | - Robert J. Millikin
- Department of Chemistry, University of Wisconsin-Madison, Madison, Wisconsin 53706, United States
| | - Rachel M. Miller
- Department of Chemistry, University of Wisconsin-Madison, Madison, Wisconsin 53706, United States
| | - Lissa C. Anderson
- Ion Cyclotron Resonance Program, National High Magnetic Field Laboratory, Tallahassee, Florida 32310, United States
| | - Ryan T. Fellers
- Proteomics Center of Excellence, Northwestern University, Evanston, Illinois 60208, United States
| | - Ying Ge
- Department of Chemistry, University of Wisconsin-Madison, Madison, Wisconsin 53706, United States
- Department of Cell and Regenerative Biology and Human Proteomics Program, University of Wisconsin-Madison, Madison, Wisconsin 53706, United States
| | - Neil L. Kelleher
- Proteomics Center of Excellence, Northwestern University, Evanston, Illinois 60208, United States
- Department of Chemistry and Molecular Biosciences and the Division of Hematology-Oncology, Northwestern University, Evanston, Illinois 60208, United States
| | - Richard D. LeDuc
- Proteomics Center of Excellence, Northwestern University, Evanston, Illinois 60208, United States
| | - Xiaowen Liu
- Department of BioHealth Informatics, Indiana University-Purdue University, Indianapolis, Indiana 46202, United States
- Center for Computational Biology and Bioinformatics, Indiana University School of Medicine, Indianapolis, Indiana 46202, United States
| | - Samuel H. Payne
- Department of Biology, Brigham Young University, Provo, UT 84602
| | - Liangliang Sun
- Department of Chemistry, Michigan State University, East Lansing, Michigan 48824, United States
| | - Paul M. Thomas
- Proteomics Center of Excellence, Northwestern University, Evanston, Illinois 60208, United States
| | - Trisha Tucholski
- Department of Chemistry, University of Wisconsin-Madison, Madison, Wisconsin 53706, United States
| | - Zhe Wang
- Department of Chemistry and Biochemistry, University of Oklahoma, Norman, Oklahoma 73019, United States
| | - Si Wu
- Department of Chemistry and Biochemistry, University of Oklahoma, Norman, Oklahoma 73019, United States
| | - Zhijie Wu
- Department of Chemistry, University of Wisconsin-Madison, Madison, Wisconsin 53706, United States
| | - Dahang Yu
- Department of Chemistry and Biochemistry, University of Oklahoma, Norman, Oklahoma 73019, United States
| | - Michael R. Shortreed
- Department of Chemistry, University of Wisconsin-Madison, Madison, Wisconsin 53706, United States
| | - Lloyd M. Smith
- Department of Chemistry, University of Wisconsin-Madison, Madison, Wisconsin 53706, United States
| |
Collapse
|
22
|
Lin Z, Wei L, Cai W, Zhu Y, Tucholski T, Mitchell SD, Guo W, Ford SP, Diffee GM, Ge Y. Simultaneous Quantification of Protein Expression and Modifications by Top-down Targeted Proteomics: A Case of the Sarcomeric Subproteome. Mol Cell Proteomics 2019; 18:594-605. [PMID: 30591534 PMCID: PMC6398208 DOI: 10.1074/mcp.tir118.001086] [Citation(s) in RCA: 17] [Impact Index Per Article: 3.4] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/16/2018] [Revised: 12/08/2018] [Indexed: 12/14/2022] Open
Abstract
Determining changes in protein expression and post-translational modifications (PTMs) is crucial for elucidating cellular signal transduction and disease mechanisms. Conventional antibody-based approaches have inherent problems such as the limited availability of high-quality antibodies and batch-to-batch variation. Top-down mass spectrometry (MS)-based proteomics has emerged as the most powerful method for characterization and quantification of protein modifications. Nevertheless, robust methods to simultaneously determine changes in protein expression and PTMs remain lacking. Herein, we have developed a straightforward and robust top-down liquid chromatography (LC)/MS-based targeted proteomics platform for simultaneous quantification of protein expression and PTMs with high throughput and high reproducibility. We employed this method to analyze the sarcomeric subproteome from various muscle types of different species, which successfully revealed skeletal muscle heterogeneity and cardiac developmental changes in sarcomeric protein isoform expression and PTMs. As demonstrated, this targeted top-down proteomics platform offers an excellent 'antibody-independent' alternative for the accurate quantification of sarcomeric protein expression and PTMs concurrently in complex mixtures, which is generally applicable to different species and various tissue types.
Collapse
Affiliation(s)
- Ziqing Lin
- From the ‡Department of Cell and Regenerative Biology, University of Wisconsin-Madison, Madison, Wisconsin 53705
- §Human Proteomics Program, University of Wisconsin-Madison, Madison, WI 53705
| | - Liming Wei
- From the ‡Department of Cell and Regenerative Biology, University of Wisconsin-Madison, Madison, Wisconsin 53705
- ¶Institutes of Biomedical Sciences, Fudan University, Shanghai, 200032, P. R. China
| | - Wenxuan Cai
- From the ‡Department of Cell and Regenerative Biology, University of Wisconsin-Madison, Madison, Wisconsin 53705
- ‖Molecular & Cellular Pharmacology Training Program, University of Wisconsin-Madison, Madison, WI 53705
| | - Yanlong Zhu
- From the ‡Department of Cell and Regenerative Biology, University of Wisconsin-Madison, Madison, Wisconsin 53705
- §Human Proteomics Program, University of Wisconsin-Madison, Madison, WI 53705
| | - Trisha Tucholski
- **Department of Chemistry, University of Wisconsin-Madison, Madison, WI 53706
| | - Stanford D Mitchell
- From the ‡Department of Cell and Regenerative Biology, University of Wisconsin-Madison, Madison, Wisconsin 53705
- ‖Molecular & Cellular Pharmacology Training Program, University of Wisconsin-Madison, Madison, WI 53705
| | - Wei Guo
- ‡‡Department of Animal Science, Fetal Programming Center, University of Wyoming, Laramie, Wyoming 82071
| | - Stephen P Ford
- ‡‡Department of Animal Science, Fetal Programming Center, University of Wyoming, Laramie, Wyoming 82071
| | - Gary M Diffee
- §§Department of Kinesiology, University of Wisconsin-Madison, Madison, WI 53705
| | - Ying Ge
- From the ‡Department of Cell and Regenerative Biology, University of Wisconsin-Madison, Madison, Wisconsin 53705;
- §Human Proteomics Program, University of Wisconsin-Madison, Madison, WI 53705
- ‖Molecular & Cellular Pharmacology Training Program, University of Wisconsin-Madison, Madison, WI 53705
- **Department of Chemistry, University of Wisconsin-Madison, Madison, WI 53706
| |
Collapse
|
23
|
Schaffer LV, Rensvold JW, Shortreed MR, Cesnik AJ, Jochem A, Scalf M, Frey BL, Pagliarini DJ, Smith LM. Identification and Quantification of Murine Mitochondrial Proteoforms Using an Integrated Top-Down and Intact-Mass Strategy. J Proteome Res 2018; 17:3526-3536. [PMID: 30180576 PMCID: PMC6201694 DOI: 10.1021/acs.jproteome.8b00469] [Citation(s) in RCA: 20] [Impact Index Per Article: 3.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/16/2022]
Abstract
The development of effective strategies for the comprehensive identification and quantification of proteoforms in complex systems is a critical challenge in proteomics. Proteoforms, the specific molecular forms in which proteins are present in biological systems, are the key effectors of biological function. Thus, knowledge of proteoform identities and abundances is essential to unraveling the mechanisms that underlie protein function. We recently reported a strategy that integrates conventional top-down mass spectrometry with intact-mass determinations for enhanced proteoform identifications and the elucidation of proteoform families and applied it to the analysis of yeast cell lysate. In the present work, we extend this strategy to enable quantification of proteoforms, and we examine changes in the abundance of murine mitochondrial proteoforms upon differentiation of mouse myoblasts to myotubes. The integrated top-down and intact-mass strategy provided an increase of ∼37% in the number of identified proteoforms compared to top-down alone, which is in agreement with our previous work in yeast; 1779 unique proteoforms were identified using the integrated strategy compared to 1301 using top-down analysis alone. Quantitative comparison of proteoform differences between the myoblast and myotube cell types showed 129 observed proteoforms exhibiting statistically significant abundance changes (fold change >2 and false discovery rate <5%).
Collapse
Affiliation(s)
- Leah V. Schaffer
- Department of Chemistry, University of Wisconsin-Madison, Madison, WI 53706, USA
| | | | - Michael R. Shortreed
- Department of Chemistry, University of Wisconsin-Madison, Madison, WI 53706, USA
| | - Anthony J. Cesnik
- Department of Chemistry, University of Wisconsin-Madison, Madison, WI 53706, USA
| | - Adam Jochem
- Morgridge Institute for Research, Madison, WI 53715, USA
| | - Mark Scalf
- Department of Chemistry, University of Wisconsin-Madison, Madison, WI 53706, USA
| | - Brian L. Frey
- Department of Chemistry, University of Wisconsin-Madison, Madison, WI 53706, USA
| | - David J. Pagliarini
- Morgridge Institute for Research, Madison, WI 53715, USA
- Department of Biochemistry, University of Wisconsin-Madison, Madison, WI 53706, USA
| | - Lloyd M. Smith
- Department of Chemistry, University of Wisconsin-Madison, Madison, WI 53706, USA
- Genome Center of Wisconsin, University of Wisconsin-Madison, Madison, WI 53706, USA
| |
Collapse
|
24
|
McCool EN, Lubeckyj RA, Shen X, Chen D, Kou Q, Liu X, Sun L. Deep Top-Down Proteomics Using Capillary Zone Electrophoresis-Tandem Mass Spectrometry: Identification of 5700 Proteoforms from the Escherichia coli Proteome. Anal Chem 2018; 90:5529-5533. [PMID: 29620868 DOI: 10.1021/acs.analchem.8b00693] [Citation(s) in RCA: 83] [Impact Index Per Article: 13.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/06/2023]
Abstract
Capillary zone electrophoresis (CZE)-tandem mass spectrometry (MS/MS) has been recognized as a useful tool for top-down proteomics. However, its performance for deep top-down proteomics is still dramatically lower than widely used reversed-phase liquid chromatography (RPLC)-MS/MS. We present an orthogonal multidimensional separation platform that couples size exclusion chromatography (SEC) and RPLC based protein prefractionation to CZE-MS/MS for deep top-down proteomics of Escherichia coli. The platform generated high peak capacity (∼4000) for separation of intact proteins, leading to the identification of 5700 proteoforms from the Escherichia coli proteome. The data represents a 10-fold improvement in the number of proteoform identifications compared with previous CZE-MS/MS studies and represents the largest bacterial top-down proteomics data set reported to date. The performance of the CZE-MS/MS based platform is comparable to the state-of-the-art RPLC-MS/MS based systems in terms of the number of proteoform identifications and the instrument time.
Collapse
Affiliation(s)
- Elijah N McCool
- Department of Chemistry , Michigan State University , 578 S Shaw Lane , East Lansing , Michigan 48824 , United States
| | - Rachele A Lubeckyj
- Department of Chemistry , Michigan State University , 578 S Shaw Lane , East Lansing , Michigan 48824 , United States
| | - Xiaojing Shen
- Department of Chemistry , Michigan State University , 578 S Shaw Lane , East Lansing , Michigan 48824 , United States
| | - Daoyang Chen
- Department of Chemistry , Michigan State University , 578 S Shaw Lane , East Lansing , Michigan 48824 , United States
| | - Qiang Kou
- Department of BioHealth Informatics , Indiana University-Purdue University Indianapolis , 719 Indiana Avenue , Indianapolis , Indiana 46202 , United States
| | - Xiaowen Liu
- Department of BioHealth Informatics , Indiana University-Purdue University Indianapolis , 719 Indiana Avenue , Indianapolis , Indiana 46202 , United States.,Center for Computational Biology and Bioinformatics , Indiana University School of Medicine , 410 W. 10th Street , Indianapolis , Indiana 46202 , United States
| | - Liangliang Sun
- Department of Chemistry , Michigan State University , 578 S Shaw Lane , East Lansing , Michigan 48824 , United States
| |
Collapse
|
25
|
Affiliation(s)
- Bifan Chen
- Department of Chemistry, University of Wisconsin-Madison, Madison, Wisconsin 53706, United States
| | - Kyle A. Brown
- Department of Chemistry, University of Wisconsin-Madison, Madison, Wisconsin 53706, United States
| | - Ziqing Lin
- Department of Cell and Regenerative Biology, University of Wisconsin-Madison, Madison, Wisconsin 53706, United States
- Human Proteomics Program, University of Wisconsin-Madison, Madison, Wisconsin 53706, United States
| | - Ying Ge
- Department of Chemistry, University of Wisconsin-Madison, Madison, Wisconsin 53706, United States
- Department of Cell and Regenerative Biology, University of Wisconsin-Madison, Madison, Wisconsin 53706, United States
- Human Proteomics Program, University of Wisconsin-Madison, Madison, Wisconsin 53706, United States
| |
Collapse
|
26
|
Schaffer LV, Shortreed MR, Cesnik AJ, Frey BL, Solntsev SK, Scalf M, Smith LM. Expanding Proteoform Identifications in Top-Down Proteomic Analyses by Constructing Proteoform Families. Anal Chem 2017; 90:1325-1333. [PMID: 29227670 DOI: 10.1021/acs.analchem.7b04221] [Citation(s) in RCA: 22] [Impact Index Per Article: 3.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/15/2022]
Abstract
In top-down proteomics, intact proteins are analyzed by tandem mass spectrometry and proteoforms, which are defined forms of a protein with specific sequences of amino acids and localized post-translational modifications, are identified using precursor mass and fragmentation data. Many proteoforms that are detected in the precursor scan (MS1) are not selected for fragmentation by the instrument and therefore remain unidentified in typical top-down proteomic workflows. Our laboratory has developed the open source software program Proteoform Suite to analyze MS1-only intact proteoform data. Here, we have adapted it to provide identifications of proteoform masses in precursor MS1 spectra of top-down data, supplementing the top-down identifications obtained using the MS2 fragmentation data. Proteoform Suite performs mass calibration using high-scoring top-down identifications and identifies additional proteoforms using calibrated, accurate intact masses. Proteoform families, the set of proteoforms from a given gene, are constructed and visualized from proteoforms identified by both top-down and intact-mass analyses. Using this strategy, we constructed proteoform families and identified 1861 proteoforms in yeast lysate, yielding an approximately 40% increase over the original 1291 proteoform identifications observed using traditional top-down analysis alone.
Collapse
Affiliation(s)
- Leah V Schaffer
- Department of Chemistry, University of Wisconsin , 1101 University Avenue, Madison, Wisconsin 53706, United States
| | - Michael R Shortreed
- Department of Chemistry, University of Wisconsin , 1101 University Avenue, Madison, Wisconsin 53706, United States
| | - Anthony J Cesnik
- Department of Chemistry, University of Wisconsin , 1101 University Avenue, Madison, Wisconsin 53706, United States
| | - Brian L Frey
- Department of Chemistry, University of Wisconsin , 1101 University Avenue, Madison, Wisconsin 53706, United States
| | - Stefan K Solntsev
- Department of Chemistry, University of Wisconsin , 1101 University Avenue, Madison, Wisconsin 53706, United States
| | - Mark Scalf
- Department of Chemistry, University of Wisconsin , 1101 University Avenue, Madison, Wisconsin 53706, United States
| | - Lloyd M Smith
- Department of Chemistry, University of Wisconsin , 1101 University Avenue, Madison, Wisconsin 53706, United States.,Genome Center of Wisconsin, University of Wisconsin , 425G Henry Mall, Room 3420, Madison, Wisconsin 53706, United States
| |
Collapse
|
27
|
Cesnik AJ, Shortreed MR, Schaffer LV, Knoener RA, Frey BL, Scalf M, Solntsev SK, Dai Y, Gasch AP, Smith LM. Proteoform Suite: Software for Constructing, Quantifying, and Visualizing Proteoform Families. J Proteome Res 2017; 17:568-578. [PMID: 29195273 DOI: 10.1021/acs.jproteome.7b00685] [Citation(s) in RCA: 27] [Impact Index Per Article: 3.9] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/19/2022]
Abstract
We present an open-source, interactive program named Proteoform Suite that uses proteoform mass and intensity measurements from complex biological samples to identify and quantify proteoforms. It constructs families of proteoforms derived from the same gene, assesses proteoform function using gene ontology (GO) analysis, and enables visualization of quantified proteoform families and their changes. It is applied here to reveal systemic proteoform variations in the yeast response to salt stress.
Collapse
Affiliation(s)
- Anthony J Cesnik
- Department of Chemistry, ‡Laboratory of Genetics, and §Genome Center of Wisconsin, University of Wisconsin-Madison , Madison, Wisconsin 53706, United States
| | - Michael R Shortreed
- Department of Chemistry, ‡Laboratory of Genetics, and §Genome Center of Wisconsin, University of Wisconsin-Madison , Madison, Wisconsin 53706, United States
| | - Leah V Schaffer
- Department of Chemistry, ‡Laboratory of Genetics, and §Genome Center of Wisconsin, University of Wisconsin-Madison , Madison, Wisconsin 53706, United States
| | - Rachel A Knoener
- Department of Chemistry, ‡Laboratory of Genetics, and §Genome Center of Wisconsin, University of Wisconsin-Madison , Madison, Wisconsin 53706, United States
| | - Brian L Frey
- Department of Chemistry, ‡Laboratory of Genetics, and §Genome Center of Wisconsin, University of Wisconsin-Madison , Madison, Wisconsin 53706, United States
| | - Mark Scalf
- Department of Chemistry, ‡Laboratory of Genetics, and §Genome Center of Wisconsin, University of Wisconsin-Madison , Madison, Wisconsin 53706, United States
| | - Stefan K Solntsev
- Department of Chemistry, ‡Laboratory of Genetics, and §Genome Center of Wisconsin, University of Wisconsin-Madison , Madison, Wisconsin 53706, United States
| | - Yunxiang Dai
- Department of Chemistry, ‡Laboratory of Genetics, and §Genome Center of Wisconsin, University of Wisconsin-Madison , Madison, Wisconsin 53706, United States
| | - Audrey P Gasch
- Department of Chemistry, ‡Laboratory of Genetics, and §Genome Center of Wisconsin, University of Wisconsin-Madison , Madison, Wisconsin 53706, United States
| | - Lloyd M Smith
- Department of Chemistry, ‡Laboratory of Genetics, and §Genome Center of Wisconsin, University of Wisconsin-Madison , Madison, Wisconsin 53706, United States
| |
Collapse
|