1
|
Jiang Y, Rex DA, Schuster D, Neely BA, Rosano GL, Volkmar N, Momenzadeh A, Peters-Clarke TM, Egbert SB, Kreimer S, Doud EH, Crook OM, Yadav AK, Vanuopadath M, Hegeman AD, Mayta M, Duboff AG, Riley NM, Moritz RL, Meyer JG. Comprehensive Overview of Bottom-Up Proteomics Using Mass Spectrometry. ACS MEASUREMENT SCIENCE AU 2024; 4:338-417. [PMID: 39193565 PMCID: PMC11348894 DOI: 10.1021/acsmeasuresciau.3c00068] [Citation(s) in RCA: 2] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 11/15/2023] [Revised: 05/03/2024] [Accepted: 05/03/2024] [Indexed: 08/29/2024]
Abstract
Proteomics is the large scale study of protein structure and function from biological systems through protein identification and quantification. "Shotgun proteomics" or "bottom-up proteomics" is the prevailing strategy, in which proteins are hydrolyzed into peptides that are analyzed by mass spectrometry. Proteomics studies can be applied to diverse studies ranging from simple protein identification to studies of proteoforms, protein-protein interactions, protein structural alterations, absolute and relative protein quantification, post-translational modifications, and protein stability. To enable this range of different experiments, there are diverse strategies for proteome analysis. The nuances of how proteomic workflows differ may be challenging to understand for new practitioners. Here, we provide a comprehensive overview of different proteomics methods. We cover from biochemistry basics and protein extraction to biological interpretation and orthogonal validation. We expect this Review will serve as a handbook for researchers who are new to the field of bottom-up proteomics.
Collapse
Affiliation(s)
- Yuming Jiang
- Department
of Computational Biomedicine, Cedars Sinai
Medical Center, Los Angeles, California 90048, United States
- Smidt Heart
Institute, Cedars Sinai Medical Center, Los Angeles, California 90048, United States
- Advanced
Clinical Biosystems Research Institute, Cedars Sinai Medical Center, Los
Angeles, California 90048, United States
| | - Devasahayam Arokia
Balaya Rex
- Center for
Systems Biology and Molecular Medicine, Yenepoya Research Centre, Yenepoya (Deemed to be University), Mangalore 575018, India
| | - Dina Schuster
- Department
of Biology, Institute of Molecular Systems
Biology, ETH Zurich, Zurich 8093, Switzerland
- Department
of Biology, Institute of Molecular Biology
and Biophysics, ETH Zurich, Zurich 8093, Switzerland
- Laboratory
of Biomolecular Research, Division of Biology and Chemistry, Paul Scherrer Institute, Villigen 5232, Switzerland
| | - Benjamin A. Neely
- Chemical
Sciences Division, National Institute of
Standards and Technology, NIST, Charleston, South Carolina 29412, United States
| | - Germán L. Rosano
- Mass
Spectrometry
Unit, Institute of Molecular and Cellular
Biology of Rosario, Rosario, 2000 Argentina
| | - Norbert Volkmar
- Department
of Biology, Institute of Molecular Systems
Biology, ETH Zurich, Zurich 8093, Switzerland
| | - Amanda Momenzadeh
- Department
of Computational Biomedicine, Cedars Sinai
Medical Center, Los Angeles, California 90048, United States
- Smidt Heart
Institute, Cedars Sinai Medical Center, Los Angeles, California 90048, United States
- Advanced
Clinical Biosystems Research Institute, Cedars Sinai Medical Center, Los
Angeles, California 90048, United States
| | - Trenton M. Peters-Clarke
- Department
of Pharmaceutical Chemistry, University
of California—San Francisco, San Francisco, California, 94158, United States
| | - Susan B. Egbert
- Department
of Chemistry, University of Manitoba, Winnipeg, Manitoba, R3T 2N2 Canada
| | - Simion Kreimer
- Smidt Heart
Institute, Cedars Sinai Medical Center, Los Angeles, California 90048, United States
- Advanced
Clinical Biosystems Research Institute, Cedars Sinai Medical Center, Los
Angeles, California 90048, United States
| | - Emma H. Doud
- Center
for Proteome Analysis, Indiana University
School of Medicine, Indianapolis, Indiana, 46202-3082, United States
| | - Oliver M. Crook
- Oxford
Protein Informatics Group, Department of Statistics, University of Oxford, Oxford OX1 3LB, United
Kingdom
| | - Amit Kumar Yadav
- Translational
Health Science and Technology Institute, NCR Biotech Science Cluster 3rd Milestone Faridabad-Gurgaon
Expressway, Faridabad, Haryana 121001, India
| | | | - Adrian D. Hegeman
- Departments
of Horticultural Science and Plant and Microbial Biology, University of Minnesota, Twin Cities, Minnesota 55108, United States
| | - Martín
L. Mayta
- School
of Medicine and Health Sciences, Center for Health Sciences Research, Universidad Adventista del Plata, Libertador San Martin 3103, Argentina
- Molecular
Biology Department, School of Pharmacy and Biochemistry, Universidad Nacional de Rosario, Rosario 2000, Argentina
| | - Anna G. Duboff
- Department
of Chemistry, University of Washington, Seattle, Washington 98195, United States
| | - Nicholas M. Riley
- Department
of Chemistry, University of Washington, Seattle, Washington 98195, United States
| | - Robert L. Moritz
- Institute
for Systems biology, Seattle, Washington 98109, United States
| | - Jesse G. Meyer
- Department
of Computational Biomedicine, Cedars Sinai
Medical Center, Los Angeles, California 90048, United States
- Smidt Heart
Institute, Cedars Sinai Medical Center, Los Angeles, California 90048, United States
- Advanced
Clinical Biosystems Research Institute, Cedars Sinai Medical Center, Los
Angeles, California 90048, United States
| |
Collapse
|
2
|
Schrader M. Origins, Technological Advancement, and Applications of Peptidomics. Methods Mol Biol 2024; 2758:3-47. [PMID: 38549006 DOI: 10.1007/978-1-0716-3646-6_1] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 04/02/2024]
Abstract
Peptidomics is the comprehensive characterization of peptides from biological sources instead of heading for a few single peptides in former peptide research. Mass spectrometry allows to detect a multitude of peptides in complex mixtures and thus enables new strategies leading to peptidomics. The term was established in the year 2001, and up to now, this new field has grown to over 3000 publications. Analytical techniques originally developed for fast and comprehensive analysis of peptides in proteomics were specifically adjusted for peptidomics. Although it is thus closely linked to proteomics, there are fundamental differences with conventional bottom-up proteomics. Fundamental technological advancements of peptidomics since have occurred in mass spectrometry and data processing, including quantification, and more slightly in separation technology. Different strategies and diverse sources of peptidomes are mentioned by numerous applications, such as discovery of neuropeptides and other bioactive peptides, including the use of biochemical assays. Furthermore, food and plant peptidomics are introduced similarly. Additionally, applications with a clinical focus are included, comprising biomarker discovery as well as immunopeptidomics. This overview extensively reviews recent methods, strategies, and applications including links to all other chapters of this book.
Collapse
Affiliation(s)
- Michael Schrader
- Department of Bioengineering Sciences, Weihenstephan-Tr. University of Applied Sciences, Freising, Germany.
| |
Collapse
|
3
|
Fan KT, Hsu CW, Chen YR. Mass spectrometry in the discovery of peptides involved in intercellular communication: From targeted to untargeted peptidomics approaches. MASS SPECTROMETRY REVIEWS 2023; 42:2404-2425. [PMID: 35765846 DOI: 10.1002/mas.21789] [Citation(s) in RCA: 2] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 11/23/2021] [Revised: 03/17/2022] [Accepted: 04/08/2022] [Indexed: 06/15/2023]
Abstract
Endogenous peptide hormones represent an essential class of biomolecules, which regulate cell-cell communications in diverse physiological processes of organisms. Mass spectrometry (MS) has been developed to be a powerful technology for identifying and quantifying peptides in a highly efficient manner. However, it is difficult to directly identify these peptide hormones due to their diverse characteristics, dynamic regulations, low abundance, and existence in a complicated biological matrix. Here, we summarize and discuss the roles of targeted and untargeted MS in discovering peptide hormones using bioassay-guided purification, bioinformatics screening, or the peptidomics-based approach. Although the peptidomics approach is expected to discover novel peptide hormones unbiasedly, only a limited number of successful cases have been reported. The critical challenges and corresponding measures for peptidomics from the steps of sample preparation, peptide extraction, and separation to the MS data acquisition and analysis are also discussed. We also identify emerging technologies and methods that can be integrated into the discovery platform toward the comprehensive study of endogenous peptide hormones.
Collapse
Affiliation(s)
- Kai-Ting Fan
- Agricultural Biotechnology Research Center, Academia Sinica, Taipei, Taiwan
| | - Chia-Wei Hsu
- Agricultural Biotechnology Research Center, Academia Sinica, Taipei, Taiwan
| | - Yet-Ran Chen
- Agricultural Biotechnology Research Center, Academia Sinica, Taipei, Taiwan
| |
Collapse
|
4
|
Lu Y, Ge C, Cai B, Xu Q, Kong R, Chang S. Antibody sequences assembly method based on weighted de Bruijn graph. MATHEMATICAL BIOSCIENCES AND ENGINEERING : MBE 2023; 20:6174-6190. [PMID: 37161102 DOI: 10.3934/mbe.2023266] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/11/2023]
Abstract
With the development of next-generation protein sequencing technologies, sequence assembly algorithm has become a key technology for de novo sequencing process. At present, the existing methods can address the assembly of an unknown single protein chain. However, for monoclonal antibodies with light and heavy chains, the assembly is still an unsolved question. To address this problem, we propose a new assembly method, DBAS, which integrates the quality scores and sequence alignment scores from de novo sequencing peptides into a weighted de Bruijn graph to assemble the final protein sequences. The established method is used to assembling sequences from two datasets with mixed light and heavy chains from antibodies. The results show that the DBAS can assemble long antibody sequences for both mixed light and heavy chains and single chains. In addition, DBAS is able to distinguish the light and heavy chains by using BLAST sequence alignment. The results show that the algorithm has good performance for both target sequence coverage and contig assembly accuracy.
Collapse
Affiliation(s)
- Yi Lu
- Institute of Bioinformatics and Medical Engineering, School of Electrical and Information Engineering, Jiangsu University of Technology, Changzhou 213001, China
| | - Cheng Ge
- Key Laboratory of Marine Drugs, Chinese Ministry of Education, School of Medicine and Pharmacy, Ocean University of China, Qingdao 266003, China
| | - Biao Cai
- Institute of Bioinformatics and Medical Engineering, School of Electrical and Information Engineering, Jiangsu University of Technology, Changzhou 213001, China
| | - Qing Xu
- Institute of Bioinformatics and Medical Engineering, School of Electrical and Information Engineering, Jiangsu University of Technology, Changzhou 213001, China
| | - Ren Kong
- Institute of Bioinformatics and Medical Engineering, School of Electrical and Information Engineering, Jiangsu University of Technology, Changzhou 213001, China
| | - Shan Chang
- Institute of Bioinformatics and Medical Engineering, School of Electrical and Information Engineering, Jiangsu University of Technology, Changzhou 213001, China
| |
Collapse
|
5
|
Zhang K, Gong X, Wang Q, Tu P, Li J, Song Y. Rapid tryptic peptide mapping of human serum albumin using DI-MS/MS ALL. RSC Adv 2022; 12:9868-9882. [PMID: 35424948 PMCID: PMC8963265 DOI: 10.1039/d1ra08717g] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/29/2021] [Accepted: 03/13/2022] [Indexed: 11/27/2022] Open
Abstract
In recent decades, proteinic drugs, in particular monoclonal antibodies, are taking the leading role of small molecule drugs, and peptide mapping relying on liquid chromatography-tandem mass spectrometry (LC-MS/MS) is an emerging approach to substitute the role of a ligand-binding assay for the quality control of the proteinic drugs. However, such LC-MS/MS approaches extensively suffer from time-intensive measurements, leading to a limited throughput. To achieve accelerated measurements, here, the potential of DI-MS/MSALL towards tryptic peptide mapping was evaluated through comparing with well-defined LC-MS/MS means, and human serum albumin (HSA) was employed as the representative protein for applicability illustration. Among the 55 tryptic peptides theoretically suggested by Skyline software, 47 were successfully captured by DI-MS/MSALL through acquiring the desired MS2 spectra, in comparison to 51 detected by LC-MS/MS. DI-MS/MSALL measurements merely took 5 min, which was dramatically superior to the LC-MS/MS assay. Noteworthily, different from fruitful multi-charged MS1 signals for LC-MS/MS, most quasi-molecular ions received lower charged states. DI-MS/MSALL also possessed advantages such as lower solvent consumption and facile instrumentation; however, more sample was consumed. In conclusion, DI-MS/MSALL is eligible to act as an alternative analytical tool for LC-MS/MS towards the peptide mapping of proteinic drugs, particularly when a heavy measurement workload. DI-MS/MSALL records MS2 spectrum at each 1 Da mass window through gas phase ion fractionation theory, and is eligible to act as an alternative analytical tool for LC-MS/MS towards the peptide mapping of proteinic drugs.![]()
Collapse
Affiliation(s)
- Ke Zhang
- Modern Research Center for Traditional Chinese Medicine, School of Chinese Materia Medica, Beijing University of Chinese Medicine Beijing 100029 China
| | - Xingcheng Gong
- Modern Research Center for Traditional Chinese Medicine, School of Chinese Materia Medica, Beijing University of Chinese Medicine Beijing 100029 China
| | - Qian Wang
- Modern Research Center for Traditional Chinese Medicine, School of Chinese Materia Medica, Beijing University of Chinese Medicine Beijing 100029 China
| | - Pengfei Tu
- Modern Research Center for Traditional Chinese Medicine, School of Chinese Materia Medica, Beijing University of Chinese Medicine Beijing 100029 China
| | - Jun Li
- Modern Research Center for Traditional Chinese Medicine, School of Chinese Materia Medica, Beijing University of Chinese Medicine Beijing 100029 China
| | - Yuelin Song
- Modern Research Center for Traditional Chinese Medicine, School of Chinese Materia Medica, Beijing University of Chinese Medicine Beijing 100029 China
| |
Collapse
|
6
|
Abstract
Accurate full-length sequencing of a purified unknown protein is still challenging nowadays due to the error-prone mass-spectrometry (MS)-based methods. De novo identified peptide sequence largely contain errors, undermining the accuracy of assembly. Bias on the detectability of the peptides also makes low-coverage regions, resulting in gaps. Although recent advances on multi-enzyme hydrolysis and algorithms showed complete assembly of full-length protein sequences in a few examples, the robustness in practical application is still to be improved. Here, inspired by genome assembly strategies, we demonstrate a contig-scaffolding strategy to assemble protein sequences with high robustness and accuracy. This strategy integrates multiple unspecific hydrolysis methods to minimize the bias in the hydrolysis process. After de novo identification of the peptides, our assembly algorithm, named Multiple Contigs & Scaffolding (MuCS), assembles the peptide sequences in a multistep, i.e., contig-scaffold manner, with error correction in each step. MS data from different hydrolysis experiments complement each other for robust contig extension and error correction. We demonstrated that our strategy on three proteins and three replications all reached 100% coverage (except one with 98.85%) and 98.69-100% accuracy. It can also efficiently deal with the membrane protein, although the transmembrane region was missing due to the limitation of the MS. The three replicates reached 88.85-92.57% coverage and 97.57-100% accuracy. In sum, we provided a practical, robust, and accurate solution for full-length protein sequencing. The MuCS software is available at http://chi-biotech.com/mucs/.
Collapse
Affiliation(s)
- Zhi-Biao Mai
- Big Data Decision Institute, Jinan University, Guangzhou 510632, China.,Guangzhou Institutes of Biomedicine and Health, Chinese Academy of Sciences, Guangzhou 510530, China
| | - Zhong-Hua Zhou
- Key Laboratory of Functional Protein Research of Guangdong Higher Education Institutes and MOE Key Laboratory of Tumor Molecular Biology, Institute of Life and Health Engineering, Jinan University, Guangzhou 510632, China
| | - Qing-Yu He
- Key Laboratory of Functional Protein Research of Guangdong Higher Education Institutes and MOE Key Laboratory of Tumor Molecular Biology, Institute of Life and Health Engineering, Jinan University, Guangzhou 510632, China
| | - Gong Zhang
- Key Laboratory of Functional Protein Research of Guangdong Higher Education Institutes and MOE Key Laboratory of Tumor Molecular Biology, Institute of Life and Health Engineering, Jinan University, Guangzhou 510632, China
| |
Collapse
|
7
|
de Graaf SC, Hoek M, Tamara S, Heck AJR. A perspective toward mass spectrometry-based de novo sequencing of endogenous antibodies. MAbs 2022; 14:2079449. [PMID: 35699511 PMCID: PMC9225641 DOI: 10.1080/19420862.2022.2079449] [Citation(s) in RCA: 16] [Impact Index Per Article: 8.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/25/2022] Open
Abstract
A key step in therapeutic and endogenous humoral antibody characterization is identifying the amino acid sequence. So far, this task has been mainly tackled through sequencing of B-cell receptor (BCR) repertoires at the nucleotide level. Mass spectrometry (MS) has emerged as an alternative tool for obtaining sequence information directly at the – most relevant – protein level. Although several MS methods are now well established, analysis of recombinant and endogenous antibodies comes with a specific set of challenges, requiring approaches beyond the conventional proteomics workflows. Here, we review the challenges in MS-based sequencing of both recombinant as well as endogenous humoral antibodies and outline state-of-the-art methods attempting to overcome these obstacles. We highlight recent examples and discuss remaining challenges. We foresee a great future for these approaches making de novo antibody sequencing and discovery by MS-based techniques feasible, even for complex clinical samples from endogenous sources such as serum and other liquid biopsies.
Collapse
Affiliation(s)
- Sebastiaan C de Graaf
- Biomolecular Mass Spectrometry and Proteomics, Bijvoet Center for Biomolecular Research and Utrecht Institute for Pharmaceutical Sciences, University of Utrecht, Utrecht, Netherlands.,Netherlands Proteomics Center, Utrecht, Netherlands
| | - Max Hoek
- Biomolecular Mass Spectrometry and Proteomics, Bijvoet Center for Biomolecular Research and Utrecht Institute for Pharmaceutical Sciences, University of Utrecht, Utrecht, Netherlands.,Netherlands Proteomics Center, Utrecht, Netherlands
| | - Sem Tamara
- Biomolecular Mass Spectrometry and Proteomics, Bijvoet Center for Biomolecular Research and Utrecht Institute for Pharmaceutical Sciences, University of Utrecht, Utrecht, Netherlands.,Netherlands Proteomics Center, Utrecht, Netherlands
| | - Albert J R Heck
- Biomolecular Mass Spectrometry and Proteomics, Bijvoet Center for Biomolecular Research and Utrecht Institute for Pharmaceutical Sciences, University of Utrecht, Utrecht, Netherlands.,Netherlands Proteomics Center, Utrecht, Netherlands
| |
Collapse
|
8
|
Progress and challenges in mass spectrometry-based analysis of antibody repertoires. Trends Biotechnol 2021; 40:463-481. [PMID: 34535228 DOI: 10.1016/j.tibtech.2021.08.006] [Citation(s) in RCA: 19] [Impact Index Per Article: 6.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/23/2021] [Revised: 08/16/2021] [Accepted: 08/17/2021] [Indexed: 12/22/2022]
Abstract
Humoral immunity is divided into the cellular B cell and protein-level antibody responses. High-throughput sequencing has advanced our understanding of both these fundamental aspects of B cell immunology as well as aspects pertaining to vaccine and therapeutics biotechnology. Although the protein-level serum and mucosal antibody repertoire make major contributions to humoral protection, the sequence composition and dynamics of antibody repertoires remain underexplored. This limits insight into important immunological and biotechnological parameters such as the number of antigen-specific antibodies, which are for example, relevant for pathogen neutralization, microbiota regulation, severity of autoimmunity, and therapeutic efficacy. High-resolution mass spectrometry (MS) has allowed initial insights into the antibody repertoire. We outline current challenges in MS-based sequence analysis of antibody repertoires and propose strategies for their resolution.
Collapse
|
9
|
Samodova D, Hosfield CM, Cramer CN, Giuli MV, Cappellini E, Franciosa G, Rosenblatt MM, Kelstrup CD, Olsen JV. ProAlanase is an Effective Alternative to Trypsin for Proteomics Applications and Disulfide Bond Mapping. Mol Cell Proteomics 2020; 19:2139-2157. [PMID: 33020190 PMCID: PMC7710147 DOI: 10.1074/mcp.tir120.002129] [Citation(s) in RCA: 25] [Impact Index Per Article: 6.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/15/2020] [Revised: 09/29/2020] [Indexed: 01/01/2023] Open
Abstract
Trypsin is the protease of choice in bottom-up proteomics. However, its application can be limited by the amino acid composition of target proteins and the pH of the digestion solution. In this study we characterize ProAlanase, a protease from the fungus Aspergillus niger that cleaves primarily on the C-terminal side of proline and alanine residues. ProAlanase achieves high proteolytic activity and specificity when digestion is carried out at acidic pH (1.5) for relatively short (2 h) time periods. To elucidate the potential of ProAlanase in proteomics applications, we conducted a series of investigations comprising comparative multi-enzymatic profiling of a human cell line proteome, histone PTM analysis, ancient bone protein identification, phosphosite mapping and de novo sequencing of a proline-rich protein and disulfide bond mapping in mAb. The results demonstrate that ProAlanase is highly suitable for proteomics analysis of the arginine- and lysine-rich histones, enabling high sequence coverage of multiple histone family members. It also facilitates an efficient digestion of bone collagen thanks to the cleavage at the C terminus of hydroxyproline which is highly prevalent in collagen. This allows to identify complementary proteins in ProAlanase- and trypsin-digested ancient bone samples, as well as to increase sequence coverage of noncollagenous proteins. Moreover, digestion with ProAlanase improves protein sequence coverage and phosphosite localization for the proline-rich protein Notch3 intracellular domain (N3ICD). Furthermore, we achieve a nearly complete coverage of N3ICD protein by de novo sequencing using the combination of ProAlanase and tryptic peptides. Finally, we demonstrate that ProAlanase is efficient in disulfide bond mapping, showing high coverage of disulfide-containing regions in a nonreduced mAb.
Collapse
Affiliation(s)
- Diana Samodova
- Novo Nordisk Foundation Center for Protein Research, University of Copenhagen, Copenhagen, Denmark
| | | | | | - Maria V Giuli
- Department of Molecular Medicine, Sapienza University of Rome, Rome, Italy
| | - Enrico Cappellini
- Evolutionary Genomics SectionGlobe Institute, University of Copenhagen, Copenhagen, Denmark
| | - Giulia Franciosa
- Novo Nordisk Foundation Center for Protein Research, University of Copenhagen, Copenhagen, Denmark
| | | | - Christian D Kelstrup
- Novo Nordisk Foundation Center for Protein Research, University of Copenhagen, Copenhagen, Denmark
| | - Jesper V Olsen
- Novo Nordisk Foundation Center for Protein Research, University of Copenhagen, Copenhagen, Denmark.
| |
Collapse
|
10
|
Walker AA, Robinson SD, Hamilton BF, Undheim EAB, King GF. Deadly Proteomes: A Practical Guide to Proteotranscriptomics of Animal Venoms. Proteomics 2020; 20:e1900324. [PMID: 32820606 DOI: 10.1002/pmic.201900324] [Citation(s) in RCA: 25] [Impact Index Per Article: 6.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/12/2020] [Revised: 08/07/2020] [Indexed: 11/11/2022]
Abstract
Animal venoms are renowned for their toxicity, biochemical complexity, and as a source of compounds with potential applications in medicine, agriculture, and industry. Polypeptides underlie much of the pharmacology of animal venoms, and elucidating these arsenals of polypeptide toxins-known as the venom proteome or venome-is an important step in venom research. Proteomics is used for the identification of venom toxins, determination of their primary structure including post-translational modifications, as well as investigations into the physiology underlying their production and delivery. Advances in proteomics and adjacent technologies has led to a recent upsurge in publications reporting venom proteomes. Improved mass spectrometers, better proteomic workflows, and the integration of next-generation sequencing of venom-gland transcriptomes and venomous animal genomes allow quicker and more accurate profiling of venom proteomes with greatly reduced starting material. Technologies such as imaging mass spectrometry are revealing additional insights into the mechanism, location, and kinetics of venom toxin production. However, these numerous new developments may be overwhelming for researchers designing venom proteome studies. Here, the field of venom proteomics is reviewed and some practical solutions for simplifying mass spectrometry workflows to study animal venoms are offered.
Collapse
Affiliation(s)
- Andrew A Walker
- Institute for Molecular Bioscience, The University of Queensland, St. Lucia, Queensland, 4072, Australia
| | - Samuel D Robinson
- Institute for Molecular Bioscience, The University of Queensland, St. Lucia, Queensland, 4072, Australia
| | - Brett F Hamilton
- Centre for Microscopy and Microanalysis, The University of Queensland, St. Lucia, Queensland, 4072, Australia.,Centre for Advanced Imaging, The University of Queensland, St. Lucia, Queensland, 4072, Australia
| | - Eivind A B Undheim
- Centre for Advanced Imaging, The University of Queensland, St. Lucia, Queensland, 4072, Australia.,Department of Biology, Centre for Biodiversity Dynamics, NTNU, Trondheim, 7491, Norway.,Department of Bioscience, Centre for Ecological and Evolutionary Synthesis, University of Oslo, Blindern, Oslo, 0316, Norway
| | - Glenn F King
- Institute for Molecular Bioscience, The University of Queensland, St. Lucia, Queensland, 4072, Australia
| |
Collapse
|
11
|
O'Bryon I, Jenson SC, Merkley ED. Flying blind, or just flying under the radar? The underappreciated power of de novo methods of mass spectrometric peptide identification. Protein Sci 2020; 29:1864-1878. [PMID: 32713088 PMCID: PMC7454419 DOI: 10.1002/pro.3919] [Citation(s) in RCA: 15] [Impact Index Per Article: 3.8] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/15/2020] [Revised: 07/21/2020] [Accepted: 07/23/2020] [Indexed: 12/15/2022]
Abstract
Mass spectrometry-based proteomics is a popular and powerful method for precise and highly multiplexed protein identification. The most common method of analyzing untargeted proteomics data is called database searching, where the database is simply a collection of protein sequences from the target organism, derived from genome sequencing. Experimental peptide tandem mass spectra are compared to simplified models of theoretical spectra calculated from the translated genomic sequences. However, in several interesting application areas, such as forensics, archaeology, venomics, and others, a genome sequence may not be available, or the correct genome sequence to use is not known. In these cases, de novo peptide identification can play an important role. De novo methods infer peptide sequence directly from the tandem mass spectrum without reference to a sequence database, usually using graph-based or machine learning algorithms. In this review, we provide a basic overview of de novo peptide identification methods and applications, briefly covering de novo algorithms and tools, and focusing in more depth on recent applications from venomics, metaproteomics, forensics, and characterization of antibody drugs.
Collapse
Affiliation(s)
- Isabelle O'Bryon
- Chemical and Biological SignaturesPacific Northwest National LaboratoryRichlandWashingtonUSA
| | - Sarah C. Jenson
- Chemical and Biological SignaturesPacific Northwest National LaboratoryRichlandWashingtonUSA
| | - Eric D. Merkley
- Chemical and Biological SignaturesPacific Northwest National LaboratoryRichlandWashingtonUSA
| |
Collapse
|
12
|
Mullahoo J, Zhang T, Clauser K, Carr SA, Jaffe JD, Papanastasiou M. Dual protease type XIII/pepsin digestion offers superior resolution and overlap for the analysis of histone tails by HX-MS. Methods 2020; 184:135-140. [PMID: 32004545 DOI: 10.1016/j.ymeth.2020.01.016] [Citation(s) in RCA: 9] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/01/2019] [Revised: 01/22/2020] [Accepted: 01/26/2020] [Indexed: 01/26/2023] Open
Abstract
The N-terminal regions of histone proteins (tails) are dynamic elements that protrude from the nucleosome and are involved in many aspects of chromatin organization. Their epigenetic role is well-established, and post-translational modifications (PTMs) present on these regions contribute to transcriptional regulation. While hydrogen/deuterium exchange mass spectrometry (HX-MS) is well-suited for the analysis of dynamic structures, it has seldom been employed to analyze histones due to the poor N-terminal coverage obtained using pepsin. Here, we test the applicability of a dual protease type XIII/pepsin digestion column to this class of proteins. We optimize online digestion conditions using the H4 monomer, and extend the method to the analysis of histones in monomeric states and nucleosome core particles (NCPs). We show that the dual protease column generates many short and overlapping N-terminal peptides. We evaluate our method by performing hydrogen exchange experiments of NCPs for different time points and present full coverage of the tails at excellent resolution. We further employ electron transfer dissociation and showcase an unprecedented degree of overlap across multiple peptides that is several fold higher than previously reported methods. The method we report here may be readily applied to the HX-MS investigation of histone dynamics and to the footprints of histone binding proteins on nucleosomes.
Collapse
Affiliation(s)
- James Mullahoo
- The Broad Institute of MIT and Harvard, Cambridge, MA, United States
| | - Terry Zhang
- Thermo Scientific, San Jose, CA, United States
| | - Karl Clauser
- The Broad Institute of MIT and Harvard, Cambridge, MA, United States
| | - Steven A Carr
- The Broad Institute of MIT and Harvard, Cambridge, MA, United States
| | - Jacob D Jaffe
- The Broad Institute of MIT and Harvard, Cambridge, MA, United States
| | | |
Collapse
|
13
|
Shaw JB, Liu W, Vasil′ev YV, Bracken CC, Malhan N, Guthals A, Beckman JS, Voinov VG. Direct Determination of Antibody Chain Pairing by Top-down and Middle-down Mass Spectrometry Using Electron Capture Dissociation and Ultraviolet Photodissociation. Anal Chem 2020; 92:766-773. [PMID: 31769659 PMCID: PMC7819135 DOI: 10.1021/acs.analchem.9b03129] [Citation(s) in RCA: 44] [Impact Index Per Article: 11.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/13/2022]
Abstract
One challenge associated with the discovery and development of monoclonal antibody (mAb) therapeutics is the determination of heavy chain and light chain pairing. Advances in MS instrumentation and MS/MS methods have greatly enhanced capabilities for the analysis of large intact proteins yielding much more detailed and accurate proteoform characterization. Consequently, direct interrogation of intact antibodies or F(ab')2 and Fab fragments has the potential to significantly streamline therapeutic mAb discovery processes. Here, we demonstrate for the first time the ability to efficiently cleave disulfide bonds linking heavy and light chains of mAbs using electron capture dissociation (ECD) and 157 nm ultraviolet photodissociation (UVPD). The combination of intact mAb, Fab, or F(ab')2 mass, intact LC and Fd masses, and CDR3 sequence coverage enabled determination of heavy chain and light chain pairing from a single experiment and experimental condition. These results demonstrate the potential of top-down and middle-down proteomics to significantly streamline therapeutic antibody discovery.
Collapse
Affiliation(s)
- Jared B. Shaw
- Environmental Molecular Sciences Laboratory, Pacific Northwest National Laboratory, 3335 Innovation Boulevard, Richland, Washington 99354, United States
| | - Weijing Liu
- Environmental Molecular Sciences Laboratory, Pacific Northwest National Laboratory, 3335 Innovation Boulevard, Richland, Washington 99354, United States
| | - Yury V. Vasil′ev
- e-MSion Inc., 2121 NE Jack London Drive, Corvallis, Oregon 97330, United States
- Linus Pauling Institute and the Department of Biochemistry and Biophysics, Oregon State University, Corvallis, Oregon 97331, United States
| | - Carter C. Bracken
- Environmental Molecular Sciences Laboratory, Pacific Northwest National Laboratory, 3335 Innovation Boulevard, Richland, Washington 99354, United States
| | - Neha Malhan
- Environmental Molecular Sciences Laboratory, Pacific Northwest National Laboratory, 3335 Innovation Boulevard, Richland, Washington 99354, United States
| | - Adrian Guthals
- Mapp Biopharmaceutical Inc., 6160 Lusk Boulevard #105, San Diego, California 92121, United States
| | - Joseph S. Beckman
- e-MSion Inc., 2121 NE Jack London Drive, Corvallis, Oregon 97330, United States
- Linus Pauling Institute and the Department of Biochemistry and Biophysics, Oregon State University, Corvallis, Oregon 97331, United States
| | - Valery G. Voinov
- e-MSion Inc., 2121 NE Jack London Drive, Corvallis, Oregon 97330, United States
- Linus Pauling Institute and the Department of Biochemistry and Biophysics, Oregon State University, Corvallis, Oregon 97331, United States
| |
Collapse
|
14
|
Fert-Bober J, Murray CI, Parker SJ, Van Eyk JE. Precision Profiling of the Cardiovascular Post-Translationally Modified Proteome: Where There Is a Will, There Is a Way. Circ Res 2019; 122:1221-1237. [PMID: 29700069 DOI: 10.1161/circresaha.118.310966] [Citation(s) in RCA: 30] [Impact Index Per Article: 6.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Indexed: 12/17/2022]
Abstract
There is an exponential increase in biological complexity as initial gene transcripts are spliced, translated into amino acid sequence, and post-translationally modified. Each protein can exist as multiple chemical or sequence-specific proteoforms, and each has the potential to be a critical mediator of a physiological or pathophysiological signaling cascade. Here, we provide an overview of how different proteoforms come about in biological systems and how they are most commonly measured using mass spectrometry-based proteomics and bioinformatics. Our goal is to present this information at a level accessible to every scientist interested in mass spectrometry and its application to proteome profiling. We will specifically discuss recent data linking various protein post-translational modifications to cardiovascular disease and conclude with a discussion for enablement and democratization of proteomics across the cardiovascular and scientific community. The aim is to inform and inspire the readership to explore a larger breadth of proteoform, particularity post-translational modifications, related to their particular areas of expertise in cardiovascular physiology.
Collapse
Affiliation(s)
- Justyna Fert-Bober
- From the Advanced Clinical BioSystems Research Institute, Smidt Heart Institute, Department of Medicine, Cedars Sinai Medical Center, Los Angeles, CA
| | - Christopher I Murray
- From the Advanced Clinical BioSystems Research Institute, Smidt Heart Institute, Department of Medicine, Cedars Sinai Medical Center, Los Angeles, CA
| | - Sarah J Parker
- From the Advanced Clinical BioSystems Research Institute, Smidt Heart Institute, Department of Medicine, Cedars Sinai Medical Center, Los Angeles, CA.
| | - Jennifer E Van Eyk
- From the Advanced Clinical BioSystems Research Institute, Smidt Heart Institute, Department of Medicine, Cedars Sinai Medical Center, Los Angeles, CA
| |
Collapse
|
15
|
Morsa D, Baiwir D, La Rocca R, Zimmerman TA, Hanozin E, Grifnée E, Longuespée R, Meuwis MA, Smargiasso N, Pauw ED, Mazzucchelli G. Multi-Enzymatic Limited Digestion: The Next-Generation Sequencing for Proteomics? J Proteome Res 2019; 18:2501-2513. [DOI: 10.1021/acs.jproteome.9b00044] [Citation(s) in RCA: 20] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/14/2022]
Affiliation(s)
- Denis Morsa
- Mass Spectrometry Laboratory, MolSys Research Unit, University of Liege, Liege 4000, Belgium
- GIGA Proteomics Facility, University of Liege, Liege 4000, Belgium
| | - Dominique Baiwir
- Mass Spectrometry Laboratory, MolSys Research Unit, University of Liege, Liege 4000, Belgium
- GIGA Proteomics Facility, University of Liege, Liege 4000, Belgium
| | - Raphaël La Rocca
- Mass Spectrometry Laboratory, MolSys Research Unit, University of Liege, Liege 4000, Belgium
| | - Tyler A. Zimmerman
- Mass Spectrometry Laboratory, MolSys Research Unit, University of Liege, Liege 4000, Belgium
| | - Emeline Hanozin
- Mass Spectrometry Laboratory, MolSys Research Unit, University of Liege, Liege 4000, Belgium
| | - Elodie Grifnée
- Mass Spectrometry Laboratory, MolSys Research Unit, University of Liege, Liege 4000, Belgium
| | - Rémi Longuespée
- Mass Spectrometry Laboratory, MolSys Research Unit, University of Liege, Liege 4000, Belgium
| | - Marie-Alice Meuwis
- Mass Spectrometry Laboratory, MolSys Research Unit, University of Liege, Liege 4000, Belgium
- Department of Hepato-Gastroenterology and Digestive Oncology, CHU, Liege 4000, Belgium
- Laboratory of Translational Gastroenterology, GIGA, Liege 4000, Belgium
| | - Nicolas Smargiasso
- Mass Spectrometry Laboratory, MolSys Research Unit, University of Liege, Liege 4000, Belgium
| | - Edwin De Pauw
- Mass Spectrometry Laboratory, MolSys Research Unit, University of Liege, Liege 4000, Belgium
| | - Gabriel Mazzucchelli
- Mass Spectrometry Laboratory, MolSys Research Unit, University of Liege, Liege 4000, Belgium
| |
Collapse
|
16
|
Muth T, Renard BY. Evaluating de novo sequencing in proteomics: already an accurate alternative to database-driven peptide identification? Brief Bioinform 2019; 19:954-970. [PMID: 28369237 DOI: 10.1093/bib/bbx033] [Citation(s) in RCA: 63] [Impact Index Per Article: 12.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/08/2016] [Indexed: 01/24/2023] Open
Abstract
While peptide identifications in mass spectrometry (MS)-based shotgun proteomics are mostly obtained using database search methods, high-resolution spectrum data from modern MS instruments nowadays offer the prospect of improving the performance of computational de novo peptide sequencing. The major benefit of de novo sequencing is that it does not require a reference database to deduce full-length or partial tag-based peptide sequences directly from experimental tandem mass spectrometry spectra. Although various algorithms have been developed for automated de novo sequencing, the prediction accuracy of proposed solutions has been rarely evaluated in independent benchmarking studies. The main objective of this work is to provide a detailed evaluation on the performance of de novo sequencing algorithms on high-resolution data. For this purpose, we processed four experimental data sets acquired from different instrument types from collision-induced dissociation and higher energy collisional dissociation (HCD) fragmentation mode using the software packages Novor, PEAKS and PepNovo. Moreover, the accuracy of these algorithms is also tested on ground truth data based on simulated spectra generated from peak intensity prediction software. We found that Novor shows the overall best performance compared with PEAKS and PepNovo with respect to the accuracy of correct full peptide, tag-based and single-residue predictions. In addition, the same tool outpaced the commercial competitor PEAKS in terms of running time speedup by factors of around 12-17. Despite around 35% prediction accuracy for complete peptide sequences on HCD data sets, taken as a whole, the evaluated algorithms perform moderately on experimental data but show a significantly better performance on simulated data (up to 84% accuracy). Further, we describe the most frequently occurring de novo sequencing errors and evaluate the influence of missing fragment ion peaks and spectral noise on the accuracy. Finally, we discuss the potential of de novo sequencing for now becoming more widely used in the field.
Collapse
Affiliation(s)
- Thilo Muth
- Research Group Bioinformatics, Robert Koch Institute, Berlin, Germany
| | - Bernhard Y Renard
- Research Group Bioinformatics, Robert Koch Institute, Berlin, Germany
| |
Collapse
|
17
|
Yang H, Li YC, Zhao MZ, Wu FL, Wang X, Xiao WD, Wang YH, Zhang JL, Wang FQ, Xu F, Zeng WF, Overall CM, He SM, Chi H, Xu P. Precision De Novo Peptide Sequencing Using Mirror Proteases of Ac-LysargiNase and Trypsin for Large-scale Proteomics. Mol Cell Proteomics 2019; 18:773-785. [PMID: 30622160 PMCID: PMC6442358 DOI: 10.1074/mcp.tir118.000918] [Citation(s) in RCA: 24] [Impact Index Per Article: 4.8] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/16/2018] [Revised: 11/20/2018] [Indexed: 11/06/2022] Open
Abstract
De novo peptide sequencing for large-scale proteomics remains challenging because of the lack of full coverage of ion series in tandem mass spectra. We developed a mirror protease of trypsin, acetylated LysargiNase (Ac-LysargiNase), with superior activity and stability. The mirror spectrum pairs derived from the Ac-LysargiNase and trypsin treated samples can generate full b and y ion series, which provide mutual complementarity of each other, and allow us to develop a novel algorithm, pNovoM, for de novo sequencing. Using pNovoM to sequence peptides of purified proteins, the accuracy of the sequence was close to 100%. More importantly, from a large-scale yeast proteome sample digested with trypsin and Ac-LysargiNase individually, 48% of all tandem mass spectra formed mirror spectrum pairs, 97% of which contained full coverage of ion series, resulting in precision de novo sequencing of full-length peptides by pNovoM. This enabled pNovoM to successfully sequence 21,249 peptides from 3,753 proteins and interpreted 44-152% more spectra than pNovo+ and PEAKS at a 5% FDR at the spectrum level. Moreover, the mirror protease strategy had an obvious advantage in sequencing long peptides. We believe that the combination of mirror protease strategy and pNovoM will be an effective approach for precision de novo sequencing on both single proteins and proteome samples.
Collapse
Affiliation(s)
- Hao Yang
- From the ‡Key Lab of Intelligent Information Processing of Chinese Academy of Sciences (CAS), Institute of Computing Technology, CAS; University of Chinese Academy of Sciences; Institute of Computing Technology, CAS, Beijing 100190, China
| | - Yan-Chang Li
- §State Key Laboratory of Proteomics; Beijing Proteome Research Center; National Center for Protein Sciences Beijing; Beijing Institute of Lifeomics, Beijing 102206, China
| | - Ming-Zhi Zhao
- §State Key Laboratory of Proteomics; Beijing Proteome Research Center; National Center for Protein Sciences Beijing; Beijing Institute of Lifeomics, Beijing 102206, China
| | - Fei-Lin Wu
- §State Key Laboratory of Proteomics; Beijing Proteome Research Center; National Center for Protein Sciences Beijing; Beijing Institute of Lifeomics, Beijing 102206, China
| | - Xi Wang
- From the ‡Key Lab of Intelligent Information Processing of Chinese Academy of Sciences (CAS), Institute of Computing Technology, CAS; University of Chinese Academy of Sciences; Institute of Computing Technology, CAS, Beijing 100190, China
| | - Wei-Di Xiao
- §State Key Laboratory of Proteomics; Beijing Proteome Research Center; National Center for Protein Sciences Beijing; Beijing Institute of Lifeomics, Beijing 102206, China
| | - Yi-Hao Wang
- §State Key Laboratory of Proteomics; Beijing Proteome Research Center; National Center for Protein Sciences Beijing; Beijing Institute of Lifeomics, Beijing 102206, China
| | - Jun-Ling Zhang
- §State Key Laboratory of Proteomics; Beijing Proteome Research Center; National Center for Protein Sciences Beijing; Beijing Institute of Lifeomics, Beijing 102206, China
| | - Fu-Qiang Wang
- §State Key Laboratory of Proteomics; Beijing Proteome Research Center; National Center for Protein Sciences Beijing; Beijing Institute of Lifeomics, Beijing 102206, China
| | - Feng Xu
- §State Key Laboratory of Proteomics; Beijing Proteome Research Center; National Center for Protein Sciences Beijing; Beijing Institute of Lifeomics, Beijing 102206, China
| | - Wen-Feng Zeng
- From the ‡Key Lab of Intelligent Information Processing of Chinese Academy of Sciences (CAS), Institute of Computing Technology, CAS; University of Chinese Academy of Sciences; Institute of Computing Technology, CAS, Beijing 100190, China
| | - Christopher M Overall
- ‖Centre for Blood Research, University of British Columbia, Vancouver, British Columbia, Canada
| | - Si-Min He
- From the ‡Key Lab of Intelligent Information Processing of Chinese Academy of Sciences (CAS), Institute of Computing Technology, CAS; University of Chinese Academy of Sciences; Institute of Computing Technology, CAS, Beijing 100190, China;.
| | - Hao Chi
- From the ‡Key Lab of Intelligent Information Processing of Chinese Academy of Sciences (CAS), Institute of Computing Technology, CAS; University of Chinese Academy of Sciences; Institute of Computing Technology, CAS, Beijing 100190, China;.
| | - Ping Xu
- §State Key Laboratory of Proteomics; Beijing Proteome Research Center; National Center for Protein Sciences Beijing; Beijing Institute of Lifeomics, Beijing 102206, China;; ¶Key Laboratory of Combinatorial Biosynthesis and Drug Discovery of Ministry of Education Wuhan University, Wuhan University School of Pharmaceutical Sciences, Wuhan 430071, China;; College of Life Sciences, Hebei University, Baoding 071002, China.
| |
Collapse
|
18
|
Muth T, Hartkopf F, Vaudel M, Renard BY. A Potential Golden Age to Come-Current Tools, Recent Use Cases, and Future Avenues for De Novo Sequencing in Proteomics. Proteomics 2018; 18:e1700150. [PMID: 29968278 DOI: 10.1002/pmic.201700150] [Citation(s) in RCA: 33] [Impact Index Per Article: 5.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/23/2018] [Revised: 05/23/2018] [Indexed: 01/15/2023]
Abstract
In shotgun proteomics, peptide and protein identification is most commonly conducted using database search engines, the method of choice when reference protein sequences are available. Despite its widespread use the database-driven approach is limited, mainly because of its static search space. In contrast, de novo sequencing derives peptide sequence information in an unbiased manner, using only the fragment ion information from the tandem mass spectra. In recent years, with the improvements in MS instrumentation, various new methods have been proposed for de novo sequencing. This review article provides an overview of existing de novo sequencing algorithms and software tools ranging from peptide sequencing to sequence-to-protein mapping. Various use cases are described for which de novo sequencing was successfully applied. Finally, limitations of current methods are highlighted and new directions are discussed for a wider acceptance of de novo sequencing in the community.
Collapse
Affiliation(s)
- Thilo Muth
- Bioinformatics Unit (MF 1), Department for Methods Development and Research Infrastructure, Robert Koch Institute, 13353, Berlin, Germany
| | - Felix Hartkopf
- Bioinformatics Unit (MF 1), Department for Methods Development and Research Infrastructure, Robert Koch Institute, 13353, Berlin, Germany
| | - Marc Vaudel
- K.G. Jebsen Center for Diabetes Research, Department of Clinical Science, University of Bergen, 5020, Bergen, Norway.,Center for Medical Genetics and Molecular Medicine, Haukeland University Hospital, 5020, Bergen, Norway
| | - Bernhard Y Renard
- Bioinformatics Unit (MF 1), Department for Methods Development and Research Infrastructure, Robert Koch Institute, 13353, Berlin, Germany
| |
Collapse
|
19
|
Affiliation(s)
- Nicholas
M. Riley
- Department
of Chemistry, University of Wisconsin-Madison, Madison, Wisconsin 53706, United States
- Genome
Center of Wisconsin, University of Wisconsin-Madison, Madison, Wisconsin 53706, United States
| | - Joshua J. Coon
- Department
of Chemistry, University of Wisconsin-Madison, Madison, Wisconsin 53706, United States
- Genome
Center of Wisconsin, University of Wisconsin-Madison, Madison, Wisconsin 53706, United States
- Department
of Biomolecular Chemistry, University of
Wisconsin-Madison, Madison, Wisconsin 53706, United States
- Morgridge
Institute for Research, Madison, Wisconsin 53715, United States
| |
Collapse
|
20
|
Abstract
Peptidomics is the comprehensive characterization of peptides from biological sources mainly by HPLC and mass spectrometry. Mass spectrometry allows the detection of a multitude of single peptides in complex mixtures. The term first appeared in full papers in the year 2001, after over 100 years of peptide research with a main focus on one or a few specific peptides. Within the last 15 years, this new field has grown to over 1200 publications. Mass spectrometry techniques, in combination with other analytical methods, were developed for the fast and comprehensive analysis of peptides in proteomics and specifically adjusted to implement peptidomics technologies. Although peptidomics is closely linked to proteomics, there are fundamental differences with conventional bottom-up proteomics. The development of peptidomics is described, including the most important implementations for its technological basis. Different strategies are covered which are applied to several important applications, such as neuropeptidomics and discovery of bioactive peptides or biomarkers. This overview includes links to all other chapters in the book as well as recent developments of separation, mass spectrometric, and data processing technologies. Additionally, some new applications in food and plant peptidomics as well as immunopeptidomics are introduced.
Collapse
|
21
|
Blank-Landeshammer B, Kollipara L, Biß K, Pfenninger M, Malchow S, Shuvaev K, Zahedi RP, Sickmann A. Combining De Novo Peptide Sequencing Algorithms, A Synergistic Approach to Boost Both Identifications and Confidence in Bottom-up Proteomics. J Proteome Res 2017; 16:3209-3218. [DOI: 10.1021/acs.jproteome.7b00198] [Citation(s) in RCA: 9] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/28/2023]
Affiliation(s)
| | - Laxmikanth Kollipara
- Leibniz-Institut für Analytische Wissenschaften − ISAS − e.V., 44139 Dortmund, Germany
| | - Karsten Biß
- Leibniz-Institut für Analytische Wissenschaften − ISAS − e.V., 44139 Dortmund, Germany
| | - Markus Pfenninger
- Biodiversity
and Climate Research Centre, Senckenberg Gesellschaft für Naturforschung, 60325 Frankfurt am Main, Germany
- Faculty
of Biological Science, Institute for Ecology, Evolution and Diversity,
Department of Molecular Ecology, Goethe University, Max-von-Laue-Straße
9, 60438 Frankfurt
am Main, Germany
| | - Sebastian Malchow
- Leibniz-Institut für Analytische Wissenschaften − ISAS − e.V., 44139 Dortmund, Germany
| | - Konstantin Shuvaev
- Leibniz-Institut für Analytische Wissenschaften − ISAS − e.V., 44139 Dortmund, Germany
| | - René P. Zahedi
- Leibniz-Institut für Analytische Wissenschaften − ISAS − e.V., 44139 Dortmund, Germany
| | - Albert Sickmann
- Leibniz-Institut für Analytische Wissenschaften − ISAS − e.V., 44139 Dortmund, Germany
- Medizinische
Fakultät, Medizinische Proteom-Center (MPC), Ruhr-Universität Bochum, 44801 Bochum, Germany
- Department
of Chemistry, College of Physical Sciences, University of Aberdeen, Aberdeen AB24 3FX, Scotland, United Kingdom
| |
Collapse
|
22
|
Abstract
De novo peptide sequencing from tandem MS data is the key technology in proteomics for the characterization of proteins, especially for new sequences, such as mAbs. In this study, we propose a deep neural network model, DeepNovo, for de novo peptide sequencing. DeepNovo architecture combines recent advances in convolutional neural networks and recurrent neural networks to learn features of tandem mass spectra, fragment ions, and sequence patterns of peptides. The networks are further integrated with local dynamic programming to solve the complex optimization task of de novo sequencing. We evaluated the method on a wide variety of species and found that DeepNovo considerably outperformed state of the art methods, achieving 7.7-22.9% higher accuracy at the amino acid level and 38.1-64.0% higher accuracy at the peptide level. We further used DeepNovo to automatically reconstruct the complete sequences of antibody light and heavy chains of mouse, achieving 97.5-100% coverage and 97.2-99.5% accuracy, without assisting databases. Moreover, DeepNovo is retrainable to adapt to any sources of data and provides a complete end-to-end training and prediction solution to the de novo sequencing problem. Not only does our study extend the deep learning revolution to a new field, but it also shows an innovative approach in solving optimization problems by using deep learning and dynamic programming.
Collapse
|
23
|
Trevisan-Silva D, Bednaski AV, Fischer JSG, Veiga SS, Bandeira N, Guthals A, Marchini FK, Leprevost FV, Barbosa VC, Senff-Ribeiro A, Carvalho PC. A multi-protease, multi-dissociation, bottom-up-to-top-down proteomic view of the Loxosceles intermedia venom. Sci Data 2017; 4:170090. [PMID: 28696408 PMCID: PMC5505115 DOI: 10.1038/sdata.2017.90] [Citation(s) in RCA: 18] [Impact Index Per Article: 2.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/16/2016] [Accepted: 05/12/2017] [Indexed: 12/15/2022] Open
Abstract
Venoms are a rich source for the discovery of molecules with biotechnological applications, but their analysis is challenging even for state-of-the-art proteomics. Here we report on a large-scale proteomic assessment of the venom of Loxosceles intermedia, the so-called brown spider. Venom was extracted from 200 spiders and fractioned into two aliquots relative to a 10 kDa cutoff mass. Each of these was further fractioned and digested with trypsin (4 h), trypsin (18 h), pepsin (18 h), and chymotrypsin (18 h), then analyzed by MudPIT on an LTQ-Orbitrap XL ETD mass spectrometer fragmenting precursors by CID, HCD, and ETD. Aliquots of undigested samples were also analyzed. Our experimental design allowed us to apply spectral networks, thus enabling us to obtain meta-contig assemblies, and consequently de novo sequencing of practically complete proteins, culminating in a deep proteome assessment of the venom. Data are available via ProteomeXchange, with identifier PXD005523.
Collapse
Affiliation(s)
- Dilza Trevisan-Silva
- Department of Cell Biology, Federal University of Paraná, Curitiba 81531-980, Brazil
| | - Aline V Bednaski
- Department of Cell Biology, Federal University of Paraná, Curitiba 81531-980, Brazil
| | - Juliana S G Fischer
- Computational Mass Spectrometry &Proteomics Group, Carlos Chagas Institute, Fiocruz, Curitiba 81.350-010, Brazil
| | - Silvio S Veiga
- Department of Cell Biology, Federal University of Paraná, Curitiba 81531-980, Brazil
| | - Nuno Bandeira
- Center for Computational Mass Spectrometry, University of California, San Diego 92093-0404, USA
| | - Adrian Guthals
- Center for Computational Mass Spectrometry, University of California, San Diego 92093-0404, USA
| | - Fabricio K Marchini
- Functional Genomics Laboratory, Carlos Chagas Institute, Fiocruz, Curitiba 81.350-010, Brazil.,Mass Spectrometry Facility RPT02H, Carlos Chagas Institute, Fiocruz, Curitiba 81.350-010, Brazil
| | - Felipe V Leprevost
- Computational Mass Spectrometry &Proteomics Group, Carlos Chagas Institute, Fiocruz, Curitiba 81.350-010, Brazil
| | - Valmir C Barbosa
- Systems Engineering and Computer Science Program, COPPE, Federal University of Rio de Janeiro, Rio de Janeiro 21941-914, Brazil
| | - Andrea Senff-Ribeiro
- Department of Cell Biology, Federal University of Paraná, Curitiba 81531-980, Brazil
| | - Paulo C Carvalho
- Computational Mass Spectrometry &Proteomics Group, Carlos Chagas Institute, Fiocruz, Curitiba 81.350-010, Brazil.,Laboratory of Toxinology, Fiocruz, Rio de Janeiro 21040-900, Brazil
| |
Collapse
|
24
|
Hu H, Khatri K, Zaia J. Algorithms and design strategies towards automated glycoproteomics analysis. MASS SPECTROMETRY REVIEWS 2017; 36:475-498. [PMID: 26728195 PMCID: PMC4931994 DOI: 10.1002/mas.21487] [Citation(s) in RCA: 71] [Impact Index Per Article: 10.1] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 07/10/2015] [Accepted: 11/30/2015] [Indexed: 05/09/2023]
Abstract
Glycoproteomics involves the study of glycosylation events on protein sequences ranging from purified proteins to whole proteome scales. Understanding these complex post-translational modification (PTM) events requires elucidation of the glycan moieties (monosaccharide sequences and glycosidic linkages between residues), protein sequences, as well as site-specific attachment of glycan moieties onto protein sequences, in a spatial and temporal manner in a variety of biological contexts. Compared with proteomics, bioinformatics for glycoproteomics is immature and many researchers still rely on tedious manual interpretation of glycoproteomics data. As sample preparation protocols and analysis techniques have matured, the number of publications on glycoproteomics and bioinformatics has increased substantially; however, the lack of consensus on tool development and code reuse limits the dissemination of bioinformatics tools because it requires significant effort to migrate a computational tool tailored for one method design to alternative methods. This review discusses algorithms and methods in glycoproteomics, and refers to the general proteomics field for potential solutions. It also introduces general strategies for tool integration and pipeline construction in order to better serve the glycoproteomics community. © 2016 Wiley Periodicals, Inc. Mass Spec Rev 36:475-498, 2017.
Collapse
Affiliation(s)
- Han Hu
- Bioinformatics Program, Boston University, Boston, Massachusetts 02215, USA
- Center for Biomedical Mass Spectrometry, Department of Biochemistry, Boston University School of Medicine, Boston University, Boston, Massachusetts 02118, USA
| | - Kshitij Khatri
- Center for Biomedical Mass Spectrometry, Department of Biochemistry, Boston University School of Medicine, Boston University, Boston, Massachusetts 02118, USA
| | - Joseph Zaia
- Center for Biomedical Mass Spectrometry, Department of Biochemistry, Boston University School of Medicine, Boston University, Boston, Massachusetts 02118, USA
| |
Collapse
|
25
|
Ruggles KV, Krug K, Wang X, Clauser KR, Wang J, Payne SH, Fenyö D, Zhang B, Mani DR. Methods, Tools and Current Perspectives in Proteogenomics. Mol Cell Proteomics 2017; 16:959-981. [PMID: 28456751 DOI: 10.1074/mcp.mr117.000024] [Citation(s) in RCA: 95] [Impact Index Per Article: 13.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/13/2017] [Indexed: 12/20/2022] Open
Abstract
With combined technological advancements in high-throughput next-generation sequencing and deep mass spectrometry-based proteomics, proteogenomics, i.e. the integrative analysis of proteomic and genomic data, has emerged as a new research field. Early efforts in the field were focused on improving protein identification using sample-specific genomic and transcriptomic sequencing data. More recently, integrative analysis of quantitative measurements from genomic and proteomic studies have identified novel insights into gene expression regulation, cell signaling, and disease. Many methods and tools have been developed or adapted to enable an array of integrative proteogenomic approaches and in this article, we systematically classify published methods and tools into four major categories, (1) Sequence-centric proteogenomics; (2) Analysis of proteogenomic relationships; (3) Integrative modeling of proteogenomic data; and (4) Data sharing and visualization. We provide a comprehensive review of methods and available tools in each category and highlight their typical applications.
Collapse
Affiliation(s)
- Kelly V Ruggles
- From the ‡Department of Medicine, New York University School of Medicine, New York, New York 10016
| | - Karsten Krug
- §The Broad Institute of MIT and Harvard, Cambridge, Massachusetts 02142
| | - Xiaojing Wang
- ¶Lester and Sue Smith Breast Center, Baylor College of Medicine, Houston, Texas 77030.,‖Department of Molecular and Human Genetics, Baylor College of Medicine, Houston, Texas 77030
| | - Karl R Clauser
- §The Broad Institute of MIT and Harvard, Cambridge, Massachusetts 02142
| | - Jing Wang
- ¶Lester and Sue Smith Breast Center, Baylor College of Medicine, Houston, Texas 77030.,‖Department of Molecular and Human Genetics, Baylor College of Medicine, Houston, Texas 77030
| | - Samuel H Payne
- **Biological Sciences Division, Pacific Northwest National Laboratory, Richland, Washington 99354
| | - David Fenyö
- ‡‡Department of Biochemistry and Molecular Pharmacology, New York University School of Medicine, New York, New York 10016; .,§§Institute for Systems Genetics, New York University School of Medicine, New York, New York 10016
| | - Bing Zhang
- ¶Lester and Sue Smith Breast Center, Baylor College of Medicine, Houston, Texas 77030; .,‖Department of Molecular and Human Genetics, Baylor College of Medicine, Houston, Texas 77030
| | - D R Mani
- §The Broad Institute of MIT and Harvard, Cambridge, Massachusetts 02142;
| |
Collapse
|
26
|
Riley NM, Westphall MS, Hebert AS, Coon JJ. Implementation of Activated Ion Electron Transfer Dissociation on a Quadrupole-Orbitrap-Linear Ion Trap Hybrid Mass Spectrometer. Anal Chem 2017; 89:6358-6366. [PMID: 28383247 DOI: 10.1021/acs.analchem.7b00213] [Citation(s) in RCA: 37] [Impact Index Per Article: 5.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/13/2022]
Abstract
Using concurrent IR photoactivation during electron transfer dissociation (ETD) reactions, i.e., activated ion ETD (AI-ETD), significantly increases dissociation efficiency resulting in improved overall performance. Here we describe the first implementation of AI-ETD on a quadrupole-Orbitrap-quadrupole linear ion trap (QLT) hybrid MS system (Orbitrap Fusion Lumos) and demonstrate the substantial benefits it offers for peptide characterization. First, we show that AI-ETD can be implemented in a straightforward manner by fastening the laser and guiding optics to the instrument chassis itself, making alignment with the trapping volume of the QLT simple and robust. We then characterize the performance of AI-ETD using standard peptides in addition to a complex mixtures of tryptic peptides using LC-MS/MS, showing not only that AI-ETD can nearly double the identifications achieved with ETD alone but also that it outperforms the other available supplemental activation methods (ETcaD and EThcD). Finally, we introduce a new activation scheme called AI-ETD+ that combines AI-ETD in the high pressure cell of the QLT with a short infrared multiphoton dissociation (IRMPD) activation in the low-pressure cell. This reaction scheme introduces no addition time to the scan duty cycle but generates MS/MS spectra rich in b/y-type and c/z•-type product ions. The extensive generation of fragment ions in AI-ETD+ substantially increases peptide sequence coverage while also improving peptide identifications over all other ETD methods, making it a valuable new tool for hybrid fragmentation approaches.
Collapse
Affiliation(s)
| | | | | | - Joshua J Coon
- Morgridge Institute for Research , Madison, Wisconsin 53715, United States
| |
Collapse
|
27
|
Savidor A, Barzilay R, Elinger D, Yarden Y, Lindzen M, Gabashvili A, Adiv Tal O, Levin Y. Database-independent Protein Sequencing (DiPS) Enables Full-length de Novo Protein and Antibody Sequence Determination. Mol Cell Proteomics 2017; 16:1151-1161. [PMID: 28348172 DOI: 10.1074/mcp.o116.065417] [Citation(s) in RCA: 16] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/10/2016] [Revised: 03/22/2017] [Indexed: 01/16/2023] Open
Abstract
Traditional "bottom-up" proteomic approaches use proteolytic digestion, LC-MS/MS, and database searching to elucidate peptide identities and their parent proteins. Protein sequences absent from the database cannot be identified, and even if present in the database, complete sequence coverage is rarely achieved even for the most abundant proteins in the sample. Thus, sequencing of unknown proteins such as antibodies or constituents of metaproteomes remains a challenging problem. To date, there is no available method for full-length protein sequencing, independent of a reference database, in high throughput. Here, we present Database-independent Protein Sequencing, a method for unambiguous, rapid, database-independent, full-length protein sequencing. The method is a novel combination of non-enzymatic, semi-random cleavage of the protein, LC-MS/MS analysis, peptide de novo sequencing, extraction of peptide tags, and their assembly into a consensus sequence using an algorithm named "Peptide Tag Assembler." As proof-of-concept, the method was applied to samples of three known proteins representing three size classes and to a previously un-sequenced, clinically relevant monoclonal antibody. Excluding leucine/isoleucine and glutamic acid/deamidated glutamine ambiguities, end-to-end full-length de novo sequencing was achieved with 99-100% accuracy for all benchmarking proteins and the antibody light chain. Accuracy of the sequenced antibody heavy chain, including the entire variable region, was also 100%, but there was a 23-residue gap in the constant region sequence.
Collapse
Affiliation(s)
- Alon Savidor
- From ‡The Nancy and Stephen Grand Israel National Center for Personalized Medicine, Weizmann Institute of Science, Rehovot
| | - Rotem Barzilay
- From ‡The Nancy and Stephen Grand Israel National Center for Personalized Medicine, Weizmann Institute of Science, Rehovot
| | - Dalia Elinger
- From ‡The Nancy and Stephen Grand Israel National Center for Personalized Medicine, Weizmann Institute of Science, Rehovot
| | - Yosef Yarden
- the §Department of Biological Regulation, Weizmann Institute of Science, Rehovot, Israel 76100
| | - Moshit Lindzen
- the §Department of Biological Regulation, Weizmann Institute of Science, Rehovot, Israel 76100
| | - Alexandra Gabashvili
- From ‡The Nancy and Stephen Grand Israel National Center for Personalized Medicine, Weizmann Institute of Science, Rehovot
| | - Ophir Adiv Tal
- From ‡The Nancy and Stephen Grand Israel National Center for Personalized Medicine, Weizmann Institute of Science, Rehovot
| | - Yishai Levin
- From ‡The Nancy and Stephen Grand Israel National Center for Personalized Medicine, Weizmann Institute of Science, Rehovot;
| |
Collapse
|
28
|
Horton AP, Robotham SA, Cannon JR, Holden DD, Marcotte EM, Brodbelt JS. Comprehensive de Novo Peptide Sequencing from MS/MS Pairs Generated through Complementary Collision Induced Dissociation and 351 nm Ultraviolet Photodissociation. Anal Chem 2017; 89:3747-3753. [PMID: 28234449 PMCID: PMC5480239 DOI: 10.1021/acs.analchem.7b00130] [Citation(s) in RCA: 12] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/01/2023]
Abstract
We describe a strategy for de novo peptide sequencing based on matched pairs of tandem mass spectra (MS/MS) obtained by collision induced dissociation (CID) and 351 nm ultraviolet photodissociation (UVPD). Each precursor ion is isolated twice with the mass spectrometer switching between CID and UVPD activation modes to obtain a complementary MS/MS pair. To interpret these paired spectra, we modified the UVnovo de novo sequencing software to automatically learn from and interpret fragmentation spectra, provided a representative set of training data. This machine learning procedure, using random forests, synthesizes information from one or multiple complementary spectra, such as the CID/UVPD pairs, into peptide fragmentation site predictions. In doing so, the burden of fragmentation model definition shifts from programmer to machine and opens up the model parameter space for inclusion of nonobvious features and interactions. This spectral synthesis also serves to transform distinct types of spectra into a common representation for subsequent activation-independent processing steps. Then, independent from precursor activation constraints, UVnovo's de novo sequencing procedure generates and scores sequence candidates for each precursor. We demonstrate the combined experimental and computational approach for de novo sequencing using whole cell E. coli lysate. In benchmarks on the CID/UVPD data, UVnovo assigned correct full-length sequences to 83% of the spectral pairs of doubly charged ions with high-confidence database identifications. Considering only top-ranked de novo predictions, 70% of the pairs were deciphered correctly. This de novo sequencing performance exceeds that of PEAKS and PepNovo on the CID spectra and that of UVnovo on CID or UVPD spectra alone. As presented here, the methods for paired CID/UVPD spectral acquisition and interpretation constitute a powerful workflow for high-throughput and accurate de novo peptide sequencing.
Collapse
Affiliation(s)
- Andrew P Horton
- Center for Systems and Synthetic Biology, Department of Molecular Biosciences, University of Texas , Austin, Texas 78712, United States
| | - Scott A Robotham
- Department of Chemistry, University of Texas , Austin, Texas 78712, United States
| | - Joe R Cannon
- Department of Chemistry, University of Texas , Austin, Texas 78712, United States
| | - Dustin D Holden
- Department of Chemistry, University of Texas , Austin, Texas 78712, United States
| | - Edward M Marcotte
- Center for Systems and Synthetic Biology, Department of Molecular Biosciences, University of Texas , Austin, Texas 78712, United States
| | - Jennifer S Brodbelt
- Department of Chemistry, University of Texas , Austin, Texas 78712, United States
| |
Collapse
|
29
|
Vyatkina K. De Novo Sequencing of Top-Down Tandem Mass Spectra: A Next Step towards Retrieving a Complete Protein Sequence. Proteomes 2017; 5:E6. [PMID: 28248257 PMCID: PMC5372227 DOI: 10.3390/proteomes5010006] [Citation(s) in RCA: 13] [Impact Index Per Article: 1.9] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/10/2016] [Revised: 01/30/2017] [Accepted: 02/04/2017] [Indexed: 11/16/2022] Open
Abstract
De novo sequencing of tandem (MS/MS) mass spectra represents the only way to determine the sequence of proteins from organisms with unknown genomes, or the ones not directly inscribed in a genome-such as antibodies, or novel splice variants. Top-down mass spectrometry provides new opportunities for analyzing such proteins; however, retrieving a complete protein sequence from top-down MS/MS spectra still remains a distant goal. In this paper, we review the state-of-the-art on this subject, and enhance our previously developed Twister algorithm for de novo sequencing of peptides from top-down MS/MS spectra to derive longer sequence fragments of a target protein.
Collapse
Affiliation(s)
- Kira Vyatkina
- Center for Algorithmic Biotechnology, Institute of Translational Biomedicine, Saint Petersburg State University, 7-9 Universitetskaya nab., St. Petersburg 199034, Russia.
- Department of Mathematical and Information Technologies, Saint Petersburg Academic University, 8/3 Khlopina st., St. Petersburg 194021, Russia.
| |
Collapse
|
30
|
Guan X, Brownstein NC, Young NL, Marshall AG. Ultrahigh-resolution Fourier transform ion cyclotron resonance mass spectrometry and tandem mass spectrometry for peptide de novo amino acid sequencing for a seven-protein mixture by paired single-residue transposed Lys-N and Lys-C digestion. RAPID COMMUNICATIONS IN MASS SPECTROMETRY : RCM 2017; 31:207-217. [PMID: 27813191 DOI: 10.1002/rcm.7783] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 08/29/2016] [Revised: 10/29/2016] [Accepted: 10/30/2016] [Indexed: 06/06/2023]
Abstract
RATIONALE Bottom-up tandem mass spectrometry (MS/MS) is regularly used in proteomics to identify proteins from a sequence database. De novo sequencing is also available for sequencing peptides with relatively short sequence lengths. We recently showed that paired Lys-C and Lys-N proteases produce peptides of identical mass and similar retention time, but different tandem mass spectra. Such parallel experiments provide complementary information, and allow for up to 100% MS/MS sequence coverage. METHODS Here, we report digestion by paired Lys-C and Lys-N proteases of a seven-protein mixture: human hemoglobin alpha, bovine carbonic anhydrase 2, horse skeletal muscle myoglobin, hen egg white lysozyme, bovine pancreatic ribonuclease, bovine rhodanese, and bovine serum albumin, followed by reversed-phase nanoflow liquid chromatography, collision-induced dissociation, and 14.5 T Fourier transform ion cyclotron resonance mass spectrometry. RESULTS Matched pairs of product peptide ions of equal precursor mass and similar retention times from each digestion are compared, leveraging single-residue transposed information with independent interferences to confidently identify fragment ion types, residues, and peptides. Selected pairs of product ion mass spectra for de novo sequenced protein segments from each member of the mixture are presented. CONCLUSIONS Pairs of the transposed product ions as well as complementary information from the parallel experiments allow for both high MS/MS coverage for long peptide sequences and high confidence in the amino acid identification. Moreover, the parallel experiments in the de novo sequencing reduce false-positive matches of product ions from the single-residue transposed peptides from the same segment, and thereby further improve the confidence in protein identification. Copyright © 2016 John Wiley & Sons, Ltd.
Collapse
Affiliation(s)
- Xiaoyan Guan
- Ion Cyclotron Resonance Program, National High Magnetic Field Laboratory, Florida State University, 1800 East Paul Dirac Drive, Tallahassee, FL, 32310, USA
| | - Naomi C Brownstein
- Department of Behavioral Sciences and Social Medicine, College of Medicine, Florida State University, 1115 W. Call St., Tallahassee, FL, 32306, USA
- Department of Statistics, Florida State University, 117 N. Woodward Ave., Tallahassee, FL, 32306, USA
| | - Nicolas L Young
- Verna & Marrs McLean Department of Biochemistry & Molecular Biology, Baylor College of Medicine, One Baylor Plaza, MS-125, Houston, TX, 77030-3411, USA
| | - Alan G Marshall
- Ion Cyclotron Resonance Program, National High Magnetic Field Laboratory, Florida State University, 1800 East Paul Dirac Drive, Tallahassee, FL, 32310, USA
- Department of Chemistry and Biochemistry, Florida State University, 95 Chieftain Way, Tallahassee, FL, 32303, USA
| |
Collapse
|
31
|
Abstract
Background De novo peptide sequencing via tandem mass spectrometry (MS/MS) has been developed rapidly in recent years. With the use of spectra pairs from the same peptide under different fragmentation modes, performance of de novo sequencing is greatly improved. Currently, with large amount of spectra sequenced everyday, spectra libraries containing tens of thousands of annotated experimental MS/MS spectra become available. These libraries provide information of the spectra properties, thus have the potential to be used with de novo sequencing to improve its performance. Results In this study, an improved de novo sequencing method assisted with spectra library is proposed. It uses spectra libraries as training datasets and introduces significant scores of the features used in our previous de novo sequencing method for HCD and ETD spectra pairs. Two pairs of HCD and ETD spectral datasets were used to test the performance of the proposed method and our previous method. The results show that this proposed method achieves better sequencing accuracy with higher ranked correct sequences and less computational time. Conclusions This paper proposed an advanced de novo sequencing method for HCD and ETD spectra pair and used information from spectra libraries and significant improved previous similar methods.
Collapse
Affiliation(s)
- Yan Yan
- Department of Cumputer Science, Faculty of Science, University of Western Ontario, London, Canada
| | - Kaizhong Zhang
- Department of Cumputer Science, Faculty of Science, University of Western Ontario, London, Canada.
| |
Collapse
|
32
|
Tandem Mass Spectrum Sequencing: An Alternative to Database Search Engines in Shotgun Proteomics. ADVANCES IN EXPERIMENTAL MEDICINE AND BIOLOGY 2016. [PMID: 27975219 DOI: 10.1007/978-3-319-41448-5_10] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register]
Abstract
Protein identification via database searches has become the gold standard in mass spectrometry based shotgun proteomics. However, as the quality of tandem mass spectra improves, direct mass spectrum sequencing gains interest as a database-independent alternative. In this chapter, the general principle of this so-called de novo sequencing is introduced along with pitfalls and challenges of the technique. The main tools available are presented with a focus on user friendly open source software which can be directly applied in everyday proteomic workflows.
Collapse
|
33
|
Przybylski C, Benito JM, Bonnet V, Mellet CO, García Fernández JM. Toward a suitable structural analysis of gene delivery carrier based on polycationic carbohydrates by electron transfer dissociation tandem mass spectrometry. Anal Chim Acta 2016; 948:62-72. [PMID: 27871611 DOI: 10.1016/j.aca.2016.11.001] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/21/2016] [Revised: 10/02/2016] [Accepted: 11/04/2016] [Indexed: 01/01/2023]
Abstract
Polycationic carbohydrates represent an attractive class of biomolecules for several applications and particularly as non viral gene delivery vectors. In this case, the establishment of structure-biological activity relationship requires sensitive and accurate characterization tools to both control and achieve fine structural deciphering. Electrospray-tandem mass spectrometry (ESI-MS/MS) appears as a suitable approach to address these questions. In the study herein, we have investigated the usefulness of electron transfer dissociation (ETD) to get structural data about five polycationic carbohydrates demonstrated as promising gene delivery agents. A particular attention was paid to determine the influence of charge states as well as both fluoranthene reaction time and supplementary activation (SA) on production of charge reduced species, fragmentation yield, varying from 2 to 62%, as well as to obtain the most higher both diversity and intensity of fragments, according to charge states and targeted compounds. ETD fragmentation appeared to be mainly directed toward pending group rather than carbohydrate cyclic scaffold leading to a partial sequencing for building blocks when amino groups are close to carbohydrate core, but allowing to complete structural deciphering of some of them, such as those including dithioureidocysteaminyl group which was not possible with CID only. Such findings clearly highlight the potential to help the rational choice of the suitable analytical conditions, according to the nature of the gene delivery molecules exhibiting polycationic features. Moreover, our ETD-MS/MS approach open the way to a fine sequencing/identification of grafted groups carried on various sets of oligo-/polysaccharides in various fields such as glycobiology or nanomaterials, even with unknown or questionable extraction, synthesis or modification steps.
Collapse
Affiliation(s)
- Cédric Przybylski
- Université d'Evry-Val-d'Essonne, Laboratoire Analyse et Modélisation pour la Biologie et l'Environnement, CNRS UMR 8587, Bâtiment Maupertuis, Bld F. Mitterrand, F-91025 Evry, France.
| | - Juan M Benito
- Instituto de Investigaciones Químicas (IIQ), CSIC-Universidad de Sevilla, Américo Vespucio 49, Isla de la Cartuja, E-41092 Sevilla, Spain
| | - Véronique Bonnet
- Université de Picardie Jules Verne, Laboratoire de Glycochimie, des Antimicrobiens et des Agroressources, CNRS UMR 7378, 80039 Amiens, France
| | - Carmen Ortiz Mellet
- Departamento de Química Orgánica, Facultad de Química, Universidad de Sevilla, E-41012 Sevilla, Spain
| | - José M García Fernández
- Instituto de Investigaciones Químicas (IIQ), CSIC-Universidad de Sevilla, Américo Vespucio 49, Isla de la Cartuja, E-41092 Sevilla, Spain
| |
Collapse
|
34
|
Guthals A, Gan Y, Murray L, Chen Y, Stinson J, Nakamura G, Lill JR, Sandoval W, Bandeira N. De Novo MS/MS Sequencing of Native Human Antibodies. J Proteome Res 2016; 16:45-54. [PMID: 27779884 DOI: 10.1021/acs.jproteome.6b00608] [Citation(s) in RCA: 31] [Impact Index Per Article: 3.9] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/28/2022]
Abstract
One direct route for the discovery of therapeutic human monoclonal antibodies (mAbs) involves the isolation of peripheral B cells from survivors/sero-positive individuals after exposure to an infectious reagent or disease etiology, followed by single-cell sequencing or hybridoma generation. Peripheral B cells, however, are not always easy to obtain and represent only a small percentage of the total B-cell population across all bodily tissues. Although it has been demonstrated that tandem mass spectrometry (MS/MS) techniques can interrogate the full polyclonal antibody (pAb) response to an antigen in vivo, all current approaches identify MS/MS spectra against databases derived from genetic sequencing of B cells from the same patient. In this proof-of-concept study, we demonstrate the feasibility of a novel MS/MS antibody discovery approach in which only serum antibodies are required without the need for sequencing of genetic material. Peripheral pAbs from a cytomegalovirus-exposed individual were purified by glycoprotein B antigen affinity and de novo sequenced from MS/MS data. Purely MS-derived mAbs were then manufactured in mammalian cells to validate potency via antigen-binding ELISA. Interestingly, we found that these mAbs accounted for 1 to 2% of total donor IgG but were not detected in parallel sequencing of memory B cells from the same patient.
Collapse
Affiliation(s)
- Adrian Guthals
- Mapp Biopharmaceutical, Inc. , 6160 Lusk Boulevard #C105, San Diego, California 92121, United States
| | - Yutian Gan
- Department of Proteomics & Biological Resources, Genentech, Inc. , South San Francisco, California 94080, United States
| | - Laura Murray
- Department of Protein Chemistry, Genentech, Inc. , South San Francisco, California 94080, United States
| | - Yongmei Chen
- Department of Antibody Engineering, Genentech, Inc. , South San Francisco, California 94080, United States
| | - Jeremy Stinson
- Department of Molecular Biology, Genentech, Inc. , South San Francisco, California 94080, United States
| | - Gerald Nakamura
- Department of Antibody Engineering, Genentech, Inc. , South San Francisco, California 94080, United States
| | - Jennie R Lill
- Department of Proteomics & Biological Resources, Genentech, Inc. , South San Francisco, California 94080, United States
| | - Wendy Sandoval
- Department of Proteomics & Biological Resources, Genentech, Inc. , South San Francisco, California 94080, United States
| | - Nuno Bandeira
- Department of Computer Science and Engineering, University of California, San Diego , 9500 Gilman Drive, Mail Code 0404, La Jolla, California 92093, United States.,Skaggs School of Pharmacy and Pharmaceutical Sciences, University of California, San Diego , 9500 Gilman Drive, Mail Code 0657, La Jolla, California 92093, United States
| |
Collapse
|
35
|
Vyatkina K, Wu S, Dekker LJM, VanDuijn MM, Liu X, Tolić N, Luider TM, Paša-Tolić L, Pevzner PA. Top-down analysis of protein samples by de novo sequencing techniques. Bioinformatics 2016; 32:2753-9. [PMID: 27187201 PMCID: PMC6280873 DOI: 10.1093/bioinformatics/btw307] [Citation(s) in RCA: 13] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/09/2015] [Revised: 03/31/2016] [Accepted: 05/09/2016] [Indexed: 11/13/2022] Open
Abstract
MOTIVATION Recent technological advances have made high-resolution mass spectrometers affordable to many laboratories, thus boosting rapid development of top-down mass spectrometry, and implying a need in efficient methods for analyzing this kind of data. RESULTS We describe a method for analysis of protein samples from top-down tandem mass spectrometry data, which capitalizes on de novo sequencing of fragments of the proteins present in the sample. Our algorithm takes as input a set of de novo amino acid strings derived from the given mass spectra using the recently proposed Twister approach, and combines them into aggregated strings endowed with offsets. The former typically constitute accurate sequence fragments of sufficiently well-represented proteins from the sample being analyzed, while the latter indicate their location in the protein sequence, and also bear information on post-translational modifications and fragmentation patterns. AVAILABILITY AND IMPLEMENTATION Freely available on the web at http://bioinf.spbau.ru/en/twister CONTACT vyatkina@spbau.ru or ppevzner@ucsd.edu SUPPLEMENTARY INFORMATION Supplementary data are available at Bioinformatics online.
Collapse
Affiliation(s)
- Kira Vyatkina
- Algorithmic Biology Laboratory, Saint Petersburg Academic University, St Petersburg, Russia Center for Algorithmic Biotechnology, Institute of Translational Biomedicine, Saint Petersburg State University, St Petersburg, Russia
| | - Si Wu
- Department of Chemistry and Biochemistry, University of Oklahoma, Norman, OK, USA
| | - Lennard J M Dekker
- Department of Neurology, Erasmus University Medical Center, Rotterdam, The Netherlands
| | - Martijn M VanDuijn
- Department of Neurology, Erasmus University Medical Center, Rotterdam, The Netherlands
| | - Xiaowen Liu
- Department of BioHealth Informatics, Indiana University-Purdue University Indianapolis, Indianapolis, IN, USA Center for Computational Biology and Bioinformatics, Indiana University School of Medicine, Indianapolis, IN, USA
| | - Nikola Tolić
- Environmental Molecular Sciences Laboratory, Pacific Northwest National Laboratory, Richland, WA, USA
| | - Theo M Luider
- Department of Neurology, Erasmus University Medical Center, Rotterdam, The Netherlands
| | - Ljiljana Paša-Tolić
- Environmental Molecular Sciences Laboratory, Pacific Northwest National Laboratory, Richland, WA, USA
| | - Pavel A Pevzner
- Center for Algorithmic Biotechnology, Institute of Translational Biomedicine, Saint Petersburg State University, St Petersburg, Russia Department of Computer Science and Engineering, University of California, San Diego, CA, USA
| |
Collapse
|
36
|
Na S, Payne SH, Bandeira N. Multi-species Identification of Polymorphic Peptide Variants via Propagation in Spectral Networks. Mol Cell Proteomics 2016; 15:3501-3512. [PMID: 27609420 PMCID: PMC5098046 DOI: 10.1074/mcp.o116.060913] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/10/2016] [Indexed: 11/25/2022] Open
Abstract
Peptide and protein identification remains challenging in organisms with poorly annotated or rapidly evolving genomes, as are commonly encountered in environmental or biofuels research. Such limitations render tandem mass spectrometry (MS/MS) database search algorithms ineffective as they lack corresponding sequences required for peptide-spectrum matching. We address this challenge with the spectral networks approach to (1) match spectra of orthologous peptides across multiple related species and then (2) propagate peptide annotations from identified to unidentified spectra. We here present algorithms to assess the statistical significance of spectral alignments (Align-GF), reduce the impurity in spectral networks, and accurately estimate the error rate in propagated identifications. Analyzing three related Cyanothece species, a model organism for biohydrogen production, spectral networks identified peptides from highly divergent sequences from networks with dozens of variant peptides, including thousands of peptides in species lacking a sequenced genome. Our analysis further detected the presence of many novel putative peptides even in genomically characterized species, thus suggesting the possibility of gaps in our understanding of their proteomic and genomic expression. A web-based pipeline for spectral networks analysis is available at http://proteomics.ucsd.edu/software.
Collapse
Affiliation(s)
- Seungjin Na
- From the ‡Dept. of Computer Science and Engineering, University of California, San Diego, La Jolla, California, 92093.,§Center for Computational Mass Spectrometry, University of California, San Diego, La Jolla, California, 92093
| | - Samuel H Payne
- ¶Pacific Northwest National Laboratory, Richland, Washington 99354
| | - Nuno Bandeira
- From the ‡Dept. of Computer Science and Engineering, University of California, San Diego, La Jolla, California, 92093; .,§Center for Computational Mass Spectrometry, University of California, San Diego, La Jolla, California, 92093.,‖Skaggs School of Pharmacy and Pharmaceutical Sciences, University of California, San Diego, La Jolla, California, 92093
| |
Collapse
|
37
|
Complete De Novo Assembly of Monoclonal Antibody Sequences. Sci Rep 2016; 6:31730. [PMID: 27562653 PMCID: PMC4999880 DOI: 10.1038/srep31730] [Citation(s) in RCA: 67] [Impact Index Per Article: 8.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/18/2016] [Accepted: 07/20/2016] [Indexed: 11/25/2022] Open
Abstract
De novo protein sequencing is one of the key problems in mass spectrometry-based proteomics, especially for novel proteins such as monoclonal antibodies for which genome information is often limited or not available. However, due to limitations in peptides fragmentation and coverage, as well as ambiguities in spectra interpretation, complete de novo assembly of unknown protein sequences still remains challenging. To address this problem, we propose an integrated system, ALPS, which for the first time can automatically assemble full-length monoclonal antibody sequences. Our system integrates de novo sequencing peptides, their quality scores and error-correction information from databases into a weighted de Bruijn graph to assemble protein sequences. We evaluated ALPS performance on two antibody data sets, each including a heavy chain and a light chain. The results show that ALPS was able to assemble three complete monoclonal antibody sequences of length 216–441 AA, at 100% coverage, and 96.64–100% accuracy.
Collapse
|
38
|
Abreu TF, Sumitomo BN, Nishiyama MY, Oliveira UC, Souza GHMF, Kitano ES, Zelanis A, Serrano SMT, Junqueira-de-Azevedo I, Silva PI, Tashima AK. Peptidomics of Acanthoscurria gomesiana spider venom reveals new toxins with potential antimicrobial activity. J Proteomics 2016; 151:232-242. [PMID: 27436114 DOI: 10.1016/j.jprot.2016.07.012] [Citation(s) in RCA: 32] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/05/2016] [Revised: 06/22/2016] [Accepted: 07/13/2016] [Indexed: 12/24/2022]
Abstract
Acanthoscurria gomesiana is a Brazilian spider from the Theraphosidae family inhabiting regions of Southeastern Brazil. Potent antimicrobial peptides as gomesin and acanthoscurrin have been discovered from the spider hemolymph in previous works. Spider venoms are also recognized as sources of biologically active peptides, however the venom peptidome of A. gomesiana remained unexplored to date. In this work, a MS-based workflow was applied to the investigation of the spider venom peptidome. Data-independent and data-dependent LC-MS/MS acquisitions of intact peptides and of peptides submitted to multiple enzyme digestions, followed by automated chromatographic alignment, de novo analysis, database and homology searches with manual validations showed that the venom is composed by <165 features, with masses ranging from 0.4-15.8kDa. From digestions, 135 peptides were identified from 17 proteins, including three new mature peptides: U1-TRTX-Agm1a, U1-TRTX-Agm2a and U1-TRTX-Agm3a, containing 3, 4 and 3 disulfide bonds, respectively. The toxins U1-TRTX-Agm1a differed by only one amino acid from U1-TRTX-Ap1a from A. paulensis and U1-TRTX-Agm2a was derived from the genicutoxin-D1 precursor from A. geniculata. These toxins have potential applications as antimicrobial agents, as the peptide fraction of A. gomesiana showed activity against Escherichia coli, Enterobacter cloacae and Candida albicans strains. MS data are available via ProteomeXchange Consortium with identifier PXD003884. BIOLOGICAL SIGNIFICANCE Biological fluids of the Acanthoscurria gomesiana spider are sources of active molecules, as is the case of antimicrobial peptides and acylpolyamines found in the hemolymphs. The venom is also a potential source of toxins with pharmacological and biotechnological applications. However, to our knowledge no A. gomesiana venom toxin structure has been determined to date. Using a combination of high resolution mass spectrometry, transcriptomics and bioinformatics, we employed a workflow to fully sequence, determine the number of disulfide bonds of mature peptides and we found new potential antimicrobial peptides. This workflow is suitable for complete peptide toxin sequencing when handling limited amount of venom samples and can accelerate the discovery of peptides with potential biotechnological applications.
Collapse
Affiliation(s)
- Thiago F Abreu
- Departamento de Bioquímica, Escola Paulista de Medicina, Universidade Federal de São Paulo, São Paulo, SP, Brazil
| | - Bianca N Sumitomo
- Departamento de Bioquímica, Escola Paulista de Medicina, Universidade Federal de São Paulo, São Paulo, SP, Brazil
| | - Milton Y Nishiyama
- Laboratório Especial de Toxinologia Aplicada, Center of Toxins, Immune-Response and Cell Signaling, Instituto Butantan, São Paulo, SP, Brazil
| | - Ursula C Oliveira
- Laboratório Especial de Toxinologia Aplicada, Center of Toxins, Immune-Response and Cell Signaling, Instituto Butantan, São Paulo, SP, Brazil
| | - Gustavo H M F Souza
- Mass Spectrometry Applications Research & Development Laboratory, Waters Corporation, Sāo Paulo, SP, Brazil
| | - Eduardo S Kitano
- Laboratório Especial de Toxinologia Aplicada, Center of Toxins, Immune-Response and Cell Signaling, Instituto Butantan, São Paulo, SP, Brazil
| | - André Zelanis
- Departamento de Ciência e Tecnologia, Universidade Federal de São Paulo, ICT-UNIFESP, São José dos Campos, SP, Brazil
| | - Solange M T Serrano
- Laboratório Especial de Toxinologia Aplicada, Center of Toxins, Immune-Response and Cell Signaling, Instituto Butantan, São Paulo, SP, Brazil
| | - Inácio Junqueira-de-Azevedo
- Laboratório Especial de Toxinologia Aplicada, Center of Toxins, Immune-Response and Cell Signaling, Instituto Butantan, São Paulo, SP, Brazil
| | - Pedro I Silva
- Laboratório Especial de Toxinologia Aplicada, Center of Toxins, Immune-Response and Cell Signaling, Instituto Butantan, São Paulo, SP, Brazil
| | - Alexandre K Tashima
- Departamento de Bioquímica, Escola Paulista de Medicina, Universidade Federal de São Paulo, São Paulo, SP, Brazil.
| |
Collapse
|
39
|
Valles-Colomer M, Darzi Y, Vieira-Silva S, Falony G, Raes J, Joossens M. Meta-omics in Inflammatory Bowel Disease Research: Applications, Challenges, and Guidelines. J Crohns Colitis 2016; 10:735-46. [PMID: 26802086 DOI: 10.1093/ecco-jcc/jjw024] [Citation(s) in RCA: 29] [Impact Index Per Article: 3.6] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 10/06/2015] [Accepted: 01/15/2016] [Indexed: 12/13/2022]
Abstract
Meta-omics [metagenomics, metatranscriptomics, and metaproteomics] are rapidly expanding our knowledge of the gut microbiota in health and disease. These technologies are increasingly used in inflammatory bowel disease [IBD] research. Yet, meta-omics data analysis, interpretation, and among-study comparison remain challenging. In this review we discuss the role these techniques are playing in IBD research, highlighting their strengths and limitations. We give guidelines on proper sample collection and preparation methods, and on performing the analyses and interpreting the results, reporting available user-friendly tools and pipelines.
Collapse
Affiliation(s)
- Mireia Valles-Colomer
- KU Leuven, Department of Microbiology and Immunology, Rega Institute, Leuven, Belgium VIB, Center for the Biology of Disease, Leuven, Belgium
| | - Youssef Darzi
- KU Leuven, Department of Microbiology and Immunology, Rega Institute, Leuven, Belgium VIB, Center for the Biology of Disease, Leuven, Belgium Microbiology Unit, Faculty of Sciences and Bioengineering Sciences, Vrije Universiteit Brussel, Brussels, Belgium
| | - Sara Vieira-Silva
- KU Leuven, Department of Microbiology and Immunology, Rega Institute, Leuven, Belgium VIB, Center for the Biology of Disease, Leuven, Belgium
| | - Gwen Falony
- KU Leuven, Department of Microbiology and Immunology, Rega Institute, Leuven, Belgium VIB, Center for the Biology of Disease, Leuven, Belgium
| | - Jeroen Raes
- KU Leuven, Department of Microbiology and Immunology, Rega Institute, Leuven, Belgium VIB, Center for the Biology of Disease, Leuven, Belgium
| | - Marie Joossens
- KU Leuven, Department of Microbiology and Immunology, Rega Institute, Leuven, Belgium VIB, Center for the Biology of Disease, Leuven, Belgium Microbiology Unit, Faculty of Sciences and Bioengineering Sciences, Vrije Universiteit Brussel, Brussels, Belgium
| |
Collapse
|
40
|
Carregari VC, Dai J, Verano-Braga T, Rocha T, Ponce-Soto LA, Marangoni S, Roepstorff P. Revealing the functional structure of a new PLA2 K49 from Bothriopsis taeniata snake venom employing automatic “de novo” sequencing using CID/HCD/ETD MS/MS analyses. J Proteomics 2016; 131:131-139. [DOI: 10.1016/j.jprot.2015.10.020] [Citation(s) in RCA: 7] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/09/2015] [Revised: 10/14/2015] [Accepted: 10/15/2015] [Indexed: 11/24/2022]
|
41
|
Vyatkina K, Wu S, Dekker LJM, VanDuijn MM, Liu X, Tolić N, Dvorkin M, Alexandrova S, Luider TM, Paša-Tolić L, Pevzner PA. De Novo Sequencing of Peptides from Top-Down Tandem Mass Spectra. J Proteome Res 2015; 14:4450-62. [DOI: 10.1021/pr501244v] [Citation(s) in RCA: 29] [Impact Index Per Article: 3.2] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/30/2022]
Affiliation(s)
- Kira Vyatkina
- Algorithmic
Biology Laboratory, Saint Petersburg Academic University, 8/3 Khlopina
Str, Saint Petersburg 194021, Russia
- Center
for Algorithmic Biotechnology, Institute of Translational Biomedicine, Saint Petersburg State University, 7-9 Universitetskaya nab., Saint Petersburg 199034, Russia
| | - Si Wu
- Department
of Chemistry and Biochemistry, University of Oklahoma, 101 Stephenson
Pkwy, Norman, Oklahoma 73019, United States
| | - Lennard J. M. Dekker
- Department
of Neurology, Erasmus University Medical Center, Postbus 2040,
3000 CA Rotterdam, The Netherlands
| | - Martijn M. VanDuijn
- Department
of Neurology, Erasmus University Medical Center, Postbus 2040,
3000 CA Rotterdam, The Netherlands
| | - Xiaowen Liu
- Department
of BioHealth Informatics, Indiana University-Purdue University Indianapolis, 535 West Michigan Street, IT 475, Indianapolis, Indiana 46202, United States
- Center
for Computational Biology and Bioinformatics, Indiana University School of Medicine, 410 West 10th Street, Suite 5000, Indianapolis, Indiana 46202, United States
| | - Nikola Tolić
- Environmental
Molecular Sciences Laboratory, Pacific Northwest National Laboratory, Richland, Washington 99352, United States
| | - Mikhail Dvorkin
- Algorithmic
Biology Laboratory, Saint Petersburg Academic University, 8/3 Khlopina
Str, Saint Petersburg 194021, Russia
| | - Sonya Alexandrova
- Algorithmic
Biology Laboratory, Saint Petersburg Academic University, 8/3 Khlopina
Str, Saint Petersburg 194021, Russia
| | - Theo M. Luider
- Department
of Neurology, Erasmus University Medical Center, Postbus 2040,
3000 CA Rotterdam, The Netherlands
| | - Ljiljana Paša-Tolić
- Environmental
Molecular Sciences Laboratory, Pacific Northwest National Laboratory, Richland, Washington 99352, United States
| | - Pavel A. Pevzner
- Center
for Algorithmic Biotechnology, Institute of Translational Biomedicine, Saint Petersburg State University, 7-9 Universitetskaya nab., Saint Petersburg 199034, Russia
- Department
of Computer Science and Engineering, University of California, San Diego, 9500 Gilman Drive, La Jolla, California 92093, United States
| |
Collapse
|
42
|
Altmeyer MO, Manz A, Neužil P. Microfluidic Superheating for Peptide Sequence Elucidation. Anal Chem 2015; 87:5997-6003. [PMID: 26035024 DOI: 10.1021/acs.analchem.5b00189] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/30/2022]
Abstract
Herein, we introduce microfluidic superheating as a new method for peptide fragmentation prior to mass spectrometric analysis. The superheating conditions were found to be stable up to 240 °C for more than 30 min without elevated pressure or boiling of the aqueous sample. As proof of principle, we exposed the peptides ACTH1-10 and OVA257-264 to various superheating conditions, causing different degrees of decomposition. Optimized superheating conditions resulted in the entire peptide ladder sequence of the y-ions, allowing the amino acid sequence to be deduced from a single-stage mass spectrum. Thus, obtaining information in the same quality as from tandem mass spectrometry can be achieved by a single superheating step.
Collapse
Affiliation(s)
- Matthias O Altmeyer
- ∥KIST Europe, Microfluidics, 66123 Saarbrücken, Germany.,⊥Twente University, MESA+, Institute for Nanotechnology, 7500 AE Enschede, Netherlands
| | - Andreas Manz
- ∥KIST Europe, Microfluidics, 66123 Saarbrücken, Germany
| | - Pavel Neužil
- ∥KIST Europe, Microfluidics, 66123 Saarbrücken, Germany.,§Central European Institute of Technology, Brno University of Technology, CZ-616 00 Brno, Czech Republic
| |
Collapse
|
43
|
Guthals A, Boucher C, Bandeira N. The generating function approach for Peptide identification in spectral networks. J Comput Biol 2015; 22:353-66. [PMID: 25423621 PMCID: PMC4425220 DOI: 10.1089/cmb.2014.0165] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/13/2022] Open
Abstract
Tandem mass (MS/MS) spectrometry has become the method of choice for protein identification and has launched a quest for the identification of every translated protein and peptide. However, computational developments have lagged behind the pace of modern data acquisition protocols and have become a major bottleneck in proteomics analysis of complex samples. As it stands today, attempts to identify MS/MS spectra against large databases (e.g., the human microbiome or 6-frame translation of the human genome) face a search space that is 10-100 times larger than the human proteome, where it becomes increasingly challenging to separate between true and false peptide matches. As a result, the sensitivity of current state-of-the-art database search methods drops by nearly 38% to such low identification rates that almost 90% of all MS/MS spectra are left as unidentified. We address this problem by extending the generating function approach to rigorously compute the joint spectral probability of multiple spectra being matched to peptides with overlapping sequences, thus enabling the confident assignment of higher significance to overlapping peptide-spectrum matches (PSMs). We find that these joint spectral probabilities can be several orders of magnitude more significant than individual PSMs, even in the ideal case when perfect separation between signal and noise peaks could be achieved per individual MS/MS spectrum. After benchmarking this approach on a typical lysate MS/MS dataset, we show that the proposed intersecting spectral probabilities for spectra from overlapping peptides improve peptide identification by 30-62%.
Collapse
Affiliation(s)
- Adrian Guthals
- Department of Computer Science and Engineering, University of California–San Diego, La Jolla, California
| | - Christina Boucher
- Department of Computer Science, Colorado State University, Fort Collins, Colorado
| | - Nuno Bandeira
- Department of Computer Science and Engineering, University of California–San Diego, La Jolla, California
- Skaggs School of Pharmacy and Pharmaceutical Sciences, University of California–San Diego, La Jolla, California
| |
Collapse
|
44
|
Guimarães LC, de Oliveira CFR, Marangoni S, de Oliveira DGL, Macedo MLR. Purification and characterization of a Kunitz inhibitor from Poincianella pyramidalis with insecticide activity against the Mediterranean flour moth. PESTICIDE BIOCHEMISTRY AND PHYSIOLOGY 2015; 118:1-9. [PMID: 25752423 DOI: 10.1016/j.pestbp.2014.12.001] [Citation(s) in RCA: 16] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 07/19/2014] [Revised: 12/02/2014] [Accepted: 12/03/2014] [Indexed: 05/13/2023]
Abstract
This paper describes the characterization of a trypsin inhibitor from Poincianella pyramidalis seeds (PpyTI). The partial sequencing of PpyTI revealed homology to Kunitz inhibitors, clustered as a member of Family I03 in MEROPS database. PpyTI has a single polypeptide chain of 19,042 Da and presents stability at high temperatures (up to 70 °C) and a wide range of pH. In vitro assays showed that disulfide bridges have an important stabilization role of reactive site in PpyTI, a characteristic shared among several Kunitz inhibitors. Bioassays carried out with the Mediterranean flour moth (Anagasta kuehniella) revealed a significant decrease in both larval weight and survival of PpyTI-fed larvae, besides a larval stage extension. Through biochemical analysis, we demonstrated that the PpyTI insecticide effects were triggered by digestion process commitment, through the inhibition of trypsin and chymotrypsin activities, the major digestive enzymes in this species. The insecticide effects and biochemical characterization of PpyTI encourage further studies using this inhibitor for insect pest control.
Collapse
Affiliation(s)
- Lays Cordeiro Guimarães
- Department of Biochemistry and Tissue Biology, Institute of Biology, University of Campinas, Campinas, SP 13083-970, Brazil; Department of Food Technology and Public Health, Center for Biological and Health Sciences, University of Mato Grosso do Sul, Campo Grande, MS 79070-900, Brazil
| | - Caio Fernando Ramalho de Oliveira
- Department of Biochemistry and Tissue Biology, Institute of Biology, University of Campinas, Campinas, SP 13083-970, Brazil; Department of Food Technology and Public Health, Center for Biological and Health Sciences, University of Mato Grosso do Sul, Campo Grande, MS 79070-900, Brazil
| | - Sergio Marangoni
- Department of Biochemistry and Tissue Biology, Institute of Biology, University of Campinas, Campinas, SP 13083-970, Brazil
| | - Daniella Gorete Lourenço de Oliveira
- Department of Food Technology and Public Health, Center for Biological and Health Sciences, University of Mato Grosso do Sul, Campo Grande, MS 79070-900, Brazil
| | - Maria Lígia Rodrigues Macedo
- Department of Food Technology and Public Health, Center for Biological and Health Sciences, University of Mato Grosso do Sul, Campo Grande, MS 79070-900, Brazil.
| |
Collapse
|
45
|
Liu X, Dekker LJM, Wu S, Vanduijn MM, Luider TM, Tolić N, Kou Q, Dvorkin M, Alexandrova S, Vyatkina K, Paša-Tolić L, Pevzner PA. De Novo Protein Sequencing by Combining Top-Down and Bottom-Up Tandem Mass Spectra. J Proteome Res 2014; 13:3241-8. [DOI: 10.1021/pr401300m] [Citation(s) in RCA: 43] [Impact Index Per Article: 4.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/05/2023]
Affiliation(s)
- Xiaowen Liu
- Department
of BioHealth Informatics, Indiana University-Purdue University Indianapolis, 535 West Michigan Street, IT 475, Indianapolis, Indiana 46202, United States
- Center
for Computational Biology and Bioinformatics, Indiana University School of Medicine, 410 West 10th Street, Suite 5000, Indianapolis, Indiana 46202, United States
| | - Lennard J. M. Dekker
- Department
of Neurology, Erasmus University Medical Center, Postbus 2040, 3000
CA Rotterdam, The Netherlands
| | - Si Wu
- Environmental
Molecular Sciences Laboratory, Pacific Northwest National Laboratory, Richland, Washington 99352, United States
| | - Martijn M. Vanduijn
- Department
of Neurology, Erasmus University Medical Center, Postbus 2040, 3000
CA Rotterdam, The Netherlands
| | - Theo M. Luider
- Department
of Neurology, Erasmus University Medical Center, Postbus 2040, 3000
CA Rotterdam, The Netherlands
| | - Nikola Tolić
- Environmental
Molecular Sciences Laboratory, Pacific Northwest National Laboratory, Richland, Washington 99352, United States
| | - Qiang Kou
- Department
of BioHealth Informatics, Indiana University-Purdue University Indianapolis, 535 West Michigan Street, IT 475, Indianapolis, Indiana 46202, United States
| | - Mikhail Dvorkin
- Algorithmic
Biology Laboratory, Saint Petersburg Academic University, 8/3 Khlopina
Str, St. Petersburg 194021, Russia
| | - Sonya Alexandrova
- Algorithmic
Biology Laboratory, Saint Petersburg Academic University, 8/3 Khlopina
Str, St. Petersburg 194021, Russia
| | - Kira Vyatkina
- Algorithmic
Biology Laboratory, Saint Petersburg Academic University, 8/3 Khlopina
Str, St. Petersburg 194021, Russia
| | - Ljiljana Paša-Tolić
- Environmental
Molecular Sciences Laboratory, Pacific Northwest National Laboratory, Richland, Washington 99352, United States
| | - Pavel A. Pevzner
- Department
of Computer Science and Engineering, University of California, 9500 Gilman
Drive, San Diego, California 92093, United States
| |
Collapse
|
46
|
Meyer JG, Kim S, Maltby DA, Ghassemian M, Bandeira N, Komives EA. Expanding proteome coverage with orthogonal-specificity α-lytic proteases. Mol Cell Proteomics 2014; 13:823-35. [PMID: 24425750 DOI: 10.1074/mcp.m113.034710] [Citation(s) in RCA: 44] [Impact Index Per Article: 4.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/19/2022] Open
Abstract
Bottom-up proteomics studies traditionally involve proteome digestion with a single protease, trypsin. However, trypsin alone does not generate peptides that encompass the entire proteome. Alternative proteases have been explored, but most have specificity for charged amino acid side chains. Therefore, additional proteases that improve proteome coverage through cleavage at sequences complementary to trypsin's may increase proteome coverage. We demonstrate the novel application of two proteases for bottom-up proteomics: wild type α-lytic protease (WaLP) and an active site mutant of WaLP, M190A α-lytic protease (MaLP). We assess several relevant factors, including MS/MS fragmentation, peptide length, peptide yield, and protease specificity. When data from separate digestions with trypsin, LysC, WaLP, and MaLP were combined, proteome coverage was increased by 101% relative to that achieved with trypsin digestion alone. To demonstrate how the gained sequence coverage can yield additional post-translational modification information, we show the identification of a number of novel phosphorylation sites in the Schizosaccharomyces pombe proteome and include an illustrative example from the protein MPD2 wherein two novel sites are identified, one in a tryptic peptide too short to identify and the other in a sequence devoid of tryptic sites. The specificity of WaLP and MaLP for aliphatic amino acid side chains was particularly valuable for coverage of membrane protein sequences, which increased 350% when the data from trypsin, LysC, WaLP, and MaLP were combined.
Collapse
Affiliation(s)
- Jesse G Meyer
- Department of Chemistry and Biochemistry, University of California San Diego, 9500 Gilman Dr., La Jolla, California 92093-0378
| | | | | | | | | | | |
Collapse
|
47
|
Richards AL, Vincent CE, Guthals A, Rose CM, Westphall MS, Bandeira N, Coon JJ. Neutron-encoded signatures enable product ion annotation from tandem mass spectra. Mol Cell Proteomics 2013; 12:3812-23. [PMID: 24043425 DOI: 10.1074/mcp.m113.028951] [Citation(s) in RCA: 18] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/20/2023] Open
Abstract
We report the use of neutron-encoded (NeuCode) stable isotope labeling of amino acids in cell culture for the purpose of C-terminal product ion annotation. Two NeuCode labeling isotopologues of lysine, (13)C6(15)N2 and (2)H8, which differ by 36 mDa, were metabolically embedded in a sample proteome, and the resultant labeled proteins were combined, digested, and analyzed via liquid chromatography and mass spectrometry. With MS/MS scan resolving powers of ~50,000 or higher, product ions containing the C terminus (i.e. lysine) appear as a doublet spaced by exactly 36 mDa, whereas N-terminal fragments exist as a single m/z peak. Through theory and experiment, we demonstrate that over 90% of all y-type product ions have detectable doublets. We report on an algorithm that can extract these neutron signatures with high sensitivity and specificity. In other words, of 15,503 y-type product ion peaks, the y-type ion identification algorithm correctly identified 14,552 (93.2%) based on detection of the NeuCode doublet; 6.8% were misclassified (i.e. other ion types that were assigned as y-type products). Searching NeuCode labeled yeast with PepNovo(+) resulted in a 34% increase in correct de novo identifications relative to searching through MS/MS only. We use this tool to simplify spectra prior to database searching, to sort unmatched tandem mass spectra for spectral richness, for correlation of co-fragmented ions to their parent precursor, and for de novo sequence identification.
Collapse
Affiliation(s)
- Alicia L Richards
- Department of Chemistry, University of Wisconsin, Madison, Wisconsin 53706
| | | | | | | | | | | | | |
Collapse
|