1
|
Hjernø K, Højrup P. Interpretation of Tandem Mass Spectrometry (MS-MS) Spectra for Peptide Analysis. Methods Mol Biol 2024; 2821:91-110. [PMID: 38997483 DOI: 10.1007/978-1-0716-3914-6_8] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 07/14/2024]
Abstract
The aim of this chapter is to give a short introduction to peptide analysis by mass spectrometry (MS) and interpretation of fragment mass spectra. Through examples and guidelines, we will demonstrate how to understand and validate search results and how to perform de novo sequencing based on the often very complex fragmentation pattern obtained by tandem mass spectrometry (also referred to as MSMS). The focus will be on simple rules for interpretation of MSMS spectra of tryptic as well as non-tryptic peptides.
Collapse
Affiliation(s)
- Karin Hjernø
- Department of Biochemistry and Molecular Biology, University of Southern Denmark, Odense, Denmark
| | - Peter Højrup
- Department of Biochemistry and Molecular Biology, University of Southern Denmark, Odense, Denmark.
| |
Collapse
|
2
|
Liu K, Ye Y, Li S, Tang H. Accurate de novo peptide sequencing using fully convolutional neural networks. Nat Commun 2023; 14:7974. [PMID: 38042873 PMCID: PMC10693636 DOI: 10.1038/s41467-023-43010-x] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/09/2022] [Accepted: 10/29/2023] [Indexed: 12/04/2023] Open
Abstract
De novo peptide sequencing, which does not rely on a comprehensive target sequence database, provides us with a way to identify novel peptides from tandem mass spectra. However, current de novo sequencing algorithms suffer from low accuracy and coverage, which hinders their application in proteomics. In this paper, we present PepNet, a fully convolutional neural network for high accuracy de novo peptide sequencing. PepNet takes an MS/MS spectrum (represented as a high-dimensional vector) as input, and outputs the optimal peptide sequence along with its confidence score. The PepNet model is trained using a total of 3 million high-energy collisional dissociation MS/MS spectra from multiple human peptide spectral libraries. Evaluation results show that PepNet significantly outperforms current best-performing de novo sequencing algorithms (e.g. PointNovo and DeepNovo) in both peptide-level accuracy and positional-level accuracy. PepNet can sequence a large fraction of spectra that were not identified by database search engines, and thus could be used as a complementary tool to database search engines for peptide identification in proteomics. In addition, PepNet runs around 3x and 7x faster than PointNovo and DeepNovo on GPUs, respectively, thus being more suitable for the analysis of large-scale proteomics data.
Collapse
Affiliation(s)
- Kaiyuan Liu
- Luddy School of Informatics, Computing, and Engineering, Indiana University, Bloomington, 47408, IN, USA
| | - Yuzhen Ye
- Luddy School of Informatics, Computing, and Engineering, Indiana University, Bloomington, 47408, IN, USA
| | - Sujun Li
- Luddy School of Informatics, Computing, and Engineering, Indiana University, Bloomington, 47408, IN, USA
- Dengding BioAI Co., Ltd., Bloomington, USA
| | - Haixu Tang
- Luddy School of Informatics, Computing, and Engineering, Indiana University, Bloomington, 47408, IN, USA.
| |
Collapse
|
3
|
Fan KT, Hsu CW, Chen YR. Mass spectrometry in the discovery of peptides involved in intercellular communication: From targeted to untargeted peptidomics approaches. MASS SPECTROMETRY REVIEWS 2023; 42:2404-2425. [PMID: 35765846 DOI: 10.1002/mas.21789] [Citation(s) in RCA: 2] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 11/23/2021] [Revised: 03/17/2022] [Accepted: 04/08/2022] [Indexed: 06/15/2023]
Abstract
Endogenous peptide hormones represent an essential class of biomolecules, which regulate cell-cell communications in diverse physiological processes of organisms. Mass spectrometry (MS) has been developed to be a powerful technology for identifying and quantifying peptides in a highly efficient manner. However, it is difficult to directly identify these peptide hormones due to their diverse characteristics, dynamic regulations, low abundance, and existence in a complicated biological matrix. Here, we summarize and discuss the roles of targeted and untargeted MS in discovering peptide hormones using bioassay-guided purification, bioinformatics screening, or the peptidomics-based approach. Although the peptidomics approach is expected to discover novel peptide hormones unbiasedly, only a limited number of successful cases have been reported. The critical challenges and corresponding measures for peptidomics from the steps of sample preparation, peptide extraction, and separation to the MS data acquisition and analysis are also discussed. We also identify emerging technologies and methods that can be integrated into the discovery platform toward the comprehensive study of endogenous peptide hormones.
Collapse
Affiliation(s)
- Kai-Ting Fan
- Agricultural Biotechnology Research Center, Academia Sinica, Taipei, Taiwan
| | - Chia-Wei Hsu
- Agricultural Biotechnology Research Center, Academia Sinica, Taipei, Taiwan
| | - Yet-Ran Chen
- Agricultural Biotechnology Research Center, Academia Sinica, Taipei, Taiwan
| |
Collapse
|
4
|
Phetsanthad A, Vu NQ, Yu Q, Buchberger AR, Chen Z, Keller C, Li L. Recent advances in mass spectrometry analysis of neuropeptides. MASS SPECTROMETRY REVIEWS 2023; 42:706-750. [PMID: 34558119 PMCID: PMC9067165 DOI: 10.1002/mas.21734] [Citation(s) in RCA: 4] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 03/09/2021] [Revised: 08/22/2021] [Accepted: 08/28/2021] [Indexed: 05/08/2023]
Abstract
Due to their involvement in numerous biochemical pathways, neuropeptides have been the focus of many recent research studies. Unfortunately, classic analytical methods, such as western blots and enzyme-linked immunosorbent assays, are extremely limited in terms of global investigations, leading researchers to search for more advanced techniques capable of probing the entire neuropeptidome of an organism. With recent technological advances, mass spectrometry (MS) has provided methodology to gain global knowledge of a neuropeptidome on a spatial, temporal, and quantitative level. This review will cover key considerations for the analysis of neuropeptides by MS, including sample preparation strategies, instrumental advances for identification, structural characterization, and imaging; insightful functional studies; and newly developed absolute and relative quantitation strategies. While many discoveries have been made with MS, the methodology is still in its infancy. Many of the current challenges and areas that need development will also be highlighted in this review.
Collapse
Affiliation(s)
- Ashley Phetsanthad
- Department of Chemistry, University of Wisconsin-Madison, 1101 University Avenue, Madison, WI 53706, USA
| | - Nhu Q. Vu
- Department of Chemistry, University of Wisconsin-Madison, 1101 University Avenue, Madison, WI 53706, USA
| | - Qing Yu
- School of Pharmacy, University of Wisconsin-Madison, 777 Highland Avenue, Madison, WI 53705, USA
| | - Amanda R. Buchberger
- Department of Chemistry, University of Wisconsin-Madison, 1101 University Avenue, Madison, WI 53706, USA
| | - Zhengwei Chen
- Department of Chemistry, University of Wisconsin-Madison, 1101 University Avenue, Madison, WI 53706, USA
| | - Caitlin Keller
- Department of Chemistry, University of Wisconsin-Madison, 1101 University Avenue, Madison, WI 53706, USA
| | - Lingjun Li
- Department of Chemistry, University of Wisconsin-Madison, 1101 University Avenue, Madison, WI 53706, USA
- School of Pharmacy, University of Wisconsin-Madison, 777 Highland Avenue, Madison, WI 53705, USA
| |
Collapse
|
5
|
Functional Peptides from One-bead One-compound High-throughput Screening Technique. Chem Res Chin Univ 2023. [DOI: 10.1007/s40242-023-2356-2] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/19/2023]
|
6
|
Svetličić E, Dončević L, Ozdanovac L, Janeš A, Tustonić T, Štajduhar A, Brkić AL, Čeprnja M, Cindrić M. Direct Identification of Urinary Tract Pathogens by MALDI-TOF/TOF Analysis and De Novo Peptide Sequencing. Molecules 2022; 27:molecules27175461. [PMID: 36080229 PMCID: PMC9457756 DOI: 10.3390/molecules27175461] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/31/2022] [Revised: 08/19/2022] [Accepted: 08/19/2022] [Indexed: 11/16/2022] Open
Abstract
For mass spectrometry-based diagnostics of microorganisms, matrix-assisted laser desorption ionization time-of-flight mass spectrometry (MALDI-TOF MS) is currently routinely used to identify urinary tract pathogens. However, it requires a lengthy culture step for accurate pathogen identification, and is limited by a relatively small number of available species in peptide spectral libraries (≤3329). Here, we propose a method for pathogen identification that overcomes the above limitations, and utilizes the MALDI-TOF/TOF MS instrument. Tandem mass spectra of the analyzed peptides were obtained by chemically activated fragmentation, which allowed mass spectrometry analysis in negative and positive ion modes. Peptide sequences were elucidated de novo, and aligned with the non-redundant National Center for Biotechnology Information Reference Sequence Database (NCBInr). For data analysis, we developed a custom program package that predicted peptide sequences from the negative and positive MS/MS spectra. The main advantage of this method over a conventional MALDI-TOF MS peptide analysis is identification in less than 24 h without a cultivation step. Compared to the limited identification with peptide spectra libraries, the NCBI database derived from genome sequencing currently contains 20,917 bacterial species, and is constantly expanding. This paper presents an accurate method that is used to identify pathogens grown on agar plates, and those isolated directly from urine samples, with high accuracy.
Collapse
Affiliation(s)
- Ema Svetličić
- Novo Nordisk Foundation Center for Biosustainability, Technical University of Denmark, 2800 Lyngby, Denmark
| | - Lucija Dončević
- Division of Molecular Medicine, Ruđer Bošković Institute, Bijenička 54, 10000 Zagreb, Croatia
| | - Luka Ozdanovac
- Division of Molecular Medicine, Ruđer Bošković Institute, Bijenička 54, 10000 Zagreb, Croatia
| | - Andrea Janeš
- Clinical Department of Laboratory Diagnostics, University Hospital Dubrava, Avenija Gojka Šuška 6, 10000 Zagreb, Croatia
| | | | - Andrija Štajduhar
- Division for Medical Statistics, Andrija Štampar Teaching Institute of Public Health, Mirogojska cesta 16, 10000 Zagreb, Croatia
| | | | - Marina Čeprnja
- Special Hospital Agram, Agram EEIG, Trnjanska cesta 108, 10000 Zagreb, Croatia
| | - Mario Cindrić
- Division of Molecular Medicine, Ruđer Bošković Institute, Bijenička 54, 10000 Zagreb, Croatia
- Correspondence: ; Tel.: +385-16384422
| |
Collapse
|
7
|
Abstract
Bioactive peptides with high potency against numerous human disorders have been regarded as a promising therapy in disease control. These peptides could be released from various dietary protein sources through hydrolysis processing using physical conditions, chemical agents, microbial fermentation, or enzymatic digestions. Considering the diversity of the original proteins and the complexity of the multiple structural peptides that existed in the hydrolysis mixture, the screening of bioactive peptides will be a challenge task. Well-organized and well-designed methods are necessarily required to enhance the efficiency of studying the potential peptides. This article, hence, provides an overview of bioactive peptides with an emphasis on the current strategy used for screening and characterization methods. Moreover, the understanding of the biological activities of peptides, mechanism inhibitions, and the interaction of the complex of peptide–enzyme is commonly evaluated using specific in vitro assays and molecular docking analysis.
Collapse
|
8
|
Fernandez B, Armengaud J, Subra G, Enjalbal C. MALDI‐MS/MS of N‐Terminal TMPP‐Acyl Peptides: A Worthwhile Tool to Decipher Protein N‐Termini. European J Org Chem 2022. [DOI: 10.1002/ejoc.202101549] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/09/2022]
Affiliation(s)
- Bernard Fernandez
- IBMM Université de Montpellier, CNRS, ENSCM 34293 Montpellier France
- Université Paris-Saclay, CEA, INRAE Département Médicaments et Technologies pour la Santé (DMTS) SPI 30200 Bagnols-sur-Cèze France
- Present address: CIRAD, UMR ASTRE 34398 Montpellier France
| | - Jean Armengaud
- Université Paris-Saclay, CEA, INRAE Département Médicaments et Technologies pour la Santé (DMTS) SPI 30200 Bagnols-sur-Cèze France
| | - Gilles Subra
- IBMM Université de Montpellier, CNRS, ENSCM 34293 Montpellier France
| | | |
Collapse
|
9
|
Abstract
Accurate full-length sequencing of a purified unknown protein is still challenging nowadays due to the error-prone mass-spectrometry (MS)-based methods. De novo identified peptide sequence largely contain errors, undermining the accuracy of assembly. Bias on the detectability of the peptides also makes low-coverage regions, resulting in gaps. Although recent advances on multi-enzyme hydrolysis and algorithms showed complete assembly of full-length protein sequences in a few examples, the robustness in practical application is still to be improved. Here, inspired by genome assembly strategies, we demonstrate a contig-scaffolding strategy to assemble protein sequences with high robustness and accuracy. This strategy integrates multiple unspecific hydrolysis methods to minimize the bias in the hydrolysis process. After de novo identification of the peptides, our assembly algorithm, named Multiple Contigs & Scaffolding (MuCS), assembles the peptide sequences in a multistep, i.e., contig-scaffold manner, with error correction in each step. MS data from different hydrolysis experiments complement each other for robust contig extension and error correction. We demonstrated that our strategy on three proteins and three replications all reached 100% coverage (except one with 98.85%) and 98.69-100% accuracy. It can also efficiently deal with the membrane protein, although the transmembrane region was missing due to the limitation of the MS. The three replicates reached 88.85-92.57% coverage and 97.57-100% accuracy. In sum, we provided a practical, robust, and accurate solution for full-length protein sequencing. The MuCS software is available at http://chi-biotech.com/mucs/.
Collapse
Affiliation(s)
- Zhi-Biao Mai
- Big Data Decision Institute, Jinan University, Guangzhou 510632, China.,Guangzhou Institutes of Biomedicine and Health, Chinese Academy of Sciences, Guangzhou 510530, China
| | - Zhong-Hua Zhou
- Key Laboratory of Functional Protein Research of Guangdong Higher Education Institutes and MOE Key Laboratory of Tumor Molecular Biology, Institute of Life and Health Engineering, Jinan University, Guangzhou 510632, China
| | - Qing-Yu He
- Key Laboratory of Functional Protein Research of Guangdong Higher Education Institutes and MOE Key Laboratory of Tumor Molecular Biology, Institute of Life and Health Engineering, Jinan University, Guangzhou 510632, China
| | - Gong Zhang
- Key Laboratory of Functional Protein Research of Guangdong Higher Education Institutes and MOE Key Laboratory of Tumor Molecular Biology, Institute of Life and Health Engineering, Jinan University, Guangzhou 510632, China
| |
Collapse
|
10
|
Characteristics of Food Protein-Derived Antidiabetic Bioactive Peptides: A Literature Update. Int J Mol Sci 2021; 22:ijms22179508. [PMID: 34502417 PMCID: PMC8431147 DOI: 10.3390/ijms22179508] [Citation(s) in RCA: 11] [Impact Index Per Article: 3.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/06/2021] [Revised: 08/29/2021] [Accepted: 08/30/2021] [Indexed: 12/25/2022] Open
Abstract
Diabetes, a glucose metabolic disorder, is considered one of the biggest challenges associated with a complex complication of health crises in the modern lifestyle. Inhibition or reduction of the dipeptidyl peptidase IV (DPP-IV), alpha-glucosidase, and protein-tyrosine phosphatase 1B (PTP-1B) enzyme activities or expressions are notably considered as the promising therapeutic strategies for the management of type 2 diabetes (T2D). Various food protein-derived antidiabetic bioactive peptides have been isolated and verified. This review provides an overview of the DPP-IV, PTP-1B, and α-glucosidase inhibitors, and updates on the methods for the discovery of DPP-IV inhibitory peptides released from food-protein hydrolysate. The finding of novel bioactive peptides involves studies about the strategy of separation fractionation, the identification of peptide sequences, and the evaluation of peptide characteristics in vitro, in silico, in situ, and in vivo. The potential of bioactive peptides suggests useful applications in the prevention and management of diabetes. Furthermore, evidence of clinical studies is necessary for the validation of these peptides’ efficiencies before commercial applications.
Collapse
|
11
|
Sengupta A, Naresh G, Mishra A, Parashar D, Narad P. Proteome analysis using machine learning approaches and its applications to diseases. ADVANCES IN PROTEIN CHEMISTRY AND STRUCTURAL BIOLOGY 2021; 127:161-216. [PMID: 34340767 DOI: 10.1016/bs.apcsb.2021.02.003] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 10/21/2022]
Abstract
With the tremendous developments in the fields of biological and medical technologies, huge amounts of data are generated in the form of genomic data, images in medical databases or as data on protein sequences, and so on. Analyzing this data through different tools sheds light on the particulars of the disease and our body's reactions to it, thus, aiding our understanding of the human health. Most useful of these tools is artificial intelligence and deep learning (DL). The artificially created neural networks in DL algorithms help extract viable data from the datasets, and further, to recognize patters in these complex datasets. Therefore, as a part of machine learning, DL helps us face all the various challenges that come forth during protein prediction, protein identification and their quantification. Proteomics is the study of such proteins, their structures, features, properties and so on. As a form of data science, Proteomics has helped us progress excellently in the field of genomics technologies. One of the major techniques used in proteomics studies is mass spectrometry (MS). However, MS is efficient with analysis of large datasets only with the added help of informatics approaches for data analysis and interpretation; these mainly include machine learning and deep learning algorithms. In this chapter, we will discuss in detail the applications of deep learning and various algorithms of machine learning in proteomics.
Collapse
Affiliation(s)
- Abhishek Sengupta
- Amity Institute of Biotechnology, Amity University Uttar Pradesh, Noida, India
| | - G Naresh
- Amity Institute of Biotechnology, Amity University Uttar Pradesh, Noida, India
| | - Astha Mishra
- Amity Institute of Biotechnology, Amity University Uttar Pradesh, Noida, India
| | - Diksha Parashar
- Amity Institute of Biotechnology, Amity University Uttar Pradesh, Noida, India
| | - Priyanka Narad
- Amity Institute of Biotechnology, Amity University Uttar Pradesh, Noida, India.
| |
Collapse
|
12
|
Moyer TB, Parsley NC, Sadecki PW, Schug WJ, Hicks LM. Leveraging orthogonal mass spectrometry based strategies for comprehensive sequencing and characterization of ribosomal antimicrobial peptide natural products. Nat Prod Rep 2021; 38:489-509. [PMID: 32929442 PMCID: PMC7956910 DOI: 10.1039/d0np00046a] [Citation(s) in RCA: 8] [Impact Index Per Article: 2.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/12/2022]
Abstract
Covering: Up to July 2020Ribosomal antimicrobial peptide (AMP) natural products, also known as ribosomally synthesized and post-translationally modified peptides (RiPPs) or host defense peptides, demonstrate potent bioactivities and impressive complexity that complicate molecular and biological characterization. Tandem mass spectrometry (MS) has rapidly accelerated bioactive peptide sequencing efforts, yet standard workflows insufficiently address intrinsic AMP diversity. Herein, orthogonal approaches to accelerate comprehensive and accurate molecular characterization without the need for prior isolation are reviewed. Chemical derivatization, proteolysis (enzymatic and chemical cleavage), multistage MS fragmentation, and separation (liquid chromatography and ion mobility) strategies can provide complementary amino acid composition and post-translational modification data to constrain sequence solutions. Examination of two complex case studies, gomesin and styelin D, highlights the practical implementation of the proposed approaches. Finally, we emphasize the importance of heterogeneous AMP peptidoforms that confer varying biological function, an area that warrants significant further development.
Collapse
Affiliation(s)
- Tessa B Moyer
- Department of Chemistry, University of North Carolina at Chapel Hill, Chapel Hill, NC, USA.
| | | | | | | | | |
Collapse
|
13
|
Fabre B, Combier JP, Plaza S. Recent advances in mass spectrometry-based peptidomics workflows to identify short-open-reading-frame-encoded peptides and explore their functions. Curr Opin Chem Biol 2021; 60:122-130. [PMID: 33401134 DOI: 10.1016/j.cbpa.2020.12.002] [Citation(s) in RCA: 28] [Impact Index Per Article: 9.3] [Reference Citation Analysis] [Abstract] [Key Words] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/16/2020] [Revised: 11/26/2020] [Accepted: 12/03/2020] [Indexed: 12/12/2022]
Abstract
Short open reading frame (sORF)-encoded polypeptides (SEPs) have recently emerged as key regulators of major cellular processes. Computational methods for the annotation of sORFs combined with transcriptomics and ribosome profiling approaches predicted the existence of tens of thousands of SEPs across the kingdom of life. Although, we still lack unambiguous evidence for most of them. The method of choice to validate the expression of SEPs is mass spectrometry (MS)-based peptidomics. Peptides are less abundant than proteins, which tends to hinder their detection. Therefore, optimization and enrichment methods are necessary to validate the existence of SEPs. In this article, we discuss the challenges for the detection of SEPs by MS and recent developments of biochemical approaches applied to the study of these peptides. We detail the advances made in the different key steps of a typical peptidomics workflow and highlight possible alternatives that have not been explored yet.
Collapse
Affiliation(s)
- Bertrand Fabre
- Laboratoire de Recherche en Sciences Végétales, UMR5546, Université de Toulouse, UPS, CNRS, 31320, Auzeville-Tolosane, France.
| | - Jean-Philippe Combier
- Laboratoire de Recherche en Sciences Végétales, UMR5546, Université de Toulouse, UPS, CNRS, 31320, Auzeville-Tolosane, France
| | - Serge Plaza
- Laboratoire de Recherche en Sciences Végétales, UMR5546, Université de Toulouse, UPS, CNRS, 31320, Auzeville-Tolosane, France
| |
Collapse
|
14
|
Yang C, Shan YC, Zhang WJ, Dai ZP, Zhang LH, Zhang YK. Full-length Protein Sequencing Based on Continuous Digestion Using Non-specific Proteases. ACTA CHIMICA SINICA 2021. [DOI: 10.6023/a21010025] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 12/25/2022]
|
15
|
Abd El-Aziz TM, Soares AG, Stockand JD. Advances in venomics: Modern separation techniques and mass spectrometry. J Chromatogr B Analyt Technol Biomed Life Sci 2020; 1160:122352. [PMID: 32971366 PMCID: PMC8174749 DOI: 10.1016/j.jchromb.2020.122352] [Citation(s) in RCA: 11] [Impact Index Per Article: 2.8] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/15/2020] [Revised: 08/25/2020] [Accepted: 08/26/2020] [Indexed: 12/31/2022]
Abstract
Snake venoms are complex chemical mixtures of biologically active proteins and non-protein components. Toxins have a wide range of targets and effects to include ion channels and membrane receptors, and platelet aggregation and platelet plug formation. Toxins target these effectors and effects at high affinity and selectivity. From a pharmacological perspective, snake venom compounds are a valuable resource for drug discovery and development. However, a major challenge to drug discovery using snake venoms is isolating and analyzing the bioactive proteins and peptides in these complex mixtures. Getting molecular information from complex mixtures such as snake venoms requires proteomic analyses, generally combined with transcriptomic analyses of venom glands. The present review summarizes current knowledge and highlights important recent advances in venomics with special emphasis on contemporary separation techniques and bioinformatics that have begun to elaborate the complexity of snake venoms. Several analytical techniques such as two-dimensional gel electrophoresis, RP-HPLC, size exclusion chromatography, ion exchange chromatography, MALDI-TOF-MS, and LC-ESI-QTOF-MS have been employed in this regard. The improvement of separation approaches such as multidimensional-HPLC, 2D-electrophoresis coupled to soft-ionization (MALDI and ESI) mass spectrometry has been critical to obtain an accurate picture of the startling complexity of venoms. In the case of bioinformatics, a variety of software tools such as PEAKS also has been used successfully. Such information gleaned from venomics is important to both predicting and resolving the biological activity of the active components of venoms, which in turn is key for the development of new drugs based on these venom components.
Collapse
Affiliation(s)
- Tarek Mohamed Abd El-Aziz
- Department of Cellular and Integrative Physiology, University of Texas Health Science Center at San Antonio, San Antonio, Texas 78229-3900, USA; Zoology Department, Faculty of Science, Minia University, El-Minia 61519, Egypt.
| | - Antonio G Soares
- Department of Cellular and Integrative Physiology, University of Texas Health Science Center at San Antonio, San Antonio, Texas 78229-3900, USA
| | - James D Stockand
- Department of Cellular and Integrative Physiology, University of Texas Health Science Center at San Antonio, San Antonio, Texas 78229-3900, USA
| |
Collapse
|
16
|
Takan S, Allmer J. DNMSO; an ontology for representing de novo sequencing results from Tandem-MS data. PeerJ 2020; 8:e10216. [PMID: 33150092 PMCID: PMC7585381 DOI: 10.7717/peerj.10216] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/11/2020] [Accepted: 09/28/2020] [Indexed: 11/20/2022] Open
Abstract
For the identification and sequencing of proteins, mass spectrometry (MS) has become the tool of choice and, as such, drives proteomics. MS/MS spectra need to be assigned a peptide sequence for which two strategies exist. Either database search or de novo sequencing can be employed to establish peptide spectrum matches. For database search, mzIdentML is the current community standard for data representation. There is no community standard for representing de novo sequencing results, but we previously proposed the de novo markup language (DNML). At the moment, each de novo sequencing solution uses different data representation, complicating downstream data integration, which is crucial since ensemble predictions may be more useful than predictions of a single tool. We here propose the de novo MS Ontology (DNMSO), which can, for example, provide many-to-many mappings between spectra and peptide predictions. Additionally, an application programming interface (API) that supports any file operation necessary for de novo sequencing from spectra input to reading, writing, creating, of the DNMSO format, as well as conversion from many other file formats, has been implemented. This API removes all overhead from the production of de novo sequencing tools and allows developers to concentrate on algorithm development completely. We make the API and formal descriptions of the format freely available at https://github.com/savastakan/dnmso.
Collapse
Affiliation(s)
- Savaş Takan
- Department of Computer Engineering, Faculty of Engineering, Izmir Institute of Technology, Izmir, Turkey
| | - Jens Allmer
- Hochschule Ruhr West, University of Applied Sciences, Medical Informatics and Bioinformatics, Institute for Measurement Engineering and Sensor Technology, Mülheim an der Ruhr, Germany
| |
Collapse
|
17
|
Cautereels J, Van Hee N, Chatterjee S, Van Alsenoy C, Lemière F, Blockhuys F. QCMS 2 as a new method for providing insight into peptide fragmentation: The influence of the side-chain and inter-side-chain interactions. JOURNAL OF MASS SPECTROMETRY : JMS 2020; 55:e4446. [PMID: 31652378 DOI: 10.1002/jms.4446] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 06/27/2019] [Revised: 09/12/2019] [Accepted: 09/21/2019] [Indexed: 06/10/2023]
Abstract
The identification of peptides and proteins from tandem mass spectra is a difficult task and multiple tools have been developed to aid this identification. We present a new method called quantum chemical mass spectrometry for materials science (QCMS2 ), which is based on quantum chemical calculations of bond orders, reaction, and transition-state energies at the DFT/B3LYP/6-311+G* level of theory. The method was used to describe the fragmentation pathways of five X-His-Ser tripeptides with X = Asn, Asp, Glu, Ser, and Trp, thereby focusing on the influence of the side chain and inter-side-chain interactions on the fragmentation. The main features in the mass spectra of the five tripeptides were correctly reproduced, and a number of fragments were assigned to fragmentations involving the side chain and the influence of inter-side-chain interactions. Product ion spectra were recorded to evaluate the capabilities and limitations of QCMS2 and a number of conventional tools.
Collapse
Affiliation(s)
- Julie Cautereels
- Department of Chemistry, University of Antwerp, Antwerp, Belgium
| | - Nils Van Hee
- Department of Chemistry, University of Antwerp, Antwerp, Belgium
| | - Sneha Chatterjee
- Department of Chemistry, University of Antwerp, Antwerp, Belgium
| | | | - Filip Lemière
- Department of Chemistry, University of Antwerp, Antwerp, Belgium
| | - Frank Blockhuys
- Department of Chemistry, University of Antwerp, Antwerp, Belgium
| |
Collapse
|
18
|
Cautereels J, Giribaldi J, Enjalbal C, Blockhuys F. Quantum chemical mass spectrometry: Ab initio study of b 2 -ion formation mechanisms for the singly protonated Gln-His-Ser tripeptide. RAPID COMMUNICATIONS IN MASS SPECTROMETRY : RCM 2020; 34:e8778. [PMID: 32144813 DOI: 10.1002/rcm.8778] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 11/08/2019] [Revised: 02/28/2020] [Accepted: 03/05/2020] [Indexed: 06/10/2023]
Abstract
RATIONALE Both amide bond protonation triggering peptide fragmentations and the controversial b2 -ion structures have been subjects of intense research. The involvement of histidine (H), with its imidazole side chain that induces specific dissociation patterns involving inter-side-chain (ISC) interactions, in b2 -ion formation was investigated, focusing on the QHS model tripeptide. METHODS To identify the effect of histidine on fragmentations issued from ISC interactions, QHS was selected for a comprehensive analysis of the pathways leading to the three possible b2 -ion structures, using quantum chemical calculations performed at the DFT/B3LYP/6-311+G* level of theory. Electrospray ionization ion trap mass spectrometry allowed the recording of MS2 and MS3 tandem mass spectra, whereas the Quantum Chemical Mass Spectrometry for Materials Science (QCMS2 ) method was used to predict fragmentation patterns. RESULTS Whereas it is very difficult to differentiate among protonated oxazolone, diketopiperazine, or lactam b2 -ions using MS2 and MS3 mass spectra, the calculations indicated that the QH b2 -ion (detected at m/z 266) is probably a mixture of the lactam and oxazolone structures formed after amide nitrogen protonation, making the formation of diketopiperazine less likely as it requires an additional step for its formation. CONCLUSIONS In contrast to glycine-histidine-containing b2 -ions, known to be issued from the backbone-imidazole cyclization, we found that interactions between the side chains were not obvious to perceive, neither from a thermodynamics nor from a fragmentation perspective, emphasizing the importance of the whole sequence on the dissociation behavior usually demonstrated from simple glycine-containing tripeptides.
Collapse
Affiliation(s)
- Julie Cautereels
- Department of Chemistry, University of Antwerp, Antwerp, Belgium
| | | | | | - Frank Blockhuys
- Department of Chemistry, University of Antwerp, Antwerp, Belgium
| |
Collapse
|
19
|
Ambrosino L, Colantuono C, Diretto G, Fiore A, Chiusano ML. Bioinformatics Resources for Plant Abiotic Stress Responses: State of the Art and Opportunities in the Fast Evolving -Omics Era. PLANTS 2020; 9:plants9050591. [PMID: 32384671 PMCID: PMC7285221 DOI: 10.3390/plants9050591] [Citation(s) in RCA: 14] [Impact Index Per Article: 3.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 03/27/2020] [Revised: 04/24/2020] [Accepted: 04/29/2020] [Indexed: 12/13/2022]
Abstract
Abiotic stresses are among the principal limiting factors for productivity in agriculture. In the current era of continuous climate changes, the understanding of the molecular aspects involved in abiotic stress response in plants is a priority. The rise of -omics approaches provides key strategies to promote effective research in the field, facilitating the investigations from reference models to an increasing number of species, tolerant and sensitive genotypes. Integrated multilevel approaches, based on molecular investigations at genomics, transcriptomics, proteomics and metabolomics levels, are now feasible, expanding the opportunities to clarify key molecular aspects involved in responses to abiotic stresses. To this aim, bioinformatics has become fundamental for data production, mining and integration, and necessary for extracting valuable information and for comparative efforts, paving the way to the modeling of the involved processes. We provide here an overview of bioinformatics resources for research on plant abiotic stresses, describing collections from -omics efforts in the field, ranging from raw data to complete databases or platforms, highlighting opportunities and still open challenges in abiotic stress research based on -omics technologies.
Collapse
Affiliation(s)
- Luca Ambrosino
- Department of Agricultural Sciences, University of Naples Federico II, 80055 Portici (Na), Italy; (L.A.); (C.C.)
- Department of Research Infrastructures for Marine Biological Resources (RIMAR), 80121 Naples, Italy
| | - Chiara Colantuono
- Department of Agricultural Sciences, University of Naples Federico II, 80055 Portici (Na), Italy; (L.A.); (C.C.)
- Department of Research Infrastructures for Marine Biological Resources (RIMAR), 80121 Naples, Italy
| | - Gianfranco Diretto
- Italian National Agency for New Technologies, Energy and Sustainable Economic Development (ENEA), 00123 Rome, Italy; (G.D.); (A.F.)
| | - Alessia Fiore
- Italian National Agency for New Technologies, Energy and Sustainable Economic Development (ENEA), 00123 Rome, Italy; (G.D.); (A.F.)
| | - Maria Luisa Chiusano
- Department of Agricultural Sciences, University of Naples Federico II, 80055 Portici (Na), Italy; (L.A.); (C.C.)
- Department of Research Infrastructures for Marine Biological Resources (RIMAR), 80121 Naples, Italy
- Correspondence: ; Tel.: +39-081-253-9492
| |
Collapse
|
20
|
Verheggen K, Raeder H, Berven FS, Martens L, Barsnes H, Vaudel M. Anatomy and evolution of database search engines-a central component of mass spectrometry based proteomic workflows. MASS SPECTROMETRY REVIEWS 2020; 39:292-306. [PMID: 28902424 DOI: 10.1002/mas.21543] [Citation(s) in RCA: 60] [Impact Index Per Article: 15.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 04/06/2016] [Accepted: 07/05/2017] [Indexed: 06/07/2023]
Abstract
Sequence database search engines are bioinformatics algorithms that identify peptides from tandem mass spectra using a reference protein sequence database. Two decades of development, notably driven by advances in mass spectrometry, have provided scientists with more than 30 published search engines, each with its own properties. In this review, we present the common paradigm behind the different implementations, and its limitations for modern mass spectrometry datasets. We also detail how the search engines attempt to alleviate these limitations, and provide an overview of the different software frameworks available to the researcher. Finally, we highlight alternative approaches for the identification of proteomic mass spectrometry datasets, either as a replacement for, or as a complement to, sequence database search engines.
Collapse
Affiliation(s)
- Kenneth Verheggen
- VIB-UGent Center for Medical Biotechnology, VIB, Ghent, Belgium
- Department of Biochemistry, Ghent University, Ghent, Belgium
- Bioinformatics Institute Ghent, Ghent University, Ghent, Belgium
| | - Helge Raeder
- KG Jebsen Center for Diabetes Research, Department of Clinical Science, University of Bergen, Norway
- Department of Pediatrics, Haukeland University Hospital, Bergen, Norway
| | - Frode S Berven
- Proteomics Unit, Department of Biomedicine, University of Bergen, Norway
| | - Lennart Martens
- VIB-UGent Center for Medical Biotechnology, VIB, Ghent, Belgium
- Department of Biochemistry, Ghent University, Ghent, Belgium
- Bioinformatics Institute Ghent, Ghent University, Ghent, Belgium
| | - Harald Barsnes
- KG Jebsen Center for Diabetes Research, Department of Clinical Science, University of Bergen, Norway
- Proteomics Unit, Department of Biomedicine, University of Bergen, Norway
- Computational Biology Unit, Department of Informatics, University of Bergen, Norway
| | - Marc Vaudel
- KG Jebsen Center for Diabetes Research, Department of Clinical Science, University of Bergen, Norway
- Proteomics Unit, Department of Biomedicine, University of Bergen, Norway
- Center for Medical Genetics and Molecular Medicine, Haukeland University Hospital, Bergen, Norway
| |
Collapse
|
21
|
Mao Y, Daly TJ, Li N. Lys-Sequencer: An algorithm for de novo sequencing of peptides by paired single residue transposed Lys-C and Lys-N digestion coupled with high-resolution mass spectrometry. RAPID COMMUNICATIONS IN MASS SPECTROMETRY : RCM 2020; 34:e8574. [PMID: 31499586 DOI: 10.1002/rcm.8574] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 07/12/2019] [Revised: 08/27/2019] [Accepted: 09/02/2019] [Indexed: 06/10/2023]
Abstract
RATIONALE Database-dependent identification of proteins by mass spectrometry is well established, but has limitations when there are novel proteins, mutations, splice variants, and post-translational modifications (PTMs) not available in the established reference database. De novo sequencing as a database-independent approach could address these limitations by deducing peptide sequences directly from experimental tandem mass spectrometry spectra, while concomitantly yielding residue-by-residue confidence metrics. METHODS Equal amounts of bovine serum albumin (BSA) sample aliquots were digested separately with Lys-C and Lys-N complementary peptidases, separated by reversed-phase ultra-high-performance liquid chromatography (UPLC), and analyzed by collision-induced dissociation (CID)-based mass spectrometry on an Orbitrap mass spectrometer. In the Lys-Sequencer algorithm, matched tandem mass spectra with equal precursor ion mass from complementary digestions were paired, and fragment ion types were identified based on the unique mass relationship between fragment ions extracted from a spectrum pair followed by de novo sequencing of peptides with identification confidence assigned at the residue level. RESULTS In all the matched spectrum pairs, 34 top-ranked BSA peptides were identified, from which 391 amino acid residues were identified correctly, covering ~67% of the full sequence of BSA (583 residues) with only ~6% (35 residues) exhibiting ambiguity in the sequence order (although amino acid compositions were still correctly assigned). Of note, this approach identified peptide sequences up to 17 amino acids in length without ambiguity, with the exception of the N-terminal or C-terminal peptides containing lysine (18-mer). CONCLUSIONS The algorithm ("Lys-Sequencer") developed in this work achieves high precision for de novo sequencing of peptides. This method facilitates the identification of point mutation and new PTMs in the protein characterization and discovery of new peptides and proteins with varying levels of confidence.
Collapse
Affiliation(s)
- Yuan Mao
- Department of Analytical Chemistry, Regeneron Pharmaceuticals, Inc., 777 Old Saw Mill River Road, Tarrytown, NY, 10591, USA
| | - Thomas J Daly
- Department of Analytical Chemistry, Regeneron Pharmaceuticals, Inc., 777 Old Saw Mill River Road, Tarrytown, NY, 10591, USA
| | - Ning Li
- Department of Analytical Chemistry, Regeneron Pharmaceuticals, Inc., 777 Old Saw Mill River Road, Tarrytown, NY, 10591, USA
| |
Collapse
|
22
|
|
23
|
Tilocca B, Costanzo N, Morittu VM, Spina AA, Soggiu A, Britti D, Roncada P, Piras C. Milk microbiota: Characterization methods and role in cheese production. J Proteomics 2019; 210:103534. [PMID: 31629058 DOI: 10.1016/j.jprot.2019.103534] [Citation(s) in RCA: 70] [Impact Index Per Article: 14.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/17/2019] [Revised: 09/04/2019] [Accepted: 09/26/2019] [Indexed: 02/07/2023]
Abstract
Milk is a complex body fluid aimed at addressing the nutritional and defensive needs of the mammal's newborns. Harbored microbiota plays a pivotal role throughout the cheesemaking process and contributes to the development of flavor and texture typical of different type of cheeses. Understanding the dairy microbiota dynamics is of paramount importance for controlling the qualitative, sensorial and biosafety features of the dairy products. Although many studies investigated the contribution of single or few microorganisms, still there is some information lacking about microbial communities. The widespread of the omics platforms and bioinformatic tools enable the investigation of the cheese-associated microbial community in both phylogenetical and functional terms, highlighting the effects of the diverse cheesemaking variables. In this review, the most relevant literature is revised to provide an introduction of the milk- and cheese-associated microbiota, along with their structural and functional dynamics in relation to the diverse cheesemaking technologies and influencing variables. Also, we focus our attention on the latest omics technologies adopted in dairy microbiota investigation. Discussion on the key-steps and major drawbacks of each omics discipline is provided along with a collection of results from the latest research studies performed to unravel the fascinating world of the dairy-associated microbiota. SIGNIFICANCE: Understanding the milk- and cheese- associated microbial community is nowadays considered a key factor in the dairy industry, since it allows a comprehensive knowledge on how all phases of the cheesemaking process impact the harbored microflora; thus, predict the consequences in the finished products in terms of texture, organoleptic characteristics, palatability and biosafety. This review, collect the pioneering and milestones works so far performed in the field of dairy microbiota, and provide the basic guidance to whom approaching the cheese microbiota investigation by means of the latest omics technologies. Also, the review emphasizes the benefits and drawbacks of the omics disciplines, and underline how the integration of diverse omics sciences enhance a comprehensive depiction of the cheese microbiota. In turn, a better consciousness of the dairy microbiota might results in the application of improved starter cultures, cheesemaking practices and technologies; supporting a bio-safe and standardized production of cheese, with a strong economic benefit for both large-scale industries and local traditional dairy farms.
Collapse
Affiliation(s)
- Bruno Tilocca
- Department of Health Sciences, University Magna Græcia of Catanzaro, Catanzaro, Italy
| | - Nicola Costanzo
- Department of Health Sciences, University Magna Græcia of Catanzaro, Catanzaro, Italy
| | - Valeria Maria Morittu
- Department of Health Sciences, University Magna Græcia of Catanzaro, Catanzaro, Italy
| | - Anna Antonella Spina
- Department of Health Sciences, University Magna Græcia of Catanzaro, Catanzaro, Italy
| | - Alessio Soggiu
- Department of Veterinary Sciences, University of Milano, Milano, Italy
| | - Domenico Britti
- Department of Health Sciences, University Magna Græcia of Catanzaro, Catanzaro, Italy
| | - Paola Roncada
- Department of Health Sciences, University Magna Græcia of Catanzaro, Catanzaro, Italy.
| | - Cristian Piras
- Department of Chemistry, University of Reading, Reading, United Kingdom
| |
Collapse
|
24
|
Li C, Li K, Li K, Xie X, Lin F. SWPepNovo: An Efficient De Novo Peptide Sequencing Tool for Large-scale MS/MS Spectra Analysis. Int J Biol Sci 2019; 15:1787-1801. [PMID: 31523183 PMCID: PMC6743289 DOI: 10.7150/ijbs.32142] [Citation(s) in RCA: 6] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/10/2018] [Accepted: 04/09/2019] [Indexed: 12/17/2022] Open
Abstract
Tandem mass spectrometry (MS/MS)-based de novo peptide sequencing is a powerful method for high-throughput protein analysis. However, the explosively increasing size of MS/MS spectra dataset inevitably and exponentially raises the computational demand of existing de novo peptide sequencing methods, which is an issue urgently to be solved in computational biology. This paper introduces an efficient tool based on SW26010 many-core processor, namely SWPepNovo, to process the large-scale peptide MS/MS spectra using a parallel peptide spectrum matches (PSMs) algorithm. Our design employs a two-level parallelization mechanism: (1) the task-level parallelism between MPEs using MPI based on a data transformation method and a dynamic feedback task scheduling algorithm, (2) the thread-level parallelism across CPEs using asynchronous task transfer and multithreading. Moreover, three optimization strategies, including vectorization, double buffering and memory access optimizations, have been employed to overcome both the compute-bound and the memory-bound bottlenecks in the parallel PSMs algorithm. The results of experiments conducted on multiple spectra datasets demonstrate the performance of SWPepNovo against three state-of-the-art tools for peptide sequencing, including PepNovo+, PEAKS and DeepNovo-DIA. The SWPepNovo also shows high scalability in experiments on extremely large datasets sized up to 11.22 GB. The software and the parameter settings are available at https://github.com/ChuangLi99/SWPepNovo.
Collapse
Affiliation(s)
- Chuang Li
- College of Information Science and Engineering, Hunan University, Changsha, China
| | - Kenli Li
- College of Information Science and Engineering, Hunan University, National Supercomputing Center in Changsha, Changsha, China
| | - Keqin Li
- College of Information Science and Engineering, Hunan University, Department of Computer Science, State University of New York, NY, USA
| | - Xianghui Xie
- State Key Laboratory of Mathematic Engineering and Advance Computing, Wuxi Jiangnan Institute of Computing Technology, Jiangsu, China
| | - Feng Lin
- School of Computer Science and Engineering, Nanyang Technological University, Singapore
| |
Collapse
|
25
|
The Radical-Scavenging Activity of a Purified and Sequenced Peptide from Lactic Acid Fermentation of Thunnus albacares By-Products. Appl Biochem Biotechnol 2019; 189:1084-1095. [PMID: 31161384 DOI: 10.1007/s12010-019-03045-7] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/07/2019] [Accepted: 05/10/2019] [Indexed: 01/11/2023]
Abstract
Yellowfin tuna by-products (Thunnus albacares) were processed to produce radical-scavenging peptides from hydrolysis by lactic acid fermentation (LAF) with Lactobacillus plantarum, papaya fruit (Carica papaya), and molasses as a carbon source for 72 h. A 15-kDa peptide was purified; after de novo sequencing, it was determined that fragments are rich in hydrophobic and neutral amino acids. The results suggest this effect is mainly to the hydrophobicity of the amino acids in their sequence. Further work is on progress to assess the ability of peptides to provide stability in lipids or in other types of samples sensitive to the action of free radicals.
Collapse
|
26
|
Allmer J. Towards an Internet of Science. J Integr Bioinform 2019; 16:/j/jib.ahead-of-print/jib-2019-0024/jib-2019-0024.xml. [PMID: 31145694 PMCID: PMC6798852 DOI: 10.1515/jib-2019-0024] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/03/2019] [Accepted: 04/25/2019] [Indexed: 11/15/2022] Open
Abstract
Big data and complex analysis workflows (pipelines) are common issues in data driven science such as bioinformatics. Large amounts of computational tools are available for data analysis. Additionally, many workflow management systems to piece together such tools into data analysis pipelines have been developed. For example, more than 50 computational tools for read mapping are available representing a large amount of duplicated effort. Furthermore, it is unclear whether these tools are correct and only a few have a user base large enough to have encountered and reported most of the potential problems. Bringing together many largely untested tools in a computational pipeline must lead to unpredictable results. Yet, this is the current state. While presently data analysis is performed on personal computers/workstations/clusters, the future will see development and analysis shift to the cloud. None of the workflow management systems is ready for this transition. This presents the opportunity to build a new system, which will overcome current duplications of effort, introduce proper testing, allow for development and analysis in public and private clouds, and include reporting features leading to interactive documents.
Collapse
Affiliation(s)
- Jens Allmer
- Hochschule Ruhr West, University of Applied Sciences, Medical Informatics and Bioinformatics, 45407 Mülheim an der Ruhr, Germany
| |
Collapse
|
27
|
Muth T, Renard BY. Evaluating de novo sequencing in proteomics: already an accurate alternative to database-driven peptide identification? Brief Bioinform 2019; 19:954-970. [PMID: 28369237 DOI: 10.1093/bib/bbx033] [Citation(s) in RCA: 63] [Impact Index Per Article: 12.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/08/2016] [Indexed: 01/24/2023] Open
Abstract
While peptide identifications in mass spectrometry (MS)-based shotgun proteomics are mostly obtained using database search methods, high-resolution spectrum data from modern MS instruments nowadays offer the prospect of improving the performance of computational de novo peptide sequencing. The major benefit of de novo sequencing is that it does not require a reference database to deduce full-length or partial tag-based peptide sequences directly from experimental tandem mass spectrometry spectra. Although various algorithms have been developed for automated de novo sequencing, the prediction accuracy of proposed solutions has been rarely evaluated in independent benchmarking studies. The main objective of this work is to provide a detailed evaluation on the performance of de novo sequencing algorithms on high-resolution data. For this purpose, we processed four experimental data sets acquired from different instrument types from collision-induced dissociation and higher energy collisional dissociation (HCD) fragmentation mode using the software packages Novor, PEAKS and PepNovo. Moreover, the accuracy of these algorithms is also tested on ground truth data based on simulated spectra generated from peak intensity prediction software. We found that Novor shows the overall best performance compared with PEAKS and PepNovo with respect to the accuracy of correct full peptide, tag-based and single-residue predictions. In addition, the same tool outpaced the commercial competitor PEAKS in terms of running time speedup by factors of around 12-17. Despite around 35% prediction accuracy for complete peptide sequences on HCD data sets, taken as a whole, the evaluated algorithms perform moderately on experimental data but show a significantly better performance on simulated data (up to 84% accuracy). Further, we describe the most frequently occurring de novo sequencing errors and evaluate the influence of missing fragment ion peaks and spectral noise on the accuracy. Finally, we discuss the potential of de novo sequencing for now becoming more widely used in the field.
Collapse
Affiliation(s)
- Thilo Muth
- Research Group Bioinformatics, Robert Koch Institute, Berlin, Germany
| | - Bernhard Y Renard
- Research Group Bioinformatics, Robert Koch Institute, Berlin, Germany
| |
Collapse
|
28
|
Kunath BJ, Minniti G, Skaugen M, Hagen LH, Vaaje-Kolstad G, Eijsink VGH, Pope PB, Arntzen MØ. Metaproteomics: Sample Preparation and Methodological Considerations. ADVANCES IN EXPERIMENTAL MEDICINE AND BIOLOGY 2019; 1073:187-215. [DOI: 10.1007/978-3-030-12298-0_8] [Citation(s) in RCA: 19] [Impact Index Per Article: 3.8] [Reference Citation Analysis] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 12/12/2022]
|
29
|
Kwon OK, Jeon JM, Sung E, Na AY, Kim SJ, Lee S. Comparative Secretome Profiling and Mutant Protein Identification in Metastatic Prostate Cancer Cells by Quantitative Mass Spectrometry-based Proteomics. Cancer Genomics Proteomics 2018; 15:279-290. [PMID: 29976633 DOI: 10.21873/cgp.20086] [Citation(s) in RCA: 12] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/04/2018] [Revised: 06/04/2018] [Accepted: 06/06/2018] [Indexed: 02/07/2023] Open
Abstract
BACKGROUND Secreted proteins play an important role in promoting cancer (PCa) cell migration and invasion. Proteogenomics helps elucidate the mechanism of diseases, discover therapeutic targets, and generate biomarkers for diagnosis through protein variations. MATERIALS AND METHODS We carried out mass a spectrometry-based proteomic analysis of the conditioned media (CM) from two human prostate cancer cell lines, belonging to different metastatic sites, to identify potential metastatic and/or aggressive factors. RESULTS We identified a total of 598 proteins, among which 561 were quantified based on proteomic analysis. Among the quantified proteins, 128 were up-regulated and 83 were down-regulated in DU145/PC3 cells. Six mutant peptides were identified in the CM of prostate cancer cell lines using proteogenomics approach. CONCLUSION This is the first proteogenomics study in PCa aiming at exploring a new type of metastatic factor, which are mutant peptides, predicting a novel biomarker of metastatic PCa for diagnosis, prognosis and drug targeting.
Collapse
Affiliation(s)
- Oh Kwang Kwon
- College of Pharmacy, Research Institute of Pharmaceutical Sciences, BK21 Plus KNU Multi-Omics-based Creative Drug Research Team, Kyungpook National University, Daegu, Republic of Korea
| | - Ju Mi Jeon
- College of Pharmacy, Research Institute of Pharmaceutical Sciences, BK21 Plus KNU Multi-Omics-based Creative Drug Research Team, Kyungpook National University, Daegu, Republic of Korea
| | - Eunji Sung
- College of Pharmacy, Research Institute of Pharmaceutical Sciences, BK21 Plus KNU Multi-Omics-based Creative Drug Research Team, Kyungpook National University, Daegu, Republic of Korea
| | - Ann-Yea Na
- College of Pharmacy, Research Institute of Pharmaceutical Sciences, BK21 Plus KNU Multi-Omics-based Creative Drug Research Team, Kyungpook National University, Daegu, Republic of Korea
| | - Sun Joo Kim
- College of Pharmacy, Research Institute of Pharmaceutical Sciences, BK21 Plus KNU Multi-Omics-based Creative Drug Research Team, Kyungpook National University, Daegu, Republic of Korea
| | - Sangkyu Lee
- College of Pharmacy, Research Institute of Pharmaceutical Sciences, BK21 Plus KNU Multi-Omics-based Creative Drug Research Team, Kyungpook National University, Daegu, Republic of Korea
| |
Collapse
|
30
|
Muth T, Hartkopf F, Vaudel M, Renard BY. A Potential Golden Age to Come-Current Tools, Recent Use Cases, and Future Avenues for De Novo Sequencing in Proteomics. Proteomics 2018; 18:e1700150. [PMID: 29968278 DOI: 10.1002/pmic.201700150] [Citation(s) in RCA: 33] [Impact Index Per Article: 5.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/23/2018] [Revised: 05/23/2018] [Indexed: 01/15/2023]
Abstract
In shotgun proteomics, peptide and protein identification is most commonly conducted using database search engines, the method of choice when reference protein sequences are available. Despite its widespread use the database-driven approach is limited, mainly because of its static search space. In contrast, de novo sequencing derives peptide sequence information in an unbiased manner, using only the fragment ion information from the tandem mass spectra. In recent years, with the improvements in MS instrumentation, various new methods have been proposed for de novo sequencing. This review article provides an overview of existing de novo sequencing algorithms and software tools ranging from peptide sequencing to sequence-to-protein mapping. Various use cases are described for which de novo sequencing was successfully applied. Finally, limitations of current methods are highlighted and new directions are discussed for a wider acceptance of de novo sequencing in the community.
Collapse
Affiliation(s)
- Thilo Muth
- Bioinformatics Unit (MF 1), Department for Methods Development and Research Infrastructure, Robert Koch Institute, 13353, Berlin, Germany
| | - Felix Hartkopf
- Bioinformatics Unit (MF 1), Department for Methods Development and Research Infrastructure, Robert Koch Institute, 13353, Berlin, Germany
| | - Marc Vaudel
- K.G. Jebsen Center for Diabetes Research, Department of Clinical Science, University of Bergen, 5020, Bergen, Norway.,Center for Medical Genetics and Molecular Medicine, Haukeland University Hospital, 5020, Bergen, Norway
| | - Bernhard Y Renard
- Bioinformatics Unit (MF 1), Department for Methods Development and Research Infrastructure, Robert Koch Institute, 13353, Berlin, Germany
| |
Collapse
|
31
|
Cho JY, Kim JK. Isolation and identification of a novel algicidal peptide from mackerel muscle hydrolysate. J Chromatogr B Analyt Technol Biomed Life Sci 2018; 1093-1094:39-46. [PMID: 29990711 DOI: 10.1016/j.jchromb.2018.06.056] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/06/2018] [Revised: 06/16/2018] [Accepted: 06/28/2018] [Indexed: 10/28/2022]
Abstract
To help remedy damage from harmful algal blooms, an attempt was made to isolate an algicidal substance previously observed to be present in mackerel muscle hydrolysate. Crude extract was obtained by cold acetone precipitation, and it dissolved best in water. Through molecular weight cut-off determination and tricine-SDS PAGE, the algicidal substance was determined to be a peptide of <1 kDa. Based on this result, purification was first performed using size exclusion chromatography and preparative reverse phase high-performance liquid chromatography. Then, the active algicidal fraction was applied to an ultra-performance liquid chromatography-electrospray ionization-mass spectrometry system, followed by MS/MS analysis. The algicidal peptide had linear structure consisting of amino acids with sequence NH-KMNF-COOH. Its calculated properties were: molecular weight 538.66 g/mol; isoelectric point 9.91; net charge +1 at pH 7.0; and 50% hydrophobicity. Algicidal ability of the identified peptide was confirmed using synthesized peptide. The LC50 values toward four harmful algal blooming species were 0.69, 0.83, 0.85 and 1.24 mg/ml for Alexandrium fundyense, A. catenella, Heterocapsa triquetra, and Prorocentrum minimum, respectively. There was no coincidence in the sequence of the identified peptide with those of known metabolites in the APD, Norine, CAMP, UniProt and METLIN databases. Consequently, this algicidal substance originating from mackerel protein was deduced to be a novel peptide that can usefully be applied to relieve harmful algal blooms.
Collapse
Affiliation(s)
- Ja Young Cho
- Department of Biotechnology, Pukyong National University, Busan, 48513, South Korea
| | - Joong Kyun Kim
- Department of Biotechnology, Pukyong National University, Busan, 48513, South Korea.
| |
Collapse
|
32
|
Dimitrakopoulos L, Prassas I, Diamandis EP, Charames GS. Onco-proteogenomics: Multi-omics level data integration for accurate phenotype prediction. Crit Rev Clin Lab Sci 2017; 54:414-432. [DOI: 10.1080/10408363.2017.1384446] [Citation(s) in RCA: 12] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/10/2023]
Affiliation(s)
- Lampros Dimitrakopoulos
- Department of Laboratory Medicine and Pathobiology, University of Toronto, Toronto, ON, Canada
- Department of Pathology and Laboratory Medicine, Mount Sinai Hospital, Joseph and Wolf Lebovic Health Complex, Toronto, ON, Canada
- Lunenfeld-Tanenbaum Research Institute, Sinai Health System, Toronto, ON, Canada
| | - Ioannis Prassas
- Department of Pathology and Laboratory Medicine, Mount Sinai Hospital, Joseph and Wolf Lebovic Health Complex, Toronto, ON, Canada
| | - Eleftherios P. Diamandis
- Department of Laboratory Medicine and Pathobiology, University of Toronto, Toronto, ON, Canada
- Department of Pathology and Laboratory Medicine, Mount Sinai Hospital, Joseph and Wolf Lebovic Health Complex, Toronto, ON, Canada
- Lunenfeld-Tanenbaum Research Institute, Sinai Health System, Toronto, ON, Canada
- Department of Clinical Biochemistry, University Health Network, Toronto, ON, Canada
| | - George S. Charames
- Department of Laboratory Medicine and Pathobiology, University of Toronto, Toronto, ON, Canada
- Department of Pathology and Laboratory Medicine, Mount Sinai Hospital, Joseph and Wolf Lebovic Health Complex, Toronto, ON, Canada
- Lunenfeld-Tanenbaum Research Institute, Sinai Health System, Toronto, ON, Canada
| |
Collapse
|
33
|
Hu H, Khatri K, Zaia J. Algorithms and design strategies towards automated glycoproteomics analysis. MASS SPECTROMETRY REVIEWS 2017; 36:475-498. [PMID: 26728195 PMCID: PMC4931994 DOI: 10.1002/mas.21487] [Citation(s) in RCA: 71] [Impact Index Per Article: 10.1] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 07/10/2015] [Accepted: 11/30/2015] [Indexed: 05/09/2023]
Abstract
Glycoproteomics involves the study of glycosylation events on protein sequences ranging from purified proteins to whole proteome scales. Understanding these complex post-translational modification (PTM) events requires elucidation of the glycan moieties (monosaccharide sequences and glycosidic linkages between residues), protein sequences, as well as site-specific attachment of glycan moieties onto protein sequences, in a spatial and temporal manner in a variety of biological contexts. Compared with proteomics, bioinformatics for glycoproteomics is immature and many researchers still rely on tedious manual interpretation of glycoproteomics data. As sample preparation protocols and analysis techniques have matured, the number of publications on glycoproteomics and bioinformatics has increased substantially; however, the lack of consensus on tool development and code reuse limits the dissemination of bioinformatics tools because it requires significant effort to migrate a computational tool tailored for one method design to alternative methods. This review discusses algorithms and methods in glycoproteomics, and refers to the general proteomics field for potential solutions. It also introduces general strategies for tool integration and pipeline construction in order to better serve the glycoproteomics community. © 2016 Wiley Periodicals, Inc. Mass Spec Rev 36:475-498, 2017.
Collapse
Affiliation(s)
- Han Hu
- Bioinformatics Program, Boston University, Boston, Massachusetts 02215, USA
- Center for Biomedical Mass Spectrometry, Department of Biochemistry, Boston University School of Medicine, Boston University, Boston, Massachusetts 02118, USA
| | - Kshitij Khatri
- Center for Biomedical Mass Spectrometry, Department of Biochemistry, Boston University School of Medicine, Boston University, Boston, Massachusetts 02118, USA
| | - Joseph Zaia
- Center for Biomedical Mass Spectrometry, Department of Biochemistry, Boston University School of Medicine, Boston University, Boston, Massachusetts 02118, USA
| |
Collapse
|
34
|
|
35
|
Savidor A, Barzilay R, Elinger D, Yarden Y, Lindzen M, Gabashvili A, Adiv Tal O, Levin Y. Database-independent Protein Sequencing (DiPS) Enables Full-length de Novo Protein and Antibody Sequence Determination. Mol Cell Proteomics 2017; 16:1151-1161. [PMID: 28348172 DOI: 10.1074/mcp.o116.065417] [Citation(s) in RCA: 16] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/10/2016] [Revised: 03/22/2017] [Indexed: 01/16/2023] Open
Abstract
Traditional "bottom-up" proteomic approaches use proteolytic digestion, LC-MS/MS, and database searching to elucidate peptide identities and their parent proteins. Protein sequences absent from the database cannot be identified, and even if present in the database, complete sequence coverage is rarely achieved even for the most abundant proteins in the sample. Thus, sequencing of unknown proteins such as antibodies or constituents of metaproteomes remains a challenging problem. To date, there is no available method for full-length protein sequencing, independent of a reference database, in high throughput. Here, we present Database-independent Protein Sequencing, a method for unambiguous, rapid, database-independent, full-length protein sequencing. The method is a novel combination of non-enzymatic, semi-random cleavage of the protein, LC-MS/MS analysis, peptide de novo sequencing, extraction of peptide tags, and their assembly into a consensus sequence using an algorithm named "Peptide Tag Assembler." As proof-of-concept, the method was applied to samples of three known proteins representing three size classes and to a previously un-sequenced, clinically relevant monoclonal antibody. Excluding leucine/isoleucine and glutamic acid/deamidated glutamine ambiguities, end-to-end full-length de novo sequencing was achieved with 99-100% accuracy for all benchmarking proteins and the antibody light chain. Accuracy of the sequenced antibody heavy chain, including the entire variable region, was also 100%, but there was a 23-residue gap in the constant region sequence.
Collapse
Affiliation(s)
- Alon Savidor
- From ‡The Nancy and Stephen Grand Israel National Center for Personalized Medicine, Weizmann Institute of Science, Rehovot
| | - Rotem Barzilay
- From ‡The Nancy and Stephen Grand Israel National Center for Personalized Medicine, Weizmann Institute of Science, Rehovot
| | - Dalia Elinger
- From ‡The Nancy and Stephen Grand Israel National Center for Personalized Medicine, Weizmann Institute of Science, Rehovot
| | - Yosef Yarden
- the §Department of Biological Regulation, Weizmann Institute of Science, Rehovot, Israel 76100
| | - Moshit Lindzen
- the §Department of Biological Regulation, Weizmann Institute of Science, Rehovot, Israel 76100
| | - Alexandra Gabashvili
- From ‡The Nancy and Stephen Grand Israel National Center for Personalized Medicine, Weizmann Institute of Science, Rehovot
| | - Ophir Adiv Tal
- From ‡The Nancy and Stephen Grand Israel National Center for Personalized Medicine, Weizmann Institute of Science, Rehovot
| | - Yishai Levin
- From ‡The Nancy and Stephen Grand Israel National Center for Personalized Medicine, Weizmann Institute of Science, Rehovot;
| |
Collapse
|
36
|
Rieder V, Blank-Landeshammer B, Stuhr M, Schell T, Biß K, Kollipara L, Meyer A, Pfenninger M, Westphal H, Sickmann A, Rahnenführer J. DISMS2: A flexible algorithm for direct proteome- wide distance calculation of LC-MS/MS runs. BMC Bioinformatics 2017; 18:148. [PMID: 28253837 PMCID: PMC5335755 DOI: 10.1186/s12859-017-1514-2] [Citation(s) in RCA: 10] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/08/2016] [Accepted: 01/31/2017] [Indexed: 01/15/2023] Open
Abstract
BACKGROUND The classification of samples on a molecular level has manifold applications, from patient classification regarding cancer treatment to phylogenetics for identifying evolutionary relationships between species. Modern methods employ the alignment of DNA or amino acid sequences, mostly not genome-wide but only on selected parts of the genome. Recently proteomics-based approaches have become popular. An established method for the identification of peptides and proteins is liquid chromatography-tandem mass spectrometry (LC-MS/MS). First, protein sequences from MS/MS spectra are identified by means of database searches, given samples with known genome-wide sequence information, then sequence based methods are applied. Alternatively, de novo peptide sequencing algorithms annotate MS/MS spectra and deduce peptide/protein information without a database. A newer approach independent of additional information is to directly compare unidentified tandem mass spectra. The challenge then is to compute the distance between pairwise MS/MS runs consisting of thousands of spectra. METHODS We present DISMS2, a new algorithm to calculate proteome-wide distances directly from MS/MS data, extending the algorithm compareMS2, an approach that also uses a spectral comparison pipeline. RESULTS Our new more flexible algorithm, DISMS2, allows for the choice of the spectrum distance measure and includes different spectra preprocessing and filtering steps that can be tailored to specific situations by parameter optimization. CONCLUSIONS DISMS2 performs well for samples from species with and without database annotation and thus has clear advantages over methods that are purely based on database search.
Collapse
Affiliation(s)
- Vera Rieder
- Department of Statistics, TU Dortmund University, Dortmund, Germany
| | | | - Marleen Stuhr
- Leibniz Center for Tropical Marine Ecology (ZMT), Bremen, Germany
| | - Tilman Schell
- Biodiversity and Climate Research Centre, Senckenberg Gesellschaft für Naturforschung, Frankfurt, Germany
| | - Karsten Biß
- Leibniz-Institut für Analytische Wissenschaften - ISAS - e.V., Dortmund, Germany
| | - Laxmikanth Kollipara
- Leibniz-Institut für Analytische Wissenschaften - ISAS - e.V., Dortmund, Germany
| | - Achim Meyer
- Leibniz Center for Tropical Marine Ecology (ZMT), Bremen, Germany
| | - Markus Pfenninger
- Biodiversity and Climate Research Centre, Senckenberg Gesellschaft für Naturforschung, Frankfurt, Germany
- Faculty of Biological Science, Institute for Ecology, Evolution and Diversity, Department of Molecular Ecology, Goethe University, Max-von-Laue-Straße 9, Frankfurt am Main, 60438 Germany
| | | | - Albert Sickmann
- Leibniz-Institut für Analytische Wissenschaften - ISAS - e.V., Dortmund, Germany
- Department of Chemistry, College of Physical Sciences, University of Aberdeen, Aberdeen, Scotland, United Kingdom
- Medizinische Fakultät, Medizinisches Proteom-Center (MPC), Ruhr-Universität Bochum, Universitätsstraße 150, Bochum, 44801 Germany
| | | |
Collapse
|
37
|
Zhang S, Shan Y, Zhang S, Sui Z, Zhang L, Liang Z, Zhang Y. NIPTL-Novo: Non-isobaric peptide termini labeling assisted peptide de novo sequencing. J Proteomics 2017; 154:40-48. [DOI: 10.1016/j.jprot.2016.12.003] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/08/2016] [Revised: 12/07/2016] [Accepted: 12/08/2016] [Indexed: 12/28/2022]
|
38
|
Guan X, Brownstein NC, Young NL, Marshall AG. Ultrahigh-resolution Fourier transform ion cyclotron resonance mass spectrometry and tandem mass spectrometry for peptide de novo amino acid sequencing for a seven-protein mixture by paired single-residue transposed Lys-N and Lys-C digestion. RAPID COMMUNICATIONS IN MASS SPECTROMETRY : RCM 2017; 31:207-217. [PMID: 27813191 DOI: 10.1002/rcm.7783] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 08/29/2016] [Revised: 10/29/2016] [Accepted: 10/30/2016] [Indexed: 06/06/2023]
Abstract
RATIONALE Bottom-up tandem mass spectrometry (MS/MS) is regularly used in proteomics to identify proteins from a sequence database. De novo sequencing is also available for sequencing peptides with relatively short sequence lengths. We recently showed that paired Lys-C and Lys-N proteases produce peptides of identical mass and similar retention time, but different tandem mass spectra. Such parallel experiments provide complementary information, and allow for up to 100% MS/MS sequence coverage. METHODS Here, we report digestion by paired Lys-C and Lys-N proteases of a seven-protein mixture: human hemoglobin alpha, bovine carbonic anhydrase 2, horse skeletal muscle myoglobin, hen egg white lysozyme, bovine pancreatic ribonuclease, bovine rhodanese, and bovine serum albumin, followed by reversed-phase nanoflow liquid chromatography, collision-induced dissociation, and 14.5 T Fourier transform ion cyclotron resonance mass spectrometry. RESULTS Matched pairs of product peptide ions of equal precursor mass and similar retention times from each digestion are compared, leveraging single-residue transposed information with independent interferences to confidently identify fragment ion types, residues, and peptides. Selected pairs of product ion mass spectra for de novo sequenced protein segments from each member of the mixture are presented. CONCLUSIONS Pairs of the transposed product ions as well as complementary information from the parallel experiments allow for both high MS/MS coverage for long peptide sequences and high confidence in the amino acid identification. Moreover, the parallel experiments in the de novo sequencing reduce false-positive matches of product ions from the single-residue transposed peptides from the same segment, and thereby further improve the confidence in protein identification. Copyright © 2016 John Wiley & Sons, Ltd.
Collapse
Affiliation(s)
- Xiaoyan Guan
- Ion Cyclotron Resonance Program, National High Magnetic Field Laboratory, Florida State University, 1800 East Paul Dirac Drive, Tallahassee, FL, 32310, USA
| | - Naomi C Brownstein
- Department of Behavioral Sciences and Social Medicine, College of Medicine, Florida State University, 1115 W. Call St., Tallahassee, FL, 32306, USA
- Department of Statistics, Florida State University, 117 N. Woodward Ave., Tallahassee, FL, 32306, USA
| | - Nicolas L Young
- Verna & Marrs McLean Department of Biochemistry & Molecular Biology, Baylor College of Medicine, One Baylor Plaza, MS-125, Houston, TX, 77030-3411, USA
| | - Alan G Marshall
- Ion Cyclotron Resonance Program, National High Magnetic Field Laboratory, Florida State University, 1800 East Paul Dirac Drive, Tallahassee, FL, 32310, USA
- Department of Chemistry and Biochemistry, Florida State University, 95 Chieftain Way, Tallahassee, FL, 32303, USA
| |
Collapse
|
39
|
Yang H, Chi H, Zhou WJ, Zeng WF, He K, Liu C, Sun RX, He SM. Open-pNovo: De Novo Peptide Sequencing with Thousands of Protein Modifications. J Proteome Res 2017; 16:645-654. [PMID: 28019094 DOI: 10.1021/acs.jproteome.6b00716] [Citation(s) in RCA: 24] [Impact Index Per Article: 3.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/18/2023]
Abstract
De novo peptide sequencing has improved remarkably, but sequencing full-length peptides with unexpected modifications is still a challenging problem. Here we present an open de novo sequencing tool, Open-pNovo, for de novo sequencing of peptides with arbitrary types of modifications. Although the search space increases by ∼300 times, Open-pNovo is close to or even ∼10-times faster than the other three proposed algorithms. Furthermore, considering top-1 candidates on three MS/MS data sets, Open-pNovo can recall over 90% of the results obtained by any one traditional algorithm and report 5-87% more peptides, including 14-250% more modified peptides. On a high-quality simulated data set, ∼85% peptides with arbitrary modifications can be recalled by Open-pNovo, while hardly any results can be recalled by others. In summary, Open-pNovo is an excellent tool for open de novo sequencing and has great potential for discovering unexpected modifications in the real biological applications.
Collapse
Affiliation(s)
- Hao Yang
- Key Lab of Intelligent Information Processing of Chinese Academy of Sciences (CAS), Institute of Computing Technology, Chinese Academy of Sciences , Beijing 100190, China.,University of Chinese Academy of Sciences, Beijing 100049, China
| | - Hao Chi
- Key Lab of Intelligent Information Processing of Chinese Academy of Sciences (CAS), Institute of Computing Technology, Chinese Academy of Sciences , Beijing 100190, China
| | - Wen-Jing Zhou
- Key Lab of Intelligent Information Processing of Chinese Academy of Sciences (CAS), Institute of Computing Technology, Chinese Academy of Sciences , Beijing 100190, China.,University of Chinese Academy of Sciences, Beijing 100049, China
| | - Wen-Feng Zeng
- Key Lab of Intelligent Information Processing of Chinese Academy of Sciences (CAS), Institute of Computing Technology, Chinese Academy of Sciences , Beijing 100190, China.,University of Chinese Academy of Sciences, Beijing 100049, China
| | - Kun He
- Key Lab of Intelligent Information Processing of Chinese Academy of Sciences (CAS), Institute of Computing Technology, Chinese Academy of Sciences , Beijing 100190, China.,University of Chinese Academy of Sciences, Beijing 100049, China
| | - Chao Liu
- Key Lab of Intelligent Information Processing of Chinese Academy of Sciences (CAS), Institute of Computing Technology, Chinese Academy of Sciences , Beijing 100190, China
| | - Rui-Xiang Sun
- Key Lab of Intelligent Information Processing of Chinese Academy of Sciences (CAS), Institute of Computing Technology, Chinese Academy of Sciences , Beijing 100190, China
| | - Si-Min He
- Key Lab of Intelligent Information Processing of Chinese Academy of Sciences (CAS), Institute of Computing Technology, Chinese Academy of Sciences , Beijing 100190, China.,University of Chinese Academy of Sciences, Beijing 100049, China
| |
Collapse
|
40
|
Islam MT, Mohamedali A, Fernandes CS, Baker MS, Ranganathan S. De Novo Peptide Sequencing: Deep Mining of High-Resolution Mass Spectrometry Data. Methods Mol Biol 2017; 1549:119-134. [PMID: 27975288 DOI: 10.1007/978-1-4939-6740-7_10] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 06/06/2023]
Abstract
High resolution mass spectrometry has revolutionized proteomics over the past decade, resulting in tremendous amounts of data in the form of mass spectra, being generated in a relatively short span of time. The mining of this spectral data for analysis and interpretation though has lagged behind such that potentially valuable data is being overlooked because it does not fit into the mold of traditional database searching methodologies. Although the analysis of spectra by de novo sequences removes such biases and has been available for a long period of time, its uptake has been slow or almost nonexistent within the scientific community. In this chapter, we propose a methodology to integrate de novo peptide sequencing using three commonly available software solutions in tandem, complemented by homology searching, and manual validation of spectra. This simplified method would allow greater use of de novo sequencing approaches and potentially greatly increase proteome coverage leading to the unearthing of valuable insights into protein biology, especially of organisms whose genomes have been recently sequenced or are poorly annotated.
Collapse
Affiliation(s)
- Mohammad Tawhidul Islam
- Department of Chemistry and Biomolecular Sciences, Faculty of Science and Engineering, Macquarie University, Sydney, NSW, 2109, Australia
| | - Abidali Mohamedali
- Department of Chemistry and Biomolecular Sciences, Faculty of Science and Engineering, Macquarie University, Sydney, NSW, 2109, Australia
- Department of Biomedical Sciences, Faculty of Medicine and Health Sciences, Macquarie University, Sydney, NSW, 2109, Australia
| | - Criselda Santan Fernandes
- Department of Chemistry and Biomolecular Sciences, Faculty of Science and Engineering, Macquarie University, Sydney, NSW, 2109, Australia
| | - Mark S Baker
- Department of Biomedical Sciences, Faculty of Medicine and Health Sciences, Macquarie University, Sydney, NSW, 2109, Australia
| | - Shoba Ranganathan
- Department of Chemistry and Biomolecular Sciences, Faculty of Science and Engineering, Macquarie University, Sydney, NSW, 2109, Australia.
| |
Collapse
|
41
|
Bertile F, Fouillen L, Wasselin T, Maes P, Le Maho Y, Van Dorsselaer A, Raclot T. The Safety Limits Of An Extended Fast: Lessons from a Non-Model Organism. Sci Rep 2016; 6:39008. [PMID: 27991520 PMCID: PMC5171797 DOI: 10.1038/srep39008] [Citation(s) in RCA: 8] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/19/2016] [Accepted: 11/16/2016] [Indexed: 02/03/2023] Open
Abstract
While safety of fasting therapy is debated in humans, extended fasting occurs routinely and safely in wild animals. To do so, food deprived animals like breeding penguins anticipate the critical limit of fasting by resuming feeding. To date, however, no molecular indices of the physiological state that links spontaneous refeeding behaviour with fasting limits had been identified. Blood proteomics and physiological data reveal here that fasting-induced body protein depletion is not unsafe “per se”. Indeed, incubating penguins only abandon their chick/egg to refeed when this state is associated with metabolic defects in glucose homeostasis/fatty acid utilization, insulin production and action, and possible renal dysfunctions. Our data illustrate how the field investigation of “exotic” models can be a unique source of information, with possible biomedical interest.
Collapse
Affiliation(s)
- Fabrice Bertile
- CNRS, UMR7178, 67037 Strasbourg, France.,Université de Strasbourg, IPHC, Laboratoire de Spectrométrie de Masse Bio-Organique, 25 rue Becquerel, 67087 Strasbourg, France
| | - Laetitia Fouillen
- CNRS, UMR7178, 67037 Strasbourg, France.,Université de Strasbourg, IPHC, Laboratoire de Spectrométrie de Masse Bio-Organique, 25 rue Becquerel, 67087 Strasbourg, France
| | - Thierry Wasselin
- CNRS, UMR7178, 67037 Strasbourg, France.,Université de Strasbourg, IPHC, Laboratoire de Spectrométrie de Masse Bio-Organique, 25 rue Becquerel, 67087 Strasbourg, France
| | - Pauline Maes
- CNRS, UMR7178, 67037 Strasbourg, France.,Université de Strasbourg, IPHC, Laboratoire de Spectrométrie de Masse Bio-Organique, 25 rue Becquerel, 67087 Strasbourg, France
| | - Yvon Le Maho
- CNRS, UMR7178, 67037 Strasbourg, France.,Université de Strasbourg, IPHC, Département Ecologie, Physiologie et Ethologie, 23 rue Becquerel, 67087 Strasbourg, France
| | - Alain Van Dorsselaer
- CNRS, UMR7178, 67037 Strasbourg, France.,Université de Strasbourg, IPHC, Laboratoire de Spectrométrie de Masse Bio-Organique, 25 rue Becquerel, 67087 Strasbourg, France
| | - Thierry Raclot
- CNRS, UMR7178, 67037 Strasbourg, France.,Université de Strasbourg, IPHC, Département Ecologie, Physiologie et Ethologie, 23 rue Becquerel, 67087 Strasbourg, France
| |
Collapse
|
42
|
Fomin E. A Simple Approach to the Reconstruction of a Set of Points from the Multiset of n2 Pairwise Distances in n2 Steps for the Sequencing Problem: II. Algorithm. J Comput Biol 2016; 23:934-942. [DOI: 10.1089/cmb.2016.0046] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/13/2022] Open
Affiliation(s)
- Eduard Fomin
- Institute of Cytology and Genetics, SB RAS, Novosibirsk, Russia
| |
Collapse
|
43
|
Fomin E. A Simple Approach to the Reconstruction of a Set of Points from the Multiset of n2 Pairwise Distances in n2 Steps for the Sequencing Problem: I. Theory. J Comput Biol 2016; 23:769-75. [DOI: 10.1089/cmb.2016.0044] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/12/2022] Open
Affiliation(s)
- Eduard Fomin
- Institute of Cytology and Genetics, Siberian Branch of the Russian Academy of Sciences, Novosibirsk, Russia
| |
Collapse
|
44
|
Gorshkov V, Hotta SYK, Verano-Braga T, Kjeldsen F. Peptide de novo sequencing of mixture tandem mass spectra. Proteomics 2016; 16:2470-9. [PMID: 27329701 PMCID: PMC5297990 DOI: 10.1002/pmic.201500549] [Citation(s) in RCA: 16] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/26/2015] [Revised: 04/27/2016] [Accepted: 06/17/2016] [Indexed: 02/02/2023]
Abstract
The impact of mixture spectra deconvolution on the performance of four popular de novo sequencing programs was tested using artificially constructed mixture spectra as well as experimental proteomics data. Mixture fragmentation spectra are recognized as a limitation in proteomics because they decrease the identification performance using database search engines. De novo sequencing approaches are expected to be even more sensitive to the reduction in mass spectrum quality resulting from peptide precursor co‐isolation and thus prone to false identifications. The deconvolution approach matched complementary b‐, y‐ions to each precursor peptide mass, which allowed the creation of virtual spectra containing sequence specific fragment ions of each co‐isolated peptide. Deconvolution processing resulted in equally efficient identification rates but increased the absolute number of correctly sequenced peptides. The improvement was in the range of 20–35% additional peptide identifications for a HeLa lysate sample. Some correct sequences were identified only using unprocessed spectra; however, the number of these was lower than those where improvement was obtained by mass spectral deconvolution. Tight candidate peptide score distribution and high sensitivity to small changes in the mass spectrum introduced by the employed deconvolution method could explain some of the missing peptide identifications.
Collapse
Affiliation(s)
- Vladimir Gorshkov
- Department of Biochemistry and Molecular Biology, University of Southern Denmark Odense M, Odense, Denmark.
| | | | - Thiago Verano-Braga
- Department of Biochemistry and Molecular Biology, University of Southern Denmark Odense M, Odense, Denmark.,Department of Physiology and Biophysics, Federal University of Minas Gerais Belo Horizonte - MG, Belo Horizonte, Brazil
| | - Frank Kjeldsen
- Department of Biochemistry and Molecular Biology, University of Southern Denmark Odense M, Odense, Denmark
| |
Collapse
|
45
|
Muth T, Renard BY, Martens L. Metaproteomic data analysis at a glance: advances in computational microbial community proteomics. Expert Rev Proteomics 2016; 13:757-69. [DOI: 10.1080/14789450.2016.1209418] [Citation(s) in RCA: 37] [Impact Index Per Article: 4.6] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/07/2023]
|
46
|
Gillet LC, Leitner A, Aebersold R. Mass Spectrometry Applied to Bottom-Up Proteomics: Entering the High-Throughput Era for Hypothesis Testing. ANNUAL REVIEW OF ANALYTICAL CHEMISTRY (PALO ALTO, CALIF.) 2016; 9:449-72. [PMID: 27049628 DOI: 10.1146/annurev-anchem-071015-041535] [Citation(s) in RCA: 218] [Impact Index Per Article: 27.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/03/2023]
Abstract
Proteins constitute a key class of molecular components that perform essential biochemical reactions in living cells. Whether the aim is to extensively characterize a given protein or to perform high-throughput qualitative and quantitative analysis of the proteome content of a sample, liquid chromatography coupled to tandem mass spectrometry has become the technology of choice. In this review, we summarize the current state of mass spectrometry applied to bottom-up proteomics, the approach that focuses on analyzing peptides obtained from proteolytic digestion of proteins. With the recent advances in instrumentation and methodology, we show that the field is moving away from providing qualitative identification of long lists of proteins to delivering highly consistent and accurate quantification values for large numbers of proteins across large numbers of samples. We believe that this shift will have a profound impact for the field of proteomics and life science research in general.
Collapse
Affiliation(s)
- Ludovic C Gillet
- Department of Biology, Institute of Molecular Systems Biology, ETH Zürich, 8093 Zürich, Switzerland;
| | - Alexander Leitner
- Department of Biology, Institute of Molecular Systems Biology, ETH Zürich, 8093 Zürich, Switzerland;
| | - Ruedi Aebersold
- Department of Biology, Institute of Molecular Systems Biology, ETH Zürich, 8093 Zürich, Switzerland;
- Faculty of Science, University of Zürich, 8057 Zürich, Switzerland
| |
Collapse
|
47
|
Yılmaz Ş, Victor B, Hulstaert N, Vandermarliere E, Barsnes H, Degroeve S, Gupta S, Sticker A, Gabriël S, Dorny P, Palmblad M, Martens L. A Pipeline for Differential Proteomics in Unsequenced Species. J Proteome Res 2016; 15:1963-70. [DOI: 10.1021/acs.jproteome.6b00140] [Citation(s) in RCA: 9] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/20/2022]
Affiliation(s)
- Şule Yılmaz
- Medical Biotechnology Center, VIB, Albert Baertsoenkaai 3, Ghent B-9000, Belgium
- Department
of Biochemistry, Ghent University, Albert Baertsoenkaai 3, B-9000 Ghent, Belgium
- Bioinformatics
Institute Ghent, Ghent University, B-9052 Ghent, Belgium
| | - Bjorn Victor
- Veterinary
Helminthology Unit, Department of Biomedical Sciences, Institute of Tropical Medicine, 2000 Antwerp, Belgium
| | - Niels Hulstaert
- Medical Biotechnology Center, VIB, Albert Baertsoenkaai 3, Ghent B-9000, Belgium
- Department
of Biochemistry, Ghent University, Albert Baertsoenkaai 3, B-9000 Ghent, Belgium
- Bioinformatics
Institute Ghent, Ghent University, B-9052 Ghent, Belgium
| | - Elien Vandermarliere
- Medical Biotechnology Center, VIB, Albert Baertsoenkaai 3, Ghent B-9000, Belgium
- Department
of Biochemistry, Ghent University, Albert Baertsoenkaai 3, B-9000 Ghent, Belgium
- Bioinformatics
Institute Ghent, Ghent University, B-9052 Ghent, Belgium
| | - Harald Barsnes
- Proteomics
Unit (PROBE), Department of Biomedicine, University of Bergen, Jonas Liesvei 91, N-5009 Bergen, Norway
| | - Sven Degroeve
- Medical Biotechnology Center, VIB, Albert Baertsoenkaai 3, Ghent B-9000, Belgium
- Department
of Biochemistry, Ghent University, Albert Baertsoenkaai 3, B-9000 Ghent, Belgium
- Bioinformatics
Institute Ghent, Ghent University, B-9052 Ghent, Belgium
| | - Surya Gupta
- Medical Biotechnology Center, VIB, Albert Baertsoenkaai 3, Ghent B-9000, Belgium
- Department
of Biochemistry, Ghent University, Albert Baertsoenkaai 3, B-9000 Ghent, Belgium
- Bioinformatics
Institute Ghent, Ghent University, B-9052 Ghent, Belgium
| | - Adriaan Sticker
- Medical Biotechnology Center, VIB, Albert Baertsoenkaai 3, Ghent B-9000, Belgium
- Department
of Biochemistry, Ghent University, Albert Baertsoenkaai 3, B-9000 Ghent, Belgium
- Bioinformatics
Institute Ghent, Ghent University, B-9052 Ghent, Belgium
- Department
of Applied Mathematics, Computer Science, and Statistics, Ghent University, B-9000 Ghent, Belgium
| | - Sarah Gabriël
- Veterinary
Helminthology Unit, Department of Biomedical Sciences, Institute of Tropical Medicine, 2000 Antwerp, Belgium
| | - Pierre Dorny
- Veterinary
Helminthology Unit, Department of Biomedical Sciences, Institute of Tropical Medicine, 2000 Antwerp, Belgium
| | - Magnus Palmblad
- Center
for Proteomics and Metabolomics, Leiden University Medical Center, 2300 RC Leiden, The Netherlands
| | - Lennart Martens
- Medical Biotechnology Center, VIB, Albert Baertsoenkaai 3, Ghent B-9000, Belgium
- Department
of Biochemistry, Ghent University, Albert Baertsoenkaai 3, B-9000 Ghent, Belgium
- Bioinformatics
Institute Ghent, Ghent University, B-9052 Ghent, Belgium
| |
Collapse
|
48
|
Wessels HJCT, de Almeida NM, Kartal B, Keltjens JT. Bacterial Electron Transfer Chains Primed by Proteomics. Adv Microb Physiol 2016; 68:219-352. [PMID: 27134025 DOI: 10.1016/bs.ampbs.2016.02.006] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/01/2023]
Abstract
Electron transport phosphorylation is the central mechanism for most prokaryotic species to harvest energy released in the respiration of their substrates as ATP. Microorganisms have evolved incredible variations on this principle, most of these we perhaps do not know, considering that only a fraction of the microbial richness is known. Besides these variations, microbial species may show substantial versatility in using respiratory systems. In connection herewith, regulatory mechanisms control the expression of these respiratory enzyme systems and their assembly at the translational and posttranslational levels, to optimally accommodate changes in the supply of their energy substrates. Here, we present an overview of methods and techniques from the field of proteomics to explore bacterial electron transfer chains and their regulation at levels ranging from the whole organism down to the Ångstrom scales of protein structures. From the survey of the literature on this subject, it is concluded that proteomics, indeed, has substantially contributed to our comprehending of bacterial respiratory mechanisms, often in elegant combinations with genetic and biochemical approaches. However, we also note that advanced proteomics offers a wealth of opportunities, which have not been exploited at all, or at best underexploited in hypothesis-driving and hypothesis-driven research on bacterial bioenergetics. Examples obtained from the related area of mitochondrial oxidative phosphorylation research, where the application of advanced proteomics is more common, may illustrate these opportunities.
Collapse
Affiliation(s)
- H J C T Wessels
- Nijmegen Center for Mitochondrial Disorders, Radboud Proteomics Centre, Translational Metabolic Laboratory, Radboud University Medical Center, Nijmegen, The Netherlands
| | - N M de Almeida
- Institute of Water and Wetland Research, Radboud University Nijmegen, Nijmegen, The Netherlands
| | - B Kartal
- Institute of Water and Wetland Research, Radboud University Nijmegen, Nijmegen, The Netherlands; Laboratory of Microbiology, Ghent University, Ghent, Belgium
| | - J T Keltjens
- Institute of Water and Wetland Research, Radboud University Nijmegen, Nijmegen, The Netherlands.
| |
Collapse
|
49
|
Statistical prediction of protein structural, localization and functional properties by the analysis of its fragment mass distributions after proteolytic cleavage. Sci Rep 2016; 6:22286. [PMID: 26924271 PMCID: PMC4770285 DOI: 10.1038/srep22286] [Citation(s) in RCA: 12] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/16/2015] [Accepted: 02/11/2016] [Indexed: 12/03/2022] Open
Abstract
Structural, localization and functional properties of unknown proteins are often being predicted from their primary polypeptide chains using sequence alignment with already characterized proteins and consequent molecular modeling. Here we suggest an approach to predict various structural and structure-associated properties of proteins directly from the mass distributions of their proteolytic cleavage fragments. For amino-acid-specific cleavages, the distributions of fragment masses are determined by the distributions of inter-amino-acid intervals in the protein, that in turn apparently reflect its structural and structure-related features. Large-scale computer simulations revealed that for transmembrane proteins, either α-helical or β -barrel secondary structure could be predicted with about 90% accuracy after thermolysin cleavage. Moreover, 3/4 intrinsically disordered proteins could be correctly distinguished from proteins with fixed three-dimensional structure belonging to all four SCOP structural classes by combining 3–4 different cleavages. Additionally, in some cases the protein cellular localization (cytosolic or membrane-associated) and its host organism (Firmicute or Proteobacteria) could be predicted with around 80% accuracy. In contrast to cytosolic proteins, for membrane-associated proteins exhibiting specific structural conformations, their monotopic or transmembrane localization and functional group (ATP-binding, transporters, sensors and so on) could be also predicted with high accuracy and particular robustness against missing cleavages.
Collapse
|
50
|
Devabhaktuni A, Elias JE. Application of de Novo Sequencing to Large-Scale Complex Proteomics Data Sets. J Proteome Res 2016; 15:732-42. [PMID: 26743026 DOI: 10.1021/acs.jproteome.5b00861] [Citation(s) in RCA: 33] [Impact Index Per Article: 4.1] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/28/2022]
Abstract
Dependent on concise, predefined protein sequence databases, traditional search algorithms perform poorly when analyzing mass spectra derived from wholly uncharacterized protein products. Conversely, de novo peptide sequencing algorithms can interpret mass spectra without relying on reference databases. However, such algorithms have been difficult to apply to complex protein mixtures, in part due to a lack of methods for automatically validating de novo sequencing results. Here, we present novel metrics for benchmarking de novo sequencing algorithm performance on large-scale proteomics data sets and present a method for accurately calibrating false discovery rates on de novo results. We also present a novel algorithm (LADS) that leverages experimentally disambiguated fragmentation spectra to boost sequencing accuracy and sensitivity. LADS improves sequencing accuracy on longer peptides relative to that of other algorithms and improves discriminability of correct and incorrect sequences. Using these advancements, we demonstrate accurate de novo identification of peptide sequences not identifiable using database search-based approaches.
Collapse
Affiliation(s)
- Arun Devabhaktuni
- Department of Chemical & Systems Biology, Stanford University , Stanford, California 94035, United States
| | - Joshua E Elias
- Department of Chemical & Systems Biology, Stanford University , Stanford, California 94035, United States
| |
Collapse
|