1
|
Ng CCA, Zhou Y, Yao ZP. Algorithms for de-novo sequencing of peptides by tandem mass spectrometry: A review. Anal Chim Acta 2023; 1268:341330. [PMID: 37268337 DOI: 10.1016/j.aca.2023.341330] [Citation(s) in RCA: 5] [Impact Index Per Article: 5.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/22/2022] [Revised: 05/04/2023] [Accepted: 05/06/2023] [Indexed: 06/04/2023]
Abstract
Peptide sequencing is of great significance to fundamental and applied research in the fields such as chemical, biological, medicinal and pharmaceutical sciences. With the rapid development of mass spectrometry and sequencing algorithms, de-novo peptide sequencing using tandem mass spectrometry (MS/MS) has become the main method for determining amino acid sequences of novel and unknown peptides. Advanced algorithms allow the amino acid sequence information to be accurately obtained from MS/MS spectra in short time. In this review, algorithms from exhaustive search to the state-of-art machine learning and neural network for high-throughput and automated de-novo sequencing are introduced and compared. Impacts of datasets on algorithm performance are highlighted. The current limitations and promising direction of de-novo peptide sequencing are also discussed in this review.
Collapse
Affiliation(s)
- Cheuk Chi A Ng
- State Key Laboratory of Chemical Biology and Drug Discovery, and Department of Applied Biology and Chemical Technology, The Hong Kong Polytechnic University, Hung Hom, Kowloon, Hong Kong Special Administrative Region of China; Research Institute for Future Food, and Research Center for Chinese Medicine Innovation, The Hong Kong Polytechnic University, Hung Hom, Kowloon, Hong Kong Special Administrative Region of China; State Key Laboratory of Chinese Medicine and Molecular Pharmacology (Incubation), and Shenzhen Key Laboratory of Food Biological Safety Control, The Hong Kong Polytechnic University Shenzhen Research Institute, Shenzhen, 518057, China
| | - Yin Zhou
- State Key Laboratory of Chemical Biology and Drug Discovery, and Department of Applied Biology and Chemical Technology, The Hong Kong Polytechnic University, Hung Hom, Kowloon, Hong Kong Special Administrative Region of China; Research Institute for Future Food, and Research Center for Chinese Medicine Innovation, The Hong Kong Polytechnic University, Hung Hom, Kowloon, Hong Kong Special Administrative Region of China; State Key Laboratory of Chinese Medicine and Molecular Pharmacology (Incubation), and Shenzhen Key Laboratory of Food Biological Safety Control, The Hong Kong Polytechnic University Shenzhen Research Institute, Shenzhen, 518057, China
| | - Zhong-Ping Yao
- State Key Laboratory of Chemical Biology and Drug Discovery, and Department of Applied Biology and Chemical Technology, The Hong Kong Polytechnic University, Hung Hom, Kowloon, Hong Kong Special Administrative Region of China; Research Institute for Future Food, and Research Center for Chinese Medicine Innovation, The Hong Kong Polytechnic University, Hung Hom, Kowloon, Hong Kong Special Administrative Region of China; State Key Laboratory of Chinese Medicine and Molecular Pharmacology (Incubation), and Shenzhen Key Laboratory of Food Biological Safety Control, The Hong Kong Polytechnic University Shenzhen Research Institute, Shenzhen, 518057, China.
| |
Collapse
|
2
|
Beslic D, Tscheuschner G, Renard BY, Weller MG, Muth T. Comprehensive evaluation of peptide de novo sequencing tools for monoclonal antibody assembly. Brief Bioinform 2022; 24:6955273. [PMID: 36545804 PMCID: PMC9851299 DOI: 10.1093/bib/bbac542] [Citation(s) in RCA: 8] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/09/2022] [Revised: 10/25/2022] [Accepted: 11/10/2022] [Indexed: 12/24/2022] Open
Abstract
Monoclonal antibodies are biotechnologically produced proteins with various applications in research, therapeutics and diagnostics. Their ability to recognize and bind to specific molecule structures makes them essential research tools and therapeutic agents. Sequence information of antibodies is helpful for understanding antibody-antigen interactions and ensuring their affinity and specificity. De novo protein sequencing based on mass spectrometry is a valuable method to obtain the amino acid sequence of peptides and proteins without a priori knowledge. In this study, we evaluated six recently developed de novo peptide sequencing algorithms (Novor, pNovo 3, DeepNovo, SMSNet, PointNovo and Casanovo), which were not specifically designed for antibody data. We validated their ability to identify and assemble antibody sequences on three multi-enzymatic data sets. The deep learning-based tools Casanovo and PointNovo showed an increased peptide recall across different enzymes and data sets compared with spectrum-graph-based approaches. We evaluated different error types of de novo peptide sequencing tools and their performance for different numbers of missing cleavage sites, noisy spectra and peptides of various lengths. We achieved a sequence coverage of 97.69-99.53% on the light chains of three different antibody data sets using the de Bruijn assembler ALPS and the predictions from Casanovo. However, low sequence coverage and accuracy on the heavy chains demonstrate that complete de novo protein sequencing remains a challenging issue in proteomics that requires improved de novo error correction, alternative digestion strategies and hybrid approaches such as homology search to achieve high accuracy on long protein sequences.
Collapse
Affiliation(s)
- Denis Beslic
- Corresponding authors: D. Beslic, Robert Koch Institute, ZKI-PH 3, Nordufer 20, 13353 Berlin, Germany. E-mail: ; G. Tscheuschner, Federal Institute for Materials Research and Testing (BAM), Richard-Willstätter-Straße 11, 12489 Berlin, Germany. E-mail: ; B.Y. Renard, Hasso Plattner Institute, Digital Engineering Faculty, University of Potsdam, Prof.-Dr.-Helmert-Straße 2-3, 14482 Potsdam, Germany. E-mail: ; M.G. Weller, Federal Institute for Materials Research and Testing (BAM), Richard-Willstätter-Straße 11, 12489 Berlin, Germany. E-mail: ; T. Muth, Federal Institute for Materials Research and Testing (BAM), Unter den Eichen 87, 12205 Berlin, Germany. E-mail:
| | - Georg Tscheuschner
- Corresponding authors: D. Beslic, Robert Koch Institute, ZKI-PH 3, Nordufer 20, 13353 Berlin, Germany. E-mail: ; G. Tscheuschner, Federal Institute for Materials Research and Testing (BAM), Richard-Willstätter-Straße 11, 12489 Berlin, Germany. E-mail: ; B.Y. Renard, Hasso Plattner Institute, Digital Engineering Faculty, University of Potsdam, Prof.-Dr.-Helmert-Straße 2-3, 14482 Potsdam, Germany. E-mail: ; M.G. Weller, Federal Institute for Materials Research and Testing (BAM), Richard-Willstätter-Straße 11, 12489 Berlin, Germany. E-mail: ; T. Muth, Federal Institute for Materials Research and Testing (BAM), Unter den Eichen 87, 12205 Berlin, Germany. E-mail:
| | - Bernhard Y Renard
- Corresponding authors: D. Beslic, Robert Koch Institute, ZKI-PH 3, Nordufer 20, 13353 Berlin, Germany. E-mail: ; G. Tscheuschner, Federal Institute for Materials Research and Testing (BAM), Richard-Willstätter-Straße 11, 12489 Berlin, Germany. E-mail: ; B.Y. Renard, Hasso Plattner Institute, Digital Engineering Faculty, University of Potsdam, Prof.-Dr.-Helmert-Straße 2-3, 14482 Potsdam, Germany. E-mail: ; M.G. Weller, Federal Institute for Materials Research and Testing (BAM), Richard-Willstätter-Straße 11, 12489 Berlin, Germany. E-mail: ; T. Muth, Federal Institute for Materials Research and Testing (BAM), Unter den Eichen 87, 12205 Berlin, Germany. E-mail:
| | - Michael G Weller
- Corresponding authors: D. Beslic, Robert Koch Institute, ZKI-PH 3, Nordufer 20, 13353 Berlin, Germany. E-mail: ; G. Tscheuschner, Federal Institute for Materials Research and Testing (BAM), Richard-Willstätter-Straße 11, 12489 Berlin, Germany. E-mail: ; B.Y. Renard, Hasso Plattner Institute, Digital Engineering Faculty, University of Potsdam, Prof.-Dr.-Helmert-Straße 2-3, 14482 Potsdam, Germany. E-mail: ; M.G. Weller, Federal Institute for Materials Research and Testing (BAM), Richard-Willstätter-Straße 11, 12489 Berlin, Germany. E-mail: ; T. Muth, Federal Institute for Materials Research and Testing (BAM), Unter den Eichen 87, 12205 Berlin, Germany. E-mail:
| | - Thilo Muth
- Corresponding authors: D. Beslic, Robert Koch Institute, ZKI-PH 3, Nordufer 20, 13353 Berlin, Germany. E-mail: ; G. Tscheuschner, Federal Institute for Materials Research and Testing (BAM), Richard-Willstätter-Straße 11, 12489 Berlin, Germany. E-mail: ; B.Y. Renard, Hasso Plattner Institute, Digital Engineering Faculty, University of Potsdam, Prof.-Dr.-Helmert-Straße 2-3, 14482 Potsdam, Germany. E-mail: ; M.G. Weller, Federal Institute for Materials Research and Testing (BAM), Richard-Willstätter-Straße 11, 12489 Berlin, Germany. E-mail: ; T. Muth, Federal Institute for Materials Research and Testing (BAM), Unter den Eichen 87, 12205 Berlin, Germany. E-mail:
| |
Collapse
|
3
|
Affinity Selection from Synthetic Peptide Libraries Enabled by De Novo MS/MS Sequencing. Int J Pept Res Ther 2022. [DOI: 10.1007/s10989-022-10370-9] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/26/2022]
Abstract
AbstractRecently, de novo MS/MS peptide sequencing has enabled the application of affinity selections to synthetic peptide mixtures that approach the diversity of phage libraries (> 108 random peptides). In conjunction with ‘split-mix’ solid phase synthesis to access equimolar peptide mixtures, this approach provides a straightforward means to examine synthetic peptide libraries of considerably higher diversity than has been feasible historically. Here, we offer a critical perspective on this work, report emerging data, and highlight opportunities for further methods refinement. With continued development, ‘affinity selection–mass spectrometry’ may become a complimentary approach to phage display, in vitro selection, and DNA-encoded libraries for the discovery of synthetic ligands that modulate protein function.
Collapse
|
4
|
O'Bryon I, Jenson SC, Merkley ED. Flying blind, or just flying under the radar? The underappreciated power of de novo methods of mass spectrometric peptide identification. Protein Sci 2020; 29:1864-1878. [PMID: 32713088 PMCID: PMC7454419 DOI: 10.1002/pro.3919] [Citation(s) in RCA: 15] [Impact Index Per Article: 3.8] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/15/2020] [Revised: 07/21/2020] [Accepted: 07/23/2020] [Indexed: 12/15/2022]
Abstract
Mass spectrometry-based proteomics is a popular and powerful method for precise and highly multiplexed protein identification. The most common method of analyzing untargeted proteomics data is called database searching, where the database is simply a collection of protein sequences from the target organism, derived from genome sequencing. Experimental peptide tandem mass spectra are compared to simplified models of theoretical spectra calculated from the translated genomic sequences. However, in several interesting application areas, such as forensics, archaeology, venomics, and others, a genome sequence may not be available, or the correct genome sequence to use is not known. In these cases, de novo peptide identification can play an important role. De novo methods infer peptide sequence directly from the tandem mass spectrum without reference to a sequence database, usually using graph-based or machine learning algorithms. In this review, we provide a basic overview of de novo peptide identification methods and applications, briefly covering de novo algorithms and tools, and focusing in more depth on recent applications from venomics, metaproteomics, forensics, and characterization of antibody drugs.
Collapse
Affiliation(s)
- Isabelle O'Bryon
- Chemical and Biological SignaturesPacific Northwest National LaboratoryRichlandWashingtonUSA
| | - Sarah C. Jenson
- Chemical and Biological SignaturesPacific Northwest National LaboratoryRichlandWashingtonUSA
| | - Eric D. Merkley
- Chemical and Biological SignaturesPacific Northwest National LaboratoryRichlandWashingtonUSA
| |
Collapse
|
5
|
Hedl TJ, San Gil R, Cheng F, Rayner SL, Davidson JM, De Luca A, Villalva MD, Ecroyd H, Walker AK, Lee A. Proteomics Approaches for Biomarker and Drug Target Discovery in ALS and FTD. Front Neurosci 2019; 13:548. [PMID: 31244593 PMCID: PMC6579929 DOI: 10.3389/fnins.2019.00548] [Citation(s) in RCA: 47] [Impact Index Per Article: 9.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/14/2019] [Accepted: 05/13/2019] [Indexed: 12/11/2022] Open
Abstract
Neurodegenerative disorders such as amyotrophic lateral sclerosis (ALS) and frontotemporal dementia (FTD) are increasing in prevalence but lack targeted therapeutics. Although the pathological mechanisms behind these diseases remain unclear, both ALS and FTD are characterized pathologically by aberrant protein aggregation and inclusion formation within neurons, which correlates with neurodegeneration. Notably, aggregation of several key proteins, including TAR DNA binding protein of 43 kDa (TDP-43), superoxide dismutase 1 (SOD1), and tau, have been implicated in these diseases. Proteomics methods are being increasingly applied to better understand disease-related mechanisms and to identify biomarkers of disease, using model systems as well as human samples. Proteomics-based approaches offer unbiased, high-throughput, and quantitative results with numerous applications for investigating proteins of interest. Here, we review recent advances in the understanding of ALS and FTD pathophysiology obtained using proteomics approaches, and we assess technical and experimental limitations. We compare findings from various mass spectrometry (MS) approaches including quantitative proteomics methods such as stable isotope labeling by amino acids in cell culture (SILAC) and tandem mass tagging (TMT) to approaches such as label-free quantitation (LFQ) and sequential windowed acquisition of all theoretical fragment ion mass spectra (SWATH-MS) in studies of ALS and FTD. Similarly, we describe disease-related protein-protein interaction (PPI) studies using approaches including immunoprecipitation mass spectrometry (IP-MS) and proximity-dependent biotin identification (BioID) and discuss future application of new techniques including proximity-dependent ascorbic acid peroxidase labeling (APEX), and biotinylation by antibody recognition (BAR). Furthermore, we explore the use of MS to detect post-translational modifications (PTMs), such as ubiquitination and phosphorylation, of disease-relevant proteins in ALS and FTD. We also discuss upstream technologies that enable enrichment of proteins of interest, highlighting the contributions of new techniques to isolate disease-relevant protein inclusions including flow cytometric analysis of inclusions and trafficking (FloIT). These recently developed approaches, as well as related advances yet to be applied to studies of these neurodegenerative diseases, offer numerous opportunities for discovery of potential therapeutic targets and biomarkers for ALS and FTD.
Collapse
Affiliation(s)
- Thomas J Hedl
- Neurodegeneration Pathobiology Laboratory, Queensland Brain Institute, The University of Queensland, St Lucia, QLD, Australia
| | - Rebecca San Gil
- Neurodegeneration Pathobiology Laboratory, Queensland Brain Institute, The University of Queensland, St Lucia, QLD, Australia
| | - Flora Cheng
- Centre for Motor Neuron Disease Research, Department of Biomedical Sciences, Faculty of Medicine and Health Sciences, Macquarie University, North Ryde, NSW, Australia
| | - Stephanie L Rayner
- Centre for Motor Neuron Disease Research, Department of Biomedical Sciences, Faculty of Medicine and Health Sciences, Macquarie University, North Ryde, NSW, Australia
| | - Jennilee M Davidson
- Centre for Motor Neuron Disease Research, Department of Biomedical Sciences, Faculty of Medicine and Health Sciences, Macquarie University, North Ryde, NSW, Australia
| | - Alana De Luca
- Centre for Motor Neuron Disease Research, Department of Biomedical Sciences, Faculty of Medicine and Health Sciences, Macquarie University, North Ryde, NSW, Australia
| | - Maria D Villalva
- Centre for Motor Neuron Disease Research, Department of Biomedical Sciences, Faculty of Medicine and Health Sciences, Macquarie University, North Ryde, NSW, Australia
| | - Heath Ecroyd
- School of Chemistry and Molecular Bioscience, University of Wollongong, Wollongong, NSW, Australia.,Illawarra Health and Medical Research Institute, Wollongong, NSW, Australia
| | - Adam K Walker
- Neurodegeneration Pathobiology Laboratory, Queensland Brain Institute, The University of Queensland, St Lucia, QLD, Australia.,Centre for Motor Neuron Disease Research, Department of Biomedical Sciences, Faculty of Medicine and Health Sciences, Macquarie University, North Ryde, NSW, Australia
| | - Albert Lee
- Centre for Motor Neuron Disease Research, Department of Biomedical Sciences, Faculty of Medicine and Health Sciences, Macquarie University, North Ryde, NSW, Australia
| |
Collapse
|