Reference Citation Analysis: Find an Article, Find a Category, Find a Journal, Find a Scholar

For: Fischer B, Roth V, Roos F, Grossmann J, Baginsky S, Widmayer P, Gruissem W, Buhmann JM. NovoHMM: a hidden Markov model for de novo peptide sequencing. Anal Chem 2007;77:7265-73. [PMID: 16285674 DOI: 10.1021/ac0508853] [Citation(s) in RCA: 109] [Impact Index Per Article: 6.4] [Reference Citation Analysis] [What about the content of this article? (0)] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/29/2022]

For:	Fischer B, Roth V, Roos F, Grossmann J, Baginsky S, Widmayer P, Gruissem W, Buhmann JM. NovoHMM: a hidden Markov model for de novo peptide sequencing. Anal Chem 2007;77:7265-73. [PMID: 16285674 DOI: 10.1021/ac0508853] [Citation(s) in RCA: 109] [Impact Index Per Article: 6.4] [Reference Citation Analysis] [What about the content of this article? (0)] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/29/2022]

Number

Cited by Other Article(s)

Klaproth-Andrade D, Hingerl J, Bruns Y, Smith NH, Träuble J, Wilhelm M, Gagneur J. Deep learning-driven fragment ion series classification enables highly precise and sensitive de novo peptide sequencing. Nat Commun 2024;15:151. [PMID: 38167372 PMCID: PMC10762064 DOI: 10.1038/s41467-023-44323-7] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/16/2023] [Accepted: 12/08/2023] [Indexed: 01/05/2024] Open

Tariq MU, Ebert S, Saeed F. Making MS Omics Data ML-Ready: SpeCollate Protocols. Methods Mol Biol 2024;2836:135-155. [PMID: 38995540 DOI: 10.1007/978-1-0716-4007-4_9] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 07/13/2024]

Ng CCA, Zhou Y, Yao ZP. Algorithms for de-novo sequencing of peptides by tandem mass spectrometry: A review. Anal Chim Acta 2023;1268:341330. [PMID: 37268337 DOI: 10.1016/j.aca.2023.341330] [Citation(s) in RCA: 5] [Impact Index Per Article: 5.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/22/2022] [Revised: 05/04/2023] [Accepted: 05/06/2023] [Indexed: 06/04/2023]

Affiliation(s)

Cheuk Chi A Ng State Key Laboratory of Chemical Biology and Drug Discovery, and Department of Applied Biology and Chemical Technology, The Hong Kong Polytechnic University, Hung Hom, Kowloon, Hong Kong Special Administrative Region of China; Research Institute for Future Food, and Research Center for Chinese Medicine Innovation, The Hong Kong Polytechnic University, Hung Hom, Kowloon, Hong Kong Special Administrative Region of China; State Key Laboratory of Chinese Medicine and Molecular Pharmacology (Incubation), and Shenzhen Key Laboratory of Food Biological Safety Control, The Hong Kong Polytechnic University Shenzhen Research Institute, Shenzhen, 518057, China
Yin Zhou State Key Laboratory of Chemical Biology and Drug Discovery, and Department of Applied Biology and Chemical Technology, The Hong Kong Polytechnic University, Hung Hom, Kowloon, Hong Kong Special Administrative Region of China; Research Institute for Future Food, and Research Center for Chinese Medicine Innovation, The Hong Kong Polytechnic University, Hung Hom, Kowloon, Hong Kong Special Administrative Region of China; State Key Laboratory of Chinese Medicine and Molecular Pharmacology (Incubation), and Shenzhen Key Laboratory of Food Biological Safety Control, The Hong Kong Polytechnic University Shenzhen Research Institute, Shenzhen, 518057, China
Zhong-Ping Yao State Key Laboratory of Chemical Biology and Drug Discovery, and Department of Applied Biology and Chemical Technology, The Hong Kong Polytechnic University, Hung Hom, Kowloon, Hong Kong Special Administrative Region of China; Research Institute for Future Food, and Research Center for Chinese Medicine Innovation, The Hong Kong Polytechnic University, Hung Hom, Kowloon, Hong Kong Special Administrative Region of China; State Key Laboratory of Chinese Medicine and Molecular Pharmacology (Incubation), and Shenzhen Key Laboratory of Food Biological Safety Control, The Hong Kong Polytechnic University Shenzhen Research Institute, Shenzhen, 518057, China.

Collapse

Li L, Wu J, Lyon CJ, Jiang L, Hu TY. Clinical Peptidomics: Advances in Instrumentation, Analyses, and Applications. BME FRONTIERS 2023;4:0019. [PMID: 37849662 PMCID: PMC10521655 DOI: 10.34133/bmef.0019] [Citation(s) in RCA: 3] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/11/2023] [Accepted: 04/19/2023] [Indexed: 10/19/2023] Open

Multienzyme deep learning models improve peptide de novo sequencing by mass spectrometry proteomics. PLoS Comput Biol 2023;19:e1010457. [PMID: 36668672 PMCID: PMC9891523 DOI: 10.1371/journal.pcbi.1010457] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/03/2022] [Revised: 02/01/2023] [Accepted: 01/04/2023] [Indexed: 01/21/2023] Open

Zhang Z, Li Y, Yuan W, Wang Z, Wan C. Proteomic-driven identification of short open reading frame-encoded peptides. Proteomics 2022;22:e2100312. [PMID: 35384297 DOI: 10.1002/pmic.202100312] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/25/2022] [Revised: 03/29/2022] [Accepted: 03/30/2022] [Indexed: 11/10/2022]

Simopoulos CMA, Figeys D, Lavallée-Adam M. Novel Bioinformatics Strategies Driving Dynamic Metaproteomic Studies. Methods Mol Biol 2022;2456:319-338. [PMID: 35612752 DOI: 10.1007/978-1-0716-2124-0_22] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/18/2022]

Tariq MU, Saeed F. SpeCollate: Deep cross-modal similarity network for mass spectrometry data based peptide deductions. PLoS One 2021;16:e0259349. [PMID: 34714871 PMCID: PMC8555789 DOI: 10.1371/journal.pone.0259349] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/10/2021] [Accepted: 10/18/2021] [Indexed: 11/19/2022] Open

Abstract

Historically, the database search algorithms have been the de facto standard for inferring peptides from mass spectrometry (MS) data. Database search algorithms deduce peptides by transforming theoretical peptides into theoretical spectra and matching them to the experimental spectra. Heuristic similarity-scoring functions are used to match an experimental spectrum to a theoretical spectrum. However, the heuristic nature of the scoring functions and the simple transformation of the peptides into theoretical spectra, along with noisy mass spectra for the less abundant peptides, can introduce a cascade of inaccuracies. In this paper, we design and implement a Deep Cross-Modal Similarity Network called SpeCollate, which overcomes these inaccuracies by learning the similarity function between experimental spectra and peptides directly from the labeled MS data. SpeCollate transforms spectra and peptides into a shared Euclidean subspace by learning fixed size embeddings for both. Our proposed deep-learning network trains on sextuplets of positive and negative examples coupled with our custom-designed SNAP-loss function. Online hardest negative mining is used to select the appropriate negative examples for optimal training performance. We use 4.8 million sextuplets obtained from the NIST and MassIVE peptide libraries to train the network and demonstrate that for closed search, SpeCollate is able to perform better than Crux and MSFragger in terms of the number of peptide-spectrum matches (PSMs) and unique peptides identified under 1% FDR for real-world data. SpeCollate also identifies a large number of peptides not reported by either Crux or MSFragger. To the best of our knowledge, our proposed SpeCollate is the first deep-learning network that can determine the cross-modal similarity between peptides and mass-spectra for MS-based proteomics. We believe SpeCollate is significant progress towards developing machine-learning solutions for MS-based omics data analysis. SpeCollate is available at https://deepspecs.github.io/.

Collapse

Progress and challenges in mass spectrometry-based analysis of antibody repertoires. Trends Biotechnol 2021;40:463-481. [PMID: 34535228 DOI: 10.1016/j.tibtech.2021.08.006] [Citation(s) in RCA: 19] [Impact Index Per Article: 6.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/23/2021] [Revised: 08/16/2021] [Accepted: 08/17/2021] [Indexed: 12/22/2022]

Zhang W, Liang Z, Chen X, Xin L, Shan B, Luo Z, Li M. ChimST: An Efficient Spectral Library Search Tool for Peptide Identification from Chimeric Spectra in Data-Dependent Acquisition. IEEE/ACM TRANSACTIONS ON COMPUTATIONAL BIOLOGY AND BIOINFORMATICS 2021;18:1416-1425. [PMID: 31603795 DOI: 10.1109/tcbb.2019.2945954] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/10/2023]

Proteomics in Forensic Analysis: Applications for Human Samples. APPLIED SCIENCES-BASEL 2021. [DOI: 10.3390/app11083393] [Citation(s) in RCA: 16] [Impact Index Per Article: 5.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 12/12/2022]

Tariq MU, Haseeb M, Aledhari M, Razzak R, Parizi RM, Saeed F. Methods for Proteogenomics Data Analysis, Challenges, and Scalability Bottlenecks: A Survey. IEEE ACCESS : PRACTICAL INNOVATIONS, OPEN SOLUTIONS 2020;9:5497-5516. [PMID: 33537181 PMCID: PMC7853650 DOI: 10.1109/access.2020.3047588] [Citation(s) in RCA: 9] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/17/2023]

Vitorino R, Guedes S, Trindade F, Correia I, Moura G, Carvalho P, Santos MAS, Amado F. De novo sequencing of proteins by mass spectrometry. Expert Rev Proteomics 2020;17:595-607. [PMID: 33016158 DOI: 10.1080/14789450.2020.1831387] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/24/2022]

DeLaney K, Cao W, Ma Y, Ma M, Zhang Y, Li L. PRESnovo: Prescreening Prior to de novo Sequencing to Improve Accuracy and Sensitivity of Neuropeptide Identification. JOURNAL OF THE AMERICAN SOCIETY FOR MASS SPECTROMETRY 2020;31:1358-1371. [PMID: 32266812 PMCID: PMC7332408 DOI: 10.1021/jasms.0c00013] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/15/2023]

Abstract

Identification of peptides in species lacking fully sequenced genomes is challenging due to the lack of prior knowledge. De novo sequencing is the method of choice, but its performance is less than satisfactory due to algorithmic bias and interference in complex MS/MS spectra. The task becomes even more challenging for endogenous peptides that do not involve an enzymatic digestion step, such as neuropeptides. However, many neuropeptides possess common sequence motifs that are conserved across members of the same family. Taking advantage of this feature to improve de novo sequencing of neuropeptides, we have developed a method named PRESnovo (prescreening precursors prior to de novo sequencing) to predict the motif from a MS/MS spectrum. A neuropeptide sequence is broken into a motif with conserved amino acid residues and the remaining partial sequence. By searching against a predefined motif database constructed from known homologous sequences, PRESnovo assigns the most probable motif to each precursor via a sophisticated scoring function. Performance analysis was conducted with 15 neuropeptide standards, and 11 neuropeptides were correctly identified with PRESnovo compared to 1 identification by PEAKS only. We applied PRESnovo to assign motifs to peptide sequences in conjunction with PEAKS for assigning the rest of the peptide sequence in order to discover neuropeptides in tissue samples of green crab, C. maenas, and Jonah crab, C. borealis. Collectively, a large number of neuropeptides were identified, including 13 putative neuropeptides identified in green crab brain, 77 in Jonah crab brain, and 47 in Jonah crab sinus glands for the first time. This PRESnovo strategy greatly simplifies de novo sequencing and enhances the accuracy and sensitivity of neuropeptide identification when common motifs are present.

Collapse

Bioinformatics Methods for Mass Spectrometry-Based Proteomics Data Analysis. Int J Mol Sci 2020;21:ijms21082873. [PMID: 32326049 PMCID: PMC7216093 DOI: 10.3390/ijms21082873] [Citation(s) in RCA: 119] [Impact Index Per Article: 29.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/18/2020] [Revised: 04/16/2020] [Accepted: 04/18/2020] [Indexed: 01/15/2023] Open

Mao Y, Daly TJ, Li N. Lys-Sequencer: An algorithm for de novo sequencing of peptides by paired single residue transposed Lys-C and Lys-N digestion coupled with high-resolution mass spectrometry. RAPID COMMUNICATIONS IN MASS SPECTROMETRY : RCM 2020;34:e8574. [PMID: 31499586 DOI: 10.1002/rcm.8574] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 07/12/2019] [Revised: 08/27/2019] [Accepted: 09/02/2019] [Indexed: 06/10/2023]

Abstract

RATIONALE

Database-dependent identification of proteins by mass spectrometry is well established, but has limitations when there are novel proteins, mutations, splice variants, and post-translational modifications (PTMs) not available in the established reference database. De novo sequencing as a database-independent approach could address these limitations by deducing peptide sequences directly from experimental tandem mass spectrometry spectra, while concomitantly yielding residue-by-residue confidence metrics.

METHODS

Equal amounts of bovine serum albumin (BSA) sample aliquots were digested separately with Lys-C and Lys-N complementary peptidases, separated by reversed-phase ultra-high-performance liquid chromatography (UPLC), and analyzed by collision-induced dissociation (CID)-based mass spectrometry on an Orbitrap mass spectrometer. In the Lys-Sequencer algorithm, matched tandem mass spectra with equal precursor ion mass from complementary digestions were paired, and fragment ion types were identified based on the unique mass relationship between fragment ions extracted from a spectrum pair followed by de novo sequencing of peptides with identification confidence assigned at the residue level.

RESULTS

In all the matched spectrum pairs, 34 top-ranked BSA peptides were identified, from which 391 amino acid residues were identified correctly, covering ~67% of the full sequence of BSA (583 residues) with only ~6% (35 residues) exhibiting ambiguity in the sequence order (although amino acid compositions were still correctly assigned). Of note, this approach identified peptide sequences up to 17 amino acids in length without ambiguity, with the exception of the N-terminal or C-terminal peptides containing lysine (18-mer).

CONCLUSIONS

The algorithm ("Lys-Sequencer") developed in this work achieves high precision for de novo sequencing of peptides. This method facilitates the identification of point mutation and new PTMs in the protein characterization and discovery of new peptides and proteins with varying levels of confidence.

Collapse

Valli M, Russo HM, Pilon AC, Pinto MEF, Dias NB, Freire RT, Castro-Gamboa I, Bolzani VDS. Computational methods for NMR and MS for structure elucidation I: software for basic NMR. PHYSICAL SCIENCES REVIEWS 2019. [DOI: 10.1515/psr-2018-0108] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/30/2023]

Issa Isaac N, Philippe D, Nicholas A, Raoult D, Eric C. Metaproteomics of the human gut microbiota: Challenges and contributions to other OMICS. CLINICAL MASS SPECTROMETRY 2019;14 Pt A:18-30. [DOI: 10.1016/j.clinms.2019.06.001] [Citation(s) in RCA: 25] [Impact Index Per Article: 5.0] [Reference Citation Analysis] [Track Full Text] [Subscribe] [Scholar Register] [Received: 03/05/2019] [Revised: 06/02/2019] [Accepted: 06/03/2019] [Indexed: 12/22/2022]

Schweiger R, Erlich Y, Carmi S. FactorialHMM: fast and exact inference in factorial hidden Markov models. Bioinformatics 2019;35:2162-2164. [PMID: 30445428 DOI: 10.1093/bioinformatics/bty944] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/02/2018] [Revised: 11/07/2018] [Accepted: 11/13/2018] [Indexed: 11/13/2022] Open

Muth T, Renard BY. Evaluating de novo sequencing in proteomics: already an accurate alternative to database-driven peptide identification? Brief Bioinform 2019;19:954-970. [PMID: 28369237 DOI: 10.1093/bib/bbx033] [Citation(s) in RCA: 63] [Impact Index Per Article: 12.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/08/2016] [Indexed: 01/24/2023] Open

Abstract

While peptide identifications in mass spectrometry (MS)-based shotgun proteomics are mostly obtained using database search methods, high-resolution spectrum data from modern MS instruments nowadays offer the prospect of improving the performance of computational de novo peptide sequencing. The major benefit of de novo sequencing is that it does not require a reference database to deduce full-length or partial tag-based peptide sequences directly from experimental tandem mass spectrometry spectra. Although various algorithms have been developed for automated de novo sequencing, the prediction accuracy of proposed solutions has been rarely evaluated in independent benchmarking studies. The main objective of this work is to provide a detailed evaluation on the performance of de novo sequencing algorithms on high-resolution data. For this purpose, we processed four experimental data sets acquired from different instrument types from collision-induced dissociation and higher energy collisional dissociation (HCD) fragmentation mode using the software packages Novor, PEAKS and PepNovo. Moreover, the accuracy of these algorithms is also tested on ground truth data based on simulated spectra generated from peak intensity prediction software. We found that Novor shows the overall best performance compared with PEAKS and PepNovo with respect to the accuracy of correct full peptide, tag-based and single-residue predictions. In addition, the same tool outpaced the commercial competitor PEAKS in terms of running time speedup by factors of around 12-17. Despite around 35% prediction accuracy for complete peptide sequences on HCD data sets, taken as a whole, the evaluated algorithms perform moderately on experimental data but show a significantly better performance on simulated data (up to 84% accuracy). Further, we describe the most frequently occurring de novo sequencing errors and evaluate the influence of missing fragment ion peaks and spectral noise on the accuracy. Finally, we discuss the potential of de novo sequencing for now becoming more widely used in the field.

Collapse

Yang H, Li YC, Zhao MZ, Wu FL, Wang X, Xiao WD, Wang YH, Zhang JL, Wang FQ, Xu F, Zeng WF, Overall CM, He SM, Chi H, Xu P. Precision De Novo Peptide Sequencing Using Mirror Proteases of Ac-LysargiNase and Trypsin for Large-scale Proteomics. Mol Cell Proteomics 2019;18:773-785. [PMID: 30622160 PMCID: PMC6442358 DOI: 10.1074/mcp.tir118.000918] [Citation(s) in RCA: 24] [Impact Index Per Article: 4.8] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/16/2018] [Revised: 11/20/2018] [Indexed: 11/06/2022] Open

Affiliation(s)

Hao Yang From the ‡Key Lab of Intelligent Information Processing of Chinese Academy of Sciences (CAS), Institute of Computing Technology, CAS; University of Chinese Academy of Sciences; Institute of Computing Technology, CAS, Beijing 100190, China
Yan-Chang Li §State Key Laboratory of Proteomics; Beijing Proteome Research Center; National Center for Protein Sciences Beijing; Beijing Institute of Lifeomics, Beijing 102206, China
Ming-Zhi Zhao §State Key Laboratory of Proteomics; Beijing Proteome Research Center; National Center for Protein Sciences Beijing; Beijing Institute of Lifeomics, Beijing 102206, China
Fei-Lin Wu §State Key Laboratory of Proteomics; Beijing Proteome Research Center; National Center for Protein Sciences Beijing; Beijing Institute of Lifeomics, Beijing 102206, China
Xi Wang From the ‡Key Lab of Intelligent Information Processing of Chinese Academy of Sciences (CAS), Institute of Computing Technology, CAS; University of Chinese Academy of Sciences; Institute of Computing Technology, CAS, Beijing 100190, China
Wei-Di Xiao §State Key Laboratory of Proteomics; Beijing Proteome Research Center; National Center for Protein Sciences Beijing; Beijing Institute of Lifeomics, Beijing 102206, China
Yi-Hao Wang §State Key Laboratory of Proteomics; Beijing Proteome Research Center; National Center for Protein Sciences Beijing; Beijing Institute of Lifeomics, Beijing 102206, China
Jun-Ling Zhang §State Key Laboratory of Proteomics; Beijing Proteome Research Center; National Center for Protein Sciences Beijing; Beijing Institute of Lifeomics, Beijing 102206, China
Fu-Qiang Wang §State Key Laboratory of Proteomics; Beijing Proteome Research Center; National Center for Protein Sciences Beijing; Beijing Institute of Lifeomics, Beijing 102206, China
Feng Xu §State Key Laboratory of Proteomics; Beijing Proteome Research Center; National Center for Protein Sciences Beijing; Beijing Institute of Lifeomics, Beijing 102206, China
Wen-Feng Zeng From the ‡Key Lab of Intelligent Information Processing of Chinese Academy of Sciences (CAS), Institute of Computing Technology, CAS; University of Chinese Academy of Sciences; Institute of Computing Technology, CAS, Beijing 100190, China
Christopher M Overall ‖Centre for Blood Research, University of British Columbia, Vancouver, British Columbia, Canada
Si-Min He From the ‡Key Lab of Intelligent Information Processing of Chinese Academy of Sciences (CAS), Institute of Computing Technology, CAS; University of Chinese Academy of Sciences; Institute of Computing Technology, CAS, Beijing 100190, China;.
Hao Chi From the ‡Key Lab of Intelligent Information Processing of Chinese Academy of Sciences (CAS), Institute of Computing Technology, CAS; University of Chinese Academy of Sciences; Institute of Computing Technology, CAS, Beijing 100190, China;.
Ping Xu §State Key Laboratory of Proteomics; Beijing Proteome Research Center; National Center for Protein Sciences Beijing; Beijing Institute of Lifeomics, Beijing 102206, China;; ¶Key Laboratory of Combinatorial Biosynthesis and Drug Discovery of Ministry of Education Wuhan University, Wuhan University School of Pharmaceutical Sciences, Wuhan 430071, China;; College of Life Sciences, Hebei University, Baoding 071002, China.

Collapse

Fomin E. A Simple Approach to the Reconstruction of a Set of Points from the Multiset of Pairwise Distances in n2 Steps for the Sequencing Problem: III. Noise Inputs for the Beltway Case. J Comput Biol 2019;26:68-75. [DOI: 10.1089/cmb.2018.0078] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/18/2023] Open

Muth T, Hartkopf F, Vaudel M, Renard BY. A Potential Golden Age to Come-Current Tools, Recent Use Cases, and Future Avenues for De Novo Sequencing in Proteomics. Proteomics 2018;18:e1700150. [PMID: 29968278 DOI: 10.1002/pmic.201700150] [Citation(s) in RCA: 33] [Impact Index Per Article: 5.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/23/2018] [Revised: 05/23/2018] [Indexed: 01/15/2023]

Tran NH, Zhang X, Li M. Deep Omics. Proteomics 2017;18. [DOI: 10.1002/pmic.201700319] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/05/2017] [Revised: 11/21/2017] [Indexed: 01/03/2023]

Vinogradov AA, Gates ZP, Zhang C, Quartararo AJ, Halloran KH, Pentelute BL. Library Design-Facilitated High-Throughput Sequencing of Synthetic Peptide Libraries. ACS COMBINATORIAL SCIENCE 2017;19:694-701. [PMID: 28892357 PMCID: PMC5818986 DOI: 10.1021/acscombsci.7b00109] [Citation(s) in RCA: 27] [Impact Index Per Article: 3.9] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/29/2022]

Zhang Z, Boonen K, Li M, Geuten K. mRNA Interactome Capture from Plant Protoplasts. J Vis Exp 2017. [PMID: 28784956 DOI: 10.3791/56011] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/09/2023] Open

De novo peptide sequencing by deep learning. Proc Natl Acad Sci U S A 2017;114:8247-8252. [PMID: 28720701 DOI: 10.1073/pnas.1705691114] [Citation(s) in RCA: 202] [Impact Index Per Article: 28.9] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/22/2022] Open

Burke MC, Mirokhin YA, Tchekhovskoi DV, Markey SP, Heidbrink Thompson J, Larkin C, Stein SE. The Hybrid Search: A Mass Spectral Library Search Method for Discovery of Modifications in Proteomics. J Proteome Res 2017;16:1924-1935. [DOI: 10.1021/acs.jproteome.6b00988] [Citation(s) in RCA: 47] [Impact Index Per Article: 6.7] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/31/2022]

Guan X, Brownstein NC, Young NL, Marshall AG. Ultrahigh-resolution Fourier transform ion cyclotron resonance mass spectrometry and tandem mass spectrometry for peptide de novo amino acid sequencing for a seven-protein mixture by paired single-residue transposed Lys-N and Lys-C digestion. RAPID COMMUNICATIONS IN MASS SPECTROMETRY : RCM 2017;31:207-217. [PMID: 27813191 DOI: 10.1002/rcm.7783] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 08/29/2016] [Revised: 10/29/2016] [Accepted: 10/30/2016] [Indexed: 06/06/2023]

Abstract

RATIONALE

Bottom-up tandem mass spectrometry (MS/MS) is regularly used in proteomics to identify proteins from a sequence database. De novo sequencing is also available for sequencing peptides with relatively short sequence lengths. We recently showed that paired Lys-C and Lys-N proteases produce peptides of identical mass and similar retention time, but different tandem mass spectra. Such parallel experiments provide complementary information, and allow for up to 100% MS/MS sequence coverage.

METHODS

Here, we report digestion by paired Lys-C and Lys-N proteases of a seven-protein mixture: human hemoglobin alpha, bovine carbonic anhydrase 2, horse skeletal muscle myoglobin, hen egg white lysozyme, bovine pancreatic ribonuclease, bovine rhodanese, and bovine serum albumin, followed by reversed-phase nanoflow liquid chromatography, collision-induced dissociation, and 14.5 T Fourier transform ion cyclotron resonance mass spectrometry.

RESULTS

Matched pairs of product peptide ions of equal precursor mass and similar retention times from each digestion are compared, leveraging single-residue transposed information with independent interferences to confidently identify fragment ion types, residues, and peptides. Selected pairs of product ion mass spectra for de novo sequenced protein segments from each member of the mixture are presented.

CONCLUSIONS

Pairs of the transposed product ions as well as complementary information from the parallel experiments allow for both high MS/MS coverage for long peptide sequences and high confidence in the amino acid identification. Moreover, the parallel experiments in the de novo sequencing reduce false-positive matches of product ions from the single-residue transposed peptides from the same segment, and thereby further improve the confidence in protein identification. Copyright © 2016 John Wiley & Sons, Ltd.

Collapse

Yang H, Chi H, Zhou WJ, Zeng WF, He K, Liu C, Sun RX, He SM. Open-pNovo: De Novo Peptide Sequencing with Thousands of Protein Modifications. J Proteome Res 2017;16:645-654. [PMID: 28019094 DOI: 10.1021/acs.jproteome.6b00716] [Citation(s) in RCA: 24] [Impact Index Per Article: 3.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/18/2023]

Islam MT, Mohamedali A, Fernandes CS, Baker MS, Ranganathan S. De Novo Peptide Sequencing: Deep Mining of High-Resolution Mass Spectrometry Data. Methods Mol Biol 2017;1549:119-134. [PMID: 27975288 DOI: 10.1007/978-1-4939-6740-7_10] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 06/06/2023]

Pathan M, Samuel M, Keerthikumar S, Mathivanan S. Unassigned MS/MS Spectra: Who Am I? Methods Mol Biol 2017;1549:67-74. [PMID: 27975284 DOI: 10.1007/978-1-4939-6740-7_6] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 06/06/2023]

Tandem Mass Spectrum Sequencing: An Alternative to Database Search Engines in Shotgun Proteomics. ADVANCES IN EXPERIMENTAL MEDICINE AND BIOLOGY 2016. [PMID: 27975219 DOI: 10.1007/978-3-319-41448-5_10] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register]

Fomin E. A Simple Approach to the Reconstruction of a Set of Points from the Multiset of n2 Pairwise Distances in n2 Steps for the Sequencing Problem: II. Algorithm. J Comput Biol 2016;23:934-942. [DOI: 10.1089/cmb.2016.0046] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/13/2022] Open

Ma B. De novo Peptide Sequencing. PROTEOME INFORMATICS 2016:15-38. [DOI: 10.1039/9781782626732-00015] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 09/01/2023]

Gorshkov V, Hotta SYK, Verano-Braga T, Kjeldsen F. Peptide de novo sequencing of mixture tandem mass spectra. Proteomics 2016;16:2470-9. [PMID: 27329701 PMCID: PMC5297990 DOI: 10.1002/pmic.201500549] [Citation(s) in RCA: 16] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/26/2015] [Revised: 04/27/2016] [Accepted: 06/17/2016] [Indexed: 02/02/2023]

Engler MS, Scheubert K, Schubert US, Böcker S. New Statistical Models for Copolymerization. Polymers (Basel) 2016;8:E240. [PMID: 30979335 PMCID: PMC6432000 DOI: 10.3390/polym8060240] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/09/2016] [Revised: 06/07/2016] [Accepted: 06/15/2016] [Indexed: 11/16/2022] Open

Vinciguerra R, De Chiaro A, Pucci P, Marino G, Birolo L. Proteomic strategies for cultural heritage: From bones to paintings. Microchem J 2016. [DOI: 10.1016/j.microc.2015.12.024] [Citation(s) in RCA: 36] [Impact Index Per Article: 4.5] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/18/2023]

Robotham SA, Horton AP, Cannon JR, Cotham VC, Marcotte EM, Brodbelt JS. UVnovo: A de Novo Sequencing Algorithm Using Single Series of Fragment Ions via Chromophore Tagging and 351 nm Ultraviolet Photodissociation Mass Spectrometry. Anal Chem 2016;88:3990-7. [PMID: 26938041 PMCID: PMC4850734 DOI: 10.1021/acs.analchem.6b00261] [Citation(s) in RCA: 24] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/04/2023]

Liu Y, Sun W, John J, Lajoie G, Ma B, Zhang K. De Novo Sequencing Assisted Approach for Characterizing Mixture MS/MS Spectra. IEEE Trans Nanobioscience 2016;15:166-76. [PMID: 26800542 DOI: 10.1109/tnb.2016.2519841] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/07/2022]

Brodbelt JS. Ion Activation Methods for Peptides and Proteins. Anal Chem 2016;88:30-51. [PMID: 26630359 PMCID: PMC5553572 DOI: 10.1021/acs.analchem.5b04563] [Citation(s) in RCA: 135] [Impact Index Per Article: 16.9] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/12/2022]

Ma B. Peptide De Novo Sequencing with MS/MS. ENCYCLOPEDIA OF ALGORITHMS 2016:1545-1547. [DOI: 10.1007/978-1-4939-2864-4_286] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 09/01/2023]

Ma B. Novor: real-time peptide de novo sequencing software. JOURNAL OF THE AMERICAN SOCIETY FOR MASS SPECTROMETRY 2015;26:1885-94. [PMID: 26122521 PMCID: PMC4604512 DOI: 10.1007/s13361-015-1204-0] [Citation(s) in RCA: 123] [Impact Index Per Article: 13.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 02/12/2015] [Revised: 05/12/2015] [Accepted: 05/17/2015] [Indexed: 05/09/2023]

Song Y, Chi AY. Peptide sequencing via graph path decomposition. Inf Sci (N Y) 2015. [DOI: 10.1016/j.ins.2015.01.003] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/24/2022]

Ma B. Peptide De Novo Sequencing with MS/MS. ENCYCLOPEDIA OF ALGORITHMS 2015:1-4. [DOI: 10.1007/978-3-642-27848-8_286-2] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Track Full Text] [Subscribe] [Scholar Register] [Received: 01/14/2015] [Accepted: 01/14/2015] [Indexed: 09/01/2023]

Wang X, Li Y, Wu Z, Wang H, Tan H, Peng J. JUMP: a tag-based database search tool for peptide identification with high sensitivity and accuracy. Mol Cell Proteomics 2014;13:3663-73. [PMID: 25202125 DOI: 10.1074/mcp.o114.039586] [Citation(s) in RCA: 102] [Impact Index Per Article: 10.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/06/2022] Open

Leprevost FV, Valente RH, Lima DB, Perales J, Melani R, Yates JR, Barbosa VC, Junqueira M, Carvalho PC. PepExplorer: a similarity-driven tool for analyzing de novo sequencing results. Mol Cell Proteomics 2014;13:2480-9. [PMID: 24878498 DOI: 10.1074/mcp.m113.037002] [Citation(s) in RCA: 31] [Impact Index Per Article: 3.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/06/2022] Open

Abstract

Peptide spectrum matching is the current gold standard for protein identification via mass-spectrometry-based proteomics. Peptide spectrum matching compares experimental mass spectra against theoretical spectra generated from a protein sequence database to perform identification, but protein sequences not present in a database cannot be identified unless their sequences are in part conserved. The alternative approach, de novo sequencing, can make it possible to infer a peptide sequence directly from a mass spectrum, but interpreting long lists of peptide sequences resulting from large-scale experiments is not trivial. With this as motivation, PepExplorer was developed to use rigorous pattern recognition to assemble a list of homologue proteins using de novo sequencing data coupled to sequence alignment to allow biological interpretation of the data. PepExplorer can read the output of various widely adopted de novo sequencing tools and converge to a list of proteins with a global false-discovery rate. To this end, it employs a radial basis function neural network that considers precursor charge states, de novo sequencing scores, peptide lengths, and alignment scores to select similar protein candidates, from a target-decoy database, usually obtained from phylogenetically related species. Alignments are performed using a modified Smith-Waterman algorithm tailored for the task at hand. We verified the effectiveness of our approach using a reference set of identifications generated by ProLuCID when searching for Pyrococcus furiosus mass spectra on the corresponding NCBI RefSeq database. We then modified the sequence database by swapping amino acids until ProLuCID was no longer capable of identifying any proteins. By searching the mass spectra using PepExplorer on the modified database, we were able to recover most of the identifications at a 1% false-discovery rate. Finally, we employed PepExplorer to disclose a comprehensive proteomic assessment of the Bothrops jararaca plasma, a known biological source of natural inhibitors of snake toxins. PepExplorer is integrated into the PatternLab for Proteomics environment, which makes available various tools for downstream data analysis, including resources for quantitative and differential proteomics.

Collapse

Song Y. A new parameterized algorithm for rapid peptide sequencing. PLoS One 2014;9:e87476. [PMID: 24551059 PMCID: PMC3925086 DOI: 10.1371/journal.pone.0087476] [Citation(s) in RCA: 13] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/09/2013] [Accepted: 12/22/2013] [Indexed: 11/19/2022] Open

Romero-Rodríguez MC, Pascual J, Valledor L, Jorrín-Novo J. Improving the quality of protein identification in non-model species. Characterization of Quercus ilex seed and Pinus radiata needle proteomes by using SEQUEST and custom databases. J Proteomics 2014;105:85-91. [PMID: 24508333 DOI: 10.1016/j.jprot.2014.01.027] [Citation(s) in RCA: 46] [Impact Index Per Article: 4.6] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/15/2013] [Accepted: 01/27/2014] [Indexed: 01/10/2023]

Abstract

UNLABELLED

Nowadays the most used pipeline for protein identification consists in the comparison of the MS/MS spectra to reference databases. Search algorithms compare obtained spectra to an in silico digestion of a sequence database to find exact matches. In this context, the database has a paramount importance and will determine in a great deal the number of identifications and its quality, being this especially relevant for non-model plant species. Using a single Viridiplantae database (NCBI, UniProt) and TAIR is not the best choice for non-model species since they are underrepresented in databases resulting in poor identification rates. We demonstrate how it is possible to improve the rate and quality of identifications in two orphan species, Quercus ilex and Pinus radiata, by using SEQUEST and a combination of public (Viridiplantae NCBI, UniProt) and a custom-built specific database which contained 593,294 and 455,096 peptide sequences (Quercus and Pinus, respectively). These databases were built after gathering and processing (trimming, contiging, 6-frame translation) publicly available RNA sequences, mostly ESTs and NGS reads. A total of 149 and 1533 proteins were identified from Quercus seeds and Pinus needles, representing a 3.1- or 1.5-fold increase in the number of protein identifications and scores compared to the use of a single database. Since this approach greatly improves the identification rate, and is not significantly more complicated or time consuming than other approaches, we recommend its routine use when working with non-model species.

BIOLOGICAL SIGNIFICANCE

In this work we demonstrate how the construction of a custom database (DB) gathering all available RNA sequences and its use in combination with Viridiplantae public DBs (NCBI, UniProt) significantly improve protein identification when working with non-model species. Protein identification rate and quality is higher to those obtained in routine procedures based on using only one database (commonly Viridiplantae from NCBI), as we demonstrated analyzing Quercus seeds and Pine needles. The proposed approach based on the building of a custom database is not difficult or time consuming, so we recommend its routine use when working with non-model species. This article is part of a Special Issue entitled: Proteomics of non-model organisms.

Collapse

Terterov I, Vyatkina K, Kononikhin AS, Boitsov V, Vyazmin S, Popov IA, Nikolaev EN, Pevzner P, Dubina M. Application of de novo sequencing tools to study abiogenic peptide formations by tandem mass spectrometry. The case of homo-peptides from glutamic acid complicated by substitutions of hydrogen by sodium or potassium atoms. RAPID COMMUNICATIONS IN MASS SPECTROMETRY : RCM 2014;28:33-41. [PMID: 24285388 DOI: 10.1002/rcm.6757] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 08/06/2013] [Revised: 09/24/2013] [Accepted: 10/04/2013] [Indexed: 06/02/2023]