1
|
Wang X, Strobel M, Aron AT, Phelan VV, Acharya DD, Brown CJ, Clevenger K, Hu J, Kretsch A, Mahood EH, Menegatti C, Xiong Q, Wang M. Network Topology Evaluation and Transitive Alignments for Molecular Networking. JOURNAL OF THE AMERICAN SOCIETY FOR MASS SPECTROMETRY 2024; 35:2165-2175. [PMID: 39133821 DOI: 10.1021/jasms.4c00208] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 09/05/2024]
Abstract
Untargeted tandem mass spectrometry (MS/MS) is an essential technique in modern analytical chemistry, providing a comprehensive snapshot of chemical entities in complex samples and identifying unknowns through their fragmentation patterns. This high-throughput approach generates large data sets that can be challenging to interpret. Molecular Networks (MNs) have been developed as a computational tool to aid in the organization and visualization of complex chemical space in untargeted mass spectrometry data, thereby supporting comprehensive data analysis and interpretation. MNs group related compounds with potentially similar structures from MS/MS data by calculating all pairwise MS/MS similarities and filtering these connections to produce a MN. Such networks are instrumental in metabolomics for identifying novel metabolites, elucidating metabolic pathways, and even discovering biomarkers for disease. While MS/MS similarity metrics have been explored in the literature, the influence of network topology approaches on MN construction remains unexplored. This manuscript introduces metrics for evaluating MN construction, benchmarks state-of-the-art approaches, and proposes the Transitive Alignments approach to improve MN construction. The Transitive Alignment technique leverages the MN topology to realign MS/MS spectra of related compounds that differ by multiple structural modifications. Combining this Transitive Alignments approach with pseudoclique finding, a method for identifying highly connected groups of nodes in a network, resulted in more complete and higher-quality molecular families. Finally, we also introduce a targeted network construction technique called induced transitive alignments where we demonstrate effectiveness on a real world natural product discovery application. We release this transitive alignment technique as a high-throughput workflow that can be used by the wider research community.
Collapse
Affiliation(s)
- Xianghu Wang
- Department of Computer Science and Engineering, University of California Riverside, 900 University Ave., Riverside, California 92521, United States
| | - Michael Strobel
- Department of Computer Science and Engineering, University of California Riverside, 900 University Ave., Riverside, California 92521, United States
| | - Allegra T Aron
- Department of Chemistry and Biochemistry, University of Denver, 2101 East Wesley Ave, Denver, Colorado 80210, United States
| | - Vanessa V Phelan
- Department of Pharmaceutical Sciences, Skaggs School of Pharmacy and Pharmaceutical Sciences, University of Colorado, Anschutz Medical Campus, 12850 E Montview Blvd, Aurora, Colorado 80045, United States
| | - Deepa D Acharya
- Biologicals Research and Development, Corteva Agriscience, 9330 Zionsville Rd, Indianapolis, Indiana 46268, United States
| | - Christopher J Brown
- Regulatory Science, Corteva Agriscience, 9330 Zionsville Rd, Indianapolis, Indiana 46268, United States
| | - Ken Clevenger
- Biologicals Research and Development, Corteva Agriscience, 9330 Zionsville Rd, Indianapolis, Indiana 46268, United States
| | - Jie Hu
- Data Science, Corteva Agriscience, 9330 Zionsville Rd, Indianapolis, Indiana 46268, United States
| | - Ashley Kretsch
- Biologicals Research and Development, Corteva Agriscience, 9330 Zionsville Rd, Indianapolis, Indiana 46268, United States
| | - Elizabeth H Mahood
- Data Science, Corteva Agriscience, 9330 Zionsville Rd, Indianapolis, Indiana 46268, United States
| | - Carla Menegatti
- Biologicals Research and Development, Corteva Agriscience, 9330 Zionsville Rd, Indianapolis, Indiana 46268, United States
| | - Quanbo Xiong
- Biologicals Research and Development, Corteva Agriscience, 9330 Zionsville Rd, Indianapolis, Indiana 46268, United States
| | - Mingxun Wang
- Department of Computer Science and Engineering, University of California Riverside, 900 University Ave., Riverside, California 92521, United States
| |
Collapse
|
2
|
Mongia M, Yasaka TM, Liu Y, Guler M, Lu L, Bhagwat A, Behsaz B, Wang M, Dorrestein PC, Mohimani H. Fast mass spectrometry search and clustering of untargeted metabolomics data. Nat Biotechnol 2024:10.1038/s41587-023-01985-4. [PMID: 38168990 DOI: 10.1038/s41587-023-01985-4] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/23/2023] [Accepted: 09/12/2023] [Indexed: 01/05/2024]
Abstract
The throughput of mass spectrometers and the amount of publicly available metabolomics data are growing rapidly, but analysis tools such as molecular networking and Mass Spectrometry Search Tool do not scale to searching and clustering billions of mass spectral data in metabolomics repositories. To address this limitation, we designed MASST+ and Networking+, which can process datasets that are up to three orders of magnitude larger than those processed by state-of-the-art tools.
Collapse
Affiliation(s)
- Mihir Mongia
- Computational Biology Department, School of Computer Science, Carnegie Mellon University, Pittsburgh, PA, USA
| | - Tyler M Yasaka
- Computational Biology Department, School of Computer Science, Carnegie Mellon University, Pittsburgh, PA, USA
| | - Yudong Liu
- Computational Biology Department, School of Computer Science, Carnegie Mellon University, Pittsburgh, PA, USA
| | - Mustafa Guler
- Computational Biology Department, School of Computer Science, Carnegie Mellon University, Pittsburgh, PA, USA
| | - Liang Lu
- Computational Biology Department, School of Computer Science, Carnegie Mellon University, Pittsburgh, PA, USA
| | - Aditya Bhagwat
- Computational Biology Department, School of Computer Science, Carnegie Mellon University, Pittsburgh, PA, USA
| | - Bahar Behsaz
- Computational Biology Department, School of Computer Science, Carnegie Mellon University, Pittsburgh, PA, USA
- Chemia Biosciences Inc., Pittsburgh, PA, USA
| | - Mingxun Wang
- Computer Science and Engineering, University of California Riverside, Riverside, CA, USA
| | - Pieter C Dorrestein
- Collaborative Mass Spectrometry Innovation Center, Skaggs School of Pharmacy and Pharmaceutical Sciences, University of California San Diego, San Diego, CA, USA
- Department of Pharmacology and Pediatrics, University of California San Diego, San Diego, CA, USA
| | - Hosein Mohimani
- Computational Biology Department, School of Computer Science, Carnegie Mellon University, Pittsburgh, PA, USA.
| |
Collapse
|
3
|
Na S, Paek E. Demystifying PTM Identification Using MODplus: Best Practices and Pitfalls. Methods Mol Biol 2024; 2836:37-55. [PMID: 38995534 DOI: 10.1007/978-1-0716-4007-4_3] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 07/13/2024]
Abstract
Tandem mass spectrometry (MS/MS) facilitates the rapid identification of posttranslational modifications (PTMs), which play a pivotal role in regulating numerous biological processes. This chapter explores recent advancements that expand the types of detectable PTMs and enhance the speed of the PTM searches. We also delve into computational challenges associated with searching for a multitude of PTMs simultaneously. The latter section introduces an automated procedure to identify an extensive range of PTMs using MODplus, a free PTM analysis software tool. We guide the reader through the preparation of the modification search, the determination of optional search parameters, the execution of the search, and the analysis of results, exemplified by a case study using specific MS/MS dataset.
Collapse
Affiliation(s)
- Seungjin Na
- Digital Omics Research Center, Korea Basic Science Institute, Cheongju, South Korea
| | - Eunok Paek
- Department of Computer Science, Hanyang University, Seoul, South Korea.
- Department of Artificial Intelligence, Hanyang University, Seoul, South Korea.
- Institute for Artificial Intelligence Research, Hanyang University, Seoul, South Korea.
| |
Collapse
|
4
|
Prunier G, Cherkaoui M, Lysiak A, Langella O, Blein-Nicolas M, Lollier V, Benoist E, Jean G, Fertin G, Rogniaux H, Tessier D. Fast alignment of mass spectra in large proteomics datasets, capturing dissimilarities arising from multiple complex modifications of peptides. BMC Bioinformatics 2023; 24:421. [PMID: 37940845 PMCID: PMC10631047 DOI: 10.1186/s12859-023-05555-y] [Citation(s) in RCA: 2] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/13/2023] [Accepted: 10/30/2023] [Indexed: 11/10/2023] Open
Abstract
BACKGROUND In proteomics, the interpretation of mass spectra representing peptides carrying multiple complex modifications remains challenging, as it is difficult to strike a balance between reasonable execution time, a limited number of false positives, and a huge search space allowing any number of modifications without a priori. The scientific community needs new developments in this area to aid in the discovery of novel post-translational modifications that may play important roles in disease. RESULTS To make progress on this issue, we implemented SpecGlobX (SpecGlob eXTended to eXperimental spectra), a standalone Java application that quickly determines the best spectral alignments of a (possibly very large) list of Peptide-to-Spectrum Matches (PSMs) provided by any open modification search method, or generated by the user. As input, SpecGlobX reads a file containing spectra in MGF or mzML format and a semicolon-delimited spreadsheet describing the PSMs. SpecGlobX returns the best alignment for each PSM as output, splitting the mass difference between the spectrum and the peptide into one or more shifts while considering the possibility of non-aligned masses (a phenomenon resulting from many situations including neutral losses). SpecGlobX is fast, able to align one million PSMs in about 1.5 min on a standard desktop. Firstly, we remind the foundations of the algorithm and detail how we adapted SpecGlob (the method we previously developed following the same aim, but limited to the interpretation of perfect simulated spectra) to the interpretation of imperfect experimental spectra. Then, we highlight the interest of SpecGlobX as a complementary tool downstream to three open modification search methods on a large simulated spectra dataset. Finally, we ran SpecGlobX on a proteome-wide dataset downloaded from PRIDE to demonstrate that SpecGlobX functions just as well on simulated and experimental spectra. We then carefully analyzed a limited set of interpretations. CONCLUSIONS SpecGlobX is helpful as a decision support tool, providing keys to interpret peptides carrying complex modifications still poorly considered by current open modification search software. Better alignment of PSMs enhances confidence in the identification of spectra provided by open modification search methods and should improve the interpretation rate of spectra.
Collapse
Affiliation(s)
- Grégoire Prunier
- INRAE, PROBE Research Infrastructure, BIBS Facility, 44300, Nantes, France
- INRAE, UR1268 Biopolymères Interactions Assemblages, 44316, Nantes, France
| | - Mehdi Cherkaoui
- INRAE, PROBE Research Infrastructure, BIBS Facility, 44300, Nantes, France
- INRAE, UR1268 Biopolymères Interactions Assemblages, 44316, Nantes, France
| | - Albane Lysiak
- INRAE, PROBE Research Infrastructure, BIBS Facility, 44300, Nantes, France
- Nantes Université, CNRS, LS2N, UMR 6004, 44000, Nantes, France
| | - Olivier Langella
- Université Paris-Saclay, INRAE, CNRS, AgroParisTech, GQE - Le Moulon, PAPPSO, 91190, Gif-Sur-Yvette, France
| | - Mélisande Blein-Nicolas
- Université Paris-Saclay, INRAE, CNRS, AgroParisTech, GQE - Le Moulon, PAPPSO, 91190, Gif-Sur-Yvette, France
| | - Virginie Lollier
- INRAE, PROBE Research Infrastructure, BIBS Facility, 44300, Nantes, France
- INRAE, UR1268 Biopolymères Interactions Assemblages, 44316, Nantes, France
| | - Emile Benoist
- Nantes Université, CNRS, LS2N, UMR 6004, 44000, Nantes, France
| | - Géraldine Jean
- Nantes Université, CNRS, LS2N, UMR 6004, 44000, Nantes, France
| | | | - Hélène Rogniaux
- INRAE, PROBE Research Infrastructure, BIBS Facility, 44300, Nantes, France
- INRAE, UR1268 Biopolymères Interactions Assemblages, 44316, Nantes, France
| | - Dominique Tessier
- INRAE, PROBE Research Infrastructure, BIBS Facility, 44300, Nantes, France.
- INRAE, UR1268 Biopolymères Interactions Assemblages, 44316, Nantes, France.
| |
Collapse
|
5
|
Gopalakrishnan Meena M, Lane MJ, Tannous J, Carrell AA, Abraham PE, Giannone RJ, Ané JM, Keller NP, Labbé JL, Geiger AG, Kainer D, Jacobson DA, Rush TA. A glimpse into the fungal metabolomic abyss: Novel network analysis reveals relationships between exogenous compounds and their outputs. PNAS NEXUS 2023; 2:pgad322. [PMID: 37854706 PMCID: PMC10581544 DOI: 10.1093/pnasnexus/pgad322] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 07/19/2023] [Accepted: 09/20/2023] [Indexed: 10/20/2023]
Abstract
Fungal specialized metabolites are a major source of beneficial compounds that are routinely isolated, characterized, and manufactured as pharmaceuticals, agrochemical agents, and industrial chemicals. The production of these metabolites is encoded by biosynthetic gene clusters that are often silent under standard growth conditions. There are limited resources for characterizing the direct link between abiotic stimuli and metabolite production. Herein, we introduce a network analysis-based, data-driven algorithm comprising two routes to characterize the production of specialized fungal metabolites triggered by different exogenous compounds: the direct route and the auxiliary route. Both routes elucidate the influence of treatments on the production of specialized metabolites from experimental data. The direct route determines known and putative metabolites induced by treatments and provides additional insight over traditional comparison methods. The auxiliary route is specific for discovering unknown analytes, and further identification can be curated through online bioinformatic resources. We validated our algorithm by applying chitooligosaccharides and lipids at two different temperatures to the fungal pathogen Aspergillus fumigatus. After liquid chromatography-mass spectrometry quantification of significantly produced analytes, we used network centrality measures to rank the treatments' ability to elucidate these analytes and confirmed their identity through fragmentation patterns or in silico spiking with commercially available standards. Later, we examined the transcriptional regulation of these metabolites through real-time quantitative polymerase chain reaction. Our data-driven techniques can complement existing metabolomic network analysis by providing an approach to track the influence of any exogenous stimuli on metabolite production. Our experimental-based algorithm can overcome the bottlenecks in elucidating novel fungal compounds used in drug discovery.
Collapse
Affiliation(s)
| | - Matthew J Lane
- Biosciences Division, Oak Ridge National Laboratory, Oak Ridge, TN 37831, USA
- Bredesen Center for Interdisciplinary Research and Graduate Education, University of Tennessee, Knoxville, TN 37916, USA
| | - Joanna Tannous
- Biosciences Division, Oak Ridge National Laboratory, Oak Ridge, TN 37831, USA
| | - Alyssa A Carrell
- Biosciences Division, Oak Ridge National Laboratory, Oak Ridge, TN 37831, USA
| | - Paul E Abraham
- Biosciences Division, Oak Ridge National Laboratory, Oak Ridge, TN 37831, USA
| | - Richard J Giannone
- Biosciences Division, Oak Ridge National Laboratory, Oak Ridge, TN 37831, USA
| | - Jean-Michel Ané
- Department of Bacteriology, University of Wisconsin-Madison, Madison, WI 53706, USA
- Department of Agronomy, University of Wisconsin-Madison, Madison, WI 53706, USA
| | - Nancy P Keller
- Department of Bacteriology, University of Wisconsin-Madison, Madison, WI 53706, USA
- Department of Medical Microbiology and Immunology, University of Wisconsin-Madison, Madison, WI 53706, USA
| | - Jesse L Labbé
- Biosciences Division, Oak Ridge National Laboratory, Oak Ridge, TN 37831, USA
- Now at Tekholding, Salt Lake City, UT 84119, USA
| | - Armin G Geiger
- Biosciences Division, Oak Ridge National Laboratory, Oak Ridge, TN 37831, USA
- Bredesen Center for Interdisciplinary Research and Graduate Education, University of Tennessee, Knoxville, TN 37916, USA
| | - David Kainer
- Biosciences Division, Oak Ridge National Laboratory, Oak Ridge, TN 37831, USA
- Now at ARC Centre of Excellence for Plant Success in Nature and Agriculture, University of Queensland, Brisbane, QLD 4072, Australia
| | - Daniel A Jacobson
- Biosciences Division, Oak Ridge National Laboratory, Oak Ridge, TN 37831, USA
| | - Tomás A Rush
- Biosciences Division, Oak Ridge National Laboratory, Oak Ridge, TN 37831, USA
| |
Collapse
|
6
|
Bittremieux W, Schmid R, Huber F, van der Hooft JJJ, Wang M, Dorrestein PC. Comparison of Cosine, Modified Cosine, and Neutral Loss Based Spectrum Alignment For Discovery of Structurally Related Molecules. JOURNAL OF THE AMERICAN SOCIETY FOR MASS SPECTROMETRY 2022; 33:1733-1744. [PMID: 35960544 DOI: 10.1021/jasms.2c00153] [Citation(s) in RCA: 14] [Impact Index Per Article: 7.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/15/2023]
Abstract
Spectrum alignment of tandem mass spectrometry (MS/MS) data using the modified cosine similarity and subsequent visualization as molecular networks have been demonstrated to be a useful strategy to discover analogs of molecules from untargeted MS/MS-based metabolomics experiments. Recently, a neutral loss matching approach has been introduced as an alternative to MS/MS-based molecular networking with an implied performance advantage in finding analogs that cannot be discovered using existing MS/MS spectrum alignment strategies. To comprehensively evaluate the scoring properties of neutral loss matching, the cosine similarity, and the modified cosine similarity, similarity measures of 955 228 peptide MS/MS spectrum pairs and 10 million small molecule MS/MS spectrum pairs were compared. This comparative analysis revealed that the modified cosine similarity outperformed neutral loss matching and the cosine similarity in all cases. The data further indicated that the performance of MS/MS spectrum alignment depends on the location and type of the modification, as well as the chemical compound class of fragmented molecules.
Collapse
Affiliation(s)
- Wout Bittremieux
- Collaborative Mass Spectrometry Innovation Center, University of California San Diego, La Jolla, California 92093, United States
- Skaggs School of Pharmacy and Pharmaceutical Sciences, University of California San Diego, La Jolla, California 92093, United States
| | - Robin Schmid
- Collaborative Mass Spectrometry Innovation Center, University of California San Diego, La Jolla, California 92093, United States
- Skaggs School of Pharmacy and Pharmaceutical Sciences, University of California San Diego, La Jolla, California 92093, United States
| | - Florian Huber
- Centre for Digitalization and Digitality, University of Applied Sciences, 40476 Düsseldorf, Germany
| | - Justin J J van der Hooft
- Bioinformatics Group, Wageningen University, 6708PB Wageningen, The Netherlands
- Department of Biochemistry, University of Johannesburg, Auckland Park, Johannesburg 2006, South Africa
| | - Mingxun Wang
- Collaborative Mass Spectrometry Innovation Center, University of California San Diego, La Jolla, California 92093, United States
- Skaggs School of Pharmacy and Pharmaceutical Sciences, University of California San Diego, La Jolla, California 92093, United States
| | - Pieter C Dorrestein
- Collaborative Mass Spectrometry Innovation Center, University of California San Diego, La Jolla, California 92093, United States
- Skaggs School of Pharmacy and Pharmaceutical Sciences, University of California San Diego, La Jolla, California 92093, United States
| |
Collapse
|
7
|
Tay AP, Hamey JJ, Martyn GE, Wilson LOW, Wilkins MR. Identification of Protein Isoforms Using Reference Databases Built from Long and Short Read RNA-Sequencing. J Proteome Res 2022; 21:1628-1639. [PMID: 35612954 DOI: 10.1021/acs.jproteome.1c00968] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/28/2022]
Abstract
Alternative splicing can lead to distinct protein isoforms. These can have different functions in specific cells and tissues or in different developmental stages. In this study, we explored whether transcripts assembled from long read, nanopore-based, direct RNA-sequencing (RNA-seq) could improve the identification of protein isoforms in human K562 cells. By comparing with Illumina-based short read RNA-seq, we showed that a large proportion of Ensembl transcripts (5949/14,326) and genes expressing alternatively spliced transcripts (486/2981) identified with long direct reads were missed by short paired-end reads. By co-analyzing proteomic and transcriptomic data, we also showed that some peptides (826/35,976), proteins (262/3215), and protein isoforms arising from distinct transcript variants (574/1212) identified with isoform-specific peptides via custom long-read-based databases were missed in Illumina-derived databases. Finally, we generated unequivocal peptide evidence for a set of protein isoforms and showed that long read, direct RNA-seq allows the discovery of novel protein isoforms not already in reference databases or custom databases built from short read RNA-seq data. Our analysis highlights the benefits of long read RNA-seq data in the generation of reference databases to increase tandem mass spectrometry (MS/MS) identification of protein isoforms.
Collapse
Affiliation(s)
- Aidan P Tay
- School of Biotechnology and Biomolecular Sciences, The University of New South Wales, Sydney, New South Wales 2052, Australia.,Australian e-Health Research Centre, Commonwealth Scientific and Industrial Research Organisation, Sydney, New South Wales 2113, Australia.,Applied Biosciences, Macquarie University, Sydney, New South Wales 2109, Australia
| | - Joshua J Hamey
- School of Biotechnology and Biomolecular Sciences, The University of New South Wales, Sydney, New South Wales 2052, Australia
| | - Gabriella E Martyn
- School of Biotechnology and Biomolecular Sciences, The University of New South Wales, Sydney, New South Wales 2052, Australia
| | - Laurence O W Wilson
- Australian e-Health Research Centre, Commonwealth Scientific and Industrial Research Organisation, Sydney, New South Wales 2113, Australia.,Applied Biosciences, Macquarie University, Sydney, New South Wales 2109, Australia
| | - Marc R Wilkins
- School of Biotechnology and Biomolecular Sciences, The University of New South Wales, Sydney, New South Wales 2052, Australia
| |
Collapse
|
8
|
Combination of GC-MS Molecular Networking and Larvicidal Effect against Aedes aegypti for the Discovery of Bioactive Substances in Commercial Essential Oils. Molecules 2022; 27:molecules27051588. [PMID: 35268689 PMCID: PMC8912102 DOI: 10.3390/molecules27051588] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/25/2022] [Revised: 02/20/2022] [Accepted: 02/24/2022] [Indexed: 01/11/2023] Open
Abstract
Dengue is a neglected disease, present mainly in tropical countries, with more than 5.2 million cases reported in 2019. Vector control remains the most effective protective measure against dengue and other arboviruses. Synthetic insecticides based on organophosphates, pyrethroids, carbamates, neonicotinoids and oxadiazines are unattractive due to their high degree of toxicity to humans, animals and the environment. Conversely, natural-product-based larvicides/insecticides, such as essential oils, present high efficiency, low environmental toxicity and can be easily scaled up for industrial processes. However, essential oils are highly complex and require modern analytical and computational approaches to streamline the identification of bioactive substances. This study combined the GC-MS spectral similarity network approach with larvicidal assays as a new strategy for the discovery of potential bioactive substances in complex biological samples, enabling the systematic and simultaneous annotation of substances in 20 essential oils through LC50 larvicidal assays. This strategy allowed rapid intuitive discovery of distribution patterns between families and metabolic classes in clusters, and the prediction of larvicidal properties of acyclic monoterpene derivatives, including citral, neral, citronellal and citronellol, and their acetate forms (LC50 < 50 µg/mL).
Collapse
|
9
|
Perpetuo L, Klein J, Ferreira R, Guedes S, Amado F, Leite-Moreira A, Silva AMS, Thongboonkerd V, Vitorino R. How can artificial intelligence be used for peptidomics? Expert Rev Proteomics 2021; 18:527-556. [PMID: 34343059 DOI: 10.1080/14789450.2021.1962303] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [Key Words] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/01/2023]
Abstract
INTRODUCTION Peptidomics is an emerging field of omics sciences using advanced isolation, analysis, and computational techniques that enable qualitative and quantitative analyses of various peptides in biological samples. Peptides can act as useful biomarkers and as therapeutic molecules for diseases. AREAS COVERED The use of therapeutic peptides can be predicted quickly and efficiently using data-driven computational methods, particularly artificial intelligence (AI) approach. Various AI approaches are useful for peptide-based drug discovery, such as support vector machine, random forest, extremely randomized trees, and other more recently developed deep learning methods. AI methods are relatively new to the development of peptide-based therapies, but these techniques already become essential tools in protein science by dissecting novel therapeutic peptides and their functions (Figure 1).[Figure: see text]. EXPERT OPINION Researchers have shown that AI models can facilitate the development of peptidomics and selective peptide therapies in the field of peptide science. Biopeptide prediction is important for the discovery and development of successful peptide-based drugs. Due to their ability to predict therapeutic roles based on sequence details, many AI-dependent prediction tools have been developed (Figure 1).
Collapse
Affiliation(s)
- Luís Perpetuo
- iBiMED, Department of Medical Sciences, University of Aveiro, Aveiro
| | - Julie Klein
- Institut National de la Santé et de la Recherche Médicale (INSERM), U1297, Institute of Cardiovascular and Metabolic Disease, Université Toulouse III, Toulouse, France
| | - Rita Ferreira
- LAQV/REQUIMTE, Department of Chemistry, University of Aveiro, Aveiro
| | - Sofia Guedes
- LAQV/REQUIMTE, Department of Chemistry, University of Aveiro, Aveiro
| | - Francisco Amado
- LAQV/REQUIMTE, Department of Chemistry, University of Aveiro, Aveiro
| | - Adelino Leite-Moreira
- UnIC, Departamento de Cirurgia e Fisiologia, Faculdade de Medicina da Universidade do Porto, Porto
| | - Artur M S Silva
- LAQV/REQUIMTE, Department of Chemistry, University of Aveiro, Aveiro
| | - Visith Thongboonkerd
- Medical Proteomics Unit, Office for Research and Development, Faculty of Medicine Siriraj Hospital, Mahidol University, Bangkok 10700, Thailand
| | - Rui Vitorino
- iBiMED, Department of Medical Sciences, University of Aveiro, Aveiro.,LAQV/REQUIMTE, Department of Chemistry, University of Aveiro, Aveiro.,UnIC, Departamento de Cirurgia e Fisiologia, Faculdade de Medicina da Universidade do Porto, Porto
| |
Collapse
|
10
|
da Silva MACN, Costa JH, Pacheco-Fill T, Ruiz ALTG, Vidal FCB, Borges KRA, Guimarães SJA, de Azevedo-Santos APS, Buglio KE, Foglio MA, Barbosa MDCL, Nascimento MDDSB, de Carvalho JE. Açai ( Euterpe oleracea Mart.) Seed Extract Induces ROS Production and Cell Death in MCF-7 Breast Cancer Cell Line. Molecules 2021; 26:molecules26123546. [PMID: 34200718 PMCID: PMC8230419 DOI: 10.3390/molecules26123546] [Citation(s) in RCA: 15] [Impact Index Per Article: 5.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/11/2021] [Revised: 06/06/2021] [Accepted: 06/07/2021] [Indexed: 01/11/2023] Open
Abstract
Euterpe oleracea Mart. (açai) is a native palm from the Amazon region. There are various chemical constituents of açai with bioactive properties. This study aimed to evaluate the chemical composition and cytotoxic effects of açai seed extract on breast cancer cell line (MCF-7). Global Natural Products Social Molecular Networking (GNPS) was applied to identify chemical compounds present in açai seed extract. LC-MS/MS and molecular networking were employed to detect the phenolic compounds of açai. The antioxidant activity of açai seed extract was measured by DPPH assay. MCF-7 breast cancer cell line viability was evaluated by MTT assay. Cell death was evaluated by flow cytometry and time-lapse microscopy. Autophagy was evaluated by orange acridin immunofluorescence assay. Reactive oxygen species (ROS) production was evaluated by DAF assay. From the molecular networking, fifteen compounds were identified, mainly phenolic compounds. The açai seed extract showed cytotoxic effects against MCF-7, induced morphologic changes in the cell line by autophagy and increased the ROS production pathway. The present study suggests that açai seed extract has a high cytotoxic capacity and may induce autophagy by increasing ROS production in breast cancer. Apart from its antioxidant activity, flavonoids with high radical scavenging activity present in açai also generated NO (nitric oxide), contributing to its cytotoxic effect and autophagy induction.
Collapse
Affiliation(s)
- Marcos Antonio Custódio Neto da Silva
- Post-Graduate Program in Internal Medicine, Faculty of Medical Science, Universidade Estadual de Campinas, Rua Tessália Vieira de Camargo, 126, Cidade Universitária Zeferino Vaz. CEP, Campinas 13083-887, SP, Brazil;
| | - Jonas Henrique Costa
- Institute of Chemistry, Universidade Estadual de Campinas, CP 6154, Campinas 13083-970, SP, Brazil; (J.H.C.); (T.P.-F.)
| | - Taícia Pacheco-Fill
- Institute of Chemistry, Universidade Estadual de Campinas, CP 6154, Campinas 13083-970, SP, Brazil; (J.H.C.); (T.P.-F.)
| | - Ana Lúcia Tasca Gois Ruiz
- Faculty of Pharmaceutical Sciences, Universidade Estadual de Campinas, Campinas 13083-859, SP, Brazil; (A.L.T.G.R.); (K.E.B.); (M.A.F.)
| | - Flávia Castello Branco Vidal
- Post-Graduate Program in Adult Heath, Department of Patology, Federal University of Maranhão (UFMA), São Luís 65080-805, MA, Brazil; (F.C.B.V.); (K.R.A.B.)
| | - Kátia Regina Assunção Borges
- Post-Graduate Program in Adult Heath, Department of Patology, Federal University of Maranhão (UFMA), São Luís 65080-805, MA, Brazil; (F.C.B.V.); (K.R.A.B.)
| | - Sulayne Janaina Araújo Guimarães
- Post-Graduate Program in Health Sicencies, Federal University of Maranhão (UFMA), São Luís 65080-805, MA, Brazil; (S.J.A.G.); (A.P.S.d.A.-S.)
| | - Ana Paula Silva de Azevedo-Santos
- Post-Graduate Program in Health Sicencies, Federal University of Maranhão (UFMA), São Luís 65080-805, MA, Brazil; (S.J.A.G.); (A.P.S.d.A.-S.)
| | - Kaio Eduardo Buglio
- Faculty of Pharmaceutical Sciences, Universidade Estadual de Campinas, Campinas 13083-859, SP, Brazil; (A.L.T.G.R.); (K.E.B.); (M.A.F.)
| | - Mary Ann Foglio
- Faculty of Pharmaceutical Sciences, Universidade Estadual de Campinas, Campinas 13083-859, SP, Brazil; (A.L.T.G.R.); (K.E.B.); (M.A.F.)
| | - Maria do Carmo Lacerda Barbosa
- Post-Graduate Program in Family Health, Department of Medicine I, Federal University of Maranhão (UFMA), São Luís 65080-805, MA, Brazil;
| | - Maria do Desterro Soares Brandão Nascimento
- Post-Graduate Program in Adult Heath, Department of Patology, Federal University of Maranhão (UFMA), São Luís 65080-805, MA, Brazil; (F.C.B.V.); (K.R.A.B.)
- Correspondence: (M.d.D.S.B.N.); (J.E.d.C.)
| | - João Ernesto de Carvalho
- Faculty of Pharmaceutical Sciences, Universidade Estadual de Campinas, Campinas 13083-859, SP, Brazil; (A.L.T.G.R.); (K.E.B.); (M.A.F.)
- Correspondence: (M.d.D.S.B.N.); (J.E.d.C.)
| |
Collapse
|
11
|
Behsaz B, Bode E, Gurevich A, Shi YN, Grundmann F, Acharya D, Caraballo-Rodríguez AM, Bouslimani A, Panitchpakdi M, Linck A, Guan C, Oh J, Dorrestein PC, Bode HB, Pevzner PA, Mohimani H. Integrating genomics and metabolomics for scalable non-ribosomal peptide discovery. Nat Commun 2021; 12:3225. [PMID: 34050176 PMCID: PMC8163882 DOI: 10.1038/s41467-021-23502-4] [Citation(s) in RCA: 23] [Impact Index Per Article: 7.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/23/2020] [Accepted: 05/04/2021] [Indexed: 02/07/2023] Open
Abstract
Non-Ribosomal Peptides (NRPs) represent a biomedically important class of natural products that include a multitude of antibiotics and other clinically used drugs. NRPs are not directly encoded in the genome but are instead produced by metabolic pathways encoded by biosynthetic gene clusters (BGCs). Since the existing genome mining tools predict many putative NRPs synthesized by a given BGC, it remains unclear which of these putative NRPs are correct and how to identify post-assembly modifications of amino acids in these NRPs in a blind mode, without knowing which modifications exist in the sample. To address this challenge, here we report NRPminer, a modification-tolerant tool for NRP discovery from large (meta)genomic and mass spectrometry datasets. We show that NRPminer is able to identify many NRPs from different environments, including four previously unreported NRP families from soil-associated microbes and NRPs from human microbiota. Furthermore, in this work we demonstrate the anti-parasitic activities and the structure of two of these NRP families using direct bioactivity screening and nuclear magnetic resonance spectrometry, illustrating the power of NRPminer for discovering bioactive NRPs.
Collapse
Affiliation(s)
- Bahar Behsaz
- Bioinformatics and Systems Biology Program, University of California San Diego, La Jolla, CA, USA
- Center for Microbiome Innovation, University of California at San Diego, La Jolla, CA, USA
- Computational Biology Department, School of Computer Science, Carnegie Mellon University, Pittsburgh, PA, USA
| | - Edna Bode
- Molecular Biotechnology, Department of Biosciences, Goethe University Frankfurt, Frankfurt am Main, Germany
| | - Alexey Gurevich
- Center for Algorithmic Biotechnology, Institute of Translational Biomedicine, St. Petersburg State University, St Petersburg, Russia
| | - Yan-Ni Shi
- Molecular Biotechnology, Department of Biosciences, Goethe University Frankfurt, Frankfurt am Main, Germany
| | - Florian Grundmann
- Molecular Biotechnology, Department of Biosciences, Goethe University Frankfurt, Frankfurt am Main, Germany
| | - Deepa Acharya
- Tiny Earth Chemistry Hub, University of Wisconsin-Madison, Madison, WI, USA
| | - Andrés Mauricio Caraballo-Rodríguez
- Collaborative Mass Spectrometry Innovation Center, Skaggs School of Pharmacy and Pharmaceutical Sciences, University of California San Diego, La Jolla, CA, USA
| | - Amina Bouslimani
- Collaborative Mass Spectrometry Innovation Center, Skaggs School of Pharmacy and Pharmaceutical Sciences, University of California San Diego, La Jolla, CA, USA
| | - Morgan Panitchpakdi
- Collaborative Mass Spectrometry Innovation Center, Skaggs School of Pharmacy and Pharmaceutical Sciences, University of California San Diego, La Jolla, CA, USA
| | - Annabell Linck
- Molecular Biotechnology, Department of Biosciences, Goethe University Frankfurt, Frankfurt am Main, Germany
| | - Changhui Guan
- The Jackson Laboratory of Medical Genomics, Farmington, CT, USA
| | - Julia Oh
- The Jackson Laboratory of Medical Genomics, Farmington, CT, USA
| | - Pieter C Dorrestein
- Center for Microbiome Innovation, University of California at San Diego, La Jolla, CA, USA
- Collaborative Mass Spectrometry Innovation Center, Skaggs School of Pharmacy and Pharmaceutical Sciences, University of California San Diego, La Jolla, CA, USA
| | - Helge B Bode
- Molecular Biotechnology, Department of Biosciences, Goethe University Frankfurt, Frankfurt am Main, Germany.
- Buchmann Institute for Molecular Life Sciences (BMLS), Goethe University Frankfurt & Senckenberg Research Institute, Frankfurt am Main, Germany.
- Max-Planck-Institute for Terrestrial Microbiology, Department for Natural Products in Organismic Interactions, Marburg, Germany.
| | - Pavel A Pevzner
- Department of Computer Science and Engineering, University of California San Diego, La Jolla, CA, USA.
| | - Hosein Mohimani
- Computational Biology Department, School of Computer Science, Carnegie Mellon University, Pittsburgh, PA, USA.
| |
Collapse
|
12
|
Cifani P, Li Z, Luo D, Grivainis M, Intlekofer AM, Fenyö D, Kentsis A. Discovery of Protein Modifications Using Differential Tandem Mass Spectrometry Proteomics. J Proteome Res 2021; 20:1835-1848. [PMID: 33749263 PMCID: PMC8341206 DOI: 10.1021/acs.jproteome.0c00638] [Citation(s) in RCA: 10] [Impact Index Per Article: 3.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/26/2023]
Abstract
Recent studies have revealed diverse amino acid, post-translational, and noncanonical modifications of proteins in diverse organisms and tissues. However, their unbiased detection and analysis remain hindered by technical limitations. Here, we present a spectral alignment method for the identification of protein modifications using high-resolution mass spectrometry proteomics. Termed SAMPEI for spectral alignment-based modified peptide identification, this open-source algorithm is designed for the discovery of functional protein and peptide signaling modifications, without prior knowledge of their identities. Using synthetic standards and controlled chemical labeling experiments, we demonstrate its high specificity and sensitivity for the discovery of substoichiometric protein modifications in complex cellular extracts. SAMPEI mapping of mouse macrophage differentiation revealed diverse post-translational protein modifications, including distinct forms of cysteine itaconatylation. SAMPEI's robust parametrization and versatility are expected to facilitate the discovery of biological modifications of diverse macromolecules. SAMPEI is implemented as a Python package and is available open-source from BioConda and GitHub (https://github.com/FenyoLab/SAMPEI).
Collapse
Affiliation(s)
- Paolo Cifani
- Molecular Pharmacology Program, Sloan Kettering Institute, Memorial Sloan Kettering Cancer Center, New York, New York 10021, United States
| | - Zhi Li
- Institute for Systems Genetics, NYU Grossman School of Medicine, New York, New York 10016, United States
- Department of Biochemistry and Molecular Pharmacology, NYU Grossman School of Medicine, New York, New York 10016, United States
| | - Danmeng Luo
- Molecular Pharmacology Program, Sloan Kettering Institute, Memorial Sloan Kettering Cancer Center, New York, New York 10021, United States
| | - Mark Grivainis
- Institute for Systems Genetics, NYU Grossman School of Medicine, New York, New York 10016, United States
| | - Andrew M Intlekofer
- Human Oncology & Pathogenesis Program and Department of Medicine, Memorial Sloan Kettering Cancer Center, New York, New York 10021, United States
| | - David Fenyö
- Institute for Systems Genetics, NYU Grossman School of Medicine, New York, New York 10016, United States
- Department of Biochemistry and Molecular Pharmacology, NYU Grossman School of Medicine, New York, New York 10016, United States
| | - Alex Kentsis
- Molecular Pharmacology Program, Sloan Kettering Institute, Memorial Sloan Kettering Cancer Center, New York, New York 10021, United States
- Tow Center for Developmental Oncology, Department of Pediatrics, Memorial Sloan Kettering Cancer Center, and Departments of Pediatrics, Pharmacology, and Physiology & Biophysics, Weill Medical College of Cornell University, New York, New York 10021, United States
| |
Collapse
|
13
|
Yang G, Yuan Y, Yuan H, Wang J, Yun H, Geng Y, Zhao M, Li L, Weng Y, Liu Z, Feng J, Bu Y, Liu L, Wang B, Zhang X. Histone acetyltransferase 1 is a succinyltransferase for histones and non-histones and promotes tumorigenesis. EMBO Rep 2021; 22:e50967. [PMID: 33372411 PMCID: PMC7857430 DOI: 10.15252/embr.202050967] [Citation(s) in RCA: 72] [Impact Index Per Article: 24.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/26/2020] [Revised: 11/13/2020] [Accepted: 12/01/2020] [Indexed: 12/13/2022] Open
Abstract
Lysine succinylation (Ksucc) is an evolutionarily conserved and widespread post-translational modification. Histone acetyltransferase 1 (HAT1) is a type B histone acetyltransferase, regulating the acetylation of both histone and non-histone proteins. However, the role of HAT1 in succinylation modulation remains unclear. Here, we employ a quantitative proteomics approach to study succinylation in HepG2 cancer cells and find that HAT1 modulates lysine succinylation on various proteins including histones and non-histones. HAT1 succinylates histone H3 on K122, contributing to epigenetic regulation and gene expression in cancer cells. Moreover, HAT1 catalyzes the succinylation of PGAM1 on K99, resulting in its increased enzymatic activity and the stimulation of glycolytic flux in cancer cells. Clinically, HAT1 is significantly elevated in liver cancer, pancreatic cancer, and cholangiocarcinoma tissues. Functionally, HAT1 succinyltransferase activity and the succinylation of PGAM1 by HAT1 play critical roles in promoting tumor progression in vitro and in vivo. Thus, we conclude that HAT1 is a succinyltransferase for histones and non-histones in tumorigenesis.
Collapse
Affiliation(s)
- Guang Yang
- Department of Cancer ResearchInstitute of Molecular BiologyCollege of Life SciencesNankai UniversityTianjinChina
| | - Ying Yuan
- Department of Cancer ResearchInstitute of Molecular BiologyCollege of Life SciencesNankai UniversityTianjinChina
| | - Hongfeng Yuan
- Department of Cancer ResearchInstitute of Molecular BiologyCollege of Life SciencesNankai UniversityTianjinChina
| | - Jiapei Wang
- Department of Cancer ResearchInstitute of Molecular BiologyCollege of Life SciencesNankai UniversityTianjinChina
| | - Haolin Yun
- Department of Cancer ResearchInstitute of Molecular BiologyCollege of Life SciencesNankai UniversityTianjinChina
| | - Yu Geng
- Department of Cancer ResearchInstitute of Molecular BiologyCollege of Life SciencesNankai UniversityTianjinChina
| | - Man Zhao
- Department of Cancer ResearchInstitute of Molecular BiologyCollege of Life SciencesNankai UniversityTianjinChina
| | - Linhan Li
- Jingjie PTM BioLab Co. Ltd.Hangzhou Economic and Technological Development AreaHangzhouChina
| | - Yejing Weng
- Jingjie PTM BioLab Co. Ltd.Hangzhou Economic and Technological Development AreaHangzhouChina
| | - Zixian Liu
- Department of Cancer ResearchInstitute of Molecular BiologyCollege of Life SciencesNankai UniversityTianjinChina
| | - Jinyan Feng
- Department of Cancer ResearchInstitute of Molecular BiologyCollege of Life SciencesNankai UniversityTianjinChina
| | - Yanan Bu
- Department of Cancer ResearchInstitute of Molecular BiologyCollege of Life SciencesNankai UniversityTianjinChina
| | - Lei Liu
- Department of Cancer ResearchInstitute of Molecular BiologyCollege of Life SciencesNankai UniversityTianjinChina
| | - Bingnan Wang
- Jingjie PTM BioLab Co. Ltd.Hangzhou Economic and Technological Development AreaHangzhouChina
| | - Xiaodong Zhang
- Department of Cancer ResearchInstitute of Molecular BiologyCollege of Life SciencesNankai UniversityTianjinChina
| |
Collapse
|
14
|
Na S, Paek E. Computational methods in mass spectrometry-based structural proteomics for studying protein structure, dynamics, and interactions. Comput Struct Biotechnol J 2020; 18:1391-1402. [PMID: 32637038 PMCID: PMC7322682 DOI: 10.1016/j.csbj.2020.06.002] [Citation(s) in RCA: 12] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/04/2020] [Revised: 06/01/2020] [Accepted: 06/01/2020] [Indexed: 12/28/2022] Open
Abstract
Mass spectrometry (MS) has made enormous contributions to comprehensive protein identification and quantification in proteomics. MS is also gaining momentum for structural biology in a variety of ways, complementing conventional structural biology techniques. Here, we will review how MS-based techniques, such as hydrogen/deuterium exchange, covalent labeling, and chemical cross-linking, enable the characterization of protein structure, dynamics, and interactions, especially from a perspective of their data analyses. Structural information encoded by chemical probes in intact proteins is decoded by interpreting MS data at a peptide level, i.e., revealing conformational and dynamic changes in local regions of proteins. The structural MS data are not amenable to data analyses in traditional proteomics workflow, requiring dedicated software for each type of data. We first provide basic principles of data interpretation, including isotopic distribution and peptide sequencing. We then focus particularly on computational methods for structural MS data analyses and discuss outstanding challenges in a proteome-wide large scale analysis.
Collapse
Affiliation(s)
- Seungjin Na
- Dept. of Computer Science, Hanyang University, Seoul 04763, Republic of Korea
| | - Eunok Paek
- Dept. of Computer Science, Hanyang University, Seoul 04763, Republic of Korea
| |
Collapse
|
15
|
Verheggen K, Raeder H, Berven FS, Martens L, Barsnes H, Vaudel M. Anatomy and evolution of database search engines-a central component of mass spectrometry based proteomic workflows. MASS SPECTROMETRY REVIEWS 2020; 39:292-306. [PMID: 28902424 DOI: 10.1002/mas.21543] [Citation(s) in RCA: 60] [Impact Index Per Article: 15.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 04/06/2016] [Accepted: 07/05/2017] [Indexed: 06/07/2023]
Abstract
Sequence database search engines are bioinformatics algorithms that identify peptides from tandem mass spectra using a reference protein sequence database. Two decades of development, notably driven by advances in mass spectrometry, have provided scientists with more than 30 published search engines, each with its own properties. In this review, we present the common paradigm behind the different implementations, and its limitations for modern mass spectrometry datasets. We also detail how the search engines attempt to alleviate these limitations, and provide an overview of the different software frameworks available to the researcher. Finally, we highlight alternative approaches for the identification of proteomic mass spectrometry datasets, either as a replacement for, or as a complement to, sequence database search engines.
Collapse
Affiliation(s)
- Kenneth Verheggen
- VIB-UGent Center for Medical Biotechnology, VIB, Ghent, Belgium
- Department of Biochemistry, Ghent University, Ghent, Belgium
- Bioinformatics Institute Ghent, Ghent University, Ghent, Belgium
| | - Helge Raeder
- KG Jebsen Center for Diabetes Research, Department of Clinical Science, University of Bergen, Norway
- Department of Pediatrics, Haukeland University Hospital, Bergen, Norway
| | - Frode S Berven
- Proteomics Unit, Department of Biomedicine, University of Bergen, Norway
| | - Lennart Martens
- VIB-UGent Center for Medical Biotechnology, VIB, Ghent, Belgium
- Department of Biochemistry, Ghent University, Ghent, Belgium
- Bioinformatics Institute Ghent, Ghent University, Ghent, Belgium
| | - Harald Barsnes
- KG Jebsen Center for Diabetes Research, Department of Clinical Science, University of Bergen, Norway
- Proteomics Unit, Department of Biomedicine, University of Bergen, Norway
- Computational Biology Unit, Department of Informatics, University of Bergen, Norway
| | - Marc Vaudel
- KG Jebsen Center for Diabetes Research, Department of Clinical Science, University of Bergen, Norway
- Proteomics Unit, Department of Biomedicine, University of Bergen, Norway
- Center for Medical Genetics and Molecular Medicine, Haukeland University Hospital, Bergen, Norway
| |
Collapse
|
16
|
Karan D, Dubey S, Pirisi L, Nagel A, Pina I, Choo YM, Hamann MT. The Marine Natural Product Manzamine A Inhibits Cervical Cancer by Targeting the SIX1 Protein. JOURNAL OF NATURAL PRODUCTS 2020; 83:286-295. [PMID: 32022559 PMCID: PMC7161578 DOI: 10.1021/acs.jnatprod.9b00577] [Citation(s) in RCA: 28] [Impact Index Per Article: 7.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 07/23/2019] [Indexed: 06/10/2023]
Abstract
Natural products remain an important source of drug leads covering unique chemical space and providing significant therapeutic value for the control of cancer and infectious diseases resistant to current drugs. Here, we determined the antiproliferative activity of a natural product manzamine A (1) from an Indo-Pacific sponge following various in vitro cellular assays targeting cervical cancer (C33A, HeLa, SiHa, and CaSki). Our data demonstrated the antiproliferative effects of 1 at relatively low and non-cytotoxic concentrations (up to 4 μM). Mechanistic investigations confirmed that 1 blocked cell cycle progression in SiHa and CaSki cells at G1/S phase and regulated cell cycle-related genes, including restoration of p21 and p53 expression. In apoptotic assays, HeLa cells showed the highest sensitivity to 1 as compared to other cell types (C33A, SiHa, and CaSki). Interestingly, 1 decreased the levels of the oncoprotein SIX1, which is associated with oncogenesis in cervical cancer. To further investigate the structure-activity relationship among manzamine A (1) class with potential antiproliferative activity, molecular networking facilitated the efficient identification, dereplication, and assignment of structures from the manzamine class and revealed the significant potential in the design of optimized molecules for the treatment of cervical cancer. These data suggest that this sponge-derived natural product class warrants further attention regarding the design and development of novel manzamine analogues, which may be efficacious for preventive and therapeutic treatment of cancer. Additionally, this study reveals the significance of protecting fragile marine ecosystems from climate change-induced loss of species diversity.
Collapse
Affiliation(s)
- Dev Karan
- Department
of Pathology, MCW Cancer Center and Prostate Cancer Center of Excellence, Medical College of Wisconsin, 8701 Watertown Plank Road, Milwaukee, Wisconsin, United States
| | - Seema Dubey
- Department
of Pathology, MCW Cancer Center and Prostate Cancer Center of Excellence, Medical College of Wisconsin, 8701 Watertown Plank Road, Milwaukee, Wisconsin, United States
| | - Lucia Pirisi
- Department
of Pathology, Microbiology and Immunology, University of South Carolina School of Medicine, Columbia, South Carolina, United States
| | - Alexis Nagel
- Department
of Drug Discovery and Biomedical Sciences, Medical University of South Carolina, Charleston, South Carolina, United States
| | - Ivett Pina
- Department
of Drug Discovery and Biomedical Sciences, Medical University of South Carolina, Charleston, South Carolina, United States
| | - Yeun-Mun Choo
- Department
of Chemistry, University of Malaya, Kuala Lumpur, Malaysia
| | - Mark T Hamann
- Department
of Drug Discovery and Biomedical Sciences, Medical University of South Carolina, Charleston, South Carolina, United States
| |
Collapse
|
17
|
De Novo Peptide Sequencing Reveals Many Cyclopeptides in the Human Gut and Other Environments. Cell Syst 2019; 10:99-108.e5. [PMID: 31864964 DOI: 10.1016/j.cels.2019.11.007] [Citation(s) in RCA: 26] [Impact Index Per Article: 5.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/10/2019] [Revised: 09/18/2019] [Accepted: 11/18/2019] [Indexed: 12/20/2022]
Abstract
Cyclic and branch cyclic peptides (cyclopeptides) represent a class of bioactive natural products that include many antibiotics and anti-tumor compounds. Despite the recent advances in metabolomics analysis, still little is known about the cyclopeptides in the human gut and their possible interactions due to a lack of computational analysis pipelines that are applicable to such compounds. Here, we introduce CycloNovo, an algorithm for automated de novo cyclopeptide analysis and sequencing that employs de Bruijn graphs, the workhorse of DNA sequencing algorithms, to identify cyclopeptides in spectral datasets. CycloNovo reconstructed 32 previously unreported cyclopeptides (to the best of our knowledge) in the human gut and reported over a hundred cyclopeptides in other environments represented by various spectra on Global Natural Products Social Molecular Network (GNPS). https://github.com/bbehsaz/cyclonovo.
Collapse
|
18
|
Pino L, Lin A, Bittremieux W. 2018 YPIC Challenge: A Case Study in Characterizing an Unknown Protein Sample. J Proteome Res 2019; 18:3936-3943. [PMID: 31556620 PMCID: PMC6824964 DOI: 10.1021/acs.jproteome.9b00384] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/29/2022]
Abstract
For the 2018 YPIC Challenge, contestants were invited to try to decipher two unknown English questions encoded by a synthetic protein expressed in Escherichia coli. In addition to deciphering the sentence, contestants were asked to determine the three-dimensional structure and detect any post-translation modifications left by the host organism. We present our experimental and computational strategy to characterize this sample by identifying the unknown protein sequence and detecting the presence of post-translational modifications. The sample was acquired with dynamic exclusion disabled to increase the signal-to-noise ratio of the measured molecules, after which spectral clustering was used to generate high-quality consensus spectra. De novo spectrum identification was used to determine the synthetic protein sequence, and any post-translational modifications introduced by E. coli on the synthetic protein were analyzed via spectral networking. This workflow resulted in a de novo sequence coverage of 70%, on par with sequence database searching performance. Additionally, the spectral networking analysis indicated that no systematic modifications were introduced on the synthetic protein by E. coli. The strategy presented here can be directly used to analyze samples for which no protein sequence information is available or when the identity of the sample is unknown. All software and code to perform the bioinformatics analysis is available as open source, and self-contained Jupyter notebooks are provided to fully recreate the analysis.
Collapse
Affiliation(s)
- Lindsay Pino
- Department of Genome Sciences, University of Washington, Seattle WA 98195, USA
| | - Andy Lin
- Department of Genome Sciences, University of Washington, Seattle WA 98195, USA
| | - Wout Bittremieux
- Department of Genome Sciences, University of Washington, Seattle WA 98195, USA
- Department of Mathematics and Computer Science, University of Antwerp, 2020 Antwerp, Belgium
- Biomedical Informatics Network Antwerpen (biomina), 2020 Antwerp, Belgium
| |
Collapse
|
19
|
Cao L, Gurevich A, Alexander KL, Naman CB, Leão T, Glukhov E, Luzzatto-Knaan T, Vargas F, Quinn R, Bouslimani A, Nothias LF, Singh NK, Sanders JG, Benitez RAS, Thompson LR, Hamid MN, Morton JT, Mikheenko A, Shlemov A, Korobeynikov A, Friedberg I, Knight R, Venkateswaran K, Gerwick WH, Gerwick L, Dorrestein PC, Pevzner PA, Mohimani H. MetaMiner: A Scalable Peptidogenomics Approach for Discovery of Ribosomal Peptide Natural Products with Blind Modifications from Microbial Communities. Cell Syst 2019; 9:600-608.e4. [PMID: 31629686 DOI: 10.1016/j.cels.2019.09.004] [Citation(s) in RCA: 38] [Impact Index Per Article: 7.6] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/11/2019] [Revised: 04/23/2019] [Accepted: 09/12/2019] [Indexed: 12/22/2022]
Abstract
Ribosomally synthesized and post-translationally modified peptides (RiPPs) are an important class of natural products that contain antibiotics and a variety of other bioactive compounds. The existing methods for discovery of RiPPs by combining genome mining and computational mass spectrometry are limited to discovering specific classes of RiPPs from small datasets, and these methods fail to handle unknown post-translational modifications. Here, we present MetaMiner, a software tool for addressing these challenges that is compatible with large-scale screening platforms for natural product discovery. After searching millions of spectra in the Global Natural Products Social (GNPS) molecular networking infrastructure against just eight genomic and metagenomic datasets, MetaMiner discovered 31 known and seven unknown RiPPs from diverse microbial communities, including human microbiome and lichen microbiome, and microorganisms isolated from the International Space Station.
Collapse
Affiliation(s)
- Liu Cao
- Computational Biology Department, Carnegie Mellon University, Pittsburgh, PA, USA
| | - Alexey Gurevich
- Center for Algorithmic Biotechnology, Institute of Translational Biomedicine, St. Petersburg State University, St. Petersburg, Russia
| | - Kelsey L Alexander
- Center for Marine Biotechnology and Biomedicine, Scripps Institution of Oceanography and Skaggs School of Pharmacy and Pharmaceutical Sciences, University of California, San Diego, San Diego, CA, USA; Department of Chemistry and Biochemistry, University of California, San Diego, San Diego, CA, USA
| | - C Benjamin Naman
- Center for Marine Biotechnology and Biomedicine, Scripps Institution of Oceanography and Skaggs School of Pharmacy and Pharmaceutical Sciences, University of California, San Diego, San Diego, CA, USA; Li Dak Sum Yip Yio Chin Kenneth Li Marine Biopharmaceutical Research Center, Department of Marine Pharmacy, College of Food and Pharmaceutical Sciences, Ningbo University, Ningbo, Zhejiang, China
| | - Tiago Leão
- Center for Marine Biotechnology and Biomedicine, Scripps Institution of Oceanography and Skaggs School of Pharmacy and Pharmaceutical Sciences, University of California, San Diego, San Diego, CA, USA
| | - Evgenia Glukhov
- Center for Marine Biotechnology and Biomedicine, Scripps Institution of Oceanography and Skaggs School of Pharmacy and Pharmaceutical Sciences, University of California, San Diego, San Diego, CA, USA
| | - Tal Luzzatto-Knaan
- Collaborative Mass Spectrometry Innovation Center, Skaggs School of Pharmacy and Pharmaceutical Sciences, University of California, San Diego, San Diego, CA, USA
| | - Fernando Vargas
- Collaborative Mass Spectrometry Innovation Center, Skaggs School of Pharmacy and Pharmaceutical Sciences, University of California, San Diego, San Diego, CA, USA
| | - Robby Quinn
- Collaborative Mass Spectrometry Innovation Center, Skaggs School of Pharmacy and Pharmaceutical Sciences, University of California, San Diego, San Diego, CA, USA
| | - Amina Bouslimani
- Collaborative Mass Spectrometry Innovation Center, Skaggs School of Pharmacy and Pharmaceutical Sciences, University of California, San Diego, San Diego, CA, USA
| | - Louis Felix Nothias
- Collaborative Mass Spectrometry Innovation Center, Skaggs School of Pharmacy and Pharmaceutical Sciences, University of California, San Diego, San Diego, CA, USA
| | - Nitin K Singh
- Jet Propulsion Laboratory, California Institute of Technology, Pasadena, CA, USA
| | - Jon G Sanders
- Department of Pediatrics, University of California, San Diego School of Medicine, San Diego, CA, USA
| | - Rodolfo A S Benitez
- Department of Pediatrics, University of California, San Diego School of Medicine, San Diego, CA, USA
| | - Luke R Thompson
- Department of Biological Sciences and Northern Gulf Institute, University of Southern Mississippi, Hattiesburg, MS, USA; Ocean Chemistry and Ecosystems Division, Atlantic Oceanographic and Meteorological Laboratory, National Oceanic and Atmospheric Administration, stationed at Southwest Fisheries Science Center, La Jolla, CA, USA
| | - Md-Nafiz Hamid
- Department of Veterinary Microbiology and Preventive Medicine, Iowa State University, Ames, IA, USA; Interdepartmental program in Bioinformatics and Computational Biology, Iowa State University, Ames, IA, USA
| | - James T Morton
- Department of Pediatrics, University of California, San Diego School of Medicine, San Diego, CA, USA; Department of Computer Science and Engineering, University of California, San Diego, San Diego, CA, USA
| | - Alla Mikheenko
- Center for Algorithmic Biotechnology, Institute of Translational Biomedicine, St. Petersburg State University, St. Petersburg, Russia
| | - Alexander Shlemov
- Center for Algorithmic Biotechnology, Institute of Translational Biomedicine, St. Petersburg State University, St. Petersburg, Russia
| | - Anton Korobeynikov
- Center for Algorithmic Biotechnology, Institute of Translational Biomedicine, St. Petersburg State University, St. Petersburg, Russia; Department of Mathematics and Mechanics, St. Petersburg State University, St. Petersburg, Russia
| | - Iddo Friedberg
- Department of Veterinary Microbiology and Preventive Medicine, Iowa State University, Ames, IA, USA; Interdepartmental program in Bioinformatics and Computational Biology, Iowa State University, Ames, IA, USA
| | - Rob Knight
- Department of Pediatrics, University of California, San Diego School of Medicine, San Diego, CA, USA; Department of Computer Science and Engineering, University of California, San Diego, San Diego, CA, USA; Center for Microbiome Innovation, Jacobs School of Engineering, University of California San Diego, San Diego, CA, USA; Department of Bioengineering, University of California, San Diego, San Diego, CA, USA
| | | | - William H Gerwick
- Center for Marine Biotechnology and Biomedicine, Scripps Institution of Oceanography and Skaggs School of Pharmacy and Pharmaceutical Sciences, University of California, San Diego, San Diego, CA, USA
| | - Lena Gerwick
- Center for Marine Biotechnology and Biomedicine, Scripps Institution of Oceanography and Skaggs School of Pharmacy and Pharmaceutical Sciences, University of California, San Diego, San Diego, CA, USA
| | - Pieter C Dorrestein
- Collaborative Mass Spectrometry Innovation Center, Skaggs School of Pharmacy and Pharmaceutical Sciences, University of California, San Diego, San Diego, CA, USA; Center for Microbiome Innovation, Jacobs School of Engineering, University of California San Diego, San Diego, CA, USA
| | - Pavel A Pevzner
- Department of Computer Science and Engineering, University of California, San Diego, San Diego, CA, USA; Center for Microbiome Innovation, Jacobs School of Engineering, University of California San Diego, San Diego, CA, USA
| | - Hosein Mohimani
- Computational Biology Department, Carnegie Mellon University, Pittsburgh, PA, USA; Department of Computer Science and Engineering, University of California, San Diego, San Diego, CA, USA.
| |
Collapse
|
20
|
Wang C, Zou P, Yang C, Liu L, Cheng L, He X, Zhang L, Zhang Y, Jiang H, Chen PR. Dynamic modifications of biomacromolecules: mechanism and chemical interventions. SCIENCE CHINA-LIFE SCIENCES 2019; 62:1459-1471. [PMID: 31555961 DOI: 10.1007/s11427-019-9823-1] [Citation(s) in RCA: 10] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 07/28/2019] [Accepted: 08/20/2019] [Indexed: 01/24/2023]
Abstract
Biological macromolecules (proteins, nucleic acids, polysaccharides, etc.) are the building blocks of life, which constantly undergo chemical modifications that are often reversible and spatial-temporally regulated. These dynamic properties of chemical modifications play fundamental roles in physiological processes as well as pathological changes of living systems. The Major Research Project (MRP) funded by the National Natural Science Foundation of China (NSFC)-"Dynamic modifications of biomacromolecules: mechanism and chemical interventions" aims to integrate cross-disciplinary approaches at the interface of chemistry, life sciences, medicine, mathematics, material science and information science with the following goals: (i) developing specific labeling techniques and detection methods for dynamic chemical modifications of biomacromolecules, (ii) analyzing the molecular mechanisms and functional relationships of dynamic chemical modifications of biomacromolecules, and (iii) exploring biomacromolecules and small molecule probes as potential drug targets and lead compounds.
Collapse
Affiliation(s)
- Chu Wang
- College of Chemistry and Molecular Engineering, Peking University, Beijing, 100871, China
| | - Peng Zou
- College of Chemistry and Molecular Engineering, Peking University, Beijing, 100871, China
| | - Caiguang Yang
- Shanghai Institute of Materia Medica, Chinese Academy of Sciences, Shanghai, 201203, China
| | - Lei Liu
- Department of Chemistry, Tsinghua University, Beijing, 100084, China
| | - Liang Cheng
- Institute of Chemistry, Chinese Academy of Sciences, Beijing, 100190, China
| | - Xiaopeng He
- School of Chemistry and Molecular Engineering, East China University of Science and Technology, Shanghai, 200237, China
| | - Liang Zhang
- Department of Pharmacology and Chemical Biology, Shanghai Jiao Tong University School of Medicine, Shanghai, 200025, China
| | - Yan Zhang
- National Natural Science Foundation of China, Beijing, 100085, China
| | - Hualiang Jiang
- Shanghai Institute of Materia Medica, Chinese Academy of Sciences, Shanghai, 201203, China.
| | - Peng R Chen
- College of Chemistry and Molecular Engineering, Peking University, Beijing, 100871, China.
| |
Collapse
|
21
|
Na S, Kim J, Paek E. MODplus: Robust and Unrestrictive Identification of Post-Translational Modifications Using Mass Spectrometry. Anal Chem 2019; 91:11324-11333. [PMID: 31365238 DOI: 10.1021/acs.analchem.9b02445] [Citation(s) in RCA: 12] [Impact Index Per Article: 2.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/02/2023]
Abstract
Post-translational modifications regulate various cellular processes and are of great biological interest. Unrestrictive searches of mass spectrometry data enable the detection of any type of modification. Here we propose MODplus, which makes practical unrestrictive searches possible by allowing (1) hundreds of modifications, (2) multiple modifications per peptide, (3) the whole proteome database, and (4) any tolerant values in search parameters. The utility of MODplus was demonstrated in large human data sets of HEK293 cells and TMT-labeled phosphorylation enrichment. Notably, MODplus supports identifying different modification types at multiple sites and reports real chemical and biological modifications, as it has been very labor intensive to link unrestrictive search results to real modifications. We also confirmed the presence of Missing Precursor (MP) spectra that were not identifiable using targeted precursor masses. The MP spectra mostly resulted in identifications of wrong modifications and negatively affected the overall performance, often by as much as 10%. MODplus can rapidly recognize MP spectra and correct their identifications, resulting in increased identification rate up to 70% in the HEK293 data set as well as improved reliability.
Collapse
Affiliation(s)
- Seungjin Na
- Department of Computer Science , Hanyang University , Seoul 04763 , South Korea
| | - Jihyung Kim
- Department of Computer Science , Hanyang University , Seoul 04763 , South Korea
| | - Eunok Paek
- Department of Computer Science , Hanyang University , Seoul 04763 , South Korea
| |
Collapse
|
22
|
An Z, Zhai L, Ying W, Qian X, Gong F, Tan M, Fu Y. PTMiner: Localization and Quality Control of Protein Modifications Detected in an Open Search and Its Application to Comprehensive Post-translational Modification Characterization in Human Proteome. Mol Cell Proteomics 2019; 18:391-405. [PMID: 30420486 PMCID: PMC6356076 DOI: 10.1074/mcp.ra118.000812] [Citation(s) in RCA: 26] [Impact Index Per Article: 5.2] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/21/2018] [Revised: 11/02/2018] [Indexed: 12/27/2022] Open
Abstract
The open (mass tolerant) search of tandem mass spectra of peptides shows great potential in the comprehensive detection of post-translational modifications (PTMs) in shotgun proteomics. However, this search strategy has not been widely used by the community, and one bottleneck of it is the lack of appropriate algorithms for automated and reliable post-processing of the coarse and error-prone search results. Here we present PTMiner, a software tool for confident filtering and localization of modifications (mass shifts) detected in an open search. After mass-shift-grouped false discovery rate (FDR) control of peptide-spectrum matches (PSMs), PTMiner uses an empirical Bayesian method to localize modifications through iterative learning of the prior probabilities of each type of modification occurring on different amino acids. The performance of PTMiner was evaluated on three data sets, including simulated data, chemically synthesized peptide library data and modified-peptide spiked-in proteome data. The results showed that PTMiner can effectively control the PSM FDR and accurately localize the modification sites. At 1% real false localization rate (FLR), PTMiner localized 93%, 84 and 83% of the modification sites in the three data sets, respectively, far higher than two open search engines we used and an extended version of the Ascore localization algorithm. We then used PTMiner to analyze a draft map of human proteome containing 25 million spectra from 30 tissues, and confidently identified over 1.7 million modified PSMs at 1% FDR and 1% FLR, which provided a system-wide view of both known and unknown PTMs in the human proteome.
Collapse
Affiliation(s)
- Zhiwu An
- National Center for Mathematics and Interdisciplinary Sciences, Key Laboratory of Random Complex Structures and Data Science, Academy of Mathematics and Systems Science, Chinese Academy of Sciences, Beijing 100190, China;; School of Mathematical Sciences, University of Chinese Academy of Sciences, Beijing 100049, China
| | - Linhui Zhai
- State Key Laboratory of Drug Research, Shanghai Institute of Materia Medica, Chinese Academy of Sciences, Shanghai 201203, China
| | - Wantao Ying
- State key Laboratory of Proteomics, National Center for Protein Sciences Beijing, Beijing Proteome Research Center, National Engineering Research Center for Protein Drugs, Beijing 102206, China, Beijing Institute of Lifeomics, Beijing 100850, China
| | - Xiaohong Qian
- State key Laboratory of Proteomics, National Center for Protein Sciences Beijing, Beijing Proteome Research Center, National Engineering Research Center for Protein Drugs, Beijing 102206, China, Beijing Institute of Lifeomics, Beijing 100850, China
| | - Fuzhou Gong
- National Center for Mathematics and Interdisciplinary Sciences, Key Laboratory of Random Complex Structures and Data Science, Academy of Mathematics and Systems Science, Chinese Academy of Sciences, Beijing 100190, China;; School of Mathematical Sciences, University of Chinese Academy of Sciences, Beijing 100049, China.
| | - Minjia Tan
- State Key Laboratory of Drug Research, Shanghai Institute of Materia Medica, Chinese Academy of Sciences, Shanghai 201203, China;.
| | - Yan Fu
- National Center for Mathematics and Interdisciplinary Sciences, Key Laboratory of Random Complex Structures and Data Science, Academy of Mathematics and Systems Science, Chinese Academy of Sciences, Beijing 100190, China;; School of Mathematical Sciences, University of Chinese Academy of Sciences, Beijing 100049, China.
| |
Collapse
|
23
|
Mohimani H, Gurevich A, Shlemov A, Mikheenko A, Korobeynikov A, Cao L, Shcherbin E, Nothias LF, Dorrestein PC, Pevzner PA. Dereplication of microbial metabolites through database search of mass spectra. Nat Commun 2018; 9:4035. [PMID: 30279420 PMCID: PMC6168521 DOI: 10.1038/s41467-018-06082-8] [Citation(s) in RCA: 166] [Impact Index Per Article: 27.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/05/2018] [Accepted: 08/14/2018] [Indexed: 12/24/2022] Open
Abstract
Natural products have traditionally been rich sources for drug discovery. In order to clear the road toward the discovery of unknown natural products, biologists need dereplication strategies that identify known ones. Here we report DEREPLICATOR+, an algorithm that improves on the previous approaches for identifying peptidic natural products, and extends them for identification of polyketides, terpenes, benzenoids, alkaloids, flavonoids, and other classes of natural products. We show that DEREPLICATOR+ can search all spectra in the recently launched Global Natural Products Social molecular network and identify an order of magnitude more natural products than previous dereplication efforts. We further demonstrate that DEREPLICATOR+ enables cross-validation of genome-mining and peptidogenomics/glycogenomics results.
Collapse
Affiliation(s)
- Hosein Mohimani
- Computational Biology Department, School of Computer Sciences, Carnegie Mellon University, Pittsburgh, PA, USA.
- Department of Computer Science and Engineering, University of California, San Diego, La Jolla, CA, USA.
| | - Alexey Gurevich
- Center for Algorithmic Biotechnology, Institute of Translational Biomedicine, St. Petersburg State University, St. Petersburg, Russia
| | - Alexander Shlemov
- Center for Algorithmic Biotechnology, Institute of Translational Biomedicine, St. Petersburg State University, St. Petersburg, Russia
| | - Alla Mikheenko
- Center for Algorithmic Biotechnology, Institute of Translational Biomedicine, St. Petersburg State University, St. Petersburg, Russia
| | - Anton Korobeynikov
- Center for Algorithmic Biotechnology, Institute of Translational Biomedicine, St. Petersburg State University, St. Petersburg, Russia
- Department of Statistical Modelling, St. Petersburg State University, St. Petersburg, Russia
| | - Liu Cao
- Computational Biology Department, School of Computer Sciences, Carnegie Mellon University, Pittsburgh, PA, USA
| | - Egor Shcherbin
- National Research University Higher School of Economics, St. Petersburg, Russia
| | - Louis-Felix Nothias
- Collaborative Mass Spectrometry Innovation Center, Skaggs School of Pharmacy and Pharmaceutical Sciences, University of California, San Diego, La Jolla, CA, USA
| | - Pieter C Dorrestein
- Collaborative Mass Spectrometry Innovation Center, Skaggs School of Pharmacy and Pharmaceutical Sciences, University of California, San Diego, La Jolla, CA, USA
- Department of Pharmacology and Pediatrics, University of California, San Diego, La Jolla, CA, USA
| | - Pavel A Pevzner
- Department of Computer Science and Engineering, University of California, San Diego, La Jolla, CA, USA
- Center for Algorithmic Biotechnology, Institute of Translational Biomedicine, St. Petersburg State University, St. Petersburg, Russia
| |
Collapse
|
24
|
Kou Q, Wu S, Tolic N, Paša-Tolic L, Liu Y, Liu X. A mass graph-based approach for the identification of modified proteoforms using top-down tandem mass spectra. Bioinformatics 2018; 33:1309-1316. [PMID: 28453668 DOI: 10.1093/bioinformatics/btw806] [Citation(s) in RCA: 16] [Impact Index Per Article: 2.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/13/2016] [Accepted: 12/15/2016] [Indexed: 11/14/2022] Open
Abstract
Motivation Although proteomics has rapidly developed in the past decade, researchers are still in the early stage of exploring the world of complex proteoforms, which are protein products with various primary structure alterations resulting from gene mutations, alternative splicing, post-translational modifications, and other biological processes. Proteoform identification is essential to mapping proteoforms to their biological functions as well as discovering novel proteoforms and new protein functions. Top-down mass spectrometry is the method of choice for identifying complex proteoforms because it provides a 'bird's eye view' of intact proteoforms. The combinatorial explosion of various alterations on a protein may result in billions of possible proteoforms, making proteoform identification a challenging computational problem. Results We propose a new data structure, called the mass graph, for efficient representation of proteoforms and design mass graph alignment algorithms. We developed TopMG, a mass graph-based software tool for proteoform identification by top-down mass spectrometry. Experiments on top-down mass spectrometry datasets showed that TopMG outperformed existing methods in identifying complex proteoforms. Availability and implementation http://proteomics.informatics.iupui.edu/software/topmg/. Contact xwliu@iupui.edu. Supplementary information Supplementary data are available at Bioinformatics online.
Collapse
Affiliation(s)
- Qiang Kou
- Department of BioHealth Informatics, Indiana University-Purdue University Indianapolis, Indianapolis, IN 46202, USA
| | - Si Wu
- Department of Chemistry and Biochemistry, University of Oklahoma, Norman, OK 73019, USA
| | - Nikola Tolic
- Environmental Molecular Sciences Laboratory, Pacific Northwest National Laboratory, Richland, WA 99354, USA
| | - Ljiljana Paša-Tolic
- Environmental Molecular Sciences Laboratory, Pacific Northwest National Laboratory, Richland, WA 99354, USA
| | - Yunlong Liu
- Department of Medical and Molecular Genetics, Indiana University School of Medicine, Indianapolis, IN 46202, USA.,Center for Computational Biology and Bioinformatics, Indiana University School of Medicine, Indianapolis, IN 46202, USA
| | - Xiaowen Liu
- Department of BioHealth Informatics, Indiana University-Purdue University Indianapolis, Indianapolis, IN 46202, USA.,Center for Computational Biology and Bioinformatics, Indiana University School of Medicine, Indianapolis, IN 46202, USA
| |
Collapse
|
25
|
Dorl S, Winkler S, Mechtler K, Dorfer V. PhoStar: Identifying Tandem Mass Spectra of Phosphorylated Peptides before Database Search. J Proteome Res 2017; 17:290-295. [PMID: 29057658 DOI: 10.1021/acs.jproteome.7b00563] [Citation(s) in RCA: 8] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/17/2023]
Abstract
Standard proteomics workflows use tandem mass spectrometry followed by sequence database search to analyze complex biological samples. The identification of proteins carrying post-translational modifications, for example, phosphorylation, is typically addressed by allowing variable modifications in the searched sequences. Accounting for these variations exponentially increases the combinatorial space in the database, which leads to increased processing times and more false positive identifications. The here-presented tool PhoStar identifies spectra that originate from phosphorylated peptides before database search using a supervised machine learning approach. The model for the prediction of phosphorylation was trained and validated with an accuracy of 97.6% on a large set of high-confidence spectra collected from publicly available experimental data. Its power was further validated by predicting phosphorylation in the complete NIST human and mouse high collision-dissociation spectral libraries, achieving an accuracy of 98.2 and 97.9%, respectively. We demonstrate the application of PhoStar by using it for spectra filtering before database search. In database search of HeLa samples the peptide search space was reduced by 27-66% while finding at least 97% of total peptide identifications (at 1% FDR) compared with a standard workflow.
Collapse
Affiliation(s)
- Sebastian Dorl
- University of Applied Sciences Upper Austria , Bioinformatics Research Group, Softwarepark 11, 4232 Hagenberg, Austria
| | - Stephan Winkler
- University of Applied Sciences Upper Austria , Bioinformatics Research Group, Softwarepark 11, 4232 Hagenberg, Austria
| | - Karl Mechtler
- Research Institute of Molecular Pathology (IMP) , Protein Chemistry, Campus-Vienna-Biocenter 1, 1030 Vienna, Austria.,Institute of Molecular Biotechnology (IMBA), Protein Chemistry , Vienna Biocenter (VBC), Dr. Bohr-Gasse 3, 1030 Vienna, Austria
| | - Viktoria Dorfer
- University of Applied Sciences Upper Austria , Bioinformatics Research Group, Softwarepark 11, 4232 Hagenberg, Austria
| |
Collapse
|
26
|
Shao W, Lam H. Tandem mass spectral libraries of peptides and their roles in proteomics research. MASS SPECTROMETRY REVIEWS 2017; 36:634-648. [PMID: 27403644 DOI: 10.1002/mas.21512] [Citation(s) in RCA: 38] [Impact Index Per Article: 5.4] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 11/30/2015] [Accepted: 05/21/2016] [Indexed: 05/15/2023]
Abstract
Proteomics is a rapidly maturing field aimed at the high-throughput identification and quantification of all proteins in a biological system. The cornerstone of proteomic technology is tandem mass spectrometry of peptides resulting from the digestion of protein mixtures. The fragmentation pattern of each peptide ion is captured in its tandem mass spectrum, which enables its identification and acts as a fingerprint for the peptide. Spectral libraries are simply searchable collections of these fingerprints, which have taken on an increasingly prominent role in proteomic data analysis. This review describes the historical development of spectral libraries in proteomics, details the computational procedures behind library building and searching, surveys the current applications of spectral libraries, and discusses the outstanding challenges. © 2016 Wiley Periodicals, Inc. Mass Spec Rev 36:634-648, 2017.
Collapse
Affiliation(s)
- Wenguang Shao
- Department of Biology, Institute of Molecular Systems Biology, Eidgenössische Technische Hochschule (ETH) Zurich, Zurich, Switzerland
- Division of Biomedical Engineering, The Hong Kong University of Science and Technology, Clear Water Bay, Hong Kong
| | - Henry Lam
- Division of Biomedical Engineering, The Hong Kong University of Science and Technology, Clear Water Bay, Hong Kong
- Department of Chemical and Biomolecular Engineering, The Hong Kong University of Science and Technology, Clear Water Bay, Hong Kong
| |
Collapse
|
27
|
Kong AT, Leprevost FV, Avtonomov DM, Mellacheruvu D, Nesvizhskii AI. MSFragger: ultrafast and comprehensive peptide identification in mass spectrometry-based proteomics. Nat Methods 2017; 14:513-520. [PMID: 28394336 PMCID: PMC5409104 DOI: 10.1038/nmeth.4256] [Citation(s) in RCA: 976] [Impact Index Per Article: 139.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/30/2016] [Accepted: 03/06/2017] [Indexed: 12/22/2022]
Abstract
There is a need to better understand and handle the 'dark matter' of proteomics-the vast diversity of post-translational and chemical modifications that are unaccounted in a typical mass spectrometry-based analysis and thus remain unidentified. We present a fragment-ion indexing method, and its implementation in peptide identification tool MSFragger, that enables a more than 100-fold improvement in speed over most existing proteome database search tools. Using several large proteomic data sets, we demonstrate how MSFragger empowers the open database search concept for comprehensive identification of peptides and all their modified forms, uncovering dramatic differences in modification rates across experimental samples and conditions. We further illustrate its utility using protein-RNA cross-linked peptide data and using affinity purification experiments where we observe, on average, a 300% increase in the number of identified spectra for enriched proteins. We also discuss the benefits of open searching for improved false discovery rate estimation in proteomics.
Collapse
Affiliation(s)
- Andy T. Kong
- Department of Computational Medicine and Bioinformatics, University of Michigan, Ann Arbor, Michigan, USA
- Department of Pathology, University of Michigan, Ann Arbor, Michigan, USA
| | | | | | | | - Alexey I. Nesvizhskii
- Department of Computational Medicine and Bioinformatics, University of Michigan, Ann Arbor, Michigan, USA
- Department of Pathology, University of Michigan, Ann Arbor, Michigan, USA
| |
Collapse
|
28
|
Abstract
In computational proteomics, the identification of peptides with an unlimited number of post-translational modification (PTM) types is a challenging task. The computational cost associated with database search increases exponentially with respect to the number of modified amino acids and linearly with respect to the number of potential PTM types at each amino acid. The problem becomes intractable very quickly if we want to enumerate all possible PTM patterns. To address this issue, one group of methods named restricted tools (including Mascot, Comet, and MS-GF+) only allow a small number of PTM types in database search process. Alternatively, the other group of methods named unrestricted tools (including MS-Alignment, ProteinProspector, and MODa) avoids enumerating PTM patterns with an alignment-based approach to localizing and characterizing modified amino acids. However, because of the large search space and PTM localization issue, the sensitivity of these unrestricted tools is low. This paper proposes a novel method named PIPI to achieve PTM-invariant peptide identification. PIPI belongs to the category of unrestricted tools. It first codes peptide sequences into Boolean vectors and codes experimental spectra into real-valued vectors. For each coded spectrum, it then searches the coded sequence database to find the top scored peptide sequences as candidates. After that, PIPI uses dynamic programming to localize and characterize modified amino acids in each candidate. We used simulation experiments and real data experiments to evaluate the performance in comparison with restricted tools (i.e., Mascot, Comet, and MS-GF+) and unrestricted tools (i.e., Mascot with error tolerant search, MS-Alignment, ProteinProspector, and MODa). Comparison with restricted tools shows that PIPI has a close sensitivity and running speed. Comparison with unrestricted tools shows that PIPI has the highest sensitivity except for Mascot with error tolerant search and ProteinProspector. These two tools simplify the task by only considering up to one modified amino acid in each peptide, which results in a higher sensitivity but has difficulty in dealing with multiple modified amino acids. The simulation experiments also show that PIPI has the lowest false discovery proportion, the highest PTM characterization accuracy, and the shortest running time among the unrestricted tools.
Collapse
Affiliation(s)
- Fengchao Yu
- Division of Biomedical Engineering, The Hong Kong University of Science and Technology , Hong Kong, China
| | - Ning Li
- Division of Biomedical Engineering, The Hong Kong University of Science and Technology , Hong Kong, China.,Division of Life Science, The Hong Kong University of Science and Technology , Hong Kong, China
| | - Weichuan Yu
- Division of Biomedical Engineering, The Hong Kong University of Science and Technology , Hong Kong, China.,Department of Electronic and Computer Engineering, The Hong Kong University of Science and Technology , Hong Kong, China
| |
Collapse
|
29
|
Guthals A, Gan Y, Murray L, Chen Y, Stinson J, Nakamura G, Lill JR, Sandoval W, Bandeira N. De Novo MS/MS Sequencing of Native Human Antibodies. J Proteome Res 2016; 16:45-54. [PMID: 27779884 DOI: 10.1021/acs.jproteome.6b00608] [Citation(s) in RCA: 31] [Impact Index Per Article: 3.9] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/28/2022]
Abstract
One direct route for the discovery of therapeutic human monoclonal antibodies (mAbs) involves the isolation of peripheral B cells from survivors/sero-positive individuals after exposure to an infectious reagent or disease etiology, followed by single-cell sequencing or hybridoma generation. Peripheral B cells, however, are not always easy to obtain and represent only a small percentage of the total B-cell population across all bodily tissues. Although it has been demonstrated that tandem mass spectrometry (MS/MS) techniques can interrogate the full polyclonal antibody (pAb) response to an antigen in vivo, all current approaches identify MS/MS spectra against databases derived from genetic sequencing of B cells from the same patient. In this proof-of-concept study, we demonstrate the feasibility of a novel MS/MS antibody discovery approach in which only serum antibodies are required without the need for sequencing of genetic material. Peripheral pAbs from a cytomegalovirus-exposed individual were purified by glycoprotein B antigen affinity and de novo sequenced from MS/MS data. Purely MS-derived mAbs were then manufactured in mammalian cells to validate potency via antigen-binding ELISA. Interestingly, we found that these mAbs accounted for 1 to 2% of total donor IgG but were not detected in parallel sequencing of memory B cells from the same patient.
Collapse
Affiliation(s)
- Adrian Guthals
- Mapp Biopharmaceutical, Inc. , 6160 Lusk Boulevard #C105, San Diego, California 92121, United States
| | - Yutian Gan
- Department of Proteomics & Biological Resources, Genentech, Inc. , South San Francisco, California 94080, United States
| | - Laura Murray
- Department of Protein Chemistry, Genentech, Inc. , South San Francisco, California 94080, United States
| | - Yongmei Chen
- Department of Antibody Engineering, Genentech, Inc. , South San Francisco, California 94080, United States
| | - Jeremy Stinson
- Department of Molecular Biology, Genentech, Inc. , South San Francisco, California 94080, United States
| | - Gerald Nakamura
- Department of Antibody Engineering, Genentech, Inc. , South San Francisco, California 94080, United States
| | - Jennie R Lill
- Department of Proteomics & Biological Resources, Genentech, Inc. , South San Francisco, California 94080, United States
| | - Wendy Sandoval
- Department of Proteomics & Biological Resources, Genentech, Inc. , South San Francisco, California 94080, United States
| | - Nuno Bandeira
- Department of Computer Science and Engineering, University of California, San Diego , 9500 Gilman Drive, Mail Code 0404, La Jolla, California 92093, United States.,Skaggs School of Pharmacy and Pharmaceutical Sciences, University of California, San Diego , 9500 Gilman Drive, Mail Code 0657, La Jolla, California 92093, United States
| |
Collapse
|
30
|
Dereplication of peptidic natural products through database search of mass spectra. Nat Chem Biol 2016; 13:30-37. [PMID: 27820803 PMCID: PMC5409158 DOI: 10.1038/nchembio.2219] [Citation(s) in RCA: 151] [Impact Index Per Article: 18.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/03/2015] [Accepted: 08/17/2016] [Indexed: 11/08/2022]
Abstract
Peptidic Natural Products (PNPs) are widely used compounds that include many antibiotics and a variety of other bioactive peptides. While recent breakthroughs in PNP discovery raised the challenge of developing new algorithms for their analysis, identification of PNPs via database search of tandem mass spectra remains an open problem. To address this problem, natural product researchers utilize dereplication strategies that identify known PNPs and lead to the discovery of new ones even in cases when the reference spectra are not present in existing spectral libraries. DEREPLICATOR is a new dereplication algorithm that enabled high-throughput PNP identification and that is compatible with large-scale mass spectrometry-based screening platforms for natural product discovery. After searching nearly one hundred million tandem mass spectra in the Global Natural Products Social (GNPS) molecular networking infrastructure, DEREPLICATOR identified an order of magnitude more PNPs (and their new variants) than any previous dereplication efforts.
Collapse
|
31
|
Pap A, Medzihradszky KF, Darula Z. Using "spectral families" to assess the reproducibility of glycopeptide enrichment: human serum O-glycosylation revisited. Anal Bioanal Chem 2016; 409:539-550. [PMID: 27766363 DOI: 10.1007/s00216-016-9960-7] [Citation(s) in RCA: 9] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/20/2016] [Revised: 09/02/2016] [Accepted: 09/19/2016] [Indexed: 11/30/2022]
Abstract
Growing evidence on the diverse biological roles of extracellular glycosylation as well as the need for quality control of protein pharmaceuticals make glycopeptide analysis both exciting and important again after a long hiatus. High-throughput O-glycosylation studies have to tackle the complexity of glycosylation as well as technical difficulties and, up to now, have yielded only limited results mostly from single enrichment experiments. In this study, we address the technical reproducibility of the characterization of the most prevalent O-glycosylation (mucin-type core 1 structures) in human serum, using a two-step lectin affinity-based workflow. Our results are based on automated glycopeptide identifications from higher-energy C-trap dissociation and electron transfer dissociation MS/MS data. Assignments meeting strict acceptance criteria served as the foundation for generating "spectral families" incorporating low-scoring MS/MS identifications, supported by accurate mass measurements and expected chromatographic retention times. We show that this approach helped to evaluate the reproducibility of the glycopeptide enrichment more reliably and also contributed to the expansion of the glycoform repertoire of already identified glycosylated sequences. The roadblocks hindering more in-depth investigations and quantitative analyses will also be discussed.
Collapse
Affiliation(s)
- Adam Pap
- Laboratory of Proteomics Research, Biological Research Centre, Hungarian Academy of Sciences, Temesvari krt 62, 6726, Szeged, Hungary
| | - Katalin F Medzihradszky
- Laboratory of Proteomics Research, Biological Research Centre, Hungarian Academy of Sciences, Temesvari krt 62, 6726, Szeged, Hungary.,Department of Pharmaceutical Chemistry, School of Pharmacy, University of California San Francisco, 600 16th Street, Genentech Hall N474A, San Francisco, CA, 94158-2517, USA
| | - Zsuzsanna Darula
- Laboratory of Proteomics Research, Biological Research Centre, Hungarian Academy of Sciences, Temesvari krt 62, 6726, Szeged, Hungary.
| |
Collapse
|
32
|
Na S, Payne SH, Bandeira N. Multi-species Identification of Polymorphic Peptide Variants via Propagation in Spectral Networks. Mol Cell Proteomics 2016; 15:3501-3512. [PMID: 27609420 PMCID: PMC5098046 DOI: 10.1074/mcp.o116.060913] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/10/2016] [Indexed: 11/25/2022] Open
Abstract
Peptide and protein identification remains challenging in organisms with poorly annotated or rapidly evolving genomes, as are commonly encountered in environmental or biofuels research. Such limitations render tandem mass spectrometry (MS/MS) database search algorithms ineffective as they lack corresponding sequences required for peptide-spectrum matching. We address this challenge with the spectral networks approach to (1) match spectra of orthologous peptides across multiple related species and then (2) propagate peptide annotations from identified to unidentified spectra. We here present algorithms to assess the statistical significance of spectral alignments (Align-GF), reduce the impurity in spectral networks, and accurately estimate the error rate in propagated identifications. Analyzing three related Cyanothece species, a model organism for biohydrogen production, spectral networks identified peptides from highly divergent sequences from networks with dozens of variant peptides, including thousands of peptides in species lacking a sequenced genome. Our analysis further detected the presence of many novel putative peptides even in genomically characterized species, thus suggesting the possibility of gaps in our understanding of their proteomic and genomic expression. A web-based pipeline for spectral networks analysis is available at http://proteomics.ucsd.edu/software.
Collapse
Affiliation(s)
- Seungjin Na
- From the ‡Dept. of Computer Science and Engineering, University of California, San Diego, La Jolla, California, 92093.,§Center for Computational Mass Spectrometry, University of California, San Diego, La Jolla, California, 92093
| | - Samuel H Payne
- ¶Pacific Northwest National Laboratory, Richland, Washington 99354
| | - Nuno Bandeira
- From the ‡Dept. of Computer Science and Engineering, University of California, San Diego, La Jolla, California, 92093; .,§Center for Computational Mass Spectrometry, University of California, San Diego, La Jolla, California, 92093.,‖Skaggs School of Pharmacy and Pharmaceutical Sciences, University of California, San Diego, La Jolla, California, 92093
| |
Collapse
|
33
|
Complete De Novo Assembly of Monoclonal Antibody Sequences. Sci Rep 2016; 6:31730. [PMID: 27562653 PMCID: PMC4999880 DOI: 10.1038/srep31730] [Citation(s) in RCA: 67] [Impact Index Per Article: 8.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/18/2016] [Accepted: 07/20/2016] [Indexed: 11/25/2022] Open
Abstract
De novo protein sequencing is one of the key problems in mass spectrometry-based proteomics, especially for novel proteins such as monoclonal antibodies for which genome information is often limited or not available. However, due to limitations in peptides fragmentation and coverage, as well as ambiguities in spectra interpretation, complete de novo assembly of unknown protein sequences still remains challenging. To address this problem, we propose an integrated system, ALPS, which for the first time can automatically assemble full-length monoclonal antibody sequences. Our system integrates de novo sequencing peptides, their quality scores and error-correction information from databases into a weighted de Bruijn graph to assemble protein sequences. We evaluated ALPS performance on two antibody data sets, each including a heavy chain and a light chain. The results show that ALPS was able to assemble three complete monoclonal antibody sequences of length 216–441 AA, at 100% coverage, and 96.64–100% accuracy.
Collapse
|
34
|
Wang M, Carver JJ, Phelan VV, Sanchez LM, Garg N, Peng Y, Nguyen DD, Watrous J, Kapono CA, Luzzatto-Knaan T, Porto C, Bouslimani A, Melnik AV, Meehan MJ, Liu WT, Crüsemann M, Boudreau PD, Esquenazi E, Sandoval-Calderón M, Kersten RD, Pace LA, Quinn RA, Duncan KR, Hsu CC, Floros DJ, Gavilan RG, Kleigrewe K, Northen T, Dutton RJ, Parrot D, Carlson EE, Aigle B, Michelsen CF, Jelsbak L, Sohlenkamp C, Pevzner P, Edlund A, McLean J, Piel J, Murphy BT, Gerwick L, Liaw CC, Yang YL, Humpf HU, Maansson M, Keyzers RA, Sims AC, Johnson AR, Sidebottom AM, Sedio BE, Klitgaard A, Larson CB, P CAB, Torres-Mendoza D, Gonzalez DJ, Silva DB, Marques LM, Demarque DP, Pociute E, O'Neill EC, Briand E, Helfrich EJN, Granatosky EA, Glukhov E, Ryffel F, Houson H, Mohimani H, Kharbush JJ, Zeng Y, Vorholt JA, Kurita KL, Charusanti P, McPhail KL, Nielsen KF, Vuong L, Elfeki M, Traxler MF, Engene N, Koyama N, Vining OB, Baric R, Silva RR, Mascuch SJ, Tomasi S, Jenkins S, Macherla V, Hoffman T, Agarwal V, Williams PG, Dai J, Neupane R, Gurr J, Rodríguez AMC, Lamsa A, Zhang C, Dorrestein K, Duggan BM, Almaliti J, Allard PM, Phapale P, Nothias LF, Alexandrov T, Litaudon M, Wolfender JL, Kyle JE, Metz TO, Peryea T, Nguyen DT, VanLeer D, Shinn P, Jadhav A, Müller R, Waters KM, Shi W, Liu X, Zhang L, Knight R, Jensen PR, Palsson BO, Pogliano K, Linington RG, Gutiérrez M, Lopes NP, Gerwick WH, Moore BS, Dorrestein PC, Bandeira N. Sharing and community curation of mass spectrometry data with Global Natural Products Social Molecular Networking. Nat Biotechnol 2016. [PMID: 27504778 DOI: 10.1038/nbt.3597.sharing] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 03/16/2023]
Abstract
The potential of the diverse chemistries present in natural products (NP) for biotechnology and medicine remains untapped because NP databases are not searchable with raw data and the NP community has no way to share data other than in published papers. Although mass spectrometry (MS) techniques are well-suited to high-throughput characterization of NP, there is a pressing need for an infrastructure to enable sharing and curation of data. We present Global Natural Products Social Molecular Networking (GNPS; http://gnps.ucsd.edu), an open-access knowledge base for community-wide organization and sharing of raw, processed or identified tandem mass (MS/MS) spectrometry data. In GNPS, crowdsourced curation of freely available community-wide reference MS libraries will underpin improved annotations. Data-driven social-networking should facilitate identification of spectra and foster collaborations. We also introduce the concept of 'living data' through continuous reanalysis of deposited data.
Collapse
Affiliation(s)
- Mingxun Wang
- Computer Science and Engineering, UC San Diego, La Jolla, United States.,Center for Computational Mass Spectrometry, UC San Diego, La Jolla, United States
| | - Jeremy J Carver
- Computer Science and Engineering, UC San Diego, La Jolla, United States.,Center for Computational Mass Spectrometry, UC San Diego, La Jolla, United States
| | - Vanessa V Phelan
- Collaborative Mass Spectrometry Innovation Center, Skaggs School of Pharmacy and Pharmaceutical Sciences, UC San Diego, La Jolla, United States
| | - Laura M Sanchez
- Collaborative Mass Spectrometry Innovation Center, Skaggs School of Pharmacy and Pharmaceutical Sciences, UC San Diego, La Jolla, United States
| | - Neha Garg
- Collaborative Mass Spectrometry Innovation Center, Skaggs School of Pharmacy and Pharmaceutical Sciences, UC San Diego, La Jolla, United States
| | - Yao Peng
- Department of Chemistry and Biochemistry, UC San Diego, La Jolla, United States
| | - Don Duy Nguyen
- Department of Chemistry and Biochemistry, UC San Diego, La Jolla, United States
| | - Jeramie Watrous
- Collaborative Mass Spectrometry Innovation Center, Skaggs School of Pharmacy and Pharmaceutical Sciences, UC San Diego, La Jolla, United States
| | - Clifford A Kapono
- Department of Chemistry and Biochemistry, UC San Diego, La Jolla, United States
| | - Tal Luzzatto-Knaan
- Collaborative Mass Spectrometry Innovation Center, Skaggs School of Pharmacy and Pharmaceutical Sciences, UC San Diego, La Jolla, United States
| | - Carla Porto
- Collaborative Mass Spectrometry Innovation Center, Skaggs School of Pharmacy and Pharmaceutical Sciences, UC San Diego, La Jolla, United States
| | - Amina Bouslimani
- Collaborative Mass Spectrometry Innovation Center, Skaggs School of Pharmacy and Pharmaceutical Sciences, UC San Diego, La Jolla, United States
| | - Alexey V Melnik
- Collaborative Mass Spectrometry Innovation Center, Skaggs School of Pharmacy and Pharmaceutical Sciences, UC San Diego, La Jolla, United States
| | - Michael J Meehan
- Collaborative Mass Spectrometry Innovation Center, Skaggs School of Pharmacy and Pharmaceutical Sciences, UC San Diego, La Jolla, United States
| | - Wei-Ting Liu
- Department of Microbiology and Immunology, Stanford University, Palo Alto, United States
| | - Max Crüsemann
- Center for Marine Biotechnology and Biomedicine, Scripps Institute of Oceanography, UC San Diego, La Jolla, United States
| | - Paul D Boudreau
- Center for Marine Biotechnology and Biomedicine, Scripps Institute of Oceanography, UC San Diego, La Jolla, United States
| | | | | | | | - Laura A Pace
- Collaborative Mass Spectrometry Innovation Center, Skaggs School of Pharmacy and Pharmaceutical Sciences, UC San Diego, La Jolla, United States
| | - Robert A Quinn
- Biology Department, San Diego State University, San Diego, United States
| | - Katherine R Duncan
- Scottish Association for Marine Science, Scottish Marine Institute, Oban, United Kingdom.,Center for Marine Biotechnology and Biomedicine, Scripps Institute of Oceanography, UC San Diego, La Jolla, United States
| | - Cheng-Chih Hsu
- Department of Chemistry and Biochemistry, UC San Diego, La Jolla, United States
| | - Dimitrios J Floros
- Department of Chemistry and Biochemistry, UC San Diego, La Jolla, United States
| | - Ronnie G Gavilan
- Center for Drug Discovery and Biodiversity, INDICASAT, City of Knowledge, Panama
| | - Karin Kleigrewe
- Center for Marine Biotechnology and Biomedicine, Scripps Institute of Oceanography, UC San Diego, La Jolla, United States
| | - Trent Northen
- Genome Dynamics, Lawrence Berkeley National Laboratory, Berkeley, United States
| | - Rachel J Dutton
- FAS Center for Systems Biology, Harvard, Cambridge, United States
| | - Delphine Parrot
- Produits naturels - Synthèses - Chimie Médicinale, University of Rennes 1, Rennes Cedex, France
| | - Erin E Carlson
- Chemistry, University of Minnesota, Minneapolis, United States
| | - Bertrand Aigle
- Dynamique des Génomes et Adaptation Microbienne, University of Lorraine, Vandœuvre-lès-Nancy, France
| | | | - Lars Jelsbak
- Department of Systems Biology, Technical University of Denmark, Lyngby, Denmark
| | - Christian Sohlenkamp
- Centro de Ciencias Genómicas, Universidad Nacional Autonoma de Mexico, Cuernavaca, Mexico
| | - Pavel Pevzner
- Center for Computational Mass Spectrometry, UC San Diego, La Jolla, United States.,Computer Science and Engineering, UC San Diego, La Jolla, United States
| | - Anna Edlund
- Microbial and Environmental Genomics, J. Craig Venter Institute, La Jolla, United States.,School of Dentistry, UC Los Angeles, Los Angeles, United States
| | - Jeffrey McLean
- Department of Periodontics, University of Washington, Seattle, United States.,School of Dentistry, UC Los Angeles, Los Angeles, United States
| | - Jörn Piel
- Institute of Microbiology, ETH Zurich, Zurich, Switzerland
| | - Brian T Murphy
- Department of Medicinal Chemistry and Pharmacognosy, University of Illinois Chicago, Chicago, United States
| | - Lena Gerwick
- Center for Marine Biotechnology and Biomedicine, Scripps Institute of Oceanography, UC San Diego, La Jolla, United States
| | - Chih-Chuang Liaw
- Department of Marine Biotechnology and Resources, National Sun Yat-sen University, Kaohsiung, Taiwan
| | - Yu-Liang Yang
- Agricultural Biotechnology Research Center, Academia Sinica, Taipei, Taiwan
| | - Hans-Ulrich Humpf
- Institute of Food Chemistry, University of Münster, Münster, Germany
| | - Maria Maansson
- Department of Systems Biology, Technical University of Denmark, Lyngby, Denmark
| | - Robert A Keyzers
- School of Chemical & Physical Sciences, and Centre for Biodiscovery, Victoria University of Wellington, Wellington, New Zealand
| | - Amy C Sims
- Gillings School of Global Public Health, Department of Epidemiology, UNC Chapel Hill, Chapel Hill, United States
| | - Andrew R Johnson
- Department of Chemistry, Indiana University, Bloomington, United States
| | | | - Brian E Sedio
- Smithsonian Tropical Research Institute, Ancón, Panama.,Center for Drug Discovery and Biodiversity, INDICASAT, City of Knowledge, Panama
| | - Andreas Klitgaard
- Department of Systems Biology, Technical University of Denmark, Lyngby, Denmark
| | - Charles B Larson
- Center for Marine Biotechnology and Biomedicine, Scripps Institute of Oceanography, UC San Diego, La Jolla, United States.,Skaggs School of Pharmacy and Pharmaceutical Sciences, UC San Diego, La Jolla, United States
| | - Cristopher A Boya P
- Center for Drug Discovery and Biodiversity, INDICASAT, City of Knowledge, Panama
| | | | - David J Gonzalez
- Skaggs School of Pharmacy and Pharmaceutical Sciences, UC San Diego, La Jolla, United States.,Collaborative Mass Spectrometry Innovation Center, Skaggs School of Pharmacy and Pharmaceutical Sciences, UC San Diego, La Jolla, United States
| | - Denise B Silva
- School of Pharmaceutical Sciences of Ribeirao Preto, University of São Paulo, São Paulo, Brazil.,Centro de Ciencias Biologicas e da Saude, Universidade Fderal de Mato Grosso do Sul, Campo Grande, Brazil
| | - Lucas M Marques
- School of Pharmaceutical Sciences of Ribeirao Preto, University of São Paulo, São Paulo, Brazil
| | - Daniel P Demarque
- School of Pharmaceutical Sciences of Ribeirao Preto, University of São Paulo, São Paulo, Brazil
| | - Egle Pociute
- Sirenas Marine Discovery, San Diego, United States
| | - Ellis C O'Neill
- Center for Marine Biotechnology and Biomedicine, Scripps Institute of Oceanography, UC San Diego, La Jolla, United States
| | - Enora Briand
- Center for Marine Biotechnology and Biomedicine, Scripps Institute of Oceanography, UC San Diego, La Jolla, United States.,UMR CNRS 6553 ECOBIO, University of Rennes 1, Rennes Cedex, France
| | | | - Eve A Granatosky
- Department of Chemistry and Biochemistry, University of Notre Dame, Notre Dame, United States
| | - Evgenia Glukhov
- Center for Marine Biotechnology and Biomedicine, Scripps Institute of Oceanography, UC San Diego, La Jolla, United States
| | - Florian Ryffel
- Institute of Microbiology, ETH Zurich, Zurich, Switzerland
| | | | - Hosein Mohimani
- Center for Computational Mass Spectrometry, UC San Diego, La Jolla, United States
| | - Jenan J Kharbush
- Center for Marine Biotechnology and Biomedicine, Scripps Institute of Oceanography, UC San Diego, La Jolla, United States
| | - Yi Zeng
- Department of Chemistry and Biochemistry, UC San Diego, La Jolla, United States
| | | | - Kenji L Kurita
- PBSci-Chemistry & Biochemistry Department, UC Santa Cruz, Santa Cruz, United States
| | - Pep Charusanti
- Department of Bioengineering, UC San Diego, La Jolla, United States
| | - Kerry L McPhail
- Department of Pharmaceutical Sciences, College of Pharmacy, Oregon State University, Corvallis, United States
| | | | - Lisa Vuong
- Sirenas Marine Discovery, San Diego, United States
| | - Maryam Elfeki
- Department of Medicinal Chemistry and Pharmacognosy, University of Illinois Chicago, Chicago, United States
| | - Matthew F Traxler
- Department of Plant and Microbial Biology, UC Berkeley, Berkeley, United States
| | - Niclas Engene
- Department of Biological Sciences, Florida International University, Miami, United States
| | - Nobuhiro Koyama
- Collaborative Mass Spectrometry Innovation Center, Skaggs School of Pharmacy and Pharmaceutical Sciences, UC San Diego, La Jolla, United States
| | - Oliver B Vining
- Department of Pharmaceutical Sciences, College of Pharmacy, Oregon State University, Corvallis, United States
| | - Ralph Baric
- Gillings School of Global Public Health, Department of Epidemiology, UNC Chapel Hill, Chapel Hill, United States
| | - Ricardo R Silva
- School of Pharmaceutical Sciences of Ribeirao Preto, University of São Paulo, São Paulo, Brazil
| | - Samantha J Mascuch
- Center for Marine Biotechnology and Biomedicine, Scripps Institute of Oceanography, UC San Diego, La Jolla, United States
| | - Sophie Tomasi
- Produits naturels - Synthèses - Chimie Médicinale, University of Rennes 1, Rennes Cedex, France
| | - Stefan Jenkins
- Genome Dynamics, Lawrence Berkeley National Laboratory, Berkeley, United States
| | | | - Thomas Hoffman
- Department of Pharmaceutical Biotechnology, Helmholtz Institute for Pharmaceutical Research Saarland, Saarbrücken, Germany
| | - Vinayak Agarwal
- Center for Oceans and Human Health, Scripps Institute of Oceanography, UC San Diego, La Jolla, United States
| | - Philip G Williams
- Department of Chemistry, University of Hawaii at Manoa, Honolulu, United States
| | - Jingqui Dai
- Department of Chemistry, University of Hawaii at Manoa, Honolulu, United States
| | - Ram Neupane
- Department of Chemistry, University of Hawaii at Manoa, Honolulu, United States
| | - Joshua Gurr
- Department of Chemistry, University of Hawaii at Manoa, Honolulu, United States
| | - Andrés M C Rodríguez
- School of Pharmaceutical Sciences of Ribeirao Preto, University of São Paulo, São Paulo, Brazil
| | - Anne Lamsa
- Division of Biological Sciences, UC San Diego, La Jolla, United States
| | - Chen Zhang
- Department of Nanoengineering, UC San Diego, La Jolla, United States
| | - Kathleen Dorrestein
- Collaborative Mass Spectrometry Innovation Center, Skaggs School of Pharmacy and Pharmaceutical Sciences, UC San Diego, La Jolla, United States
| | - Brendan M Duggan
- Collaborative Mass Spectrometry Innovation Center, Skaggs School of Pharmacy and Pharmaceutical Sciences, UC San Diego, La Jolla, United States
| | - Jehad Almaliti
- Collaborative Mass Spectrometry Innovation Center, Skaggs School of Pharmacy and Pharmaceutical Sciences, UC San Diego, La Jolla, United States
| | - Pierre-Marie Allard
- School of Pharmaceutical Sciences, University of Geneva, Geneva, Switzerland
| | - Prasad Phapale
- Structural and Computational Biology, European Molecular Biology Laboratory, Heidelberg, Germany
| | - Louis-Felix Nothias
- Institut de Chimie des Substances Naturelles, CNRS-ICSN, UPR 2301, Labex CEBA, University of Paris-Saclay, Gif-sur-Yvette, France
| | - Theodore Alexandrov
- Structural and Computational Biology, European Molecular Biology Laboratory, Heidelberg, Germany
| | - Marc Litaudon
- Institut de Chimie des Substances Naturelles, CNRS-ICSN, UPR 2301, Labex CEBA, University of Paris-Saclay, Gif-sur-Yvette, France
| | - Jean-Luc Wolfender
- School of Pharmaceutical Sciences, University of Geneva, Geneva, Switzerland
| | - Jennifer E Kyle
- Biological Sciences, Pacific Northwest National Laboratory, Richland, United States
| | - Thomas O Metz
- Biological Sciences, Pacific Northwest National Laboratory, Richland, United States
| | - Tyler Peryea
- National Center for Advancing Translational Sciences, National Institute of Health, Rockville, United States
| | - Dac-Trung Nguyen
- National Center for Advancing Translational Sciences, National Institute of Health, Rockville, United States
| | - Danielle VanLeer
- National Center for Advancing Translational Sciences, National Institute of Health, Rockville, United States
| | - Paul Shinn
- National Center for Advancing Translational Sciences, National Institute of Health, Rockville, United States
| | - Ajit Jadhav
- National Center for Advancing Translational Sciences, National Institute of Health, Rockville, United States
| | - Rolf Müller
- Department of Pharmaceutical Biotechnology, Helmholtz Institute for Pharmaceutical Research Saarland, Saarbrücken, Germany
| | - Katrina M Waters
- Biological Sciences, Pacific Northwest National Laboratory, Richland, United States
| | - Wenyuan Shi
- School of Dentistry, UC Los Angeles, Los Angeles, United States
| | - Xueting Liu
- Institute of Microbiology, Chinese Academy of Sciences, Beijing, China
| | - Lixin Zhang
- Institute of Microbiology, Chinese Academy of Sciences, Beijing, China
| | - Rob Knight
- Department of Pediatrics, UC San Diego, La Jolla, United States
| | - Paul R Jensen
- Center for Marine Biotechnology and Biomedicine, Scripps Institute of Oceanography, UC San Diego, La Jolla, United States
| | | | - Kit Pogliano
- Division of Biological Sciences, UC San Diego, La Jolla, United States
| | - Roger G Linington
- PBSci-Chemistry & Biochemistry Department, UC Santa Cruz, Santa Cruz, United States
| | - Marcelino Gutiérrez
- Center for Drug Discovery and Biodiversity, INDICASAT, City of Knowledge, Panama
| | - Norberto P Lopes
- School of Pharmaceutical Sciences of Ribeirao Preto, University of São Paulo, São Paulo, Brazil
| | - William H Gerwick
- Collaborative Mass Spectrometry Innovation Center, Skaggs School of Pharmacy and Pharmaceutical Sciences, UC San Diego, La Jolla, United States.,Center for Marine Biotechnology and Biomedicine, Scripps Institute of Oceanography, UC San Diego, La Jolla, United States
| | - Bradley S Moore
- Collaborative Mass Spectrometry Innovation Center, Skaggs School of Pharmacy and Pharmaceutical Sciences, UC San Diego, La Jolla, United States.,Center for Marine Biotechnology and Biomedicine, Scripps Institute of Oceanography, UC San Diego, La Jolla, United States.,Center for Oceans and Human Health, Scripps Institute of Oceanography, UC San Diego, La Jolla, United States
| | - Pieter C Dorrestein
- Collaborative Mass Spectrometry Innovation Center, Skaggs School of Pharmacy and Pharmaceutical Sciences, UC San Diego, La Jolla, United States.,Center for Marine Biotechnology and Biomedicine, Scripps Institute of Oceanography, UC San Diego, La Jolla, United States.,Skaggs School of Pharmacy and Pharmaceutical Sciences, UC San Diego, La Jolla, United States
| | - Nuno Bandeira
- Center for Computational Mass Spectrometry, UC San Diego, La Jolla, United States.,Collaborative Mass Spectrometry Innovation Center, Skaggs School of Pharmacy and Pharmaceutical Sciences, UC San Diego, La Jolla, United States.,Skaggs School of Pharmacy and Pharmaceutical Sciences, UC San Diego, La Jolla, United States
| |
Collapse
|
35
|
Wang M, Carver JJ, Phelan VV, Sanchez LM, Garg N, Peng Y, Nguyen DD, Watrous J, Kapono CA, Luzzatto-Knaan T, Porto C, Bouslimani A, Melnik AV, Meehan MJ, Liu WT, Crüsemann M, Boudreau PD, Esquenazi E, Sandoval-Calderón M, Kersten RD, Pace LA, Quinn RA, Duncan KR, Hsu CC, Floros DJ, Gavilan RG, Kleigrewe K, Northen T, Dutton RJ, Parrot D, Carlson EE, Aigle B, Michelsen CF, Jelsbak L, Sohlenkamp C, Pevzner P, Edlund A, McLean J, Piel J, Murphy BT, Gerwick L, Liaw CC, Yang YL, Humpf HU, Maansson M, Keyzers RA, Sims AC, Johnson AR, Sidebottom AM, Sedio BE, Klitgaard A, Larson CB, P. CAB, Torres-Mendoza D, Gonzalez DJ, Silva DB, Marques LM, Demarque DP, Pociute E, O'Neill EC, Briand E, Helfrich EJN, Granatosky EA, Glukhov E, Ryffel F, Houson H, Mohimani H, Kharbush JJ, Zeng Y, Vorholt JA, Kurita KL, Charusanti P, McPhail KL, Nielsen KF, Vuong L, Elfeki M, Traxler MF, Engene N, Koyama N, Vining OB, Baric R, Silva RR, Mascuch SJ, Tomasi S, Jenkins S, Macherla V, Hoffman T, Agarwal V, Williams PG, Dai J, Neupane R, Gurr J, Rodríguez AMC, Lamsa A, Zhang C, Dorrestein K, Duggan BM, Almaliti J, Allard PM, Phapale P, Nothias LF, Alexandrov T, Litaudon M, Wolfender JL, Kyle JE, Metz TO, Peryea T, Nguyen DT, VanLeer D, Shinn P, Jadhav A, Müller R, Waters KM, Shi W, Liu X, Zhang L, Knight R, Jensen PR, Palsson BO, Pogliano K, Linington RG, Gutiérrez M, Lopes NP, Gerwick WH, Moore BS, Dorrestein PC, Bandeira N. Sharing and community curation of mass spectrometry data with Global Natural Products Social Molecular Networking. Nat Biotechnol 2016; 34:828-837. [PMID: 27504778 PMCID: PMC5321674 DOI: 10.1038/nbt.3597] [Citation(s) in RCA: 2420] [Impact Index Per Article: 302.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/11/2015] [Accepted: 05/10/2016] [Indexed: 12/14/2022]
Abstract
The potential of the diverse chemistries present in natural products (NP) for biotechnology and medicine remains untapped because NP databases are not searchable with raw data and the NP community has no way to share data other than in published papers. Although mass spectrometry (MS) techniques are well-suited to high-throughput characterization of NP, there is a pressing need for an infrastructure to enable sharing and curation of data. We present Global Natural Products Social Molecular Networking (GNPS; http://gnps.ucsd.edu), an open-access knowledge base for community-wide organization and sharing of raw, processed or identified tandem mass (MS/MS) spectrometry data. In GNPS, crowdsourced curation of freely available community-wide reference MS libraries will underpin improved annotations. Data-driven social-networking should facilitate identification of spectra and foster collaborations. We also introduce the concept of 'living data' through continuous reanalysis of deposited data.
Collapse
Affiliation(s)
- Mingxun Wang
- Computer Science and Engineering, UC San Diego, La Jolla, United States
- Center for Computational Mass Spectrometry, UC San Diego, La Jolla, United States
| | - Jeremy J Carver
- Computer Science and Engineering, UC San Diego, La Jolla, United States
- Center for Computational Mass Spectrometry, UC San Diego, La Jolla, United States
| | - Vanessa V Phelan
- Collaborative Mass Spectrometry Innovation Center, Skaggs School of Pharmacy and Pharmaceutical Sciences, UC San Diego, La Jolla, United States
| | - Laura M Sanchez
- Collaborative Mass Spectrometry Innovation Center, Skaggs School of Pharmacy and Pharmaceutical Sciences, UC San Diego, La Jolla, United States
| | - Neha Garg
- Collaborative Mass Spectrometry Innovation Center, Skaggs School of Pharmacy and Pharmaceutical Sciences, UC San Diego, La Jolla, United States
| | - Yao Peng
- Department of Chemistry and Biochemistry, UC San Diego, La Jolla, United States
| | - Don Duy Nguyen
- Department of Chemistry and Biochemistry, UC San Diego, La Jolla, United States
| | - Jeramie Watrous
- Collaborative Mass Spectrometry Innovation Center, Skaggs School of Pharmacy and Pharmaceutical Sciences, UC San Diego, La Jolla, United States
| | - Clifford A Kapono
- Department of Chemistry and Biochemistry, UC San Diego, La Jolla, United States
| | - Tal Luzzatto-Knaan
- Collaborative Mass Spectrometry Innovation Center, Skaggs School of Pharmacy and Pharmaceutical Sciences, UC San Diego, La Jolla, United States
| | - Carla Porto
- Collaborative Mass Spectrometry Innovation Center, Skaggs School of Pharmacy and Pharmaceutical Sciences, UC San Diego, La Jolla, United States
| | - Amina Bouslimani
- Collaborative Mass Spectrometry Innovation Center, Skaggs School of Pharmacy and Pharmaceutical Sciences, UC San Diego, La Jolla, United States
| | - Alexey V Melnik
- Collaborative Mass Spectrometry Innovation Center, Skaggs School of Pharmacy and Pharmaceutical Sciences, UC San Diego, La Jolla, United States
| | - Michael J Meehan
- Collaborative Mass Spectrometry Innovation Center, Skaggs School of Pharmacy and Pharmaceutical Sciences, UC San Diego, La Jolla, United States
| | - Wei-Ting Liu
- Department of Microbiology and Immunology, Stanford University, Palo Alto, United States
| | - Max Crüsemann
- Center for Marine Biotechnology and Biomedicine, Scripps Institute of Oceanography, UC San Diego, La Jolla, United States
| | - Paul D Boudreau
- Center for Marine Biotechnology and Biomedicine, Scripps Institute of Oceanography, UC San Diego, La Jolla, United States
| | | | | | | | - Laura A Pace
- Collaborative Mass Spectrometry Innovation Center, Skaggs School of Pharmacy and Pharmaceutical Sciences, UC San Diego, La Jolla, United States
| | - Robert A Quinn
- Biology Department, San Diego State University, San Diego, United States
| | - Katherine R Duncan
- Center for Marine Biotechnology and Biomedicine, Scripps Institute of Oceanography, UC San Diego, La Jolla, United States
- Scottish Association for Marine Science, Scottish Marine Institute, Oban, United Kingdom
| | - Cheng-Chih Hsu
- Department of Chemistry and Biochemistry, UC San Diego, La Jolla, United States
| | - Dimitrios J Floros
- Department of Chemistry and Biochemistry, UC San Diego, La Jolla, United States
| | - Ronnie G Gavilan
- Center for Drug Discovery and Biodiversity, INDICASAT, City of Knowledge, Panama
| | - Karin Kleigrewe
- Center for Marine Biotechnology and Biomedicine, Scripps Institute of Oceanography, UC San Diego, La Jolla, United States
| | - Trent Northen
- Genome Dynamics, Lawrence Berkeley National Laboratory, Berkeley, United States
| | - Rachel J Dutton
- FAS Center for Systems Biology, Harvard, Cambridge, United States
| | - Delphine Parrot
- Produits naturels – Synthèses – Chimie Médicinale, University of Rennes 1, Rennes Cedex, France
| | - Erin E Carlson
- Chemistry, University of Minnesota, Minneapolis, United States
| | - Bertrand Aigle
- Dynamique des Génomes et Adaptation Microbienne, University of Lorraine, Vandœuvre-lès-Nancy, France
| | | | - Lars Jelsbak
- Department of Systems Biology, Technical University of Denmark, Lyngby, Denmark
| | - Christian Sohlenkamp
- Centro de Ciencias Genómicas, Universidad Nacional Autonoma de Mexico, Cuernavaca, Mexico
| | - Pavel Pevzner
- Computer Science and Engineering, UC San Diego, La Jolla, United States
- Center for Computational Mass Spectrometry, UC San Diego, La Jolla, United States
| | - Anna Edlund
- Microbial and Environmental Genomics, J. Craig Venter Institute, La Jolla, United States
- School of Dentistry, UC Los Angeles, Los Angeles, United States
| | - Jeffrey McLean
- School of Dentistry, UC Los Angeles, Los Angeles, United States
- Department of Periodontics, University of Washington, Seattle, United States
| | - Jörn Piel
- Institute of Microbiology, ETH Zurich, Zurich, Switzerland
| | - Brian T Murphy
- Department of Medicinal Chemistry and Pharmacognosy, University of Illinois Chicago, Chicago, United States
| | - Lena Gerwick
- Center for Marine Biotechnology and Biomedicine, Scripps Institute of Oceanography, UC San Diego, La Jolla, United States
| | - Chih-Chuang Liaw
- Department of Marine Biotechnology and Resources, National Sun Yat-sen University, Kaohsiung, Taiwan
| | - Yu-Liang Yang
- Agricultural Biotechnology Research Center, Academia Sinica, Taipei, Taiwan
| | - Hans-Ulrich Humpf
- Institute of Food Chemistry, University of Münster, Münster, Germany
| | - Maria Maansson
- Department of Systems Biology, Technical University of Denmark, Lyngby, Denmark
| | - Robert A Keyzers
- School of Chemical & Physical Sciences, and Centre for Biodiscovery, Victoria University of Wellington, Wellington, New Zealand
| | - Amy C Sims
- Gillings School of Global Public Health, Department of Epidemiology, UNC Chapel Hill, Chapel Hill, United States
| | - Andrew R. Johnson
- Department of Chemistry, Indiana University, Bloomington, United States
| | | | - Brian E Sedio
- Center for Drug Discovery and Biodiversity, INDICASAT, City of Knowledge, Panama
- Smithsonian Tropical Research Institute, Ancón, Panama
| | - Andreas Klitgaard
- Department of Systems Biology, Technical University of Denmark, Lyngby, Denmark
| | - Charles B Larson
- Center for Marine Biotechnology and Biomedicine, Scripps Institute of Oceanography, UC San Diego, La Jolla, United States
- Skaggs School of Pharmacy and Pharmaceutical Sciences, UC San Diego, La Jolla, United States
| | - Cristopher A Boya P.
- Center for Drug Discovery and Biodiversity, INDICASAT, City of Knowledge, Panama
| | | | - David J Gonzalez
- Collaborative Mass Spectrometry Innovation Center, Skaggs School of Pharmacy and Pharmaceutical Sciences, UC San Diego, La Jolla, United States
- Skaggs School of Pharmacy and Pharmaceutical Sciences, UC San Diego, La Jolla, United States
| | - Denise B Silva
- School of Pharmaceutical Sciences of Ribeirao Preto, University of São Paulo, São Paulo, Brazil
- Centro de Ciencias Biologicas e da Saude, Universidade Fderal de Mato Grosso do Sul, Campo Grande, Brazil
| | - Lucas M Marques
- School of Pharmaceutical Sciences of Ribeirao Preto, University of São Paulo, São Paulo, Brazil
| | - Daniel P Demarque
- School of Pharmaceutical Sciences of Ribeirao Preto, University of São Paulo, São Paulo, Brazil
| | - Egle Pociute
- Sirenas Marine Discovery, San Diego, United States
| | - Ellis C O'Neill
- Center for Marine Biotechnology and Biomedicine, Scripps Institute of Oceanography, UC San Diego, La Jolla, United States
| | - Enora Briand
- Center for Marine Biotechnology and Biomedicine, Scripps Institute of Oceanography, UC San Diego, La Jolla, United States
- UMR CNRS 6553 ECOBIO, University of Rennes 1, Rennes Cedex, France
| | | | - Eve A Granatosky
- Department of Chemistry and Biochemistry, University of Notre Dame, Notre Dame, United States
| | - Evgenia Glukhov
- Center for Marine Biotechnology and Biomedicine, Scripps Institute of Oceanography, UC San Diego, La Jolla, United States
| | - Florian Ryffel
- Institute of Microbiology, ETH Zurich, Zurich, Switzerland
| | | | - Hosein Mohimani
- Center for Computational Mass Spectrometry, UC San Diego, La Jolla, United States
| | - Jenan J Kharbush
- Center for Marine Biotechnology and Biomedicine, Scripps Institute of Oceanography, UC San Diego, La Jolla, United States
| | - Yi Zeng
- Department of Chemistry and Biochemistry, UC San Diego, La Jolla, United States
| | | | - Kenji L Kurita
- PBSci-Chemistry & Biochemistry Department, UC Santa Cruz, Santa Cruz, United States
| | - Pep Charusanti
- Department of Bioengineering, UC San Diego, La Jolla, United States
| | - Kerry L McPhail
- Department of Pharmaceutical Sciences, College of Pharmacy, Oregon State University, Corvallis, United States
| | | | - Lisa Vuong
- Sirenas Marine Discovery, San Diego, United States
| | - Maryam Elfeki
- Department of Medicinal Chemistry and Pharmacognosy, University of Illinois Chicago, Chicago, United States
| | - Matthew F Traxler
- Department of Plant and Microbial Biology, UC Berkeley, Berkeley, United States
| | - Niclas Engene
- Department of Biological Sciences, Florida International University, Miami, United States
| | - Nobuhiro Koyama
- Collaborative Mass Spectrometry Innovation Center, Skaggs School of Pharmacy and Pharmaceutical Sciences, UC San Diego, La Jolla, United States
| | - Oliver B Vining
- Department of Pharmaceutical Sciences, College of Pharmacy, Oregon State University, Corvallis, United States
| | - Ralph Baric
- Gillings School of Global Public Health, Department of Epidemiology, UNC Chapel Hill, Chapel Hill, United States
| | - Ricardo R Silva
- School of Pharmaceutical Sciences of Ribeirao Preto, University of São Paulo, São Paulo, Brazil
| | - Samantha J Mascuch
- Center for Marine Biotechnology and Biomedicine, Scripps Institute of Oceanography, UC San Diego, La Jolla, United States
| | - Sophie Tomasi
- Produits naturels – Synthèses – Chimie Médicinale, University of Rennes 1, Rennes Cedex, France
| | - Stefan Jenkins
- Genome Dynamics, Lawrence Berkeley National Laboratory, Berkeley, United States
| | | | - Thomas Hoffman
- Department of Pharmaceutical Biotechnology, Helmholtz Institute for Pharmaceutical Research Saarland, Saarbrücken, Germany
| | - Vinayak Agarwal
- Center for Oceans and Human Health, Scripps Institute of Oceanography, UC San Diego, La Jolla, United States
| | - Philip G Williams
- Department of Chemistry, University of Hawaii at Manoa, Honolulu, United States
| | - Jingqui Dai
- Department of Chemistry, University of Hawaii at Manoa, Honolulu, United States
| | - Ram Neupane
- Department of Chemistry, University of Hawaii at Manoa, Honolulu, United States
| | - Joshua Gurr
- Department of Chemistry, University of Hawaii at Manoa, Honolulu, United States
| | - Andrés M. C. Rodríguez
- School of Pharmaceutical Sciences of Ribeirao Preto, University of São Paulo, São Paulo, Brazil
| | - Anne Lamsa
- Division of Biological Sciences, UC San Diego, La Jolla, United States
| | - Chen Zhang
- Department of Nanoengineering, UC San Diego, La Jolla, United States
| | - Kathleen Dorrestein
- Collaborative Mass Spectrometry Innovation Center, Skaggs School of Pharmacy and Pharmaceutical Sciences, UC San Diego, La Jolla, United States
| | - Brendan M Duggan
- Collaborative Mass Spectrometry Innovation Center, Skaggs School of Pharmacy and Pharmaceutical Sciences, UC San Diego, La Jolla, United States
| | - Jehad Almaliti
- Collaborative Mass Spectrometry Innovation Center, Skaggs School of Pharmacy and Pharmaceutical Sciences, UC San Diego, La Jolla, United States
| | - Pierre-Marie Allard
- School of Pharmaceutical Sciences, University of Geneva, Geneva, Switzerland
| | - Prasad Phapale
- Structural and Computational Biology, European Molecular Biology Laboratory, Heidelberg, Germany
| | - Louis-Felix Nothias
- Institut de Chimie des Substances Naturelles, CNRS-ICSN, UPR 2301, Labex CEBA, University of Paris-Saclay, Gif-sur-Yvette, France
| | - Theodore Alexandrov
- Structural and Computational Biology, European Molecular Biology Laboratory, Heidelberg, Germany
| | - Marc Litaudon
- Institut de Chimie des Substances Naturelles, CNRS-ICSN, UPR 2301, Labex CEBA, University of Paris-Saclay, Gif-sur-Yvette, France
| | - Jean-Luc Wolfender
- School of Pharmaceutical Sciences, University of Geneva, Geneva, Switzerland
| | - Jennifer E Kyle
- Biological Sciences, Pacific Northwest National Laboratory, Richland, United States
| | - Thomas O Metz
- Biological Sciences, Pacific Northwest National Laboratory, Richland, United States
| | - Tyler Peryea
- National Center for Advancing Translational Sciences, National Institute of Health, Rockville, United States
| | - Dac-Trung Nguyen
- National Center for Advancing Translational Sciences, National Institute of Health, Rockville, United States
| | - Danielle VanLeer
- National Center for Advancing Translational Sciences, National Institute of Health, Rockville, United States
| | - Paul Shinn
- National Center for Advancing Translational Sciences, National Institute of Health, Rockville, United States
| | - Ajit Jadhav
- National Center for Advancing Translational Sciences, National Institute of Health, Rockville, United States
| | - Rolf Müller
- Department of Pharmaceutical Biotechnology, Helmholtz Institute for Pharmaceutical Research Saarland, Saarbrücken, Germany
| | - Katrina M Waters
- Biological Sciences, Pacific Northwest National Laboratory, Richland, United States
| | - Wenyuan Shi
- School of Dentistry, UC Los Angeles, Los Angeles, United States
| | - Xueting Liu
- Institute of Microbiology, Chinese Academy of Sciences, Beijing, China
| | - Lixin Zhang
- Institute of Microbiology, Chinese Academy of Sciences, Beijing, China
| | - Rob Knight
- Department of Pediatrics, UC San Diego, La Jolla, United States
| | - Paul R Jensen
- Center for Marine Biotechnology and Biomedicine, Scripps Institute of Oceanography, UC San Diego, La Jolla, United States
| | | | - Kit Pogliano
- Division of Biological Sciences, UC San Diego, La Jolla, United States
| | - Roger G Linington
- PBSci-Chemistry & Biochemistry Department, UC Santa Cruz, Santa Cruz, United States
| | - Marcelino Gutiérrez
- Center for Drug Discovery and Biodiversity, INDICASAT, City of Knowledge, Panama
| | - Norberto P Lopes
- School of Pharmaceutical Sciences of Ribeirao Preto, University of São Paulo, São Paulo, Brazil
| | - William H Gerwick
- Collaborative Mass Spectrometry Innovation Center, Skaggs School of Pharmacy and Pharmaceutical Sciences, UC San Diego, La Jolla, United States
- Center for Marine Biotechnology and Biomedicine, Scripps Institute of Oceanography, UC San Diego, La Jolla, United States
| | - Bradley S Moore
- Collaborative Mass Spectrometry Innovation Center, Skaggs School of Pharmacy and Pharmaceutical Sciences, UC San Diego, La Jolla, United States
- Center for Marine Biotechnology and Biomedicine, Scripps Institute of Oceanography, UC San Diego, La Jolla, United States
- Center for Oceans and Human Health, Scripps Institute of Oceanography, UC San Diego, La Jolla, United States
| | - Pieter C Dorrestein
- Collaborative Mass Spectrometry Innovation Center, Skaggs School of Pharmacy and Pharmaceutical Sciences, UC San Diego, La Jolla, United States
- Center for Marine Biotechnology and Biomedicine, Scripps Institute of Oceanography, UC San Diego, La Jolla, United States
- Skaggs School of Pharmacy and Pharmaceutical Sciences, UC San Diego, La Jolla, United States
| | - Nuno Bandeira
- Center for Computational Mass Spectrometry, UC San Diego, La Jolla, United States
- Collaborative Mass Spectrometry Innovation Center, Skaggs School of Pharmacy and Pharmaceutical Sciences, UC San Diego, La Jolla, United States
- Skaggs School of Pharmacy and Pharmaceutical Sciences, UC San Diego, La Jolla, United States
| |
Collapse
|
36
|
Nasir W, Toledo AG, Noborn F, Nilsson J, Wang M, Bandeira N, Larson G. SweetNET: A Bioinformatics Workflow for Glycopeptide MS/MS Spectral Analysis. J Proteome Res 2016; 15:2826-40. [PMID: 27399812 DOI: 10.1021/acs.jproteome.6b00417] [Citation(s) in RCA: 44] [Impact Index Per Article: 5.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/06/2023]
Abstract
Glycoproteomics has rapidly become an independent analytical platform bridging the fields of glycomics and proteomics to address site-specific protein glycosylation and its impact in biology. Current glycopeptide characterization relies on time-consuming manual interpretations and demands high levels of personal expertise. Efficient data interpretation constitutes one of the major challenges to be overcome before true high-throughput glycopeptide analysis can be achieved. The development of new glyco-related bioinformatics tools is thus of crucial importance to fulfill this goal. Here we present SweetNET: a data-oriented bioinformatics workflow for efficient analysis of hundreds of thousands of glycopeptide MS/MS-spectra. We have analyzed MS data sets from two separate glycopeptide enrichment protocols targeting sialylated glycopeptides and chondroitin sulfate linkage region glycopeptides, respectively. Molecular networking was performed to organize the glycopeptide MS/MS data based on spectral similarities. The combination of spectral clustering, oxonium ion intensity profiles, and precursor ion m/z shift distributions provided typical signatures for the initial assignment of different N-, O- and CS-glycopeptide classes and their respective glycoforms. These signatures were further used to guide database searches leading to the identification and validation of a large number of glycopeptide variants including novel deoxyhexose (fucose) modifications in the linkage region of chondroitin sulfate proteoglycans.
Collapse
Affiliation(s)
- Waqas Nasir
- Department of Clinical Chemistry and Transfusion Medicine, Institute of Biomedicine, Sahlgrenska Academy at the University of Gothenburg , SE 413 45 Gothenburg, Sweden
| | - Alejandro Gomez Toledo
- Department of Clinical Chemistry and Transfusion Medicine, Institute of Biomedicine, Sahlgrenska Academy at the University of Gothenburg , SE 413 45 Gothenburg, Sweden
| | - Fredrik Noborn
- Department of Clinical Chemistry and Transfusion Medicine, Institute of Biomedicine, Sahlgrenska Academy at the University of Gothenburg , SE 413 45 Gothenburg, Sweden
| | - Jonas Nilsson
- Department of Clinical Chemistry and Transfusion Medicine, Institute of Biomedicine, Sahlgrenska Academy at the University of Gothenburg , SE 413 45 Gothenburg, Sweden
| | - Mingxun Wang
- Department of Computer Science and Engineering, Center for Computational Mass Spectrometry, CSE, and Skaggs School of Pharmacy and Pharmaceutical Sciences, University of California, San Diego , La Jolla, California 92093, United States
| | - Nuno Bandeira
- Department of Computer Science and Engineering, Center for Computational Mass Spectrometry, CSE, and Skaggs School of Pharmacy and Pharmaceutical Sciences, University of California, San Diego , La Jolla, California 92093, United States
| | - Göran Larson
- Department of Clinical Chemistry and Transfusion Medicine, Institute of Biomedicine, Sahlgrenska Academy at the University of Gothenburg , SE 413 45 Gothenburg, Sweden
| |
Collapse
|
37
|
Quinn RA, Phelan VV, Whiteson KL, Garg N, Bailey BA, Lim YW, Conrad DJ, Dorrestein PC, Rohwer FL. Microbial, host and xenobiotic diversity in the cystic fibrosis sputum metabolome. THE ISME JOURNAL 2016; 10:1483-98. [PMID: 26623545 PMCID: PMC5029181 DOI: 10.1038/ismej.2015.207] [Citation(s) in RCA: 75] [Impact Index Per Article: 9.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 05/19/2015] [Revised: 10/19/2015] [Accepted: 10/12/2015] [Indexed: 12/21/2022]
Abstract
Cystic fibrosis (CF) lungs are filled with thick mucus that obstructs airways and facilitates chronic infections. Pseudomonas aeruginosa is a significant pathogen of this disease that produces a variety of toxic small molecules. We used molecular networking-based metabolomics to investigate the chemistry of CF sputa and assess how the microbial molecules detected reflect the microbiome and clinical culture history of the patients. Metabolites detected included xenobiotics, P. aeruginosa specialized metabolites and host sphingolipids. The clinical culture and microbiome profiles did not correspond to the detection of P. aeruginosa metabolites in the same samples. The P. aeruginosa molecules that were detected in sputum did not match those from laboratory cultures. The pseudomonas quinolone signal (PQS) was readily detectable from cultured strains, but absent from sputum, even when its precursor molecules were present. The lack of PQS production in vivo is potentially due to the chemical nature of the CF lung environment, indicating that culture-based studies of this pathogen may not explain its behavior in the lung. The most differentially abundant molecules between CF and non-CF sputum were sphingolipids, including sphingomyelins, ceramides and lactosylceramide. As these highly abundant molecules contain the inflammatory mediator ceramide, they may have a significant role in CF hyperinflammation. This study demonstrates that the chemical makeup of CF sputum is a complex milieu of microbial, host and xenobiotic molecules. Detection of a bacterium by clinical culturing and 16S rRNA gene profiling do not necessarily reflect the active production of metabolites from that bacterium in a sputum sample.
Collapse
Affiliation(s)
- Robert A Quinn
- Department of Biology, San Diego State
University, San Diego, CA, USA
- Skaggs School of Pharmacy and
Pharmaceutical Sciences, University of California at San Diego, La
Jolla, CA, USA
| | - Vanessa V Phelan
- Skaggs School of Pharmacy and
Pharmaceutical Sciences, University of California at San Diego, La
Jolla, CA, USA
| | - Katrine L Whiteson
- Department of Molecular Biology and
Biochemistry, University of California Irvine, Irvine,
CA, USA
| | - Neha Garg
- Skaggs School of Pharmacy and
Pharmaceutical Sciences, University of California at San Diego, La
Jolla, CA, USA
| | - Barbara A Bailey
- Department of Mathematics and Statistics,
San Diego State University, San Diego, CA,
USA
| | - Yan Wei Lim
- Department of Biology, San Diego State
University, San Diego, CA, USA
| | - Douglas J Conrad
- Department of Medicine, University of
California at San Diego, La Jolla, CA,
USA
| | - Pieter C Dorrestein
- Skaggs School of Pharmacy and
Pharmaceutical Sciences, University of California at San Diego, La
Jolla, CA, USA
| | - Forest L Rohwer
- Department of Biology, San Diego State
University, San Diego, CA, USA
| |
Collapse
|
38
|
Griss J. Spectral library searching in proteomics. Proteomics 2016; 16:729-40. [PMID: 26616598 DOI: 10.1002/pmic.201500296] [Citation(s) in RCA: 48] [Impact Index Per Article: 6.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/16/2015] [Revised: 10/15/2015] [Accepted: 10/29/2015] [Indexed: 12/12/2022]
Abstract
Spectral library searching has become a mature method to identify tandem mass spectra in proteomics data analysis. This review provides a comprehensive overview of available spectral library search engines and highlights their distinct features. Additionally, resources providing spectral libraries are summarized and tools presented that extend experimental spectral libraries by simulating spectra. Finally, spectrum clustering algorithms are discussed that utilize the same spectrum-to-spectrum matching algorithms as spectral library search engines and allow novel methods to analyse proteomics data.
Collapse
Affiliation(s)
- Johannes Griss
- Division of Immunology, Allergy and Infectious Diseases, Department of Dermatology, Medical University of Vienna, Austria.,European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Wellcome Trust Genome Campus, Hinxton, Cambridge, UK
| |
Collapse
|
39
|
Mohimani H, Pevzner PA. Dereplication, sequencing and identification of peptidic natural products: from genome mining to peptidogenomics to spectral networks. Nat Prod Rep 2016; 33:73-86. [PMID: 26497201 PMCID: PMC5590107 DOI: 10.1039/c5np00050e] [Citation(s) in RCA: 54] [Impact Index Per Article: 6.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/23/2023]
Abstract
Covering: 2000 to 2015. While recent breakthroughs in the discovery of peptide antibiotics and other Peptidic Natural Products (PNPs) raise a challenge for developing new algorithms for their analyses, the computational technologies for high-throughput PNP discovery are still lacking. We discuss the computational bottlenecks in analyzing PNPs and review recent advances in genome mining, peptidogenomics, and spectral networks that are now enabling the discovery of new PNPs via mass spectrometry. We further describe the connections between these advances and the new generation of software tools for PNP dereplication, de novo sequencing, and identification.
Collapse
Affiliation(s)
- Hosein Mohimani
- Department of Computer Science and Engineering, University of California, San Diego, USA.
| | - Pavel A Pevzner
- Department of Computer Science and Engineering, University of California, San Diego, USA.
| |
Collapse
|
40
|
Abstract
Mass spectrometry-based proteomics provides a powerful tool for large-scale analysis of protein modifications. Statistical and computational analysis of mass spectrometry data is a key step in protein modification identification. This chapter presents common and advanced data analysis strategies for modification identification, including variable modification search, unrestrictive approaches for modification discovery, false discovery rate estimation and control methods, and tools for modification site localization.
Collapse
Affiliation(s)
- Yan Fu
- National Center for Mathematics and Interdisciplinary Sciences, Key Laboratory of Random Complex Structures and Data Science, Academy of Mathematics and Systems Science, Chinese Academy of Sciences, Zhongguancun East Road 55, Beijing, 100190, China.
| |
Collapse
|
41
|
Chi H, He K, Yang B, Chen Z, Sun RX, Fan SB, Zhang K, Liu C, Yuan ZF, Wang QH, Liu SQ, Dong MQ, He SM. Reprint of "pFind-Alioth: A novel unrestricted database search algorithm to improve the interpretation of high-resolution MS/MS data". J Proteomics 2015; 129:33-41. [PMID: 26232248 DOI: 10.1016/j.jprot.2015.07.019] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/02/2015] [Revised: 05/04/2015] [Accepted: 05/10/2015] [Indexed: 01/23/2023]
Abstract
Database search is the dominant approach in high-throughput proteomic analysis. However, the interpretation rate of MS/MS spectra is very low in such a restricted mode, which is mainly due to unexpected modifications and irregular digestion types. In this study, we developed a new algorithm called Alioth, to be integrated into the search engine of pFind, for fast and accurate unrestricted database search on high-resolution MS/MS data. An ion index is constructed for both peptide precursors and fragment ions, by which arbitrary digestions and a single site of any modifications and mutations can be searched efficiently. A new re-ranking algorithm is used to distinguish the correct peptide-spectrum matches from random ones. The algorithm is tested on several HCD datasets and the interpretation rate of MS/MS spectra using Alioth is as high as 60%-80%. Peptides from semi- and non-specific digestions, as well as those with unexpected modifications or mutations, can be effectively identified using Alioth and confidently validated using other search engines. The average processing speed of Alioth is 5-10 times faster than some other unrestricted search engines and is comparable to or even faster than the restricted search algorithms tested.This article is part of a Special Issue entitled: Computational Proteomics.
Collapse
Affiliation(s)
- Hao Chi
- Key Lab of Intelligent Information Processing of Chinese Academy of Sciences (CAS), Institute of Computing Technology, CAS, Beijing 100190, China
| | - Kun He
- Key Lab of Intelligent Information Processing of Chinese Academy of Sciences (CAS), Institute of Computing Technology, CAS, Beijing 100190, China
| | - Bing Yang
- National Institute of Biological Sciences, Beijing, Beijing 102206, China
| | - Zhen Chen
- Beijing Institute of Genomics, Chinese Academy of Sciences, Beijing 100029, China
| | - Rui-Xiang Sun
- Key Lab of Intelligent Information Processing of Chinese Academy of Sciences (CAS), Institute of Computing Technology, CAS, Beijing 100190, China
| | - Sheng-Bo Fan
- Key Lab of Intelligent Information Processing of Chinese Academy of Sciences (CAS), Institute of Computing Technology, CAS, Beijing 100190, China
| | - Kun Zhang
- Key Lab of Intelligent Information Processing of Chinese Academy of Sciences (CAS), Institute of Computing Technology, CAS, Beijing 100190, China
| | - Chao Liu
- Key Lab of Intelligent Information Processing of Chinese Academy of Sciences (CAS), Institute of Computing Technology, CAS, Beijing 100190, China
| | - Zuo-Fei Yuan
- Key Lab of Intelligent Information Processing of Chinese Academy of Sciences (CAS), Institute of Computing Technology, CAS, Beijing 100190, China
| | - Quan-Hui Wang
- Beijing Institute of Genomics, Chinese Academy of Sciences, Beijing 100029, China
| | - Si-Qi Liu
- Beijing Institute of Genomics, Chinese Academy of Sciences, Beijing 100029, China
| | - Meng-Qiu Dong
- National Institute of Biological Sciences, Beijing, Beijing 102206, China
| | - Si-Min He
- Key Lab of Intelligent Information Processing of Chinese Academy of Sciences (CAS), Institute of Computing Technology, CAS, Beijing 100190, China.
| |
Collapse
|
42
|
Computational and statistical methods for high-throughput analysis of post-translational modifications of proteins. J Proteomics 2015. [PMID: 26216596 DOI: 10.1016/j.jprot.2015.07.016] [Citation(s) in RCA: 25] [Impact Index Per Article: 2.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/16/2022]
Abstract
The investigation of post-translational modifications (PTMs) represents one of the main research focuses for the study of protein function and cell signaling. Mass spectrometry instrumentation with increasing sensitivity improved protocols for PTM enrichment and recently established pipelines for high-throughput experiments allow large-scale identification and quantification of several PTM types. This review addresses the concurrently emerging challenges for the computational analysis of the resulting data and presents PTM-centered approaches for spectra identification, statistical analysis, multivariate analysis and data interpretation. We furthermore discuss the potential of future developments that will help to gain deep insight into the PTM-ome and its biological role in cells. This article is part of a Special Issue entitled: Computational Proteomics.
Collapse
|
43
|
Xu T, Park SK, Venable JD, Wohlschlegel JA, Diedrich JK, Cociorva D, Lu B, Liao L, Hewel J, Han X, Wong CCL, Fonslow B, Delahunty C, Gao Y, Shah H, Yates JR. ProLuCID: An improved SEQUEST-like algorithm with enhanced sensitivity and specificity. J Proteomics 2015; 129:16-24. [PMID: 26171723 DOI: 10.1016/j.jprot.2015.07.001] [Citation(s) in RCA: 353] [Impact Index Per Article: 39.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/09/2015] [Revised: 06/08/2015] [Accepted: 07/04/2015] [Indexed: 12/25/2022]
Abstract
ProLuCID, a new algorithm for peptide identification using tandem mass spectrometry and protein sequence databases has been developed. This algorithm uses a three tier scoring scheme. First, a binomial probability is used as a preliminary scoring scheme to select candidate peptides. The binomial probability scores generated by ProLuCID minimize molecular weight bias and are independent of database size. A modified cross-correlation score is calculated for each candidate peptide identified by the binomial probability. This cross-correlation scoring function models the isotopic distributions of fragment ions of candidate peptides which ultimately results in higher sensitivity and specificity than that obtained with the SEQUEST XCorr. Finally, ProLuCID uses the distribution of XCorr values for all of the selected candidate peptides to compute a Z score for the peptide hit with the highest XCorr. The ProLuCID Z score combines the discriminative power of XCorr and DeltaCN, the standard parameters for assessing the quality of the peptide identification using SEQUEST, and displays significant improvement in specificity over ProLuCID XCorr alone. ProLuCID is also able to take advantage of high resolution MS/MS spectra leading to further improvements in specificity when compared to low resolution tandem MS data. A comparison of filtered data searched with SEQUEST and ProLuCID using the same false discovery rate as estimated by a target-decoy database strategy, shows that ProLuCID was able to identify as many as 25% more proteins than SEQUEST. ProLuCID is implemented in Java and can be easily installed on a single computer or a computer cluster. This article is part of a Special Issue entitled: Computational Proteomics.
Collapse
Affiliation(s)
- T Xu
- Department of Chemical Physiology, The Scripps Research Institute, 10550 North Torrey Pines Road, SR11, La Jolla, CA 92037, USA; Dow AgroSciences LLC, Indianapolis, IN 46268, USA
| | - S K Park
- Department of Chemical Physiology, The Scripps Research Institute, 10550 North Torrey Pines Road, SR11, La Jolla, CA 92037, USA
| | - J D Venable
- Department of Chemical Physiology, The Scripps Research Institute, 10550 North Torrey Pines Road, SR11, La Jolla, CA 92037, USA
| | - J A Wohlschlegel
- Department of Chemical Physiology, The Scripps Research Institute, 10550 North Torrey Pines Road, SR11, La Jolla, CA 92037, USA
| | - J K Diedrich
- Department of Chemical Physiology, The Scripps Research Institute, 10550 North Torrey Pines Road, SR11, La Jolla, CA 92037, USA
| | - D Cociorva
- Department of Chemical Physiology, The Scripps Research Institute, 10550 North Torrey Pines Road, SR11, La Jolla, CA 92037, USA
| | - B Lu
- Department of Chemical Physiology, The Scripps Research Institute, 10550 North Torrey Pines Road, SR11, La Jolla, CA 92037, USA
| | - L Liao
- Department of Chemical Physiology, The Scripps Research Institute, 10550 North Torrey Pines Road, SR11, La Jolla, CA 92037, USA
| | - J Hewel
- Department of Chemical Physiology, The Scripps Research Institute, 10550 North Torrey Pines Road, SR11, La Jolla, CA 92037, USA
| | - X Han
- Department of Chemical Physiology, The Scripps Research Institute, 10550 North Torrey Pines Road, SR11, La Jolla, CA 92037, USA
| | - C C L Wong
- Department of Chemical Physiology, The Scripps Research Institute, 10550 North Torrey Pines Road, SR11, La Jolla, CA 92037, USA
| | - B Fonslow
- Department of Chemical Physiology, The Scripps Research Institute, 10550 North Torrey Pines Road, SR11, La Jolla, CA 92037, USA
| | - C Delahunty
- Department of Chemical Physiology, The Scripps Research Institute, 10550 North Torrey Pines Road, SR11, La Jolla, CA 92037, USA
| | - Y Gao
- Department of Chemical Physiology, The Scripps Research Institute, 10550 North Torrey Pines Road, SR11, La Jolla, CA 92037, USA
| | - H Shah
- Department of Chemical Physiology, The Scripps Research Institute, 10550 North Torrey Pines Road, SR11, La Jolla, CA 92037, USA
| | - J R Yates
- Department of Chemical Physiology, The Scripps Research Institute, 10550 North Torrey Pines Road, SR11, La Jolla, CA 92037, USA.
| |
Collapse
|
44
|
Chi H, He K, Yang B, Chen Z, Sun RX, Fan SB, Zhang K, Liu C, Yuan ZF, Wang QH, Liu SQ, Dong MQ, He SM. pFind-Alioth: A novel unrestricted database search algorithm to improve the interpretation of high-resolution MS/MS data. J Proteomics 2015; 125:89-97. [PMID: 25979774 DOI: 10.1016/j.jprot.2015.05.009] [Citation(s) in RCA: 35] [Impact Index Per Article: 3.9] [Reference Citation Analysis] [Abstract] [Key Words] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/02/2015] [Revised: 05/04/2015] [Accepted: 05/10/2015] [Indexed: 10/23/2022]
Abstract
Database search is the dominant approach in high-throughput proteomic analysis. However, the interpretation rate of MS/MS spectra is very low in such a restricted mode, which is mainly due to unexpected modifications and irregular digestion types. In this study, we developed a new algorithm called Alioth, to be integrated into the search engine of pFind, for fast and accurate unrestricted database search on high-resolution MS/MS data. An ion index is constructed for both peptide precursors and fragment ions, by which arbitrary digestions and a single site of any modifications and mutations can be searched efficiently. A new re-ranking algorithm is used to distinguish the correct peptide-spectrum matches from random ones. The algorithm is tested on several HCD datasets and the interpretation rate of MS/MS spectra using Alioth is as high as 60%-80%. Peptides from semi- and non-specific digestions, as well as those with unexpected modifications or mutations, can be effectively identified using Alioth and confidently validated using other search engines. The average processing speed of Alioth is 5-10 times faster than some other unrestricted search engines and is comparable to or even faster than the restricted search algorithms tested.
Collapse
Affiliation(s)
- Hao Chi
- Key Lab of Intelligent Information Processing of Chinese Academy of Sciences (CAS), Institute of Computing Technology, CAS, Beijing 100190, China
| | - Kun He
- Key Lab of Intelligent Information Processing of Chinese Academy of Sciences (CAS), Institute of Computing Technology, CAS, Beijing 100190, China
| | - Bing Yang
- National Institute of Biological Sciences, Beijing, Beijing 102206, China
| | - Zhen Chen
- Beijing Institute of Genomics, Chinese Academy of Sciences, Beijing 100029, China
| | - Rui-Xiang Sun
- Key Lab of Intelligent Information Processing of Chinese Academy of Sciences (CAS), Institute of Computing Technology, CAS, Beijing 100190, China
| | - Sheng-Bo Fan
- Key Lab of Intelligent Information Processing of Chinese Academy of Sciences (CAS), Institute of Computing Technology, CAS, Beijing 100190, China
| | - Kun Zhang
- Key Lab of Intelligent Information Processing of Chinese Academy of Sciences (CAS), Institute of Computing Technology, CAS, Beijing 100190, China
| | - Chao Liu
- Key Lab of Intelligent Information Processing of Chinese Academy of Sciences (CAS), Institute of Computing Technology, CAS, Beijing 100190, China
| | - Zuo-Fei Yuan
- Key Lab of Intelligent Information Processing of Chinese Academy of Sciences (CAS), Institute of Computing Technology, CAS, Beijing 100190, China
| | - Quan-Hui Wang
- Beijing Institute of Genomics, Chinese Academy of Sciences, Beijing 100029, China
| | - Si-Qi Liu
- Beijing Institute of Genomics, Chinese Academy of Sciences, Beijing 100029, China
| | - Meng-Qiu Dong
- National Institute of Biological Sciences, Beijing, Beijing 102206, China
| | - Si-Min He
- Key Lab of Intelligent Information Processing of Chinese Academy of Sciences (CAS), Institute of Computing Technology, CAS, Beijing 100190, China.
| |
Collapse
|
45
|
Guthals A, Boucher C, Bandeira N. The generating function approach for Peptide identification in spectral networks. J Comput Biol 2015; 22:353-66. [PMID: 25423621 PMCID: PMC4425220 DOI: 10.1089/cmb.2014.0165] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/13/2022] Open
Abstract
Tandem mass (MS/MS) spectrometry has become the method of choice for protein identification and has launched a quest for the identification of every translated protein and peptide. However, computational developments have lagged behind the pace of modern data acquisition protocols and have become a major bottleneck in proteomics analysis of complex samples. As it stands today, attempts to identify MS/MS spectra against large databases (e.g., the human microbiome or 6-frame translation of the human genome) face a search space that is 10-100 times larger than the human proteome, where it becomes increasingly challenging to separate between true and false peptide matches. As a result, the sensitivity of current state-of-the-art database search methods drops by nearly 38% to such low identification rates that almost 90% of all MS/MS spectra are left as unidentified. We address this problem by extending the generating function approach to rigorously compute the joint spectral probability of multiple spectra being matched to peptides with overlapping sequences, thus enabling the confident assignment of higher significance to overlapping peptide-spectrum matches (PSMs). We find that these joint spectral probabilities can be several orders of magnitude more significant than individual PSMs, even in the ideal case when perfect separation between signal and noise peaks could be achieved per individual MS/MS spectrum. After benchmarking this approach on a typical lysate MS/MS dataset, we show that the proposed intersecting spectral probabilities for spectra from overlapping peptides improve peptide identification by 30-62%.
Collapse
Affiliation(s)
- Adrian Guthals
- Department of Computer Science and Engineering, University of California–San Diego, La Jolla, California
| | - Christina Boucher
- Department of Computer Science, Colorado State University, Fort Collins, Colorado
| | - Nuno Bandeira
- Department of Computer Science and Engineering, University of California–San Diego, La Jolla, California
- Skaggs School of Pharmacy and Pharmaceutical Sciences, University of California–San Diego, La Jolla, California
| |
Collapse
|
46
|
Leon DR, Ytterberg AJ, Boontheung P, Kim U, Loo JA, Gunsalus RP, Ogorzalek Loo RR. Mining proteomic data to expose protein modifications in Methanosarcina mazei strain Gö1. Front Microbiol 2015; 6:149. [PMID: 25798134 PMCID: PMC4350412 DOI: 10.3389/fmicb.2015.00149] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/01/2014] [Accepted: 02/09/2015] [Indexed: 12/11/2022] Open
Abstract
Proteomic tools identify constituents of complex mixtures, often delivering long lists of identified proteins. The high-throughput methods excel at matching tandem mass spectrometry data to spectra predicted from sequence databases. Unassigned mass spectra are ignored, but could, in principle, provide valuable information on unanticipated modifications and improve protein annotations while consuming limited quantities of material. Strategies to "mine" information from these discards are presented, along with discussion of features that, when present, provide strong support for modifications. In this study we mined LC-MS/MS datasets of proteolytically-digested concanavalin A pull down fractions from Methanosarcina mazei Gö1 cell lysates. Analyses identified 154 proteins. Many of the observed proteins displayed post-translationally modified forms, including O-formylated and methyl-esterified segments that appear biologically relevant (i.e., not artifacts of sample handling). Interesting cleavages and modifications (e.g., S-cyanylation and trimethylation) were observed near catalytic sites of methanogenesis enzymes. Of 31 Methanosarcina protein N-termini recovered by concanavalin A binding or from a previous study, only M. mazei S-layer protein MM1976 and its M. acetivorans C2A orthologue, MA0829, underwent signal peptide excision. Experimental results contrast with predictions from algorithms SignalP 3.0 and Exprot, which were found to over-predict the presence of signal peptides. Proteins MM0002, MM0716, MM1364, and MM1976 were found to be glycosylated, and employing chromatography tailored specifically for glycopeptides will likely reveal more. This study supplements limited, existing experimental datasets of mature archaeal N-termini, including presence or absence of signal peptides, translation initiation sites, and other processing. Methanosarcina surface and membrane proteins are richly modified.
Collapse
Affiliation(s)
- Deborah R Leon
- Department of Chemistry and Biochemistry, University of California, Los Angeles Los Angeles, CA, USA
| | - A Jimmy Ytterberg
- Department of Chemistry and Biochemistry, University of California, Los Angeles Los Angeles, CA, USA
| | - Pinmanee Boontheung
- Department of Chemistry and Biochemistry, University of California, Los Angeles Los Angeles, CA, USA
| | - Unmi Kim
- Microbiology, Immunology, and Molecular Genetics, University of California, Los Angeles Los Angeles, CA, USA
| | - Joseph A Loo
- Department of Chemistry and Biochemistry, University of California, Los Angeles Los Angeles, CA, USA ; Department of Biological Chemistry, David Geffen School of Medicine, University of California, Los Angeles Los Angeles, CA, USA ; UCLA-DOE Institute for Genomics and Proteomics, University of California, Los Angeles Los Angeles, CA, USA
| | - Robert P Gunsalus
- Microbiology, Immunology, and Molecular Genetics, University of California, Los Angeles Los Angeles, CA, USA ; UCLA-DOE Institute for Genomics and Proteomics, University of California, Los Angeles Los Angeles, CA, USA
| | - Rachel R Ogorzalek Loo
- Department of Biological Chemistry, David Geffen School of Medicine, University of California, Los Angeles Los Angeles, CA, USA ; UCLA-DOE Institute for Genomics and Proteomics, University of California, Los Angeles Los Angeles, CA, USA
| |
Collapse
|
47
|
Dallas DC, Guerrero A, Parker EA, Robinson RC, Gan J, German JB, Barile D, Lebrilla CB. Current peptidomics: applications, purification, identification, quantification, and functional analysis. Proteomics 2015; 15:1026-38. [PMID: 25429922 PMCID: PMC4371869 DOI: 10.1002/pmic.201400310] [Citation(s) in RCA: 160] [Impact Index Per Article: 17.8] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/06/2014] [Revised: 10/08/2014] [Accepted: 11/24/2014] [Indexed: 12/28/2022]
Abstract
Peptidomics is an emerging field branching from proteomics that targets endogenously produced protein fragments. Endogenous peptides are often functional within the body-and can be both beneficial and detrimental. This review covers the use of peptidomics in understanding digestion, and identifying functional peptides and biomarkers. Various techniques for peptide and glycopeptide extraction, both at analytical and preparative scales, and available options for peptide detection with MS are discussed. Current algorithms for peptide sequence determination, and both analytical and computational techniques for quantification are compared. Techniques for statistical analysis, sequence mapping, enzyme prediction, and peptide function, and structure prediction are explored.
Collapse
Affiliation(s)
- David C. Dallas
- Department of Food Science and Technology, University of California, Davis, One Shields Avenue, Davis, CA, USA
- Foods for Health Institute, University of California, Davis, One Shields Avenue, Davis, CA, USA
| | - Andres Guerrero
- Department of Chemistry, University of California, Davis, One Shields Avenue, Davis, CA, USA
| | - Evan A. Parker
- Department of Chemistry, University of California, Davis, One Shields Avenue, Davis, CA, USA
| | - Randall C. Robinson
- Department of Food Science and Technology, University of California, Davis, One Shields Avenue, Davis, CA, USA
| | - Junai Gan
- Department of Food Science and Technology, University of California, Davis, One Shields Avenue, Davis, CA, USA
| | - J. Bruce German
- Department of Food Science and Technology, University of California, Davis, One Shields Avenue, Davis, CA, USA
- Foods for Health Institute, University of California, Davis, One Shields Avenue, Davis, CA, USA
| | - Daniela Barile
- Department of Food Science and Technology, University of California, Davis, One Shields Avenue, Davis, CA, USA
- Foods for Health Institute, University of California, Davis, One Shields Avenue, Davis, CA, USA
| | - Carlito B. Lebrilla
- Foods for Health Institute, University of California, Davis, One Shields Avenue, Davis, CA, USA
- Department of Chemistry, University of California, Davis, One Shields Avenue, Davis, CA, USA
| |
Collapse
|
48
|
Na S, Paek E. Software eyes for protein post-translational modifications. MASS SPECTROMETRY REVIEWS 2015; 34:133-147. [PMID: 24889695 DOI: 10.1002/mas.21425] [Citation(s) in RCA: 36] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 06/26/2012] [Revised: 07/18/2013] [Accepted: 11/20/2013] [Indexed: 06/03/2023]
Abstract
Post-translational modifications (PTMs) are critical to almost all aspects of complex processes of the cell. Identification of PTMs is one of the biggest challenges for proteomics, and there have been many computational studies for the analysis of PTMs from tandem mass spectrometry (MS/MS). Most early PTM identification studies have been performed by matching MS/MS data to protein databases, using database search tools, but they are prohibitively slow when a large number of PTMs is given as a search parameter. In this article, we present recent developments to search for more types of PTMs and to speed up the search, and discuss many computational issues and solutions in terms of identifying multiply modified peptides or searching for all possible modifications at once in unrestrictive mode. Apart from the most common type of PTMs involving covalent addition of functional groups to proteins, PTMs such as disulfide linkage require dedicated software for the analysis because they may involve cross-linking between two different parts of proteins. Finally, methods for identification of protein disulfide bonds are presented.
Collapse
Affiliation(s)
- Seungjin Na
- Department of Computer Science and Engineering, University of California, San Diego, La Jolla, CA, 92093; Center for Computational Mass Spectrometry, University of California, San Diego, La Jolla, CA, 92093
| | | |
Collapse
|
49
|
Medzihradszky KF, Chalkley RJ. Lessons in de novo peptide sequencing by tandem mass spectrometry. MASS SPECTROMETRY REVIEWS 2015; 34:43-63. [PMID: 25667941 PMCID: PMC4367481 DOI: 10.1002/mas.21406] [Citation(s) in RCA: 137] [Impact Index Per Article: 15.2] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/03/2023]
Abstract
Mass spectrometry has become the method of choice for the qualitative and quantitative characterization of protein mixtures isolated from all kinds of living organisms. The raw data in these studies are MS/MS spectra, usually of peptides produced by proteolytic digestion of a protein. These spectra are "translated" into peptide sequences, normally with the help of various search engines. Data acquisition and interpretation have both been automated, and most researchers look only at the summary of the identifications without ever viewing the underlying raw data used for assignments. Automated analysis of data is essential due to the volume produced. However, being familiar with the finer intricacies of peptide fragmentation processes, and experiencing the difficulties of manual data interpretation allow a researcher to be able to more critically evaluate key results, particularly because there are many known rules of peptide fragmentation that are not incorporated into search engine scoring. Since the most commonly used MS/MS activation method is collision-induced dissociation (CID), in this article we present a brief review of the history of peptide CID analysis. Next, we provide a detailed tutorial on how to determine peptide sequences from CID data. Although the focus of the tutorial is de novo sequencing, the lessons learned and resources supplied are useful for data interpretation in general.
Collapse
|
50
|
Mohimani H, Liu WT, Kersten R, Moore BS, Dorrestein PC, Pevzner PA. NRPquest: Coupling Mass Spectrometry and Genome Mining for Nonribosomal Peptide Discovery. JOURNAL OF NATURAL PRODUCTS 2014; 77:1902-9. [PMID: 25116163 PMCID: PMC4143176 DOI: 10.1021/np500370c] [Citation(s) in RCA: 65] [Impact Index Per Article: 6.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 04/29/2014] [Indexed: 05/31/2023]
Abstract
Nonribosomal peptides (NRPs) such as vancomycin and daptomycin are among the most effective antibiotics. While NRPs are biomedically important, the computational techniques for sequencing these peptides are still in their infancy. The recent emergence of mass spectrometry techniques for NRP analysis (capable of sequencing an NRP from small amounts of nonpurified material) revealed an enormous diversity of NRPs. However, as many NRPs have nonlinear structure (e.g., cyclic or branched-cyclic peptides), the standard de novo sequencing tools (developed for linear peptides) are not applicable to NRP analysis. Here, we introduce the first NRP identification algorithm, NRPquest, that performs mutation-tolerant and modification-tolerant searches of spectral data sets against a database of putative NRPs. In contrast to previous studies aimed at NRP discovery (that usually report very few NRPs), NRPquest revealed nearly a hundred NRPs (including unknown variants of previously known peptides) in a single study. This result indicates that NRPquest can potentially make MS-based NRP identification as robust as the identification of linear peptides in traditional proteomics.
Collapse
Affiliation(s)
- Hosein Mohimani
- Department of Electrical and Computer
Engineering, Department of Chemistry and Biochemistry, Center for Marine
Biotechnology and Biomedicine, Scripps Institution of Oceanography, Skaggs School of
Pharmacy and Pharmaceutical Sciences, and Department of Computer Science
and Engineering, University of California
San Diego, La Jolla, California 92093, United States
| | - Wei-Ting Liu
- Department of Electrical and Computer
Engineering, Department of Chemistry and Biochemistry, Center for Marine
Biotechnology and Biomedicine, Scripps Institution of Oceanography, Skaggs School of
Pharmacy and Pharmaceutical Sciences, and Department of Computer Science
and Engineering, University of California
San Diego, La Jolla, California 92093, United States
| | - Roland
D. Kersten
- Department of Electrical and Computer
Engineering, Department of Chemistry and Biochemistry, Center for Marine
Biotechnology and Biomedicine, Scripps Institution of Oceanography, Skaggs School of
Pharmacy and Pharmaceutical Sciences, and Department of Computer Science
and Engineering, University of California
San Diego, La Jolla, California 92093, United States
| | - Bradley S. Moore
- Department of Electrical and Computer
Engineering, Department of Chemistry and Biochemistry, Center for Marine
Biotechnology and Biomedicine, Scripps Institution of Oceanography, Skaggs School of
Pharmacy and Pharmaceutical Sciences, and Department of Computer Science
and Engineering, University of California
San Diego, La Jolla, California 92093, United States
| | - Pieter C. Dorrestein
- Department of Electrical and Computer
Engineering, Department of Chemistry and Biochemistry, Center for Marine
Biotechnology and Biomedicine, Scripps Institution of Oceanography, Skaggs School of
Pharmacy and Pharmaceutical Sciences, and Department of Computer Science
and Engineering, University of California
San Diego, La Jolla, California 92093, United States
| | - Pavel A. Pevzner
- Department of Electrical and Computer
Engineering, Department of Chemistry and Biochemistry, Center for Marine
Biotechnology and Biomedicine, Scripps Institution of Oceanography, Skaggs School of
Pharmacy and Pharmaceutical Sciences, and Department of Computer Science
and Engineering, University of California
San Diego, La Jolla, California 92093, United States
| |
Collapse
|