1
|
McDonnell KJ. Operationalizing Team Science at the Academic Cancer Center Network to Unveil the Structure and Function of the Gut Microbiome. J Clin Med 2025; 14:2040. [PMID: 40142848 PMCID: PMC11943358 DOI: 10.3390/jcm14062040] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/17/2025] [Revised: 02/28/2025] [Accepted: 03/05/2025] [Indexed: 03/28/2025] Open
Abstract
Oncologists increasingly recognize the microbiome as an important facilitator of health as well as a contributor to disease, including, specifically, cancer. Our knowledge of the etiologies, mechanisms, and modulation of microbiome states that ameliorate or promote cancer continues to evolve. The progressive refinement and adoption of "omic" technologies (genomics, transcriptomics, proteomics, and metabolomics) and utilization of advanced computational methods accelerate this evolution. The academic cancer center network, with its immediate access to extensive, multidisciplinary expertise and scientific resources, has the potential to catalyze microbiome research. Here, we review our current understanding of the role of the gut microbiome in cancer prevention, predisposition, and response to therapy. We underscore the promise of operationalizing the academic cancer center network to uncover the structure and function of the gut microbiome; we highlight the unique microbiome-related expert resources available at the City of Hope of Comprehensive Cancer Center as an example of the potential of team science to achieve novel scientific and clinical discovery.
Collapse
Affiliation(s)
- Kevin J McDonnell
- Center for Precision Medicine, Department of Medical Oncology & Therapeutics Research, City of Hope Comprehensive Cancer Center, Duarte, CA 91010, USA
| |
Collapse
|
2
|
Poudel S, Yuan ZF, Fu Y, Wu L, Shrestha H, High AA, Peng J, Wang X. JUMPlib: Integrative Search Tool Combining Fragment Ion Indexing with Comprehensive TMT Spectral Libraries. J Proteome Res 2025; 24:410-418. [PMID: 39715016 DOI: 10.1021/acs.jproteome.4c00410] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/25/2024]
Abstract
The identification of peptides is a cornerstone of mass spectrometry-based proteomics. Spectral library-based algorithms are well-established methods to enhance the identification efficiency of peptides during database searches in proteomics. However, these algorithms are not specifically tailored for tandem mass tag (TMT)-based proteomics due to the lack of high-quality TMT spectral libraries. Here, we introduce JUMPlib, a computational tool for a TMT-based spectral library search. JUMPlib comprises components for generating spectral libraries, conducting library searches, filtering peptide identifications, and quantifying peptides and proteins. Fragment ion indexing in the libraries increases the search speed and utilizing the experimental retention time of precursor ions improves peptide identification. We found that methionine oxidation is a major factor contributing to large shifts in peptide retention time. To test the JUMPlib program, we curated two comprehensive human libraries for the labeling of TMT6/10/11 and TMT16/18 reagents, with ∼286,000 precursor ions and ∼304,000 precursor ions, respectively. Our analyses demonstrate that JUMPlib, employing the fragment ion index strategy, enhances search speed and exhibits high sensitivity and specificity, achieving approximately a 25% increase in peptide-spectrum matches compared to other search tools. Overall, JUMPlib serves as a streamlined computational platform designed to enhance peptide identification in TMT-based proteomics. Both the JUMPlib source code and libraries are publicly available.
Collapse
Affiliation(s)
- Suresh Poudel
- Center for Proteomics and Metabolomics, St. Jude Children's Research Hospital, Memphis, Tennessee 38105, United States
| | - Zuo-Fei Yuan
- Center for Proteomics and Metabolomics, St. Jude Children's Research Hospital, Memphis, Tennessee 38105, United States
| | - Yingxue Fu
- Center for Proteomics and Metabolomics, St. Jude Children's Research Hospital, Memphis, Tennessee 38105, United States
| | - Long Wu
- Center for Proteomics and Metabolomics, St. Jude Children's Research Hospital, Memphis, Tennessee 38105, United States
| | - Him Shrestha
- Department of Structural Biology, and Department of Developmental Neurobiology, St. Jude Children's Research Hospital, Memphis, Tennessee 38105, United States
| | - Anthony A High
- Center for Proteomics and Metabolomics, St. Jude Children's Research Hospital, Memphis, Tennessee 38105, United States
| | - Junmin Peng
- Department of Structural Biology, and Department of Developmental Neurobiology, St. Jude Children's Research Hospital, Memphis, Tennessee 38105, United States
| | - Xusheng Wang
- Department of Neurology, University of Tennessee Health Science Center, Memphis, Tennessee 38103, United States
- Department of Genetics, Genomics & Informatics, University of Tennessee Health Science Center, Memphis, Tennessee 38103, United States
| |
Collapse
|
3
|
Sőregi P, Zwillinger M, Vágó L, Csékei M, Kotschy A. High density information storage through isotope ratio encoding. Chem Sci 2024:d4sc03519d. [PMID: 39246345 PMCID: PMC11376023 DOI: 10.1039/d4sc03519d] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/29/2024] [Accepted: 08/19/2024] [Indexed: 09/10/2024] Open
Abstract
The need for reliable information storage is on a steep rise. Sequence-defined polymers, particularly oligonucleotides, are already in use in several areas, while compound mixtures also offer a simple way for storing information. We investigated the use of a set of isotopologues in information storage by mixing, where the information is stored in the form of a mass spectrometric (MS) fingerprint of the mixture. A small molecule with 24 non-labile and replaceable hydrogen atoms was selected as a model, and a set of components covering the D0-D24 deuteration range were synthesized. Theoretical analysis predicted that by mixing up to 10 out of the prepared components, one can encode over 130 million different combinations and distinguish their MS fingerprints. As a proof of principle, several mixtures predicted to have similar fingerprints were prepared and their MS fingerprints were recorded. From each measured MS fingerprint, we were able to unambiguously identify the actual composition of the mixture. It was also demonstrated that one can make the MS fingerprints of a given mixture unique, thereby making counterfeiting of the stored information very difficult. Finally, the utility of isotope ratio encoding in covalent tagging was also demonstrated.
Collapse
Affiliation(s)
- Petra Sőregi
- Servier Research Institute of Medicinal Chemistry Záhony utca 7 1031 Budapest Hungary
- Hevesy György PhD School of Chemistry, Eötvös Loránd University Pázmány Péter sétány 1/A 1117 Budapest Hungary
| | - Márton Zwillinger
- Servier Research Institute of Medicinal Chemistry Záhony utca 7 1031 Budapest Hungary
| | - Lajos Vágó
- Kastély u. 49/A 2045 Törökbálint Hungary
| | - Márton Csékei
- Servier Research Institute of Medicinal Chemistry Záhony utca 7 1031 Budapest Hungary
| | - Andras Kotschy
- Servier Research Institute of Medicinal Chemistry Záhony utca 7 1031 Budapest Hungary
| |
Collapse
|
4
|
Ji C, Miao J, Zhao N, Dai Y, Yang J, Qu J, Zhu J, Zhao M. N-nitrosamines induced gender-dimorphic effects on infant rats at environmental levels. THE SCIENCE OF THE TOTAL ENVIRONMENT 2024; 912:169196. [PMID: 38097075 DOI: 10.1016/j.scitotenv.2023.169196] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 09/21/2023] [Revised: 11/22/2023] [Accepted: 12/06/2023] [Indexed: 12/21/2023]
Abstract
The safety of drinking water has always been a concern for people all over the world. N-nitrosamines (NAs), a kind of nitrogenous disinfection by-products (N-DBPs), are generally detected as a mixture in drinking water at home and abroad. Studies have shown that individual NAs posed strong carcinogenicity at high concentrations. However, health risks of NAs at environmental levels (concentrations in drinking water) are still unclear. Therefore, the potential health risks of environmentally relevant NAs exposure in drinking water needs to be conducted. In this study, blood biochemical analysis and metabolomics based on nuclear magnetic resonance (NMR) were performed to comprehensively investigate NAs induced metabolic disturbance in infant rats at environmental levels. Results of blood biochemical indices analysis indicated that AST in the serum of male rats in NAs-treated group exhibited a significant gender-specific difference. Multivariate statistics showed that two and eight significantly disturbed metabolic pathways were identified in the serum samples of NAs-treated male and female rats, respectively. In the urine samples of NAs-treated female rats, glycine, serine, and threonine metabolism pathway was significantly disturbed; while three significantly disturbed metabolic pathways were found in the urine of NAs-treated male rats. Finally, results of spearman correlation coefficients suggested that the disturbances of metabolism profile in serum and urine were correlated with changes in the gut microbiota (data derived from our published paper). Data presented here aimed to generate new health risk data of NAs mixture exposure at environmental levels and provide theoretical support for drinking water safety management. ENVIRONMENTAL IMPLICATION: N-nitrosamines (NAs) are a kind of nitrogenous disinfection by-products (N-DBPs) generated during drinking water disinfection processes. Herein, health risks of NAs at environmental levels (concentrations in drinking water) are investigated using blood biochemical analysis and nuclear magnetic resonance (NMR)-based metabolomics. Results confirmed NAs induced gender-specific on the metabolism in rat and the disturbances of metabolism profile in serum and urine were correlated with changes in the gut microbiota. Data presented here aimed to generate new health risk data of NAs mixture exposure at environmental levels and provide theoretical support for drinking water safety management.
Collapse
Affiliation(s)
- Chenyang Ji
- Key Laboratory of Pollution Exposure and Health Intervention of Zhejiang Province, Interdisciplinary Research Academy, Zhejiang Shuren University, Hangzhou 310015, China
| | - Jiahui Miao
- Key Laboratory of Microbial Technology for Industrial Pollution Control of Zhejiang Province, College of Environment, Zhejiang University of Technology, Hangzhou 310014, China
| | - Nan Zhao
- Key Laboratory of Microbial Technology for Industrial Pollution Control of Zhejiang Province, College of Environment, Zhejiang University of Technology, Hangzhou 310014, China
| | - Yaoyao Dai
- Key Laboratory of Microbial Technology for Industrial Pollution Control of Zhejiang Province, College of Environment, Zhejiang University of Technology, Hangzhou 310014, China
| | - Jiawen Yang
- Key Laboratory of Microbial Technology for Industrial Pollution Control of Zhejiang Province, College of Environment, Zhejiang University of Technology, Hangzhou 310014, China
| | - Jianli Qu
- Key Laboratory of Microbial Technology for Industrial Pollution Control of Zhejiang Province, College of Environment, Zhejiang University of Technology, Hangzhou 310014, China
| | - Jianqiang Zhu
- Key Laboratory of Microbial Technology for Industrial Pollution Control of Zhejiang Province, College of Environment, Zhejiang University of Technology, Hangzhou 310014, China; College of Life Science, Taizhou University, Taizhou 318000, PR China
| | - Meirong Zhao
- Key Laboratory of Microbial Technology for Industrial Pollution Control of Zhejiang Province, College of Environment, Zhejiang University of Technology, Hangzhou 310014, China.
| |
Collapse
|
5
|
Liu K, Ye Y, Li S, Tang H. Accurate de novo peptide sequencing using fully convolutional neural networks. Nat Commun 2023; 14:7974. [PMID: 38042873 PMCID: PMC10693636 DOI: 10.1038/s41467-023-43010-x] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/09/2022] [Accepted: 10/29/2023] [Indexed: 12/04/2023] Open
Abstract
De novo peptide sequencing, which does not rely on a comprehensive target sequence database, provides us with a way to identify novel peptides from tandem mass spectra. However, current de novo sequencing algorithms suffer from low accuracy and coverage, which hinders their application in proteomics. In this paper, we present PepNet, a fully convolutional neural network for high accuracy de novo peptide sequencing. PepNet takes an MS/MS spectrum (represented as a high-dimensional vector) as input, and outputs the optimal peptide sequence along with its confidence score. The PepNet model is trained using a total of 3 million high-energy collisional dissociation MS/MS spectra from multiple human peptide spectral libraries. Evaluation results show that PepNet significantly outperforms current best-performing de novo sequencing algorithms (e.g. PointNovo and DeepNovo) in both peptide-level accuracy and positional-level accuracy. PepNet can sequence a large fraction of spectra that were not identified by database search engines, and thus could be used as a complementary tool to database search engines for peptide identification in proteomics. In addition, PepNet runs around 3x and 7x faster than PointNovo and DeepNovo on GPUs, respectively, thus being more suitable for the analysis of large-scale proteomics data.
Collapse
Affiliation(s)
- Kaiyuan Liu
- Luddy School of Informatics, Computing, and Engineering, Indiana University, Bloomington, 47408, IN, USA
| | - Yuzhen Ye
- Luddy School of Informatics, Computing, and Engineering, Indiana University, Bloomington, 47408, IN, USA
| | - Sujun Li
- Luddy School of Informatics, Computing, and Engineering, Indiana University, Bloomington, 47408, IN, USA
- Dengding BioAI Co., Ltd., Bloomington, USA
| | - Haixu Tang
- Luddy School of Informatics, Computing, and Engineering, Indiana University, Bloomington, 47408, IN, USA.
| |
Collapse
|
6
|
Kuo TY, Wang JH, Huang YW, Sung TY, Chen CT. Improving quantitation accuracy in isobaric-labeling mass spectrometry experiments with spectral library searching and feature-based peptide-spectrum match filter. Sci Rep 2023; 13:14119. [PMID: 37644119 PMCID: PMC10465558 DOI: 10.1038/s41598-023-41124-2] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/18/2023] [Accepted: 08/22/2023] [Indexed: 08/31/2023] Open
Abstract
Isobaric labeling relative quantitation is one of the dominating proteomic quantitation technologies. Traditional quantitation pipelines for isobaric-labeled mass spectrometry data are based on sequence database searching. In this study, we present a novel quantitation pipeline that integrates sequence database searching, spectral library searching, and a feature-based peptide-spectrum-match (PSM) filter using various spectral features for filtering. The combined database and spectral library searching results in larger quantitation coverage, and the filter removes PSMs with larger quantitation errors, retaining those with higher quantitation accuracy. Quantitation results show that the proposed pipeline can improve the overall quantitation accuracy at the PSM and protein levels. To our knowledge, this is the first study that utilizes spectral library searching to improve isobaric labeling-based quantitation. For users to conveniently perform the proposed pipeline, we have implemented the feature-based filter being executable on both Windows and Linux platforms; its executable files, user manual, and sample data sets are freely available at https://ms.iis.sinica.edu.tw/comics/Software_FPF.html . Furthermore, with the developed filter, the proposed pipeline is fully compatible with the Trans-Proteomic Pipeline.
Collapse
Affiliation(s)
- Tzu-Yun Kuo
- Department of Biochemical Science and Technology, National Taiwan University, Taipei, 10617, Taiwan
| | - Jen-Hung Wang
- Bioinformatics Program, Taiwan International Graduate Program, Institute of Statistical Science, Academia Sinica, Taipei, 11529, Taiwan
- Institute of Information Science, Academia Sinica, Taipei, 11529, Taiwan
- Institute of Biomedical Informatics, National Yang Ming Chiao Tung University, Taipei, 11221, Taiwan
| | - Yung-Wen Huang
- Department of Computer Science and Information Engineering, National Taiwan University, Taipei, 10617, Taiwan
| | - Ting-Yi Sung
- Institute of Information Science, Academia Sinica, Taipei, 11529, Taiwan.
| | - Ching-Tai Chen
- Department of Bioinformatics and Biomedical Engineering, Asia University, Taichung, 41354, Taiwan.
- Center for Precision Health Research, Asia University, Taichung, 41354, Taiwan.
| |
Collapse
|
7
|
Yang KL, Yu F, Teo GC, Li K, Demichev V, Ralser M, Nesvizhskii AI. MSBooster: improving peptide identification rates using deep learning-based features. Nat Commun 2023; 14:4539. [PMID: 37500632 PMCID: PMC10374903 DOI: 10.1038/s41467-023-40129-9] [Citation(s) in RCA: 60] [Impact Index Per Article: 30.0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/07/2022] [Accepted: 07/06/2023] [Indexed: 07/29/2023] Open
Abstract
Peptide identification in liquid chromatography-tandem mass spectrometry (LC-MS/MS) experiments relies on computational algorithms for matching acquired MS/MS spectra against sequences of candidate peptides using database search tools, such as MSFragger. Here, we present a new tool, MSBooster, for rescoring peptide-to-spectrum matches using additional features incorporating deep learning-based predictions of peptide properties, such as LC retention time, ion mobility, and MS/MS spectra. We demonstrate the utility of MSBooster, in tandem with MSFragger and Percolator, in several different workflows, including nonspecific searches (immunopeptidomics), direct identification of peptides from data independent acquisition data, single-cell proteomics, and data generated on an ion mobility separation-enabled timsTOF MS platform. MSBooster is fast, robust, and fully integrated into the widely used FragPipe computational platform.
Collapse
Affiliation(s)
- Kevin L Yang
- Department of Computational Medicine and Bioinformatics, University of Michigan, Ann Arbor, MI, USA
| | - Fengchao Yu
- Department of Pathology, University of Michigan, Ann Arbor, MI, USA.
| | - Guo Ci Teo
- Department of Pathology, University of Michigan, Ann Arbor, MI, USA
| | - Kai Li
- Department of Computational Medicine and Bioinformatics, University of Michigan, Ann Arbor, MI, USA
| | - Vadim Demichev
- Department of Biochemistry, Charité Universitätsmedizin, Berlin, Germany
- Department of Biochemistry, University of Cambridge, Cambridge, UK
| | - Markus Ralser
- Department of Biochemistry, Charité Universitätsmedizin, Berlin, Germany
- Nuffield Department of Medicine, The Wellcome Centre for Human Genetics, University of Oxford, Oxford, UK
- Max Planck Institute for Molecular Genetics, Berlin, Germany
| | - Alexey I Nesvizhskii
- Department of Computational Medicine and Bioinformatics, University of Michigan, Ann Arbor, MI, USA.
- Department of Pathology, University of Michigan, Ann Arbor, MI, USA.
| |
Collapse
|
8
|
Abstract
Spectrum library searching is a powerful alternative to database searching for data dependent acquisition experiments, but has been historically limited to identifying previously observed peptides in libraries. Here we present Scribe, a new library search engine designed to leverage deep learning fragmentation prediction software such as Prosit. Rather than relying on highly curated DDA libraries, this approach predicts fragmentation and retention times for every peptide in a FASTA database. Scribe embeds Percolator for false discovery rate correction and an interference tolerant, label-free quantification integrator for an end-to-end proteomics workflow. By leveraging expected relative fragmentation and retention time values, we find that library searching with Scribe can outperform traditional database searching tools both in terms of sensitivity and quantitative precision. Scribe and its graphical interface are easy to use, freely accessible, and fully open source.
Collapse
Affiliation(s)
- Brian C Searle
- Department of Biomedical Informatics, The Ohio State University Medical Center, Columbus, Ohio43210, United States
- Department of Chemistry and Biochemistry, The Ohio State University, Columbus, Ohio43210, United States
- Pelotonia Institute for Immuno-Oncology, The Ohio State University Comprehensive Cancer Center, Columbus, Ohio43210, United States
- Proteome Software Inc., Portland, Oregon97219, United States
| | - Ariana E Shannon
- Department of Biomedical Informatics, The Ohio State University Medical Center, Columbus, Ohio43210, United States
- Department of Chemistry and Biochemistry, The Ohio State University, Columbus, Ohio43210, United States
- Pelotonia Institute for Immuno-Oncology, The Ohio State University Comprehensive Cancer Center, Columbus, Ohio43210, United States
| | - Damien Beau Wilburn
- Department of Biomedical Informatics, The Ohio State University Medical Center, Columbus, Ohio43210, United States
- Department of Chemistry and Biochemistry, The Ohio State University, Columbus, Ohio43210, United States
- Pelotonia Institute for Immuno-Oncology, The Ohio State University Comprehensive Cancer Center, Columbus, Ohio43210, United States
| |
Collapse
|
9
|
Dorl S, Winkler S, Mechtler K, Dorfer V. MS Ana: Improving Sensitivity in Peptide Identification with Spectral Library Search. J Proteome Res 2023; 22:462-470. [PMID: 36688604 PMCID: PMC9903325 DOI: 10.1021/acs.jproteome.2c00658] [Citation(s) in RCA: 4] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/24/2023]
Abstract
Spectral library search can enable more sensitive peptide identification in tandem mass spectrometry experiments. However, its drawbacks are the limited availability of high-quality libraries and the added difficulty of creating decoy spectra for result validation. We describe MS Ana, a new spectral library search engine that enables high sensitivity peptide identification using either curated or predicted spectral libraries as well as robust false discovery control through its own decoy library generation algorithm. MS Ana identifies on average 36% more spectrum matches and 4% more proteins than database search in a benchmark test on single-shot human cell-line data. Further, we demonstrate the quality of the result validation with tests on synthetic peptide pools and show the importance of library selection through a comparison of library search performance with different configurations of publicly available human spectral libraries.
Collapse
Affiliation(s)
- Sebastian Dorl
- University
of Applied Sciences Upper Austria, Bioinformatics Research Group, Softwarepark 11, 4232Hagenberg, Austria,Department
of Computer Science, Johannes Kepler University
Linz, Altenbergerstraße
69, 4040Linz, Austria,E-mail: . Phone: +43 (0) 50804
27145
| | - Stephan Winkler
- University
of Applied Sciences Upper Austria, Bioinformatics Research Group, Softwarepark 11, 4232Hagenberg, Austria,Department
of Computer Science, Johannes Kepler University
Linz, Altenbergerstraße
69, 4040Linz, Austria
| | - Karl Mechtler
- Research
Institute of Molecular Pathology (IMP), Protein Chemistry, Campus-Vienna-Biocenter 1, 1030Vienna, Austria,Institute
of Molecular Biotechnology (IMBA), Protein Chemistry, Vienna Biocenter
(VBC), Dr. Bohr-Gasse 3, 1030Vienna, Austria,Gregor
Mendel Institute of Molecular Plant Biology of the Austrian Academy
of Sciences (GMI), Dr.
Bohr Gasse 3, 1030Vienna, Austria
| | - Viktoria Dorfer
- University
of Applied Sciences Upper Austria, Bioinformatics Research Group, Softwarepark 11, 4232Hagenberg, Austria,E-mail: . Phone: +43 (0) 50804
22740
| |
Collapse
|
10
|
Lee S, Park H, Kim H. False discovery rate estimation using candidate peptides for each spectrum. BMC Bioinformatics 2022; 23:454. [PMID: 36319948 PMCID: PMC9623924 DOI: 10.1186/s12859-022-05002-4] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/12/2022] [Accepted: 10/25/2022] [Indexed: 11/06/2022] Open
Abstract
BACKGROUND False discovery rate (FDR) estimation is very important in proteomics. The target-decoy strategy (TDS), which is often used for FDR estimation, estimates the FDR under the assumption that when spectra are identified incorrectly, the probabilities of the spectra matching the target or decoy peptides are identical. However, no spectra matching target or decoy peptide probabilities are identical. We propose cTDS (target-decoy strategy with candidate peptides) for accurate estimation of the FDR using the probability that the spectrum is identified incorrectly as a target or decoy peptide. RESULTS Most spectrum cases result in a probability of having the spectrum identified incorrectly as a target or decoy peptide of close to 0.5, but only about 1.14-4.85% of the total spectra have an exact probability of 0.5. We used an entrapment sequence method to demonstrate the accuracy of cTDS. For fixed FDR thresholds (1-10%), the false match rate (FMR) in cTDS is closer than the FMR in TDS. We compared the number of peptide-spectrum matches (PSMs) obtained with TDS and cTDS at a 1% FDR threshold with the HEK293 dataset. In the first and third replications, the number of PSMs obtained with cTDS for the reverse, pseudo-reverse, shuffle, and de Bruijn databases exceeded those obtained with TDS (about 0.001-0.132%), with the pseudo-shuffle database containing less compared to TDS (about 0.05-0.126%). In the second replication, the number of PSMs obtained with cTDS for all databases exceeds that obtained with TDS (about 0.013-0.274%). CONCLUSIONS When spectra are actually identified incorrectly, most probabilities of the spectra matching a target or decoy peptide are not identical. Therefore, we propose cTDS, which estimates the FDR more accurately using the probability of the spectrum being identified incorrectly as a target or decoy peptide.
Collapse
Affiliation(s)
- Sangjeong Lee
- grid.49606.3d0000 0001 1364 9317Department of Computer Science, Hanyang University, Seoul, 06978 Republic of Korea
| | - Heejin Park
- grid.49606.3d0000 0001 1364 9317Department of Computer Science, Hanyang University, Seoul, 06978 Republic of Korea
| | - Hyunwoo Kim
- grid.249964.40000 0001 0523 5253Biomedical Informatics Team, Korea Institute of Science and Technology Information, Daejeon, 34141 Republic of Korea
| |
Collapse
|
11
|
Zwillinger M, Fischer L, Sályi G, Szabó S, Csékei M, Huc I, Kotschy A. Isotope Ratio Encoding of Sequence-Defined Oligomers. J Am Chem Soc 2022; 144:19078-19088. [PMID: 36206533 DOI: 10.1021/jacs.2c08135] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/28/2022]
Abstract
Information storage at the molecular level commonly entails encoding in the form of ordered sequences of different monomers and subsequent fragmentation and tandem mass spectrometry analysis to read this information. Recent approaches also include the use of mixtures of distinct molecules noncovalently bonded to one another. Here, we present an alternate isotope ratio encoding approach utilizing deuterium-labeled monomers to produce hundreds of oligomers endowed with unique isotope distribution patterns. Mass spectrometric recognition of these patterns then allowed us to directly readout encoded information with high fidelity. Specifically, we show that all 256 tetramers composed of four different monomers of identical constitution can be distinguished by their mass fingerprint using mono-, di-, tri-, and tetradeuterated building blocks. The method is robust to experimental errors and does not require the most sophisticated mass spectrometry instrumentation. Such isotope ratio-encoded oligomers may serve as tags that carry information, but the method mainly opens up the capability to write information, for example, about molecular identity, directly into a pure compound via its isotopologue distribution obviating the need for additional tagging and avoiding the use of mixtures of different molecules.
Collapse
Affiliation(s)
- Márton Zwillinger
- Servier Research Institute of Medicinal Chemistry, H-1031 Budapest, Hungary.,Hevesy György PhD School of Chemistry, Eötvös Loránd University, H-1053 Budapest, Hungary
| | - Lucile Fischer
- CBMN UMR5248, University of Bordeaux-CNRS-IPB, F-33600 Pessac, France
| | - Gergő Sályi
- Servier Research Institute of Medicinal Chemistry, H-1031 Budapest, Hungary
| | - Soma Szabó
- Servier Research Institute of Medicinal Chemistry, H-1031 Budapest, Hungary
| | - Márton Csékei
- Servier Research Institute of Medicinal Chemistry, H-1031 Budapest, Hungary
| | - Ivan Huc
- Department of Pharmacy and Center for Integrated Protein Science, Ludwig-Maximilians-University, D-81377 Munich, Germany
| | - András Kotschy
- Servier Research Institute of Medicinal Chemistry, H-1031 Budapest, Hungary
| |
Collapse
|
12
|
Shiferaw GA, Gabriels R, Bouwmeester R, Van Den Bossche T, Vandermarliere E, Martens L, Volders PJ. Sensitive and Specific Spectral Library Searching with CompOmics Spectral Library Searching Tool and Percolator. J Proteome Res 2022; 21:1365-1370. [PMID: 35446579 DOI: 10.1021/acs.jproteome.2c00075] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/29/2022]
Abstract
Maintaining high sensitivity while limiting false positives is a key challenge in peptide identification from mass spectrometry data. Here, we investigate the effects of integrating the machine learning-based postprocessor Percolator into our spectral library searching tool COSS (CompOmics Spectral library Searching tool). To evaluate the effects of this postprocessing, we have used 40 data sets from 2 different projects and have searched these against the NIST and MassIVE spectral libraries. The searching is carried out using 2 spectral library search tools, COSS and MSPepSearch with and without Percolator postprocessing, and using sequence database search engine MS-GF+ as a baseline comparator. The addition of the Percolator rescoring step to COSS is effective and results in a substantial improvement in sensitivity and specificity of the identifications. COSS is freely available as open source under the permissive Apache2 license, and binaries and source code are found at https://github.com/compomics/COSS.
Collapse
Affiliation(s)
- Genet Abay Shiferaw
- VIB-UGent Center for Medical Biotechnology, VIB, 9000 Ghent, Belgium.,Department of Biomolecular Medicine, Ghent University, 9000 Ghent, Belgium
| | - Ralf Gabriels
- VIB-UGent Center for Medical Biotechnology, VIB, 9000 Ghent, Belgium.,Department of Biomolecular Medicine, Ghent University, 9000 Ghent, Belgium
| | - Robbin Bouwmeester
- VIB-UGent Center for Medical Biotechnology, VIB, 9000 Ghent, Belgium.,Department of Biomolecular Medicine, Ghent University, 9000 Ghent, Belgium
| | - Tim Van Den Bossche
- VIB-UGent Center for Medical Biotechnology, VIB, 9000 Ghent, Belgium.,Department of Biomolecular Medicine, Ghent University, 9000 Ghent, Belgium
| | - Elien Vandermarliere
- VIB-UGent Center for Medical Biotechnology, VIB, 9000 Ghent, Belgium.,Department of Biomolecular Medicine, Ghent University, 9000 Ghent, Belgium
| | - Lennart Martens
- VIB-UGent Center for Medical Biotechnology, VIB, 9000 Ghent, Belgium.,Department of Biomolecular Medicine, Ghent University, 9000 Ghent, Belgium
| | - Pieter-Jan Volders
- VIB-UGent Center for Medical Biotechnology, VIB, 9000 Ghent, Belgium.,Department of Biomolecular Medicine, Ghent University, 9000 Ghent, Belgium.,Cancer Research Institute Ghent, Ghent University, 9000 Ghent, Belgium
| |
Collapse
|
13
|
Na S, Choi H, Paek E. Deephos: Predicted spectral database search for TMT-labeled phosphopeptides and its false discovery rate estimation. Bioinformatics 2022; 38:2980-2987. [PMID: 35441674 DOI: 10.1093/bioinformatics/btac280] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/12/2021] [Revised: 03/26/2022] [Accepted: 04/14/2022] [Indexed: 11/14/2022] Open
Abstract
MOTIVATION Tandem mass tag (TMT)-based tandem mass spectrometry (MS/MS) has become the method of choice for the quantification of post-translational modifications in complex mixtures. Many cancer proteogenomic studies have highlighted the importance of large-scale phosphopeptide quantification coupled with TMT labeling. Herein, we propose a predicted Spectral DataBase (pSDB) search strategy called Deephos that can improve both sensitivity and specificity in identifying MS/MS spectra of TMT-labeled phosphopeptides. RESULTS With deep learning-based fragment ion prediction, we compiled a pSDB of TMT-labeled phosphopeptides generated from ∼8,000 human phosphoproteins annotated in UniProt. Deep learning could successfully recognize the fragmentation patterns altered by both TMT labeling and phosphorylation. In addition, we discuss the decoy spectra for false discovery rate (FDR) estimation in the pSDB search. We show that FDR could be inaccurately estimated by the existing decoy spectra generation methods and propose an innovative method to generate decoy spectra for more accurate FDR estimation. The utilities of Deephos were demonstrated in multi-stage analyses (coupled with database searches) of glioblastoma, acute myeloid leukemia, and breast cancer phosphoproteomes. AVAILABILITY Deephos pSDB and the search software are available at https://github.com/seungjinna/deephos.
Collapse
Affiliation(s)
- Seungjin Na
- Institute for Artificial Intelligence Research, Hanyang University, Seoul, 04763, Republic of Korea
| | - Hyunjin Choi
- Department of Automotive Engineering, Hanyang University, Seoul, 04763, Republic of Korea
| | - Eunok Paek
- Institute for Artificial Intelligence Research, Hanyang University, Seoul, 04763, Republic of Korea.,Department of Computer Science, Hanyang University, Seoul, 04763, Republic of Korea
| |
Collapse
|
14
|
Garcia-Arcos I, Park SS, Mai M, Alvarez-Buve R, Chow L, Cai H, Baumlin-Schmid N, Agudelo CW, Martinez J, Kim MD, Dabo AJ, Salathe M, Goldberg IJ, Foronjy RF. LRP1 loss in airway epithelium exacerbates smoke-induced oxidative damage and airway remodeling. J Lipid Res 2022; 63:100185. [PMID: 35202607 PMCID: PMC8953659 DOI: 10.1016/j.jlr.2022.100185] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/02/2022] [Accepted: 02/07/2022] [Indexed: 02/04/2023] Open
Abstract
The LDL receptor-related protein 1 (LRP1) partakes in metabolic and signaling events regulated in a tissue-specific manner. The function of LRP1 in airways has not been studied. We aimed to study the function of LRP1 in smoke-induced disease. We found that bronchial epithelium of patients with chronic obstructive pulmonary disease and airway epithelium of mice exposed to smoke had increased LRP1 expression. We then knocked out LRP1 in human bronchial epithelial cells in vitro and in airway epithelial club cells in mice. In vitro, LRP1 knockdown decreased cell migration and increased transforming growth factor β activation. Tamoxifen-inducible airway-specific LRP1 knockout mice (club Lrp1-/-) induced after complete lung development had increased inflammation in the bronchoalveolar space and lung parenchyma at baseline. After 6 months of smoke exposure, club Lrp1-/- mice showed a combined restrictive and obstructive phenotype, with lower compliance, inspiratory capacity, and forced expiratory volume0.05/forced vital capacity than WT smoke-exposed mice. This was associated with increased values of Ashcroft fibrotic index. Proteomic analysis of room air exposed-club Lrp1-/- mice showed significantly decreased levels of proteins involved in cytoskeleton signaling and xenobiotic detoxification as well as decreased levels of glutathione. The proteome fingerprint created by smoke eclipsed many of the original differences, but club Lrp1-/- mice continued to have decreased lung glutathione levels and increased protein oxidative damage and airway cell proliferation. Therefore, LRP1 deficiency leads to greater lung inflammation and damage and exacerbates smoke-induced lung disease.
Collapse
Affiliation(s)
- Itsaso Garcia-Arcos
- Departments of Medicine and Cell Biology, SUNY Downstate Medical Center, New York, NY, USA.
| | - Sangmi S Park
- Departments of Medicine and Cell Biology, SUNY Downstate Medical Center, New York, NY, USA
| | - Michelle Mai
- Departments of Medicine and Cell Biology, SUNY Downstate Medical Center, New York, NY, USA
| | - Roger Alvarez-Buve
- Respiratory Department, Hospital University Arnau de Vilanova and Santa Maria, IRB Lleida, University of Lleida, Lleida, Catalonia, Spain
| | - Lillian Chow
- Departments of Medicine and Cell Biology, SUNY Downstate Medical Center, New York, NY, USA
| | - Huchong Cai
- Departments of Medicine and Cell Biology, SUNY Downstate Medical Center, New York, NY, USA
| | | | - Christina W Agudelo
- Departments of Medicine and Cell Biology, SUNY Downstate Medical Center, New York, NY, USA
| | - Jennifer Martinez
- Departments of Medicine and Cell Biology, SUNY Downstate Medical Center, New York, NY, USA
| | - Michael D Kim
- Department of Internal Medicine, University of Kansas Medical Center, Kansas City, KS, USA
| | - Abdoulaye J Dabo
- Departments of Medicine and Cell Biology, SUNY Downstate Medical Center, New York, NY, USA
| | - Matthias Salathe
- Department of Internal Medicine, University of Kansas Medical Center, Kansas City, KS, USA
| | - Ira J Goldberg
- Department of Medicine, NYU Langone School of Medicine, New York, NY, USA
| | - Robert F Foronjy
- Departments of Medicine and Cell Biology, SUNY Downstate Medical Center, New York, NY, USA
| |
Collapse
|
15
|
Wang JH, Choong WK, Chen CT, Sung TY. Calibr improves spectral library search for spectrum-centric analysis of data independent acquisition proteomics. Sci Rep 2022; 12:2045. [PMID: 35132134 PMCID: PMC8821666 DOI: 10.1038/s41598-022-06026-9] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/08/2021] [Accepted: 01/21/2022] [Indexed: 12/20/2022] Open
Abstract
Identifying peptides and proteins from mass spectrometry (MS) data, spectral library searching has emerged as a complementary approach to the conventional database searching. However, for the spectrum-centric analysis of data-independent acquisition (DIA) data, spectral library searching has not been widely exploited because existing spectral library search tools are mainly designed and optimized for the analysis of data-dependent acquisition (DDA) data. We present Calibr, a spectral library search tool for spectrum-centric DIA data analysis. Calibr optimizes spectrum preprocessing for pseudo MS2 spectra, generating an 8.11% increase in spectrum–spectrum match (SSM) number and a 7.49% increase in peptide number over the traditional preprocessing approach. When searching against the DDA-based spectral library, Calibr improves SSM number by 17.6–26.65% and peptide number by 18.45–37.31% over two state-of-the-art tools on three different data sets. Searching against the public spectral library from MassIVE, Calibr improves state-of-the-art tools in SSM and peptide numbers by more than 31.49% and 25.24%, respectively, for two data sets. Our analyses indicate higher sensitivity of Calibr results from the use of various spectral similarity measures and statistical scores, coupled with machine learning-based statistical validation for FDR control. Calibr executable files including a graphical user-interface application are available at https://ms.iis.sinica.edu.tw/COmics/Software_CalibrWizard.html and https://sourceforge.net/projects/comics-calibr.
Collapse
|
16
|
Simopoulos CMA, Figeys D, Lavallée-Adam M. Novel Bioinformatics Strategies Driving Dynamic Metaproteomic Studies. Methods Mol Biol 2022; 2456:319-338. [PMID: 35612752 DOI: 10.1007/978-1-0716-2124-0_22] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/18/2022]
Abstract
Constant improvements in mass spectrometry technologies and laboratory workflows have enabled the proteomics investigation of biological samples of growing complexity. Microbiomes represent such complex samples for which metaproteomics analyses are becoming increasingly popular. Metaproteomics experimental procedures create large amounts of data from which biologically relevant signal must be efficiently extracted to draw meaningful conclusions. Such a data processing requires appropriate bioinformatics tools specifically developed for, or capable of handling metaproteomics data. In this chapter, we outline current and novel tools that can perform the most commonly used steps in the analysis of cutting-edge metaproteomics data, such as peptide and protein identification and quantification, as well as data normalization, imputation, mining, and visualization. We also provide details about the experimental setups in which these tools should be used.
Collapse
Affiliation(s)
- Caitlin M A Simopoulos
- Department of Biochemistry, Microbiology and Immunology and Ottawa Institute of Systems Biology, University of Ottawa, Ottawa, ON, Canada
| | - Daniel Figeys
- Department of Biochemistry, Microbiology and Immunology and Ottawa Institute of Systems Biology, University of Ottawa, Ottawa, ON, Canada
- School of Pharmaceutical Sciences, University of Ottawa, Ottawa, ON, Canada
| | - Mathieu Lavallée-Adam
- Department of Biochemistry, Microbiology and Immunology and Ottawa Institute of Systems Biology, University of Ottawa, Ottawa, ON, Canada.
| |
Collapse
|
17
|
Zhang W, Liang Z, Chen X, Xin L, Shan B, Luo Z, Li M. ChimST: An Efficient Spectral Library Search Tool for Peptide Identification from Chimeric Spectra in Data-Dependent Acquisition. IEEE/ACM TRANSACTIONS ON COMPUTATIONAL BIOLOGY AND BIOINFORMATICS 2021; 18:1416-1425. [PMID: 31603795 DOI: 10.1109/tcbb.2019.2945954] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/10/2023]
Abstract
Accurate and sensitive identification of peptides from MS/MS spectra is a very challenging problem in computational shotgun proteomics. To tackle this problem, spectral library search has been one of the competitive solutions. However, most existing library search tools were developed on the basis of one peptide per spectrum, which prevents them from working properly on chimeric spectra where two or more peptides are co-fragmented. In this work, we present a new library search tool called ChimST, which is particularly capable of reliably identifying multiple peptides from a chimeric spectrum. It starts with associating each query MS/MS spectrum with MS precursor features. For each precursor feature, there is a list of peptide candidates extracted from an input spectral library. Then, it takes one peptide candidate from each associated feature and scores how well they could collectively interpret the query spectrum. The highest-scoring set of peptide candidates are finally reported as the identification of the query spectrum. Our experimental tests show that ChimST could significantly outperform the three state-of-the-art library search tools, SpectraST, reSpect, and MSPLIT, in terms of the numbers of both peptide-spectrum matches and unique peptides, especially when the acquisition isolation window is broad.
Collapse
|
18
|
Abstract
Proteomics, the large-scale study of all proteins of an organism or system, is a powerful tool for studying biological systems. It can provide a holistic view of the physiological and biochemical states of given samples through identification and quantification of large numbers of peptides and proteins. In forensic science, proteomics can be used as a confirmatory and orthogonal technique for well-built genomic analyses. Proteomics is highly valuable in cases where nucleic acids are absent or degraded, such as hair and bone samples. It can be used to identify body fluids, ethnic group, gender, individual, and estimate post-mortem interval using bone, muscle, and decomposition fluid samples. Compared to genomic analysis, proteomics can provide a better global picture of a sample. It has been used in forensic science for a wide range of sample types and applications. In this review, we briefly introduce proteomic methods, including sample preparation techniques, data acquisition using liquid chromatography-tandem mass spectrometry, and data analysis using database search, spectral library search, and de novo sequencing. We also summarize recent applications in the past decade of proteomics in forensic science with a special focus on human samples, including hair, bone, body fluids, fingernail, muscle, brain, and fingermark, and address the challenges, considerations, and future developments of forensic proteomics.
Collapse
|
19
|
Cifani P, Li Z, Luo D, Grivainis M, Intlekofer AM, Fenyö D, Kentsis A. Discovery of Protein Modifications Using Differential Tandem Mass Spectrometry Proteomics. J Proteome Res 2021; 20:1835-1848. [PMID: 33749263 PMCID: PMC8341206 DOI: 10.1021/acs.jproteome.0c00638] [Citation(s) in RCA: 14] [Impact Index Per Article: 3.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/26/2023]
Abstract
Recent studies have revealed diverse amino acid, post-translational, and noncanonical modifications of proteins in diverse organisms and tissues. However, their unbiased detection and analysis remain hindered by technical limitations. Here, we present a spectral alignment method for the identification of protein modifications using high-resolution mass spectrometry proteomics. Termed SAMPEI for spectral alignment-based modified peptide identification, this open-source algorithm is designed for the discovery of functional protein and peptide signaling modifications, without prior knowledge of their identities. Using synthetic standards and controlled chemical labeling experiments, we demonstrate its high specificity and sensitivity for the discovery of substoichiometric protein modifications in complex cellular extracts. SAMPEI mapping of mouse macrophage differentiation revealed diverse post-translational protein modifications, including distinct forms of cysteine itaconatylation. SAMPEI's robust parametrization and versatility are expected to facilitate the discovery of biological modifications of diverse macromolecules. SAMPEI is implemented as a Python package and is available open-source from BioConda and GitHub (https://github.com/FenyoLab/SAMPEI).
Collapse
Affiliation(s)
- Paolo Cifani
- Molecular Pharmacology Program, Sloan Kettering Institute, Memorial Sloan Kettering Cancer Center, New York, New York 10021, United States
| | - Zhi Li
- Institute for Systems Genetics, NYU Grossman School of Medicine, New York, New York 10016, United States
- Department of Biochemistry and Molecular Pharmacology, NYU Grossman School of Medicine, New York, New York 10016, United States
| | - Danmeng Luo
- Molecular Pharmacology Program, Sloan Kettering Institute, Memorial Sloan Kettering Cancer Center, New York, New York 10021, United States
| | - Mark Grivainis
- Institute for Systems Genetics, NYU Grossman School of Medicine, New York, New York 10016, United States
| | - Andrew M Intlekofer
- Human Oncology & Pathogenesis Program and Department of Medicine, Memorial Sloan Kettering Cancer Center, New York, New York 10021, United States
| | - David Fenyö
- Institute for Systems Genetics, NYU Grossman School of Medicine, New York, New York 10016, United States
- Department of Biochemistry and Molecular Pharmacology, NYU Grossman School of Medicine, New York, New York 10016, United States
| | - Alex Kentsis
- Molecular Pharmacology Program, Sloan Kettering Institute, Memorial Sloan Kettering Cancer Center, New York, New York 10021, United States
- Tow Center for Developmental Oncology, Department of Pediatrics, Memorial Sloan Kettering Cancer Center, and Departments of Pediatrics, Pharmacology, and Physiology & Biophysics, Weill Medical College of Cornell University, New York, New York 10021, United States
| |
Collapse
|
20
|
Wilburn DB, Richards AL, Swaney DL, Searle BC. CIDer: A Statistical Framework for Interpreting Differences in CID and HCD Fragmentation. J Proteome Res 2021; 20:1951-1965. [PMID: 33729787 DOI: 10.1021/acs.jproteome.0c00964] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [Key Words] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/29/2022]
Abstract
Library searching is a powerful technique for detecting peptides using either data independent or data dependent acquisition. While both large-scale spectrum library curators and deep learning prediction approaches have focused on beam-type CID fragmentation (HCD), resonance CID fragmentation remains a popular technique. Here we demonstrate an approach to model the differences between HCD and CID spectra, and present a software tool, CIDer, for converting libraries between the two fragmentation methods. We demonstrate that just using a combination of simple linear models and basic principles of peptide fragmentation, we can explain up to 43% of the variation between ions fragmented by HCD and CID across an array of collision energy settings. We further show that in some circumstances, searching converted CID libraries can detect more peptides than searching existing CID libraries or libraries of machine learning predictions from FASTA databases. These results suggest that leveraging information in existing libraries by converting between HCD and CID libraries may be an effective interim solution while large-scale CID libraries are being developed.
Collapse
Affiliation(s)
- Damien B Wilburn
- Institute for Systems Biology, Seattle, Washington 98109, United States.,Department of Genome Sciences, University of Washington, Seattle, Washington 98195, United States
| | - Alicia L Richards
- Quantitative Biosciences Institute (QBI), University of California San Francisco, San Francisco, California 94158, United States.,J. David Gladstone Institutes, San Francisco, California 94158, United States.,Department of Cellular and Molecular Pharmacology, University of California San Francisco, San Francisco, California 94158, United States
| | - Danielle L Swaney
- Quantitative Biosciences Institute (QBI), University of California San Francisco, San Francisco, California 94158, United States.,J. David Gladstone Institutes, San Francisco, California 94158, United States.,Department of Cellular and Molecular Pharmacology, University of California San Francisco, San Francisco, California 94158, United States
| | - Brian C Searle
- Institute for Systems Biology, Seattle, Washington 98109, United States
| |
Collapse
|
21
|
St-Germain JR, Astori A, Raught B. A SARS-CoV-2 Peptide Spectral Library Enables Rapid, Sensitive Identification of Virus Peptides in Complex Biological Samples. J Proteome Res 2021; 20:2187-2194. [PMID: 33683136 DOI: 10.1021/acs.jproteome.1c00048] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Key Words] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/29/2022]
Abstract
On the basis of an analysis of (i) SARS-CoV-2 virions, (ii) SARS-CoV-2-infected VeroE6 cell lysates, and (iii) recombinant SARS-CoV-2 proteins expressed in HEK 293 cells, here we present a comprehensive SARS-CoV-2 peptide spectrum compendium, comprising 1682 high confidence peptide consensus spectra derived from 1170 peptides (of various charge states) spanning 23 virus proteins. This high quality reference set can be used, e.g., for the selection of commonly observed virus peptides for use in targeted proteomics or data-independent acquisition (DIA) approaches. Using this rich resource, we also demonstrate that a spectral matching search approach yields improved performance over the use of standard database search engines alone for the identification of virus peptides in complex biological samples.
Collapse
Affiliation(s)
- Jonathan R St-Germain
- Princess Margaret Cancer Centre, University Health Network, Toronto, Ontario M5G 2C4, Canada
| | - Audrey Astori
- Princess Margaret Cancer Centre, University Health Network, Toronto, Ontario M5G 2C4, Canada
| | - Brian Raught
- Princess Margaret Cancer Centre, University Health Network, Toronto, Ontario M5G 2C4, Canada.,Department of Medical Biophysics, University of Toronto, Toronto, Ontario M5G 1L7, Canada
| |
Collapse
|
22
|
Taechawattananant P, Yoshii K, Ishihama Y. Peak Identification and Quantification by Proteomic Mass Spectrogram Decomposition. J Proteome Res 2021; 20:2291-2298. [PMID: 33661642 DOI: 10.1021/acs.jproteome.0c00819] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/28/2022]
Abstract
Recent advances in the liquid chromatography/mass spectrometry (LC/MS) technology have improved the sensitivity, resolution, and speed of proteome analysis, resulting in increasing demand for more sophisticated algorithms to interpret complex mass spectrograms. Here, we propose a novel statistical method, proteomic mass spectrogram decomposition (ProtMSD), for joint identification and quantification of peptides and proteins. Given the proteomic mass spectrogram and the reference mass spectra of all possible peptide ions associated with proteins as a dictionary, ProtMSD estimates the chromatograms of those peptide ions under a group sparsity constraint without using the conventional careful preprocessing (e.g., thresholding and peak picking). We show that the method was significantly improved using protein-peptide hierarchical relationships, isotopic distribution profiles, reference retention times of peptide ions, and prelearned mass spectra of noise. We examined the concept of database search, library search, and match-between-runs. Our ProtMSD showed excellent agreements of 3277 peptide ions (94.79%) and 493 proteins (98.21%) with Mascot/Skyline for an Escherichia coli proteome sample and of 4460 peptide ions (103%) and 588 proteins (101%) with match-between-runs by MaxQuant for a yeast proteome sample. This is the first attempt to use a matrix decomposition technique as a tool for LC/MS-based proteome identification and quantification.
Collapse
Affiliation(s)
| | - Kazuyoshi Yoshii
- Graduate School of Informatics, Kyoto University, Kyoto 606-8501, Japan.,RIKEN Center for Advanced Intelligence Project (AIP), Tokyo 103-0027, Japan
| | - Yasushi Ishihama
- Graduate School of Pharmaceutical Sciences, Kyoto University, Kyoto 606-8501, Japan.,Laboratory of Clinical and Analytical Chemistry, National Institute of Biomedical Innovation, Health and Nutrition, Ibaraki, Osaka 567-0085, Japan
| |
Collapse
|
23
|
Manda SS, Noor Z, Hains PG, Zhong Q. PIONEER: Pipeline for Generating High-Quality Spectral Libraries for DIA-MS Data. Curr Protoc 2021; 1:e69. [PMID: 33656278 DOI: 10.1002/cpz1.69] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/11/2023]
Abstract
Data-independent-acquisition mass spectrometry (DIA-MS) is a state-of-the-art proteomic technique for high-throughput identification and quantification of peptides and proteins. Interpretation of DIA-MS data relies on the use of a spectral library, which is optimally created from data acquired from the same samples in data-dependent acquisition (DDA) mode. As DIA-MS quantification relies on the spectral libraries, having a high-quality, non-redundant, and comprehensive spectral library is essential. This article describes the major steps for creating a high-quality spectral library using a combination of multiple complementary search engines. We discuss appropriate strategies to control the false discovery rate for the final spectral library as a result of merging multiple searches. © 2021 The Authors Current Protocols © 2021 Wiley Periodicals LLC. Basic Protocol 1: Searching DDA-MS files with multiple search engines Basic Protocol 2: Merging results from multiple search engines Basic Protocol 3: Creating spectral libraries from merged results Alternate Protocol: Using CLI for automating tasks Support Protocol: Creating concatenated FASTA files.
Collapse
Affiliation(s)
- Srikanth S Manda
- ProCan®, Children's Medical Research Institute, Faculty of Medicine and Health, The University of Sydney, Westmead, New South Wales, Australia
| | - Zainab Noor
- ProCan®, Children's Medical Research Institute, Faculty of Medicine and Health, The University of Sydney, Westmead, New South Wales, Australia
| | - Peter G Hains
- ProCan®, Children's Medical Research Institute, Faculty of Medicine and Health, The University of Sydney, Westmead, New South Wales, Australia
| | - Qing Zhong
- ProCan®, Children's Medical Research Institute, Faculty of Medicine and Health, The University of Sydney, Westmead, New South Wales, Australia
| |
Collapse
|
24
|
Barnes S. Overview of Experimental Methods and Study Design in Metabolomics, and Statistical and Pathway Considerations. Methods Mol Biol 2021; 2104:1-10. [PMID: 31953809 DOI: 10.1007/978-1-0716-0239-3_1] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/04/2023]
Abstract
Metabolomics has become a powerful tool in biological and clinical investigations. This chapter reviews the technological basis of metabolomics and the considerations in answering biomedical questions. The workflow of metabolomics is explained in the sequence of data processing, quality control, metabolite annotation, statistical analysis, pathway analysis, and multi-omics integration. Reproducibility in both sample analysis and data analysis is key to the scientific progress, and the recommendation is made on reporting standards in publications. This chapter explains the technical aspects of metabolomics in the context of systems biology and applications to human health.
Collapse
Affiliation(s)
- Stephen Barnes
- Department of Pharmacology & Toxicology and Targeted Metabolomics and Proteomics Laboratory, University of Alabama at Birmingham, Birmingham, AL, USA.
| |
Collapse
|
25
|
Modernizing the Toolkit for Arthropod Bloodmeal Identification. INSECTS 2021; 12:insects12010037. [PMID: 33418885 PMCID: PMC7825046 DOI: 10.3390/insects12010037] [Citation(s) in RCA: 11] [Impact Index Per Article: 2.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 12/08/2020] [Revised: 12/30/2020] [Accepted: 12/31/2020] [Indexed: 11/24/2022]
Abstract
Simple Summary The ability to identify the source of vertebrate blood in mosquitoes, ticks, and other blood-feeding arthropod vectors greatly enhances our knowledge of how vector-borne pathogens are spread. The source of the bloodmeal is identified by analyzing the remnants of blood remaining in the arthropod at the time of capture, though this is often fraught with challenges. This review provides a roadmap and guide for those considering modern techniques for arthropod bloodmeal identification with a focus on progress made in the field over the past decade. We highlight genome regions that can be used to identify the vertebrate source of arthropod bloodmeals as well as technological advances made in other fields that have introduced innovative new ways to identify vertebrate meal source based on unique properties of the DNA sequence, protein signatures, or residual molecules present in the blood. Additionally, engineering progress in miniaturization has led to a number of field-deployable technologies that bring the laboratory directly to the arthropods at the site of collection. Although many of these advancements have helped to address the technical challenges of the past, the challenge of successfully analyzing degraded DNA in bloodmeals remains to be solved. Abstract Understanding vertebrate–vector interactions is vitally important for understanding the transmission dynamics of arthropod-vectored pathogens and depends on the ability to accurately identify the vertebrate source of blood-engorged arthropods in field collections using molecular methods. A decade ago, molecular techniques being applied to arthropod blood meal identification were thoroughly reviewed, but there have been significant advancements in the techniques and technologies available since that time. This review highlights the available diagnostic markers in mitochondrial and nuclear DNA and discusses their benefits and shortcomings for use in molecular identification assays. Advances in real-time PCR, high resolution melting analysis, digital PCR, next generation sequencing, microsphere assays, mass spectrometry, and stable isotope analysis each offer novel approaches and advantages to bloodmeal analysis that have gained traction in the field. New, field-forward technologies and platforms have also come into use that offer promising solutions for point-of-care and remote field deployment for rapid bloodmeal source identification. Some of the lessons learned over the last decade, particularly in the fields of DNA barcoding and sequence analysis, are discussed. Though many advancements have been made, technical challenges remain concerning the prevention of sample degradation both by the arthropod before the sample has been obtained and during storage. This review provides a roadmap and guide for those considering modern techniques for arthropod bloodmeal identification and reviews how advances in molecular technology over the past decade have been applied in this unique biomedical context.
Collapse
|
26
|
Abstract
Mass spectrometry (MS)-based proteomics is currently the most successful approach to measure and compare peptides and proteins in a large variety of biological samples. Modern mass spectrometers, equipped with high-resolution analyzers, provide large amounts of data output. This is the case of shotgun/bottom-up proteomics, which consists in the enzymatic digestion of protein into peptides that are then measured by MS-instruments through a data dependent acquisition (DDA) mode. Dedicated bioinformatic tools and platforms have been developed to face the increasing size and complexity of raw MS data that need to be processed and interpreted for large-scale protein identification and quantification. This chapter illustrates the most popular bioinformatics solution for the analysis of shotgun MS-proteomics data. A general description will be provided on the data preprocessing options and the different search engines available, including practical suggestions on how to optimize the parameters for peptide search, based on hands-on experience.
Collapse
Affiliation(s)
- Avinash Yadav
- Department of Experimental Oncology, European Institute of Oncology (IEO), IRCCS, Milan, Italy
| | - Federica Marini
- Department of Experimental Oncology, European Institute of Oncology (IEO), IRCCS, Milan, Italy
| | - Alessandro Cuomo
- Department of Experimental Oncology, European Institute of Oncology (IEO), IRCCS, Milan, Italy
| | - Tiziana Bonaldi
- Department of Experimental Oncology, European Institute of Oncology (IEO), IRCCS, Milan, Italy.
| |
Collapse
|
27
|
Comprehensive Two-Dimensional Gas Chromatography Mass Spectrometry-Based Metabolomics. ADVANCES IN EXPERIMENTAL MEDICINE AND BIOLOGY 2021; 1280:57-67. [PMID: 33791974 DOI: 10.1007/978-3-030-51652-9_4] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 02/01/2023]
Abstract
Compared to one-dimensional gas chromatography with mass spectrometry (GC-MS), GC × GC-MS provides significantly increased peak capacity, resolution, and sensitivity for analysis of complex biological samples. In the last decade, GC × GC-MS has been increasingly applied to the discovery of metabolite biomarkers and elucidation of metabolic mechanisms in human diseases. The recent development of coupling GC × GC with a high-resolution mass spectrometer further accelerates these metabolomic applications. In this chapter, we will briefly review the instrumentation, sample preparation, data analysis, and applications of GC × GC-MS-based metabolomic analysis.
Collapse
|
28
|
Qin C, Luo X, Deng C, Shu K, Zhu W, Griss J, Hermjakob H, Bai M, Perez-Riverol Y. Deep learning embedder method and tool for mass spectra similarity search. J Proteomics 2020; 232:104070. [PMID: 33307250 DOI: 10.1016/j.jprot.2020.104070] [Citation(s) in RCA: 11] [Impact Index Per Article: 2.2] [Reference Citation Analysis] [Abstract] [Key Words] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/29/2020] [Revised: 11/25/2020] [Accepted: 12/01/2020] [Indexed: 12/31/2022]
Abstract
Spectral similarity calculation is widely used in protein identification tools and mass spectra clustering algorithms while comparing theoretical or experimental spectra. The performance of the spectral similarity calculation plays an important role in these tools and algorithms especially in the analysis of large-scale datasets. Recently, deep learning methods have been proposed to improve the performance of clustering algorithms and protein identification by training the algorithms with existing data and the use of multiple spectra and identified peptide features. While the efficiency of these algorithms is still under study in comparison with traditional approaches, their application in proteomics data analysis is becoming more common. Here, we propose the use of deep learning to improve spectral similarity comparison. We assessed the performance of deep learning for spectral similarity, with GLEAMS and a newly trained embedder model (DLEAMSE), which uses high-quality spectra from PRIDE Cluster. Also, we developed a new bioinformatics tool (mslookup - https://github.com/bigbio/DLEAMSE/) that allows users to quickly search for spectra in previously identified mass spectra publish in public repositories and spectral libraries. Finally, we released a human database to enable bioinformaticians and biologists to search for identified spectra in their machines. SIGNIFICANCE STATEMENT: Spectral similarity calculation plays an important role in proteomics data analysis. With deep learning's ability to learn the implicit and effective features from large-scale training datasets, deep learning-based MS/MS spectra embedding models has emerged as a solution to improve mass spectral clustering similarity calculation algorithms. We compare multiple similarity scoring and deep learning methods in terms of accuracy (compute the similarity for a pair of the mass spectrum) and computing-time performance. The benchmark results showed no major differences in accuracy between DLEAMSE and normalized dot product for spectrum similarity calculations. The DLEAMSE GPU implementation is faster than NDP in preprocessing on the GPU server and the similarity calculation of DLEAMSE (Euclidean distance on 32-D vectors) takes about 1/3 of dot product calculations. The deep learning model (DLEAMSE) encoding and embedding steps needed to run once for each spectrum and the embedded 32-D points can be persisted in the repository for future comparison, which is faster for future comparisons and large-scale data. Based on these, we proposed a new tool mslookup that enables the researcher to find spectra previously identified in public data. The tool can be also used to generate in-house databases of previously identified spectra to share with other laboratories and consortiums.
Collapse
Affiliation(s)
- Chunyuan Qin
- Chongqing Key Laboratory on Big Data for Bio Intelligence, Chongqing University of Posts and telecommunications, Chongqing, China
| | - Xiyang Luo
- Chongqing Key Laboratory on Big Data for Bio Intelligence, Chongqing University of Posts and telecommunications, Chongqing, China
| | - Chuan Deng
- Chongqing Key Laboratory on Big Data for Bio Intelligence, Chongqing University of Posts and telecommunications, Chongqing, China
| | - Kunxian Shu
- Chongqing Key Laboratory on Big Data for Bio Intelligence, Chongqing University of Posts and telecommunications, Chongqing, China
| | - Weimin Zhu
- State Key Laboratory of Proteomics, Beijing Proteome Research Center, National Center for Protein Sciences (Beijing), Beijing Institute of Life Omics, Beijing 102206, China
| | - Johannes Griss
- European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Wellcome Trust Genome Campus, Hinxton, Cambridge CB10 1SD, UK; Department of Dermatology, Medical University of Vienna, 1090 Vienna, Austria
| | - Henning Hermjakob
- State Key Laboratory of Proteomics, Beijing Proteome Research Center, National Center for Protein Sciences (Beijing), Beijing Institute of Life Omics, Beijing 102206, China; European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Wellcome Trust Genome Campus, Hinxton, Cambridge CB10 1SD, UK
| | - Mingze Bai
- Chongqing Key Laboratory on Big Data for Bio Intelligence, Chongqing University of Posts and telecommunications, Chongqing, China; State Key Laboratory of Proteomics, Beijing Proteome Research Center, National Center for Protein Sciences (Beijing), Beijing Institute of Life Omics, Beijing 102206, China.
| | - Yasset Perez-Riverol
- European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Wellcome Trust Genome Campus, Hinxton, Cambridge CB10 1SD, UK.
| |
Collapse
|
29
|
Wang L, Liu K, Li S, Tang H. A Fast and Memory-Efficient Spectral Library Search Algorithm Using Locality-Sensitive Hashing. Proteomics 2020; 20:e2000002. [PMID: 32415809 PMCID: PMC7669687 DOI: 10.1002/pmic.202000002] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/01/2020] [Revised: 04/17/2020] [Indexed: 01/07/2023]
Abstract
With the accumulation of MS/MS spectra collected in spectral libraries, the spectral library searching approach emerges as an important approach for peptide identification in proteomics, complementary to the commonly used protein database searching approach, in particular for the proteomic analyses of well-studied model organisms, such as human. Existing spectral library searching algorithms compare a query MS/MS spectrum with each spectrum in the library with matched precursor mass and charge state, which may become computationally intensive with the rapidly growing library size. Here, the software msSLASH, which implements a fast spectral library searching algorithm based on the Locality-Sensitive Hashing (LSH) technique, is presented. The algorithm first converts the library and query spectra into bit-strings using LSH functions, and then computes the similarity between the spectra with highly similar bit-string. Using the spectral library searching of large real-world MS/MS spectra datasets, it is demonstrated that the algorithm significantly reduced the number of spectral comparisons, and as a result, achieved 2-9X speedup in comparison with existing spectral library searching algorithm SpectraST. The spectral searching algorithm is implemented in C/C++, and is ready to be used in proteomic data analyses.
Collapse
Affiliation(s)
- Lei Wang
- School of Informatics and Computing, Indiana University, Bloomington, IN, 47405, USA
| | - Kaiyuan Liu
- School of Informatics and Computing, Indiana University, Bloomington, IN, 47405, USA
| | - Sujun Li
- School of Informatics and Computing, Indiana University, Bloomington, IN, 47405, USA
| | - Haixu Tang
- School of Informatics and Computing, Indiana University, Bloomington, IN, 47405, USA
| |
Collapse
|
30
|
DeLaney K, Cao W, Ma Y, Ma M, Zhang Y, Li L. PRESnovo: Prescreening Prior to de novo Sequencing to Improve Accuracy and Sensitivity of Neuropeptide Identification. JOURNAL OF THE AMERICAN SOCIETY FOR MASS SPECTROMETRY 2020; 31:1358-1371. [PMID: 32266812 PMCID: PMC7332408 DOI: 10.1021/jasms.0c00013] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/15/2023]
Abstract
Identification of peptides in species lacking fully sequenced genomes is challenging due to the lack of prior knowledge. De novo sequencing is the method of choice, but its performance is less than satisfactory due to algorithmic bias and interference in complex MS/MS spectra. The task becomes even more challenging for endogenous peptides that do not involve an enzymatic digestion step, such as neuropeptides. However, many neuropeptides possess common sequence motifs that are conserved across members of the same family. Taking advantage of this feature to improve de novo sequencing of neuropeptides, we have developed a method named PRESnovo (prescreening precursors prior to de novo sequencing) to predict the motif from a MS/MS spectrum. A neuropeptide sequence is broken into a motif with conserved amino acid residues and the remaining partial sequence. By searching against a predefined motif database constructed from known homologous sequences, PRESnovo assigns the most probable motif to each precursor via a sophisticated scoring function. Performance analysis was conducted with 15 neuropeptide standards, and 11 neuropeptides were correctly identified with PRESnovo compared to 1 identification by PEAKS only. We applied PRESnovo to assign motifs to peptide sequences in conjunction with PEAKS for assigning the rest of the peptide sequence in order to discover neuropeptides in tissue samples of green crab, C. maenas, and Jonah crab, C. borealis. Collectively, a large number of neuropeptides were identified, including 13 putative neuropeptides identified in green crab brain, 77 in Jonah crab brain, and 47 in Jonah crab sinus glands for the first time. This PRESnovo strategy greatly simplifies de novo sequencing and enhances the accuracy and sensitivity of neuropeptide identification when common motifs are present.
Collapse
|
31
|
Shiferaw GA, Vandermarliere E, Hulstaert N, Gabriels R, Martens L, Volders PJ. COSS: A Fast and User-Friendly Tool for Spectral Library Searching. J Proteome Res 2020; 19:2786-2793. [PMID: 32384242 DOI: 10.1021/acs.jproteome.9b00743] [Citation(s) in RCA: 14] [Impact Index Per Article: 2.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/10/2023]
Abstract
Spectral similarity searching to identify peptide-derived MS/MS spectra is a promising technique, and different spectrum similarity search tools have therefore been developed. Each of these tools, however, comes with some limitations, mainly because of low processing speed and issues with handling large databases. Furthermore, the number of spectral data formats supported is typically limited, which also creates a threshold to adoption. We have therefore developed COSS (CompOmics Spectral Searching), a new and user-friendly spectral library search tool supporting two scoring functions. COSS also includes decoy spectra generation for result validation. We have benchmarked COSS on three different spectral libraries and compared the results with established spectral searching tools and a sequence database search tool. Our comparison showed that COSS more reliably identifies spectra, is capable of handling large data sets and libraries, and is an easy to use tool that can run on low computer specifications. COSS binaries and source code can be freely downloaded from https://github.com/compomics/COSS.
Collapse
Affiliation(s)
- Genet Abay Shiferaw
- VIB-UGent Center for Medical Biotechnology, VIB, 9000 Ghent, Belgium.,Department of Biomolecular Medicine, Ghent University, 9000 Ghent, Belgium
| | - Elien Vandermarliere
- VIB-UGent Center for Medical Biotechnology, VIB, 9000 Ghent, Belgium.,Department of Biomolecular Medicine, Ghent University, 9000 Ghent, Belgium
| | - Niels Hulstaert
- VIB-UGent Center for Medical Biotechnology, VIB, 9000 Ghent, Belgium.,Department of Biomolecular Medicine, Ghent University, 9000 Ghent, Belgium
| | - Ralf Gabriels
- VIB-UGent Center for Medical Biotechnology, VIB, 9000 Ghent, Belgium.,Department of Biomolecular Medicine, Ghent University, 9000 Ghent, Belgium
| | - Lennart Martens
- VIB-UGent Center for Medical Biotechnology, VIB, 9000 Ghent, Belgium.,Department of Biomolecular Medicine, Ghent University, 9000 Ghent, Belgium
| | - Pieter-Jan Volders
- VIB-UGent Center for Medical Biotechnology, VIB, 9000 Ghent, Belgium.,Department of Biomolecular Medicine, Ghent University, 9000 Ghent, Belgium.,Cancer Research Institute Ghent, Ghent University, 9000 Ghent, Belgium
| |
Collapse
|
32
|
Verheggen K, Raeder H, Berven FS, Martens L, Barsnes H, Vaudel M. Anatomy and evolution of database search engines-a central component of mass spectrometry based proteomic workflows. MASS SPECTROMETRY REVIEWS 2020; 39:292-306. [PMID: 28902424 DOI: 10.1002/mas.21543] [Citation(s) in RCA: 78] [Impact Index Per Article: 15.6] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 04/06/2016] [Accepted: 07/05/2017] [Indexed: 06/07/2023]
Abstract
Sequence database search engines are bioinformatics algorithms that identify peptides from tandem mass spectra using a reference protein sequence database. Two decades of development, notably driven by advances in mass spectrometry, have provided scientists with more than 30 published search engines, each with its own properties. In this review, we present the common paradigm behind the different implementations, and its limitations for modern mass spectrometry datasets. We also detail how the search engines attempt to alleviate these limitations, and provide an overview of the different software frameworks available to the researcher. Finally, we highlight alternative approaches for the identification of proteomic mass spectrometry datasets, either as a replacement for, or as a complement to, sequence database search engines.
Collapse
Affiliation(s)
- Kenneth Verheggen
- VIB-UGent Center for Medical Biotechnology, VIB, Ghent, Belgium
- Department of Biochemistry, Ghent University, Ghent, Belgium
- Bioinformatics Institute Ghent, Ghent University, Ghent, Belgium
| | - Helge Raeder
- KG Jebsen Center for Diabetes Research, Department of Clinical Science, University of Bergen, Norway
- Department of Pediatrics, Haukeland University Hospital, Bergen, Norway
| | - Frode S Berven
- Proteomics Unit, Department of Biomedicine, University of Bergen, Norway
| | - Lennart Martens
- VIB-UGent Center for Medical Biotechnology, VIB, Ghent, Belgium
- Department of Biochemistry, Ghent University, Ghent, Belgium
- Bioinformatics Institute Ghent, Ghent University, Ghent, Belgium
| | - Harald Barsnes
- KG Jebsen Center for Diabetes Research, Department of Clinical Science, University of Bergen, Norway
- Proteomics Unit, Department of Biomedicine, University of Bergen, Norway
- Computational Biology Unit, Department of Informatics, University of Bergen, Norway
| | - Marc Vaudel
- KG Jebsen Center for Diabetes Research, Department of Clinical Science, University of Bergen, Norway
- Proteomics Unit, Department of Biomedicine, University of Bergen, Norway
- Center for Medical Genetics and Molecular Medicine, Haukeland University Hospital, Bergen, Norway
| |
Collapse
|
33
|
Hentschker C, Maaß S, Junker S, Hecker M, Hammerschmidt S, Otto A, Becher D. Comprehensive Spectral Library from the Pathogenic Bacterium Streptococcus pneumoniae with Focus on Phosphoproteins. J Proteome Res 2020; 19:1435-1446. [DOI: 10.1021/acs.jproteome.9b00615] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/16/2022]
Affiliation(s)
- Christian Hentschker
- Department of Microbial Proteomics, Institute of Microbiology; University of Greifswald, Felix-Hausdorff-Str. 8, 17489 Greifswald, Germany
| | - Sandra Maaß
- Department of Microbial Proteomics, Institute of Microbiology; University of Greifswald, Felix-Hausdorff-Str. 8, 17489 Greifswald, Germany
| | - Sabryna Junker
- Department of Microbial Proteomics, Institute of Microbiology; University of Greifswald, Felix-Hausdorff-Str. 8, 17489 Greifswald, Germany
| | - Michael Hecker
- Department of Microbial Physiology and Molecular Biology, Institute of Microbiology; University of Greifswald, Felix-Hausdorff-Str. 8, 17489 Greifswald, Germany
| | - Sven Hammerschmidt
- Department of Molecular Genetics and Infection Biology, Interfaculty Institute for Genetics and Functional Genomics, University of Greifswald, Felix-Hausdorff-Str. 8, 17489 Greifswald, Germany
| | - Andreas Otto
- Department of Microbial Proteomics, Institute of Microbiology; University of Greifswald, Felix-Hausdorff-Str. 8, 17489 Greifswald, Germany
| | - Dörte Becher
- Department of Microbial Proteomics, Institute of Microbiology; University of Greifswald, Felix-Hausdorff-Str. 8, 17489 Greifswald, Germany
| |
Collapse
|
34
|
Fernández-Costa C, Martínez-Bartolomé S, McClatchy D, Yates JR. Improving Proteomics Data Reproducibility with a Dual-Search Strategy. Anal Chem 2020; 92:1697-1701. [PMID: 31880919 DOI: 10.1021/acs.analchem.9b04955] [Citation(s) in RCA: 8] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/28/2022]
Abstract
Mass spectrometry-based proteomics is an invaluable tool for addressing important biological questions. Data-dependent acquisition methods effectuate stochastic acquisition of data in complex mixtures, which results in missing identifications across replicates. We developed a search approach that improves the reproducibility of data acquired from any mass spectrometer. In our approach, a spectral library is built from the identification results from a database search, and then, the library is used to research the same data files to obtain the final result. We showed that higher identification and quantification reproducibility is achieved with the dual-search approach than with a typical database search. Four datasets with different complexity were compared: (1) data from a cell lysate study performed in our lab, (2) data from an interactome study performed in our lab, (3) a publicly available extracellular vesicles dataset, and (4) a publicly available phosphoproteomics dataset. Our results show that the dual-search approach can be widely and easily used to improve data quality in proteomics data.
Collapse
Affiliation(s)
- Carolina Fernández-Costa
- Department of Molecular Medicine , The Scripps Research Institute , La Jolla , California 92037 , United States
| | - Salvador Martínez-Bartolomé
- Department of Molecular Medicine , The Scripps Research Institute , La Jolla , California 92037 , United States
| | - Daniel McClatchy
- Department of Molecular Medicine , The Scripps Research Institute , La Jolla , California 92037 , United States
| | - John R Yates
- Department of Molecular Medicine , The Scripps Research Institute , La Jolla , California 92037 , United States
| |
Collapse
|
35
|
Optimization of TripleTOF spectral simulation and library searching for confident localization of phosphorylation sites. PLoS One 2019; 14:e0225885. [PMID: 31790495 PMCID: PMC6886777 DOI: 10.1371/journal.pone.0225885] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/21/2019] [Accepted: 11/14/2019] [Indexed: 12/31/2022] Open
Abstract
Tandem mass spectrometry (MS/MS) has been used in analysis of proteins and their post-translational modifications. A recently developed data analysis method, which simulates MS/MS spectra of phosphopeptides and performs spectral library searching using SpectraST, facilitates confident localization of phosphorylation sites. However, its performance has been evaluated only on MS/MS spectra acquired using Orbitrap HCD mass spectrometers so far. In this study, we have investigated whether this approach would be applicable to another type of mass spectrometers, and optimized the simulation and search conditions to achieve sensitive and confident site localization. Synthetic phosphopeptides and enriched K562 cell phosphopeptides were analyzed using a TripleTOF 6600 mass spectrometer before and after enzymatic dephosphorylation. Dephosphorylated peptides identified by X!Tandem database searching were subjected to spectral simulation of all possible single phosphorylations using SimPhospho software. Phosphopeptides were identified and localized by SpectraST searching against a library of the simulated spectra. Although no synthetic phosphopeptide was localized at 1% false localization rate under the previous conditions, optimization of the spectral simulation and search conditions for the TripleTOF datasets achieved the localization and improved the sensitivity. Furthermore, the optimized conditions enabled sensitive localization of K562 phosphopeptides at 1% false discovery and localization rates. These results suggest that accurate phosphopeptide simulation of TripleTOF MS/MS spectra is possible and the simulated spectral libraries can be used in SpectraST searching for confident localization of phosphorylation sites.
Collapse
|
36
|
Ammar C, Berchtold E, Csaba G, Schmidt A, Imhof A, Zimmer R. Multi-Reference Spectral Library Yields Almost Complete Coverage of Heterogeneous LC-MS/MS Data Sets. J Proteome Res 2019; 18:1553-1566. [DOI: 10.1021/acs.jproteome.8b00819] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/26/2022]
Affiliation(s)
- Constantin Ammar
- Institute of Bioinformatics, Department of Informatics, Ludwig-Maximilians-Universität München, Amalienstrasse 17, 80333 München, Germany
- Graduate School of Quantitative Biosciences (QBM), Ludwig-Maximilians-Universität München, Feodor-Lynen-Strasse 25, 81337 München, Germany
| | - Evi Berchtold
- Institute of Bioinformatics, Department of Informatics, Ludwig-Maximilians-Universität München, Amalienstrasse 17, 80333 München, Germany
| | - Gergely Csaba
- Institute of Bioinformatics, Department of Informatics, Ludwig-Maximilians-Universität München, Amalienstrasse 17, 80333 München, Germany
| | - Andreas Schmidt
- Zentrallabor für Proteinanalytik (Protein Analysis Unit), Ludwig-Maximilians-Universität München, Grosshaderner Strasse 9, 82152 Planegg-Martinsried, Germany
| | - Axel Imhof
- Graduate School of Quantitative Biosciences (QBM), Ludwig-Maximilians-Universität München, Feodor-Lynen-Strasse 25, 81337 München, Germany
- Zentrallabor für Proteinanalytik (Protein Analysis Unit), Ludwig-Maximilians-Universität München, Grosshaderner Strasse 9, 82152 Planegg-Martinsried, Germany
| | - Ralf Zimmer
- Institute of Bioinformatics, Department of Informatics, Ludwig-Maximilians-Universität München, Amalienstrasse 17, 80333 München, Germany
- Graduate School of Quantitative Biosciences (QBM), Ludwig-Maximilians-Universität München, Feodor-Lynen-Strasse 25, 81337 München, Germany
| |
Collapse
|
37
|
Application of immobilized ATP to the study of NLRP inflammasomes. Arch Biochem Biophys 2019; 670:104-115. [PMID: 30641048 DOI: 10.1016/j.abb.2018.12.031] [Citation(s) in RCA: 12] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/27/2018] [Revised: 12/01/2018] [Accepted: 12/17/2018] [Indexed: 01/15/2023]
Abstract
The NLRP proteins are a subfamily of the NOD-like receptor (NLR) innate immune sensors that possess an ATP-binding NACHT domain. As the most well studied member, NLRP3 can initiate the assembly process of a multiprotein complex, termed the inflammasome, upon detection of a wide range of microbial products and endogenous danger signals and results in the activation of pro-caspase-1, a cysteine protease that regulates multiple host defense pathways including cytokine maturation. Dysregulated NLRP3 activation contributes to inflammation and the pathogenesis of several chronic diseases, and the ATP-binding properties of NLRPs are thought to be critical for inflammasome activation. In light of this, we examined the utility of immobilized ATP matrices in the study of NLRP inflammasomes. Using NLRP3 as the prototypical member of the family, P-linked ATP Sepharose was determined to be a highly-effective capture agent. In subsequent examinations, P-linked ATP Sepharose was used as an enrichment tool to enable the effective profiling of NLRP3-biomarker signatures with selected reaction monitoring-mass spectrometry (SRM-MS). Finally, ATP Sepharose was used in combination with a fluorescence-linked enzyme chemoproteomic strategy (FLECS) screen to identify potential competitive inhibitors of NLRP3. The identification of a novel benzo[d]imidazol-2-one inhibitor that specifically targets the ATP-binding and hydrolysis properties of the NLRP3 protein implies that ATP Sepharose and FLECS could be applied other NLRPs as well.
Collapse
|
38
|
Kunath BJ, Minniti G, Skaugen M, Hagen LH, Vaaje-Kolstad G, Eijsink VGH, Pope PB, Arntzen MØ. Metaproteomics: Sample Preparation and Methodological Considerations. ADVANCES IN EXPERIMENTAL MEDICINE AND BIOLOGY 2019; 1073:187-215. [DOI: 10.1007/978-3-030-12298-0_8] [Citation(s) in RCA: 19] [Impact Index Per Article: 3.2] [Reference Citation Analysis] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 12/12/2022]
|
39
|
Pandeswari PB, Sabareesh V. Middle-down approach: a choice to sequence and characterize proteins/proteomes by mass spectrometry. RSC Adv 2018; 9:313-344. [PMID: 35521579 PMCID: PMC9059502 DOI: 10.1039/c8ra07200k] [Citation(s) in RCA: 42] [Impact Index Per Article: 6.0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/29/2018] [Accepted: 12/11/2018] [Indexed: 12/27/2022] Open
Abstract
Owing to rapid growth in the elucidation of genome sequences of various organisms, deducing proteome sequences has become imperative, in order to have an improved understanding of biological processes. Since the traditional Edman method was unsuitable for high-throughput sequencing and also for N-terminus modified proteins, mass spectrometry (MS) based methods, mainly based on soft ionization modes: electrospray ionization and matrix-assisted laser desorption/ionization, began to gain significance. MS based methods were adaptable for high-throughput studies and applicable for sequencing N-terminus blocked proteins/peptides too. Consequently, over the last decade a new discipline called 'proteomics' has emerged, which encompasses the attributes necessary for high-throughput identification of proteins. 'Proteomics' may also be regarded as an offshoot of the classic field, 'biochemistry'. Many protein sequencing and proteomic investigations were successfully accomplished through MS dependent sequence elucidation of 'short proteolytic peptides (typically: 7-20 amino acid residues), which is called the 'shotgun' or 'bottom-up (BU)' approach. While the BU approach continues as a workhorse for proteomics/protein sequencing, attempts to sequence intact proteins without proteolysis, called the 'top-down (TD)' approach started, due to ambiguities in the BU approach, e.g., protein inference problem, identification of proteoforms and the discovery of posttranslational modifications (PTMs). The high-throughput TD approach (TD proteomics) is yet in its infancy. Nevertheless, TD characterization of purified intact proteins has been useful for detecting PTMs. With the hope to overcome the pitfalls of BU and TD strategies, another concept called the 'middle-down (MD)' approach was put forward. Similar to BU, the MD approach also involves proteolysis, but in a restricted manner, to produce 'longer' proteolytic peptides than the ones usually obtained in BU studies, thereby providing better sequence coverage. In this regard, special proteases (OmpT, Sap9, IdeS) have been used, which can cleave proteins to produce longer proteolytic peptides. By reviewing ample evidences currently existing in the literature that is predominantly on PTM characterization of histones and antibodies, herein we highlight salient features of the MD approach. Consequently, we are inclined to claim that the MD concept might have widespread applications in future for various research areas, such as clinical, biopharmaceuticals (including PTM analysis) and even for general/routine characterization of proteins including therapeutic proteins, but not just limited to analysis of histones or antibodies.
Collapse
Affiliation(s)
- P Boomathi Pandeswari
- Advanced Centre for Bio Separation Technology (CBST), Vellore Institute of Technology (VIT) Vellore Tamil Nadu 632014 India
| | - Varatharajan Sabareesh
- Advanced Centre for Bio Separation Technology (CBST), Vellore Institute of Technology (VIT) Vellore Tamil Nadu 632014 India
| |
Collapse
|
40
|
Deutsch EW, Perez-Riverol Y, Chalkley RJ, Wilhelm M, Tate S, Sachsenberg T, Walzer M, Käll L, Delanghe B, Böcker S, Schymanski EL, Wilmes P, Dorfer V, Kuster B, Volders PJ, Jehmlich N, Vissers JP, Wolan DW, Wang AY, Mendoza L, Shofstahl J, Dowsey AW, Griss J, Salek RM, Neumann S, Binz PA, Lam H, Vizcaíno JA, Bandeira N, Röst H. Expanding the Use of Spectral Libraries in Proteomics. J Proteome Res 2018; 17:4051-4060. [PMID: 30270626 PMCID: PMC6443480 DOI: 10.1021/acs.jproteome.8b00485] [Citation(s) in RCA: 46] [Impact Index Per Article: 6.6] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/15/2022]
Abstract
The 2017 Dagstuhl Seminar on Computational Proteomics provided an opportunity for a broad discussion on the current state and future directions of the generation and use of peptide tandem mass spectrometry spectral libraries. Their use in proteomics is growing slowly, but there are multiple challenges in the field that must be addressed to further increase the adoption of spectral libraries and related techniques. The primary bottlenecks are the paucity of high quality and comprehensive libraries and the general difficulty of adopting spectral library searching into existing workflows. There are several existing spectral library formats, but none captures a satisfactory level of metadata; therefore, a logical next improvement is to design a more advanced, Proteomics Standards Initiative-approved spectral library format that can encode all of the desired metadata. The group discussed a series of metadata requirements organized into three designations of completeness or quality, tentatively dubbed bronze, silver, and gold. The metadata can be organized at four different levels of granularity: at the collection (library) level, at the individual entry (peptide ion) level, at the peak (fragment ion) level, and at the peak annotation level. Strategies for encoding mass modifications in a consistent manner and the requirement for encoding high-quality and commonly seen but as-yet-unidentified spectra were discussed. The group also discussed related topics, including strategies for comparing two spectra, techniques for generating representative spectra for a library, approaches for selection of optimal signature ions for targeted workflows, and issues surrounding the merging of two or more libraries into one. We present here a review of this field and the challenges that the community must address in order to accelerate the adoption of spectral libraries in routine analysis of proteomics datasets.
Collapse
Affiliation(s)
- Eric W. Deutsch
- Institute for Systems Biology, Seattle, Washington, 98109, United States
| | - Yasset Perez-Riverol
- European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Wellcome Trust Genome Campus, Hinxton, Cambridge CB10 1SD, United Kingdom
| | - Robert J. Chalkley
- University of California San Francisco, San Francisco, 94158, California, United States
| | - Mathias Wilhelm
- Chair of Proteomics and Bioanalytics, Technical University of Munich, Freising, 85354, Germany
| | | | - Timo Sachsenberg
- Department of Computer Science, Center for Bioinformatics, University of Tübingen, Sand 14, Tübingen, 72076, Germany
| | - Mathias Walzer
- European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Wellcome Trust Genome Campus, Hinxton, Cambridge CB10 1SD, United Kingdom
| | - Lukas Käll
- Science for Life Laboratory, School of Engineering Sciences in Chemistry, Biotechnology and Health, KTH − Royal Institute of Technology, Stockholm 114 28, Sweden
| | - Bernard Delanghe
- Thermo Fisher Scientific Bremen, Hanna-Kunath Str. 11, 28199 Bremen, Germany
| | - Sebastian Böcker
- Chair for Bioinformatics, Friedrich-Schiller-University Jena, 07743 Jena, Germany
| | - Emma L. Schymanski
- Luxembourg Centre for Systems Biomedicine (LCSB), University of Luxembourg, 6 avenue du Swing, L-4367 Belvaux, Luxembourg
| | - Paul Wilmes
- Luxembourg Centre for Systems Biomedicine (LCSB), University of Luxembourg, 6 avenue du Swing, L-4367 Belvaux, Luxembourg
| | - Viktoria Dorfer
- University of Applied Sciences Upper Austria, Bioinformatics Research Group, Hagenberg, 4232, Austria
| | - Bernhard Kuster
- Chair of Proteomics and Bioanalytics, Technical University of Munich, Freising, 85354, Germany
- Bavarian Biomolecular Mass Spectrometry Center (BayBioMS), Technical University of Munich, Freising, 85354, Germany
| | | | - Nico Jehmlich
- Helmholtz-Centre for Environmental Research - UFZ, Leipzig, Germany
| | | | - Dennis W. Wolan
- Department of Molecular Medicine, The Scripps Research Institute, 92037, La Jolla, California, United States
| | - Ana Y. Wang
- Department of Molecular Medicine, The Scripps Research Institute, 92037, La Jolla, California, United States
| | - Luis Mendoza
- Institute for Systems Biology, Seattle, Washington, 98109, United States
| | - Jim Shofstahl
- Thermo Fisher Scientific, 355 River Oaks Parkway San Jose, CA 95134
| | - Andrew W. Dowsey
- Department of Population Health Sciences and Bristol Veterinary School, Faculty of Health Sciences, University of Bristol, Bristol BS9 1BN, UK
| | - Johannes Griss
- Division of Immunology, Allergy and Infectious Diseases, Department of Dermatology, Medical University of Vienna, Währinger Gürtel 18-20, Vienna 1090, Austria
| | - Reza M. Salek
- The International Agency for Research on Cancer (IARC), 150 Cours Albert Thomas, 69372 Lyon CEDEX 08, France
| | - Steffen Neumann
- Leibniz Institute of Plant Biochemistry, Department of Stress and Developmental Biology, 06120 Halle, Germany
- German Centre for Integrative Biodiversity Research (iDiv), Halle-Jena-Leipzig, 04103 Leipzig, Germany
| | - Pierre-Alain Binz
- Clinical Chemistry Service, Centre Hospitalier Universitaire Vaudois, 1011 Lausanne, Switzerland
| | - Henry Lam
- Department of Chemical and Biological Engineering, the Hong Kong University of Science and Technology, Clear Water Bay, Hong Kong
| | - Juan Antonio Vizcaíno
- European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Wellcome Trust Genome Campus, Hinxton, Cambridge CB10 1SD, United Kingdom
| | - Nuno Bandeira
- Center for Computational Mass Spectrometry, Department of Computer Science and Engineering, Skaggs School of Pharmacy and Pharmaceutical Sciences, University of California, San Diego, 92093-0404, USA
| | - Hannes Röst
- The Donnelly Centre, University of Toronto, 160 College St., Toronto, ON, M5S 3E1, Canada
| |
Collapse
|
41
|
Cheema AK, Byrum SD, Sharma NK, Altadill T, Kumar VP, Biswas S, Balgley BM, Hauer-Jensen M, Tackett AJ, Ghosh SP. Proteomic Changes in Mouse Spleen after Radiation-Induced Injury and its Modulation by Gamma-Tocotrienol. Radiat Res 2018; 190:449-463. [PMID: 30070965 PMCID: PMC6297072 DOI: 10.1667/rr15008.1] [Citation(s) in RCA: 10] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/14/2022]
Abstract
Gamma-tocotrienol (GT3), a naturally occurring vitamin E isomer, a promising radioprotector, has been shown to protect mice against radiation-induced hematopoietic and gastrointestinal injuries. We analyzed changes in protein expression profiles of spleen tissue after GT3 treatment in mice exposed to gamma radiation to gain insights into the molecular mechanism of radioprotective efficacy. Male CD2F1 mice, 12-to-14 weeks old, were treated with either vehicle or GT3 at 24 h prior to 7 Gy total-body irradiation. Nonirradiated vehicle, nonirradiated GT3 and age-matched naïve animals were used as controls. Blood and tissues were harvested on days 0, 1, 2, 4, 7, 10 and 14 postirradiation. High-resolution mass-spectrometry-based radioproteomics was used to identify differentially expressed proteins in spleen tissue with or without drug treatment. Subsequent bioinformatic analyses helped delineate molecular markers of biological pathways and networks regulating the cellular radiation responses in spleen. Our results show a robust alteration in spleen proteomic profiles including upregulation of the Wnt signaling pathway and actin-cytoskeleton linked proteins in mediating the radiation injury response in spleen. Furthermore, we show that 24 h pretreatment with GT3 attenuates radiation-induced hematopoietic injury in the spleen by modulating various cell signaling proteins. Taken together, our results show that the radioprotective effects of GT3 are mediated, via alleviation of radiation-induced alterations in biochemical pathways, with wide implications on overall hematopoietic injury.
Collapse
Affiliation(s)
- Amrita K. Cheema
- Departments of Oncology, Biochemistry, Molecular and Cellular Biology, Georgetown University Medical Center, Washington, DC
| | - Stephanie D. Byrum
- Division of Radiation Health, College of Pharmacy, University of Arkansas for Medical Sciences and Central Arkansas Veterans Healthcare System, Little Rock, Arkansas
| | - Neel Kamal Sharma
- Armed Forces Radiobiology Research Institute, Uniformed Services University of the Health Sciences (USUHS), Bethesda, Maryland
| | - Tatiana Altadill
- Departments of Oncology, Biochemistry, Molecular and Cellular Biology, Georgetown University Medical Center, Washington, DC
- Institut d’Investigacio Biomedica de Bellvitge (IDIBELL), Gynecological Department, Vall Hebron University Hospital, Universitat Autonoma de Barcelona, Barcelona, Spain
| | - Vidya P. Kumar
- Armed Forces Radiobiology Research Institute, Uniformed Services University of the Health Sciences (USUHS), Bethesda, Maryland
| | - Shukla Biswas
- Armed Forces Radiobiology Research Institute, Uniformed Services University of the Health Sciences (USUHS), Bethesda, Maryland
| | | | - Martin Hauer-Jensen
- Division of Radiation Health, College of Pharmacy, University of Arkansas for Medical Sciences and Central Arkansas Veterans Healthcare System, Little Rock, Arkansas
| | - Alan J. Tackett
- Division of Radiation Health, College of Pharmacy, University of Arkansas for Medical Sciences and Central Arkansas Veterans Healthcare System, Little Rock, Arkansas
| | - Sanchita P. Ghosh
- Armed Forces Radiobiology Research Institute, Uniformed Services University of the Health Sciences (USUHS), Bethesda, Maryland
| |
Collapse
|
42
|
Perez JJ, Chen CY. Implementation of normalized retention time (iRT) for bottom-up proteomic analysis of the aminoglycoside phosphotransferase enzyme facilitating method distribution. Anal Bioanal Chem 2018; 411:4701-4708. [DOI: 10.1007/s00216-018-1377-z] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/06/2018] [Revised: 08/15/2018] [Accepted: 09/13/2018] [Indexed: 01/05/2023]
|
43
|
Bittremieux W, Tabb DL, Impens F, Staes A, Timmerman E, Martens L, Laukens K. Quality control in mass spectrometry-based proteomics. MASS SPECTROMETRY REVIEWS 2018; 37:697-711. [PMID: 28802010 DOI: 10.1002/mas.21544] [Citation(s) in RCA: 83] [Impact Index Per Article: 11.9] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 03/04/2017] [Revised: 07/24/2017] [Accepted: 07/24/2017] [Indexed: 05/21/2023]
Abstract
Mass spectrometry is a highly complex analytical technique and mass spectrometry-based proteomics experiments can be subject to a large variability, which forms an obstacle to obtaining accurate and reproducible results. Therefore, a comprehensive and systematic approach to quality control is an essential requirement to inspire confidence in the generated results. A typical mass spectrometry experiment consists of multiple different phases including the sample preparation, liquid chromatography, mass spectrometry, and bioinformatics stages. We review potential sources of variability that can impact the results of a mass spectrometry experiment occurring in all of these steps, and we discuss how to monitor and remedy the negative influences on the experimental results. Furthermore, we describe how specialized quality control samples of varying sample complexity can be incorporated into the experimental workflow and how they can be used to rigorously assess detailed aspects of the instrument performance.
Collapse
Affiliation(s)
- Wout Bittremieux
- Department of Mathematics and Computer Science, University of Antwerp, Antwerp, Belgium
- Biomedical Informatics Research Center Antwerp (Biomina), University of Antwerp/Antwerp University Hospital, Edegem, Belgium
| | - David L Tabb
- Division of Molecular Biology and Human Genetics, Stellenbosch University Faculty of Medicine and Health Sciences, Tygerberg Hospital, Cape Town, South Africa
| | - Francis Impens
- VIB Proteomics Core, Ghent, Belgium
- VIB-UGent Center for Medical Biotechnology, Ghent, Belgium
- Faculty of Medicine and Health Sciences, Department of Biochemistry, Ghent University, Ghent, Belgium
| | - An Staes
- VIB Proteomics Core, Ghent, Belgium
- VIB-UGent Center for Medical Biotechnology, Ghent, Belgium
- Faculty of Medicine and Health Sciences, Department of Biochemistry, Ghent University, Ghent, Belgium
| | - Evy Timmerman
- VIB Proteomics Core, Ghent, Belgium
- VIB-UGent Center for Medical Biotechnology, Ghent, Belgium
- Faculty of Medicine and Health Sciences, Department of Biochemistry, Ghent University, Ghent, Belgium
| | - Lennart Martens
- VIB-UGent Center for Medical Biotechnology, Ghent, Belgium
- Faculty of Medicine and Health Sciences, Department of Biochemistry, Ghent University, Ghent, Belgium
- Bioinformatics Institute Ghent, Ghent University, Zwijnaarde, Belgium
| | - Kris Laukens
- Department of Mathematics and Computer Science, University of Antwerp, Antwerp, Belgium
- Biomedical Informatics Research Center Antwerp (Biomina), University of Antwerp/Antwerp University Hospital, Edegem, Belgium
| |
Collapse
|
44
|
Assembling the Community-Scale Discoverable Human Proteome. Cell Syst 2018; 7:412-421.e5. [PMID: 30172843 PMCID: PMC6279426 DOI: 10.1016/j.cels.2018.08.004] [Citation(s) in RCA: 108] [Impact Index Per Article: 15.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/14/2017] [Revised: 12/22/2017] [Accepted: 08/03/2018] [Indexed: 01/15/2023]
Abstract
The increasing throughput and sharing of proteomics mass spectrometry data have now yielded over one-third of a million public mass spectrometry runs. However, these discoveries are not continuously aggregated in an open and error-controlled manner, which limits their utility. To facilitate the reusability of these data, we built the MassIVE Knowledge Base (MassIVE-KB), a community-wide, continuously updating knowledge base that aggregates proteomics mass spectrometry discoveries into an open reusable format with full provenance information for community scrutiny. Reusing >31 TB of public human data stored in a mass spectrometry interactive virtual environment (MassIVE), the MassIVE-KB contains >2.1 million precursors from 19,610 proteins (48% larger than before; 97% of the total) and doubles proteome coverage to 6 million amino acids (54% of the proteome) with strict library-scale false discovery controls, thereby providing evidence for 430 proteins for which sufficient protein-level evidence was previously missing. Furthermore, MassIVE-KB can inform experimental design, helps identify and quantify new data, and provides tools for community construction of specialized spectral libraries. Wang et al. introduce MassIVE-KB, a program designed to distill the entire community’s mass spectrometry data into reusable spectral library resources. As a result, the statistically-significant discovery of a peptide or protein in a single researcher’s data will thus be made available to the whole community to support its identification (in shotgun experiments) or quantitative detection (in targeted experiments) in all future analyses.
Collapse
|
45
|
Shen J, Pagala VR, Breuer AM, Peng J, Bin Ma, Wang X. Spectral Library Search Improves Assignment of TMT Labeled MS/MS Spectra. J Proteome Res 2018; 17:3325-3331. [PMID: 30096983 DOI: 10.1021/acs.jproteome.8b00594] [Citation(s) in RCA: 17] [Impact Index Per Article: 2.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/30/2022]
Abstract
Tandem mass tag (TMT)-based liquid chromatography-tandem mass spectrometry (LC-MS/MS) is a proven approach for large-scale multiplexed protein quantification. However, the identification of TMT-labeled peptides is compromised by the labeling during traditional sequence database searches. In this study, we aim to use a spectral library search to increase the sensitivity and specificity of peptide identification for TMT-based MS data. Compared to MS/MS spectra of unlabeled peptides, the spectra of TMT-labeled counterparts usually display intensified b ions, suggesting that TMT labeling can alter product ion patterns during MS/MS fragementation. We compiled a human TMT spectral library of 401,168 unique peptides of high quality from millions of peptide-spectrum matches in tens of profiling projects, matching to 14,048 nonredundant proteins (13,953 genes). A mouse TMT spectral library of similar size was also constructed. The libraries were subsequently appended with decoy spectra to evaluate the false discovery rate, which was validated by a simulated null TMT data set. The performance of the library search was further optimized by removing TMT reporter ions and selecting an appropriate library construction method. Finally, we searched a human TMT data set against the spectral library to demonstrate that the spectral library outperformed the sequence database. Both human and mouse TMT libraries were made publicly available to the research community.
Collapse
Affiliation(s)
- Jianqiao Shen
- Department of Computer Science , University of Waterloo , Waterloo , Ontario N2L 3G1 , Canada
| | | | | | | | - Bin Ma
- Department of Computer Science , University of Waterloo , Waterloo , Ontario N2L 3G1 , Canada
| | | |
Collapse
|
46
|
Ulke-Lemée A, Lau A, Nelson MC, James MT, Muruve DA, MacDonald JA. Quantification of Inflammasome Adaptor Protein ASC in Biological Samples by Multiple-Reaction Monitoring Mass Spectrometry. Inflammation 2018; 41:1396-1408. [DOI: 10.1007/s10753-018-0787-6] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/15/2022]
|
47
|
Mohammed Y, Palmblad M. Visualizing and comparing results of different peptide identification methods. Brief Bioinform 2018; 19:210-218. [PMID: 28011752 DOI: 10.1093/bib/bbw115] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/12/2016] [Indexed: 11/14/2022] Open
Abstract
In mass spectrometry-based proteomics, peptides are typically identified from tandem mass spectra using spectrum comparison. A sequence search engine compares experimentally obtained spectra with those predicted from protein sequences, applying enzyme cleavage and fragmentation rules. To this, there are two main alternatives: spectral libraries and de novo sequencing. The former compares measured spectra with a collection of previously acquired and identified spectra in a library. De novo attempts to sequence peptides from the tandem mass spectra alone. We here present a theoretical framework and a data processing workflow for visualizing and comparing the results of these different types of algorithms. The method considers the three search strategies as different dimensions, identifies distinct agreement classes and visualizes the complementarity of the search strategies. We have included X! Tandem, SpectraST and PepNovo, as they are in common use and representative for algorithms of each type. Our method allows advanced investigation of how the three search methods perform relatively to each other and shows the impact of the currently used decoy sequences for evaluating the false discovery rates.
Collapse
Affiliation(s)
- Yassene Mohammed
- Center for Proteomics and Metabolomics, Leiden University Medical Center, the Netherlands.,University of Victoria, University of Victoria - Genome British Columbia Proteomics Centre, Canada
| | - Magnus Palmblad
- Center for Proteomics and Metabolomics, Leiden University Medical Center, the Netherlands
| |
Collapse
|
48
|
Joe MK, Lieberman RL, Nakaya N, Tomarev SI. Myocilin Regulates Metalloprotease 2 Activity Through Interaction With TIMP3. Invest Ophthalmol Vis Sci 2017; 58:5308-5318. [PMID: 29049729 PMCID: PMC5644706 DOI: 10.1167/iovs.16-20336] [Citation(s) in RCA: 9] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/15/2023] Open
Abstract
Purpose To elucidate functions of wild-type myocilin, a secreted glycoprotein associated with glaucoma. Methods Lysates of mouse eyes were used for immunoprecipitation with affinity-purified antibodies against mouse myocilin. Shotgun proteomic analysis was used for the identification of proteins interacting with myocilin. Colocalization of myocilin and tissue inhibitor of metalloproteinases 3 (TIMP3) in different eye structures was investigated by a multiplex fluorescent in situ hybridization and immunofluorescent labeling with subsequent confocal microscopy. Matrix metalloproteinase 2 (MMP2) activity assay was used to test effects of myocilin on TIMP3 inhibitory action. Results TIMP3 was identified by a shotgun proteomic analysis as a protein that was coimmunoprecipitated with myocilin from eye lysates of wild-type and transgenic mice expressing elevated levels of mouse myocilin but not from lysates of transgenic mice expressing mutated mouse myocilin. Interaction of myocilin and TIMP3 was confirmed by coimmunoprecipitation of myocilin and TIMP3 from HEK293 cells transiently transfected with cDNAs encoding these proteins. The olfactomedin domain of myocilin is essential for interaction with TIMP3. In the eye, the main sites of myocilin and TIMP3 colocalization are the trabecular meshwork, sclera, and choroid. Using purified proteins, it has been shown that myocilin markedly enhanced the inhibitory activity of TIMP3 toward MMP2. Conclusions Myocilin may serve as a modulator of TIMP3 activity via interactions with the myocilin olfactomedin domain. Our data imply that in the case of MYOCILIN null or some glaucoma-causing mutations, inhibitory activity of TIMP3 toward MMP2 might be reduced, mimicking deleterious mutations in the TIMP3 gene.
Collapse
Affiliation(s)
- Myung Kuk Joe
- Section of Retinal Ganglion Cell Biology, Laboratory of Retinal Cell and Molecular Biology, National Eye Institute, National Institutes of Health, Bethesda, Maryland, United States
| | - Raquel L Lieberman
- School of Chemistry and Biochemistry, Georgia Institute of Technology, Atlanta, Georgia, United States
| | - Naoki Nakaya
- Section of Retinal Ganglion Cell Biology, Laboratory of Retinal Cell and Molecular Biology, National Eye Institute, National Institutes of Health, Bethesda, Maryland, United States
| | - Stanislav I Tomarev
- Section of Retinal Ganglion Cell Biology, Laboratory of Retinal Cell and Molecular Biology, National Eye Institute, National Institutes of Health, Bethesda, Maryland, United States
| |
Collapse
|
49
|
Williams BJ, Ciavarini SJ, Devlin C, Cohn SM, Xie R, Vissers JPC, Martin LB, Caswell A, Langridge JI, Geromanos SJ. Multi-mode acquisition (MMA): An MS/MS acquisition strategy for maximizing selectivity, specificity and sensitivity of DIA product ion spectra. Proteomics 2017; 16:2284-301. [PMID: 27296928 DOI: 10.1002/pmic.201500492] [Citation(s) in RCA: 11] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/23/2015] [Revised: 05/16/2016] [Accepted: 06/10/2016] [Indexed: 01/08/2023]
Abstract
In proteomics studies, it is generally accepted that depth of coverage and dynamic range is limited in data-directed acquisitions. The serial nature of the method limits both sensitivity and the number of precursor ions that can be sampled. To that end, a number of data-independent acquisition (DIA) strategies have been introduced with these methods, for the most part, immune to the sampling issue; nevertheless, some do have other limitations with respect to sensitivity. The major limitation with DIA approaches is interference, i.e., MS/MS spectra are highly chimeric and often incapable of being identified using conventional database search engines. Utilizing each available dimension of separation prior to ion detection, we present a new multi-mode acquisition (MMA) strategy multiplexing both narrowband and wideband DIA acquisitions in a single analytical workflow. The iterative nature of the MMA workflow limits the adverse effects of interference with minimal loss in sensitivity. Qualitative identification can be performed by selected ion chromatograms or conventional database search strategies.
Collapse
Affiliation(s)
| | | | | | | | - Rong Xie
- Waters Corporation, Milford, MA, USA
| | | | | | | | | | | |
Collapse
|
50
|
Shao W, Lam H. Tandem mass spectral libraries of peptides and their roles in proteomics research. MASS SPECTROMETRY REVIEWS 2017; 36:634-648. [PMID: 27403644 DOI: 10.1002/mas.21512] [Citation(s) in RCA: 45] [Impact Index Per Article: 5.6] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 11/30/2015] [Accepted: 05/21/2016] [Indexed: 05/15/2023]
Abstract
Proteomics is a rapidly maturing field aimed at the high-throughput identification and quantification of all proteins in a biological system. The cornerstone of proteomic technology is tandem mass spectrometry of peptides resulting from the digestion of protein mixtures. The fragmentation pattern of each peptide ion is captured in its tandem mass spectrum, which enables its identification and acts as a fingerprint for the peptide. Spectral libraries are simply searchable collections of these fingerprints, which have taken on an increasingly prominent role in proteomic data analysis. This review describes the historical development of spectral libraries in proteomics, details the computational procedures behind library building and searching, surveys the current applications of spectral libraries, and discusses the outstanding challenges. © 2016 Wiley Periodicals, Inc. Mass Spec Rev 36:634-648, 2017.
Collapse
Affiliation(s)
- Wenguang Shao
- Department of Biology, Institute of Molecular Systems Biology, Eidgenössische Technische Hochschule (ETH) Zurich, Zurich, Switzerland
- Division of Biomedical Engineering, The Hong Kong University of Science and Technology, Clear Water Bay, Hong Kong
| | - Henry Lam
- Division of Biomedical Engineering, The Hong Kong University of Science and Technology, Clear Water Bay, Hong Kong
- Department of Chemical and Biomolecular Engineering, The Hong Kong University of Science and Technology, Clear Water Bay, Hong Kong
| |
Collapse
|