1
|
Correia I, Oliveira C, Reis A, Guimarães AR, Aveiro S, Domingues P, Bezerra AR, Vitorino R, Moura G, Santos MAS. A proteogenomic pipeline for the analysis of protein biosynthesis errors in the human pathogen Candida albicans. Mol Cell Proteomics 2024:100818. [PMID: 39047911 DOI: 10.1016/j.mcpro.2024.100818] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/24/2023] [Revised: 03/20/2024] [Accepted: 07/19/2024] [Indexed: 07/27/2024] Open
Abstract
Candida albicans is a diploid pathogen known for its ability to live as a commensal fungus in healthy individuals, but causing both superficial infections and disseminated candidiasis in immunocompromised patients where it is associated with high morbidity and mortality. Its success in colonizing the human host is attributed to a wide range of virulence traits that modulate interactions between the host and the pathogen, such as optimal growth rate at 37 ºC, the ability to switch between yeast and hyphal forms and a remarkable genomic and phenotypic plasticity. A fascinating aspect of its biology is a prominent heterogeneous proteome that arises from frequent genomic rearrangements, high allelic variation, and high levels of amino acid misincorporations in proteins. This leads to increased morphological and physiological phenotypic diversity of high adaptive potential, but the scope of such protein mistranslation is poorly understood due to technical difficulties in detecting and quantifying amino acid misincorporation events in complex protein samples. We have developed and optimized mass spectrometry and bioinformatics pipelines capable of identifying rare amino acid misincorporation events at the proteome level. We have also analysed the proteomic profile of an engineered C. albicans strain that exhibits high level of leucine misincorporation at protein CUG sites and employed an in vivo quantitative gain-of-function fluorescence reporter system to validate our LC-MS/MS data. C. albicans misincorporates amino acids above the background level at protein sites of diverse codons, particularly at CUG, confirming our previous data on the quantification of leucine incorporation at single CUG sites of recombinant reporter proteins, but increasing misincorporation of Leucine at these sites does not alter the translational fidelity of the other codons. These findings indicate that the C. albicans statistical proteome exceeds prior estimates, suggesting that its highly plastic phenome may also be modulated by environmental factors due to translational ambiguity.
Collapse
Affiliation(s)
- Inês Correia
- Institute of Biomedicine (iBiMED) and Department of Medical Sciences (DCM), University of Aveiro, Aveiro, Portugal.
| | - Carla Oliveira
- Institute of Biomedicine (iBiMED) and Department of Medical Sciences (DCM), University of Aveiro, Aveiro, Portugal
| | - Andreia Reis
- Institute of Biomedicine (iBiMED) and Department of Medical Sciences (DCM), University of Aveiro, Aveiro, Portugal
| | - Ana Rita Guimarães
- Institute of Biomedicine (iBiMED) and Department of Medical Sciences (DCM), University of Aveiro, Aveiro, Portugal
| | - Susana Aveiro
- Department of Chemistry, University of Aveiro, Aveiro, Portugal
| | - Pedro Domingues
- Department of Chemistry, University of Aveiro, Aveiro, Portugal
| | - Ana Rita Bezerra
- Institute of Biomedicine (iBiMED) and Department of Medical Sciences (DCM), University of Aveiro, Aveiro, Portugal
| | - Rui Vitorino
- Institute of Biomedicine (iBiMED) and Department of Medical Sciences (DCM), University of Aveiro, Aveiro, Portugal
| | - Gabriela Moura
- Institute of Biomedicine (iBiMED) and Department of Medical Sciences (DCM), University of Aveiro, Aveiro, Portugal
| | - Manuel A S Santos
- Institute of Biomedicine (iBiMED) and Department of Medical Sciences (DCM), University of Aveiro, Aveiro, Portugal; Multidisciplinary Institute of Ageing (MIA-Portugal), University of Coimbra, Coimbra, Portugal.
| |
Collapse
|
2
|
Lapcik P, Synkova K, Janacova L, Bouchalova P, Potesil D, Nenutil R, Bouchal P. A hybrid DDA/DIA-PASEF based assay library for a deep proteotyping of triple-negative breast cancer. Sci Data 2024; 11:794. [PMID: 39025866 PMCID: PMC11258311 DOI: 10.1038/s41597-024-03632-2] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/14/2024] [Accepted: 07/10/2024] [Indexed: 07/20/2024] Open
Abstract
Triple-negative breast cancer (TNBC) is the most aggressive subtype of breast cancer, and deeper proteome coverage is needed for its molecular characterization. We present comprehensive library of targeted mass spectrometry assays specific for TNBC and demonstrate its applicability. Proteins were extracted from 105 TNBC tissues and digested. Aliquots were pooled, fractionated using hydrophilic chromatography and analyzed by LC-MS/MS in data-dependent acquisition (DDA) parallel accumulation-serial fragmentation (PASEF) mode on timsTOF Pro LC-MS system. 16 individual lysates were analyzed in data-independent acquisition (DIA)-PASEF mode. Hybrid library was generated in Spectronaut software and covers 244,464 precursors, 168,006 peptides and 11,564 protein groups (FDR = 1%). Application of our library for pilot quantitative analysis of 16 tissues increased identification numbers in Spectronaut 18.5 and DIA-NN 1.8.1 software compared to library-free setting, with Spectronaut achieving the best results represented by 190,310 precursors, 140,566 peptides, and 10,463 protein groups. In conclusion, we introduce assay library that offers the deepest coverage of TNBC proteome to date. The TNBC library is available via PRIDE repository (PXD047793).
Collapse
Grants
- NU22-08-00230 Ministerstvo Zdravotnictví Ceské Republiky (Ministry of Health of the Czech Republic)
- NU22-08-00230 Ministerstvo Zdravotnictví Ceské Republiky (Ministry of Health of the Czech Republic)
- NU22-08-00230 Ministerstvo Zdravotnictví Ceské Republiky (Ministry of Health of the Czech Republic)
- NU22-08-00230 Ministerstvo Zdravotnictví Ceské Republiky (Ministry of Health of the Czech Republic)
- LX22NPO5102 Ministerstvo Školství, Mládeže a Tělovýchovy (Ministry of Education, Youth and Sports)
- LX22NPO5102 Ministerstvo Školství, Mládeže a Tělovýchovy (Ministry of Education, Youth and Sports)
- LX22NPO5102 Ministerstvo Školství, Mládeže a Tělovýchovy (Ministry of Education, Youth and Sports)
- LX22NPO5102 Ministerstvo Školství, Mládeže a Tělovýchovy (Ministry of Education, Youth and Sports)
- CZ.02.1.01/0.0/0.0/18_046/0015974 Ministerstvo Školství, Mládeže a Tělovýchovy (Ministry of Education, Youth and Sports)
- LM2023033 Ministerstvo Školství, Mládeže a Tělovýchovy (Ministry of Education, Youth and Sports)
Collapse
Affiliation(s)
- Petr Lapcik
- Department of Biochemistry, Faculty of Science, Masaryk University, Brno, Czech Republic
| | - Klara Synkova
- Department of Biochemistry, Faculty of Science, Masaryk University, Brno, Czech Republic
| | - Lucia Janacova
- Department of Biochemistry, Faculty of Science, Masaryk University, Brno, Czech Republic
| | - Pavla Bouchalova
- Department of Biochemistry, Faculty of Science, Masaryk University, Brno, Czech Republic
| | - David Potesil
- Central European Institute of Technology, Masaryk University, Brno, Czech Republic
| | - Rudolf Nenutil
- Department of Oncological Pathology, Masaryk Memorial Cancer Institute, Brno, Czech Republic
| | - Pavel Bouchal
- Department of Biochemistry, Faculty of Science, Masaryk University, Brno, Czech Republic.
| |
Collapse
|
3
|
Zakopcanik M, Kavan D, Kukacka Z, Novak P, Loginov DS. Data-Independent Acquisition Represents a Promising Alternative for Fast Photochemical Oxidation of Proteins (FPOP) Samples Analysis. Anal Chem 2024; 96:11273-11279. [PMID: 38967040 PMCID: PMC11256011 DOI: 10.1021/acs.analchem.4c01084] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/27/2024] [Revised: 06/27/2024] [Accepted: 06/28/2024] [Indexed: 07/06/2024]
Abstract
Fast Photochemical Oxidation of Proteins (FPOP) is a protein footprinting method utilizing hydroxyl radicals to provide valuable information on the solvent-accessible surface area. The extensive number of oxidative modifications that are created by FPOP is both advantageous, leading to great spatial resolution, and challenging, increasing the complexity of data processing. The precise localization of the modification together with the appropriate reproducibility is crucial to obtain relevant structural information. In this paper, we propose a novel approach combining validated spectral libraries together with utilizing DIA data. First, the DDA data searched by FragPipe are subsequently validated using Skyline software to form a spectral library. This library is then matched against the DIA data to filter out nonrepresentative IDs. In comparison with FPOP data processing using only a search engine followed by generally applied filtration steps, the manually validated spectral library offers higher confidence in identifications and increased spatial resolution. Furthermore, the reproducibility of quantification was compared for DIA, DDA, and MS-only acquisition modes on timsTOF SCP. Comparison of coefficients of variation (CV) showed that the DIA and MS acquisition modes exhibit significantly better reproducibility in quantification (CV medians 0.1233 and 0.1494, respectively) compared to the DDA mode (CV median 0.2104).
Collapse
Affiliation(s)
- Marek Zakopcanik
- Institute
of Microbiology, The Czech Academy of Sciences, 14220 Prague, Czech Republic
- Department
of Biochemistry, Faculty of Science, Charles
University, 12820 Prague, Czech
Republic
| | - Daniel Kavan
- Institute
of Microbiology, The Czech Academy of Sciences, 14220 Prague, Czech Republic
| | - Zdenek Kukacka
- Institute
of Microbiology, The Czech Academy of Sciences, 14220 Prague, Czech Republic
| | - Petr Novak
- Institute
of Microbiology, The Czech Academy of Sciences, 14220 Prague, Czech Republic
| | - Dmitry S. Loginov
- Institute
of Microbiology, The Czech Academy of Sciences, 14220 Prague, Czech Republic
| |
Collapse
|
4
|
Reis A, Augusti R, Eberlin MN. A general, most basic rule for ion dissociation: Protonated molecules. JOURNAL OF MASS SPECTROMETRY : JMS 2024; 59:e5003. [PMID: 38445745 DOI: 10.1002/jms.5003] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 12/11/2023] [Revised: 01/03/2024] [Accepted: 01/11/2024] [Indexed: 03/07/2024]
Abstract
Contrary to the common but potentially misleading belief that when a protonated molecule is excited, it is its most stable protomer that will mandatorily dissociate, we demonstrate herein that, when rationalizing or predicting the chemistry of such ions, we should always search for the most labile protomer. This "most labile protomer" rule, based on the mobile proton model, states therefore that when a protonated molecule is heated, during ionization or by collisions for instance, the loosely bonded proton (H+ ) can acquire enough energy to detach itself from the most basic site of the molecule and then freely "walk through" the molecular framework to eventually find, if available, another protonation site, forming other less stable but more labile protomers, that is, protomers that may display lower dissociation thresholds. To demonstrate the validity of the "most labile protomer" rule as well as the misleading nature of the "most stable protomer" rule, we have selected several illustrative molecules and have collected their ESI(+)-MS/MS. To compare energies of precursors and products, we have also performed PM7 calculations and elaborated potential energy surface diagrams for their possible protomers and dissociation thresholds. We have also applied the "most labile protomer" rule to reinterpret-exclusively via classical charge-induced dissociation cleavages-several dissociation processes proposed for protonated molecules. In an accompanying letter, we have also applied a similar "most labile electromer" rule to ionized molecules.
Collapse
Affiliation(s)
- Adriano Reis
- School of Engineering, Mackenzie Presbyterian University, São Paulo, SP, Brazil
- Mackenzie Institute for Research in Graphene and Nanotechnologies (MackGraphe), São Paulo, SP, Brazil
| | - Rodinei Augusti
- Department of Chemistry, Federal University of Minas Gerais, Belo Horizonte, MG, Brazil
| | - Marcos N Eberlin
- School of Engineering, Mackenzie Presbyterian University, São Paulo, SP, Brazil
- Mackenzie Institute for Research in Graphene and Nanotechnologies (MackGraphe), São Paulo, SP, Brazil
| |
Collapse
|
5
|
Walmsley SJ, Guo J, Tarifa A, DeCaprio AP, Cooke MS, Turesky RJ, Villalta PW. Mass Spectral Library for DNA Adductomics. Chem Res Toxicol 2024; 37:302-310. [PMID: 38231175 PMCID: PMC10939812 DOI: 10.1021/acs.chemrestox.3c00302] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/18/2024]
Abstract
Endogenous electrophiles, ionizing and non-ionizing radiation, and hazardous chemicals present in the environment and diet can damage DNA by forming covalent adducts. DNA adducts can form in critical cancer driver genes and, if not repaired, may induce mutations during cell division, potentially leading to the onset of cancer. The detection and quantification of specific DNA adducts are some of the first steps in studying their role in carcinogenesis, the physiological conditions that lead to their production, and the risk assessment of exposure to specific genotoxic chemicals. Hundreds of different DNA adducts have been reported in the literature, and there is a critical need to establish a DNA adduct mass spectral database to facilitate the detection of previously observed DNA adducts and characterize newly discovered DNA adducts. We have collected synthetic DNA adduct standards from the research community, acquired MSn (n = 2, 3) fragmentation spectra using Orbitrap and Quadrupole-Time-of-Flight (Q-TOF) MS instrumentation, processed the spectral data and incorporated it into the MassBank of North America (MoNA) database, and created a DNA adduct portal Web site (https://sites.google.com/umn.edu/dnaadductportal) to serve as a central location for the DNA adduct mass spectra and metadata, including the spectral database downloadable in different formats. This spectral library should prove to be a valuable resource for the DNA adductomics community, accelerating research and improving our understanding of the role of DNA adducts in disease.
Collapse
Affiliation(s)
- Scott J Walmsley
- Institute for Health Informatics, University of Minnesota, Minneapolis, Minnesota 55455, United States
- Masonic Cancer Center, University of Minnesota, Minneapolis, Minnesota 55455, United States
| | - Jingshu Guo
- Masonic Cancer Center, University of Minnesota, Minneapolis, Minnesota 55455, United States
- Department of Medicinal Chemistry, College of Pharmacy, University of Minnesota, Minneapolis, Minnesota 55455, United States
- Thermo Fisher Scientific, San Jose, California 95134, United States
| | - Anamary Tarifa
- Forensic & Analytical Toxicology Facility, Department of Chemistry and Biochemistry, Florida International University, Miami, Florida 33199, United States
| | - Anthony P DeCaprio
- Forensic & Analytical Toxicology Facility, Department of Chemistry and Biochemistry, Florida International University, Miami, Florida 33199, United States
| | - Marcus S Cooke
- Oxidative Stress Group, Department of Molecular Biosciences, University of South Florida, Tampa, Florida 33620, United States
| | - Robert J Turesky
- Masonic Cancer Center, University of Minnesota, Minneapolis, Minnesota 55455, United States
- Department of Medicinal Chemistry, College of Pharmacy, University of Minnesota, Minneapolis, Minnesota 55455, United States
| | - Peter W Villalta
- Masonic Cancer Center, University of Minnesota, Minneapolis, Minnesota 55455, United States
- Department of Medicinal Chemistry, College of Pharmacy, University of Minnesota, Minneapolis, Minnesota 55455, United States
| |
Collapse
|
6
|
Ng CCA, Zhou Y, Yao ZP. Algorithms for de-novo sequencing of peptides by tandem mass spectrometry: A review. Anal Chim Acta 2023; 1268:341330. [PMID: 37268337 DOI: 10.1016/j.aca.2023.341330] [Citation(s) in RCA: 5] [Impact Index Per Article: 5.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/22/2022] [Revised: 05/04/2023] [Accepted: 05/06/2023] [Indexed: 06/04/2023]
Abstract
Peptide sequencing is of great significance to fundamental and applied research in the fields such as chemical, biological, medicinal and pharmaceutical sciences. With the rapid development of mass spectrometry and sequencing algorithms, de-novo peptide sequencing using tandem mass spectrometry (MS/MS) has become the main method for determining amino acid sequences of novel and unknown peptides. Advanced algorithms allow the amino acid sequence information to be accurately obtained from MS/MS spectra in short time. In this review, algorithms from exhaustive search to the state-of-art machine learning and neural network for high-throughput and automated de-novo sequencing are introduced and compared. Impacts of datasets on algorithm performance are highlighted. The current limitations and promising direction of de-novo peptide sequencing are also discussed in this review.
Collapse
Affiliation(s)
- Cheuk Chi A Ng
- State Key Laboratory of Chemical Biology and Drug Discovery, and Department of Applied Biology and Chemical Technology, The Hong Kong Polytechnic University, Hung Hom, Kowloon, Hong Kong Special Administrative Region of China; Research Institute for Future Food, and Research Center for Chinese Medicine Innovation, The Hong Kong Polytechnic University, Hung Hom, Kowloon, Hong Kong Special Administrative Region of China; State Key Laboratory of Chinese Medicine and Molecular Pharmacology (Incubation), and Shenzhen Key Laboratory of Food Biological Safety Control, The Hong Kong Polytechnic University Shenzhen Research Institute, Shenzhen, 518057, China
| | - Yin Zhou
- State Key Laboratory of Chemical Biology and Drug Discovery, and Department of Applied Biology and Chemical Technology, The Hong Kong Polytechnic University, Hung Hom, Kowloon, Hong Kong Special Administrative Region of China; Research Institute for Future Food, and Research Center for Chinese Medicine Innovation, The Hong Kong Polytechnic University, Hung Hom, Kowloon, Hong Kong Special Administrative Region of China; State Key Laboratory of Chinese Medicine and Molecular Pharmacology (Incubation), and Shenzhen Key Laboratory of Food Biological Safety Control, The Hong Kong Polytechnic University Shenzhen Research Institute, Shenzhen, 518057, China
| | - Zhong-Ping Yao
- State Key Laboratory of Chemical Biology and Drug Discovery, and Department of Applied Biology and Chemical Technology, The Hong Kong Polytechnic University, Hung Hom, Kowloon, Hong Kong Special Administrative Region of China; Research Institute for Future Food, and Research Center for Chinese Medicine Innovation, The Hong Kong Polytechnic University, Hung Hom, Kowloon, Hong Kong Special Administrative Region of China; State Key Laboratory of Chinese Medicine and Molecular Pharmacology (Incubation), and Shenzhen Key Laboratory of Food Biological Safety Control, The Hong Kong Polytechnic University Shenzhen Research Institute, Shenzhen, 518057, China.
| |
Collapse
|
7
|
Shi S, Wen L, Hu S, Chen L, Qiao J, Hong H. Rapid Screening of Methamphetamine in Hair by Ambient Ionization Mass Spectrometry (AIMS). ANAL LETT 2023. [DOI: 10.1080/00032719.2023.2180016] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/22/2023]
Affiliation(s)
- Shengyang Shi
- Research Institute of Advanced Technologies, Ningbo University, Ningbo, China
| | - Luhong Wen
- Research Institute of Advanced Technologies, Ningbo University, Ningbo, China
- China Innovation Instrument Company, Ningbo, China
- Hua Yue Enterprise Holdings, Guangzhou, China
| | - Shundi Hu
- Research Institute of Advanced Technologies, Ningbo University, Ningbo, China
- China Innovation Instrument Company, Ningbo, China
| | - La Chen
- Research Institute of Advanced Technologies, Ningbo University, Ningbo, China
- China Innovation Instrument Company, Ningbo, China
| | - Juanjuan Qiao
- Research Institute of Advanced Technologies, Ningbo University, Ningbo, China
| | - Huanhuan Hong
- Research Institute of Advanced Technologies, Ningbo University, Ningbo, China
- China Innovation Instrument Company, Ningbo, China
| |
Collapse
|
8
|
Arab I, Fondrie WE, Laukens K, Bittremieux W. Semisupervised Machine Learning for Sensitive Open Modification Spectral Library Searching. J Proteome Res 2023; 22:585-593. [PMID: 36688569 DOI: 10.1021/acs.jproteome.2c00616] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/24/2023]
Abstract
A key analysis task in mass spectrometry proteomics is matching the acquired tandem mass spectra to their originating peptides by sequence database searching or spectral library searching. Machine learning is an increasingly popular postprocessing approach to maximize the number of confident spectrum identifications that can be obtained at a given false discovery rate threshold. Here, we have integrated semisupervised machine learning in the ANN-SoLo tool, an efficient spectral library search engine that is optimized for open modification searching to identify peptides with any type of post-translational modification. We show that machine learning rescoring boosts the number of spectra that can be identified for both standard searching and open searching, and we provide insights into relevant spectrum characteristics harnessed by the machine learning model. The semisupervised machine learning functionality has now been fully integrated into ANN-SoLo, which is available as open source under the permissive Apache 2.0 license on GitHub at https://github.com/bittremieux/ANN-SoLo.
Collapse
Affiliation(s)
- Issar Arab
- Department of Computer Science, University of Antwerp, 2020 Antwerp, Belgium.,Biomedical Informatics Network Antwerpen (biomina), 2020 Antwerp, Belgium
| | | | - Kris Laukens
- Department of Computer Science, University of Antwerp, 2020 Antwerp, Belgium.,Biomedical Informatics Network Antwerpen (biomina), 2020 Antwerp, Belgium
| | - Wout Bittremieux
- Department of Computer Science, University of Antwerp, 2020 Antwerp, Belgium.,Biomedical Informatics Network Antwerpen (biomina), 2020 Antwerp, Belgium
| |
Collapse
|
9
|
Dorl S, Winkler S, Mechtler K, Dorfer V. MS Ana: Improving Sensitivity in Peptide Identification with Spectral Library Search. J Proteome Res 2023; 22:462-470. [PMID: 36688604 PMCID: PMC9903325 DOI: 10.1021/acs.jproteome.2c00658] [Citation(s) in RCA: 4] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/24/2023]
Abstract
Spectral library search can enable more sensitive peptide identification in tandem mass spectrometry experiments. However, its drawbacks are the limited availability of high-quality libraries and the added difficulty of creating decoy spectra for result validation. We describe MS Ana, a new spectral library search engine that enables high sensitivity peptide identification using either curated or predicted spectral libraries as well as robust false discovery control through its own decoy library generation algorithm. MS Ana identifies on average 36% more spectrum matches and 4% more proteins than database search in a benchmark test on single-shot human cell-line data. Further, we demonstrate the quality of the result validation with tests on synthetic peptide pools and show the importance of library selection through a comparison of library search performance with different configurations of publicly available human spectral libraries.
Collapse
Affiliation(s)
- Sebastian Dorl
- University
of Applied Sciences Upper Austria, Bioinformatics Research Group, Softwarepark 11, 4232Hagenberg, Austria,Department
of Computer Science, Johannes Kepler University
Linz, Altenbergerstraße
69, 4040Linz, Austria,E-mail: . Phone: +43 (0) 50804
27145
| | - Stephan Winkler
- University
of Applied Sciences Upper Austria, Bioinformatics Research Group, Softwarepark 11, 4232Hagenberg, Austria,Department
of Computer Science, Johannes Kepler University
Linz, Altenbergerstraße
69, 4040Linz, Austria
| | - Karl Mechtler
- Research
Institute of Molecular Pathology (IMP), Protein Chemistry, Campus-Vienna-Biocenter 1, 1030Vienna, Austria,Institute
of Molecular Biotechnology (IMBA), Protein Chemistry, Vienna Biocenter
(VBC), Dr. Bohr-Gasse 3, 1030Vienna, Austria,Gregor
Mendel Institute of Molecular Plant Biology of the Austrian Academy
of Sciences (GMI), Dr.
Bohr Gasse 3, 1030Vienna, Austria
| | - Viktoria Dorfer
- University
of Applied Sciences Upper Austria, Bioinformatics Research Group, Softwarepark 11, 4232Hagenberg, Austria,E-mail: . Phone: +43 (0) 50804
22740
| |
Collapse
|
10
|
Proteomic overview of hepatocellular carcinoma cell lines and generation of the spectral library. Sci Data 2022; 9:732. [PMID: 36446815 PMCID: PMC9708666 DOI: 10.1038/s41597-022-01845-x] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/04/2022] [Accepted: 11/14/2022] [Indexed: 12/02/2022] Open
Abstract
Cell lines are extensively used tools, therefore a comprehensive proteomic overview of hepatocellular carcinoma (HCC) cell lines and an extensive spectral library for data independent acquisition (DIA) quantification are necessary. Here, we present the proteome of nine commonly used HCC cell lines covering 9,208 protein groups, and the HCC spectral library containing 253,921 precursors, 168,811 peptides and 10,098 protein groups. The proteomic overview reveals the heterogeneity between different cell lines, and the similarity in proliferation and metastasis characteristics and drug targets-expression with tumour tissues. The HCC spectral library generating consumed 108 hours' runtime for data dependent acquisition (DDA) of 48 runs, 24 hours' runtime for database searching by MaxQuant version 2.0.3.0, and 1 hour' runtime for processing by SpectronautTM version 15.2. The HCC spectral library supports quantification of 7,637 protein groups of triples 2-hour DIA analysis of HepG2 and discovering biological alteration. This study provides valuable resources for HCC cell lines and efficient DIA quantification on LC-Orbitrap platform, further help to explore the molecular mechanism and candidate therapeutic targets.
Collapse
|
11
|
Fierro-Monti I, Wright JC, Choudhary JS, Vizcaíno JA. Identifying individuals using proteomics: are we there yet? Front Mol Biosci 2022; 9:1062031. [PMID: 36523653 PMCID: PMC9744771 DOI: 10.3389/fmolb.2022.1062031] [Citation(s) in RCA: 4] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/05/2022] [Accepted: 11/16/2022] [Indexed: 08/31/2023] Open
Abstract
Multi-omics approaches including proteomics analyses are becoming an integral component of precision medicine. As clinical proteomics studies gain momentum and their sensitivity increases, research on identifying individuals based on their proteomics data is here examined for risks and ethics-related issues. A great deal of work has already been done on this topic for DNA/RNA sequencing data, but it has yet to be widely studied in other omics fields. The current state-of-the-art for the identification of individuals based solely on proteomics data is explained. Protein sequence variation analysis approaches are covered in more detail, including the available analysis workflows and their limitations. We also outline some previous forensic and omics proteomics studies that are relevant for the identification of individuals. Following that, we discuss the risks of patient reidentification using other proteomics data types such as protein expression abundance and post-translational modification (PTM) profiles. In light of the potential identification of individuals through proteomics data, possible legal and ethical implications are becoming increasingly important in the field.
Collapse
Affiliation(s)
- Ivo Fierro-Monti
- European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Wellcome Trust Genome Campus, Hinxton, United Kingdom
| | | | | | - Juan Antonio Vizcaíno
- European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Wellcome Trust Genome Campus, Hinxton, United Kingdom
| |
Collapse
|
12
|
Bittremieux W, Wang M, Dorrestein PC. The critical role that spectral libraries play in capturing the metabolomics community knowledge. Metabolomics 2022; 18:94. [PMID: 36409434 DOI: 10.1007/s11306-022-01947-y] [Citation(s) in RCA: 23] [Impact Index Per Article: 11.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 08/01/2022] [Accepted: 10/19/2022] [Indexed: 11/22/2022]
Abstract
BACKGROUND Spectral library searching is currently the most common approach for compound annotation in untargeted metabolomics. Spectral libraries applicable to liquid chromatography mass spectrometry have grown in size over the past decade to include hundreds of thousands to millions of mass spectra and tens of thousands of compounds, forming an essential knowledge base for the interpretation of metabolomics experiments. AIM OF REVIEW We describe existing spectral library resources, highlight different strategies for compiling spectral libraries, and discuss quality considerations that should be taken into account when interpreting spectral library searching results. Finally, we describe how spectral libraries are empowering the next generation of machine learning tools in computational metabolomics, and discuss several opportunities for using increasingly accessible large spectral libraries. KEY SCIENTIFIC CONCEPTS OF REVIEW This review focuses on the current state of spectral libraries for untargeted LC-MS/MS based metabolomics. We show how the number of entries in publicly accessible spectral libraries has increased more than 60-fold in the past eight years to aid molecular interpretation and we discuss how the role of spectral libraries in untargeted metabolomics will evolve in the near future.
Collapse
Affiliation(s)
- Wout Bittremieux
- Collaborative Mass Spectrometry Innovation Center, University of California San Diego, La Jolla, CA, 92093, USA
- Skaggs School of Pharmacy and Pharmaceutical Sciences, University of California San Diego, La Jolla, CA, 92093, USA
| | - Mingxun Wang
- Department of Computer Science, University of California Riverside, Riverside, CA, 92507, USA
| | - Pieter C Dorrestein
- Collaborative Mass Spectrometry Innovation Center, University of California San Diego, La Jolla, CA, 92093, USA.
- Skaggs School of Pharmacy and Pharmaceutical Sciences, University of California San Diego, La Jolla, CA, 92093, USA.
| |
Collapse
|
13
|
Cao Z, Li G. MStoCIRC: A powerful tool for downstream analysis of MS/MS data to predict translatable circRNAs. Front Mol Biosci 2022; 9:791797. [PMID: 36072432 PMCID: PMC9441560 DOI: 10.3389/fmolb.2022.791797] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/09/2021] [Accepted: 07/18/2022] [Indexed: 11/13/2022] Open
Abstract
CircRNAs are formed by a non-canonical splicing method and appear circular in nature. CircRNAs are widely distributed in organisms and have the features of time- and tissue-specific expressions. CircRNAs have attracted increasing interest from scientists because of their non-negligible effects on the growth and development of organisms. The translation capability of circRNAs is a novel and valuable direction in the functional research of circRNAs. To explore the translation potential of circRNAs, some progress has been made in both experimental identification and computational prediction. For computational prediction, both CircCode and CircPro are ribosome profiling-based software applications for predicting translatable circRNAs, and the online databases riboCIRC and TransCirc analyze as many pieces of evidence as possible and list the predicted translatable circRNAs of high confidence. Simultaneously, mass spectrometry in proteomics is often recognized as an efficient method to support the identification of protein and peptide sequences from diverse complex templates. However, few applications fully utilize mass spectrometry to predict translatable circRNAs. Therefore, this research aims to build up a scientific analysis pipeline with two salient features: 1) it starts with the data analysis of raw tandem mass spectrometry data; and 2) it also incorporates other translation evidence such as IRES. The pipeline has been packaged into an analysis tool called mass spectrometry to translatable circRNAs (MStoCIRC). MStoCIRC is mainly implemented by Python3 language programming and could be downloaded from GitHub (https://github.com/QUMU00/mstocirc-master). The tool contains a main program and several small, independent function modules, making it more multifunctional. MStoCIRC can process data efficiently and has obtained hundreds of translatable circRNAs in humans and Arabidopsis thaliana.
Collapse
|
14
|
Bacala R, Hatcher DW, Perreault H, Fu BX. Challenges and opportunities for proteomics and the improvement of bread wheat quality. JOURNAL OF PLANT PHYSIOLOGY 2022; 275:153743. [PMID: 35749977 DOI: 10.1016/j.jplph.2022.153743] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 02/28/2022] [Revised: 05/13/2022] [Accepted: 05/30/2022] [Indexed: 06/15/2023]
Abstract
Wheat remains a critical global food source, pressured by climate change and the need to maximize yield, improve processing and nutritional quality and ensure safety. An enormous amount of research has been conducted to understand gluten protein composition and structure in relation to end-use quality, yet progress has become stagnant. This is mainly due to the need and inability to biochemically characterize the intact functional glutenin polymer in order to correlate to quality, necessitating reduction to monomeric subunits and a loss of contextual information. While some individual gluten proteins might have a positive or negative influence on gluten quality, it is the sum total of these proteins, their relative and absolute expression, their sub-cellular trafficking, the amount and size of glutenin polymers, and ratios between gluten protein classes that define viscoelasticity of gluten. The sub-cellular trafficking of gluten proteins during seed maturation is still not completely clear and there is evidence of dual pathways and therefore different destinations for proteins, either constitutively or temporally. The trafficking of proteins is also unclear in endosperm cells as they undergo programmed cell death; Golgi disappear around 12 DPA but protein filling continues at least to 25 DPA. Modulation of the timing of cellular events will invariably affect protein deposition and therefore gluten strength and function. Existing and emerging proteomics technologies such as proteoform profiling and top-down proteomics offer new tools to study gluten protein composition as a whole system and identify compositional patterns that can modify gluten structure with improved functionality.
Collapse
Affiliation(s)
- Ray Bacala
- Canadian Grain Commission, Grain Research Laboratory, 1404-303 Main Street, Winnipeg, Manitoba, R3C 3G8, Canada; University of Manitoba, Department of Chemistry, 144 Dysart Road, Winnipeg, Manitoba, R3T 2N2, Canada.
| | - Dave W Hatcher
- Canadian Grain Commission, Grain Research Laboratory, 1404-303 Main Street, Winnipeg, Manitoba, R3C 3G8, Canada
| | - Héléne Perreault
- University of Manitoba, Department of Chemistry, 144 Dysart Road, Winnipeg, Manitoba, R3T 2N2, Canada.
| | - Bin Xiao Fu
- Canadian Grain Commission, Grain Research Laboratory, 1404-303 Main Street, Winnipeg, Manitoba, R3C 3G8, Canada; Department of Food and Human Nutritional Sciences, 209 - 35 Chancellor's Circle, University of Manitoba, Winnipeg, Manitoba, R3T 2N2, Canada.
| |
Collapse
|
15
|
Tay AP, Hamey JJ, Martyn GE, Wilson LOW, Wilkins MR. Identification of Protein Isoforms Using Reference Databases Built from Long and Short Read RNA-Sequencing. J Proteome Res 2022; 21:1628-1639. [PMID: 35612954 DOI: 10.1021/acs.jproteome.1c00968] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/28/2022]
Abstract
Alternative splicing can lead to distinct protein isoforms. These can have different functions in specific cells and tissues or in different developmental stages. In this study, we explored whether transcripts assembled from long read, nanopore-based, direct RNA-sequencing (RNA-seq) could improve the identification of protein isoforms in human K562 cells. By comparing with Illumina-based short read RNA-seq, we showed that a large proportion of Ensembl transcripts (5949/14,326) and genes expressing alternatively spliced transcripts (486/2981) identified with long direct reads were missed by short paired-end reads. By co-analyzing proteomic and transcriptomic data, we also showed that some peptides (826/35,976), proteins (262/3215), and protein isoforms arising from distinct transcript variants (574/1212) identified with isoform-specific peptides via custom long-read-based databases were missed in Illumina-derived databases. Finally, we generated unequivocal peptide evidence for a set of protein isoforms and showed that long read, direct RNA-seq allows the discovery of novel protein isoforms not already in reference databases or custom databases built from short read RNA-seq data. Our analysis highlights the benefits of long read RNA-seq data in the generation of reference databases to increase tandem mass spectrometry (MS/MS) identification of protein isoforms.
Collapse
Affiliation(s)
- Aidan P Tay
- School of Biotechnology and Biomolecular Sciences, The University of New South Wales, Sydney, New South Wales 2052, Australia.,Australian e-Health Research Centre, Commonwealth Scientific and Industrial Research Organisation, Sydney, New South Wales 2113, Australia.,Applied Biosciences, Macquarie University, Sydney, New South Wales 2109, Australia
| | - Joshua J Hamey
- School of Biotechnology and Biomolecular Sciences, The University of New South Wales, Sydney, New South Wales 2052, Australia
| | - Gabriella E Martyn
- School of Biotechnology and Biomolecular Sciences, The University of New South Wales, Sydney, New South Wales 2052, Australia
| | - Laurence O W Wilson
- Australian e-Health Research Centre, Commonwealth Scientific and Industrial Research Organisation, Sydney, New South Wales 2113, Australia.,Applied Biosciences, Macquarie University, Sydney, New South Wales 2109, Australia
| | - Marc R Wilkins
- School of Biotechnology and Biomolecular Sciences, The University of New South Wales, Sydney, New South Wales 2052, Australia
| |
Collapse
|
16
|
Yan T, Palmer AB, Geiszler DJ, Polasky DA, Boatner LM, Burton NR, Armenta E, Nesvizhskii AI, Backus KM. Enhancing Cysteine Chemoproteomic Coverage through Systematic Assessment of Click Chemistry Product Fragmentation. Anal Chem 2022; 94:3800-3810. [PMID: 35195394 DOI: 10.1021/acs.analchem.1c04402] [Citation(s) in RCA: 13] [Impact Index Per Article: 6.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/28/2022]
Abstract
Mass spectrometry-based chemoproteomics has enabled functional analysis and small molecule screening at thousands of cysteine residues in parallel. Widely adopted chemoproteomic sample preparation workflows rely on the use of pan cysteine-reactive probes such as iodoacetamide alkyne combined with biotinylation via copper-catalyzed azide-alkyne cycloaddition (CuAAC) or "click chemistry" for cysteine capture. Despite considerable advances in both sample preparation and analytical platforms, current techniques only sample a small fraction of all cysteines encoded in the human proteome. Extending the recently introduced labile mode of the MSFragger search engine, here we report an in-depth analysis of cysteine biotinylation via click chemistry (CBCC) reagent gas-phase fragmentation during MS/MS analysis. We find that CBCC conjugates produce both known and novel diagnostic fragments and peptide remainder ions. Among these species, we identified a candidate signature ion for CBCC peptides, the cyclic oxonium-biotin fragment ion that is generated upon fragmentation of the N(triazole)-C(alkyl) bond. Guided by our empirical comparison of fragmentation patterns of six CBCC reagent combinations, we achieved enhanced coverage of cysteine-labeled peptides. Implementation of labile searches afforded unique PSMs and provides a roadmap for the utility of such searches in enhancing chemoproteomic peptide coverage.
Collapse
Affiliation(s)
- Tianyang Yan
- Biological Chemistry Department, David Geffen School of Medicine, UCLA, Los Angeles, California 90095, United States
- Department of Chemistry and Biochemistry, UCLA, Los Angeles, California 90095, United States
| | - Andrew B Palmer
- Biological Chemistry Department, David Geffen School of Medicine, UCLA, Los Angeles, California 90095, United States
- Department of Chemistry and Biochemistry, UCLA, Los Angeles, California 90095, United States
| | - Daniel J Geiszler
- Department of Computational Medicine and Bioinformatics, University of Michigan, Ann Arbor, Michigan 48109, United States
| | - Daniel A Polasky
- Department of Pathology, University of Michigan, Ann Arbor, Michigan 48109, United States
| | - Lisa M Boatner
- Biological Chemistry Department, David Geffen School of Medicine, UCLA, Los Angeles, California 90095, United States
- Department of Chemistry and Biochemistry, UCLA, Los Angeles, California 90095, United States
| | - Nikolas R Burton
- Biological Chemistry Department, David Geffen School of Medicine, UCLA, Los Angeles, California 90095, United States
- Department of Chemistry and Biochemistry, UCLA, Los Angeles, California 90095, United States
| | - Ernest Armenta
- Biological Chemistry Department, David Geffen School of Medicine, UCLA, Los Angeles, California 90095, United States
- Department of Chemistry and Biochemistry, UCLA, Los Angeles, California 90095, United States
| | - Alexey I Nesvizhskii
- Department of Computational Medicine and Bioinformatics, University of Michigan, Ann Arbor, Michigan 48109, United States
- Department of Pathology, University of Michigan, Ann Arbor, Michigan 48109, United States
| | - Keriann M Backus
- Biological Chemistry Department, David Geffen School of Medicine, UCLA, Los Angeles, California 90095, United States
- Department of Chemistry and Biochemistry, UCLA, Los Angeles, California 90095, United States
| |
Collapse
|
17
|
Lapcik P, Janacova L, Bouchalova P, Potesil D, Podhorec J, Hora M, Poprach A, Fiala O, Bouchal P. A large-scale assay library for targeted protein quantification in renal cell carcinoma tissues. Proteomics 2021; 22:e2100228. [PMID: 34902229 DOI: 10.1002/pmic.202100228] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/17/2021] [Revised: 12/07/2021] [Accepted: 12/07/2021] [Indexed: 11/08/2022]
Abstract
Renal cell carcinoma (RCC) represents 2.2% of all cancer incidences; however, prognostic or predictive RCC biomarkers at protein level are largely missing. To support proteomics research of localized and metastatic RCC, we introduce a new library of targeted mass spectrometry assays for accurate protein quantification in malignant and normal kidney tissue. Aliquots of 86 initially localized RCC, 75 metastatic RCC and 17 adjacent non-cancerous fresh frozen tissue lysates were trypsin digested, pooled, and fractionated using hydrophilic chromatography. The fractions were analyzed using LC-MS/MS on QExactive HF-X mass spectrometer in data-dependent acquisition (DDA) mode. A resulting spectral library contains 77,817 peptides representing 7960 protein groups (FDR = 1%). Further, we confirm applicability of this library on four RCC datasets measured in data-independent acquisition (DIA) mode, demonstrating a specific quantification of a substantially increased part of RCC proteome, depending on LC-MS/MS instrumentation. Impact of sample specificity of the library on the results of targeted DIA data extraction was demonstrated by parallel analyses of two datasets by two pan human libraries. The new RCC specific library has potential to contribute to better understanding the RCC development at molecular level, leading to new diagnostic and therapeutic targets.
Collapse
Affiliation(s)
- Petr Lapcik
- Department of Biochemistry, Faculty of Science, Masaryk University, Brno, Czech Republic
| | - Lucia Janacova
- Department of Biochemistry, Faculty of Science, Masaryk University, Brno, Czech Republic
| | - Pavla Bouchalova
- Department of Biochemistry, Faculty of Science, Masaryk University, Brno, Czech Republic
| | - David Potesil
- Central European Institute of Technology, Masaryk University, Brno, Czech Republic
| | - Jan Podhorec
- Department of Comprehensive Cancer Care, Masaryk Memorial Cancer Institute, Brno, Czech Republic.,Department of Comprehensive Cancer Care, Faculty of Medicine, Masaryk University, Brno, Czech Republic
| | - Milan Hora
- Department of Urology, Faculty of Medicine and University Hospital in Pilsen, Charles University, Plzen, Czech Republic
| | - Alexandr Poprach
- Department of Comprehensive Cancer Care, Masaryk Memorial Cancer Institute, Brno, Czech Republic.,Department of Comprehensive Cancer Care, Faculty of Medicine, Masaryk University, Brno, Czech Republic
| | - Ondrej Fiala
- Department of Oncology and Radiotherapy, Faculty of Medicine and University Hospital in Pilsen, Charles University, Plzen, Czech Republic.,Laboratory of Cancer Treatment and Tissue Regeneration, Biomedical Center, Faculty of Medicine in Pilsen, Charles University, Plzen, Czech Republic
| | - Pavel Bouchal
- Department of Biochemistry, Faculty of Science, Masaryk University, Brno, Czech Republic
| |
Collapse
|
18
|
Gong Y, Qin S, Dai L, Tian Z. The glycosylation in SARS-CoV-2 and its receptor ACE2. Signal Transduct Target Ther 2021; 6:396. [PMID: 34782609 PMCID: PMC8591162 DOI: 10.1038/s41392-021-00809-8] [Citation(s) in RCA: 102] [Impact Index Per Article: 34.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/08/2021] [Revised: 10/10/2021] [Accepted: 10/24/2021] [Indexed: 02/05/2023] Open
Abstract
Coronavirus disease 2019 (COVID-19), a highly infectious disease caused by severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2), has infected more than 235 million individuals and led to more than 4.8 million deaths worldwide as of October 5 2021. Cryo-electron microscopy and topology show that the SARS-CoV-2 genome encodes lots of highly glycosylated proteins, such as spike (S), envelope (E), membrane (M), and ORF3a proteins, which are responsible for host recognition, penetration, binding, recycling and pathogenesis. Here we reviewed the detections, substrates, biological functions of the glycosylation in SARS-CoV-2 proteins as well as the human receptor ACE2, and also summarized the approved and undergoing SARS-CoV-2 therapeutics associated with glycosylation. This review may not only broad the understanding of viral glycobiology, but also provide key clues for the development of new preventive and therapeutic methodologies against SARS-CoV-2 and its variants.
Collapse
Affiliation(s)
- Yanqiu Gong
- National Clinical Research Center for Geriatrics and Department of General Practice, State Key Laboratory of Biotherapy, West China Hospital, Sichuan University, and Collaborative Innovation Center of Biotherapy, 610041, Chengdu, China
| | - Suideng Qin
- School of Chemical Science & Engineering, Shanghai Key Laboratory of Chemical Assessment and Sustainability, Tongji University, 200092, Shanghai, China
| | - Lunzhi Dai
- National Clinical Research Center for Geriatrics and Department of General Practice, State Key Laboratory of Biotherapy, West China Hospital, Sichuan University, and Collaborative Innovation Center of Biotherapy, 610041, Chengdu, China.
| | - Zhixin Tian
- School of Chemical Science & Engineering, Shanghai Key Laboratory of Chemical Assessment and Sustainability, Tongji University, 200092, Shanghai, China.
| |
Collapse
|
19
|
Teh R, Azimi A, Ali M, Mann G, Fernández-Peñas P. Specialised skin cancer spectral library for use in data-independent mass spectrometry. Proteomics 2021; 21:e2100128. [PMID: 34374218 DOI: 10.1002/pmic.202100128] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Key Words] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/16/2021] [Revised: 07/12/2021] [Accepted: 07/26/2021] [Indexed: 11/08/2022]
Affiliation(s)
- Rachel Teh
- Westmead Clinical School, Faculty of Medicine and Health, The University of Sydney, Westmead, New South Wales, 2145, Australia.,Centre for Cancer Research, Westmead Institute for Medical Research, Westmead, New South Wales, 2145, Australia
| | - Ali Azimi
- Westmead Clinical School, Faculty of Medicine and Health, The University of Sydney, Westmead, New South Wales, 2145, Australia.,Centre for Cancer Research, Westmead Institute for Medical Research, Westmead, New South Wales, 2145, Australia
| | - Marina Ali
- Westmead Clinical School, Faculty of Medicine and Health, The University of Sydney, Westmead, New South Wales, 2145, Australia
| | - Graham Mann
- Centre for Cancer Research, Westmead Institute for Medical Research, Westmead, New South Wales, 2145, Australia.,The John Curtin School of Medical Research, College of Health and Medicine, Australian National University, Canberra, ACT, 2601, Australia
| | - Pablo Fernández-Peñas
- Westmead Clinical School, Faculty of Medicine and Health, The University of Sydney, Westmead, New South Wales, 2145, Australia.,Centre for Cancer Research, Westmead Institute for Medical Research, Westmead, New South Wales, 2145, Australia
| |
Collapse
|
20
|
Cassidy L, Kaulich PT, Maaß S, Bartel J, Becher D, Tholey A. Bottom-up and top-down proteomic approaches for the identification, characterization, and quantification of the low molecular weight proteome with focus on short open reading frame-encoded peptides. Proteomics 2021; 21:e2100008. [PMID: 34145981 DOI: 10.1002/pmic.202100008] [Citation(s) in RCA: 23] [Impact Index Per Article: 7.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/03/2021] [Revised: 06/09/2021] [Accepted: 06/09/2021] [Indexed: 01/14/2023]
Abstract
The recent discovery of alternative open reading frames creates a need for suitable analytical approaches to verify their translation and to characterize the corresponding gene products at the molecular level. As the analysis of small proteins within a background proteome by means of classical bottom-up proteomics is challenging, method development for the analysis of small open reading frame encoded peptides (SEPs) have become a focal point for research. Here, we highlight bottom-up and top-down proteomics approaches established for the analysis of SEPs in both pro- and eukaryotes. Major steps of analysis, including sample preparation and (small) proteome isolation, separation and mass spectrometry, data interpretation and quality control, quantification, the analysis of post-translational modifications, and exploration of functional aspects of the SEPs by means of proteomics technologies are described. These methods do not exclusively cover the analytics of SEPs but simultaneously include the low molecular weight proteome, and moreover, can also be used for the proteome-wide analysis of proteolytic processing events.
Collapse
Affiliation(s)
- Liam Cassidy
- Systematic Proteome Research & Bioanalytics, Institute for Experimental Medicine, Christian-Albrechts-Universität zu Kiel, Kiel, Germany
| | - Philipp T Kaulich
- Systematic Proteome Research & Bioanalytics, Institute for Experimental Medicine, Christian-Albrechts-Universität zu Kiel, Kiel, Germany
| | - Sandra Maaß
- Department of Microbial Proteomics, Institute of Microbiology, University of Greifswald, Greifswald, Germany
| | - Jürgen Bartel
- Department of Microbial Proteomics, Institute of Microbiology, University of Greifswald, Greifswald, Germany
| | - Dörte Becher
- Department of Microbial Proteomics, Institute of Microbiology, University of Greifswald, Greifswald, Germany
| | - Andreas Tholey
- Systematic Proteome Research & Bioanalytics, Institute for Experimental Medicine, Christian-Albrechts-Universität zu Kiel, Kiel, Germany
| |
Collapse
|
21
|
Manda SS, Noor Z, Hains PG, Zhong Q. PIONEER: Pipeline for Generating High-Quality Spectral Libraries for DIA-MS Data. Curr Protoc 2021; 1:e69. [PMID: 33656278 DOI: 10.1002/cpz1.69] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/11/2023]
Abstract
Data-independent-acquisition mass spectrometry (DIA-MS) is a state-of-the-art proteomic technique for high-throughput identification and quantification of peptides and proteins. Interpretation of DIA-MS data relies on the use of a spectral library, which is optimally created from data acquired from the same samples in data-dependent acquisition (DDA) mode. As DIA-MS quantification relies on the spectral libraries, having a high-quality, non-redundant, and comprehensive spectral library is essential. This article describes the major steps for creating a high-quality spectral library using a combination of multiple complementary search engines. We discuss appropriate strategies to control the false discovery rate for the final spectral library as a result of merging multiple searches. © 2021 The Authors Current Protocols © 2021 Wiley Periodicals LLC. Basic Protocol 1: Searching DDA-MS files with multiple search engines Basic Protocol 2: Merging results from multiple search engines Basic Protocol 3: Creating spectral libraries from merged results Alternate Protocol: Using CLI for automating tasks Support Protocol: Creating concatenated FASTA files.
Collapse
Affiliation(s)
- Srikanth S Manda
- ProCan®, Children's Medical Research Institute, Faculty of Medicine and Health, The University of Sydney, Westmead, New South Wales, Australia
| | - Zainab Noor
- ProCan®, Children's Medical Research Institute, Faculty of Medicine and Health, The University of Sydney, Westmead, New South Wales, Australia
| | - Peter G Hains
- ProCan®, Children's Medical Research Institute, Faculty of Medicine and Health, The University of Sydney, Westmead, New South Wales, Australia
| | - Qing Zhong
- ProCan®, Children's Medical Research Institute, Faculty of Medicine and Health, The University of Sydney, Westmead, New South Wales, Australia
| |
Collapse
|
22
|
Rozanova S, Barkovits K, Nikolov M, Schmidt C, Urlaub H, Marcus K. Quantitative Mass Spectrometry-Based Proteomics: An Overview. Methods Mol Biol 2021; 2228:85-116. [PMID: 33950486 DOI: 10.1007/978-1-0716-1024-4_8] [Citation(s) in RCA: 79] [Impact Index Per Article: 26.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/12/2022]
Abstract
In recent decades, mass spectrometry has moved more than ever before into the front line of protein-centered research. After being established at the qualitative level, the more challenging question of quantification of proteins and peptides using mass spectrometry has become a focus for further development. In this chapter, we discuss and review actual strategies and problems of the methods for the quantitative analysis of peptides, proteins, and finally proteomes by mass spectrometry. The common themes, the differences, and the potential pitfalls of the main approaches are presented in order to provide a survey of the emerging field of quantitative, mass spectrometry-based proteomics.
Collapse
Affiliation(s)
- Svitlana Rozanova
- Medizinisches Proteom-Center, Medical Faculty, Ruhr-University Bochum, Bochum, Germany.,Medical Proteome Analysis, Center for protein diagnostics (PRODI), Ruhr-University Bochum, Bochum, Germany
| | - Katalin Barkovits
- Medizinisches Proteom-Center, Medical Faculty, Ruhr-University Bochum, Bochum, Germany.,Medical Proteome Analysis, Center for protein diagnostics (PRODI), Ruhr-University Bochum, Bochum, Germany
| | - Miroslav Nikolov
- Bioanalytical Mass Spectrometry Group, Max Planck Institute for Biophysical Chemistry, Goettingen, Germany
| | - Carla Schmidt
- Interdisciplinary Research Center HALOmem, Charles Tanford Protein Center, Institute for Biochemistry and Biotechnology, Martin Luther University Halle-Wittenberg, Halle, Germany
| | - Henning Urlaub
- Bioanalytical Mass Spectrometry Group, Max Planck Institute for Biophysical Chemistry, Goettingen, Germany.,Bioanalytics Group, Institute of Clinical Chemistry, University Medical Center Goettingen, Goettingen, Germany.,Hematology/Oncology, Department of Medicine II, Johann Wolfgang Goethe University, Frankfurt, Germany
| | - Katrin Marcus
- Medizinisches Proteom-Center, Medical Faculty, Ruhr-University Bochum, Bochum, Germany. .,Medical Proteome Analysis, Center for protein diagnostics (PRODI), Ruhr-University Bochum, Bochum, Germany.
| |
Collapse
|
23
|
Qin C, Luo X, Deng C, Shu K, Zhu W, Griss J, Hermjakob H, Bai M, Perez-Riverol Y. Deep learning embedder method and tool for mass spectra similarity search. J Proteomics 2020; 232:104070. [PMID: 33307250 DOI: 10.1016/j.jprot.2020.104070] [Citation(s) in RCA: 11] [Impact Index Per Article: 2.8] [Reference Citation Analysis] [Abstract] [Key Words] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/29/2020] [Revised: 11/25/2020] [Accepted: 12/01/2020] [Indexed: 12/31/2022]
Abstract
Spectral similarity calculation is widely used in protein identification tools and mass spectra clustering algorithms while comparing theoretical or experimental spectra. The performance of the spectral similarity calculation plays an important role in these tools and algorithms especially in the analysis of large-scale datasets. Recently, deep learning methods have been proposed to improve the performance of clustering algorithms and protein identification by training the algorithms with existing data and the use of multiple spectra and identified peptide features. While the efficiency of these algorithms is still under study in comparison with traditional approaches, their application in proteomics data analysis is becoming more common. Here, we propose the use of deep learning to improve spectral similarity comparison. We assessed the performance of deep learning for spectral similarity, with GLEAMS and a newly trained embedder model (DLEAMSE), which uses high-quality spectra from PRIDE Cluster. Also, we developed a new bioinformatics tool (mslookup - https://github.com/bigbio/DLEAMSE/) that allows users to quickly search for spectra in previously identified mass spectra publish in public repositories and spectral libraries. Finally, we released a human database to enable bioinformaticians and biologists to search for identified spectra in their machines. SIGNIFICANCE STATEMENT: Spectral similarity calculation plays an important role in proteomics data analysis. With deep learning's ability to learn the implicit and effective features from large-scale training datasets, deep learning-based MS/MS spectra embedding models has emerged as a solution to improve mass spectral clustering similarity calculation algorithms. We compare multiple similarity scoring and deep learning methods in terms of accuracy (compute the similarity for a pair of the mass spectrum) and computing-time performance. The benchmark results showed no major differences in accuracy between DLEAMSE and normalized dot product for spectrum similarity calculations. The DLEAMSE GPU implementation is faster than NDP in preprocessing on the GPU server and the similarity calculation of DLEAMSE (Euclidean distance on 32-D vectors) takes about 1/3 of dot product calculations. The deep learning model (DLEAMSE) encoding and embedding steps needed to run once for each spectrum and the embedded 32-D points can be persisted in the repository for future comparison, which is faster for future comparisons and large-scale data. Based on these, we proposed a new tool mslookup that enables the researcher to find spectra previously identified in public data. The tool can be also used to generate in-house databases of previously identified spectra to share with other laboratories and consortiums.
Collapse
Affiliation(s)
- Chunyuan Qin
- Chongqing Key Laboratory on Big Data for Bio Intelligence, Chongqing University of Posts and telecommunications, Chongqing, China
| | - Xiyang Luo
- Chongqing Key Laboratory on Big Data for Bio Intelligence, Chongqing University of Posts and telecommunications, Chongqing, China
| | - Chuan Deng
- Chongqing Key Laboratory on Big Data for Bio Intelligence, Chongqing University of Posts and telecommunications, Chongqing, China
| | - Kunxian Shu
- Chongqing Key Laboratory on Big Data for Bio Intelligence, Chongqing University of Posts and telecommunications, Chongqing, China
| | - Weimin Zhu
- State Key Laboratory of Proteomics, Beijing Proteome Research Center, National Center for Protein Sciences (Beijing), Beijing Institute of Life Omics, Beijing 102206, China
| | - Johannes Griss
- European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Wellcome Trust Genome Campus, Hinxton, Cambridge CB10 1SD, UK; Department of Dermatology, Medical University of Vienna, 1090 Vienna, Austria
| | - Henning Hermjakob
- State Key Laboratory of Proteomics, Beijing Proteome Research Center, National Center for Protein Sciences (Beijing), Beijing Institute of Life Omics, Beijing 102206, China; European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Wellcome Trust Genome Campus, Hinxton, Cambridge CB10 1SD, UK
| | - Mingze Bai
- Chongqing Key Laboratory on Big Data for Bio Intelligence, Chongqing University of Posts and telecommunications, Chongqing, China; State Key Laboratory of Proteomics, Beijing Proteome Research Center, National Center for Protein Sciences (Beijing), Beijing Institute of Life Omics, Beijing 102206, China.
| | - Yasset Perez-Riverol
- European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Wellcome Trust Genome Campus, Hinxton, Cambridge CB10 1SD, UK.
| |
Collapse
|
24
|
Weerakoon H, Potriquet J, Shah AK, Reed S, Jayakody B, Kapil C, Midha MK, Moritz RL, Lepletier A, Mulvenna J, Miles JJ, Hill MM. A primary human T-cell spectral library to facilitate large scale quantitative T-cell proteomics. Sci Data 2020; 7:412. [PMID: 33230158 PMCID: PMC7683684 DOI: 10.1038/s41597-020-00744-3] [Citation(s) in RCA: 9] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/11/2020] [Accepted: 10/15/2020] [Indexed: 12/23/2022] Open
Abstract
Data independent analysis (DIA) exemplified by sequential window acquisition of all theoretical mass spectra (SWATH-MS) provides robust quantitative proteomics data, but the lack of a public primary human T-cell spectral library is a current resource gap. Here, we report the generation of a high-quality spectral library containing data for 4,833 distinct proteins from human T-cells across genetically unrelated donors, covering ~24% proteins of the UniProt/SwissProt reviewed human proteome. SWATH-MS analysis of 18 primary T-cell samples using the new human T-cell spectral library reliably identified and quantified 2,850 proteins at 1% false discovery rate (FDR). In comparison, the larger Pan-human spectral library identified and quantified 2,794 T-cell proteins in the same dataset. As the libraries identified an overlapping set of proteins, combining the two libraries resulted in quantification of 4,078 human T-cell proteins. Collectively, this large data archive will be a useful public resource for human T-cell proteomic studies. The human T-cell library is available at SWATHAtlas and the data are available via ProteomeXchange (PXD019446 and PXD019542) and PeptideAtlas (PASS01587).
Collapse
Affiliation(s)
- Harshi Weerakoon
- QIMR Berghofer Medical Research Institute, Herston, Brisbane, QLD, 4006, Australia
- School of Biomedical Sciences, The University of Queensland, St Lucia, QLD, 4072, Australia
- Faculty of Medicine and Allied Sciences, Rajarata University of Sri Lanka, Saliyapura, 50000, Sri Lanka
| | - Jeremy Potriquet
- QIMR Berghofer Medical Research Institute, Herston, Brisbane, QLD, 4006, Australia
- SCIEX Australia Pty Ltd, Mt Waverley, VIC, 3149, Australia
| | - Alok K Shah
- QIMR Berghofer Medical Research Institute, Herston, Brisbane, QLD, 4006, Australia
- CSL Limited, 45 Poplar Rd, Parkville, VIC, 3052, Australia
| | - Sarah Reed
- UQ Centre for Clinical Research, Faculty of Medicine, The University of Queensland, Brisbane, QLD, 4006, Australia
| | - Buddhika Jayakody
- UQ Centre for Clinical Research, Faculty of Medicine, The University of Queensland, Brisbane, QLD, 4006, Australia
| | - Charu Kapil
- Institute for Systems Biology, Seattle, WA, 98109, USA
| | - Mukul K Midha
- Institute for Systems Biology, Seattle, WA, 98109, USA
| | | | - Ailin Lepletier
- QIMR Berghofer Medical Research Institute, Herston, Brisbane, QLD, 4006, Australia
- Institute for Glycomics, Griffith University, Gold Coast, QLD, 4222, Australia
| | - Jason Mulvenna
- QIMR Berghofer Medical Research Institute, Herston, Brisbane, QLD, 4006, Australia
| | - John J Miles
- Australian Institute of Tropical Health and Medicine, James Cook University, Cairns, QLD, 4878, Australia.
- Centre for Molecular Therapeutics, James Cook University, Cairns, QLD, 4878, Australia.
- Centre for Tropical Bioinformatics and Molecular Biology, James Cook University, Cairns, QLD, 4878, Australia.
| | - Michelle M Hill
- QIMR Berghofer Medical Research Institute, Herston, Brisbane, QLD, 4006, Australia.
- UQ Centre for Clinical Research, Faculty of Medicine, The University of Queensland, Brisbane, QLD, 4006, Australia.
| |
Collapse
|
25
|
Liu R, Wei P, Keller C, Orefice NS, Shi Y, Li Z, Huang J, Cui Y, Frost DC, Han S, Cross TWL, Rey FE, Li L. Integrated Label-Free and 10-Plex DiLeu Isobaric Tag Quantitative Methods for Profiling Changes in the Mouse Hypothalamic Neuropeptidome and Proteome: Assessment of the Impact of the Gut Microbiome. Anal Chem 2020; 92:14021-14030. [PMID: 32926775 DOI: 10.1021/acs.analchem.0c02939] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/13/2023]
Abstract
Gut microbiota can regulate host physiological and pathological status through gut-brain communications or pathways. However, the impact of the gut microbiome on neuropeptides and proteins involved in regulating brain functions and behaviors is still not clearly understood. To address the problem, integrated label-free and 10-plex DiLeu isobaric tag-based quantitative methods were implemented to compare the profiling of neuropeptides and proteins in the hypothalamus of germ-free (GF)- vs conventionally raised (ConvR)-mice. A total of 2943 endogenous peptides from 63 neuropeptide precursors and 3971 proteins in the mouse hypothalamus were identified. Among these 368 significantly changed peptides (fold changes over 1.5 and a p-value of <0.05), 73.6% of the peptides showed higher levels in GF-mice than in ConvR-mice, and 26.4% of the peptides had higher levels in ConvR-mice than in GF-mice. These peptides were mainly from secretogranin-2, phosphatidylethanolamine-binding protein-1, ProSAAS, and proenkephalin-A. A quantitative proteomic analysis employing DiLeu isobaric tags revealed that 282 proteins were significantly up- or down-regulated (fold changes over 1.2 and a p-value of <0.05) among the 3277 quantified proteins. These neuropeptides and proteins were mainly involved in regulating behaviors, transmitter release, signaling pathways, and synapses. Interestingly, pathways including long-term potentiation, long-term depression, and circadian entrainment were involved. In the present study, a combined label-free and 10-plex DiLeu-based quantitative method enabled a comprehensive profiling of gut microbiome-induced dynamic changes of neuropeptides and proteins in the hypothalamus, suggesting that the gut microbiome might mediate a range of behavioral changes, brain development, and learning and memory through these neuropeptides and proteins.
Collapse
Affiliation(s)
- Rui Liu
- School of Pharmacy, University of Wisconsin-Madison, Madison, Wisconsin 53705, United States.,School of Pharmacy, Nanjing University of Chinese Medicine, Nanjing 210023, P. R. China.,Jiangsu Collaborative Innovation Center of Chinese Medicinal Resources Industrialization, and National and Local Collaborative Engineering Center of Chinese Medicinal Resources Industrialization and Formulae Innovative Medicine, Nanjing 210023, P. R. China.,Jiangsu Key Laboratory of Research and Development in Marine Bio-resource Pharmaceutics, Nanjing University of Chinese Medicine, Nanjing 210023, P. R. China
| | - Pingli Wei
- Department of Chemistry, University of Wisconsin-Madison, Madison, Wisconsin 53706, United States
| | - Caitlin Keller
- Department of Chemistry, University of Wisconsin-Madison, Madison, Wisconsin 53706, United States
| | - Nicola Salvatore Orefice
- Department of Medicine, University of Wisconsin-Madison, Madison, Wisconsin 53705, United States.,Waisman Center, University of Wisconsin-Madison, Madison, Wisconsin 53705, United States
| | - Yatao Shi
- School of Pharmacy, University of Wisconsin-Madison, Madison, Wisconsin 53705, United States
| | - Zihui Li
- Department of Chemistry, University of Wisconsin-Madison, Madison, Wisconsin 53706, United States
| | - Junfeng Huang
- School of Pharmacy, University of Wisconsin-Madison, Madison, Wisconsin 53705, United States
| | - Yusi Cui
- School of Pharmacy, University of Wisconsin-Madison, Madison, Wisconsin 53705, United States
| | - Dustin C Frost
- School of Pharmacy, University of Wisconsin-Madison, Madison, Wisconsin 53705, United States
| | - Shuying Han
- School of Pharmacy, University of Wisconsin-Madison, Madison, Wisconsin 53705, United States.,School of Pharmacy, Nanjing University of Chinese Medicine, Nanjing 210023, P. R. China.,Jiangsu Collaborative Innovation Center of Chinese Medicinal Resources Industrialization, and National and Local Collaborative Engineering Center of Chinese Medicinal Resources Industrialization and Formulae Innovative Medicine, Nanjing 210023, P. R. China
| | - Tzu-Wen L Cross
- Department of Bacteriology, University of Wisconsin-Madison, Madison, Wisconsin 53705, United States.,Cardiovascular Research Center, University of Wisconsin-Madison, Madison, Wisconsin 53706, United States
| | - Federico E Rey
- Department of Bacteriology, University of Wisconsin-Madison, Madison, Wisconsin 53705, United States
| | - Lingjun Li
- School of Pharmacy, University of Wisconsin-Madison, Madison, Wisconsin 53705, United States.,Department of Chemistry, University of Wisconsin-Madison, Madison, Wisconsin 53706, United States
| |
Collapse
|
26
|
Hanaichi S, Fujihara A. Identification and quantification of leucine and isoleucine residues in peptides using photoexcited tryptophan. Amino Acids 2020; 52:1107-1113. [PMID: 32710184 DOI: 10.1007/s00726-020-02875-8] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/17/2020] [Accepted: 07/19/2020] [Indexed: 10/23/2022]
|
27
|
Ramachandran S, Thomas T. A Frequency-Based Approach to Predict the Low-Energy Collision-Induced Dissociation Fragmentation Spectra. ACS OMEGA 2020; 5:12615-12622. [PMID: 32548445 PMCID: PMC7288360 DOI: 10.1021/acsomega.9b03935] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 11/18/2019] [Accepted: 05/12/2020] [Indexed: 06/11/2023]
Abstract
Peptide identification algorithms rely on the comparison between the experimental tandem mass spectrometry spectrum and the theoretical spectrum to identify a peptide from the tandem mass spectra. Hence, it is important to understand the fragmentation process and predict the tandem mass spectra for high-throughput proteomics research. In this study, a novel method was developed to predict the theoretical ion trap collision-induced dissociation (CID) tandem mass spectra of the singly, doubly, and triply charged tryptic peptides. The fragmentation statistics of the ion trap CID spectra were used to predict the theoretical tandem mass spectra of the peptide sequence. The study estimated the relative cleavage frequency for each pair of adjacent amino acids along the peptide length. The study showed that the cleavage frequency can be directly used to predict the tandem mass spectra. The predicted spectra show a high correlation with the experimental spectra used in this study; 99.73% of the high-quality reference spectra have correlation scores greater than 0.8. The new method predicts the theoretical spectrum and correlates significantly better with the experimental spectrum as compared to the existing spectrum prediction tools OpenMS_Simulator, MS2PIP, and MS2PBPI, where only 80, 85.76, and 85.80% of the spectral count, respectively, has a correlation score greater than 0.8.
Collapse
|
28
|
Hentschker C, Maaß S, Junker S, Hecker M, Hammerschmidt S, Otto A, Becher D. Comprehensive Spectral Library from the Pathogenic Bacterium Streptococcus pneumoniae with Focus on Phosphoproteins. J Proteome Res 2020; 19:1435-1446. [DOI: 10.1021/acs.jproteome.9b00615] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/16/2022]
Affiliation(s)
- Christian Hentschker
- Department of Microbial Proteomics, Institute of Microbiology; University of Greifswald, Felix-Hausdorff-Str. 8, 17489 Greifswald, Germany
| | - Sandra Maaß
- Department of Microbial Proteomics, Institute of Microbiology; University of Greifswald, Felix-Hausdorff-Str. 8, 17489 Greifswald, Germany
| | - Sabryna Junker
- Department of Microbial Proteomics, Institute of Microbiology; University of Greifswald, Felix-Hausdorff-Str. 8, 17489 Greifswald, Germany
| | - Michael Hecker
- Department of Microbial Physiology and Molecular Biology, Institute of Microbiology; University of Greifswald, Felix-Hausdorff-Str. 8, 17489 Greifswald, Germany
| | - Sven Hammerschmidt
- Department of Molecular Genetics and Infection Biology, Interfaculty Institute for Genetics and Functional Genomics, University of Greifswald, Felix-Hausdorff-Str. 8, 17489 Greifswald, Germany
| | - Andreas Otto
- Department of Microbial Proteomics, Institute of Microbiology; University of Greifswald, Felix-Hausdorff-Str. 8, 17489 Greifswald, Germany
| | - Dörte Becher
- Department of Microbial Proteomics, Institute of Microbiology; University of Greifswald, Felix-Hausdorff-Str. 8, 17489 Greifswald, Germany
| |
Collapse
|
29
|
Barkovits K, Pacharra S, Pfeiffer K, Steinbach S, Eisenacher M, Marcus K, Uszkoreit J. Reproducibility, Specificity and Accuracy of Relative Quantification Using Spectral Library-based Data-independent Acquisition. Mol Cell Proteomics 2020; 19:181-197. [PMID: 31699904 PMCID: PMC6944235 DOI: 10.1074/mcp.ra119.001714] [Citation(s) in RCA: 77] [Impact Index Per Article: 19.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/09/2019] [Revised: 10/17/2019] [Indexed: 12/14/2022] Open
Abstract
Currently data-dependent acquisition (DDA) is the method of choice for mass spectrometry-based proteomics discovery experiments, but data-independent acquisition (DIA) is steadily becoming more important. One of the most important requirements to perform a DIA analysis is the availability of suitable spectral libraries for peptide identification and quantification. Several studies were performed addressing the evaluation of spectral library performance for protein identification in DIA measurements. But so far only few experiments estimate the effect of these libraries on the quantitative level.In this work we created a gold standard spike-in sample set with known contents and ratios of proteins in a complex protein matrix that allowed a detailed comparison of DIA quantification data obtained with different spectral library approaches. We used in-house generated sample-specific spectral libraries created using varying sample preparation approaches and repeated DDA measurement. In addition, two different search engines were tested for protein identification from DDA data and subsequent library generation. In total, eight different spectral libraries were generated, and the quantification results compared with a library free method, as well as a default DDA analysis. Not only the number of identifications on peptide and protein level in the spectral libraries and the corresponding DIA analysis results was inspected, but also the number of expected and identified differentially abundant protein groups and their ratios.We found, that while libraries of prefractionated samples were generally larger, there was no significant increase in DIA identifications compared with repetitive non-fractionated measurements. Furthermore, we show that the accuracy of the quantification is strongly dependent on the applied spectral library and whether the quantification is based on peptide or protein level. Overall, the reproducibility and accuracy of DIA quantification is superior to DDA in all applied approaches.Data has been deposited to the ProteomeXchange repository with identifiers PXD012986, PXD012987, PXD012988 and PXD014956.
Collapse
Affiliation(s)
- Katalin Barkovits
- Ruhr University Bochum, Faculty of Medicine, Medizinisches Proteom-Center, Bochum, Germany
| | - Sandra Pacharra
- Ruhr University Bochum, Faculty of Medicine, Medizinisches Proteom-Center, Bochum, Germany
| | - Kathy Pfeiffer
- Ruhr University Bochum, Faculty of Medicine, Medizinisches Proteom-Center, Bochum, Germany
| | - Simone Steinbach
- Ruhr University Bochum, Faculty of Medicine, Medizinisches Proteom-Center, Bochum, Germany
| | - Martin Eisenacher
- Ruhr University Bochum, Faculty of Medicine, Medizinisches Proteom-Center, Bochum, Germany
| | - Katrin Marcus
- Ruhr University Bochum, Faculty of Medicine, Medizinisches Proteom-Center, Bochum, Germany.
| | - Julian Uszkoreit
- Ruhr University Bochum, Faculty of Medicine, Medizinisches Proteom-Center, Bochum, Germany.
| |
Collapse
|
30
|
Shao W, Caron E, Pedrioli P, Aebersold R. The SysteMHC Atlas: a Computational Pipeline, a Website, and a Data Repository for Immunopeptidomic Analyses. Methods Mol Biol 2020; 2120:173-181. [PMID: 32124319 DOI: 10.1007/978-1-0716-0327-7_12] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/12/2022]
Abstract
Mass spectrometry has emerged as the method of choice for the exploration of the immunopeptidome. Insights from the immunopeptidome promise novel cancer therapeutic approaches and a better understanding of the basic mechanisms of our immune system. To meet the computational demands from the steady gain in popularity and reach of mass spectrometry-based immunopeptidomics analysis, we created the SysteMHC Atlas project, a first-of-its-kind computational pipeline and resource repository dedicated to standardizing data analysis and public dissemination of immunopeptidomic datasets.
Collapse
Affiliation(s)
- Wenguang Shao
- Department of Biology, Institute of Molecular Systems Biology, ETH Zurich, Zurich, Switzerland.
| | - Etienne Caron
- CHU Sainte-Justine Research Center, Montreal, QC, Canada. .,Department of Pathology and Cellular Biology, Faculty of Medicine, Université de Montréal, Montreal, QC, Canada.
| | - Patrick Pedrioli
- Department of Biology, Institute of Molecular Systems Biology, ETH Zurich, Zurich, Switzerland
| | - Ruedi Aebersold
- Department of Biology, Institute of Molecular Systems Biology, ETH Zurich, Zurich, Switzerland.,Faculty of Science, University of Zurich, Zurich, Switzerland
| |
Collapse
|
31
|
Pino L, Lin A, Bittremieux W. 2018 YPIC Challenge: A Case Study in Characterizing an Unknown Protein Sample. J Proteome Res 2019; 18:3936-3943. [PMID: 31556620 PMCID: PMC6824964 DOI: 10.1021/acs.jproteome.9b00384] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/29/2022]
Abstract
For the 2018 YPIC Challenge, contestants were invited to try to decipher two unknown English questions encoded by a synthetic protein expressed in Escherichia coli. In addition to deciphering the sentence, contestants were asked to determine the three-dimensional structure and detect any post-translation modifications left by the host organism. We present our experimental and computational strategy to characterize this sample by identifying the unknown protein sequence and detecting the presence of post-translational modifications. The sample was acquired with dynamic exclusion disabled to increase the signal-to-noise ratio of the measured molecules, after which spectral clustering was used to generate high-quality consensus spectra. De novo spectrum identification was used to determine the synthetic protein sequence, and any post-translational modifications introduced by E. coli on the synthetic protein were analyzed via spectral networking. This workflow resulted in a de novo sequence coverage of 70%, on par with sequence database searching performance. Additionally, the spectral networking analysis indicated that no systematic modifications were introduced on the synthetic protein by E. coli. The strategy presented here can be directly used to analyze samples for which no protein sequence information is available or when the identity of the sample is unknown. All software and code to perform the bioinformatics analysis is available as open source, and self-contained Jupyter notebooks are provided to fully recreate the analysis.
Collapse
Affiliation(s)
- Lindsay Pino
- Department of Genome Sciences, University of Washington, Seattle WA 98195, USA
| | - Andy Lin
- Department of Genome Sciences, University of Washington, Seattle WA 98195, USA
| | - Wout Bittremieux
- Department of Genome Sciences, University of Washington, Seattle WA 98195, USA
- Department of Mathematics and Computer Science, University of Antwerp, 2020 Antwerp, Belgium
- Biomedical Informatics Network Antwerpen (biomina), 2020 Antwerp, Belgium
| |
Collapse
|
32
|
Paulines MJ, Wetzel C, Limbach PA. Using spectral matching to interpret LC-MS/MS data during RNA modification mapping. JOURNAL OF MASS SPECTROMETRY : JMS 2019; 54:906-914. [PMID: 31663233 DOI: 10.1002/jms.4456] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 07/17/2019] [Revised: 09/24/2019] [Accepted: 10/07/2019] [Indexed: 05/09/2023]
Abstract
While a number of approaches have been developed to analyze liquid chromatography tandem mass spectrometry (LC-MS/MS) data obtained from modified oligonucleotides, the majority of these methods require analyzing every MS/MS spectrum de novo to sequence the oligonucleotide and place the modification. Spectral matching is an alternative approach for analyzing MS/MS data that is based on creating a library of annotated MS/MS spectra against which individual MS/MS data can be searched. Here, we have adapted the existing NIST spectral matching software to enable its use in the interpretation of MS/MS data obtained from modified oligonucleotides. In particular, we demonstrate the utility of this approach to identify specific post-transcriptionally modified nucleosides in particular transfer RNAs (tRNAs) obtained through a conventional RNA modification mapping experimental protocol. Spectral matching was found to be an efficient approach for screening for known modified tRNAs by using the experimental data as the library and previously annotated RNase T1 digestion products of tRNAs as the reference spectra. The utility of spectral matching for rapid analysis of multiple LC-MS/MS analyses was demonstrated by screening mutant strains of Streptococcus mutans to identify the enzyme(s) responsible for synthesizing the tRNA position 37 modification threonylcarbamoyladenosine (t6 A).
Collapse
Affiliation(s)
- Mellie June Paulines
- Rieveschl Laboratories for Mass Spectrometry, Department of Chemistry, University of Cincinnati, PO Box 210172, Cincinnati, Ohio, 45221-0172, USA
| | - Collin Wetzel
- Rieveschl Laboratories for Mass Spectrometry, Department of Chemistry, University of Cincinnati, PO Box 210172, Cincinnati, Ohio, 45221-0172, USA
- Department of Cancer Biology, University of Cincinnati, PO Box 2100521, Cincinnati, Ohio, 45221-0521, USA
| | - Patrick A Limbach
- Rieveschl Laboratories for Mass Spectrometry, Department of Chemistry, University of Cincinnati, PO Box 210172, Cincinnati, Ohio, 45221-0172, USA
| |
Collapse
|
33
|
O’Bryon I, Tucker AE, Kaiser BLD, Wahl KL, Merkley ED. Constructing a Tandem Mass Spectral Library for Forensic Ricin Identification. J Proteome Res 2019; 18:3926-3935. [DOI: 10.1021/acs.jproteome.9b00377] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/20/2022]
Affiliation(s)
- Isabelle O’Bryon
- Chemical and Biological Signature Sciences Group, Pacific Northwest National Laboratory, Richland, Washington 99354, United States
| | - Abigail E. Tucker
- Chemical and Biological Signature Sciences Group, Pacific Northwest National Laboratory, Richland, Washington 99354, United States
| | - Brooke L. D. Kaiser
- Chemical and Biological Signature Sciences Group, Pacific Northwest National Laboratory, Richland, Washington 99354, United States
| | - Karen L. Wahl
- Chemical and Biological Signature Sciences Group, Pacific Northwest National Laboratory, Richland, Washington 99354, United States
| | - Eric D. Merkley
- Chemical and Biological Signature Sciences Group, Pacific Northwest National Laboratory, Richland, Washington 99354, United States
| |
Collapse
|
34
|
Buchowiecka AK. Modified cysteine S-phosphopeptide standards for mass spectrometry-based proteomics. Amino Acids 2019; 51:1365-1375. [DOI: 10.1007/s00726-019-02773-8] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/03/2018] [Accepted: 08/18/2019] [Indexed: 02/06/2023]
|
35
|
Bittremieux W, Meysman P, Noble WS, Laukens K. Fast Open Modification Spectral Library Searching through Approximate Nearest Neighbor Indexing. J Proteome Res 2018; 17:3463-3474. [PMID: 30184435 PMCID: PMC6173621 DOI: 10.1021/acs.jproteome.8b00359] [Citation(s) in RCA: 41] [Impact Index Per Article: 6.8] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/21/2022]
Abstract
Open modification searching (OMS) is a powerful search strategy that identifies peptides carrying any type of modification by allowing a modified spectrum to match against its unmodified variant by using a very wide precursor mass window. A drawback of this strategy, however, is that it leads to a large increase in search time. Although performing an open search can be done using existing spectral library search engines by simply setting a wide precursor mass window, none of these tools have been optimized for OMS, leading to excessive runtimes and suboptimal identification results. We present the ANN-SoLo tool for fast and accurate open spectral library searching. ANN-SoLo uses approximate nearest neighbor indexing to speed up OMS by selecting only a limited number of the most relevant library spectra to compare to an unknown query spectrum. This approach is combined with a cascade search strategy to maximize the number of identified unmodified and modified spectra while strictly controlling the false discovery rate as well as a shifted dot product score to sensitively match modified spectra to their unmodified counterparts. ANN-SoLo achieves state-of-the-art performance in terms of speed and the number of identifications. On a previously published human cell line data set, ANN-SoLo confidently identifies more spectra than SpectraST or MSFragger and achieves a speedup of an order of magnitude compared with SpectraST. ANN-SoLo is implemented in Python and C++. It is freely available under the Apache 2.0 license at https://github.com/bittremieux/ANN-SoLo .
Collapse
Affiliation(s)
- Wout Bittremieux
- Department of Mathematics and Computer Science , University of Antwerp , 2020 Antwerp , Belgium
- Biomedical Informatics Network Antwerpen (biomina) , 2020 Antwerp , Belgium
- Department of Genome Sciences , University of Washington , Seattle , Washington 98195 , United States
| | - Pieter Meysman
- Department of Mathematics and Computer Science , University of Antwerp , 2020 Antwerp , Belgium
- Biomedical Informatics Network Antwerpen (biomina) , 2020 Antwerp , Belgium
| | - William Stafford Noble
- Department of Genome Sciences , University of Washington , Seattle , Washington 98195 , United States
- Department of Computer Science and Engineering , University of Washington , Seattle , Washington 98195 , United States
| | - Kris Laukens
- Department of Mathematics and Computer Science , University of Antwerp , 2020 Antwerp , Belgium
- Biomedical Informatics Network Antwerpen (biomina) , 2020 Antwerp , Belgium
| |
Collapse
|
36
|
De Vijlder T, Valkenborg D, Lemière F, Romijn EP, Laukens K, Cuyckens F. A tutorial in small molecule identification via electrospray ionization-mass spectrometry: The practical art of structural elucidation. MASS SPECTROMETRY REVIEWS 2018; 37:607-629. [PMID: 29120505 PMCID: PMC6099382 DOI: 10.1002/mas.21551] [Citation(s) in RCA: 126] [Impact Index Per Article: 21.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 10/03/2017] [Accepted: 10/03/2017] [Indexed: 05/10/2023]
Abstract
The identification of unknown molecules has been one of the cornerstone applications of mass spectrometry for decades. This tutorial reviews the basics of the interpretation of electrospray ionization-based MS and MS/MS spectra in order to identify small-molecule analytes (typically below 2000 Da). Most of what is discussed in this tutorial also applies to other atmospheric pressure ionization methods like atmospheric pressure chemical/photoionization. We focus primarily on the fundamental steps of MS-based structural elucidation of individual unknown compounds, rather than describing strategies for large-scale identification in complex samples. We critically discuss topics like the detection of protonated and deprotonated ions ([M + H]+ and [M - H]- ) as well as other adduct ions, the determination of the molecular formula, and provide some basic rules on the interpretation of product ion spectra. Our tutorial focuses primarily on the fundamental steps of MS-based structural elucidation of individual unknown compounds (eg, contaminants in chemical production, pharmacological alteration of drugs), rather than describing strategies for large-scale identification in complex samples. This tutorial also discusses strategies to obtain useful orthogonal information (UV/Vis, H/D exchange, chemical derivatization, etc) and offers an overview of the different informatics tools and approaches that can be used for structural elucidation of small molecules. It is primarily intended for beginning mass spectrometrists and researchers from other mass spectrometry sub-disciplines that want to get acquainted with structural elucidation are interested in some practical tips and tricks.
Collapse
Affiliation(s)
- Thomas De Vijlder
- Pharmaceutical Development & Manufacturing Sciences (PDMS)Janssen Research & DevelopmentBeerseBelgium
| | - Dirk Valkenborg
- Interuniversity Institute for Biostatistics and Statistical BioinformaticsHasselt UniversityDiepenbeekBelgium
- Center for Proteomics (CFP)University of AntwerpAntwerpBelgium
- Flemish Institute for Technological Research (VITO)MolBelgium
| | - Filip Lemière
- Center for Proteomics (CFP)University of AntwerpAntwerpBelgium
- Department of Chemistry, Biomolecular and Analytical Mass SpectrometryUniversity of AntwerpAntwerpBelgium
| | - Edwin P. Romijn
- Pharmaceutical Development & Manufacturing Sciences (PDMS)Janssen Research & DevelopmentBeerseBelgium
| | - Kris Laukens
- Department of Mathematics and Computer Science, Advanced Database Research and Modelling (ADReM)University of AntwerpAntwerpBelgium
- Biomedical Informatics Network Antwerp (Biomina)University of AntwerpAntwerpBelgium
| | - Filip Cuyckens
- Pharmacokinetics, Dynamics & MetabolismJanssen Research & DevelopmentBeerseBelgium
| |
Collapse
|
37
|
Assembling the Community-Scale Discoverable Human Proteome. Cell Syst 2018; 7:412-421.e5. [PMID: 30172843 PMCID: PMC6279426 DOI: 10.1016/j.cels.2018.08.004] [Citation(s) in RCA: 97] [Impact Index Per Article: 16.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/14/2017] [Revised: 12/22/2017] [Accepted: 08/03/2018] [Indexed: 01/15/2023]
Abstract
The increasing throughput and sharing of proteomics mass spectrometry data have now yielded over one-third of a million public mass spectrometry runs. However, these discoveries are not continuously aggregated in an open and error-controlled manner, which limits their utility. To facilitate the reusability of these data, we built the MassIVE Knowledge Base (MassIVE-KB), a community-wide, continuously updating knowledge base that aggregates proteomics mass spectrometry discoveries into an open reusable format with full provenance information for community scrutiny. Reusing >31 TB of public human data stored in a mass spectrometry interactive virtual environment (MassIVE), the MassIVE-KB contains >2.1 million precursors from 19,610 proteins (48% larger than before; 97% of the total) and doubles proteome coverage to 6 million amino acids (54% of the proteome) with strict library-scale false discovery controls, thereby providing evidence for 430 proteins for which sufficient protein-level evidence was previously missing. Furthermore, MassIVE-KB can inform experimental design, helps identify and quantify new data, and provides tools for community construction of specialized spectral libraries. Wang et al. introduce MassIVE-KB, a program designed to distill the entire community’s mass spectrometry data into reusable spectral library resources. As a result, the statistically-significant discovery of a peptide or protein in a single researcher’s data will thus be made available to the whole community to support its identification (in shotgun experiments) or quantitative detection (in targeted experiments) in all future analyses.
Collapse
|
38
|
Oki N, Fujihara A. Molecular recognition and quantitative analysis of leucine and isoleucine using photodissociation of cold gas-phase noncovalent complexes. JOURNAL OF MASS SPECTROMETRY : JMS 2018; 53:595-597. [PMID: 29722139 DOI: 10.1002/jms.4196] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 02/14/2018] [Revised: 03/24/2018] [Accepted: 04/21/2018] [Indexed: 06/08/2023]
Affiliation(s)
- Narumi Oki
- Department of Chemistry, Osaka Prefecture University, Osaka, 599-8531, Japan
| | - Akimasa Fujihara
- Department of Chemistry, Osaka Prefecture University, Osaka, 599-8531, Japan
| |
Collapse
|
39
|
Kind T, Tsugawa H, Cajka T, Ma Y, Lai Z, Mehta SS, Wohlgemuth G, Barupal DK, Showalter MR, Arita M, Fiehn O. Identification of small molecules using accurate mass MS/MS search. MASS SPECTROMETRY REVIEWS 2018; 37:513-532. [PMID: 28436590 PMCID: PMC8106966 DOI: 10.1002/mas.21535] [Citation(s) in RCA: 249] [Impact Index Per Article: 41.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 10/27/2016] [Revised: 03/17/2017] [Accepted: 03/18/2017] [Indexed: 05/03/2023]
Abstract
Tandem mass spectral library search (MS/MS) is the fastest way to correctly annotate MS/MS spectra from screening small molecules in fields such as environmental analysis, drug screening, lipid analysis, and metabolomics. The confidence in MS/MS-based annotation of chemical structures is impacted by instrumental settings and requirements, data acquisition modes including data-dependent and data-independent methods, library scoring algorithms, as well as post-curation steps. We critically discuss parameters that influence search results, such as mass accuracy, precursor ion isolation width, intensity thresholds, centroiding algorithms, and acquisition speed. A range of publicly and commercially available MS/MS databases such as NIST, MassBank, MoNA, LipidBlast, Wiley MSforID, and METLIN are surveyed. In addition, software tools including NIST MS Search, MS-DIAL, Mass Frontier, SmileMS, Mass++, and XCMS2 to perform fast MS/MS search are discussed. MS/MS scoring algorithms and challenges during compound annotation are reviewed. Advanced methods such as the in silico generation of tandem mass spectra using quantum chemistry and machine learning methods are covered. Community efforts for curation and sharing of tandem mass spectra that will allow for faster distribution of scientific discoveries are discussed.
Collapse
Affiliation(s)
- Tobias Kind
- Genome Center, Metabolomics, UC Davis, Davis, California
| | - Hiroshi Tsugawa
- RIKEN Center for Sustainable Resource Science, Yokohama, Kanagawa, Japan
| | - Tomas Cajka
- Genome Center, Metabolomics, UC Davis, Davis, California
| | - Yan Ma
- National Institute of Biological Sciences, Beijing, People’s Republic of China
| | - Zijuan Lai
- Genome Center, Metabolomics, UC Davis, Davis, California
| | | | | | | | | | - Masanori Arita
- RIKEN Center for Sustainable Resource Science, Yokohama, Kanagawa, Japan
| | - Oliver Fiehn
- Genome Center, Metabolomics, UC Davis, Davis, California
- Faculty of Sciences, Department of Biochemistry, King Abdulaziz University, Jeddah, Saudi Arabia
| |
Collapse
|
40
|
Barsnes H, Vaudel M. SearchGUI: A Highly Adaptable Common Interface for Proteomics Search and de Novo Engines. J Proteome Res 2018; 17:2552-2555. [PMID: 29774740 DOI: 10.1021/acs.jproteome.8b00175] [Citation(s) in RCA: 117] [Impact Index Per Article: 19.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/24/2022]
Abstract
Mass-spectrometry-based proteomics has become the standard approach for identifying and quantifying proteins. A vital step consists of analyzing experimentally generated mass spectra to identify the underlying peptide sequences for later mapping to the originating proteins. We here present the latest developments in SearchGUI, a common open-source interface for the most frequently used freely available proteomics search and de novo engines that has evolved into a central component in numerous bioinformatics workflows.
Collapse
Affiliation(s)
| | - Marc Vaudel
- Center for Medical Genetics and Molecular Medicine , Haukeland University Hospital , 5021 Bergen , Norway
| |
Collapse
|
41
|
Zhang Z, Burke M, Mirokhin YA, Tchekhovskoi DV, Markey SP, Yu W, Chaerkady R, Hess S, Stein SE. Reverse and Random Decoy Methods for False Discovery Rate Estimation in High Mass Accuracy Peptide Spectral Library Searches. J Proteome Res 2018; 17:846-857. [DOI: 10.1021/acs.jproteome.7b00614] [Citation(s) in RCA: 25] [Impact Index Per Article: 4.2] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/01/2023]
Affiliation(s)
- Zheng Zhang
- Mass
Spectrometry Data Center, National Institute of Standards and Technology, 100 Bureau Drive, Gaithersburg, Maryland 20899, United States
| | - Meghan Burke
- Mass
Spectrometry Data Center, National Institute of Standards and Technology, 100 Bureau Drive, Gaithersburg, Maryland 20899, United States
| | - Yuri A. Mirokhin
- Mass
Spectrometry Data Center, National Institute of Standards and Technology, 100 Bureau Drive, Gaithersburg, Maryland 20899, United States
| | - Dmitrii V. Tchekhovskoi
- Mass
Spectrometry Data Center, National Institute of Standards and Technology, 100 Bureau Drive, Gaithersburg, Maryland 20899, United States
| | - Sanford P. Markey
- Mass
Spectrometry Data Center, National Institute of Standards and Technology, 100 Bureau Drive, Gaithersburg, Maryland 20899, United States
| | - Wen Yu
- Research
Bioinformatics, MedImmune LLC, One MedImmune Way, Gaithersburg, Maryland 20878, United States
| | - Raghothama Chaerkady
- Antibody
Discovery and Protein Engineering, Protein Sciences, MedImmune LLC, One MedImmune Way, Gaithersburg, Maryland 20878, United States
| | - Sonja Hess
- Antibody
Discovery and Protein Engineering, Protein Sciences, MedImmune LLC, One MedImmune Way, Gaithersburg, Maryland 20878, United States
| | - Stephen E. Stein
- Mass
Spectrometry Data Center, National Institute of Standards and Technology, 100 Bureau Drive, Gaithersburg, Maryland 20899, United States
| |
Collapse
|