1
|
Alves G, Ogurtsov AY, Porterfield H, Maity T, Jenkins LM, Sacks DB, Yu YK. Multiplexing the Identification of Microorganisms via Tandem Mass Tag Labeling Augmented by Interference Removal through a Novel Modification of the Expectation Maximization Algorithm. JOURNAL OF THE AMERICAN SOCIETY FOR MASS SPECTROMETRY 2024; 35:1138-1155. [PMID: 38740383 PMCID: PMC11157548 DOI: 10.1021/jasms.3c00445] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 12/19/2023] [Revised: 04/12/2024] [Accepted: 04/17/2024] [Indexed: 05/16/2024]
Abstract
Having fast, accurate, and broad spectrum methods for the identification of microorganisms is of paramount importance to public health, research, and safety. Bottom-up mass spectrometer-based proteomics has emerged as an effective tool for the accurate identification of microorganisms from microbial isolates. However, one major hurdle that limits the deployment of this tool for routine clinical diagnosis, and other areas of research such as culturomics, is the instrument time required for the mass spectrometer to analyze a single sample, which can take ∼1 h per sample, when using mass spectrometers that are presently used in most institutes. To address this issue, in this study, we employed, for the first time, tandem mass tags (TMTs) in multiplex identifications of microorganisms from multiple TMT-labeled samples in one MS/MS experiment. A difficulty encountered when using TMT labeling is the presence of interference in the measured intensities of TMT reporter ions. To correct for interference, we employed in the proposed method a modified version of the expectation maximization (EM) algorithm that redistributes the signal from ion interference back to the correct TMT-labeled samples. We have evaluated the sensitivity and specificity of the proposed method using 94 MS/MS experiments (covering a broad range of protein concentration ratios across TMT-labeled channels and experimental parameters), containing a total of 1931 true positive TMT-labeled channels and 317 true negative TMT-labeled channels. The results of the evaluation show that the proposed method has an identification sensitivity of 93-97% and a specificity of 100% at the species level. Furthermore, as a proof of concept, using an in-house-generated data set composed of some of the most common urinary tract pathogens, we demonstrated that by using the proposed method the mass spectrometer time required per sample, using a 1 h LC-MS/MS run, can be reduced to 10 and 6 min when samples are labeled with TMT-6 and TMT-10, respectively. The proposed method can also be used along with Orbitrap mass spectrometers that have faster MS/MS acquisition rates, like the recently released Orbitrap Astral mass spectrometer, to further reduce the mass spectrometer time required per sample.
Collapse
Affiliation(s)
- Gelio Alves
- National
Center for Biotechnology Information, National Library of Medicine,
National Institutes of Health, Bethesda, Maryland 20894, United States
| | - Aleksey Y. Ogurtsov
- National
Center for Biotechnology Information, National Library of Medicine,
National Institutes of Health, Bethesda, Maryland 20894, United States
| | - Harry Porterfield
- Department
of Laboratory Medicine, Clinical Center, National Institutes of Health, Bethesda, Maryland 20892, United States
| | - Tapan Maity
- Laboratory
of Cell Biology, Center for Cancer Research, National Cancer Institute, National Institutes of Health, Bethesda, Maryland 20892, United States
| | - Lisa M. Jenkins
- Laboratory
of Cell Biology, Center for Cancer Research, National Cancer Institute, National Institutes of Health, Bethesda, Maryland 20892, United States
| | - David B. Sacks
- Department
of Laboratory Medicine, Clinical Center, National Institutes of Health, Bethesda, Maryland 20892, United States
| | - Yi-Kuo Yu
- National
Center for Biotechnology Information, National Library of Medicine,
National Institutes of Health, Bethesda, Maryland 20894, United States
| |
Collapse
|
2
|
Valletta M, Campolattano N, De Chiara I, Marasco R, Singh VP, Muscariello L, Pedone PV, Chambery A, Russo R. A robust nanoLC high-resolution mass spectrometry methodology for the comprehensive profiling of lactic acid bacteria in milk kefir. Food Res Int 2023; 173:113298. [PMID: 37803610 DOI: 10.1016/j.foodres.2023.113298] [Citation(s) in RCA: 2] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/05/2023] [Revised: 07/14/2023] [Accepted: 07/20/2023] [Indexed: 10/08/2023]
Abstract
Consumer attention to functional foods containing probiotics is growing because of their positive effects on human health. Kefir is a fermented milk beverage produced by bacteria and yeasts. Given the emerging kefir market, there is an increasing demand for new methodologies to certify product claims such as colony-forming units/g and bacterial taxa. MALDI-TOF MS proved to be useful for the detection/identification of bacteria in clinical diagnostics and agri-food applications. Recently, LC-MS/MS approaches have also been applied to the identification of proteins and proteotypic peptides of lactic acid bacteria in fermented food matrices. Here, we developed an innovative nanoLC-ESI-MS/MS-based methodology for profiling lactic acid bacteria in commercial and artisanal milk kefir products as well as in kefir grains at the genus, species and subspecies level. The proposed workflow enables the authentication of kefir label claims declaring the presence of probiotic starters. An overview of the composition of lactic acid bacteria was also obtained for unlabelled kefir highlighting, for the first time, the great potential of LC-MS/MS as a sensitive tool to assess the authenticity of fermented foods.
Collapse
Affiliation(s)
- Mariangela Valletta
- Department of Environmental, Biological and Pharmaceutical Sciences and Technologies, University of Campania "Luigi Vanvitelli", 81100 Caserta, Italy
| | - Nicoletta Campolattano
- Department of Environmental, Biological and Pharmaceutical Sciences and Technologies, University of Campania "Luigi Vanvitelli", 81100 Caserta, Italy
| | - Ida De Chiara
- Department of Environmental, Biological and Pharmaceutical Sciences and Technologies, University of Campania "Luigi Vanvitelli", 81100 Caserta, Italy
| | - Rosangela Marasco
- Department of Environmental, Biological and Pharmaceutical Sciences and Technologies, University of Campania "Luigi Vanvitelli", 81100 Caserta, Italy
| | - Vikram Pratap Singh
- Department of Environmental, Biological and Pharmaceutical Sciences and Technologies, University of Campania "Luigi Vanvitelli", 81100 Caserta, Italy
| | - Lidia Muscariello
- Department of Environmental, Biological and Pharmaceutical Sciences and Technologies, University of Campania "Luigi Vanvitelli", 81100 Caserta, Italy
| | - Paolo Vincenzo Pedone
- Department of Environmental, Biological and Pharmaceutical Sciences and Technologies, University of Campania "Luigi Vanvitelli", 81100 Caserta, Italy
| | - Angela Chambery
- Department of Environmental, Biological and Pharmaceutical Sciences and Technologies, University of Campania "Luigi Vanvitelli", 81100 Caserta, Italy.
| | - Rosita Russo
- Department of Environmental, Biological and Pharmaceutical Sciences and Technologies, University of Campania "Luigi Vanvitelli", 81100 Caserta, Italy.
| |
Collapse
|
3
|
Svetličić E, Dončević L, Ozdanovac L, Janeš A, Tustonić T, Štajduhar A, Brkić AL, Čeprnja M, Cindrić M. Direct Identification of Urinary Tract Pathogens by MALDI-TOF/TOF Analysis and De Novo Peptide Sequencing. Molecules 2022; 27:molecules27175461. [PMID: 36080229 PMCID: PMC9457756 DOI: 10.3390/molecules27175461] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/31/2022] [Revised: 08/19/2022] [Accepted: 08/19/2022] [Indexed: 11/16/2022] Open
Abstract
For mass spectrometry-based diagnostics of microorganisms, matrix-assisted laser desorption ionization time-of-flight mass spectrometry (MALDI-TOF MS) is currently routinely used to identify urinary tract pathogens. However, it requires a lengthy culture step for accurate pathogen identification, and is limited by a relatively small number of available species in peptide spectral libraries (≤3329). Here, we propose a method for pathogen identification that overcomes the above limitations, and utilizes the MALDI-TOF/TOF MS instrument. Tandem mass spectra of the analyzed peptides were obtained by chemically activated fragmentation, which allowed mass spectrometry analysis in negative and positive ion modes. Peptide sequences were elucidated de novo, and aligned with the non-redundant National Center for Biotechnology Information Reference Sequence Database (NCBInr). For data analysis, we developed a custom program package that predicted peptide sequences from the negative and positive MS/MS spectra. The main advantage of this method over a conventional MALDI-TOF MS peptide analysis is identification in less than 24 h without a cultivation step. Compared to the limited identification with peptide spectra libraries, the NCBI database derived from genome sequencing currently contains 20,917 bacterial species, and is constantly expanding. This paper presents an accurate method that is used to identify pathogens grown on agar plates, and those isolated directly from urine samples, with high accuracy.
Collapse
Affiliation(s)
- Ema Svetličić
- Novo Nordisk Foundation Center for Biosustainability, Technical University of Denmark, 2800 Lyngby, Denmark
| | - Lucija Dončević
- Division of Molecular Medicine, Ruđer Bošković Institute, Bijenička 54, 10000 Zagreb, Croatia
| | - Luka Ozdanovac
- Division of Molecular Medicine, Ruđer Bošković Institute, Bijenička 54, 10000 Zagreb, Croatia
| | - Andrea Janeš
- Clinical Department of Laboratory Diagnostics, University Hospital Dubrava, Avenija Gojka Šuška 6, 10000 Zagreb, Croatia
| | | | - Andrija Štajduhar
- Division for Medical Statistics, Andrija Štampar Teaching Institute of Public Health, Mirogojska cesta 16, 10000 Zagreb, Croatia
| | | | - Marina Čeprnja
- Special Hospital Agram, Agram EEIG, Trnjanska cesta 108, 10000 Zagreb, Croatia
| | - Mario Cindrić
- Division of Molecular Medicine, Ruđer Bošković Institute, Bijenička 54, 10000 Zagreb, Croatia
- Correspondence: ; Tel.: +385-16384422
| |
Collapse
|
4
|
Alves G, Ogurtsov A, Karlsson R, Jaén-Luchoro D, Piñeiro-Iglesias B, Salvà-Serra F, Andersson B, Moore ERB, Yu YK. Identification of Antibiotic Resistance Proteins via MiCId's Augmented Workflow. A Mass Spectrometry-Based Proteomics Approach. JOURNAL OF THE AMERICAN SOCIETY FOR MASS SPECTROMETRY 2022; 33:917-931. [PMID: 35500907 PMCID: PMC9164240 DOI: 10.1021/jasms.1c00347] [Citation(s) in RCA: 4] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 11/17/2021] [Revised: 02/17/2022] [Accepted: 02/18/2022] [Indexed: 06/01/2023]
Abstract
Fast and accurate identifications of pathogenic bacteria along with their associated antibiotic resistance proteins are of paramount importance for patient treatments and public health. To meet this goal from the mass spectrometry aspect, we have augmented the previously published Microorganism Classification and Identification (MiCId) workflow for this capability. To evaluate the performance of this augmented workflow, we have used MS/MS datafiles from samples of 10 antibiotic resistance bacterial strains belonging to three different species: Escherichia coli, Klebsiella pneumoniae, and Pseudomonas aeruginosa. The evaluation shows that MiCId's workflow has a sensitivity value around 85% (with a lower bound at about 72%) and a precision greater than 95% in identifying antibiotic resistance proteins. In addition to having high sensitivity and precision, MiCId's workflow is fast and portable, making it a valuable tool for rapid identifications of bacteria as well as detection of their antibiotic resistance proteins. It performs microorganismal identifications, protein identifications, sample biomass estimates, and antibiotic resistance protein identifications in 6-17 min per MS/MS sample using computing resources that are available in most desktop and laptop computers. We have also demonstrated other use of MiCId's workflow. Using MS/MS data sets from samples of two bacterial clonal isolates, one being antibiotic-sensitive while the other being multidrug-resistant, we applied MiCId's workflow to investigate possible mechanisms of antibiotic resistance in these pathogenic bacteria; the results showed that MiCId's conclusions agree with the published study. The new version of MiCId (v.07.01.2021) is freely available for download at https://www.ncbi.nlm.nih.gov/CBBresearch/Yu/downloads.html.
Collapse
Affiliation(s)
- Gelio Alves
- National
Center for Biotechnology Information, National Library of Medicine, National Institutes of Health, Bethesda, Maryland 20894, United States
| | - Aleksey Ogurtsov
- National
Center for Biotechnology Information, National Library of Medicine, National Institutes of Health, Bethesda, Maryland 20894, United States
| | - Roger Karlsson
- Department
of Infectious Diseases, Sahlgrenska Academy, University of Gothenburg, 40530 Gothenburg, Sweden
- Department
of Clinical Microbiology, Sahlgrenska University
Hospital, 40234 Gothenburg, Sweden
- Center
for Antibiotic Resistance Research (CARe), University of Gothenburg, 40016 Gothenburg, Sweden
- Nanoxis
Consulting AB, 40234 Gothenburg, Sweden
| | - Daniel Jaén-Luchoro
- Department
of Infectious Diseases, Sahlgrenska Academy, University of Gothenburg, 40530 Gothenburg, Sweden
- Center
for Antibiotic Resistance Research (CARe), University of Gothenburg, 40016 Gothenburg, Sweden
- Culture Collection
University of Gothenburg (CCUG), Sahlgrenska
Academy of the University of Gothenburg, 40234 Gothenburg, Sweden
| | - Beatriz Piñeiro-Iglesias
- Department
of Clinical Microbiology, Sahlgrenska University
Hospital, 40234 Gothenburg, Sweden
- Center
for Antibiotic Resistance Research (CARe), University of Gothenburg, 40016 Gothenburg, Sweden
| | - Francisco Salvà-Serra
- Department
of Infectious Diseases, Sahlgrenska Academy, University of Gothenburg, 40530 Gothenburg, Sweden
- Department
of Clinical Microbiology, Sahlgrenska University
Hospital, 40234 Gothenburg, Sweden
- Center
for Antibiotic Resistance Research (CARe), University of Gothenburg, 40016 Gothenburg, Sweden
- Culture Collection
University of Gothenburg (CCUG), Sahlgrenska
Academy of the University of Gothenburg, 40234 Gothenburg, Sweden
- Microbiology,
Department of Biology, University of the
Balearic Islands, 07122 Palma de Mallorca, Spain
| | - Björn Andersson
- Bioinformatics
Core Facility at Sahlgrenska Academy, University
of Gothenburg, Box 413, 40530 Gothenburg, Sweden
| | - Edward R. B. Moore
- Department
of Infectious Diseases, Sahlgrenska Academy, University of Gothenburg, 40530 Gothenburg, Sweden
- Department
of Clinical Microbiology, Sahlgrenska University
Hospital, 40234 Gothenburg, Sweden
- Center
for Antibiotic Resistance Research (CARe), University of Gothenburg, 40016 Gothenburg, Sweden
- Culture Collection
University of Gothenburg (CCUG), Sahlgrenska
Academy of the University of Gothenburg, 40234 Gothenburg, Sweden
| | - Yi-Kuo Yu
- National
Center for Biotechnology Information, National Library of Medicine, National Institutes of Health, Bethesda, Maryland 20894, United States
| |
Collapse
|
5
|
Kondori N, Kurtovic A, Piñeiro-Iglesias B, Salvà-Serra F, Jaén-Luchoro D, Andersson B, Alves G, Ogurtsov A, Thorsell A, Fuchs J, Tunovic T, Kamenska N, Karlsson A, Yu YK, Moore ERB, Karlsson R. Mass Spectrometry Proteotyping-Based Detection and Identification of Staphylococcus aureus, Escherichia coli, and Candida albicans in Blood. Front Cell Infect Microbiol 2021; 11:634215. [PMID: 34381737 PMCID: PMC8350517 DOI: 10.3389/fcimb.2021.634215] [Citation(s) in RCA: 8] [Impact Index Per Article: 2.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/27/2020] [Accepted: 07/09/2021] [Indexed: 12/12/2022] Open
Abstract
Bloodstream infections (BSIs), the presence of microorganisms in blood, are potentially serious conditions that can quickly develop into sepsis and life-threatening situations. When assessing proper treatment, rapid diagnosis is the key; besides clinical judgement performed by attending physicians, supporting microbiological tests typically are performed, often requiring microbial isolation and culturing steps, which increases the time required for confirming positive cases of BSI. The additional waiting time forces physicians to prescribe broad-spectrum antibiotics and empirically based treatments, before determining the precise cause of the disease. Thus, alternative and more rapid cultivation-independent methods are needed to improve clinical diagnostics, supporting prompt and accurate treatment and reducing the development of antibiotic resistance. In this study, a culture-independent workflow for pathogen detection and identification in blood samples was developed, using peptide biomarkers and applying bottom-up proteomics analyses, i.e., so-called "proteotyping". To demonstrate the feasibility of detection of blood infectious pathogens, using proteotyping, Escherichia coli and Staphylococcus aureus were included in the study, as the most prominent bacterial causes of bacteremia and sepsis, as well as Candida albicans, one of the most prominent causes of fungemia. Model systems including spiked negative blood samples, as well as positive blood cultures, without further culturing steps, were investigated. Furthermore, an experiment designed to determine the incubation time needed for correct identification of the infectious pathogens in blood cultures was performed. The results for the spiked negative blood samples showed that proteotyping was 100- to 1,000-fold more sensitive, in comparison with the MALDI-TOF MS-based approach. Furthermore, in the analyses of ten positive blood cultures each of E. coli and S. aureus, both the MALDI-TOF MS-based and proteotyping approaches were successful in the identification of E. coli, although only proteotyping could identify S. aureus correctly in all samples. Compared with the MALDI-TOF MS-based approaches, shotgun proteotyping demonstrated higher sensitivity and accuracy, and required significantly shorter incubation time before detection and identification of the correct pathogen could be accomplished.
Collapse
Affiliation(s)
- Nahid Kondori
- Department of Infectious Diseases, Sahlgrenska Academy, University of Gothenburg, Gothenburg, Sweden
- Department of Clinical Microbiology, Sahlgrenska University Hospital, Gothenburg, Sweden
| | - Amra Kurtovic
- Department of Clinical Microbiology, Sahlgrenska University Hospital, Gothenburg, Sweden
| | | | - Francisco Salvà-Serra
- Department of Infectious Diseases, Sahlgrenska Academy, University of Gothenburg, Gothenburg, Sweden
- Department of Clinical Microbiology, Sahlgrenska University Hospital, Gothenburg, Sweden
- Culture Collection University of Gothenburg (CCUG), Sahlgrenska Academy of the University of Gothenburg, Gothenburg, Sweden
- Microbiology, Department of Biology, University of the Balearic Islands, Palma de Mallorca, Spain
| | - Daniel Jaén-Luchoro
- Department of Infectious Diseases, Sahlgrenska Academy, University of Gothenburg, Gothenburg, Sweden
- Culture Collection University of Gothenburg (CCUG), Sahlgrenska Academy of the University of Gothenburg, Gothenburg, Sweden
| | - Björn Andersson
- Bioinformatics Core Facility at Sahlgrenska Academy, University of Gothenburg, Gothenburg, Sweden
| | - Gelio Alves
- National Center for Biotechnology Information (NCBI), Bethesda, MD, United States
| | - Aleksey Ogurtsov
- National Center for Biotechnology Information (NCBI), Bethesda, MD, United States
| | - Annika Thorsell
- Proteomics Core Facility at Sahlgrenska Academy, University of Gothenburg, Gothenburg, Sweden
| | - Johannes Fuchs
- Proteomics Core Facility at Sahlgrenska Academy, University of Gothenburg, Gothenburg, Sweden
| | - Timur Tunovic
- Department of Clinical Microbiology, Sahlgrenska University Hospital, Gothenburg, Sweden
| | - Nina Kamenska
- Norra-Älvsborgs-Länssjukhus (NÄL), Trollhättan, Sweden
| | | | - Yi-Kuo Yu
- National Center for Biotechnology Information (NCBI), Bethesda, MD, United States
| | - Edward R. B. Moore
- Department of Infectious Diseases, Sahlgrenska Academy, University of Gothenburg, Gothenburg, Sweden
- Department of Clinical Microbiology, Sahlgrenska University Hospital, Gothenburg, Sweden
- Culture Collection University of Gothenburg (CCUG), Sahlgrenska Academy of the University of Gothenburg, Gothenburg, Sweden
| | - Roger Karlsson
- Department of Infectious Diseases, Sahlgrenska Academy, University of Gothenburg, Gothenburg, Sweden
- Department of Clinical Microbiology, Sahlgrenska University Hospital, Gothenburg, Sweden
- Nanoxis Consulting AB, Gothenburg, Sweden
| |
Collapse
|
6
|
Kuhring M, Doellinger J, Nitsche A, Muth T, Renard BY. TaxIt: An Iterative Computational Pipeline for Untargeted Strain-Level Identification Using MS/MS Spectra from Pathogenic Single-Organism Samples. J Proteome Res 2020; 19:2501-2510. [PMID: 32362126 DOI: 10.1021/acs.jproteome.9b00714] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/16/2022]
Abstract
Untargeted accurate strain-level classification of a priori unidentified organisms using tandem mass spectrometry is a challenging task. Reference databases often lack taxonomic depth, limiting peptide assignments to the species level. However, the extension with detailed strain information increases runtime and decreases statistical power. In addition, larger databases contain a higher number of similar proteomes. We present TaxIt, an iterative workflow to address the increasing search space required for MS/MS-based strain-level classification of samples with unknown taxonomic origin. TaxIt first applies reference sequence data for initial identification of species candidates, followed by automated acquisition of relevant strain sequences for low level classification. Furthermore, proteome similarities resulting in ambiguous taxonomic assignments are addressed with an abundance weighting strategy to increase the confidence in candidate taxa. For benchmarking the performance of our method, we apply our iterative workflow on several samples of bacterial and viral origin. In comparison to noniterative approaches using unique peptides or advanced abundance correction, TaxIt identifies microbial strains correctly in all examples presented (with one tie), thereby demonstrating the potential for untargeted and deeper taxonomic classification. TaxIt makes extensive use of public, unrestricted, and continuously growing sequence resources such as the NCBI databases and is available under open-source BSD license at https://gitlab.com/rki_bioinformatics/TaxIt.
Collapse
Affiliation(s)
- Mathias Kuhring
- Bioinformatics Unit (MF 1), Department for Methods Development and Research Infrastructure, Robert Koch Institute, 13353 Berlin, Germany.,Core Unit Bioinformatics, Berlin Institute of Health (BIH), 10178 Berlin, Germany.,Berlin Institute of Health Metabolomics Platform, Berlin Institute of Health (BIH), 10178 Berlin, Germany.,Max Delbrück Center (MDC) for Molecular Medicine, 13125 Berlin, Germany
| | - Joerg Doellinger
- Centre for Biological Threats and Special Pathogens, Proteomics and Spectroscopy (ZBS 6), Robert Koch Institute, 13353 Berlin, Germany.,Centre for Biological Threats and Special Pathogens, Highly Pathogenic Viruses (ZBS 1), Robert Koch Institute, 13353 Berlin, Germany
| | - Andreas Nitsche
- Centre for Biological Threats and Special Pathogens, Highly Pathogenic Viruses (ZBS 1), Robert Koch Institute, 13353 Berlin, Germany
| | - Thilo Muth
- Bioinformatics Unit (MF 1), Department for Methods Development and Research Infrastructure, Robert Koch Institute, 13353 Berlin, Germany.,eScience Division (S.3), Federal Institute for Materials Research and Testing, 12489 Berlin, Germany
| | - Bernhard Y Renard
- Bioinformatics Unit (MF 1), Department for Methods Development and Research Infrastructure, Robert Koch Institute, 13353 Berlin, Germany.,Hasso Plattner Institute, Digital Engineering Faculty, University of Potsdam, 14482 Potsdam, Germany
| |
Collapse
|
7
|
Alves G, Yu YK. Robust Accurate Identification and Biomass Estimates of Microorganisms via Tandem Mass Spectrometry. JOURNAL OF THE AMERICAN SOCIETY FOR MASS SPECTROMETRY 2020; 31:85-102. [PMID: 32881514 PMCID: PMC10501333 DOI: 10.1021/jasms.9b00035] [Citation(s) in RCA: 6] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/11/2023]
Abstract
Rapid and accurate identification of microorganisms and estimation of their biomasses are of extreme importance to public health. Mass spectrometry has become an important technique for these purposes. Previously we published a workflow named Microorganism Classification and Identification (MiCId v.12.26.2017) that was shown to perform no worse than other workflows. This manuscript presents MiCId v.12.13.2018 that, in comparison with the earlier version v.12.26.2017, allows for biomass estimates, provides more accurate microorganism identifications (better controls the number of false positives), and is robust against database size increase. This significant advance is made possible by several new ingredients introduced: first, we apply a modified expectation-maximization method to compute for each taxon considered a prior probability, which can be used for biomass estimate; second, we introduce a new concept called ownership, through which the participation ratio is computed and use it as the number of taxa to be kept within a cluster of closely related taxa; third, based on confidently identified peptides, we calculate for each taxon its degree of independence from the rest of taxa considered to determine whether or not to split this taxon off the cluster. Using 270 data files, each containing a large number of MS/MS spectra, we show that, in comparison with v.12.26.2017, version v.12.13.2018 yields superior retrieval results. We also show that MiCId v.12.13.2018 can estimate species biomass reasonably well. The new MiCId v.12.13.2018, designed to run in Linux environment, is freely available for download at https://www.ncbi.nlm.nih.gov/CBBresearch/Yu/downloads.html.
Collapse
Affiliation(s)
- Gelio Alves
- National Center for Biotehnology Information, National Library of Medicine, National Institutes of Health, Bethesda, Maryland 20894, United States
| | - Yi-Kuo Yu
- National Center for Biotehnology Information, National Library of Medicine, National Institutes of Health, Bethesda, Maryland 20894, United States
| |
Collapse
|
8
|
Chen SH, Parker CH, Croley TR, McFarland MA. Identification of Salmonella Taxon-Specific Peptide Markers to the Serovar Level by Mass Spectrometry. Anal Chem 2019; 91:4388-4395. [PMID: 30860807 DOI: 10.1021/acs.analchem.8b04843] [Citation(s) in RCA: 9] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/12/2022]
Abstract
We present an LC-MS/MS pipeline to identify taxon-specific tryptic peptide markers for the identification of Salmonella at the genus, species, subspecies, and serovar levels of specificity. Salmonella enterica subsp. enterica serovars Typhimurium and its four closest relatives, Saintpaul, Heidelberg, Paratyphi B, and Muenchen, were evaluated. A decision-tree approach was used to identify peptides common to the five Salmonella proteomes for evaluation as genus-, species-, and subspecies-specific markers. Peptides identified for two or fewer Salmonella strains were evaluated as potential serovar markers. Currently, there are approximately 140 000 assembled bacterial genomes publicly available, more than 8500 of which are for Salmonella. Consequently, the specificity of each candidate peptide marker was confirmed across all publicly available protein sequences in the NCBI nonredundant (nr) database. The performance of a subset of candidate taxon-specific peptide markers was evaluated in a targeted mass-spectrometry method. The presented workflow offers a marked improvement in specificity over existing MALDI-TOF-based bacterial identification platforms for the identification of closely related Salmonella serovars.
Collapse
Affiliation(s)
- Shu-Hua Chen
- U.S. Food and Drug Administration , Center for Food Safety and Applied Nutrition , College Park , Maryland 20740 , United States
| | - Christine H Parker
- U.S. Food and Drug Administration , Center for Food Safety and Applied Nutrition , College Park , Maryland 20740 , United States
| | - Timothy R Croley
- U.S. Food and Drug Administration , Center for Food Safety and Applied Nutrition , College Park , Maryland 20740 , United States
| | - Melinda A McFarland
- U.S. Food and Drug Administration , Center for Food Safety and Applied Nutrition , College Park , Maryland 20740 , United States
| |
Collapse
|
9
|
Alves G, Wang G, Ogurtsov AY, Drake SK, Gucek M, Sacks DB, Yu YK. Rapid Classification and Identification of Multiple Microorganisms with Accurate Statistical Significance via High-Resolution Tandem Mass Spectrometry. JOURNAL OF THE AMERICAN SOCIETY FOR MASS SPECTROMETRY 2018; 29:1721-1737. [PMID: 29873019 PMCID: PMC6061032 DOI: 10.1007/s13361-018-1986-y] [Citation(s) in RCA: 19] [Impact Index Per Article: 3.2] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 11/13/2017] [Revised: 03/30/2018] [Accepted: 04/25/2018] [Indexed: 05/30/2023]
Abstract
Rapid and accurate identification and classification of microorganisms is of paramount importance to public health and safety. With the advance of mass spectrometry (MS) technology, the speed of identification can be greatly improved. However, the increasing number of microbes sequenced is complicating correct microbial identification even in a simple sample due to the large number of candidates present. To properly untwine candidate microbes in samples containing one or more microbes, one needs to go beyond apparent morphology or simple "fingerprinting"; to correctly prioritize the candidate microbes, one needs to have accurate statistical significance in microbial identification. We meet these challenges by using peptide-centric representations of microbes to better separate them and by augmenting our earlier analysis method that yields accurate statistical significance. Here, we present an updated analysis workflow that uses tandem MS (MS/MS) spectra for microbial identification or classification. We have demonstrated, using 226 MS/MS publicly available data files (each containing from 2500 to nearly 100,000 MS/MS spectra) and 4000 additional MS/MS data files, that the updated workflow can correctly identify multiple microbes at the genus and often the species level for samples containing more than one microbe. We have also shown that the proposed workflow computes accurate statistical significances, i.e., E values for identified peptides and unified E values for identified microbes. Our updated analysis workflow MiCId, a freely available software for Microorganism Classification and Identification, is available for download at https://www.ncbi.nlm.nih.gov/CBBresearch/Yu/downloads.html . Graphical Abstract ᅟ.
Collapse
Affiliation(s)
- Gelio Alves
- National Center for Biotechnology Information, National Library of Medicine, National Institutes of Health, Bethesda, MD, 20894, USA
| | - Guanghui Wang
- Proteomics Core, National Heart, Lung, and Blood Institute, National Institutes of Health, Bethesda, MD, 20892, USA
| | - Aleksey Y Ogurtsov
- National Center for Biotechnology Information, National Library of Medicine, National Institutes of Health, Bethesda, MD, 20894, USA
| | - Steven K Drake
- Critical Care Medicine Department, Clinical Center, National Institutes of Health, Bethesda, MD, 20892, USA
| | - Marjan Gucek
- Proteomics Core, National Heart, Lung, and Blood Institute, National Institutes of Health, Bethesda, MD, 20892, USA
| | - David B Sacks
- Department of Laboratory Medicine, Clinical Center, National Institutes of Health, Bethesda, MD, 20892, USA
| | - Yi-Kuo Yu
- National Center for Biotechnology Information, National Library of Medicine, National Institutes of Health, Bethesda, MD, 20894, USA.
| |
Collapse
|
10
|
Blanco-Míguez A, Fdez-Riverola F, Lourenço A, Sánchez B. P4P: a peptidome-based strain-level genome comparison web tool. Nucleic Acids Res 2017; 45:W265-W269. [PMID: 28482090 PMCID: PMC5570244 DOI: 10.1093/nar/gkx389] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/31/2017] [Accepted: 05/05/2017] [Indexed: 12/02/2022] Open
Abstract
Peptidome similarity analysis enables researchers to gain insights into differential peptide profiles, providing a robust tool to discriminate strain-specific peptides, true intra-species differences among biological replicates or even microorganism-phenotype variations. However, no in silico peptide fingerprinting software existed to facilitate such phylogeny inference. Hence, we developed the Peptidomes for Phylogenies (P4P) web tool, which enables the survey of similarities between microbial proteomes and simplifies the process of obtaining new biological insights into their phylogeny. P4P can be used to analyze different peptide datasets, i.e. bacteria, viruses, eukaryotic species or even metaproteomes. Also, it is able to work with whole proteome datasets and experimental mass-to-charge lists originated from mass spectrometers. The ultimate aim is to generate a valid and manageable list of peptides that have phylogenetic signal and are potentially sample-specific. Sample-to-sample comparison is based on a consensus peak set matrix, which can be further submitted to phylogenetic analysis. P4P holds great potential for improving phylogenetic analyses in challenging taxonomic groups, biomarker identification or epidemiologic studies. Notably, P4P can be of interest for applications handling large proteomic datasets, which it is able to reduce to small matrices while maintaining high phylogenetic resolution. The web server is available at http://sing-group.org/p4p.
Collapse
Affiliation(s)
- Aitor Blanco-Míguez
- ESEI-Department of Computer Science, University of Vigo, Edificio Politécnico, Campus Universitario As Lagoas S/N 32004, Ourense, Spain.,CINBIO-Centro de Investigaciones Biomédicas, University of Vigo, Campus Universitario Lagoas-Marcosende, 36310 Vigo, Spain.,CEB-Centre of Biological Engineering, University of Minho, Campus de Gualtar, 4710-057 Braga, Portugal
| | - Florentino Fdez-Riverola
- ESEI-Department of Computer Science, University of Vigo, Edificio Politécnico, Campus Universitario As Lagoas S/N 32004, Ourense, Spain.,CINBIO-Centro de Investigaciones Biomédicas, University of Vigo, Campus Universitario Lagoas-Marcosende, 36310 Vigo, Spain
| | - Anália Lourenço
- ESEI-Department of Computer Science, University of Vigo, Edificio Politécnico, Campus Universitario As Lagoas S/N 32004, Ourense, Spain.,CINBIO-Centro de Investigaciones Biomédicas, University of Vigo, Campus Universitario Lagoas-Marcosende, 36310 Vigo, Spain.,Department of Microbiology and Biochemistry of Dairy Products, Instituto de Productos Lácteos de Asturias (IPLA), Consejo Superior de Investigaciones Científicas (CSIC), Paseo Río Linares S/N 33300, Villaviciosa, Asturias, Spain
| | - Borja Sánchez
- CEB-Centre of Biological Engineering, University of Minho, Campus de Gualtar, 4710-057 Braga, Portugal
| |
Collapse
|
11
|
Blanco-Míguez A, Meier-Kolthoff JP, Gutiérrez-Jácome A, Göker M, Fdez-Riverola F, Sánchez B, Lourenço A. Improving Phylogeny Reconstruction at the Strain Level Using Peptidome Datasets. PLoS Comput Biol 2016; 12:e1005271. [PMID: 28033346 PMCID: PMC5198984 DOI: 10.1371/journal.pcbi.1005271] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/29/2016] [Accepted: 11/28/2016] [Indexed: 11/18/2022] Open
Abstract
Typical bacterial strain differentiation methods are often challenged by high genetic similarity between strains. To address this problem, we introduce a novel in silico peptide fingerprinting method based on conventional wet-lab protocols that enables the identification of potential strain-specific peptides. These can be further investigated using in vitro approaches, laying a foundation for the development of biomarker detection and application-specific methods. This novel method aims at reducing large amounts of comparative peptide data to binary matrices while maintaining a high phylogenetic resolution. The underlying case study concerns the Bacillus cereus group, namely the differentiation of Bacillus thuringiensis, Bacillus anthracis and Bacillus cereus strains. Results show that trees based on cytoplasmic and extracellular peptidomes are only marginally in conflict with those based on whole proteomes, as inferred by the established Genome-BLAST Distance Phylogeny (GBDP) method. Hence, these results indicate that the two approaches can most likely be used complementarily even in other organismal groups. The obtained results confirm previous reports about the misclassification of many strains within the B. cereus group. Moreover, our method was able to separate the B. anthracis strains with high resolution, similarly to the GBDP results as benchmarked via Bayesian inference and both Maximum Likelihood and Maximum Parsimony. In addition to the presented phylogenomic applications, whole-peptide fingerprinting might also become a valuable complementary technique to digital DNA-DNA hybridization, notably for bacterial classification at the species and subspecies level in the future.
Collapse
Affiliation(s)
- Aitor Blanco-Míguez
- ESEI–Department of Computer Science, University of Vigo, Edificio Politécnico, Campus Universitario As Lagoas s/n, Ourense, Spain
- Department of Microbiology and Biochemistry of Dairy Products, Instituto de Productos Lácteos de Asturias (IPLA), Consejo Superior de Investigaciones Científicas (CSIC), Villaviciosa, Asturias, Spain
| | - Jan P. Meier-Kolthoff
- Leibniz Institute DSMZ–German Collection of Microorganisms and Cell Cultures GmbH, Inhoffenstraße 7B, Braunschweig, Germany
| | - Alberto Gutiérrez-Jácome
- ESEI–Department of Computer Science, University of Vigo, Edificio Politécnico, Campus Universitario As Lagoas s/n, Ourense, Spain
| | - Markus Göker
- Leibniz Institute DSMZ–German Collection of Microorganisms and Cell Cultures GmbH, Inhoffenstraße 7B, Braunschweig, Germany
| | - Florentino Fdez-Riverola
- ESEI–Department of Computer Science, University of Vigo, Edificio Politécnico, Campus Universitario As Lagoas s/n, Ourense, Spain
| | - Borja Sánchez
- Department of Microbiology and Biochemistry of Dairy Products, Instituto de Productos Lácteos de Asturias (IPLA), Consejo Superior de Investigaciones Científicas (CSIC), Villaviciosa, Asturias, Spain
| | - Anália Lourenço
- ESEI–Department of Computer Science, University of Vigo, Edificio Politécnico, Campus Universitario As Lagoas s/n, Ourense, Spain
- CEB—Centre of Biological Engineering, University of Minho, Campus de Gualtar, Braga, Portugal
| |
Collapse
|