1
|
Uvarova YE, Demenkov PS, Kuzmicheva IN, Venzel AS, Mischenko EL, Ivanisenko TV, Efimov VM, Bannikova SV, Vasilieva AR, Ivanisenko VA, Peltek SE. Accurate noise-robust classification of Bacillus species from MALDI-TOF MS spectra using a denoising autoencoder. J Integr Bioinform 2023; 20:jib-2023-0017. [PMID: 37978847 PMCID: PMC10757077 DOI: 10.1515/jib-2023-0017] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/31/2023] [Accepted: 07/10/2023] [Indexed: 11/19/2023] Open
Abstract
Bacillus strains are ubiquitous in the environment and are widely used in the microbiological industry as valuable enzyme sources, as well as in agriculture to stimulate plant growth. The Bacillus genus comprises several closely related groups of species. The rapid classification of these remains challenging using existing methods. Techniques based on MALDI-TOF MS data analysis hold significant promise for fast and precise microbial strains classification at both the genus and species levels. In previous work, we proposed a geometric approach to Bacillus strain classification based on mass spectra analysis via the centroid method (CM). One limitation of such methods is the noise in MS spectra. In this study, we used a denoising autoencoder (DAE) to improve bacteria classification accuracy under noisy MS spectra conditions. We employed a denoising autoencoder approach to convert noisy MS spectra into latent variables representing molecular patterns in the original MS data, and the Random Forest method to classify bacterial strains by latent variables. Comparison of the DAE-RF with the CM method using the artificially noisy test samples showed that DAE-RF offers higher noise robustness. Hence, the DAE-RF method could be utilized for noise-robust, fast, and neat classification of Bacillus species according to MALDI-TOF MS data.
Collapse
Affiliation(s)
- Yulia E. Uvarova
- Federal Research Center Institute of Cytology and Genetics SB RAS, 630090Novosibirsk, Russia
| | - Pavel S. Demenkov
- Federal Research Center Institute of Cytology and Genetics SB RAS, 630090Novosibirsk, Russia
- Kurchatov Center for Genome Research, Institute of Cytology and Genetics SB RAS, 630090Novosibirsk, Russia
- Novosibirsk State University, 630090Novosibirsk, Russia
| | | | - Artur S. Venzel
- Federal Research Center Institute of Cytology and Genetics SB RAS, 630090Novosibirsk, Russia
- Novosibirsk State University, 630090Novosibirsk, Russia
| | - Elena L. Mischenko
- Federal Research Center Institute of Cytology and Genetics SB RAS, 630090Novosibirsk, Russia
| | - Timofey V. Ivanisenko
- Federal Research Center Institute of Cytology and Genetics SB RAS, 630090Novosibirsk, Russia
- Kurchatov Center for Genome Research, Institute of Cytology and Genetics SB RAS, 630090Novosibirsk, Russia
| | - Vadim M. Efimov
- Federal Research Center Institute of Cytology and Genetics SB RAS, 630090Novosibirsk, Russia
| | - Svetlana V. Bannikova
- Federal Research Center Institute of Cytology and Genetics SB RAS, 630090Novosibirsk, Russia
| | - Asya R. Vasilieva
- Federal Research Center Institute of Cytology and Genetics SB RAS, 630090Novosibirsk, Russia
| | - Vladimir A. Ivanisenko
- Federal Research Center Institute of Cytology and Genetics SB RAS, 630090Novosibirsk, Russia
- Kurchatov Center for Genome Research, Institute of Cytology and Genetics SB RAS, 630090Novosibirsk, Russia
- Novosibirsk State University, 630090Novosibirsk, Russia
| | - Sergey E. Peltek
- Federal Research Center Institute of Cytology and Genetics SB RAS, 630090Novosibirsk, Russia
- Kurchatov Center for Genome Research, Institute of Cytology and Genetics SB RAS, 630090Novosibirsk, Russia
| |
Collapse
|
2
|
Recent Studies on Advance Spectroscopic Techniques for the Identification of Microorganisms: A Review. ARAB J CHEM 2022. [DOI: 10.1016/j.arabjc.2022.104521] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/16/2022] Open
|
3
|
Serum amino acids quantification by plasmonic colloidosome-coupled MALDI-TOF MS for triple-negative breast cancer diagnosis. Mater Today Bio 2022; 17:100486. [DOI: 10.1016/j.mtbio.2022.100486] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/23/2022] [Revised: 10/29/2022] [Accepted: 11/01/2022] [Indexed: 11/08/2022]
|
4
|
Lazari LC, Zerbinati RM, Rosa-Fernandes L, Santiago VF, Rosa KF, Angeli CB, Schwab G, Palmieri M, Sarmento DJS, Marinho CRF, Almeida JD, To K, Giannecchini S, Wrenger C, Sabino EC, Martinho H, Lindoso JAL, Durigon EL, Braz-Silva PH, Palmisano G. MALDI-TOF mass spectrometry of saliva samples as a prognostic tool for COVID-19. J Oral Microbiol 2022; 14:2043651. [PMID: 35251522 PMCID: PMC8890567 DOI: 10.1080/20002297.2022.2043651] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/02/2022] Open
Abstract
Background Methods Results Conclusion
Collapse
Affiliation(s)
- Lucas C. Lazari
- GlycoProteomics Laboratory, Department of Parasitology, ICB, University of São Paulo, São Paulo, Brazil
| | - Rodrigo M. Zerbinati
- Laboratory of Virology (LIM-52-HC-FMUSP), Institute of Tropical Medicine of São Paulo, School of Medicine, University of São Paulo, São Paulo, Brazil
| | - Livia Rosa-Fernandes
- GlycoProteomics Laboratory, Department of Parasitology, ICB, University of São Paulo, São Paulo, Brazil
- Laboratory of Experimental Immunoparasitology, Department of Parasitology, ICB, University of São Paulo, São Paulo, Brazil
| | - Veronica Feijoli Santiago
- GlycoProteomics Laboratory, Department of Parasitology, ICB, University of São Paulo, São Paulo, Brazil
| | - Klaise F. Rosa
- GlycoProteomics Laboratory, Department of Parasitology, ICB, University of São Paulo, São Paulo, Brazil
| | - Claudia B. Angeli
- GlycoProteomics Laboratory, Department of Parasitology, ICB, University of São Paulo, São Paulo, Brazil
| | - Gabriela Schwab
- Laboratory of Virology (LIM-52-HC-FMUSP), Institute of Tropical Medicine of São Paulo, School of Medicine, University of São Paulo, São Paulo, Brazil
| | - Michelle Palmieri
- Department of Stomatology, School of Dentistry, University of São Paulo, São Paulo, Brazil
| | - Dmitry J. S. Sarmento
- Department of Stomatology, School of Dentistry, University of São Paulo, São Paulo, Brazil
| | - Claudio R. F. Marinho
- Laboratory of Experimental Immunoparasitology, Department of Parasitology, ICB, University of São Paulo, São Paulo, Brazil
| | - Janete Dias Almeida
- Department of Biosciences and Oral Diagnosis, Institute of Science and Technology, São Paulo State University, São José dos Campos, Brazil
| | - Kelvin To
- State Key Laboratory for Emerging Infectious Diseases, Department of Microbiology, Carol Yu Centre for Infection, Li KaShing Faculty of Medicine of the University of Hong Kong, Hong Kong, Special Administrative Region, China
| | - Simone Giannecchini
- Department of Experimental and Clinical Medicine, University of Florence, Florence, Italy
| | - Carsten Wrenger
- Unit for Drug Discovery, Department of Parasitology, ICB, University of São Paulo, São Paulo, Brazil
| | - Ester C. Sabino
- Institute of Tropical Medicine of São Paulo, School of Medicine, University of São Paulo, São Paulo, Brazil
| | - Herculano Martinho
- Centro de Ciencias Naturais e Humanas, Universidade Federal do ABC, Santo André, Brazil
| | - José A. L. Lindoso
- Institute of Infectious Diseases Emílio Ribas, São Paulo, Brazil
- Laboratory of Protozoology (LIM-49-HC-FMUSP), Institute of Tropical Medicine of São Paulo, School of Medicine, University of São Paulo, São Paulo, Brazil
- Department of Infectious Diseases, School of Medicine, University of São Paulo, São Paulo, Brazil
| | - Edison L. Durigon
- Laboratory of Clinical and Molecular Virology, Department of Microbiology, ICB, University of São Paulo, São Paulo, Brazil
| | - Paulo H. Braz-Silva
- Laboratory of Virology (LIM-52-HC-FMUSP), Institute of Tropical Medicine of São Paulo, School of Medicine, University of São Paulo, São Paulo, Brazil
- Department of Stomatology, School of Dentistry, University of São Paulo, São Paulo, Brazil
| | - Giuseppe Palmisano
- GlycoProteomics Laboratory, Department of Parasitology, ICB, University of São Paulo, São Paulo, Brazil
| |
Collapse
|
5
|
Hua D, Desaire H. Improved Discrimination of Disease States Using Proteomics Data with the Updated Aristotle Classifier. J Proteome Res 2021; 20:2823-2829. [PMID: 33909976 DOI: 10.1021/acs.jproteome.1c00066] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Key Words] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/28/2022]
Abstract
Mass spectrometry data sets from omics studies are an optimal information source for discriminating patients with disease and identifying biomarkers. Thousands of proteins or endogenous metabolites can be queried in each analysis, spanning several orders of magnitude in abundance. Machine learning tools that effectively leverage these data to accurately identify disease states are in high demand. While mass spectrometry data sets are rich with potentially useful information, using the data effectively can be challenging because of missing entries in the data sets and because the number of samples is typically much smaller than the number of features, two challenges that make machine learning difficult. To address this problem, we have modified a new supervised classification tool, the Aristotle Classifier, so that omics data sets can be better leveraged for identifying disease states. The optimized classifier, AC.2021, is benchmarked on multiple data sets against its predecessor and two leading supervised classification tools, Support Vector Machine (SVM) and XGBoost. The new classifier, AC.2021, outperformed existing tools on multiple tests using proteomics data. The underlying code for the classifier, provided herein, would be useful for researchers who desire improved classification accuracy when using their omics data sets to identify disease states.
Collapse
Affiliation(s)
- David Hua
- Department of Chemistry, University of Kansas, Lawrence, Kansas 66045, United States
| | - Heather Desaire
- Department of Chemistry, University of Kansas, Lawrence, Kansas 66045, United States
| |
Collapse
|
6
|
He Q, Sun C, Liu J, Pan Y. MALDI-MSI analysis of cancer drugs: Significance, advances, and applications. Trends Analyt Chem 2021. [DOI: 10.1016/j.trac.2021.116183] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/28/2022]
|
7
|
Desaire H, Patabandige MW, Hua D. The local-balanced model for improved machine learning outcomes on mass spectrometry data sets and other instrumental data. Anal Bioanal Chem 2021; 413:1583-1593. [PMID: 33580828 PMCID: PMC8516084 DOI: 10.1007/s00216-020-03117-2] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/12/2020] [Revised: 11/17/2020] [Accepted: 12/08/2020] [Indexed: 11/25/2022]
Abstract
One unifying challenge when classifying biological samples with mass spectrometry data is overcoming the obstacle of sample-to-sample variability so that differences between groups, such as between a healthy set and a disease set, can be identified. Similarly, when the same sample is re-analyzed under identical conditions, instrument signals can fluctuate by more than 10%. This signal inconsistency imposes difficulties in identifying subtle differences across a set of samples, and it weakens the mass spectrometrist’s ability to effectively leverage data in domains as diverse as proteomics, metabolomics, glycomics, and imaging. We selected challenging data sets in the fields of glycomics, mass spectrometry imaging, and bacterial typing to study the problem of within-group signal variability and adapted a 30 year old statistical approach to address the problem. The solution, “local-balanced model,” relies on using balanced subsets of training data to classify test samples. This analysis strategy was assessed on ESI-MS data of IgG-based glycopeptides and MALDI-MS imaging data of endogenous lipids, and MALDI-MS data of bacterial proteins. Two preliminary examples on non-mass spectrometry data sets are also included to show the potential generality of the method outside the field of MS analysis. We demonstrate that this approach is superior to simple normalization methods, generalizable to multiple mass spectrometry domains, and potentially appropriate in fields as diverse as physics and satellite imaging. In some cases, improvements in classification can be dramatic, with accuracy escalating from 60% with normalization alone to over 90% with the additional development described herein.
Collapse
Affiliation(s)
- Heather Desaire
- Department of Chemistry, University of Kansas, Lawrence, KS, 66045, USA.
| | | | - David Hua
- Department of Chemistry, University of Kansas, Lawrence, KS, 66045, USA
| |
Collapse
|
8
|
Identification and dereplication of endophytic Colletotrichum strains by MALDI TOF mass spectrometry and molecular networking. Sci Rep 2020; 10:19788. [PMID: 33188275 PMCID: PMC7666161 DOI: 10.1038/s41598-020-74852-w] [Citation(s) in RCA: 10] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/28/2020] [Accepted: 09/29/2020] [Indexed: 01/09/2023] Open
Abstract
The chemical diversity of biologically active fungal strains from 42 Colletotrichum, isolated from leaves of the tropical palm species Astrocaryum sciophilum collected in pristine forests of French Guiana, was investigated. The collection was first classified based on protein fingerprints acquired by matrix-assisted laser desorption/ionization time-of-flight mass spectrometry (MALDI-TOF MS) correlated with cytotoxicity. Liquid chromatography coupled to high-resolution tandem mass spectrometry (LC-HRMS/MS) data from ethyl acetate extracts were acquired and processed to generate a massive molecular network (MN) using the MetGem software. From five Colletotrichum strains producing cytotoxic specialized metabolites, we predicted the occurrence of peptide and cytochalasin analogues in four of them by MN, including a similar ion clusters in the MN algorithm provided by MetGem software. Chemoinformatics predictions were fully confirmed after isolation of three pentacyclopeptides (cyclo(Phe-Leu-Leu-Leu-Val), cyclo(Phe-Leu-Leu-Leu-Leu) and cyclo(Phe-Leu-Leu-Leu-Ile)) and two cytochalasins (cytochalasin C and cytochalasin D) exhibiting cytotoxicity at the micromolar concentration. Finally, the chemical study of the last active cytotoxic strain BSNB-0583 led to the isolation of four colletamides bearing an identical decadienamide chain.
Collapse
|
9
|
De Bruyne S, Speeckaert MM, Van Biesen W, Delanghe JR. Recent evolutions of machine learning applications in clinical laboratory medicine. Crit Rev Clin Lab Sci 2020; 58:131-152. [PMID: 33045173 DOI: 10.1080/10408363.2020.1828811] [Citation(s) in RCA: 9] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/17/2022]
Abstract
Machine learning (ML) is gaining increased interest in clinical laboratory medicine, mainly triggered by the decreased cost of generating and storing data using laboratory automation and computational power, and the widespread accessibility of open source tools. Nevertheless, only a handful of ML-based products are currently commercially available for routine clinical laboratory practice. In this review, we start with an introduction to ML by providing an overview of the ML landscape, its general workflow, and the most commonly used algorithms for clinical laboratory applications. Furthermore, we aim to illustrate recent evolutions (2018 to mid-2020) of the techniques used in the clinical laboratory setting and discuss the associated challenges and opportunities. In the field of clinical chemistry, the reviewed applications of ML algorithms include quality review of lab results, automated urine sediment analysis, disease or outcome prediction from routine laboratory parameters, and interpretation of complex biochemical data. In the hematology subdiscipline, we discuss the concepts of automated blood film reporting and malaria diagnosis. At last, we handle a broad range of clinical microbiology applications, such as the reduction of diagnostic workload by laboratory automation, the detection and identification of clinically relevant microorganisms, and the detection of antimicrobial resistance.
Collapse
Affiliation(s)
- Sander De Bruyne
- Department of Diagnostic Sciences, Ghent University, Ghent, Belgium
| | | | - Wim Van Biesen
- Department of Nephrology, Ghent University Hospital, Ghent, Belgium
| | - Joris R Delanghe
- Department of Diagnostic Sciences, Ghent University, Ghent, Belgium
| |
Collapse
|
10
|
Weis C, Jutzeler C, Borgwardt K. Machine learning for microbial identification and antimicrobial susceptibility testing on MALDI-TOF mass spectra: a systematic review. Clin Microbiol Infect 2020; 26:1310-1317. [DOI: 10.1016/j.cmi.2020.03.014] [Citation(s) in RCA: 34] [Impact Index Per Article: 6.8] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/03/2019] [Revised: 03/05/2020] [Accepted: 03/13/2020] [Indexed: 01/12/2023]
|
11
|
Hua D, Liu X, Go EP, Wang Y, Hummon AB, Desaire H. How to Apply Supervised Machine Learning Tools to MS Imaging Files: Case Study with Cancer Spheroids Undergoing Treatment with the Monoclonal Antibody Cetuximab. JOURNAL OF THE AMERICAN SOCIETY FOR MASS SPECTROMETRY 2020; 31:1350-1357. [PMID: 32469221 PMCID: PMC7685566 DOI: 10.1021/jasms.0c00010] [Citation(s) in RCA: 6] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/09/2023]
Abstract
As the field of mass spectrometry imaging continues to grow, so too do its needs for optimal methods of data analysis. One general need in image analysis is the ability to classify the underlying regions within an image, as healthy or diseased, for example. Classification, as a general problem, is often best accomplished by supervised machine learning strategies; unfortunately, conducting supervised machine learning on MS imaging files is not typically done by mass spectrometrists because a high degree of specialized knowledge is needed. To address this problem, we developed a fully open-source approach that facilitates supervised machine learning on MS imaging files, and we demonstrated its implementation on sets of cancer spheroids that either had or had not undergone chemotherapy treatment. These supervised machine learning studies demonstrated that metabolic changes induced by the monoclonal antibody, Cetuximab, are detectable but modest at 24 h, and by 72 h, the drug induces a larger and more diverse metabolic response.
Collapse
Affiliation(s)
- David Hua
- Department of Chemistry, University of Kansas, Lawrence, Kansas 66045, United States
| | - Xin Liu
- Department of Chemistry and Biochemistry and the Comprehensive Cancer Center, The Ohio State University, Columbus, Ohio 43210, United States
| | - Eden P. Go
- Department of Chemistry, University of Kansas, Lawrence, Kansas 66045, United States
| | - Yijia Wang
- Department of Chemistry and Biochemistry and the Comprehensive Cancer Center, The Ohio State University, Columbus, Ohio 43210, United States
| | - Amanda B. Hummon
- Department of Chemistry and Biochemistry and the Comprehensive Cancer Center, The Ohio State University, Columbus, Ohio 43210, United States
| | - Heather Desaire
- Department of Chemistry, University of Kansas, Lawrence, Kansas 66045, United States
| |
Collapse
|
12
|
Time to Positivity as a Prognostic Tool in the Performance of Short-Term Subculture for MALDI-TOF MS-Based Identification of Microorganisms from Positive Blood Cultures in Pediatric Patients. Curr Microbiol 2020; 77:953-958. [DOI: 10.1007/s00284-020-01900-z] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/22/2019] [Accepted: 01/21/2020] [Indexed: 10/25/2022]
|