Liu CC, Hsiao WWL. Large-scale comparative genomics to refine the organization of the global
Salmonella enterica population structure.
Microb Genom 2022;
8:mgen000906. [PMID:
36748524 PMCID:
PMC9837569 DOI:
10.1099/mgen.0.000906]
[Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/12/2022] Open
Abstract
The White-Kauffmann-Le Minor (WKL) scheme is the most widely used Salmonella typing scheme for reporting the disease prevalence of the enteric pathogen. With the advent of whole-genome sequencing (WGS), in silico methods have increasingly replaced traditional serotyping due to reproducibility, speed and coverage. However, despite integrating genomic-based typing by in silico serotyping tools such as SISTR, in silico serotyping in certain contexts remains ambiguous and insufficiently informative. Specifically, in silico serotyping does not attempt to resolve polyphyly. Furthermore, in spite of the widespread acknowledgement of polyphyly from genomic studies, the prevalence of polyphyletic serovars is not well characterized. Here, we applied a genomics approach to acquire the necessary resolution to classify genetically discordant serovars and propose an alternative typing scheme that consistently reflect natural Salmonella populations. By accessing the unprecedented volume of bacterial genomic data publicly available in GenomeTrakr and PubMLST databases (>180 000 genomes representing 723 serovars), we characterized the global Salmonella population structure and systematically identified putative non-monophyletic serovars. The proportion of putative non-monophyletic serovars was estimated higher than previous reports, reinforcing the inability of antigenic determinants to depict the complexity of Salmonella evolutionary history. We explored the extent of genetic diversity masked by serotyping labels and found significant intra-serovar molecular differences across many clinically important serovars. To avoid false discovery due to incorrect in silico serotyping calls, we cross-referenced reported serovar labels and concluded a low error rate in in silico serotyping. The combined application of clustering statistics and genome-wide association methods demonstrated effective characterization of stable bacterial populations and explained functional differences. The collective methods adopted in our study have practical values in establishing genomic-based typing nomenclatures for an entire microbial species or closely related subpopulations. Ultimately, we foresee an improved typing scheme to be a hybrid that integrates both genomic and antigenic information such that the resolution from WGS is leveraged to improve the precision of subpopulation classification while preserving the common names defined by the WKL scheme.
Collapse