1
|
Moeckel C, Mareboina M, Konnaris MA, Chan CS, Mouratidis I, Montgomery A, Chantzi N, Pavlopoulos GA, Georgakopoulos-Soares I. A survey of k-mer methods and applications in bioinformatics. Comput Struct Biotechnol J 2024; 23:2289-2303. [PMID: 38840832 PMCID: PMC11152613 DOI: 10.1016/j.csbj.2024.05.025] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/13/2024] [Revised: 05/14/2024] [Accepted: 05/15/2024] [Indexed: 06/07/2024] Open
Abstract
The rapid progression of genomics and proteomics has been driven by the advent of advanced sequencing technologies, large, diverse, and readily available omics datasets, and the evolution of computational data processing capabilities. The vast amount of data generated by these advancements necessitates efficient algorithms to extract meaningful information. K-mers serve as a valuable tool when working with large sequencing datasets, offering several advantages in computational speed and memory efficiency and carrying the potential for intrinsic biological functionality. This review provides an overview of the methods, applications, and significance of k-mers in genomic and proteomic data analyses, as well as the utility of absent sequences, including nullomers and nullpeptides, in disease detection, vaccine development, therapeutics, and forensic science. Therefore, the review highlights the pivotal role of k-mers in addressing current genomic and proteomic problems and underscores their potential for future breakthroughs in research.
Collapse
Affiliation(s)
- Camille Moeckel
- Institute for Personalized Medicine, Department of Biochemistry and Molecular Biology, The Pennsylvania State University College of Medicine, Hershey, PA, USA
| | - Manvita Mareboina
- Institute for Personalized Medicine, Department of Biochemistry and Molecular Biology, The Pennsylvania State University College of Medicine, Hershey, PA, USA
| | - Maxwell A. Konnaris
- Institute for Personalized Medicine, Department of Biochemistry and Molecular Biology, The Pennsylvania State University College of Medicine, Hershey, PA, USA
| | - Candace S.Y. Chan
- Department of Bioengineering and Therapeutic Sciences, University of California San Francisco, San Francisco, CA, USA
| | - Ioannis Mouratidis
- Institute for Personalized Medicine, Department of Biochemistry and Molecular Biology, The Pennsylvania State University College of Medicine, Hershey, PA, USA
- Huck Institute of the Life Sciences, Penn State University, University Park, Pennsylvania, USA
| | - Austin Montgomery
- Institute for Personalized Medicine, Department of Biochemistry and Molecular Biology, The Pennsylvania State University College of Medicine, Hershey, PA, USA
| | - Nikol Chantzi
- Institute for Personalized Medicine, Department of Biochemistry and Molecular Biology, The Pennsylvania State University College of Medicine, Hershey, PA, USA
| | | | - Ilias Georgakopoulos-Soares
- Institute for Personalized Medicine, Department of Biochemistry and Molecular Biology, The Pennsylvania State University College of Medicine, Hershey, PA, USA
- Huck Institute of the Life Sciences, Penn State University, University Park, Pennsylvania, USA
| |
Collapse
|
2
|
Mayeur H, Leyhr J, Mulley J, Leurs N, Michel L, Sharma K, Lagadec R, Aury JM, Osborne OG, Mulhair P, Poulain J, Mangenot S, Mead D, Smith M, Corton C, Oliver K, Skelton J, Betteridge E, Dolucan J, Dudchenko O, Omer AD, Weisz D, Aiden EL, McCarthy S, Sims Y, Torrance J, Tracey A, Howe K, Baril T, Hayward A, Martinand-Mari C, Sanchez S, Haitina T, Martin K, Korsching SI, Mazan S, Debiais-Thibaud M. The sensory shark: high-quality morphological, genomic and transcriptomic data for the small-spotted catshark Scyliorhinus canicula reveal the molecular bases of sensory organ evolution in jawed vertebrates. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2024:2024.05.23.595469. [PMID: 39005470 PMCID: PMC11244906 DOI: 10.1101/2024.05.23.595469] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 07/16/2024]
Abstract
Cartilaginous fishes (chimaeras and elasmobranchs -sharks, skates and rays) hold a key phylogenetic position to explore the origin and diversifications of jawed vertebrates. Here, we report and integrate reference genomic, transcriptomic and morphological data in the small-spotted catshark Scyliorhinus canicula to shed light on the evolution of sensory organs. We first characterise general aspects of the catshark genome, confirming the high conservation of genome organisation across cartilaginous fishes, and investigate population genomic signatures. Taking advantage of a dense sampling of transcriptomic data, we also identify gene signatures for all major organs, including chondrichthyan specializations, and evaluate expression diversifications between paralogs within major gene families involved in sensory functions. Finally, we combine these data with 3D synchrotron imaging and in situ gene expression analyses to explore chondrichthyan-specific traits and more general evolutionary trends of sensory systems. This approach brings to light, among others, novel markers of the ampullae of Lorenzini electro-sensory cells, a duplication hotspot for crystallin genes conserved in jawed vertebrates, and a new metazoan clade of the Transient-receptor potential (TRP) family. These resources and results, obtained in an experimentally tractable chondrichthyan model, open new avenues to integrate multiomics analyses for the study of elasmobranchs and jawed vertebrates.
Collapse
|
3
|
Song X, Geng Y, Xu C, Li J, Guo Y, Shi Y, Ma Q, Li Q, Zhang M. The complete mitochondrial genomes of five critical phytopathogenic Bipolaris species: features, evolution, and phylogeny. IMA Fungus 2024; 15:15. [PMID: 38863028 PMCID: PMC11167856 DOI: 10.1186/s43008-024-00149-6] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/13/2023] [Accepted: 05/28/2024] [Indexed: 06/13/2024] Open
Abstract
In the present study, three mitogenomes from the Bipolaris genus (Bipolaris maydis, B. zeicola, and B. oryzae) were assembled and compared with the other two reported Bipolaris mitogenomes (B. oryzae and B. sorokiniana). The five mitogenomes were all circular DNA molecules, with lengths ranging from 106,403 bp to 135,790 bp. The mitogenomes of the five Bipolaris species mainly comprised the same set of 13 core protein-coding genes (PCGs), two rRNAs, and a certain number of tRNAs and unidentified open reading frames (ORFs). The PCG length, AT skew and GC skew showed large variability among the 13 PCGs in the five mitogenomes. Across the 13 core PCGs tested, nad6 had the least genetic distance among the 16 Pleosporales species we investigated, indicating that this gene was highly conserved. In addition, the Ka/Ks values for all 12 core PCGs (excluding rps3) were < 1, suggesting that these genes were subject to purifying selection. Comparative mitogenomic analyses indicate that introns were the main factor contributing to the size variation of Bipolaris mitogenomes. The introns of the cox1 gene experienced frequent gain/loss events in Pleosporales species. The gene arrangement and collinearity in the mitogenomes of the five Bipolaris species were almost highly conserved within the genus. Phylogenetic analysis based on combined mitochondrial gene datasets showed that the five Bipolaris species formed well-supported topologies. This study is the first report on the mitogenomes of B. maydis and B. zeicola, as well as the first comparison of mitogenomes among Bipolaris species. The findings of this study will further advance investigations into the population genetics, evolution, and genomics of Bipolaris species.
Collapse
Affiliation(s)
- Xinzheng Song
- Department of Plant Pathology, Henan Agricultural University, Zhengzhou, Henan, China
| | - Yuehua Geng
- Department of Plant Pathology, Henan Agricultural University, Zhengzhou, Henan, China
| | - Chao Xu
- Department of Plant Pathology, Henan Agricultural University, Zhengzhou, Henan, China
| | - Jiaxin Li
- Department of Plant Pathology, Henan Agricultural University, Zhengzhou, Henan, China
| | - Yashuang Guo
- Department of Plant Pathology, Henan Agricultural University, Zhengzhou, Henan, China
| | - Yan Shi
- Department of Plant Pathology, Henan Agricultural University, Zhengzhou, Henan, China
| | - Qingzhou Ma
- Department of Plant Pathology, Henan Agricultural University, Zhengzhou, Henan, China.
| | - Qiang Li
- School of Food and Biological Engineering, Chengdu University, Chengdu, Sichuan, China.
| | - Meng Zhang
- Department of Plant Pathology, Henan Agricultural University, Zhengzhou, Henan, China.
| |
Collapse
|
4
|
Zhang ZY, Xia HX, Yuan MJ, Gao F, Bao WH, Jin L, Li M, Li Y. Multi-omics analyses provide insights into the evolutionary history and the synthesis of medicinal components of the Chinese wingnut. PLANT DIVERSITY 2024; 46:309-320. [PMID: 38798724 PMCID: PMC11119516 DOI: 10.1016/j.pld.2024.03.010] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 01/05/2024] [Revised: 03/22/2024] [Accepted: 03/31/2024] [Indexed: 05/29/2024]
Abstract
Chinese wingnut (Pterocarya stenoptera) is a medicinally and economically important tree species within the family Juglandaceae. However, the lack of high-quality reference genome has hindered its in-depth research. In this study, we successfully assembled its chromosome-level genome and performed multi-omics analyses to address its evolutionary history and synthesis of medicinal components. A thorough examination of genomes has uncovered a significant expansion in the Lateral Organ Boundaries Domain gene family among the winged group in Juglandaceae. This notable increase may be attributed to their frequent exposure to flood-prone environments. After further differentiation between Chinese wingnut and Cyclocarya paliurus, significant positive selection occurred on the genes of NADH dehydrogenase related to mitochondrial aerobic respiration in Chinese wingnut, enhancing its ability to cope with waterlogging stress. Comparative genomic analysis revealed Chinese wingnut evolved more unique genes related to arginine synthesis, potentially endowing it with a higher capacity to purify nutrient-rich water bodies. Expansion of terpene synthase families enables the production of increased quantities of terpenoid volatiles, potentially serving as an evolved defense mechanism against herbivorous insects. Through combined transcriptomic and metabolomic analysis, we identified the candidate genes involved in the synthesis of terpenoid volatiles. Our study offers essential genetic resources for Chinese wingnut, unveiling its evolutionary history and identifying key genes linked to the production of terpenoid volatiles.
Collapse
Affiliation(s)
- Zi-Yan Zhang
- College of Life Science and Technology, Inner Mongolia Normal University, Hohhot 010020, China
- Key Laboratory of Biodiversity Conservation and Sustainable Utilization in Mongolian Plateau for College and University of Inner Mongolia Autonomous Region, Hohhot 010022, China
| | - He-Xiao Xia
- College of Landscape and Art, Henan Agricultural University, Zhengzhou 450002, China
| | - Meng-Jie Yuan
- College of Life Science and Technology, Inner Mongolia Normal University, Hohhot 010020, China
- Key Laboratory of Biodiversity Conservation and Sustainable Utilization in Mongolian Plateau for College and University of Inner Mongolia Autonomous Region, Hohhot 010022, China
| | - Feng Gao
- College of Life Science and Technology, Inner Mongolia Normal University, Hohhot 010020, China
- Key Laboratory of Biodiversity Conservation and Sustainable Utilization in Mongolian Plateau for College and University of Inner Mongolia Autonomous Region, Hohhot 010022, China
| | - Wen-Hua Bao
- College of Life Science and Technology, Inner Mongolia Normal University, Hohhot 010020, China
- Key Laboratory of Biodiversity Conservation and Sustainable Utilization in Mongolian Plateau for College and University of Inner Mongolia Autonomous Region, Hohhot 010022, China
| | - Lan Jin
- College of Life Science and Technology, Inner Mongolia Normal University, Hohhot 010020, China
- Key Laboratory of Biodiversity Conservation and Sustainable Utilization in Mongolian Plateau for College and University of Inner Mongolia Autonomous Region, Hohhot 010022, China
| | - Min Li
- College of Life Science and Technology, Inner Mongolia Normal University, Hohhot 010020, China
- Key Laboratory of Biodiversity Conservation and Sustainable Utilization in Mongolian Plateau for College and University of Inner Mongolia Autonomous Region, Hohhot 010022, China
| | - Yong Li
- College of Life Science and Technology, Inner Mongolia Normal University, Hohhot 010020, China
- Key Laboratory of Biodiversity Conservation and Sustainable Utilization in Mongolian Plateau for College and University of Inner Mongolia Autonomous Region, Hohhot 010022, China
- State Key Laboratory of Tree Genetics and Breeding, Chinese Academy of Forestry, Beijing 100091, China
| |
Collapse
|
5
|
Sami A, El-Metwally S, Rashad MZ. MAC-ErrorReads: machine learning-assisted classifier for filtering erroneous NGS reads. BMC Bioinformatics 2024; 25:61. [PMID: 38321434 PMCID: PMC10848413 DOI: 10.1186/s12859-024-05681-1] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/12/2023] [Accepted: 01/29/2024] [Indexed: 02/08/2024] Open
Abstract
BACKGROUND The rapid advancement of next-generation sequencing (NGS) machines in terms of speed and affordability has led to the generation of a massive amount of biological data at the expense of data quality as errors become more prevalent. This introduces the need to utilize different approaches to detect and filtrate errors, and data quality assurance is moved from the hardware space to the software preprocessing stages. RESULTS We introduce MAC-ErrorReads, a novel Machine learning-Assisted Classifier designed for filtering Erroneous NGS Reads. MAC-ErrorReads transforms the erroneous NGS read filtration process into a robust binary classification task, employing five supervised machine learning algorithms. These models are trained on features extracted through the computation of Term Frequency-Inverse Document Frequency (TF_IDF) values from various datasets such as E. coli, GAGE S. aureus, H. Chr14, Arabidopsis thaliana Chr1 and Metriaclima zebra. Notably, Naive Bayes demonstrated robust performance across various datasets, displaying high accuracy, precision, recall, F1-score, MCC, and ROC values. The MAC-ErrorReads NB model accurately classified S. aureus reads, surpassing most error correction tools with a 38.69% alignment rate. For H. Chr14, tools like Lighter, Karect, CARE, Pollux, and MAC-ErrorReads showed rates above 99%. BFC and RECKONER exceeded 98%, while Fiona had 95.78%. For the Arabidopsis thaliana Chr1, Pollux, Karect, RECKONER, and MAC-ErrorReads demonstrated good alignment rates of 92.62%, 91.80%, 91.78%, and 90.87%, respectively. For the Metriaclima zebra, Pollux achieved a high alignment rate of 91.23%, despite having the lowest number of mapped reads. MAC-ErrorReads, Karect, and RECKONER demonstrated good alignment rates of 83.76%, 83.71%, and 83.67%, respectively, while also producing reasonable numbers of mapped reads to the reference genome. CONCLUSIONS This study demonstrates that machine learning approaches for filtering NGS reads effectively identify and retain the most accurate reads, significantly enhancing assembly quality and genomic coverage. The integration of genomics and artificial intelligence through machine learning algorithms holds promise for enhancing NGS data quality, advancing downstream data analysis accuracy, and opening new opportunities in genetics, genomics, and personalized medicine research.
Collapse
Affiliation(s)
- Amira Sami
- Department of Computer Science, Faculty of Computers and Information, Mansoura University, P.O. Box: 35516, Mansoura, Egypt
| | - Sara El-Metwally
- Department of Computer Science, Faculty of Computers and Information, Mansoura University, P.O. Box: 35516, Mansoura, Egypt.
- Biomedical Informatics Department, Faculty of Computer Science and Engineering, New Mansoura University, Gamasa, 35712, Egypt.
| | - M Z Rashad
- Department of Computer Science, Faculty of Computers and Information, Mansoura University, P.O. Box: 35516, Mansoura, Egypt
| |
Collapse
|
6
|
Długosz M, Deorowicz S. Illumina reads correction: evaluation and improvements. Sci Rep 2024; 14:2232. [PMID: 38278837 PMCID: PMC11222498 DOI: 10.1038/s41598-024-52386-9] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/20/2023] [Accepted: 01/18/2024] [Indexed: 01/28/2024] Open
Abstract
The paper focuses on the correction of Illumina WGS sequencing reads. We provide an extensive evaluation of the existing correctors. To this end, we measure an impact of the correction on variant calling (VC) as well as de novo assembly. It shows, that in selected cases read correction improves the VC results quality. We also examine the algorithms behaviour in a processing of Illumina NovaSeq reads, with different reads quality characteristics than in older sequencers. We show that most of the algorithms are ready to cope with such reads. Finally, we introduce a new version of RECKONER, our read corrector, by optimizing it and equipping with a new correction strategy. Currently, RECKONER allows to correct high-coverage human reads in less than 2.5 h, is able to cope with two types of reads errors: indels and substitutions, and utilizes a new, based on a two lengths of oligomers, correction verification technique.
Collapse
Affiliation(s)
- Maciej Długosz
- Faculty of Automatic Control, Electronics and Computer Science, Silesian University of Technology, 44-100, Gliwice, Poland
| | - Sebastian Deorowicz
- Faculty of Automatic Control, Electronics and Computer Science, Silesian University of Technology, 44-100, Gliwice, Poland.
| |
Collapse
|
7
|
Jalalizadeh F, Njamkepo E, Weill FX, Goodarzi F, Rahnamaye-Farzami M, Sabourian R, Bakhshi B. Genetic approach toward linkage of Iran 2012-2016 cholera outbreaks with 7th pandemic Vibrio cholerae. BMC Microbiol 2024; 24:33. [PMID: 38254012 PMCID: PMC10801964 DOI: 10.1186/s12866-024-03185-9] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/07/2023] [Accepted: 01/08/2024] [Indexed: 01/24/2024] Open
Abstract
Vibrio cholerae, as a natural inhabitant of the marine environment is among the world-leading causes of diarrheal diseases. The present study aimed to investigate the genetic relatedness of Iran 2012-2016 V. cholerae outbreaks with 7th pandemic cholera and to further characterize the non-ST69/non-ST75 sequence types strains by whole-genome sequencing (WGS).Twenty V. cholerae isolates related to 2012, 2013, 2015 and 2016 cholera outbreaks were studied by two genotyping methods - Pulsed-field Gel Electrophoresis (PFGE) and Multi-locus Sequence Typing (MLST)-and by antimicrobial susceptibility testing. Seven sequence types (STs) and sixteen pulsotypes were detected. Sequence type 69 was the most abundant ST confirming that most (65%, 13/20) of the studied isolates collected in Iran between 2012 and 2016 belonged to the 7th pandemic clone. All these ST69 isolates (except two) exhibited similar pulsotypes. ST75 was the second most abundant ST. It was identified in 2015 and 2016. ST438, ST178, ST579 and STs of 983 and 984 (as newfound STs) each were only detected in one isolate. All strains collected in 2016 appeared as distinct STs and pulsotypes indicative of probable different originations. All ST69 strains were resistant to nalidixic acid. Moreover, resistance to nalidixic acid, trimethoprim-sulfamethoxazole and tetracycline was only observed in strains of ST69. These properties propose the ST69 as a unique genotype derived from a separate lineage with distinct resistance properties. The circulation of V. cholerae ST69 and its traits in recent years in Iran proposes the 7th pandemic strains as the ongoing causes of cholera outbreaks in this country, although the role of ST75 as the probable upcoming dominant ST should not be ignored.Genomic analysis of non-ST69/non-ST75 strains in this study showed ST579 is the most similar ST type to 7th pandemic sequence types, due to the presence of wild type-El Tor sequences of tcpA and VC-1319, VC-1320, VC-1577, VC-1578 genes (responsible for polymyxin resistance in El Tor biotype), the traits of rstC of RS1 phage in one strain of this ST type and the presence of VPI-1 and VSP-I islands in ST579 and ST178 strains. In silico analysis showed no significant presence of resistance genes/cassettes/plasmids within non-ST69/non-ST75 strains genomes. Overall, these data indicate the higher susceptibility of V. cholerae non-ST69/non-ST75 strains in comparison with more ubiquitous and more circulating ST69 and ST75 strains.In conclusion, the occurrence of small outbreaks and sporadic cholera cases due to V. cholerae ST69 in recent years in Iran shows the 7th pandemic strains as the persistent causes of cholera outbreaks in this country, although the role of ST75 as the second most contributed ST should not be ignored. The occurrence of non-ST69/non-ST75 sequence types with some virulence factors characteristics in border provinces in recent years is noteworthy, and further studies together with surveillance efforts are expected to determine their likely route of transport.
Collapse
Affiliation(s)
- Fatemeh Jalalizadeh
- Department of Bacteriology, Faculty of Medical Sciences, Tarbiat Modares University, Tehran, Iran
| | | | | | - Forough Goodarzi
- Department of Bacteriology, Faculty of Medical Sciences, Tarbiat Modares University, Tehran, Iran
| | | | | | - Bita Bakhshi
- Department of Bacteriology, Faculty of Medical Sciences, Tarbiat Modares University, Tehran, Iran.
| |
Collapse
|
8
|
Rahi P, Mühle E, Scandola C, Touak G, Clermont D. Genome sequence-based identification of Enterobacter strains and description of Enterobacter pasteurii sp. nov. Microbiol Spectr 2024; 12:e0315023. [PMID: 38099614 PMCID: PMC10783019 DOI: 10.1128/spectrum.03150-23] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/22/2023] [Accepted: 11/16/2023] [Indexed: 01/13/2024] Open
Abstract
IMPORTANCE Accurate taxonomy is essential for microbial biological resource centers, since the microbial resources are often used to support new discoveries and subsequent research. Here, we used genome sequence data, alongside matrix-assisted laser desorption/ionization time-of-flight mass spectrometer biotyper-based protein profiling, to accurately identify six Enterobacter cloacae complex strains. This approach effectively identified distinct species within the E. cloacae complex, including Enterobacter asburiae, "Enterobacter xiangfangensis," and Enterobacter quasihormaechei. Moreover, the study revealed the existence of a novel species within the Enterobacter genus, for which we proposed the name Enterobacter pasteurii sp. nov. In summary, this study demonstrates the significance of adopting a genome sequence-driven taxonomy approach for the precise identification of bacterial strains in a biological resource center and expands our understanding of the E. cloacae complex.
Collapse
Affiliation(s)
- Praveen Rahi
- Collection of Institut Pasteur (CIP), Institut Pasteur, Université Paris Cité, Paris, France
| | - Estelle Mühle
- Collection of Institut Pasteur (CIP), Institut Pasteur, Université Paris Cité, Paris, France
| | - Cyril Scandola
- Ultrastructural Bioimaging Unit, Institut Pasteur, Université Paris Cité, Paris, France
| | - Gerald Touak
- Collection of Institut Pasteur (CIP), Institut Pasteur, Université Paris Cité, Paris, France
| | - Dominique Clermont
- Collection of Institut Pasteur (CIP), Institut Pasteur, Université Paris Cité, Paris, France
| |
Collapse
|
9
|
Sato R, Kondo Y, Agarie S. The first released available genome of the common ice plant ( Mesembryanthemum crystallinum L.) extended the research region on salt tolerance, C 3-CAM photosynthetic conversion, and halophilism. F1000Res 2024; 12:448. [PMID: 38618020 PMCID: PMC11016173 DOI: 10.12688/f1000research.129958.3] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Figures] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Accepted: 01/03/2024] [Indexed: 04/16/2024] Open
Abstract
Background The common ice plant ( Mesembryanthemum crystallinum L.) is an annual herb belonging to the genus Mesembryanthemum of the family Aizoaceae, native to Southern Africa. Methods We performed shotgun genome paired-end sequencing using the Illumina platform to determine the genome sequence of the ice plants. We assembled the whole genome sequences using the genome assembler "ALGA" and "Redundans", then released them as available genomic information. Finally, we mainly estimated the potential genomic function by the homology search method. Results A draft genome was generated with a total length of 286 Mb corresponding to 79.2% of the estimated genome size (361 Mb), consisting of 49,782 contigs. It encompassed 93.49% of the genes of terrestrial higher plants, 99.5% of the ice plant transcriptome, and 100% of known DNA sequences. In addition, 110.9 Mb (38.8%) of repetitive sequences and untranslated regions, 971 tRNA, and 100 miRNA loci were identified, and their effects on stress tolerance and photosynthesis were investigated. Molecular phylogenetic analysis based on ribosomal DNA among 26 kinds of plant species revealed genetic similarity between the ice plant and poplar, which have salt tolerance. Overall, 35,702 protein-coding regions were identified in the genome, of which 56.05% to 82.59% were annotated and submitted to domain searches and gene ontology (GO) analyses, which found that eighteen GO terms stood out among five plant species. These terms were related to biological defense, growth, reproduction, transcription, post-transcription, and intermembrane transportation, regarded as one of the fundamental results of using the utilized ice plant genome. Conclusions The information that we characterized is useful for elucidation of the mechanism of growth promotion under salinity and reversible conversion of the photosynthetic type from C3 to Crassulacean Acid Metabolism (CAM).
Collapse
Affiliation(s)
- Ryoma Sato
- Graduate school of Bioresource and Bioenvironmental Sciences, Kyushu University, 744 Motooka Nishi-ku Fukuoka, 819-0395, Japan
| | - Yuri Kondo
- Graduate school of Bioresource and Bioenvironmental Sciences, Kyushu University, 744 Motooka Nishi-ku Fukuoka, 819-0395, Japan
| | - Sakae Agarie
- Faculty of Agriculture, Kyushu University, 744 Motooka Nishi-ku Fukuoka, 819-0395, Japan
| |
Collapse
|
10
|
Zhang X, Duan XM, Cheng J, Qiao HJ, Dai YM. Hymenobacter endophyticus sp. nov., isolated from wheat leaf tissue. Int J Syst Evol Microbiol 2023; 73. [PMID: 38059799 DOI: 10.1099/ijsem.0.006197] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/08/2023] Open
Abstract
A bacterium, designated strain ZK17L-C2T, was isolated from the leaf tissues of wheat (Triticum aestivum) collected in Chengdu, Sichuan Province, PR China. It is aerobic, non-motile, Gram-negative, rod-shaped and red-to-pink in colour. Phylogenetic analysis based on 16S rRNA gene sequences showed that strain ZK17L-C2T belonged to the genus Hymenobacter and was most closely related to Hymenobacter rigui KCTC 12533T (98.68 %) and Hymenobacter metallilatus 9PBR-2T (98.19 %). Digital DNA-DNA hybridization (dDDH) values between strain ZK17L-C2T and these two type strains were 26.6 and 26.5 %, and average nucleotide identity (ANI) values were 84.9 and 84.8 %, respectively; these values are lower than the proposed and generally accepted species boundaries for dDDH and ANI. The genomic DNA G+C content of strain ZK17L-C2T was 59.4 mol%. It can grow at pH 5.5-7.5 and 15-30 °C, which is different from the closely related type strains. The major fatty acids of strain ZK17L-C2T were iso-C15 : 0, C16 : 0 and C18 : 0. Overall, the results from biochemical, chemical taxonomy and phylogenetic analyses indicate that strain ZK17L-C2T (=CGMCC 1.19373T=KCTC 92184 T) represents a new species of the genus Hymenobacter, for which the name Hymenobacter endophyticus sp. nov. is proposed.
Collapse
Affiliation(s)
- Xue Zhang
- College of Animal Science and Technology, Hebei Normal University of Science &Technology, Qinhuangdao 066600, PR China
| | - Xue-Mei Duan
- Key Laboratory of Environmental and Applied Microbiology, Environmental Microbiology Key Laboratory of Sichuan Province, Chengdu Institute of Biology, Chinese Academy of Sciences, Sichuan 610041, PR China
- University of Chinese Academy of Sciences, Beijing, 100049, PR China
| | - Jin Cheng
- College of Animal Science and Technology, Hebei Normal University of Science &Technology, Qinhuangdao 066600, PR China
| | - Hong-Jiao Qiao
- College of Animal Science and Technology, Hebei Normal University of Science &Technology, Qinhuangdao 066600, PR China
| | - Yu-Mei Dai
- College of Animal Science and Technology, Hebei Normal University of Science &Technology, Qinhuangdao 066600, PR China
| |
Collapse
|
11
|
Elizondo EC, Faircloth BC, Brumfield RT, Shakya SB, Ellis VA, Schmidt CJ, Kovach AI, Gregory Shriver W. A high-quality de novo genome assembly for clapper rail (Rallus crepitans). G3 (BETHESDA, MD.) 2023; 13:jkad097. [PMID: 37130071 PMCID: PMC10484055 DOI: 10.1093/g3journal/jkad097] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 03/26/2022] [Revised: 03/26/2022] [Accepted: 03/10/2023] [Indexed: 05/03/2023]
Abstract
The clapper rail (Rallus crepitans), of the family Rallidae, is a secretive marsh bird species that is adapted for high salinity habitats. They are very similar in appearance to the closely related king rail (R. elegans), but while king rails are limited primarily to freshwater marshes, clapper rails are highly adapted to tolerate salt marshes. Both species can be found in brackish marshes where they freely hybridize, but the distribution of their respective habitats precludes the formation of a continuous hybrid zone and secondary contact can occur repeatedly. This system, thus, provides unique opportunities to investigate the underlying mechanisms driving their differential salinity tolerance as well as the maintenance of the species boundary between the 2 species. To facilitate these studies, we assembled a de novo reference genome assembly for a female clapper rail. Chicago and HiC libraries were prepared as input for the Dovetail HiRise pipeline to scaffold the genome. The pipeline, however, did not recover the Z chromosome so a custom script was used to assemble the Z chromosome. We generated a near chromosome level assembly with a total length of 994.8 Mb comprising 13,226 scaffolds. The assembly had a scaffold N50 was 82.7 Mb, L50 of four, and had a BUSCO completeness score of 92%. This assembly is among the most contiguous genomes among the species in the family Rallidae. It will serve as an important tool in future studies on avian salinity tolerance, interspecific hybridization, and speciation.
Collapse
Affiliation(s)
- Elisa C Elizondo
- Department of Entomology and Wildlife Ecology, University of Delaware, Newark, DE 19716, USA
| | - Brant C Faircloth
- Museum of Natural Science and Department of Biological Sciences, Louisiana State University, Baton Rouge, LA 70803, USA
| | - Robb T Brumfield
- Museum of Natural Science and Department of Biological Sciences, Louisiana State University, Baton Rouge, LA 70803, USA
| | - Subir B Shakya
- Museum of Natural Science and Department of Biological Sciences, Louisiana State University, Baton Rouge, LA 70803, USA
| | - Vincenzo A Ellis
- Department of Entomology and Wildlife Ecology, University of Delaware, Newark, DE 19716, USA
| | - Carl J Schmidt
- Department of Animal and Food Sciences, University of Delaware, Newark, DE 19716, USA
| | - Adrienne I Kovach
- Department of Natural Resources, University of New Hampshire, Durham, NH 03824, USA
| | - W Gregory Shriver
- Department of Entomology and Wildlife Ecology, University of Delaware, Newark, DE 19716, USA
| |
Collapse
|
12
|
Yan L, Yin Z, Zhang H, Zhao Z, Wang M, Müller A, Kallenborn F, Wichmann A, Wei Y, Niu B, Schmidt B, Liu W. RabbitQCPlus 2.0: More efficient and versatile quality control for sequencing data. Methods 2023; 216:39-50. [PMID: 37330158 DOI: 10.1016/j.ymeth.2023.06.007] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/27/2023] [Revised: 05/26/2023] [Accepted: 06/12/2023] [Indexed: 06/19/2023] Open
Abstract
Assessing the quality of sequencing data plays a crucial role in downstream data analysis. However, existing tools often achieve sub-optimal efficiency, especially when dealing with compressed files or performing complicated quality control operations such as over-representation analysis and error correction. We present RabbitQCPlus, an ultra-efficient quality control tool for modern multi-core systems. RabbitQCPlus uses vectorization, memory copy reduction, parallel (de)compression, and optimized data structures to achieve substantial performance gains. It is 1.1 to 5.4 times faster when performing basic quality control operations compared to state-of-the-art applications yet requires fewer compute resources. Moreover, RabbitQCPlus is at least 4 times faster than other applications when processing gzip-compressed FASTQ files and 1.3 times faster with the error correction module turned on. Furthermore, it takes less than 4 minutes to process 280 GB of plain FASTQ sequencing data, while other applications take at least 22 minutes on a 48-core server when enabling the per-read over-representation analysis. C++ sources are available at https://github.com/RabbitBio/RabbitQCPlus.
Collapse
Affiliation(s)
- Lifeng Yan
- School of Software, Shandong University, Jinan, China
| | - Zekun Yin
- School of Software, Shandong University, Jinan, China.
| | - Hao Zhang
- School of Software, Shandong University, Jinan, China
| | - Zhan Zhao
- School of Software, Shandong University, Jinan, China
| | - Mingkai Wang
- School of Software, Shandong University, Jinan, China
| | - André Müller
- Institute for Computer Science, Johannes Gutenberg University, Mainz, Germany
| | - Felix Kallenborn
- Institute for Computer Science, Johannes Gutenberg University, Mainz, Germany
| | - Alexander Wichmann
- Institute for Computer Science, Johannes Gutenberg University, Mainz, Germany
| | - Yanjie Wei
- Shenzhen Institutes of Advanced Technology, Chinese Academy of Sciences, Shenzhen, China
| | - Beifang Niu
- Computer Network Information Center, Chinese Academy of Sciences, Beijing, China
| | - Bertil Schmidt
- Institute for Computer Science, Johannes Gutenberg University, Mainz, Germany
| | - Weiguo Liu
- School of Software, Shandong University, Jinan, China
| |
Collapse
|
13
|
Vorel J, Kmentová N, Hahn C, Bureš P, Kašný M. An insight into the functional genomics and species classification of Eudiplozoon nipponicum (Monogenea, Diplozoidae), a haematophagous parasite of the common carp Cyprinus carpio. BMC Genomics 2023; 24:363. [PMID: 37380941 DOI: 10.1186/s12864-023-09461-8] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/10/2023] [Accepted: 06/16/2023] [Indexed: 06/30/2023] Open
Abstract
BACKGROUND Monogenea (Platyhelminthes, Neodermata) are the most species-rich class within the Neodermata superclass of primarily fish parasites. Despite their economic and ecological importance, monogenean research tends to focus on their morphological, phylogenetic, and population characteristics, while comprehensive omics analyses aimed at describing functionally important molecules are few and far between. We present a molecular characterisation of monogenean representative Eudiplozoon nipponicum, an obligate haematophagous parasite infecting the gills of the common carp. We report its nuclear and mitochondrial genomes, present a functional annotation of protein molecules relevant to the molecular and biochemical aspect of physiological processes involved in interactions with the fish hosts, and re-examinate the taxonomic position of Eudiplozoon species within the Diplozoidae family. RESULTS We have generated 50.81 Gbp of raw sequencing data (Illumina and Oxford Nanopore reads), bioinformatically processed, and de novo assembled them into a genome draft 0.94 Gbp long, consisting of 21,044 contigs (N50 = 87 kbp). The final assembly represents 57% of the estimated total genome size (~ 1.64 Gbp), whereby repetitive and low-complexity regions account for ~ 64% of the assembled length. In total, 36,626 predicted genes encode 33,031 proteins and homology-based annotation of protein-coding genes (PCGs) and proteins characterises 14,785 (44.76%) molecules. We have detected significant representation of functional proteins and known molecular functions. The numbers of peptidases and inhibitors (579 proteins), characterised GO terms (16,016 unique assigned GO terms), and identified KEGG Orthology (4,315 proteins) acting in 378 KEGG pathways demonstrate the variety of mechanisms by which the parasite interacts with hosts on a macromolecular level (immunomodulation, feeding, and development). Comparison between the newly assembled E. nipponicum mitochondrial genome (length of 17,038 bp) and other diplozoid monogeneans confirms the existence of two distinct Eudiplozoon species infecting different fish hosts: Cyprinus carpio and Carassius spp. CONCLUSIONS Although the amount of sequencing data and characterised molecules of monogenean parasites has recently increased, a better insight into their molecular biology is needed. The E. nipponicum nuclear genome presented here, currently the largest described genome of any monogenean parasite, represents a milestone in the study of monogeneans and their molecules but further omics research is needed to understand these parasites' biological nature.
Collapse
Affiliation(s)
- Jiří Vorel
- Department of Botany and Zoology, Faculty of Science, Masaryk University, Kotlářská 2, Brno, 611 37, Czech Republic.
| | - Nikol Kmentová
- Research Group Zoology: Biodiversity and Toxicology, Centre for Environmental Sciences, Hasselt University, Agoralaan Gebouw D, Diepenbeek, B-3590, Belgium
| | - Christoph Hahn
- Institute of Biology, University of Graz, Universitätsplatz 2, Graz, A-8010, Austria
| | - Petr Bureš
- Department of Botany and Zoology, Faculty of Science, Masaryk University, Kotlářská 2, Brno, 611 37, Czech Republic
| | - Martin Kašný
- Department of Botany and Zoology, Faculty of Science, Masaryk University, Kotlářská 2, Brno, 611 37, Czech Republic
| |
Collapse
|
14
|
Pahayo DG, Cadorna CAE, Quimado MO, Rey JD. The complete chloroplast genome of Calophyllum soulattri Burm. f. (Calophyllaceae). Mitochondrial DNA B Resour 2023; 8:607-611. [PMID: 37250208 PMCID: PMC10215020 DOI: 10.1080/23802359.2023.2215350] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/29/2022] [Accepted: 05/02/2023] [Indexed: 05/31/2023] Open
Abstract
Calophyllum soulattri Burm. f. (1768) is an evergreen tree native to Southeast Asia, Australia, and the Solomon Islands. It is known for its medicinal uses and has been utilized in traditional folk medicine. However, genomic resources for this species are still unavailable. In this study, we sequenced and assembled the first complete chloroplast genome of C. soulattri using next-generation sequencing data. The chloroplast genome of C. soulattri is 161,381 bp in length with a total GC content of 36.36%. The chloroplast genome contains a large single copy (LSC) region of 88,680 bp, a small single copy (SSC) region of 17,453 bp, and two inverted repeat (IR) regions of 27,624 bp each. Furthermore, the chloroplast genome has 131 genes, which include 86 protein-coding genes, 37 tRNA genes, and 8 rRNA genes. Phylogenetic analysis indicated that C. soulattri is clustered in the same branch with C. inophyllum and is closely related to Mesua ferrea.
Collapse
Affiliation(s)
- Dexter G. Pahayo
- Plant Molecular Phylogenetics Laboratory, Institute of Biology, College of Science, University of the Philippines, Diliman, Quezon City, Philippines
| | - Charles Anthon E. Cadorna
- Plant Molecular Phylogenetics Laboratory, Institute of Biology, College of Science, University of the Philippines, Diliman, Quezon City, Philippines
| | - Marilyn O. Quimado
- Forest Biological Sciences, College of Forestry and Natural Resources, University of the Philippines, Los Baños, Laguna, Philippines
| | - Jessica D. Rey
- Plant Molecular Phylogenetics Laboratory, Institute of Biology, College of Science, University of the Philippines, Diliman, Quezon City, Philippines
| |
Collapse
|
15
|
Yuan Y, Gao F, Chang Y, Zhao Q, He X. Advances of mRNA vaccine in tumor: a maze of opportunities and challenges. Biomark Res 2023; 11:6. [PMID: 36650562 PMCID: PMC9845107 DOI: 10.1186/s40364-023-00449-w] [Citation(s) in RCA: 18] [Impact Index Per Article: 18.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/03/2022] [Accepted: 01/10/2023] [Indexed: 01/19/2023] Open
Abstract
High-frequency mutations in tumor genomes could be exploited as an asset for developing tumor vaccines. In recent years, with the tremendous breakthrough in genomics, intelligence algorithm, and in-depth insight of tumor immunology, it has become possible to rapidly target genomic alterations in tumor cell and rationally select vaccine targets. Among a variety of candidate vaccine platforms, the early application of mRNA was limited by instability low efficiency and excessive immunogenicity until the successful development of mRNA vaccines against SARS-COV-2 broken of technical bottleneck in vaccine preparation, allowing tumor mRNA vaccines to be prepared rapidly in an economical way with good performance of stability and efficiency. In this review, we systematically summarized the classification and characteristics of tumor antigens, the general process and methods for screening neoantigens, the strategies of vaccine preparations and advances in clinical trials, as well as presented the main challenges in the current mRNA tumor vaccine development.
Collapse
Affiliation(s)
- Yuan Yuan
- grid.413247.70000 0004 1808 0969Department of Gastroenterology, Zhongnan Hospital of Wuhan University, Wuhan, China ,grid.412793.a0000 0004 1799 5032Department of Gastroenterology, Tongji Hospital, Tongji Medical College, Huazhong University of Science and Technology, Wuhan, China
| | - Fan Gao
- grid.413247.70000 0004 1808 0969Department of Gastroenterology, Zhongnan Hospital of Wuhan University, Wuhan, China ,grid.412793.a0000 0004 1799 5032Department of Gastroenterology, Tongji Hospital, Tongji Medical College, Huazhong University of Science and Technology, Wuhan, China
| | - Ying Chang
- grid.413247.70000 0004 1808 0969Department of Gastroenterology, Zhongnan Hospital of Wuhan University, Wuhan, China ,grid.413247.70000 0004 1808 0969Hubei Clinical Center and Key Laboratory of Intestinal and Colorectal Diseases, Wuhan, China
| | - Qiu Zhao
- grid.413247.70000 0004 1808 0969Department of Gastroenterology, Zhongnan Hospital of Wuhan University, Wuhan, China ,grid.413247.70000 0004 1808 0969Hubei Clinical Center and Key Laboratory of Intestinal and Colorectal Diseases, Wuhan, China
| | - Xingxing He
- grid.413247.70000 0004 1808 0969Department of Gastroenterology, Zhongnan Hospital of Wuhan University, Wuhan, China ,grid.412793.a0000 0004 1799 5032Department of Gastroenterology, Tongji Hospital, Tongji Medical College, Huazhong University of Science and Technology, Wuhan, China ,grid.413247.70000 0004 1808 0969Hubei Clinical Center and Key Laboratory of Intestinal and Colorectal Diseases, Wuhan, China
| |
Collapse
|
16
|
Arcari G, Hennart M, Badell E, Brisse S. Multidrug-resistant toxigenic Corynebacterium diphtheriae sublineage 453 with two novel resistance genomic islands. Microb Genom 2023; 9:mgen000923. [PMID: 36748453 PMCID: PMC9973851 DOI: 10.1099/mgen.0.000923] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/28/2023] Open
Abstract
Antimicrobial therapy is important for case management of diphtheria, but knowledge on the emergence of multidrug-resistance in Corynebacterium diphtheriae is scarce. We report on the genomic features of two multidrug-resistant toxigenic isolates sampled from wounds in France 3 years apart. Both isolates were resistant to spiramycin, clindamycin, tetracycline, kanamycin and trimethoprim-sulfamethoxazole. Genes ermX, cmx, aph(3')-Ib, aph(6)-Id, aph(3')-Ic, aadA1, dfrA15, sul1, cmlA, cmlR and tet(33) were clustered in two genomic islands, one consisting of two transposons and one integron, the other being flanked by two IS6100 insertion sequences. One isolate additionally presented mutations in gyrA and rpoB and was resistant to ciprofloxacin and rifampicin. Both isolates belonged to sublineage 453 (SL453), together with 25 isolates from 11 other countries (https://bigsdb.pasteur.fr/diphtheria/). SL453 is a cosmopolitan toxigenic sublineage of C. diphtheriae, a subset of which acquired multidrug resistance. Even though penicillin, amoxicillin and erythromycin, recommended as the first line in the treatment of diphtheria, remain active, surveillance of diphtheria should consider the risk of dissemination of multidrug-resistant strains and their genetic elements.
Collapse
Affiliation(s)
- Gabriele Arcari
- Institut Pasteur, Université Paris Cité, Biodiversity and Epidemiology of Bacterial Pathogens, Paris, France.,Department of Molecular Medicine, Sapienza Università di Roma, Rome, Italy
| | - Mélanie Hennart
- Institut Pasteur, Université Paris Cité, Biodiversity and Epidemiology of Bacterial Pathogens, Paris, France.,Collège doctoral, Sorbonne Université, F-75005 Paris, France
| | - Edgar Badell
- Institut Pasteur, Université Paris Cité, Biodiversity and Epidemiology of Bacterial Pathogens, Paris, France.,Institut Pasteur, National Reference Center for Corynebacteria of the Diphtheriae Complex, Paris, France
| | - Sylvain Brisse
- Institut Pasteur, Université Paris Cité, Biodiversity and Epidemiology of Bacterial Pathogens, Paris, France.,Institut Pasteur, National Reference Center for Corynebacteria of the Diphtheriae Complex, Paris, France
| |
Collapse
|
17
|
Bernal A, Jacob S, Andresen K, Yemelin A, Hartmann H, Antelo L, Thines E. Identification of the polyketide synthase gene responsible for the synthesis of tanzawaic acids in Penicillium steckii IBWF104-06. Fungal Genet Biol 2023; 164:103750. [PMID: 36379411 DOI: 10.1016/j.fgb.2022.103750] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/17/2022] [Revised: 11/04/2022] [Accepted: 11/06/2022] [Indexed: 11/13/2022]
Abstract
Microorganisms have been used as biological control agents (BCAs) in agriculture for a long time, but their importance has increased dramatically over the last few years. The Penicillium steckii IBWF104-06 strain has presented strong BCA activity in greenhouse experiments performed against phytopathogenic fungi and oomycetes. P. steckii strains generally produce different antifungal tanzawaic acids; interesting compounds known to be catalyzed by polyketide synthetases in other fungi. Since the decalin structure is characteristic for tanzawaic acids, two polyketide synthase genes (PsPKS1 and PsPKS2) were selected for further analysis, which have similarity in sequence and gene cluster structure with genes that are known to be responsible for the biosynthesis of decalin-containing compounds. Subsequently, gene-inactivation mutants of both PsPKS1 and PsPKS2 have been generated. It was found, that the ΔPspks1 mutant cannot produce tanzawaic acids any more, whereas reintegration of the original PsPKS1 gene into the genome of ΔPspks1 reestablished tanzawaic acid production. The mutant ΔPspks2 is not altered in tanzawaic acids production. Interestingly, both mutants ΔPsPKS1 and ΔPsPKS2 still display strong BCA activity, indicating that the mechanism of action is not related to the production of tanzawaic acids.
Collapse
Affiliation(s)
- Azahara Bernal
- Institute of Biotechnology and Drug Research gGmbH (IBWF), Hanns-Dieter-Hüsch-Weg 17, D-55128 Mainz, Germany
| | - Stefan Jacob
- Institute of Biotechnology and Drug Research gGmbH (IBWF), Hanns-Dieter-Hüsch-Weg 17, D-55128 Mainz, Germany
| | - Karsten Andresen
- Johannes Gutenberg-University Mainz, Microbiology and Biotechnology at the Institute of Molecular Physiology, Hanns-Dieter-Hüsch-Weg 17, D-55128 Mainz, Germany
| | - Alexander Yemelin
- Institute of Biotechnology and Drug Research gGmbH (IBWF), Hanns-Dieter-Hüsch-Weg 17, D-55128 Mainz, Germany
| | | | - Luis Antelo
- Institute of Biotechnology and Drug Research gGmbH (IBWF), Hanns-Dieter-Hüsch-Weg 17, D-55128 Mainz, Germany; Johannes Gutenberg-University Mainz, Microbiology and Biotechnology at the Institute of Molecular Physiology, Hanns-Dieter-Hüsch-Weg 17, D-55128 Mainz, Germany.
| | - Eckhard Thines
- Institute of Biotechnology and Drug Research gGmbH (IBWF), Hanns-Dieter-Hüsch-Weg 17, D-55128 Mainz, Germany; Johannes Gutenberg-University Mainz, Microbiology and Biotechnology at the Institute of Molecular Physiology, Hanns-Dieter-Hüsch-Weg 17, D-55128 Mainz, Germany.
| |
Collapse
|
18
|
Pottier M, Castagnet S, Gravey F, Leduc G, Sévin C, Petry S, Giard JC, Le Hello S, Léon A. Antimicrobial Resistance and Genetic Diversity of Pseudomonas aeruginosa Strains Isolated from Equine and Other Veterinary Samples. Pathogens 2022; 12:64. [PMID: 36678412 PMCID: PMC9867525 DOI: 10.3390/pathogens12010064] [Citation(s) in RCA: 5] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/26/2022] [Revised: 12/20/2022] [Accepted: 12/26/2022] [Indexed: 01/03/2023] Open
Abstract
Pseudomonas aeruginosa is one of the leading causes of healthcare-associated infections in humans. This bacterium is less represented in veterinary medicine, despite causing difficult-to-treat infections due to its capacity to acquire antimicrobial resistance, produce biofilms, and persist in the environment, along with its limited number of veterinary antibiotic therapies. Here, we explored susceptibility profiles to antibiotics and to didecyldimethylammonium chloride (DDAC), a quaternary ammonium widely used as a disinfectant, in 168 P. aeruginosa strains isolated from animals, mainly Equidae. A genomic study was performed on 41 of these strains to determine their serotype, sequence type (ST), relatedness, and resistome. Overall, 7.7% of animal strains were resistant to carbapenems, 10.1% presented a multidrug-resistant (MDR) profile, and 11.3% showed decreased susceptibility (DS) to DDAC. Genomic analyses revealed that the study population was diverse, and 4.9% were ST235, which is considered the most relevant human high-risk clone worldwide. This study found P. aeruginosa populations with carbapenem resistance, multidrug resistance, and DS to DDAC in equine and canine isolates. These strains, which are not susceptible to antibiotics used in veterinary and human medicine, warrant close the setting up of a clone monitoring, based on that already in place in human medicine, in a one-health approach.
Collapse
Affiliation(s)
- Marine Pottier
- Research Department, LABÉO, 14053 Caen, France
- Inserm UMR 1311, Dynamicure, Normandie University, UNICAEN, UNIROUEN, 14000 Caen, France
| | - Sophie Castagnet
- Research Department, LABÉO, 14053 Caen, France
- Inserm UMR 1311, Dynamicure, Normandie University, UNICAEN, UNIROUEN, 14000 Caen, France
| | - François Gravey
- Inserm UMR 1311, Dynamicure, Normandie University, UNICAEN, UNIROUEN, 14000 Caen, France
- CHU de Caen, Service de Microbiologie, Avenue de la Côte de Nacre, 14033 Caen, France
| | - Guillaume Leduc
- CHU de Caen, Service de Microbiologie, Avenue de la Côte de Nacre, 14033 Caen, France
| | - Corinne Sévin
- Anses, Normandy Laboratory for Animal Health, Physiopathology and Epidemiology of Equine Diseases Unit, 14430 Goustranville, France
| | - Sandrine Petry
- Anses, Normandy Laboratory for Animal Health, Physiopathology and Epidemiology of Equine Diseases Unit, 14430 Goustranville, France
| | - Jean-Christophe Giard
- Inserm UMR 1311, Dynamicure, Normandie University, UNICAEN, UNIROUEN, 14000 Caen, France
| | - Simon Le Hello
- Inserm UMR 1311, Dynamicure, Normandie University, UNICAEN, UNIROUEN, 14000 Caen, France
- CHU de Caen, Service de Microbiologie, Avenue de la Côte de Nacre, 14033 Caen, France
- CHU de Caen, Service d’Hygiène Hospitalière, Avenue de la Côte de Nacre, 14033 Caen, France
| | - Albertine Léon
- Research Department, LABÉO, 14053 Caen, France
- Inserm UMR 1311, Dynamicure, Normandie University, UNICAEN, UNIROUEN, 14000 Caen, France
| |
Collapse
|
19
|
Genome-Wide Transposon Mutagenesis Screens Identify Group A Streptococcus Genes Affecting Susceptibility to β-Lactam Antibiotics. J Bacteriol 2022; 204:e0028722. [PMID: 36374114 PMCID: PMC9765115 DOI: 10.1128/jb.00287-22] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/16/2022] Open
Abstract
Group A streptococcus (GAS) is a Gram-positive human bacterial pathogen responsible for more than 700 million infections annually worldwide. Beta-lactam antibiotics are the primary agents used to treat GAS infections. Naturally occurring GAS clinical isolates with decreased susceptibility to beta-lactam antibiotics attributed to mutations in PBP2X have recently been documented. This prompted us to perform a genome-wide screen to identify GAS genes that alter beta-lactam susceptibility in vitro. Using saturated transposon mutagenesis, we screened for GAS gene mutations conferring altered in vitro susceptibility to penicillin G and/or ceftriaxone, two beta-lactam antibiotics commonly used to treat GAS infections. In the aggregate, we found that inactivating mutations in 150 GAS genes are associated with altered susceptibility to penicillin G and/or ceftriaxone. Many of the genes identified were previously not known to alter beta-lactam susceptibility or affect cell wall biosynthesis. Using isogenic mutant strains, we confirmed that inactivation of clpX (Clp protease ATP-binding subunit) or cppA (CppA proteinase) resulted in decreased in vitro susceptibility to penicillin G and ceftriaxone. Deletion of murA1 (UDP-N-acetylglucosamine 1-carboxyvinyltransferase) conferred increased susceptibility to ceftriaxone. Our results provide new information about the GAS genes affecting susceptibility to beta-lactam antibiotics. IMPORTANCE Beta-lactam antibiotics are the primary drugs prescribed to treat infections caused by group A streptococcus (GAS), an important human pathogen. However, the molecular mechanisms of GAS interactions with beta-lactam antibiotics are not fully understood. In this study, we performed a genome-wide mutagenesis screen to identify GAS mutations conferring altered susceptibility to beta-lactam antibiotics. In the aggregate, we discovered that mutations in 150 GAS genes were associated with altered beta-lactam susceptibility. Many identified genes were previously not known to alter beta-lactam susceptibility or affect cell wall biosynthesis. Our results provide new information about the molecular mechanisms of GAS interaction with beta-lactam antibiotics.
Collapse
|
20
|
Expósito RR, Martínez-Sánchez M, Touriño J. SparkEC: speeding up alignment-based DNA error correction tools. BMC Bioinformatics 2022; 23:464. [PMID: 36344928 PMCID: PMC9639292 DOI: 10.1186/s12859-022-05013-1] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/21/2022] [Accepted: 10/26/2022] [Indexed: 11/09/2022] Open
Abstract
Background In recent years, huge improvements have been made in the context of sequencing genomic data under what is called Next Generation Sequencing (NGS). However, the DNA reads generated by current NGS platforms are not free of errors, which can affect the quality of downstream analysis. Although error correction can be performed as a preprocessing step to overcome this issue, it usually requires long computational times to analyze those large datasets generated nowadays through NGS. Therefore, new software capable of scaling out on a cluster of nodes with high performance is of great importance. Results In this paper, we present SparkEC, a parallel tool capable of fixing those errors produced during the sequencing process. For this purpose, the algorithms proposed by the CloudEC tool, which is already proved to perform accurate corrections, have been analyzed and optimized to improve their performance by relying on the Apache Spark framework together with the introduction of other enhancements such as the usage of memory-efficient data structures and the avoidance of any input preprocessing. The experimental results have shown significant improvements in the computational times of SparkEC when compared to CloudEC for all the representative datasets and scenarios under evaluation, providing an average and maximum speedups of 4.9\documentclass[12pt]{minimal}
\usepackage{amsmath}
\usepackage{wasysym}
\usepackage{amsfonts}
\usepackage{amssymb}
\usepackage{amsbsy}
\usepackage{mathrsfs}
\usepackage{upgreek}
\setlength{\oddsidemargin}{-69pt}
\begin{document}$$\times$$\end{document}× and 11.9\documentclass[12pt]{minimal}
\usepackage{amsmath}
\usepackage{wasysym}
\usepackage{amsfonts}
\usepackage{amssymb}
\usepackage{amsbsy}
\usepackage{mathrsfs}
\usepackage{upgreek}
\setlength{\oddsidemargin}{-69pt}
\begin{document}$$\times$$\end{document}×, respectively, over its counterpart. Conclusion As error correction can take excessive computational time, SparkEC provides a scalable solution for correcting large datasets. Due to its distributed implementation, SparkEC speed can increase with respect to the number of nodes in a cluster. Furthermore, the software is freely available under GPLv3 license and is compatible with different operating systems (Linux, Windows and macOS). Supplementary Information The online version contains supplementary material available at 10.1186/s12859-022-05013-1.
Collapse
Affiliation(s)
- Roberto R. Expósito
- grid.8073.c0000 0001 2176 8535Universidade da Coruña, CITIC, Computer Architecture Group, Campus de Elviña, 15071 A Coruña, Spain
| | - Marco Martínez-Sánchez
- grid.8073.c0000 0001 2176 8535Universidade da Coruña, CITIC, Computer Architecture Group, Campus de Elviña, 15071 A Coruña, Spain
| | - Juan Touriño
- grid.8073.c0000 0001 2176 8535Universidade da Coruña, CITIC, Computer Architecture Group, Campus de Elviña, 15071 A Coruña, Spain
| |
Collapse
|
21
|
Wright JJ, Bruce SA, Sinopoli DA, Palumbo JR, Stewart DJ. Phylogenomic analysis of the bowfin (Amia calva) reveals unrecognized species diversity in a living fossil lineage. Sci Rep 2022; 12:16514. [PMID: 36192509 PMCID: PMC9529906 DOI: 10.1038/s41598-022-20875-4] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/29/2022] [Accepted: 09/20/2022] [Indexed: 11/24/2022] Open
Abstract
The Bowfin (Amia calva), as currently recognized, represents the sole living member of the family Amiidae, which dates back to approximately 150 Ma. Prior to 1896, 13 species of extant Bowfins had been described, but these were all placed into a single species with no rationale or analysis given. This situation has persisted until the present day, with little attention given to re-evaluation of those previously described nominal forms. Here, we present a phylogenomic analysis based on over 21,000 single nucleotide polymorphisms (SNPs) from 94 individuals that unambiguously demonstrates the presence of at least two independent evolutionary lineages within extant Amia populations that merit species-level standing, as well as the possibility of two more. These findings not only expand the recognizable species diversity in an iconic, ancient lineage, but also demonstrate the utility of such methods in addressing previously intractable questions of molecular systematics and phylogeography in slowly evolving groups of ancient fishes.
Collapse
Affiliation(s)
- Jeremy J Wright
- Research & Collections, New York State Museum, 3140 Cultural Education Center, Albany, NY, USA.
| | - Spencer A Bruce
- Department of Information Technology Services, University at Albany-State University of New York, Albany, NY, USA
| | - Daniel A Sinopoli
- Department of Biological Sciences, Museum of Natural Sciences, Louisiana State University, Baton Rouge, LA, USA
| | - Jay R Palumbo
- Department of Environmental Science & Ecology, State University of New York at Brockport, Brockport, NY, USA
| | - Donald J Stewart
- Department of Environmental Biology, State University of New York College of Environmental Science and Forestry, Syracuse, NY, USA.
| |
Collapse
|
22
|
Van Daele F, Honnay O, De Kort H. Genomic analyses point to a low evolutionary potential of prospective source populations for assisted migration in a forest herb. Evol Appl 2022; 15:1859-1874. [DOI: 10.1111/eva.13485] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/03/2022] [Revised: 08/13/2022] [Accepted: 09/17/2022] [Indexed: 11/26/2022] Open
Affiliation(s)
- Frederik Van Daele
- Department of Biology, Plant Conservation and Population Biology KU Leuven Leuven Belgium
| | - Olivier Honnay
- Department of Biology, Plant Conservation and Population Biology KU Leuven Leuven Belgium
| | - Hanne De Kort
- Department of Biology, Plant Conservation and Population Biology KU Leuven Leuven Belgium
| |
Collapse
|
23
|
K-Mer Spectrum-Based Error Correction Algorithm for Next-Generation Sequencing Data. COMPUTATIONAL INTELLIGENCE AND NEUROSCIENCE 2022; 2022:8077664. [PMID: 35875730 PMCID: PMC9303089 DOI: 10.1155/2022/8077664] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 05/18/2022] [Accepted: 06/13/2022] [Indexed: 11/26/2022]
Abstract
In the mid-1970s, the first-generation sequencing technique (Sanger) was created. It used Advanced BioSystems sequencing devices and Beckman's GeXP genetic testing technology. The second-generation sequencing (2GS) technique arrived just several years after the first human genome was published in 2003. 2GS devices are very quicker than Sanger sequencing equipment, with considerably cheaper manufacturing costs and far higher throughput in the form of short reads. The third-generation sequencing (3GS) method, initially introduced in 2005, offers further reduced manufacturing costs and higher throughput. Even though sequencing technique has result generations, it is error-prone due to a large number of reads. The study of this massive amount of data will aid in the decoding of life secrets, the detection of infections, the development of improved crops, and the improvement of life quality, among other things. This is a challenging task, which is complicated not just by a large number of reads and by the occurrence of sequencing mistakes. As a result, error correction is a crucial duty in data processing; it entails identifying and correcting read errors. Various k-spectrum-based error correction algorithms' performance can be influenced by a variety of characteristics like coverage depth, read length, and genome size, as demonstrated in this work. As a result, time and effort must be put into selecting acceptable approaches for error correction of certain NGS data.
Collapse
|
24
|
Tang T, Hutvagner G, Wang W, Li J. Simultaneous compression of multiple error-corrected short-read sets for faster data transmission and better de novo assemblies. Brief Funct Genomics 2022; 21:387-398. [PMID: 35848773 DOI: 10.1093/bfgp/elac016] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/24/2022] [Revised: 06/10/2022] [Accepted: 06/14/2022] [Indexed: 11/14/2022] Open
Abstract
Next-Generation Sequencing has produced incredible amounts of short-reads sequence data for de novo genome assembly over the last decades. For efficient transmission of these huge datasets, high-performance compression algorithms have been intensively studied. As both the de novo assembly and error correction methods utilize the overlaps between reads data, a concern is that the will the sequencing errors bring up negative effects on genome assemblies also affect the compression of the NGS data. This work addresses two problems: how current error correction algorithms can enable the compression algorithms to make the sequence data much more compact, and whether the sequence-modified reads by the error-correction algorithms will lead to quality improvement for de novo contig assembly. As multiple sets of short reads are often produced by a single biomedical project in practice, we propose a graph-based method to reorder the files in the collection of multiple sets and then compress them simultaneously for a further compression improvement after error correction. We use examples to illustrate that accurate error correction algorithms can significantly reduce the number of mismatched nucleotides in the reference-free compression, hence can greatly improve the compression performance. Extensive test on practical collections of multiple short-read sets does confirm that the compression performance on the error-corrected data (with unchanged size) significantly outperforms that on the original data, and that the file reordering idea contributes furthermore. The error correction on the original reads has also resulted in quality improvements of the genome assemblies, sometimes remarkably. However, it is still an open question that how to combine appropriate error correction methods with an assembly algorithm so that the assembly performance can be always significantly improved.
Collapse
Affiliation(s)
- Tao Tang
- Data Science Institute, University of Technology Sydney, 81 Broadway, Ultimo, 2007, NSW, Australia.,School of Mordern Posts, Nanjing University of Posts and Telecommunications, 9 Wenyuan Rd, Qixia District, 210003, Jiangsu, China
| | - Gyorgy Hutvagner
- School of Biomedical Engineering, University of Technology Sydney, 81 Broadway, Ultimo, 2007, NSW, Australia
| | - Wenjian Wang
- School of Computer and Information Technology, Shanxi University, Shanxi Road, 030006, Shanxi, China
| | - Jinyan Li
- Data Science Institute, University of Technology Sydney, 81 Broadway, Ultimo, 2007, NSW, Australia
| |
Collapse
|
25
|
Bridel S, Bouchez V, Brancotte B, Hauck S, Armatys N, Landier A, Mühle E, Guillot S, Toubiana J, Maiden MCJ, Jolley KA, Brisse S. A comprehensive resource for Bordetella genomic epidemiology and biodiversity studies. Nat Commun 2022; 13:3807. [PMID: 35778384 PMCID: PMC9249784 DOI: 10.1038/s41467-022-31517-8] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/25/2022] [Accepted: 06/21/2022] [Indexed: 11/09/2022] Open
Abstract
The genus Bordetella includes bacteria that are found in the environment and/or associated with humans and other animals. A few closely related species, including Bordetella pertussis, are human pathogens that cause diseases such as whooping cough. Here, we present a large database of Bordetella isolates and genomes and develop genotyping systems for the genus and for the B. pertussis clade. To generate the database, we merge previously existing databases from Oxford University and Institut Pasteur, import genomes from public repositories, and add 83 newly sequenced B. bronchiseptica genomes. The public database currently includes 2582 Bordetella isolates and their provenance data, and 2085 genomes ( https://bigsdb.pasteur.fr/bordetella/ ). We use core-genome multilocus sequence typing (cgMLST) to develop genotyping systems for the whole genus and for B. pertussis, as well as specific schemes to define antigenic, virulence and macrolide resistance profiles. Phylogenetic analyses allow us to redefine evolutionary relationships among known Bordetella species, and to propose potential new species. Our database provides an expandable resource for genotyping of environmental and clinical Bordetella isolates, thus facilitating evolutionary and epidemiological research on whooping cough and other Bordetella infections.
Collapse
Affiliation(s)
- Sébastien Bridel
- Institut Pasteur, Université Paris Cité, Biodiversity and Epidemiology of Bacterial Pathogens, Paris, France
| | - Valérie Bouchez
- Institut Pasteur, Université Paris Cité, Biodiversity and Epidemiology of Bacterial Pathogens, Paris, France.,National Reference Center for Whooping Cough and other Bordetella Infections, Institut Pasteur, Paris, France
| | - Bryan Brancotte
- Institut Pasteur, Université Paris Cité, Bioinformatics and Biostatistics Hub, F-75015, Paris, France
| | - Sofia Hauck
- Department of Zoology, University of Oxford, 11a Mansfield Road, Oxford, OX1 3SZ, UK
| | - Nathalie Armatys
- Institut Pasteur, Université Paris Cité, Biodiversity and Epidemiology of Bacterial Pathogens, Paris, France.,National Reference Center for Whooping Cough and other Bordetella Infections, Institut Pasteur, Paris, France
| | - Annie Landier
- Institut Pasteur, Université Paris Cité, Biodiversity and Epidemiology of Bacterial Pathogens, Paris, France.,National Reference Center for Whooping Cough and other Bordetella Infections, Institut Pasteur, Paris, France
| | - Estelle Mühle
- Collection de l´Institut Pasteur, Institut Pasteur, Université Paris Cité, Paris, France
| | - Sophie Guillot
- Institut Pasteur, Université Paris Cité, Biodiversity and Epidemiology of Bacterial Pathogens, Paris, France.,National Reference Center for Whooping Cough and other Bordetella Infections, Institut Pasteur, Paris, France
| | - Julie Toubiana
- Institut Pasteur, Université Paris Cité, Biodiversity and Epidemiology of Bacterial Pathogens, Paris, France.,National Reference Center for Whooping Cough and other Bordetella Infections, Institut Pasteur, Paris, France.,Department of General Pediatrics and Pediatric Infectious Diseases, Université Paris Cité, Hôpital Necker-Enfants Malades, APHP, Paris, France
| | - Martin C J Maiden
- Department of Zoology, University of Oxford, 11a Mansfield Road, Oxford, OX1 3SZ, UK
| | - Keith A Jolley
- Department of Zoology, University of Oxford, 11a Mansfield Road, Oxford, OX1 3SZ, UK
| | - Sylvain Brisse
- Institut Pasteur, Université Paris Cité, Biodiversity and Epidemiology of Bacterial Pathogens, Paris, France. .,National Reference Center for Whooping Cough and other Bordetella Infections, Institut Pasteur, Paris, France.
| |
Collapse
|
26
|
Validation of Fourier Transform Infrared Spectroscopy for Serotyping of Streptococcus pneumoniae. J Clin Microbiol 2022; 60:e0032522. [PMID: 35699436 PMCID: PMC9297836 DOI: 10.1128/jcm.00325-22] [Citation(s) in RCA: 10] [Impact Index Per Article: 5.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/02/2023] Open
Abstract
Fourier transform infrared (FT-IR) spectroscopy (IR Biotyper; Bruker) allows highly discriminatory fingerprinting of closely related bacterial strains. In this study, FT-IR spectroscopy-based capsular typing of Streptococcus pneumoniae was validated as a rapid, cost-effective, and medium-throughput alternative to the classical phenotypic techniques. A training set of 233 strains was defined, comprising 34 different serotypes and including all 24 vaccine types (VTs) and 10 non-vaccine types (NVTs). The acquired spectra were used to (i) create a dendrogram where strains clustered together according to their serotypes and (ii) train an artificial neural network (ANN) model to predict unknown pneumococcal serotypes. During validation using 153 additional strains, we reached 98.0% accuracy for determining serotypes represented in the training set. Next, the performance of the IR Biotyper was assessed using 124 strains representing 59 non-training set serotypes. In this setting, 42 of 59 serotypes (71.1%) could be accurately categorized as being non-training set serotypes. Furthermore, it was observed that comparability of spectra was affected by the source of the Columbia medium used to grow the pneumococci and that this complicated the robustness and standardization potential of FT-IR spectroscopy. A rigorous laboratory workflow in combination with specific ANN models that account for environmental noise parameters can be applied to overcome this issue in the near future. The IR Biotyper has the potential to be used as a fast, cost-effective, and accurate phenotypic serotyping tool for S. pneumoniae.
Collapse
|
27
|
Kallenborn F, Cascitti J, Schmidt B. CARE 2.0: reducing false-positive sequencing error corrections using machine learning. BMC Bioinformatics 2022; 23:227. [PMID: 35698033 PMCID: PMC9195321 DOI: 10.1186/s12859-022-04754-3] [Citation(s) in RCA: 7] [Impact Index Per Article: 3.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/03/2022] [Accepted: 05/30/2022] [Indexed: 11/10/2022] Open
Abstract
BACKGROUND Next-generation sequencing pipelines often perform error correction as a preprocessing step to obtain cleaned input data. State-of-the-art error correction programs are able to reliably detect and correct the majority of sequencing errors. However, they also introduce new errors by making false-positive corrections. These correction mistakes can have negative impact on downstream analysis, such as k-mer statistics, de-novo assembly, and variant calling. This motivates the need for more precise error correction tools. RESULTS We present CARE 2.0, a context-aware read error correction tool based on multiple sequence alignment targeting Illumina datasets. In addition to a number of newly introduced optimizations its most significant change is the replacement of CARE 1.0's hand-crafted correction conditions with a novel classifier based on random decision forests trained on Illumina data. This results in up to two orders-of-magnitude fewer false-positive corrections compared to other state-of-the-art error correction software. At the same time, CARE 2.0 is able to achieve high numbers of true-positive corrections comparable to its competitors. On a simulated full human dataset with 914M reads CARE 2.0 generates only 1.2M false positives (FPs) (and 801.4M true positives (TPs)) at a highly competitive runtime while the best corrections achieved by other state-of-the-art tools contain at least 3.9M FPs and at most 814.5M TPs. Better de-novo assembly and improved k-mer analysis show the applicability of CARE 2.0 to real-world data. CONCLUSION False-positive corrections can negatively influence down-stream analysis. The precision of CARE 2.0 greatly reduces the number of those corrections compared to other state-of-the-art programs including BFC, Karect, Musket, Bcool, SGA, and Lighter. Thus, higher-quality datasets are produced which improve k-mer analysis and de-novo assembly in real-world datasets which demonstrates the applicability of machine learning techniques in the context of sequencing read error correction. CARE 2.0 is written in C++/CUDA for Linux systems and can be run on the CPU as well as on CUDA-enabled GPUs. It is available at https://github.com/fkallen/CARE .
Collapse
Affiliation(s)
- Felix Kallenborn
- Department of Computer Science, Johannes Gutenberg University Mainz, Mainz, Germany.
| | - Julian Cascitti
- Department of Computer Science, Johannes Gutenberg University Mainz, Mainz, Germany
| | - Bertil Schmidt
- Department of Computer Science, Johannes Gutenberg University Mainz, Mainz, Germany
| |
Collapse
|
28
|
Lefrancq N, Bouchez V, Fernandes N, Barkoff AM, Bosch T, Dalby T, Åkerlund T, Darenberg J, Fabianova K, Vestrheim DF, Fry NK, González-López JJ, Gullsby K, Habington A, He Q, Litt D, Martini H, Piérard D, Stefanelli P, Stegger M, Zavadilova J, Armatys N, Landier A, Guillot S, Hong SL, Lemey P, Parkhill J, Toubiana J, Cauchemez S, Salje H, Brisse S. Global spatial dynamics and vaccine-induced fitness changes of Bordetella pertussis. Sci Transl Med 2022; 14:eabn3253. [PMID: 35476597 DOI: 10.1126/scitranslmed.abn3253] [Citation(s) in RCA: 9] [Impact Index Per Article: 4.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/02/2022]
Abstract
As with other pathogens, competitive interactions between Bordetella pertussis strains drive infection risk. Vaccines are thought to perturb strain diversity through shifts in immune pressures; however, this has rarely been measured because of inadequate data and analytical tools. We used 3344 sequences from 23 countries to show that, on average, there are 28.1 transmission chains circulating within a subnational region, with the number of chains strongly associated with host population size. It took 5 to 10 years for B. pertussis to be homogeneously distributed throughout Europe, with the same time frame required for the United States. Increased fitness of pertactin-deficient strains after implementation of acellular vaccines, but reduced fitness otherwise, can explain long-term genotype dynamics. These findings highlight the role of vaccine policy in shifting local diversity of a pathogen that is responsible for 160,000 deaths annually.
Collapse
Affiliation(s)
- Noémie Lefrancq
- Insitut Pasteur, Université Paris Cité, Mathematical Modelling of Infectious Diseases Unit, UMR2000, CNRS, 75015 Paris, France.,Department of Genetics, University of Cambridge, Cambridge CB2 3EH, UK
| | - Valérie Bouchez
- Institut Pasteur, Université Paris Cité, Biodiversity and Epidemiology of Bacterial Pathogens, 75724 Paris, France.,National Reference Center for Whooping Cough and Other Bordetella Infections, 75724 Paris, France
| | - Nadia Fernandes
- Institut Pasteur, Université Paris Cité, Biodiversity and Epidemiology of Bacterial Pathogens, 75724 Paris, France
| | - Alex-Mikael Barkoff
- University of Turku UTU, Institute of Biomedicine, Research Center for Infections and Immunity, FI-20520 Turku, Finland
| | - Thijs Bosch
- Centre for Infectious Disease Control, National Institute for Public Health and the Environment (RIVM), 3720 BA Bilthoven, Netherlands
| | - Tine Dalby
- Statens Serum Institut, Bacteria, Parasites and Fungi/Infectious Disease Preparedness, 2300 Copenhagen, Denmark
| | - Thomas Åkerlund
- The Public Health Agency of Sweden, Unit for Laboratory Surveillance of Bacterial Pathogens, SE-171 82 Solna, Sweden
| | - Jessica Darenberg
- The Public Health Agency of Sweden, Unit for Laboratory Surveillance of Bacterial Pathogens, SE-171 82 Solna, Sweden
| | - Katerina Fabianova
- National Institute of Public Health, Department of Infectious Diseases Epidemiology, CZ-10000 Prague, Czech Republic
| | - Didrik F Vestrheim
- Norwegian Institute of Public Health, Department of Infectious Disease Control and Vaccine, N-0213 Oslo, Norway
| | - Norman K Fry
- Respiratory and Vaccine Preventable Bacteria Reference Unit, Public Health England-National Infection Service, London NW9 5EQ, UK.,Immunisation and Countermeasures Division, Public Health England-National Infection Service, London NW9 5EQ, UK
| | - Juan José González-López
- University Hospital Vall d'Hebron, Microbiology Department, 08035 Barcelona, Spain.,Universitat Autònoma de Barcelona, Department of Genetics and Microbiology, 08193 Barcelona, Spain
| | - Karolina Gullsby
- Centre for Research and Development, Uppsala University/Region Gävleborg, 80187 Gävle, Sweden
| | - Adele Habington
- Molecular Microbiology Laboratory, Children's Health Ireland, Crumlin, D12 N512 Dublin, Ireland
| | - Qiushui He
- University of Turku UTU, Institute of Biomedicine, Research Center for Infections and Immunity, FI-20520 Turku, Finland.,InFLAMES Research Flagship Center, University of Turku, FI-20520 Turku, Finland
| | - David Litt
- Respiratory and Vaccine Preventable Bacteria Reference Unit, Public Health England-National Infection Service, London NW9 5EQ, UK
| | - Helena Martini
- Department of Microbiology, National Reference Centre for Bordetella pertussis, Universitair Ziekenhuis Brussel, Vrije Universiteit Brussel (VUB), B-1090 Brussels, Belgium
| | - Denis Piérard
- Department of Microbiology, National Reference Centre for Bordetella pertussis, Universitair Ziekenhuis Brussel, Vrije Universiteit Brussel (VUB), B-1090 Brussels, Belgium
| | - Paola Stefanelli
- Department of Infectious Diseases, Istituto Superiore di Sanità, IT-00161 Rome, Italy
| | - Marc Stegger
- Statens Serum Institut, Bacteria, Parasites and Fungi/Infectious Disease Preparedness, 2300 Copenhagen, Denmark
| | - Jana Zavadilova
- National Institute of Public Health, National Reference Laboratory for Pertussis and Diphtheria, 100 00 Prague, Czech Republic
| | - Nathalie Armatys
- Institut Pasteur, Université Paris Cité, Biodiversity and Epidemiology of Bacterial Pathogens, 75724 Paris, France.,National Reference Center for Whooping Cough and Other Bordetella Infections, 75724 Paris, France
| | - Annie Landier
- Institut Pasteur, Université Paris Cité, Biodiversity and Epidemiology of Bacterial Pathogens, 75724 Paris, France.,National Reference Center for Whooping Cough and Other Bordetella Infections, 75724 Paris, France
| | - Sophie Guillot
- Institut Pasteur, Université Paris Cité, Biodiversity and Epidemiology of Bacterial Pathogens, 75724 Paris, France.,National Reference Center for Whooping Cough and Other Bordetella Infections, 75724 Paris, France
| | - Samuel L Hong
- Department of Microbiology, Immunology and Transplantation, Rega Institute, KU Leuven, 3000 Leuven, Belgium
| | - Philippe Lemey
- Department of Microbiology, Immunology and Transplantation, Rega Institute, KU Leuven, 3000 Leuven, Belgium
| | - Julian Parkhill
- Department of Veterinary Medicine, University of Cambridge, Cambridge CB3 0ES, UK
| | - Julie Toubiana
- Institut Pasteur, Université Paris Cité, Biodiversity and Epidemiology of Bacterial Pathogens, 75724 Paris, France.,National Reference Center for Whooping Cough and Other Bordetella Infections, 75724 Paris, France.,Université Paris Cité, Department of General Paediatrics and Paediatric Infectious Diseases, Necker-Enfants Malades Hospital, APHP, 75015 Paris, France
| | - Simon Cauchemez
- Insitut Pasteur, Université Paris Cité, Mathematical Modelling of Infectious Diseases Unit, UMR2000, CNRS, 75015 Paris, France
| | - Henrik Salje
- Insitut Pasteur, Université Paris Cité, Mathematical Modelling of Infectious Diseases Unit, UMR2000, CNRS, 75015 Paris, France.,Department of Genetics, University of Cambridge, Cambridge CB2 3EH, UK
| | - Sylvain Brisse
- Institut Pasteur, Université Paris Cité, Biodiversity and Epidemiology of Bacterial Pathogens, 75724 Paris, France.,National Reference Center for Whooping Cough and Other Bordetella Infections, 75724 Paris, France
| |
Collapse
|
29
|
Ma Q, Geng Y, Li Q, Cheng C, Zang R, Guo Y, Wu H, Xu C, Zhang M. Comparative mitochondrial genome analyses reveal conserved gene arrangement but massive expansion/contraction in two closely related Exserohilum pathogens. Comput Struct Biotechnol J 2022; 20:1456-1469. [PMID: 35386100 PMCID: PMC8956966 DOI: 10.1016/j.csbj.2022.03.016] [Citation(s) in RCA: 4] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/11/2022] [Revised: 03/16/2022] [Accepted: 03/18/2022] [Indexed: 01/18/2023] Open
Abstract
Exserohilum turcicum and E. rostratum, two closely related fungal species, are both economically important pathogens but have quite different target hosts (specific to plants and cross-kingdom infection, respectively). In the present study, complete circular mitochondrial genomes of the two Exserohilum species were sequenced and de novo assembled, which mainly comprised the same set of 13 core protein-coding genes (PCGs), two rRNAs, and a certain number of tRNAs and unidentified open reading frames (ORFs). Comparative analyses indicated that these two fungi had significant mitogenomic collinearity and consistent mitochondrial gene arrangement, yet with vastly different mitogenome sizes, 264,948 bp and 64,620 bp, respectively. By contrast with the 17 introns containing 17 intronic ORFs (one-to-one) in the E. rostratum mitogenome, E. turcicum involved far more introns (70) and intronic ORFs (126), which was considered as the main contributing factors of their mitogenome expansion/contraction. Within the generally intron-rich gene cox1, a total of 18 and 10 intron position classes (Pcls) were identified separately in the two mitogenomes. Moreover, 16.16% and 10.85% ratios of intra-mitogenomic repetitive regions were detected in E. turcicum and E. rostratum, respectively. Based on the combined mitochondrial gene dataset, we established a well-supported topology of phylogeny tree of 98 ascomycetes, implying that mitogenomes may act as an effective molecular marker for fungal phylogenetic reconstruction. Our results served as the first report on mitogenomes in the genus Exserohilum, and would have significant implications in understanding the origin, evolution and pathogenic mechanisms of this fungal lineage.
Collapse
Affiliation(s)
- Qingzhou Ma
- Department of Plant Pathology, Henan Agricultural University, Zhengzhou, Henan, China
| | - Yuehua Geng
- Department of Plant Pathology, Henan Agricultural University, Zhengzhou, Henan, China
| | - Qiang Li
- School of Food and Biological Engineering, Chengdu University, Chengdu, China
| | - Chongyang Cheng
- Department of Plant Pathology, Henan Agricultural University, Zhengzhou, Henan, China
| | - Rui Zang
- Department of Plant Pathology, Henan Agricultural University, Zhengzhou, Henan, China
| | - Yashuang Guo
- Department of Plant Pathology, Henan Agricultural University, Zhengzhou, Henan, China
| | - Haiyan Wu
- Analytical Instrument Center, Henan Agricultural University, Zhengzhou, Henan, China
| | - Chao Xu
- Department of Plant Pathology, Henan Agricultural University, Zhengzhou, Henan, China
| | - Meng Zhang
- Department of Plant Pathology, Henan Agricultural University, Zhengzhou, Henan, China
| |
Collapse
|
30
|
Palma F, Mangone I, Janowicz A, Moura A, Chiaverini A, Torresi M, Garofolo G, Criscuolo A, Brisse S, Di Pasquale A, Cammà C, Radomski N. In vitro and in silico parameters for precise cgMLST typing of Listeria monocytogenes. BMC Genomics 2022; 23:235. [PMID: 35346021 PMCID: PMC8961897 DOI: 10.1186/s12864-022-08437-4] [Citation(s) in RCA: 7] [Impact Index Per Article: 3.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/06/2021] [Accepted: 02/28/2022] [Indexed: 02/02/2023] Open
Abstract
Background Whole genome sequencing analyzed by core genome multi-locus sequence typing (cgMLST) is widely used in surveillance of the pathogenic bacteria Listeria monocytogenes. Given the heterogeneity of available bioinformatics tools to define cgMLST alleles, our aim was to identify parameters influencing the precision of cgMLST profiles. Methods We used three L. monocytogenes reference genomes from different phylogenetic lineages and assessed the impact of in vitro (i.e. tested genomes, successive platings, replicates of DNA extraction and sequencing) and in silico parameters (i.e. targeted depth of coverage, depth of coverage, breadth of coverage, assembly metrics, cgMLST workflows, cgMLST completeness) on cgMLST precision made of 1748 core loci. Six cgMLST workflows were tested, comprising assembly-based (BIGSdb, INNUENDO, GENPAT, SeqSphere and BioNumerics) and assembly-free (i.e. kmer-based MentaLiST) allele callers. Principal component analyses and generalized linear models were used to identify the most impactful parameters on cgMLST precision. Results The isolate’s genetic background, cgMLST workflows, cgMLST completeness, as well as depth and breadth of coverage were the parameters that impacted most on cgMLST precision (i.e. identical alleles against reference circular genomes). All workflows performed well at ≥40X of depth of coverage, with high loci detection (> 99.54% for all, except for BioNumerics with 97.78%) and showed consistent cluster definitions using the reference cut-off of ≤7 allele differences. Conclusions This highlights that bioinformatics workflows dedicated to cgMLST allele calling are largely robust when paired-end reads are of high quality and when the sequencing depth is ≥40X. Supplementary Information The online version contains supplementary material available at 10.1186/s12864-022-08437-4.
Collapse
|
31
|
Integrative Reverse Genetic Analysis Identifies Polymorphisms Contributing to Decreased Antimicrobial Agent Susceptibility in Streptococcus pyogenes. mBio 2022; 13:e0361821. [PMID: 35038921 PMCID: PMC8764543 DOI: 10.1128/mbio.03618-21] [Citation(s) in RCA: 6] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/21/2023] Open
Abstract
Identification of genetic polymorphisms causing increased antibiotic resistance in bacterial pathogens traditionally has proceeded from observed phenotype to defined mutant genotype. The availability of large collections of microbial genome sequences that lack antibiotic susceptibility metadata provides an important resource and opportunity to obtain new information about increased antimicrobial resistance by a reverse genotype-to-phenotype bioinformatic and experimental workflow. We analyzed 26,465 genome sequences of Streptococcus pyogenes, a human pathogen causing 700 million infections annually. The population genomic data identified amino acid changes in penicillin-binding proteins 1A, 1B, 2A, and 2X with signatures of evolution under positive selection as potential candidates for causing decreased susceptibility to β-lactam antibiotics. Construction and analysis of isogenic mutant strains containing individual amino acid replacements in penicillin-binding protein 2X (PBP2X) confirmed that the identified residues produced decreased susceptibility to penicillin. We also discovered the first chimeric PBP2X in S. pyogenes and show that strains containing it have significantly decreased β-lactam susceptibility. The novel integrative reverse genotype-to-phenotype strategy presented is broadly applicable to other pathogens and likely will lead to new knowledge about antimicrobial agent resistance, a massive public health problem worldwide. IMPORTANCE The recent demonstration that naturally occurring amino acid substitutions in Streptococcus pyogenes PBP2X are sufficient to cause severalfold reduced susceptibility to multiple β-lactam antibiotics in vitro raises the concern that these therapeutic agents may become compromised. Substitutions in PBP2X are common first-step mutations that, with the incremental accumulation of additional adaptive mutations within the PBPs, can result in high-level resistance. Because β-lactam susceptibility testing is not routinely performed, the nature and extent of such substitutions within the PBPs of S. pyogenes are poorly characterized. To address this knowledge deficit, polymorphisms in the PBPs were identified among the most comprehensive cohort of S. pyogenes genome sequences investigated to date. The mutational processes and selective forces acting on the PBPs were assessed to identify specific substitutions likely to influence β-lactam susceptibility and to evaluate factors posited to be impediments to resistance emergence.
Collapse
|
32
|
Population structure analysis and laboratory monitoring of Shigella by core-genome multilocus sequence typing. Nat Commun 2022; 13:551. [PMID: 35087053 PMCID: PMC8795385 DOI: 10.1038/s41467-022-28121-1] [Citation(s) in RCA: 11] [Impact Index Per Article: 5.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/29/2021] [Accepted: 01/06/2022] [Indexed: 11/18/2022] Open
Abstract
The laboratory surveillance of bacillary dysentery is based on a standardised Shigella typing scheme that classifies Shigella strains into four serogroups and more than 50 serotypes on the basis of biochemical tests and lipopolysaccharide O-antigen serotyping. Real-time genomic surveillance of Shigella infections has been implemented in several countries, but without the use of a standardised typing scheme. Here, we study over 4000 reference strains and clinical isolates of Shigella, covering all serotypes, with both the current serotyping scheme and the standardised EnteroBase core-genome multilocus sequence typing scheme (cgMLST). The Shigella genomes are grouped into eight phylogenetically distinct clusters, within the E. coli species. The cgMLST hierarchical clustering (HC) analysis at different levels of resolution (HC2000 to HC400) recognises the natural population structure of Shigella. By contrast, the serotyping scheme is affected by horizontal gene transfer, leading to a conflation of genetically unrelated Shigella strains and a separation of genetically related strains. The use of this cgMLST scheme will facilitate the transition from traditional phenotypic typing to routine whole-genome sequencing for the laboratory surveillance of Shigella infections. Lab-based surveillance of Shigella has traditionally been based on serotyping but increasing availability of whole genome sequencing could enable higher resolution typing. Here, the authors apply a core genome multilocus sequence typing scheme to Shigella sequence data and describe its population structure.
Collapse
|
33
|
Shibuya Y, Belazzougui D, Kucherov G. Set-Min Sketch: A Probabilistic Map for Power-Law Distributions with Application to k-Mer Annotation. J Comput Biol 2022; 29:140-154. [DOI: 10.1089/cmb.2021.0429] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/12/2022] Open
Affiliation(s)
- Yoshihiro Shibuya
- LIGM, Modèles et Algorithmes Group, Université Gustave Eiffel, Marne-la-Vallée, France
| | - Djamal Belazzougui
- CAPA, DTISI, Centre de Reacherche sur l'Information Scientifique et Technique, Algiers, Algeria
| | - Gregory Kucherov
- LIGM, CNRS, Modèles et Algorithmes Group, Université Gustave Eiffel, Marne-la-Vallée, France
- Skolkovo Institute of Science and Technology, Moscow, Russia
| |
Collapse
|
34
|
Sharma A, Jain P, Mahgoub A, Zhou Z, Mahadik K, Chaterji S. Lerna: transformer architectures for configuring error correction tools for short- and long-read genome sequencing. BMC Bioinformatics 2022; 23:25. [PMID: 34991450 PMCID: PMC8734100 DOI: 10.1186/s12859-021-04547-0] [Citation(s) in RCA: 4] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/07/2021] [Accepted: 12/20/2021] [Indexed: 11/10/2022] Open
Abstract
BACKGROUND Sequencing technologies are prone to errors, making error correction (EC) necessary for downstream applications. EC tools need to be manually configured for optimal performance. We find that the optimal parameters (e.g., k-mer size) are both tool- and dataset-dependent. Moreover, evaluating the performance (i.e., Alignment-rate or Gain) of a given tool usually relies on a reference genome, but quality reference genomes are not always available. We introduce Lerna for the automated configuration of k-mer-based EC tools. Lerna first creates a language model (LM) of the uncorrected genomic reads, and then, based on this LM, calculates a metric called the perplexity metric to evaluate the corrected reads for different parameter choices. Next, it finds the one that produces the highest alignment rate without using a reference genome. The fundamental intuition of our approach is that the perplexity metric is inversely correlated with the quality of the assembly after error correction. Therefore, Lerna leverages the perplexity metric for automated tuning of k-mer sizes without needing a reference genome. RESULTS First, we show that the best k-mer value can vary for different datasets, even for the same EC tool. This motivates our design that automates k-mer size selection without using a reference genome. Second, we show the gains of our LM using its component attention-based transformers. We show the model's estimation of the perplexity metric before and after error correction. The lower the perplexity after correction, the better the k-mer size. We also show that the alignment rate and assembly quality computed for the corrected reads are strongly negatively correlated with the perplexity, enabling the automated selection of k-mer values for better error correction, and hence, improved assembly quality. We validate our approach on both short and long reads. Additionally, we show that our attention-based models have significant runtime improvement for the entire pipeline-18[Formula: see text] faster than previous works, due to parallelizing the attention mechanism and the use of JIT compilation for GPU inferencing. CONCLUSION Lerna improves de novo genome assembly by optimizing EC tools. Our code is made available in a public repository at: https://github.com/icanforce/lerna-genomics .
Collapse
Affiliation(s)
| | - Pranjal Jain
- Indian Institute of Technology Bombay, Mumbai, India
| | | | | | | | | |
Collapse
|
35
|
Kämpfer P, Busse HJ, Clermont D, Criscuolo A, Glaeser SP. Devosia equisanguinis sp. nov., isolated from horse blood. Int J Syst Evol Microbiol 2021; 71. [PMID: 34788212 DOI: 10.1099/ijsem.0.005090] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/29/2022] Open
Abstract
A Gram-stain-negative, aerobic, non-endospore-forming organism isolated from horse blood was studied for its taxonomic allocation. On the basis of 16S rRNA gene sequence similarity comparisons, strain M6-77T grouped within the genus Devosia and was most closely related to Devosia elaeis (97.6 %) and Devosia indica (97.55 %). The 16S rRNA gene sequence similarity to type strains of other Devosia species was below 97.5 %. The average nucleotide identity and digital DNA-DNA hybridization values between the M6-77T genome assembly and those of the closest relative Devosia type strains were <85 and <25 %, respectively. Strain M6-77T grew optimally at 25-37 °C (range: 10-36 °C), at a pH range of pH 6.5-10.5 and in the presence of up to 3 % (w/v) NaCl. The fatty acid profile from whole-cell hydrolysates supported the allocation of the strain to the genus Devosia. Major fatty acids were C18 : 1 ω7c, 11-methyl C18 : 1 ω7c and C16 : 0. The quinone system consisted exclusively of ubiquinone Q-10. The polar lipid profile was composed of the major lipids diphosphatidylglycerol, phosphatidylglycerol and three unidentified glycolipids. In the polyamine pattern, putrescine was predominant and spermidine was detected in moderate amounts. The diamino acid of the peptidoglycan was meso-diaminopimelic acid. In addition, the results of physiological and biochemical tests also allowed phenotypic differentiation of strain M6-77T from the closely related species. Hence, M6-77T represents a new species of the genus Devosia, for which we propose the name Devosia equisanguinis sp. nov., with M6-77T (=CIP 111628T=LMG 30659T=CCM 8868T) as the type strain.
Collapse
Affiliation(s)
- Peter Kämpfer
- Institut für Angewandte Mikrobiologie, Justus-Liebig-Universität Giessen, D-35392 Giessen, Germany
| | - Hans-Jürgen Busse
- Institut für Mikrobiologie, Veterinärmedizinische Universität, Wien, Austria
| | - Dominique Clermont
- CIP - Collection of Institut Pasteur, Institut Pasteur, Université de Paris, F-75015 Paris, France
| | - Alexis Criscuolo
- Hub de Bioinformatique et Biostatistique - Département Biologie Computationnelle, Institut Pasteur, Université de Paris, F-75015 Paris, France
| | - Stefanie P Glaeser
- Institut für Angewandte Mikrobiologie, Justus-Liebig-Universität Giessen, D-35392 Giessen, Germany
| |
Collapse
|
36
|
Guglielmini J, Hennart M, Badell E, Toubiana J, Criscuolo A, Brisse S. Genomic Epidemiology and Strain Taxonomy of Corynebacterium diphtheriae. J Clin Microbiol 2021; 59:e0158121. [PMID: 34524891 PMCID: PMC8601238 DOI: 10.1128/jcm.01581-21] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/17/2021] [Accepted: 09/03/2021] [Indexed: 12/13/2022] Open
Abstract
Corynebacterium diphtheriae is highly transmissible and can cause large diphtheria outbreaks where vaccination coverage is insufficient. Sporadic cases or small clusters are observed in high-vaccination settings. The phylogeography and short timescale evolution of C. diphtheriae are not well understood, in part due to a lack of harmonized analytical approaches of genomic surveillance and strain tracking. We combined 1,305 genes with highly reproducible allele calls into a core genome multilocus sequence typing (cgMLST) scheme. We analyzed cgMLST gene diversity among 602 isolates from sporadic clinical cases, small clusters, or large outbreaks. We defined sublineages based on the phylogenetic structure within C. diphtheriae and strains based on the highest number of cgMLST mismatches within documented outbreaks. We performed time-scaled phylogenetic analyses of major sublineages. The cgMLST scheme showed high allele call rate in C. diphtheriae and the closely related species C. belfantii and C. rouxii. We demonstrate its utility to delineate epidemiological case clusters and outbreaks using a 25 mismatches threshold and reveal a number of cryptic transmission chains, most of which are geographically restricted to one or a few adjacent countries. Subcultures of the vaccine strain PW8 differed by up to 20 cgMLST mismatches. Phylogenetic analyses revealed a short-timescale evolutionary gain or loss of the diphtheria toxin and biovar-associated genes. We devised a genomic taxonomy of strains and deeper sublineages (defined using a 500-cgMLST-mismatch threshold), currently comprising 151 sublineages, only a few of which are geographically widespread based on current sampling. The cgMLST genotyping tool and nomenclature was made publicly accessible (https://bigsdb.pasteur.fr/diphtheria). Standardized genome-scale strain genotyping will help tracing transmission and geographic spread of C. diphtheriae. The unified genomic taxonomy of C. diphtheriae strains provides a common language for studies of ecology, evolution, and virulence heterogeneity among C. diphtheriae sublineages.
Collapse
Affiliation(s)
- Julien Guglielmini
- Institut Pasteur, Université de Paris, Bioinformatics and Biostatistics Hub, Department of Computational Biology, Paris, France
| | - Melanie Hennart
- Institut Pasteur, Université de Paris, Biodiversity and Epidemiology of Bacterial Pathogens, Paris, France
- Sorbonne Université, Collège Doctoral, Paris, France
| | - Edgar Badell
- Institut Pasteur, Université de Paris, Biodiversity and Epidemiology of Bacterial Pathogens, Paris, France
- National Reference Center for the Corynebacteria of the Diphtheriae Complex, Paris, France
| | - Julie Toubiana
- Institut Pasteur, Université de Paris, Biodiversity and Epidemiology of Bacterial Pathogens, Paris, France
- National Reference Center for the Corynebacteria of the Diphtheriae Complex, Paris, France
- Université de Paris, Service de Pédiatrie Générale et Maladies Infectieuses, Hôpital Necker-Enfants Malades, Assistance Publique-Hôpitaux de Paris, Paris, France
| | - Alexis Criscuolo
- Institut Pasteur, Université de Paris, Bioinformatics and Biostatistics Hub, Department of Computational Biology, Paris, France
| | - Sylvain Brisse
- Institut Pasteur, Université de Paris, Biodiversity and Epidemiology of Bacterial Pathogens, Paris, France
- National Reference Center for the Corynebacteria of the Diphtheriae Complex, Paris, France
| |
Collapse
|
37
|
Ma Q, Wu H, Geng Y, Li Q, Zang R, Guo Y, Xu C, Zhang M. Mitogenome-wide comparison and phylogeny reveal group I intron dynamics and intraspecific diversification within the phytopathogen Corynespora cassiicola. Comput Struct Biotechnol J 2021; 19:5987-5999. [PMID: 34849203 PMCID: PMC8598970 DOI: 10.1016/j.csbj.2021.11.002] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/12/2021] [Revised: 11/01/2021] [Accepted: 11/02/2021] [Indexed: 12/20/2022] Open
Abstract
Corynespora cassiicola, the causal agent of an extensive range of plant diseases worldwide, is a momentous fungus with diverse lifestyles and rich in intraspecies variations. In the present study, a total of 56 mitochondrial genomes of C. cassiicola were assembled (except two available online) and analyzed, of which 16 mitogenomes were newly sequenced here. All these circular mitochondrial DNA (mtDNA) molecules, ranging from 39,223 bp to 45,786 bp in length, comprised the same set of 13 core protein-coding genes (PCGs), two rRNAs and 27 tRNAs arranged in identical order. Across the above conserved genes, nad3 had the largest genetic distance between different isolates and was possibly subjected to positive selection pressure. Comparative mitogenomic analysis indicated that seven group I (IB, IC1, and IC2) introns with a length range of 1013-1876 bp were differentially inserted in three core PCGs (cox1, nad1, and nad5), resulting in the varied mitogenome sizes among C. cassiicola isolates. In combination with dynamic distribution of the introns, a well-supported mitogenome-wide phylogeny of the 56 C. cassiicola isolates revealed eight phylogenetic groups, which only had weak correlations with host range and toxin class. Different groups of isolates exhibited obvious differences in length and GC content of some genes, while a degree of variance in codon usage and tRNA structure was also observed. This research served as the first report on mitogenomic comparisons within C. cassiicola, and could provide new insights into its intraspecific microevolution and genetic diversity.
Collapse
Affiliation(s)
- Qingzhou Ma
- Department of Plant Pathology, Henan Agricultural University, Zhengzhou, Henan, China
| | - Haiyan Wu
- Analytical Instrument Center, Henan Agricultural University, Zhengzhou, Henan, China
| | - Yuehua Geng
- Department of Plant Pathology, Henan Agricultural University, Zhengzhou, Henan, China
| | - Qiang Li
- School of Food and Biological Engineering, Chengdu University, Chengdu, China
| | - Rui Zang
- Department of Plant Pathology, Henan Agricultural University, Zhengzhou, Henan, China
| | - Yashuang Guo
- Department of Plant Pathology, Henan Agricultural University, Zhengzhou, Henan, China
| | - Chao Xu
- Department of Plant Pathology, Henan Agricultural University, Zhengzhou, Henan, China
| | - Meng Zhang
- Department of Plant Pathology, Henan Agricultural University, Zhengzhou, Henan, China
| |
Collapse
|
38
|
Del-Rio G, Rego MA, Whitney BM, Schunck F, Silveira LF, Faircloth BC, Brumfield RT. Displaced clines in an avian hybrid zone (Thamnophilidae: Rhegmatorhina) within an Amazonian interfluve. Evolution 2021; 76:455-475. [PMID: 34626500 DOI: 10.1111/evo.14377] [Citation(s) in RCA: 9] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/20/2020] [Revised: 08/25/2021] [Accepted: 09/01/2021] [Indexed: 12/24/2022]
Abstract
Secondary contact between species often results in the formation of a hybrid zone, with the eventual fates of the hybridizing species dependent on evolutionary and ecological forces. We examine this process in the Amazon Basin by conducting the first genomic and phenotypic characterization of the hybrid zone formed after secondary contact between two obligate army-ant-followers: the White-breasted Antbird (Rhegmatorhina hoffmannsi) and the Harlequin Antbird (Rhegmatorhina berlepschi). We found a major geographic displacement (∼120 km) between the mitochondrial and nuclear clines, and we explore potential hypotheses for the displacement, including sampling error, genetic drift, and asymmetric cytonuclear incompatibilities. We cannot exclude roles for sampling error and genetic drift in contributing to the discordance; however, the data suggest expansion and unidirectional introgression of hoffmannsi into the distribution of berlepschi.
Collapse
Affiliation(s)
- Glaucia Del-Rio
- Museum of Natural Science, Louisiana State University, Baton Rouge, Louisiana, 70803.,Department of Biological Sciences, Louisiana State University, Baton Rouge, Louisiana, 70803
| | - Marco A Rego
- Museum of Natural Science, Louisiana State University, Baton Rouge, Louisiana, 70803.,Department of Biological Sciences, Louisiana State University, Baton Rouge, Louisiana, 70803
| | - Bret M Whitney
- Museum of Natural Science, Louisiana State University, Baton Rouge, Louisiana, 70803.,Museu de Zoologia, Universidade de São Paulo, São Paulo, SP, 04263-000, Brazil
| | - Fabio Schunck
- Museu de Zoologia, Universidade de São Paulo, São Paulo, SP, 04263-000, Brazil
| | - Luís F Silveira
- Museu de Zoologia, Universidade de São Paulo, São Paulo, SP, 04263-000, Brazil
| | - Brant C Faircloth
- Museum of Natural Science, Louisiana State University, Baton Rouge, Louisiana, 70803.,Department of Biological Sciences, Louisiana State University, Baton Rouge, Louisiana, 70803
| | - Robb T Brumfield
- Museum of Natural Science, Louisiana State University, Baton Rouge, Louisiana, 70803.,Department of Biological Sciences, Louisiana State University, Baton Rouge, Louisiana, 70803
| |
Collapse
|
39
|
Leinonen M, Salmela L. Extraction of long k-mers using spaced seeds. IEEE/ACM TRANSACTIONS ON COMPUTATIONAL BIOLOGY AND BIOINFORMATICS 2021; PP:1-1. [PMID: 34529572 DOI: 10.1109/tcbb.2021.3113131] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/13/2023]
Abstract
The extraction of k-mers from reads is an important task in many bioinformatics applications, such as all DNA sequence analysis methods based on de Bruijn graphs. These methods tend to be more accurate when the used k-mers are unique in the analyzed DNA, and thus the use of longer k-mers is preferred. When the read lengths of short read sequencing technologies increase, the error rate will become the determining factor for the largest possible value of k. Here we propose LoMeX which uses spaced seeds to extract long k-mers accurately even in the presence of sequencing errors. Our experiments show that LoMeX can extract long k-mers from current Illumina reads with a similar or higher recall than a standard k-mer counting tool. Furthermore, our experiments on simulated data show that when the read length further increases enabling even longer k-mers, the performance of standard k-mer counters declines, whereas LoMeX still extracts long k-mers successfully.
Collapse
|
40
|
Genomic and Experimental Investigations of Auriscalpium and Strobilurus Fungi Reveal New Insights into Pinecone Decomposition. J Fungi (Basel) 2021; 7:jof7080679. [PMID: 34436218 PMCID: PMC8401616 DOI: 10.3390/jof7080679] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/13/2021] [Revised: 08/16/2021] [Accepted: 08/19/2021] [Indexed: 11/16/2022] Open
Abstract
Saprophytic fungi (SPF) play vital roles in ecosystem dynamics and decomposition. However, because of the complexity of living systems, our understanding of how SPF interact with each other to decompose organic matter is very limited. Here we studied their roles and interactions in the decomposition of highly specialized substrates between the two genera Auriscalpium and Strobilurus fungi-colonized fallen pinecones of the same plant sequentially. We obtained the genome sequences from seven fungal species with three pairs: A. orientale-S. luchuensis, A. vulgare-S. stephanocystis and A. microsporum-S. pachcystidiatus/S. orientalis on cones of Pinus yunnanensis, P. sylvestris and P. armandii, respectively, and the organic profiles of substrate during decomposition. Our analyses revealed evidence for both competition and cooperation between the two groups of fungi during decomposition, enabling efficient utilization of substrates with complementary profiles of carbohydrate active enzymes (CAZymes). The Auriscalpium fungi are highly effective at utilizing the primary organic carbon, such as lignin, and hemicellulose in freshly fallen cones, facilitated the invasion and colonization by Strobilurus fungi. The Strobilurus fungi have genes coding for abundant CAZymes to utilize the remaining organic compounds and for producing an arsenal of secondary metabolites such as strobilurins that can inhibit other fungi from colonizing the pinecones.
Collapse
|
41
|
Functional Insights into the High-Molecular-Mass Penicillin-Binding Proteins of Streptococcus agalactiae Revealed by Gene Deletion and Transposon Mutagenesis Analysis. J Bacteriol 2021; 203:e0023421. [PMID: 34124943 DOI: 10.1128/jb.00234-21] [Citation(s) in RCA: 6] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/01/2023] Open
Abstract
High-molecular-mass penicillin-binding proteins (PBPs) are enzymes that catalyze the biosynthesis of bacterial cell wall peptidoglycan. The Gram-positive bacterial pathogen Streptococcus agalactiae (group B streptococcus [GBS]) produces five high-molecular-mass PBPs, namely, PBP1A, PBP1B, PBP2A, PBP2B, and PBP2X. Among these, only PBP2X is essential for cell viability, whereas the other four PBPs are individually dispensable. The biological function of the four nonessential PBPs is poorly characterized in GBS. We deleted the pbp1a, pbp1b, pbp2a, and pbp2b genes individually from a genetically well-characterized serotype V GBS strain and studied the phenotypes of the four isogenic mutant strains. Compared to the wild-type parental strain, (i) none of the pbp isogenic mutant strains had a significant growth defect in Todd-Hewitt broth supplemented with 0.2% yeast extract (THY) rich medium, (ii) isogenic mutant Δpbp1a and Δpbp1b strains had significantly increased susceptibility to penicillin and ampicillin, and (iii) isogenic mutant Δpbp1a and Δpbp2b strains had significantly longer chain lengths. Using saturated transposon mutagenesis and transposon insertion site sequencing, we determined the genes essential for the viability of the wild-type GBS strain and each of the four isogenic pbp deletion mutant strains in THY rich medium. The pbp1a gene is essential for cell viability in the pbp2b deletion background. Reciprocally, pbp2b is essential in the pbp1a deletion background. Moreover, the gene encoding RodA, a peptidoglycan polymerase that works in conjunction with PBP2B, is also essential in the pbp1a deletion background. Together, our results suggest functional overlap between PBP1A and the PBP2B-RodA complex in GBS cell wall peptidoglycan biosynthesis. IMPORTANCE High-molecular-mass penicillin-binding proteins (HMM PBPs) are enzymes required for bacterial cell wall biosynthesis. Bacterial pathogen group B streptococcus (GBS) produces five distinct HMM PBPs. The biological functions of these proteins are not well characterized in GBS. In this study, we performed a comprehensive deletion analysis of genes encoding HMM PBPs in GBS. We found that deleting certain PBP-encoding genes altered bacterial susceptibility to beta-lactam antibiotics, cell morphology, and the essentiality of other enzymes involved in cell wall peptidoglycan synthesis. The results of our study shed new light on the biological functions of PBPs in GBS.
Collapse
|
42
|
Thi Phan N, Besnard G, Ouazahrou R, Sánchez WS, Gil L, Manzi S, Bellafiore S. Genome sequence of the coffee root-knot nematode Meloidogyne exigua. J Nematol 2021; 53:e2021-65. [PMID: 34296190 PMCID: PMC8290501 DOI: 10.21307/jofnem-2021-065] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/17/2021] [Indexed: 11/16/2022] Open
Abstract
Root-knot nematodes (Meloidogyne spp.) cause serious damages on most crops. Here, we report a high-quality genome sequence of Meloidogyne exigua (population Mex1, Costa Rica), a major pathogen of coffee. Its mitogenome (20,974 bp) was first assembled and annotated. The nuclear genome was then constructed consisting of 206 contigs, with an N50 length of 1.89 Mb and a total assembly length of 42.1 Mb.
Collapse
Affiliation(s)
- Ngan Thi Phan
- PHIM Plant Health Institute, University of Montpellier, IRD, CIRAD, INRAE, Institut Agro, Montpellier, France
| | - Guillaume Besnard
- CNRS-UPS-IRD, UMR5174, EDB, 118 route de Narbonne, Université Paul Sabatier, 31062 Toulouse, France
| | | | | | - Lisa Gil
- US 1426, GeT-PlaGe, Genotoul, INRAE, Castanet-Tolosan, France
| | - Sophie Manzi
- CNRS-UPS-IRD, UMR5174, EDB, 118 route de Narbonne, Université Paul Sabatier, 31062 Toulouse, France
| | - Stéphane Bellafiore
- PHIM Plant Health Institute, University of Montpellier, IRD, CIRAD, INRAE, Institut Agro, Montpellier, France
| |
Collapse
|
43
|
Aujoulat F, Mazuet C, Criscuolo A, Popoff MR, Enault C, Diancourt L, Jumas-Bilak E, Lavigne JP, Marchandin H. Peptoniphilus nemausensis sp. nov. A new Gram-positive anaerobic coccus isolated from human clinical samples, an emendated description of the genus Peptoniphilus and an evaluation of the taxonomic status of Peptoniphilus species with not validly published names. Syst Appl Microbiol 2021; 44:126235. [PMID: 34385044 DOI: 10.1016/j.syapm.2021.126235] [Citation(s) in RCA: 14] [Impact Index Per Article: 4.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/15/2021] [Revised: 07/05/2021] [Accepted: 07/09/2021] [Indexed: 11/27/2022]
Abstract
A Gram-positive, anaerobic coccus isolated from a human surgical site infection was previously shown to belong to an unknown species of the genus Peptoniphilus initially proposed as 'Peptoniphilus nemausus' sp. nov., based on both 16S rRNA gene sequence identity of 97.9% with the most closely related species Peptoniphilus coxii and an individualized phylogenetic branching within the genus Peptoniphilus. A polyphasic characterization of the novel species is proposed herein. Whole genome sequence analysis showed an average nucleotide identity value of 84.75% and digital DNA-DNA hybridization value of 28.9% against P. coxii type strain. The strain displayed unique features among members of the genus Peptoniphilus, as it was able to hydrolyze aesculin, and produced acetate as the major metabolic end-product without associated production of butyrate. Growth was observed under microaerophilic conditions. From all these data, the isolate is confirmed as belonging to a new Peptoniphilus species, for which the name Peptoniphilus nemausensis sp. nov. is proposed. The type strain is 1804121828T (=LMG 31466T = CECT 9935T). A database survey using a highly polymorphic partial sequence of the 16S rRNA gene of P. nemausensis revealed P. nemausensis to be a particularly rare skin-associated species in humans. An emendated description of the Peptoniphilus genus is proposed based on a review of the characteristics of the 12 new species with validly published names since the genus description in 2001 and of P. nemausensis. Finally, the relationships between members of the genus Peptoniphilus were explored based on whole genome sequence analysis in order to clarify the taxonomic status of not yet validly published species showing that three pairs of species should be considered as synonyms: Peptoniphilus timonensis and 'Peptoniphilus phoceensis', Peptoniphilus lacydonensis and 'Peptoniphilus rhinitidis', Peptoniphilus tyrrelliae and Peptoniphilus senegalensis.
Collapse
Affiliation(s)
- Fabien Aujoulat
- HydroSciences Montpellier, CNRS, IRD, Univ Montpellier, Montpellier, France
| | - Christelle Mazuet
- Centre National de Référence bactéries anaérobies et botulisme, Institut Pasteur, Paris, France
| | - Alexis Criscuolo
- Hub de Bioinformatique et Biostatistique - Département Biologie Computationnelle, Institut Pasteur, Paris, France
| | - Michel R Popoff
- Unité des Toxines Bactériennes, UMR CNRS 2001, Institut Pasteur, Paris, France
| | - Cécilia Enault
- Service de Microbiologie et Hygiène Hospitalière, CHU Nîmes, Nîmes, France
| | - Laure Diancourt
- Centre National de Référence bactéries anaérobies et botulisme, Institut Pasteur, Paris, France
| | - Estelle Jumas-Bilak
- HydroSciences Montpellier, CNRS, IRD, Univ Montpellier, Département d'Hygiène Hospitalière, CHU Montpellier, Montpellier, France
| | - Jean-Philippe Lavigne
- VBIC, INSERM U1047, Univ Montpellier, Service de Microbiologie et Hygiène Hospitalière, CHU Nîmes, Nîmes, France
| | - Hélène Marchandin
- HydroSciences Montpellier, CNRS, IRD, Univ Montpellier, Service de Microbiologie et Hygiène Hospitalière, CHU Nîmes, Nîmes, France.
| |
Collapse
|
44
|
Burkholderia from Fungus Gardens of Fungus-Growing Ants Produces Antifungals That Inhibit the Specialized Parasite Escovopsis. Appl Environ Microbiol 2021; 87:e0017821. [PMID: 33962985 DOI: 10.1128/aem.00178-21] [Citation(s) in RCA: 8] [Impact Index Per Article: 2.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/20/2023] Open
Abstract
Within animal-associated microbiomes, the functional roles of specific microbial taxa are often uncharacterized. Here, we use the fungus-growing ant system, a model for microbial symbiosis, to determine the potential defensive roles of key bacterial taxa present in the ants' fungus gardens. Fungus gardens serve as an external digestive system for the ants, with mutualistic fungi in the genus Leucoagaricus converting the plant substrate into energy for the ants. The fungus garden is host to specialized parasitic fungi in the genus Escovopsis. Here, we examine the potential role of Burkholderia spp. that occur within ant fungus gardens in inhibiting Escovopsis. We isolated members of the bacterial genera Burkholderia and Paraburkholderia from 50% of the 52 colonies sampled, indicating that members of the family Burkholderiaceae are common inhabitants in the fungus gardens of a diverse range of fungus-growing ant genera. Using antimicrobial inhibition bioassays, we found that 28 out of 32 isolates inhibited at least one Escovopsis strain with a zone of inhibition greater than 1 cm. Genomic assessment of fungus garden-associated Burkholderiaceae indicated that isolates with strong inhibition all belonged to the genus Burkholderia and contained biosynthetic gene clusters that encoded the production of two antifungals: burkholdine1213 and pyrrolnitrin. Organic extracts of cultured isolates confirmed that these compounds are responsible for antifungal activities that inhibit Escovopsis but, at equivalent concentrations, not Leucoagaricus spp. Overall, these new findings, combined with previous evidence, suggest that members of the fungus garden microbiome play an important role in maintaining the health and function of fungus-growing ant colonies. IMPORTANCE Many organisms partner with microbes to defend themselves against parasites and pathogens. Fungus-growing ants must protect Leucoagaricus spp., the fungal mutualist that provides sustenance for the ants, from a specialized fungal parasite, Escovopsis. The ants take multiple approaches, including weeding their fungus gardens to remove Escovopsis spores, as well as harboring Pseudonocardia spp., bacteria that produce antifungals that inhibit Escovopsis. In addition, a genus of bacteria commonly found in fungus gardens, Burkholderia, is known to produce secondary metabolites that inhibit Escovopsis spp. In this study, we isolated Burkholderia spp. from fungus-growing ants, assessed the isolates' ability to inhibit Escovopsis spp., and identified two compounds responsible for inhibition. Our findings suggest that Burkholderia spp. are often found in fungus gardens, adding another possible mechanism within the fungus-growing ant system to suppress the growth of the specialized parasite Escovopsis.
Collapse
|
45
|
Kallenborn F, Hildebrandt A, Schmidt B. CARE: context-aware sequencing read error correction. Bioinformatics 2021; 37:889-895. [PMID: 32818262 DOI: 10.1093/bioinformatics/btaa738] [Citation(s) in RCA: 8] [Impact Index Per Article: 2.7] [Reference Citation Analysis] [Abstract] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/08/2020] [Revised: 07/14/2020] [Accepted: 08/14/2020] [Indexed: 11/14/2022] Open
Abstract
MOTIVATION Error correction is a fundamental pre-processing step in many Next-Generation Sequencing (NGS) pipelines, in particular for de novo genome assembly. However, existing error correction methods either suffer from high false-positive rates since they break reads into independent k-mers or do not scale efficiently to large amounts of sequencing reads and complex genomes. RESULTS We present CARE-an alignment-based scalable error correction algorithm for Illumina data using the concept of minhashing. Minhashing allows for efficient similarity search within large sequencing read collections which enables fast computation of high-quality multiple alignments. Sequencing errors are corrected by detailed inspection of the corresponding alignments. Our performance evaluation shows that CARE generates significantly fewer false-positive corrections than state-of-the-art tools (Musket, SGA, BFC, Lighter, Bcool, Karect) while maintaining a competitive number of true positives. When used prior to assembly it can achieve superior de novo assembly results for a number of real datasets. CARE is also the first multiple sequence alignment-based error corrector that is able to process a human genome Illumina NGS dataset in only 4 h on a single workstation using GPU acceleration. AVAILABILITYAND IMPLEMENTATION CARE is open-source software written in C++ (CPU version) and in CUDA/C++ (GPU version). It is licensed under GPLv3 and can be downloaded at https://github.com/fkallen/CARE. SUPPLEMENTARY INFORMATION Supplementary data are available at Bioinformatics online.
Collapse
Affiliation(s)
- Felix Kallenborn
- Department of Computer Science, Johannes Gutenberg University, Mainz 55122, Germany
| | - Andreas Hildebrandt
- Department of Computer Science, Johannes Gutenberg University, Mainz 55122, Germany
| | - Bertil Schmidt
- Department of Computer Science, Johannes Gutenberg University, Mainz 55122, Germany
| |
Collapse
|
46
|
Kolora SRR, Gysi DM, Schaffer S, Grimm-Seyfarth A, Szabolcs M, Faria R, Henle K, Stadler PF, Schlegel M, Nowick K. Accelerated Evolution of Tissue-Specific Genes Mediates Divergence Amidst Gene Flow in European Green Lizards. Genome Biol Evol 2021; 13:6275683. [PMID: 33988711 PMCID: PMC8382678 DOI: 10.1093/gbe/evab109] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 05/13/2021] [Indexed: 11/12/2022] Open
Abstract
The European green lizards of the Lacerta viridis complex consist of two closely related species, L. viridis and Lacerta bilineata that split less than 7 million years ago in the presence of gene flow. Recently, a third lineage, referred to as the “Adriatic” was described within the L. viridis complex distributed from Slovenia to Greece. However, whether gene flow between the Adriatic lineage and L. viridis or L. bilineata has occurred and the evolutionary processes involved in their diversification are currently unknown. We hypothesized that divergence occurred in the presence of gene flow between multiple lineages and involved tissue-specific gene evolution. In this study, we sequenced the whole genome of an individual of the Adriatic lineage and tested for the presence of gene flow amongst L. viridis, L. bilineata, and Adriatic. Additionally, we sequenced transcriptomes from multiple tissues to understand tissue-specific effects. The species tree supports that the Adriatic lineage is a sister taxon to L. bilineata. We detected gene flow between the Adriatic lineage and L. viridis suggesting that the evolutionary history of the L. viridis complex is likely shaped by gene flow. Interestingly, we observed topological differences between the autosomal and Z-chromosome phylogenies with a few fast evolving genes on the Z-chromosome. Genes highly expressed in the ovaries and strongly co-expressed in the brain experienced accelerated evolution presumably contributing to establishing reproductive isolation in the L. viridis complex.
Collapse
Affiliation(s)
- Sree Rohit Raj Kolora
- German Centre for Integrative Biodiversity Research (iDiv) Halle Jena Leipzig, Leipzig, Germany.,Bioinformatics Group, Department of Computer Science, and Interdisciplinary Center for Bioinformatics, Universität Leipzig, Leipzig, Germany.,Molecular Evolution & Animal Systematics, University of Leipzig, Leipzig, Germany.,Department of Integrative Biology, University of California, Berkeley, Berkeley, CA, USA
| | - Deisy Morselli Gysi
- Bioinformatics Group, Department of Computer Science, and Interdisciplinary Center for Bioinformatics, Universität Leipzig, Leipzig, Germany.,Swarm Intelligence and Complex Systems Group, Faculty of Mathematics and Computer Science, University of Leipzig, Leipzig, Germany.,Center for Complex Networks Research, Northeastern University, Boston, MA, USA
| | - Stefan Schaffer
- German Centre for Integrative Biodiversity Research (iDiv) Halle Jena Leipzig, Leipzig, Germany.,Molecular Evolution & Animal Systematics, University of Leipzig, Leipzig, Germany
| | - Annegret Grimm-Seyfarth
- Department of Conservation Biology, UFZ-Helmholtz Centre for Environmental Research, Leipzig, Germany
| | - Márton Szabolcs
- Department of Tisza River Research, Danube Research Institute, Centre for Ecological Research, Hungarian Academy of Sciences, Debrecen, Hungary
| | - Rui Faria
- CIBIO, Centro de Investigação em Biodiversidade e Recursos Genéticos, InBIO, Laboratório Associado, Universidade do Porto, Vairão, Portugal
| | - Klaus Henle
- German Centre for Integrative Biodiversity Research (iDiv) Halle Jena Leipzig, Leipzig, Germany.,Department of Conservation Biology, UFZ-Helmholtz Centre for Environmental Research, Leipzig, Germany
| | - Peter F Stadler
- German Centre for Integrative Biodiversity Research (iDiv) Halle Jena Leipzig, Leipzig, Germany.,Bioinformatics Group, Department of Computer Science, and Interdisciplinary Center for Bioinformatics, Universität Leipzig, Leipzig, Germany.,Competence Center for Scalable Data Services and Solutions Dresden/Leipzig, Universität Leipzig, Leipzig, Germany.,Max-Planck-Institute for Mathematics in the Sciences, Leipzig, Germany.,Department of Theoretical Chemistry, University of Vienna, Wien, Austria.,Facultad de Ciencias, Universidad Nacional de Colombia, Bogotá, Colombia.,Santa Fe Institute, New Mexico, USA
| | - Martin Schlegel
- German Centre for Integrative Biodiversity Research (iDiv) Halle Jena Leipzig, Leipzig, Germany.,Molecular Evolution & Animal Systematics, University of Leipzig, Leipzig, Germany
| | - Katja Nowick
- Institute for Biology, Freie Universität Berlin, Berlin, Germany
| |
Collapse
|
47
|
Nagy NA, Rácz R, Rimington O, Póliska S, Orozco-terWengel P, Bruford MW, Barta Z. Draft genome of a biparental beetle species, Lethrus apterus. BMC Genomics 2021; 22:301. [PMID: 33902445 PMCID: PMC8074431 DOI: 10.1186/s12864-021-07627-w] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/28/2020] [Accepted: 04/13/2021] [Indexed: 11/23/2022] Open
Abstract
BACKGROUND The lack of an understanding about the genomic architecture underpinning parental behaviour in subsocial insects displaying simple parental behaviours prevents the development of a full understanding about the evolutionary origin of sociality. Lethrus apterus is one of the few insect species that has biparental care. Division of labour can be observed between parents during the reproductive period in order to provide food and protection for their offspring. RESULTS Here, we report the draft genome of L. apterus, the first genome in the family Geotrupidae. The final assembly consisted of 286.93 Mbp in 66,933 scaffolds. Completeness analysis found the assembly contained 93.5% of the Endopterygota core BUSCO gene set. Ab initio gene prediction resulted in 25,385 coding genes, whereas homology-based analyses predicted 22,551 protein coding genes. After merging, 20,734 were found during functional annotation. Compared to other publicly available beetle genomes, 23,528 genes among the predicted genes were assigned to orthogroups of which 1664 were in species-specific groups. Additionally, reproduction related genes were found among the predicted genes based on which a reduction in the number of odorant- and pheromone-binding proteins was detected. CONCLUSIONS These genes can be used in further comparative and functional genomic researches which can advance our understanding of the genetic basis and hence the evolution of parental behaviour.
Collapse
Affiliation(s)
- Nikoletta A Nagy
- MTA-DE Behavioural Ecology Research Group, Department of Evolutionary Zoology, University of Debrecen, Egyetem tér 1, Debrecen, H-4032, Hungary.
- Department of Evolutionary Zoology and Human Biology, University of Debrecen, Debrecen, Hungary.
| | - Rita Rácz
- MTA-DE Behavioural Ecology Research Group, Department of Evolutionary Zoology, University of Debrecen, Egyetem tér 1, Debrecen, H-4032, Hungary
- Department of Evolutionary Zoology and Human Biology, University of Debrecen, Debrecen, Hungary
| | | | - Szilárd Póliska
- Genomic Medicine and Bioinformatic Core Facility, Department of Biochemistry and Molecular Biology, Faculty of Medicine, University of Debrecen, Debrecen, Hungary
| | | | | | - Zoltán Barta
- MTA-DE Behavioural Ecology Research Group, Department of Evolutionary Zoology, University of Debrecen, Egyetem tér 1, Debrecen, H-4032, Hungary
- Department of Evolutionary Zoology and Human Biology, University of Debrecen, Debrecen, Hungary
| |
Collapse
|
48
|
Kämpfer P, Glaeser SP, McInroy JA, Clermont D, Criscuolo A, Busse HJ. Pseudomonas carbonaria sp. nov., isolated from charcoal. Int J Syst Evol Microbiol 2021; 71. [PMID: 33835910 DOI: 10.1099/ijsem.0.004750] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/22/2023] Open
Abstract
A beige-pigmented, oxidase-positive bacterial isolate, Wesi-4T, isolated from charcoal in 2012, was examined in detail by applying a polyphasic taxonomic approach. Cells of the isolates were rod shaped and Gram-stain negative. Examination of the 16S rRNA gene sequence of the isolate revealed highest sequence similarities to the type strains of Pseudomonas matsuisoli and Pseudomonas nosocomialis (both 97.3 %). Phylogenetic analyses on the basis of the 16S rRNA gene sequences indicated a separate position of Wesi-4T, which was confirmed by multilocus sequence analyses (MLSA) based on the three loci gyrB, rpoB and rpoD and a core genome-based phylogenetic tree. Genome sequence based comparison of Wesi-4T and the type strains of P. matsuisoli and P. nosocomialis yielded average nucleotide identity values <95 % and in silico DNA-DNA hybridization values <70 %, respectively. The polyamine pattern contains the major amines putrescine, cadaverine and spermidine. The quinone system contains predominantly ubiquinone Q-9 and in the polar lipid profile diphosphatidylglycerol, phosphatidylglycerol and phosphatidylethanolamine are the major lipids. The fatty acid contains predominantly C16 : 0, summed feature 3 (C16 : 1ω7c and/or C16 : 1ω6c) and summed feature 8 (C18 : 1ω7c and/or C18 : 1 ω6c). In addition, physiological and biochemical tests revealed a clear phenotypic difference from P. matsuisoli. These cumulative data indicate that the isolate represents a novel species of the genus Pseudomonas for which the name Pseudomonas carbonaria sp. nov. is proposed with Wesi-4T (=DSM 110367T=CIP 111764T=CCM 9017T) as the type strain.
Collapse
Affiliation(s)
- Peter Kämpfer
- Institut für Angewandte Mikrobiologie, Justus-Liebig-Universität Giessen, D-35392 Giessen, Germany
| | - S P Glaeser
- Institut für Angewandte Mikrobiologie, Justus-Liebig-Universität Giessen, D-35392 Giessen, Germany
| | - John A McInroy
- Department of Entomology and Plant Pathology, Auburn University, Alabama, USA
| | | | - Alexis Criscuolo
- Hub de Bioinformatique et Biostatistique - Département Biologie Computationnelle, Institut Pasteur, Paris, France
| | - Hans-Jürgen Busse
- Institut für Mikrobiologie, Veterinärmedizinische Universität, A-1210 Wien, Austria
| |
Collapse
|
49
|
Heo Y, Manikandan G, Ramachandran A, Chen D. Comprehensive Evaluation of Error-Correction Methodologies for Genome Sequencing Data. Bioinformatics 2021. [DOI: 10.36255/exonpublications.bioinformatics.2021.ch6] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/20/2022] Open
|
50
|
Kämpfer P, Busse HJ, Glaeser SP, Clermont D, Criscuolo A, Mietke H. Jeotgalicoccus meleagridis sp. nov. isolated from bioaerosol from emissions of a turkey fattening plant and reclassification of Jeotgalicoccus halophilus Liu et al. 2011 as a later heterotypic synonym of Jeotgalicoccus aerolatus Martin et al. 2011. Int J Syst Evol Microbiol 2021; 71. [PMID: 33724175 DOI: 10.1099/ijsem.0.004745] [Citation(s) in RCA: 11] [Impact Index Per Article: 3.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/18/2022] Open
Abstract
A Gram-stain-positive, non-motile, non-spore-forming, coccus (strain Do 184T) was isolated from exhaust air of a turkey fattening plant on mannitol salt agar. The strain shared high 16S rRNA gene sequence similarity to the type strains of Jeotgalicoccus aerolatus (98.0%) followed by Jeotgalicoccus marinus (97.2%) and Jeotgalicoccus huakuii (97.1%). All other 16S rRNA gene sequence similarities to species of the genus Jeotgalicoccus were below 97%. The average nucleotide identities (ANI) between the Do 184T genome assembly and the ones of type strains of species of the genus Jeotgalicoccus were far below the 95% species delineation cutoff value, ranging from 79.47% (J. marinus DSM 19772T) to 75.30% (J. pinnipedialis CIP 107946T). The quinone system of Do 184T, the polar lipid profile, the polyamine pattern and the fatty acid profile were in congruence with those reported for other species of the genus Jeotgalicoccus and thus supported the affiliation of Do 184T to this genus. Do 184T represents a novel species, for which the name Jeotgalicoccus meleagridis sp. nov. is proposed, with the type strain Do 184T (=LMG 31100T=CCM 8918T=CIP 111649T). In addition, data on genome sequences of Jeotgalicoccus halophilus C1-52T =CGMCC 1.8911T=NBRC 105788T and Jeotgalicoccus aerolatus MPA-33T=CCM 7679T=CCUG 57953T=DSM 22420T=CIP 111750T indicate that both isolates represent the same species. Pairwise ANI between the genomes of these two strains lead to similarities of 98.98-99.05 %. These results indicate that these strains represent members of the same species. Due to priority of publication it is proposed that Jeotgalicoccus halophilus is reclassified as Jeotgalicoccus aerolatus.
Collapse
Affiliation(s)
- Peter Kämpfer
- Institut für Angewandte Mikrobiologie, Justus-Liebig-Universität Giessen, D-35392 Giessen, Germany
| | - Hans-Jürgen Busse
- Institut für Mikrobiologie, Veterinärmedizinische Universität, A-1210 Wien, Austria
| | - Stefanie P Glaeser
- Institut für Angewandte Mikrobiologie, Justus-Liebig-Universität Giessen, D-35392 Giessen, Germany
| | | | - Alexis Criscuolo
- Hub de Bioinformatique et Biostatistique ‒ Département Biologie Computationnelle, Institut Pasteur, USR 3756 CNRS, Paris, France
| | - Henriette Mietke
- Staatliche Betriebsgesellschaft für Umwelt und Landwirtschaft, D-01683 Nossen, Germany
| |
Collapse
|