26
|
Kariithi HM, Volkening JD, Chiwanga GH, Pantin-Jackwood MJ, Msoffe PLM, Suarez DL. Genome Sequences and Characterization of Chicken Astrovirus and Avian Nephritis Virus from Tanzanian Live Bird Markets. Viruses 2023; 15:1247. [PMID: 37376547 DOI: 10.3390/v15061247] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/29/2023] [Revised: 05/22/2023] [Accepted: 05/24/2023] [Indexed: 06/29/2023] Open
Abstract
The enteric chicken astrovirus (CAstV) and avian nephritis virus (ANV) are the type species of the genus Avastrovirus (AAstV; Astroviridae family), capable of causing considerable production losses in poultry. Using next-generation sequencing of a cloacal swab from a backyard chicken in Tanzania, we assembled genome sequences of ANV and CAstV (6918 nt and 7318 nt in length, respectively, excluding poly(A) tails, which have a typical AAstV genome architecture (5'-UTR-ORF1a-ORF1b-ORF2-'3-UTR). They are most similar to strains ck/ANV/BR/RS/6R/15 (82.72%) and ck/CAstV/PL/G059/14 (82.23%), respectively. Phylogenetic and sequence analyses of the genomes and the three open reading frames (ORFs) grouped the Tanzanian ANV and CAstV strains with Eurasian ANV-5 and CAstV-Aii viruses, respectively. Compared to other AAstVs, the Tanzanian strains have numerous amino acid variations (substitutions, insertions and deletions) in the spike region of the capsid protein. Furthermore, CAstV-A has a 4018 nt recombinant fragment in the ORF1a/1b genomic region, predicted to be from Eurasian CAstV-Bi and Bvi parental strains. These data should inform future epidemiological studies and options for AAstV diagnostics and vaccines.
Collapse
|
27
|
Cai X, Lan T, Ping P, Oliver B, Li J. Intra-Host Co-Existing Strains of SARS-CoV-2 Reference Genome Uncovered by Exhaustive Computational Search. Viruses 2023; 15:v15051065. [PMID: 37243151 DOI: 10.3390/v15051065] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/28/2023] [Revised: 04/24/2023] [Accepted: 04/24/2023] [Indexed: 05/28/2023] Open
Abstract
The COVID-19 pandemic caused by SARS-CoV-2 has had a severe impact on people worldwide. The reference genome of the virus has been widely used as a template for designing mRNA vaccines to combat the disease. In this study, we present a computational method aimed at identifying co-existing intra-host strains of the virus from RNA-sequencing data of short reads that were used to assemble the original reference genome. Our method consisted of five key steps: extraction of relevant reads, error correction for the reads, identification of within-host diversity, phylogenetic study, and protein binding affinity analysis. Our study revealed that multiple strains of SARS-CoV-2 can coexist in both the viral sample used to produce the reference sequence and a wastewater sample from California. Additionally, our workflow demonstrated its capability to identify within-host diversity in foot-and-mouth disease virus (FMDV). Through our research, we were able to shed light on the binding affinity and phylogenetic relationships of these strains with the published SARS-CoV-2 reference genome, SARS-CoV, variants of concern (VOC) of SARS-CoV-2, and some closely related coronaviruses. These insights have important implications for future research efforts aimed at identifying within-host diversity, understanding the evolution and spread of these viruses, as well as the development of effective treatments and vaccines against them.
Collapse
|
28
|
Niu Y, Zhang T, Chen M, Chen G, Liu Z, Yu R, Han X, Chen K, Huang A, Chen C, Yang Y. Analysis of the Complete Mitochondrial Genome of the Bitter Gourd ( Momordica charantia). PLANTS (BASEL, SWITZERLAND) 2023; 12:1686. [PMID: 37111909 PMCID: PMC10143269 DOI: 10.3390/plants12081686] [Citation(s) in RCA: 2] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 01/13/2023] [Revised: 03/16/2023] [Accepted: 04/06/2023] [Indexed: 06/19/2023]
Abstract
Bitter gourd (Momordica charantia L.) is a significant vegetable. Although it has a special bitter taste, it is still popular with the public. The industrialization of bitter gourd could be hampered by a lack of genetic resources. The bitter gourd's mitochondrial and chloroplast genomes have not been extensively studied. In the present study, the mitochondrial genome of bitter gourd was sequenced and assembled, and its substructure was investigated. The mitochondrial genome of bitter gourd is 331,440 bp with 24 unique core genes, 16 variable genes, 3 rRNAs, and 23 tRNAs. We identified 134 SSRs and 15 tandem repeats in the entire mitochondrial genome of bitter gourd. Moreover, 402 pairs of repeats with a length greater than or equal to 30 were observed in total. The longest palindromic repeat was 523 bp, and the longest forward repeat was 342 bp. We found 20 homologous DNA fragments in bitter gourd, and the summary insert length was 19,427 bp, accounting for 5.86% of the mitochondrial genome. We predicted a total of 447 potential RNA editing sites in 39 unique PCGs and also discovered that the ccmFN gene has been edited the most often, at 38 times. This study provides a basis for a better understanding and analysis of differences in the evolution and inheritance patterns of cucurbit mitochondrial genomes.
Collapse
|
29
|
Akai K, Asano K, Suzuki C, Shimosaka E, Tamiya S, Suzuki T, Takeuchi T, Ohki T. De novo genome assembly of the partial homozygous dihaploid potato identified PVY resistance gene ( Rychc) derived from Solanum chacoense. BREEDING SCIENCE 2023; 73:168-179. [PMID: 37404346 PMCID: PMC10316315 DOI: 10.1270/jsbbs.22078] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 09/28/2022] [Accepted: 12/11/2022] [Indexed: 07/06/2023]
Abstract
The isolation of disease resistance genes introduced from wild or related cultivated species is essential for understanding their mechanisms, spectrum and risk of breakdown. To identify target genes not included in reference genomes, genomic sequences with the target locus must be reconstructed. However, de novo assembly approaches of the entire genome, such as those used for constructing reference genomes, are complicated in higher plants. Moreover, in the autotetraploid potato, the heterozygous regions and repetitive structures located around disease resistance gene clusters fragment the genomes into short contigs, making it challenging to identify resistance genes. In this study, we report that a de novo assembly approach of a target gene-specific homozygous dihaploid developed through haploid induction was suitable for gene isolation in potatoes using the potato virus Y resistance gene Rychc as a model. The assembled contig containing Rychc-linked markers was 3.3 Mb in length and could be joined with gene location information from the fine mapping analysis. Rychc was successfully identified in a repeated island located on the distal end of the long arm of chromosome 9 as a Toll/interleukin-1 receptor-nucleotide-binding site-leucine rich repeat (TIR-NBS-LRR) type resistance gene. This approach will be practical for other gene isolation projects in potatoes.
Collapse
|
30
|
Black AN, Bondo KJ, Mularo A, Hernandez A, Yu Y, Stein CM, Gregory A, Fricke KA, Prendergast J, Sullins D, Haukos D, Whitson M, Grisham B, Lowe Z, DeWoody JA. A highly-contiguous and annotated genome assembly of the Lesser Prairie-Chicken (Tympanuchus pallidicinctus). Genome Biol Evol 2023; 15:7077021. [PMID: 36916502 PMCID: PMC10118296 DOI: 10.1093/gbe/evad043] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/30/2023] [Revised: 03/01/2023] [Accepted: 03/04/2023] [Indexed: 03/14/2023] Open
Abstract
The Lesser Prairie-Chicken (Tympanuchus pallidicinctus; LEPC) is an iconic North American prairie grouse, renowned for ornate and spectacular breeding season displays. Unfortunately, the species has disappeared across much of its historical range, with corresponding precipitous declines in contemporary population abundance, largely due to climactic and anthropogenic factors. These declines led to a 2022U.S. Fish and Wildlife decision to identify and list two Distinct Population Segments (i.e., Northern and Southern DPSs) as threatened or endangered under the 1973 Endangered Species Act. Herein, we describe an annotated reference genome that was generated from a LEPC sample collected from Southern DPS. We chose a representative from the Southern DPS because of the potential for introgression in the Northern DPS, where some populations hybridize with the Greater Prairie-Chicken (Tympanuchus cupido). This new LEPC reference assembly consists of 206 scaffolds, a N50 of 45 Mb, and 15,563 predicted protein-coding genes. We demonstrate the utility of this new genome assembly by estimating genome-wide heterozygosity in a representative LEPC and in related species. Heterozygosity in a LEPC sample was 0.0024, near the middle of the range (0.0003-0.0050) of related species. Overall, this new assembly provides a valuable resource that will enhance evolutionary and conservation genetic research in prairie grouse.
Collapse
|
31
|
Kang P, Yoo YH, Kim DI, Yim JH, Lee H. De Novo Transcriptome Assembly and Comparative Analysis of Differentially Expressed Genes Involved in Cold Acclimation and Freezing Tolerance of the Arctic Moss Aulacomnium turgidum (Wahlenb.) Schwaegr. PLANTS (BASEL, SWITZERLAND) 2023; 12:1250. [PMID: 36986936 PMCID: PMC10054522 DOI: 10.3390/plants12061250] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 12/30/2022] [Revised: 03/06/2023] [Accepted: 03/07/2023] [Indexed: 06/18/2023]
Abstract
Cold acclimation refers to a phenomenon in which plants become more tolerant to freezing after exposure to non-lethal low temperatures. Aulacomnium turgidum (Wahlenb.) Schwaegr is a moss found in the Arctic that can be used to study the freezing tolerance of bryophytes. To improve our understanding of the cold acclimation effect on the freezing tolerance of A. turgidum, we compared the electrolyte leakage of protonema grown at 25 °C (non-acclimation; NA) and at 4 °C (cold acclimation; CA). Freezing damage was significantly lower in CA plants frozen at -12 °C (CA-12) than in NA plants frozen at -12 °C (NA-12). During recovery at 25 °C, CA-12 demonstrated a more rapid and greater level of the maximum photochemical efficiency of photosystem II than NA-12, indicating a greater recovery capacity for CA-12 compared to NA-12. For the comparative analysis of the transcriptome between NA-12 and CA-12, six cDNA libraries were constructed in triplicate, and RNA-seq reads were assembled into 45,796 unigenes. The differential gene expression analysis showed that a significant number of AP2 transcription factor genes and pentatricopeptide repeat protein-coding genes related to abiotic stress and the sugar metabolism pathway were upregulated in CA-12. Furthermore, starch and maltose concentrations increased in CA-12, suggesting that cold acclimation increases freezing tolerance and protects photosynthetic efficiency through the accumulation of starch and maltose in A. turgidum. A de novo assembled transcriptome can be used to explore genetic sources in non-model organisms.
Collapse
|
32
|
Mak QXC, Wick RR, Holt JM, Wang JR. Polishing De Novo Nanopore Assemblies of Bacteria and Eukaryotes With FMLRC2. Mol Biol Evol 2023; 40:7069220. [PMID: 36869750 PMCID: PMC10015616 DOI: 10.1093/molbev/msad048] [Citation(s) in RCA: 8] [Impact Index Per Article: 8.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/14/2022] [Revised: 01/20/2023] [Accepted: 02/21/2023] [Indexed: 03/05/2023] Open
Abstract
As the accuracy and throughput of nanopore sequencing improve, it is increasingly common to perform long-read first de novo genome assemblies followed by polishing with accurate short reads. We briefly introduce FMLRC2, the successor to the original FM-index Long Read Corrector (FMLRC), and illustrate its performance as a fast and accurate de novo assembly polisher for both bacterial and eukaryotic genomes.
Collapse
|
33
|
Liang J, Kong L, Hu X, Fu C, Bai S. Chromosomal-level genome assembly of the high-quality Xian/Indica rice (Oryza sativa L.) Xiangyaxiangzhan. BMC PLANT BIOLOGY 2023; 23:94. [PMID: 36782126 PMCID: PMC9926808 DOI: 10.1186/s12870-023-04114-0] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 09/29/2022] [Accepted: 02/10/2023] [Indexed: 06/18/2023]
Abstract
The indica rice variety XYXZ carries elite traits including appearance and eating quality. Here, we report the de novo assembly of XYXZ using Illumine paired-end whole-genome shotgun sequencing and Nanopore sequencing. We annotated 39,722 protein-coding genes in the 395.04 Mb assembly. In comparison to other cultivars, XYXZ showed a larger gene size including the transcripts and introns, and more exons per gene. And hundreds of ultra-long genes were also detected. A total of 4362 complete LTRs were annotated, and among them, many were located next to or in protein-coding genes including several genes related to rice quality. We observed the different distributions of LTRs in these genes among XYXZ, Nipponbare, and R498, implying these LTRs might potentially affect expressions of the proximal genes and rice quality. Overall, This chromosome-length genome assembly of XYXZ provides a valuable resource for gene discovery, genetic variation and evolution, and the breeding of high-quality rice.
Collapse
|
34
|
Nowlan JP, Sies AN, Britney SR, Cameron ADS, Siah A, Lumsden JS, Russell S. Genomics of Tenacibaculum Species in British Columbia, Canada. Pathogens 2023; 12:pathogens12010101. [PMID: 36678448 PMCID: PMC9864904 DOI: 10.3390/pathogens12010101] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/02/2022] [Revised: 12/30/2022] [Accepted: 01/04/2023] [Indexed: 01/11/2023] Open
Abstract
Tenacibaculum is a genus of Gram-negative filamentous bacteria with a cosmopolitan distribution. The research describing Tenacibaculum genomes stems primarily from Norway and Chile due to their impacts on salmon aquaculture. Canadian salmon aquaculture also experiences mortality events related to the presence of Tenacibaculum spp., yet no Canadian Tenacibaculum genomes are publicly available. Ribosomal DNA sequencing of 16S and four species-specific 16S quantitative-PCR assays were used to select isolates cultured from Atlantic salmon with mouthrot in British Columbia (BC), Canada. Ten isolates representing four known and two unknown species of Tenacibaculum were selected for shotgun whole genome sequencing using the Oxford Nanopore's MinION platform. The genome assemblies achieved closed circular chromosomes for seven isolates and long contigs for the remaining three isolates. Average nucleotide identity analysis identified T. ovolyticum, T. maritimum, T. dicentrarchi, two genomovars of T. finnmarkense, and two proposed novel species T. pacificus sp. nov. type strain 18-2881-AT and T. retecalamus sp. nov. type strain 18-3228-7BT. Annotation in most of the isolates predicted putative virulence and antimicrobial resistance genes, most-notably toxins (i.e., hemolysins), type-IX secretion systems, and oxytetracycline resistance. Comparative analysis with the T. maritimum type-strain predicted additional toxins and numerous C-terminal secretion proteins, including an M12B family metalloprotease in the T. maritimum isolates from BC. The genomic prediction of virulence-associated genes provides important targets for studies of mouthrot disease, and the annotation of the antimicrobial resistance genes provides targets for surveillance and diagnosis in veterinary medicine.
Collapse
|
35
|
CRISPR/Cas9-Mediated Enrichment Coupled to Nanopore Sequencing Provides a Valuable Tool for the Precise Reconstruction of Large Genomic Target Regions. Int J Mol Sci 2023; 24:ijms24021076. [PMID: 36674592 PMCID: PMC9863143 DOI: 10.3390/ijms24021076] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/30/2022] [Revised: 12/23/2022] [Accepted: 12/24/2022] [Indexed: 01/09/2023] Open
Abstract
Complete and accurate identification of genetic variants associated with specific phenotypes can be challenging when there is a high level of genomic divergence between individuals in a study and the corresponding reference genome. We have applied the Cas9-mediated enrichment coupled to nanopore sequencing to perform a targeted de novo assembly and accurately reconstruct a genomic region of interest. This approach was used to reconstruct a 250-kbp target region on chromosome 5 of the common bean genome (Phaseolus vulgaris) associated with the shattering phenotype. Comparing a non-shattering cultivar (Midas) with the reference genome revealed many single-nucleotide variants and structural variants in this region. We cut five 50-kbp tiled sub-regions of Midas genomic DNA using Cas9, followed by sequencing on a MinION device and de novo assembly, generating a single contig spanning the whole 250-kbp region. This assembly increased the number of Illumina reads mapping to genes in the region, improving their genotypability for downstream analysis. The Cas9 tiling approach for target enrichment and sequencing is a valuable alternative to whole-genome sequencing for the assembly of ultra-long regions of interest, improving the accuracy of downstream genotype-phenotype association analysis.
Collapse
|
36
|
Tang YL, Kong YH, Qin S, Merchant A, Shi JZ, Zhou XG, Li MW, Wang Q. Transcriptomic dissection of termite gut microbiota following entomopathogenic fungal infection. Front Physiol 2023; 14:1194370. [PMID: 37153226 PMCID: PMC10161392 DOI: 10.3389/fphys.2023.1194370] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/27/2023] [Accepted: 04/14/2023] [Indexed: 05/09/2023] Open
Abstract
Termites are social insects that live in the soil or in decaying wood, where exposure to pathogens should be common. However, these pathogens rarely cause mortality in established colonies. In addition to social immunity, the gut symbionts of termites are expected to assist in protecting their hosts, though the specific contributions are unclear. In this study, we examined this hypothesis in Odontotermes formosanus, a fungus-growing termite in the family Termitidae, by 1) disrupting its gut microbiota with the antibiotic kanamycin, 2) challenging O. formosanus with the entomopathogenic fungus Metarhizium robertsii, and finally 3) sequencing the resultant gut transcriptomes. As a result, 142531 transcripts and 73608 unigenes were obtained, and unigenes were annotated following NR, NT, KO, Swiss-Prot, PFAM, GO, and KOG databases. Among them, a total of 3,814 differentially expressed genes (DEGs) were identified between M. robertsii infected termites with or without antibiotics treatment. Given the lack of annotated genes in O. formosanus transcriptomes, we examined the expression profiles of the top 20 most significantly differentially expressed genes using qRT-PCR. Several of these genes, including APOA2, Calpain-5, and Hsp70, were downregulated in termites exposed to both antibiotics and pathogen but upregulated in those exposed only to the pathogen, suggesting that gut microbiota might buffer/facilitate their hosts against infection by finetuning physiological and biochemical processes, including innate immunity, protein folding, and ATP synthesis. Overall, our combined results imply that stabilization of gut microbiota can assist termites in maintaining physiological and biochemical homeostasis when foreign pathogenic fungi invade.
Collapse
|
37
|
Wang Y, Li F, Zhang F, Wu L, Xu N, Sun Q, Chen H, Yu Z, Lu J, Jiang K, Wang X, Wen S, Zhou Y, Zhao H, Jiang Q, Wang J, Jia R, Sun J, Tang L, Xu H, Hu W, Xu Z, Chen W, Guo A, Xu Q. Time-ordering japonica/geng genomes analysis indicates the importance of large structural variants in rice breeding. PLANT BIOTECHNOLOGY JOURNAL 2023; 21:202-218. [PMID: 36196761 PMCID: PMC9829401 DOI: 10.1111/pbi.13938] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 07/09/2022] [Revised: 08/23/2022] [Accepted: 09/29/2022] [Indexed: 06/16/2023]
Abstract
Temperate japonica/geng (GJ) rice yield has significantly improved due to intensive breeding efforts, dramatically enhancing global food security. However, little is known about the underlying genomic structural variations (SVs) responsible for this improvement. We compared 58 long-read assemblies comprising cultivated and wild rice species in the present study, revealing 156 319 SVs. The phylogenomic analysis based on the SV dataset detected the putatively selected region of GJ sub-populations. A significant portion of the detected SVs overlapped with genic regions were found to influence the expression of involved genes inside GJ assemblies. Integrating the SVs and causal genetic variants underlying agronomic traits into the analysis enables the precise identification of breeding signatures resulting from complex breeding histories aimed at stress tolerance, yield potential and quality improvement. Further, the results demonstrated genomic and genetic evidence that the SV in the promoter of LTG1 is accounting for chilling sensitivity, and the increased copy numbers of GNP1 were associated with positive effects on grain number. In summary, the current study provides genomic resources for retracing the properties of SVs-shaped agronomic traits during previous breeding procedures, which will assist future genetic, genomic and breeding research on rice.
Collapse
|
38
|
Comparison of Long-Read Methods for Sequencing and Assembly of Lepidopteran Pest Genomes. Int J Mol Sci 2022; 24:ijms24010649. [PMID: 36614092 PMCID: PMC9820851 DOI: 10.3390/ijms24010649] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/10/2022] [Revised: 12/15/2022] [Accepted: 12/24/2022] [Indexed: 01/03/2023] Open
Abstract
Lepidopteran species are mostly pests, causing serious annual economic losses. High-quality genome sequencing and assembly uncover the genetic foundation of pest occurrence and provide guidance for pest control measures. Long-read sequencing technology and assembly algorithm advances have improved the ability to timeously produce high-quality genomes. Lepidoptera includes a wide variety of insects with high genetic diversity and heterozygosity. Therefore, the selection of an appropriate sequencing and assembly strategy to obtain high-quality genomic information is urgently needed. This research used silkworm as a model to test genome sequencing and assembly through high-coverage datasets by de novo assemblies. We report the first nearly complete telomere-to-telomere reference genome of silkworm Bombyx mori (P50T strain) produced by Pacific Biosciences (PacBio) HiFi sequencing, and highly contiguous and complete genome assemblies of two other silkworm strains by Oxford Nanopore Technologies (ONT) or PacBio continuous long-reads (CLR) that were unrepresented in the public database. Assembly quality was evaluated by use of BUSCO, Inspector, and EagleC. It is necessary to choose an appropriate assembler for draft genome construction, especially for low-depth datasets. For PacBio CLR and ONT sequencing, NextDenovo is superior. For PacBio HiFi sequencing, hifiasm is better. Quality assessment is essential for genome assembly and can provide better and more accurate results. For chromosome-level high-quality genome construction, we recommend using 3D-DNA with EagleC evaluation. Our study references how to obtain and evaluate high-quality genome assemblies, and is a resource for biological control, comparative genomics, and evolutionary studies of Lepidopteran pests and related species.
Collapse
|
39
|
Liu C, Wang Y, Peng J, Fan B, Xu D, Wu J, Cao Z, Gao Y, Wang X, Li S, Su Q, Zhang Z, Wang S, Wu X, Shang Q, Shi H, Shen Y, Wang B, Tian J. High-quality genome assembly and pan-genome studies facilitate genetic discovery in mung bean and its improvement. PLANT COMMUNICATIONS 2022; 3:100352. [PMID: 35752938 PMCID: PMC9700124 DOI: 10.1016/j.xplc.2022.100352] [Citation(s) in RCA: 19] [Impact Index Per Article: 9.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 02/14/2022] [Revised: 05/31/2022] [Accepted: 06/22/2022] [Indexed: 05/29/2023]
Abstract
Mung bean is an economically important legume crop species that is used as a food, consumed as a vegetable, and used as an ingredient and even as a medicine. To explore the genomic diversity of mung bean, we assembled a high-quality reference genome (Vrad_JL7) that was ∼479.35 Mb in size, with a contig N50 length of 10.34 Mb. A total of 40,125 protein-coding genes were annotated, representing ∼96.9% of the genetic region. We also sequenced 217 accessions, mainly landraces and cultivars from China, and identified 2,229,343 high-quality single-nucleotide polymorphisms (SNPs). Population structure revealed that the Chinese accessions diverged into two groups and were distinct from non-Chinese lines. Genetic diversity analysis based on genomic data from 750 accessions in 23 countries supported the hypothesis that mung bean was first domesticated in south Asia and introduced to east Asia probably through the Silk Road. We constructed the first pan-genome of mung bean germplasm and assembled 287.73 Mb of non-reference sequences. Among the genes, 83.1% were core genes and 16.9% were variable. Presence/absence variation (PAV) events of nine genes involved in the regulation of the photoperiodic flowering pathway were identified as being under selection during the adaptation process to promote early flowering in the spring. Genome-wide association studies (GWASs) revealed 2,912 SNPs and 259 gene PAV events associated with 33 agronomic traits, including a SNP in the coding region of the SWEET10 homolog (jg24043) involved in crude starch content and a PAV event in a large fragment containing 11 genes for color-related traits. This high-quality reference genome and pan-genome will provide insights into mung bean breeding.
Collapse
|
40
|
Moglad E, Alanazi N, Altayb HN. Genomic Study of Chromosomally and Plasmid-Mediated Multidrug Resistance and Virulence Determinants in Klebsiella Pneumoniae Isolates Obtained from a Tertiary Hospital in Al-Kharj, KSA. Antibiotics (Basel) 2022; 11:1564. [PMID: 36358219 PMCID: PMC9686629 DOI: 10.3390/antibiotics11111564] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/09/2022] [Revised: 10/28/2022] [Accepted: 11/03/2022] [Indexed: 07/21/2023] Open
Abstract
Klebsiella pneumoniae is an emergent pathogen causing respiratory tract, bloodstream, and urinary tract infections in humans. This study defines the genomic sequence data, genotypic and phenotypic characterization of K. pneumoniae clinically isolated from Al-Kharj, KSA. Whole-genome analysis of four K. pneumoniae strains was performed, including de novo assembly, functional annotation, whole-genome-phylogenetic analysis, antibiotic-resistant gene identification, prophage regions, virulent factor, and pan-genome analysis. The results showed that K6 and K7 strains were MDR and ESBL producers, K16 was an ESBL producer, and K8 was sensitive to all tested drugs except ampicillin. K6 and K7 were identified with sequence type (ST) 23, while K16 and K8 were identified with STs 353 and 592, respectively. K6 and K7 were identified with the K1 (wzi1 genotype) capsule and O1 serotype, while K8 was identified with the K57 (wzi206 genotype) capsule and O3b. K6 isolates harbored 10 antimicrobial resistance genes (ARGs) associated with four different plasmids; the chloramphenicol acetyltransferase (catB3), blaOXA-1 and aac(6')-Ib-cr genes were detected in plasmid pB-8922_OXA-48. K6 and K7 also carried a similar gene cassette in plasmid pC1K6P0122-2; the gene cassettes were the trimethoprim-resistant gene (dfrA14), integron integrase (IntI1), insertion sequence (IS1), transposase protein, and replication initiation protein (RepE). Two hypervirulent plasmids were reported in isolates K6 and K7 that carried synthesis genes (iucA, iucB, iucC, iucD, and iutA) and iron siderophore genes (iroB, iroC, iroD, and iroN). The presence of these plasmids in high-risk clones suggests their dissemination in our region, which represents a serious health problem.
Collapse
|
41
|
Kariithi HM, Christy N, Decanini EL, Lemiere S, Volkening JD, Afonso CL, Suarez DL. Detection and Genome Sequence Analysis of Avian Metapneumovirus Subtype A Viruses Circulating in Commercial Chicken Flocks in Mexico. Vet Sci 2022; 9:vetsci9100579. [PMID: 36288192 PMCID: PMC9612082 DOI: 10.3390/vetsci9100579] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/27/2022] [Revised: 10/14/2022] [Accepted: 10/17/2022] [Indexed: 11/11/2022] Open
Abstract
Avian metapneumoviruses (aMPV subtypes A-D) are respiratory and reproductive pathogens of poultry. Since aMPV-A was initially reported in Mexico in 2014, there have been no additional reports of its detection in the country. Using nontargeted next-generation sequencing (NGS) of FTA card-spotted respiratory samples from commercial chickens in Mexico, seven full genome sequences of aMPV-A (lengths of 13,288-13,381 nucleotides) were de novo assembled. Additionally, complete coding sequences of genes N (n = 2), P and M (n = 7 each), F and L (n = 1 each), M2 (n = 6), SH (n = 5) and G (n = 2) were reference-based assembled from another seven samples. The Mexican isolates phylogenetically group with, but in a distinct clade separate from, other aMPV-A strains. The genome and G-gene nt sequences of the Mexican aMPVs are closest to strain UK/8544/06 (97.22-97.47% and 95.07-95.83%, respectively). Various amino acid variations distinguish the Mexican isolates from each other, and other aMPV-A strains, most of which are in the G (n = 38), F (n = 12), and L (n = 19) proteins. Using our sequence data and publicly available aMPV-A data, we revised a previously published rRT-PCR test, which resulted in different cycling and amplification conditions for aMPV-A to make it more compatible with other commonly used rRT-PCR diagnostic cycling conditions. This is the first comprehensive sequence analysis of aMPVs in Mexico and demonstrates the value of nontargeted NGS to identify pathogens where targeted virus surveillance is likely not routinely performed.
Collapse
|
42
|
De Novo Transcriptome Assembly, Gene Annotations, and Characterization of Functional Profiling Reveal Key Genes for Lead Alleviation in the Pb Hyperaccumulator Greek Mustard ( Hirschfeldia incana L.). Curr Issues Mol Biol 2022; 44:4658-4675. [PMID: 36286033 PMCID: PMC9600276 DOI: 10.3390/cimb44100318] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/29/2022] [Revised: 09/22/2022] [Accepted: 09/27/2022] [Indexed: 11/16/2022] Open
Abstract
Lead (Pb) contamination is a widespread environmental problem due to its toxicity to living organisms. Hirschfeldia incana L., a member of the Brassicaceae family, commonly found in the Mediterranean regions, is characterized by its ability to tolerate and accumulate Pb in soils and hydroponic cultures. This plant has been reported as an excellent model to assess the response of plants to Pb. However, the lack of genomic data for H. incana hinders research at the molecular level. In the present study, we carried out RNA deep transcriptome sequencing (RNA-seq) of H. incana under two conditions, control without Pb(NO3)2 and treatment with 100 µM of Pb(NO3)2 for 15 days. A total of 797.83 million reads were generated using Illumina sequencing technology. We assembled 77,491 transcript sequences with an average length of 959 bp and N50 of 1330 bp. Sequence similarity analyses and annotation of these transcripts were performed against the Arabidopsis thaliana nr protein database, Gene Ontology (GO), and KEGG databases. As a result, 13,046 GO terms and 138 KEGG maps were created. Under Pb stress, 577 and 270 genes were differentially expressed in roots and aboveground parts, respectively. Detailed elucidation of regulation of metal transporters, transcription factors (TFs), and plant hormone genes described the role of actors that allow the plant to fine-tune Pb stress responses. Our study revealed that several genes related to jasmonic acid biosynthesis and alpha-linoleic acid were upregulated, suggesting these components’ implication in Hirschfeldia incana L responses to Pb stress. This study provides data for further genomic analyses of the biological and molecular mechanisms leading to Pb tolerance and accumulation in Hirschfeldia incana L.
Collapse
|
43
|
Boštjančić LL, Francesconi C, Rutz C, Hoffbeck L, Poidevin L, Kress A, Jussila J, Makkonen J, Feldmeyer B, Bálint M, Schwenk K, Lecompte O, Theissinger K. Host-pathogen coevolution drives innate immune response to Aphanomyces astaci infection in freshwater crayfish: transcriptomic evidence. BMC Genomics 2022; 23:600. [PMID: 35989333 PMCID: PMC9394032 DOI: 10.1186/s12864-022-08571-z] [Citation(s) in RCA: 4] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/08/2021] [Accepted: 04/20/2022] [Indexed: 02/08/2023] Open
Abstract
BACKGROUND For over a century, scientists have studied host-pathogen interactions between the crayfish plague disease agent Aphanomyces astaci and freshwater crayfish. It has been hypothesised that North American crayfish hosts are disease-resistant due to the long-lasting coevolution with the pathogen. Similarly, the increasing number of latent infections reported in the historically sensitive European crayfish hosts seems to indicate that similar coevolutionary processes are occurring between European crayfish and A. astaci. Our current understanding of these host-pathogen interactions is largely focused on the innate immunity processes in the crayfish haemolymph and cuticle, but the molecular basis of the observed disease-resistance and susceptibility remain unclear. To understand how coevolution is shaping the host's molecular response to the pathogen, susceptible native European noble crayfish and invasive disease-resistant marbled crayfish were challenged with two A. astaci strains of different origin: a haplogroup A strain (introduced to Europe at least 50 years ago, low virulence) and a haplogroup B strain (signal crayfish in lake Tahoe, USA, high virulence). Here, we compare the gene expression profiles of the hepatopancreas, an integrated organ of crayfish immunity and metabolism. RESULTS We characterised several novel innate immune-related gene groups in both crayfish species. Across all challenge groups, we detected 412 differentially expressed genes (DEGs) in the noble crayfish, and 257 DEGs in the marbled crayfish. In the noble crayfish, a clear immune response was detected to the haplogroup B strain, but not to the haplogroup A strain. In contrast, in the marbled crayfish we detected an immune response to the haplogroup A strain, but not to the haplogroup B strain. CONCLUSIONS We highlight the hepatopancreas as an important hub for the synthesis of immune molecules in the response to A. astaci. A clear distinction between the innate immune response in the marbled crayfish and the noble crayfish is the capability of the marbled crayfish to mobilise a higher variety of innate immune response effectors. With this study we outline that the type and strength of the host immune response to the pathogen is strongly influenced by the coevolutionary history of the crayfish with specific A. astaci strains.
Collapse
|
44
|
Pseudo-Chromosomal Genome Assembly in Combination with Comprehensive Transcriptome Analysis in Agaricus bisporus Strain KMCC00540 Reveals Mechanical Stimulus Responsive Genes Associated with Browning Effect. J Fungi (Basel) 2022; 8:jof8080886. [PMID: 36012874 PMCID: PMC9410529 DOI: 10.3390/jof8080886] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/30/2022] [Revised: 08/09/2022] [Accepted: 08/18/2022] [Indexed: 11/17/2022] Open
Abstract
Agaricus bisporus is one of the world’s most popular edible mushrooms, including in South Korea. We performed de novo genome assembly with a South Korean white-colored cultivar of A. bisporus, KMCC00540. After generating a scaffold-level genomic sequence, we inferred chromosome-level assembly by genomic synteny analysis with the representative A. bisporus strains H97 and H39. The KMCC00540 genome had 13 pseudochromosomes comprising 33,030,236 bp mostly covering both strains. A comparative genomic analysis with cultivar H97 indicated that most genomic regions and annotated proteins were shared (over 90%), ensuring that our cultivar could be used as a representative genome. However, A. bisporus suffers from browning even from only a slight mechanical stimulus during transportation, which significantly lowers its commercial value. To identify which genes respond to a mechanical stimulus that induces browning, we performed a time-course transcriptome analysis based on the de novo assembled genome. Mechanical stimulus induces up-regulation in long fatty acid ligase activity-related genes, as well as melanin biosynthesis genes, especially at early time points. In summary, we assembled the chromosome-level genomic information on a Korean strain of A. bisporus and identified which genes respond to a mechanical stimulus, which provided key hints for improving the post-harvest biological control of A. bisporus.
Collapse
|
45
|
Hu G, Cheng L, Cheng Y, Mao W, Qiao Y, Lan Y. Pan-genome analysis of three main Chinese chestnut varieties. FRONTIERS IN PLANT SCIENCE 2022; 13:916550. [PMID: 35958219 PMCID: PMC9358723 DOI: 10.3389/fpls.2022.916550] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 04/09/2022] [Accepted: 07/05/2022] [Indexed: 05/02/2023]
Abstract
Chinese chestnut (Castanea mollissima Blume) is one of the earliest domesticated plants of high nutritional and ecological value, yet mechanisms of C. mollissima underlying its growth and development are poorly understood. Although individual chestnut species differ greatly, the molecular basis of the formation of their characteristic traits remains unknown. Though the draft genomes of chestnut have been previously released, the pan-genome of different variety needs to be studied. We report the genome sequence of three cultivated varieties of chestnut herein, namely Hei-Shan-Zhai-7 (H7, drought-resistant variety), Yan-Hong (YH, easy-pruning variety), and Yan-Shan-Zao-Sheng (ZS, early-maturing variety), to expedite convenience and efficiency in its genetics-based breeding. We obtained three chromosome-level chestnut genome assemblies through a combination of Oxford Nanopore technology, Illumina HiSeq X, and Hi-C mapping. The final genome assemblies are 671.99 Mb (YH), 790.99 Mb (ZS), and 678.90 Mb (H7), across 12 chromosomes, with scaffold N50 sizes of 50.50 Mb (YH), 65.05 Mb (ZS), and 52.16 Mb (H7). Through the identification of homologous genes and the cluster analysis of gene families, we found that H7, YH and ZS had 159, 131, and 91 unique gene families, respectively, and there were 13,248 single-copy direct homologous genes in the three chestnut varieties. For the convenience of research, the chestnut genome database was constructed. Based on the results of gene family identification, the presence/absence variations (PAVs) information of the three sample genes was calculated, and a total of 2,364, 2,232, and 1,475 unique genes were identified in H7, YH and ZS, respectively. Our results suggest that the GBSS II-b gene family underwent expansion in chestnut (relative to nearest source species). Overall, we developed high-quality and well-annotated genome sequences of three C. mollissima varieties, which will facilitate clarifying the molecular mechanisms underlying important traits, and shortening the breeding process.
Collapse
|
46
|
Poates A, Truong J, Lindsey R, Griswold T, Williams-Newkirk AJ, Carleton H, Trees E. Sequencing of Enteric Bacteria: Library Preparation Procedure Matters for Accurate Identification and Characterization. Foodborne Pathog Dis 2022; 19:569-578. [PMID: 35861967 DOI: 10.1089/fpd.2022.0017] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/13/2022] Open
Abstract
Enzymatic library preparation kits are increasingly used for bacterial whole genome sequencing. While they offer a rapid workflow, the transposases used in the kits are recognized to be somewhat biased. The aim of this study was to optimize and validate a protocol for the Illumina DNA Prep kit (formerly Nextera DNA Flex) for sequencing enteric pathogens and compare its performance against the Nextera XT kit. One hundred forty-three strains of Campylobacter, Escherichia, Listeria, Salmonella, Shigella, and Vibrio were prepared with both methods and sequenced on the Illumina MiSeq using 300 and/or 500 cycle chemistries. Sequences were compared using core genome multilocus sequence typing (cgMLST), 7-gene multilocus sequence typing (MLST), and detection of markers encoding serotype, virulence, and antimicrobial resistance. Sequences for one Escherichia strain were downsampled to determine the minimum coverage required for the analyses. While organism-specific differences were observed, the Prep libraries generated longer average read lengths and less fragmented assemblies compared to the XT libraries. In downstream analysis, the most notable difference between the kits was observed for Escherichia, particularly for the 300 cycle sequences. The O group was not predicted in 32% and 4% of XT sequences when using blast and kmer algorithms, respectively, while the O group was predicted from all Prep sequences regardless of the algorithm. In addition, the ehxA gene was not detected in 6% of XT sequences and 34% were missing one or more of the type III secretion systems and/or plasmid-associated genes, which were detected in the Prep sequences. The coverage downsampling revealed that acceptable assembly quality and allele detection was achieved at 30 × coverage with the Prep libraries, whereas 40-50 × coverage was required for the XT libraries. The better performance of the Prep libraries was attributed to more even coverage, particularly in genome regions low in GC content.
Collapse
|
47
|
Liu JJ, Zamany A, Cartwright C, Xiang Y, Shamoun SF, Rancourt B. Transcriptomic Reprogramming and Genetic Variations Contribute to Western Hemlock Defense and Resistance Against Annosus Root and Butt Rot Disease. FRONTIERS IN PLANT SCIENCE 2022; 13:908680. [PMID: 35845706 PMCID: PMC9279933 DOI: 10.3389/fpls.2022.908680] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 03/30/2022] [Accepted: 06/01/2022] [Indexed: 06/15/2023]
Abstract
Western hemlock (Tsuga heterophylla) is highly susceptible to Annosus root and butt rot disease, caused by Heterobasidion occidentale across its native range in western North America. Understanding molecular mechanisms of tree defense and dissecting genetic components underlying disease resistance will facilitate forest breeding and disease control management. The aim of this study was to profile host transcriptome reprogramming in response to pathogen infection using RNA-seq analysis. Inoculated seedlings were clearly grouped into three types: quantitative resistant (QR), susceptible (Sus), and un-infected (Uif), based on profiles of H. occidentale genes expressed in host tissues. Following de novo assembly of a western hemlock reference transcriptome with more than 33,000 expressed genes, the defensive transcriptome reprogramming was characterized and a set of differentially expressed genes (DEGs) were identified with gene ontology (GO) annotation. The QR seedlings showed controlled and coordinated molecular defenses against biotic stressors with enhanced biosynthesis of terpenoids, cinnamic acids, and other secondary metabolites. The Sus seedlings showed defense responses to abiotic stimuli with a few biological processes enhanced (such as DNA replication and cell wall organization), while others were suppressed (such as killing of cells of other organism). Furthermore, non-synonymous single nucleotide polymorphisms (ns-SNPs) of the defense- and resistance-related genes were characterized with high genetic variability. Both phylogenetic analysis and principal coordinate analysis (PCoA) revealed distinct evolutionary distances among the samples. The QR and Sus seedlings were well separated and grouped into different phylogenetic clades. This study provides initial insight into molecular defense and genetic components of western hemlock resistance against the Annosus root and butt rot disease. Identification of a large number of genes and their DNA variations with annotated functions in plant resistance and defense promotes the development of genomics-based breeding strategies for improved western hemlock resistance to H. occidentale.
Collapse
|
48
|
Ma L, Ouyang H, Su A, Zhang Y, Pang D, Zhang T, Sun R, Wang W, Xie Z, Lv D. AbSE Workflow: Rapid Identification of the Coding Sequence and Linear Epitope of the Monoclonal Antibody at the Single-cell Level. ACS Synth Biol 2022; 11:1856-1864. [PMID: 35503752 DOI: 10.1021/acssynbio.2c00018] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/29/2022]
Abstract
Monoclonal antibody (mAb) has been widely used in immunity research and disease diagnosis and therapy. Antibody sequence and epitope are the prerequisites and basis of mAb applications, which determine the properties of antibodies and make the preparation of antibody-based molecules controllable and reliable. Here, we present the antibody sequence and epitope identification (AbSE) workflow, a time-saving and cost-effective route for rapid determination of antibody sequence and linear epitope of mAb even at the single-cell level. The feasibility and accuracy of the AbSE workflow were demonstrated through the identification and validation of the coding sequence and epitope of antihuman serum albumin (antiHSA) mAb. It can be inferred that the AbSE workflow is a powerful and universal approach for paired antibody-epitope sequence identification. It may characterize antibodies not only on a single hybridoma cell but also on any other antibody-secreting cells.
Collapse
|
49
|
Zhang X, Liu CG, Yang SH, Wang X, Bai FW, Wang Z. Benchmarking of long-read sequencing, assemblers and polishers for yeast genome. Brief Bioinform 2022; 23:6576452. [PMID: 35511110 DOI: 10.1093/bib/bbac146] [Citation(s) in RCA: 8] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/27/2021] [Revised: 03/26/2022] [Accepted: 03/31/2022] [Indexed: 11/14/2022] Open
Abstract
BACKGROUND The long reads of the third-generation sequencing significantly benefit the quality of the de novo genome assembly. However, its relatively high single-base error rate has been criticized. Currently, sequencing accuracy and throughput continue to improve, and many advanced tools are constantly emerging. PacBio HiFi sequencing and Oxford Nanopore Technologies (ONT) PromethION are two up-to-date platforms with low error rates and ultralong high-throughput reads. Therefore, it is urgently needed to select the appropriate sequencing platforms, depths and genome assembly tools for high-quality genomes in the era of explosive data production. METHODS We performed 455 (7 assemblers with 4 polishing pipelines or without polishing on 13 subsets with different depths) and 88 (4 assemblers with or without polishing on 11 subsets with different depths) de novo assemblies of Yeast S288C on high-coverage ONT and HiFi datasets, respectively. The assembly quality was evaluated by Quality Assessment Tool (QUAST), Benchmarking Universal Single-Copy Orthologs (BUSCO) and the newly proposed Comprehensive_score (C_score). In addition, we applied four preferable pipelines to assemble the genome of nonreference yeast strains. RESULTS The assembler plays an essential role in genome construction, especially for low-depth datasets. For ONT datasets, Flye is superior to other tools through C_score evaluation. Polishing by Pilon and Medaka improve accuracy and continuity of the preassemblies, respectively, and their combination pipeline worked well in most quality metrics. For HiFi datasets, Flye and NextDenovo performed better than other tools, and polishing is also necessary. Enough data depth is required for high-quality genome construction by ONT (>80X) and HiFi (>20X) datasets.
Collapse
|
50
|
Boostrom I, Portal EAR, Spiller OB, Walsh TR, Sands K. Comparing Long-Read Assemblers to Explore the Potential of a Sustainable Low-Cost, Low-Infrastructure Approach to Sequence Antimicrobial Resistant Bacteria With Oxford Nanopore Sequencing. Front Microbiol 2022; 13:796465. [PMID: 35308384 PMCID: PMC8928191 DOI: 10.3389/fmicb.2022.796465] [Citation(s) in RCA: 7] [Impact Index Per Article: 3.5] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/16/2021] [Accepted: 01/26/2022] [Indexed: 12/24/2022] Open
Abstract
Long-read sequencing (LRS) can resolve repetitive regions, a limitation of short read (SR) data. Reduced cost and instrument size has led to a steady increase in LRS across diagnostics and research. Here, we re-basecalled FAST5 data sequenced between 2018 and 2021 and analyzed the data in relation to gDNA across a large dataset (n = 200) spanning a wide GC content (25-67%). We examined whether re-basecalled data would improve the hybrid assembly, and, for a smaller cohort, compared long read (LR) assemblies in the context of antimicrobial resistance (AMR) genes and mobile genetic elements. We included a cost analysis when comparing SR and LR instruments. We compared the R9 and R10 chemistries and reported not only a larger yield but increased read quality with R9 flow cells. There were often discrepancies with ARG presence/absence and/or variant detection in LR assemblies. Flye-based assemblies were generally efficient at detecting the presence of ARG on both the chromosome and plasmids. Raven performed more quickly but inconsistently recovered small plasmids, notably a ∼15-kb Col-like plasmid harboring bla KPC . Canu assemblies were the most fragmented, with genome sizes larger than expected. LR assemblies failed to consistently determine multiple copies of the same ARG as identified by the Unicycler reference. Even with improvements to ONT chemistry and basecalling, long-read assemblies can lead to misinterpretation of data. If LR data are currently being relied upon, it is necessary to perform multiple assemblies, although this is resource (computing) intensive and not yet readily available/useable.
Collapse
|