1
|
Mielecki D, Detman A, Aleksandrzak-Piekarczyk T, Widomska M, Chojnacka A, Stachurska-Skrodzka A, Walczak P, Grzesiuk E, Sikora A. Unlocking the genome of the non-sourdough Kazachstania humilis MAW1: insights into inhibitory factors and phenotypic properties. Microb Cell Fact 2024; 23:111. [PMID: 38622625 PMCID: PMC11017505 DOI: 10.1186/s12934-024-02380-7] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/29/2023] [Accepted: 03/22/2024] [Indexed: 04/17/2024] Open
Abstract
BACKGROUND Ascomycetous budding yeasts are ubiquitous environmental microorganisms important in food production and medicine. Due to recent intensive genomic research, the taxonomy of yeast is becoming more organized based on the identification of monophyletic taxa. This includes genera important to humans, such as Kazachstania. Until now, Kazachstania humilis (previously Candida humilis) was regarded as a sourdough-specific yeast. In addition, any antibacterial activity has not been associated with this species. RESULTS Previously, we isolated a yeast strain that impaired bio-hydrogen production in a dark fermentation bioreactor and inhibited the growth of Gram-positive and Gram-negative bacteria. Here, using next generation sequencing technologies, we sequenced the genome of this strain named K. humilis MAW1. This is the first genome of a K. humilis isolate not originating from a fermented food. We used novel phylogenetic approach employing the 18 S-ITS-D1-D2 region to show the placement of the K. humilis MAW1 among other members of the Kazachstania genus. This strain was examined by global phenotypic profiling, including carbon sources utilized and the influence of stress conditions on growth. Using the well-recognized bacterial model Escherichia coli AB1157, we show that K. humilis MAW1 cultivated in an acidic medium inhibits bacterial growth by the disturbance of cell division, manifested by filament formation. To gain a greater understanding of the inhibitory effect of K. humilis MAW1, we selected 23 yeast proteins with recognized toxic activity against bacteria and used them for Blast searches of the K. humilis MAW1 genome assembly. The resulting panel of genes present in the K. humilis MAW1 genome included those encoding the 1,3-β-glucan glycosidase and the 1,3-β-glucan synthesis inhibitor that might disturb the bacterial cell envelope structures. CONCLUSIONS We characterized a non-sourdough-derived strain of K. humilis, including its genome sequence and physiological aspects. The MAW1, together with other K. humilis strains, shows the new organization of the mating-type locus. The revealed here pH-dependent ability to inhibit bacterial growth has not been previously recognized in this species. Our study contributes to the building of genome sequence-based classification systems; better understanding of K.humilis as a cell factory in fermentation processes and exploring bacteria-yeast interactions in microbial communities.
Collapse
Affiliation(s)
- Damian Mielecki
- Institute of Biochemistry and Biophysics, Polish Academy of Sciences, Pawińskiego 5a, Warsaw, 02-106, Poland
- Mossakowski Medical Research Institute, Polish Academy of Sciences, Pawińskiego 5, Warsaw, 02-106, Poland
| | - Anna Detman
- Institute of Biochemistry and Biophysics, Polish Academy of Sciences, Pawińskiego 5a, Warsaw, 02-106, Poland
| | | | - Małgorzata Widomska
- Institute of Biochemistry and Biophysics, Polish Academy of Sciences, Pawińskiego 5a, Warsaw, 02-106, Poland
| | - Aleksandra Chojnacka
- Institute of Biochemistry and Biophysics, Polish Academy of Sciences, Pawińskiego 5a, Warsaw, 02-106, Poland
- Institute of Biology, Warsaw University of Life Sciences, Nowoursynowska 159, Warsaw, 02-776, Poland
| | | | - Paulina Walczak
- Institute of Biochemistry and Biophysics, Polish Academy of Sciences, Pawińskiego 5a, Warsaw, 02-106, Poland
| | - Elżbieta Grzesiuk
- Institute of Biochemistry and Biophysics, Polish Academy of Sciences, Pawińskiego 5a, Warsaw, 02-106, Poland
| | - Anna Sikora
- Institute of Biochemistry and Biophysics, Polish Academy of Sciences, Pawińskiego 5a, Warsaw, 02-106, Poland.
| |
Collapse
|
2
|
Koren S, Bao Z, Guarracino A, Ou S, Goodwin S, Jenike KM, Lucas J, McNulty B, Park J, Rautiainen M, Rhie A, Roelofs D, Schneiders H, Vrijenhoek I, Nijbroek K, Ware D, Schatz MC, Garrison E, Huang S, McCombie WR, Miga KH, Wittenberg AH, Phillippy AM. Gapless assembly of complete human and plant chromosomes using only nanopore sequencing. bioRxiv 2024:2024.03.15.585294. [PMID: 38529488 PMCID: PMC10962732 DOI: 10.1101/2024.03.15.585294] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 03/27/2024]
Abstract
The combination of ultra-long Oxford Nanopore (ONT) sequencing reads with long, accurate PacBio HiFi reads has enabled the completion of a human genome and spurred similar efforts to complete the genomes of many other species. However, this approach for complete, "telomere-to-telomere" genome assembly relies on multiple sequencing platforms, limiting its accessibility. ONT "Duplex" sequencing reads, where both strands of the DNA are read to improve quality, promise high per-base accuracy. To evaluate this new data type, we generated ONT Duplex data for three widely-studied genomes: human HG002, Solanum lycopersicum Heinz 1706 (tomato), and Zea mays B73 (maize). For the diploid, heterozygous HG002 genome, we also used "Pore-C" chromatin contact mapping to completely phase the haplotypes. We found the accuracy of Duplex data to be similar to HiFi sequencing, but with read lengths tens of kilobases longer, and the Pore-C data to be compatible with existing diploid assembly algorithms. This combination of read length and accuracy enables the construction of a high-quality initial assembly, which can then be further resolved using the ultra-long reads, and finally phased into chromosome-scale haplotypes with Pore-C. The resulting assemblies have a base accuracy exceeding 99.999% (Q50) and near-perfect continuity, with most chromosomes assembled as single contigs. We conclude that ONT sequencing is a viable alternative to HiFi sequencing for de novo genome assembly, and has the potential to provide a single-instrument solution for the reconstruction of complete genomes.
Collapse
Affiliation(s)
- Sergey Koren
- Genome Informatics Section, Center for Genomics and Data Science Research, National Human Genome Research Institute, National Institutes of Health, Bethesda, MD, USA
| | - Zhigui Bao
- Department of Molecular Biology, Max Planck Institute for Biology Tübingen, Tübingen, BadenWürttemberg, Germany
- Shenzhen Branch, Guangdong Laboratory of Lingnan Modern Agriculture, Genome Analysis Laboratory of the Ministry of Agriculture and Rural Affairs, Agricultural Genomics Institute at Shenzhen, Chinese Academy of Agricultural Sciences, Shenzhen, China
| | - Andrea Guarracino
- Department of Genetics, Genomics and Informatics, University of Tennessee Health Science Center, Memphis, Tennessee, USA
- Human Technopole, Milan, Italy
| | - Shujun Ou
- Ohio State University, Columbus, OH, USA
| | - Sara Goodwin
- Cold Spring Harbor Laboratory, Cold Spring Harbor, NY, USA
| | - Katharine M. Jenike
- Department of Computer Science, Johns Hopkins University, Baltimore, MD, USA
| | - Julian Lucas
- University of California Santa Cruz, Santa Cruz, CA, USA
| | - Brandy McNulty
- University of California Santa Cruz, Santa Cruz, CA, USA
| | - Jimin Park
- University of California Santa Cruz, Santa Cruz, CA, USA
| | - Mikko Rautiainen
- Genome Informatics Section, Center for Genomics and Data Science Research, National Human Genome Research Institute, National Institutes of Health, Bethesda, MD, USA
| | - Arang Rhie
- Genome Informatics Section, Center for Genomics and Data Science Research, National Human Genome Research Institute, National Institutes of Health, Bethesda, MD, USA
| | - Dick Roelofs
- KeyGene, Agro Business Park 90, 6708 PW Wageningen, Netherlands
| | | | - Ilse Vrijenhoek
- KeyGene, Agro Business Park 90, 6708 PW Wageningen, Netherlands
| | - Koen Nijbroek
- KeyGene, Agro Business Park 90, 6708 PW Wageningen, Netherlands
| | - Doreen Ware
- Cold Spring Harbor Laboratory, Cold Spring Harbor, NY, USA
| | - Michael C. Schatz
- Department of Computer Science, Johns Hopkins University, Baltimore, MD, USA
| | - Erik Garrison
- Department of Genetics, Genomics and Informatics, University of Tennessee Health Science Center, Memphis, Tennessee, USA
| | - Sanwen Huang
- Shenzhen Branch, Guangdong Laboratory of Lingnan Modern Agriculture, Genome Analysis Laboratory of the Ministry of Agriculture and Rural Affairs, Agricultural Genomics Institute at Shenzhen, Chinese Academy of Agricultural Sciences, Shenzhen, China
- State Key Laboratory of Tropical Crop Breeding, Chinese Academy of Tropical Agricultural Sciences, Haikou, Hainan, China
| | | | - Karen H. Miga
- University of California Santa Cruz, Santa Cruz, CA, USA
| | | | - Adam M. Phillippy
- Genome Informatics Section, Center for Genomics and Data Science Research, National Human Genome Research Institute, National Institutes of Health, Bethesda, MD, USA
| |
Collapse
|
3
|
Cackett G, Sýkora M, Portugal R, Dulson C, Dixon L, Werner F. Transcription termination and readthrough in African swine fever virus. Front Immunol 2024; 15:1350267. [PMID: 38545109 PMCID: PMC10965686 DOI: 10.3389/fimmu.2024.1350267] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/05/2023] [Accepted: 01/30/2024] [Indexed: 04/13/2024] Open
Abstract
Introduction African swine fever virus (ASFV) is a nucleocytoplasmic large DNA virus (NCLDV) that encodes its own host-like RNA polymerase (RNAP) and factors required to produce mature mRNA. The formation of accurate mRNA 3' ends by ASFV RNAP depends on transcription termination, likely enabled by a combination of sequence motifs and transcription factors, although these are poorly understood. The termination of any RNAP is rarely 100% efficient, and the transcriptional "readthrough" at terminators can generate long mRNAs which may interfere with the expression of downstream genes. ASFV transcriptome analyses reveal a landscape of heterogeneous mRNA 3' termini, likely a combination of bona fide termination sites and the result of mRNA degradation and processing. While short-read sequencing (SRS) like 3' RNA-seq indicates an accumulation of mRNA 3' ends at specific sites, it cannot inform about which promoters and transcription start sites (TSSs) directed their synthesis, i.e., information about the complete and unprocessed mRNAs at nucleotide resolution. Methods Here, we report a rigorous analysis of full-length ASFV transcripts using long-read sequencing (LRS). We systematically compared transcription termination sites predicted from SRS 3' RNA-seq with 3' ends mapped by LRS during early and late infection. Results Using in-vitro transcription assays, we show that recombinant ASFV RNAP terminates transcription at polyT stretches in the non-template strand, similar to the archaeal RNAP or eukaryotic RNAPIII, unaided by secondary RNA structures or predicted viral termination factors. Our results cement this T-rich motif (U-rich in the RNA) as a universal transcription termination signal in ASFV. Many genes share the usage of the same terminators, while genes can also use a range of terminators to generate transcript isoforms varying enormously in length. A key factor in the latter phenomenon is the highly abundant terminator readthrough we observed, which is more prevalent during late compared with early infection. Discussion This indicates that ASFV mRNAs under the control of late gene promoters utilize different termination mechanisms and factors to early promoters and/or that cellular factors influence the viral transcriptome landscape differently during the late stages of infection.
Collapse
Affiliation(s)
- Gwenny Cackett
- Institute for Structural and Molecular Biology, University College London, London, United Kingdom
| | - Michal Sýkora
- Institute for Structural and Molecular Biology, University College London, London, United Kingdom
| | | | - Christopher Dulson
- Institute for Structural and Molecular Biology, University College London, London, United Kingdom
| | - Linda Dixon
- Pirbright Institute, Pirbright, Surrey, United Kingdom
| | - Finn Werner
- Institute for Structural and Molecular Biology, University College London, London, United Kingdom
| |
Collapse
|
4
|
Jousheghani ZZ, Patro R. Oarfish: Enhanced probabilistic modeling leads to improved accuracy in long read transcriptome quantification. bioRxiv 2024:2024.02.28.582591. [PMID: 38464200 PMCID: PMC10925290 DOI: 10.1101/2024.02.28.582591] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 03/12/2024]
Abstract
Motivation Long read sequencing technology is becoming an increasingly indispensable tool in genomic and transcriptomic analysis. In transcriptomics in particular, long reads offer the possibility of sequencing full-length isoforms, which can vastly simplify the identification of novel transcripts and transcript quantification. However, despite this promise, the focus of much long read method development to date has been on transcript identification, with comparatively little attention paid to quantification. Yet, due to differences in the underlying protocols and technologies, lower throughput (i.e. fewer reads sequenced per sample compared to short read technologies), as well as technical artifacts, long read quantification remains a challenge, motivating the continued development and assessment of quantification methods tailored to this increasingly prevalent type of data. Results We introduce a new method and software tool for long read transcript quantification called oarfish. Our model incorporates a novel and innovative coverage score, which affects the conditional probability of fragment assignment in the underlying probabilistic model. We demonstrate that by accounting for this coverage information, oarfish is able to produce more accurate quantification estimates than existing long read quantification methods, particularly when one considers the primary isoforms present in a particular cell line or tissue type. Availability and Implementation Oarfish is implemented in the Rust programming language, and is made available as free and open-source software under the BSD 3-clause license. The source code is available at https://www.github.com/COMBINE-lab/oarfish.
Collapse
Affiliation(s)
- Zahra Zare Jousheghani
- Department of Electrical and Computer Engineering, University of Maryland, College Park, 20742, Maryland, USA
| | - Rob Patro
- Department of Computer Science, University of Maryland, College Park, 20742, Maryland, USA
| |
Collapse
|
5
|
Spohr P, Scharf S, Rommerskirchen A, Henrich B, Jäger P, Klau GW, Haas R, Dilthey A, Pfeffer K. Insights into gut microbiomes in stem cell transplantation by comprehensive shotgun long-read sequencing. Sci Rep 2024; 14:4068. [PMID: 38374282 PMCID: PMC10876974 DOI: 10.1038/s41598-024-53506-1] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/23/2023] [Accepted: 02/01/2024] [Indexed: 02/21/2024] Open
Abstract
The gut microbiome is a diverse ecosystem, dominated by bacteria; however, fungi, phages/viruses, archaea, and protozoa are also important members of the gut microbiota. Exploration of taxonomic compositions beyond bacteria as well as an understanding of the interaction between the bacteriome with the other members is limited using 16S rDNA sequencing. Here, we developed a pipeline enabling the simultaneous interrogation of the gut microbiome (bacteriome, mycobiome, archaeome, eukaryome, DNA virome) and of antibiotic resistance genes based on optimized long-read shotgun metagenomics protocols and custom bioinformatics. Using our pipeline we investigated the longitudinal composition of the gut microbiome in an exploratory clinical study in patients undergoing allogeneic hematopoietic stem cell transplantation (alloHSCT; n = 31). Pre-transplantation microbiomes exhibited a 3-cluster structure, characterized by Bacteroides spp. /Phocaeicola spp., mixed composition and Enterococcus abundances. We revealed substantial inter-individual and temporal variabilities of microbial domain compositions, human DNA, and antibiotic resistance genes during the course of alloHSCT. Interestingly, viruses and fungi accounted for substantial proportions of microbiome content in individual samples. In the course of HSCT, bacterial strains were stable or newly acquired. Our results demonstrate the disruptive potential of alloHSCTon the gut microbiome and pave the way for future comprehensive microbiome studies based on long-read metagenomics.
Collapse
Affiliation(s)
- Philipp Spohr
- Chair Algorithmic Bioinformatics, Faculty of Mathematics and Natural Sciences, Heinrich Heine University Düsseldorf, Düsseldorf, Germany
- Center for Digital Medicine, Düsseldorf, Germany
| | - Sebastian Scharf
- Institute of Medical Microbiology and Hospital Hygiene, Heinrich Heine University Düsseldorf, University Hospital Düsseldorf, Düsseldorf, Germany
| | - Anna Rommerskirchen
- Institute of Medical Microbiology and Hospital Hygiene, Heinrich Heine University Düsseldorf, University Hospital Düsseldorf, Düsseldorf, Germany
| | - Birgit Henrich
- Institute of Medical Microbiology and Hospital Hygiene, Heinrich Heine University Düsseldorf, University Hospital Düsseldorf, Düsseldorf, Germany
| | - Paul Jäger
- Department of Hematology, Immunology, and Clinical Immunology, Heinrich Heine University Düsseldorf, University Hospital Düsseldorf, Düsseldorf, Germany
| | - Gunnar W Klau
- Chair Algorithmic Bioinformatics, Faculty of Mathematics and Natural Sciences, Heinrich Heine University Düsseldorf, Düsseldorf, Germany.
- Center for Digital Medicine, Düsseldorf, Germany.
| | - Rainer Haas
- Department of Hematology, Immunology, and Clinical Immunology, Heinrich Heine University Düsseldorf, University Hospital Düsseldorf, Düsseldorf, Germany.
| | - Alexander Dilthey
- Institute of Medical Microbiology and Hospital Hygiene, Heinrich Heine University Düsseldorf, University Hospital Düsseldorf, Düsseldorf, Germany.
- Center for Digital Medicine, Düsseldorf, Germany.
| | - Klaus Pfeffer
- Institute of Medical Microbiology and Hospital Hygiene, Heinrich Heine University Düsseldorf, University Hospital Düsseldorf, Düsseldorf, Germany.
| |
Collapse
|
6
|
Penumarthi LR, Baptista RP, Beaudry MS, Glenn TC, Kissinger JC. A new chromosome-level genome assembly and annotation of Cryptosporidium meleagridis. bioRxiv 2024:2024.02.16.580748. [PMID: 38405792 PMCID: PMC10888889 DOI: 10.1101/2024.02.16.580748] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 02/27/2024]
Abstract
Cryptosporidium spp. are medically and scientifically relevant protozoan parasites that cause severe diarrheal illness in infants and immunosuppressed populations as well as animals. Although most human Cryptosporidium infections are caused by C. parvum and C. hominis, there are several other human-infecting species including C. meleagridis, which is commonly observed in developing countries. Here, we polished and annotated a long-read genome sequence assembly for C. meleagridis TU1867, a species which infects birds and humans. The genome sequence was generated using a combination of whole genome amplification (WGA) and long-read Oxford Nanopore Technologies sequencing. The assembly was then polished with Illumina data. The chromosome-level genome assembly is 9.2 Mbp with a contig N50 of 1.1 Mb. Annotation revealed 3,923 protein-coding genes. A BUSCO analysis indicates a completeness of 96.6% (n=446), including 430 (96.4%) single-copy and 1 (0.224%) duplicated apicomplexan conserved gene(s). The new C. meleagridis genome assembly is nearly gap-free and provides a valuable new resource for the Cryptosporidium community and future studies on evolution and host-specificity.
Collapse
Affiliation(s)
- Lasya R Penumarthi
- Institute of Bioinformatics, University of Georgia. Athens, Georgia. 30602, USA
- Center for Tropical and Emerging Global Diseases, University of Georgia. Athens, Georgia 30602, USA
| | - Rodrigo P Baptista
- Institute of Bioinformatics, University of Georgia. Athens, Georgia. 30602, USA
- Center for Tropical and Emerging Global Diseases, University of Georgia. Athens, Georgia 30602, USA
| | - Megan S Beaudry
- Department of Environmental Health Science, University of Georgia. Athens, GA, USA
| | - Travis C Glenn
- Institute of Bioinformatics, University of Georgia. Athens, Georgia. 30602, USA
- Department of Environmental Health Science, University of Georgia. Athens, GA, USA
- Department of Genetics, University of Georgia. Athens, Georgia 30602, USA
| | - Jessica C Kissinger
- Institute of Bioinformatics, University of Georgia. Athens, Georgia. 30602, USA
- Center for Tropical and Emerging Global Diseases, University of Georgia. Athens, Georgia 30602, USA
- Department of Genetics, University of Georgia. Athens, Georgia 30602, USA
| |
Collapse
|
7
|
Restrepo-Benavides M, Lozano-Arce D, Gonzalez-Garcia LN, Báez-Aguirre F, Ariza-Aranguren G, Faccini D, Zambrano MM, Jiménez P, Fernández-Bravo A, Restrepo S, Guevara-Suarez M. Unveiling potential virulence determinants in Vibrio isolates from Anadara tuberculosa through whole genome analyses. Microbiol Spectr 2024; 12:e0292823. [PMID: 38189292 PMCID: PMC10846245 DOI: 10.1128/spectrum.02928-23] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/25/2023] [Accepted: 11/14/2023] [Indexed: 01/09/2024] Open
Abstract
The genus Vibrio includes pathogenic bacteria able to cause disease in humans and aquatic organisms, leading to disease outbreaks and significant economic losses in the fishery industry. Despite much work on Vibrio in several marine organisms, no specific studies have been conducted on Anadara tuberculosa. This is a commercially important bivalve species, known as "piangua hembra," along Colombia's Pacific coast. Therefore, this study aimed to identify and characterize the genomes of Vibrio isolates obtained from A. tuberculosa. Bacterial isolates were obtained from 14 A. tuberculosa specimens collected from two locations along the Colombian Pacific coast, of which 17 strains were identified as Vibrio: V. parahaemolyticus (n = 12), V. alginolyticus (n = 3), V. fluvialis (n = 1), and V. natriegens (n = 1). Whole genome sequence of these isolates was done using Oxford Nanopore Technologies (ONT). The analysis revealed the presence of genes conferring resistance to β-lactams, tetracyclines, chloramphenicol, and macrolides, indicating potential resistance to these antimicrobial agents. Genes associated with virulence were also found, suggesting the potential pathogenicity of these Vibrio isolates, as well as genes for Type III Secretion Systems (T3SS) and Type VI Secretion Systems (T6SS), which play crucial roles in delivering virulence factors and in interbacterial competition. This study represents the first genomic analysis of bacteria within A. tuberculosa, shedding light on Vibrio genetic factors and contributing to a comprehensive understanding of the pathogenic potential of these Vibrio isolates.IMPORTANCEThis study presents the first comprehensive report on the whole genome analysis of Vibrio isolates obtained from Anadara tuberculosa, a bivalve species of great significance for social and economic matters on the Pacific coast of Colombia. Research findings have significant implications for the field, as they provide crucial information on the genetic factors and possible pathogenicity of Vibrio isolates associated with A. tuberculosa. The identification of antimicrobial resistance genes and virulence factors within these isolates emphasizes the potential risks they pose to both human and animal health. Furthermore, the presence of genes associated with Type III and Type VI Secretion Systems suggests their critical role in virulence and interbacterial competition. Understanding the genetic factors that contribute to Vibrio bacterial virulence and survival strategies within their ecological niche is of utmost importance for the effective prevention and management of diseases in aquaculture practices.
Collapse
Affiliation(s)
- Mariana Restrepo-Benavides
- Applied Genomics Research Group, Vicerrectoría de Investigación y Creación, Universidad de los Andes, Bogotá, Colombia
- Unit of Microbiology, Department of Basic Health Sciences, Faculty of Medicine and Health Sciences, IISPV, University Rovira i Virgili, Reus, Spain
| | - Daniela Lozano-Arce
- Applied Genomics Research Group, Vicerrectoría de Investigación y Creación, Universidad de los Andes, Bogotá, Colombia
| | - Laura Natalia Gonzalez-Garcia
- Applied Genomics Research Group, Vicerrectoría de Investigación y Creación, Universidad de los Andes, Bogotá, Colombia
- Systems and Computing Engineering Department, Universidad de Los Andes, Bogotá, Colombia
- UMR DIADE, Institut de Recherche pour le Développement, Université de Montpellier, Montpellier, France
| | - Felipe Báez-Aguirre
- Applied Genomics Research Group, Vicerrectoría de Investigación y Creación, Universidad de los Andes, Bogotá, Colombia
| | - Gabriela Ariza-Aranguren
- Applied Genomics Research Group, Vicerrectoría de Investigación y Creación, Universidad de los Andes, Bogotá, Colombia
| | - Daniel Faccini
- Applied Genomics Research Group, Vicerrectoría de Investigación y Creación, Universidad de los Andes, Bogotá, Colombia
| | | | - Pedro Jiménez
- Laboratorio de Fitopatología, Facultad de Ciencias Básicas y Aplicadas, Universidad Militar Nueva Granada, Cajicá, Colombia
| | - Ana Fernández-Bravo
- Unit of Microbiology, Department of Basic Health Sciences, Faculty of Medicine and Health Sciences, IISPV, University Rovira i Virgili, Reus, Spain
| | - Silvia Restrepo
- Departamento de Ingeniería Química y de Alimentos, Laboratorio de Micología y Fitopatología, Universidad de los Andes, Bogotá, Colombia
| | - Marcela Guevara-Suarez
- Applied Genomics Research Group, Vicerrectoría de Investigación y Creación, Universidad de los Andes, Bogotá, Colombia
| |
Collapse
|
8
|
Wöhner TW, Emeriewen OF, Wittenberg AHJ, Nijbroek K, Wang RP, Blom EJ, Schneiders H, Keilwagen J, Berner T, Hoff KJ, Gabriel L, Thierfeldt H, Almolla O, Barchi L, Schuster M, Lempe J, Peil A, Flachowsky H. The structure of the tetraploid sour cherry 'Schattenmorelle' ( Prunus cerasus L.) genome reveals insights into its segmental allopolyploid nature. Front Plant Sci 2023; 14:1284478. [PMID: 38107002 PMCID: PMC10722297 DOI: 10.3389/fpls.2023.1284478] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 08/28/2023] [Accepted: 10/31/2023] [Indexed: 12/19/2023]
Abstract
Sour cherry (Prunus cerasus L.) is an important allotetraploid cherry species that evolved in the Caspian Sea and Black Sea regions from a hybridization of the tetraploid ground cherry (Prunus fruticosa Pall.) and an unreduced pollen of the diploid sweet cherry (P. avium L.) ancestor. Details of when and where the evolution of this species occurred are unclear, as well as the effect of hybridization on the genome structure. To gain insight, the genome of the sour cherry cultivar 'Schattenmorelle' was sequenced using Illumina NovaSeqTM and Oxford Nanopore long-read technologies, resulting in a ~629-Mbp pseudomolecule reference genome. The genome could be separated into two subgenomes, with subgenome PceS_a originating from P. avium and subgenome PceS_f originating from P. fruticosa. The genome also showed size reduction compared to ancestral species and traces of homoeologous sequence exchanges throughout. Comparative analysis confirmed that the genome of sour cherry is segmental allotetraploid and evolved very recently in the past.
Collapse
Affiliation(s)
- Thomas W. Wöhner
- Institute for Breeding Research on Fruit Crops, Julius Kühn Institute (JKI) – Federal Research Centre for Cultivated Plants, Dresden, Saxony, Germany
| | - Ofere F. Emeriewen
- Institute for Breeding Research on Fruit Crops, Julius Kühn Institute (JKI) – Federal Research Centre for Cultivated Plants, Dresden, Saxony, Germany
| | | | | | | | | | | | - Jens Keilwagen
- Institute for Biosafety in Plant Biotechnology, Julius Kühn Institute (JKI) – Federal Research Centre for Cultivated Plants, Quedlinburg, Saxony-Anhalt, Germany
| | - Thomas Berner
- Institute for Biosafety in Plant Biotechnology, Julius Kühn Institute (JKI) – Federal Research Centre for Cultivated Plants, Quedlinburg, Saxony-Anhalt, Germany
| | - Katharina J. Hoff
- Institute of Mathematics and Computer Science, University of Greifswald, Greifswald, Mecklenburg-Western Pomerania, Germany
| | - Lars Gabriel
- Institute of Mathematics and Computer Science, University of Greifswald, Greifswald, Mecklenburg-Western Pomerania, Germany
| | - Hannah Thierfeldt
- Institute of Mathematics and Computer Science, University of Greifswald, Greifswald, Mecklenburg-Western Pomerania, Germany
| | - Omar Almolla
- Dipartimento di Scienze Agrarie, Forestali e Alimentari (DISAFA) – Plant Genetics, University of Turin, Grugliasco, Italy
| | - Lorenzo Barchi
- Dipartimento di Scienze Agrarie, Forestali e Alimentari (DISAFA) – Plant Genetics, University of Turin, Grugliasco, Italy
| | - Mirko Schuster
- Institute for Breeding Research on Fruit Crops, Julius Kühn Institute (JKI) – Federal Research Centre for Cultivated Plants, Dresden, Saxony, Germany
| | - Janne Lempe
- Institute for Breeding Research on Fruit Crops, Julius Kühn Institute (JKI) – Federal Research Centre for Cultivated Plants, Dresden, Saxony, Germany
| | - Andreas Peil
- Institute for Breeding Research on Fruit Crops, Julius Kühn Institute (JKI) – Federal Research Centre for Cultivated Plants, Dresden, Saxony, Germany
| | - Henryk Flachowsky
- Institute for Breeding Research on Fruit Crops, Julius Kühn Institute (JKI) – Federal Research Centre for Cultivated Plants, Dresden, Saxony, Germany
| |
Collapse
|
9
|
Schall PZ, Winkler PA, Petersen-Jones SM, Yuzbasiyan-Gurkan V, Kidd JM. Genome-wide methylation patterns from canine nanopore assemblies. G3 (Bethesda) 2023; 13:jkad203. [PMID: 37681359 PMCID: PMC10627269 DOI: 10.1093/g3journal/jkad203] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 07/27/2023] [Revised: 08/23/2023] [Accepted: 08/29/2023] [Indexed: 09/09/2023]
Abstract
Recent advances in long-read sequencing have enabled the creation of reference-quality genome assemblies for multiple individuals within a species. In particular, 8 long-read genome assemblies have recently been published for the canine model (dogs and wolves). These assemblies were created using a range of sequencing and computational approaches, with only limited comparisons described among subsets of the assemblies. Here we present 3 high-quality de novo reference assemblies based upon Oxford Nanopore long-read sequencing: 2 Bernese Mountain Dogs (BD & OD) and a Cairn terrier (CA611). These breeds are of particular interest due to the enrichment of unresolved genetic disorders. Leveraging advancement in software technologies, we utilized published data of Labrador Retriever (Yella) to generate a new assembly, resulting in a ∼280-fold increase in continuity (N50 size of 91 kbp vs 25.75 Mbp). In conjunction with these 4 new assemblies, we uniformly assessed 8 existing assemblies for generalized quality metrics, sequence divergence, and a detailed BUSCO assessment. We identified a set of ∼400 conserved genes during the BUSCO analysis missing in all assemblies. Genome-wide methylation profiles were generated from the nanopore sequencing, resulting in broad concordance with existing whole-genome and reduced-representation bisulfite sequencing, while highlighting superior overage of mobile elements. These analyses demonstrate the ability of Nanopore sequencing to resolve the sequence and epigenetic profile of canine genomes.
Collapse
Affiliation(s)
- Peter Z Schall
- Department of Human Genetics, University of Michigan, Ann Arbor, MI 48109, USA
- Department of Computational Medicine and Bioinformatics, University of Michigan, Ann Arbor, MI 48109, USA
| | - Paige A Winkler
- Department of Small Animal Clinical Sciences, College of Veterinary Medicine, Michigan State University, East Lansing, MI 48824, USA
| | - Simon M Petersen-Jones
- Department of Small Animal Clinical Sciences, College of Veterinary Medicine, Michigan State University, East Lansing, MI 48824, USA
| | - Vilma Yuzbasiyan-Gurkan
- Department of Small Animal Clinical Sciences, College of Veterinary Medicine, Michigan State University, East Lansing, MI 48824, USA
- Department of Microbiology and Molecular Genetics, College of Veterinary Medicine, Michigan State University, East Lansing, MI 48824, USA
| | - Jeffrey M Kidd
- Department of Human Genetics, University of Michigan, Ann Arbor, MI 48109, USA
- Department of Computational Medicine and Bioinformatics, University of Michigan, Ann Arbor, MI 48109, USA
| |
Collapse
|
10
|
Sumner JT, Andrasz CL, Johnson CA, Wax S, Anderson P, Keeling EL, Davidson JM. De novo genome assembly and comparative genomics for the colonial ascidian Botrylloides violaceus. G3 (Bethesda) 2023; 13:jkad181. [PMID: 37555394 PMCID: PMC10542563 DOI: 10.1093/g3journal/jkad181] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 01/25/2023] [Revised: 01/25/2023] [Accepted: 07/12/2023] [Indexed: 08/10/2023]
Abstract
Ascidians have the potential to reveal fundamental biological insights related to coloniality, regeneration, immune function, and the evolution of these traits. This study implements a hybrid assembly technique to produce a genome assembly and annotation for the botryllid ascidian, Botrylloides violaceus. A hybrid genome assembly was produced using Illumina, Inc. short and Oxford Nanopore Technologies long-read sequencing technologies. The resulting assembly is comprised of 831 contigs, has a total length of 121 Mbp, N50 of 1 Mbp, and a BUSCO score of 96.1%. Genome annotation identified 13 K protein-coding genes. Comparative genomic analysis with other tunicates reveals patterns of conservation and divergence within orthologous gene families even among closely related species. Characterization of the Wnt gene family, encoding signaling ligands involved in development and regeneration, reveals conserved patterns of subfamily presence and gene copy number among botryllids. This supports the use of genomic data from nonmodel organisms in the investigation of biological phenomena.
Collapse
Affiliation(s)
- Jack T Sumner
- Department of Biological Sciences, California Polytechnic State University, San Luis Obispo, CA 93407, USA
| | - Cassidy L Andrasz
- Department of Biological Sciences, California Polytechnic State University, San Luis Obispo, CA 93407, USA
| | - Christine A Johnson
- Department of Biological Sciences, California Polytechnic State University, San Luis Obispo, CA 93407, USA
| | - Sarah Wax
- Department of Biological Sciences, California Polytechnic State University, San Luis Obispo, CA 93407, USA
| | - Paul Anderson
- Department of Computer Science and Software Engineering, California Polytechnic State University, San Luis Obispo, CA 93407, USA
| | - Elena L Keeling
- Department of Biological Sciences, California Polytechnic State University, San Luis Obispo, CA 93407, USA
| | - Jean M Davidson
- Department of Biological Sciences, California Polytechnic State University, San Luis Obispo, CA 93407, USA
| |
Collapse
|
11
|
Xing F, Xia Y, Lu Q, Lo SKF, Lau SKP, Woo PCY. Rapid diagnosis of fatal Nocardia kroppenstedtii bacteremic pneumonia and empyema thoracis by next-generation sequencing: a case report. Front Med (Lausanne) 2023; 10:1226126. [PMID: 37534314 PMCID: PMC10392123 DOI: 10.3389/fmed.2023.1226126] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/20/2023] [Accepted: 06/27/2023] [Indexed: 08/04/2023] Open
Abstract
Nocardia species do not replicate as rapidly as other pyogenic bacteria and nocardial infections can be highly fatal, particularly in immunocompromised patients. Here, we present the first report of fatal Nocardia kroppenstedtii bacteremic pneumonia and empyema thoracis diagnosed by next-generation sequencing (NGS) using the Oxford Nanopore Technologies' MinION device. The bacterium was not identified by matrix-assisted laser desorption ionization-time of flight mass spectrometry. Due to its low equipment cost, short turn-around-time, and portable size, the Oxford Nanopore Technologies' MinION device is a useful platform for NGS in routine clinical microbiology laboratories.
Collapse
Affiliation(s)
- Fanfan Xing
- Department of Clinical Microbiology and Infection Control, The University of Hong Kong—Shenzhen Hospital, Shenzhen, China
| | - Yao Xia
- Department of Microbiology, Li Ka Shing Faculty of Medicine, The University of Hong Kong, Pokfulam, Hong Kong SAR, China
| | - Qianyun Lu
- Department of Clinical Microbiology and Infection Control, The University of Hong Kong—Shenzhen Hospital, Shenzhen, China
| | - Simon K. F. Lo
- Department of Clinical Microbiology and Infection Control, The University of Hong Kong—Shenzhen Hospital, Shenzhen, China
| | - Susanna K. P. Lau
- Department of Microbiology, Li Ka Shing Faculty of Medicine, The University of Hong Kong, Pokfulam, Hong Kong SAR, China
| | - Patrick C. Y. Woo
- Department of Microbiology, Li Ka Shing Faculty of Medicine, The University of Hong Kong, Pokfulam, Hong Kong SAR, China
- Doctoral Program in Translational Medicine and Department of Life Sciences, National Chung Hsing University, Taichung, Taiwan
- The iEGG and Animal Biotechnology Research Center, National Chung Hsing University, Taichung, Taiwan
| |
Collapse
|
12
|
Gurgul A, Jasielczuk I, Szmatoła T, Sawicki S, Semik-Gurgul E, Długosz B, Bugno-Poniewierska M. Application of Nanopore Sequencing for High Throughput Genotyping in Horses. Animals (Basel) 2023; 13:2227. [PMID: 37444025 DOI: 10.3390/ani13132227] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/25/2023] [Revised: 07/03/2023] [Accepted: 07/04/2023] [Indexed: 07/15/2023] Open
Abstract
Nanopore sequencing is a third-generation biopolymer sequencing technique that relies on monitoring the changes in an electrical current that occur as nucleic acids are passed through a protein nanopore. Increasing quality of reads generated by nanopore sequencing systems encourages their application in genome-wide polymorphism detection and genotyping. In this study, we employed nanopore sequencing to identify genome-wide polymorphisms in the horse genome. To reduce the size and complexity of genome fragments for sequencing in a simple and cost-efficient manner, we amplified random DNA fragments using a modified DOP-PCR and sequenced the resulting products using the MinION system. After initial filtering, this generated 28,426 polymorphisms, which were validated at a 3% error rate. Upon further filtering for polymorphism and reproducibility, we identified 9495 SNPs that reflected the horse population structure. To conclude, the use of nanopore sequencing, in conjunction with a genome enrichment step, is a promising tool that can be practical in a variety of applications, including genotyping, population genomics, association studies, linkage mapping, and potentially genomic selection.
Collapse
Affiliation(s)
- Artur Gurgul
- Center of Experimental and Innovative Medicine, University of Agriculture in Krakow, al. Mickiewicza 24/28, 30-059 Krakow, Poland
| | - Igor Jasielczuk
- Center of Experimental and Innovative Medicine, University of Agriculture in Krakow, al. Mickiewicza 24/28, 30-059 Krakow, Poland
| | - Tomasz Szmatoła
- Center of Experimental and Innovative Medicine, University of Agriculture in Krakow, al. Mickiewicza 24/28, 30-059 Krakow, Poland
| | - Sebastian Sawicki
- Department of Animal Reproduction, Anatomy and Genomics, University of Agriculture in Krakow, al. Mickiewicza 24/28, 30-059 Krakow, Poland
| | - Ewelina Semik-Gurgul
- Department of Animal Molecular Biology, National Research Institute of Animal Production, Krakowska 1, 32-083 Balice, Poland
| | - Bogusława Długosz
- Department of Animal Reproduction, Anatomy and Genomics, University of Agriculture in Krakow, al. Mickiewicza 24/28, 30-059 Krakow, Poland
| | - Monika Bugno-Poniewierska
- Department of Animal Reproduction, Anatomy and Genomics, University of Agriculture in Krakow, al. Mickiewicza 24/28, 30-059 Krakow, Poland
| |
Collapse
|
13
|
Velasco-Amo MP, Arias-Giraldo LF, Román-Écija M, Fuente LDL, Marco-Noales E, Moralejo E, Navas-Cortés JA, Landa BB. Complete Circularized Genome Resources of Seven Strains of Xylella fastidiosa subsp. fastidiosa Using Hybrid Assembly Reveals Unknown Plasmids. Phytopathology 2023; 113:1128-1132. [PMID: 36441872 DOI: 10.1094/phyto-10-22-0396-a] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/16/2023]
Abstract
Xylella fastidiosa is a vascular plant pathogenic bacterium native to the Americas that is causing significant epidemics and economic losses in olive and almonds in Europe, where it is a quarantine pathogen. Since its first detection in 2013 in Italy, mandatory surveys across Europe revealed the presence of the bacterium also in France, Spain, and Portugal. Combining Oxford Nanopore Technologies and Illumina sequencing data, we assembled high-quality complete genomes of seven X. fastidiosa subsp. fastidiosa strains isolated from different plants in Spain, the United States, and Mexico. Comparative genomic analyses discovered differences in plasmid content among strains, including plasmids that had been overlooked previously when using the Illumina sequencing platform alone. Interestingly, in strain CFBP8073, intercepted in France from plants imported from Mexico, three plasmids were identified, including two (plasmids pXF-P1.CFBP8073 and pXF-P2.CFBP8073) not previously described in X. fastidiosa and one (pXF5823.CFBP8073) almost identical to a plasmid described in a X. fastidiosa strain from citrus. Plasmids found in the Spanish strains here were similar to those described previously in other strains from the same subspecies and ST1 isolated in the Balearic Islands and the United States. The genome resources from this work will assist in further studies on the role of plasmids in the epidemiology, ecology, and evolution of this plant pathogen.
Collapse
Affiliation(s)
- María Pilar Velasco-Amo
- Instituto de Agricultura Sostenible, Consejo Superior de Investigaciones Científicas (CSIC), Córdoba, Spain
| | - Luis F Arias-Giraldo
- Instituto de Agricultura Sostenible, Consejo Superior de Investigaciones Científicas (CSIC), Córdoba, Spain
| | - Miguel Román-Écija
- Instituto de Agricultura Sostenible, Consejo Superior de Investigaciones Científicas (CSIC), Córdoba, Spain
| | - Leonardo De La Fuente
- Department of Entomology and Plant Pathology, Auburn University, Auburn, AL 36849, U.S.A
| | - Ester Marco-Noales
- Centro de Protección Vegetal y Biotecnología, Instituto Valenciano de Investigaciones Agrarias (IVIA), Moncada, Spain
| | - Eduardo Moralejo
- Tragsa, Empresa de Transformación Agraria, Delegación de Baleares, 07005 Palma, Spain
| | - Juan A Navas-Cortés
- Instituto de Agricultura Sostenible, Consejo Superior de Investigaciones Científicas (CSIC), Córdoba, Spain
| | - Blanca B Landa
- Instituto de Agricultura Sostenible, Consejo Superior de Investigaciones Científicas (CSIC), Córdoba, Spain
| |
Collapse
|
14
|
Kaur A, Rana R, Bansal K, Patel HK, Sonti RV, Patil PB. Insights into the Diversity of Transcription Activator-Like Effectors in Indian Pathotype Strains of Xanthomonas oryzae pv. oryzae. Phytopathology 2023; 113:953-959. [PMID: 36441870 DOI: 10.1094/phyto-08-22-0304-sc] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/16/2023]
Abstract
Xanthomonas oryzae pv. oryzae (Xoo) is a major rice pathogen, and its genome harbors extensive inter-strain and inter-lineage variations. The emergence of highly virulent pathotypes of Xoo that can overcome major resistance (R) genes deployed in rice breeding programs is a grave threat to rice cultivation. The present study reports on a long-read Oxford nanopore-based complete genomic investigation of Xoo isolates from 11 pathotypes that are reported based on their reaction toward 10 R genes. The investigation revealed remarkable variation in the genome structure in the strains belonging to different pathotypes. Furthermore, transcription activator-like effector (TALE) proteins secreted by the type III secretion system display marked variation in content, genomic location, classes, and DNA-binding domain. We also found the association of tal genes in the vicinity of regions with genome structural variations. Furthermore, in silico analysis of the genome-wide rice targets of TALEs allowed us to understand the emergence of pathotypes compatible with major R genes. Long-read, cost-effective sequencing technologies such as nanopore can be a game changer in the surveillance of major and emerging pathotypes. The resource and findings will be invaluable in the management of Xoo and in appropriate deployment of R genes in rice breeding programs.
Collapse
Affiliation(s)
- Amandeep Kaur
- Bacterial Genomics and Evolution Laboratory, CSIR-Institute of Microbial Technology, Chandigarh, India
| | - Rekha Rana
- Bacterial Genomics and Evolution Laboratory, CSIR-Institute of Microbial Technology, Chandigarh, India
- The Academy of Scientific and Innovative Research, Ghaziabad, India
| | - Kanika Bansal
- Bacterial Genomics and Evolution Laboratory, CSIR-Institute of Microbial Technology, Chandigarh, India
| | | | - Ramesh V Sonti
- International Centre for Genetic Engineering and Biotechnology, New Delhi, India
| | - Prabhu B Patil
- Bacterial Genomics and Evolution Laboratory, CSIR-Institute of Microbial Technology, Chandigarh, India
| |
Collapse
|
15
|
De La Cerda GY, Landis JB, Eifler E, Hernandez AI, Li F, Zhang J, Tribble CM, Karimi N, Chan P, Givnish T, Strickler SR, Specht CD. Balancing read length and sequencing depth: Optimizing Nanopore long-read sequencing for monocots with an emphasis on the Liliales. Appl Plant Sci 2023; 11:e11524. [PMID: 37342170 PMCID: PMC10278932 DOI: 10.1002/aps3.11524] [Citation(s) in RCA: 3] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 09/30/2022] [Revised: 01/20/2023] [Accepted: 01/30/2023] [Indexed: 06/22/2023]
Abstract
PREMISE We present approaches used to generate long-read Nanopore sequencing reads for the Liliales and demonstrate how modifications to standard protocols directly impact read length and total output. The goal is to help those interested in generating long-read sequencing data determine which steps may be necessary for optimizing output and results. METHODS Four species of Calochortus (Liliaceae) were sequenced. Modifications made to sodium dodecyl sulfate (SDS) extractions and cleanup protocols included grinding with a mortar and pestle, using cut or wide-bore tips, chloroform cleaning, bead cleaning, eliminating short fragments, and using highly purified DNA. RESULTS Steps taken to maximize read length can decrease overall output. Notably, the number of pores in a flow cell is correlated with the overall output, yet we did not see an association between the pore number and the read length or the number of reads produced. DISCUSSION Many factors contribute to the overall success of a Nanopore sequencing run. We showed the direct impact that several modifications to the DNA extraction and cleaning steps have on the total sequencing output, read size, and number of reads generated. We show a tradeoff between read length and the number of reads and, to a lesser extent, the total sequencing output, all of which are important factors for successful de novo genome assembly.
Collapse
Affiliation(s)
- Gisel Y. De La Cerda
- School of Integrative Plant Science, Section of Plant Biology and the L. H. Bailey HortoriumCornell UniversityIthacaNew York14853USA
| | - Jacob B. Landis
- School of Integrative Plant Science, Section of Plant Biology and the L. H. Bailey HortoriumCornell UniversityIthacaNew York14853USA
- BTI Computational Biology CenterBoyce Thompson InstituteIthacaNew York14853USA
| | - Evan Eifler
- Department of BotanyUniversity of Wisconsin–MadisonMadisonWisconsin53706USA
| | - Adriana I. Hernandez
- School of Integrative Plant Science, Section of Plant Biology and the L. H. Bailey HortoriumCornell UniversityIthacaNew York14853USA
| | - Fay‐Wei Li
- BTI Computational Biology CenterBoyce Thompson InstituteIthacaNew York14853USA
| | - Jing Zhang
- BTI Computational Biology CenterBoyce Thompson InstituteIthacaNew York14853USA
| | - Carrie M. Tribble
- School of Life SciencesUniversity of Hawaiʻi, MānoaHonoluluHawaiʻi96822USA
| | - Nisa Karimi
- Department of BotanyUniversity of Wisconsin–MadisonMadisonWisconsin53706USA
| | - Patricia Chan
- Department of BotanyUniversity of Wisconsin–MadisonMadisonWisconsin53706USA
| | - Thomas Givnish
- Department of BotanyUniversity of Wisconsin–MadisonMadisonWisconsin53706USA
| | - Susan R. Strickler
- BTI Computational Biology CenterBoyce Thompson InstituteIthacaNew York14853USA
- Present address:
Plant Science and ConservationChicago Botanic GardenGlencoeIllinois60022USA
- Present address:
Plant Biology and Conservation ProgramNorthwestern UniversityEvanstonIllinois60208USA
| | - Chelsea D. Specht
- School of Integrative Plant Science, Section of Plant Biology and the L. H. Bailey HortoriumCornell UniversityIthacaNew York14853USA
| |
Collapse
|
16
|
Hotaling S, Wilcox ER, Heckenhauer J, Stewart RJ, Frandsen PB. Highly accurate long reads are crucial for realizing the potential of biodiversity genomics. BMC Genomics 2023; 24:117. [PMID: 36927511 PMCID: PMC10018877 DOI: 10.1186/s12864-023-09193-9] [Citation(s) in RCA: 4] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/09/2022] [Accepted: 02/17/2023] [Indexed: 03/18/2023] Open
Abstract
BACKGROUND Generating the most contiguous, accurate genome assemblies given available sequencing technologies is a long-standing challenge in genome science. With the rise of long-read sequencing, assembly challenges have shifted from merely increasing contiguity to correctly assembling complex, repetitive regions of interest, ideally in a phased manner. At present, researchers largely choose between two types of long read data: longer, but less accurate sequences, or highly accurate, but shorter reads (i.e., >Q20 or 99% accurate). To better understand how these types of long-read data as well as scale of data (i.e., mean length and sequencing depth) influence genome assembly outcomes, we compared genome assemblies for a caddisfly, Hesperophylax magnus, generated with longer, but less accurate, Oxford Nanopore (ONT) R9.4.1 and highly accurate PacBio HiFi (HiFi) data. Next, we expanded this comparison to consider the influence of highly accurate long-read sequence data on genome assemblies across 6750 plant and animal genomes. For this broader comparison, we used HiFi data as a surrogate for highly accurate long-reads broadly as we could identify when they were used from GenBank metadata. RESULTS HiFi reads outperformed ONT reads in all assembly metrics tested for the caddisfly data set and allowed for accurate assembly of the repetitive ~ 20 Kb H-fibroin gene. Across plants and animals, genome assemblies that incorporated HiFi reads were also more contiguous. For plants, the average HiFi assembly was 501% more contiguous (mean contig N50 = 20.5 Mb) than those generated with any other long-read data (mean contig N50 = 4.1 Mb). For animals, HiFi assemblies were 226% more contiguous (mean contig N50 = 20.9 Mb) versus other long-read assemblies (mean contig N50 = 9.3 Mb). In plants, we also found limited evidence that HiFi may offer a unique solution for overcoming genomic complexity that scales with assembly size. CONCLUSIONS Highly accurate long-reads generated with HiFi or analogous technologies represent a key tool for maximizing genome assembly quality for a wide swath of plants and animals. This finding is particularly important when resources only allow for one type of sequencing data to be generated. Ultimately, to realize the promise of biodiversity genomics, we call for greater uptake of highly accurate long-reads in future studies.
Collapse
Affiliation(s)
- Scott Hotaling
- Department of Watershed Sciences, Utah State University, Logan, UT, USA.
| | - Edward R Wilcox
- DNA Sequencing Center, Department of Biology, Brigham Young University, Provo, UT, USA
| | - Jacqueline Heckenhauer
- LOEWE Centre for Translational Biodiversity Genomics (LOEWE-TBG), Frankfurt, Germany
- Department of Terrestrial Zoology, Senckenberg Research Institute and Natural History Museum Frankfurt, 60325, Frankfurt, Germany
| | - Russell J Stewart
- Department of Biomedical Engineering, University of Utah, Salt Lake City, UT, USA
| | - Paul B Frandsen
- LOEWE Centre for Translational Biodiversity Genomics (LOEWE-TBG), Frankfurt, Germany.
- Department of Plant and Wildlife Sciences, Brigham Young University, Provo, UT, USA.
- Data Science Lab, Smithsonian Institution, Washington, DC, USA.
| |
Collapse
|
17
|
Nazarenko MS, Sleptcov AA, Zarubin AA, Salakhov RR, Shevchenko AI, Tmoyan NA, Elisaphenko EA, Zubkova ES, Zheltysheva NV, Ezhov MV, Kukharchuk VV, Parfyonova YV, Zakian SM, Zakharova IS. Calling and Phasing of Single-Nucleotide and Structural Variants of the LDLR Gene Using Oxford Nanopore MinION. Int J Mol Sci 2023; 24. [PMID: 36901902 DOI: 10.3390/ijms24054471] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/05/2022] [Revised: 01/27/2023] [Accepted: 02/22/2023] [Indexed: 03/12/2023] Open
Abstract
The LDLR locus has clinical significance for lipid metabolism, Mendelian familial hypercholesterolemia (FH), and common lipid metabolism-related diseases (coronary artery disease and Alzheimer's disease), but its intronic and structural variants are underinvestigated. The aim of this study was to design and validate a method for nearly complete sequencing of the LDLR gene using long-read Oxford Nanopore sequencing technology (ONT). Five PCR amplicons from LDLR of three patients with compound heterozygous FH were analyzed. We used standard workflows of EPI2ME Labs for variant calling. All rare missense and small deletion variants detected previously by massively parallel sequencing and Sanger sequencing were identified using ONT. One patient had a 6976 bp deletion (exons 15 and 16) that was detected by ONT with precisely located breakpoints between AluY and AluSx1. Trans-heterozygous associations between mutation c.530C>T and c.1054T>C, c.2141-966_2390-330del, and c.1327T>C, and between mutations c.1246C>T and c.940+3_940+6del of LDLR, were confirmed. We demonstrated the ability of ONT to phase variants, thereby enabling haplotype assignment for LDLR with personalized resolution. The ONT-based method was able to detect exonic variants with the additional benefit of intronic analysis in one run. This method can serve as an efficient and cost-effective tool for diagnosing FH and conducting research on extended LDLR haplotype reconstruction.
Collapse
|
18
|
Fast KM, Rakestraw AW, Sandel MW. Complete mitochondrial genome of a livebearing freshwater fish (Cyprinodontiformes: Poeciliidae): Poecilia parae. Mitochondrial DNA B Resour 2023; 8:215-219. [PMID: 36761101 PMCID: PMC9904314 DOI: 10.1080/23802359.2023.2171246] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/08/2023] Open
Abstract
Members of the fish family Poeciliidae (livebearing 'tooth-carps') have historically been used as models in medical research, behavior ecology, and biological control. This group of primarily freshwater fishes is highly tolerant to environmental factors such as salinity and warm temperatures and includes some invasive species. Here, we present the mitochondrial genome of Poecilia parae. A representative of this species was obtained from Suriname. The complete mitochondrial genome was sequenced using Oxford Nanopore technology and is 16,559 bp long. The genome contains 13 protein-coding genes, two ribosomal RNAs (rRNAs), 22 transfer RNAs (tRNAs), and one control region (D-loop). Phylogenetic analysis yielded topologies similar to those previously published. The data generated here will be useful in future studies of comparative biology and those utilizing environmental DNA (eDNA).
Collapse
Affiliation(s)
- Kayla M. Fast
- Department of Biological and Environmental Sciences, The University of West Alabama, Livingston, AL, USA,CONTACT Kayla M. Fast Department of Biological and Environmental Sciences, The University of West Alabama, Livingston, AL, USA
| | - Alex W. Rakestraw
- Department of Biological and Environmental Sciences, The University of West Alabama, Livingston, AL, USA
| | - Michael W. Sandel
- Department of Wildlife, Fisheries, and Aquaculture, Mississippi State University, Mississippi State, MS, USA,Michael W. Sandel Department of Wildlife, Fisheries, and Aquaculture, Mississippi State University, Mississippi State, MS, USA
| |
Collapse
|
19
|
Delmiglio C, Waite DW, Lilly ST, Yan J, Elliott CE, Pattemore J, Guy PL, Thompson JR. New Virus Diagnostic Approaches to Ensuring the Ongoing Plant Biosecurity of Aotearoa New Zealand. Viruses 2023; 15:v15020418. [PMID: 36851632 PMCID: PMC9964515 DOI: 10.3390/v15020418] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/05/2023] [Revised: 01/25/2023] [Accepted: 01/26/2023] [Indexed: 02/05/2023] Open
Abstract
To protect New Zealand's unique ecosystems and primary industries, imported plant materials must be constantly monitored at the border for high-threat pathogens. Techniques adopted for this purpose must be robust, accurate, rapid, and sufficiently agile to respond to new and emerging threats. Polymerase chain reaction (PCR), especially real-time PCR, remains an essential diagnostic tool but it is now being complemented by high-throughput sequencing using both Oxford Nanopore and Illumina technologies, allowing unbiased screening of whole populations. The demand for and value of Point-of-Use (PoU) technologies, which allow for in situ screening, are also increasing. Isothermal PoU molecular diagnostics based on recombinase polymerase amplification (RPA) and loop-mediated amplification (LAMP) do not require expensive equipment and can reach PCR-comparable levels of sensitivity. Recent advances in PoU technologies offer opportunities for increased specificity, accuracy, and sensitivities which makes them suitable for wider utilization by frontline or border staff. National and international activities and initiatives are adopted to improve both the plant virus biosecurity infrastructure and the integration, development, and harmonization of new virus diagnostic technologies.
Collapse
Affiliation(s)
- Catia Delmiglio
- Plant Health and Environment Laboratory, Ministry for Primary Industries, P.O. Box 2095, Auckland 1140, New Zealand
- Correspondence: (C.D.); (J.R.T.)
| | - David W. Waite
- Plant Health and Environment Laboratory, Ministry for Primary Industries, P.O. Box 2095, Auckland 1140, New Zealand
| | - Sonia T. Lilly
- Plant Health and Environment Laboratory, Ministry for Primary Industries, P.O. Box 2095, Auckland 1140, New Zealand
| | - Juncong Yan
- Plant Health and Environment Laboratory, Ministry for Primary Industries, P.O. Box 2095, Auckland 1140, New Zealand
| | - Candace E. Elliott
- Science and Surveillance Group, Post Entry Quarantine, Department of Agriculture, Fisheries and Forestry, Mickleham, VIC 3064, Australia
| | - Julie Pattemore
- Science and Surveillance Group, Post Entry Quarantine, Department of Agriculture, Fisheries and Forestry, Mickleham, VIC 3064, Australia
| | - Paul L. Guy
- Department of Botany, University of Otago, P.O. Box 56, Dunedin 9054, New Zealand
| | - Jeremy R. Thompson
- Plant Health and Environment Laboratory, Ministry for Primary Industries, P.O. Box 2095, Auckland 1140, New Zealand
- Correspondence: (C.D.); (J.R.T.)
| |
Collapse
|
20
|
Winter S, de Raad J, Wolf M, Coimbra RTF, de Jong MJ, Schöneberg Y, Christoph M, von Klopotek H, Bach K, Foroush BP, Hanack W, Kauffeldt AH, Milz T, Ngetich EK, Wenz C, Sonnewald M, Nilsson MA, Janke A. A chromosome-scale reference genome assembly of the great sand eel, Hyperoplus lanceolatus. J Hered 2023; 114:189-194. [PMID: 36661278 PMCID: PMC10078159 DOI: 10.1093/jhered/esad003] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/24/2022] [Accepted: 01/18/2023] [Indexed: 01/21/2023] Open
Abstract
Despite increasing sequencing efforts, numerous fish families still lack a reference genome, which complicates genetic research. One such understudied family is the sand lances (Ammodytidae, literally: 'sand burrower'), a globally distributed clade of over 30 fish species that tend to avoid tidal currents by burrowing into the sand. Here, we present the first annotated chromosome-level genome assembly of the great sand eel (Hyperoplus lanceolatus). The genome assembly was generated using Oxford Nanopore Technologies long sequencing reads and Illumina short reads for polishing. The final assembly has a total length of 808.5 Mbp, of which 97.1% were anchored into 24 chromosome-scale scaffolds using proximity-ligation scaffolding. It is highly contiguous with a scaffold and contig N50 of 33.7 Mbp and 31.3 Mbp, respectively, and has a BUSCO completeness score of 96.9%. The presented genome assembly is a valuable resource for future studies of sand lances, as this family is of great ecological and commercial importance and may also contribute to studies aiming to resolve the suprafamiliar taxonomy of bony fishes.
Collapse
Affiliation(s)
- Sven Winter
- Institute for Ecology, Evolution, and Diversity, Goethe University, 60428 Frankfurt am Main, Germany.,Senckenberg Biodiversity and Climate Research Centre, 60325 Frankfurt am Main, Germany.,Research Institute for Wildlife Ecology, University of Veterinary Medicine, Vienna, Austria
| | - Jordi de Raad
- Senckenberg Biodiversity and Climate Research Centre, 60325 Frankfurt am Main, Germany.,LOEWE-Centre for Translational Biodiversity Genomics, 60325 Frankfurt am Main, Germany
| | - Magnus Wolf
- Institute for Ecology, Evolution, and Diversity, Goethe University, 60428 Frankfurt am Main, Germany.,Senckenberg Biodiversity and Climate Research Centre, 60325 Frankfurt am Main, Germany
| | - Raphael T F Coimbra
- Institute for Ecology, Evolution, and Diversity, Goethe University, 60428 Frankfurt am Main, Germany.,Senckenberg Biodiversity and Climate Research Centre, 60325 Frankfurt am Main, Germany
| | - Menno J de Jong
- Senckenberg Biodiversity and Climate Research Centre, 60325 Frankfurt am Main, Germany
| | - Yannis Schöneberg
- Institute for Ecology, Evolution, and Diversity, Goethe University, 60428 Frankfurt am Main, Germany.,Senckenberg Biodiversity and Climate Research Centre, 60325 Frankfurt am Main, Germany
| | - Maria Christoph
- Institute for Ecology, Evolution, and Diversity, Goethe University, 60428 Frankfurt am Main, Germany
| | - Hagen von Klopotek
- Institute for Ecology, Evolution, and Diversity, Goethe University, 60428 Frankfurt am Main, Germany
| | - Katharina Bach
- Institute for Ecology, Evolution, and Diversity, Goethe University, 60428 Frankfurt am Main, Germany
| | - Behgol Pashm Foroush
- Institute for Ecology, Evolution, and Diversity, Goethe University, 60428 Frankfurt am Main, Germany
| | - Wiebke Hanack
- Institute for Ecology, Evolution, and Diversity, Goethe University, 60428 Frankfurt am Main, Germany
| | - Aaron Hagen Kauffeldt
- Institute for Ecology, Evolution, and Diversity, Goethe University, 60428 Frankfurt am Main, Germany
| | - Tim Milz
- Institute for Ecology, Evolution, and Diversity, Goethe University, 60428 Frankfurt am Main, Germany
| | - Emmanuel Kipruto Ngetich
- Institute for Ecology, Evolution, and Diversity, Goethe University, 60428 Frankfurt am Main, Germany
| | - Christian Wenz
- Institute for Ecology, Evolution, and Diversity, Goethe University, 60428 Frankfurt am Main, Germany
| | - Moritz Sonnewald
- Senckenberg Research Institute, Department of Marine Zoology, Section Ichthyology, 60325 Frankfurt am Main, Germany
| | - Maria Anna Nilsson
- Senckenberg Biodiversity and Climate Research Centre, 60325 Frankfurt am Main, Germany.,LOEWE-Centre for Translational Biodiversity Genomics, 60325 Frankfurt am Main, Germany
| | - Axel Janke
- Institute for Ecology, Evolution, and Diversity, Goethe University, 60428 Frankfurt am Main, Germany.,Senckenberg Biodiversity and Climate Research Centre, 60325 Frankfurt am Main, Germany.,LOEWE-Centre for Translational Biodiversity Genomics, 60325 Frankfurt am Main, Germany
| |
Collapse
|
21
|
Salakhov RR, Golubenko MV, Valiakhmetov NR, Pavlyukova EN, Zarubin AA, Babushkina NP, Kucher AN, Sleptcov AA, Nazarenko MS. Application of Long-Read Nanopore Sequencing to the Search for Mutations in Hypertrophic Cardiomyopathy. Int J Mol Sci 2022; 23. [PMID: 36555486 DOI: 10.3390/ijms232415845] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/24/2022] [Revised: 12/10/2022] [Accepted: 12/10/2022] [Indexed: 12/15/2022] Open
Abstract
Increasing evidence suggests that both coding and non-coding regions of sarcomeric protein genes can contribute to hypertrophic cardiomyopathy (HCM). Here, we introduce an experimental workflow (tested on four patients) for complete sequencing of the most common HCM genes (MYBPC3, MYH7, TPM1, TNNT2, and TNNI3) via long-range PCR, Oxford Nanopore Technology (ONT) sequencing, and bioinformatic analysis. We applied Illumina and Sanger sequencing to validate the results, FastQC, Qualimap, and MultiQC for quality evaluations, MiniMap2 to align data, Clair3 to call and phase variants, and Annovar's tools and CADD to assess pathogenicity of variants. We could not amplify the region encompassing exons 6-12 of MYBPC3. A higher sequencing error rate was observed with ONT (6.86-6.92%) than with Illumina technology (1.14-1.35%), mostly for small indels. Pathogenic variant p.Gln1233Ter and benign polymorphism p.Arg326Gln in MYBPC3 in a heterozygous state were found in one patient. We demonstrated the ability of ONT to phase single-nucleotide variants, enabling direct haplotype determination for genes TNNT2 and TPM1. These findings highlight the importance of long-range PCR efficiency, as well as lower accuracy of variant calling by ONT than by Illumina technology; these differences should be clarified prior to clinical application of the ONT method.
Collapse
|
22
|
Nelson TM, Ghosh S, Postler TS. L-RAPiT: A Cloud-Based Computing Pipeline for the Analysis of Long-Read RNA Sequencing Data. Int J Mol Sci 2022; 23:ijms232415851. [PMID: 36555493 PMCID: PMC9781625 DOI: 10.3390/ijms232415851] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/16/2022] [Revised: 12/07/2022] [Accepted: 12/11/2022] [Indexed: 12/15/2022] Open
Abstract
Long-read sequencing (LRS) has been adopted to meet a wide variety of research needs, ranging from the construction of novel transcriptome annotations to the rapid identification of emerging virus variants. Amongst other advantages, LRS preserves more information about RNA at the transcript level than conventional high-throughput sequencing, including far more accurate and quantitative records of splicing patterns. New studies with LRS datasets are being published at an exponential rate, generating a vast reservoir of information that can be leveraged to address a host of different research questions. However, mining such publicly available data in a tailored fashion is currently not easy, as the available software tools typically require familiarity with the command-line interface, which constitutes a significant obstacle to many researchers. Additionally, different research groups utilize different software packages to perform LRS analysis, which often prevents a direct comparison of published results across different studies. To address these challenges, we have developed the Long-Read Analysis Pipeline for Transcriptomics (L-RAPiT), a user-friendly, free pipeline requiring no dedicated computational resources or bioinformatics expertise. L-RAPiT can be implemented directly through Google Colaboratory, a system based on the open-source Jupyter notebook environment, and allows for the direct analysis of transcriptomic reads from Oxford Nanopore and PacBio LRS machines. This new pipeline enables the rapid, convenient, and standardized analysis of publicly available or newly generated LRS datasets.
Collapse
|
23
|
Conlin LK, Aref-Eshghi E, McEldrew DA, Luo M, Rajagopalan R. Long-read sequencing for molecular diagnostics in constitutional genetic disorders. Hum Mutat 2022; 43:1531-1544. [PMID: 36086952 PMCID: PMC9561063 DOI: 10.1002/humu.24465] [Citation(s) in RCA: 7] [Impact Index Per Article: 3.5] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/23/2022] [Revised: 09/03/2022] [Accepted: 09/06/2022] [Indexed: 11/11/2022]
Abstract
Long-read sequencing (LRS) has been around for more than a decade, but widespread adoption of the technology has been slow due to the perceived high error rates and high sequencing cost. This is changing due to the recent advancements to produce highly accurate sequences and the reducing costs. LRS promises significant improvement over short read sequencing in four major areas: (1) better detection of structural variation (2) better resolution of highly repetitive or nonunique regions (3) accurate long-range haplotype phasing and (4) the detection of base modifications natively from the sequencing data. Several successful applications of LRS have demonstrated its ability to resolve molecular diagnoses where short-read sequencing fails to identify a cause. However, the argument for increased diagnostic yield from LRS remains to be validated. Larger cohort studies may be required to establish the realistic boundaries of LRS's clinical utility and analytical validity, as well as the development of standards for clinical applications. We discuss the limitations of the current standard of care, and contrast with the applications and advantages of two major LRS platforms, PacBio and Oxford Nanopore, for molecular diagnostics of constitutional disorders, and present a critical argument about the potential of LRS in diagnostic settings.
Collapse
Affiliation(s)
- Laura K. Conlin
- Division of Genomic Diagnostics, Department of Pathology and Laboratory Medicine, Children’s Hospital of Philadelphia, Philadelphia, PA 19104
- Department of Pathology and Laboratory Medicine, Perelman School of Medicine, University of Pennsylvania, Philadelphia, PA 19104
| | - Erfan Aref-Eshghi
- Division of Genomic Diagnostics, Department of Pathology and Laboratory Medicine, Children’s Hospital of Philadelphia, Philadelphia, PA 19104
| | - Deborah A. McEldrew
- Division of Genomic Diagnostics, Department of Pathology and Laboratory Medicine, Children’s Hospital of Philadelphia, Philadelphia, PA 19104
| | - Minjie Luo
- Division of Genomic Diagnostics, Department of Pathology and Laboratory Medicine, Children’s Hospital of Philadelphia, Philadelphia, PA 19104
- Department of Pathology and Laboratory Medicine, Perelman School of Medicine, University of Pennsylvania, Philadelphia, PA 19104
| | - Ramakrishnan Rajagopalan
- Division of Genomic Diagnostics, Department of Pathology and Laboratory Medicine, Children’s Hospital of Philadelphia, Philadelphia, PA 19104
- Department of Pathology and Laboratory Medicine, Perelman School of Medicine, University of Pennsylvania, Philadelphia, PA 19104
| |
Collapse
|
24
|
Waite DW, Liefting L, Delmiglio C, Chernyavtseva A, Ha HJ, Thompson JR. Development and Validation of a Bioinformatic Workflow for the Rapid Detection of Viruses in Biosecurity. Viruses 2022; 14:v14102163. [PMID: 36298719 PMCID: PMC9610911 DOI: 10.3390/v14102163] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/15/2022] [Accepted: 09/25/2022] [Indexed: 11/05/2022] Open
Abstract
The field of biosecurity has greatly benefited from the widespread adoption of high-throughput sequencing technologies, for its ability to deeply query plant and animal samples for pathogens for which no tests exist. However, the bioinformatics analysis tools designed for rapid analysis of these sequencing datasets are not developed with this application in mind, limiting the ability of diagnosticians to standardise their workflows using published tool kits. We sought to assess previously published bioinformatic tools for their ability to identify plant- and animal-infecting viruses while distinguishing from the host genetic material. We discovered that many of the current generation of virus-detection pipelines are not adequate for this task, being outperformed by more generic classification tools. We created synthetic MinION and HiSeq libraries simulating plant and animal infections of economically important viruses and assessed a series of tools for their suitability for rapid and accurate detection of infection, and further tested the top performing tools against the VIROMOCK Challenge dataset to ensure that our findings were reproducible when compared with international standards. Our work demonstrated that several methods provide sensitive and specific detection of agriculturally important viruses in a timely manner and provides a key piece of ground truthing for method development in this space.
Collapse
Affiliation(s)
- David W. Waite
- Plant Health and Environment Laboratory, Ministry for Primary Industries, P.O. Box 2095, Auckland 1140, New Zealand
- Correspondence:
| | - Lia Liefting
- Plant Health and Environment Laboratory, Ministry for Primary Industries, P.O. Box 2095, Auckland 1140, New Zealand
| | - Catia Delmiglio
- Plant Health and Environment Laboratory, Ministry for Primary Industries, P.O. Box 2095, Auckland 1140, New Zealand
| | | | - Hye Jeong Ha
- Animal Health Laboratory, Ministry for Primary Industries, Upper Hutt 5018, New Zealand
| | - Jeremy R. Thompson
- Plant Health and Environment Laboratory, Ministry for Primary Industries, P.O. Box 2095, Auckland 1140, New Zealand
| |
Collapse
|
25
|
Zaitsev SS, Khizhnyakova MA, Feodorova VA. Retrospective Investigation of the Whole Genome of the Hypovirulent Listeria monocytogenes Strain of ST201, CC69, Lineage III, Isolated from a Piglet with Fatal Neurolisteriosis. Microorganisms 2022; 10:microorganisms10071442. [PMID: 35889161 PMCID: PMC9324732 DOI: 10.3390/microorganisms10071442] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/14/2022] [Revised: 07/12/2022] [Accepted: 07/15/2022] [Indexed: 02/06/2023] Open
Abstract
Listeria monocytogenes (Lm), the causative agent for both human and animal listeriosis, is considered to be a rare but potentially fatal foodborne pathogen. While Lm strains associated with current cases of human listeriosis are now being intensely investigated, our knowledge of this microorganism which has caused listerial infection in the past is still extremely limited. The objective of this study was a retrospective whole-genome sequence analysis of the Lm collection strain, 4/52-1953, isolated in the middle of the 20th century from a piglet with listerial neuroinfection. The multi-locus sequence typing (MLST) analysis based on seven housekeeping genes (abcZ, bglA, cat, dapE, dat, ldh, and lhkA) showed that the Lm strain 4/52-1953 was assigned to the sequence type 201 (ST201), clonal complex 69 (CC69), and phylogenetic lineage III. The strain 4/52-1953, similarly to other ST201 strains, probably originated from the ST9, CC69 via ST157. At least eight different STs, ST69, ST72, ST130, ST136, ST148, ST469, ST769, and ST202, were identified as the descendants of the first generation and a single one, ST2290, was proved to be the descendant of the second generation. Among them there were strains either associated with some sporadic cases of human and animal listerial infection in the course of more than 60 years worldwide or isolated from food samples, fish and dairy products, or migratory birds. Phylogenetic analysis based on whole genomes of all the Lm strains available in the NCBI GenBank (n = 256) demonstrated that the strain 4/52-1953 belonged to minor Cluster I, represented by lineage III only, while two other major Clusters, II and III, were formed by lineages I and II. In the genome of the strain 4/52-1953, 41 virulence-associated genes, including the Listeria pathogenicity island 1 (LIPI-1), and LIPI-2 represented by two internalin genes, the inlA and inlB genes, and five genes related to antibiotic resistance, were found. These findings can help to make the emergence of both hyper- and hypovirulent variants, including those bearing antibiotic resistance genes, more visible and aid the aims of molecular epidemiology as well.
Collapse
Affiliation(s)
- Sergey S Zaitsev
- Federal Research Center for Virology and Microbiology, Branch in Saratov, 410028 Saratov, Russia
| | - Mariya A Khizhnyakova
- Federal Research Center for Virology and Microbiology, Branch in Saratov, 410028 Saratov, Russia
| | - Valentina A Feodorova
- Federal Research Center for Virology and Microbiology, Branch in Saratov, 410028 Saratov, Russia
| |
Collapse
|
26
|
Jabeen MF, Sanderson ND, Foster D, Crook DW, Cane JL, Borg C, Connolly C, Thulborn S, Pavord ID, Klenerman P, Street TL, Hinks TSC. Identifying Bacterial Airways Infection in Stable Severe Asthma Using Oxford Nanopore Sequencing Technologies. Microbiol Spectr 2022; 10:e0227921. [PMID: 35323032 PMCID: PMC9045196 DOI: 10.1128/spectrum.02279-21] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/19/2021] [Accepted: 03/02/2022] [Indexed: 12/11/2022] Open
Abstract
Previous metagenomic studies in asthma have been limited by inadequate sequencing depth for species-level bacterial identification and by heterogeneity in clinical phenotyping. We hypothesize that chronic bacterial airways infection is a key "treatable trait" whose prevalence, clinical phenotype and reliable biomarkers need definition. In this study, we have applied a method for Oxford Nanopore sequencing for the unbiased metagenomic characterization of severe asthma. We optimized methods to compare performance of Illumina MiSeq, Nanopore sequencing, and RT-qPCR on total sputum DNA extracts against culture/MALDI-TOF for analysis of induced sputum samples from highly phenotyped severe asthma during clinical stability. In participants with severe asthma (n = 23) H. influenzae was commonly cultured (n = 8) and identified as the dominant bacterial species by metagenomic sequencing using an optimized method for Illumina MiSeq and Oxford Nanopore. Alongside superior operational characteristics, Oxford Nanopore achieved near complete genome coverage of H. influenzae and demonstrated a high level of agreement with Illumina MiSeq data. Clinically significant infection was confirmed with validated H. influenzae plasmid-based quantitative PCR assay. H. influenzae positive patients were found to have sputum neutrophilia and lower FeNO. In conclusion, using an optimized method of direct sequencing of induced sputum samples, H. influenzae was identified as a clinically relevant pathogen in severe asthma and was identified reliably using metagenomic sequencing. Application of these protocols in ongoing analysis of large patient cohorts will allow full characterization of this clinical phenotype. IMPORTANCE The human airways were once thought sterile in health. Now metagenomic techniques suggest bacteria may be present, but their role in asthma is not understood. Traditional culture lacks sensitivity and current sequencing techniques are limited by operational problems and limited ability to identify pathogens at species level. We optimized a new sequencing technique-Oxford Nanopore technologies (ONT)-for use on human sputum samples and compared it with existing methods. We found ONT was effective for rapidly analyzing samples and could identify bacteria at the species level. We used this to show Haemophilus influenzae was a dominant bacterium in the airways in people with severe asthma. The presence of Haemophilus was associated with a "neutrophilic" form of asthma - a subgroup for which we currently lack specific treatments. Therefore, this technique could be used to target chronic antibiotic therapy and in research to characterize the full breadth of bacteria in the airways.
Collapse
Affiliation(s)
- Maisha F. Jabeen
- Respiratory Medicine Unit, Experimental Medicine Division, Nuffield Department of Medicine, University of Oxford, John Radcliffe Hospital, Oxford, United Kingdom
- National Institute for Health Research Oxford Biomedical Research Centre, John Radcliffe Hospital, Oxford, United Kingdom
| | - Nicholas D. Sanderson
- Nuffield Department of Clinical Medicine, University of Oxford, John Radcliffe Hospital, Oxford, United Kingdom
- National Institute for Health Research Oxford Biomedical Research Centre, John Radcliffe Hospital, Oxford, United Kingdom
| | - Dona Foster
- Nuffield Department of Clinical Medicine, University of Oxford, John Radcliffe Hospital, Oxford, United Kingdom
| | - Derrick W. Crook
- Nuffield Department of Clinical Medicine, University of Oxford, John Radcliffe Hospital, Oxford, United Kingdom
- National Institute for Health Research Oxford Biomedical Research Centre, John Radcliffe Hospital, Oxford, United Kingdom
| | - Jennifer L. Cane
- Respiratory Medicine Unit, Experimental Medicine Division, Nuffield Department of Medicine, University of Oxford, John Radcliffe Hospital, Oxford, United Kingdom
- National Institute for Health Research Oxford Biomedical Research Centre, John Radcliffe Hospital, Oxford, United Kingdom
| | - Catherine Borg
- Respiratory Medicine Unit, Experimental Medicine Division, Nuffield Department of Medicine, University of Oxford, John Radcliffe Hospital, Oxford, United Kingdom
- National Institute for Health Research Oxford Biomedical Research Centre, John Radcliffe Hospital, Oxford, United Kingdom
| | - Clare Connolly
- Respiratory Medicine Unit, Experimental Medicine Division, Nuffield Department of Medicine, University of Oxford, John Radcliffe Hospital, Oxford, United Kingdom
- National Institute for Health Research Oxford Biomedical Research Centre, John Radcliffe Hospital, Oxford, United Kingdom
| | - Samantha Thulborn
- Respiratory Medicine Unit, Experimental Medicine Division, Nuffield Department of Medicine, University of Oxford, John Radcliffe Hospital, Oxford, United Kingdom
- National Institute for Health Research Oxford Biomedical Research Centre, John Radcliffe Hospital, Oxford, United Kingdom
| | - Ian D. Pavord
- Respiratory Medicine Unit, Experimental Medicine Division, Nuffield Department of Medicine, University of Oxford, John Radcliffe Hospital, Oxford, United Kingdom
- National Institute for Health Research Oxford Biomedical Research Centre, John Radcliffe Hospital, Oxford, United Kingdom
| | - Paul Klenerman
- Peter Medawar Building for Pathogen Research and Translational Gastroenterology Unit, Nuffield Department of Clinical Medicine, University of Oxford, Oxford, United Kingdom
| | - Teresa L. Street
- Nuffield Department of Clinical Medicine, University of Oxford, John Radcliffe Hospital, Oxford, United Kingdom
- National Institute for Health Research Oxford Biomedical Research Centre, John Radcliffe Hospital, Oxford, United Kingdom
| | - Timothy S. C. Hinks
- Respiratory Medicine Unit, Experimental Medicine Division, Nuffield Department of Medicine, University of Oxford, John Radcliffe Hospital, Oxford, United Kingdom
- National Institute for Health Research Oxford Biomedical Research Centre, John Radcliffe Hospital, Oxford, United Kingdom
| |
Collapse
|
27
|
Czmil A, Wronski M, Czmil S, Sochacka-Pietal M, Cmil M, Gawor J, Wołkowicz T, Plewczynski D, Strzalka D, Pietal M. NanoForms: an integrated server for processing, analysis and assembly of raw sequencing data of microbial genomes, from Oxford Nanopore technology. PeerJ 2022; 10:e13056. [PMID: 35368340 PMCID: PMC8973472 DOI: 10.7717/peerj.13056] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/23/2021] [Accepted: 02/13/2022] [Indexed: 01/11/2023] Open
Abstract
Background Next Generation Sequencing (NGS) techniques dominate today's landscape of genetics and genomics research. Though Illumina still dominates worldwide sequencing, Oxford Nanopore is one of the leading technologies currently being used by biologists, medics and geneticists across various applications. Oxford Nanopore is automated and relatively simple for conducting experiments, but generates gigabytes of raw data, to be processed by often ambiguous set of alternative bioinformatics command-line tools, and genomics frameworks which require a knowledge of bioinformatics to run. Results We established an inter-collegiate collaboration across experimentalists and bioinformaticians in order to provide a novel bioinformatics tool, free for academics. This tool allows people without extensive bioinformatics knowledge to simply process their raw genome sequencing data. Currently, due to ICT resources' maintenance reasons, our server is only capable of handling small genomes (up to 15 Mb). In this paper, we introduce our tool, NanoForms: an intuitive and integrated web server for the processing and analysis of raw prokaryotic genome data, coming from Oxford Nanopore. NanoForms is freely available for academics at the following locations: http://nanoforms.tech (webserver) and https://github.com/czmilanna/nanoforms (GitHub source repository).
Collapse
Affiliation(s)
- Anna Czmil
- Department of Complex Systems, Rzeszow University of Technology, Rzeszow, Subcarpathian, Poland
| | - Michal Wronski
- Department of Complex Systems, Rzeszow University of Technology, Rzeszow, Subcarpathian, Poland
| | - Sylwester Czmil
- Department of Complex Systems, Rzeszow University of Technology, Rzeszow, Subcarpathian, Poland
| | - Marta Sochacka-Pietal
- Department of Biotechnology and Bioinformatics, Rzeszow University of Technology, Rzeszow, Subcarpathian, Poland
| | - Michal Cmil
- Department of Complex Systems, Rzeszow University of Technology, Rzeszow, Subcarpathian, Poland
| | - Jan Gawor
- DNA Sequencing and Oligonucleotide Synthesis Laboratory, Institute of Biochemistry and Biophysics, Polish Academy of Sciences, Warsaw, Masovian, Poland
| | - Tomasz Wołkowicz
- Department of Bacteriology and Biocontamination Control, National Institute of Public Health-National Institute of Hygiene, Warsaw, Masovian, Poland
| | - Dariusz Plewczynski
- Laboratory of Functional and Structural Genomics, Centre of New Technologies, University of Warsaw, Warsaw, Masovian, Poland,Laboratory of Bioinformatics and Computational Genomics, Warsaw University of Technology, Warsaw, Masovian, Poland
| | - Dominik Strzalka
- Department of Complex Systems, Rzeszow University of Technology, Rzeszow, Subcarpathian, Poland
| | - Michal Pietal
- Department of Complex Systems, Rzeszow University of Technology, Rzeszow, Subcarpathian, Poland
| |
Collapse
|
28
|
Ong CT, Ross EM, Boe-Hansen GB, Turni C, Hayes BJ, Tabor AE. Technical note: overcoming host contamination in bovine vaginal metagenomic samples with nanopore adaptive sequencing. J Anim Sci 2022; 100:skab344. [PMID: 34791313 PMCID: PMC8722758 DOI: 10.1093/jas/skab344] [Citation(s) in RCA: 4] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/11/2021] [Accepted: 11/10/2021] [Indexed: 12/11/2022] Open
Abstract
Animal metagenomic studies, in which host-associated microbiomes are profiled, are an increasingly important contribution to our understanding of the physiological functions, health and susceptibility to diseases of livestock. One of the major challenges in these studies is host DNA contamination, which limits the sequencing capacity for metagenomic content and reduces the accuracy of metagenomic profiling. This is the first study comparing the effectiveness of different sequencing methods for profiling bovine vaginal metagenomic samples. We compared the new method of Oxford Nanopore Technologies (ONT) adaptive sequencing, which can be used to target or eliminate defined genetic sequences, to standard ONT sequencing, Illumina 16S rDNA amplicon sequencing, and Illumina shotgun sequencing. The efficiency of each method in recovering the metagenomic data and recalling the metagenomic profiles was assessed. ONT adaptive sequencing yielded a higher amount of metagenomic data than the other methods per 1 Gb of sequence data. The increased sequencing efficiency of ONT adaptive sequencing consequently reduced the amount of raw data needed to provide sufficient coverage for the metagenomic samples with high host-to-microbe DNA ratio. Additionally, the long reads generated by ONT adaptive sequencing retained the continuity of read information, which benefited the in-depth annotations for both taxonomical and functional profiles of the metagenome. The different methods resulted in the identification of different taxa. Genera Clostridium, which was identified at low abundances and categorized under Order "Unclassified Clostridiales" when using the 16S rDNA amplicon sequencing method, was identified to be the dominant genera in the sample when sequenced with the three other methods. Additionally, higher numbers of annotated genes were identified with ONT adaptive sequencing, which also produced high coverage on most of the commonly annotated genes. This study illustrates the advantages of ONT adaptive sequencing in improving the amount of metagenomic data derived from microbiome samples with high host-to-microbe DNA ratio and the advantage of long reads in preserving intact information for accurate annotations.
Collapse
Affiliation(s)
- Chian Teng Ong
- Centre for Animal Science, Queensland Alliance for Agriculture and Food Innovation, The University of Queensland, Queensland 4072, Australia
| | - Elizabeth M Ross
- Centre for Animal Science, Queensland Alliance for Agriculture and Food Innovation, The University of Queensland, Queensland 4072, Australia
| | - Gry B Boe-Hansen
- Faculty of Science, School of Veterinary Science, The University of Queensland, Queensland 4072, Australia
| | - Conny Turni
- Centre for Animal Science, Queensland Alliance for Agriculture and Food Innovation, The University of Queensland, Queensland 4072, Australia
| | - Ben J Hayes
- Centre for Animal Science, Queensland Alliance for Agriculture and Food Innovation, The University of Queensland, Queensland 4072, Australia
| | - Ala E Tabor
- Centre for Animal Science, Queensland Alliance for Agriculture and Food Innovation, The University of Queensland, Queensland 4072, Australia
- Faculty of Science, School of Chemistry and Molecular Bioscience, The University of Queensland, Queensland 4072, Australia
| |
Collapse
|
29
|
Khezri A, Avershina E, Ahmad R. Hybrid Assembly Provides Improved Resolution of Plasmids, Antimicrobial Resistance Genes, and Virulence Factors in Escherichia coli and Klebsiella pneumoniae Clinical Isolates. Microorganisms 2021; 9:microorganisms9122560. [PMID: 34946161 PMCID: PMC8704702 DOI: 10.3390/microorganisms9122560] [Citation(s) in RCA: 8] [Impact Index Per Article: 2.7] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/03/2021] [Revised: 12/03/2021] [Accepted: 12/06/2021] [Indexed: 12/28/2022] Open
Abstract
Emerging new sequencing technologies have provided researchers with a unique opportunity to study factors related to microbial pathogenicity, such as antimicrobial resistance (AMR) genes and virulence factors. However, the use of whole-genome sequence (WGS) data requires good knowledge of the bioinformatics involved, as well as the necessary techniques. In this study, a total of nine Escherichia coli and Klebsiella pneumoniae isolates from Norwegian clinical samples were sequenced using both MinION and Illumina platforms. Three out of nine samples were sequenced directly from blood culture, and one sample was sequenced from a mixed-blood culture. For genome assembly, several long-read, (Canu, Flye, Unicycler, and Miniasm), short-read (ABySS, Unicycler and SPAdes) and hybrid assemblers (Unicycler, hybridSPAdes, and MaSurCa) were tested. Assembled genomes from the best-performing assemblers (according to quality checks using QUAST and BUSCO) were subjected to downstream analyses. Flye and Unicycler assemblers performed best for the assembly of long and short reads, respectively. For hybrid assembly, Unicycler was the top-performing assembler and produced more circularized and complete genome assemblies. Hybrid assembled genomes performed substantially better in downstream analyses to predict putative plasmids, AMR genes and β-lactamase gene variants, compared to MinION and Illumina assemblies. Thus, hybrid assembly has the potential to reveal factors related to microbial pathogenicity in clinical and mixed samples.
Collapse
Affiliation(s)
- Abdolrahman Khezri
- Department of Biotechnology, Inland Norway University of Applied Sciences, 2318 Hamar, Norway; (A.K.); (E.A.)
| | - Ekaterina Avershina
- Department of Biotechnology, Inland Norway University of Applied Sciences, 2318 Hamar, Norway; (A.K.); (E.A.)
| | - Rafi Ahmad
- Department of Biotechnology, Inland Norway University of Applied Sciences, 2318 Hamar, Norway; (A.K.); (E.A.)
- Faculty of Health Sciences, Institute of Clinical Medicine, UiT-The Arctic University of Norway, Hansine Hansens veg 18, 9019 Tromsø, Norway
- Correspondence:
| |
Collapse
|
30
|
Turner L, Backenstose NJC, Brandl S, Bernal MA. Range expansion and complete mitochondrial genome of the highfin blenny (Lupinoblennius nicholsi). Mol Biol Rep 2021; 49:1587-1591. [PMID: 34773549 DOI: 10.1007/s11033-021-06932-7] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/20/2021] [Accepted: 11/04/2021] [Indexed: 10/19/2022]
Abstract
BACKGROUND The highfin blenny, Lupinoblennius nicholsi, is a marine fish species reported in reef and rocky inshore habitats with a disjunct distribution in the southern Gulf of Mexico. Overall, there are very few studies on this species and there is a scarcity of molecular resources for genetic comparisons. We set out to report the first mitochondrial genome for L. nicholsi and report a range expansion for the species. METHODS AND RESULTS An individual of L. nicholsi was collected from the coast of Dauphin Island, Alabama. The mitochondrial genome was sequenced, assembled, and annotated. The fragment corresponding to cytochrome oxidase I (COI) was used to compare this sample to other cryptobenthic species of the Atlantic. Finding a mature individual in the coast of Alabama implies this species has a continuous distribution throughout the northern Gulf of Mexico. The mitochondrial genome of L. nicholsi is 16,416 bp in length and comprised of 13 protein coding genes, 22 transfer RNA genes, two ribosomal RNA genes, and a non-coding D-loop. Comparisons using COI support the species is L. nicholsi and separate it from other cryptobenthic fishes found in the area. CONCLUSIONS This study represents the first mitochondrial genome for this L. nicholsi, serving as a reference for future comparative studies with marine fishes. By reporting the range expansion of this species, this study provides insights on the fish diversity of the Gulf of Mexico.
Collapse
Affiliation(s)
- Logan Turner
- Department of Biological Sciences, College of Science and Mathematics, Auburn University, Auburn, AL, 36930, USA.
| | - Nathan J C Backenstose
- Department of Biological Sciences, College of Arts and Science, State University of New York at Buffalo, Buffalo, NY, 14260, USA
| | - Simon Brandl
- Department of Marine Science, College of Natural Science, University of Texas at Austin, Port Aransas, TX, 78373, USA
| | - Moisés A Bernal
- Department of Biological Sciences, College of Science and Mathematics, Auburn University, Auburn, AL, 36930, USA
| |
Collapse
|
31
|
Manouana GP, Maloum MN, Bikangui R, Oye Bingono SO, Ondo GN, Honkpehedji JY, Rossatanga EG, Assoumou SZ, Pallerla SR, Rachakonda S, Ndong RM, Lekana-Douki JB, Siawaya JFD, Borrmann S, Kremsner PG, Lell B, Velavan TP, Adegnika AA. Emergence of B.1.1.318 SARS-CoV-2 viral lineage and high incidence of alpha B.1.1.7 variant of concern in Republic of Gabon. Int J Infect Dis 2021; 114:151-154. [PMID: 34742926 PMCID: PMC8563502 DOI: 10.1016/j.ijid.2021.10.057] [Citation(s) in RCA: 6] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/23/2021] [Revised: 10/09/2021] [Accepted: 10/29/2021] [Indexed: 11/26/2022] Open
Abstract
OBJECTIVE Variants of concern (VOCs) associated with relatively high transmissibility appear to be spreading rapidly in Gabon. Therefore, it is imperative to understand the distribution of several variants of concern in the population, which could have implications for transmissibility and vaccine efficacy. METHODS Between February and May 2021, SARS-CoV-2 genomes were sequenced using the Oxford nanopore MinION method and the respective genome diversity was elucidated. Phylogenetic analysis was performed and genomes were classified using pangolin lineages. RESULTS The results highlight the increase (46%) of the alpha variant of concern (B.1.1.7) in the Gabonese population over the study period. In addition, an increase (31%) in the B.1.1.318 lineage, which is associated with high transmission and impaired vaccine efficacy (D614G+E484K+Y144del), was detected. CONCLUSION With the second wave ongoing, our findings highlight the need for surveillance of the SARS-CoV-2 genome in the Republic of Gabon and should provide useful guidance to policy makers in selecting an appropriate vaccine for the population.
Collapse
Affiliation(s)
- Gédéon Prince Manouana
- Centre de Recherches Médicales de Lambaréné, Lambaréné, Gabon; Institute of Tropical Medicine, Universitätsklinikum Tübingen, Tübingen, Germany
| | | | - Rodrigue Bikangui
- Centre de Recherches Médicales de Lambaréné, Lambaréné, Gabon; Ecole doctorale de Franceville, Franceville, Gabon
| | | | | | | | | | - Samira Zoa Assoumou
- Laboratoire Professeur Daniel Gahouma, Libreville, Gabon; Département de Bactériologie-Virologie, Université des Sciences de la Santé, Libreville, Gabon
| | | | | | | | | | | | - Steffen Borrmann
- Centre de Recherches Médicales de Lambaréné, Lambaréné, Gabon; Institute of Tropical Medicine, Universitätsklinikum Tübingen, Tübingen, Germany; German Center for Infection Research (DZIF), Tübingen, Germany
| | - Peter G Kremsner
- Centre de Recherches Médicales de Lambaréné, Lambaréné, Gabon; Institute of Tropical Medicine, Universitätsklinikum Tübingen, Tübingen, Germany; German Center for Infection Research (DZIF), Tübingen, Germany
| | - Bertrand Lell
- Centre de Recherches Médicales de Lambaréné, Lambaréné, Gabon; Institute of Tropical Medicine, Universitätsklinikum Tübingen, Tübingen, Germany; Medical University of Vienna, Vienna, Austria
| | - Thirumalaisamy P Velavan
- Institute of Tropical Medicine, Universitätsklinikum Tübingen, Tübingen, Germany; Vietnamese-German Center for Medical Research, VG-CARE, Hanoi, Vietnam
| | - Ayola Akim Adegnika
- Centre de Recherches Médicales de Lambaréné, Lambaréné, Gabon; Institute of Tropical Medicine, Universitätsklinikum Tübingen, Tübingen, Germany; German Center for Infection Research (DZIF), Tübingen, Germany; Fondation pour la Recherche Scientifique, Cotonou, Bénin.
| |
Collapse
|
32
|
Bao Y, Wadden J, Erb-Downward JR, Ranjan P, Zhou W, McDonald TL, Mills RE, Boyle AP, Dickson RP, Blaauw D, Welch JD. SquiggleNet: real-time, direct classification of nanopore signals. Genome Biol 2021; 22:298. [PMID: 34706748 PMCID: PMC8548853 DOI: 10.1186/s13059-021-02511-y] [Citation(s) in RCA: 20] [Impact Index Per Article: 6.7] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/05/2021] [Accepted: 10/04/2021] [Indexed: 11/17/2022] Open
Abstract
We present SquiggleNet, the first deep-learning model that can classify nanopore reads directly from their electrical signals. SquiggleNet operates faster than DNA passes through the pore, allowing real-time classification and read ejection. Using 1 s of sequencing data, the classifier achieves significantly higher accuracy than base calling followed by sequence alignment. Our approach is also faster and requires an order of magnitude less memory than alignment-based approaches. SquiggleNet distinguished human from bacterial DNA with over 90% accuracy, generalized to unseen bacterial species in a human respiratory meta genome sample, and accurately classified sequences containing human long interspersed repeat elements.
Collapse
Affiliation(s)
- Yuwei Bao
- Department of Computer Science and Engineering, University of Michigan, Ann Arbor, 48109, MI, USA
| | - Jack Wadden
- Department of Computer Science and Engineering, University of Michigan, Ann Arbor, 48109, MI, USA.,Department of Electrical and Computer Engineering, University of Michigan, Ann Arbor, 48109, MI, USA
| | - John R Erb-Downward
- Division of Pulmonary and Critical Care Medicine, Department of Internal Medicine, University of Michigan Medical School, Ann Arbor, 48109, MI, USA
| | - Piyush Ranjan
- Division of Pulmonary and Critical Care Medicine, Department of Internal Medicine, University of Michigan Medical School, Ann Arbor, 48109, MI, USA
| | - Weichen Zhou
- Department of Computational Medicine and Bioinformatics, University of Michigan, Ann Arbor, 48109, MI, USA
| | - Torrin L McDonald
- Department of Human Genetics, University of Michigan Medical, Ann Arbor, 48109, MI, USA
| | - Ryan E Mills
- Department of Computational Medicine and Bioinformatics, University of Michigan, Ann Arbor, 48109, MI, USA.,Department of Human Genetics, University of Michigan Medical, Ann Arbor, 48109, MI, USA
| | - Alan P Boyle
- Department of Computational Medicine and Bioinformatics, University of Michigan, Ann Arbor, 48109, MI, USA.,Department of Human Genetics, University of Michigan Medical, Ann Arbor, 48109, MI, USA
| | - Robert P Dickson
- Division of Pulmonary and Critical Care Medicine, Department of Internal Medicine, University of Michigan Medical School, Ann Arbor, 48109, MI, USA.,Department of Microbiology and Immunology, University of Michigan Medical School, Ann Arbor, 48109, MI, USA.,Michigan Center for Integrative Research in Critical Care, Ann Arbor, 48109, MI, USA
| | - David Blaauw
- Division of Pulmonary and Critical Care Medicine, Department of Internal Medicine, University of Michigan Medical School, Ann Arbor, 48109, MI, USA
| | - Joshua D Welch
- Department of Computer Science and Engineering, University of Michigan, Ann Arbor, 48109, MI, USA. .,Department of Computational Medicine and Bioinformatics, University of Michigan, Ann Arbor, 48109, MI, USA.
| |
Collapse
|
33
|
Mantri SS, Negri T, Sales-Ortells H, Angelov A, Peter S, Neidhardt H, Oelmann Y, Ziemert N. Metagenomic Sequencing of Multiple Soil Horizons and Sites in Close Vicinity Revealed Novel Secondary Metabolite Diversity. mSystems 2021; 6:e0101821. [PMID: 34636675 DOI: 10.1128/mSystems.01018-21] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [What about the content of this article? (0)] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/04/2023] Open
Abstract
Discovery of novel antibiotics is crucial for combating rapidly spreading antimicrobial resistance and new infectious diseases. Most of the clinically used antibiotics are natural products—secondary metabolites produced by soil microbes that can be cultured in the lab. Rediscovery of these secondary metabolites during discovery expeditions costs both time and resources. Metagenomics approaches can overcome this challenge by capturing both culturable and unculturable hidden microbial diversity. To be effective, such an approach should address questions like the following. Which sequencing method is better at capturing the microbial diversity and biosynthesis potential? What part of the soil should be sampled? Can patterns and correlations from such big-data explorations guide future novel natural product discovery surveys? Here, we address these questions by a paired amplicon and shotgun metagenomic sequencing survey of samples from soil horizons of multiple forest sites very close to each other. Metagenome mining identified numerous novel biosynthetic gene clusters (BGCs) and enzymatic domain sequences. Hybrid assembly of both long reads and short reads improved the metagenomic assembly and resulted in better BGC annotations. A higher percentage of novel domains was recovered from shotgun metagenome data sets than from amplicon data sets. Overall, in addition to revealing the biosynthetic potential of soil microbes, our results suggest the importance of sampling not only different soils but also their horizons to capture microbial and biosynthetic diversity and highlight the merits of metagenome sequencing methods. IMPORTANCE This study helped uncover the biosynthesis potential of forest soils via exploration of shotgun metagenome and amplicon sequencing methods and showed that both methods are needed to expose the full microbial diversity in soil. Based on our metagenome mining results, we suggest revising the historical strategy of sampling soils from far-flung places, as we found a significant number of novel and diverse BGCs and domains even in different soils that are very close to each other. Furthermore, sampling of different soil horizons can reveal the additional diversity that often remains hidden and is mainly caused by differences in environmental key parameters such as soil pH and nutrient content. This paired metagenomic survey identified diversity patterns and correlations, a step toward developing a rational approach for future natural product discovery surveys.
Collapse
|
34
|
Johnson LK, Sahasrabudhe R, Gill JA, Roach JL, Froenicke L, Brown CT, Whitehead A. Draft genome assemblies using sequencing reads from Oxford Nanopore Technology and Illumina platforms for four species of North American Fundulus killifish. Gigascience 2021; 9:5859380. [PMID: 32556169 PMCID: PMC7301629 DOI: 10.1093/gigascience/giaa067] [Citation(s) in RCA: 10] [Impact Index Per Article: 3.3] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/04/2019] [Revised: 04/16/2020] [Accepted: 05/27/2020] [Indexed: 01/04/2023] Open
Abstract
BACKGROUND Whole-genome sequencing data from wild-caught individuals of closely related North American killifish species (Fundulus xenicus, Fundulus catenatus, Fundulus nottii, and Fundulus olivaceus) were obtained using long-read Oxford Nanopore Technology (ONT) PromethION and short-read Illumina platforms. FINDINGS Draft de novo reference genome assemblies were generated using a combination of long and short sequencing reads. For each species, the PromethION platform was used to generate 30-45× sequence coverage, and the Illumina platform was used to generate 50-160× sequence coverage. Illumina-only assemblies were fragmented with high numbers of contigs, while ONT-only assemblies were error prone with low BUSCO scores. The highest N50 values, ranging from 0.4 to 2.7 Mb, were from assemblies generated using a combination of short- and long-read data. BUSCO scores were consistently >90% complete using the Eukaryota database. CONCLUSIONS High-quality genomes can be obtained from a combination of using short-read Illumina data to polish assemblies generated with long-read ONT data. Draft assemblies and raw sequencing data are available for public use. We encourage use and reuse of these data for assembly benchmarking and other analyses.
Collapse
Affiliation(s)
- Lisa K Johnson
- Department of Environmental Toxicology, University of California. 1 Shields Avenue, Davis, CA 95616, Davis, CA, USA
- Department of Population Health & Reproduction, School of Veterinary Medicine, University of California. 1 Shields Avenue, Davis, CA 95616, Davis, CA, USA
| | - Ruta Sahasrabudhe
- DNA Technologies Core, Genome Center, University of California, 1 Shields Avenue, Davis, CA 95616
| | - James Anthony Gill
- Department of Environmental Toxicology, University of California. 1 Shields Avenue, Davis, CA 95616, Davis, CA, USA
| | - Jennifer L Roach
- Department of Environmental Toxicology, University of California. 1 Shields Avenue, Davis, CA 95616, Davis, CA, USA
| | - Lutz Froenicke
- DNA Technologies Core, Genome Center, University of California, 1 Shields Avenue, Davis, CA 95616
| | - C Titus Brown
- Department of Population Health & Reproduction, School of Veterinary Medicine, University of California. 1 Shields Avenue, Davis, CA 95616, Davis, CA, USA
| | - Andrew Whitehead
- Correspondence address. Andrew Whitehead, Department of Environmental Toxicology, University of California. 1 Shields Avenue, Davis, CA 95616, USA, Davis, CA, USA. E-mail:
| |
Collapse
|
35
|
Flint A, Reaume S, Harlow J, Hoover E, Weedmark K, Nasheri N. Genomic analysis of human noroviruses using combined Illumina-Nanopore data. Virus Evol 2021; 7:veab079. [PMID: 35186325 PMCID: PMC8570145 DOI: 10.1093/ve/veab079] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/01/2021] [Revised: 08/23/2021] [Accepted: 09/13/2021] [Indexed: 07/23/2023] Open
Abstract
Whole-genome sequence analysis of noroviruses is routinely performed by employing a metagenomic approach. While this methodology has several advantages, such as allowing for the examination of co-infection, it has some limitations, such as the requirement of high viral load to achieve full-length or near full-length genomic sequences. In this study, we used a pre-amplification step to obtain full-length genomic amplicons from 39 Canadian GII isolates, followed by deep sequencing on Illumina and Oxford Nanopore platforms. This approach significantly reduced the required viral titre to obtain full-genome coverage. Herein, we compared the coverage and sequences obtained by both platforms and provided an in-depth genomic analysis of the obtained sequences, including the presence of single-nucleotide variants and recombination events.
Collapse
Affiliation(s)
- Annika Flint
- Genomics Laboratory, Bureau of Microbial Hazards, Health Canada, Ottawa, ON, Canada
| | - Spencer Reaume
- National Food Virology Reference Centre, Bureau of Microbial Hazards, Health Canada, Ottawa, ON, Canada
| | - Jennifer Harlow
- National Food Virology Reference Centre, Bureau of Microbial Hazards, Health Canada, Ottawa, ON, Canada
| | - Emily Hoover
- Genomics Laboratory, Bureau of Microbial Hazards, Health Canada, Ottawa, ON, Canada
| | - Kelly Weedmark
- Genomics Laboratory, Bureau of Microbial Hazards, Health Canada, Ottawa, ON, Canada
| | | |
Collapse
|
36
|
Mahmoud M, Doddapaneni H, Timp W, Sedlazeck FJ. PRINCESS: comprehensive detection of haplotype resolved SNVs, SVs, and methylation. Genome Biol 2021; 22:268. [PMID: 34521442 PMCID: PMC8442460 DOI: 10.1186/s13059-021-02486-w] [Citation(s) in RCA: 18] [Impact Index Per Article: 6.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/01/2021] [Accepted: 09/02/2021] [Indexed: 12/11/2022] Open
Abstract
Long-read sequencing has been shown to have advantages in structural variation (SV) detection and methylation calling. Many studies focus either on SV, methylation, or phasing of SNV; however, only the combination of variants provides a comprehensive insight into the sample and thus enables novel findings in biology or medicine. PRINCESS is a structured workflow that takes raw sequence reads and generates a fully phased SNV, SV, and methylation call set within a few hours. PRINCESS achieves high accuracy and long phasing even on low coverage datasets and can resolve repetitive, complex medical relevant genes that often escape detection. PRINCESS is publicly available at https://github.com/MeHelmy/princess under the MIT license.
Collapse
Affiliation(s)
- Medhat Mahmoud
- Human Genome Sequencing Center, Baylor College of Medicine, Houston, TX, 77030, USA.
- Department of Molecular and Human Genetics, Baylor College of Medicine, Houston, TX, 77030, USA.
| | | | - Winston Timp
- Department of Biomedical Engineering, Johns Hopkins University, Baltimore, MD, 21218, USA
| | - Fritz J Sedlazeck
- Human Genome Sequencing Center, Baylor College of Medicine, Houston, TX, 77030, USA.
| |
Collapse
|
37
|
Moolhuijzen P, See PT, Moffat CS. The first genome assembly of fungal pathogen Pyrenophora tritici-repentis race 1 isolate using Oxford Nanopore MinION sequencing. BMC Res Notes 2021; 14:334. [PMID: 34454585 PMCID: PMC8403381 DOI: 10.1186/s13104-021-05751-0] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/09/2021] [Accepted: 08/19/2021] [Indexed: 11/16/2022] Open
Abstract
Objectives The assembly of fungal genomes using short-reads is challenged by long repetitive and low GC regions. However, long-read sequencing technologies, such as PacBio and Oxford Nanopore, are able to overcome many problematic regions, thereby providing an opportunity to improve fragmented genome assemblies derived from short reads only. Here, a necrotrophic fungal pathogen Pyrenophora tritici-repentis (Ptr) isolate 134 (Ptr134), which causes tan spot disease on wheat, was sequenced on a MinION using Oxford Nanopore Technologies (ONT), to improve on a previous Illumina short-read genome assembly and provide a more complete genome resource for pan-genomic analyses of Ptr. Results The genome of Ptr134 sequenced on a MinION using ONT was assembled into 28 contiguous sequences with a total length of 40.79 Mb and GC content of 50.81%. The long-read assembly provided 6.79 Mb of new sequence and 2846 extra annotated protein coding genes as compared to the previous short-read assembly. This improved genome sequence represents near complete chromosomes, an important resource for large scale and pan genomic comparative analyses.
Collapse
Affiliation(s)
- Paula Moolhuijzen
- Centre for Crop Disease and Management, School of Molecular Life Sciences, Curtin University, Bentley, WA, 6102, Australia.
| | - Pao Theen See
- Centre for Crop Disease and Management, School of Molecular Life Sciences, Curtin University, Bentley, WA, 6102, Australia
| | - Caroline S Moffat
- Centre for Crop Disease and Management, School of Molecular Life Sciences, Curtin University, Bentley, WA, 6102, Australia
| |
Collapse
|
38
|
Eastis AN, Fast KM, Sandel MW. The complete mitochondrial genome of the Variable Platyfish Xiphophorus variatus. Mitochondrial DNA B Resour 2021; 6:2640-2642. [PMID: 34409164 PMCID: PMC8366644 DOI: 10.1080/23802359.2021.1963339] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Indexed: 11/16/2022]
Abstract
We present the complete mitochondrial genome sequence of the Variable Platyfish, Xiphophorus variatus (Meek 1904) (Cyprinodontiformes: Poeciliidae). The genome consists of 16,624 bp which encodes 13 protein-coding genes, 22 transfer RNAs, 2 ribosomal RNAs, and 1 control region. Genome-wide nucleotide composition is 27.79% adenine, 31.11% cytosine, 15.63% guanine, and 25.48% thymine. The X. variatus mitochondrial genome shares similar GC content and identical gene order and gene strand location with other members of Poeciliidae. The sequence presented herein will be of utility for future phylogenetic and biomedical research and for designing primers for species detection from environmental DNA samples.
Collapse
Affiliation(s)
- Anna Nicole Eastis
- Department of Biological and Environmental Sciences, University of West Alabama, Livingston, AL, USA
| | - Kayla Marie Fast
- Department of Biological and Environmental Sciences, University of West Alabama, Livingston, AL, USA
| | - Michael Warren Sandel
- Department of Biological and Environmental Sciences, University of West Alabama, Livingston, AL, USA
| |
Collapse
|
39
|
Frei D, Veekman E, Grogg D, Stoffel-Studer I, Morishima A, Shimizu-Inatsugi R, Yates S, Shimizu KK, Frey JE, Studer B, Copetti D. Ultralong Oxford Nanopore Reads Enable the Development of a Reference-Grade Perennial Ryegrass Genome Assembly. Genome Biol Evol 2021; 13:evab159. [PMID: 34247248 PMCID: PMC8358221 DOI: 10.1093/gbe/evab159] [Citation(s) in RCA: 12] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 07/02/2021] [Indexed: 12/24/2022] Open
Abstract
Despite the progress made in DNA sequencing over the last decade, reconstructing telomere-to-telomere genome assemblies of large and repeat-rich eukaryotic genomes is still difficult. More accurate basecalls or longer reads could address this issue, but no current sequencing platform can provide both simultaneously. Perennial ryegrass (Lolium perenne L.) is an example of an important species for which the lack of a reference genome assembly hindered a swift adoption of genomics-based methods into breeding programs. To fill this gap, we optimized the Oxford Nanopore Technologies' sequencing protocol, obtaining sequencing reads with an N50 of 62 kb-a very high value for a plant sample. The assembly of such reads produced a highly complete (2.3 of 2.7 Gb), correct (QV 45), and contiguous (contig N50 and N90 11.74 and 3.34 Mb, respectively) genome assembly. We show how read length was key in determining the assembly contiguity. Sequence annotation revealed the dominance of transposable elements and repeated sequences (81.6% of the assembly) and identified 38,868 protein coding genes. Almost 90% of the bases could be anchored to seven pseudomolecules, providing the first high-quality haploid reference assembly for perennial ryegrass. This protocol will enable producing longer Oxford Nanopore Technology reads for more plant samples and ushering forage grasses into modern genomics-assisted breeding programs.
Collapse
Affiliation(s)
- Daniel Frei
- Agroscope, Research Group Molecular Diagnostics, Genomics and Bioinformatics, Wädenswil, Switzerland
| | | | - Daniel Grogg
- Molecular Plant Breeding, Institute of Agricultural Sciences, ETH Zurich, Zurich, Switzerland
| | - Ingrid Stoffel-Studer
- Molecular Plant Breeding, Institute of Agricultural Sciences, ETH Zurich, Zurich, Switzerland
| | - Aki Morishima
- Department of Evolutionary Biology and Environmental Studies, University of Zurich, Zurich, Switzerland
| | - Rie Shimizu-Inatsugi
- Department of Evolutionary Biology and Environmental Studies, University of Zurich, Zurich, Switzerland
| | - Steven Yates
- Molecular Plant Breeding, Institute of Agricultural Sciences, ETH Zurich, Zurich, Switzerland
| | - Kentaro K Shimizu
- Department of Evolutionary Biology and Environmental Studies, University of Zurich, Zurich, Switzerland
- Kihara Institute for Biological Research, Yokohama City University, Maioka, Totsuka-ward, Yokohama, Japan
| | - Jürg E Frey
- Agroscope, Research Group Molecular Diagnostics, Genomics and Bioinformatics, Wädenswil, Switzerland
| | - Bruno Studer
- Molecular Plant Breeding, Institute of Agricultural Sciences, ETH Zurich, Zurich, Switzerland
| | - Dario Copetti
- Molecular Plant Breeding, Institute of Agricultural Sciences, ETH Zurich, Zurich, Switzerland
- Department of Evolutionary Biology and Environmental Studies, University of Zurich, Zurich, Switzerland
| |
Collapse
|
40
|
Hotaling S, Sproul JS, Heckenhauer J, Powell A, Larracuente AM, Pauls SU, Kelley JL, Frandsen PB. Long Reads Are Revolutionizing 20 Years of Insect Genome Sequencing. Genome Biol Evol 2021; 13:evab138. [PMID: 34152413 PMCID: PMC8358217 DOI: 10.1093/gbe/evab138] [Citation(s) in RCA: 41] [Impact Index Per Article: 13.7] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 06/10/2021] [Indexed: 12/15/2022] Open
Abstract
The first insect genome assembly (Drosophila melanogaster) was published two decades ago. Today, nuclear genome assemblies are available for a staggering 601 insect species representing 20 orders. In this study, we analyzed the most-contiguous assembly for each species and provide a "state-of-the-field" perspective, emphasizing taxonomic representation, assembly quality, gene completeness, and sequencing technologies. Relative to species richness, genomic efforts have been biased toward four orders (Diptera, Hymenoptera, Collembola, and Phasmatodea), Coleoptera are underrepresented, and 11 orders still lack a publicly available genome assembly. The average insect genome assembly is 439.2 Mb in length with 87.5% of single-copy benchmarking genes intact. Most notable has been the impact of long-read sequencing; assemblies that incorporate long reads are ∼48× more contiguous than those that do not. We offer four recommendations as we collectively continue building insect genome resources: 1) seek better integration between independent research groups and consortia, 2) balance future sampling between filling taxonomic gaps and generating data for targeted questions, 3) take advantage of long-read sequencing technologies, and 4) expand and improve gene annotations.
Collapse
Affiliation(s)
- Scott Hotaling
- School of Biological Sciences, Washington State University, Pullman, Washington, USA
| | - John S Sproul
- Department of Biology, University of Rochester, New York, USA
| | - Jacqueline Heckenhauer
- LOEWE Centre for Translational Biodiversity Genomics (LOEWE‐TBG), Frankfurt, Germany
- Department of Terrestrial Zoology, Entomology III, Senckenberg Research Institute and Natural History Museum Frankfurt, Frankfurt, Germany
| | - Ashlyn Powell
- Department of Plant and Wildlife Sciences, Brigham Young University, Provo, Utah, USA
| | | | - Steffen U Pauls
- LOEWE Centre for Translational Biodiversity Genomics (LOEWE‐TBG), Frankfurt, Germany
- Department of Terrestrial Zoology, Entomology III, Senckenberg Research Institute and Natural History Museum Frankfurt, Frankfurt, Germany
- Institute for Insect Biotechnology, Justus-Liebig-University, Giessen, Germany
| | - Joanna L Kelley
- School of Biological Sciences, Washington State University, Pullman, Washington, USA
| | - Paul B Frandsen
- LOEWE Centre for Translational Biodiversity Genomics (LOEWE‐TBG), Frankfurt, Germany
- Department of Plant and Wildlife Sciences, Brigham Young University, Provo, Utah, USA
- Data Science Lab, Smithsonian Institution, Washington, District of Columbia, USA
| |
Collapse
|
41
|
McCartney AM, Hilario E, Choi S, Guhlin J, Prebble JM, Houliston G, Buckley TR, Chagné D. An exploration of assembly strategies and quality metrics on the accuracy of the rewarewa (Knightia excelsa) genome. Mol Ecol Resour 2021; 21:2125-2144. [PMID: 33955186 PMCID: PMC8362059 DOI: 10.1111/1755-0998.13406] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/23/2020] [Revised: 03/18/2021] [Accepted: 04/20/2021] [Indexed: 12/17/2022]
Abstract
We used long read sequencing data generated from Knightia excelsa, a nectar-producing Proteaceae tree endemic to Aotearoa (New Zealand), to explore how sequencing data type, volume and workflows can impact final assembly accuracy and chromosome reconstruction. Establishing a high-quality genome for this species has specific cultural importance to Māori and commercial importance to honey producers in Aotearoa. Assemblies were produced by five long read assemblers using data subsampled based on read lengths, two polishing strategies and two Hi-C mapping methods. Our results from subsampling the data by read length showed that each assembler tested performed differently depending on the coverage and the read length of the data. Subsampling highlighted that input data with longer read lengths but perhaps lower coverage constructed more contiguous, kmers and gene-complete assemblies than short read length input data with higher coverage. The final genome assembly was constructed into 14 pseudochromosomes using an initial flye long read assembly, a racon/medaka/pilon combined polishing strategy, salsa2 and allhic scaffolding, juicebox curation, and Macadamia linkage map validation. We highlighted the importance of developing assembly workflows based on the volume and read length of sequencing data and established a robust set of quality metrics for generating high-quality assemblies. Scaffolding analyses highlighted that problems found in the initial assemblies could not be resolved accurately by Hi-C data and that assembly scaffolding was more successful when the underlying contig assembly was of higher accuracy. These findings provide insight into how quality assessment tools can be implemented throughout genome assembly pipelines to inform the de novo reconstruction of a high-quality genome assembly for nonmodel organisms.
Collapse
Affiliation(s)
- Ann M. McCartney
- Manaaki Whenua ‐ Landcare ResearchAucklandNew Zealand
- Genomics AotearoaDunedinNew Zealand
| | - Elena Hilario
- Genomics AotearoaDunedinNew Zealand
- The New Zealand Institute for Plant and Food Research (Plant & Food Research)SandringhamNew Zealand
| | - Seung‐Sub Choi
- Manaaki Whenua ‐ Landcare ResearchAucklandNew Zealand
- Genomics AotearoaDunedinNew Zealand
- School of Biological SciencesThe University of AucklandAucklandNew Zealand
| | - Joseph Guhlin
- Genomics AotearoaDunedinNew Zealand
- University of OtagoDunedinNew Zealand
| | - Jessica M. Prebble
- Genomics AotearoaDunedinNew Zealand
- Manaaki Whenua Landcare ResearchLincolnNew Zealand
| | - Gary Houliston
- Genomics AotearoaDunedinNew Zealand
- Manaaki Whenua Landcare ResearchLincolnNew Zealand
| | - Thomas R. Buckley
- Manaaki Whenua ‐ Landcare ResearchAucklandNew Zealand
- Genomics AotearoaDunedinNew Zealand
- School of Biological SciencesThe University of AucklandAucklandNew Zealand
| | - David Chagné
- Genomics AotearoaDunedinNew Zealand
- Plant & Food ResearchFitzherbert, Palmerston NorthNew Zealand
| |
Collapse
|
42
|
Lama M, Chanakya PP, Khamari B, Peketi ASK, Kumar P, Muddu GK, Nagaraja V, Bulagonda EP. Genomic analysis of a multidrug-resistant Brucella anthropi strain isolated from a 4-day-old neonatal sepsis patient. J Glob Antimicrob Resist 2021; 26:227-229. [PMID: 34273590 DOI: 10.1016/j.jgar.2021.06.013] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/28/2021] [Revised: 06/07/2021] [Accepted: 06/30/2021] [Indexed: 10/20/2022] Open
Abstract
OBJECTIVES Brucella anthropi is a Gram-negative, aerobic, motile, oxidase-positive, non-fermentative Alphaproteobacteria belonging to the family Brucellaceae. It is most commonly found in soil but is an emerging, opportunistic, nosocomial human pathogen. The objective of this study was to understand the genome features of a drug-resistant B. anthropi (SOA01) isolated from a blood culture of a 4-day-old neonate and to determine its antimicrobial resistance and pathogenic potential. METHODS Hybrid genome assembly of B. anthropi strain SOA01 was generated using quality-trimmed short Illumina and long MinION reads. Identification and antimicrobial susceptibility profile were determined by MALDI-TOF, in silico ribosomal multilocus sequence typing (rMLST) and VITEK®2, respectively. PATRIC webserver and VFDB were used to identify antimicrobial resistance (AMR), virulence factor (VF) and transporter genes. RESULTS Multidrug-resistant B. anthropi strain SOA01 has a genome of 4 975 830 bp with a G+C content of 56.29%. Several AMR, VF and transporter genes were identified in the genome. Antimicrobial susceptibility testing revealed resistance to different classes of antibiotics in strain SOA01. CONCLUSION Brucella anthropi SOA01 is a multidrug-resistant strain. Several AMR and VF genes were identified in the genome, revealing the potential threat posed by this pathogen. The genome data generated in this study are likely to be useful in better understanding its AMR mechanisms, pathogenic potential and successful adaptation from its primary habitat of soil to the human system. Since it is often misidentified as Brucella melitensis or Brucella suis, genome characterisation and detailed understanding of its biology are crucial.
Collapse
Affiliation(s)
- Manmath Lama
- AMR Laboratory, Department of Biosciences, Sri Sathya Sai Institute of Higher Learning, Puttaparthi, India
| | - Pachi Pulusu Chanakya
- AMR Laboratory, Department of Biosciences, Sri Sathya Sai Institute of Higher Learning, Puttaparthi, India
| | - Balaram Khamari
- AMR Laboratory, Department of Biosciences, Sri Sathya Sai Institute of Higher Learning, Puttaparthi, India
| | - Arun Sai Kumar Peketi
- AMR Laboratory, Department of Biosciences, Sri Sathya Sai Institute of Higher Learning, Puttaparthi, India
| | - Prakash Kumar
- Department of Microbiology, Sri Sathya Sai Institute of Higher Medical Sciences, Prasanthigram, India
| | - Gopi Krishna Muddu
- Department of Paediatrics, Sri Sathya Sai General Hospital, Puttaparthi, India
| | - Valakunja Nagaraja
- Department of Microbiology and Cell Biology, Indian Institute of Science, Bengaluru, India; Jawaharlal Nehru Centre for Advanced Scientific Research, Jakkur, Bengaluru, India.
| | | |
Collapse
|
43
|
Liu HH, Wang J, Wu PH, Lu MYJ, Li JY, Shen YM, Tzeng MN, Kuo CH, Lin YH, Chang HX. Whole-Genome Sequence Resource of Calonectria ilicicola, the Casual Pathogen of Soybean Red Crown Rot. Mol Plant Microbe Interact 2021; 34:848-851. [PMID: 33683143 DOI: 10.1094/mpmi-11-20-0315-a] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/12/2023]
Abstract
Calonectria ilicicola (anamorph: Cylindrocladium parasiticum) is a soilborne plant-pathogenic fungus with a broad host range, and it can cause red crown rot of soybean and Cylindrocladium black rot of peanut, which has become an emerging threat to crop production worldwide. Limited molecular studies have focused on Calonectria ilicicola and one of the possible difficulties is the lack of genomic resources. This study presents the first high quality and near-completed genome of C. ilicicola, using the Oxford Nanopore GridION sequencing platform. A total of 16 contigs were assembled and the genome of C. ilicicola isolate F018 was estimated to have 11 chromosomes. Currently, the C. ilicicola F018 genome represents the most contiguous assembly, which has the lowest contig number and the highest contig N50 among all Calonectria genome resources. Putative protein-coding sequences and secretory proteins were estimated to be 17,308 and 1,930 in the C. ilicicola F018 genome, respectively; and the prediction was close to other plant-pathogenic fungi, such as Fusarium species, within the Nectriaceae family. The availability of this high-quality genome resource is expected to facilitate research on fungal biology and genetics of C. ilicicola and to support advanced understanding of pathogen virulence and disease management.[Formula: see text] Copyright © 2021 The Author(s). This is an open access article distributed under the CC BY-NC-ND 4.0 International license.
Collapse
Affiliation(s)
- Hsien-Hao Liu
- Department of Plant Pathology and Microbiology, National Taiwan University, Taipei City 10617, Taiwan
| | - Jie Wang
- Center for Genomics-Enabled Plant Science, Michigan State University, East Lansing, MI, U.S.A
| | - Ping-Hu Wu
- Department of Plant Pathology and Microbiology, National Taiwan University, Taipei City 10617, Taiwan
| | - Mei-Yeh Jade Lu
- NGS High Throughput Genomics Core, Biodiversity Research Center, Academia Sinica, Taipei City 11529, Taiwan
| | - Jeng-Yi Li
- NGS High Throughput Genomics Core, Biodiversity Research Center, Academia Sinica, Taipei City 11529, Taiwan
| | - Yuan-Min Shen
- Taichung District Agricultural Research and Extension Station, Council of Agriculture, Changhua County 51544, Taiwan
| | - Min-Nan Tzeng
- Kaohsiung District Agricultural Research and Extension Station, Council of Agriculture, Pingtung County 90846, Taiwan
| | - Chang-Hsin Kuo
- Department of Plant Medicine, National Chiayi University, Chiayi City 60004, Taiwan
| | - Ying-Hong Lin
- Department of Plant Medicine, National Pingtung University of Science and Technology, Pingtung 912301, Taiwan
- Plant Medicine Teaching Hospital, General Research Service Center, National Pingtung University of Science and Technology, Pingtung, Taiwan
| | - Hao-Xun Chang
- Department of Plant Pathology and Microbiology, National Taiwan University, Taipei City 10617, Taiwan
| |
Collapse
|
44
|
Tayyrov A, Gillis-Germitsch N, Tritten L, Schnyder M. Genome sequence of the cardiopulmonary canid nematode Angiostrongylus vasorum reveals species-specific genes with potential involvement in coagulopathy. Genomics 2021; 113:2695-701. [PMID: 34118383 DOI: 10.1016/j.ygeno.2021.06.010] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [What about the content of this article? (0)] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/02/2020] [Revised: 05/21/2021] [Accepted: 06/07/2021] [Indexed: 11/22/2022]
Abstract
Angiostrongylus vasorum is an emerging parasitic nematode of canids and causes respiratory distress, bleeding, and other signs in dogs. Despite its clinical importance, the molecular toolbox allowing the study of the parasite is incomplete. To address this gap, we have sequenced its nuclear genome using Oxford nanopore sequencing, polished with Illumina reads. The size of the final genome is 280 Mb comprising 468 contigs, with an N50 value of 1.68 Mb and a BUSCO score of 93.5%. Ninety-three percent of 13,766 predicted genes were assigned to putative functions. Three folate carriers were found exclusively in A. vasorum, with potential involvement in host coagulopathy. A screen for previously identified vaccine candidates, the aminopeptidase H11 and the somatic protein rHc23, revealed homologs in A. vasorum. The genome sequence will provide a foundation for the development of new tools against canine angiostrongylosis, supporting the identification of potential drug and vaccine targets.
Collapse
|
45
|
Abstract
Plasmids can provide a selective advantage for microorganisms to survive and adapt to new environmental conditions. Plasmid-encoded traits, such as antimicrobial resistance (AMR) or virulence, impact the ecology and evolution of bacteria and can significantly influence the burden of infectious diseases. Insight about the identity and functions encoded on plasmids on the global scale are largely lacking. Here, we investigate the plasmidome of 24 samples (22 countries, 5 continents) from the global sewage surveillance project. We obtained 105-Gbp Oxford Nanopore and 167-Gbp Illumina NextSeq DNA sequences from plasmid DNA preparations and assembled 165,302 contigs (159,322 circular). Of these, 58,429 carried genes encoding for plasmid-related and 11,222 for virus/phage-related proteins. About 90% of the circular DNA elements did not have any similarity to known plasmids. Those that exhibited similarity had similarity to plasmids whose hosts were previously detected in these sewage samples (e.g., Acinetobacter, Escherichia, Moraxella, Enterobacter, Bacteroides, and Klebsiella). Some AMR classes were detected at a higher abundance in plasmidomes (e.g., macrolide-lincosamide-streptogramin B, macrolide, and quinolone) compared to the respective complex sewage samples. In addition to AMR genes, a range of functions were encoded on the candidate plasmids, including plasmid replication and maintenance, mobilization, and conjugation. In summary, we describe a laboratory and bioinformatics workflow for the recovery of plasmids and other potential extrachromosomal DNA elements from complex microbiomes. Moreover, the obtained data could provide further valuable insight into the ecology and evolution of microbiomes, knowledge about AMR transmission, and the discovery of novel functions. IMPORTANCE This is, to the best of our knowledge, the first study to investigate plasmidomes at a global scale using long read sequencing from complex untreated domestic sewage. Previous metagenomic surveys have detected AMR genes in a variety of environments, including sewage. However, it is unknown whether the AMR genes were present on the microbial chromosome or located on extrachromosomal elements, such as plasmids. Using our approach, we recovered a large number of plasmids, of which most appear novel. We identified distinct AMR genes that were preferentially located on plasmids, potentially contributing to their transmissibility. Overall, plasmids are of great importance for the biology of microorganisms in their natural environments (free-living and host-associated), as well as for molecular biology and biotechnology. Plasmidome collections may therefore be valuable resources for the discovery of fundamental biological mechanisms and novel functions useful in a variety of contexts.
Collapse
|
46
|
Lama M, Chanakya PP, Khamari B, Peketi ASK, Kumar P, Nagaraja V, Bulagonda EP. Genomic and phylogenetic analysis of a multidrug-resistant Burkholderia contaminans strain isolated from a patient with ocular infection. J Glob Antimicrob Resist 2021; 25:323-5. [PMID: 33965629 DOI: 10.1016/j.jgar.2021.04.004] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [What about the content of this article? (0)] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/25/2021] [Revised: 03/25/2021] [Accepted: 04/08/2021] [Indexed: 11/22/2022] Open
Abstract
OBJECTIVES The genus Burkholderia comprises rod-shaped, non-spore-forming, obligately aerobic Gram-negative bacteria that is found across diverse ecological niches. Burkholderia contaminans, an emerging pathogen associated with cystic fibrosis, is frequently isolated from contaminated medical devices in hospital settings. The aim of this study was to understand the genomic characteristics, antimicrobial resistance profile and virulence determinants of B. contaminans strain SBC01 isolated from the eye of a patient hit by a cow's tail. METHODS A hybrid sequence of isolate SBC01 was generated using Illumina HiSeq and Oxford Nanopore Technology platforms. Unicycler was used to assemble the hybrid genomic sequence. The draft genome was annotated using the NCBI Prokaryotic Genome Annotation Pipeline. Antimicrobial susceptibility testing was performed by VITEK®2. Antimicrobial resistance and virulence genes were identified using validated bioinformatics tools. RESULTS The assembled genome size is 8 841 722 bp with a G+C content of 66.33% distributed in 19 contigs. Strain SBC01 was found to possess several antimicrobial resistance and efflux pump genes. The isolate was susceptible to tetracyclines, meropenem and ceftazidime. Many genes encoding potential virulence factors were identified. CONCLUSION Burkholderia contaminans SBC01 belonging to sequence type 482 (ST482) is a multidrug-resistant strain containing diverse antimicrobial resistance genes, revealing the risks associated with infections by new Burkholderia spp. The large G+C-rich genome has a myriad of virulence factors, highlighting its pathogenic potential. Thus, while providing insights into the antimicrobial resistance and virulence potential of this uncommon species, the present analysis will aid in understanding the evolution and speciation in the Burkholderia genus.
Collapse
|
47
|
Halliwell JA, Baker D, Judge K, Quail MA, Oliver K, Betteridge E, Skelton J, Andrews PW, Barbaric I. Nanopore Sequencing Indicates That Tandem Amplification of Chromosome 20q11.21 in Human Pluripotent Stem Cells Is Driven by Break-Induced Replication. Stem Cells Dev 2021; 30:578-586. [PMID: 33757297 PMCID: PMC8165465 DOI: 10.1089/scd.2021.0013] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/19/2022] Open
Abstract
Copy number variants (CNVs) are genomic rearrangements implicated in numerous congenital and acquired diseases, including cancer. The appearance of culture-acquired CNVs in human pluripotent stem cells (PSCs) has prompted concerns for their use in regenerative medicine. A particular problem in PSC is the frequent occurrence of CNVs in the q11.21 region of chromosome 20. However, the exact mechanism of origin of this amplicon remains elusive due to the difficulty in delineating its sequence and breakpoints. Here, we have addressed this problem using long-read Nanopore sequencing of two examples of this CNV, present as duplication and as triplication. In both cases, the CNVs were arranged in a head-to-tail orientation, with microhomology sequences flanking or overlapping the proximal and distal breakpoints. These breakpoint signatures point to a mechanism of microhomology-mediated break-induced replication in CNV formation, with surrounding Alu sequences likely contributing to the instability of this genomic region.
Collapse
Affiliation(s)
- Jason A Halliwell
- Department of Biomedical Science, University of Sheffield, Sheffield, United Kingdom
| | - Duncan Baker
- Sheffield Diagnostic Genetic Services, Sheffield Children's Hospital, Sheffield, United Kingdom
| | - Kim Judge
- Department of Sequencing R & D, Wellcome Sanger Institute, Hinxton, United Kingdom
| | - Michael A Quail
- Department of Sequencing R & D, Wellcome Sanger Institute, Hinxton, United Kingdom
| | - Karen Oliver
- Department of Sequencing R & D, Wellcome Sanger Institute, Hinxton, United Kingdom
| | - Emma Betteridge
- Department of Sequencing R & D, Wellcome Sanger Institute, Hinxton, United Kingdom
| | - Jason Skelton
- Department of Sequencing R & D, Wellcome Sanger Institute, Hinxton, United Kingdom
| | - Peter W Andrews
- Department of Biomedical Science, University of Sheffield, Sheffield, United Kingdom
| | - Ivana Barbaric
- Department of Biomedical Science, University of Sheffield, Sheffield, United Kingdom
| |
Collapse
|
48
|
Martínez P, Robledo D, Taboada X, Blanco A, Moser M, Maroso F, Hermida M, Gómez-Tato A, Álvarez-Blázquez B, Cabaleiro S, Piferrer F, Bouza C, Lien S, Viñas AM. A genome-wide association study, supported by a new chromosome-level genome assembly, suggests sox2 as a main driver of the undifferentiatiated ZZ/ZW sex determination of turbot (Scophthalmus maximus). Genomics 2021; 113:1705-1718. [PMID: 33838278 DOI: 10.1016/j.ygeno.2021.04.007] [Citation(s) in RCA: 15] [Impact Index Per Article: 5.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/19/2021] [Revised: 03/20/2021] [Accepted: 04/05/2021] [Indexed: 01/10/2023]
Abstract
BACKGROUND Understanding sex determination (SD) across taxa is a major challenge for evolutionary biology. The new genomic tools are paving the way to identify genomic features underlying SD in fish, a group frequently showing limited sex chromosome differentiation and high SD evolutionary turnover. Turbot (Scophthalmus maximus) is a commercially important flatfish with an undifferentiated ZW/ZZ SD system and remarkable sexual dimorphism. Here we describe a new long-read turbot genome assembly used to disentangle the genetic architecture of turbot SD by combining genomics and classical genetics approaches. RESULTS The new turbot genome assembly consists of 145 contigs (N50 = 22.9 Mb), 27 of them representing >95% of its estimated genome size. A genome wide association study (GWAS) identified a ~ 6.8 Mb region on chromosome 12 associated with sex in 69.4% of the 36 families analyzed. The highest associated markers flanked sox2, the only gene in the region showing differential expression between sexes before gonad differentiation. A single SNP showed consistent differences between Z and W chromosomes. The analysis of a broad sample of families suggested the presence of additional genetic and/or environmental factors on turbot SD. CONCLUSIONS The new chromosome-level turbot genome assembly, one of the most contiguous fish assemblies to date, facilitated the identification of sox2 as a consistent candidate gene putatively driving SD in this species. This chromosome SD system barely showed any signs of differentiation, and other factors beyond the main QTL seem to control SD in a certain proportion of families.
Collapse
Affiliation(s)
- Paulino Martínez
- Departament of Zoology, Genetics and Physical Anthropology, Faculty of Veterinary, Universidade de Santiago de Compostela, Campus de Lugo, 27002 Lugo, Spain.
| | - Diego Robledo
- The Roslin Institute and Royal (Dick) School of Veterinary Studies, University of Edinburgh, Midlothian EH25 9RG, UK.
| | - Xoana Taboada
- Departament of Zoology, Genetics and Physical Anthropology, Faculty of Biology, Universidade de Santiago de Compostela, 15782 Santiago de Compostela, Spain
| | - Andrés Blanco
- Departament of Zoology, Genetics and Physical Anthropology, Faculty of Veterinary, Universidade de Santiago de Compostela, Campus de Lugo, 27002 Lugo, Spain.
| | - Michel Moser
- Centre for Integrative Genetics, Department of Animal and Aquacultural Sciences, Faculty of Biosciences, Norwegian University of Life Sciences, Ås, Norway.
| | - Francesco Maroso
- Department of Life Science and Biotechnology, University of Ferrara, 44121 Ferrara, Italy
| | - Miguel Hermida
- Departament of Zoology, Genetics and Physical Anthropology, Faculty of Veterinary, Universidade de Santiago de Compostela, Campus de Lugo, 27002 Lugo, Spain.
| | - Antonio Gómez-Tato
- Departament of Mathematics, Faculty of Mathematics, Universidade de Santiago de Compostela, 15782 Santiago de Compostela, Spain.
| | - Blanca Álvarez-Blázquez
- Instituto Español de Oceanografía (IEO), Centro Oceanográfico de Vigo, Cabo Estay-Canido, 36280 Vigo, Spain.
| | - Santiago Cabaleiro
- Cluster de Acuicultura de Galicia (Punta do Couso), Aguiño-Ribeira, 15695 A Coruña, Spain.
| | - Francesc Piferrer
- Institut de Ciències del Mar, Consejo Superior de Investigaciones Científicas (CSIC), 08003 Barcelona, Spain.
| | - Carmen Bouza
- Departament of Zoology, Genetics and Physical Anthropology, Faculty of Veterinary, Universidade de Santiago de Compostela, Campus de Lugo, 27002 Lugo, Spain.
| | - Sigbjørn Lien
- Centre for Integrative Genetics, Department of Animal and Aquacultural Sciences, Faculty of Biosciences, Norwegian University of Life Sciences, Ås, Norway.
| | - Ana M Viñas
- Departament of Zoology, Genetics and Physical Anthropology, Faculty of Biology, Universidade de Santiago de Compostela, 15782 Santiago de Compostela, Spain.
| |
Collapse
|
49
|
Kopec PM, Mikolajczyk K, Jajor E, Perek A, Nowakowska J, Obermeier C, Chawla HS, Korbas M, Bartkowiak-Broda I, Karlowski WM. Local Duplication of TIR-NBS-LRR Gene Marks Clubroot Resistance in Brassica napus cv. Tosca. Front Plant Sci 2021; 12:639631. [PMID: 33936130 PMCID: PMC8082685 DOI: 10.3389/fpls.2021.639631] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 12/09/2020] [Accepted: 03/09/2021] [Indexed: 06/12/2023]
Abstract
Clubroot, caused by Plasmodiophora brassicae infection, is a disease of growing importance in cruciferous crops, including oilseed rape (Brassica napus). The affected plants exhibit prominent galling of the roots that impairs their capacity for water and nutrient uptake, which leads to growth retardation, wilting, premature ripening, or death. Due to the scarcity of effective means of protection against the pathogen, breeding of resistant varieties remains a crucial component of disease control measures. The key aspect of the breeding process is the identification of genetic factors associated with variable response to the pathogen exposure. Although numerous clubroot resistance loci have been described in Brassica crops, continuous updates on the sources of resistance are necessary. Many of the resistance genes are pathotype-specific, moreover, resistance breakdowns have been reported. In this study, we characterize the clubroot resistance locus in the winter oilseed rape cultivar "Tosca." In a series of greenhouse experiments, we evaluate the disease severity of P. brassicae-challenged "Tosca"-derived population of doubled haploids, which we genotype with Brassica 60 K array and a selection of SSR/SCAR markers. We then construct a genetic map and narrow down the resistance locus to the 0.4 cM fragment on the A03 chromosome, corresponding to the region previously described as Crr3. Using Oxford Nanopore long-read genome resequencing and RNA-seq we review the composition of the locus and describe a duplication of TIR-NBS-LRR gene. Further, we explore the transcriptomic differences of the local genes between the clubroot resistant and susceptible, inoculated and control DH lines. We conclude that the duplicated TNL gene is a promising candidate for the resistance factor. This study provides valuable resources for clubroot resistance breeding programs and lays a foundation for further functional studies on clubroot resistance.
Collapse
Affiliation(s)
- Piotr M. Kopec
- Department of Computational Biology, Institute of Molecular Biology and Biotechnology, Faculty of Biology, Adam Mickiewicz University Poznan, Poznan, Poland
| | - Katarzyna Mikolajczyk
- Department of Genetics and Breeding of Oilseed Crops, Plant Breeding and Acclimatization Institute-National Research Institute, Poznan, Poland
| | - Ewa Jajor
- Institute of Plant Protection - National Research Institute, Poznan, Poland
| | - Agnieszka Perek
- Institute of Plant Protection - National Research Institute, Poznan, Poland
| | - Joanna Nowakowska
- Department of Genetics and Breeding of Oilseed Crops, Plant Breeding and Acclimatization Institute-National Research Institute, Poznan, Poland
| | - Christian Obermeier
- Department of Plant Breeding, Justus-Liebig-Universitaet Giessen, Giessen, Germany
| | - Harmeet Singh Chawla
- Department of Plant Breeding, Justus-Liebig-Universitaet Giessen, Giessen, Germany
- Helmholtz Zentrum München, German Research Center for Environmental Health, Neuherberg, Germany
| | - Marek Korbas
- Institute of Plant Protection - National Research Institute, Poznan, Poland
| | - Iwona Bartkowiak-Broda
- Department of Genetics and Breeding of Oilseed Crops, Plant Breeding and Acclimatization Institute-National Research Institute, Poznan, Poland
| | - Wojciech M. Karlowski
- Department of Computational Biology, Institute of Molecular Biology and Biotechnology, Faculty of Biology, Adam Mickiewicz University Poznan, Poznan, Poland
| |
Collapse
|
50
|
Harkess A, McLoughlin F, Bilkey N, Elliott K, Emenecker R, Mattoon E, Miller K, Czymmek K, Vierstra RD, Meyers BC, Michael TP. Improved Spirodela polyrhiza genome and proteomic analyses reveal a conserved chromosomal structure with high abundance of chloroplastic proteins favoring energy production. J Exp Bot 2021; 72:2491-2500. [PMID: 33454741 DOI: 10.1093/jxb/erab006] [Citation(s) in RCA: 17] [Impact Index Per Article: 5.7] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 06/08/2020] [Accepted: 01/13/2021] [Indexed: 05/11/2023]
Abstract
Duckweeds are a monophyletic group of rapidly reproducing aquatic monocots in the Lemnaceae family. Given their clonal, exponentially fast reproduction, a key question is whether genome structure is conserved across the species in the absence of meiotic recombination. Here, we studied the genome and proteome of Spirodela polyrhiza, or greater duckweed, which has the largest body plan yet the smallest genome size in the family (1C=150 Mb). Using Oxford Nanopore sequencing combined with Hi-C scaffolding, we generated a highly contiguous, chromosome-scale assembly of S. polyrhiza line Sp7498 (Sp7498_HiC). Both the Sp7498_HiC and Sp9509 genome assemblies reveal large chromosomal misorientations relative to a recent PacBio assembly of Sp7498, highlighting the need for orthogonal long-range scaffolding techniques such as Hi-C and BioNano optical mapping. Shotgun proteomics of Sp7498 verified the expression of ~2250 proteins and revealed a high abundance of proteins involved in photosynthesis and carbohydrate metabolism among other functions. In addition, a strong increase in chloroplast proteins was observed that correlated to chloroplast density. This Sp7498_HiC genome was generated cheaply and quickly with a single Oxford Nanopore MinION flow cell and one Hi-C library in a classroom setting. Combining these data with a mass spectrometry-generated proteome illustrates the utility of duckweed as a model for genomics- and proteomics-based education.
Collapse
Affiliation(s)
- Alex Harkess
- Donald Danforth Plant Science Center, St Louis, MO, USA
| | | | - Natasha Bilkey
- Department of Biology, Washington University, St Louis, MO, USA
| | - Kiona Elliott
- Department of Biology, Washington University, St Louis, MO, USA
| | - Ryan Emenecker
- Department of Biology, Washington University, St Louis, MO, USA
| | - Erin Mattoon
- Department of Biology, Washington University, St Louis, MO, USA
| | - Kari Miller
- Department of Biology, Washington University, St Louis, MO, USA
| | - Kirk Czymmek
- Donald Danforth Plant Science Center, St Louis, MO, USA
| | | | - Blake C Meyers
- Donald Danforth Plant Science Center, St Louis, MO, USA
- Division of Plant Sciences, University of Missouri, Columbia, MO, USA
| | - Todd P Michael
- Department of Informatics, J. Craig Venter Institute (JCVI), San Diego, CA, USA
- The Plant Molecular and Cellular Biology Laboratory, The Salk Institute for Biological Studies, La Jolla, CA, USA
| |
Collapse
|