1
|
Safar HA, Alatar F, Mustafa AS. Three Rounds of Read Correction Significantly Improve Eukaryotic Protein Detection in ONT Reads. Microorganisms 2024; 12:247. [PMID: 38399651 PMCID: PMC10893331 DOI: 10.3390/microorganisms12020247] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/29/2023] [Revised: 01/19/2024] [Accepted: 01/23/2024] [Indexed: 02/25/2024] Open
Abstract
BACKGROUND Eukaryotes' whole-genome sequencing is crucial for species identification, gene detection, and protein annotation. Oxford Nanopore Technology (ONT) is an affordable and rapid platform for sequencing eukaryotes; however, the relatively higher error rates require computational and bioinformatic efforts to produce more accurate genome assemblies. Here, we evaluated the effect of read correction tools on eukaryote genome completeness, gene detection and protein annotation. METHODS Reads generated by ONT of four eukaryotes, C. albicans, C. gattii, S. cerevisiae, and P. falciparum, were assembled using minimap2 and underwent three rounds of read correction using flye, medaka and racon. The generates consensus FASTA files were compared for total length (bp), genome completeness, gene detection, and protein-annotation by QUAST, BUSCO, BRAKER1 and InterProScan, respectively. RESULTS Genome completeness was dependent on the assembly method rather than on the read correction tool; however, medaka performed better than flye and racon. Racon significantly performed better than flye and medaka in gene detection, while both racon and medaka significantly performed better than flye in protein-annotation. CONCLUSION We show that three rounds of read correction significantly affect gene detection and protein annotation, which are dependent on assembly quality in preference to assembly completeness.
Collapse
Affiliation(s)
- Hussain A. Safar
- OMICS Research Unit, Health Science Centre, Kuwait University, Kuwait City 13110, Kuwait;
| | - Fatemah Alatar
- Serology and Molecular Microbiology Reference Laboratory, Mubarak Al-Kabeer Hospital, Ministry of Health, Kuwait City 13110, Kuwait;
| | - Abu Salim Mustafa
- Department of Microbiology, Faculty of Medicine, Kuwait University, Kuwait City 13110, Kuwait
| |
Collapse
|
2
|
Drown MK, DeLiberto AN, Flack N, Doyle M, Westover AG, Proefrock JC, Heilshorn S, D’Alessandro E, Crawford DL, Faulk C, Oleksiak MF. Sequencing Bait: Nuclear and Mitogenome Assembly of an Abundant Coastal Tropical and Subtropical Fish, Atherinomorus stipes. Genome Biol Evol 2022; 14:6648392. [PMID: 35866575 PMCID: PMC9348626 DOI: 10.1093/gbe/evac111] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 07/13/2022] [Indexed: 02/01/2023] Open
Abstract
Genetic data from nonmodel species can inform ecology and physiology, giving insight into a species' distribution and abundance as well as their responses to changing environments, all of which are important for species conservation and management. Moreover, reduced sequencing costs and improved long-read sequencing technology allows researchers to readily generate genomic resources for nonmodel species. Here, we apply Oxford Nanopore long-read sequencing and low-coverage (∼1x) whole genome short-read sequencing technology (Illumina) to assemble a genome and examine population genetics of an abundant tropical and subtropical fish, the hardhead silverside (Atherinomorus stipes). These fish are found in shallow coastal waters and are frequently included in ecological models because they serve as abundant prey for commercially and ecologically important species. Despite their importance in sub-tropical and tropical ecosystems, little is known about their population connectivity and genetic diversity. Our A. stipes genome assembly is about 1.2 Gb with comparable repetitive element content (∼47%), number of protein duplication events, and DNA methylation patterns to other teleost fish species. Among five sampled populations spanning 43 km of South Florida and the Florida Keys, we find little population structure suggesting high population connectivity.
Collapse
Affiliation(s)
| | | | - Nicole Flack
- Department of Veterinary and Biomedical Sciences, University of Minnesota, Minnesota, USA
| | - Meghan Doyle
- The Rosenstiel School, University of Miami, Florida, USA
| | | | | | | | | | | | | | | |
Collapse
|
3
|
Shaikhutdinov N, Gusev O. Chironomid midges (Diptera) provide insights into genome evolution in extreme environments. CURRENT OPINION IN INSECT SCIENCE 2022; 49:101-107. [PMID: 34990872 DOI: 10.1016/j.cois.2021.12.009] [Citation(s) in RCA: 4] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 10/11/2021] [Revised: 12/11/2021] [Accepted: 12/29/2021] [Indexed: 06/14/2023]
Abstract
Extremophiles often undergo marked changes in genomic architecture, likely as a result of adaptation to the harsh environments they inhabit. These changes can involve gene duplications that affect subsequent gene evolution and the regulation of gene expression. Excellent examples of this are provided by two non-biting chironomid midges (Diptera, Chironomidae): Polypedilum vanderplanki, which in its larval form can withstand almost complete water loss, and Belgica antarctica, which exhibits freeze tolerance. This review presents recent studies on the molecular adaptations and evolutionary features of these and other extremophile chironomid genomes, as well as biotechnological applications of a cell line derived from P. vanderplanki that can survive air-drying. We highlight the importance of genomics in identifying molecular pathways and genomic modifications associated with adaptation to extreme environmental conditions.
Collapse
Affiliation(s)
- Nurislam Shaikhutdinov
- Extreme Biology Laboratory, Regulatory Genomics Research Center, Institute of Fundamental Medicine and Biology, Kazan Federal University, Kazan, 420012, Russia; Center of Life Sciences, Skolkovo Institute of Science and Technology, Moscow, 121205, Russia
| | - Oleg Gusev
- Extreme Biology Laboratory, Regulatory Genomics Research Center, Institute of Fundamental Medicine and Biology, Kazan Federal University, Kazan, 420012, Russia; Graduate School of Medicine, Juntendo University, Tokyo, 113-8421, Japan; RIKEN Center for Integrative Medical Sciences, RIKEN, Yokohama, 230-004, Japan.
| |
Collapse
|
4
|
Lamb HJ, Hayes BJ, Randhawa IAS, Nguyen LT, Ross EM. Genomic prediction using low-coverage portable Nanopore sequencing. PLoS One 2021; 16:e0261274. [PMID: 34910782 PMCID: PMC8673642 DOI: 10.1371/journal.pone.0261274] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/03/2021] [Accepted: 11/26/2021] [Indexed: 11/18/2022] Open
Abstract
Most traits in livestock, crops and humans are polygenic, that is, a large number of loci contribute to genetic variation. Effects at these loci lie along a continuum ranging from common low-effect to rare high-effect variants that cumulatively contribute to the overall phenotype. Statistical methods to calculate the effect of these loci have been developed and can be used to predict phenotypes in new individuals. In agriculture, these methods are used to select superior individuals using genomic breeding values; in humans these methods are used to quantitatively measure an individual’s disease risk, termed polygenic risk scores. Both fields typically use SNP array genotypes for the analysis. Recently, genotyping-by-sequencing has become popular, due to lower cost and greater genome coverage (including structural variants). Oxford Nanopore Technologies’ (ONT) portable sequencers have the potential to combine the benefits genotyping-by-sequencing with portability and decreased turn-around time. This introduces the potential for in-house clinical genetic disease risk screening in humans or calculating genomic breeding values on-farm in agriculture. Here we demonstrate the potential of the later by calculating genomic breeding values for four traits in cattle using low-coverage ONT sequence data and comparing these breeding values to breeding values calculated from SNP arrays. At sequencing coverages between 2X and 4X the correlation between ONT breeding values and SNP array-based breeding values was > 0.92 when imputation was used and > 0.88 when no imputation was used. With an average sequencing coverage of 0.5x the correlation between the two methods was between 0.85 and 0.92 using imputation, depending on the trait. This suggests that ONT sequencing has potential for in clinic or on-farm genomic prediction, however, further work to validate these findings in a larger population still remains.
Collapse
Affiliation(s)
- Harrison J. Lamb
- Centre for Animal Science, Queensland Alliance for Agriculture and Food Innovation, The University of Queensland, Brisbane, QLD, Australia
- * E-mail:
| | - Ben J. Hayes
- Centre for Animal Science, Queensland Alliance for Agriculture and Food Innovation, The University of Queensland, Brisbane, QLD, Australia
| | - Imtiaz A. S. Randhawa
- School of Veterinary Science, The University of Queensland, Brisbane, QLD, Australia
| | - Loan T. Nguyen
- Centre for Animal Science, Queensland Alliance for Agriculture and Food Innovation, The University of Queensland, Brisbane, QLD, Australia
| | - Elizabeth M. Ross
- Centre for Animal Science, Queensland Alliance for Agriculture and Food Innovation, The University of Queensland, Brisbane, QLD, Australia
| |
Collapse
|
5
|
Maggiori C, Raymond-Bouchard I, Brennan L, Touchette D, Whyte L. MinION sequencing from sea ice cryoconites leads to de novo genome reconstruction from metagenomes. Sci Rep 2021; 11:21041. [PMID: 34702846 PMCID: PMC8548342 DOI: 10.1038/s41598-021-00026-x] [Citation(s) in RCA: 9] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/29/2021] [Accepted: 09/30/2021] [Indexed: 01/04/2023] Open
Abstract
Genome reconstruction from metagenomes enables detailed study of individual community members, their metabolisms, and their survival strategies. Obtaining high quality metagenome-assembled genomes (MAGs) is particularly valuable in extreme environments like sea ice cryoconites, where the native consortia are recalcitrant to culture and strong astrobiology analogues. We evaluated three separate approaches for MAG generation from Allen Bay, Nunavut sea ice cryoconites-HiSeq-only, MinION-only, and hybrid (HiSeq + MinION)-where field MinION sequencing yielded a reliable metagenome. The hybrid assembly produced longer contigs, more coding sequences, and more total MAGs, revealing a microbial community dominated by Bacteroidetes. The hybrid MAGs also had the highest completeness, lowest contamination, and highest N50. A putatively novel species of Octadecabacter is among the hybrid MAGs produced, containing the genus's only known instances of genomic potential for nitrate reduction, denitrification, sulfate reduction, and fermentation. This study shows that the inclusion of MinION reads in traditional short read datasets leads to higher quality metagenomes and MAGs for more accurate descriptions of novel microorganisms in this extreme, transient habitat and has produced the first hybrid MAGs from an extreme environment.
Collapse
Affiliation(s)
- Catherine Maggiori
- Department of Natural Resource Sciences, Faculty of Agricultural and Environmental Sciences, McGill University, 21 111 Lakeshore Road, Macdonald Stewart Building, Room MS3-053, Ste. Anne-de-Bellevue, Quebec, H9X 3V9, Canada.
| | - Isabelle Raymond-Bouchard
- Department of Natural Resource Sciences, Faculty of Agricultural and Environmental Sciences, McGill University, 21 111 Lakeshore Road, Macdonald Stewart Building, Room MS3-053, Ste. Anne-de-Bellevue, Quebec, H9X 3V9, Canada
| | - Laura Brennan
- Department of Natural Resource Sciences, Faculty of Agricultural and Environmental Sciences, McGill University, 21 111 Lakeshore Road, Macdonald Stewart Building, Room MS3-053, Ste. Anne-de-Bellevue, Quebec, H9X 3V9, Canada
| | - David Touchette
- Department of Natural Resource Sciences, Faculty of Agricultural and Environmental Sciences, McGill University, 21 111 Lakeshore Road, Macdonald Stewart Building, Room MS3-053, Ste. Anne-de-Bellevue, Quebec, H9X 3V9, Canada
| | - Lyle Whyte
- Department of Natural Resource Sciences, Faculty of Agricultural and Environmental Sciences, McGill University, 21 111 Lakeshore Road, Macdonald Stewart Building, Room MS3-053, Ste. Anne-de-Bellevue, Quebec, H9X 3V9, Canada
| |
Collapse
|
6
|
Tang M, He S, Gong X, Lü P, Taha RH, Chen K. High-Quality de novo Chromosome-Level Genome Assembly of a Single Bombyx mori With BmNPV Resistance by a Combination of PacBio Long-Read Sequencing, Illumina Short-Read Sequencing, and Hi-C Sequencing. Front Genet 2021; 12:718266. [PMID: 34603381 PMCID: PMC8481875 DOI: 10.3389/fgene.2021.718266] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/14/2021] [Accepted: 08/05/2021] [Indexed: 12/17/2022] Open
Abstract
The reference genomes of Bombyx mori (B. mori), Silkworm Knowledge-based database (SilkDB) and SilkBase, have served as the gold standard for nearly two decades. Their use has fundamentally shaped model organisms and accelerated relevant studies on lepidoptera. However, the current reference genomes of B. mori do not accurately represent the full set of genes for any single strain. As new genome-wide sequencing technologies have emerged and the cost of high-throughput sequencing technology has fallen, it is now possible for standard laboratories to perform full-genome assembly for specific strains. Here we present a high-quality de novo chromosome-level genome assembly of a single B. mori with nuclear polyhedrosis virus (BmNPV) resistance through the integration of PacBio long-read sequencing, Illumina short-read sequencing, and Hi-C sequencing. In addition, regular bioinformatics analyses, such as gene family, phylogenetic, and divergence analyses, were performed. The sample was from our unique B. mori species (NB), which has strong inborn resistance to BmNPV. Our genome assembly showed good collinearity with SilkDB and SilkBase and particular regions. To the best of our knowledge, this is the first genome assembly with BmNPV resistance, which should be a more accurate insect model for resistance studies.
Collapse
Affiliation(s)
- Min Tang
- School of Life Sciences, Jiangsu University, Zhenjiang, China
| | - Suqun He
- School of Life Sciences, Jiangsu University, Zhenjiang, China
| | - Xun Gong
- Institute of Clinical Pharmacology, Anhui Medical University, Hefei, China.,Department of Medical Rheumatology, Columbia University, New York, NY, United States
| | - Peng Lü
- School of Life Sciences, Jiangsu University, Zhenjiang, China
| | - Rehab H Taha
- Department of Sericulture, Plant Protection Research Institute, Agricultural Research Center, Giza, Egypt
| | - Keping Chen
- School of Life Sciences, Jiangsu University, Zhenjiang, China
| |
Collapse
|
7
|
Genome sequence of the cardiopulmonary canid nematode Angiostrongylus vasorum reveals species-specific genes with potential involvement in coagulopathy. Genomics 2021; 113:2695-2701. [PMID: 34118383 DOI: 10.1016/j.ygeno.2021.06.010] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/02/2020] [Revised: 05/21/2021] [Accepted: 06/07/2021] [Indexed: 11/22/2022]
Abstract
Angiostrongylus vasorum is an emerging parasitic nematode of canids and causes respiratory distress, bleeding, and other signs in dogs. Despite its clinical importance, the molecular toolbox allowing the study of the parasite is incomplete. To address this gap, we have sequenced its nuclear genome using Oxford nanopore sequencing, polished with Illumina reads. The size of the final genome is 280 Mb comprising 468 contigs, with an N50 value of 1.68 Mb and a BUSCO score of 93.5%. Ninety-three percent of 13,766 predicted genes were assigned to putative functions. Three folate carriers were found exclusively in A. vasorum, with potential involvement in host coagulopathy. A screen for previously identified vaccine candidates, the aminopeptidase H11 and the somatic protein rHc23, revealed homologs in A. vasorum. The genome sequence will provide a foundation for the development of new tools against canine angiostrongylosis, supporting the identification of potential drug and vaccine targets.
Collapse
|
8
|
Genetic Evaluation of Nosocomial Candida auris Transmission. J Clin Microbiol 2021; 59:JCM.02252-20. [PMID: 33472901 DOI: 10.1128/jcm.02252-20] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/27/2020] [Accepted: 12/28/2020] [Indexed: 11/20/2022] Open
Abstract
Whole-genome sequences of Candida auris isolates from nosocomial and nonnosocomial infections were compared. The average numbers of single nucleotide variations were different between the two groups. The small amount of genetic variability between intra- or interhost isolates suggests recovery of all colonizing or infecting genomes for comparison is required for outbreaks.
Collapse
|
9
|
Liu H, Wu S, Li A, Ruan J. SMARTdenovo: a de novo assembler using long noisy reads. GIGABYTE 2021; 2021:gigabyte15. [PMID: 36824332 PMCID: PMC9632051 DOI: 10.46471/gigabyte.15] [Citation(s) in RCA: 76] [Impact Index Per Article: 25.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/19/2020] [Accepted: 03/05/2021] [Indexed: 12/11/2022] Open
Abstract
Long-read single-molecule sequencing has revolutionized de novo genome assembly and enabled the automated reconstruction of reference-quality genomes. It has also been widely used to study structural variants, phase haplotypes and more. Here, we introduce the assembler SMARTdenovo, a single-molecule sequencing (SMS) assembler that follows the overlap-layout-consensus (OLC) paradigm. SMARTdenovo (RRID: SCR_017622) was designed to be a rapid assembler, which, unlike contemporaneous SMS assemblers, does not require highly accurate raw reads for error correction. It has performed well in the evaluation of congeneric assemblers and has been successfully users for various assembly projects. It is compatible with Canu for assembling high-quality genomes, and several of the assembly strategies in this program have been incorporated into subsequent popular assemblers. The assembler has been in use since 2015; here we provide information on the development of SMARTdenovo and how to implement its algorithms into current projects.
Collapse
Affiliation(s)
- Hailin Liu
- Guangdong Laboratory for Lingnan Modern Agriculture, Genome Analysis Laboratory of the Ministry of Agriculture, Agricultural Genomics Institute at Shenzhen, Chinese Academy of Agricultural Sciences, Shenzhen 518120, China
| | - Shigang Wu
- Guangdong Laboratory for Lingnan Modern Agriculture, Genome Analysis Laboratory of the Ministry of Agriculture, Agricultural Genomics Institute at Shenzhen, Chinese Academy of Agricultural Sciences, Shenzhen 518120, China
| | - Alun Li
- Guangdong Laboratory for Lingnan Modern Agriculture, Genome Analysis Laboratory of the Ministry of Agriculture, Agricultural Genomics Institute at Shenzhen, Chinese Academy of Agricultural Sciences, Shenzhen 518120, China
| | - Jue Ruan
- Guangdong Laboratory for Lingnan Modern Agriculture, Genome Analysis Laboratory of the Ministry of Agriculture, Agricultural Genomics Institute at Shenzhen, Chinese Academy of Agricultural Sciences, Shenzhen 518120, China, Corresponding author. E-mail:
| |
Collapse
|
10
|
Xie Y, Zhong Y, Chang J, Kwan HS. Chromosome-level de novo assembly of Coprinopsis cinerea A43mut B43mut pab1-1 #326 and genetic variant identification of mutants using Nanopore MinION sequencing. Fungal Genet Biol 2020; 146:103485. [PMID: 33253902 DOI: 10.1016/j.fgb.2020.103485] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/13/2020] [Revised: 10/22/2020] [Accepted: 11/13/2020] [Indexed: 11/26/2022]
Abstract
The homokaryotic Coprinopsis cinerea strain A43mut B43mut pab1-1 #326 is a widely used experimental model for developmental studies in mushroom-forming fungi. It can grow on defined artificial media and complete the whole lifecycle within two weeks. The mutations in mating type factors A and B result in the special feature of clamp formation and fruiting without mating. This feature allows investigations and manipulations with a homokaryotic genetic background. Current genome assembly of strain #326 was based on short-read sequencing data and was highly fragmented, leading to the bias in gene annotation and downstream analyses. Here, we report a chromosome-level genome assembly of strain #326. Oxford Nanopore Technology (ONT) MinION sequencing was used to get long reads. Illumina short reads was used to polish the sequences. A combined assembly yield 13 chromosomes and a mitochondrial genome as individual scaffolds. The assembly has 15,250 annotated genes with a high synteny with the C. cinerea strain Okayama-7 #130. This assembly has great improvement on contiguity and annotations. It is a suitable reference for further genomic studies, especially for the genetic, genomic and transcriptomic analyses in ONT long reads. Single nucleotide variants and structural variants in six mutagenized and cisplatin-screened mutants could be identified and validated. A 66 bp deletion in Ras GTPase-activating protein (RasGAP) was found in all mutants. To make a better use of ONT sequencing platform, we modified a high-molecular-weight genomic DNA isolation protocol based on magnetic beads for filamentous fungi. This study showed the use of MinION to construct a fungal reference genome and to perform downstream studies in an individual laboratory. An experimental workflow was proposed, from DNA isolation and whole genome sequencing, to genome assembly and variant calling. Our results provided solutions and parameters for fungal genomic analysis on MinION sequencing platform.
Collapse
Affiliation(s)
- Yichun Xie
- School of Life Sciences, The Chinese University of Hong Kong, Shatin, New Territories, Hong Kong Special Administrative Region
| | - Yiyi Zhong
- School of Life Sciences, The Chinese University of Hong Kong, Shatin, New Territories, Hong Kong Special Administrative Region
| | - Jinhui Chang
- School of Life Sciences, The Chinese University of Hong Kong, Shatin, New Territories, Hong Kong Special Administrative Region; The Hong Kong Polytechnic University Shenzhen Research Institute, Shenzhen, China
| | - Hoi Shan Kwan
- School of Life Sciences, The Chinese University of Hong Kong, Shatin, New Territories, Hong Kong Special Administrative Region.
| |
Collapse
|
11
|
Evaluation of assembly methods combining long-reads and short-reads to obtain Paenibacillus sp. R4 high-quality complete genome. 3 Biotech 2020; 10:480. [PMID: 33094089 DOI: 10.1007/s13205-020-02474-0] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/06/2020] [Accepted: 10/07/2020] [Indexed: 10/23/2022] Open
Abstract
We sequenced the Paenibacillus sp. R4 using Oxford Nanopore Technology (ONT), single molecule real-time (SMRT) technology from Pacific Biosciences (PacBio), and Illumina technologies to investigate the application of nanopore reads in de novo sequencing of bacterial genomes. We compared the differences in both genome sequences between genome assemblies using nanopore and PacBio reads and focused on the difference in the prediction of coding sequences. The results indicated that for more accurate predictions of open reading frames, contigs in the assemblies using only PacBio reads also needed to be corrected using short reads with high-quality bases, and repeat regions in genomes did not affect the increase of mispredicted coding sequences via genome polishing significantly. In assemblies using only nanopore reads, genome polishing was essential, but many repeat regions in genomes might increase the number of mispredicted coding sequences via genome polishing. The hybrid assembly combining the long reads and short reads represents the best result for coding sequence predictions in genome assemblies using nanopore reads.
Collapse
|
12
|
Kraft F, Kurth I. Long-read sequencing to understand genome biology and cell function. Int J Biochem Cell Biol 2020; 126:105799. [PMID: 32629027 DOI: 10.1016/j.biocel.2020.105799] [Citation(s) in RCA: 20] [Impact Index Per Article: 5.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/02/2020] [Revised: 06/29/2020] [Accepted: 07/02/2020] [Indexed: 02/08/2023]
Abstract
Determining the sequence of DNA and RNA molecules has a huge impact on the understanding of cell biology and function. Recent advancements in next-generation short-read sequencing (NGS) technologies, drops in cost and a resolution down to the single-cell level shaped our current view on genome structure and function. Third-generation sequencing (TGS) methods further complete the knowledge about these processes based on long reads and the ability to analyze DNA or RNA at single molecule level. Long-read sequencing provides additional possibilities to study genome architecture and the composition of highly complex regions and to determine epigenetic modifications of nucleotide bases at a genome-wide level. We discuss the principles and advancements of long-read sequencing and its applications in genome biology.
Collapse
Affiliation(s)
- Florian Kraft
- Institute of Human Genetics, Medical Faculty, RWTH Aachen University, Aachen, Germany.
| | - Ingo Kurth
- Institute of Human Genetics, Medical Faculty, RWTH Aachen University, Aachen, Germany.
| |
Collapse
|
13
|
Stadler M, Lambert C, Wibberg D, Kalinowski J, Cox RJ, Kolařík M, Kuhnert E. Intragenomic polymorphisms in the ITS region of high-quality genomes of the Hypoxylaceae (Xylariales, Ascomycota). Mycol Prog 2020. [DOI: 10.1007/s11557-019-01552-9] [Citation(s) in RCA: 42] [Impact Index Per Article: 10.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/31/2023]
Abstract
AbstractThe internal transcribed spacer (ITS) region of the ribosomal DNA (rDNA) has been established (and is generally accepted) as a primary “universal” genetic barcode for fungi for many years, but the actual value for taxonomy has been heavily disputed among mycologists. Recently, twelve draft genome sequences, mainly derived from type species of the family Hypoxylaceae (Xylariales, Ascomycota) and the ex-epitype strain of Xylaria hypoxylon have become available during the course of a large phylogenomic study that was primarily aimed at establishing a correlation between the existing multi-gene-based genealogy with a genome-based phylogeny and the discovery of novel biosynthetic gene clusters encoding for secondary metabolites. The genome sequences were obtained using combinations of Illumina and Oxford nanopore technologies or PacBio sequencing, respectively, and resulted in high-quality sequences with an average N50 of 3.2 Mbp. While the main results will be published concurrently in a separate paper, the current case study was dedicated to the detection of ITS nrDNA copies in the genomes, in an attempt to explain certain incongruities and apparent mismatches between phenotypes and genotypes that had been observed during previous polyphasic studies. The results revealed that all of the studied strains had at least three copies of rDNA in their genomes, with Hypoxylon fragiforme having at least 19 copies of the ITS region, followed by Xylaria hypoxylon with at least 13 copies. Several of the genomes contained 2–3 copies that were nearly identical, but in some cases drastic differences, below 97% identity were observed. In one case, ascribable to the presence of a pseudogene, the deviations of the ITS sequences from the same genome resulted in only ca. 90% of overall homology. These results are discussed in the scope of the current trends to use ITS data for species recognition and segregation of fungi. We propose that additional genomes should be checked for such ITS polymorphisms to reassess the validity of this non-coding part of the fungal DNA for molecular identification.
Collapse
|
14
|
Fauver JR, Martin J, Weil GJ, Mitreva M, Fischer PU. De novo Assembly of the Brugia malayi Genome Using Long Reads from a Single MinION Flowcell. Sci Rep 2019; 9:19521. [PMID: 31863009 PMCID: PMC6925183 DOI: 10.1038/s41598-019-55908-y] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/21/2019] [Accepted: 11/28/2019] [Indexed: 11/15/2022] Open
Abstract
Filarial nematode infections cause a substantial global disease burden. Genomic studies of filarial worms can improve our understanding of their biology and epidemiology. However, genomic information from field isolates is limited and available reference genomes are often discontinuous. Single molecule sequencing technologies can reduce the cost of genome sequencing and long reads produced from these devices can improve the contiguity and completeness of genome assemblies. In addition, these new technologies can make generation and analysis of large numbers of field isolates feasible. In this study, we assessed the performance of the Oxford Nanopore Technologies MinION for sequencing and assembling the genome of Brugia malayi, a human parasite widely used in filariasis research. Using data from a single MinION flowcell, a 90.3 Mb nuclear genome was assembled into 202 contigs with an N50 of 2.4 Mb. This assembly covered 96.9% of the well-defined B. malayi reference genome with 99.2% identity. The complete mitochondrial genome was obtained with individual reads and the nearly complete genome of the endosymbiotic bacteria Wolbachia was assembled alongside the nuclear genome. Long-read data from the MinION produced an assembly that approached the quality of a well-established reference genome using comparably fewer resources.
Collapse
Affiliation(s)
- Joseph R Fauver
- Division of Infectious Diseases, Department of Medicine, Washington University School of Medicine, St. Louis, MO, United States.
- Department of Epidemiology of Microbial Diseases, Yale School of Public Health, New Haven, CT, United States.
| | - John Martin
- McDonnell Genome Institute, Washington University School of Medicine, St. Louis, MO, United States
| | - Gary J Weil
- Division of Infectious Diseases, Department of Medicine, Washington University School of Medicine, St. Louis, MO, United States
| | - Makedonka Mitreva
- Division of Infectious Diseases, Department of Medicine, Washington University School of Medicine, St. Louis, MO, United States
- McDonnell Genome Institute, Washington University School of Medicine, St. Louis, MO, United States
| | - Peter U Fischer
- Division of Infectious Diseases, Department of Medicine, Washington University School of Medicine, St. Louis, MO, United States
| |
Collapse
|