51
|
Malmberg MM, Spangenberg GC, Daetwyler HD, Cogan NOI. Assessment of low-coverage nanopore long read sequencing for SNP genotyping in doubled haploid canola (Brassica napus L.). Sci Rep 2019; 9:8688. [PMID: 31213642 PMCID: PMC6582154 DOI: 10.1038/s41598-019-45131-0] [Citation(s) in RCA: 16] [Impact Index Per Article: 3.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/31/2019] [Accepted: 05/28/2019] [Indexed: 11/16/2022] Open
Abstract
Despite the high accuracy of short read sequencing (SRS), there are still issues with attaining accurate single nucleotide polymorphism (SNP) genotypes at low sequencing coverage and in highly duplicated genomes due to misalignment. Long read sequencing (LRS) systems, including the Oxford Nanopore Technologies (ONT) minION, have become popular options for de novo genome assembly and structural variant characterisation. The current high error rate often requires substantial post-sequencing correction and would appear to prevent the adoption of this system for SNP genotyping, but nanopore sequencing errors are largely random. Using low coverage ONT minION sequencing for genotyping of pre-validated SNP loci was examined in 9 canola doubled haploids. The minION genotypes were compared to the Illumina sequences to determine the extent and nature of genotype discrepancies between the two systems. The significant increase in read length improved alignment to the genome and the absence of classical SRS biases results in a more even representation of the genome. Sequencing errors are present, primarily in the form of heterozygous genotypes, which can be removed in completely homozygous backgrounds but requires more advanced bioinformatics in heterozygous genomes. Developments in this technology are promising for routine genotyping in the future.
Collapse
Affiliation(s)
- M M Malmberg
- Agriculture Victoria, AgriBio, Centre for AgriBioscience, 5 Ring Road, Bundoora, Victoria, 3083, Australia.,School of Applied Systems Biology, La Trobe University, Bundoora, Victoria, 3086, Australia
| | - G C Spangenberg
- Agriculture Victoria, AgriBio, Centre for AgriBioscience, 5 Ring Road, Bundoora, Victoria, 3083, Australia.,School of Applied Systems Biology, La Trobe University, Bundoora, Victoria, 3086, Australia
| | - H D Daetwyler
- Agriculture Victoria, AgriBio, Centre for AgriBioscience, 5 Ring Road, Bundoora, Victoria, 3083, Australia.,School of Applied Systems Biology, La Trobe University, Bundoora, Victoria, 3086, Australia
| | - N O I Cogan
- Agriculture Victoria, AgriBio, Centre for AgriBioscience, 5 Ring Road, Bundoora, Victoria, 3083, Australia. .,School of Applied Systems Biology, La Trobe University, Bundoora, Victoria, 3086, Australia.
| |
Collapse
|
52
|
The Impact of Centromeres on Spatial Genome Architecture. Trends Genet 2019; 35:565-578. [PMID: 31200946 DOI: 10.1016/j.tig.2019.05.003] [Citation(s) in RCA: 55] [Impact Index Per Article: 11.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/20/2019] [Revised: 05/06/2019] [Accepted: 05/09/2019] [Indexed: 01/01/2023]
Abstract
The development of new technologies and experimental techniques is enabling researchers to see what was once unable to be seen. For example, the centromere was first seen as the mediator between spindle fiber and chromosome during mitosis and meiosis. Although this continues to be its most prominent role, we now know that the centromere functions beyond cellular division with important roles in genome organization and chromatin regulation. Here we aim to share the structures and functions of centromeres in various organisms beginning with the diversity of their DNA sequence anatomies. We zoom out to describe their position in the nucleus and ultimately detail the different ways they contribute to genome organization and regulation at the spatial level.
Collapse
|
53
|
Yoshimura J, Ichikawa K, Shoura MJ, Artiles KL, Gabdank I, Wahba L, Smith CL, Edgley ML, Rougvie AE, Fire AZ, Morishita S, Schwarz EM. Recompleting the Caenorhabditis elegans genome. Genome Res 2019; 29:1009-1022. [PMID: 31123080 PMCID: PMC6581061 DOI: 10.1101/gr.244830.118] [Citation(s) in RCA: 73] [Impact Index Per Article: 14.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/04/2018] [Accepted: 03/11/2019] [Indexed: 01/14/2023]
Abstract
Caenorhabditis elegans was the first multicellular eukaryotic genome sequenced to apparent completion. Although this assembly employed a standard C. elegans strain (N2), it used sequence data from several laboratories, with DNA propagated in bacteria and yeast. Thus, the N2 assembly has many differences from any C. elegans available today. To provide a more accurate C. elegans genome, we performed long-read assembly of VC2010, a modern strain derived from N2. Our VC2010 assembly has 99.98% identity to N2 but with an additional 1.8 Mb including tandem repeat expansions and genome duplications. For 116 structural discrepancies between N2 and VC2010, 97 structures matching VC2010 (84%) were also found in two outgroup strains, implying deficiencies in N2. Over 98% of N2 genes encoded unchanged products in VC2010; moreover, we predicted ≥53 new genes in VC2010. The recompleted genome of C. elegans should be a valuable resource for genetics, genomics, and systems biology.
Collapse
Affiliation(s)
- Jun Yoshimura
- Department of Computational Biology and Medical Sciences, Graduate School of Frontier Sciences, The University of Tokyo, Chiba 277-8583, Japan
| | - Kazuki Ichikawa
- Department of Computational Biology and Medical Sciences, Graduate School of Frontier Sciences, The University of Tokyo, Chiba 277-8583, Japan
| | - Massa J Shoura
- Department of Pathology, Stanford University, Stanford, California 94305, USA
| | - Karen L Artiles
- Department of Pathology, Stanford University, Stanford, California 94305, USA
| | - Idan Gabdank
- Department of Genetics, Stanford University, Stanford, California 94305, USA
| | - Lamia Wahba
- Department of Pathology, Stanford University, Stanford, California 94305, USA
| | - Cheryl L Smith
- Department of Pathology, Stanford University, Stanford, California 94305, USA.,Department of Genetics, Stanford University, Stanford, California 94305, USA
| | - Mark L Edgley
- Department of Zoology and Michael Smith Laboratories, University of British Columbia, Vancouver V6T 1Z3, British Columbia, Canada
| | - Ann E Rougvie
- Department of Genetics, Cell Biology, and Development, University of Minnesota, Minneapolis, Minnesota 55454, USA
| | - Andrew Z Fire
- Department of Pathology, Stanford University, Stanford, California 94305, USA.,Department of Genetics, Stanford University, Stanford, California 94305, USA
| | - Shinichi Morishita
- Department of Computational Biology and Medical Sciences, Graduate School of Frontier Sciences, The University of Tokyo, Chiba 277-8583, Japan
| | - Erich M Schwarz
- Department of Molecular Biology and Genetics, Cornell University, Ithaca, New York 14853, USA
| |
Collapse
|
54
|
Chang CH, Chavan A, Palladino J, Wei X, Martins NMC, Santinello B, Chen CC, Erceg J, Beliveau BJ, Wu CT, Larracuente AM, Mellone BG. Islands of retroelements are major components of Drosophila centromeres. PLoS Biol 2019; 17:e3000241. [PMID: 31086362 PMCID: PMC6516634 DOI: 10.1371/journal.pbio.3000241] [Citation(s) in RCA: 91] [Impact Index Per Article: 18.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/17/2018] [Accepted: 04/08/2019] [Indexed: 12/24/2022] Open
Abstract
Centromeres are essential chromosomal regions that mediate kinetochore assembly and spindle attachments during cell division. Despite their functional conservation, centromeres are among the most rapidly evolving genomic regions and can shape karyotype evolution and speciation across taxa. Although significant progress has been made in identifying centromere-associated proteins, the highly repetitive centromeres of metazoans have been refractory to DNA sequencing and assembly, leaving large gaps in our understanding of their functional organization and evolution. Here, we identify the sequence composition and organization of the centromeres of Drosophila melanogaster by combining long-read sequencing, chromatin immunoprecipitation for the centromeric histone CENP-A, and high-resolution chromatin fiber imaging. Contrary to previous models that heralded satellite repeats as the major functional components, we demonstrate that functional centromeres form on islands of complex DNA sequences enriched in retroelements that are flanked by large arrays of satellite repeats. Each centromere displays distinct size and arrangement of its DNA elements but is similar in composition overall. We discover that a specific retroelement, G2/Jockey-3, is the most highly enriched sequence in CENP-A chromatin and is the only element shared among all centromeres. G2/Jockey-3 is also associated with CENP-A in the sister species D. simulans, revealing an unexpected conservation despite the reported turnover of centromeric satellite DNA. Our work reveals the DNA sequence identity of the active centromeres of a premier model organism and implicates retroelements as conserved features of centromeric DNA.
Collapse
Affiliation(s)
- Ching-Ho Chang
- Department of Biology, University of Rochester; Rochester, New York, United States of America
| | - Ankita Chavan
- Department of Molecular and Cell Biology, University of Connecticut, Storrs, Connecticut, United States of America
| | - Jason Palladino
- Department of Molecular and Cell Biology, University of Connecticut, Storrs, Connecticut, United States of America
| | - Xiaolu Wei
- Department of Biomedical Genetics, University of Rochester Medical Center, Rochester, New York, United States of America
| | - Nuno M. C. Martins
- Department of Genetics, Harvard Medical School, Boston, Massachusetts, United States of America
| | - Bryce Santinello
- Department of Molecular and Cell Biology, University of Connecticut, Storrs, Connecticut, United States of America
| | - Chin-Chi Chen
- Department of Molecular and Cell Biology, University of Connecticut, Storrs, Connecticut, United States of America
| | - Jelena Erceg
- Department of Genetics, Harvard Medical School, Boston, Massachusetts, United States of America
| | - Brian J. Beliveau
- Wyss Institute for Biologically Inspired Engineering, Harvard Medical School, Boston, Massachusetts, United States of America
- Department of Systems Biology, Harvard Medical School, Boston, Massachusetts, United States of America
- Department of Genome Sciences, University of Washington Seattle, Seattle, Washington, United States of America
| | - Chao-Ting Wu
- Department of Genetics, Harvard Medical School, Boston, Massachusetts, United States of America
| | - Amanda M. Larracuente
- Department of Biology, University of Rochester; Rochester, New York, United States of America
| | - Barbara G. Mellone
- Department of Molecular and Cell Biology, University of Connecticut, Storrs, Connecticut, United States of America
- Institute for Systems Genomics, University of Connecticut Storrs, Connecticut, United States of America
| |
Collapse
|
55
|
Taking Advantage of the Genomics Revolution for Monitoring and Conservation of Chondrichthyan Populations. DIVERSITY-BASEL 2019. [DOI: 10.3390/d11040049] [Citation(s) in RCA: 13] [Impact Index Per Article: 2.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 12/24/2022]
Abstract
Chondrichthyes (sharks, rays, skates and chimaeras) are among the oldest extant predators and are vital to top-down regulation of oceanic ecosystems. They are an ecologically diverse group occupying a wide range of habitats and are thus, exploited by coastal, pelagic and deep-water fishing industries. Chondrichthyes are among the most data deficient vertebrate species groups making design and implementation of regulatory and conservation measures challenging. High-throughput sequencing technologies have significantly propelled ecological investigations and understanding of marine and terrestrial species’ populations, but there remains a paucity of NGS based research on chondrichthyan populations. We present a brief review of current methods to access genomic and metagenomic data from Chondrichthyes and discuss applications of these datasets to increase our understanding of chondrichthyan taxonomy, evolution, ecology and population structures. Last, we consider opportunities and challenges offered by genomic studies for conservation and management of chondrichthyan populations.
Collapse
|
56
|
Shabardina V, Kischka T, Manske F, Grundmann N, Frith MC, Suzuki Y, Makałowski W. NanoPipe-a web server for nanopore MinION sequencing data analysis. Gigascience 2019; 8:giy169. [PMID: 30689855 PMCID: PMC6377397 DOI: 10.1093/gigascience/giy169] [Citation(s) in RCA: 21] [Impact Index Per Article: 4.2] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/10/2018] [Revised: 12/10/2018] [Accepted: 12/23/2018] [Indexed: 11/14/2022] Open
Abstract
BACKGROUND The fast-moving progress of the third-generation long-read sequencing technologies will soon bring the biological and medical sciences to a new era of research. Altogether, the technique and experimental procedures are becoming more straightforward and available to biologists from diverse fields, even without any profound experience in DNA sequencing. Thus, the introduction of the MinION device by Oxford Nanopore Technologies promises to "bring sequencing technology to the masses" and also allows quick and operative analysis in field studies. However, the convenience of this sequencing technology dramatically contrasts with the available analysis tools, which may significantly reduce enthusiasm of a "regular" user. To really bring the sequencing technology to every biologist, we need a set of user-friendly tools that can perform a powerful analysis in an automatic manner. FINDINGS NanoPipe was developed in consideration of the specifics of the MinION sequencing technologies, providing accordingly adjusted alignment parameters. The range of the target species/sequences for the alignment is not limited, and the descriptive usage page of NanoPipe helps a user to succeed with NanoPipe analysis. The results contain alignment statistics, consensus sequence, polymorphisms data, and visualization of the alignment. Several test cases are used to demonstrate the efficiency of the tool. CONCLUSIONS Freely available NanoPipe software allows effortless and reliable analysis of MinION sequencing data for experienced bioinformaticians, as well for wet-lab biologists with minimum bioinformatics knowledge. Moreover, for the latter group, we describe the basic algorithm necessary for MinION sequencing analysis from the first to last step.
Collapse
Affiliation(s)
- Victoria Shabardina
- Institue of Bioinformatics, University of Muenster, Niels-Stensen-Strasse 14, Muenster, 48149, Germany
| | - Tabea Kischka
- Institue of Bioinformatics, University of Muenster, Niels-Stensen-Strasse 14, Muenster, 48149, Germany
| | - Felix Manske
- Institue of Bioinformatics, University of Muenster, Niels-Stensen-Strasse 14, Muenster, 48149, Germany
| | - Norbert Grundmann
- Institue of Bioinformatics, University of Muenster, Niels-Stensen-Strasse 14, Muenster, 48149, Germany
| | - Martin C Frith
- Artificial Intelligence Research Center, AIST, 2-3-26, Aomi, Koto-ku, Tokyo, 135-0064, Japan
- Department of Computational Biology and Medical Sciences, The University of Tokyo, 5-1-5 Kashiwanoha, Kashiwa, Chiba, 277-8562, Japan
- AIST-Waseda University Computational Bio Big Data Open Innovation Laboratory, 3-4-1 Ookubo, Shinjuku-ku, Tokyo, 169-8555, Japan
| | - Yutaka Suzuki
- Laboratory of Systems Genomics, Department of Computational Biology and Medical Sciences, The University of Tokyo, 5-1-5 Kashiwanoha, Kashiwa, Chiba, 277-8562, Japan
| | - Wojciech Makałowski
- Institue of Bioinformatics, University of Muenster, Niels-Stensen-Strasse 14, Muenster, 48149, Germany
| |
Collapse
|
57
|
Highly Contiguous Genome Assemblies of 15 Drosophila Species Generated Using Nanopore Sequencing. G3-GENES GENOMES GENETICS 2018; 8:3131-3141. [PMID: 30087105 PMCID: PMC6169393 DOI: 10.1534/g3.118.200160] [Citation(s) in RCA: 85] [Impact Index Per Article: 14.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Indexed: 12/16/2022]
Abstract
The Drosophila genus is a unique group containing a wide range of species that occupy diverse ecosystems. In addition to the most widely studied species, Drosophila melanogaster, many other members in this genus also possess a well-developed set of genetic tools. Indeed, high-quality genomes exist for several species within the genus, facilitating studies of the function and evolution of cis-regulatory regions and proteins by allowing comparisons across at least 50 million years of evolution. Yet, the available genomes still fail to capture much of the substantial genetic diversity within the Drosophila genus. We have therefore tested protocols to rapidly and inexpensively sequence and assemble the genome from any Drosophila species using single-molecule sequencing technology from Oxford Nanopore. Here, we use this technology to present highly contiguous genome assemblies of 15 Drosophila species: 10 of the 12 originally sequenced Drosophila species (ananassae, erecta, mojavensis, persimilis, pseudoobscura, sechellia, simulans, virilis, willistoni, and yakuba), four additional species that had previously reported assemblies (biarmipes, bipectinata, eugracilis, and mauritiana), and one novel assembly (triauraria). Genomes were generated from an average of 29x depth-of-coverage data that after assembly resulted in an average contig N50 of 4.4 Mb. Subsequent alignment of contigs from the published reference genomes demonstrates that our assemblies could be used to close over 60% of the gaps present in the currently published reference genomes. Importantly, the materials and reagents cost for each genome was approximately $1,000 (USD). This study demonstrates the power and cost-effectiveness of long-read sequencing for genome assembly in Drosophila and provides a framework for the affordable sequencing and assembly of additional Drosophila genomes.
Collapse
|