51
|
Knot IE, Zouganelis GD, Weedall GD, Wich SA, Rae R. DNA Barcoding of Nematodes Using the MinION. Front Ecol Evol 2020. [DOI: 10.3389/fevo.2020.00100] [Citation(s) in RCA: 17] [Impact Index Per Article: 4.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/23/2023] Open
|
52
|
Yuan Z, Liu Y, Dai M, Yi X, Wang C. Controlling DNA Translocation Through Solid-state Nanopores. NANOSCALE RESEARCH LETTERS 2020; 15:80. [PMID: 32297032 PMCID: PMC7158975 DOI: 10.1186/s11671-020-03308-x] [Citation(s) in RCA: 14] [Impact Index Per Article: 3.5] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 10/14/2019] [Accepted: 03/24/2020] [Indexed: 05/14/2023]
Abstract
Compared with the status of bio-nanopores, there are still several challenges that need to be overcome before solid-state nanopores can be applied in commercial DNA sequencing. Low spatial and low temporal resolution are the two major challenges. Owing to restrictions on nanopore length and the solid-state nanopores' surface properties, there is still room for improving the spatial resolution. Meanwhile, DNA translocation is too fast under an electrical force, which results in the acquisition of few valid data points. The temporal resolution of solid-state nanopores could thus be enhanced if the DNA translocation speed is well controlled. In this mini-review, we briefly summarize the methods of improving spatial resolution and concentrate on controllable methods to promote the resolution of nanopore detection. In addition, we provide a perspective on the development of DNA sequencing by nanopores.
Collapse
Affiliation(s)
- Zhishan Yuan
- School of Electro-mechanical Engineering, Guangdong University of Technology, Guangzhou, 510006 China
| | - Youming Liu
- School of Electro-mechanical Engineering, Guangdong University of Technology, Guangzhou, 510006 China
| | - Min Dai
- School of Electro-mechanical Engineering, Guangdong University of Technology, Guangzhou, 510006 China
| | - Xin Yi
- School of Electro-mechanical Engineering, Guangdong University of Technology, Guangzhou, 510006 China
| | - Chengyong Wang
- School of Electro-mechanical Engineering, Guangdong University of Technology, Guangzhou, 510006 China
| |
Collapse
|
53
|
Franco‐Sierra ND, Díaz‐Nieto JF. Rapid mitochondrial genome sequencing based on Oxford Nanopore Sequencing and a proxy for vertebrate species identification. Ecol Evol 2020; 10:3544-3560. [PMID: 32274008 PMCID: PMC7141017 DOI: 10.1002/ece3.6151] [Citation(s) in RCA: 13] [Impact Index Per Article: 3.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/23/2019] [Revised: 02/09/2020] [Accepted: 02/12/2020] [Indexed: 02/06/2023] Open
Abstract
Molecular information is crucial for species identification when facing challenging morphology-based specimen identifications. The use of DNA barcodes partially solves this problem, but in some cases when PCR is not an option (i.e., primers are not available, problems in reaction standardization), amplification-free approaches could be an optimal alternative. Recent advances in DNA sequencing, like the MinION device from Oxford Nanopore Technologies (ONT), allow to obtain genomic data with low laboratory and technical requirements, and at a relatively low cost. In this study, we explore ONT sequencing for molecular species identification from a total DNA sample obtained from a neotropical rodent and we also test the technology for complete mitochondrial genome reconstruction via genome skimming. We were able to obtain "de novo" the complete mitogenome of a specimen from the genus Melanomys (Cricetidae: Sigmodontinae) with average depth coverage of 78X using ONT-only data and by combining multiple assembly routines. Our pipeline for an automated species identification was able to identify the sample using unassembled sequence data (raw) in a reasonable computing time, which was substantially reduced when a priori information related to the organism identity was known. Our findings suggest ONT sequencing as a suitable candidate to solve species identification problems in metazoan nonmodel organisms and generate complete mtDNA datasets.
Collapse
Affiliation(s)
- Nicolás D. Franco‐Sierra
- Grupo de investigación en Biodiversidad, Evolución y Conservación (BEC)Departamento de Ciencias Biológicas, Escuela de CienciasUniversidad EAFITMedellínColombia
| | - Juan F. Díaz‐Nieto
- Grupo de investigación en Biodiversidad, Evolución y Conservación (BEC)Departamento de Ciencias Biológicas, Escuela de CienciasUniversidad EAFITMedellínColombia
| |
Collapse
|
54
|
Pollo SMJ, Reiling SJ, Wit J, Workentine ML, Guy RA, Batoff GW, Yee J, Dixon BR, Wasmuth JD. Benchmarking hybrid assemblies of Giardia and prediction of widespread intra-isolate structural variation. Parasit Vectors 2020; 13:108. [PMID: 32111234 PMCID: PMC7048089 DOI: 10.1186/s13071-020-3968-8] [Citation(s) in RCA: 6] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/03/2019] [Accepted: 02/13/2020] [Indexed: 01/02/2023] Open
Abstract
Background Currently available short read genome assemblies of the tetraploid protozoan parasite Giardia intestinalis are highly fragmented, highlighting the need for improved genome assemblies at a reasonable cost. Long nanopore reads are well suited to resolve repetitive genomic regions resulting in better quality assemblies of eukaryotic genomes. Subsequent addition of highly accurate short reads to long-read assemblies further improves assembly quality. Using this hybrid approach, we assembled genomes for three Giardia isolates, two with published assemblies and one novel, to evaluate the improvement in genome quality gained from long reads. We then used the long reads to predict structural variants to examine this previously unexplored source of genetic variation in Giardia. Methods With MinION reads for each isolate, we assembled genomes using several assemblers specializing in long reads. Assembly metrics, gene finding, and whole genome alignments to the reference genomes enabled direct comparison to evaluate the performance of the nanopore reads. Further improvements from adding Illumina reads to the long-read assemblies were evaluated using gene finding. Structural variants were predicted from alignments of the long reads to the best hybrid genome for each isolate and enrichment of key genes was analyzed using random genome sampling and calculation of percentiles to find thresholds of significance. Results Our hybrid assembly method generated reference quality genomes for each isolate. Consistent with previous findings based on SNPs, examination of heterozygosity using the structural variants found that Giardia BGS was considerably more heterozygous than the other isolates that are from Assemblage A. Further, each isolate was shown to contain structural variant regions enriched for variant-specific surface proteins, a key class of virulence factor in Giardia. Conclusions The ability to generate reference quality genomes from a single MinION run and a multiplexed MiSeq run enables future large-scale comparative genomic studies within the genus Giardia. Further, prediction of structural variants from long reads allows for more in-depth analyses of major sources of genetic variation within and between Giardia isolates that could have effects on both pathogenicity and host range.![]()
Collapse
Affiliation(s)
- Stephen M J Pollo
- Department of Ecosystem and Public Health, Faculty of Veterinary Medicine, University of Calgary, Calgary, AB, Canada.,Host-Parasite Interactions Training Program, University of Calgary, Calgary, AB, Canada
| | - Sarah J Reiling
- Bureau of Microbial Hazards, Food Directorate, Health Canada, Ottawa, ON, Canada
| | - Janneke Wit
- Host-Parasite Interactions Training Program, University of Calgary, Calgary, AB, Canada.,Department of Comparative Biology and Experimental Medicine, Faculty of Veterinary Medicine, University of Calgary, Calgary, AB, Canada
| | - Matthew L Workentine
- Department of Ecosystem and Public Health, Faculty of Veterinary Medicine, University of Calgary, Calgary, AB, Canada
| | - Rebecca A Guy
- Division of Enteric Diseases, National Microbiology Laboratory, Public Health Agency of Canada, Guelph, ON, Canada
| | - G William Batoff
- Department of Biology, Biochemistry and Molecular Biology Program, Trent University, Peterborough, ON, Canada
| | - Janet Yee
- Department of Biology, Biochemistry and Molecular Biology Program, Trent University, Peterborough, ON, Canada
| | - Brent R Dixon
- Bureau of Microbial Hazards, Food Directorate, Health Canada, Ottawa, ON, Canada
| | - James D Wasmuth
- Department of Ecosystem and Public Health, Faculty of Veterinary Medicine, University of Calgary, Calgary, AB, Canada. .,Host-Parasite Interactions Training Program, University of Calgary, Calgary, AB, Canada.
| |
Collapse
|
55
|
Zhao Y, Long L, Wan J, Biliya S, Brady SC, Lee D, Ojemakinde A, Andersen EC, Vannberg FO, Lu H, McGrath PT. A spontaneous complex structural variant in rcan-1 increases exploratory behavior and laboratory fitness of Caenorhabditis elegans. PLoS Genet 2020; 16:e1008606. [PMID: 32092052 PMCID: PMC7058356 DOI: 10.1371/journal.pgen.1008606] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/12/2019] [Revised: 03/05/2020] [Accepted: 01/11/2020] [Indexed: 01/02/2023] Open
Abstract
Over long evolutionary timescales, major changes to the copy number, function, and genomic organization of genes occur, however, our understanding of the individual mutational events responsible for these changes is lacking. In this report, we study the genetic basis of adaptation of two strains of C. elegans to laboratory food sources using competition experiments on a panel of 89 recombinant inbred lines (RIL). Unexpectedly, we identified a single RIL with higher relative fitness than either of the parental strains. This strain also displayed a novel behavioral phenotype, resulting in higher propensity to explore bacterial lawns. Using bulk-segregant analysis and short-read resequencing of this RIL, we mapped the change in exploration behavior to a spontaneous, complex rearrangement of the rcan-1 gene that occurred during construction of the RIL panel. We resolved this rearrangement into five unique tandem inversion/duplications using Oxford Nanopore long-read sequencing. rcan-1 encodes an ortholog to human RCAN1/DSCR1 calcipressin gene, which has been implicated as a causal gene for Down syndrome. The genomic rearrangement in rcan-1 creates two complete and two truncated versions of the rcan-1 coding region, with a variety of modified 5’ and 3’ non-coding regions. While most copy-number variations (CNVs) are thought to act by increasing expression of duplicated genes, these changes to rcan-1 ultimately result in the reduction of its whole-body expression due to changes in the upstream regions. By backcrossing this rearrangement into a common genetic background to create a near isogenic line (NIL), we demonstrate that both the competitive advantage and exploration behavioral changes are linked to this complex genetic variant. This NIL strain does not phenocopy a strain containing an rcan-1 loss-of-function allele, which suggests that the residual expression of rcan-1 is necessary for its fitness effects. Our results demonstrate how colonization of new environments, such as those encountered in the laboratory, can create evolutionary pressure to modify gene function. This evolutionary mismatch can be resolved by an unexpectedly complex genetic change that simultaneously duplicates and diversifies a gene into two uniquely regulated genes. Our work shows how complex rearrangements can act to modify gene expression in ways besides increased gene dosage. Evolution acts on genetic variants that modify phenotypes that increase the likelihood of staying alive and passing on these genetic changes to subsequent generations (i.e. fitness). There is general interest in understanding the types of genetic variants that can increase fitness in specific environments. One route that fitness can be increased is through changes in behavior, such as finding new food sources. Here, we identify a spontaneous genetic change that increases exploration behavior and fitness of animals in laboratory environments. Interestingly, this genetic change is not a simple genetic change that deletes or changes the sequence of a protein product, but rather a complex structural variant that simultaneously duplicates the rcan-1 gene and also modifies its expression in a number of tissues. Our work demonstrates how a complex structural change can duplicate a gene, modify the DNA control regions that determine its cellular sites of action, and confer a fitness advantage that could lead to its spread in a population.
Collapse
Affiliation(s)
- Yuehui Zhao
- School of Biological Sciences, Georgia Institute of Technology, Atlanta, Georgia, United States of America
| | - Lijiang Long
- School of Biological Sciences, Georgia Institute of Technology, Atlanta, Georgia, United States of America
- Interdisciplinary Graduate Program in Quantitative Biosciences, Georgia Institute of Technology, Atlanta, Georgia, United States of America
| | - Jason Wan
- The Wallace H. Coulter Department of Biomedical Engineering, Georgia Institute of Technology and Emory University, Atlanta, Georgia, United States of America
| | - Shweta Biliya
- School of Biological Sciences, Georgia Institute of Technology, Atlanta, Georgia, United States of America
| | - Shannon C. Brady
- Department of Molecular Biosciences, Northwestern University, Evanston, Illinois, United States of America
| | - Daehan Lee
- Department of Molecular Biosciences, Northwestern University, Evanston, Illinois, United States of America
| | - Akinade Ojemakinde
- School of Biological Sciences, Georgia Institute of Technology, Atlanta, Georgia, United States of America
| | - Erik C. Andersen
- Department of Molecular Biosciences, Northwestern University, Evanston, Illinois, United States of America
| | - Fredrik O. Vannberg
- School of Biological Sciences, Georgia Institute of Technology, Atlanta, Georgia, United States of America
- Parker H. Petit Institute of Bioengineering and Bioscience, Georgia Institute of Technology, Atlanta, Georgia, United States of America
| | - Hang Lu
- Parker H. Petit Institute of Bioengineering and Bioscience, Georgia Institute of Technology, Atlanta, Georgia, United States of America
- School of Chemical & Biomolecular Engineering, Georgia Institute of Technology, Atlanta, Georgia, United States of America
| | - Patrick T. McGrath
- School of Biological Sciences, Georgia Institute of Technology, Atlanta, Georgia, United States of America
- Interdisciplinary Graduate Program in Quantitative Biosciences, Georgia Institute of Technology, Atlanta, Georgia, United States of America
- School of Chemical & Biomolecular Engineering, Georgia Institute of Technology, Atlanta, Georgia, United States of America
- School of Physics, Georgia Institute of Technology, Atlanta, Georgia, United States of America
- * E-mail:
| |
Collapse
|
56
|
Abstract
Individuals within a species can exhibit vast variation in copy number of repetitive DNA elements. This variation may contribute to complex traits such as lifespan and disease, yet it is only infrequently considered in genotype-phenotype associations. Although the possible importance of copy number variation is widely recognized, accurate copy number quantification remains challenging. Here, we assess the technical reproducibility of several major methods for copy number estimation as they apply to the large repetitive ribosomal DNA array (rDNA). rDNA encodes the ribosomal RNAs and exists as a tandem gene array in all eukaryotes. Repeat units of rDNA are kilobases in size, often with several hundred units comprising the array, making rDNA particularly intractable to common quantification techniques. We evaluate pulsed-field gel electrophoresis, droplet digital PCR, and Nextera-based whole genome sequencing as approaches to copy number estimation, comparing techniques across model organisms and spanning wide ranges of copy numbers. Nextera-based whole genome sequencing, though commonly used in recent literature, produced high error. We explore possible causes for this error and provide recommendations for best practices in rDNA copy number estimation. We present a resource of high-confidence rDNA copy number estimates for a set of S. cerevisiae and C. elegans strains for future use. We furthermore explore the possibility for FISH-based copy number estimation, an alternative that could potentially characterize copy number on a cellular level.
Collapse
|
57
|
Vasudevan K, Devanga Ragupathi NK, Jacob JJ, Veeraraghavan B. Highly accurate-single chromosomal complete genomes using IonTorrent and MinION sequencing of clinical pathogens. Genomics 2020; 112:545-551. [DOI: 10.1016/j.ygeno.2019.04.006] [Citation(s) in RCA: 22] [Impact Index Per Article: 5.5] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/30/2019] [Revised: 03/13/2019] [Accepted: 04/09/2019] [Indexed: 10/27/2022]
|
58
|
MinION sequencing of Streptococcus suis allows for functional characterization of bacteria by multilocus sequence typing and antimicrobial resistance profiling. J Microbiol Methods 2019; 169:105817. [PMID: 31881288 DOI: 10.1016/j.mimet.2019.105817] [Citation(s) in RCA: 16] [Impact Index Per Article: 3.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/08/2019] [Revised: 12/20/2019] [Accepted: 12/23/2019] [Indexed: 01/31/2023]
Abstract
In recent years, high-throughput sequencing has revolutionized disease diagnosis by its powerful ability to provide high resolution genomic information. The Oxford Nanopore MinION sequencer has unparalleled potential as a rapid disease diagnostic tool due to its high mobility, accessibility, and short turnaround time. However, there is a lack of rigorous quality assessment and control processes standardizing the testing on the MinION, which is necessary for incorporation into a diagnostic workflow. Thus, our study examined the use of the MinION sequencer for bacterial whole genome generation and characterization. Using Streptococcus suis as a model, we optimized DNA isolation and treatments to be used for MinION sequencing and standardized de novo assembly to quickly generate a full-length consensus sequence achieving a 99.4% average accuracy. The consensus genomes from MinION sequencing were able to accurately predict the multilocus sequence type in 8 out of 10 samples and identified antimicrobial resistance profiles for 100% of the samples, despite the concern of a high error rate. The inability to unequivocally predict sequence types was due to difficulty in differentiating high identity alleles, which was overcome by applying additional error correction methods to increase consensus accuracy. This manuscript provides methods for the use of MinION sequencing for identification of S. suis genome sequence, sequence type, and antibiotic resistance profile that can be used as a framework for identification and classification of other pathogens.
Collapse
|
59
|
Petersen LM, Martin IW, Moschetti WE, Kershaw CM, Tsongalis GJ. Third-Generation Sequencing in the Clinical Laboratory: Exploring the Advantages and Challenges of Nanopore Sequencing. J Clin Microbiol 2019; 58:e01315-19. [PMID: 31619531 PMCID: PMC6935936 DOI: 10.1128/jcm.01315-19] [Citation(s) in RCA: 125] [Impact Index Per Article: 25.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/08/2023] Open
Abstract
Metagenomic sequencing for infectious disease diagnostics is an important tool that holds promise for use in the clinical laboratory. Challenges for implementation so far include high cost, the length of time to results, and the need for technical and bioinformatics expertise. However, the recent technological innovation of nanopore sequencing from Oxford Nanopore Technologies (ONT) has the potential to address these challenges. ONT sequencing is an attractive platform for clinical laboratories to adopt due to its low cost, rapid turnaround time, and user-friendly bioinformatics pipelines. However, this method still faces the problem of base-calling accuracy compared to other platforms. This review highlights the general challenges of pathogen detection in clinical specimens by metagenomic sequencing, the advantages and disadvantages of the ONT platform, and how research to date supports the potential future use of nanopore sequencing in infectious disease diagnostics.
Collapse
Affiliation(s)
- Lauren M Petersen
- Dartmouth-Hitchcock Medical Center, Department of Pathology and Laboratory Medicine, Lebanon, New Hampshire, USA
| | - Isabella W Martin
- Dartmouth-Hitchcock Medical Center, Department of Pathology and Laboratory Medicine, Lebanon, New Hampshire, USA
| | - Wayne E Moschetti
- Dartmouth-Hitchcock Medical Center, Department of Orthopaedics and Sports Medicine, Lebanon, New Hampshire, USA
| | - Colleen M Kershaw
- Dartmouth-Hitchcock Medical Center, Department of Infectious Disease and International Health, Lebanon, New Hampshire, USA
| | - Gregory J Tsongalis
- Dartmouth-Hitchcock Medical Center, Department of Pathology and Laboratory Medicine, Lebanon, New Hampshire, USA
| |
Collapse
|
60
|
Fauver JR, Martin J, Weil GJ, Mitreva M, Fischer PU. De novo Assembly of the Brugia malayi Genome Using Long Reads from a Single MinION Flowcell. Sci Rep 2019; 9:19521. [PMID: 31863009 PMCID: PMC6925183 DOI: 10.1038/s41598-019-55908-y] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/21/2019] [Accepted: 11/28/2019] [Indexed: 11/15/2022] Open
Abstract
Filarial nematode infections cause a substantial global disease burden. Genomic studies of filarial worms can improve our understanding of their biology and epidemiology. However, genomic information from field isolates is limited and available reference genomes are often discontinuous. Single molecule sequencing technologies can reduce the cost of genome sequencing and long reads produced from these devices can improve the contiguity and completeness of genome assemblies. In addition, these new technologies can make generation and analysis of large numbers of field isolates feasible. In this study, we assessed the performance of the Oxford Nanopore Technologies MinION for sequencing and assembling the genome of Brugia malayi, a human parasite widely used in filariasis research. Using data from a single MinION flowcell, a 90.3 Mb nuclear genome was assembled into 202 contigs with an N50 of 2.4 Mb. This assembly covered 96.9% of the well-defined B. malayi reference genome with 99.2% identity. The complete mitochondrial genome was obtained with individual reads and the nearly complete genome of the endosymbiotic bacteria Wolbachia was assembled alongside the nuclear genome. Long-read data from the MinION produced an assembly that approached the quality of a well-established reference genome using comparably fewer resources.
Collapse
Affiliation(s)
- Joseph R Fauver
- Division of Infectious Diseases, Department of Medicine, Washington University School of Medicine, St. Louis, MO, United States.
- Department of Epidemiology of Microbial Diseases, Yale School of Public Health, New Haven, CT, United States.
| | - John Martin
- McDonnell Genome Institute, Washington University School of Medicine, St. Louis, MO, United States
| | - Gary J Weil
- Division of Infectious Diseases, Department of Medicine, Washington University School of Medicine, St. Louis, MO, United States
| | - Makedonka Mitreva
- Division of Infectious Diseases, Department of Medicine, Washington University School of Medicine, St. Louis, MO, United States
- McDonnell Genome Institute, Washington University School of Medicine, St. Louis, MO, United States
| | - Peter U Fischer
- Division of Infectious Diseases, Department of Medicine, Washington University School of Medicine, St. Louis, MO, United States
| |
Collapse
|
61
|
Sevim V, Lee J, Egan R, Clum A, Hundley H, Lee J, Everroad RC, Detweiler AM, Bebout BM, Pett-Ridge J, Göker M, Murray AE, Lindemann SR, Klenk HP, O'Malley R, Zane M, Cheng JF, Copeland A, Daum C, Singer E, Woyke T. Shotgun metagenome data of a defined mock community using Oxford Nanopore, PacBio and Illumina technologies. Sci Data 2019; 6:285. [PMID: 31772173 PMCID: PMC6879543 DOI: 10.1038/s41597-019-0287-z] [Citation(s) in RCA: 57] [Impact Index Per Article: 11.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/07/2019] [Accepted: 10/31/2019] [Indexed: 11/17/2022] Open
Abstract
Metagenomic sequence data from defined mock communities is crucial for the assessment of sequencing platform performance and downstream analyses, including assembly, binning and taxonomic assignment. We report a comparison of shotgun metagenome sequencing and assembly metrics of a defined microbial mock community using the Oxford Nanopore Technologies (ONT) MinION, PacBio and Illumina sequencing platforms. Our synthetic microbial community BMock12 consists of 12 bacterial strains with genome sizes spanning 3.2–7.2 Mbp, 40–73% GC content, and 1.5–7.3% repeats. Size selection of both PacBio and ONT sequencing libraries prior to sequencing was essential to yield comparable relative abundances of organisms among all sequencing technologies. While the Illumina-based metagenome assembly yielded good coverage with few misassemblies, contiguity was greatly improved by both, Illumina + ONT and Illumina + PacBio hybrid assemblies but increased misassemblies, most notably in genomes with high sequence similarity to each other. Our resulting datasets allow evaluation and benchmarking of bioinformatics software on Illumina, PacBio and ONT platforms in parallel. Measurement(s) | metagenomic data • sequence_assembly | Technology Type(s) | ONT MinION • Illumina sequencing • PacBio RS II | Factor Type(s) | sequencing platform | Sample Characteristic - Organism | Bacteria | Sample Characteristic - Environment | mock community |
Machine-accessible metadata file describing the reported data: 10.6084/m9.figshare.10260740
Collapse
Affiliation(s)
- Volkan Sevim
- DOE Joint Genome Institute, 2800 Mitchell Drive, Walnut Creek, CA, 94598, USA
| | - Juna Lee
- DOE Joint Genome Institute, 2800 Mitchell Drive, Walnut Creek, CA, 94598, USA
| | - Robert Egan
- DOE Joint Genome Institute, 2800 Mitchell Drive, Walnut Creek, CA, 94598, USA
| | - Alicia Clum
- DOE Joint Genome Institute, 2800 Mitchell Drive, Walnut Creek, CA, 94598, USA
| | - Hope Hundley
- DOE Joint Genome Institute, 2800 Mitchell Drive, Walnut Creek, CA, 94598, USA
| | - Janey Lee
- DOE Joint Genome Institute, 2800 Mitchell Drive, Walnut Creek, CA, 94598, USA
| | - R Craig Everroad
- NASA Ames Research Center, Exobiology Branch, Moffett Field, CA, 94035, USA
| | - Angela M Detweiler
- NASA Ames Research Center, Exobiology Branch, Moffett Field, CA, 94035, USA.,Bay Area Environmental Research Institute, Moffett Field, CA, 94035, USA
| | - Brad M Bebout
- NASA Ames Research Center, Exobiology Branch, Moffett Field, CA, 94035, USA
| | - Jennifer Pett-Ridge
- Lawrence Livermore National Laboratory, Nuclear and Chemical Science Division, 7000 East Ave, Livermore, CA, 94550-9234, USA
| | - Markus Göker
- Leibniz-Institut DSMZ-Deutsche Sammlung von Mikroorganismen und Zellkulturen GmbH, Inhoffenstraße 7B, 38124, Braunschweig, Germany
| | - Alison E Murray
- Desert Research Institute, Division of Earth and Ecosystem Sciences, 2215 Raggio Pkwy, Reno, NV, 89512, USA
| | | | - Hans-Peter Klenk
- Newcastle University, School of Natural and Environmental Sciences, Ridley Building 2, Newcastle upon Tyne, NE1 7RU, UK
| | - Ronan O'Malley
- DOE Joint Genome Institute, 2800 Mitchell Drive, Walnut Creek, CA, 94598, USA
| | - Matthew Zane
- DOE Joint Genome Institute, 2800 Mitchell Drive, Walnut Creek, CA, 94598, USA
| | - Jan-Fang Cheng
- DOE Joint Genome Institute, 2800 Mitchell Drive, Walnut Creek, CA, 94598, USA
| | - Alex Copeland
- DOE Joint Genome Institute, 2800 Mitchell Drive, Walnut Creek, CA, 94598, USA
| | - Christopher Daum
- DOE Joint Genome Institute, 2800 Mitchell Drive, Walnut Creek, CA, 94598, USA
| | - Esther Singer
- DOE Joint Genome Institute, 2800 Mitchell Drive, Walnut Creek, CA, 94598, USA. .,Lawrence Berkeley National Laboratory, 1 Cyclotron Road, Berkeley, CA, 94720, USA.
| | - Tanja Woyke
- DOE Joint Genome Institute, 2800 Mitchell Drive, Walnut Creek, CA, 94598, USA.
| |
Collapse
|
62
|
Rapid, multiplexed, whole genome and plasmid sequencing of foodborne pathogens using long-read nanopore technology. Sci Rep 2019; 9:16350. [PMID: 31704961 PMCID: PMC6841976 DOI: 10.1038/s41598-019-52424-x] [Citation(s) in RCA: 33] [Impact Index Per Article: 6.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/24/2019] [Accepted: 10/16/2019] [Indexed: 12/11/2022] Open
Abstract
U.S. public health agencies have employed next-generation sequencing (NGS) as a tool to quickly identify foodborne pathogens during outbreaks. Although established short-read NGS technologies are known to provide highly accurate data, long-read sequencing is still needed to resolve highly-repetitive genomic regions and genomic arrangement, and to close the sequences of bacterial chromosomes and plasmids. Here, we report the use of long-read nanopore sequencing to simultaneously sequence the entire chromosome and plasmid of Salmonella enterica subsp. enterica serovar Bareilly and Escherichia coli O157:H7. We developed a rapid and random sequencing approach coupled with de novo genome assembly within a customized data analysis workflow that uses publicly-available tools. In sequencing runs as short as four hours, using the MinION instrument, we obtained full-length genomes with an average identity of 99.87% for Salmonella Bareilly and 99.89% for E. coli in comparison to the respective MiSeq references. These nanopore-only assemblies provided readily available information on serotype, virulence factors, and antimicrobial resistance genes. We also demonstrate the potential of nanopore sequencing assemblies for rapid preliminary phylogenetic inference. Nanopore sequencing provides additional advantages as very low capital investment and footprint, and shorter (10 hours library preparation and sequencing) turnaround time compared to other NGS technologies.
Collapse
|
63
|
Liechti N, Schürch N, Bruggmann R, Wittwer M. Nanopore sequencing improves the draft genome of the human pathogenic amoeba Naegleria fowleri. Sci Rep 2019; 9:16040. [PMID: 31690847 PMCID: PMC6831594 DOI: 10.1038/s41598-019-52572-0] [Citation(s) in RCA: 22] [Impact Index Per Article: 4.4] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/08/2019] [Accepted: 10/12/2019] [Indexed: 11/09/2022] Open
Abstract
Naegleria fowleri is an environmental protist found in soil and warm freshwater sources worldwide and is known for its ability to infect humans and causing a rapid and mostly fatal primary amoebic meningoencephalitis. When contaminated water enters the nose, the facultative parasite follows the olfactory nerve and enters the brain by crossing the cribriform plate where it causes tissue damage and haemorrhagic necrosis. Although N. fowleri has been studied for several years, the mechanisms of pathogenicity are still poorly understood. Furthermore, there is a lack of knowledge on the genomic level and the current reference assembly is limited in contiguity. To improve the draft genome and to investigate pathogenicity factors, we sequenced the genome of N. fowleri using Oxford Nanopore Technology (ONT). Assembly and polishing of the long reads resulted in a high-quality draft genome whose N50 is 18 times higher than the previously published genome. The prediction of potentially secreted proteins revealed a large proportion of enzymes with a hydrolysing function, which could play an important role during the pathogenesis and account for the destructive nature of primary amoebic meningoencephalitis. The improved genome provides the basis for further investigation unravelling the biology and the pathogenic potential of N. fowleri.
Collapse
Affiliation(s)
- Nicole Liechti
- Interfaculty Bioinformatics Unit and Swiss Institute of Bioinformatics, University of Bern, Bern, Switzerland
- Spiez Laboratory, Federal Office for Civil Protection, Austrasse, Spiez, Switzerland
- Graduate School for Cellular and Biomedical Sciences, University of Bern, Bern, Switzerland
| | - Nadia Schürch
- Spiez Laboratory, Federal Office for Civil Protection, Austrasse, Spiez, Switzerland
| | - Rémy Bruggmann
- Interfaculty Bioinformatics Unit and Swiss Institute of Bioinformatics, University of Bern, Bern, Switzerland
| | - Matthias Wittwer
- Spiez Laboratory, Federal Office for Civil Protection, Austrasse, Spiez, Switzerland.
| |
Collapse
|
64
|
Pjeta R, Wunderer J, Bertemes P, Hofer T, Salvenmoser W, Lengerer B, Coassin S, Erhart G, Beisel C, Sobral D, Kremser L, Lindner H, Curini-Galletti M, Stelzer CP, Hess MW, Ladurner P. Temporary adhesion of the proseriate flatworm Minona ileanae. Philos Trans R Soc Lond B Biol Sci 2019; 374:20190194. [PMID: 31495318 PMCID: PMC6745481 DOI: 10.1098/rstb.2019.0194] [Citation(s) in RCA: 13] [Impact Index Per Article: 2.6] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 05/28/2019] [Indexed: 01/05/2023] Open
Abstract
Flatworms can very rapidly attach to and detach from many substrates. In the presented work, we analysed the adhesive system of the marine proseriate flatworm Minona ileanae. We used light-, scanning- and transmission electron microscopy to analyse the morphology of the adhesive organs, which are located at the ventral side of the tail-plate. We performed transcriptome sequencing and differential RNA-seq for the identification of tail-specific transcripts. Using in situ hybridization expression screening, we identified nine transcripts that were expressed in the cells of the adhesive organs. Knock-down of five of these transcripts by RNA interference led to a reduction of the animal's attachment capacity. Adhesive proteins in footprints were confirmed using mass spectrometry and antibody staining. Additionally, lectin labelling of footprints revealed the presence of several sugar moieties. Furthermore, we determined a genome size of about 560 Mb for M. ileanae. We demonstrated the potential of Oxford Nanopore sequencing of genomic DNA as a cost-effective tool for identifying the number of repeats within an adhesive protein and for combining transcripts that were fragments of larger genes. A better understanding of the molecules involved in flatworm bioadhesion can pave the way towards developing innovative glues with reversible adhesive properties. This article is part of the theme issue 'Transdisciplinary approaches to the study of adhesion and adhesives in biological systems'.
Collapse
Affiliation(s)
- Robert Pjeta
- Institute of Zoology and Center of Molecular Biosciences Innsbruck, University of Innsbruck, 6020 Innsbruck, Austria
| | - Julia Wunderer
- Institute of Zoology and Center of Molecular Biosciences Innsbruck, University of Innsbruck, 6020 Innsbruck, Austria
| | - Philip Bertemes
- Institute of Zoology and Center of Molecular Biosciences Innsbruck, University of Innsbruck, 6020 Innsbruck, Austria
| | - Teresa Hofer
- Institute of Zoology and Center of Molecular Biosciences Innsbruck, University of Innsbruck, 6020 Innsbruck, Austria
| | - Willi Salvenmoser
- Institute of Zoology and Center of Molecular Biosciences Innsbruck, University of Innsbruck, 6020 Innsbruck, Austria
| | - Birgit Lengerer
- Biology of Marine Organisms and Biomimetics, Research Institute for Biosciences, University of Mons, 7000 Mons, Belgium
| | - Stefan Coassin
- Division of Genetic Epidemiology, Department of Medical Genetics, Molecular and Clinical Pharmacology, Medical University of Innsbruck, 6020 Innsbruck, Austria
| | - Gertraud Erhart
- Division of Genetic Epidemiology, Department of Medical Genetics, Molecular and Clinical Pharmacology, Medical University of Innsbruck, 6020 Innsbruck, Austria
| | - Christian Beisel
- Department of Biosystems Science and Engineering, ETH Zürich, Basel, Switzerland
| | | | - Leopold Kremser
- Division of Clinical Biochemistry, Biocenter, Medical University of Innsbruck, 6020 Innsbruck, Austria
| | - Herbert Lindner
- Division of Clinical Biochemistry, Biocenter, Medical University of Innsbruck, 6020 Innsbruck, Austria
| | | | - Claus-Peter Stelzer
- Research Institute for Limnology, University of Innsbruck, 5310 Mondsee, Austria
| | - Michael W. Hess
- Division of Histology and Embryology, Medical University of Innsbruck, 6020 Innsbruck, Austria
| | - Peter Ladurner
- Institute of Zoology and Center of Molecular Biosciences Innsbruck, University of Innsbruck, 6020 Innsbruck, Austria
| |
Collapse
|
65
|
Douglas GM, Langille MGI. Current and Promising Approaches to Identify Horizontal Gene Transfer Events in Metagenomes. Genome Biol Evol 2019; 11:2750-2766. [PMID: 31504488 PMCID: PMC6777429 DOI: 10.1093/gbe/evz184] [Citation(s) in RCA: 42] [Impact Index Per Article: 8.4] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 08/19/2019] [Indexed: 12/16/2022] Open
Abstract
High-throughput shotgun metagenomics sequencing has enabled the profiling of myriad natural communities. These data are commonly used to identify gene families and pathways that were potentially gained or lost in an environment and which may be involved in microbial adaptation. Despite the widespread interest in these events, there are no established best practices for identifying gene gain and loss in metagenomics data. Horizontal gene transfer (HGT) represents several mechanisms of gene gain that are especially of interest in clinical microbiology due to the rapid spread of antibiotic resistance genes in natural communities. Several additional mechanisms of gene gain and loss, including gene duplication, gene loss-of-function events, and de novo gene birth are also important to consider in the context of metagenomes but have been less studied. This review is largely focused on detecting HGT in prokaryotic metagenomes, but methods for detecting these other mechanisms are first discussed. For this article to be self-contained, we provide a general background on HGT and the different possible signatures of this process. Lastly, we discuss how improved assembly of genomes from metagenomes would be the most straight-forward approach for improving the inference of gene gain and loss events. Several recent technological advances could help improve metagenome assemblies: long-read sequencing, determining the physical proximity of contigs, optical mapping of short sequences along chromosomes, and single-cell metagenomics. The benefits and limitations of these advances are discussed and open questions in this area are highlighted.
Collapse
Affiliation(s)
- Gavin M Douglas
- Department of Microbiology and Immunology, Dalhousie University, Halifax, Nova Scotia, Canada
| | - Morgan G I Langille
- Department of Microbiology and Immunology, Dalhousie University, Halifax, Nova Scotia, Canada
| |
Collapse
|
66
|
Howe KL. A new reference genome sequence for Caenorhabditis elegans? Lab Anim (NY) 2019; 48:267-268. [DOI: 10.1038/s41684-019-0371-1] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/09/2022]
|
67
|
Edwards HS, Krishnakumar R, Sinha A, Bird SW, Patel KD, Bartsch MS. Real-Time Selective Sequencing with RUBRIC: Read Until with Basecall and Reference-Informed Criteria. Sci Rep 2019; 9:11475. [PMID: 31391493 PMCID: PMC6685950 DOI: 10.1038/s41598-019-47857-3] [Citation(s) in RCA: 28] [Impact Index Per Article: 5.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/19/2019] [Accepted: 07/09/2019] [Indexed: 12/12/2022] Open
Abstract
The Oxford MinION, the first commercial nanopore sequencer, is also the first to implement molecule-by-molecule real-time selective sequencing or “Read Until”. As DNA transits a MinION nanopore, real-time pore current data can be accessed and analyzed to provide active feedback to that pore. Fragments of interest are sequenced by default, while DNA deemed non-informative is rejected by reversing the pore bias to eject the strand, providing a novel means of background depletion and/or target enrichment. In contrast to the previously published pattern-matching Read Until approach, our RUBRIC method is the first example of real-time selective sequencing where on-line basecalling enables alignment against conventional nucleic acid references to provide the basis for sequence/reject decisions. We evaluate RUBRIC performance across a range of optimizable parameters, apply it to mixed human/bacteria and CRISPR/Cas9-cut samples, and present a generalized model for estimating real-time selection performance as a function of sample composition and computing configuration.
Collapse
Affiliation(s)
- Harrison S Edwards
- Exploratory Systems Dept., Sandia National Laboratories, Livermore, CA, USA.,Institute of Biomaterials and Biomedical Engineering, University of Toronto, Toronto, Canada
| | - Raga Krishnakumar
- Systems Biology Dept., Sandia National Laboratories, Livermore, CA, USA
| | - Anupama Sinha
- Systems Biology Dept., Sandia National Laboratories, Livermore, CA, USA
| | - Sara W Bird
- Biotechnology & Bioengineering Dept., Sandia National Laboratories, Livermore, CA, USA.,uBiome, San Francisco, CA, USA
| | - Kamlesh D Patel
- Exploratory Systems Dept., Sandia National Laboratories, Livermore, CA, USA.,Purdue Partnerships Dept., Sandia National Laboratories, Albuquerque, NM, USA
| | - Michael S Bartsch
- Exploratory Systems Dept., Sandia National Laboratories, Livermore, CA, USA.
| |
Collapse
|
68
|
Constructing a Reference Genome in a Single Lab: The Possibility to Use Oxford Nanopore Technology. PLANTS 2019; 8:plants8080270. [PMID: 31390788 PMCID: PMC6724115 DOI: 10.3390/plants8080270] [Citation(s) in RCA: 10] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 06/12/2019] [Revised: 07/29/2019] [Accepted: 08/04/2019] [Indexed: 12/19/2022]
Abstract
The whole genome sequencing (WGS) has become a crucial tool in understanding genome structure and genetic variation. The MinION sequencing of Oxford Nanopore Technologies (ONT) is an excellent approach for performing WGS and it has advantages in comparison with other Next-Generation Sequencing (NGS): It is relatively inexpensive, portable, has simple library preparation, can be monitored in real-time, and has no theoretical limits on reading length. Sorghum bicolor (L.) Moench is diploid (2n = 2x = 20) with a genome size of about 730 Mb, and its genome sequence information is released in the Phytozome database. Therefore, sorghum can be used as a good reference. However, plant species have complex and large genomes when compared to animals or microorganisms. As a result, complete genome sequencing is difficult for plant species. MinION sequencing that produces long-reads can be an excellent tool for overcoming the weak assembly of short-reads generated from NGS by minimizing the generation of gaps or covering the repetitive sequence that appears on the plant genome. Here, we conducted the genome sequencing for S. bicolor cv. BTx623 while using the MinION platform and obtained 895,678 reads and 17.9 gigabytes (Gb) (ca. 25× coverage of reference) from long-read sequence data. A total of 6124 contigs (covering 45.9%) were generated from Canu, and a total of 2661 contigs (covering 50%) were generated from Minimap and Miniasm with a Racon through a de novo assembly using two different tools and mapped assembled contigs against the sorghum reference genome. Our results provide an optimal series of long-read sequencing analysis for plant species while using the MinION platform and a clue to determine the total sequencing scale for optimal coverage that is based on various genome sizes.
Collapse
|
69
|
Soneson C, Yao Y, Bratus-Neuenschwander A, Patrignani A, Robinson MD, Hussain S. A comprehensive examination of Nanopore native RNA sequencing for characterization of complex transcriptomes. Nat Commun 2019; 10:3359. [PMID: 31366910 PMCID: PMC6668388 DOI: 10.1038/s41467-019-11272-z] [Citation(s) in RCA: 127] [Impact Index Per Article: 25.4] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/11/2019] [Accepted: 07/04/2019] [Indexed: 11/29/2022] Open
Abstract
A platform for highly parallel direct sequencing of native RNA strands was recently described by Oxford Nanopore Technologies, but despite initial efforts it remains crucial to further investigate the technology for quantification of complex transcriptomes. Here we undertake native RNA sequencing of polyA + RNA from two human cell lines, analysing ~5.2 million aligned native RNA reads. To enable informative comparisons, we also perform relevant ONT direct cDNA- and Illumina-sequencing. We find that while native RNA sequencing does enable some of the anticipated advantages, key unexpected aspects currently hamper its performance, most notably the quite frequent inability to obtain full-length transcripts from single reads, as well as difficulties to unambiguously infer their true transcript of origin. While characterising issues that need to be addressed when investigating more complex transcriptomes, our study highlights that with some defined improvements, native RNA sequencing could be an important addition to the mammalian transcriptomics toolbox.
Collapse
Affiliation(s)
- Charlotte Soneson
- Institute of Molecular Life Sciences, University of Zurich, 8057, Zurich, Switzerland.
- SIB Swiss Institute of Bioinformatics, 8057, Zurich, Switzerland.
- Friedrich Miescher Institute for Biomedical Research and SIB Swiss Institute of Bioinformatics, Basel, Switzerland.
| | - Yao Yao
- Institute of Molecular Life Sciences, University of Zurich, 8057, Zurich, Switzerland
- SIB Swiss Institute of Bioinformatics, 8057, Zurich, Switzerland
| | | | - Andrea Patrignani
- Functional Genomics Centre Zurich, ETHZ/University of Zurich, 8057, Zurich, Switzerland
| | - Mark D Robinson
- Institute of Molecular Life Sciences, University of Zurich, 8057, Zurich, Switzerland.
- SIB Swiss Institute of Bioinformatics, 8057, Zurich, Switzerland.
| | - Shobbir Hussain
- Department of Biology and Biochemistry, University of Bath, Bath, BA2 7AY, UK.
| |
Collapse
|
70
|
Prabakar RK, Xu L, Hicks J, Smith AD. SMURF-seq: efficient copy number profiling on long-read sequencers. Genome Biol 2019; 20:134. [PMID: 31287019 PMCID: PMC6615205 DOI: 10.1186/s13059-019-1732-1] [Citation(s) in RCA: 6] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/30/2018] [Accepted: 06/06/2019] [Indexed: 12/21/2022] Open
Abstract
We present SMURF-seq, a protocol to efficiently sequence short DNA molecules on a long-read sequencer by randomly ligating them to form long molecules. Applying SMURF-seq using the Oxford Nanopore MinION yields up to 30 fragments per read, providing an average of 6.2 and up to 7.5 million mappable fragments per run, increasing information throughput for read-counting applications. We apply SMURF-seq on the MinION to generate copy number profiles. A comparison with profiles from Illumina sequencing reveals that SMURF-seq attains similar accuracy. More broadly, SMURF-seq expands the utility of long-read sequencers for read-counting applications.
Collapse
Affiliation(s)
- Rishvanth K. Prabakar
- Quantitative and Computational Biology Section, Department of Biological Sciences, University of Southern California, 1050 Childs Way, Los Angeles, 90089 USA
| | - Liya Xu
- Michelson Center for Convergent Bioscience, University of Southern California, 1002 Childs Way, Los Angeles, 90089 USA
| | - James Hicks
- Michelson Center for Convergent Bioscience, University of Southern California, 1002 Childs Way, Los Angeles, 90089 USA
| | - Andrew D. Smith
- Quantitative and Computational Biology Section, Department of Biological Sciences, University of Southern California, 1050 Childs Way, Los Angeles, 90089 USA
| |
Collapse
|
71
|
Saxena AS, Salomon MP, Matsuba C, Yeh SD, Baer CF. Evolution of the Mutational Process under Relaxed Selection in Caenorhabditis elegans. Mol Biol Evol 2019; 36:239-251. [PMID: 30445510 DOI: 10.1093/molbev/msy213] [Citation(s) in RCA: 29] [Impact Index Per Article: 5.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/13/2022] Open
Abstract
The mutational process varies at many levels, from within genomes to among taxa. Many mechanisms have been linked to variation in mutation, but understanding of the evolution of the mutational process is rudimentary. Physiological condition is often implicated as a source of variation in microbial mutation rate and may contribute to mutation rate variation in multicellular organisms.Deleterious mutations are an ubiquitous source of variation in condition. We test the hypothesis that the mutational process depends on the underlying mutation load in two groups of Caenorhabditis elegans mutation accumulation (MA) lines that differ in their starting mutation loads. "First-order MA" (O1MA) lines maintained under minimal selection for ∼250 generations were divided into high-fitness and low-fitness groups and sets of "second-order MA" (O2MA) lines derived from each O1MA line were maintained for ∼150 additional generations. Genomes of 48 O2MA lines and their progenitors were sequenced. There is significant variation among O2MA lines in base-substitution rate (µbs), but no effect of initial fitness; the indel rate is greater in high-fitness O2MA lines. Overall, µbs is positively correlated with recombination and proximity to short tandem repeats and negatively correlated with 10 bp and 1 kb GC content. However, probability of mutation is sufficiently predicted by the three-nucleotide motif alone. Approximately 90% of the variance in standing nucleotide variation is explained by mutability. Total mutation rate increased in the O2MA lines, as predicted by the "drift barrier" model of mutation rate evolution. These data, combined with experimental estimates of fitness, suggest that epistasis is synergistic.
Collapse
Affiliation(s)
| | - Matthew P Salomon
- Department of Biology, University of Florida, Gainesville, FL
- Department of Molecular Oncology, John Wayne Cancer Institute, Santa Monica, CA
| | - Chikako Matsuba
- Department of Biology, University of Florida, Gainesville, FL
- Department of Molecular Oncology, John Wayne Cancer Institute, Santa Monica, CA
| | - Shu-Dan Yeh
- Department of Biology, University of Florida, Gainesville, FL
- Department of Life Sciences, National Central University, Taoyuan, Taiwan
| | - Charles F Baer
- Department of Biology, University of Florida, Gainesville, FL
- University of Florida Genetics Institute
| |
Collapse
|
72
|
Malmberg MM, Spangenberg GC, Daetwyler HD, Cogan NOI. Assessment of low-coverage nanopore long read sequencing for SNP genotyping in doubled haploid canola (Brassica napus L.). Sci Rep 2019; 9:8688. [PMID: 31213642 PMCID: PMC6582154 DOI: 10.1038/s41598-019-45131-0] [Citation(s) in RCA: 16] [Impact Index Per Article: 3.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/31/2019] [Accepted: 05/28/2019] [Indexed: 11/16/2022] Open
Abstract
Despite the high accuracy of short read sequencing (SRS), there are still issues with attaining accurate single nucleotide polymorphism (SNP) genotypes at low sequencing coverage and in highly duplicated genomes due to misalignment. Long read sequencing (LRS) systems, including the Oxford Nanopore Technologies (ONT) minION, have become popular options for de novo genome assembly and structural variant characterisation. The current high error rate often requires substantial post-sequencing correction and would appear to prevent the adoption of this system for SNP genotyping, but nanopore sequencing errors are largely random. Using low coverage ONT minION sequencing for genotyping of pre-validated SNP loci was examined in 9 canola doubled haploids. The minION genotypes were compared to the Illumina sequences to determine the extent and nature of genotype discrepancies between the two systems. The significant increase in read length improved alignment to the genome and the absence of classical SRS biases results in a more even representation of the genome. Sequencing errors are present, primarily in the form of heterozygous genotypes, which can be removed in completely homozygous backgrounds but requires more advanced bioinformatics in heterozygous genomes. Developments in this technology are promising for routine genotyping in the future.
Collapse
Affiliation(s)
- M M Malmberg
- Agriculture Victoria, AgriBio, Centre for AgriBioscience, 5 Ring Road, Bundoora, Victoria, 3083, Australia.,School of Applied Systems Biology, La Trobe University, Bundoora, Victoria, 3086, Australia
| | - G C Spangenberg
- Agriculture Victoria, AgriBio, Centre for AgriBioscience, 5 Ring Road, Bundoora, Victoria, 3083, Australia.,School of Applied Systems Biology, La Trobe University, Bundoora, Victoria, 3086, Australia
| | - H D Daetwyler
- Agriculture Victoria, AgriBio, Centre for AgriBioscience, 5 Ring Road, Bundoora, Victoria, 3083, Australia.,School of Applied Systems Biology, La Trobe University, Bundoora, Victoria, 3086, Australia
| | - N O I Cogan
- Agriculture Victoria, AgriBio, Centre for AgriBioscience, 5 Ring Road, Bundoora, Victoria, 3083, Australia. .,School of Applied Systems Biology, La Trobe University, Bundoora, Victoria, 3086, Australia.
| |
Collapse
|
73
|
Kim C, Kim J, Kim S, Cook DE, Evans KS, Andersen EC, Lee J. Long-read sequencing reveals intra-species tolerance of substantial structural variations and new subtelomere formation in C. elegans. Genome Res 2019; 29:1023-1035. [PMID: 31123081 PMCID: PMC6581047 DOI: 10.1101/gr.246082.118] [Citation(s) in RCA: 37] [Impact Index Per Article: 7.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/19/2018] [Accepted: 04/22/2019] [Indexed: 12/05/2022]
Abstract
Long-read sequencing technologies have contributed greatly to comparative genomics among species and can also be applied to study genomics within a species. In this study, to determine how substantial genomic changes are generated and tolerated within a species, we sequenced a C. elegans strain, CB4856, which is one of the most genetically divergent strains compared to the N2 reference strain. For this comparison, we used the Pacific Biosciences (PacBio) RSII platform (80×, N50 read length 11.8 kb) and generated de novo genome assembly to the level of pseudochromosomes containing 76 contigs (N50 contig = 2.8 Mb). We identified structural variations that affected as many as 2694 genes, most of which are at chromosome arms. Subtelomeric regions contained the most extensive genomic rearrangements, which even created new subtelomeres in some cases. The subtelomere structure of Chromosome VR implies that ancestral telomere damage was repaired by alternative lengthening of telomeres even in the presence of a functional telomerase gene and that a new subtelomere was formed by break-induced replication. Our study demonstrates that substantial genomic changes including structural variations and new subtelomeres can be tolerated within a species, and that these changes may accumulate genetic diversity within a species.
Collapse
Affiliation(s)
- Chuna Kim
- Institute of Molecular Biology and Genetics, Seoul National University, Seoul, Korea 08826
- Department of Biological Sciences, Seoul National University, Seoul, Korea 08826
| | - Jun Kim
- Department of Biological Sciences, Seoul National University, Seoul, Korea 08826
- Research Institute of Basic Sciences, Seoul National University, Seoul, Korea 08826
| | - Sunghyun Kim
- Institute of Molecular Biology and Genetics, Seoul National University, Seoul, Korea 08826
- Department of Molecular and Computational Biology, University of Southern California, Los Angeles, California 90089, USA
| | - Daniel E Cook
- Department of Molecular Biosciences, Northwestern University, Evanston, Illinois 60208, USA
| | - Kathryn S Evans
- Department of Molecular Biosciences, Northwestern University, Evanston, Illinois 60208, USA
| | - Erik C Andersen
- Department of Molecular Biosciences, Northwestern University, Evanston, Illinois 60208, USA
| | - Junho Lee
- Institute of Molecular Biology and Genetics, Seoul National University, Seoul, Korea 08826
- Department of Biological Sciences, Seoul National University, Seoul, Korea 08826
- Research Institute of Basic Sciences, Seoul National University, Seoul, Korea 08826
| |
Collapse
|
74
|
Yoshimura J, Ichikawa K, Shoura MJ, Artiles KL, Gabdank I, Wahba L, Smith CL, Edgley ML, Rougvie AE, Fire AZ, Morishita S, Schwarz EM. Recompleting the Caenorhabditis elegans genome. Genome Res 2019; 29:1009-1022. [PMID: 31123080 PMCID: PMC6581061 DOI: 10.1101/gr.244830.118] [Citation(s) in RCA: 81] [Impact Index Per Article: 16.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/04/2018] [Accepted: 03/11/2019] [Indexed: 01/14/2023]
Abstract
Caenorhabditis elegans was the first multicellular eukaryotic genome sequenced to apparent completion. Although this assembly employed a standard C. elegans strain (N2), it used sequence data from several laboratories, with DNA propagated in bacteria and yeast. Thus, the N2 assembly has many differences from any C. elegans available today. To provide a more accurate C. elegans genome, we performed long-read assembly of VC2010, a modern strain derived from N2. Our VC2010 assembly has 99.98% identity to N2 but with an additional 1.8 Mb including tandem repeat expansions and genome duplications. For 116 structural discrepancies between N2 and VC2010, 97 structures matching VC2010 (84%) were also found in two outgroup strains, implying deficiencies in N2. Over 98% of N2 genes encoded unchanged products in VC2010; moreover, we predicted ≥53 new genes in VC2010. The recompleted genome of C. elegans should be a valuable resource for genetics, genomics, and systems biology.
Collapse
Affiliation(s)
- Jun Yoshimura
- Department of Computational Biology and Medical Sciences, Graduate School of Frontier Sciences, The University of Tokyo, Chiba 277-8583, Japan
| | - Kazuki Ichikawa
- Department of Computational Biology and Medical Sciences, Graduate School of Frontier Sciences, The University of Tokyo, Chiba 277-8583, Japan
| | - Massa J Shoura
- Department of Pathology, Stanford University, Stanford, California 94305, USA
| | - Karen L Artiles
- Department of Pathology, Stanford University, Stanford, California 94305, USA
| | - Idan Gabdank
- Department of Genetics, Stanford University, Stanford, California 94305, USA
| | - Lamia Wahba
- Department of Pathology, Stanford University, Stanford, California 94305, USA
| | - Cheryl L Smith
- Department of Pathology, Stanford University, Stanford, California 94305, USA.,Department of Genetics, Stanford University, Stanford, California 94305, USA
| | - Mark L Edgley
- Department of Zoology and Michael Smith Laboratories, University of British Columbia, Vancouver V6T 1Z3, British Columbia, Canada
| | - Ann E Rougvie
- Department of Genetics, Cell Biology, and Development, University of Minnesota, Minneapolis, Minnesota 55454, USA
| | - Andrew Z Fire
- Department of Pathology, Stanford University, Stanford, California 94305, USA.,Department of Genetics, Stanford University, Stanford, California 94305, USA
| | - Shinichi Morishita
- Department of Computational Biology and Medical Sciences, Graduate School of Frontier Sciences, The University of Tokyo, Chiba 277-8583, Japan
| | - Erich M Schwarz
- Department of Molecular Biology and Genetics, Cornell University, Ithaca, New York 14853, USA
| |
Collapse
|
75
|
Locating and Characterizing a Transgene Integration Site by Nanopore Sequencing. G3-GENES GENOMES GENETICS 2019; 9:1481-1486. [PMID: 30837263 PMCID: PMC6505145 DOI: 10.1534/g3.119.300582] [Citation(s) in RCA: 18] [Impact Index Per Article: 3.6] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Indexed: 12/26/2022]
Abstract
The introduction of foreign DNA into cells and organisms has facilitated much of modern biological research, and it promises to become equally important in clinical practice. Locating sites of foreign DNA incorporation in mammalian genomes has proven burdensome, so the genomic location of most transgenes remains unknown. To address this challenge, we applied nanopore sequencing in search of the site of integration of Tg(Pou5f1-EGFP)2Mnn (also known as Oct4:EGFP), a widely used fluorescent reporter in mouse germ line research. Using this nanopore-based approach, we identified the site of Oct4:EGFP transgene integration near the telomere of Chromosome 9. This methodology simultaneously yielded an estimate of transgene copy number, provided direct evidence of transgene inversions, revealed contaminating E. coli genomic DNA within the transgene array, validated the integrity of neighboring genes, and enabled definitive genotyping. We suggest that such an approach provides a rapid, cost-effective method for identifying and analyzing transgene integration sites.
Collapse
|
76
|
Kono N, Arakawa K. Nanopore sequencing: Review of potential applications in functional genomics. Dev Growth Differ 2019; 61:316-326. [DOI: 10.1111/dgd.12608] [Citation(s) in RCA: 164] [Impact Index Per Article: 32.8] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/18/2019] [Revised: 03/26/2019] [Accepted: 03/26/2019] [Indexed: 12/17/2022]
Affiliation(s)
- Nobuaki Kono
- Institute for Advanced Biosciences Keio University Tsuruoka Yamagata Japan
| | - Kazuharu Arakawa
- Institute for Advanced Biosciences Keio University Tsuruoka Yamagata Japan
| |
Collapse
|
77
|
Zeeshan S, Xiong R, Liang BT, Ahmed Z. 100 Years of evolving gene-disease complexities and scientific debutants. Brief Bioinform 2019; 21:885-905. [PMID: 30972412 DOI: 10.1093/bib/bbz038] [Citation(s) in RCA: 28] [Impact Index Per Article: 5.6] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/09/2019] [Revised: 03/06/2019] [Accepted: 03/08/2019] [Indexed: 12/22/2022] Open
Abstract
It's been over 100 years since the word `gene' is around and progressively evolving in several scientific directions. Time-to-time technological advancements have heavily revolutionized the field of genomics, especially when it's about, e.g. triple code development, gene number proposition, genetic mapping, data banks, gene-disease maps, catalogs of human genes and genetic disorders, CRISPR/Cas9, big data and next generation sequencing, etc. In this manuscript, we present the progress of genomics from pea plant genetics to the human genome project and highlight the molecular, technical and computational developments. Studying genome and epigenome led to the fundamentals of development and progression of human diseases, which includes chromosomal, monogenic, multifactorial and mitochondrial diseases. World Health Organization has classified, standardized and maintained all human diseases, when many academic and commercial online systems are sharing information about genes and linking to associated diseases. To efficiently fathom the wealth of this biological data, there is a crucial need to generate appropriate gene annotation repositories and resources. Our focus has been how many gene-disease databases are available worldwide and which sources are authentic, timely updated and recommended for research and clinical purposes. In this manuscript, we have discussed and compared 43 such databases and bioinformatics applications, which enable users to connect, explore and, if possible, download gene-disease data.
Collapse
Affiliation(s)
- Saman Zeeshan
- The Jackson Laboratory for Genomic Medicine, 10 Discovery Drive, Farmington, CT, USA
| | - Ruoyun Xiong
- Department of Genetics and Genome Sciences, School of Medicine, University of Connecticut Health Center, Farmington Ave, Farmington, CT, USA
| | - Bruce T Liang
- Department of Genetics and Genome Sciences, School of Medicine, University of Connecticut Health Center, Farmington Ave, Farmington, CT, USA.,Pat and Jim Calhoun Cardiology Center, School of Medicine, University of Connecticut Health Center, Farmington Ave, Farmington, CT, USA
| | - Zeeshan Ahmed
- Department of Genetics and Genome Sciences, School of Medicine, University of Connecticut Health Center, Farmington Ave, Farmington, CT, USA
| |
Collapse
|
78
|
Taking Advantage of the Genomics Revolution for Monitoring and Conservation of Chondrichthyan Populations. DIVERSITY-BASEL 2019. [DOI: 10.3390/d11040049] [Citation(s) in RCA: 13] [Impact Index Per Article: 2.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 12/24/2022]
Abstract
Chondrichthyes (sharks, rays, skates and chimaeras) are among the oldest extant predators and are vital to top-down regulation of oceanic ecosystems. They are an ecologically diverse group occupying a wide range of habitats and are thus, exploited by coastal, pelagic and deep-water fishing industries. Chondrichthyes are among the most data deficient vertebrate species groups making design and implementation of regulatory and conservation measures challenging. High-throughput sequencing technologies have significantly propelled ecological investigations and understanding of marine and terrestrial species’ populations, but there remains a paucity of NGS based research on chondrichthyan populations. We present a brief review of current methods to access genomic and metagenomic data from Chondrichthyes and discuss applications of these datasets to increase our understanding of chondrichthyan taxonomy, evolution, ecology and population structures. Last, we consider opportunities and challenges offered by genomic studies for conservation and management of chondrichthyan populations.
Collapse
|
79
|
Lim A, Naidenov B, Bates H, Willyerd K, Snider T, Couger MB, Chen C, Ramachandran A. Nanopore ultra-long read sequencing technology for antimicrobial resistance detection in Mannheimia haemolytica. J Microbiol Methods 2019; 159:138-147. [PMID: 30849421 DOI: 10.1016/j.mimet.2019.03.001] [Citation(s) in RCA: 16] [Impact Index Per Article: 3.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/25/2018] [Revised: 03/02/2019] [Accepted: 03/04/2019] [Indexed: 02/02/2023]
Abstract
Disruptive innovations in long-range, cost-effective direct template nucleic acid sequencing are transforming clinical and diagnostic medicine. A multidrug resistant strain and a pan-susceptible strain of Mannheimia haemolytica, isolated from pneumonic bovine lung samples, were sequenced at 146× and 111× coverage, respectively with Oxford Nanopore Technologies MinION. De novo assembly produced a complete genome for the non-resistant strain and a nearly complete assembly for the drug resistant strain. Functional annotation using RAST (Rapid Annotations using Subsystems Technology), CARD (Comprehensive Antibiotic Resistance Database) and ResFinder databases identified genes conferring resistance to different classes of antibiotics including β-lactams, tetracyclines, lincosamides, phenicols, aminoglycosides, sulfonamides and macrolides. Resistance phenotypes of the M. haemolytica strains were determined by minimum inhibitory concentration (MIC) of the antibiotics. Sequencing with a highly portable MinION device corresponded to MIC assays with most of the antimicrobial resistant determinants being identified with as few as 5437 reads, except for the genes responsible for resistance to Fluoroquinolones. The resulting quality assemblies and AMR gene annotation highlight the efficiency of ultra-long read, whole-genome sequencing (WGS) as a valuable tool in diagnostic veterinary medicine.
Collapse
Affiliation(s)
- Alexander Lim
- Department of Biochemistry and Molecular Biology, Oklahoma State University, 246 Noble Research Center, Stillwater, OK 74078, United States
| | - Bryan Naidenov
- Department of Biochemistry and Molecular Biology, Oklahoma State University, 246 Noble Research Center, Stillwater, OK 74078, United States
| | - Haley Bates
- Oklahoma Animal Disease Diagnostic Laboratory, Center for Veterinary Health Sciences, 1950 W. Farm Road, Stillwater, OK 74078, United States
| | - Karyn Willyerd
- Department of Biochemistry and Molecular Biology, Oklahoma State University, 246 Noble Research Center, Stillwater, OK 74078, United States
| | - Timothy Snider
- Oklahoma Animal Disease Diagnostic Laboratory, Center for Veterinary Health Sciences, 1950 W. Farm Road, Stillwater, OK 74078, United States
| | - Matthew Brian Couger
- Department of Microbiology and Molecular Genetics, Oklahoma State University, 307 Life Sciences East, Stillwater, OK 74078, United States
| | - Charles Chen
- Department of Biochemistry and Molecular Biology, Oklahoma State University, 246 Noble Research Center, Stillwater, OK 74078, United States.
| | - Akhilesh Ramachandran
- Oklahoma Animal Disease Diagnostic Laboratory, Center for Veterinary Health Sciences, 1950 W. Farm Road, Stillwater, OK 74078, United States.
| |
Collapse
|
80
|
Abstract
Affordable, high-throughput DNA sequencing has accelerated the pace of genome assembly over the past decade. Genome assemblies from high-throughput, short-read sequencing, however, are often not as contiguous as the first generation of genome assemblies. Whereas early genome assembly projects were often aided by clone maps or other mapping data, many current assembly projects forego these scaffolding data and only assemble genomes into smaller segments. Recently, new technologies have been invented that allow chromosome-scale assembly at a lower cost and faster speed than traditional methods. Here, we give an overview of the problem of chromosome-scale assembly and traditional methods for tackling this problem. We then review new technologies for chromosome-scale assembly and recent genome projects that used these technologies to create highly contiguous genome assemblies at low cost.
Collapse
Affiliation(s)
- Edward S. Rice
- Department of Biomolecular Engineering, University of California, Santa Cruz, California 95064, USA;,
| | - Richard E. Green
- Department of Biomolecular Engineering, University of California, Santa Cruz, California 95064, USA;,
- Dovetail Genomics, LLC, Santa Cruz, California 95060, USA
| |
Collapse
|
81
|
Fu S, Wang A, Au KF. A comparative evaluation of hybrid error correction methods for error-prone long reads. Genome Biol 2019; 20:26. [PMID: 30717772 PMCID: PMC6362602 DOI: 10.1186/s13059-018-1605-z] [Citation(s) in RCA: 67] [Impact Index Per Article: 13.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/20/2018] [Accepted: 12/05/2018] [Indexed: 12/20/2022] Open
Abstract
BACKGROUND Third-generation sequencing technologies have advanced the progress of the biological research by generating reads that are substantially longer than second-generation sequencing technologies. However, their notorious high error rate impedes straightforward data analysis and limits their application. A handful of error correction methods for these error-prone long reads have been developed to date. The output data quality is very important for downstream analysis, whereas computing resources could limit the utility of some computing-intense tools. There is a lack of standardized assessments for these long-read error-correction methods. RESULTS Here, we present a comparative performance assessment of ten state-of-the-art error-correction methods for long reads. We established a common set of benchmarks for performance assessment, including sensitivity, accuracy, output rate, alignment rate, output read length, run time, and memory usage, as well as the effects of error correction on two downstream applications of long reads: de novo assembly and resolving haplotype sequences. CONCLUSIONS Taking into account all of these metrics, we provide a suggestive guideline for method choice based on available data size, computing resources, and individual research goals.
Collapse
Affiliation(s)
- Shuhua Fu
- Department of Internal Medicine, University of Iowa, Iowa City, IA, 52242, USA
| | - Anqi Wang
- Department of Internal Medicine, University of Iowa, Iowa City, IA, 52242, USA
| | - Kin Fai Au
- Department of Internal Medicine, University of Iowa, Iowa City, IA, 52242, USA.
- Department of Biostatistics, University of Iowa, Iowa City, IA, 52242, USA.
- Department of Biomedical Informatics, The Ohio State University, Columbus, OH, 43210, USA.
| |
Collapse
|
82
|
Subirana JA, Messeguer X. How Long Are Long Tandem Repeats? A Challenge for Current Methods of Whole-Genome Sequence Assembly: The Case of Satellites in Caenorhabditis elegans. Genes (Basel) 2018; 9:genes9100500. [PMID: 30332836 PMCID: PMC6210790 DOI: 10.3390/genes9100500] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/18/2018] [Revised: 10/11/2018] [Accepted: 10/11/2018] [Indexed: 11/16/2022] Open
Abstract
Repetitive genome regions have been difficult to sequence, mainly because of the comparatively small size of the fragments used in assembly. Satellites or tandem repeats are very abundant in nematodes and offer an excellent playground to evaluate different assembly methods. Here, we compare the structure of satellites found in three different assemblies of the Caenorhabditis elegans genome: the original sequence obtained by Sanger sequencing, an assembly based on PacBio technology, and an assembly using Nanopore sequencing reads. In general, satellites were found in equivalent genomic regions, but the new long-read methods (PacBio and Nanopore) tended to result in longer assembled satellites. Important differences exist between the assemblies resulting from the two long-read technologies, such as the sizes of long satellites. Our results also suggest that the lengths of some annotated genes with internal repeats which were assembled using Sanger sequencing are likely to be incorrect.
Collapse
Affiliation(s)
- Juan A Subirana
- Department of Computer Science, Universitat Politècnica de Catalunya, Jordi Girona 1-3, 08034 Barcelona, Spain.
- Evolutionary Genomics Group, Research Program on Biomedical Informatics (GRIB)⁻Hospital del Mar Research Institute (IMIM), Universitat Pompeu Fabra (UPF), Dr. Aiguader 86, 08003 Barcelona, Spain.
| | - Xavier Messeguer
- Department of Computer Science, Universitat Politècnica de Catalunya, Jordi Girona 1-3, 08034 Barcelona, Spain.
| |
Collapse
|
83
|
Salazar AN, Abeel T. Approximate, simultaneous comparison of microbial genome architectures via syntenic anchoring of quiver representations. Bioinformatics 2018; 34:i732-i742. [PMID: 30423098 PMCID: PMC6129293 DOI: 10.1093/bioinformatics/bty614] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/30/2022] Open
Abstract
Motivation A long-standing limitation in comparative genomic studies is the dependency on a reference genome, which hinders the spectrum of genetic diversity that can be identified across a population of organisms. This is especially true in the microbial world where genome architectures can significantly vary. There is therefore a need for computational methods that can simultaneously analyze the architectures of multiple genomes without introducing bias from a reference. Results In this article, we present Ptolemy: a novel method for studying the diversity of genome architectures-such as structural variation and pan-genomes-across a collection of microbial assemblies without the need of a reference. Ptolemy is a 'top-down' approach to compare whole genome assemblies. Genomes are represented as labeled multi-directed graphs-known as quivers-which are then merged into a single, canonical quiver by identifying 'gene anchors' via synteny analysis. The canonical quiver represents an approximate, structural alignment of all genomes in a given collection encoding structural variation across (sub-) populations within the collection. We highlight various applications of Ptolemy by analyzing structural variation and the pan-genomes of different datasets composing of Mycobacterium, Saccharomyces, Escherichia and Shigella species. Our results show that Ptolemy is flexible and can handle both conserved and highly dynamic genome architectures. Ptolemy is user-friendly-requires only FASTA-formatted assembly along with a corresponding GFF-formatted file-and resource-friendly-can align 24 genomes in ∼10 mins with four CPUs and <2 GB of RAM. Availability and implementation Github: https://github.com/AbeelLab/ptolemy. Supplementary information Supplementary data are available at Bioinformatics online.
Collapse
Affiliation(s)
- Alex N Salazar
- Delft Bioinformatics Lab, Delft University of Technology, Delft, The Netherlands
- Infectious Disease and Microbiome Program, Broad Institute of MIT and Harvard, Cambridge, MA, USA
| | - Thomas Abeel
- Delft Bioinformatics Lab, Delft University of Technology, Delft, The Netherlands
- Infectious Disease and Microbiome Program, Broad Institute of MIT and Harvard, Cambridge, MA, USA
| |
Collapse
|
84
|
Patel A, Belykh E, Miller EJ, George LL, Martirosyan NL, Byvaltsev VA, Preul MC. MinION rapid sequencing: Review of potential applications in neurosurgery. Surg Neurol Int 2018; 9:157. [PMID: 30159201 PMCID: PMC6094492 DOI: 10.4103/sni.sni_55_18] [Citation(s) in RCA: 14] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/18/2018] [Accepted: 05/22/2018] [Indexed: 01/12/2023] Open
Abstract
BACKGROUND Gene sequencing has played an integral role in the advancement and understanding of disease pathology and treatment. Although historically expensive and time consuming, new sequencing technologies improve our capability to obtain the genetic information in an accurate and timely manner. Within neurosurgery, gene sequencing is routinely used in the diagnosis and treatment of neurosurgical diseases, primarily for brain tumors. This paper reviews nanopore sequencing, an innovation utilized by MinION and outlines its potential use for neurosurgery. METHODS A literature search was conducted for publications containing the keywords of Oxford MinION, nanopore sequencing, brain tumor, glioma, whole genome sequencing (WGS), epigenomics, molecular neuropathology, and next-generation sequencing (NGS). In total, 64 articles were selected and used for this review. RESULTS The Oxford MinION nanopore sequencing technology has had successful applications within clinical microbiology, human genome sequencing, and cancer genotyping across multiple specialties. Technical details, methodology, and current use of MinION sequencing are discussed through the prism of potential applications to solve neurosurgery-related scientific and diagnostic questions. The MinION device has proven to provide rapid and accurate reads with longer read lengths when compared with NGS. For applications within neurosurgery, the MinION device is capable of providing critical diagnostic information for central nervous system (CNS) tumors within a single day. CONCLUSIONS MinION provides rapid and accurate gene sequencing with better affordability and convenience compared with current NGS methods. Widespread success of the MinION nanopore sequencing technology in providing accurate, rapid, and convenient gene sequencing suggests a promising future within research laboratories and to improve care for neurosurgical patients.
Collapse
Affiliation(s)
- Arpan Patel
- Department of Neurosurgery, Barrow Neurological Institute, St. Joseph's Hospital and Medical Center, Phoenix, Arizona, USA
- College of Medicine-Phoenix, University of Arizona, Phoenix, Arizona, USA
| | - Evgenii Belykh
- Department of Neurosurgery, Barrow Neurological Institute, St. Joseph's Hospital and Medical Center, Phoenix, Arizona, USA
- Department of Neurosurgery, Irkutsk State Medical University, Irkutsk, Russia
| | - Eric J. Miller
- Department of Neurosurgery, Barrow Neurological Institute, St. Joseph's Hospital and Medical Center, Phoenix, Arizona, USA
- College of Medicine-Phoenix, University of Arizona, Phoenix, Arizona, USA
| | - Laeth L. George
- Department of Neurosurgery, Barrow Neurological Institute, St. Joseph's Hospital and Medical Center, Phoenix, Arizona, USA
- College of Medicine-Phoenix, University of Arizona, Phoenix, Arizona, USA
| | - Nikolay L. Martirosyan
- Department of Neurosurgery, Barrow Neurological Institute, St. Joseph's Hospital and Medical Center, Phoenix, Arizona, USA
| | - Vadim A. Byvaltsev
- Department of Neurosurgery, Irkutsk State Medical University, Irkutsk, Russia
| | - Mark C. Preul
- Department of Neurosurgery, Barrow Neurological Institute, St. Joseph's Hospital and Medical Center, Phoenix, Arizona, USA
| |
Collapse
|
85
|
van Dijk EL, Jaszczyszyn Y, Naquin D, Thermes C. The Third Revolution in Sequencing Technology. Trends Genet 2018; 34:666-681. [PMID: 29941292 DOI: 10.1016/j.tig.2018.05.008] [Citation(s) in RCA: 561] [Impact Index Per Article: 93.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/23/2018] [Revised: 05/18/2018] [Accepted: 05/29/2018] [Indexed: 12/16/2022]
Abstract
Forty years ago the advent of Sanger sequencing was revolutionary as it allowed complete genome sequences to be deciphered for the first time. A second revolution came when next-generation sequencing (NGS) technologies appeared, which made genome sequencing much cheaper and faster. However, NGS methods have several drawbacks and pitfalls, most notably their short reads. Recently, third-generation/long-read methods appeared, which can produce genome assemblies of unprecedented quality. Moreover, these technologies can directly detect epigenetic modifications on native DNA and allow whole-transcript sequencing without the need for assembly. This marks the third revolution in sequencing technology. Here we review and compare the various long-read methods. We discuss their applications and their respective strengths and weaknesses and provide future perspectives.
Collapse
Affiliation(s)
- Erwin L van Dijk
- Institute for Integrative Biology of the Cell, UMR9198, CNRS CEA Université Paris-Sud, Université Paris-Saclay, 9198 Gif sur Yvette Cedex, France.
| | - Yan Jaszczyszyn
- Institute for Integrative Biology of the Cell, UMR9198, CNRS CEA Université Paris-Sud, Université Paris-Saclay, 9198 Gif sur Yvette Cedex, France
| | - Delphine Naquin
- Institute for Integrative Biology of the Cell, UMR9198, CNRS CEA Université Paris-Sud, Université Paris-Saclay, 9198 Gif sur Yvette Cedex, France
| | - Claude Thermes
- Institute for Integrative Biology of the Cell, UMR9198, CNRS CEA Université Paris-Sud, Université Paris-Saclay, 9198 Gif sur Yvette Cedex, France
| |
Collapse
|
86
|
Pitta DW, Indugu N, Baker L, Vecchiarelli B, Attwood G. Symposium review: Understanding diet-microbe interactions to enhance productivity of dairy cows. J Dairy Sci 2018; 101:7661-7679. [PMID: 29859694 DOI: 10.3168/jds.2017-13858] [Citation(s) in RCA: 13] [Impact Index Per Article: 2.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/18/2017] [Accepted: 04/30/2018] [Indexed: 11/19/2022]
Abstract
Ruminants are dependent on the microbiota (bacteria, protozoa, archaea, and fungi) that inhabit the reticulo-rumen for digestion of feedstuffs. Nearly 70% of energy and 50% of protein requirements for dairy cows are met by microbial fermentation in the rumen, emphasizing the need to characterize the role of microbes in feed breakdown and nutrient utilization. Over the past 2 decades, next-generation sequencing technologies have allowed for rapid expansion of knowledge concerning microbial populations and alterations in response to forages, concentrates, supplements, and probiotics in the rumen. Advances in gene sequencing and emerging bioinformatic tools have allowed for increased throughput of data to aid in our understanding of the functional relevance of microbial genomes. In particular, metagenomics can identify specific genes involved in metabolic pathways, and metatranscriptomics can describe the transcriptional activity of microbial genes. These powerful approaches help untangle the complex interactions between microbes and dietary nutrients so that we can more fully understand the physiology of feed digestion in the rumen. Application of genomics-based approaches offers promise in unraveling microbial niches and respective gene repertoires to potentiate fiber and nonfiber carbohydrate digestion, microbial protein synthesis, and healthy biohydrogenation. New information on microbial genomics and interactions with dietary components will more clearly define pathways in the rumen to positively influence milk yield and components.
Collapse
Affiliation(s)
- Dipti W Pitta
- Department of Clinical Studies, School of Veterinary Medicine, University of Pennsylvania, Kennett Square 19348.
| | - Nagaraju Indugu
- Department of Clinical Studies, School of Veterinary Medicine, University of Pennsylvania, Kennett Square 19348
| | - Linda Baker
- Department of Clinical Studies, School of Veterinary Medicine, University of Pennsylvania, Kennett Square 19348
| | - Bonnie Vecchiarelli
- Department of Clinical Studies, School of Veterinary Medicine, University of Pennsylvania, Kennett Square 19348
| | - Graeme Attwood
- Rumen Microbial Genomics, Ag Research, Palmerston North, New Zealand 11222
| |
Collapse
|
87
|
Fuselli S, Baptista RP, Panziera A, Magi A, Guglielmi S, Tonin R, Benazzo A, Bauzer LG, Mazzoni CJ, Bertorelle G. A new hybrid approach for MHC genotyping: high-throughput NGS and long read MinION nanopore sequencing, with application to the non-model vertebrate Alpine chamois (Rupicapra rupicapra). Heredity (Edinb) 2018; 121:293-303. [PMID: 29572469 PMCID: PMC6133961 DOI: 10.1038/s41437-018-0070-5] [Citation(s) in RCA: 20] [Impact Index Per Article: 3.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/27/2017] [Revised: 01/24/2018] [Accepted: 02/25/2018] [Indexed: 12/13/2022] Open
Abstract
The major histocompatibility complex (MHC) acts as an interface between the immune system and infectious diseases. Accurate characterization and genotyping of the extremely variable MHC loci are challenging especially without a reference sequence. We designed a combination of long-range PCR, Illumina short-reads, and Oxford Nanopore MinION long-reads approaches to capture the genetic variation of the MHC II DRB locus in an Italian population of the Alpine chamois (Rupicapra rupicapra). We utilized long-range PCR to generate a 9 Kb fragment of the DRB locus. Amplicons from six different individuals were fragmented, tagged, and simultaneously sequenced with Illumina MiSeq. One of these amplicons was sequenced with the MinION device, which produced long reads covering the entire amplified fragment. A pipeline that combines short and long reads resolved several short tandem repeats and homopolymers and produced a de novo reference, which was then used to map and genotype the short reads from all individuals. The assembled DRB locus showed a high level of polymorphism and the presence of a recombination breakpoint. Our results suggest that an amplicon-based NGS approach coupled with single-molecule MinION nanopore sequencing can efficiently achieve both the assembly and the genotyping of complex genomic regions in multiple individuals in the absence of a reference sequence.
Collapse
Affiliation(s)
- S Fuselli
- Department of Life Sciences and Biotechnology, University of Ferrara, Via L. Borsari 46, Ferrara, 44121, Italy.
| | - R P Baptista
- Center for Tropical & Emerging Global Diseases, University of Georgia, 107 Paul D. Coverdell Center, 500 D. W. Brooks Drive, Athens, GA, 30602-7394, USA
| | - A Panziera
- Department of Life Sciences and Biotechnology, University of Ferrara, Via L. Borsari 46, Ferrara, 44121, Italy.,Department of Biodiversity and Molecular Ecology, Research and Innovation Centre, Fondazione Edmund Mach, Via Edmund Mach 1, San Michele all'Adige, I-38010, Italy
| | - A Magi
- Department of Experimental and Clinical Medicine, University of Florence, Largo Brambilla, Florence, 3-50134, Italy
| | - S Guglielmi
- Department of Life Sciences and Biotechnology, University of Ferrara, Via L. Borsari 46, Ferrara, 44121, Italy
| | - R Tonin
- Department of Life Sciences and Biotechnology, University of Ferrara, Via L. Borsari 46, Ferrara, 44121, Italy.,Faculty of Science and Technology, Free University of Bozen-Bolzano, Piazza Università 5, Bolzano, Italy
| | - A Benazzo
- Department of Life Sciences and Biotechnology, University of Ferrara, Via L. Borsari 46, Ferrara, 44121, Italy
| | - L G Bauzer
- Laboratório de Fisiologia e Controle de Artrópodes Vetores, Instituto Oswaldo Cruz, Fiocruz, Rio de Janeiro, Brazil.,Berlin Center for Genomics in Biodiversity Research, Königin-Luise-Str. 6-8, Berlin, 14195, Germany
| | - C J Mazzoni
- Berlin Center for Genomics in Biodiversity Research, Königin-Luise-Str. 6-8, Berlin, 14195, Germany
| | - G Bertorelle
- Department of Life Sciences and Biotechnology, University of Ferrara, Via L. Borsari 46, Ferrara, 44121, Italy
| |
Collapse
|
88
|
D'Argenio V. The High-Throughput Analyses Era: Are We Ready for the Data Struggle? High Throughput 2018; 7:E8. [PMID: 29498666 PMCID: PMC5876534 DOI: 10.3390/ht7010008] [Citation(s) in RCA: 38] [Impact Index Per Article: 6.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/28/2017] [Revised: 02/16/2018] [Accepted: 02/27/2018] [Indexed: 12/23/2022] Open
Abstract
Recent and rapid technological advances in molecular sciences have dramatically increased the ability to carry out high-throughput studies characterized by big data production. This, in turn, led to the consequent negative effect of highlighting the presence of a gap between data yield and their analysis. Indeed, big data management is becoming an increasingly important aspect of many fields of molecular research including the study of human diseases. Now, the challenge is to identify, within the huge amount of data obtained, that which is of clinical relevance. In this context, issues related to data interpretation, sharing and storage need to be assessed and standardized. Once this is achieved, the integration of data from different -omic approaches will improve the diagnosis, monitoring and therapy of diseases by allowing the identification of novel, potentially actionably biomarkers in view of personalized medicine.
Collapse
Affiliation(s)
- Valeria D'Argenio
- CEINGE-Biotecnologie Avanzate, via G. Salvatore 486, 80145 Naples, Italy.
- Department of Molecular Medicine and Medical Biotechnologies, University of Naples Federico II, via Pansini 5, 80131 Naples, Italy.
| |
Collapse
|
89
|
Michael TP, Jupe F, Bemm F, Motley ST, Sandoval JP, Lanz C, Loudet O, Weigel D, Ecker JR. High contiguity Arabidopsis thaliana genome assembly with a single nanopore flow cell. Nat Commun 2018; 9:541. [PMID: 29416032 PMCID: PMC5803254 DOI: 10.1038/s41467-018-03016-2] [Citation(s) in RCA: 167] [Impact Index Per Article: 27.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/23/2017] [Accepted: 01/11/2018] [Indexed: 12/17/2022] Open
Abstract
The handheld Oxford Nanopore MinION sequencer generates ultra-long reads with minimal cost and time requirements, which makes sequencing genomes at the bench feasible. Here, we sequence the gold standard Arabidopsis thaliana genome (KBS-Mac-74 accession) on the bench with the MinION sequencer, and assemble the genome using typical consumer computing hardware (4 Cores, 16 Gb RAM) into chromosome arms (62 contigs with an N50 length of 12.3 Mb). We validate the contiguity and quality of the assembly with two independent single-molecule technologies, Bionano optical genome maps and Pacific Biosciences Sequel sequencing. The new A. thaliana KBS-Mac-74 genome enables resolution of a quantitative trait locus that had previously been recalcitrant to a Sanger-based BAC sequencing approach. In summary, we demonstrate that even when the purpose is to understand complex structural variation at a single region of the genome, complete genome assembly is becoming the simplest way to achieve this goal.
Collapse
Affiliation(s)
| | - Florian Jupe
- Genomic Analysis Laboratory, The Salk Institute for Biological Studies, La Jolla, CA, 92037, USA
- Monsanto Company, Creve Coeur, MO, 63141, USA
| | - Felix Bemm
- Max Planck Institute for Developmental Biology, 72076, Tübingen, Germany
| | | | - Justin P Sandoval
- Genomic Analysis Laboratory, The Salk Institute for Biological Studies, La Jolla, CA, 92037, USA
| | - Christa Lanz
- Max Planck Institute for Developmental Biology, 72076, Tübingen, Germany
| | - Olivier Loudet
- Institut Jean-Pierre Bourgin, INRA, AgroParisTech, CNRS, Université Paris-Saclay, 78000, Versailles, France
| | - Detlef Weigel
- Max Planck Institute for Developmental Biology, 72076, Tübingen, Germany
| | - Joseph R Ecker
- Genomic Analysis Laboratory, The Salk Institute for Biological Studies, La Jolla, CA, 92037, USA
- Howard Hughes Medical Institute, The Salk Institute for Biological Studies, La Jolla, CA, 92037, USA
| |
Collapse
|
90
|
Eccles D, Chandler J, Camberis M, Henrissat B, Koren S, Le Gros G, Ewbank JJ. De novo assembly of the complex genome of Nippostrongylus brasiliensis using MinION long reads. BMC Biol 2018; 16:6. [PMID: 29325570 PMCID: PMC5765664 DOI: 10.1186/s12915-017-0473-4] [Citation(s) in RCA: 25] [Impact Index Per Article: 4.2] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/12/2017] [Accepted: 12/14/2017] [Indexed: 02/07/2023] Open
Abstract
BACKGROUND Eukaryotic genome assembly remains a challenge in part due to the prevalence of complex DNA repeats. This is a particularly acute problem for holocentric nematodes because of the large number of satellite DNA sequences found throughout their genomes. These have been recalcitrant to most genome sequencing methods. At the same time, many nematodes are parasites and some represent a serious threat to human health. There is a pressing need for better molecular characterization of animal and plant parasitic nematodes. The advent of long-read DNA sequencing methods offers the promise of resolving complex genomes. RESULTS Using Nippostrongylus brasiliensis as a test case, applying improved base-calling algorithms and assembly methods, we demonstrate the feasibility of de novo genome assembly matching current community standards using only MinION long reads. In doing so, we uncovered an unexpected diversity of very long and complex DNA sequences repeated throughout the N. brasiliensis genome, including massive tandem repeats of tRNA genes. CONCLUSION Base-calling and assembly methods have improved sufficiently that de novo genome assembly of large complex genomes is possible using only long reads. The method has the added advantage of preserving haplotypic variants and so has the potential to be used in population analyses.
Collapse
Affiliation(s)
- David Eccles
- Malaghan Institute of Medical Research, Wellington, New Zealand
| | - Jodie Chandler
- Malaghan Institute of Medical Research, Wellington, New Zealand
| | - Mali Camberis
- Malaghan Institute of Medical Research, Wellington, New Zealand
| | - Bernard Henrissat
- Department of Biological Sciences, King Abdulaziz University, Jeddah, Saudi Arabia
- CNRS UMR 7257, Aix-Marseille University, Marseille, France
- INRA, USC 1408 AFMB, Marseille, France
| | - Sergey Koren
- Genome Informatics Section, Computational and Statistical Genomics Branch, National Human Genome Research Institute, National Institutes of Health, Bethesda, MD, 20892, USA
| | - Graham Le Gros
- Malaghan Institute of Medical Research, Wellington, New Zealand.
| | - Jonathan J Ewbank
- Malaghan Institute of Medical Research, Wellington, New Zealand
- Centre d'Immunologie de Marseille-Luminy, Aix-Marseille University, CNRS, INSERM, Marseille, France
| |
Collapse
|
91
|
|