1
|
Hook PW, Timp W. Beyond assembly: the increasing flexibility of single-molecule sequencing technology. Nat Rev Genet 2023; 24:627-641. [PMID: 37161088 PMCID: PMC10169143 DOI: 10.1038/s41576-023-00600-1] [Citation(s) in RCA: 5] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 03/30/2023] [Indexed: 05/11/2023]
Abstract
The maturation of high-throughput short-read sequencing technology over the past two decades has shaped the way genomes are studied. Recently, single-molecule, long-read sequencing has emerged as an essential tool in deciphering genome structure and function, including filling gaps in the human reference genome, measuring the epigenome and characterizing splicing variants in the transcriptome. With recent technological developments, these single-molecule technologies have moved beyond genome assembly and are being used in a variety of ways, including to selectively sequence specific loci with long reads, measure chromatin state and protein-DNA binding in order to investigate the dynamics of gene regulation, and rapidly determine copy number variation. These increasingly flexible uses of single-molecule technologies highlight a young and fast-moving part of the field that is leading to a more accessible era of nucleic acid sequencing.
Collapse
Affiliation(s)
- Paul W Hook
- Department of Biomedical Engineering, Molecular Biology and Genetics, and Genetic Medicine, Johns Hopkins University, Baltimore, MD, USA
| | - Winston Timp
- Department of Biomedical Engineering, Molecular Biology and Genetics, and Genetic Medicine, Johns Hopkins University, Baltimore, MD, USA.
| |
Collapse
|
2
|
Spealman P, De T, Chuong JN, Gresham D. Best Practices in Microbial Experimental Evolution: Using Reporters and Long-Read Sequencing to Identify Copy Number Variation in Experimental Evolution. J Mol Evol 2023; 91:356-368. [PMID: 37012421 PMCID: PMC10275804 DOI: 10.1007/s00239-023-10102-7] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/28/2022] [Accepted: 02/21/2023] [Indexed: 04/05/2023]
Abstract
Copy number variants (CNVs), comprising gene amplifications and deletions, are a pervasive class of heritable variation. CNVs play a key role in rapid adaptation in both natural, and experimental, evolution. However, despite the advent of new DNA sequencing technologies, detection and quantification of CNVs in heterogeneous populations has remained challenging. Here, we summarize recent advances in the use of CNV reporters that provide a facile means of quantifying de novo CNVs at a specific locus in the genome, and nanopore sequencing, for resolving the often complex structures of CNVs. We provide guidance for the engineering and analysis of CNV reporters and practical guidelines for single-cell analysis of CNVs using flow cytometry. We summarize recent advances in nanopore sequencing, discuss the utility of this technology, and provide guidance for the bioinformatic analysis of these data to define the molecular structure of CNVs. The combination of reporter systems for tracking and isolating CNV lineages and long-read DNA sequencing for characterizing CNV structures enables unprecedented resolution of the mechanisms by which CNVs are generated and their evolutionary dynamics.
Collapse
Affiliation(s)
- Pieter Spealman
- Department of Biology, New York University, New York, NY, 10003, USA
- Center for Genomics and Systems Biology, New York University, New York, NY, 10003, USA
| | - Titir De
- Department of Biology, New York University, New York, NY, 10003, USA
- Center for Genomics and Systems Biology, New York University, New York, NY, 10003, USA
| | - Julie N Chuong
- Department of Biology, New York University, New York, NY, 10003, USA
- Center for Genomics and Systems Biology, New York University, New York, NY, 10003, USA
| | - David Gresham
- Department of Biology, New York University, New York, NY, 10003, USA.
- Center for Genomics and Systems Biology, New York University, New York, NY, 10003, USA.
| |
Collapse
|
3
|
Martin S, Heavens D, Lan Y, Horsfield S, Clark MD, Leggett RM. Nanopore adaptive sampling: a tool for enrichment of low abundance species in metagenomic samples. Genome Biol 2022; 23:11. [PMID: 35067223 PMCID: PMC8785595 DOI: 10.1186/s13059-021-02582-x] [Citation(s) in RCA: 83] [Impact Index Per Article: 27.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/08/2021] [Accepted: 12/20/2021] [Indexed: 12/13/2022] Open
Abstract
Adaptive sampling is a method of software-controlled enrichment unique to nanopore sequencing platforms. To test its potential for enrichment of rarer species within metagenomic samples, we create a synthetic mock community and construct sequencing libraries with a range of mean read lengths. Enrichment is up to 13.87-fold for the least abundant species in the longest read length library; factoring in reduced yields from rejecting molecules the calculated efficiency raises this to 4.93-fold. Finally, we introduce a mathematical model of enrichment based on molecule length and relative abundance, whose predictions correlate strongly with mock and complex real-world microbial communities.
Collapse
Affiliation(s)
- Samuel Martin
- Earlham Institute, Norwich Research Park, Norwich, NR4 7UZ, UK
| | - Darren Heavens
- Earlham Institute, Norwich Research Park, Norwich, NR4 7UZ, UK
| | - Yuxuan Lan
- Earlham Institute, Norwich Research Park, Norwich, NR4 7UZ, UK
| | | | | | | |
Collapse
|
4
|
Payne A, Holmes N, Clarke T, Munro R, Debebe BJ, Loose M. Readfish enables targeted nanopore sequencing of gigabase-sized genomes. Nat Biotechnol 2021; 39:442-450. [PMID: 33257864 PMCID: PMC7610616 DOI: 10.1038/s41587-020-00746-x] [Citation(s) in RCA: 160] [Impact Index Per Article: 40.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/07/2020] [Revised: 10/21/2020] [Accepted: 10/21/2020] [Indexed: 01/07/2023]
Abstract
Nanopore sequencers can be used to selectively sequence certain DNA molecules in a pool by reversing the voltage across individual nanopores to reject specific sequences, enabling enrichment and depletion to address biological questions. Previously, we achieved this using dynamic time warping to map the signal to a reference genome, but the method required substantial computational resources and did not scale to gigabase-sized references. Here we overcome this limitation by using graphical processing unit (GPU) base-calling. We show enrichment of specific chromosomes from the human genome and of low-abundance organisms in mixed populations without a priori knowledge of sample composition. Finally, we enrich targeted panels comprising 25,600 exons from 10,000 human genes and 717 genes implicated in cancer, identifying PML-RARA fusions in the NB4 cell line in <15 h sequencing. These methods can be used to efficiently screen any target panel of genes without specialized sample preparation using any computer and a suitable GPU. Our toolkit, readfish, is available at https://www.github.com/looselab/readfish .
Collapse
Affiliation(s)
- Alexander Payne
- DeepSeq, School of Life Sciences, Queens Medical Centre, University of Nottingham, Nottingham, UK
| | - Nadine Holmes
- DeepSeq, School of Life Sciences, Queens Medical Centre, University of Nottingham, Nottingham, UK
| | - Thomas Clarke
- DeepSeq, School of Life Sciences, Queens Medical Centre, University of Nottingham, Nottingham, UK
| | - Rory Munro
- DeepSeq, School of Life Sciences, Queens Medical Centre, University of Nottingham, Nottingham, UK
| | - Bisrat J Debebe
- DeepSeq, School of Life Sciences, Queens Medical Centre, University of Nottingham, Nottingham, UK
| | - Matthew Loose
- DeepSeq, School of Life Sciences, Queens Medical Centre, University of Nottingham, Nottingham, UK.
| |
Collapse
|
5
|
Sagita R, Quax WJ, Haslinger K. Current State and Future Directions of Genetics and Genomics of Endophytic Fungi for Bioprospecting Efforts. Front Bioeng Biotechnol 2021; 9:649906. [PMID: 33791289 PMCID: PMC8005728 DOI: 10.3389/fbioe.2021.649906] [Citation(s) in RCA: 16] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/05/2021] [Accepted: 02/16/2021] [Indexed: 12/16/2022] Open
Abstract
The bioprospecting of secondary metabolites from endophytic fungi received great attention in the 1990s and 2000s, when the controversy around taxol production from Taxus spp. endophytes was at its height. Since then, hundreds of reports have described the isolation and characterization of putative secondary metabolites from endophytic fungi. However, only very few studies also report the genetic basis for these phenotypic observations. With low sequencing cost and fast sample turnaround, genetics- and genomics-based approaches have risen to become comprehensive approaches to study natural products from a wide-range of organisms, especially to elucidate underlying biosynthetic pathways. However, in the field of fungal endophyte biology, elucidation of biosynthetic pathways is still a major challenge. As a relatively poorly investigated group of microorganisms, even in the light of recent efforts to sequence more fungal genomes, such as the 1000 Fungal Genomes Project at the Joint Genome Institute (JGI), the basis for bioprospecting of enzymes and pathways from endophytic fungi is still rather slim. In this review we want to discuss the current approaches and tools used to associate phenotype and genotype to elucidate biosynthetic pathways of secondary metabolites in endophytic fungi through the lens of bioprospecting. This review will point out the reported successes and shortcomings, and discuss future directions in sampling, and genetics and genomics of endophytic fungi. Identifying responsible biosynthetic genes for the numerous secondary metabolites isolated from endophytic fungi opens the opportunity to explore the genetic potential of producer strains to discover novel secondary metabolites and enhance secondary metabolite production by metabolic engineering resulting in novel and more affordable medicines and food additives.
Collapse
Affiliation(s)
| | | | - Kristina Haslinger
- Groningen Institute of Pharmacy, Chemical and Pharmaceutical Biology, University of Groningen, Groningen, Netherlands
| |
Collapse
|
6
|
Edwards HS, Krishnakumar R, Sinha A, Bird SW, Patel KD, Bartsch MS. Real-Time Selective Sequencing with RUBRIC: Read Until with Basecall and Reference-Informed Criteria. Sci Rep 2019; 9:11475. [PMID: 31391493 PMCID: PMC6685950 DOI: 10.1038/s41598-019-47857-3] [Citation(s) in RCA: 28] [Impact Index Per Article: 4.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/19/2019] [Accepted: 07/09/2019] [Indexed: 12/12/2022] Open
Abstract
The Oxford MinION, the first commercial nanopore sequencer, is also the first to implement molecule-by-molecule real-time selective sequencing or “Read Until”. As DNA transits a MinION nanopore, real-time pore current data can be accessed and analyzed to provide active feedback to that pore. Fragments of interest are sequenced by default, while DNA deemed non-informative is rejected by reversing the pore bias to eject the strand, providing a novel means of background depletion and/or target enrichment. In contrast to the previously published pattern-matching Read Until approach, our RUBRIC method is the first example of real-time selective sequencing where on-line basecalling enables alignment against conventional nucleic acid references to provide the basis for sequence/reject decisions. We evaluate RUBRIC performance across a range of optimizable parameters, apply it to mixed human/bacteria and CRISPR/Cas9-cut samples, and present a generalized model for estimating real-time selection performance as a function of sample composition and computing configuration.
Collapse
Affiliation(s)
- Harrison S Edwards
- Exploratory Systems Dept., Sandia National Laboratories, Livermore, CA, USA.,Institute of Biomaterials and Biomedical Engineering, University of Toronto, Toronto, Canada
| | - Raga Krishnakumar
- Systems Biology Dept., Sandia National Laboratories, Livermore, CA, USA
| | - Anupama Sinha
- Systems Biology Dept., Sandia National Laboratories, Livermore, CA, USA
| | - Sara W Bird
- Biotechnology & Bioengineering Dept., Sandia National Laboratories, Livermore, CA, USA.,uBiome, San Francisco, CA, USA
| | - Kamlesh D Patel
- Exploratory Systems Dept., Sandia National Laboratories, Livermore, CA, USA.,Purdue Partnerships Dept., Sandia National Laboratories, Albuquerque, NM, USA
| | - Michael S Bartsch
- Exploratory Systems Dept., Sandia National Laboratories, Livermore, CA, USA.
| |
Collapse
|