1
|
Nakanishi H, Takada A, Yoneyama K, Hara M, Sakai K, Saito K. Estimating bloodstain age in the short term based on DNA fragment length using nanopore sequencer. Forensic Sci Int 2024; 358:112010. [PMID: 38581825 DOI: 10.1016/j.forsciint.2024.112010] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/07/2024] [Revised: 03/21/2024] [Accepted: 03/25/2024] [Indexed: 04/08/2024]
Abstract
We used a nanopore sequencer to quantify DNA fragments > 10,000 bp in size and then evaluated their relationship with short-term bloodstain age. Moreover, DNA degradation was investigated after bloodstains were wetted once with water. Bloodstain samples on cotton gauze were stored at room temperature and low humidity for up to 6 months. Bloodstains stored for 1 day were wetted with nuclease-free water, allowed to dry, and stored at room temperature and low humidity for up to 1 week. The proportion of fragments > 20,000 bp in dry bloodstains tended to decrease over time, particularly for fragments > 50,000 bp in size. This trend was modeled using a power approximation curve, with the highest R2 value (0.6475) noted for fragments > 50,000 bp in size; lower values were recorded for shorter fragments. The proportion of longer fragments was significantly reduced in bloodstains that were dried after being wetted once, and there was significant difference in fragments > 50,000 bp between dry conditions and once-wetted. This result suggests that even temporary exposure to water causes significant DNA fragmentation, but not extensive degradation. Thus, bloodstains that appear fresh but have a low proportion of long DNA fragments may have been wetted previously. Our results indicate that evaluating the proportion of long DNA fragments yields information on both bloodstain age and the environment in which they were stored.
Collapse
Affiliation(s)
- Hiroaki Nakanishi
- Department of Forensic Medicine, Juntendo University School of Medicine, 2-1-1, Hongo, Bunkyo-Ku, Tokyo 113-8421, Japan; Department of Forensic Medicine, Saitama Medical University, 38 Morohongo, Moroyama, Saitama 350-0495, Japan.
| | - Aya Takada
- Department of Forensic Medicine, Juntendo University School of Medicine, 2-1-1, Hongo, Bunkyo-Ku, Tokyo 113-8421, Japan; Department of Forensic Medicine, Saitama Medical University, 38 Morohongo, Moroyama, Saitama 350-0495, Japan; Tokyo Medical Examiner's Office, Tokyo Metropolitan Government, 4-21-18, Otsuka, Bunkyo-Ku, Tokyo 112-0012, Japan
| | - Katsumi Yoneyama
- Department of Forensic Medicine, Saitama Medical University, 38 Morohongo, Moroyama, Saitama 350-0495, Japan
| | - Masaaki Hara
- Department of Forensic Medicine, Saitama Medical University, 38 Morohongo, Moroyama, Saitama 350-0495, Japan
| | - Kentaro Sakai
- Department of Forensic Medicine, Juntendo University School of Medicine, 2-1-1, Hongo, Bunkyo-Ku, Tokyo 113-8421, Japan; Tokyo Medical Examiner's Office, Tokyo Metropolitan Government, 4-21-18, Otsuka, Bunkyo-Ku, Tokyo 112-0012, Japan
| | - Kazuyuki Saito
- Department of Forensic Medicine, Juntendo University School of Medicine, 2-1-1, Hongo, Bunkyo-Ku, Tokyo 113-8421, Japan; Department of Forensic Medicine, Saitama Medical University, 38 Morohongo, Moroyama, Saitama 350-0495, Japan; Tokyo Medical Examiner's Office, Tokyo Metropolitan Government, 4-21-18, Otsuka, Bunkyo-Ku, Tokyo 112-0012, Japan
| |
Collapse
|
2
|
Chen J, Xu F. Application of Nanopore Sequencing in the Diagnosis and Treatment of Pulmonary Infections. Mol Diagn Ther 2023; 27:685-701. [PMID: 37563539 PMCID: PMC10590290 DOI: 10.1007/s40291-023-00669-8] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 07/18/2023] [Indexed: 08/12/2023]
Abstract
This review provides an in-depth discussion of the development, principles and utility of nanopore sequencing technology and its diverse applications in the identification of various pulmonary pathogens. We examined the emergence and advancements of nanopore sequencing as a significant player in this field. We illustrate the challenges faced in diagnosing mixed infections and further scrutinize the use of nanopore sequencing in the identification of single pathogens, including viruses (with a focus on its use in epidemiology, outbreak investigation, and viral resistance), bacteria (emphasizing 16S targeted sequencing, rare bacterial lung infections, and antimicrobial resistance studies), fungi (employing internal transcribed spacer sequencing), tuberculosis, and atypical pathogens. Furthermore, we discuss the role of nanopore sequencing in metagenomics and its potential for unbiased detection of all pathogens in a clinical setting, emphasizing its advantages in sequencing genome repeat areas and structural variant regions. We discuss the limitations in dealing with host DNA removal, the inherent high error rate of nanopore sequencing technology, along with the complexity of operation and processing, while acknowledging the possibilities provided by recent technological improvements. We compared nanopore sequencing with the BioFire system, a rapid molecular diagnostic system based on polymerase chain reaction. Although the BioFire system serves well for the rapid screening of known and common pathogens, it falls short in the identification of unknown or rare pathogens and in providing comprehensive genome analysis. As technological advancements continue, it is anticipated that the role of nanopore sequencing technology in diagnosing and treating lung infections will become increasingly significant.
Collapse
Affiliation(s)
- Jie Chen
- Department of Infectious Diseases, The Second Affiliated Hospital, Zhejiang University School of Medicine, Hangzhou, 310009, Zhejiang, China
| | - Feng Xu
- Department of Infectious Diseases, The Second Affiliated Hospital, Zhejiang University School of Medicine, Hangzhou, 310009, Zhejiang, China.
| |
Collapse
|
3
|
Hwang HY, Wang J. Effect of recombination on genetic diversity of Caenorhabditis elegans. Sci Rep 2023; 13:16425. [PMID: 37777524 PMCID: PMC10542817 DOI: 10.1038/s41598-023-42600-5] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/20/2022] [Accepted: 09/12/2023] [Indexed: 10/02/2023] Open
Abstract
Greater molecular divergence and genetic diversity are present in regions of high recombination in many species. Studies describing the correlation between variant abundance and recombination rate have long focused on recombination in the context of linked selection models, whereby interference between linked sites under positive or negative selection reduces genetic diversity in regions of low recombination. Here, we show that indels, especially those of intermediate sizes, are enriched relative to single nucleotide polymorphisms in regions of high recombination in C. elegans. To explain this phenomenon, we reintroduce an alternative model that emphasizes the mutagenic effect of recombination. To extend the analysis, we examine the variants with a phylogenetic context and discuss how different models could be examined together. The number of variants generated by recombination in natural populations could be substantial including possibly the majority of some indel subtypes. Our work highlights the potential importance of a mutagenic effect of recombination, which could have a significant role in the shaping of natural genetic diversity.
Collapse
Affiliation(s)
- Ho-Yon Hwang
- Department of Biochemistry and Molecular Biology, Bloomberg School of Public Health, Department of Neuroscience, School of Medicine, Johns Hopkins University, Baltimore, MD, 21205, USA.
| | - Jiou Wang
- Department of Biochemistry and Molecular Biology, Bloomberg School of Public Health, Department of Neuroscience, School of Medicine, Johns Hopkins University, Baltimore, MD, 21205, USA.
| |
Collapse
|
4
|
Moya ND, Stevens L, Miller IR, Sokol CE, Galindo JL, Bardas AD, Koh ESH, Rozenich J, Yeo C, Xu M, Andersen EC. Novel and improved Caenorhabditis briggsae gene models generated by community curation. BMC Genomics 2023; 24:486. [PMID: 37626289 PMCID: PMC10463891 DOI: 10.1186/s12864-023-09582-0] [Citation(s) in RCA: 3] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/19/2023] [Accepted: 08/12/2023] [Indexed: 08/27/2023] Open
Abstract
BACKGROUND The nematode Caenorhabditis briggsae has been used as a model in comparative genomics studies with Caenorhabditis elegans because of their striking morphological and behavioral similarities. However, the potential of C. briggsae for comparative studies is limited by the quality of its genome resources. The genome resources for the C. briggsae laboratory strain AF16 have not been developed to the same extent as C. elegans. The recent publication of a new chromosome-level reference genome for QX1410, a C. briggsae wild strain closely related to AF16, has provided the first step to bridge the gap between C. elegans and C. briggsae genome resources. Currently, the QX1410 gene models consist of software-derived gene predictions that contain numerous errors in their structure and coding sequences. In this study, a team of researchers manually inspected over 21,000 gene models and underlying transcriptomic data to repair software-derived errors. RESULTS We designed a detailed workflow to train a team of nine students to manually curate gene models using RNA read alignments. We manually inspected the gene models, proposed corrections to the coding sequences of over 8,000 genes, and modeled thousands of putative isoforms and untranslated regions. We exploited the conservation of protein sequence length between C. briggsae and C. elegans to quantify the improvement in protein-coding gene model quality and showed that manual curation led to substantial improvements in the protein sequence length accuracy of QX1410 genes. Additionally, collinear alignment analysis between the QX1410 and AF16 genomes revealed over 1,800 genes affected by spurious duplications and inversions in the AF16 genome that are now resolved in the QX1410 genome. CONCLUSIONS Community-based, manual curation using transcriptome data is an effective approach to improve the quality of software-derived protein-coding genes. The detailed protocols provided in this work can be useful for future large-scale manual curation projects in other species. Our manual curation efforts have brought the QX1410 gene models to a comparable level of quality as the extensively curated AF16 gene models. The improved genome resources for C. briggsae provide reliable tools for the study of Caenorhabditis biology and other related nematodes.
Collapse
Affiliation(s)
- Nicolas D Moya
- Department of Molecular Biosciences, Northwestern University, 4619 Silverman Hall 2205 Tech Drive, Evanston, IL, 60208, USA
- Interdisciplinary Biological Sciences Program, Northwestern University, Evanston, IL, 60208, USA
| | - Lewis Stevens
- Department of Molecular Biosciences, Northwestern University, 4619 Silverman Hall 2205 Tech Drive, Evanston, IL, 60208, USA
- Tree of Life, Wellcome Sanger Institute, Cambridge, UK
| | - Isabella R Miller
- Department of Molecular Biosciences, Northwestern University, 4619 Silverman Hall 2205 Tech Drive, Evanston, IL, 60208, USA
| | - Chloe E Sokol
- Department of Molecular Biosciences, Northwestern University, 4619 Silverman Hall 2205 Tech Drive, Evanston, IL, 60208, USA
| | - Joseph L Galindo
- Department of Molecular Biosciences, Northwestern University, 4619 Silverman Hall 2205 Tech Drive, Evanston, IL, 60208, USA
| | - Alexandra D Bardas
- Department of Molecular Biosciences, Northwestern University, 4619 Silverman Hall 2205 Tech Drive, Evanston, IL, 60208, USA
| | - Edward S H Koh
- Department of Molecular Biosciences, Northwestern University, 4619 Silverman Hall 2205 Tech Drive, Evanston, IL, 60208, USA
| | - Justine Rozenich
- Department of Molecular Biosciences, Northwestern University, 4619 Silverman Hall 2205 Tech Drive, Evanston, IL, 60208, USA
| | - Cassia Yeo
- Department of Molecular Biosciences, Northwestern University, 4619 Silverman Hall 2205 Tech Drive, Evanston, IL, 60208, USA
| | - Maryanne Xu
- Department of Molecular Biosciences, Northwestern University, 4619 Silverman Hall 2205 Tech Drive, Evanston, IL, 60208, USA
| | - Erik C Andersen
- Department of Molecular Biosciences, Northwestern University, 4619 Silverman Hall 2205 Tech Drive, Evanston, IL, 60208, USA.
| |
Collapse
|
5
|
Lee YC, Ke HM, Liu YC, Lee HH, Wang MC, Tseng YC, Kikuchi T, Tsai IJ. Single-worm long-read sequencing reveals genome diversity in free-living nematodes. Nucleic Acids Res 2023; 51:8035-8047. [PMID: 37526286 PMCID: PMC10450198 DOI: 10.1093/nar/gkad647] [Citation(s) in RCA: 2] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/19/2023] [Revised: 07/10/2023] [Accepted: 07/21/2023] [Indexed: 08/02/2023] Open
Abstract
Obtaining sufficient genetic material from a limited biological source is currently the primary operational bottleneck in studies investigating biodiversity and genome evolution. In this study, we employed multiple displacement amplification (MDA) and Smartseq2 to amplify nanograms of genomic DNA and mRNA, respectively, from individual Caenorhabditis elegans. Although reduced genome coverage was observed in repetitive regions, we produced assemblies covering 98% of the reference genome using long-read sequences generated with Oxford Nanopore Technologies (ONT). Annotation with the sequenced transcriptome coupled with the available assembly revealed that gene predictions were more accurate, complete and contained far fewer false positives than de novo transcriptome assembly approaches. We sampled and sequenced the genomes and transcriptomes of 13 nematodes from early-branching species in Chromadoria, Dorylaimia and Enoplia. The basal Chromadoria and Enoplia species had larger genome sizes, ranging from 136.6 to 738.8 Mb, compared with those in the other clades. Nine mitogenomes were fully assembled, and displayed a complete lack of synteny to other species. Phylogenomic analyses based on the new annotations revealed strong support for Enoplia as sister to the rest of Nematoda. Our result demonstrates the robustness of MDA in combination with ONT, paving the way for the study of genome diversity in the phylum Nematoda and beyond.
Collapse
Affiliation(s)
- Yi-Chien Lee
- Biodiversity Research Center, Academia Sinica, Taipei 115, Taiwan
- Biodiversity Program, Taiwan International Graduate Program, Academia Sinica and National Taiwan Normal University, Taipei, Taiwan
- Department of Life Science, National Taiwan Normal University, 116 Wenshan, Taipei, Taiwan
| | - Huei-Mien Ke
- Department of Microbiology, Soochow University, Taipei, Taiwan
| | - Yu-Ching Liu
- Biodiversity Research Center, Academia Sinica, Taipei 115, Taiwan
| | - Hsin-Han Lee
- Biodiversity Research Center, Academia Sinica, Taipei 115, Taiwan
| | - Min-Chen Wang
- Marine Research Station (MRS), Institute of Cellular and Organismic Biology, Academia Sinica, 262 I-Lan County, Taiwan
| | - Yung-Che Tseng
- Marine Research Station (MRS), Institute of Cellular and Organismic Biology, Academia Sinica, 262 I-Lan County, Taiwan
| | - Taisei Kikuchi
- Department of Integrated Biosciences, Graduate School of Frontier Sciences, The University of Tokyo, Chiba 277-8562, Japan
| | - Isheng Jason Tsai
- Biodiversity Research Center, Academia Sinica, Taipei 115, Taiwan
- Biodiversity Program, Taiwan International Graduate Program, Academia Sinica and National Taiwan Normal University, Taipei, Taiwan
| |
Collapse
|
6
|
Braley LE, Jewell JB, Figueroa J, Humann JL, Main D, Mora-Romero GA, Moroz N, Woodhall JW, White RA, Tanaka K. Nanopore Sequencing with GraphMap for Comprehensive Pathogen Detection in Potato Field Soil. PLANT DISEASE 2023; 107:2288-2295. [PMID: 36724099 DOI: 10.1094/pdis-01-23-0052-sr] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/18/2023]
Abstract
Early detection of causal pathogens is important to prevent crop loss from diseases. However, some diseases, such as soilborne diseases, are difficult to diagnose due to the absence of visible or characteristic symptoms. In the present study, the use of the Oxford Nanopore MinION sequencer as a molecular diagnostic tool was assessed due to its long-read sequencing capabilities and portability. Nucleotide samples (DNA or RNA) from potato field soils were sequenced and analyzed using a locally curated pathogen database, followed by identification via sequence mapping. We performed computational speed tests of three commonly used mapping/annotation tools (BLAST, BWA-BLAST, and BWA-GraphMap) and found BWA-GraphMap to be the fastest tool for local searching against our curated pathogen database. The data collected demonstrate the high potential of Nanopore sequencing as a minimally biased diagnostic tool for comprehensive pathogen detection in soil from potato fields. Our GraphMap-based MinION sequencing method could be useful as a predictive approach for disease management by identifying pathogens present in field soil prior to planting. Although this method still needs further experimentation with a larger sample size for practical use, the data analysis pipeline presented can be applied to other cropping systems and diagnostics for detecting multiple pathogens.
Collapse
Affiliation(s)
- Lauren E Braley
- Department of Plant Pathology, Washington State University, Pullman, WA 99164-6430, U.S.A
| | - Jeremy B Jewell
- Department of Plant Pathology, Washington State University, Pullman, WA 99164-6430, U.S.A
| | - Jose Figueroa
- Department of Bioinformatics and Genomics, The University of North Carolina at Charlotte, Charlotte, NC 28223, U.S.A
- Department of Bioinformatics and Genomics, The University of North Carolina at Charlotte, Kannapolis, NC 28081, U.S.A
| | - Jodi L Humann
- Department of Horticulture, Washington State University, Pullman, WA 99164-6414, U.S.A
| | - Dorrie Main
- Department of Horticulture, Washington State University, Pullman, WA 99164-6414, U.S.A
| | - Guadalupe A Mora-Romero
- Unidad de Investigación en Ambiente y Salud, Universidad Autónoma de Occidente, Los Mochis, Sinaloa 81223, México
| | - Natalia Moroz
- Department of Plant Pathology, Washington State University, Pullman, WA 99164-6430, U.S.A
| | - James W Woodhall
- Parma Research and Extension Center, University of Idaho, Parma, ID 83660-6699, U.S.A
| | - Richard Allen White
- Department of Bioinformatics and Genomics, The University of North Carolina at Charlotte, Charlotte, NC 28223, U.S.A
- Department of Bioinformatics and Genomics, The University of North Carolina at Charlotte, Kannapolis, NC 28081, U.S.A
| | - Kiwamu Tanaka
- Department of Plant Pathology, Washington State University, Pullman, WA 99164-6430, U.S.A
| |
Collapse
|
7
|
Fang S, Yin B, Xie W, He S, Liang L, Tang P, Tian R, Weng T, Yuan J, Wang D. Low-noise and high-speed trans-impedance amplifier for nanopore sensor. THE REVIEW OF SCIENTIFIC INSTRUMENTS 2023; 94:074704. [PMID: 37439626 DOI: 10.1063/5.0155192] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 04/18/2023] [Accepted: 06/23/2023] [Indexed: 07/14/2023]
Abstract
The small current detection circuit is the core component of the accurate detection of the nanopore sensor. In this paper, a compact, low-noise, and high-speed trans-impedance amplifier is built for the nanopore detection system. The amplifier consists of two amplification stages. The first stage performs low-noise trans-impedance amplification by using ADA4530-1, which is a high-performance FET operational amplifier, and a high-ohm feedback resistor of 1 GΩ. The high pass shelf filter in the second stage recovers the higher frequency above the 3 dB cutoff in the first stage to extend the maximum bandwidth up to 50 kHz. The amplifier shows a low noise below sub-2 pA rms when tuned to have a bandwidth of around 5 kHz. It also guarantees a stable frequency response in the nanopore sensor.
Collapse
Affiliation(s)
- Shaoxi Fang
- Chongqing Key Laboratory of Multi-scale Manufacturing Technology, Chongqing Institute of Green and Intelligent Technology, Chinese Academy of Sciences, Chongqing 400714, People's Republic of China
- Chongqing School, University of Chinese Academy of Sciences, Chongqing 400714, People's Republic of China
| | - Bohua Yin
- Chongqing Key Laboratory of Multi-scale Manufacturing Technology, Chongqing Institute of Green and Intelligent Technology, Chinese Academy of Sciences, Chongqing 400714, People's Republic of China
- Changchun University of Science and Technology, Jilin Province, Changchun 130022, People's Republic of China
| | - Wanyi Xie
- Chongqing Key Laboratory of Multi-scale Manufacturing Technology, Chongqing Institute of Green and Intelligent Technology, Chinese Academy of Sciences, Chongqing 400714, People's Republic of China
- Chongqing School, University of Chinese Academy of Sciences, Chongqing 400714, People's Republic of China
| | - Shixuan He
- Chongqing Key Laboratory of Multi-scale Manufacturing Technology, Chongqing Institute of Green and Intelligent Technology, Chinese Academy of Sciences, Chongqing 400714, People's Republic of China
- Chongqing School, University of Chinese Academy of Sciences, Chongqing 400714, People's Republic of China
| | - Liyuan Liang
- Chongqing Key Laboratory of Multi-scale Manufacturing Technology, Chongqing Institute of Green and Intelligent Technology, Chinese Academy of Sciences, Chongqing 400714, People's Republic of China
- Chongqing School, University of Chinese Academy of Sciences, Chongqing 400714, People's Republic of China
| | - Peng Tang
- Chongqing Key Laboratory of Multi-scale Manufacturing Technology, Chongqing Institute of Green and Intelligent Technology, Chinese Academy of Sciences, Chongqing 400714, People's Republic of China
- Chongqing School, University of Chinese Academy of Sciences, Chongqing 400714, People's Republic of China
| | - Rong Tian
- Chongqing Key Laboratory of Multi-scale Manufacturing Technology, Chongqing Institute of Green and Intelligent Technology, Chinese Academy of Sciences, Chongqing 400714, People's Republic of China
- Chongqing School, University of Chinese Academy of Sciences, Chongqing 400714, People's Republic of China
| | - Ting Weng
- Chongqing Key Laboratory of Multi-scale Manufacturing Technology, Chongqing Institute of Green and Intelligent Technology, Chinese Academy of Sciences, Chongqing 400714, People's Republic of China
- Chongqing School, University of Chinese Academy of Sciences, Chongqing 400714, People's Republic of China
| | - Jiahu Yuan
- Chongqing Key Laboratory of Multi-scale Manufacturing Technology, Chongqing Institute of Green and Intelligent Technology, Chinese Academy of Sciences, Chongqing 400714, People's Republic of China
- Chongqing School, University of Chinese Academy of Sciences, Chongqing 400714, People's Republic of China
| | - Deqiang Wang
- Chongqing Key Laboratory of Multi-scale Manufacturing Technology, Chongqing Institute of Green and Intelligent Technology, Chinese Academy of Sciences, Chongqing 400714, People's Republic of China
- Chongqing School, University of Chinese Academy of Sciences, Chongqing 400714, People's Republic of China
- Changchun University of Science and Technology, Jilin Province, Changchun 130022, People's Republic of China
| |
Collapse
|
8
|
Ohta T, Shiwa Y. Hybrid Genome Assembly of Short and Long Reads in Galaxy. Methods Mol Biol 2023; 2632:15-30. [PMID: 36781718 DOI: 10.1007/978-1-0716-2996-3_2] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/15/2023]
Abstract
Galaxy is a web browser-based data analysis platform that is widely used in biology. Public Galaxy instances allow the analysis of data and interpretation of results without requiring software installation. NanoGalaxy is a public Galaxy instance with tools and workflows for nanopore data analysis. This chapter describes the steps involved in performing genome assembly using short and long reads in NanoGalaxy.
Collapse
Affiliation(s)
- Tazro Ohta
- Database Center for Life Science, Joint Support-Center for Data Science Research, Research Organization of Information and Systems, Mishima, Shizuoka, Japan
| | - Yuh Shiwa
- Laboratory of Bioinformatics, Department of Molecular Microbiology, Faculty of Life Sciences, Tokyo University of Agriculture, Setagaya, Tokyo, Japan.
| |
Collapse
|
9
|
GALA: a computational framework for de novo chromosome-by-chromosome assembly with long reads. Nat Commun 2023; 14:204. [PMID: 36639368 PMCID: PMC9839709 DOI: 10.1038/s41467-022-35670-y] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/08/2022] [Accepted: 12/16/2022] [Indexed: 01/15/2023] Open
Abstract
High-quality genome assembly has wide applications in genetics and medical studies. However, it is still very challenging to achieve gap-free chromosome-scale assemblies using current workflows for long-read platforms. Here we report on GALA (Gap-free long-read Assembly tool), a computational framework for chromosome-based sequencing data separation and de novo assembly implemented through a multi-layer graph that identifies discordances within preliminary assemblies and partitions the data into chromosome-scale scaffolding groups. The subsequent independent assembly of each scaffolding group generates a gap-free assembly likely free from the mis-assembly errors which usually hamper existing workflows. This flexible framework also allows us to integrate data from various technologies, such as Hi-C, genetic maps, and even motif analyses to generate gap-free chromosome-scale assemblies. As a proof of principle we de novo assemble the C. elegans genome using combined PacBio and Nanopore sequencing data and a rice cultivar genome using Nanopore sequencing data from publicly available datasets. We also demonstrate the proposed method's applicability with a gap-free assembly of the human genome using PacBio high-fidelity (HiFi) long reads. Thus, our method enables straightforward assembly of genomes with multiple data sources and overcomes barriers that at present restrict the application of de novo genome assembly technology.
Collapse
|
10
|
Ding Q, Ren X, Li R, Chan L, Ho VWS, Bi Y, Xie D, Zhao Z. Highly efficient transgenesis with miniMos in Caenorhabditis briggsae. G3 (BETHESDA, MD.) 2022; 12:jkac254. [PMID: 36171682 PMCID: PMC9713419 DOI: 10.1093/g3journal/jkac254] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 07/12/2022] [Accepted: 08/29/2022] [Indexed: 06/13/2023]
Abstract
Caenorhabditis briggsae as a companion species for Caenorhabditis elegans has played an increasingly important role in study of evolution of development and genome and gene regulation. Aided by the isolation of its sister spices, it has recently been established as a model for speciation study. To take full advantage of the species for comparative study, an effective transgenesis method especially those with single-copy insertion is important for functional comparison. Here, we improved a transposon-based transgenesis methodology that had been originally developed in C. elegans but worked marginally in C. briggsae. By incorporation of a heat shock step, the transgenesis efficiency in C. briggsae with a single-copy insertion is comparable to that in C. elegans. We used the method to generate 54 independent insertions mostly consisting of a mCherry tag over the C. briggsae genome. We demonstrated the use of the tags in identifying interacting loci responsible for hybrid male sterility between C. briggsae and Caenorhabditis nigoni when combined with the GFP tags we generated previously. Finally, we demonstrated that C. briggsae tolerates the C. elegans toxin, PEEL-1, but not SUP-35, making the latter a potential negative selection marker against extrachromosomal array.
Collapse
Affiliation(s)
- Qiutao Ding
- Department of Biology, Hong Kong Baptist University, Hong Kong, China
| | - Xiaoliang Ren
- Department of Biology, Hong Kong Baptist University, Hong Kong, China
| | - Runsheng Li
- Department of Biology, Hong Kong Baptist University, Hong Kong, China
- Department of Infectious Diseases and Public Health, Jockey Club College of Veterinary Medicine and Life Sciences, City University of Hong Kong, Hong Kong, China
| | - Luyan Chan
- Department of Biology, Hong Kong Baptist University, Hong Kong, China
| | - Vincy W S Ho
- Department of Biology, Hong Kong Baptist University, Hong Kong, China
| | - Yu Bi
- Department of Biology, Hong Kong Baptist University, Hong Kong, China
| | - Dongying Xie
- Department of Biology, Hong Kong Baptist University, Hong Kong, China
| | - Zhongying Zhao
- Department of Biology, Hong Kong Baptist University, Hong Kong, China
- State Key Laboratory of Environmental and Biological Analysis, Hong Kong Baptist University, Hong Kong SAR, China
| |
Collapse
|
11
|
ONT-Based Alternative Assemblies Impact on the Annotations of Unique versus Repetitive Features in the Genome of a Romanian Strain of Drosophila melanogaster. Int J Mol Sci 2022; 23:ijms232314892. [PMID: 36499217 PMCID: PMC9741293 DOI: 10.3390/ijms232314892] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/26/2022] [Revised: 11/21/2022] [Accepted: 11/24/2022] [Indexed: 11/29/2022] Open
Abstract
To date, different strategies of whole-genome sequencing (WGS) have been developed in order to understand the genome structure and functions. However, the analysis of genomic sequences obtained from natural populations is challenging and the biological interpretation of sequencing data remains the main issue. The MinION device developed by Oxford Nanopore Technologies (ONT) is able to generate long reads with minimal costs and time requirements. These valuable assets qualify it as a suitable method for performing WGS, especially in small laboratories. The long reads resulted using this sequencing approach can cover large structural variants and repetitive sequences commonly present in the genomes of eukaryotes. Using MinION, we performed two WGS assessments of a Romanian local strain of Drosophila melanogaster, referred to as Horezu_LaPeri (Horezu). In total, 1,317,857 reads with a size of 8.9 gigabytes (Gb) were generated. Canu and Flye de novo assembly tools were employed to obtain four distinct assemblies with both unfiltered and filtered reads, achieving maximum reference genome coverages of 94.8% (Canu) and 91.4% (Flye). In order to test the quality of these assemblies, we performed a two-step evaluation. Firstly, we considered the BUSCO scores and inquired for a supplemental set of genes using BLAST. Subsequently, we appraised the total content of natural transposons (NTs) relative to the reference genome (ISO1 strain) and mapped the mdg1 retroelement as a resolution assayer. Our results reveal that filtered data provide only slightly enhanced results when considering genes identification, but the use of unfiltered data had a consistent positive impact on the global evaluation of the NTs content. Our comparative studies also revealed differences between Flye and Canu assemblies regarding the annotation of unique versus repetitive genomic features. In our hands, Flye proved to be moderately better for gene identification, while Canu clearly outperformed Flye for NTs analysis. Data concerning the NTs content were compared to those obtained with ONT for the D. melanogaster ISO1 strain, revealing that our strategy conducted to better results. Additionally, the parameters of our ONT reads and assemblies are similar to those reported for ONT experiments performed on various model organisms, revealing that our assembly data are appropriate for a proficient annotation of the Horezu genome.
Collapse
|
12
|
Liew YJM, Chua KO, Yong HS, Song SL, Chan KG. Complete chloroplast genome of Boesenbergia rotunda and a comparative analysis with members of the family Zingiberaceae. REVISTA BRASILEIRA DE BOTANICA : BRAZILIAN JOURNAL OF BOTANY 2022; 45:1209-1222. [PMID: 36320930 PMCID: PMC9607705 DOI: 10.1007/s40415-022-00845-w] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 06/10/2022] [Revised: 09/15/2022] [Accepted: 10/16/2022] [Indexed: 06/16/2023]
Abstract
UNLABELLED Boesenbergia rotunda (L.) Mansf. is a medically important ginger species of the family Zingiberaceae but its genomic information on molecular phylogeny and identification is scarce. In this work, the chloroplast genome of B. rotunda was sequenced, characterized and compared to the other Zingiberaceae species to provide chloroplast genetic resources and to determine its phylogenetic position in the family. The chloroplast genome of B. rotunda was 163,817 bp in length and consisted of a large single-copy (LSC) region of 88,302 bp, a small single-copy (SSC) region of 16,023 bp and a pair of inverted repeats (IRA and IRB) of 29,746 bp each. The chloroplast genome contained 113 unique genes, including 79 protein-coding genes, 30 transfer RNA (tRNA) genes and four ribosomal RNA (rRNA) genes. Several genes had atypical start codons, while most amino acids exhibited biased usage of synonymous codons. Comparative analyses with various chloroplast genomes of Zingiberaceae taxa revealed several highly variable regions (psbK-psbI, trnT-GGU-psbD, rbcL-accD, ndhF-rpl32, and ycf1) in the LSC and SSC regions in the chloroplast genome of B. rotunda that could be utilized as molecular markers for DNA barcoding and species delimitation. Phylogenetic analyses based on shared protein-coding genes revealed that B. rotunda formed a distinct lineage with B. kingii Mood & L.M.Prince, in a subclade that also contained the genera Kaempferia and Zingiber. These findings constitute the first chloroplast genome information of B. rotunda that could be a reference for phylogenetic analysis and identification of genus Boesenbergia within the Zingiberaceae family. SUPPLEMENTARY INFORMATION The online version contains supplementary material available at 10.1007/s40415-022-00845-w.
Collapse
Affiliation(s)
- Yvonne Jing Mei Liew
- University of Malaya Centre for Proteomics Research, Universiti Malaya, 50603 Kuala Lumpur, Malaysia
- Deputy Vice Chancellor’s Office (Research and Innovation), Universiti Malaya, 50603 Kuala Lumpur, Malaysia
| | - Kah-Ooi Chua
- Centre for Research in Biotechnology for Agriculture, Universiti Malaya, 50603 Kuala Lumpur, Malaysia
| | - Hoi-Sen Yong
- Institute of Biological Sciences, Faculty of Science, Universiti Malaya, 50603 Kuala Lumpur, Malaysia
| | - Sze-Looi Song
- Institute for Advanced Studies, Universiti Malaya, 50603 Kuala Lumpur, Malaysia
| | - Kok-Gan Chan
- Institute of Biological Sciences, Faculty of Science, Universiti Malaya, 50603 Kuala Lumpur, Malaysia
- International Genome Centre, Jiangsu University, Zhenjiang, China
- Institute of Marine Sciences, Shantou University, Shantou, 515063 China
| |
Collapse
|
13
|
Ahmed YW, Alemu BA, Bekele SA, Gizaw ST, Zerihun MF, Wabalo EK, Teklemariam MD, Mihrete TK, Hanurry EY, Amogne TG, Gebrehiwot AD, Berga TN, Haile EA, Edo DO, Alemu BD. Epigenetic tumor heterogeneity in the era of single-cell profiling with nanopore sequencing. Clin Epigenetics 2022; 14:107. [PMID: 36030244 PMCID: PMC9419648 DOI: 10.1186/s13148-022-01323-6] [Citation(s) in RCA: 7] [Impact Index Per Article: 3.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/27/2022] [Accepted: 08/12/2022] [Indexed: 11/29/2022] Open
Abstract
Nanopore sequencing has brought the technology to the next generation in the science of sequencing. This is achieved through research advancing on: pore efficiency, creating mechanisms to control DNA translocation, enhancing signal-to-noise ratio, and expanding to long-read ranges. Heterogeneity regarding epigenetics would be broad as mutations in the epigenome are sensitive to cause new challenges in cancer research. Epigenetic enzymes which catalyze DNA methylation and histone modification are dysregulated in cancer cells and cause numerous heterogeneous clones to evolve. Detection of this heterogeneity in these clones plays an indispensable role in the treatment of various cancer types. With single-cell profiling, the nanopore sequencing technology could provide a simple sequence at long reads and is expected to be used soon at the bedside or doctor's office. Here, we review the advancements of nanopore sequencing and its use in the detection of epigenetic heterogeneity in cancer.
Collapse
Affiliation(s)
- Yohannis Wondwosen Ahmed
- Department of Medical Biochemistry, School of Medicine, College of Health Sciences, Addis Ababa University, P.O. Box: 9086, Addis Ababa, Ethiopia.
| | - Berhan Ababaw Alemu
- Department of Medical Biochemistry, School of Medicine, St. Paul's Hospital, Millennium Medical College, Addis Ababa, Ethiopia
| | - Sisay Addisu Bekele
- Department of Medical Biochemistry, School of Medicine, College of Health Sciences, Addis Ababa University, P.O. Box: 9086, Addis Ababa, Ethiopia
| | - Solomon Tebeje Gizaw
- Department of Medical Biochemistry, School of Medicine, College of Health Sciences, Addis Ababa University, P.O. Box: 9086, Addis Ababa, Ethiopia
| | - Muluken Fekadie Zerihun
- Department of Medical Biochemistry, School of Medicine, College of Health Sciences, Addis Ababa University, P.O. Box: 9086, Addis Ababa, Ethiopia
| | - Endriyas Kelta Wabalo
- Department of Medical Biochemistry, School of Medicine, College of Health Sciences, Addis Ababa University, P.O. Box: 9086, Addis Ababa, Ethiopia
| | - Maria Degef Teklemariam
- Department of Medical Biochemistry, School of Medicine, College of Health Sciences, Addis Ababa University, P.O. Box: 9086, Addis Ababa, Ethiopia
| | - Tsehayneh Kelemu Mihrete
- Department of Medical Biochemistry, School of Medicine, College of Health Sciences, Addis Ababa University, P.O. Box: 9086, Addis Ababa, Ethiopia
| | - Endris Yibru Hanurry
- Department of Medical Biochemistry, School of Medicine, College of Health Sciences, Addis Ababa University, P.O. Box: 9086, Addis Ababa, Ethiopia
| | - Tensae Gebru Amogne
- Department of Medical Biochemistry, School of Medicine, College of Health Sciences, Addis Ababa University, P.O. Box: 9086, Addis Ababa, Ethiopia
| | - Assaye Desalegne Gebrehiwot
- Department of Medical Anatomy, School of Medicine, College of Health Sciences, Addis Ababa University, Addis Ababa, Ethiopia
| | - Tamirat Nida Berga
- Department of Medical Biochemistry, School of Medicine, College of Health Sciences, Addis Ababa University, P.O. Box: 9086, Addis Ababa, Ethiopia
| | - Ebsitu Abate Haile
- Department of Medical Biochemistry, School of Medicine, College of Health Sciences, Addis Ababa University, P.O. Box: 9086, Addis Ababa, Ethiopia
| | - Dessiet Oma Edo
- Department of Medical Biochemistry, School of Medicine, College of Health Sciences, Addis Ababa University, P.O. Box: 9086, Addis Ababa, Ethiopia
| | - Bizuwork Derebew Alemu
- Department of Statistics, College of Natural and Computational Sciences, Mizan Tepi University, Tepi, Ethiopia
| |
Collapse
|
14
|
El Mouridi S, Alkhaldi F, Frøkjær-Jensen C. Modular safe-harbor transgene insertion for targeted single-copy and extrachromosomal array integration in Caenorhabditis elegans. G3 (BETHESDA, MD.) 2022; 12:jkac184. [PMID: 35900171 PMCID: PMC9434227 DOI: 10.1093/g3journal/jkac184] [Citation(s) in RCA: 12] [Impact Index Per Article: 6.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 05/30/2022] [Accepted: 06/29/2022] [Indexed: 12/02/2022]
Abstract
Efficient and reproducible transgenesis facilitates and accelerates research using genetic model organisms. Here, we describe a modular safe-harbor transgene insertion (MosTI) for use in Caenorhabditis elegans which improves targeted insertion of single-copy transgenes by homology directed repair and targeted integration of extrachromosomal arrays by nonhomologous end-joining. MosTI allows easy conversion between selection markers at insertion site and a collection of universal targeting vectors with commonly used promoters and fluorophores. Insertions are targeted at three permissive safe-harbor intergenic locations and transgenes are reproducibly expressed in somatic and germ cells. Chromosomal integration is mediated by CRISPR/Cas9, and positive selection is based on a set of split markers (unc-119, hygroR, and gfp) where only animals with chromosomal insertions are rescued, resistant to antibiotics, or fluorescent, respectively. Single-copy insertion is efficient using either constitutive or heat-shock inducible Cas9 expression (25-75%) and insertions can be generated from a multiplexed injection mix. Extrachromosomal array integration is also efficient (7-44%) at modular safe-harbor transgene insertion landing sites or at the endogenous unc-119 locus. We use short-read sequencing to estimate the plasmid copy numbers for 8 integrated arrays (6-37 copies) and long-read Nanopore sequencing to determine the structure and size (5.4 Mb) of 1 array. Using universal targeting vectors, standardized insertion strains, and optimized protocols, it is possible to construct complex transgenic strains which should facilitate the study of increasingly complex biological problems in C. elegans.
Collapse
Affiliation(s)
- Sonia El Mouridi
- Biological and Environmental Sciences and Engineering Division (BESE), King Abdullah University of Science and Technology (KAUST), Thuwal 23955-6900, Kingdom of Saudi Arabia
| | - Faisal Alkhaldi
- Biological and Environmental Sciences and Engineering Division (BESE), King Abdullah University of Science and Technology (KAUST), Thuwal 23955-6900, Kingdom of Saudi Arabia
| | - Christian Frøkjær-Jensen
- Biological and Environmental Sciences and Engineering Division (BESE), King Abdullah University of Science and Technology (KAUST), Thuwal 23955-6900, Kingdom of Saudi Arabia
| |
Collapse
|
15
|
Maeda Y, Kobayashi R, Watanabe K, Yoshino T, Bowler C, Matsumoto M, Tanaka T. Chromosome-Scale Genome Assembly of the Marine Oleaginous Diatom Fistulifera solaris. MARINE BIOTECHNOLOGY (NEW YORK, N.Y.) 2022; 24:788-800. [PMID: 35915286 DOI: 10.1007/s10126-022-10147-7] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 04/11/2022] [Accepted: 07/15/2022] [Indexed: 06/15/2023]
Abstract
Microalgae including diatoms are of interest for environmentally friendly manufacturing such as production of biofuels, chemicals, and materials. The highly oil-accumulating marine diatom Fistulifera solaris has been studied as a promising host organism to be employed for these applications. Recently reported large-scale genetic engineering based on episomal vectors for diatoms could be useful to further enhance the potential of F. solaris, whereas we need to understand more the mode-of-action of diatom centromeres to rationally design the episomal vectors for stable extrachromosomal maintenance. Our previous genome analysis with pyrosequencing (short read sequencing) had generated the fragmented scaffolds which were not useful to predict centromeres on each chromosome. Here, we report the almost complete chromosomal structure of the genome of F. solaris using a long-read nanopore sequencing platform MinION. From just one single run using a MinION flow-cell, the chromosome-scale assembly with telomere-to-telomere resolution was achieved for 41 out of 44 chromosomes. Putative centromere regions were predicted from the 16 chromosomes, and we discovered putative consensus motifs in the predicted centromeres. Similar motif search had been performed in model diatoms, but no consensus motif was found. Therefore, this is the first study to successfully estimate consensus motifs in diatom centromeres. The chromosome-scale assembly also suggests the potential existence of multi-copy mini-chromosomes and tandemly repeated lipogenesis genes related to the oleaginous phenotype of F. solaris. Findings of this study are useful to understand and further engineer the oleaginous phenotype of F. solaris.
Collapse
Affiliation(s)
- Yoshiaki Maeda
- Division of Biotechnology and Life Science, Institute of Engineering, Tokyo University of Agriculture and Technology, 2-24-16, Koganei, Tokyo, 184-8588, Japan.
- Faculty of Life and Environmental Sciences, University of Tsukuba, 1-1-1 Tennoudai, Tsukuba, Ibaraki, 305-8572, Japan.
| | - Ryosuke Kobayashi
- Division of Biotechnology and Life Science, Institute of Engineering, Tokyo University of Agriculture and Technology, 2-24-16, Koganei, Tokyo, 184-8588, Japan
| | - Kahori Watanabe
- Division of Biotechnology and Life Science, Institute of Engineering, Tokyo University of Agriculture and Technology, 2-24-16, Koganei, Tokyo, 184-8588, Japan
| | - Tomoko Yoshino
- Division of Biotechnology and Life Science, Institute of Engineering, Tokyo University of Agriculture and Technology, 2-24-16, Koganei, Tokyo, 184-8588, Japan
| | - Chris Bowler
- Institut de Biologie de L'Ecole Normale Supérieure (IBENS), Ecole Normale Supérieure, CNRS, INSERM, Université PSL, 75005, Paris, France
| | - Mitsufumi Matsumoto
- Biotechnology Laboratory, Electric Power Development Co.1, Yanagisaki-machi, Wakamatsu-ku, LtdKitakyusyu, Fukuoka, 808-0111, Japan
| | - Tsuyoshi Tanaka
- Division of Biotechnology and Life Science, Institute of Engineering, Tokyo University of Agriculture and Technology, 2-24-16, Koganei, Tokyo, 184-8588, Japan
| |
Collapse
|
16
|
Lee Y, Ha U, Moon S. Ongoing endeavors to detect mobilization of transposable elements. BMB Rep 2022. [PMID: 35725016 PMCID: PMC9340088 DOI: 10.5483/bmbrep.2022.55.7.088] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/20/2022] Open
Abstract
Transposable elements (TEs) are DNA sequences capable of mobilization from one location to another in the genome. Since the discovery of ‘Dissociation (Dc) locus’ by Barbara McClintock in maize (1), mounting evidence in the era of genomics indicates that a significant fraction of most eukaryotic genomes is composed of TE sequences, involving in various aspects of biological processes such as development, physiology, diseases and evolution. Although technical advances in genomics have discovered numerous functional impacts of TE across species, our understanding of TEs is still ongoing process due to challenges resulted from complexity and abundance of TEs in the genome. In this mini-review, we briefly summarize biology of TEs and their impacts on the host genome, emphasizing importance of understanding TE landscape in the genome. Then, we introduce recent endeavors especially in vivo retrotransposition assays and long read sequencing technology for identifying de novo insertions/TE polymorphism, which will broaden our knowledge of extraordinary relationship between genomic cohabitants and their host.
Collapse
Affiliation(s)
- Yujeong Lee
- Department of Biological Sciences, Kangwon National University, Chuncheon 24341, Korea
| | - Una Ha
- Department of Biological Sciences, Kangwon National University, Chuncheon 24341, Korea
| | - Sungjin Moon
- Department of Biological Sciences, Kangwon National University, Chuncheon 24341, Korea
| |
Collapse
|
17
|
Xie H, Li W, Hu Y, Yang C, Lu J, Guo Y, Wen L, Tang F. De novo assembly of human genome at single-cell levels. Nucleic Acids Res 2022; 50:7479-7492. [PMID: 35819189 PMCID: PMC9303314 DOI: 10.1093/nar/gkac586] [Citation(s) in RCA: 15] [Impact Index Per Article: 7.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/10/2022] [Revised: 05/17/2022] [Accepted: 06/24/2022] [Indexed: 12/12/2022] Open
Abstract
Genome assembly has been benefited from long-read sequencing technologies with higher accuracy and higher continuity. However, most human genome assembly require large amount of DNAs from homogeneous cell lines without keeping cell heterogeneities, since cell heterogeneity could profoundly affect haplotype assembly results. Herein, using single-cell genome long-read sequencing technology (SMOOTH-seq), we have sequenced K562 and HG002 cells on PacBio HiFi and Oxford Nanopore Technologies (ONT) platforms and conducted de novo genome assembly. For the first time, we have completed the human genome assembly with high continuity (with NG50 of ∼2 Mb using 95 individual K562 cells) at single-cell levels, and explored the impact of different assemblers and sequencing strategies on genome assembly. With sequencing data from 30 diploid individual HG002 cells of relatively high genome coverage (average coverage ∼41.7%) on ONT platform, the NG50 can reach over 1.3 Mb. Furthermore, with the assembled genome from K562 single-cell dataset, more complete and accurate set of insertion events and complex structural variations could be identified. This study opened a new chapter on the practice of single-cell genome de novo assembly.
Collapse
Affiliation(s)
- Haoling Xie
- School of Life Sciences, Biomedical Pioneering Innovation Center, Peking University, Beijing 100871, China.,Peking University-Tsinghua University-National Institute of Biological Sciences Joint Graduate Program (PTN), School of Life Sciences, Peking University, Beijing 100871, China.,Beijing Advanced Innovation Center for Genomics (ICG), Ministry of Education Key Laboratory of Cell Proliferation and Differentiation, Beijing 100871, China
| | - Wen Li
- School of Life Sciences, Biomedical Pioneering Innovation Center, Peking University, Beijing 100871, China.,Peking-Tsinghua Center for Life Sciences, Academy for Advanced Interdisciplinary Studies, Peking University, Beijing 100871, China.,Beijing Advanced Innovation Center for Genomics (ICG), Ministry of Education Key Laboratory of Cell Proliferation and Differentiation, Beijing 100871, China
| | - Yuqiong Hu
- School of Life Sciences, Biomedical Pioneering Innovation Center, Peking University, Beijing 100871, China.,Beijing Advanced Innovation Center for Genomics (ICG), Ministry of Education Key Laboratory of Cell Proliferation and Differentiation, Beijing 100871, China
| | - Cheng Yang
- School of Life Sciences, Biomedical Pioneering Innovation Center, Peking University, Beijing 100871, China.,Beijing Advanced Innovation Center for Genomics (ICG), Ministry of Education Key Laboratory of Cell Proliferation and Differentiation, Beijing 100871, China
| | - Jiansen Lu
- School of Life Sciences, Biomedical Pioneering Innovation Center, Peking University, Beijing 100871, China.,Beijing Advanced Innovation Center for Genomics (ICG), Ministry of Education Key Laboratory of Cell Proliferation and Differentiation, Beijing 100871, China
| | - Yuqing Guo
- School of Life Sciences, Biomedical Pioneering Innovation Center, Peking University, Beijing 100871, China.,Beijing Advanced Innovation Center for Genomics (ICG), Ministry of Education Key Laboratory of Cell Proliferation and Differentiation, Beijing 100871, China
| | - Lu Wen
- School of Life Sciences, Biomedical Pioneering Innovation Center, Peking University, Beijing 100871, China.,Beijing Advanced Innovation Center for Genomics (ICG), Ministry of Education Key Laboratory of Cell Proliferation and Differentiation, Beijing 100871, China
| | - Fuchou Tang
- School of Life Sciences, Biomedical Pioneering Innovation Center, Peking University, Beijing 100871, China.,Peking University-Tsinghua University-National Institute of Biological Sciences Joint Graduate Program (PTN), School of Life Sciences, Peking University, Beijing 100871, China.,Peking-Tsinghua Center for Life Sciences, Academy for Advanced Interdisciplinary Studies, Peking University, Beijing 100871, China.,Beijing Advanced Innovation Center for Genomics (ICG), Ministry of Education Key Laboratory of Cell Proliferation and Differentiation, Beijing 100871, China
| |
Collapse
|
18
|
Stevens L, Moya ND, Tanny RE, Gibson SB, Tracey A, Na H, Chitrakar R, Dekker J, Walhout AJ, Baugh LR, Andersen EC. Chromosome-level reference genomes for two strains of Caenorhabditis briggsae: an improved platform for comparative genomics. Genome Biol Evol 2022; 14:6554914. [PMID: 35348662 PMCID: PMC9011032 DOI: 10.1093/gbe/evac042] [Citation(s) in RCA: 11] [Impact Index Per Article: 5.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 03/21/2022] [Indexed: 11/13/2022] Open
Abstract
Abstract
The publication of the Caenorhabditis briggsae reference genome in 2003 enabled the first comparative genomics studies between C. elegans and C. briggsae, shedding light on the evolution of genome content and structure in the Caenorhabditis genus. However, despite being widely used, the currently available C. briggsae reference genome is substantially less complete and structurally accurate than the C. elegans reference genome. Here, we used high-coverage Oxford Nanopore long-read and chromosome conformation capture data to generate chromosome-level reference genomes for two C. briggsae strains: QX1410, a new reference strain closely related to the laboratory AF16 strain, and VX34, a highly divergent strain isolated in China. We also sequenced 99 recombinant inbred lines (RILs) generated from reciprocal crosses between QX1410 and VX34 to create a recombination map and identify chromosomal domains. Additionally, we used both short- and long-read RNA sequencing (RNA-seq) data to generate high-quality gene annotations. By comparing these new reference genomes to the current reference, we reveal that hyper-divergent haplotypes cover large portions of the C. briggsae genome, similar to recent reports in C. elegans and C. tropicalis. We also show that the genomes of selfing Caenorhabditis species have undergone more rearrangement than their outcrossing relatives, which has biased previous estimates of rearrangement rate in Caenorhabditis. These new genomes provide a substantially improved platform for comparative genomics in Caenorhabditis and narrow the gap between the quality of genomic resources available for C. elegans and C. briggsae.
Collapse
Affiliation(s)
- Lewis Stevens
- Department of Molecular Biosciences, Northwestern University, Evanston, IL 60208, USA
| | - Nicolas D. Moya
- Department of Molecular Biosciences, Northwestern University, Evanston, IL 60208, USA
- Interdisciplinary Biological Sciences Program, Northwestern University, Evanston, IL 60208, USA
| | - Robyn E. Tanny
- Department of Molecular Biosciences, Northwestern University, Evanston, IL 60208, USA
| | - Sophia B. Gibson
- Department of Molecular Biosciences, Northwestern University, Evanston, IL 60208, USA
| | - Alan Tracey
- Tree of Life, Wellcome Sanger Institute, Cambridge, UK
| | - Huimin Na
- Department of Systems Biology, University of Massachusetts Chan Medical School, Worcester, MA, USA
| | | | - Job Dekker
- Department of Systems Biology, University of Massachusetts Chan Medical School, Worcester, MA, USA
| | - Albertha J.M. Walhout
- Department of Systems Biology, University of Massachusetts Chan Medical School, Worcester, MA, USA
| | - L. Ryan Baugh
- Department of Biology, Duke University, Durham, NC, USA
- Center for Genomic and Computational Biology, Duke University, Durham, NC, USA
| | - Erik C. Andersen
- Department of Molecular Biosciences, Northwestern University, Evanston, IL 60208, USA
| |
Collapse
|
19
|
Ding Q, Li R, Ren X, Chan LY, Ho VWS, Xie D, Ye P, Zhao Z. Genomic architecture of 5S rDNA cluster and its variations within and between species. BMC Genomics 2022; 23:238. [PMID: 35346033 PMCID: PMC8961926 DOI: 10.1186/s12864-022-08476-x] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/20/2021] [Accepted: 03/16/2022] [Indexed: 11/10/2022] Open
Abstract
BACKGROUND Ribosomal DNAs (rDNAs) are arranged in purely tandem repeats, preventing them from being reliably assembled onto chromosomes during generation of genome assembly. The uncertainty of rDNA genomic structure presents a significant barrier for studying their function and evolution. RESULTS Here we generate ultra-long Oxford Nanopore Technologies (ONT) and short NGS reads to delineate the architecture and variation of the 5S rDNA cluster in the different strains of C. elegans and C. briggsae. We classify the individual rDNA's repeating units into 25 types based on the unique sequence variations in each unit of C. elegans (N2). We next perform assembly of the cluster by taking advantage of the long reads that carry these units, which led to an assembly of 5S rDNA cluster consisting of up to 167 consecutive 5S rDNA units in the N2 strain. The ordering and copy number of various rDNA units are consistent with the separation time between strains. Surprisingly, we observed a drastically reduced level of variation in the unit composition in the 5S rDNA cluster in the C. elegans CB4856 and C. briggsae AF16 strains than in the C. elegans N2 strain, suggesting that N2, a widely used reference strain, is likely to be defective in maintaining the 5S rDNA cluster stability compared with other wild isolates of C. elegans or C. briggsae. CONCLUSIONS The results demonstrate that Nanopore DNA sequencing reads are capable of generating assembly of highly repetitive sequences, and rDNA units are highly dynamic both within and between population(s) of the same species in terms of sequence and copy number. The detailed structure and variation of the 5S rDNA units within the rDNA cluster pave the way for functional and evolutionary studies.
Collapse
Affiliation(s)
- Qiutao Ding
- Department of Biology, Hong Kong Baptist University, Hong Kong SAR, China
| | - Runsheng Li
- Department of Biology, Hong Kong Baptist University, Hong Kong SAR, China
- Department of Infectious Diseases and Public Health, City University of Hong Kong, Hong Kong SAR, China
| | - Xiaoliang Ren
- Department of Biology, Hong Kong Baptist University, Hong Kong SAR, China
| | - Lu-Yan Chan
- Department of Biology, Hong Kong Baptist University, Hong Kong SAR, China
| | - Vincy W S Ho
- Department of Biology, Hong Kong Baptist University, Hong Kong SAR, China
| | - Dongying Xie
- Department of Biology, Hong Kong Baptist University, Hong Kong SAR, China
| | - Pohao Ye
- Department of Biology, Hong Kong Baptist University, Hong Kong SAR, China
| | - Zhongying Zhao
- Department of Biology, Hong Kong Baptist University, Hong Kong SAR, China.
- State Key Laboratory of Environmental and Biological Analysis, Hong Kong Baptist University, Hong Kong SAR, China.
| |
Collapse
|
20
|
Abstract
The nematode Caenorhabditis elegans has shed light on many aspects of eukaryotic biology, including genetics, development, cell biology, and genomics. A major factor in the success of C. elegans as a model organism has been the availability, since the late 1990s, of an essentially gap-free and well-annotated nuclear genome sequence, divided among 6 chromosomes. In this review, we discuss the structure, function, and biology of C. elegans chromosomes and then provide a general perspective on chromosome biology in other diverse nematode species. We highlight malleable chromosome features including centromeres, telomeres, and repetitive elements, as well as the remarkable process of programmed DNA elimination (historically described as chromatin diminution) that induces loss of portions of the genome in somatic cells of a handful of nematode species. An exciting future prospect is that nematode species may enable experimental approaches to study chromosome features and to test models of chromosome evolution. In the long term, fundamental insights regarding how speciation is integrated with chromosome biology may be revealed.
Collapse
Affiliation(s)
- Peter M Carlton
- Graduate School of Biostudies, Kyoto University, Kyoto 606-8501, Japan
| | - Richard E Davis
- Department of Biochemistry and Molecular Genetics, University of Colorado School of Medicine, Denver, CO 80045, USA.,RNA Bioscience Initiative, University of Colorado School of Medicine, Aurora, CO 80045, USA
| | - Shawn Ahmed
- Department of Genetics, University of North Carolina, Chapel Hill, NC 27599, USA.,Department of Biology, University of North Carolina, Chapel Hill, NC 27599, USA
| |
Collapse
|
21
|
Chernyavskaya Y, Zhang X, Liu J, Blackburn J. Long-read sequencing of the zebrafish genome reorganizes genomic architecture. BMC Genomics 2022; 23:116. [PMID: 35144548 PMCID: PMC8832730 DOI: 10.1186/s12864-022-08349-3] [Citation(s) in RCA: 8] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/30/2021] [Accepted: 01/28/2022] [Indexed: 12/31/2022] Open
Abstract
Background Nanopore sequencing technology has revolutionized the field of genome biology with its ability to generate extra-long reads that can resolve regions of the genome that were previously inaccessible to short-read sequencing platforms. Over 50% of the zebrafish genome consists of difficult to map, highly repetitive, low complexity elements that pose inherent problems for short-read sequencers and assemblers. Results We used long-read nanopore sequencing to generate a de novo assembly of the zebrafish genome and compared our assembly to the current reference genome, GRCz11. The new assembly identified 1697 novel insertions and deletions over one kilobase in length and placed 106 previously unlocalized scaffolds. We also discovered additional sites of retrotransposon integration previously unreported in GRCz11 and observed the expression of these transposable elements in adult zebrafish under physiologic conditions, implying they have active mobility in the zebrafish genome and contribute to the ever-changing genomic landscape. Conclusions We used nanopore sequencing to improve upon and resolve the issues plaguing the current zebrafish reference assembly, GRCz11. Zebrafish is a prominent model of human disease, and our corrected assembly will be useful for studies relying on interspecies comparisons and precise linkage of genetic events to disease phenotypes. Supplementary Information The online version contains supplementary material available at 10.1186/s12864-022-08349-3.
Collapse
Affiliation(s)
- Yelena Chernyavskaya
- Department of Cellular & Molecular Biochemistry, University of Kentucky, Lexington, KY, 40536, USA.,Markey Cancer Center at the University of Kentucky, Lexington, KY, 40536, USA
| | - Xiaofei Zhang
- Markey Cancer Center at the University of Kentucky, Lexington, KY, 40536, USA.,Department of Computer Science, University of Kentucky, Lexington, KY, 40536, USA
| | - Jinze Liu
- Department of Biostatistics, Virginia Commonwealth University, Richmond, USA.
| | - Jessica Blackburn
- Department of Cellular & Molecular Biochemistry, University of Kentucky, Lexington, KY, 40536, USA. .,Markey Cancer Center at the University of Kentucky, Lexington, KY, 40536, USA.
| |
Collapse
|
22
|
Li IC, Yu GY, Huang JF, Chen ZW, Chou CH. Comparison of Reference-Based Assembly and De Novo Assembly for Bacterial Plasmid Reconstruction and AMR Gene Localization in Salmonella enterica Serovar Schwarzengrund Isolates. Microorganisms 2022; 10:microorganisms10020227. [PMID: 35208682 PMCID: PMC8874696 DOI: 10.3390/microorganisms10020227] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/28/2021] [Revised: 01/18/2022] [Accepted: 01/18/2022] [Indexed: 12/02/2022] Open
Abstract
It is well established that plasmids carrying multiple antimicrobial resistance (AMR) genes can be easily transferred among bacterial isolates by horizontal gene transfer. Previous studies have shown that a combination of short- and long-read approaches is effective in reconstructing accurate plasmids. However, high-quality Illumina short reads mapped onto the long reads in the context of an AMR hybrid monitoring strategy have not yet been explored. Hence, this study aimed to improve the reconstruction of plasmids, including the localization of AMR genes, using the above-described parameters on whole-genome sequencing (WGS) results. To the best of our knowledge, this study is the first to use S1 nuclease pulsed-field gel electrophoresis (S1-PFGE) to confirm the number and sizes of plasmids detected by in silico-based predictions in Salmonella strains. Our results showed that de novo assembly did not detect the number of bacterial plasmids more accurately than reference-based assembly did. As this new hybrid mapping strategy surpassed de novo assembly in bacterial reconstruction, it was further used to identify the presence and genomic location of AMR genes among three Salmonella enterica serovar Schwarzengrund isolates. The AMR genes identified in the bacterial chromosome among the three Salmonella enterica serovar Schwarzengrund isolates included: AAC(3)-IV, AAC(6′)-Iy, aadA2, APH(4)-Ia, cmlA1, golS, mdsA, mdsB, mdsC, mdtK, qacH, sdiA, sul2, sul3, and TEM-1 genes. Moreover, the presence of TEM-1, AAC(3)-IV, aadA2, APH(4)-Ia, cmlA1, dfrA12, floR, sul1, sul3, and tet(A) genes found within three IncFIB plasmids and one IncX1 plasmid highlight their possible transmission into the environment, which is a public health risk. In conclusion, the generated data using this new hybrid mapping strategy will contribute to the improvement of AMR monitoring and support the risk assessment of AMR dissemination.
Collapse
Affiliation(s)
- I-Chen Li
- Zoonoses Research Center and School of Veterinary Medicine, National Taiwan University, No. 1, Sec. 4, Roosevelt Rd., Taipei City 106, Taiwan;
| | - Gine-Ye Yu
- Animal Technology Research Center, Agricultural Technology Research Institute, No. 52, Kedong 2nd Rd., Zhunan Township, Miaoli County 350, Taiwan; (G.-Y.Y.); (J.-F.H.)
| | - Jing-Fang Huang
- Animal Technology Research Center, Agricultural Technology Research Institute, No. 52, Kedong 2nd Rd., Zhunan Township, Miaoli County 350, Taiwan; (G.-Y.Y.); (J.-F.H.)
| | - Zeng-Weng Chen
- Animal Technology Research Center, Agricultural Technology Research Institute, No. 52, Kedong 2nd Rd., Zhunan Township, Miaoli County 350, Taiwan; (G.-Y.Y.); (J.-F.H.)
- Correspondence: (Z.-W.C.); (C.-H.C.); Tel.: +886-37-585-851 (Z.-W.C.); +886-2-3366-3861 (C.-H.C.); Fax: +886-2-2364-9154 (C.-H.C.)
| | - Chung-Hsi Chou
- Zoonoses Research Center and School of Veterinary Medicine, National Taiwan University, No. 1, Sec. 4, Roosevelt Rd., Taipei City 106, Taiwan;
- Correspondence: (Z.-W.C.); (C.-H.C.); Tel.: +886-37-585-851 (Z.-W.C.); +886-2-3366-3861 (C.-H.C.); Fax: +886-2-2364-9154 (C.-H.C.)
| |
Collapse
|
23
|
Boysen G, Nookaew I. Current and Future Methodology for Quantitation and Site-Specific Mapping the Location of DNA Adducts. TOXICS 2022; 10:toxics10020045. [PMID: 35202232 PMCID: PMC8876591 DOI: 10.3390/toxics10020045] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 12/13/2021] [Revised: 01/12/2022] [Accepted: 01/15/2022] [Indexed: 02/01/2023]
Abstract
Formation of DNA adducts is a key event for a genotoxic mode of action, and their presence is often used as a surrogate for mutation and increased cancer risk. Interest in DNA adducts are twofold: first, to demonstrate exposure, and second, to link DNA adduct location to subsequent mutations or altered gene regulation. Methods have been established to quantitate DNA adducts with high chemical specificity and to visualize the location of DNA adducts, and elegant bio-analytical methods have been devised utilizing enzymes, various chemistries, and molecular biology methods. Traditionally, these highly specific methods cannot be combined, and the results are incomparable. Initially developed for single-molecule DNA sequencing, nanopore-type technologies are expected to enable simultaneous quantitation and location of DNA adducts across the genome. Herein, we briefly summarize the current methodologies for state-of-the-art quantitation of DNA adduct levels and mapping of DNA adducts and describe novel single-molecule DNA sequencing technologies to achieve both measures. Emerging technologies are expected to soon provide a comprehensive picture of the exposome and identify gene regions susceptible to DNA adduct formation.
Collapse
Affiliation(s)
- Gunnar Boysen
- Department Environmental and Occupational Health, Fay W. Boozman College of Public Health, University of Arkansas for Medical Sciences, Little Rock, AR 72205, USA
- The Winthrop P. Rockefeller Cancer Institute, University of Arkansas for Medical Sciences, Little Rock, AR 72205, USA;
- Correspondence:
| | - Intawat Nookaew
- The Winthrop P. Rockefeller Cancer Institute, University of Arkansas for Medical Sciences, Little Rock, AR 72205, USA;
- Department Biomedical Informatics, College of Medicine, University of Arkansas for Medical Sciences, Little Rock, AR 72205, USA
| |
Collapse
|
24
|
Treitli SC, Peña-Diaz P, Hałakuc P, Karnkowska A, Hampl V. High quality genome assembly of the amitochondriate eukaryote Monocercomonoides exilis. Microb Genom 2021; 7. [PMID: 34951395 PMCID: PMC8767320 DOI: 10.1099/mgen.0.000745] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/12/2022] Open
Abstract
Monocercomonoides exilis is considered the first known eukaryote to completely lack mitochondria. This conclusion is based primarily on a genomic and transcriptomic study which failed to identify any mitochondrial hallmark proteins. However, the available genome assembly has limited contiguity and around 1.5 % of the genome sequence is represented by unknown bases. To improve the contiguity, we re-sequenced the genome and transcriptome of M. exilis using Oxford Nanopore Technology (ONT). The resulting draft genome is assembled in 101 contigs with an N50 value of 1.38 Mbp, almost 20 times higher than the previously published assembly. Using a newly generated ONT transcriptome, we further improve the gene prediction and add high quality untranslated region (UTR) annotations, in which we identify two putative polyadenylation signals present in the 3′UTR regions and characterise the Kozak sequence in the 5′UTR regions. All these improvements are reflected by higher BUSCO genome completeness values. Regardless of an overall more complete genome assembly without missing bases and a better gene prediction, we still failed to identify any mitochondrial hallmark genes, thus further supporting the hypothesis on the absence of mitochondrion.
Collapse
Affiliation(s)
- Sebastian Cristian Treitli
- Department of Parasitology, Faculty of Science, Charles University, BIOCEV, Průmyslová 595, 252 42 Vestec, Czech Republic
| | - Priscila Peña-Diaz
- Department of Parasitology, Faculty of Science, Charles University, BIOCEV, Průmyslová 595, 252 42 Vestec, Czech Republic
| | - Paweł Hałakuc
- Institute of Evolutionary Biology, Faculty of Biology, Biological and Chemical Research Centre, University of Warsaw, Warsaw, Poland
| | - Anna Karnkowska
- Institute of Evolutionary Biology, Faculty of Biology, Biological and Chemical Research Centre, University of Warsaw, Warsaw, Poland
| | - Vladimír Hampl
- Department of Parasitology, Faculty of Science, Charles University, BIOCEV, Průmyslová 595, 252 42 Vestec, Czech Republic
| |
Collapse
|
25
|
Xie S, Leung AWS, Zheng Z, Zhang D, Xiao C, Luo R, Luo M, Zhang S. Applications and potentials of nanopore sequencing in the (epi)genome and (epi)transcriptome era. Innovation (N Y) 2021; 2:100153. [PMID: 34901902 PMCID: PMC8640597 DOI: 10.1016/j.xinn.2021.100153] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/09/2021] [Accepted: 08/09/2021] [Indexed: 02/08/2023] Open
Abstract
The Human Genome Project opened an era of (epi)genomic research, and also provided a platform for the development of new sequencing technologies. During and after the project, several sequencing technologies continue to dominate nucleic acid sequencing markets. Currently, Illumina (short-read), PacBio (long-read), and Oxford Nanopore (long-read) are the most popular sequencing technologies. Unlike PacBio or the popular short-read sequencers before it, which, as examples of the second or so-called Next-Generation Sequencing platforms, need to synthesize when sequencing, nanopore technology directly sequences native DNA and RNA molecules. Nanopore sequencing, therefore, avoids converting mRNA into cDNA molecules, which not only allows for the sequencing of extremely long native DNA and full-length RNA molecules but also document modifications that have been made to those native DNA or RNA bases. In this review on direct DNA sequencing and direct RNA sequencing using Oxford Nanopore technology, we focus on their development and application achievements, discussing their challenges and future perspective. We also address the problems researchers may encounter applying these approaches in their research topics, and how to resolve them. Nanopore-seq can dissect native DNA/RNA molecules from any organisms at unlimited length A wide variety of algorithms greatly increase the accuracy of signal decoding in Nanopore-Seq Nanopore-Seq significantly facilitates genome assembly and structural variant calling, and can simultaneously detect base modifications These advantages ensure its great potentials in future medical and agricultural practices
Collapse
Affiliation(s)
- Shangqian Xie
- Key Laboratory of Ministry of Education for Genetics and Germplasm Innovation of Tropical Special Trees and Ornamental Plants, College of Forestry, Hainan University, Haikou 570228, China
| | - Amy Wing-Sze Leung
- Department of Computer Science, The University of Hong Kong, Hong Kong 999077, China
| | - Zhenxian Zheng
- Department of Computer Science, The University of Hong Kong, Hong Kong 999077, China
| | - Dake Zhang
- Beijing Advanced Innovation Centre for Biomedical Engineering, Key Laboratory for Biomechanics and Mechanobiology of Ministry of Education, School of Biological Science and Medical Engineering, Beihang University, Beijing 100083, China
| | - Chuanle Xiao
- State Key Laboratory of Ophthalmology, Zhongshan Ophthalmic Centre, Sun Yat-sen University, Guangzhou 510060, China
| | - Ruibang Luo
- Department of Computer Science, The University of Hong Kong, Hong Kong 999077, China
| | - Ming Luo
- Agriculture and Biotechnology Research Center, Guangdong Provincial Key Laboratory of Applied Botany, Center of Economic Botany, Core Botanical Gardens, South China Botanical Garden, Chinese Academy of Sciences, Guangzhou 510650, China
| | - Shoudong Zhang
- School of Life Sciences, The Chinese University of Hong Kong, Shatin, Hong Kong 999077, China.,Center for Soybean Research of the State Key Laboratory of Agrobiotechnology, The Chinese University of Hong Kong, Shatin, Hong Kong 999077, China
| |
Collapse
|
26
|
Nanopore sequencing technology, bioinformatics and applications. Nat Biotechnol 2021; 39:1348-1365. [PMID: 34750572 PMCID: PMC8988251 DOI: 10.1038/s41587-021-01108-x] [Citation(s) in RCA: 476] [Impact Index Per Article: 158.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/09/2019] [Accepted: 09/22/2021] [Indexed: 12/13/2022]
Abstract
Rapid advances in nanopore technologies for sequencing single long DNA and RNA molecules have led to substantial improvements in accuracy, read length and throughput. These breakthroughs have required extensive development of experimental and bioinformatics methods to fully exploit nanopore long reads for investigations of genomes, transcriptomes, epigenomes and epitranscriptomes. Nanopore sequencing is being applied in genome assembly, full-length transcript detection and base modification detection and in more specialized areas, such as rapid clinical diagnoses and outbreak surveillance. Many opportunities remain for improving data quality and analytical approaches through the development of new nanopores, base-calling methods and experimental protocols tailored to particular applications.
Collapse
|
27
|
Dohál M, Porvazník I, Solovič I, Mokrý J. Whole Genome Sequencing in the Management of Non-Tuberculous Mycobacterial Infections. Microorganisms 2021; 9:microorganisms9112237. [PMID: 34835363 PMCID: PMC8621650 DOI: 10.3390/microorganisms9112237] [Citation(s) in RCA: 13] [Impact Index Per Article: 4.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/30/2021] [Revised: 10/22/2021] [Accepted: 10/25/2021] [Indexed: 12/20/2022] Open
Abstract
Infections caused by non-tuberculous mycobacteria (NTM) have been a public health problem in recent decades and contribute significantly to the clinical and economic burden globally. The diagnosis of infections is difficult and time-consuming and, in addition, the conventional diagnostics tests do not have sufficient discrimination power in species identification due to cross-reactions and not fully specific probes. However, technological advances have been made and the whole genome sequencing (WGS) method has been shown to be an essential part of routine diagnostics in clinical mycobacteriology laboratories. The use of this technology has contributed to the characterization of new species of mycobacteria, as well as the identification of gene mutations encoding resistance and virulence factors. Sequencing data also allowed to track global outbreaks of nosocomial NTM infections caused by M. abscessus complex and M. chimaera. To highlight the utility of WGS, we summarize recent scientific studies on WGS as a tool suitable for the management of NTM-induced infections in clinical practice.
Collapse
Affiliation(s)
- Matúš Dohál
- Biomedical Center Martin, Department of Pharmacology, Jessenius Faculty of Medicine, Comenius University, 036 01 Martin, Slovakia;
- Correspondence: ; Tel.: +42-19-0252-4199
| | - Igor Porvazník
- National Institute of Tuberculosis, Lung Diseases and Thoracic Surgery, 059 81 Vyšné Hágy, Slovakia; (I.P.); (I.S.)
- Faculty of Health, Catholic University, 034 01 Ružomberok, Slovakia
| | - Ivan Solovič
- National Institute of Tuberculosis, Lung Diseases and Thoracic Surgery, 059 81 Vyšné Hágy, Slovakia; (I.P.); (I.S.)
- Faculty of Health, Catholic University, 034 01 Ružomberok, Slovakia
| | - Juraj Mokrý
- Biomedical Center Martin, Department of Pharmacology, Jessenius Faculty of Medicine, Comenius University, 036 01 Martin, Slovakia;
| |
Collapse
|
28
|
Johnson LK, Sahasrabudhe R, Gill JA, Roach JL, Froenicke L, Brown CT, Whitehead A. Draft genome assemblies using sequencing reads from Oxford Nanopore Technology and Illumina platforms for four species of North American Fundulus killifish. Gigascience 2021; 9:5859380. [PMID: 32556169 PMCID: PMC7301629 DOI: 10.1093/gigascience/giaa067] [Citation(s) in RCA: 10] [Impact Index Per Article: 3.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/04/2019] [Revised: 04/16/2020] [Accepted: 05/27/2020] [Indexed: 01/04/2023] Open
Abstract
BACKGROUND Whole-genome sequencing data from wild-caught individuals of closely related North American killifish species (Fundulus xenicus, Fundulus catenatus, Fundulus nottii, and Fundulus olivaceus) were obtained using long-read Oxford Nanopore Technology (ONT) PromethION and short-read Illumina platforms. FINDINGS Draft de novo reference genome assemblies were generated using a combination of long and short sequencing reads. For each species, the PromethION platform was used to generate 30-45× sequence coverage, and the Illumina platform was used to generate 50-160× sequence coverage. Illumina-only assemblies were fragmented with high numbers of contigs, while ONT-only assemblies were error prone with low BUSCO scores. The highest N50 values, ranging from 0.4 to 2.7 Mb, were from assemblies generated using a combination of short- and long-read data. BUSCO scores were consistently >90% complete using the Eukaryota database. CONCLUSIONS High-quality genomes can be obtained from a combination of using short-read Illumina data to polish assemblies generated with long-read ONT data. Draft assemblies and raw sequencing data are available for public use. We encourage use and reuse of these data for assembly benchmarking and other analyses.
Collapse
Affiliation(s)
- Lisa K Johnson
- Department of Environmental Toxicology, University of California. 1 Shields Avenue, Davis, CA 95616, Davis, CA, USA
- Department of Population Health & Reproduction, School of Veterinary Medicine, University of California. 1 Shields Avenue, Davis, CA 95616, Davis, CA, USA
| | - Ruta Sahasrabudhe
- DNA Technologies Core, Genome Center, University of California, 1 Shields Avenue, Davis, CA 95616
| | - James Anthony Gill
- Department of Environmental Toxicology, University of California. 1 Shields Avenue, Davis, CA 95616, Davis, CA, USA
| | - Jennifer L Roach
- Department of Environmental Toxicology, University of California. 1 Shields Avenue, Davis, CA 95616, Davis, CA, USA
| | - Lutz Froenicke
- DNA Technologies Core, Genome Center, University of California, 1 Shields Avenue, Davis, CA 95616
| | - C Titus Brown
- Department of Population Health & Reproduction, School of Veterinary Medicine, University of California. 1 Shields Avenue, Davis, CA 95616, Davis, CA, USA
| | - Andrew Whitehead
- Correspondence address. Andrew Whitehead, Department of Environmental Toxicology, University of California. 1 Shields Avenue, Davis, CA 95616, USA, Davis, CA, USA. E-mail:
| |
Collapse
|
29
|
Choi J, Jia Z, Riahipour R, McKinney CJ, Amarasekara CA, Weerakoon-Ratnayake KM, Soper SA, Park S. Label-Free Identification of Single Mononucleotides by Nanoscale Electrophoresis. SMALL (WEINHEIM AN DER BERGSTRASSE, GERMANY) 2021; 17:e2102567. [PMID: 34558175 PMCID: PMC8542607 DOI: 10.1002/smll.202102567] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 04/30/2021] [Revised: 07/15/2021] [Indexed: 06/13/2023]
Abstract
Nanoscale electrophoresis allows for unique separations of single molecules, such as DNA/RNA nucleobases, and thus has the potential to be used as single molecular sensors for exonuclease sequencing. For this to be envisioned, label-free detection of the nucleotides to determine their electrophoretic mobility (i.e., time-of-flight, TOF) for highly accurate identification must be realized. Here, for the first time a novel nanosensor is shown that allows discriminating four 2-deoxyribonucleoside 5'-monophosphates, dNMPs, molecules in a label-free manner by nanoscale electrophoresis. This is made possible by positioning two sub-10 nm in-plane pores at both ends of a nanochannel column used for nanoscale electrophoresis and measuring the longitudinal transient current during translocation of the molecules. The dual nanopore TOF sensor with 0.5, 1, and 5 µm long nanochannel column lengths discriminates different dNMPs with a mean accuracy of 55, 66, and 94%, respectively. This nanosensor format can broadly be applicable to label-free detection and discrimination of other single molecules, vesicles, and particles by changing the dimensions of the nanochannel column and in-plane nanopores and integrating different pre- and postprocessing units to the nanosensor. This is simple to accomplish because the nanosensor is contained within a fluidic network made in plastic via replication.
Collapse
Affiliation(s)
- Junseo Choi
- Department of Mechanical Engineering, Louisiana State University, Baton Rouge, LA 70803, USA
- Center of Bio-Modular Multiscale Systems for Precision Medicine, USA
| | - Zheng Jia
- Department of Mechanical Engineering, Louisiana State University, Baton Rouge, LA 70803, USA
- Center of Bio-Modular Multiscale Systems for Precision Medicine, USA
| | - Ramin Riahipour
- Department of Mechanical Engineering, Louisiana State University, Baton Rouge, LA 70803, USA
- Center of Bio-Modular Multiscale Systems for Precision Medicine, USA
| | - Collin J. McKinney
- Department of Chemistry, University of North Carolina, Chapel Hill, NC 27599, USA
- Center of Bio-Modular Multiscale Systems for Precision Medicine, USA
| | - Charuni A. Amarasekara
- Department of Chemistry, University of Kansas, Lawrence, KS 66047, USA
- Center of Bio-Modular Multiscale Systems for Precision Medicine, USA
| | - Kumuditha M. Weerakoon-Ratnayake
- Department of Chemistry, University of Kansas, Lawrence, KS 66047, USA
- Center of Bio-Modular Multiscale Systems for Precision Medicine, USA
| | - Steven A. Soper
- Department of Chemistry, University of Kansas, Lawrence, KS 66047, USA
- Center of Bio-Modular Multiscale Systems for Precision Medicine, USA
- Bioengineering Program, University of Kansas, Lawrence, KS 66047, USA
- Department of Kansas Biology and KUCC, University of Kansas Medical Center, Kansas City, KS 66160, USA
| | - Sunggook Park
- Department of Mechanical Engineering, Louisiana State University, Baton Rouge, LA 70803, USA
- Center of Bio-Modular Multiscale Systems for Precision Medicine, USA
| |
Collapse
|
30
|
Lin Z, Xie Y, Nong W, Ren X, Li R, Zhao Z, Hui JHL, Yuen KWY. Formation of artificial chromosomes in Caenorhabditis elegans and analyses of their segregation in mitosis, DNA sequence composition and holocentromere organization. Nucleic Acids Res 2021; 49:9174-9193. [PMID: 34417622 PMCID: PMC8450109 DOI: 10.1093/nar/gkab690] [Citation(s) in RCA: 8] [Impact Index Per Article: 2.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/21/2020] [Revised: 07/23/2021] [Accepted: 07/30/2021] [Indexed: 11/14/2022] Open
Abstract
To investigate how exogenous DNA concatemerizes to form episomal artificial chromosomes (ACs), acquire equal segregation ability and maintain stable holocentromeres, we injected DNA sequences with different features, including sequences that are repetitive or complex, and sequences with different AT-contents, into the gonad of Caenorhabditis elegans to form ACs in embryos, and monitored AC mitotic segregation. We demonstrated that AT-poor sequences (26% AT-content) delayed the acquisition of segregation competency of newly formed ACs. We also co-injected fragmented Saccharomyces cerevisiae genomic DNA, differentially expressed fluorescent markers and ubiquitously expressed selectable marker to construct a less repetitive, more complex AC. We sequenced the whole genome of a strain which propagates this AC through multiple generations, and de novo assembled the AC sequences. We discovered CENP-AHCP-3 domains/peaks are distributed along the AC, as in endogenous chromosomes, suggesting a holocentric architecture. We found that CENP-AHCP-3 binds to the unexpressed marker genes and many fragmented yeast sequences, but is excluded in the yeast extremely high-AT-content centromeric and mitochondrial DNA (> 83% AT-content) on the AC. We identified A-rich motifs in CENP-AHCP-3 domains/peaks on the AC and on endogenous chromosomes, which have some similarity with each other and similarity to some non-germline transcription factor binding sites.
Collapse
Affiliation(s)
- Zhongyang Lin
- School of Biological Sciences, the University of Hong Kong, Kadoorie Biological Sciences Building, Pokfulam Road, Hong Kong
| | - Yichun Xie
- School of Life Sciences, Simon F.S. Li Marine Science Laboratory, State Key Laboratory of Agrobiotechnology, The Chinese University of Hong Kong, Hong Kong
| | - Wenyan Nong
- School of Life Sciences, Simon F.S. Li Marine Science Laboratory, State Key Laboratory of Agrobiotechnology, The Chinese University of Hong Kong, Hong Kong
| | - Xiaoliang Ren
- Department of Biology, Baptist University of Hong Kong, Sir Run Run Shaw Building, Ho Sin Hang Campus, Kowloon Tong, Hong Kong
| | - Runsheng Li
- Department of Biology, Baptist University of Hong Kong, Sir Run Run Shaw Building, Ho Sin Hang Campus, Kowloon Tong, Hong Kong
| | - Zhongying Zhao
- Department of Biology, Baptist University of Hong Kong, Sir Run Run Shaw Building, Ho Sin Hang Campus, Kowloon Tong, Hong Kong
| | - Jerome Ho Lam Hui
- School of Life Sciences, Simon F.S. Li Marine Science Laboratory, State Key Laboratory of Agrobiotechnology, The Chinese University of Hong Kong, Hong Kong
| | - Karen Wing Yee Yuen
- School of Biological Sciences, the University of Hong Kong, Kadoorie Biological Sciences Building, Pokfulam Road, Hong Kong
| |
Collapse
|
31
|
Comprehensive Wet-Bench and Bioinformatics Workflow for Complex Microbiota Using Oxford Nanopore Technologies. mSystems 2021; 6:e0075021. [PMID: 34427527 PMCID: PMC8407471 DOI: 10.1128/msystems.00750-21] [Citation(s) in RCA: 11] [Impact Index Per Article: 3.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/26/2022] Open
Abstract
The advent of high-throughput sequencing techniques has recently provided an astonishing insight into the composition and function of the human microbiome. Next-generation sequencing (NGS) has become the gold standard for advanced microbiome analysis; however, 3rd generation real-time sequencing, such as Oxford Nanopore Technologies (ONT), enables rapid sequencing from several kilobases to >2 Mb with high resolution. Despite the wide availability and the enormous potential for clinical and translational applications, ONT is poorly standardized in terms of sampling and storage conditions, DNA extraction, library creation, and bioinformatic classification. Here, we present a comprehensive analysis pipeline with sampling, storage, DNA extraction, library preparation, and bioinformatic evaluation for complex microbiomes sequenced with ONT. Our findings from buccal and rectal swabs and DNA extraction experiments indicate that methods that were approved for NGS microbiome analysis cannot be simply adapted to ONT. We recommend using swabs and DNA extractions protocols with extended washing steps. Both 16S rRNA and metagenomic sequencing achieved reliable and reproducible results. Our benchmarking experiments reveal thresholds for analysis parameters that achieved excellent precision, recall, and area under the precision recall values and is superior to existing classifiers (Kraken2, Kaiju, and MetaMaps). Hence, our workflow provides an experimental and bioinformatic pipeline to perform a highly accurate analysis of complex microbial structures from buccal and rectal swabs. IMPORTANCE Advanced microbiome analysis relies on sequencing of short DNA fragments from microorganisms like bacteria, fungi, and viruses. More recently, long fragment DNA sequencing of 3rd generation sequencing has gained increasing importance and can be rapidly conducted within a few hours due to its potential real-time sequencing. However, the analysis and correct identification of the microbiome relies on a multitude of factors, such as the method of sampling, DNA extraction, sequencing, and bioinformatic analysis. Scientists have used different protocols in the past that do not allow us to compare results across different studies and research fields. Here, we provide a comprehensive workflow from DNA extraction, sequencing, and bioinformatic workflow that allows rapid and accurate analysis of human buccal and rectal swabs with reproducible protocols. This workflow can be readily applied by many scientists from various research fields that aim to use long-fragment microbiome sequencing.
Collapse
|
32
|
Morris C, Lee YS, Yoon S. Adventitious agent detection methods in bio-pharmaceutical applications with a focus on viruses, bacteria, and mycoplasma. Curr Opin Biotechnol 2021; 71:105-114. [PMID: 34325176 DOI: 10.1016/j.copbio.2021.06.027] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/16/2021] [Revised: 06/22/2021] [Accepted: 06/29/2021] [Indexed: 10/20/2022]
Abstract
Adventitious agents present significant complications to biopharmaceutical manufacturing. Adventitious agents include numerous lifeforms such as bacteria, fungi, viruses, mycoplasma, and others that are inadvertently introduced into biological systems. They present significant problems to the stability of cell cultures and the sterility of manufacturing products. In this review, detection methods for bacteria, viruses, and mycoplasma are comprehensively addressed. Detection methods for viruses include traditional culture-based methods, electron microscopy studies, in vitro molecular and antibody assays, sequencing methods (massive parallel or next generation sequencing), and degenerate PCR (polymerase chain reaction). Bacteria, on the other hand, can be detected with culture-based approaches, PCR, and biosensor-based methods. Mycoplasma can be detected via PCR (including specific kits), microbiological culture methods, and enzyme-linked immunosorbent assays (ELISA). This review highlights the advantages and weaknesses of current detection methods while exploring potential avenues for further development and improvement of novel detection methods. Additionally, a brief evaluation of the transition of these methods into the gene therapy production realm with a focus on viral titer monitoring will be presented.
Collapse
Affiliation(s)
- Caitlin Morris
- Pharmaceutical Sciences, University of Massachusetts Lowell, Lowell, MA 01854, USA
| | - Yong Suk Lee
- Pharmaceutical Sciences, University of Massachusetts Lowell, Lowell, MA 01854, USA
| | - Seongkyu Yoon
- Chemical Engineering, University of Massachusetts Lowell, Lowell, MA 01854, USA.
| |
Collapse
|
33
|
Genome sequence of the cardiopulmonary canid nematode Angiostrongylus vasorum reveals species-specific genes with potential involvement in coagulopathy. Genomics 2021; 113:2695-2701. [PMID: 34118383 DOI: 10.1016/j.ygeno.2021.06.010] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/02/2020] [Revised: 05/21/2021] [Accepted: 06/07/2021] [Indexed: 11/22/2022]
Abstract
Angiostrongylus vasorum is an emerging parasitic nematode of canids and causes respiratory distress, bleeding, and other signs in dogs. Despite its clinical importance, the molecular toolbox allowing the study of the parasite is incomplete. To address this gap, we have sequenced its nuclear genome using Oxford nanopore sequencing, polished with Illumina reads. The size of the final genome is 280 Mb comprising 468 contigs, with an N50 value of 1.68 Mb and a BUSCO score of 93.5%. Ninety-three percent of 13,766 predicted genes were assigned to putative functions. Three folate carriers were found exclusively in A. vasorum, with potential involvement in host coagulopathy. A screen for previously identified vaccine candidates, the aminopeptidase H11 and the somatic protein rHc23, revealed homologs in A. vasorum. The genome sequence will provide a foundation for the development of new tools against canine angiostrongylosis, supporting the identification of potential drug and vaccine targets.
Collapse
|
34
|
Re-examination of two diatom reference genomes using long-read sequencing. BMC Genomics 2021; 22:379. [PMID: 34030633 PMCID: PMC8147415 DOI: 10.1186/s12864-021-07666-3] [Citation(s) in RCA: 15] [Impact Index Per Article: 5.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/22/2020] [Accepted: 04/26/2021] [Indexed: 12/03/2022] Open
Abstract
Background The marine diatoms Thalassiosira pseudonana and Phaeodactylum tricornutum are valuable model organisms for exploring the evolution, diversity and ecology of this important algal group. Their reference genomes, published in 2004 and 2008, respectively, were the product of traditional Sanger sequencing. In the case of T. pseudonana, optical restriction site mapping was employed to further clarify and contextualize chromosome-level scaffolds. While both genomes are considered highly accurate and reasonably contiguous, they still contain many unresolved regions and unordered/unlinked scaffolds. Results We have used Oxford Nanopore Technologies long-read sequencing to update and validate the quality and contiguity of the T. pseudonana and P. tricornutum genomes. Fine-scale assessment of our long-read derived genome assemblies allowed us to resolve previously uncertain genomic regions, further characterize complex structural variation, and re-evaluate the repetitive DNA content of both genomes. We also identified 1862 previously undescribed genes in T. pseudonana. In P. tricornutum, we used transposable element detection software to identify 33 novel copia-type LTR-RT insertions, indicating ongoing activity and rapid expansion of this superfamily as the organism continues to be maintained in culture. Finally, Bionano optical mapping of P. tricornutum chromosomes was combined with long-read sequence data to explore the potential of long-read sequencing and optical mapping for resolving haplotypes. Conclusion Despite its potential to yield highly contiguous scaffolds, long-read sequencing is not a panacea. Even for relatively small nuclear genomes such as those investigated herein, repetitive DNA sequences cause problems for current genome assembly algorithms. Determining whether a long-read derived genomic assembly is ‘better’ than one produced using traditional sequence data is not straightforward. Our revised reference genomes for P. tricornutum and T. pseudonana nevertheless provide additional insight into the structure and evolution of both genomes, thereby providing a more robust foundation for future diatom research. Supplementary Information The online version contains supplementary material available at 10.1186/s12864-021-07666-3.
Collapse
|
35
|
Sun J, Li R, Chen C, Sigwart JD, Kocot KM. Benchmarking Oxford Nanopore read assemblers for high-quality molluscan genomes. Philos Trans R Soc Lond B Biol Sci 2021; 376:20200160. [PMID: 33813888 PMCID: PMC8059532 DOI: 10.1098/rstb.2020.0160] [Citation(s) in RCA: 14] [Impact Index Per Article: 4.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 12/31/2020] [Indexed: 12/14/2022] Open
Abstract
Choosing the optimum assembly approach is essential to achieving a high-quality genome assembly suitable for comparative and evolutionary genomic investigations. Significant recent progress in long-read sequencing technologies such as PacBio and Oxford Nanopore Technologies (ONT) has also brought about a large variety of assemblers. Although these have been extensively tested on model species such as Homo sapiens and Drosophila melanogaster, such benchmarking has not been done in Mollusca, which lacks widely adopted model species. Molluscan genomes are notoriously rich in repeats and are often highly heterozygous, making their assembly challenging. Here, we benchmarked 10 assemblers based on ONT raw reads from two published molluscan genomes of differing properties, the gastropod Chrysomallon squamiferum (356.6 Mb, 1.59% heterozygosity) and the bivalve Mytilus coruscus (1593 Mb, 1.94% heterozygosity). By optimizing the assembly pipeline, we greatly improved both genomes from previously published versions. Our results suggested that 40-50X of ONT reads are sufficient for high-quality genomes, with Flye being the recommended assembler for compact and less heterozygous genomes exemplified by C. squamiferum, while NextDenovo excelled for more repetitive and heterozygous molluscan genomes exemplified by M. coruscus. A phylogenomic analysis using the two updated genomes with 32 other published high-quality lophotrochozoan genomes resulted in maximum support across all nodes, and we show that improved genome quality also leads to more complete matrices for phylogenomic inferences. Our benchmarking will ensure efficiency in future assemblies for molluscs and perhaps also for other marine phyla with few genomes available. This article is part of the Theo Murphy meeting issue 'Molluscan genomics: broad insights and future directions for a neglected phylum'.
Collapse
Affiliation(s)
- Jin Sun
- Institute of Evolution and Marine Biodiversity, Key Laboratory of Mariculture (Ministry of Education), Ocean University of China, Qingdao 266003, People's Republic of China
| | - Runsheng Li
- Department of Infectious Diseases and Public Health, Jockey Club College of Veterinary Medicine and Life Sciences, City University of Hong Kong, Kowloon, Hong Kong, People's Republic of China
| | - Chong Chen
- X-STAR, Japan Agency for Marine-Earth Science and Technology (JAMSTEC), 2–15 Natsushima-cho, Yokosuka, Kanagawa Prefecture 237-0061, Japan
| | - Julia D. Sigwart
- Senckenberg Museum, 60325 Frankfurt, Germany
- Marine Laboratory Queen's University Belfast, Portaferry, BT22 1PF, Northern Ireland
| | - Kevin M. Kocot
- Department of Biological Sciences and Alabama Museum of Natural History, University of Alabama, Tuscaloosa, AL 35487, USA
| |
Collapse
|
36
|
Brown E, Freimanis G, Shaw AE, Horton DL, Gubbins S, King D. Characterising Foot-and-Mouth Disease Virus in Clinical Samples Using Nanopore Sequencing. Front Vet Sci 2021; 8:656256. [PMID: 34079833 PMCID: PMC8165188 DOI: 10.3389/fvets.2021.656256] [Citation(s) in RCA: 7] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/20/2021] [Accepted: 04/06/2021] [Indexed: 11/13/2022] Open
Abstract
The sequencing of viral genomes provides important data for the prevention and control of foot-and-mouth disease (FMD) outbreaks. Sequence data can be used for strain identification, outbreak tracing, and aiding the selection of the most appropriate vaccine for the circulating strains. At present, sequencing of FMD virus (FMDV) relies upon the time-consuming transport of samples to well-resourced laboratories. The Oxford Nanopore Technologies' MinION portable sequencer has the potential to allow sequencing in remote, decentralised laboratories closer to the outbreak location. In this study, we investigated the utility of the MinION to generate sequence data of sufficient quantity and quality for the characterisation of FMDV serotypes O, A, Asia 1. Prior to sequencing, a universal two-step RT-PCR was used to amplify parts of the 5'UTR, as well as the leader, capsid and parts of the 2A encoding regions of FMDV RNA extracted from three sample matrices: cell culture supernatant, tongue epithelial suspension and oral swabs. The resulting consensus sequences were compared with reference sequences generated on the Illumina MiSeq platform. Consensus sequences with an accuracy of 100% were achieved within 10 and 30 min from the start of the sequencing run when using RNA extracted from cell culture supernatants and tongue epithelial suspensions, respectively. In contrast, sequencing from swabs required up to 2.5 h. Together these results demonstrated that the MinION sequencer can be used to accurately and rapidly characterise serotypes A, O, and Asia 1 of FMDV using amplicons amplified from a variety of different sample matrices.
Collapse
Affiliation(s)
- Emma Brown
- yaDepartment of Transmission Biology, The Pirbright Institute, Woking, United Kingdom
- Faculty of Health and Medical Science, School of Veterinary Medicine, University of Surrey, Guildford, United Kingdom
| | - Graham Freimanis
- Department of Bioinformatics, Sequencing & Proteomics, The Pirbright Institute, Woking, United Kingdom
| | - Andrew E. Shaw
- Vesicular Disease Reference Laboratory, The Pirbright Institute, Woking, United Kingdom
| | - Daniel L. Horton
- Faculty of Health and Medical Science, School of Veterinary Medicine, University of Surrey, Guildford, United Kingdom
| | - Simon Gubbins
- yaDepartment of Transmission Biology, The Pirbright Institute, Woking, United Kingdom
| | - David King
- Vesicular Disease Reference Laboratory, The Pirbright Institute, Woking, United Kingdom
- Department of Microbial and Cellular Sciences, School of Biosciences and Medicine, Faculty of Health and Medical Sciences, Stag Hill campus, University of Surrey, Guildford, United Kingdom
| |
Collapse
|
37
|
Multiplex PCR-Based Nanopore Sequencing and Epidemiological Surveillance of Hantaan orthohantavirus in Apodemus agrarius, Republic of Korea. Viruses 2021; 13:v13050847. [PMID: 34066592 PMCID: PMC8148566 DOI: 10.3390/v13050847] [Citation(s) in RCA: 7] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Key Words] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/12/2021] [Revised: 05/01/2021] [Accepted: 05/04/2021] [Indexed: 01/02/2023] Open
Abstract
Whole-genome sequencing of infectious agents enables the identification and characterization of emerging viruses. The MinION device is a portable sequencer that allows real-time sequencing in fields or hospitals. Hantaan orthohantavirus (Hantaan virus, HTNV), harbored by Apodemus agrarius, causes hemorrhagic fever with renal syndrome (HFRS) and poses a critical public health threat worldwide. In this study, we aimed to evaluate the feasibility of using nanopore sequencing for whole-genome sequencing of HTNV from samples having different viral copy numbers. Amplicon-based next-generation sequencing was performed in A. agrarius lung tissues collected from the Republic of Korea. Genomic sequences of HTNV were analyzed based on the viral RNA copy numbers. Amplicon-based nanopore sequencing provided nearly full-length genomic sequences of HTNV and showed sufficient read depth for phylogenetic analysis after 8 h of sequencing. The average identity of the HTNV genome sequences for the nanopore sequencer compared to those of generated from Illumina MiSeq revealed 99.8% (L and M segments) and 99.7% (S segment) identities, respectively. This study highlights the potential of the portable nanopore sequencer for rapid generation of accurate genomic sequences of HTNV for quicker decision making in point-of-care testing of HFRS patients during a hantavirus outbreak.
Collapse
|
38
|
Rhie A, McCarthy SA, Fedrigo O, Damas J, Formenti G, Koren S, Uliano-Silva M, Chow W, Fungtammasan A, Kim J, Lee C, Ko BJ, Chaisson M, Gedman GL, Cantin LJ, Thibaud-Nissen F, Haggerty L, Bista I, Smith M, Haase B, Mountcastle J, Winkler S, Paez S, Howard J, Vernes SC, Lama TM, Grutzner F, Warren WC, Balakrishnan CN, Burt D, George JM, Biegler MT, Iorns D, Digby A, Eason D, Robertson B, Edwards T, Wilkinson M, Turner G, Meyer A, Kautt AF, Franchini P, Detrich HW, Svardal H, Wagner M, Naylor GJP, Pippel M, Malinsky M, Mooney M, Simbirsky M, Hannigan BT, Pesout T, Houck M, Misuraca A, Kingan SB, Hall R, Kronenberg Z, Sović I, Dunn C, Ning Z, Hastie A, Lee J, Selvaraj S, Green RE, Putnam NH, Gut I, Ghurye J, Garrison E, Sims Y, Collins J, Pelan S, Torrance J, Tracey A, Wood J, Dagnew RE, Guan D, London SE, Clayton DF, Mello CV, Friedrich SR, Lovell PV, Osipova E, Al-Ajli FO, Secomandi S, Kim H, Theofanopoulou C, Hiller M, Zhou Y, Harris RS, Makova KD, Medvedev P, Hoffman J, Masterson P, Clark K, Martin F, Howe K, Flicek P, Walenz BP, Kwak W, Clawson H, Diekhans M, Nassar L, Paten B, Kraus RHS, Crawford AJ, Gilbert MTP, Zhang G, Venkatesh B, Murphy RW, Koepfli KP, Shapiro B, Johnson WE, Di Palma F, Marques-Bonet T, Teeling EC, Warnow T, Graves JM, Ryder OA, Haussler D, O'Brien SJ, Korlach J, Lewin HA, Howe K, Myers EW, Durbin R, Phillippy AM, Jarvis ED. Towards complete and error-free genome assemblies of all vertebrate species. Nature 2021; 592:737-746. [PMID: 33911273 PMCID: PMC8081667 DOI: 10.1038/s41586-021-03451-0] [Citation(s) in RCA: 793] [Impact Index Per Article: 264.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/22/2020] [Accepted: 03/12/2021] [Indexed: 02/02/2023]
Abstract
High-quality and complete reference genome assemblies are fundamental for the application of genomics to biology, disease, and biodiversity conservation. However, such assemblies are available for only a few non-microbial species1-4. To address this issue, the international Genome 10K (G10K) consortium5,6 has worked over a five-year period to evaluate and develop cost-effective methods for assembling highly accurate and nearly complete reference genomes. Here we present lessons learned from generating assemblies for 16 species that represent six major vertebrate lineages. We confirm that long-read sequencing technologies are essential for maximizing genome quality, and that unresolved complex repeats and haplotype heterozygosity are major sources of assembly error when not handled correctly. Our assemblies correct substantial errors, add missing sequence in some of the best historical reference genomes, and reveal biological discoveries. These include the identification of many false gene duplications, increases in gene sizes, chromosome rearrangements that are specific to lineages, a repeated independent chromosome breakpoint in bat genomes, and a canonical GC-rich pattern in protein-coding genes and their regulatory regions. Adopting these lessons, we have embarked on the Vertebrate Genomes Project (VGP), an international effort to generate high-quality, complete reference genomes for all of the roughly 70,000 extant vertebrate species and to help to enable a new era of discovery across the life sciences.
Collapse
Affiliation(s)
- Arang Rhie
- Genome Informatics Section, Computational and Statistical Genomics Branch, National Human Genome Research Institute, National Institutes of Health, Bethesda, MD, USA
| | - Shane A McCarthy
- Department of Genetics, University of Cambridge, Cambridge, UK
- Wellcome Sanger Institute, Cambridge, UK
| | - Olivier Fedrigo
- Vertebrate Genome Lab, The Rockefeller University, New York, NY, USA
| | - Joana Damas
- The Genome Center, University of California Davis, Davis, CA, USA
| | - Giulio Formenti
- Vertebrate Genome Lab, The Rockefeller University, New York, NY, USA
- Laboratory of Neurogenetics of Language, The Rockefeller University, New York, NY, USA
| | - Sergey Koren
- Genome Informatics Section, Computational and Statistical Genomics Branch, National Human Genome Research Institute, National Institutes of Health, Bethesda, MD, USA
| | - Marcela Uliano-Silva
- Leibniz Institute for Zoo and Wildlife Research, Department of Evolutionary Genetics, Berlin, Germany
- Berlin Center for Genomics in Biodiversity Research, Berlin, Germany
| | | | | | - Juwan Kim
- Interdisciplinary Program in Bioinformatics, Seoul National University, Seoul, Republic of Korea
| | - Chul Lee
- Interdisciplinary Program in Bioinformatics, Seoul National University, Seoul, Republic of Korea
| | - Byung June Ko
- Department of Agricultural Biotechnology and Research Institute of Agriculture and Life Sciences, Seoul National University, Seoul, Republic of Korea
| | - Mark Chaisson
- University of Southern California, Los Angeles, CA, USA
| | - Gregory L Gedman
- Laboratory of Neurogenetics of Language, The Rockefeller University, New York, NY, USA
| | - Lindsey J Cantin
- Laboratory of Neurogenetics of Language, The Rockefeller University, New York, NY, USA
| | - Francoise Thibaud-Nissen
- National Center for Biotechnology Information, National Library of Medicine, NIH, Bethesda, MD, USA
| | - Leanne Haggerty
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, UK
| | - Iliana Bista
- Department of Genetics, University of Cambridge, Cambridge, UK
- Wellcome Sanger Institute, Cambridge, UK
| | | | - Bettina Haase
- Vertebrate Genome Lab, The Rockefeller University, New York, NY, USA
| | | | - Sylke Winkler
- Max Planck Institute of Molecular Cell Biology and Genetics, Dresden, Germany
- DRESDEN-concept Genome Center, Dresden, Germany
| | - Sadye Paez
- Vertebrate Genome Lab, The Rockefeller University, New York, NY, USA
- Laboratory of Neurogenetics of Language, The Rockefeller University, New York, NY, USA
| | | | - Sonja C Vernes
- Neurogenetics of Vocal Communication Group, Max Planck Institute for Psycholinguistics, Nijmegen, The Netherlands
- Donders Institute for Brain, Cognition and Behaviour, Nijmegen, The Netherlands
- School of Biology, University of St Andrews, St Andrews, UK
| | - Tanya M Lama
- University of Massachusetts Cooperative Fish and Wildlife Research Unit, Amherst, MA, USA
| | - Frank Grutzner
- School of Biological Science, The Environment Institute, University of Adelaide, Adelaide, South Australia, Australia
| | - Wesley C Warren
- Bond Life Sciences Center, University of Missouri, Columbia, MO, USA
| | | | - Dave Burt
- UQ Genomics, University of Queensland, Brisbane, Queensland, Australia
| | - Julia M George
- Department of Biological Sciences, Clemson University, Clemson, SC, USA
| | - Matthew T Biegler
- Laboratory of Neurogenetics of Language, The Rockefeller University, New York, NY, USA
| | - David Iorns
- The Genetic Rescue Foundation, Wellington, New Zealand
| | - Andrew Digby
- Kākāpō Recovery, Department of Conservation, Invercargill, New Zealand
| | - Daryl Eason
- Kākāpō Recovery, Department of Conservation, Invercargill, New Zealand
| | - Bruce Robertson
- Department of Zoology, University of Otago, Dunedin, New Zealand
| | | | - Mark Wilkinson
- Department of Life Sciences, Natural History Museum, London, UK
| | - George Turner
- School of Natural Sciences, Bangor University, Gwynedd, UK
| | - Axel Meyer
- Department of Biology, University of Konstanz, Konstanz, Germany
| | - Andreas F Kautt
- Department of Biology, University of Konstanz, Konstanz, Germany
- Department of Organismic and Evolutionary Biology, Harvard University, Cambridge, MA, USA
| | - Paolo Franchini
- Department of Biology, University of Konstanz, Konstanz, Germany
| | - H William Detrich
- Department of Marine and Environmental Sciences, Northeastern University Marine Science Center, Nahant, MA, USA
| | - Hannes Svardal
- Department of Biology, University of Antwerp, Antwerp, Belgium
- Naturalis Biodiversity Center, Leiden, The Netherlands
| | - Maximilian Wagner
- Institute of Biology, Karl-Franzens University of Graz, Graz, Austria
| | - Gavin J P Naylor
- Florida Museum of Natural History, University of Florida, Gainesville, FL, USA
| | - Martin Pippel
- Max Planck Institute of Molecular Cell Biology and Genetics, Dresden, Germany
- Center for Systems Biology, Dresden, Germany
| | - Milan Malinsky
- Wellcome Sanger Institute, Cambridge, UK
- Zoological Institute, University of Basel, Basel, Switzerland
| | | | | | | | - Trevor Pesout
- UC Santa Cruz Genomics Institute, University of California, Santa Cruz, CA, USA
| | | | | | | | | | | | - Ivan Sović
- Pacific Biosciences, Menlo Park, CA, USA
- Digital BioLogic, Ivanić-Grad, Croatia
| | | | - Zemin Ning
- Wellcome Sanger Institute, Cambridge, UK
| | | | - Joyce Lee
- Bionano Genomics, San Diego, CA, USA
| | | | - Richard E Green
- UC Santa Cruz Genomics Institute, University of California, Santa Cruz, CA, USA
- Dovetail Genomics, Santa Cruz, CA, USA
| | | | - Ivo Gut
- CNAG-CRG, Centre for Genomic Regulation, Barcelona Institute of Science and Technology, Barcelona, Spain
- Universitat Pompeu Fabra, Barcelona, Spain
| | - Jay Ghurye
- Dovetail Genomics, Santa Cruz, CA, USA
- Department of Computer Science, University of Maryland College Park, College Park, MD, USA
| | - Erik Garrison
- UC Santa Cruz Genomics Institute, University of California, Santa Cruz, CA, USA
| | - Ying Sims
- Wellcome Sanger Institute, Cambridge, UK
| | | | | | | | | | | | | | - Dengfeng Guan
- Department of Genetics, University of Cambridge, Cambridge, UK
- School of Computer Science and Technology, Center for Bioinformatics, Harbin Institute of Technology, Harbin, China
| | - Sarah E London
- Department of Psychology, Institute for Mind and Biology, University of Chicago, Chicago, IL, USA
| | - David F Clayton
- Department of Genetics and Biochemistry, Clemson University, Clemson, SC, USA
| | - Claudio V Mello
- Department of Behavioral Neuroscience, Oregon Health and Science University, Portland, OR, USA
| | - Samantha R Friedrich
- Department of Behavioral Neuroscience, Oregon Health and Science University, Portland, OR, USA
| | - Peter V Lovell
- Department of Behavioral Neuroscience, Oregon Health and Science University, Portland, OR, USA
| | - Ekaterina Osipova
- Max Planck Institute of Molecular Cell Biology and Genetics, Dresden, Germany
- Center for Systems Biology, Dresden, Germany
- Max Planck Institute for the Physics of Complex Systems, Dresden, Germany
| | - Farooq O Al-Ajli
- Monash University Malaysia Genomics Facility, School of Science, Selangor Darul Ehsan, Malaysia
- Tropical Medicine and Biology Multidisciplinary Platform, Monash University Malaysia, Selangor Darul Ehsan, Malaysia
- Qatar Falcon Genome Project, Doha, Qatar
| | | | - Heebal Kim
- Interdisciplinary Program in Bioinformatics, Seoul National University, Seoul, Republic of Korea
- Department of Agricultural Biotechnology and Research Institute of Agriculture and Life Sciences, Seoul National University, Seoul, Republic of Korea
- eGnome, Inc., Seoul, Republic of Korea
| | | | - Michael Hiller
- LOEWE Centre for Translational Biodiversity Genomics, Frankfurt, Germany
- Senckenberg Research Institute, Frankfurt, Germany
- Goethe-University, Faculty of Biosciences, Frankfurt, Germany
| | | | - Robert S Harris
- Department of Biology, Pennsylvania State University, University Park, PA, USA
| | - Kateryna D Makova
- Department of Biology, Pennsylvania State University, University Park, PA, USA
- Center for Medical Genomics, Pennsylvania State University, University Park, PA, USA
- Center for Computational Biology and Bioinformatics, Pennsylvania State University, University Park, PA, USA
| | - Paul Medvedev
- Center for Medical Genomics, Pennsylvania State University, University Park, PA, USA
- Center for Computational Biology and Bioinformatics, Pennsylvania State University, University Park, PA, USA
- Department of Computer Science and Engineering, Pennsylvania State University, University Park, PA, USA
- Department of Biochemistry and Molecular Biology, Pennsylvania State University, University Park, PA, USA
| | - Jinna Hoffman
- National Center for Biotechnology Information, National Library of Medicine, NIH, Bethesda, MD, USA
| | - Patrick Masterson
- National Center for Biotechnology Information, National Library of Medicine, NIH, Bethesda, MD, USA
| | - Karen Clark
- National Center for Biotechnology Information, National Library of Medicine, NIH, Bethesda, MD, USA
| | - Fergal Martin
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, UK
| | - Kevin Howe
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, UK
| | - Paul Flicek
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, UK
| | - Brian P Walenz
- Genome Informatics Section, Computational and Statistical Genomics Branch, National Human Genome Research Institute, National Institutes of Health, Bethesda, MD, USA
| | - Woori Kwak
- eGnome, Inc., Seoul, Republic of Korea
- Hoonygen, Seoul, Korea
| | - Hiram Clawson
- UC Santa Cruz Genomics Institute, University of California, Santa Cruz, CA, USA
| | - Mark Diekhans
- UC Santa Cruz Genomics Institute, University of California, Santa Cruz, CA, USA
| | - Luis Nassar
- UC Santa Cruz Genomics Institute, University of California, Santa Cruz, CA, USA
| | - Benedict Paten
- UC Santa Cruz Genomics Institute, University of California, Santa Cruz, CA, USA
| | - Robert H S Kraus
- Department of Biology, University of Konstanz, Konstanz, Germany
- Department of Migration, Max Planck Institute of Animal Behavior, Radolfzell, Germany
| | - Andrew J Crawford
- Department of Biological Sciences, Universidad de los Andes, Bogotá, Colombia
| | - M Thomas P Gilbert
- Center for Evolutionary Hologenomics, The GLOBE Institute, University of Copenhagen, Copenhagen, Denmark
- University Museum, NTNU, Trondheim, Norway
| | - Guojie Zhang
- China National Genebank, BGI-Shenzhen, Shenzhen, China
- Villum Center for Biodiversity Genomics, Section for Ecology and Evolution, Department of Biology, University of Copenhagen, Copenhagen, Denmark
- State Key Laboratory of Genetic Resources and Evolution, Kunming Institute of Zoology, Chinese Academy of Sciences, Kunming, China
- Center for Excellence in Animal Evolution and Genetics, Chinese Academy of Sciences, Kunming, China
| | - Byrappa Venkatesh
- Institute of Molecular and Cell Biology, A*STAR, Biopolis, Singapore, Singapore
| | - Robert W Murphy
- Centre for Biodiversity, Royal Ontario Museum, Toronto, Ontario, Canada
| | - Klaus-Peter Koepfli
- Smithsonian Conservation Biology Institute, Center for Species Survival, National Zoological Park, Washington, DC, USA
| | - Beth Shapiro
- Department of Ecology and Evolutionary Biology, University of California Santa Cruz, Santa Cruz, CA, USA
- Howard Hughes Medical Institute, Chevy Chase, MD, USA
| | - Warren E Johnson
- Smithsonian Conservation Biology Institute, Center for Species Survival, National Zoological Park, Washington, DC, USA
- The Walter Reed Biosystematics Unit, Museum Support Center MRC-534, Smithsonian Institution, Suitland, MD, USA
- Walter Reed Army Institute of Research, Silver Spring, MD, USA
| | - Federica Di Palma
- Department of Biological Sciences, Earlham Institute, University of East Anglia, Norwich, UK
| | - Tomas Marques-Bonet
- Institute of Evolutionary Biology (UPF-CSIC), PRBB, Barcelona, Spain
- Catalan Institution of Research and Advanced Studies (ICREA), Barcelona, Spain
- Centre for Genomic Regulation (CRG), Barcelona Institute of Science and Technology (BIST), Barcelona, Spain
- Institut Català de Paleontologia Miquel Crusafont, Universitat Autònoma de Barcelona, Barcelona, Spain
| | - Emma C Teeling
- School of Biology and Environmental Science, University College Dublin, Dublin, Ireland
| | - Tandy Warnow
- Department of Computer Science, The University of Illinois at Urbana-Champaign, Urbana, IL, USA
| | | | - Oliver A Ryder
- San Diego Zoo Global, Escondido, CA, USA
- Department of Evolution, Behavior, and Ecology, University of California San Diego, La Jolla, CA, USA
| | - David Haussler
- UC Santa Cruz Genomics Institute, University of California, Santa Cruz, CA, USA
- Department of Ecology and Evolutionary Biology, University of California Santa Cruz, Santa Cruz, CA, USA
| | - Stephen J O'Brien
- Laboratory of Genomics Diversity-Center for Computer Technologies, ITMO University, St. Petersburg, Russian Federation
- Guy Harvey Oceanographic Center, Halmos College of Natural Sciences and Oceanography, Nova Southeastern University, Fort Lauderdale, FL, USA
| | | | - Harris A Lewin
- The Genome Center, University of California Davis, Davis, CA, USA
- Department of Evolution and Ecology, University of California Davis, Davis, CA, USA
- John Muir Institute for the Environment, University of California Davis, Davis, CA, USA
| | | | - Eugene W Myers
- Max Planck Institute of Molecular Cell Biology and Genetics, Dresden, Germany.
- Center for Systems Biology, Dresden, Germany.
- Faculty of Computer Science, Technical University Dresden, Dresden, Germany.
| | - Richard Durbin
- Department of Genetics, University of Cambridge, Cambridge, UK.
- Wellcome Sanger Institute, Cambridge, UK.
| | - Adam M Phillippy
- Genome Informatics Section, Computational and Statistical Genomics Branch, National Human Genome Research Institute, National Institutes of Health, Bethesda, MD, USA.
| | - Erich D Jarvis
- Vertebrate Genome Lab, The Rockefeller University, New York, NY, USA.
- Laboratory of Neurogenetics of Language, The Rockefeller University, New York, NY, USA.
- Howard Hughes Medical Institute, Chevy Chase, MD, USA.
| |
Collapse
|
39
|
Prall TM, Neumann EK, Karl JA, Shortreed CG, Baker DA, Bussan HE, Wiseman RW, O'Connor DH. Consistent ultra-long DNA sequencing with automated slow pipetting. BMC Genomics 2021; 22:182. [PMID: 33711930 DOI: 10.1186/s12864-021-07500-w/figures/4] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/17/2020] [Accepted: 03/02/2021] [Indexed: 05/28/2023] Open
Abstract
BACKGROUND Oxford Nanopore Technologies' instruments can sequence reads of great length. Long reads improve sequence assemblies by unambiguously spanning repetitive elements of the genome. Sequencing reads of significant length requires the preservation of long DNA template molecules through library preparation by pipetting reagents as slowly as possible to minimize shearing. This process is time-consuming and inconsistent at preserving read length as even small changes in volumetric flow rate can result in template shearing. RESULTS We have designed SNAILS (Slow Nucleic Acid Instrument for Long Sequences), a 3D-printable instrument that automates slow pipetting of reagents used in long read library preparation for Oxford Nanopore sequencing. Across six sequencing libraries, SNAILS preserved more reads exceeding 100 kilobases in length and increased its libraries' average read length over manual slow pipetting. CONCLUSIONS SNAILS is a low-cost, easily deployable solution for improving sequencing projects that require reads of significant length. By automating the slow pipetting of library preparation reagents, SNAILS increases the consistency and throughput of long read Nanopore sequencing.
Collapse
Affiliation(s)
- Trent M Prall
- Department of Pathology and Laboratory Medicine, University of Wisconsin, Madison, USA
| | - Emma K Neumann
- Department of Pathology and Laboratory Medicine, University of Wisconsin, Madison, USA
| | - Julie A Karl
- Department of Pathology and Laboratory Medicine, University of Wisconsin, Madison, USA
| | - Cecilia G Shortreed
- Department of Pathology and Laboratory Medicine, University of Wisconsin, Madison, USA
| | - David A Baker
- Department of Pathology and Laboratory Medicine, University of Wisconsin, Madison, USA
| | - Hailey E Bussan
- Department of Pathology and Laboratory Medicine, University of Wisconsin, Madison, USA
| | - Roger W Wiseman
- Department of Pathology and Laboratory Medicine, University of Wisconsin, Madison, USA
- Wisconsin National Primate Research Center, University of Wisconsin, Madison, USA
| | - David H O'Connor
- Department of Pathology and Laboratory Medicine, University of Wisconsin, Madison, USA.
- Wisconsin National Primate Research Center, University of Wisconsin, Madison, USA.
| |
Collapse
|
40
|
Prall TM, Neumann EK, Karl JA, Shortreed CG, Baker DA, Bussan HE, Wiseman RW, O'Connor DH. Consistent ultra-long DNA sequencing with automated slow pipetting. BMC Genomics 2021; 22:182. [PMID: 33711930 PMCID: PMC7953553 DOI: 10.1186/s12864-021-07500-w] [Citation(s) in RCA: 6] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/17/2020] [Accepted: 03/02/2021] [Indexed: 12/26/2022] Open
Abstract
BACKGROUND Oxford Nanopore Technologies' instruments can sequence reads of great length. Long reads improve sequence assemblies by unambiguously spanning repetitive elements of the genome. Sequencing reads of significant length requires the preservation of long DNA template molecules through library preparation by pipetting reagents as slowly as possible to minimize shearing. This process is time-consuming and inconsistent at preserving read length as even small changes in volumetric flow rate can result in template shearing. RESULTS We have designed SNAILS (Slow Nucleic Acid Instrument for Long Sequences), a 3D-printable instrument that automates slow pipetting of reagents used in long read library preparation for Oxford Nanopore sequencing. Across six sequencing libraries, SNAILS preserved more reads exceeding 100 kilobases in length and increased its libraries' average read length over manual slow pipetting. CONCLUSIONS SNAILS is a low-cost, easily deployable solution for improving sequencing projects that require reads of significant length. By automating the slow pipetting of library preparation reagents, SNAILS increases the consistency and throughput of long read Nanopore sequencing.
Collapse
Affiliation(s)
- Trent M Prall
- Department of Pathology and Laboratory Medicine, University of Wisconsin, Madison, USA
| | - Emma K Neumann
- Department of Pathology and Laboratory Medicine, University of Wisconsin, Madison, USA
| | - Julie A Karl
- Department of Pathology and Laboratory Medicine, University of Wisconsin, Madison, USA
| | - Cecilia G Shortreed
- Department of Pathology and Laboratory Medicine, University of Wisconsin, Madison, USA
| | - David A Baker
- Department of Pathology and Laboratory Medicine, University of Wisconsin, Madison, USA
| | - Hailey E Bussan
- Department of Pathology and Laboratory Medicine, University of Wisconsin, Madison, USA
| | - Roger W Wiseman
- Department of Pathology and Laboratory Medicine, University of Wisconsin, Madison, USA
- Wisconsin National Primate Research Center, University of Wisconsin, Madison, USA
| | - David H O'Connor
- Department of Pathology and Laboratory Medicine, University of Wisconsin, Madison, USA.
- Wisconsin National Primate Research Center, University of Wisconsin, Madison, USA.
| |
Collapse
|
41
|
van Belzen IAEM, Schönhuth A, Kemmeren P, Hehir-Kwa JY. Structural variant detection in cancer genomes: computational challenges and perspectives for precision oncology. NPJ Precis Oncol 2021; 5:15. [PMID: 33654267 PMCID: PMC7925608 DOI: 10.1038/s41698-021-00155-6] [Citation(s) in RCA: 28] [Impact Index Per Article: 9.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/17/2020] [Accepted: 01/12/2021] [Indexed: 01/31/2023] Open
Abstract
Cancer is generally characterized by acquired genomic aberrations in a broad spectrum of types and sizes, ranging from single nucleotide variants to structural variants (SVs). At least 30% of cancers have a known pathogenic SV used in diagnosis or treatment stratification. However, research into the role of SVs in cancer has been limited due to difficulties in detection. Biological and computational challenges confound SV detection in cancer samples, including intratumor heterogeneity, polyploidy, and distinguishing tumor-specific SVs from germline and somatic variants present in healthy cells. Classification of tumor-specific SVs is challenging due to inconsistencies in detected breakpoints, derived variant types and biological complexity of some rearrangements. Full-spectrum SV detection with high recall and precision requires integration of multiple algorithms and sequencing technologies to rescue variants that are difficult to resolve through individual methods. Here, we explore current strategies for integrating SV callsets and to enable the use of tumor-specific SVs in precision oncology.
Collapse
Affiliation(s)
| | - Alexander Schönhuth
- Genome Data Science, Faculty of Technology, Bielefeld University, Bielefeld, Germany
| | - Patrick Kemmeren
- Princess Máxima Center for Pediatric Oncology, Utrecht, The Netherlands
| | - Jayne Y Hehir-Kwa
- Princess Máxima Center for Pediatric Oncology, Utrecht, The Netherlands.
| |
Collapse
|
42
|
Begum G, Albanna A, Bankapur A, Nassir N, Tambi R, Berdiev BK, Akter H, Karuvantevida N, Kellam B, Alhashmi D, Sung WWL, Thiruvahindrapuram B, Alsheikh-Ali A, Scherer SW, Uddin M. Long-Read Sequencing Improves the Detection of Structural Variations Impacting Complex Non-Coding Elements of the Genome. Int J Mol Sci 2021; 22:2060. [PMID: 33669700 PMCID: PMC7923155 DOI: 10.3390/ijms22042060] [Citation(s) in RCA: 9] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/16/2020] [Revised: 01/21/2021] [Accepted: 01/27/2021] [Indexed: 12/17/2022] Open
Abstract
The advent of long-read sequencing offers a new assessment method of detecting genomic structural variation (SV) in numerous rare genetic diseases. For autism spectrum disorders (ASD) cases where pathogenic variants fail to be found in the protein-coding genic regions along chromosomes, we proposed a scalable workflow to characterize the risk factor of SVs impacting non-coding elements of the genome. We applied whole-genome sequencing on an Emirati family having three children with ASD using long and short-read sequencing technology. A series of analytical pipelines were established to identify a set of SVs with high sensitivity and specificity. At 15-fold coverage, we observed that long-read sequencing technology (987 variants) detected a significantly higher number of SVs when compared to variants detected using short-read technology (509 variants) (p-value < 1.1020 × 10-57). Further comparison showed 97.9% of long-read sequencing variants were spanning within the 1-100 kb size range (p-value < 9.080 × 10-67) and impacting over 5000 genes. Moreover, long-read variants detected 604 non-coding RNAs (p-value < 9.02 × 10-9), comprising 58% microRNA, 31.9% lncRNA, and 9.1% snoRNA. Even at low coverage, long-read sequencing has shown to be a reliable technology in detecting SVs impacting complex elements of the genome.
Collapse
Affiliation(s)
- Ghausia Begum
- College of Medicine, Mohammed Bin Rashid University of Medicine and Health Sciences, Dubai 505055, United Arab Emirates; (G.B.); (A.A.); (A.B.); (N.N.); (R.T.); (B.K.B.); (N.K.); (D.A.); (A.A.-A.)
| | - Ammar Albanna
- College of Medicine, Mohammed Bin Rashid University of Medicine and Health Sciences, Dubai 505055, United Arab Emirates; (G.B.); (A.A.); (A.B.); (N.N.); (R.T.); (B.K.B.); (N.K.); (D.A.); (A.A.-A.)
- Department of Psychiatry, Al Jalila Children’s Specialty Hospital, Dubai 7662, United Arab Emirates
| | - Asma Bankapur
- College of Medicine, Mohammed Bin Rashid University of Medicine and Health Sciences, Dubai 505055, United Arab Emirates; (G.B.); (A.A.); (A.B.); (N.N.); (R.T.); (B.K.B.); (N.K.); (D.A.); (A.A.-A.)
| | - Nasna Nassir
- College of Medicine, Mohammed Bin Rashid University of Medicine and Health Sciences, Dubai 505055, United Arab Emirates; (G.B.); (A.A.); (A.B.); (N.N.); (R.T.); (B.K.B.); (N.K.); (D.A.); (A.A.-A.)
| | - Richa Tambi
- College of Medicine, Mohammed Bin Rashid University of Medicine and Health Sciences, Dubai 505055, United Arab Emirates; (G.B.); (A.A.); (A.B.); (N.N.); (R.T.); (B.K.B.); (N.K.); (D.A.); (A.A.-A.)
| | - Bakhrom K. Berdiev
- College of Medicine, Mohammed Bin Rashid University of Medicine and Health Sciences, Dubai 505055, United Arab Emirates; (G.B.); (A.A.); (A.B.); (N.N.); (R.T.); (B.K.B.); (N.K.); (D.A.); (A.A.-A.)
| | - Hosneara Akter
- Genetics and Genomic Medicine Centre, NeuroGen Children’s Healthcare, Dhaka 1205, Bangladesh;
- Department of Biochemistry and Molecular Biology, Dhaka University, Dhaka 1000, Bangladesh
| | - Noushad Karuvantevida
- College of Medicine, Mohammed Bin Rashid University of Medicine and Health Sciences, Dubai 505055, United Arab Emirates; (G.B.); (A.A.); (A.B.); (N.N.); (R.T.); (B.K.B.); (N.K.); (D.A.); (A.A.-A.)
- Department of Biotechnology, Bharathidasan University, Tiruchirappalli 620024, India
| | - Barbara Kellam
- The Centre for Applied Genomics, The Hospital for Sick Children, Toronto, ON M5S 1A1, Canada; (B.K.); (W.W.L.S.); (B.T.)
| | - Deena Alhashmi
- College of Medicine, Mohammed Bin Rashid University of Medicine and Health Sciences, Dubai 505055, United Arab Emirates; (G.B.); (A.A.); (A.B.); (N.N.); (R.T.); (B.K.B.); (N.K.); (D.A.); (A.A.-A.)
| | - Wilson W. L. Sung
- The Centre for Applied Genomics, The Hospital for Sick Children, Toronto, ON M5S 1A1, Canada; (B.K.); (W.W.L.S.); (B.T.)
| | - Bhooma Thiruvahindrapuram
- The Centre for Applied Genomics, The Hospital for Sick Children, Toronto, ON M5S 1A1, Canada; (B.K.); (W.W.L.S.); (B.T.)
| | - Alawi Alsheikh-Ali
- College of Medicine, Mohammed Bin Rashid University of Medicine and Health Sciences, Dubai 505055, United Arab Emirates; (G.B.); (A.A.); (A.B.); (N.N.); (R.T.); (B.K.B.); (N.K.); (D.A.); (A.A.-A.)
| | - Stephen W. Scherer
- The Centre for Applied Genomics, The Hospital for Sick Children, Toronto, ON M5S 1A1, Canada; (B.K.); (W.W.L.S.); (B.T.)
- Department of Genetics and Genome Biology, The Hospital for Sick Children, Toronto, ON M5G 1X8, Canada
- McLaughlin Centre and Department of Molecular Genetics, University of Toronto, Toronto, ON M5S, Canada
| | - Mohammed Uddin
- College of Medicine, Mohammed Bin Rashid University of Medicine and Health Sciences, Dubai 505055, United Arab Emirates; (G.B.); (A.A.); (A.B.); (N.N.); (R.T.); (B.K.B.); (N.K.); (D.A.); (A.A.-A.)
| |
Collapse
|
43
|
Sorokin DY, Mosier D, Zorz JK, Dong X, Strous M. Wenzhouxiangella Strain AB-CW3, a Proteolytic Bacterium From Hypersaline Soda Lakes That Preys on Cells of Gram-Positive Bacteria. Front Microbiol 2020; 11:597686. [PMID: 33281797 PMCID: PMC7691419 DOI: 10.3389/fmicb.2020.597686] [Citation(s) in RCA: 6] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/21/2020] [Accepted: 10/26/2020] [Indexed: 11/13/2022] Open
Abstract
A new haloalkaliphilic species of Wenzhouxiangella, strain AB-CW3, was isolated from a system of hypersaline alkaline soda lakes in the Kulunda Steppe using cells of Staphylococcus aureus as growth substrate. AB-CW3's complete, circular genome was assembled from combined nanopore and Illumina sequencing and its proteome was determined for three different experimental conditions. AB-CW3 is an aerobic gammaproteobacterium feeding mainly on proteins and peptides. Unique among Wenzhouxiangella, it uses a flagellum for motility, fimbria for cell attachment and is capable of complete denitrification. AB-CW3 can use proteins derived from living or dead cells of Staphylococcus and other Gram-positive bacteria as the carbon and energy source. It encodes and expresses production of a novel Lantibiotic, a class of antimicrobial peptides which have so far only been found to be produced by Gram-positive bacteria. AB-CW3 likely excretes this peptide via a type I secretion system encoded upstream of the genes for production of the Lanthipeptide. Comparison of AB-CW3's genome to 18 other Wenzhouxiangella genomes from marine, hypersaline, and soda lake habitats indicated one or two transitions from marine to soda lake environments followed by a transition of W. marina back to the oceans. Only 19 genes appear to set haloalkaliphilic Wenzhouxiangella apart from their neutrophilic relatives. As strain AB-CW3 is only distantly related to other members of the genus, we propose to provisionally name it "Wenzhouxiangella alkaliphila".
Collapse
Affiliation(s)
- Dimitry Y Sorokin
- Winogradsky Institute of Microbiology, Federal Research Centre for Biotechnology, Russian Academy of Sciences, Moscow, Russia.,Department of Biotechnology, Delft University of Technology, Delft, Netherlands
| | - Damon Mosier
- Department of Geoscience, University of Calgary, Calgary, AB, Canada
| | - Jackie K Zorz
- Department of Geoscience, University of Calgary, Calgary, AB, Canada
| | - Xiaoli Dong
- Department of Geoscience, University of Calgary, Calgary, AB, Canada
| | - Marc Strous
- Department of Geoscience, University of Calgary, Calgary, AB, Canada
| |
Collapse
|
44
|
Xie Y, Zhong Y, Chang J, Kwan HS. Chromosome-level de novo assembly of Coprinopsis cinerea A43mut B43mut pab1-1 #326 and genetic variant identification of mutants using Nanopore MinION sequencing. Fungal Genet Biol 2020; 146:103485. [PMID: 33253902 DOI: 10.1016/j.fgb.2020.103485] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/13/2020] [Revised: 10/22/2020] [Accepted: 11/13/2020] [Indexed: 11/26/2022]
Abstract
The homokaryotic Coprinopsis cinerea strain A43mut B43mut pab1-1 #326 is a widely used experimental model for developmental studies in mushroom-forming fungi. It can grow on defined artificial media and complete the whole lifecycle within two weeks. The mutations in mating type factors A and B result in the special feature of clamp formation and fruiting without mating. This feature allows investigations and manipulations with a homokaryotic genetic background. Current genome assembly of strain #326 was based on short-read sequencing data and was highly fragmented, leading to the bias in gene annotation and downstream analyses. Here, we report a chromosome-level genome assembly of strain #326. Oxford Nanopore Technology (ONT) MinION sequencing was used to get long reads. Illumina short reads was used to polish the sequences. A combined assembly yield 13 chromosomes and a mitochondrial genome as individual scaffolds. The assembly has 15,250 annotated genes with a high synteny with the C. cinerea strain Okayama-7 #130. This assembly has great improvement on contiguity and annotations. It is a suitable reference for further genomic studies, especially for the genetic, genomic and transcriptomic analyses in ONT long reads. Single nucleotide variants and structural variants in six mutagenized and cisplatin-screened mutants could be identified and validated. A 66 bp deletion in Ras GTPase-activating protein (RasGAP) was found in all mutants. To make a better use of ONT sequencing platform, we modified a high-molecular-weight genomic DNA isolation protocol based on magnetic beads for filamentous fungi. This study showed the use of MinION to construct a fungal reference genome and to perform downstream studies in an individual laboratory. An experimental workflow was proposed, from DNA isolation and whole genome sequencing, to genome assembly and variant calling. Our results provided solutions and parameters for fungal genomic analysis on MinION sequencing platform.
Collapse
Affiliation(s)
- Yichun Xie
- School of Life Sciences, The Chinese University of Hong Kong, Shatin, New Territories, Hong Kong Special Administrative Region
| | - Yiyi Zhong
- School of Life Sciences, The Chinese University of Hong Kong, Shatin, New Territories, Hong Kong Special Administrative Region
| | - Jinhui Chang
- School of Life Sciences, The Chinese University of Hong Kong, Shatin, New Territories, Hong Kong Special Administrative Region; The Hong Kong Polytechnic University Shenzhen Research Institute, Shenzhen, China
| | - Hoi Shan Kwan
- School of Life Sciences, The Chinese University of Hong Kong, Shatin, New Territories, Hong Kong Special Administrative Region.
| |
Collapse
|
45
|
Prodanov T, Bansal V. Sensitive alignment using paralogous sequence variants improves long-read mapping and variant calling in segmental duplications. Nucleic Acids Res 2020; 48:e114. [PMID: 33035301 PMCID: PMC7641771 DOI: 10.1093/nar/gkaa829] [Citation(s) in RCA: 10] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/15/2020] [Revised: 08/31/2020] [Accepted: 09/22/2020] [Indexed: 02/07/2023] Open
Abstract
The ability to characterize repetitive regions of the human genome is limited by the read lengths of short-read sequencing technologies. Although long-read sequencing technologies such as Pacific Biosciences (PacBio) and Oxford Nanopore Technologies can potentially overcome this limitation, long segmental duplications with high sequence identity pose challenges for long-read mapping. We describe a probabilistic method, DuploMap, designed to improve the accuracy of long-read mapping in segmental duplications. It analyzes reads mapped to segmental duplications using existing long-read aligners and leverages paralogous sequence variants (PSVs)—sequence differences between paralogous sequences—to distinguish between multiple alignment locations. On simulated datasets, DuploMap increased the percentage of correctly mapped reads with high confidence for multiple long-read aligners including Minimap2 (74.3–90.6%) and BLASR (82.9–90.7%) while maintaining high precision. Across multiple whole-genome long-read datasets, DuploMap aligned an additional 8–21% of the reads in segmental duplications with high confidence relative to Minimap2. Using DuploMap-aligned PacBio circular consensus sequencing reads, an additional 8.9 Mb of DNA sequence was mappable, variant calling achieved a higher F1 score and 14 713 additional variants supported by linked-read data were identified. Finally, we demonstrate that a significant fraction of PSVs in segmental duplications overlaps with variants and adversely impacts short-read variant calling.
Collapse
Affiliation(s)
- Timofey Prodanov
- Bioinformatics and Systems Biology Graduate Program, University of California, San Diego, La Jolla, CA 92093, USA
| | - Vikas Bansal
- Department of Pediatrics, School of Medicine, University of California, San Diego, La Jolla, CA 92093, USA
| |
Collapse
|
46
|
Gigante CM, Yale G, Condori RE, Costa NC, Long NV, Minh PQ, Chuong VD, Tho ND, Thanh NT, Thin NX, Hanh NTH, Wambura G, Ade F, Mito O, Chuchu V, Muturi M, Mwatondo A, Hampson K, Thumbi SM, Thomae BG, de Paz VH, Meneses S, Munyua P, Moran D, Cadena L, Gibson A, Wallace RM, Pieracci EG, Li Y. Portable Rabies Virus Sequencing in Canine Rabies Endemic Countries Using the Oxford Nanopore MinION. Viruses 2020; 12:v12111255. [PMID: 33158200 PMCID: PMC7694271 DOI: 10.3390/v12111255] [Citation(s) in RCA: 20] [Impact Index Per Article: 5.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/15/2020] [Revised: 10/21/2020] [Accepted: 10/26/2020] [Indexed: 12/18/2022] Open
Abstract
As countries with endemic canine rabies progress towards elimination by 2030, it will become necessary to employ techniques to help plan, monitor, and confirm canine rabies elimination. Sequencing can provide critical information to inform control and vaccination strategies by identifying genetically distinct virus variants that may have different host reservoir species or geographic distributions. However, many rabies testing laboratories lack the resources or expertise for sequencing, especially in remote or rural areas where human rabies deaths are highest. We developed a low-cost, high throughput rabies virus sequencing method using the Oxford Nanopore MinION portable sequencer. A total of 259 sequences were generated from diverse rabies virus isolates in public health laboratories lacking rabies virus sequencing capacity in Guatemala, India, Kenya, and Vietnam. Phylogenetic analysis provided valuable insight into rabies virus diversity and distribution in these countries and identified a new rabies virus lineage in Kenya, the first published canine rabies virus sequence from Guatemala, evidence of rabies spread across an international border in Vietnam, and importation of a rabid dog into a state working to become rabies-free in India. Taken together, our evaluation highlights the MinION's potential for low-cost, high volume sequencing of pathogens in locations with limited resources.
Collapse
Affiliation(s)
- Crystal M. Gigante
- Poxvirus and Rabies Branch, Division of High-Consequence Pathogens and Pathology, National Center for Emerging and Zoonotic Infectious Diseases, Centers for Disease Control and Prevention, Atlanta, GA 30329, USA; (C.M.G.); (R.E.C.); (R.M.W.); (E.G.P.)
| | - Gowri Yale
- Mission Rabies, Tonca, Panjim, Goa 403001, India;
| | - Rene Edgar Condori
- Poxvirus and Rabies Branch, Division of High-Consequence Pathogens and Pathology, National Center for Emerging and Zoonotic Infectious Diseases, Centers for Disease Control and Prevention, Atlanta, GA 30329, USA; (C.M.G.); (R.E.C.); (R.M.W.); (E.G.P.)
| | - Niceta Cunha Costa
- Disease Investigation Unit, Directorate of Animal Health and Veterinary Services, Patto, Panjim, Goa 403001, India;
| | - Nguyen Van Long
- Vietnam Department of Animal Health, Hanoi 100000, Vietnam; (N.V.L.); (P.Q.M.); (V.D.C.)
| | - Phan Quang Minh
- Vietnam Department of Animal Health, Hanoi 100000, Vietnam; (N.V.L.); (P.Q.M.); (V.D.C.)
| | - Vo Dinh Chuong
- Vietnam Department of Animal Health, Hanoi 100000, Vietnam; (N.V.L.); (P.Q.M.); (V.D.C.)
| | - Nguyen Dang Tho
- National Center for Veterinary Diseases, Hanoi 100000, Vietnam;
| | - Nguyen Tat Thanh
- Sub-Department of Animal Health, Phú Thọ Province 35000, Vietnam; (N.T.T.); (N.X.T.); (N.T.H.H.)
| | - Nguyen Xuan Thin
- Sub-Department of Animal Health, Phú Thọ Province 35000, Vietnam; (N.T.T.); (N.X.T.); (N.T.H.H.)
| | - Nguyen Thi Hong Hanh
- Sub-Department of Animal Health, Phú Thọ Province 35000, Vietnam; (N.T.T.); (N.X.T.); (N.T.H.H.)
| | - Gati Wambura
- Center for Global Health Research, Kenya Medical Research Institute, Nairobi 00100, Kenya; (G.W.); (F.A.); (O.M.); (V.C.); (S.M.T.)
| | - Frederick Ade
- Center for Global Health Research, Kenya Medical Research Institute, Nairobi 00100, Kenya; (G.W.); (F.A.); (O.M.); (V.C.); (S.M.T.)
| | - Oscar Mito
- Center for Global Health Research, Kenya Medical Research Institute, Nairobi 00100, Kenya; (G.W.); (F.A.); (O.M.); (V.C.); (S.M.T.)
| | - Veronicah Chuchu
- Center for Global Health Research, Kenya Medical Research Institute, Nairobi 00100, Kenya; (G.W.); (F.A.); (O.M.); (V.C.); (S.M.T.)
- Department of Public Health, Pharmacology and Toxicology, University of Nairobi, Nairobi 00100, Kenya
| | - Mathew Muturi
- Zoonotic Disease Unit, Ministry of Health, Ministry of Agriculture, Livestock and Fisheries, Nairobi 00100, Kenya; (M.M.); (A.M.)
| | - Athman Mwatondo
- Zoonotic Disease Unit, Ministry of Health, Ministry of Agriculture, Livestock and Fisheries, Nairobi 00100, Kenya; (M.M.); (A.M.)
| | - Katie Hampson
- Institute of Biodiversity, Animal Health and Comparative Medicine, University of Glasgow, Glasgow G12 8QQ, UK;
| | - Samuel M. Thumbi
- Center for Global Health Research, Kenya Medical Research Institute, Nairobi 00100, Kenya; (G.W.); (F.A.); (O.M.); (V.C.); (S.M.T.)
- University of Nairobi Institute of Tropical and Infectious Diseases, Nairobi 00100, Kenya
- Paul G. Allen School for Global Animal Health, Washington State University, Pullman, WA 99164, USA
| | - Byron G. Thomae
- Ministry of Agriculture Livestock and Food, Guatemala City 01013, Guatemala;
| | - Victor Hugo de Paz
- National Health Laboratory, MSPAS, Villa Nueva 01064, Guatemala; (V.H.d.P.); (S.M.)
| | - Sergio Meneses
- National Health Laboratory, MSPAS, Villa Nueva 01064, Guatemala; (V.H.d.P.); (S.M.)
| | - Peninah Munyua
- Division of Global Health Protection, Centers for Disease Control, Nairobi 00100, Kenya;
| | - David Moran
- University del Valle de Guatemala, Guatemala City 01015, Guatemala;
| | - Loren Cadena
- Division of Global Health Protection, Centers for Disease Control, Guatemala City 01001, Guatemala;
| | - Andrew Gibson
- The Roslin Institute and The Royal (Dick) School of Veterinary Studies, Division of Genetics and Genomics, The University of Edinburgh, Easter Bush Veterinary Centre, Roslin, Midlothian EH25 9RG, UK;
| | - Ryan M. Wallace
- Poxvirus and Rabies Branch, Division of High-Consequence Pathogens and Pathology, National Center for Emerging and Zoonotic Infectious Diseases, Centers for Disease Control and Prevention, Atlanta, GA 30329, USA; (C.M.G.); (R.E.C.); (R.M.W.); (E.G.P.)
| | - Emily G. Pieracci
- Poxvirus and Rabies Branch, Division of High-Consequence Pathogens and Pathology, National Center for Emerging and Zoonotic Infectious Diseases, Centers for Disease Control and Prevention, Atlanta, GA 30329, USA; (C.M.G.); (R.E.C.); (R.M.W.); (E.G.P.)
| | - Yu Li
- Poxvirus and Rabies Branch, Division of High-Consequence Pathogens and Pathology, National Center for Emerging and Zoonotic Infectious Diseases, Centers for Disease Control and Prevention, Atlanta, GA 30329, USA; (C.M.G.); (R.E.C.); (R.M.W.); (E.G.P.)
- Correspondence:
| |
Collapse
|
47
|
Georgieva D, Liu Q, Wang K, Egli D. Detection of base analogs incorporated during DNA replication by nanopore sequencing. Nucleic Acids Res 2020; 48:e88. [PMID: 32710620 PMCID: PMC7470954 DOI: 10.1093/nar/gkaa517] [Citation(s) in RCA: 22] [Impact Index Per Article: 5.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/25/2019] [Revised: 05/28/2020] [Accepted: 06/05/2020] [Indexed: 01/23/2023] Open
Abstract
DNA synthesis is a fundamental requirement for cell proliferation and DNA repair, but no single method can identify the location, direction and speed of replication forks with high resolution. Mammalian cells have the ability to incorporate thymidine analogs along with the natural A, T, G and C bases during DNA synthesis, which allows for labeling of replicating or repaired DNA. Here, we demonstrate the use of the Oxford Nanopore Technologies MinION to detect 11 different thymidine analogs including CldU, BrdU, IdU as well as EdU alone or coupled to Biotin and other bulky adducts in synthetic DNA templates. We also show that the large adduct Biotin can be distinguished from the smaller analog IdU, which opens the possibility of using analog combinations to identify the location and direction of DNA synthesis. Furthermore, we detect IdU label on single DNA molecules in the genome of mouse pluripotent stem cells and using CRISPR/Cas9-mediated enrichment, determine replication rates using newly synthesized DNA strands in human mitochondrial DNA. We conclude that this novel method, termed Replipore sequencing, has the potential for on target examination of DNA replication in a wide range of biological contexts.
Collapse
Affiliation(s)
- Daniela Georgieva
- Integrated Program in Cellular, Molecular, and Biomedical Studies, Columbia University, New York, NY 10032, USA.,Naomi Berrie Diabetes Center, Columbia University, New York NY 10032, USA.,Columbia Stem Cell Initiative, Columbia University, New York, NY 10032, USA
| | - Qian Liu
- Raymond G. Perelman Center for Cellular and Molecular Therapeutics, Children's Hospital of Philadelphia, Philadelphia, PA 19104, USA
| | - Kai Wang
- Raymond G. Perelman Center for Cellular and Molecular Therapeutics, Children's Hospital of Philadelphia, Philadelphia, PA 19104, USA.,Department of Pathology and Laboratory Medicine, Perelman School of Medicine, University of Pennsylvania, Philadelphia, PA 19104, USA
| | - Dieter Egli
- Naomi Berrie Diabetes Center, Columbia University, New York NY 10032, USA.,Columbia Stem Cell Initiative, Columbia University, New York, NY 10032, USA.,Department of Pediatrics and Department of Obstetrics and Gynecology, Columbia University, New York, NY 10032, USA
| |
Collapse
|
48
|
Latorre-Pérez A, Pascual J, Porcar M, Vilanova C. A lab in the field: applications of real-time, in situ metagenomic sequencing. Biol Methods Protoc 2020; 5:bpaa016. [PMID: 33134552 PMCID: PMC7585387 DOI: 10.1093/biomethods/bpaa016] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/18/2020] [Revised: 08/07/2020] [Accepted: 08/18/2020] [Indexed: 01/18/2023] Open
Abstract
High-throughput metagenomic sequencing is considered one of the main technologies fostering the development of microbial ecology. Widely used second-generation sequencers have enabled the analysis of extremely diverse microbial communities, the discovery of novel gene functions, and the comprehension of the metabolic interconnections established among microbial consortia. However, the high cost of the sequencers and the complexity of library preparation and sequencing protocols still hamper the application of metagenomic sequencing in a vast range of real-life applications. In this context, the emergence of portable, third-generation sequencers is becoming a popular alternative for the rapid analysis of microbial communities in particular scenarios, due to their low cost, simplicity of operation, and rapid yield of results. This review discusses the main applications of real-time, in situ metagenomic sequencing developed to date, highlighting the relevance of this technology in current challenges (such as the management of global pathogen outbreaks) and in the next future of industry and clinical diagnosis.
Collapse
Affiliation(s)
| | | | - Manuel Porcar
- Darwin Bioprospecting Excellence SL, Valencia, Spain
- Institute for Integrative Systems Biology, I2SysBio, University of Valencia-CSIC, Valencia, Spain
| | | |
Collapse
|
49
|
Abstract
Brugia pahangi is a zoonotic parasite that is closely related to human-infecting filarial nematodes. Here, we report the nearly complete genome of Brugia pahangi, including assemblies of four autosomes and an X chromosome, with only seven gaps. The Y chromosome is still not completely assembled. Brugia pahangi is a zoonotic parasite that is closely related to human-infecting filarial nematodes. Here, we report the nearly complete genome of Brugia pahangi, including assemblies of four autosomes and an X chromosome, with only seven gaps. The Y chromosome is still not completely assembled.
Collapse
|
50
|
Dohm JC, Peters P, Stralis-Pavese N, Himmelbauer H. Benchmarking of long-read correction methods. NAR Genom Bioinform 2020; 2:lqaa037. [PMID: 33575591 PMCID: PMC7671305 DOI: 10.1093/nargab/lqaa037] [Citation(s) in RCA: 59] [Impact Index Per Article: 14.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/29/2020] [Revised: 05/02/2020] [Accepted: 05/15/2020] [Indexed: 01/25/2023] Open
Abstract
Third-generation sequencing technologies provided by Pacific Biosciences and Oxford Nanopore Technologies generate read lengths in the scale of kilobasepairs. However, these reads display high error rates, and correction steps are necessary to realize their great potential in genomics and transcriptomics. Here, we compare properties of PacBio and Nanopore data and assess correction methods by Canu, MARVEL and proovread in various combinations. We found total error rates of around 13% in the raw datasets. PacBio reads showed a high rate of insertions (around 8%) whereas Nanopore reads showed similar rates for substitutions, insertions and deletions of around 4% each. In data from both technologies the errors were uniformly distributed along reads apart from noisy 5' ends, and homopolymers appeared among the most over-represented kmers relative to a reference. Consensus correction using read overlaps reduced error rates to about 1% when using Canu or MARVEL after patching. The lowest error rate in Nanopore data (0.45%) was achieved by applying proovread on MARVEL-patched data including Illumina short-reads, and the lowest error rate in PacBio data (0.42%) was the result of Canu correction with minimap2 alignment after patching. Our study provides valuable insights and benchmarks regarding long-read data and correction methods.
Collapse
Affiliation(s)
- Juliane C Dohm
- Institute of Computational Biology, Department of Biotechnology, University of Life Sciences and Natural Resources, Vienna (BOKU), Muthgasse 18, 1190 Vienna, Austria
| | - Philipp Peters
- Institute of Computational Biology, Department of Biotechnology, University of Life Sciences and Natural Resources, Vienna (BOKU), Muthgasse 18, 1190 Vienna, Austria
| | - Nancy Stralis-Pavese
- Institute of Computational Biology, Department of Biotechnology, University of Life Sciences and Natural Resources, Vienna (BOKU), Muthgasse 18, 1190 Vienna, Austria
| | - Heinz Himmelbauer
- Institute of Computational Biology, Department of Biotechnology, University of Life Sciences and Natural Resources, Vienna (BOKU), Muthgasse 18, 1190 Vienna, Austria
| |
Collapse
|