1
|
Dyshlovoy SA, Paigin S, Afflerbach AK, Lobermeyer A, Werner S, Schüller U, Bokemeyer C, Schuh AH, Bergmann L, von Amsberg G, Joosse SA. Applications of Nanopore sequencing in precision cancer medicine. Int J Cancer 2024. [PMID: 39031959 DOI: 10.1002/ijc.35100] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/09/2024] [Revised: 04/25/2024] [Accepted: 06/25/2024] [Indexed: 07/22/2024]
Abstract
Oxford Nanopore Technologies sequencing, also referred to as Nanopore sequencing, stands at the forefront of a revolution in clinical genetics, offering the potential for rapid, long read, and real-time DNA and RNA sequencing. This technology is currently making sequencing more accessible and affordable. In this comprehensive review, we explore its potential regarding precision cancer diagnostics and treatment. We encompass a critical analysis of clinical cases where Nanopore sequencing was successfully applied to identify point mutations, splice variants, gene fusions, epigenetic modifications, non-coding RNAs, and other pivotal biomarkers that defined subsequent treatment strategies. Additionally, we address the challenges of clinical applications of Nanopore sequencing and discuss the current efforts to overcome them.
Collapse
Affiliation(s)
- Sergey A Dyshlovoy
- Department of Oncology, Oxford Molecular Diagnostics Centre, University of Oxford, Level 4, John Radcliffe Hospital, Oxford, UK
- Department of Oncology, Hematology and Bone Marrow Transplantation with Section Pneumology, University Cancer Center Hamburg (UCCH), University Medical Center Hamburg-Eppendorf, Hamburg, Germany
| | - Stefanie Paigin
- Department of Tumor Biology, University Medical Center Hamburg-Eppendorf, Hamburg, Germany
- Institute of Pathology and Neuropathology, University Hospital Tübingen, Tübingen, Germany
| | - Ann-Kristin Afflerbach
- Department of Tumor Biology, University Medical Center Hamburg-Eppendorf, Hamburg, Germany
| | - Annabelle Lobermeyer
- Department of Tumor Biology, University Medical Center Hamburg-Eppendorf, Hamburg, Germany
| | - Stefan Werner
- Department of Tumor Biology, University Medical Center Hamburg-Eppendorf, Hamburg, Germany
| | - Ulrich Schüller
- Research Institute Children's Cancer Center Hamburg, Hamburg, Germany
- Institute for Neuropathology, University Medical Center Hamburg-Eppendorf, Hamburg, Germany
- Department of Paediatric Hematology and Oncology, University Medical Center Hamburg-Eppendorf, Hamburg, Germany
| | - Carsten Bokemeyer
- Department of Oncology, Hematology and Bone Marrow Transplantation with Section Pneumology, University Cancer Center Hamburg (UCCH), University Medical Center Hamburg-Eppendorf, Hamburg, Germany
| | - Anna H Schuh
- Department of Oncology, Oxford Molecular Diagnostics Centre, University of Oxford, Level 4, John Radcliffe Hospital, Oxford, UK
| | - Lina Bergmann
- Department of Tumor Biology, University Medical Center Hamburg-Eppendorf, Hamburg, Germany
| | - Gunhild von Amsberg
- Department of Oncology, Hematology and Bone Marrow Transplantation with Section Pneumology, University Cancer Center Hamburg (UCCH), University Medical Center Hamburg-Eppendorf, Hamburg, Germany
- Martini-Klinik, Prostate Cancer Center, University Medical Center Hamburg-Eppendorf, Hamburg, Germany
| | - Simon A Joosse
- Department of Tumor Biology, University Medical Center Hamburg-Eppendorf, Hamburg, Germany
- Mildred Scheel Cancer Career Center HaTriCS4, University Medical Center Hamburg-Eppendorf, Hamburg, Germany
| |
Collapse
|
2
|
Berkovich AK, Pyshkina OA, Zorina AA, Rodin VA, Panova TV, Sergeev VG, Zvereva ME. Direct Determination of the Structure of Single Biopolymer Molecules Using Nanopore Sequencing. BIOCHEMISTRY. BIOKHIMIIA 2024; 89:S234-S248. [PMID: 38621753 DOI: 10.1134/s000629792414013x] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 06/30/2023] [Revised: 08/21/2023] [Accepted: 09/01/2023] [Indexed: 04/17/2024]
Abstract
This review highlights operational principles, features, and modern aspects of the development of third-generation sequencing technology of biopolymers focusing on the nucleic acids analysis, namely the nanopore sequencing system. Basics of the method and technical solutions used for its realization are considered, from the first works showing the possibility of creation of these systems to the easy-to-handle procedure developed by Oxford Nanopore Technologies company. Moreover, this review focuses on applications, which were developed and realized using equipment developed by the Oxford Nanopore Technologies, including assembly of whole genomes, methagenomics, direct analysis of the presence of modified bases.
Collapse
Affiliation(s)
- Anna K Berkovich
- Faculty of Chemistry, Lomonosov Moscow State University, Moscow, 119991, Russia.
| | - Olga A Pyshkina
- Faculty of Chemistry, Lomonosov Moscow State University, Moscow, 119991, Russia
| | - Anna A Zorina
- Faculty of Chemistry, Lomonosov Moscow State University, Moscow, 119991, Russia
| | - Vladimir A Rodin
- Faculty of Chemistry, Lomonosov Moscow State University, Moscow, 119991, Russia
| | - Tatyana V Panova
- Faculty of Chemistry, Lomonosov Moscow State University, Moscow, 119991, Russia
| | - Vladimir G Sergeev
- Faculty of Chemistry, Lomonosov Moscow State University, Moscow, 119991, Russia
| | - Maria E Zvereva
- Faculty of Chemistry, Lomonosov Moscow State University, Moscow, 119991, Russia
| |
Collapse
|
3
|
Vachon A, Seo GE, Patel NH, Coffin CS, Marinier E, Eyras E, Osiowy C. Hepatitis B virus serum RNA transcript isoform composition and proportion in chronic hepatitis B patients by nanopore long-read sequencing. Front Microbiol 2023; 14:1233178. [PMID: 37645229 PMCID: PMC10461054 DOI: 10.3389/fmicb.2023.1233178] [Citation(s) in RCA: 2] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/01/2023] [Accepted: 07/31/2023] [Indexed: 08/31/2023] Open
Abstract
Introduction Serum hepatitis B virus (HBV) RNA is a promising new biomarker to manage and predict clinical outcomes of chronic hepatitis B (CHB) infection. However, the HBV serum transcriptome within encapsidated particles, which is the biomarker analyte measured in serum, remains poorly characterized. This study aimed to evaluate serum HBV RNA transcript composition and proportionality by PCR-cDNA nanopore sequencing of samples from CHB patients having varied HBV genotype (gt, A to F) and HBeAg status. Methods Longitudinal specimens from 3 individuals during and following pregnancy (approximately 7 months between time points) were also investigated. HBV RNA extracted from 16 serum samples obtained from 13 patients (73.3% female, 84.6% Asian) was sequenced and serum HBV RNA isoform detection and quantification were performed using three bioinformatic workflows; FLAIR, RATTLE, and a GraphMap-based workflow within the Galaxy application. A spike-in RNA variant (SIRV) control mix was used to assess run quality and coverage. The proportionality of transcript isoforms was based on total HBV reads determined by each workflow. Results All chosen isoform detection workflows showed high agreement in transcript proportionality and composition for most samples. HBV pregenomic RNA (pgRNA) was the most frequently observed transcript isoform (93.8% of patient samples), while other detected transcripts included pgRNA spliced variants, 3' truncated variants and HBx mRNA, depending on the isoform detection method. Spliced variants of pgRNA were primarily observed in HBV gtB, C, E, or F-infected patients, with the Sp1 spliced variant detected most frequently. Twelve other pgRNA spliced variant transcripts were identified, including 3 previously unidentified transcripts, although spliced isoform identification was very dependent on the workflow used to analyze sequence data. Longitudinal sampling among pregnant and post-partum antiviral-treated individuals showed increasing proportions of 3' truncated pgRNA variants over time. Conclusions This study demonstrated long-read sequencing as a promising tool for the characterization of the serum HBV transcriptome. However, further studies are needed to better understand how serum HBV RNA isoform type and proportion are linked to CHB disease progression and antiviral treatment response.
Collapse
Affiliation(s)
- Alicia Vachon
- Department of Medical Microbiology and Infectious Diseases, University of Manitoba, Winnipeg, MB, Canada
- National Microbiology Laboratory, Public Health Agency of Canada, Winnipeg, MB, Canada
| | - Grace E. Seo
- Department of Medical Microbiology and Infectious Diseases, University of Manitoba, Winnipeg, MB, Canada
- National Microbiology Laboratory, Public Health Agency of Canada, Winnipeg, MB, Canada
| | - Nishi H. Patel
- Department of Medicine and Department of Microbiology, Immunology, and Infectious Diseases, Cumming School of Medicine, University of Calgary, Calgary, AB, Canada
| | - Carla S. Coffin
- Department of Medicine and Department of Microbiology, Immunology, and Infectious Diseases, Cumming School of Medicine, University of Calgary, Calgary, AB, Canada
| | - Eric Marinier
- National Microbiology Laboratory, Public Health Agency of Canada, Winnipeg, MB, Canada
| | - Eduardo Eyras
- EMBL Australia Partner Laboratory Network at the Australian National University, Canberra, ACT, Australia
- The John Curtin School of Medical Research, ANU College of Health and Medicine, Canberra, ACT, Australia
- Catalan Institution for Research and Advanced Studies, Barcelona, Spain
- Hospital del Mar Medical Research Institute, Barcelona, Spain
| | - Carla Osiowy
- Department of Medical Microbiology and Infectious Diseases, University of Manitoba, Winnipeg, MB, Canada
- National Microbiology Laboratory, Public Health Agency of Canada, Winnipeg, MB, Canada
| |
Collapse
|
4
|
Le QTN, Sugi N, Yamaguchi M, Hirayama T, Kobayashi M, Suzuki Y, Kusano M, Shiba H. Morphological and metabolomics profiling of intraspecific Arabidopsis hybrids in relation to biomass heterosis. Sci Rep 2023; 13:9529. [PMID: 37308530 DOI: 10.1038/s41598-023-36618-y] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/16/2022] [Accepted: 06/07/2023] [Indexed: 06/14/2023] Open
Abstract
Heterosis contributes greatly to the worldwide agricultural yield. However, the molecular mechanism underlying heterosis remains unclear. This study took advantage of Arabidopsis intraspecific hybrids to identify heterosis-related metabolites. Forty-six intraspecific hybrids were used to examine parental effects on seed area and germination time. The degree of heterosis was evaluated based on biomass: combinations showing high heterosis of F1 hybrids exhibited a biomass increase from 6.1 to 44% over the better parent value (BPV), whereas that of the low- and no-heterosis hybrids ranged from - 19.8 to 9.8% over the BPV. Metabolomics analyses of F1 hybrids with high heterosis and those with low one suggested that changes in TCA cycle intermediates are key factors that control growth. Notably, higher fumarate/malate ratios were observed in the high heterosis F1 hybrids, suggesting they provide metabolic support associated with the increased biomass. These hybrids may produce more energy-intensive biomass by speeding up the efficiency of TCA fluxes. However, the expression levels of TCA-process-related genes in F1 hybrids were not associated with the intensity of heterosis, suggesting that the post-transcriptional or post-translational regulation of these genes may affect the productivity of the intermediates in the TCA cycle.
Collapse
Affiliation(s)
- Quynh Thi Ngoc Le
- Graduate School of Life and Environmental Sciences, University of Tsukuba, 1-1-1 Ten-Nodai, Tsukuba, Ibaraki, Japan
- Thuyloi University, 175 Tay Son, Dong Da, Hanoi, Viet Nam
| | - Naoya Sugi
- Graduate School of Life and Environmental Sciences, University of Tsukuba, 1-1-1 Ten-Nodai, Tsukuba, Ibaraki, Japan
- Kihara Institute for Biological Research, Yokohama City University, Yokohama, Kanagawa, Japan
| | - Masaaki Yamaguchi
- Degree Programs in Life and Earth Sciences, Graduate School of Science and Technology, University of Tsukuba, 1-1-1 Ten-Nodai, Tsukuba, Ibaraki, Japan
| | - Touko Hirayama
- Degree Programs in Life and Earth Sciences, Graduate School of Science and Technology, University of Tsukuba, 1-1-1 Ten-Nodai, Tsukuba, Ibaraki, Japan
| | - Makoto Kobayashi
- RIKEN Center for Sustainable Resource Science, Suehiro 1-7-22, Tsurumi, Yokohama, Japan
| | - Yutaka Suzuki
- Department of Computational Biology and Medical Sciences, Graduate School of Frontier Sciences, The University of Tokyo, 5-1-5 Kashiwanoha, Kashiwa, Japan
| | - Miyako Kusano
- Degree Programs in Life and Earth Sciences, Graduate School of Science and Technology, University of Tsukuba, 1-1-1 Ten-Nodai, Tsukuba, Ibaraki, Japan
- RIKEN Center for Sustainable Resource Science, Suehiro 1-7-22, Tsurumi, Yokohama, Japan
| | - Hiroshi Shiba
- Degree Programs in Life and Earth Sciences, Graduate School of Science and Technology, University of Tsukuba, 1-1-1 Ten-Nodai, Tsukuba, Ibaraki, Japan.
- Tsukuba-Plant Innovation Research Center, University of Tsukuba, Ten-Nodai 1-1-1, Tsukuba, Ibaraki, Japan.
| |
Collapse
|
5
|
Neoantigens: promising targets for cancer therapy. Signal Transduct Target Ther 2023; 8:9. [PMID: 36604431 PMCID: PMC9816309 DOI: 10.1038/s41392-022-01270-x] [Citation(s) in RCA: 164] [Impact Index Per Article: 164.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/12/2022] [Revised: 11/14/2022] [Accepted: 11/27/2022] [Indexed: 01/07/2023] Open
Abstract
Recent advances in neoantigen research have accelerated the development and regulatory approval of tumor immunotherapies, including cancer vaccines, adoptive cell therapy and antibody-based therapies, especially for solid tumors. Neoantigens are newly formed antigens generated by tumor cells as a result of various tumor-specific alterations, such as genomic mutation, dysregulated RNA splicing, disordered post-translational modification, and integrated viral open reading frames. Neoantigens are recognized as non-self and trigger an immune response that is not subject to central and peripheral tolerance. The quick identification and prediction of tumor-specific neoantigens have been made possible by the advanced development of next-generation sequencing and bioinformatic technologies. Compared to tumor-associated antigens, the highly immunogenic and tumor-specific neoantigens provide emerging targets for personalized cancer immunotherapies, and serve as prospective predictors for tumor survival prognosis and immune checkpoint blockade responses. The development of cancer therapies will be aided by understanding the mechanism underlying neoantigen-induced anti-tumor immune response and by streamlining the process of neoantigen-based immunotherapies. This review provides an overview on the identification and characterization of neoantigens and outlines the clinical applications of prospective immunotherapeutic strategies based on neoantigens. We also explore their current status, inherent challenges, and clinical translation potential.
Collapse
|
6
|
Yu R, Cai D, Sun Y. AccuVIR: an ACCUrate VIRal genome assembly tool for third-generation sequencing data. Bioinformatics 2023; 39:6969105. [PMID: 36610711 PMCID: PMC9825286 DOI: 10.1093/bioinformatics/btac827] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/23/2022] [Revised: 11/24/2022] [Accepted: 12/24/2022] [Indexed: 12/27/2022] Open
Abstract
MOTIVATION RNA viruses tend to mutate constantly. While many of the variants are neutral, some can lead to higher transmissibility or virulence. Accurate assembly of complete viral genomes enables the identification of underlying variants, which are essential for studying virus evolution and elucidating the relationship between genotypes and virus properties. Recently, third-generation sequencing platforms such as Nanopore sequencers have been used for real-time virus sequencing for Ebola, Zika, coronavirus disease 2019, etc. However, their high per-base error rate prevents the accurate reconstruction of the viral genome. RESULTS In this work, we introduce a new tool, AccuVIR, for viral genome assembly and polishing using error-prone long reads. It can better distinguish sequencing errors from true variants based on the key observation that sequencing errors can disrupt the gene structures of viruses, which usually have a high density of coding regions. Our experimental results on both simulated and real third-generation sequencing data demonstrated its superior performance on generating more accurate viral genomes than generic assembly or polish tools. AVAILABILITY AND IMPLEMENTATION The source code and the documentation of AccuVIR are available at https://github.com/rainyrubyzhou/AccuVIR. SUPPLEMENTARY INFORMATION Supplementary data are available at Bioinformatics online.
Collapse
Affiliation(s)
- Runzhou Yu
- Department of Electrical Engineering, City University of Hong Kong, Kowloon, Hong Kong SAR 000000, China
| | - Dehan Cai
- Department of Electrical Engineering, City University of Hong Kong, Kowloon, Hong Kong SAR 000000, China
| | - Yanni Sun
- To whom correspondence should be addressed.
| |
Collapse
|
7
|
Dorney R, Dhungel BP, Rasko JEJ, Hebbard L, Schmitz U. Recent advances in cancer fusion transcript detection. Brief Bioinform 2022; 24:6918739. [PMID: 36527429 PMCID: PMC9851307 DOI: 10.1093/bib/bbac519] [Citation(s) in RCA: 9] [Impact Index Per Article: 4.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/26/2022] [Revised: 10/11/2022] [Accepted: 10/31/2022] [Indexed: 12/23/2022] Open
Abstract
Extensive investigation of gene fusions in cancer has led to the discovery of novel biomarkers and therapeutic targets. To date, most studies have neglected chromosomal rearrangement-independent fusion transcripts and complex fusion structures such as double or triple-hop fusions, and fusion-circRNAs. In this review, we untangle fusion-related terminology and propose a classification system involving both gene and transcript fusions. We highlight the importance of RNA-level fusions and how long-read sequencing approaches can improve detection and characterization. Moreover, we discuss novel bioinformatic tools to identify fusions in long-read sequencing data and strategies to experimentally validate and functionally characterize fusion transcripts.
Collapse
Affiliation(s)
- Ryley Dorney
- epartment of Molecular & Cell Biology, College of Public Health, Medical & Vet Sciences, James Cook University, Douglas, QLD 4811, Australia,Centre for Tropical Bioinformatics and Molecular Biology, Australian Institute of Tropical Health and Medicine, James Cook University, Cairns 4878, Australia
| | - Bijay P Dhungel
- Gene and Stem Cell Therapy Program Centenary Institute, The University of Sydney, Camperdown, NSW 2050, Australia,Faculty of Medicine & Health, The University of Sydney, Camperdown, NSW 2006, Australia,Centre for Tropical Bioinformatics and Molecular Biology, Australian Institute of Tropical Health and Medicine, James Cook University, Cairns 4878, Australia
| | - John E J Rasko
- Gene and Stem Cell Therapy Program Centenary Institute, The University of Sydney, Camperdown, NSW 2050, Australia,Faculty of Medicine & Health, The University of Sydney, Camperdown, NSW 2006, Australia
| | - Lionel Hebbard
- epartment of Molecular & Cell Biology, College of Public Health, Medical & Vet Sciences, James Cook University, Douglas, QLD 4811, Australia,Storr Liver Centre, Westmead Institute for Medical Research, Westmead Hospital and University of Sydney, Sydney, New South Wales, Australia
| | - Ulf Schmitz
- Corresponding author. Ulf Schmitz, Department of Molecular and Cell Biology, College of Public Health, Medical and Vet Sciences, James Cook University, Douglas, QLD 4811, Australia. E-mail:
| |
Collapse
|
8
|
Ono Y, Hamada M, Asai K. PBSIM3: a simulator for all types of PacBio and ONT long reads. NAR Genom Bioinform 2022; 4:lqac092. [PMID: 36465498 PMCID: PMC9713900 DOI: 10.1093/nargab/lqac092] [Citation(s) in RCA: 5] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/10/2022] [Revised: 11/02/2022] [Accepted: 11/12/2022] [Indexed: 12/03/2022] Open
Abstract
Long-read sequencers, such as Pacific Biosciences (PacBio) and Oxford Nanopore Technologies (ONT) sequencers, have improved their read length and accuracy, thereby opening up unprecedented research. Many tools and algorithms have been developed to analyze long reads, and rapid progress in PacBio and ONT has further accelerated their development. Together with the development of high-throughput sequencing technologies and their analysis tools, many read simulators have been developed and effectively utilized. PBSIM is one of the popular long-read simulators. In this study, we developed PBSIM3 with three new functions: error models for long reads, multi-pass sequencing for high-fidelity read simulation and transcriptome sequencing simulation. Therefore, PBSIM3 is now able to meet a wide range of long-read simulation requirements.
Collapse
Affiliation(s)
- Yukiteru Ono
- Department of Computational Biology and Medical Sciences, Graduate School of Frontier Sciences, University of Tokyo, 5-1-5 Kashiwanoha, Kashiwa 277-8561, Japan
| | - Michiaki Hamada
- Department of Electrical Engineering and Bioscience, Faculty of Science and Engineering, Waseda University, 55N-06-10, 3-4-1, Okubo, Shinjuku-ku, Tokyo 169-8555, Japan
- Computational Bio Big-Data Open Innovation Laboratory (CBBD-OIL), National Institute of Advanced Industrial Science and Technology (AIST), 63-520, 3-4-1, Okubo, Shinjuku-ku, Tokyo 169-8555, Japan
- Institute for Medical-Oriented Structural Biology, Waseda University, 2-2, Wakamatsu-cho, Shinjuku-ku, Tokyo 162-8480, Japan
- Graduate School of Medicine, Nippon Medical School, 1-1-5, Sendagi, Bunkyo-ku, Tokyo, 113-8602, Japan
| | - Kiyoshi Asai
- Department of Computational Biology and Medical Sciences, Graduate School of Frontier Sciences, University of Tokyo, 5-1-5 Kashiwanoha, Kashiwa 277-8561, Japan
- Artificial Intelligence Research Center (AIRC), National Institute of Advanced Industrial Science and Technology (AIST), 2-3-26, Aomi, Koto-ku, 135-0064 Tokyo, Japan
| |
Collapse
|
9
|
MinION Whole-Genome Sequencing in Resource-Limited Settings: Challenges and Opportunities. CURRENT CLINICAL MICROBIOLOGY REPORTS 2022; 9:52-59. [DOI: 10.1007/s40588-022-00183-1] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 10/04/2022] [Indexed: 11/18/2022]
Abstract
Abstract
Purpose of Review
The introduction of MinION whole-genome sequencing technology greatly increased and simplified complete genome sequencing in various fields of science across the globe. Sequences have been generated from complex organisms to microorganisms and are stored in genome databases that are readily accessible by researchers. Various new software for genome analysis, along with upgrades to older software packages, are being generated. New protocols are also being validated that enable WGS technology to be rapidly and increasingly used for sequencing in field settings.
Recent Findings
MinION WGS technology has been implemented in developed countries due to its advantages: portability, real-time analysis, and lower cost compared to other sequencing technologies. While these same advantages are critical in developing countries, MinION WGS technology is still under-utilized in resource-limited settings.
Summary
In this review, we look at the applications, advantages, challenges, and opportunities of using MinION WGS in resource-limited settings.
Collapse
|
10
|
Leshkowitz D, Kedmi M, Fried Y, Pilzer D, Keren-Shaul H, Ainbinder E, Dassa B. Exploring differential exon usage via short- and long-read RNA sequencing strategies. Open Biol 2022; 12:220206. [PMID: 36168804 PMCID: PMC9516339 DOI: 10.1098/rsob.220206] [Citation(s) in RCA: 5] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/19/2022] Open
Abstract
Alternative splicing produces various mRNAs, and thereby various protein products, from one gene, impacting a wide range of cellular activities. However, accurate reconstruction and quantification of full-length transcripts using short-reads is limited, due to their length. Long-reads sequencing technologies may provide a solution by sequencing full-length transcripts. We explored the use of both Illumina short-reads and two long Oxford Nanopore Technology (cDNA and Direct RNA) RNA-Seq reads for detecting global differential splicing during mouse embryonic stem cell differentiation, applying several bioinformatics strategies: gene-based, isoform-based and exon-based. We detected the strongest similarity among the sequencing platforms at the gene level compared to exon-based and isoform-based. Furthermore, the exon-based strategy discovered many differential exon usage (DEU) events, mostly in a platform-dependent manner and in non-differentially expressed genes. Thus, the platforms complemented each other in the ability to detect DEUs (i.e. long-reads exhibited an advantage in detecting DEUs at the UTRs, and short-reads detected more DEUs). Exons within 20 genes, detected in one or more platforms, were here validated by PCR, including key differentiation genes, such as Mdb3 and Aplp1. We provide an important analysis resource for discovering transcriptome changes during stem cell differentiation and insights for analysing such data.
Collapse
Affiliation(s)
- Dena Leshkowitz
- Life Sciences Core Facilities, Weizmann Institute of Science, Rehovot 76100, Israel
| | - Merav Kedmi
- Life Sciences Core Facilities, Weizmann Institute of Science, Rehovot 76100, Israel
| | - Yael Fried
- Life Sciences Core Facilities, Weizmann Institute of Science, Rehovot 76100, Israel
| | - David Pilzer
- Life Sciences Core Facilities, Weizmann Institute of Science, Rehovot 76100, Israel
| | - Hadas Keren-Shaul
- Life Sciences Core Facilities, Weizmann Institute of Science, Rehovot 76100, Israel
| | - Elena Ainbinder
- Life Sciences Core Facilities, Weizmann Institute of Science, Rehovot 76100, Israel
| | - Bareket Dassa
- Life Sciences Core Facilities, Weizmann Institute of Science, Rehovot 76100, Israel
| |
Collapse
|
11
|
Palos K, Nelson Dittrich AC, Yu L, Brock JR, Railey CE, Wu HYL, Sokolowska E, Skirycz A, Hsu PY, Gregory BD, Lyons E, Beilstein MA, Nelson ADL. Identification and functional annotation of long intergenic non-coding RNAs in Brassicaceae. THE PLANT CELL 2022; 34:3233-3260. [PMID: 35666179 PMCID: PMC9421480 DOI: 10.1093/plcell/koac166] [Citation(s) in RCA: 16] [Impact Index Per Article: 8.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 11/10/2021] [Accepted: 05/05/2022] [Indexed: 06/01/2023]
Abstract
Long intergenic noncoding RNAs (lincRNAs) are a large yet enigmatic class of eukaryotic transcripts that can have critical biological functions. The wealth of RNA-sequencing (RNA-seq) data available for plants provides the opportunity to implement a harmonized identification and annotation effort for lincRNAs that enables cross-species functional and genomic comparisons as well as prioritization of functional candidates. In this study, we processed >24 Tera base pairs of RNA-seq data from >16,000 experiments to identify ∼130,000 lincRNAs in four Brassicaceae: Arabidopsis thaliana, Camelina sativa, Brassica rapa, and Eutrema salsugineum. We used nanopore RNA-seq, transcriptome-wide structural information, peptide data, and epigenomic data to characterize these lincRNAs and identify conserved motifs. We then used comparative genomic and transcriptomic approaches to highlight lincRNAs in our data set with sequence or transcriptional conservation. Finally, we used guilt-by-association analyses to assign putative functions to lincRNAs within our data set. We tested this approach on a subset of lincRNAs associated with germination and seed development, observing germination defects for Arabidopsis lines harboring T-DNA insertions at these loci. LincRNAs with Brassicaceae-conserved putative miRNA binding motifs, small open reading frames, or abiotic-stress modulated expression are a few of the annotations that will guide functional analyses into this cryptic portion of the transcriptome.
Collapse
Affiliation(s)
- Kyle Palos
- The Boyce Thompson Institute, Cornell University, Ithaca, New York, USA
| | | | - Li’ang Yu
- The Boyce Thompson Institute, Cornell University, Ithaca, New York, USA
| | - Jordan R Brock
- Department of Horticulture, Michigan State University, East Lansing, Michigan, USA
| | - Caylyn E Railey
- The Boyce Thompson Institute, Cornell University, Ithaca, New York, USA
| | - Hsin-Yen Larry Wu
- Department of Biochemistry and Molecular Biology, Michigan State University, East Lansing, Michigan, USA
| | | | | | - Polly Yingshan Hsu
- Department of Biochemistry and Molecular Biology, Michigan State University, East Lansing, Michigan, USA
| | - Brian D Gregory
- Department of Biology, University of Pennsylvania, Philadelphia, Pennsylvania, USA
| | - Eric Lyons
- The School of Plant Sciences, University of Arizona, Tucson, Arizona, USA
| | - Mark A Beilstein
- The School of Plant Sciences, University of Arizona, Tucson, Arizona, USA
| | | |
Collapse
|
12
|
Sakamoto Y, Miyake S, Oka M, Kanai A, Kawai Y, Nagasawa S, Shiraishi Y, Tokunaga K, Kohno T, Seki M, Suzuki Y, Suzuki A. Phasing analysis of lung cancer genomes using a long read sequencer. Nat Commun 2022; 13:3464. [PMID: 35710642 PMCID: PMC9203510 DOI: 10.1038/s41467-022-31133-6] [Citation(s) in RCA: 7] [Impact Index Per Article: 3.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/14/2021] [Accepted: 06/02/2022] [Indexed: 12/14/2022] Open
Abstract
Chromosomal backgrounds of cancerous mutations still remain elusive. Here, we conduct the phasing analysis of non-small cell lung cancer specimens of 20 Japanese patients. By the combinatory use of short and long read sequencing data, we obtain long phased blocks of 834 kb in N50 length with >99% concordance rate. By analyzing the obtained phasing information, we reveal that several cancer genomes harbor regions in which mutations are unevenly distributed to either of two haplotypes. Large-scale chromosomal rearrangement events, which resemble chromothripsis events but have smaller scales, occur on only one chromosome, and these events account for the observed biased distributions. Interestingly, the events are characteristic of EGFR mutation-positive lung adenocarcinomas. Further integration of long read epigenomic and transcriptomic data reveal that haploid chromosomes are not always at equivalent transcriptomic/epigenomic conditions. Distinct chromosomal backgrounds are responsible for later cancerous aberrations in a haplotype-specific manner.
Collapse
Affiliation(s)
- Yoshitaka Sakamoto
- Department of Computational Biology and Medical Sciences, Graduate School of Frontier Sciences, The University of Tokyo, Chiba, Japan
| | - Shuhei Miyake
- Department of Computational Biology and Medical Sciences, Graduate School of Frontier Sciences, The University of Tokyo, Chiba, Japan
| | - Miho Oka
- Department of Computational Biology and Medical Sciences, Graduate School of Frontier Sciences, The University of Tokyo, Chiba, Japan
- Ono Pharmaceutical Co., Ltd, Ibaraki, Japan
| | - Akinori Kanai
- Department of Computational Biology and Medical Sciences, Graduate School of Frontier Sciences, The University of Tokyo, Chiba, Japan
| | - Yosuke Kawai
- Genome Medical Science Project (Toyama), National Center for Global Health and Medicine, Tokyo, Japan
| | - Satoi Nagasawa
- Department of Computational Biology and Medical Sciences, Graduate School of Frontier Sciences, The University of Tokyo, Chiba, Japan
| | - Yuichi Shiraishi
- Division of Genome Analysis Platform Development, National Cancer Center Research Institute, Tokyo, Japan
| | - Katsushi Tokunaga
- Genome Medical Science Project (Toyama), National Center for Global Health and Medicine, Tokyo, Japan
| | - Takashi Kohno
- Division of Genome Biology, National Cancer Center Research Institute, Tokyo, Japan
| | - Masahide Seki
- Department of Computational Biology and Medical Sciences, Graduate School of Frontier Sciences, The University of Tokyo, Chiba, Japan
| | - Yutaka Suzuki
- Department of Computational Biology and Medical Sciences, Graduate School of Frontier Sciences, The University of Tokyo, Chiba, Japan.
| | - Ayako Suzuki
- Department of Computational Biology and Medical Sciences, Graduate School of Frontier Sciences, The University of Tokyo, Chiba, Japan.
| |
Collapse
|
13
|
Chen Z, He X. Application of third-generation sequencing in cancer research. MEDICAL REVIEW (BERLIN, GERMANY) 2021; 1:150-171. [PMID: 37724303 PMCID: PMC10388785 DOI: 10.1515/mr-2021-0013] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 05/14/2021] [Accepted: 08/09/2021] [Indexed: 09/20/2023]
Abstract
In the past several years, nanopore sequencing technology from Oxford Nanopore Technologies (ONT) and single-molecule real-time (SMRT) sequencing technology from Pacific BioSciences (PacBio) have become available to researchers and are currently being tested for cancer research. These methods offer many advantages over most widely used high-throughput short-read sequencing approaches and allow the comprehensive analysis of transcriptomes by identifying full-length splice isoforms and several other posttranscriptional events. In addition, these platforms enable structural variation characterization at a previously unparalleled resolution and direct detection of epigenetic marks in native DNA and RNA. Here, we present a comprehensive summary of important applications of these technologies in cancer research, including the identification of complex structure variants, alternatively spliced isoforms, fusion transcript events, and exogenous RNA. Furthermore, we discuss the impact of the newly developed nanopore direct RNA sequencing (RNA-Seq) approach in advancing epitranscriptome research in cancer. Although the unique challenges still present for these new single-molecule long-read methods, they will unravel many aspects of cancer genome complexity in unprecedented ways and present an encouraging outlook for continued application in an increasing number of different cancer research settings.
Collapse
Affiliation(s)
- Zhiao Chen
- Fudan University Shanghai Cancer Center and Institutes of Biomedical Sciences, Fudan University, Shanghai, China
- Department of Oncology, Shanghai Medical College, Fudan University, Shanghai, China
| | - Xianghuo He
- Fudan University Shanghai Cancer Center and Institutes of Biomedical Sciences, Fudan University, Shanghai, China
- Department of Oncology, Shanghai Medical College, Fudan University, Shanghai, China
- Key Laboratory of Breast Cancer in Shanghai, Fudan University Shanghai Cancer Center, Fudan University, Shanghai, China
| |
Collapse
|
14
|
Wang P, Liu D, Yang FH, Ge H, Zhao X, Chen HG, Du T. Identification of key gene networks controlling vernalization development characteristics of Isatis indigotica by full-length transcriptomes and gene expression profiles. PHYSIOLOGY AND MOLECULAR BIOLOGY OF PLANTS : AN INTERNATIONAL JOURNAL OF FUNCTIONAL PLANT BIOLOGY 2021; 27:2679-2693. [PMID: 34975240 PMCID: PMC8703213 DOI: 10.1007/s12298-021-01110-2] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 09/14/2021] [Revised: 11/25/2021] [Accepted: 11/29/2021] [Indexed: 06/14/2023]
Abstract
UNLABELLED Isatis indigotica Fort., as a common Chinese medicinal raw material, will lose its medicinal value if it blooms early, so it is highly valuable to clarify the induction mechanism of the vernalization of I. indigotica at low temperature. In this study, the concentrations of soluble sugar, proline, glutathione and zeatin in two germplasms of I. indigotica with different degrees of low temperature tolerance (Y1 and Y2) were determined at 10 days, 20 days and 30 days of low-temperature treatment, and the full-length transcriptome of 24 samples was sequenced by Nanopore sequencing with Oxford Nanopore Technologies (ONT). After that, the data of transcripts involved in the vernalization of I. indigotica at low temperature were obtained, and these transcripts were identified using weighted gene co-expression network analysis (WGCNA). The results revealed the massive accumulation of soluble sugar and proline in Y1 and Y2 after low temperature induction. A total of 18,385 new transcripts, 6168 transcription factors and 470 lncRNAs were obtained. Differential expression analysis showed that gibberellin, flavonoids, fatty acids and some processes related to low temperature response were significantly enriched. Eight key transcripts were identified by WGCNA, among which ONT.14640.1, ONT.9119.1, ONT.13080.2 and ONT.16007.1 encodes a flavonoid transporter, 9-cis-epoxycarotenoid dioxygenase 3 (NCED3), growth factor gene and L-aspartate oxidase in plants, respectively. It indicated that secondary metabolites such as hormones and flavonoids play an important role in the vernalization of I. indigotica. qRT-PCR proved the reliability of transcriptome results. These results provide important insights on the low-temperature vernalization of I. indigotica, and provide a research basis for analyzing the vernalization mechanism of I. indigotica. SUPPLEMENTARY INFORMATION The online version contains supplementary material available at 10.1007/s12298-021-01110-2.
Collapse
Affiliation(s)
- Pan Wang
- Gansu University of Chinese Medicine, Lanzhou, 730000 China
| | - Dong Liu
- Gansu University of Chinese Medicine, Lanzhou, 730000 China
| | - Fu-Hong Yang
- Gansu University of Chinese Medicine, Lanzhou, 730000 China
- Pingliang Academy of Agricultural Sciences, Pingliang, 744000 China
| | - Hui Ge
- Gansu University of Chinese Medicine, Lanzhou, 730000 China
| | - Xin Zhao
- Gansu University of Chinese Medicine, Lanzhou, 730000 China
| | - Hong-Gang Chen
- Gansu University of Chinese Medicine, Lanzhou, 730000 China
| | - Tao Du
- Gansu University of Chinese Medicine, Lanzhou, 730000 China
| |
Collapse
|
15
|
Kuo MC, Liu SCH, Hsu YF, Wu RM. The role of noncoding RNAs in Parkinson's disease: biomarkers and associations with pathogenic pathways. J Biomed Sci 2021; 28:78. [PMID: 34794432 PMCID: PMC8603508 DOI: 10.1186/s12929-021-00775-x] [Citation(s) in RCA: 40] [Impact Index Per Article: 13.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/31/2021] [Accepted: 11/04/2021] [Indexed: 02/08/2023] Open
Abstract
The discovery of various noncoding RNAs (ncRNAs) and their biological implications is a growing area in cell biology. Increasing evidence has revealed canonical and noncanonical functions of long and small ncRNAs, including microRNAs, long ncRNAs (lncRNAs), circular RNAs, PIWI-interacting RNAs, and tRNA-derived fragments. These ncRNAs have the ability to regulate gene expression and modify metabolic pathways. Thus, they may have important roles as diagnostic biomarkers or therapeutic targets in various diseases, including neurodegenerative disorders, especially Parkinson's disease. Recently, through diverse sequencing technologies and a wide variety of bioinformatic analytical tools, such as reverse transcriptase quantitative PCR, microarrays, next-generation sequencing and long-read sequencing, numerous ncRNAs have been shown to be associated with neurodegenerative disorders, including Parkinson's disease. In this review article, we will first introduce the biogenesis of different ncRNAs, including microRNAs, PIWI-interacting RNAs, circular RNAs, long noncoding RNAs, and tRNA-derived fragments. The pros and cons of the detection platforms of ncRNAs and the reproducibility of bioinformatic analytical tools will be discussed in the second part. Finally, the recent discovery of numerous PD-associated ncRNAs and their association with the diagnosis and pathophysiology of PD are reviewed, and microRNAs and long ncRNAs that are transported by exosomes in biofluids are particularly emphasized.
Collapse
Affiliation(s)
- Ming-Che Kuo
- Department of Medicine, Section of Neurology, Cancer Center, National Taiwan University Hospital, Taipei, Taiwan
- Department of Neurology, National Taiwan University Hospital, College of Medicine, National Taiwan University, Taipei, Taiwan
| | - Sam Chi-Hao Liu
- Department of Neurology, National Taiwan University Hospital, College of Medicine, National Taiwan University, Taipei, Taiwan
| | - Ya-Fang Hsu
- Graduate Institute of Brain and Mind Sciences, College of Medicine, National Taiwan University, Taipei, Taiwan
| | - Ruey-Meei Wu
- Department of Neurology, National Taiwan University Hospital, College of Medicine, National Taiwan University, Taipei, Taiwan.
- Graduate Institute of Brain and Mind Sciences, College of Medicine, National Taiwan University, Taipei, Taiwan.
| |
Collapse
|
16
|
Nanopore sequencing technology, bioinformatics and applications. Nat Biotechnol 2021; 39:1348-1365. [PMID: 34750572 PMCID: PMC8988251 DOI: 10.1038/s41587-021-01108-x] [Citation(s) in RCA: 470] [Impact Index Per Article: 156.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/09/2019] [Accepted: 09/22/2021] [Indexed: 12/13/2022]
Abstract
Rapid advances in nanopore technologies for sequencing single long DNA and RNA molecules have led to substantial improvements in accuracy, read length and throughput. These breakthroughs have required extensive development of experimental and bioinformatics methods to fully exploit nanopore long reads for investigations of genomes, transcriptomes, epigenomes and epitranscriptomes. Nanopore sequencing is being applied in genome assembly, full-length transcript detection and base modification detection and in more specialized areas, such as rapid clinical diagnoses and outbreak surveillance. Many opportunities remain for improving data quality and analytical approaches through the development of new nanopores, base-calling methods and experimental protocols tailored to particular applications.
Collapse
|
17
|
Comparative Analysis of PacBio and Oxford Nanopore Sequencing Technologies for Transcriptomic Landscape Identification of Penaeus monodon. Life (Basel) 2021; 11:life11080862. [PMID: 34440606 PMCID: PMC8399832 DOI: 10.3390/life11080862] [Citation(s) in RCA: 6] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/09/2021] [Revised: 08/07/2021] [Accepted: 08/17/2021] [Indexed: 12/16/2022] Open
Abstract
With the advantages that long-read sequencing platforms such as Pacific Biosciences (Menlo Park, CA, USA) (PacBio) and Oxford Nanopore Technologies (Oxford, UK) (ONT) can offer, various research fields such as genomics and transcriptomics can exploit their benefits. Selecting an appropriate sequencing platform is undoubtedly crucial for the success of the research outcome, thus there is a need to compare these long-read sequencing platforms and evaluate them for specific research questions. This study aims to compare the performance of PacBio and ONT platforms for transcriptomic analysis by utilizing transcriptome data from three different tissues (hepatopancreas, intestine, and gonads) of the juvenile black tiger shrimp, Penaeus monodon. We compared three important features: (i) main characteristics of the sequencing libraries and their alignment with the reference genome, (ii) transcript assembly features and isoform identification, and (iii) correlation of the quantification of gene expression levels for both platforms. Our analyses suggest that read-length bias and differences in sequencing throughput are highly influential factors when using long reads in transcriptome studies. These comparisons can provide a guideline when designing a transcriptome study utilizing these two long-read sequencing technologies.
Collapse
|
18
|
De Paoli-Iseppi R, Gleeson J, Clark MB. Isoform Age - Splice Isoform Profiling Using Long-Read Technologies. Front Mol Biosci 2021; 8:711733. [PMID: 34409069 PMCID: PMC8364947 DOI: 10.3389/fmolb.2021.711733] [Citation(s) in RCA: 9] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/19/2021] [Accepted: 07/19/2021] [Indexed: 01/12/2023] Open
Abstract
Alternative splicing (AS) of RNA is a key mechanism that results in the expression of multiple transcript isoforms from single genes and leads to an increase in the complexity of both the transcriptome and proteome. Regulation of AS is critical for the correct functioning of many biological pathways, while disruption of AS can be directly pathogenic in diseases such as cancer or cause risk for complex disorders. Current short-read sequencing technologies achieve high read depth but are limited in their ability to resolve complex isoforms. In this review we examine how long-read sequencing (LRS) technologies can address this challenge by covering the entire RNA sequence in a single read and thereby distinguish isoform changes that could impact RNA regulation or protein function. Coupling LRS with technologies such as single cell sequencing, targeted sequencing and spatial transcriptomics is producing a rapidly expanding suite of technological approaches to profile alternative splicing at the isoform level with unprecedented detail. In addition, integrating LRS with genotype now allows the impact of genetic variation on isoform expression to be determined. Recent results demonstrate the potential of these techniques to elucidate the landscape of splicing, including in tissues such as the brain where AS is particularly prevalent. Finally, we also discuss how AS can impact protein function, potentially leading to novel therapeutic targets for a range of diseases.
Collapse
Affiliation(s)
| | | | - Michael B. Clark
- Centre for Stem Cell Systems, Department of Anatomy and Physiology, The University of Melbourne, Parkville, VIC, Australia
| |
Collapse
|
19
|
Sakamoto Y, Zaha S, Suzuki Y, Seki M, Suzuki A. Application of long-read sequencing to the detection of structural variants in human cancer genomes. Comput Struct Biotechnol J 2021; 19:4207-4216. [PMID: 34527193 PMCID: PMC8350331 DOI: 10.1016/j.csbj.2021.07.030] [Citation(s) in RCA: 7] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/06/2021] [Revised: 07/20/2021] [Accepted: 07/25/2021] [Indexed: 01/02/2023] Open
Abstract
In recent years, the so-called long-read sequencing technology has had a substantial impact on various aspects of genome sciences. Here, we introduce recent studies of cancerous structural variants (SVs) using long-read sequencing technologies, namely Pacific Biosciences (PacBio) sequencers, Oxford Nanopore Technologies (ONT) sequencers, and linked-read methods. By taking advantage of long-read lengths, these technologies have enabled the precise detection of SVs, including long insertions by transposable elements, such as LINE-1. In addition to SV detection, the epigenome status (including DNA methylation and haplotype information) surrounding SV loci has also been unveiled by long-read sequencing technologies, to identify the effects of SVs. Among the various research fields in which long-read sequencing has been applied, cancer genomics has shown the most remarkable advances. In fact, many studies are beginning to shed light on the detection of SVs and the elucidation of their complex structures in various types of cancer. In the particular case of cancers, we summarize the technical limitations of the application of this technology to the analysis of clinical samples. We will introduce recent achievements from this viewpoint. However, a similar approach will be started for other applications in the near future. Therefore, by complementing the current short-read sequencing analysis, long-read sequencing should reveal the complex nature of human genomes in their healthy and disease states, which will open a new opportunity for a better understanding of disease development and for a novel strategy for drug development.
Collapse
Affiliation(s)
- Yoshitaka Sakamoto
- Department of Computational Biology and Medical Sciences, Graduate School of Frontier Sciences, The University of Tokyo, 5-1-5 Kashiwanoha, Kashiwa, Chiba 277-8561, Japan
| | - Suzuko Zaha
- Department of Computational Biology and Medical Sciences, Graduate School of Frontier Sciences, The University of Tokyo, 5-1-5 Kashiwanoha, Kashiwa, Chiba 277-8561, Japan
| | - Yutaka Suzuki
- Department of Computational Biology and Medical Sciences, Graduate School of Frontier Sciences, The University of Tokyo, 5-1-5 Kashiwanoha, Kashiwa, Chiba 277-8561, Japan
| | - Masahide Seki
- Department of Computational Biology and Medical Sciences, Graduate School of Frontier Sciences, The University of Tokyo, 5-1-5 Kashiwanoha, Kashiwa, Chiba 277-8561, Japan
| | - Ayako Suzuki
- Department of Computational Biology and Medical Sciences, Graduate School of Frontier Sciences, The University of Tokyo, 5-1-5 Kashiwanoha, Kashiwa, Chiba 277-8561, Japan
| |
Collapse
|
20
|
Transcript Identification Through Long-Read Sequencing. Methods Mol Biol 2021. [PMID: 33835462 DOI: 10.1007/978-1-0716-1307-8_29] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/15/2023]
Abstract
RNA-seq using long-read sequencing, such as nanopore and SMRT (Single Molecule, Real-Time) sequencing, enabled the identification of the full-length structure of RNA molecules. Several tools for long-read RNA-seq were developed recently. In this section, we introduce an analytical pipeline of long-read RNA-seq for isoform identification and the estimation of expression levels using minimap2, TranscriptClean, and TALON. We applied this pipeline to the public direct RNA-seq data of the HAP1 and HEK293 cell lines to identify transcript isoforms which can be detected only using long-read RNA-seq data.
Collapse
|
21
|
Massaiu I, Songia P, Chiesa M, Valerio V, Moschetta D, Alfieri V, Myasoedova VA, Schmid M, Cassetta L, Colombo GI, D’Alessandra Y, Poggio P. Evaluation of Oxford Nanopore MinION RNA-Seq Performance for Human Primary Cells. Int J Mol Sci 2021; 22:ijms22126317. [PMID: 34204756 PMCID: PMC8231517 DOI: 10.3390/ijms22126317] [Citation(s) in RCA: 8] [Impact Index Per Article: 2.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/20/2021] [Revised: 05/17/2021] [Accepted: 06/09/2021] [Indexed: 12/12/2022] Open
Abstract
Transcript sequencing is a crucial tool for gaining a deep understanding of biological processes in diagnostic and clinical medicine. Given their potential to study novel complex eukaryotic transcriptomes, long-read sequencing technologies are able to overcome some limitations of short-read RNA-Seq approaches. Oxford Nanopore Technologies (ONT) offers the ability to generate long-read sequencing data in real time via portable protein nanopore USB devices. This work aimed to provide the user with the number of reads that should be sequenced, through the ONT MinION platform, to reach the desired accuracy level for a human cell RNA study. We sequenced three cDNA libraries prepared from poly-adenosine RNA of human primary cardiac fibroblasts. Since the runs were comparable, they were combined in a total dataset of 48 million reads. Synthetic datasets with different sizes were generated starting from the total and analyzed in terms of the number of identified genes and their expression levels. As expected, an improved sensitivity was obtained, increasing the sequencing depth, particularly for the non-coding genes. The reliability of expression levels was assayed by (i) comparison with PCR quantifications of selected genes and (ii) by the implementation of a user-friendly multiplexing method in a single run.
Collapse
Affiliation(s)
- Ilaria Massaiu
- Centro Cardiologico Monzino IRCCS, 20131 Milan, Italy; (I.M.); (P.S.); (M.C.); (V.V.); (D.M.); (V.A.); (V.A.M.); (G.I.C.); (Y.D.)
| | - Paola Songia
- Centro Cardiologico Monzino IRCCS, 20131 Milan, Italy; (I.M.); (P.S.); (M.C.); (V.V.); (D.M.); (V.A.); (V.A.M.); (G.I.C.); (Y.D.)
| | - Mattia Chiesa
- Centro Cardiologico Monzino IRCCS, 20131 Milan, Italy; (I.M.); (P.S.); (M.C.); (V.V.); (D.M.); (V.A.); (V.A.M.); (G.I.C.); (Y.D.)
| | - Vincenza Valerio
- Centro Cardiologico Monzino IRCCS, 20131 Milan, Italy; (I.M.); (P.S.); (M.C.); (V.V.); (D.M.); (V.A.); (V.A.M.); (G.I.C.); (Y.D.)
- Dipartimento di Medicina Clinica e Chirurgia, Università degli Studi di Napoli Federico II, 80131 Napoli, Italy
| | - Donato Moschetta
- Centro Cardiologico Monzino IRCCS, 20131 Milan, Italy; (I.M.); (P.S.); (M.C.); (V.V.); (D.M.); (V.A.); (V.A.M.); (G.I.C.); (Y.D.)
- Dipartimento di Scienze Farmacologiche e Biomolecolari, Università degli Studi di Milano, 20133 Milano, Italy
| | - Valentina Alfieri
- Centro Cardiologico Monzino IRCCS, 20131 Milan, Italy; (I.M.); (P.S.); (M.C.); (V.V.); (D.M.); (V.A.); (V.A.M.); (G.I.C.); (Y.D.)
| | - Veronika A. Myasoedova
- Centro Cardiologico Monzino IRCCS, 20131 Milan, Italy; (I.M.); (P.S.); (M.C.); (V.V.); (D.M.); (V.A.); (V.A.M.); (G.I.C.); (Y.D.)
| | - Michael Schmid
- Genexa AG, Dienerstrasse 7, CH-8004 Zürich, Switzerland;
| | - Luca Cassetta
- The Queen’s Medical Research Council Centre for Reproductive Health, University of Edinburgh, Edinburgh EH16 4TJ, UK;
| | - Gualtiero I. Colombo
- Centro Cardiologico Monzino IRCCS, 20131 Milan, Italy; (I.M.); (P.S.); (M.C.); (V.V.); (D.M.); (V.A.); (V.A.M.); (G.I.C.); (Y.D.)
| | - Yuri D’Alessandra
- Centro Cardiologico Monzino IRCCS, 20131 Milan, Italy; (I.M.); (P.S.); (M.C.); (V.V.); (D.M.); (V.A.); (V.A.M.); (G.I.C.); (Y.D.)
| | - Paolo Poggio
- Centro Cardiologico Monzino IRCCS, 20131 Milan, Italy; (I.M.); (P.S.); (M.C.); (V.V.); (D.M.); (V.A.); (V.A.M.); (G.I.C.); (Y.D.)
- Correspondence:
| |
Collapse
|
22
|
Halstead MM, Islas-Trejo A, Goszczynski DE, Medrano JF, Zhou H, Ross PJ. Large-Scale Multiplexing Permits Full-Length Transcriptome Annotation of 32 Bovine Tissues From a Single Nanopore Flow Cell. Front Genet 2021; 12:664260. [PMID: 34093657 PMCID: PMC8173071 DOI: 10.3389/fgene.2021.664260] [Citation(s) in RCA: 8] [Impact Index Per Article: 2.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/04/2021] [Accepted: 04/06/2021] [Indexed: 12/18/2022] Open
Abstract
A comprehensive annotation of transcript isoforms in domesticated species is lacking. Especially considering that transcriptome complexity and splicing patterns are not well-conserved between species, this presents a substantial obstacle to genomic selection programs that seek to improve production, disease resistance, and reproduction. Recent advances in long-read sequencing technology have made it possible to directly extrapolate the structure of full-length transcripts without the need for transcript reconstruction. In this study, we demonstrate the power of long-read sequencing for transcriptome annotation by coupling Oxford Nanopore Technology (ONT) with large-scale multiplexing of 93 samples, comprising 32 tissues collected from adult male and female Hereford cattle. More than 30 million uniquely mapping full-length reads were obtained from a single ONT flow cell, and used to identify and characterize the expression dynamics of 99,044 transcript isoforms at 31,824 loci. Of these predicted transcripts, 21% exactly matched a reference transcript, and 61% were novel isoforms of reference genes, substantially increasing the ratio of transcript variants per gene, and suggesting that the complexity of the bovine transcriptome is comparable to that in humans. Over 7,000 transcript isoforms were extremely tissue-specific, and 61% of these were attributed to testis, which exhibited the most complex transcriptome of all interrogated tissues. Despite profiling over 30 tissues, transcription was only detected at about 60% of reference loci. Consequently, additional studies will be necessary to continue characterizing the bovine transcriptome in additional cell types, developmental stages, and physiological conditions. However, by here demonstrating the power of ONT sequencing coupled with large-scale multiplexing, the task of exhaustively annotating the bovine transcriptome - or any mammalian transcriptome - appears significantly more feasible.
Collapse
Affiliation(s)
| | | | | | | | | | - Pablo J. Ross
- Department of Animal Science, University of California, Davis, Davis, CA, United States
| |
Collapse
|
23
|
Goldsmith C, Rodríguez-Aguilera JR, El-Rifai I, Jarretier-Yuste A, Hervieu V, Raineteau O, Saintigny P, Chagoya de Sánchez V, Dante R, Ichim G, Hernandez-Vargas H. Low biological fluctuation of mitochondrial CpG and non-CpG methylation at the single-molecule level. Sci Rep 2021; 11:8032. [PMID: 33850190 PMCID: PMC8044111 DOI: 10.1038/s41598-021-87457-8] [Citation(s) in RCA: 13] [Impact Index Per Article: 4.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/15/2020] [Accepted: 03/30/2021] [Indexed: 12/16/2022] Open
Abstract
Mammalian cytosine DNA methylation (5mC) is associated with the integrity of the genome and the transcriptional status of nuclear DNA. Due to technical limitations, it has been less clear if mitochondrial DNA (mtDNA) is methylated and whether 5mC has a regulatory role in this context. Here, we used bisulfite-independent single-molecule sequencing of native human and mouse DNA to study mitochondrial 5mC across different biological conditions. We first validated the ability of long-read nanopore sequencing to detect 5mC in CpG (5mCpG) and non-CpG (5mCpH) context in nuclear DNA at expected genomic locations (i.e. promoters, gene bodies, enhancers, and cell type-specific transcription factor binding sites). Next, using high coverage nanopore sequencing we found low levels of mtDNA CpG and CpH methylation (with several exceptions) and little variation across biological processes: differentiation, oxidative stress, and cancer. 5mCpG and 5mCpH were overall higher in tissues compared to cell lines, with small additional variation between cell lines of different origin. Despite general low levels, global and single-base differences were found in cancer tissues compared to their adjacent counterparts, in particular for 5mCpG. In conclusion, nanopore sequencing is a useful tool for the detection of modified DNA bases on mitochondria that avoid the biases introduced by bisulfite and PCR amplification. Enhanced nanopore basecalling models will provide further resolution on the small size effects detected here, as well as rule out the presence of other DNA modifications such as oxidized forms of 5mC.
Collapse
Affiliation(s)
- Chloe Goldsmith
- Department of Tumor Escape, Resistance and Immunity, TGF-Beta and Immuno-Regulation Team, Cancer Research Centre of Lyon (CRCL), INSERM U 1052, CNRS UMR 5286, UCBL1, Université de Lyon, Centre Léon Bérard, 28 rue Laennec, 69373, Lyon Cedex 08, France.
| | - Jesús Rafael Rodríguez-Aguilera
- Department of Cellular Biology and Development, Instituto de Fisiología Celular, Universidad Nacional Autónoma de México (UNAM), Circuito Exterior s/n, Ciudad Universitaria, Coyoacán, 04510, Mexico City, Mexico
| | - Ines El-Rifai
- Department of Tumor Escape, Resistance and Immunity, TGF-Beta and Immuno-Regulation Team, Cancer Research Centre of Lyon (CRCL), INSERM U 1052, CNRS UMR 5286, UCBL1, Université de Lyon, Centre Léon Bérard, 28 rue Laennec, 69373, Lyon Cedex 08, France
| | - Adrien Jarretier-Yuste
- Department of Tumor Escape, Resistance and Immunity, TGF-Beta and Immuno-Regulation Team, Cancer Research Centre of Lyon (CRCL), INSERM U 1052, CNRS UMR 5286, UCBL1, Université de Lyon, Centre Léon Bérard, 28 rue Laennec, 69373, Lyon Cedex 08, France
| | - Valérie Hervieu
- Department of Surgical Pathology, Hospices Civils de Lyon, Groupement Hospitalier Est, Lyon, France
| | - Olivier Raineteau
- Univ Lyon, Université Claude Bernard Lyon 1, INSERM, Stem Cell and Brain Research Institute U1208, Bron, France
| | - Pierre Saintigny
- Univ Lyon, Université Claude Bernard Lyon 1, INSERM 1052, CNRS 5286, Centre Léon Bérard, Centre de Recherche en Cancérologie de Lyon, Lyon, France
- Department of Translational Medicine, Centre Léon Bérard, Lyon, France
| | - Victoria Chagoya de Sánchez
- Department of Cellular Biology and Development, Instituto de Fisiología Celular, Universidad Nacional Autónoma de México (UNAM), Circuito Exterior s/n, Ciudad Universitaria, Coyoacán, 04510, Mexico City, Mexico
| | - Robert Dante
- Dependence Receptors Cancer and Development Laboratory, Department of Signaling of Tumoral Escape. Cancer Research. Center of Lyon (CRCL), Inserm U 1052, CNRS UMR 5286, Université de Lyon, Centre Léon Bérard, 28 rue Laennec, 69373, Lyon Cedex 08, France
| | - Gabriel Ichim
- Cancer Cell Death Laboratory, Part of LabEx DEVweCAN, Université de Lyon, Lyon, France
- Cancer Research Centre of Lyon (CRCL), Inserm U 1052, CNRS UMR 5286, Université de Lyon, Centre Léon Bérard, 28 rue Laennec, 69373, Lyon Cedex 08, France
| | - Hector Hernandez-Vargas
- Department of Tumor Escape, Resistance and Immunity, TGF-Beta and Immuno-Regulation Team, Cancer Research Centre of Lyon (CRCL), INSERM U 1052, CNRS UMR 5286, UCBL1, Université de Lyon, Centre Léon Bérard, 28 rue Laennec, 69373, Lyon Cedex 08, France.
- Department of Translational Medicine, Centre Léon Bérard, Lyon, France.
| |
Collapse
|
24
|
Mitsuhashi S, Nakagawa S, Sasaki-Honda M, Sakurai H, Frith MC, Mitsuhashi H. Nanopore direct RNA sequencing detects DUX4-activated repeats and isoforms in human muscle cells. Hum Mol Genet 2021; 30:552-563. [PMID: 33693705 PMCID: PMC8120133 DOI: 10.1093/hmg/ddab063] [Citation(s) in RCA: 9] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/27/2021] [Revised: 01/27/2021] [Accepted: 02/23/2021] [Indexed: 01/11/2023] Open
Abstract
Facioscapulohumeral muscular dystrophy (FSHD) is an inherited muscle disease caused by misexpression of the DUX4 gene in skeletal muscle. DUX4 is a transcription factor, which is normally expressed in the cleavage-stage embryo and regulates gene expression involved in early embryonic development. Recent studies revealed that DUX4 also activates the transcription of repetitive elements such as endogenous retroviruses (ERVs), mammalian apparent long terminal repeat (LTR)-retrotransposons and pericentromeric satellite repeats (Human Satellite II). DUX4-bound ERV sequences also create alternative promoters for genes or long non-coding RNAs, producing fusion transcripts. To further understand transcriptional regulation by DUX4, we performed nanopore long-read direct RNA sequencing (dRNA-seq) of human muscle cells induced by DUX4, because long reads show whole isoforms with greater confidence. We successfully detected differential expression of known DUX4-induced genes and discovered 61 differentially expressed repeat loci, which are near DUX4–ChIP peaks. We also identified 247 gene–ERV fusion transcripts, of which 216 were not reported previously. In addition, long-read dRNA-seq clearly shows that RNA splicing is a common event in DUX4-activated ERV transcripts. Long-read analysis showed non-LTR transposons including Alu elements are also transcribed from LTRs. Our findings revealed further complexity of DUX4-induced ERV transcripts. This catalogue of DUX4-activated repetitive elements may provide useful information to elucidate the pathology of FSHD. Also, our results indicate that nanopore dRNA-seq has complementary strengths to conventional short-read complementary DNA sequencing.
Collapse
Affiliation(s)
- Satomi Mitsuhashi
- Department of Genomic Function and Diversity, Tokyo Medical and Dental University, Tokyo 113-8510, Japan.,Department of Human Genetics, Yokohama City University, Yokohama, Kanagawa 236-0004, Japan
| | - So Nakagawa
- Micro/Nano Technology Center, Tokai University, Hiratsuka, Kanagawa 259-1292, Japan.,Department of Molecular Life Science, Tokai University School of Medicine, Isehara, Kanagawa 259-1193, Japan
| | - Mitsuru Sasaki-Honda
- Center for iPS Cell Research and Application (CiRA), Kyoto University, 53 Shogoin Kawahara-cho, Sakyo-ku, Kyoto 606-8507, Japan
| | - Hidetoshi Sakurai
- Center for iPS Cell Research and Application (CiRA), Kyoto University, 53 Shogoin Kawahara-cho, Sakyo-ku, Kyoto 606-8507, Japan
| | - Martin C Frith
- Artificial Intelligence Research Center, National Institute of Advanced Industrial Science and Technology (AIST), Tokyo 135-0064, Japan.,Graduate School of Frontier Sciences, University of Tokyo, Chiba 277-8561, Japan.,Computational Bio Big-Data Open Innovation Laboratory (CBBD-OIL), National Institute of Advanced Industrial Science and Technology (AIST), Tokyo 169-8555, Japan
| | - Hiroaki Mitsuhashi
- Micro/Nano Technology Center, Tokai University, Hiratsuka, Kanagawa 259-1292, Japan.,Department of Applied Biochemistry, School of Engineering, Tokai University, Hiratsuka, Kanagawa 259-1292, Japan
| |
Collapse
|
25
|
Huang KK, Huang J, Wu JKL, Lee M, Tay ST, Kumar V, Ramnarayanan K, Padmanabhan N, Xu C, Tan ALK, Chan C, Kappei D, Göke J, Tan P. Long-read transcriptome sequencing reveals abundant promoter diversity in distinct molecular subtypes of gastric cancer. Genome Biol 2021; 22:44. [PMID: 33482911 PMCID: PMC7821541 DOI: 10.1186/s13059-021-02261-x] [Citation(s) in RCA: 38] [Impact Index Per Article: 12.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/31/2020] [Accepted: 01/04/2021] [Indexed: 02/07/2023] Open
Abstract
BACKGROUND Deregulated gene expression is a hallmark of cancer; however, most studies to date have analyzed short-read RNA sequencing data with inherent limitations. Here, we combine PacBio long-read isoform sequencing (Iso-Seq) and Illumina paired-end short-read RNA sequencing to comprehensively survey the transcriptome of gastric cancer (GC), a leading cause of global cancer mortality. RESULTS We performed full-length transcriptome analysis across 10 GC cell lines covering four major GC molecular subtypes (chromosomal unstable, Epstein-Barr positive, genome stable and microsatellite unstable). We identify 60,239 non-redundant full-length transcripts, of which > 66% are novel compared to current transcriptome databases. Novel isoforms are more likely to be cell line and subtype specific, expressed at lower levels with larger number of exons, with longer isoform/coding sequence lengths. Most novel isoforms utilize an alternate first exon, and compared to other alternative splicing categories, are expressed at higher levels and exhibit higher variability. Collectively, we observe alternate promoter usage in 25% of detected genes, with the majority (84.2%) of known/novel promoter pairs exhibiting potential changes in their coding sequences. Mapping these alternate promoters to TCGA GC samples, we identify several cancer-associated isoforms, including novel variants of oncogenes. Tumor-specific transcript isoforms tend to alter protein coding sequences to a larger extent than other isoforms. Analysis of outcome data suggests that novel isoforms may impart additional prognostic information. CONCLUSIONS Our results provide a rich resource of full-length transcriptome data for deeper studies of GC and other gastrointestinal malignancies.
Collapse
Affiliation(s)
- Kie Kyon Huang
- Programme in Cancer and Stem Cell Biology, Duke-NUS Medical School, 8 College Road, Singapore, 169857 Singapore
| | - Jiawen Huang
- Programme in Cancer and Stem Cell Biology, Duke-NUS Medical School, 8 College Road, Singapore, 169857 Singapore
| | - Jeanie Kar Leng Wu
- Programme in Cancer and Stem Cell Biology, Duke-NUS Medical School, 8 College Road, Singapore, 169857 Singapore
| | - Minghui Lee
- Programme in Cancer and Stem Cell Biology, Duke-NUS Medical School, 8 College Road, Singapore, 169857 Singapore
| | - Su Ting Tay
- Programme in Cancer and Stem Cell Biology, Duke-NUS Medical School, 8 College Road, Singapore, 169857 Singapore
| | - Vikrant Kumar
- Programme in Cancer and Stem Cell Biology, Duke-NUS Medical School, 8 College Road, Singapore, 169857 Singapore
| | - Kalpana Ramnarayanan
- Programme in Cancer and Stem Cell Biology, Duke-NUS Medical School, 8 College Road, Singapore, 169857 Singapore
| | - Nisha Padmanabhan
- Programme in Cancer and Stem Cell Biology, Duke-NUS Medical School, 8 College Road, Singapore, 169857 Singapore
| | - Chang Xu
- Programme in Cancer and Stem Cell Biology, Duke-NUS Medical School, 8 College Road, Singapore, 169857 Singapore
| | - Angie Lay Keng Tan
- Programme in Cancer and Stem Cell Biology, Duke-NUS Medical School, 8 College Road, Singapore, 169857 Singapore
| | - Charlene Chan
- Cancer Science Institute of Singapore, National University of Singapore, Singapore, 117599 Singapore
| | - Dennis Kappei
- Cancer Science Institute of Singapore, National University of Singapore, Singapore, 117599 Singapore
- Department of Biochemistry, Yong Loo Lin School of Medicine, National University of Singapore, Singapore, 117596 Singapore
| | - Jonathan Göke
- Genome Institute of Singapore, Singapore, 138672 Singapore
| | - Patrick Tan
- Programme in Cancer and Stem Cell Biology, Duke-NUS Medical School, 8 College Road, Singapore, 169857 Singapore
- Cancer Science Institute of Singapore, National University of Singapore, Singapore, 117599 Singapore
- Genome Institute of Singapore, Singapore, 138672 Singapore
- SingHealth/Duke-NUS Institute of Precision Medicine, National Heart Centre Singapore, Singapore, 169609 Singapore
| |
Collapse
|
26
|
Li N, Cai Q, Miao Q, Song Z, Fang Y, Hu B. High-Throughput Metagenomics for Identification of Pathogens in the Clinical Settings. SMALL METHODS 2021; 5:2000792. [PMID: 33614906 PMCID: PMC7883231 DOI: 10.1002/smtd.202000792] [Citation(s) in RCA: 88] [Impact Index Per Article: 29.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 08/31/2020] [Revised: 10/24/2020] [Indexed: 05/25/2023]
Abstract
The application of sequencing technology is shifting from research to clinical laboratories owing to rapid technological developments and substantially reduced costs. However, although thousands of microorganisms are known to infect humans, identification of the etiological agents for many diseases remains challenging as only a small proportion of pathogens are identifiable by the current diagnostic methods. These challenges are compounded by the emergence of new pathogens. Hence, metagenomic next-generation sequencing (mNGS), an agnostic, unbiased, and comprehensive method for detection, and taxonomic characterization of microorganisms, has become an attractive strategy. Although many studies, and cases reports, have confirmed the success of mNGS in improving the diagnosis, treatment, and tracking of infectious diseases, several hurdles must still be overcome. It is, therefore, imperative that practitioners and clinicians understand both the benefits and limitations of mNGS when applying it to clinical practice. Interestingly, the emerging third-generation sequencing technologies may partially offset the disadvantages of mNGS. In this review, mainly: a) the history of sequencing technology; b) various NGS technologies, common platforms, and workflows for clinical applications; c) the application of NGS in pathogen identification; d) the global expert consensus on NGS-related methods in clinical applications; and e) challenges associated with diagnostic metagenomics are described.
Collapse
Affiliation(s)
- Na Li
- Department of Infectious DiseasesZhongshan HospitalFudan UniversityShanghai200032China
| | - Qingqing Cai
- Genoxor Medical Science and Technology Inc.Zhejiang317317China
| | - Qing Miao
- Department of Infectious DiseasesZhongshan HospitalFudan UniversityShanghai200032China
| | - Zeshi Song
- Genoxor Medical Science and Technology Inc.Zhejiang317317China
| | - Yuan Fang
- Genoxor Medical Science and Technology Inc.Zhejiang317317China
| | - Bijie Hu
- Department of Infectious DiseasesZhongshan HospitalFudan UniversityShanghai200032China
| |
Collapse
|
27
|
Oka M, Xu L, Suzuki T, Yoshikawa T, Sakamoto H, Uemura H, Yoshizawa AC, Suzuki Y, Nakatsura T, Ishihama Y, Suzuki A, Seki M. Aberrant splicing isoforms detected by full-length transcriptome sequencing as transcripts of potential neoantigens in non-small cell lung cancer. Genome Biol 2021; 22:9. [PMID: 33397462 PMCID: PMC7780684 DOI: 10.1186/s13059-020-02240-8] [Citation(s) in RCA: 46] [Impact Index Per Article: 15.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/13/2020] [Accepted: 12/14/2020] [Indexed: 02/08/2023] Open
Abstract
BACKGROUND Long-read sequencing of full-length cDNAs enables the detection of structures of aberrant splicing isoforms in cancer cells. These isoforms are occasionally translated, presented by HLA molecules, and recognized as neoantigens. This study used a long-read sequencer (MinION) to construct a comprehensive catalog of aberrant splicing isoforms in non-small-cell lung cancers, by which novel isoforms and potential neoantigens are identified. RESULTS Full-length cDNA sequencing is performed using 22 cell lines, and a total of 2021 novel splicing isoforms are identified. The protein expression of some of these isoforms is then validated by proteome analysis. Ablations of a nonsense-mediated mRNA decay (NMD) factor, UPF1, and a splicing factor, SF3B1, are found to increase the proportion of aberrant transcripts. NetMHC evaluation of the binding affinities to each type of HLA molecule reveals that some of the isoforms potentially generate neoantigen candidates. We also identify aberrant splicing isoforms in seven non-small-cell lung cancer specimens. An enzyme-linked immune absorbent spot assay indicates that approximately half the peptide candidates have the potential to activate T cell responses through their interaction with HLA molecules. Finally, we estimate the number of isoforms in The Cancer Genome Atlas (TCGA) datasets by referring to the constructed catalog and found that disruption of NMD factors is significantly correlated with the number of splicing isoforms found in the TCGA-Lung Adenocarcinoma data collection. CONCLUSIONS Our results indicate that long-read sequencing of full-length cDNAs is essential for the precise identification of aberrant transcript structures in cancer cells.
Collapse
Affiliation(s)
- Miho Oka
- Department of Computational Biology and Medical Sciences, Graduate School of Frontier Sciences, The University of Tokyo, Chiba, Japan
- Ono Pharmaceutical Co., Ltd., Ibaraki, Japan
| | - Liu Xu
- Department of Computational Biology and Medical Sciences, Graduate School of Frontier Sciences, The University of Tokyo, Chiba, Japan
| | - Toshihiro Suzuki
- General Medical Education and Research Center, Teikyo University, Tokyo, Japan
- Division of Cancer Immunotherapy, Exploratory Oncology Research and Clinical Trial Center, National Cancer Center, Chiba, Japan
| | - Toshiaki Yoshikawa
- Division of Cancer Immunotherapy, Exploratory Oncology Research and Clinical Trial Center, National Cancer Center, Chiba, Japan
| | - Hiromi Sakamoto
- Department of Clinical Genomics, National Cancer Center Research Institute, Tokyo, Japan
| | - Hayato Uemura
- Department of Molecular and Cellular BioAnalysis, Graduate School of Pharmaceutical Sciences, Kyoto University, Kyoto, Japan
| | - Akiyasu C. Yoshizawa
- Department of Molecular and Cellular BioAnalysis, Graduate School of Pharmaceutical Sciences, Kyoto University, Kyoto, Japan
| | - Yutaka Suzuki
- Department of Computational Biology and Medical Sciences, Graduate School of Frontier Sciences, The University of Tokyo, Chiba, Japan
| | - Tetsuya Nakatsura
- Division of Cancer Immunotherapy, Exploratory Oncology Research and Clinical Trial Center, National Cancer Center, Chiba, Japan
| | - Yasushi Ishihama
- Department of Molecular and Cellular BioAnalysis, Graduate School of Pharmaceutical Sciences, Kyoto University, Kyoto, Japan
| | - Ayako Suzuki
- Department of Computational Biology and Medical Sciences, Graduate School of Frontier Sciences, The University of Tokyo, Chiba, Japan
| | - Masahide Seki
- Department of Computational Biology and Medical Sciences, Graduate School of Frontier Sciences, The University of Tokyo, Chiba, Japan
| |
Collapse
|
28
|
Sakamoto Y, Xu L, Seki M, Yokoyama TT, Kasahara M, Kashima Y, Ohashi A, Shimada Y, Motoi N, Tsuchihara K, Kobayashi SS, Kohno T, Shiraishi Y, Suzuki A, Suzuki Y. Long-read sequencing for non-small-cell lung cancer genomes. Genome Res 2020; 30:1243-1257. [PMID: 32887687 PMCID: PMC7545141 DOI: 10.1101/gr.261941.120] [Citation(s) in RCA: 24] [Impact Index Per Article: 6.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/02/2020] [Accepted: 04/09/2020] [Indexed: 12/23/2022]
Abstract
Here, we report the application of a long-read sequencer, PromethION, for analyzing human cancer genomes. We first conducted whole-genome sequencing on lung cancer cell lines. We found that it is possible to genotype known cancerous mutations, such as point mutations. We also found that long-read sequencing is particularly useful for precisely identifying and characterizing structural aberrations, such as large deletions, gene fusions, and other chromosomal rearrangements. In addition, we identified several medium-sized structural aberrations consisting of complex combinations of local duplications, inversions, and microdeletions. These complex mutations occurred even in key cancer-related genes, such as STK11, NF1, SMARCA4, and PTEN. The biological relevance of those mutations was further revealed by epigenome, transcriptome, and protein analyses of the affected signaling pathways. Such structural aberrations were also found in clinical lung adenocarcinoma specimens. Those structural aberrations were unlikely to be reliably detected by conventional short-read sequencing. Therefore, long-read sequencing may contribute to understanding the molecular etiology of patients for whom causative cancerous mutations remain unknown and therapeutic strategies are elusive.
Collapse
Affiliation(s)
- Yoshitaka Sakamoto
- Department of Computational Biology and Medical Sciences, Graduate School of Frontier Sciences, The University of Tokyo, Chiba 277-8562, Japan
| | - Liu Xu
- Department of Computational Biology and Medical Sciences, Graduate School of Frontier Sciences, The University of Tokyo, Chiba 277-8562, Japan
| | - Masahide Seki
- Department of Computational Biology and Medical Sciences, Graduate School of Frontier Sciences, The University of Tokyo, Chiba 277-8562, Japan
| | - Toshiyuki T Yokoyama
- Department of Computational Biology and Medical Sciences, Graduate School of Frontier Sciences, The University of Tokyo, Chiba 277-8562, Japan
| | - Masahiro Kasahara
- Department of Computational Biology and Medical Sciences, Graduate School of Frontier Sciences, The University of Tokyo, Chiba 277-8562, Japan
| | - Yukie Kashima
- Division of Translational Informatics, Exploratory Oncology Research and Clinical Trial Center, National Cancer Center, Chiba 277-8577, Japan.,Division of Translational Genomics, Exploratory Oncology Research and Clinical Trial Center, National Cancer Center, Chiba 277-8577, Japan
| | - Akihiro Ohashi
- Division of Translational Genomics, Exploratory Oncology Research and Clinical Trial Center, National Cancer Center, Chiba 277-8577, Japan
| | - Yoko Shimada
- Division of Genome Biology, National Cancer Center Research Institute, Tokyo 104-0045, Japan
| | - Noriko Motoi
- Department of Pathology, National Cancer Center Hospital, Tokyo 104-0045, Japan
| | - Katsuya Tsuchihara
- Division of Translational Informatics, Exploratory Oncology Research and Clinical Trial Center, National Cancer Center, Chiba 277-8577, Japan
| | - Susumu S Kobayashi
- Division of Translational Genomics, Exploratory Oncology Research and Clinical Trial Center, National Cancer Center, Chiba 277-8577, Japan
| | - Takashi Kohno
- Division of Genome Biology, National Cancer Center Research Institute, Tokyo 104-0045, Japan
| | - Yuichi Shiraishi
- Division of Cellular Signaling, National Cancer Center Research Institute, Tokyo 104-0045, Japan
| | - Ayako Suzuki
- Department of Computational Biology and Medical Sciences, Graduate School of Frontier Sciences, The University of Tokyo, Chiba 277-8562, Japan.,Division of Translational Informatics, Exploratory Oncology Research and Clinical Trial Center, National Cancer Center, Chiba 277-8577, Japan
| | - Yutaka Suzuki
- Department of Computational Biology and Medical Sciences, Graduate School of Frontier Sciences, The University of Tokyo, Chiba 277-8562, Japan
| |
Collapse
|
29
|
Oikonomopoulos S, Bayega A, Fahiminiya S, Djambazian H, Berube P, Ragoussis J. Methodologies for Transcript Profiling Using Long-Read Technologies. Front Genet 2020; 11:606. [PMID: 32733532 PMCID: PMC7358353 DOI: 10.3389/fgene.2020.00606] [Citation(s) in RCA: 53] [Impact Index Per Article: 13.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/08/2020] [Accepted: 05/19/2020] [Indexed: 12/28/2022] Open
Abstract
RNA sequencing using next-generation sequencing technologies (NGS) is currently the standard approach for gene expression profiling, particularly for large-scale high-throughput studies. NGS technologies comprise high throughput, cost efficient short-read RNA-Seq, while emerging single molecule, long-read RNA-Seq technologies have enabled new approaches to study the transcriptome and its function. The emerging single molecule, long-read technologies are currently commercially available by Pacific Biosciences (PacBio) and Oxford Nanopore Technologies (ONT), while new methodologies based on short-read sequencing approaches are also being developed in order to provide long range single molecule level information-for example, the ones represented by the 10x Genomics linked read methodology. The shift toward long-read sequencing technologies for transcriptome characterization is based on current increases in throughput and decreases in cost, making these attractive for de novo transcriptome assembly, isoform expression quantification, and in-depth RNA species analysis. These types of analyses were challenging with standard short sequencing approaches, due to the complex nature of the transcriptome, which consists of variable lengths of transcripts and multiple alternatively spliced isoforms for most genes, as well as the high sequence similarity of highly abundant species of RNA, such as rRNAs. Here we aim to focus on single molecule level sequencing technologies and single-cell technologies that, combined with perturbation tools, allow the analysis of complete RNA species, whether short or long, at high resolution. In parallel, these tools have opened new ways in understanding gene functions at the tissue, network, and pathway levels, as well as their detailed functional characterization. Analysis of the epi-transcriptome, including RNA methylation and modification and the effects of such modifications on biological systems is now enabled through direct RNA sequencing instead of classical indirect approaches. However, many difficulties and challenges remain, such as methodologies to generate full-length RNA or cDNA libraries from all different species of RNAs, not only poly-A containing transcripts, and the identification of allele-specific transcripts due to current error rates of single molecule technologies, while the bioinformatics analysis on long-read data for accurate identification of 5' and 3' UTRs is still in development.
Collapse
Affiliation(s)
- Spyros Oikonomopoulos
- McGill Genome Centre, Department of Human Genetics, McGill University, Montréal, QC, Canada
| | - Anthony Bayega
- McGill Genome Centre, Department of Human Genetics, McGill University, Montréal, QC, Canada
| | - Somayyeh Fahiminiya
- McGill Genome Centre, Department of Human Genetics, McGill University, Montréal, QC, Canada
| | - Haig Djambazian
- McGill Genome Centre, Department of Human Genetics, McGill University, Montréal, QC, Canada
| | - Pierre Berube
- McGill Genome Centre, Department of Human Genetics, McGill University, Montréal, QC, Canada
| | - Jiannis Ragoussis
- McGill Genome Centre, Department of Human Genetics, McGill University, Montréal, QC, Canada
- Department of Bioengineering, McGill University, Montréal, QC, Canada
| |
Collapse
|
30
|
Cui J, shen N, Lu Z, Xu G, Wang Y, Jin B. Analysis and comprehensive comparison of PacBio and nanopore-based RNA sequencing of the Arabidopsis transcriptome. PLANT METHODS 2020; 16:85. [PMID: 32536962 PMCID: PMC7291481 DOI: 10.1186/s13007-020-00629-x] [Citation(s) in RCA: 55] [Impact Index Per Article: 13.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 12/16/2019] [Accepted: 06/06/2020] [Indexed: 05/27/2023]
Abstract
BACKGROUND The number of studies using third-generation sequencing utilising Pacific Biosciences (PacBio) and Oxford Nanopore Technologies (ONT) is rapidly increasing in many different research areas. Among them, plant full-length single-molecule transcriptome studies have mostly used PacBio sequencing, whereas ONT is rarely used. Therefore, in this study, we examined ONT RNA sequencing methods in plants. We performed a detailed evaluation of reads from PacBio, Nanopore direct cDNA (ONT Dc), and Nanopore PCR cDNA (ONT Pc) sequencing including characteristics of raw data and identification of transcripts. In addition, matched Illumina data were generated for comparison. RESULTS ONT Pc showed overall better raw data quality, whereas PacBio generated longer read lengths. In the transcriptome analysis, PacBio and ONT Pc performed similarly in transcript identification, simple sequence repeat analysis, and long non-coding RNA prediction. PacBio was superior in identifying alternative splicing events, whereas ONT Pc could estimate transcript expression levels. CONCLUSIONS This paper made a comprehensive comparison of PacBio and nanopore-based RNA sequencing of the Arabidopsis transcriptome, the results indicate that ONT Pc is more cost-effective for generating extremely long reads and can characterise the transcriptome as well as quantify transcript expression. Therefore, ONT Pc is a new cost-effective and worthwhile method for full-length single-molecule transcriptome analysis in plants.
Collapse
Affiliation(s)
- Jiawen Cui
- College of Horticulture and Plant Protection, Yangzhou University, Yangzhou, 225009 China
| | - Nan shen
- College of Horticulture and Plant Protection, Yangzhou University, Yangzhou, 225009 China
| | - Zhaogeng Lu
- College of Horticulture and Plant Protection, Yangzhou University, Yangzhou, 225009 China
| | - Guolu Xu
- Biomarker Technologies Corporation, Beijing, 101300 China
| | - Yuyao Wang
- Biomarker Technologies Corporation, Beijing, 101300 China
| | - Biao Jin
- College of Horticulture and Plant Protection, Yangzhou University, Yangzhou, 225009 China
| |
Collapse
|
31
|
Tang AD, Soulette CM, van Baren MJ, Hart K, Hrabeta-Robinson E, Wu CJ, Brooks AN. Full-length transcript characterization of SF3B1 mutation in chronic lymphocytic leukemia reveals downregulation of retained introns. Nat Commun 2020; 11:1438. [PMID: 32188845 PMCID: PMC7080807 DOI: 10.1038/s41467-020-15171-6] [Citation(s) in RCA: 223] [Impact Index Per Article: 55.8] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/24/2018] [Accepted: 02/06/2020] [Indexed: 01/01/2023] Open
Abstract
While splicing changes caused by somatic mutations in SF3B1 are known, identifying full-length isoform changes may better elucidate the functional consequences of these mutations. We report nanopore sequencing of full-length cDNA from CLL samples with and without SF3B1 mutation, as well as normal B cell samples, giving a total of 149 million pass reads. We present FLAIR (Full-Length Alternative Isoform analysis of RNA), a computational workflow to identify high-confidence transcripts, perform differential splicing event analysis, and differential isoform analysis. Using nanopore reads, we demonstrate differential 3' splice site changes associated with SF3B1 mutation, agreeing with previous studies. We also observe a strong downregulation of intron retention events associated with SF3B1 mutation. Full-length transcript analysis links multiple alternative splicing events together and allows for better estimates of the abundance of productive versus unproductive isoforms. Our work demonstrates the potential utility of nanopore sequencing for cancer and splicing research.
Collapse
Affiliation(s)
- Alison D Tang
- Department of Biomolecular Engineering, University of California, Santa Cruz, CA, 95062, USA
| | - Cameron M Soulette
- Department of Molecular Cell & Developmental Biology, University of California, Santa Cruz, CA, 95062, USA
| | - Marijke J van Baren
- Department of Biomolecular Engineering, University of California, Santa Cruz, CA, 95062, USA
| | - Kevyn Hart
- Department of Biomolecular Engineering, University of California, Santa Cruz, CA, 95062, USA
| | - Eva Hrabeta-Robinson
- Department of Biomolecular Engineering, University of California, Santa Cruz, CA, 95062, USA
| | - Catherine J Wu
- Department of Medical Oncology, Dana-Farber Cancer Institute, Boston, MA, 02215, USA
- Broad Institiute of Harvard and MIT, Cambridge, MA, USA
- Department of Medicine, Brigham and Women's Hospital, Harvard Medical School, Boston, MA, 02115, USA
| | - Angela N Brooks
- Department of Biomolecular Engineering, University of California, Santa Cruz, CA, 95062, USA.
| |
Collapse
|
32
|
Minervini CF, Cumbo C, Orsini P, Anelli L, Zagaria A, Specchia G, Albano F. Nanopore Sequencing in Blood Diseases: A Wide Range of Opportunities. Front Genet 2020; 11:76. [PMID: 32140171 PMCID: PMC7043087 DOI: 10.3389/fgene.2020.00076] [Citation(s) in RCA: 26] [Impact Index Per Article: 6.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/06/2019] [Accepted: 01/23/2020] [Indexed: 12/20/2022] Open
Abstract
The molecular pathogenesis of hematological diseases is often driven by genetic and epigenetic alterations. Next-generation sequencing has considerably increased our genomic knowledge of these disorders becoming ever more widespread in clinical practice. In 2012 Oxford Nanopore Technologies (ONT) released the MinION, the first long-read nanopore-based sequencer, overcoming the main limits of short-reads sequences generation. In the last years, several nanopore sequencing approaches have been performed in various "-omic" sciences; this review focuses on the challenge to introduce ONT devices in the hematological field, showing advantages, disadvantages and future perspectives of this technology in the precision medicine era.
Collapse
Affiliation(s)
| | | | | | | | | | | | - Francesco Albano
- Department of Emergency and Organ Transplantation (D.E.T.O.), Hematology Section, University of Bari, Bari, Italy
| |
Collapse
|
33
|
Santos A, van Aerle R, Barrientos L, Martinez-Urtaza J. Computational methods for 16S metabarcoding studies using Nanopore sequencing data. Comput Struct Biotechnol J 2020; 18:296-305. [PMID: 32071706 PMCID: PMC7013242 DOI: 10.1016/j.csbj.2020.01.005] [Citation(s) in RCA: 57] [Impact Index Per Article: 14.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/05/2019] [Revised: 01/15/2020] [Accepted: 01/15/2020] [Indexed: 12/23/2022] Open
Abstract
Assessment of bacterial diversity through sequencing of 16S ribosomal RNA (16S rRNA) genes has been an approach widely used in environmental microbiology, particularly since the advent of high-throughput sequencing technologies. An additional innovation introduced by these technologies was the need of developing new strategies to manage and investigate the massive amount of sequencing data generated. This situation stimulated the rapid expansion of the field of bioinformatics with the release of new tools to be applied to the downstream analysis and interpretation of sequencing data mainly generated using Illumina technology. In recent years, a third generation of sequencing technologies has been developed and have been applied in parallel and complementarily to the former sequencing strategies. In particular, Oxford Nanopore Technologies (ONT) introduced nanopore sequencing which has become very popular among molecular ecologists. Nanopore technology offers a low price, portability and fast sequencing throughput. This powerful technology has been recently tested for 16S rRNA analyses showing promising results. However, compared with previous technologies, there is a scarcity of bioinformatic tools and protocols designed specifically for the analysis of Nanopore 16S sequences. Due its notable characteristics, researchers have recently started performing assessments regarding the suitability MinION on 16S rRNA sequencing studies, and have obtained remarkable results. Here we present a review of the state-of-the-art of MinION technology applied to microbiome studies, the current possible application and main challenges for its use on 16S rRNA metabarcoding.
Collapse
Affiliation(s)
- Andres Santos
- Applied and Molecular Biology Laboratory, Centre of Excellence in Translational Medicine, Universidad de La Frontera, Avenida Alemania 0458, 4810296 Temuco, Chile
- Scientific and Technological Bioresource Nucleus, Universidad de La Frontera, Avenida Francisco Salazar 01145, 481123 Temuco, Chile
- Centre for Environment, Fisheries and Aquaculture Science (Cefas), Barrack Road, Weymouth, Dorset DT4 8UB, UK
| | - Ronny van Aerle
- Centre for Environment, Fisheries and Aquaculture Science (Cefas), Barrack Road, Weymouth, Dorset DT4 8UB, UK
| | - Leticia Barrientos
- Applied and Molecular Biology Laboratory, Centre of Excellence in Translational Medicine, Universidad de La Frontera, Avenida Alemania 0458, 4810296 Temuco, Chile
- Scientific and Technological Bioresource Nucleus, Universidad de La Frontera, Avenida Francisco Salazar 01145, 481123 Temuco, Chile
- Centre for Environment, Fisheries and Aquaculture Science (Cefas), Barrack Road, Weymouth, Dorset DT4 8UB, UK
| | - Jaime Martinez-Urtaza
- Centre for Environment, Fisheries and Aquaculture Science (Cefas), Barrack Road, Weymouth, Dorset DT4 8UB, UK
| |
Collapse
|
34
|
Xu L, Seki M. Recent advances in the detection of base modifications using the Nanopore sequencer. J Hum Genet 2020; 65:25-33. [PMID: 31602005 PMCID: PMC7087776 DOI: 10.1038/s10038-019-0679-0] [Citation(s) in RCA: 82] [Impact Index Per Article: 20.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/17/2019] [Revised: 09/22/2019] [Accepted: 09/26/2019] [Indexed: 12/29/2022]
Abstract
DNA and RNA modifications have important functions, including the regulation of gene expression. Existing methods based on short-read sequencing for the detection of modifications show difficulty in determining the modification patterns of single chromosomes or an entire transcript sequence. Furthermore, the kinds of modifications for which detection methods are available are very limited. The Nanopore sequencer is a single-molecule, long-read sequencer that can directly sequence RNA as well as DNA. Moreover, the Nanopore sequencer detects modifications on long DNA and RNA molecules. In this review, we mainly focus on base modification detection in the DNA and RNA of mammals using the Nanopore sequencer. We summarize current studies of modifications using the Nanopore sequencer, detection tools using statistical tests or machine learning, and applications of this technology, such as analyses of open chromatin, DNA replication, and RNA metabolism.
Collapse
Affiliation(s)
- Liu Xu
- Department of Computational Biology and Medical Sciences, Graduate School of Frontier Sciences, The University of Tokyo, Kashiwa, Chiba, Japan
| | - Masahide Seki
- Department of Computational Biology and Medical Sciences, Graduate School of Frontier Sciences, The University of Tokyo, Kashiwa, Chiba, Japan.
| |
Collapse
|
35
|
Sakamoto Y, Sereewattanawoot S, Suzuki A. A new era of long-read sequencing for cancer genomics. J Hum Genet 2020; 65:3-10. [PMID: 31474751 PMCID: PMC6892365 DOI: 10.1038/s10038-019-0658-5] [Citation(s) in RCA: 47] [Impact Index Per Article: 11.8] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/16/2019] [Revised: 07/19/2019] [Accepted: 07/21/2019] [Indexed: 02/08/2023]
Abstract
Cancer is a disease largely caused by genomic aberrations. Utilizing many rapidly emerging sequencing technologies, researchers have studied cancer genomes to understand the molecular statuses of cancer cells and to reveal their vulnerabilities, such as driver mutations or gene expression. Long-read technologies enable us to identify and characterize novel types of cancerous mutations, including complicated structural variants in haplotype resolution. In this review, we introduce three representative platforms for long-read sequencing and research trends of cancer genomics with long-read data. Further, we describe that aberrant transcriptome and epigenome statuses, namely, fusion transcripts, as well as aberrant transcript isoforms and the phase information of DNA methylation, are able to be elucidated by long-read sequencers. Long-read sequencing may shed light on novel types of aberrations in cancer genomics that are being missed by conventional short-read sequencing analyses.
Collapse
Affiliation(s)
- Yoshitaka Sakamoto
- Department of Computational Biology and Medical Sciences, Graduate School of Frontier Sciences, The University of Tokyo, 5-1-5 Kashiwanoha, Kashiwa, Chiba, 277-8561, Japan
| | - Sarun Sereewattanawoot
- Department of Computational Biology and Medical Sciences, Graduate School of Frontier Sciences, The University of Tokyo, 5-1-5 Kashiwanoha, Kashiwa, Chiba, 277-8561, Japan
| | - Ayako Suzuki
- Department of Computational Biology and Medical Sciences, Graduate School of Frontier Sciences, The University of Tokyo, 5-1-5 Kashiwanoha, Kashiwa, Chiba, 277-8561, Japan.
| |
Collapse
|
36
|
Sessegolo C, Cruaud C, Da Silva C, Cologne A, Dubarry M, Derrien T, Lacroix V, Aury JM. Transcriptome profiling of mouse samples using nanopore sequencing of cDNA and RNA molecules. Sci Rep 2019; 9:14908. [PMID: 31624302 PMCID: PMC6797730 DOI: 10.1038/s41598-019-51470-9] [Citation(s) in RCA: 56] [Impact Index Per Article: 11.2] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/20/2019] [Accepted: 09/28/2019] [Indexed: 01/27/2023] Open
Abstract
Our vision of DNA transcription and splicing has changed dramatically with the introduction of short-read sequencing. These high-throughput sequencing technologies promised to unravel the complexity of any transcriptome. Generally gene expression levels are well-captured using these technologies, but there are still remaining caveats due to the limited read length and the fact that RNA molecules had to be reverse transcribed before sequencing. Oxford Nanopore Technologies has recently launched a portable sequencer which offers the possibility of sequencing long reads and most importantly RNA molecules. Here we generated a full mouse transcriptome from brain and liver using the Oxford Nanopore device. As a comparison, we sequenced RNA (RNA-Seq) and cDNA (cDNA-Seq) molecules using both long and short reads technologies and tested the TeloPrime preparation kit, dedicated to the enrichment of full-length transcripts. Using spike-in data, we confirmed that expression levels are efficiently captured by cDNA-Seq using short reads. More importantly, Oxford Nanopore RNA-Seq tends to be more efficient, while cDNA-Seq appears to be more biased. We further show that the cDNA library preparation of the Nanopore protocol induces read truncation for transcripts containing internal runs of T's. This bias is marked for runs of at least 15 T's, but is already detectable for runs of at least 9 T's and therefore concerns more than 20% of expressed transcripts in mouse brain and liver. Finally, we outline that bioinformatics challenges remain ahead for quantifying at the transcript level, especially when reads are not full-length. Accurate quantification of repeat-associated genes such as processed pseudogenes also remains difficult, and we show that current mapping protocols which map reads to the genome largely over-estimate their expression, at the expense of their parent gene.
Collapse
Affiliation(s)
- Camille Sessegolo
- Univ Lyon, Université Lyon 1, CNRS, Laboratoire de Biométrie et Biologie Évolutive UMR5558, F-69622, Villeurbanne, France
- EPI ERABLE - Inria Grenoble, Rhône-Alpes, France
| | - Corinne Cruaud
- Genoscope, Institut de biologie François-Jacob, Commissariat a l'Energie Atomique (CEA), Université Paris-Saclay, F-91057, Evry, France
| | - Corinne Da Silva
- Genoscope, Institut de biologie François-Jacob, Commissariat a l'Energie Atomique (CEA), Université Paris-Saclay, F-91057, Evry, France
| | - Audric Cologne
- Univ Lyon, Université Lyon 1, CNRS, Laboratoire de Biométrie et Biologie Évolutive UMR5558, F-69622, Villeurbanne, France
- EPI ERABLE - Inria Grenoble, Rhône-Alpes, France
| | - Marion Dubarry
- Genoscope, Institut de biologie François-Jacob, Commissariat a l'Energie Atomique (CEA), Université Paris-Saclay, F-91057, Evry, France
| | - Thomas Derrien
- Univ Rennes, CNRS, IGDR (Institut de génétique et développement de Rennes) - UMR 6290, F-35000, Rennes, France
| | - Vincent Lacroix
- Univ Lyon, Université Lyon 1, CNRS, Laboratoire de Biométrie et Biologie Évolutive UMR5558, F-69622, Villeurbanne, France
- EPI ERABLE - Inria Grenoble, Rhône-Alpes, France
| | - Jean-Marc Aury
- Genoscope, Institut de biologie François-Jacob, Commissariat a l'Energie Atomique (CEA), Université Paris-Saclay, F-91057, Evry, France.
| |
Collapse
|
37
|
Soneson C, Yao Y, Bratus-Neuenschwander A, Patrignani A, Robinson MD, Hussain S. A comprehensive examination of Nanopore native RNA sequencing for characterization of complex transcriptomes. Nat Commun 2019; 10:3359. [PMID: 31366910 PMCID: PMC6668388 DOI: 10.1038/s41467-019-11272-z] [Citation(s) in RCA: 127] [Impact Index Per Article: 25.4] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/11/2019] [Accepted: 07/04/2019] [Indexed: 11/29/2022] Open
Abstract
A platform for highly parallel direct sequencing of native RNA strands was recently described by Oxford Nanopore Technologies, but despite initial efforts it remains crucial to further investigate the technology for quantification of complex transcriptomes. Here we undertake native RNA sequencing of polyA + RNA from two human cell lines, analysing ~5.2 million aligned native RNA reads. To enable informative comparisons, we also perform relevant ONT direct cDNA- and Illumina-sequencing. We find that while native RNA sequencing does enable some of the anticipated advantages, key unexpected aspects currently hamper its performance, most notably the quite frequent inability to obtain full-length transcripts from single reads, as well as difficulties to unambiguously infer their true transcript of origin. While characterising issues that need to be addressed when investigating more complex transcriptomes, our study highlights that with some defined improvements, native RNA sequencing could be an important addition to the mammalian transcriptomics toolbox.
Collapse
Affiliation(s)
- Charlotte Soneson
- Institute of Molecular Life Sciences, University of Zurich, 8057, Zurich, Switzerland.
- SIB Swiss Institute of Bioinformatics, 8057, Zurich, Switzerland.
- Friedrich Miescher Institute for Biomedical Research and SIB Swiss Institute of Bioinformatics, Basel, Switzerland.
| | - Yao Yao
- Institute of Molecular Life Sciences, University of Zurich, 8057, Zurich, Switzerland
- SIB Swiss Institute of Bioinformatics, 8057, Zurich, Switzerland
| | | | - Andrea Patrignani
- Functional Genomics Centre Zurich, ETHZ/University of Zurich, 8057, Zurich, Switzerland
| | - Mark D Robinson
- Institute of Molecular Life Sciences, University of Zurich, 8057, Zurich, Switzerland.
- SIB Swiss Institute of Bioinformatics, 8057, Zurich, Switzerland.
| | - Shobbir Hussain
- Department of Biology and Biochemistry, University of Bath, Bath, BA2 7AY, UK.
| |
Collapse
|
38
|
Abugessaisa I, Noguchi S, Hasegawa A, Kondo A, Kawaji H, Carninci P, Kasukawa T. refTSS: A Reference Data Set for Human and Mouse Transcription Start Sites. J Mol Biol 2019; 431:2407-2422. [DOI: 10.1016/j.jmb.2019.04.045] [Citation(s) in RCA: 35] [Impact Index Per Article: 7.0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/30/2018] [Revised: 04/25/2019] [Accepted: 04/29/2019] [Indexed: 01/22/2023]
|
39
|
Zhao L, Zhang H, Kohnen MV, Prasad KVSK, Gu L, Reddy ASN. Analysis of Transcriptome and Epitranscriptome in Plants Using PacBio Iso-Seq and Nanopore-Based Direct RNA Sequencing. Front Genet 2019; 10:253. [PMID: 30949200 PMCID: PMC6438080 DOI: 10.3389/fgene.2019.00253] [Citation(s) in RCA: 80] [Impact Index Per Article: 16.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/15/2018] [Accepted: 03/06/2019] [Indexed: 12/18/2022] Open
Abstract
Nanopore sequencing from Oxford Nanopore Technologies (ONT) and Pacific BioSciences (PacBio) single-molecule real-time (SMRT) long-read isoform sequencing (Iso-Seq) are revolutionizing the way transcriptomes are analyzed. These methods offer many advantages over most widely used high-throughput short-read RNA sequencing (RNA-Seq) approaches and allow a comprehensive analysis of transcriptomes in identifying full-length splice isoforms and several other post-transcriptional events. In addition, direct RNA-Seq provides valuable information about RNA modifications, which are lost during the PCR amplification step in other methods. Here, we present a comprehensive summary of important applications of these technologies in plants, including identification of complex alternative splicing (AS), full-length splice variants, fusion transcripts, and alternative polyadenylation (APA) events. Furthermore, we discuss the impact of the newly developed nanopore direct RNA-Seq in advancing epitranscriptome research in plants. Additionally, we summarize computational tools for identifying and quantifying full-length isoforms and other co/post-transcriptional events and discussed some of the limitations with these methods. Sequencing of transcriptomes using these new single-molecule long-read methods will unravel many aspects of transcriptome complexity in unprecedented ways as compared to previous short-read sequencing approaches. Analysis of plant transcriptomes with these new powerful methods that require minimum sample processing is likely to become the norm and is expected to uncover novel co/post-transcriptional gene regulatory mechanisms that control biological outcomes during plant development and in response to various stresses.
Collapse
Affiliation(s)
- Liangzhen Zhao
- Basic Forestry and Proteomics Research Center, College of Forestry, Fujian Provincial Key Laboratory of Haixia Applied Plant Systems Biology, Fujian Agriculture and Forestry University, Fuzhou, China
| | - Hangxiao Zhang
- Basic Forestry and Proteomics Research Center, College of Forestry, Fujian Provincial Key Laboratory of Haixia Applied Plant Systems Biology, Fujian Agriculture and Forestry University, Fuzhou, China
| | - Markus V. Kohnen
- Basic Forestry and Proteomics Research Center, College of Forestry, Fujian Provincial Key Laboratory of Haixia Applied Plant Systems Biology, Fujian Agriculture and Forestry University, Fuzhou, China
| | - Kasavajhala V. S. K. Prasad
- Program in Cell and Molecular Biology, Department of Biology, Colorado State University, Fort Collins, CO, United States
| | - Lianfeng Gu
- Basic Forestry and Proteomics Research Center, College of Forestry, Fujian Provincial Key Laboratory of Haixia Applied Plant Systems Biology, Fujian Agriculture and Forestry University, Fuzhou, China
| | - Anireddy S. N. Reddy
- Program in Cell and Molecular Biology, Department of Biology, Colorado State University, Fort Collins, CO, United States
| |
Collapse
|