1
|
Haile S, Corbett RD, O’Neill K, Xu J, Smailus DE, Pandoh PK, Bayega A, Bala M, Chuah E, Coope RJN, Moore RA, Mungall KL, Zhao Y, Ma Y, Marra MA, Jones SJM, Mungall AJ. Adaptable and comprehensive approaches for long-read nanopore sequencing of polyadenylated and non-polyadenylated RNAs. Front Genet 2024; 15:1466338. [PMID: 39687742 PMCID: PMC11647301 DOI: 10.3389/fgene.2024.1466338] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/17/2024] [Accepted: 11/11/2024] [Indexed: 12/18/2024] Open
Abstract
The advent of long-read (LR) sequencing technologies has provided a direct opportunity to determine the structure of transcripts with potential for end-to-end sequencing of full-length RNAs. LR methods that have been described to date include commercial offerings from Oxford Nanopore Technologies (ONT) and Pacific Biosciences. These kits are based on selection of polyadenylated (polyA+) RNAs and/or oligo-dT priming of reverse transcription. Thus, these approaches do not allow comprehensive interrogation of the transcriptome due to their exclusion of non-polyadenylated (polyA-) RNAs. In addition, polyA + specificity also results in 3'-biased measurements of PolyA+ RNAs especially when the RNA input is partially degraded. To address these limitations of current LR protocols, we modified rRNA depletion protocols that have been used in short-read sequencing: one approach representing a ligation-based method and the other a template-switch cDNA synthesis-based method to append ONT-specific adaptor sequences and by removing any deliberate fragmentation/shearing of RNA/cDNA. Here, we present comparisons with poly+ RNA-specific versions of the two approaches including the ONT PCR-cDNA Barcoding kit. The rRNA depletion protocols displayed higher proportions (30%-50%) of intronic content compared to that of the polyA-specific protocols (5%-8%). In addition, the rRNA depletion protocols enabled ∼20-50% higher detection of expressed genes. Other metrics that were favourable to the rRNA depletion protocols include better coverage of long transcripts, and higher accuracy and reproducibility of expression measurements. Overall, these results indicate that the rRNA depletion-based protocols described here allow the comprehensive characterization of polyadenylated and non-polyadenylated RNAs. While the resulting reads are long enough to help decipher transcript structures, future endeavors are warranted to improve the proportion of individual reads representing end-to-end spanning of transcripts.
Collapse
Affiliation(s)
- Simon Haile
- Canada’s Michael Smith Genome Sciences Centre, BC Cancer, Vancouver, BC, Canada
| | - Richard D. Corbett
- Canada’s Michael Smith Genome Sciences Centre, BC Cancer, Vancouver, BC, Canada
| | - Kieran O’Neill
- Canada’s Michael Smith Genome Sciences Centre, BC Cancer, Vancouver, BC, Canada
| | - Jing Xu
- Canada’s Michael Smith Genome Sciences Centre, BC Cancer, Vancouver, BC, Canada
| | - Duane E. Smailus
- Canada’s Michael Smith Genome Sciences Centre, BC Cancer, Vancouver, BC, Canada
| | - Pawan K. Pandoh
- Canada’s Michael Smith Genome Sciences Centre, BC Cancer, Vancouver, BC, Canada
| | - Anthony Bayega
- Canada’s Michael Smith Genome Sciences Centre, BC Cancer, Vancouver, BC, Canada
| | - Miruna Bala
- Canada’s Michael Smith Genome Sciences Centre, BC Cancer, Vancouver, BC, Canada
| | - Eric Chuah
- Canada’s Michael Smith Genome Sciences Centre, BC Cancer, Vancouver, BC, Canada
| | - Robin J. N. Coope
- Canada’s Michael Smith Genome Sciences Centre, BC Cancer, Vancouver, BC, Canada
| | - Richard A. Moore
- Canada’s Michael Smith Genome Sciences Centre, BC Cancer, Vancouver, BC, Canada
| | - Karen L. Mungall
- Canada’s Michael Smith Genome Sciences Centre, BC Cancer, Vancouver, BC, Canada
| | - Yongjun Zhao
- Canada’s Michael Smith Genome Sciences Centre, BC Cancer, Vancouver, BC, Canada
| | - Yussanne Ma
- Canada’s Michael Smith Genome Sciences Centre, BC Cancer, Vancouver, BC, Canada
| | - Marco A. Marra
- Canada’s Michael Smith Genome Sciences Centre, BC Cancer, Vancouver, BC, Canada
- Department of Medical Genetics, University of British Columbia, Vancouver, BC, Canada
| | - Steven J. M. Jones
- Canada’s Michael Smith Genome Sciences Centre, BC Cancer, Vancouver, BC, Canada
- Department of Medical Genetics, University of British Columbia, Vancouver, BC, Canada
| | - Andrew J. Mungall
- Canada’s Michael Smith Genome Sciences Centre, BC Cancer, Vancouver, BC, Canada
| |
Collapse
|
2
|
Chen T, Liu Y, Song S, Bai J, Li C. Full-length transcriptome analysis of the bloom-forming dinoflagellate Akashiwo sanguinea by single-molecule real-time sequencing. Front Microbiol 2022; 13:993914. [PMID: 36325025 PMCID: PMC9618608 DOI: 10.3389/fmicb.2022.993914] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/14/2022] [Accepted: 09/20/2022] [Indexed: 11/13/2022] Open
Abstract
The dinoflagellate Akashiwo sanguinea is a harmful algal species and commonly observed in estuarine and coastal waters around the world. Harmful algal blooms (HABs) caused by this species lead to serious environmental impacts in the coastal waters of China since 1998 followed by huge economic losses. However, the full-length transcriptome information of A. sanguinea is still not fully explored, which hampers basic genetic and functional studies. Herein, single-molecule real-time (SMRT) sequencing technology was performed to characterize the full-length transcript in A. sanguinea. Totally, 83.03 Gb SMRT sequencing clean reads were generated, 983,960 circular consensus sequences (CCS) with average lengths of 3,061 bp were obtained, and 81.71% (804,016) of CCS were full-length non-chimeric reads (FLNC). Furthermore, 26,461 contigs were obtained after being corrected with Illumina library sequencing, with 20,037 (75.72%) successfully annotated in the five public databases. A total of 13,441 long non-coding RNA (lncRNA) transcripts, 3,137 alternative splicing (AS) events, 514 putative transcription factors (TFs) members from 23 TF families, and 4,397 simple sequence repeats (SSRs) were predicted, respectively. Our findings provided a sizable insights into gene sequence characteristics of A. sanguinea, which can be used as a reference sequence resource for A. sanguinea draft genome annotation, and will contribute to further molecular biology research on this harmful bloom algae.
Collapse
Affiliation(s)
- Tiantian Chen
- College of Environmental Science and Engineering, Ocean University of China, Qingdao, China
- Key Laboratory of Marine Environment and Ecology, Ocean University of China, Qingdao, China
| | - Yun Liu
- CAS Key Laboratory of Marine Ecology and Environmental Sciences, Institute of Oceanology, Chinese Academy of Sciences, Qingdao, China
- Laboratory of Marine Ecology and Environmental Science, Qingdao National Laboratory for Marine Science and Technology, Qingdao, China
| | - Shuqun Song
- CAS Key Laboratory of Marine Ecology and Environmental Sciences, Institute of Oceanology, Chinese Academy of Sciences, Qingdao, China
- Laboratory of Marine Ecology and Environmental Science, Qingdao National Laboratory for Marine Science and Technology, Qingdao, China
| | - Jie Bai
- College of Environmental Science and Engineering, Ocean University of China, Qingdao, China
- Key Laboratory of Marine Environment and Ecology, Ocean University of China, Qingdao, China
| | - Caiwen Li
- CAS Key Laboratory of Marine Ecology and Environmental Sciences, Institute of Oceanology, Chinese Academy of Sciences, Qingdao, China
- Laboratory of Marine Ecology and Environmental Science, Qingdao National Laboratory for Marine Science and Technology, Qingdao, China
| |
Collapse
|
3
|
Sun Y, Li H. Chimeric RNAs Discovered by RNA Sequencing and Their Roles in Cancer and Rare Genetic Diseases. Genes (Basel) 2022; 13:741. [PMID: 35627126 PMCID: PMC9140685 DOI: 10.3390/genes13050741] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/09/2022] [Revised: 04/13/2022] [Accepted: 04/20/2022] [Indexed: 12/30/2022] Open
Abstract
Chimeric RNAs are transcripts that are generated by gene fusion and intergenic splicing events, thus comprising nucleotide sequences from different parental genes. In the past, Northern blot analysis and RT-PCR were used to detect chimeric RNAs. However, they are low-throughput and can be time-consuming, labor-intensive, and cost-prohibitive. With the development of RNA-seq and transcriptome analyses over the past decade, the number of chimeric RNAs in cancer as well as in rare inherited diseases has dramatically increased. Chimeric RNAs may be potential diagnostic biomarkers when they are specifically expressed in cancerous cells and/or tissues. Some chimeric RNAs can also play a role in cell proliferation and cancer development, acting as tools for cancer prognosis, and revealing new insights into the cell origin of tumors. Due to their abilities to characterize a whole transcriptome with a high sequencing depth and intergenically identify spliced chimeric RNAs produced with the absence of chromosomal rearrangement, RNA sequencing has not only enhanced our ability to diagnose genetic diseases, but also provided us with a deeper understanding of these diseases. Here, we reviewed the mechanisms of chimeric RNA formation and the utility of RNA sequencing for discovering chimeric RNAs in several types of cancer and rare inherited diseases. We also discussed the diagnostic, prognostic, and therapeutic values of chimeric RNAs.
Collapse
Affiliation(s)
- Yunan Sun
- Department of Pathology, School of Medicine, University of Virginia, Charlottesville, VA 22908, USA;
- Department of Biochemistry and Molecular Genetics, School of Medicine, University of Virginia, Charlottesville, VA 22908, USA
| | - Hui Li
- Department of Pathology, School of Medicine, University of Virginia, Charlottesville, VA 22908, USA;
- Department of Biochemistry and Molecular Genetics, School of Medicine, University of Virginia, Charlottesville, VA 22908, USA
| |
Collapse
|