1
|
Xu D, Tang L, Zhou J, Wang F, Cao H, Huang Y, Kapranov P. Evidence for widespread existence of functional novel and non-canonical human transcripts. BMC Biol 2023; 21:271. [PMID: 38001496 PMCID: PMC10675921 DOI: 10.1186/s12915-023-01753-5] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/24/2023] [Accepted: 10/31/2023] [Indexed: 11/26/2023] Open
Abstract
BACKGROUND Fraction of functional sequence in the human genome remains a key unresolved question in Biology and the subject of vigorous debate. While a plethora of studies have connected a significant fraction of human DNA to various biochemical processes, the classical definition of function requires evidence of effects on cellular or organismal fitness that such studies do not provide. Although multiple high-throughput reverse genetics screens have been developed to address this issue, they are limited to annotated genomic elements and suffer from non-specific effects, arguing for a strong need to develop additional functional genomics approaches. RESULTS In this work, we established a high-throughput lentivirus-based insertional mutagenesis strategy as a forward genetics screen tool in aneuploid cells. Application of this approach to human cell lines in multiple phenotypic screens suggested the presence of many yet uncharacterized functional elements in the human genome, represented at least in part by novel exons of known and novel genes. The novel transcripts containing these exons can be massively, up to thousands-fold, induced by specific stresses, and at least some can represent bi-cistronic protein-coding mRNAs. CONCLUSIONS Altogether, these results argue that many unannotated and non-canonical human transcripts, including those that appear as aberrant splice products, have biological relevance under specific biological conditions.
Collapse
Affiliation(s)
- Dongyang Xu
- Institute of Genomics, School of Medicine, Huaqiao University, 668 Jimei Road, Xiamen, 361021, China
| | - Lu Tang
- Institute of Genomics, School of Medicine, Huaqiao University, 668 Jimei Road, Xiamen, 361021, China
| | - Junjun Zhou
- Institute of Genomics, School of Medicine, Huaqiao University, 668 Jimei Road, Xiamen, 361021, China
| | - Fang Wang
- Institute of Genomics, School of Medicine, Huaqiao University, 668 Jimei Road, Xiamen, 361021, China
| | - Huifen Cao
- Institute of Genomics, School of Medicine, Huaqiao University, 668 Jimei Road, Xiamen, 361021, China
| | - Yu Huang
- Institute of Genomics, School of Medicine, Huaqiao University, 668 Jimei Road, Xiamen, 361021, China
| | - Philipp Kapranov
- Institute of Genomics, School of Medicine, Huaqiao University, 668 Jimei Road, Xiamen, 361021, China.
- State Key Laboratory of Cellular Stress Biology, School of Life Sciences, Xiamen University, Xiamen, 361102, China.
| |
Collapse
|
2
|
Xu D, Tang L, Kapranov P. Complexities of mammalian transcriptome revealed by targeted RNA enrichment techniques. Trends Genet 2023; 39:320-333. [PMID: 36681580 DOI: 10.1016/j.tig.2022.12.004] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/29/2022] [Revised: 12/27/2022] [Accepted: 12/30/2022] [Indexed: 01/21/2023]
Abstract
Studies using highly sensitive targeted RNA enrichment methods have shown that a large portion of the human transcriptome remains to be discovered and that most of the genome is transcribed in a complex, interleaved fashion characterized by a complex web of transcripts emanating from protein coding and noncoding loci. These results resonate with those from single-cell transcriptome profiling endeavors that reveal the existence of multiple novel, cell type-specific transcripts and clearly demonstrate that our understanding of the complexities of the human transcriptome is far from being complete. Here, we review the current status of the targeted RNA enrichment techniques, their application to the discovery of novel cell type-specific transcripts, and their impact on our understanding of the human genome and transcriptome.
Collapse
Affiliation(s)
- Dongyang Xu
- Institute of Genomics, School of Medicine, Huaqiao University, 668 Jimei Road, Xiamen 361021, China
| | - Lu Tang
- Institute of Genomics, School of Medicine, Huaqiao University, 668 Jimei Road, Xiamen 361021, China
| | - Philipp Kapranov
- Institute of Genomics, School of Medicine, Huaqiao University, 668 Jimei Road, Xiamen 361021, China.
| |
Collapse
|
3
|
Mathur R, Jha NK, Saini G, Jha SK, Shukla SP, Filipejová Z, Kesari KK, Iqbal D, Nand P, Upadhye VJ, Jha AK, Roychoudhury S, Slama P. Epigenetic factors in breast cancer therapy. Front Genet 2022; 13:886487. [PMID: 36212140 PMCID: PMC9539821 DOI: 10.3389/fgene.2022.886487] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/28/2022] [Accepted: 07/20/2022] [Indexed: 11/17/2022] Open
Abstract
Epigenetic modifications are inherited differences in cellular phenotypes, such as cell gene expression alterations, that occur during somatic cell divisions (also, in rare circumstances, in germ line transmission), but no alterations to the DNA sequence are involved. Histone alterations, polycomb/trithorax associated proteins, short non-coding or short RNAs, long non—coding RNAs (lncRNAs), & DNA methylation are just a few biological processes involved in epigenetic events. These various modifications are intricately linked. The transcriptional potential of genes is closely conditioned by epigenetic control, which is crucial in normal growth and development. Epigenetic mechanisms transmit genomic adaptation to an environment, resulting in a specific phenotype. The purpose of this systematic review is to glance at the roles of Estrogen signalling, polycomb/trithorax associated proteins, DNA methylation in breast cancer progression, as well as epigenetic mechanisms in breast cancer therapy, with an emphasis on functionality, regulatory factors, therapeutic value, and future challenges.
Collapse
Affiliation(s)
- Runjhun Mathur
- Department of Biotechnology, School of Engineering and Technology, Sharda University, Greater Noida, India
- Dr. A.P.J Abdul Kalam Technical University, Lucknow, India
| | - Niraj Kumar Jha
- Department of Biotechnology, School of Engineering and Technology, Sharda University, Greater Noida, India
- Department of Biotechnology, School of Applied and Life Sciences (SALS), Uttaranchal University, Dehradun, India
- Department of Biotechnology Engineering and Food Technology, Chandigarh University, Mohali, India
| | - Gaurav Saini
- Department of Civil Engineering, Netaji Subhas University of Technology, Delhi, India
| | - Saurabh Kumar Jha
- Department of Biotechnology, School of Engineering and Technology, Sharda University, Greater Noida, India
- Department of Biotechnology Engineering and Food Technology, Chandigarh University, Mohali, India
| | - Sheo Prasad Shukla
- Department of Civil Engineering, Rajkiya Engineering College, Banda, India
| | - Zita Filipejová
- Small Animal Clinic, University of Veterinary Sciences Brno, Brno, Czechia
| | | | - Danish Iqbal
- Department of Medical Laboratory Sciences, College of Applied Medical Sciences, Majmaah University, Al Majma'ah, Saudi Arabia
- Health and Basic Sciences Research Center, Majmaah University, Al Majma'ah, Saudi Arabia
| | - Parma Nand
- Department of Biotechnology, School of Engineering and Technology, Sharda University, Greater Noida, India
| | - Vijay Jagdish Upadhye
- Center of Research for Development (CR4D), Parul Institute of Applied Sciences (PIAS), Parul University, Vadodara, Gujarat
| | - Abhimanyu Kumar Jha
- Department of Biotechnology, School of Engineering and Technology, Sharda University, Greater Noida, India
- *Correspondence: Abhimanyu Kumar Jha, ; Shubhadeep Roychoudhury,
| | - Shubhadeep Roychoudhury
- Department of Life Science and Bioinformatics, Assam University, Silchar, India
- *Correspondence: Abhimanyu Kumar Jha, ; Shubhadeep Roychoudhury,
| | - Petr Slama
- Department of Animal Morphology, Physiology, and Genetics, Faculty of AgriSciences, Mendel University in Brno, Brno, Czechia
| |
Collapse
|
4
|
Mukherjee S, Detroja R, Balamurali D, Matveishina E, Medvedeva Y, Valencia A, Gorohovski A, Frenkel-Morgenstern M. Computational analysis of sense-antisense chimeric transcripts reveals their potential regulatory features and the landscape of expression in human cells. NAR Genom Bioinform 2021; 3:lqab074. [PMID: 34458728 PMCID: PMC8386243 DOI: 10.1093/nargab/lqab074] [Citation(s) in RCA: 14] [Impact Index Per Article: 4.7] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/09/2020] [Revised: 07/02/2021] [Accepted: 08/20/2021] [Indexed: 12/11/2022] Open
Abstract
Many human genes are transcribed from both strands and produce sense-antisense gene pairs. Sense-antisense (SAS) chimeric transcripts are produced upon the coalescing of exons/introns from both sense and antisense transcripts of the same gene. SAS chimera was first reported in prostate cancer cells. Subsequently, numerous SAS chimeras have been reported in the ChiTaRS-2.1 database. However, the landscape of their expression in human cells and functional aspects are still unknown. We found that longer palindromic sequences are a unique feature of SAS chimeras. Structural analysis indicates that a long hairpin-like structure formed by many consecutive Watson-Crick base pairs appears because of these long palindromic sequences, which possibly play a similar role as double-stranded RNA (dsRNA), interfering with gene expression. RNA-RNA interaction analysis suggested that SAS chimeras could significantly interact with their parental mRNAs, indicating their potential regulatory features. Here, 267 SAS chimeras were mapped in RNA-seq data from 16 healthy human tissues, revealing their expression in normal cells. Evolutionary analysis suggested the positive selection favoring sense-antisense fusions that significantly impacted the evolution of their function and structure. Overall, our study provides detailed insight into the expression landscape of SAS chimeras in human cells and identifies potential regulatory features.
Collapse
Affiliation(s)
- Sumit Mukherjee
- Cancer Genomics and BioComputing of Complex Diseases Lab, Azrieli Faculty of Medicine, Bar-Ilan University, Safed 1311502, Israel
| | - Rajesh Detroja
- Cancer Genomics and BioComputing of Complex Diseases Lab, Azrieli Faculty of Medicine, Bar-Ilan University, Safed 1311502, Israel
| | - Deepak Balamurali
- Cancer Genomics and BioComputing of Complex Diseases Lab, Azrieli Faculty of Medicine, Bar-Ilan University, Safed 1311502, Israel
| | - Elena Matveishina
- Department of Bioengineering and Bioinformatics, Lomonosov Moscow State University, Moscow 119234, Russian Federation
- Institute of Bioengineering, Research Centre of Biotechnology, Russian Academy of Sciences, Moscow 117312, Russian Federation
| | - Yulia A Medvedeva
- Institute of Bioengineering, Research Centre of Biotechnology, Russian Academy of Sciences, Moscow 117312, Russian Federation
- Department of Biomedical Physics, Moscow Institute of Technology, Dolgoprudny 141701, Russian Federation
| | - Alfonso Valencia
- Barcelona Supercomputing Center (BSC), C/ Jordi Girona 29, 08034, Barcelona, Spain
- ICREA, Pg. Lluís Companys 23, 08010 Barcelona, Spain
| | - Alessandro Gorohovski
- Cancer Genomics and BioComputing of Complex Diseases Lab, Azrieli Faculty of Medicine, Bar-Ilan University, Safed 1311502, Israel
| | - Milana Frenkel-Morgenstern
- Cancer Genomics and BioComputing of Complex Diseases Lab, Azrieli Faculty of Medicine, Bar-Ilan University, Safed 1311502, Israel
| |
Collapse
|
5
|
Mukherjee S, Heng HH, Frenkel-Morgenstern M. Emerging Role of Chimeric RNAs in Cell Plasticity and Adaptive Evolution of Cancer Cells. Cancers (Basel) 2021; 13:4328. [PMID: 34503137 PMCID: PMC8431553 DOI: 10.3390/cancers13174328] [Citation(s) in RCA: 10] [Impact Index Per Article: 3.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/16/2021] [Revised: 08/22/2021] [Accepted: 08/23/2021] [Indexed: 12/12/2022] Open
Abstract
Gene fusions can give rise to somatic alterations in cancers. Fusion genes have the potential to create chimeric RNAs, which can generate the phenotypic diversity of cancer cells, and could be associated with novel molecular functions related to cancer cell survival and proliferation. The expression of chimeric RNAs in cancer cells might impact diverse cancer-related functions, including loss of apoptosis and cancer cell plasticity, and promote oncogenesis. Due to their recurrence in cancers and functional association with oncogenic processes, chimeric RNAs are considered biomarkers for cancer diagnosis. Several recent studies demonstrated that chimeric RNAs could lead to the generation of new functionality for the resistance of cancer cells against drug therapy. Therefore, targeting chimeric RNAs in drug resistance cancer could be useful for developing precision medicine. So, understanding the functional impact of chimeric RNAs in cancer cells from an evolutionary perspective will be helpful to elucidate cancer evolution, which could provide a new insight to design more effective therapies for cancer patients in a personalized manner.
Collapse
Affiliation(s)
- Sumit Mukherjee
- Cancer Genomics and BioComputing of Complex Diseases Lab, Azrieli Faculty of Medicine, Bar-Ilan University, Safed 1311502, Israel;
| | - Henry H. Heng
- Center for Molecular Medicine and Genetics, Wayne State University School of Medicine, Detroit, MI 48201, USA;
- Department of Pathology, Wayne State University School of Medicine, Detroit, MI 48201, USA
| | - Milana Frenkel-Morgenstern
- Cancer Genomics and BioComputing of Complex Diseases Lab, Azrieli Faculty of Medicine, Bar-Ilan University, Safed 1311502, Israel;
| |
Collapse
|
6
|
Zhao H, Chen Y, Shen C, Li L, Li Q, Tan K, Huang H, Hu G. Breakpoint mapping of a t(9;22;12) chronic myeloid leukaemia patient with e14a3 BCR-ABL1 transcript using Nanopore sequencing. J Gene Med 2020; 23:e3276. [PMID: 32949441 DOI: 10.1002/jgm.3276] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/15/2020] [Revised: 09/03/2020] [Accepted: 09/14/2020] [Indexed: 01/01/2023] Open
Abstract
BACKGROUND The genetic changes in chronic myeloid leukaemia (CML) have been well established, although challenges persist in cases with rare fusion transcripts or complex variant translocations. Here, we present a CML patient with e14a3 BCR-ABL1 transcript and t(9;22;12) variant Philadelphia (Ph) chromosome. METHODS Cytogenetic analysis and fluorescence in situ hybridization (FISH) was performed to identify the chromosomal aberrations and gene fusions. Rare fusion transcript was verified by a reverse transcription-polymerase chain reaction (RT-PCR). Breakpoints were characterized and validated using Oxford Nanopore Technologies (ONT) (Oxford, UK) and Sanger sequencing, respectively. RESULTS The karyotype showed the translocation t(9;22;12)(q34;q11.2;q24) [20] and FISH indicated 40% positive BCR-ABL1 fusion signals. The RT-PCR suggested e14a3 type fusion transcript. The ONT sequencing analysis identified specific positions of translocation breakpoints: chr22:23633040-chr9:133729579, chr12:121567595-chr22:24701405, which were confirmed using Sanger sequencing. The patient achieved molecular remission 3 months after imatinib therapy. CONCLUSIONS The present study indicates Nanopore sequencing as a valid strategy, which can characterize breakpoints precisely in special clinical cases with atypical structural variations. CML patients with e14a3 transcripts may have good clinical course in the tyrosine kinase inhibitor era, as reviewed here.
Collapse
Affiliation(s)
- Hu Zhao
- Department of Haematology, The Affiliated Zhuzhou Hospital, XiangYa Medical College, Central South University, Zhuzhou, Hunan, China
| | - Yuan Chen
- Department of Haematology, The Affiliated Zhuzhou Hospital, XiangYa Medical College, Central South University, Zhuzhou, Hunan, China
| | - Chanjuan Shen
- Department of Haematology, The Affiliated Zhuzhou Hospital, XiangYa Medical College, Central South University, Zhuzhou, Hunan, China
| | - Lingshu Li
- Department of Haematology, The Affiliated Zhuzhou Hospital, XiangYa Medical College, Central South University, Zhuzhou, Hunan, China
| | - Qingzhao Li
- Department of Haematology, The Affiliated Zhuzhou Hospital, XiangYa Medical College, Central South University, Zhuzhou, Hunan, China
| | - Kui Tan
- Department of Haematology, The Affiliated Zhuzhou Hospital, XiangYa Medical College, Central South University, Zhuzhou, Hunan, China
| | - Huang Huang
- Department of Haematology, The Affiliated Zhuzhou Hospital, XiangYa Medical College, Central South University, Zhuzhou, Hunan, China
| | - Guoyu Hu
- Department of Haematology, The Affiliated Zhuzhou Hospital, XiangYa Medical College, Central South University, Zhuzhou, Hunan, China
| |
Collapse
|
7
|
Amelio I, Bernassola F, Candi E. Emerging roles of long non-coding RNAs in breast cancer biology and management. Semin Cancer Biol 2020; 72:36-45. [PMID: 32619506 DOI: 10.1016/j.semcancer.2020.06.019] [Citation(s) in RCA: 47] [Impact Index Per Article: 11.8] [Reference Citation Analysis] [Abstract] [Key Words] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/29/2020] [Revised: 06/08/2020] [Accepted: 06/25/2020] [Indexed: 01/01/2023]
Abstract
Breast cancer is the most common cancer in women with the highest mortality among this gender. Despite treatment strategies including surgery, hormone therapy and targeted therapy have recently advanced, innovative biomarkers are needed for the early detection, treatment and prognosis. An increasing number of non-coding RNAs (ncRNAs) have shown great potential as crucial players in different stages of the breast cancer tumorigenesis, influencing cell death, metabolism, epithelial-mesenchymal transition (EMT), metastasis and drug resistance. Long non-coding RNAs (lncRNAs), specifically, are a class of RNA transcripts with a length greater than 200 nucleotides, which have also been shown to exerts oncogenic or tumour suppressive roles in the pathogenesis of breast cancer. LncRNAs are implicated in different molecular mechanisms by regulating gene expressions and functions at transcriptional, translational, and post-translational levels. Here, we aim to briefly discuss the latest existing body of knowledge regarding the key functions and the molecular mechanisms of some of the most relevant lncRNAs in the pathogenesis, treatment and prognosis of breast cancer.
Collapse
Affiliation(s)
- I Amelio
- Department of Experimental Medicine, TOR, University of Rome "Tor Vergata", Rome, Italy; School of Life Sciences, University of Nottingham, Nottingham, UK
| | - F Bernassola
- Department of Experimental Medicine, TOR, University of Rome "Tor Vergata", Rome, Italy
| | - E Candi
- Department of Experimental Medicine, TOR, University of Rome "Tor Vergata", Rome, Italy; Istituto Dermopatico dell'Immacolata, IDI-IRCCS, Rome, Italy.
| |
Collapse
|
8
|
Balamurali D, Gorohovski A, Detroja R, Palande V, Raviv-Shay D, Frenkel-Morgenstern M. ChiTaRS 5.0: the comprehensive database of chimeric transcripts matched with druggable fusions and 3D chromatin maps. Nucleic Acids Res 2020; 48:D825-D834. [PMID: 31747015 PMCID: PMC7145514 DOI: 10.1093/nar/gkz1025] [Citation(s) in RCA: 11] [Impact Index Per Article: 2.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/15/2019] [Revised: 10/18/2019] [Accepted: 10/26/2019] [Indexed: 12/11/2022] Open
Abstract
Chimeric RNA transcripts are formed when exons from two genes fuse together, often due to chromosomal translocations, transcriptional errors or trans-splicing effect. While these chimeric RNAs produce functional proteins only in certain cases, they play a significant role in disease phenotyping and progression. ChiTaRS 5.0 (http://chitars.md.biu.ac.il/) is the latest and most comprehensive chimeric transcript repository, with 111 582 annotated entries from eight species, including 23 167 known human cancer breakpoints. The database includes unique information correlating chimeric breakpoints with 3D chromatin contact maps, generated from public datasets of chromosome conformation capture techniques (Hi-C). In this update, we have added curated information on druggable fusion targets matched with chimeric breakpoints, which are applicable to precision medicine in cancers. The introduction of a new section that lists chimeric RNAs in various cell-lines is another salient feature. Finally, using text-mining techniques, novel chimeras in Alzheimer's disease, schizophrenia, dyslexia and other diseases were collected in ChiTaRS. Thus, this improved version is an extensive catalogue of chimeras from multiple species. It extends our understanding of the evolution of chimeric transcripts in eukaryotes and contributes to the analysis of 3D genome conformational changes and the functional role of chimeras in the etiopathogenesis of cancers and other complex diseases.
Collapse
Affiliation(s)
- Deepak Balamurali
- Laboratory of Cancer Genomics and Biocomputing of Complex Diseases, The Azrieli Faculty of Medicine, Bar-Ilan University, Safed 1311502, Israel
| | - Alessandro Gorohovski
- Laboratory of Cancer Genomics and Biocomputing of Complex Diseases, The Azrieli Faculty of Medicine, Bar-Ilan University, Safed 1311502, Israel
| | - Rajesh Detroja
- Laboratory of Cancer Genomics and Biocomputing of Complex Diseases, The Azrieli Faculty of Medicine, Bar-Ilan University, Safed 1311502, Israel
| | - Vikrant Palande
- Laboratory of Cancer Genomics and Biocomputing of Complex Diseases, The Azrieli Faculty of Medicine, Bar-Ilan University, Safed 1311502, Israel
| | - Dorith Raviv-Shay
- Laboratory of Cancer Genomics and Biocomputing of Complex Diseases, The Azrieli Faculty of Medicine, Bar-Ilan University, Safed 1311502, Israel
| | - Milana Frenkel-Morgenstern
- Laboratory of Cancer Genomics and Biocomputing of Complex Diseases, The Azrieli Faculty of Medicine, Bar-Ilan University, Safed 1311502, Israel
| |
Collapse
|
9
|
Frenkel-Morgenstern M. Identification of Chimeric RNAs Using RNA-Seq Reads and Protein-Protein Interactions of Translated Chimeras. Methods Mol Biol 2020; 2079:27-40. [PMID: 31728960 DOI: 10.1007/978-1-4939-9904-0_3] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 06/10/2023]
Abstract
Chimeric RNA moieties typically consist of exons from two genes expressed from different genomic locations and produced by chromosomal translocations, trans-splicing or transcription errors. Recent advances in next-generation sequencing procedures have opened new horizons for identification of novel chimeric transcripts in various diseases in a personalized manner. Here we describe the detailed computational procedures to identify chimeric transcripts using RNA-seq reads. Moreover, we elaborate on the domain-domain co-occurrence method to detect alterations in chimeric protein-protein interaction (ChiPPI) networks produced by chimeric RNA that are translated to chimeric proteins.
Collapse
|
10
|
Chuang TJ, Chen YJ, Chen CY, Mai TL, Wang YD, Yeh CS, Yang MY, Hsiao YT, Chang TH, Kuo TC, Cho HH, Shen CN, Kuo HC, Lu MY, Chen YH, Hsieh SC, Chiang TW. Integrative transcriptome sequencing reveals extensive alternative trans-splicing and cis-backsplicing in human cells. Nucleic Acids Res 2019; 46:3671-3691. [PMID: 29385530 PMCID: PMC6283421 DOI: 10.1093/nar/gky032] [Citation(s) in RCA: 53] [Impact Index Per Article: 10.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/24/2016] [Accepted: 01/13/2018] [Indexed: 01/16/2023] Open
Abstract
Transcriptionally non-co-linear (NCL) transcripts can originate from trans-splicing (trans-spliced RNA; 'tsRNA') or cis-backsplicing (circular RNA; 'circRNA'). While numerous circRNAs have been detected in various species, tsRNAs remain largely uninvestigated. Here, we utilize integrative transcriptome sequencing of poly(A)- and non-poly(A)-selected RNA-seq data from diverse human cell lines to distinguish between tsRNAs and circRNAs. We identified 24,498 NCL events and found that a considerable proportion (20-35%) of them arise from both tsRNAs and circRNAs, representing extensive alternative trans-splicing and cis-backsplicing in human cells. We show that sequence generalities of exon circularization are also observed in tsRNAs. Recapitulation of NCL RNAs further shows that inverted Alu repeats can simultaneously promote the formation of tsRNAs and circRNAs. However, tsRNAs and circRNAs exhibit quite different, or even opposite, expression patterns, in terms of correlation with the expression of their co-linear counterparts, expression breadth/abundance, transcript stability, and subcellular localization preference. These results indicate that tsRNAs and circRNAs may play different regulatory roles and analysis of NCL events should take the joint effects of different NCL-splicing types and joint effects of multiple NCL events into consideration. This study describes the first transcriptome-wide analysis of trans-splicing and cis-backsplicing, expanding our understanding of the complexity of the human transcriptome.
Collapse
Affiliation(s)
- Trees-Juen Chuang
- Genomics Research Center, Academia Sinica, Taipei 11529, Taiwan.,Genome and Systems Biology Degree Program, National Taiwan University, Taipei 10617 & Academia Sinica, Taipei 11529, Taiwan
| | - Yen-Ju Chen
- Genomics Research Center, Academia Sinica, Taipei 11529, Taiwan.,Genome and Systems Biology Degree Program, National Taiwan University, Taipei 10617 & Academia Sinica, Taipei 11529, Taiwan
| | - Chia-Ying Chen
- Genomics Research Center, Academia Sinica, Taipei 11529, Taiwan
| | - Te-Lun Mai
- Genomics Research Center, Academia Sinica, Taipei 11529, Taiwan
| | - Yi-Da Wang
- Genomics Research Center, Academia Sinica, Taipei 11529, Taiwan
| | - Chung-Shu Yeh
- Genomics Research Center, Academia Sinica, Taipei 11529, Taiwan.,Institute of Biochemistry and Molecular Biology, National Yang-Ming University, Taipei 11221, Taiwan
| | - Min-Yu Yang
- Genomics Research Center, Academia Sinica, Taipei 11529, Taiwan
| | - Yu-Ting Hsiao
- Genomics Research Center, Academia Sinica, Taipei 11529, Taiwan
| | | | - Tzu-Chien Kuo
- Genomics Research Center, Academia Sinica, Taipei 11529, Taiwan
| | - Hsin-Hua Cho
- Genomics Research Center, Academia Sinica, Taipei 11529, Taiwan
| | - Chia-Ning Shen
- Genomics Research Center, Academia Sinica, Taipei 11529, Taiwan
| | - Hung-Chih Kuo
- Institute of Cellular and Organismic Biology, Academia Sinica, Taipei 11529, Taiwan
| | - Mei-Yeh Lu
- High Throughput Genomics Core, Biodiversity Research Center, Academia Sinica, Taipei 11529, Taiwan
| | - Yi-Hua Chen
- High Throughput Genomics Core, Biodiversity Research Center, Academia Sinica, Taipei 11529, Taiwan
| | - Shan-Chi Hsieh
- Institute of Cellular and Organismic Biology, Academia Sinica, Taipei 11529, Taiwan
| | - Tai-Wei Chiang
- Genomics Research Center, Academia Sinica, Taipei 11529, Taiwan
| |
Collapse
|
11
|
Chen CY, Chuang TJ. Comment on "A comprehensive overview and evaluation of circular RNA detection tools". PLoS Comput Biol 2019; 15:e1006158. [PMID: 31150384 PMCID: PMC6544197 DOI: 10.1371/journal.pcbi.1006158] [Citation(s) in RCA: 12] [Impact Index Per Article: 2.4] [Reference Citation Analysis] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/27/2017] [Accepted: 03/17/2018] [Indexed: 11/18/2022] Open
Affiliation(s)
- Chia-Ying Chen
- Genomics Research Center, Academia Sinica, Taipei, Taiwan
| | | |
Collapse
|
12
|
Pervouchine DD. Circular exonic RNAs: When RNA structure meets topology. BIOCHIMICA ET BIOPHYSICA ACTA-GENE REGULATORY MECHANISMS 2019; 1862:194384. [PMID: 31102674 DOI: 10.1016/j.bbagrm.2019.05.002] [Citation(s) in RCA: 24] [Impact Index Per Article: 4.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 04/22/2019] [Revised: 05/08/2019] [Accepted: 05/08/2019] [Indexed: 12/12/2022]
Abstract
Although RNA circularization was first documented in the 1990s, the extent to which it occurs was not known until recent advances in high-throughput sequencing enabled the widespread identification of circular RNAs (circRNAs). Despite this, many aspects of circRNA biogenesis, structure, and function yet remain obscure. This review focuses on circular exonic RNAs, a subclass of circRNAs that are generated through backsplicing. Here, I hypothesize that RNA secondary structure can be the common factor that promotes both exon skipping and spliceosomal RNA circularization, and that backsplicing of double-stranded regions could generate topologically linked circRNA molecules. CircRNAs manifest themselves by the presence of tail-to-head exon junctions, which were previously attributed to post-transcriptional exon permutation and repetition. I revisit these observations and argue that backsplicing does not automatically imply RNA circularization because tail-to-head exon junctions give only local information about transcript architecture and, therefore, they are in principle insufficient to determine globally circular topology. This article is part of a Special Issue entitled: RNA structure and splicing regulation edited by Francisco Baralle, Ravindra Singh and Stefan Stamm.
Collapse
Affiliation(s)
- Dmitri D Pervouchine
- Skolkovo Institute of Science and Technology, 3 Nobel St, Moscow 143026, Russia; Faculty of Bioengineering and Bioinformatics, Moscow State University, Leninskiye Gory 1-73, Moscow 119234, Russia.
| |
Collapse
|
13
|
Chen CY, Chuang TJ. NCLcomparator: systematically post-screening non-co-linear transcripts (circular, trans-spliced, or fusion RNAs) identified from various detectors. BMC Bioinformatics 2019; 20:3. [PMID: 30606103 PMCID: PMC6318855 DOI: 10.1186/s12859-018-2589-0] [Citation(s) in RCA: 9] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/22/2017] [Accepted: 12/21/2018] [Indexed: 01/05/2023] Open
Abstract
BACKGROUND Non-co-linear (NCL) transcripts consist of exonic sequences that are topologically inconsistent with the reference genome in an intragenic fashion (circular or intragenic trans-spliced RNAs) or in an intergenic fashion (fusion or intergenic trans-spliced RNAs). On the basis of RNA-seq data, numerous NCL event detectors have been developed and detected thousands of NCL events in diverse species. However, there are great discrepancies in the identification results among detectors, indicating a considerable proportion of false positives in the detected NCL events. Although several helpful guidelines for evaluating the performance of NCL event detectors have been provided, a systematic guideline for measurement of NCL events identified by existing tools has not been available. RESULTS We develop a software, NCLcomparator, for systematically post-screening the intragenic or intergenic NCL events identified by various NCL detectors. NCLcomparator first examine whether the input NCL events are potentially false positives derived from ambiguous alignments (i.e., the NCL events have an alternative co-linear explanation or multiple matches against the reference genome). To evaluate the reliability of the identified NCL events, we define the NCL score (NCLscore) based on the variation in the number of supporting NCL junction reads identified by the tools examined. Of the input NCL events, we show that the ambiguous alignment-derived events have relatively lower NCLscore values than the other events, indicating that an NCL event with a higher NCLscore has a higher level of reliability. To help selecting highly expressed NCL events, NCLcomparator also provides a series of useful measurements such as the expression levels of the detected NCL events and their corresponding host genes and the junction usage of the co-linear splice junctions at both NCL donor and acceptor sites. CONCLUSION NCLcomparator provides useful guidelines, with the input of identified NCL events from various detectors and the corresponding paired-end RNA-seq data only, to help users selecting potentially high-confidence NCL events for further functional investigation. The software thus helps to facilitate future studies into NCL events, shedding light on the fundamental biology of this important but understudied class of transcripts. NCLcomparator is freely accessible at https://github.com/TreesLab/NCLcomparator .
Collapse
Affiliation(s)
- Chia-Ying Chen
- Genomics Research Center, Academia Sinica, Taipei, 11529 Taiwan
| | | |
Collapse
|
14
|
He Y, Yuan C, Chen L, Lei M, Zellmer L, Huang H, Liao DJ. Transcriptional-Readthrough RNAs Reflect the Phenomenon of "A Gene Contains Gene(s)" or "Gene(s) within a Gene" in the Human Genome, and Thus Are Not Chimeric RNAs. Genes (Basel) 2018; 9:E40. [PMID: 29337901 PMCID: PMC5793191 DOI: 10.3390/genes9010040] [Citation(s) in RCA: 21] [Impact Index Per Article: 3.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/29/2017] [Revised: 12/29/2017] [Accepted: 01/07/2018] [Indexed: 02/06/2023] Open
Abstract
Tens of thousands of chimeric RNAs, i.e., RNAs with sequences of two genes, have been identified in human cells. Most of them are formed by two neighboring genes on the same chromosome and are considered to be derived via transcriptional readthrough, but a true readthrough event still awaits more evidence and trans-splicing that joins two transcripts together remains as a possible mechanism. We regard those genomic loci that are transcriptionally read through as unannotated genes, because their transcriptional and posttranscriptional regulations are the same as those of already-annotated genes, including fusion genes formed due to genetic alterations. Therefore, readthrough RNAs and fusion-gene-derived RNAs are not chimeras. Only those two-gene RNAs formed at the RNA level, likely via trans-splicing, without corresponding genes as genomic parents, should be regarded as authentic chimeric RNAs. However, since in human cells, procedural and mechanistic details of trans-splicing have never been disclosed, we doubt the existence of trans-splicing. Therefore, there are probably no authentic chimeras in humans, after readthrough and fusion-gene derived RNAs are all put back into the group of ordinary RNAs. Therefore, it should be further determined whether in human cells all two-neighboring-gene RNAs are derived from transcriptional readthrough and whether trans-splicing truly exists.
Collapse
Affiliation(s)
- Yan He
- Key Lab of Endemic and Ethnic Diseases of the Ministry of Education of China in Guizhou Medical University, Guiyang 550004, Guizhou, China.
| | - Chengfu Yuan
- Department of Biochemistry, China Three Gorges University, Yichang City 443002, Hubei, China.
| | - Lichan Chen
- Hormel Institute, University of Minnesota, Austin, MN 55912, USA.
| | - Mingjuan Lei
- Hormel Institute, University of Minnesota, Austin, MN 55912, USA.
| | - Lucas Zellmer
- Masonic Cancer Center, University of Minnesota, 435 E. River Road, Minneapolis, MN 55455, USA.
| | - Hai Huang
- School of Clinical Laboratory Science, Guizhou Medical University, Guiyang 550004, Guizhou, China.
| | - Dezhong Joshua Liao
- Key Lab of Endemic and Ethnic Diseases of the Ministry of Education of China in Guizhou Medical University, Guiyang 550004, Guizhou, China.
- Department of Pathology, Guizhou Medical University Hospital, Guiyang 550004, Guizhou, China.
| |
Collapse
|
15
|
Pintarelli G, Dassano A, Cotroneo CE, Galvan A, Noci S, Piazza R, Pirola A, Spinelli R, Incarbone M, Palleschi A, Rosso L, Santambrogio L, Dragani TA, Colombo F. Read-through transcripts in normal human lung parenchyma are down-regulated in lung adenocarcinoma. Oncotarget 2017; 7:27889-98. [PMID: 27058892 PMCID: PMC5053695 DOI: 10.18632/oncotarget.8556] [Citation(s) in RCA: 11] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/17/2015] [Accepted: 02/18/2016] [Indexed: 12/26/2022] Open
Abstract
Read-through transcripts result from the continuous transcription of adjacent, similarly oriented genes, with the splicing out of the intergenic region. They have been found in several neoplastic and normal tissues, but their pathophysiological significance is unclear. We used high-throughput sequencing of cDNA fragments (RNA-Seq) to identify read-through transcripts in the non-involved lung tissue of 64 surgically treated lung adenocarcinoma patients. A total of 52 distinct read-through species was identified, with 24 patients having at least one read-through event, up to a maximum of 17 such transcripts in one patient. Sanger sequencing validated 28 of these transcripts and identified an additional 15, for a total of 43 distinct read-through events involving 35 gene pairs. Expression levels of 10 validated read-through transcripts were measured by quantitative PCR in pairs of matched non-involved lung tissue and lung adenocarcinoma tissue from 45 patients. Higher expression levels were observed in normal lung tissue than in the tumor counterpart, with median relative quantification ratios between normal and tumor varying from 1.90 to 7.78; the difference was statistically significant (P < 0.001, Wilcoxon's signed-rank test for paired samples) for eight transcripts: ELAVL1–TIMM44, FAM162B–ZUFSP, IFNAR2–IL10RB, INMT–FAM188B, KIAA1841–C2orf74, NFATC3–PLA2G15, SIRPB1–SIRPD, and SHANK3–ACR. This report documents the presence of read-through transcripts in apparently normal lung tissue, with inter-individual differences in patterns and abundance. It also shows their down-regulation in tumors, suggesting that these chimeric transcripts may function as tumor suppressors in lung tissue.
Collapse
Affiliation(s)
- Giulia Pintarelli
- Department of Predictive and Prevention Medicine, Fondazione IRCCS, Istituto Nazionale dei Tumori, Milan, Italy
| | - Alice Dassano
- Department of Predictive and Prevention Medicine, Fondazione IRCCS, Istituto Nazionale dei Tumori, Milan, Italy
| | - Chiara E Cotroneo
- Department of Predictive and Prevention Medicine, Fondazione IRCCS, Istituto Nazionale dei Tumori, Milan, Italy.,Present Address: UCD School of Biomolecular and Biomedical Science, University College Dublin, Belfield, Dublin, Ireland
| | - Antonella Galvan
- Formerly, Department of Predictive and Prevention Medicine, Fondazione IRCCS, Istituto Nazionale dei Tumori, Milan, Italy
| | - Sara Noci
- Department of Predictive and Prevention Medicine, Fondazione IRCCS, Istituto Nazionale dei Tumori, Milan, Italy
| | - Rocco Piazza
- Department of Health Sciences, University of Milano-Bicocca, Monza, Italy.,Hematology and Clinical Research Unit, San Gerardo Hospital, Monza, Italy
| | - Alessandra Pirola
- Department of Health Sciences, University of Milano-Bicocca, Monza, Italy
| | - Roberta Spinelli
- Formerly, Department of Health Sciences, University of Milano-Bicocca, Monza, Italy
| | - Matteo Incarbone
- Department of Surgery, San Giuseppe Hospital, Multimedica, Milan, Italy
| | - Alessandro Palleschi
- Department of Surgery, IRCCS Fondazione Cà Granda Ospedale Maggiore Policlinico, Università degli Studi di Milano, Milan, Italy
| | - Lorenzo Rosso
- Department of Surgery, IRCCS Fondazione Cà Granda Ospedale Maggiore Policlinico, Università degli Studi di Milano, Milan, Italy
| | - Luigi Santambrogio
- Department of Surgery, IRCCS Fondazione Cà Granda Ospedale Maggiore Policlinico, Università degli Studi di Milano, Milan, Italy
| | - Tommaso A Dragani
- Department of Predictive and Prevention Medicine, Fondazione IRCCS, Istituto Nazionale dei Tumori, Milan, Italy
| | - Francesca Colombo
- Department of Predictive and Prevention Medicine, Fondazione IRCCS, Istituto Nazionale dei Tumori, Milan, Italy
| |
Collapse
|
16
|
Ledda F, Paratcha G. Mechanisms regulating dendritic arbor patterning. Cell Mol Life Sci 2017; 74:4511-4537. [PMID: 28735442 PMCID: PMC11107629 DOI: 10.1007/s00018-017-2588-8] [Citation(s) in RCA: 29] [Impact Index Per Article: 4.1] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/04/2016] [Revised: 06/14/2017] [Accepted: 07/06/2017] [Indexed: 12/17/2022]
Abstract
The nervous system is populated by diverse types of neurons, each of which has dendritic trees with strikingly different morphologies. These neuron-specific morphologies determine how dendritic trees integrate thousands of synaptic inputs to generate different firing properties. To ensure proper neuronal function and connectivity, it is necessary that dendrite patterns are precisely controlled and coordinated with synaptic activity. Here, we summarize the molecular and cellular mechanisms that regulate the formation of cell type-specific dendrite patterns during development. We focus on different aspects of vertebrate dendrite patterning that are particularly important in determining the neuronal function; such as the shape, branching, orientation and size of the arbors as well as the development of dendritic spine protrusions that receive excitatory inputs and compartmentalize postsynaptic responses. Additionally, we briefly comment on the implications of aberrant dendritic morphology for nervous system disease.
Collapse
Affiliation(s)
- Fernanda Ledda
- Division of Molecular and Cellular Neuroscience, Institute of Cell Biology and Neuroscience (IBCN)-CONICET, School of Medicine, University of Buenos Aires (UBA), Paraguay 2155, 3rd Floor, CABA, 1121, Buenos Aires, Argentina
| | - Gustavo Paratcha
- Division of Molecular and Cellular Neuroscience, Institute of Cell Biology and Neuroscience (IBCN)-CONICET, School of Medicine, University of Buenos Aires (UBA), Paraguay 2155, 3rd Floor, CABA, 1121, Buenos Aires, Argentina.
| |
Collapse
|
17
|
Rufflé F, Audoux J, Boureux A, Beaumeunier S, Gaillard JB, Bou Samra E, Megarbane A, Cassinat B, Chomienne C, Alves R, Riquier S, Gilbert N, Lemaitre JM, Bacq-Daian D, Bougé AL, Philippe N, Commes T. New chimeric RNAs in acute myeloid leukemia. F1000Res 2017; 6. [PMID: 29623188 PMCID: PMC5861515 DOI: 10.12688/f1000research.11352.2] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Accepted: 12/05/2017] [Indexed: 12/24/2022] Open
Abstract
Background: High-throughput next generation sequencing (NGS) technologies enable the detection of biomarkers used for tumor classification, disease monitoring and cancer therapy. Whole-transcriptome analysis using RNA-seq is important, not only as a means of understanding the mechanisms responsible for complex diseases but also to efficiently identify novel genes/exons, splice isoforms, RNA editing, allele-specific mutations, differential gene expression and fusion-transcripts or chimeric RNA (chRNA). Methods: We used
Crac, a tool that uses genomic locations and local coverage to classify biological events and directly infer splice and chimeric junctions within a single read. Crac’s algorithm extracts transcriptional chimeric events irrespective of annotation with a high sensitivity, and
CracTools was used to aggregate, annotate and filter the chRNA reads. The selected chRNA candidates were validated by real time PCR and sequencing. In order to check the tumor specific expression of chRNA, we analyzed a publicly available dataset using a new tag search approach. Results: We present data related to acute myeloid leukemia (AML) RNA-seq analysis. We highlight novel biological cases of chRNA, in addition to previously well characterized leukemia chRNA. We have identified and validated 17 chRNAs among 3 AML patients: 10 from an AML patient with a translocation between chromosomes 15 and 17 (AML-t(15;17), 4 from patient with normal karyotype (AML-NK) 3 from a patient with chromosomal 16 inversion (AML-inv16). The new fusion transcripts can be classified into four groups according to the exon organization. Conclusions: All groups suggest complex but distinct synthesis mechanisms involving either collinear exons of different genes, non-collinear exons, or exons of different chromosomes. Finally, we check tumor-specific expression in a larger RNA-seq AML cohort and identify new AML biomarkers that could improve diagnosis and prognosis of AML.
Collapse
Affiliation(s)
- Florence Rufflé
- Institut de Biologie Computationnelle, Université Montpellier, Montpellier, France.,Institut de Médecine Régénératrice et de Biothérapie, INSERM U1183, CHU Montpellier, Montpellier, France
| | - Jerome Audoux
- Institut de Biologie Computationnelle, Université Montpellier, Montpellier, France.,Institut de Médecine Régénératrice et de Biothérapie, INSERM U1183, CHU Montpellier, Montpellier, France
| | - Anthony Boureux
- Institut de Biologie Computationnelle, Université Montpellier, Montpellier, France.,Institut de Médecine Régénératrice et de Biothérapie, INSERM U1183, CHU Montpellier, Montpellier, France
| | - Sacha Beaumeunier
- Institut de Biologie Computationnelle, Université Montpellier, Montpellier, France.,Institut de Médecine Régénératrice et de Biothérapie, INSERM U1183, CHU Montpellier, Montpellier, France
| | | | - Elias Bou Samra
- Université Paris Sud, Université Paris-Saclay, Orsay, France.,Institut Curie, PSL Research University, Paris, France
| | | | - Bruno Cassinat
- Laboratoire de Biologie Cellulaire, Hôpital Saint-Louis, Assistance publique - Hôpitaux de Paris (AP-HP), Paris, France
| | - Christine Chomienne
- Laboratoire de Biologie Cellulaire, Hôpital Saint-Louis, Assistance publique - Hôpitaux de Paris (AP-HP), Paris, France.,Hôpital Saint-Louis, Université Paris Diderot, INSERM UMRS 1131, Paris, France
| | - Ronnie Alves
- Institut de Biologie Computationnelle, Université Montpellier, Montpellier, France.,Instituto Tecnológico Vale, Nazaré, Belém, PA, Brazil
| | - Sebastien Riquier
- Institut de Biologie Computationnelle, Université Montpellier, Montpellier, France.,Institut de Médecine Régénératrice et de Biothérapie, INSERM U1183, CHU Montpellier, Montpellier, France
| | - Nicolas Gilbert
- Institut de Médecine Régénératrice et de Biothérapie, INSERM U1183, CHU Montpellier, Montpellier, France
| | - Jean-Marc Lemaitre
- Institut de Médecine Régénératrice et de Biothérapie, INSERM U1183, CHU Montpellier, Montpellier, France
| | | | - Anne Laure Bougé
- Institut de Biologie Computationnelle, Université Montpellier, Montpellier, France.,Institut de Médecine Régénératrice et de Biothérapie, INSERM U1183, CHU Montpellier, Montpellier, France
| | - Nicolas Philippe
- Institut de Biologie Computationnelle, Université Montpellier, Montpellier, France.,Institut de Médecine Régénératrice et de Biothérapie, INSERM U1183, CHU Montpellier, Montpellier, France
| | - Therese Commes
- Institut de Biologie Computationnelle, Université Montpellier, Montpellier, France.,Institut de Médecine Régénératrice et de Biothérapie, INSERM U1183, CHU Montpellier, Montpellier, France
| |
Collapse
|
18
|
Chwalenia K, Facemire L, Li H. Chimeric RNAs in cancer and normal physiology. WILEY INTERDISCIPLINARY REVIEWS-RNA 2017; 8. [DOI: 10.1002/wrna.1427] [Citation(s) in RCA: 32] [Impact Index Per Article: 4.6] [Reference Citation Analysis] [Track Full Text] [Subscribe] [Scholar Register] [Received: 12/07/2016] [Revised: 04/27/2017] [Accepted: 04/28/2017] [Indexed: 12/20/2022]
Affiliation(s)
- Katarzyna Chwalenia
- Department of Pathology, School of Medicine; University of Virginia; Charlottesville VA USA
| | - Loryn Facemire
- Department of Pathology, School of Medicine; University of Virginia; Charlottesville VA USA
| | - Hui Li
- Department of Pathology, School of Medicine; University of Virginia; Charlottesville VA USA
- Department of Biochemistry and Molecular Genetics, School of Medicine; University of Virginia; Charlottesville VA USA
| |
Collapse
|
19
|
Hoff AM, Johannessen B, Alagaratnam S, Zhao S, Nome T, Løvf M, Bakken AC, Hektoen M, Sveen A, Lothe RA, Skotheim RI. Novel RNA variants in colorectal cancers. Oncotarget 2017; 6:36587-602. [PMID: 26474385 PMCID: PMC4742197 DOI: 10.18632/oncotarget.5500] [Citation(s) in RCA: 12] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/01/2015] [Accepted: 09/30/2015] [Indexed: 01/03/2023] Open
Abstract
With an annual estimated incidence of 1.4 million, and a five-year survival rate of 60%, colorectal cancer (CRC) is a major clinical burden. To identify novel RNA variants in CRC, we analyzed exon-level microarray expression data from a cohort of 202 CRCs. We nominated 25 genes with increased expression of their 3′ parts in at least one cancer sample each. To efficiently investigate underlying transcript structures, we developed an approach using rapid amplification of cDNA ends followed by high throughput sequencing (RACE-seq). RACE products from the targeted genes in 23 CRC samples were pooled together and sequenced. We identified VWA2-TCF7L2, DHX35-BPIFA2 and CASZ1-MASP2 as private fusion events, and novel transcript structures for 17 of the 23 other candidate genes. The high-throughput approach facilitated identification of CRC specific RNA variants. These include a recurrent read-through fusion transcript between KLK8 and KLK7, and a splice variant of S100A2. Both of these were overrepresented in CRC tissue and cell lines from external RNA-seq datasets.
Collapse
Affiliation(s)
- Andreas M Hoff
- Department of Molecular Oncology, Institute for Cancer Research, Oslo University Hospital-Norwegian Radium Hospital, Oslo, Norway.,KG Jebsen Colorectal Cancer Research Centre, Oslo University Hospital, Oslo, Norway.,Centre for Cancer Biomedicine, University of Oslo, Oslo, Norway
| | - Bjarne Johannessen
- Department of Molecular Oncology, Institute for Cancer Research, Oslo University Hospital-Norwegian Radium Hospital, Oslo, Norway.,KG Jebsen Colorectal Cancer Research Centre, Oslo University Hospital, Oslo, Norway.,Centre for Cancer Biomedicine, University of Oslo, Oslo, Norway
| | - Sharmini Alagaratnam
- Department of Molecular Oncology, Institute for Cancer Research, Oslo University Hospital-Norwegian Radium Hospital, Oslo, Norway.,KG Jebsen Colorectal Cancer Research Centre, Oslo University Hospital, Oslo, Norway.,Centre for Cancer Biomedicine, University of Oslo, Oslo, Norway
| | - Sen Zhao
- Department of Molecular Oncology, Institute for Cancer Research, Oslo University Hospital-Norwegian Radium Hospital, Oslo, Norway.,KG Jebsen Colorectal Cancer Research Centre, Oslo University Hospital, Oslo, Norway.,Centre for Cancer Biomedicine, University of Oslo, Oslo, Norway
| | - Torfinn Nome
- Department of Molecular Oncology, Institute for Cancer Research, Oslo University Hospital-Norwegian Radium Hospital, Oslo, Norway.,KG Jebsen Colorectal Cancer Research Centre, Oslo University Hospital, Oslo, Norway.,Centre for Cancer Biomedicine, University of Oslo, Oslo, Norway
| | - Marthe Løvf
- Department of Molecular Oncology, Institute for Cancer Research, Oslo University Hospital-Norwegian Radium Hospital, Oslo, Norway.,KG Jebsen Colorectal Cancer Research Centre, Oslo University Hospital, Oslo, Norway.,Centre for Cancer Biomedicine, University of Oslo, Oslo, Norway
| | - Anne C Bakken
- Department of Molecular Oncology, Institute for Cancer Research, Oslo University Hospital-Norwegian Radium Hospital, Oslo, Norway.,KG Jebsen Colorectal Cancer Research Centre, Oslo University Hospital, Oslo, Norway.,Centre for Cancer Biomedicine, University of Oslo, Oslo, Norway
| | - Merete Hektoen
- Department of Molecular Oncology, Institute for Cancer Research, Oslo University Hospital-Norwegian Radium Hospital, Oslo, Norway.,KG Jebsen Colorectal Cancer Research Centre, Oslo University Hospital, Oslo, Norway.,Centre for Cancer Biomedicine, University of Oslo, Oslo, Norway
| | - Anita Sveen
- Department of Molecular Oncology, Institute for Cancer Research, Oslo University Hospital-Norwegian Radium Hospital, Oslo, Norway.,KG Jebsen Colorectal Cancer Research Centre, Oslo University Hospital, Oslo, Norway.,Centre for Cancer Biomedicine, University of Oslo, Oslo, Norway
| | - Ragnhild A Lothe
- Department of Molecular Oncology, Institute for Cancer Research, Oslo University Hospital-Norwegian Radium Hospital, Oslo, Norway.,KG Jebsen Colorectal Cancer Research Centre, Oslo University Hospital, Oslo, Norway.,Centre for Cancer Biomedicine, University of Oslo, Oslo, Norway
| | - Rolf I Skotheim
- Department of Molecular Oncology, Institute for Cancer Research, Oslo University Hospital-Norwegian Radium Hospital, Oslo, Norway.,KG Jebsen Colorectal Cancer Research Centre, Oslo University Hospital, Oslo, Norway.,Centre for Cancer Biomedicine, University of Oslo, Oslo, Norway
| |
Collapse
|
20
|
Rodríguez-Martín B, Palumbo E, Marco-Sola S, Griebel T, Ribeca P, Alonso G, Rastrojo A, Aguado B, Guigó R, Djebali S. ChimPipe: accurate detection of fusion genes and transcription-induced chimeras from RNA-seq data. BMC Genomics 2017; 18:7. [PMID: 28049418 PMCID: PMC5209911 DOI: 10.1186/s12864-016-3404-9] [Citation(s) in RCA: 21] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/18/2016] [Accepted: 12/09/2016] [Indexed: 11/28/2022] Open
Abstract
Background Chimeric transcripts are commonly defined as transcripts linking two or more different genes in the genome, and can be explained by various biological mechanisms such as genomic rearrangement, read-through or trans-splicing, but also by technical or biological artefacts. Several studies have shown their importance in cancer, cell pluripotency and motility. Many programs have recently been developed to identify chimeras from Illumina RNA-seq data (mostly fusion genes in cancer). However outputs of different programs on the same dataset can be widely inconsistent, and tend to include many false positives. Other issues relate to simulated datasets restricted to fusion genes, real datasets with limited numbers of validated cases, result inconsistencies between simulated and real datasets, and gene rather than junction level assessment. Results Here we present ChimPipe, a modular and easy-to-use method to reliably identify fusion genes and transcription-induced chimeras from paired-end Illumina RNA-seq data. We have also produced realistic simulated datasets for three different read lengths, and enhanced two gold-standard cancer datasets by associating exact junction points to validated gene fusions. Benchmarking ChimPipe together with four other state-of-the-art tools on this data showed ChimPipe to be the top program at identifying exact junction coordinates for both kinds of datasets, and the one showing the best trade-off between sensitivity and precision. Applied to 106 ENCODE human RNA-seq datasets, ChimPipe identified 137 high confidence chimeras connecting the protein coding sequence of their parent genes. In subsequent experiments, three out of four predicted chimeras, two of which recurrently expressed in a large majority of the samples, could be validated. Cloning and sequencing of the three cases revealed several new chimeric transcript structures, 3 of which with the potential to encode a chimeric protein for which we hypothesized a new role. Applying ChimPipe to human and mouse ENCODE RNA-seq data led to the identification of 131 recurrent chimeras common to both species, and therefore potentially conserved. Conclusions ChimPipe combines discordant paired-end reads and split-reads to detect any kind of chimeras, including those originating from polymerase read-through, and shows an excellent trade-off between sensitivity and precision. The chimeras found by ChimPipe can be validated in-vitro with high accuracy. Electronic supplementary material The online version of this article (doi:10.1186/s12864-016-3404-9) contains supplementary material, which is available to authorized users.
Collapse
Affiliation(s)
- Bernardo Rodríguez-Martín
- Centre for Genomic Regulation (CRG), The Barcelona Institute of Science and Technology, Dr. Aiguader 88, Barcelona, 08003, Spain.,Universitat Pompeu Fabra (UPF), Barcelona, Spain.,Joint IRB-BSC Program in Computational Biology, Barcelona Supercomputing Center (BSC), Jordi Girona 31, Barcelona, 08034, Spain
| | - Emilio Palumbo
- Centre for Genomic Regulation (CRG), The Barcelona Institute of Science and Technology, Dr. Aiguader 88, Barcelona, 08003, Spain.,Universitat Pompeu Fabra (UPF), Barcelona, Spain
| | - Santiago Marco-Sola
- Centro Nacional de Análisis Genómico, Baldiri Reixac, 4, Barcelona Science Park - Tower I, Barcelona, 08028, Spain
| | - Thasso Griebel
- Centro Nacional de Análisis Genómico, Baldiri Reixac, 4, Barcelona Science Park - Tower I, Barcelona, 08028, Spain
| | - Paolo Ribeca
- Centro Nacional de Análisis Genómico, Baldiri Reixac, 4, Barcelona Science Park - Tower I, Barcelona, 08028, Spain.,Integrative Biology, The Pirbright Institute, London, GU24 0NF, UK
| | - Graciela Alonso
- Centro de Biología Molecular Severo Ochoa (CSIC - UAM), Nicolás Cabrera 1, Cantoblanco, Madrid, 28049, Spain
| | - Alberto Rastrojo
- Centro de Biología Molecular Severo Ochoa (CSIC - UAM), Nicolás Cabrera 1, Cantoblanco, Madrid, 28049, Spain
| | - Begoña Aguado
- Centro de Biología Molecular Severo Ochoa (CSIC - UAM), Nicolás Cabrera 1, Cantoblanco, Madrid, 28049, Spain
| | - Roderic Guigó
- Centre for Genomic Regulation (CRG), The Barcelona Institute of Science and Technology, Dr. Aiguader 88, Barcelona, 08003, Spain.,Universitat Pompeu Fabra (UPF), Barcelona, Spain.,Institut Hospital del Mar d'Investigacions Mediques (IMIM), Barcelona, 08003, Spain
| | - Sarah Djebali
- Centre for Genomic Regulation (CRG), The Barcelona Institute of Science and Technology, Dr. Aiguader 88, Barcelona, 08003, Spain. .,Universitat Pompeu Fabra (UPF), Barcelona, Spain. .,GenPhySE, Université de Toulouse, INRA, INPT, ENVT, Castanet Tolosan, France.
| |
Collapse
|
21
|
Gorohovski A, Tagore S, Palande V, Malka A, Raviv-Shay D, Frenkel-Morgenstern M. ChiTaRS-3.1-the enhanced chimeric transcripts and RNA-seq database matched with protein-protein interactions. Nucleic Acids Res 2016; 45:D790-D795. [PMID: 27899596 PMCID: PMC5210585 DOI: 10.1093/nar/gkw1127] [Citation(s) in RCA: 43] [Impact Index Per Article: 5.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/02/2016] [Revised: 10/26/2016] [Accepted: 10/30/2016] [Indexed: 12/17/2022] Open
Abstract
Discovery of chimeric RNAs, which are produced by chromosomal translocations as well as the joining of exons from different genes by trans-splicing, has added a new level of complexity to our study and understanding of the transcriptome. The enhanced ChiTaRS-3.1 database (http://chitars.md.biu.ac.il) is designed to make widely accessible a wealth of mined data on chimeric RNAs, with easy-to-use analytical tools built-in. The database comprises 34 922 chimeric transcripts along with 11 714 cancer breakpoints. In this latest version, we have included multiple cross-references to GeneCards, iHop, PubMed, NCBI, Ensembl, OMIM, RefSeq and the Mitelman collection for every entry in the ‘Full Collection’. In addition, for every chimera, we have added a predicted chimeric protein–protein interaction (ChiPPI) network, which allows for easy visualization of protein partners of both parental and fusion proteins for all human chimeras. The database contains a comprehensive annotation for 34 922 chimeric transcripts from eight organisms, and includes the manual annotation of 200 sense-antiSense (SaS) chimeras. The current improvements in the content and functionality to the ChiTaRS database make it a central resource for the study of chimeric transcripts and fusion proteins.
Collapse
Affiliation(s)
- Alessandro Gorohovski
- Faculty of Medicine in Galilee, Bar-Ilan University, Henrietta Szold 8, Safed 13195, Israel
| | - Somnath Tagore
- Faculty of Medicine in Galilee, Bar-Ilan University, Henrietta Szold 8, Safed 13195, Israel
| | - Vikrant Palande
- Faculty of Medicine in Galilee, Bar-Ilan University, Henrietta Szold 8, Safed 13195, Israel
| | - Assaf Malka
- Faculty of Medicine in Galilee, Bar-Ilan University, Henrietta Szold 8, Safed 13195, Israel
| | - Dorith Raviv-Shay
- Faculty of Medicine in Galilee, Bar-Ilan University, Henrietta Szold 8, Safed 13195, Israel
| | - Milana Frenkel-Morgenstern
- Faculty of Medicine in Galilee, Bar-Ilan University, Henrietta Szold 8, Safed 13195, Israel. Corresponding author:
| |
Collapse
|
22
|
Latysheva NS, Babu MM. Discovering and understanding oncogenic gene fusions through data intensive computational approaches. Nucleic Acids Res 2016; 44:4487-503. [PMID: 27105842 PMCID: PMC4889949 DOI: 10.1093/nar/gkw282] [Citation(s) in RCA: 110] [Impact Index Per Article: 13.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/12/2016] [Accepted: 03/24/2016] [Indexed: 12/21/2022] Open
Abstract
Although gene fusions have been recognized as important drivers of cancer for decades, our understanding of the prevalence and function of gene fusions has been revolutionized by the rise of next-generation sequencing, advances in bioinformatics theory and an increasing capacity for large-scale computational biology. The computational work on gene fusions has been vastly diverse, and the present state of the literature is fragmented. It will be fruitful to merge three camps of gene fusion bioinformatics that appear to rarely cross over: (i) data-intensive computational work characterizing the molecular biology of gene fusions; (ii) development research on fusion detection tools, candidate fusion prioritization algorithms and dedicated fusion databases and (iii) clinical research that seeks to either therapeutically target fusion transcripts and proteins or leverages advances in detection tools to perform large-scale surveys of gene fusion landscapes in specific cancer types. In this review, we unify these different-yet highly complementary and symbiotic-approaches with the view that increased synergy will catalyze advancements in gene fusion identification, characterization and significance evaluation.
Collapse
Affiliation(s)
- Natasha S Latysheva
- MRC Laboratory of Molecular Biology, Francis Crick Ave, Cambridge CB2 0QH, United Kingdom
| | - M Madan Babu
- MRC Laboratory of Molecular Biology, Francis Crick Ave, Cambridge CB2 0QH, United Kingdom
| |
Collapse
|
23
|
Lai J, An J, Seim I, Walpole C, Hoffman A, Moya L, Srinivasan S, Perry-Keene JL, Wang C, Lehman ML, Nelson CC, Clements JA, Batra J. Fusion transcript loci share many genomic features with non-fusion loci. BMC Genomics 2015; 16:1021. [PMID: 26626734 PMCID: PMC4667522 DOI: 10.1186/s12864-015-2235-4] [Citation(s) in RCA: 14] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/03/2015] [Accepted: 11/23/2015] [Indexed: 11/10/2022] Open
Abstract
BACKGROUND Fusion transcripts are found in many tissues and have the potential to create novel functional products. Here, we investigate the genomic sequences around fusion junctions to better understand the transcriptional mechanisms mediating fusion transcription/splicing. We analyzed data from prostate (cancer) cells as previous studies have shown extensively that these cells readily undergo fusion transcription. RESULTS We used the FusionMap program to identify high-confidence fusion transcripts from RNAseq data. The RNAseq datasets were from our (N = 8) and other (N = 14) clinical prostate tumors with adjacent non-cancer cells, and from the LNCaP prostate cancer cell line that were mock-, androgen- (DHT), and anti-androgen- (bicalutamide, enzalutamide) treated. In total, 185 fusion transcripts were identified from all RNAseq datasets. The majority (76%) of these fusion transcripts were 'read-through chimeras' derived from adjacent genes in the genome. Characterization of sequences at fusion loci were carried out using a combination of the FusionMap program, custom Perl scripts, and the RNAfold program. Our computational analysis indicated that most fusion junctions (76%) use the consensus GT-AG intron donor-acceptor splice site, and most fusion transcripts (85%) maintained the open reading frame. We assessed whether parental genes of fusion transcripts have the potential to form complementary base pairing between parental genes which might bring them into physical proximity. Our computational analysis of sequences flanking fusion junctions at parental loci indicate that these loci have a similar propensity as non-fusion loci to hybridize. The abundance of repetitive sequences at fusion and non-fusion loci was also investigated given that SINE repeats are involved in aberrant gene transcription. We found few instances of repetitive sequences at both fusion and non-fusion junctions. Finally, RT-qPCR was performed on RNA from both clinical prostate tumors and adjacent non-cancer cells (N = 7), and LNCaP cells treated as above to validate the expression of seven fusion transcripts and their respective parental genes. We reveal that fusion transcript expression is similar to the expression of parental genes. CONCLUSIONS Fusion transcripts maintain the open reading frame, and likely use the same transcriptional machinery as non-fusion transcripts as they share many genomic features at splice/fusion junctions.
Collapse
Affiliation(s)
- John Lai
- Australian Prostate Cancer Research Centre - Queensland, Translational Research Institute, Brisbane, Australia. .,Cancer and Molecular Medicine Program, Institute of Health and Biomedical Innovation, Queensland University of Technology, Brisbane, Australia. .,Current address: Genetic Technologies, 60-66 Hanover Street, Melbourne, Australia.
| | - Jiyuan An
- Australian Prostate Cancer Research Centre - Queensland, Translational Research Institute, Brisbane, Australia. .,Cancer and Molecular Medicine Program, Institute of Health and Biomedical Innovation, Queensland University of Technology, Brisbane, Australia.
| | - Inge Seim
- Australian Prostate Cancer Research Centre - Queensland, Translational Research Institute, Brisbane, Australia. .,Cancer and Molecular Medicine Program, Institute of Health and Biomedical Innovation, Queensland University of Technology, Brisbane, Australia. .,Comparative and Endocrine Biology Laboratory, Institute of Health and Biomedical Innovation, Brisbane, Australia. .,Ghrelin Research Group, Institute of Health and Biomedical Innovation, Brisbane, Australia.
| | - Carina Walpole
- Australian Prostate Cancer Research Centre - Queensland, Translational Research Institute, Brisbane, Australia. .,Cancer and Molecular Medicine Program, Institute of Health and Biomedical Innovation, Queensland University of Technology, Brisbane, Australia.
| | - Andrea Hoffman
- Australian Prostate Cancer Research Centre - Queensland, Translational Research Institute, Brisbane, Australia. .,Cancer and Molecular Medicine Program, Institute of Health and Biomedical Innovation, Queensland University of Technology, Brisbane, Australia.
| | - Leire Moya
- Australian Prostate Cancer Research Centre - Queensland, Translational Research Institute, Brisbane, Australia. .,Cancer and Molecular Medicine Program, Institute of Health and Biomedical Innovation, Queensland University of Technology, Brisbane, Australia.
| | - Srilakshmi Srinivasan
- Australian Prostate Cancer Research Centre - Queensland, Translational Research Institute, Brisbane, Australia. .,Cancer and Molecular Medicine Program, Institute of Health and Biomedical Innovation, Queensland University of Technology, Brisbane, Australia.
| | | | | | - Chenwei Wang
- Australian Prostate Cancer Research Centre - Queensland, Translational Research Institute, Brisbane, Australia. .,Cancer and Molecular Medicine Program, Institute of Health and Biomedical Innovation, Queensland University of Technology, Brisbane, Australia.
| | - Melanie L Lehman
- Australian Prostate Cancer Research Centre - Queensland, Translational Research Institute, Brisbane, Australia. .,Cancer and Molecular Medicine Program, Institute of Health and Biomedical Innovation, Queensland University of Technology, Brisbane, Australia.
| | - Colleen C Nelson
- Australian Prostate Cancer Research Centre - Queensland, Translational Research Institute, Brisbane, Australia. .,Cancer and Molecular Medicine Program, Institute of Health and Biomedical Innovation, Queensland University of Technology, Brisbane, Australia.
| | - Judith A Clements
- Australian Prostate Cancer Research Centre - Queensland, Translational Research Institute, Brisbane, Australia. .,Cancer and Molecular Medicine Program, Institute of Health and Biomedical Innovation, Queensland University of Technology, Brisbane, Australia.
| | - Jyotsna Batra
- Australian Prostate Cancer Research Centre - Queensland, Translational Research Institute, Brisbane, Australia. .,Cancer and Molecular Medicine Program, Institute of Health and Biomedical Innovation, Queensland University of Technology, Brisbane, Australia.
| |
Collapse
|
24
|
Chen I, Chen CY, Chuang TJ. Biogenesis, identification, and function of exonic circular RNAs. WILEY INTERDISCIPLINARY REVIEWS-RNA 2015; 6:563-79. [PMID: 26230526 PMCID: PMC5042038 DOI: 10.1002/wrna.1294] [Citation(s) in RCA: 298] [Impact Index Per Article: 33.1] [Reference Citation Analysis] [Abstract] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 04/29/2015] [Revised: 06/11/2015] [Accepted: 06/16/2015] [Indexed: 01/20/2023]
Abstract
Circular RNAs (circRNAs) arise during post-transcriptional processes, in which a single-stranded RNA molecule forms a circle through covalent binding. Previously, circRNA products were often regarded to be splicing intermediates, by-products, or products of aberrant splicing. But recently, rapid advances in high-throughput RNA sequencing (RNA-seq) for global investigation of nonco-linear (NCL) RNAs, which comprised sequence segments that are topologically inconsistent with the reference genome, leads to renewed interest in this type of NCL RNA (i.e., circRNA), especially exonic circRNAs (ecircRNAs). Although the biogenesis and function of ecircRNAs are mostly unknown, some ecircRNAs are abundant, highly expressed, or evolutionarily conserved. Some ecircRNAs have been shown to affect microRNA regulation, and probably play roles in regulating parental gene transcription, cell proliferation, and RNA-binding proteins, indicating their functional potential for development as diagnostic tools. To date, thousands of ecircRNAs have been identified in multiple tissues/cell types from diverse species, through analyses of RNA-seq data. However, the detection of ecircRNA candidates involves several major challenges, including discrimination between ecircRNAs and other types of NCL RNAs (e.g., trans-spliced RNAs and genetic rearrangements); removal of sequencing errors, alignment errors, and in vitro artifacts; and the reconciliation of heterogeneous results arising from the use of different bioinformatics methods or sequencing data generated under different treatments. Such challenges may severely hamper the understanding of ecircRNAs. Herein, we review the biogenesis, identification, properties, and function of ecircRNAs, and discuss some unanswered questions regarding ecircRNAs. We also evaluate the accuracy (in terms of sensitivity and precision) of some well-known circRNA-detecting methods.
Collapse
Affiliation(s)
- Iju Chen
- Genomics Research Center, Academia Sinica, Taipei, Taiwan
| | - Chia-Ying Chen
- Genomics Research Center, Academia Sinica, Taipei, Taiwan
| | | |
Collapse
|
25
|
Peng Z, Yuan C, Zellmer L, Liu S, Xu N, Liao DJ. Hypothesis: Artifacts, Including Spurious Chimeric RNAs with a Short Homologous Sequence, Caused by Consecutive Reverse Transcriptions and Endogenous Random Primers. J Cancer 2015; 6:555-67. [PMID: 26000048 PMCID: PMC4439942 DOI: 10.7150/jca.11997] [Citation(s) in RCA: 28] [Impact Index Per Article: 3.1] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/26/2015] [Accepted: 04/02/2015] [Indexed: 12/21/2022] Open
Abstract
Recent RNA-sequencing technology and associated bioinformatics have led to identification of tens of thousands of putative human chimeric RNAs, i.e. RNAs containing sequences from two different genes, most of which are derived from neighboring genes on the same chromosome. In this essay, we redefine "two neighboring genes" as those producing individual transcripts, and point out two known mechanisms for chimeric RNA formation, i.e. transcription from a fusion gene or trans-splicing of two RNAs. By our definition, most putative RNA chimeras derived from canonically-defined neighboring genes may either be technical artifacts or be cis-splicing products of 5'- or 3'-extended RNA of either partner that is redefined herein as an unannotated gene, whereas trans-splicing events are rare in human cells. Therefore, most authentic chimeric RNAs result from fusion genes, about 1,000 of which have been identified hitherto. We propose a hypothesis of "consecutive reverse transcriptions (RTs)", i.e. another RT reaction following the previous one, for how most spurious chimeric RNAs, especially those containing a short homologous sequence, may be generated during RT, especially in RNA-sequencing wherein RNAs are fragmented. We also point out that RNA samples contain numerous RNA and DNA shreds that can serve as endogenous random primers for RT and ensuing polymerase chain reactions (PCR), creating artifacts in RT-PCR.
Collapse
Affiliation(s)
- Zhiyu Peng
- 1. Beijing Genomics Institute at Shenzhen, Building No.11, Beishan Industrial Zone, Yantian District, Shenzhen 518083, P. R. China
| | - Chengfu Yuan
- 2. Hormel Institute, University of Minnesota, Austin, MN 55912, USA
| | - Lucas Zellmer
- 2. Hormel Institute, University of Minnesota, Austin, MN 55912, USA
| | - Siqi Liu
- 3. CAS Key Laboratory of Genome Sciences and Information, Beijing Institute of Genomics, Chinese Academy of Sciences, Beijing 100101, P. R. China
| | - Ningzhi Xu
- 4. Laboratory of Cell and Molecular Biology, Cancer Institute, Chinese Academy of Medical Science, Beijing 100021, P. R. China
| | - D Joshua Liao
- 2. Hormel Institute, University of Minnesota, Austin, MN 55912, USA
| |
Collapse
|
26
|
St Laurent G, Wahlestedt C, Kapranov P. The Landscape of long noncoding RNA classification. Trends Genet 2015; 31:239-51. [PMID: 25869999 DOI: 10.1016/j.tig.2015.03.007] [Citation(s) in RCA: 810] [Impact Index Per Article: 90.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/12/2015] [Revised: 03/09/2015] [Accepted: 03/12/2015] [Indexed: 12/12/2022]
Abstract
Advances in the depth and quality of transcriptome sequencing have revealed many new classes of long noncoding RNAs (lncRNAs). lncRNA classification has mushroomed to accommodate these new findings, even though the real dimensions and complexity of the noncoding transcriptome remain unknown. Although evidence of functionality of specific lncRNAs continues to accumulate, conflicting, confusing, and overlapping terminology has fostered ambiguity and lack of clarity in the field in general. The lack of fundamental conceptual unambiguous classification framework results in a number of challenges in the annotation and interpretation of noncoding transcriptome data. It also might undermine integration of the new genomic methods and datasets in an effort to unravel the function of lncRNA. Here, we review existing lncRNA classifications, nomenclature, and terminology. Then, we describe the conceptual guidelines that have emerged for their classification and functional annotation based on expanding and more comprehensive use of large systems biology-based datasets.
Collapse
Affiliation(s)
- Georges St Laurent
- St. Laurent Institute, 317 New Boston St., Suite 201, Woburn, MA 01801 USA; Department of Molecular Biology, Cell Biology and Biochemistry, Brown University, 185 Meeting Street, Providence, RI 02912, USA
| | - Claes Wahlestedt
- Center for Therapeutic Innovation and Department of Psychiatry and Behavioral Sciences, University of Miami Miller School of Medicine, 1501 NW 10th Ave, Miami, FL 33136 USA.
| | - Philipp Kapranov
- Institute of Genomics, School of Biomedical Sciences, Huaqiao Univerisity, 668 Jimei Road, Xiamen, China 361021; St. Laurent Institute, 317 New Boston St., Suite 201, Woburn, MA 01801 USA.
| |
Collapse
|
27
|
Frenkel-Morgenstern M, Gorohovski A, Vucenovic D, Maestre L, Valencia A. ChiTaRS 2.1--an improved database of the chimeric transcripts and RNA-seq data with novel sense-antisense chimeric RNA transcripts. Nucleic Acids Res 2014; 43:D68-75. [PMID: 25414346 PMCID: PMC4383979 DOI: 10.1093/nar/gku1199] [Citation(s) in RCA: 24] [Impact Index Per Article: 2.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/20/2022] Open
Abstract
Chimeric RNAs that comprise two or more different transcripts have been identified in many cancers and among the Expressed Sequence Tags (ESTs) isolated from different organisms; they might represent functional proteins and produce different disease phenotypes. The ChiTaRS 2.1 database of chimeric transcripts and RNA-Seq data (http://chitars.bioinfo.cnio.es/) is the second version of the ChiTaRS database and includes improvements in content and functionality. Chimeras from eight organisms have been collated including novel sense–antisense (SAS) chimeras resulting from the slippage of the sense and anti-sense intragenic regions. The new database version collects more than 29 000 chimeric transcripts and indicates the expression and tissue specificity for 333 entries confirmed by RNA-seq reads mapping the chimeric junction sites. User interface allows for rapid and easy analysis of evolutionary conservation of fusions, literature references and experimental data supporting fusions in different organisms. More than 1428 cancer breakpoints have been automatically collected from public databases and manually verified to identify their correct cross-references, genomic sequences and junction sites. As a result, the ChiTaRS 2.1 collection of chimeras from eight organisms and human cancer breakpoints extends our understanding of the evolution of chimeric transcripts in eukaryotes as well as their functional role in carcinogenic processes.
Collapse
Affiliation(s)
- Milana Frenkel-Morgenstern
- Structural Biology and BioComputing Program, Spanish National Cancer Research Centre (CNIO), Madrid 28029, Spain
| | - Alessandro Gorohovski
- Structural Biology and BioComputing Program, Spanish National Cancer Research Centre (CNIO), Madrid 28029, Spain
| | - Dunja Vucenovic
- Structural Biology and BioComputing Program, Spanish National Cancer Research Centre (CNIO), Madrid 28029, Spain
| | - Lorena Maestre
- Monoclonal Antibodies Unit, Spanish National Cancer Research Centre (CNIO), Madrid 28029, Spain
| | - Alfonso Valencia
- Structural Biology and BioComputing Program, Spanish National Cancer Research Centre (CNIO), Madrid 28029, Spain.
| |
Collapse
|
28
|
Lloréns-Rico V, Serrano L, Lluch-Senar M. Assessing the hodgepodge of non-mapped reads in bacterial transcriptomes: real or artifactual RNA chimeras? BMC Genomics 2014; 15:633. [PMID: 25070459 PMCID: PMC4122791 DOI: 10.1186/1471-2164-15-633] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/25/2014] [Accepted: 07/17/2014] [Indexed: 11/10/2022] Open
Abstract
BACKGROUND RNA sequencing methods have already altered our view of the extent and complexity of bacterial and eukaryotic transcriptomes, revealing rare transcript isoforms (circular RNAs, RNA chimeras) that could play an important role in their biology. RESULTS We performed an analysis of chimera formation by four different computational approaches, including a custom designed pipeline, to study the transcriptomes of M. pneumoniae and P. aeruginosa, as well as mixtures of both. We found that rare transcript isoforms detected by conventional pipelines of analysis could be artifacts of the experimental procedure used in the library preparation, and that they are protocol-dependent. CONCLUSION By using a customized pipeline we show that optimal library preparation protocol and the pipeline to analyze the results are crucial to identify real chimeric RNAs.
Collapse
Affiliation(s)
| | - Luis Serrano
- EMBL/CRG Systems Biology Research Unit, Centre for Genomic Regulation (CRG), Dr, Aiguader 88, 08003 Barcelona, Spain.
| | | |
Collapse
|
29
|
Yu CY, Liu HJ, Hung LY, Kuo HC, Chuang TJ. Is an observed non-co-linear RNA product spliced in trans, in cis or just in vitro? Nucleic Acids Res 2014; 42:9410-23. [PMID: 25053845 PMCID: PMC4132752 DOI: 10.1093/nar/gku643] [Citation(s) in RCA: 43] [Impact Index Per Article: 4.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/14/2022] Open
Abstract
Global transcriptome investigations often result in the detection of an enormous number of transcripts composed of non-co-linear sequence fragments. Such ‘aberrant’ transcript products may arise from post-transcriptional events or genetic rearrangements, or may otherwise be false positives (sequencing/alignment errors or in vitro artifacts). Moreover, post-transcriptionally non-co-linear (‘PtNcl’) transcripts can arise from trans-splicing or back-splicing in cis (to generate so-called ‘circular RNA’). Here, we collected previously-predicted human non-co-linear RNA candidates, and designed a validation procedure integrating in silico filters with multiple experimental validation steps to examine their authenticity. We showed that >50% of the tested candidates were in vitro artifacts, even though some had been previously validated by RT-PCR. After excluding the possibility of genetic rearrangements, we distinguished between trans-spliced and circular RNAs, and confirmed that these two splicing forms can share the same non-co-linear junction. Importantly, the experimentally-confirmed PtNcl RNA events and their corresponding PtNcl splicing types (i.e. trans-splicing, circular RNA, or both sharing the same junction) were all expressed in rhesus macaque, and some were even expressed in mouse. Our study thus describes an essential procedure for confirming PtNcl transcripts, and provides further insight into the evolutionary role of PtNcl RNA events, opening up this important, but understudied, class of post-transcriptional events for comprehensive characterization.
Collapse
Affiliation(s)
- Chun-Ying Yu
- Institute of Cellular and Organismic Biology, Academia Sinica, Taipei 11529, Taiwan
| | - Hsiao-Jung Liu
- Institute of Cellular and Organismic Biology, Academia Sinica, Taipei 11529, Taiwan
| | - Li-Yuan Hung
- Division of Physical and Computational Genomics, Genomics Research Center, Academia Sinica, Taipei 11529, Taiwan
| | - Hung-Chih Kuo
- Institute of Cellular and Organismic Biology, Academia Sinica, Taipei 11529, Taiwan
| | - Trees-Juen Chuang
- Division of Physical and Computational Genomics, Genomics Research Center, Academia Sinica, Taipei 11529, Taiwan
| |
Collapse
|
30
|
Tai PWL, Zaidi SK, Wu H, Grandy RA, Montecino MM, van Wijnen AJ, Lian JB, Stein GS, Stein JL. The dynamic architectural and epigenetic nuclear landscape: developing the genomic almanac of biology and disease. J Cell Physiol 2014; 229:711-27. [PMID: 24242872 PMCID: PMC3996806 DOI: 10.1002/jcp.24508] [Citation(s) in RCA: 11] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/07/2013] [Accepted: 11/11/2013] [Indexed: 12/31/2022]
Abstract
Compaction of the eukaryotic genome into the confined space of the cell nucleus must occur faithfully throughout each cell cycle to retain gene expression fidelity. For decades, experimental limitations to study the structural organization of the interphase nucleus restricted our understanding of its contributions towards gene regulation and disease. However, within the past few years, our capability to visualize chromosomes in vivo with sophisticated fluorescence microscopy, and to characterize chromosomal regulatory environments via massively parallel sequencing methodologies have drastically changed how we currently understand epigenetic gene control within the context of three-dimensional nuclear structure. The rapid rate at which information on nuclear structure is unfolding brings challenges to compare and contrast recent observations with historic findings. In this review, we discuss experimental breakthroughs that have influenced how we understand and explore the dynamic structure and function of the nucleus, and how we can incorporate historical perspectives with insights acquired from the ever-evolving advances in molecular biology and pathology.
Collapse
Affiliation(s)
- Phillip W. L. Tai
- Department of Biochemistry and Vermont Cancer Center, University of Vermont College of Medicine, Burlington, VT
| | - Sayyed K. Zaidi
- Department of Biochemistry and Vermont Cancer Center, University of Vermont College of Medicine, Burlington, VT
| | - Hai Wu
- Department of Biochemistry and Vermont Cancer Center, University of Vermont College of Medicine, Burlington, VT
| | - Rodrigo A. Grandy
- Department of Biochemistry and Vermont Cancer Center, University of Vermont College of Medicine, Burlington, VT
| | - Martin M. Montecino
- Center for Biomedical Research and FONDAP Center for Genome Regulation, Universidad Andres Bello, Santiago, Chile
| | - André J. van Wijnen
- Departments of Orthopedic Surgery and Biochemistry and Molecular Biology, Mayo Clinic, Rochester, MN
| | - Jane B. Lian
- Department of Biochemistry and Vermont Cancer Center, University of Vermont College of Medicine, Burlington, VT
| | - Gary S. Stein
- Department of Biochemistry and Vermont Cancer Center, University of Vermont College of Medicine, Burlington, VT
| | - Janet L. Stein
- Department of Biochemistry and Vermont Cancer Center, University of Vermont College of Medicine, Burlington, VT
| |
Collapse
|
31
|
Nitsche A, Doose G, Tafer H, Robinson M, Saha NR, Gerdol M, Canapa A, Hoffmann S, Amemiya CT, Stadler PF. Atypical RNAs in the coelacanth transcriptome. JOURNAL OF EXPERIMENTAL ZOOLOGY PART B-MOLECULAR AND DEVELOPMENTAL EVOLUTION 2013; 322:342-51. [PMID: 24174405 DOI: 10.1002/jez.b.22542] [Citation(s) in RCA: 17] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Received: 04/24/2013] [Revised: 07/22/2013] [Accepted: 08/16/2013] [Indexed: 01/15/2023]
Abstract
Circular and apparently trans-spliced RNAs have recently been reported as abundant types of transcripts in mammalian transcriptome data. Both types of non-colinear RNAs are also abundant in RNA-seq of different tissue from both the African and the Indonesian coelacanth. We observe more than 8,000 lincRNAs with normal gene structure and several thousands of circularized and trans-spliced products, showing that such atypical RNAs form a substantial contribution to the transcriptome. Surprisingly, the majority of the circularizing and trans-connecting splice junctions are unique to atypical forms, that is, are not used in normal isoforms.
Collapse
Affiliation(s)
- Anne Nitsche
- Department of Computer Science, Bioinformatics Group, University of Leipzig, Leipzig, Germany; Interdisciplinary Center for Bioinformatics, University of Leipzig, Leipzig, Germany
| | | | | | | | | | | | | | | | | | | |
Collapse
|
32
|
Abstract
The last decade has seen tremendous effort committed to the annotation of the human genome sequence, most notably perhaps in the form of the ENCODE project. One of the major findings of ENCODE, and other genome analysis projects, is that the human transcriptome is far larger and more complex than previously thought. This complexity manifests, for example, as alternative splicing within protein-coding genes, as well as in the discovery of thousands of long noncoding RNAs. It is also possible that significant numbers of human transcripts have not yet been described by annotation projects, while existing transcript models are frequently incomplete. The question as to what proportion of this complexity is truly functional remains open, however, and this ambiguity presents a serious challenge to genome scientists. In this article, we will discuss the current state of human transcriptome annotation, drawing on our experience gained in generating the GENCODE gene annotation set. We highlight the gaps in our knowledge of transcript functionality that remain, and consider the potential computational and experimental strategies that can be used to help close them. We propose that an understanding of the true overlap between transcriptional complexity and functionality will not be gained in the short term. However, significant steps toward obtaining this knowledge can now be taken by using an integrated strategy, combining all of the experimental resources at our disposal.
Collapse
Affiliation(s)
- Jonathan M Mudge
- Department of Informatics, Wellcome Trust Sanger Institute, Hinxton CB10 1SA, United Kingdom
| | | | | |
Collapse
|
33
|
Wu CS, Yu CY, Chuang CY, Hsiao M, Kao CF, Kuo HC, Chuang TJ. Integrative transcriptome sequencing identifies trans-splicing events with important roles in human embryonic stem cell pluripotency. Genome Res 2013; 24:25-36. [PMID: 24131564 PMCID: PMC3875859 DOI: 10.1101/gr.159483.113] [Citation(s) in RCA: 83] [Impact Index Per Article: 7.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/09/2023]
Abstract
Trans-splicing is a post-transcriptional event that joins exons from separate pre-mRNAs. Detection of trans-splicing is usually severely hampered by experimental artifacts and genetic rearrangements. Here, we develop a new computational pipeline, TSscan, which integrates different types of high-throughput long-/short-read transcriptome sequencing of different human embryonic stem cell (hESC) lines to effectively minimize false positives while detecting trans-splicing. Combining TSscan screening with multiple experimental validation steps revealed that most chimeric RNA products were platform-dependent experimental artifacts of RNA sequencing. We successfully identified and confirmed four trans-spliced RNAs, including the first reported trans-spliced large intergenic noncoding RNA (“tsRMST”). We showed that these trans-spliced RNAs were all highly expressed in human pluripotent stem cells and differentially expressed during hESC differentiation. Our results further indicated that tsRMST can contribute to pluripotency maintenance of hESCs by suppressing lineage-specific gene expression through the recruitment of NANOG and the PRC2 complex factor, SUZ12. Taken together, our findings provide important insights into the role of trans-splicing in pluripotency maintenance of hESCs and help to facilitate future studies into trans-splicing, opening up this important but understudied class of post-transcriptional events for comprehensive characterization.
Collapse
Affiliation(s)
- Chan-Shuo Wu
- Genomics Research Center, Academia Sinica, Taipei 11529, Taiwan
| | | | | | | | | | | | | |
Collapse
|
34
|
Khaladkar M, Buckley PT, Lee MT, Francis C, Eghbal MM, Chuong T, Suresh S, Kuhn B, Eberwine J, Kim J. Subcellular RNA sequencing reveals broad presence of cytoplasmic intron-sequence retaining transcripts in mouse and rat neurons. PLoS One 2013; 8:e76194. [PMID: 24098440 PMCID: PMC3789819 DOI: 10.1371/journal.pone.0076194] [Citation(s) in RCA: 31] [Impact Index Per Article: 2.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/31/2013] [Accepted: 08/20/2013] [Indexed: 12/03/2022] Open
Abstract
Recent findings have revealed the complexity of the transcriptional landscape in mammalian cells. One recently described class of novel transcripts are the Cytoplasmic Intron-sequence Retaining Transcripts (CIRTs), hypothesized to confer post-transcriptional regulatory function. For instance, the neuronal CIRT KCNMA1i16 contributes to the firing properties of hippocampal neurons. Intronic sub-sequence retention within IL1-β mRNA in anucleate platelets has been implicated in activity-dependent splicing and translation. In a recent study, we showed CIRTs harbor functional SINE ID elements which are hypothesized to mediate dendritic localization in neurons. Based on these studies and others, we hypothesized that CIRTs may be present in a broad set of transcripts and comprise novel signals for post-transcriptional regulation. We carried out a transcriptome-wide survey of CIRTs by sequencing micro-dissected subcellular RNA fractions. We sequenced two batches of 150-300 individually dissected dendrites from primary cultures of hippocampal neurons in rat and three batches from mouse hippocampal neurons. After statistical processing to minimize artifacts, we found a broad prevalence of CIRTs in the neurons in both species (44-60% of the expressed transcripts). The sequence patterns, including stereotypical length, biased inclusion of specific introns, and intron-intron junctions, suggested CIRT-specific nuclear processing. Our analysis also suggested that these cytoplasmic intron-sequence retaining transcripts may serve as a primary transcript for ncRNAs. Our results show that retaining intronic sequences is not isolated to a few loci but may be a genome-wide phenomenon for embedding functional signals within certain mRNA. The results hypothesize a novel source of cis-sequences for post-transcriptional regulation. Our results hypothesize two potentially novel splicing pathways: one, within the nucleus for CIRT biogenesis; and another, within the cytoplasm for removing CIRT sequences before translation. We also speculate that release of CIRT sequences prior to translation may form RNA-based signals within the cell potentially comprising a novel class of signaling pathways.
Collapse
Affiliation(s)
- Mugdha Khaladkar
- Department of Biology, University of Pennsylvania, Philadelphia, Pennsylvania, United States of America
| | - Peter T. Buckley
- Penn Genome Frontiers Institute, University of Pennsylvania, Philadelphia, Pennsylvania, United States of America
- Department of Pharmacology, University of Pennsylvania, Philadelphia, Pennsylvania, United States of America
| | - Miler T. Lee
- Department of Genetics, Yale University, New Haven, Connecticut, United States of America
| | - Chantal Francis
- Department of Biology, University of Pennsylvania, Philadelphia, Pennsylvania, United States of America
| | - Mitra M. Eghbal
- Department of Biology, University of Pennsylvania, Philadelphia, Pennsylvania, United States of America
| | - Tina Chuong
- Department of Biology, University of Pennsylvania, Philadelphia, Pennsylvania, United States of America
| | - Sangita Suresh
- Department of Pediatrics, Boston Children’s Hospital, Boston, Massachusetts, United States of America
| | - Bernhard Kuhn
- Department of Pediatrics, Boston Children’s Hospital, Boston, Massachusetts, United States of America
| | - James Eberwine
- Penn Genome Frontiers Institute, University of Pennsylvania, Philadelphia, Pennsylvania, United States of America
- Department of Pharmacology, University of Pennsylvania, Philadelphia, Pennsylvania, United States of America
- Genomics and Computational Biology Program, University of Pennsylvania, Philadelphia, Pennsylvania, United States of America
| | - Junhyong Kim
- Department of Biology, University of Pennsylvania, Philadelphia, Pennsylvania, United States of America
- Penn Genome Frontiers Institute, University of Pennsylvania, Philadelphia, Pennsylvania, United States of America
- Genomics and Computational Biology Program, University of Pennsylvania, Philadelphia, Pennsylvania, United States of America
| |
Collapse
|
35
|
Common fusion transcripts identified in colorectal cancer cell lines by high-throughput RNA sequencing. Transl Oncol 2013; 6:546-53. [PMID: 24151535 DOI: 10.1593/tlo.13457] [Citation(s) in RCA: 25] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/10/2013] [Revised: 07/24/2013] [Accepted: 07/25/2013] [Indexed: 01/01/2023] Open
Abstract
Colorectal cancer (CRC) is the third most common cancer disease in the Western world, and about 40% of the patients die from this disease. The cancer cells are commonly genetically unstable, but only a few low-frequency recurrent fusion genes have so far been reported for this disease. In this study, we present a thorough search for novel fusion transcripts in CRC using high-throughput RNA sequencing. From altogether 220 million paired-end sequence reads from seven CRC cell lines, we identified 3391 candidate fused transcripts. By stringent requirements, we nominated 11 candidate fusion transcripts for further experimental validation, of which 10 were positive by reverse transcription-polymerase chain reaction and Sanger sequencing. Six were intrachromosomal fusion transcripts, and interestingly, three of these, AKAP13-PDE8A, COMMD10-AP3S1, and CTB-35F21.1-PSD2, were present in, respectively, 18, 18, and 20 of 21 analyzed cell lines and in, respectively, 18, 61, and 48 (17%-58%) of 106 primary cancer tissues. These three fusion transcripts were also detected in 2 to 4 of 14 normal colonic mucosa samples (14%-28%). Whole-genome sequencing identified a specific genomic breakpoint in COMMD10-AP3S1 and further indicates that both the COMMD10-AP3S1 and AKAP13-PDE8A fusion transcripts are due to genomic duplications in specific cell lines. In conclusion, we have identified AKAP13-PDE8A, COMMD10-AP3S1, and CTB-35F21.1-PSD2 as novel intrachromosomal fusion transcripts and the most highly recurring chimeric transcripts described for CRC to date. The functional and clinical relevance of these chimeric RNA molecules remains to be elucidated.
Collapse
|
36
|
Durán E, Djebali S, González S, Flores O, Mercader JM, Guigó R, Torrents D, Soler-López M, Orozco M. Unravelling the hidden DNA structural/physical code provides novel insights on promoter location. Nucleic Acids Res 2013; 41:7220-30. [PMID: 23761436 PMCID: PMC3753636 DOI: 10.1093/nar/gkt511] [Citation(s) in RCA: 13] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/23/2022] Open
Abstract
Although protein recognition of DNA motifs in promoter regions has been traditionally considered as a critical regulatory element in transcription, the location of promoters, and in particular transcription start sites (TSSs), still remains a challenge. Here we perform a comprehensive analysis of putative core promoter sequences relative to non-annotated predicted TSSs along the human genome, which were defined by distinct DNA physical properties implemented in our ProStar computational algorithm. A representative sampling of predicted regions was subjected to extensive experimental validation and analyses. Interestingly, the vast majority proved to be transcriptionally active despite the lack of specific sequence motifs, indicating that physical signaling is indeed able to detect promoter activity beyond conventional TSS prediction methods. Furthermore, highly active regions displayed typical chromatin features associated to promoters of housekeeping genes. Our results enable to redefine the promoter signatures and analyze the diversity, evolutionary conservation and dynamic regulation of human core promoters at large-scale. Moreover, the present study strongly supports the hypothesis of an ancient regulatory mechanism encoded by the intrinsic physical properties of the DNA that may contribute to the complexity of transcription regulation in the human genome.
Collapse
Affiliation(s)
- Elisa Durán
- Institute for Research in Biomedicine (IRB Barcelona), Barcelona 08028, Spain, Joint IRB-BSC Research Program on Computational Biology, Barcelona 08028, Spain, Bioinformatics and Genomics Group, Center for Genomic Regulation and Universitat Pompeu Fabra, Barcelona 08003, Spain, Barcelona Supercomputing Center, Barcelona 08034, Spain and Department of Biochemistry and Molecular Biology, University of Barcelona, Barcelona 08028, Spain
| | | | | | | | | | | | | | | | | |
Collapse
|
37
|
Teperino R, Lempradl A, Pospisilik JA. Bridging epigenomics and complex disease: the basics. Cell Mol Life Sci 2013; 70:1609-21. [PMID: 23463237 PMCID: PMC11113658 DOI: 10.1007/s00018-013-1299-z] [Citation(s) in RCA: 18] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/29/2013] [Revised: 02/05/2013] [Accepted: 02/05/2013] [Indexed: 12/20/2022]
Abstract
The DNA sequence largely defines gene expression and phenotype. However, it is becoming increasingly clear that an additional chromatin-based regulatory network imparts both stability and plasticity to genome output, modifying phenotype independently of the genetic blueprint. Indeed, alterations in this "epigenetic" control layer underlie, at least in part, the reason for monozygotic twins being discordant for disease. Functionally, this regulatory layer comprises post-translational modifications of DNA and histones, as well as small and large noncoding RNAs. Together these regulate gene expression by changing chromatin organization and DNA accessibility. Successive technological advances over the past decade have enabled researchers to map the chromatin state with increasing accuracy and comprehensiveness, catapulting genetic research into a genome-wide era. Here, aiming particularly at the genomics/epigenomics newcomer, we review the epigenetic basis that has helped drive the technological shift and how this progress is shaping our understanding of complex disease.
Collapse
Affiliation(s)
- Raffaele Teperino
- Max-Planck Institute of Immunobiology and Epigenetics, Stuebeweg 51, 79108 Freiburg, Germany
| | - Adelheid Lempradl
- Max-Planck Institute of Immunobiology and Epigenetics, Stuebeweg 51, 79108 Freiburg, Germany
| | - J. Andrew Pospisilik
- Max-Planck Institute of Immunobiology and Epigenetics, Stuebeweg 51, 79108 Freiburg, Germany
| |
Collapse
|
38
|
Papantonis A, Cook PR. Transcription factories: genome organization and gene regulation. Chem Rev 2013; 113:8683-705. [PMID: 23597155 DOI: 10.1021/cr300513p] [Citation(s) in RCA: 162] [Impact Index Per Article: 14.7] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/06/2023]
Affiliation(s)
- Argyris Papantonis
- Sir William Dunn School of Pathology, University of Oxford , South Parks Road, Oxford OX1 3RE, United Kingdom
| | | |
Collapse
|
39
|
Yuan C, Liu Y, Yang M, Liao DJ. New methods as alternative or corrective measures for the pitfalls and artifacts of reverse transcription and polymerase chain reactions (RT-PCR) in cloning chimeric or antisense-accompanied RNA. RNA Biol 2013; 10:958-67. [PMID: 23618925 PMCID: PMC4111735 DOI: 10.4161/rna.24570] [Citation(s) in RCA: 19] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/02/2023] Open
Abstract
We established new methods for cloning cDNA ends that start with reverse transcription (RT) and soon proceed with the synthesis of the second cDNA strand, avoiding manipulations of fragile RNA. Our 3′-end cloning method does not involve poly-dT primers and polymerase chain reactions (PCR), is low in efficiency but high in fidelity and can clone those RNAs without a poly-A tail. We also established a cDNA protection assay to supersede RNA protection assay. The protected cDNA can be amplified, cloned and sequenced, enhancing sensitivity and fidelity. We report that RT product using gene-specific primer (GSP) cannot be gene- or strand-specific because RNA sample contains endogenous random primers (ERP). The gene-specificity may be improved by adding a linker sequence at the 5′-end of the GSP to prime RT and using the linker as a primer in the ensuing PCR. The strand-specificity may be improved by using strand-specific DNA oligos in our protection assay. The CDK4 mRNA and TSPAN31 mRNA are transcribed from the opposite DNA strands and overlap at their 3′ ends. Using this relationship as a model, we found that the overlapped sequence might serve as a primer with its antisense as the template to create a wrong-template extension in RT or PCR. We infer that two unrelated RNAs or cDNAs overlapping at the 5′- or 3′-end might create a spurious chimera in this way, and many chimeras with a homologous sequence may be such artifacts. The ERP and overlapping antisense together set complex pitfalls, which one should be aware of.
Collapse
Affiliation(s)
- Chengfu Yuan
- Hormel Institute, University of Minnesota, Austin, MN, USA
| | | | | | | |
Collapse
|
40
|
Di Stefano M, Rosa A, Belcastro V, di Bernardo D, Micheletti C. Colocalization of coregulated genes: a steered molecular dynamics study of human chromosome 19. PLoS Comput Biol 2013; 9:e1003019. [PMID: 23555238 PMCID: PMC3610629 DOI: 10.1371/journal.pcbi.1003019] [Citation(s) in RCA: 39] [Impact Index Per Article: 3.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/14/2012] [Accepted: 02/19/2013] [Indexed: 01/12/2023] Open
Abstract
The connection between chromatin nuclear organization and gene activity is vividly illustrated by the observation that transcriptional coregulation of certain genes appears to be directly influenced by their spatial proximity. This fact poses the more general question of whether it is at all feasible that the numerous genes that are coregulated on a given chromosome, especially those at large genomic distances, might become proximate inside the nucleus. This problem is studied here using steered molecular dynamics simulations in order to enforce the colocalization of thousands of knowledge-based gene sequences on a model for the gene-rich human chromosome 19. Remarkably, it is found that most (≈ 88%) gene pairs can be brought simultaneously into contact. This is made possible by the low degree of intra-chromosome entanglement and the large number of cliques in the gene coregulatory network. A clique is a set of genes coregulated all together as a group. The constrained conformations for the model chromosome 19 are further shown to be organized in spatial macrodomains that are similar to those inferred from recent HiC measurements. The findings indicate that gene coregulation and colocalization are largely compatible and that this relationship can be exploited to draft the overall spatial organization of the chromosome in vivo. The more general validity and implications of these findings could be investigated by applying to other eukaryotic chromosomes the general and transferable computational strategy introduced here.
Collapse
Affiliation(s)
- Marco Di Stefano
- International School for Advanced Studies (SISSA), Trieste, Italy
| | - Angelo Rosa
- International School for Advanced Studies (SISSA), Trieste, Italy
- * E-mail: (AR); (CM)
| | - Vincenzo Belcastro
- Philip Morris International R&D, Philip Morris Products S.A., Neuchâtel, Switzerland
| | - Diego di Bernardo
- Telethon Institute of Genetics and Medicine (TIGEM), Napoli, Italy
- Department of Informatics and Systems Engineering, University “Federico II”, Napoli, Italy
| | - Cristian Micheletti
- International School for Advanced Studies (SISSA), Trieste, Italy
- * E-mail: (AR); (CM)
| |
Collapse
|
41
|
Kim T, Reitmair A. Non-Coding RNAs: Functional Aspects and Diagnostic Utility in Oncology. Int J Mol Sci 2013; 14:4934-68. [PMID: 23455466 PMCID: PMC3634484 DOI: 10.3390/ijms14034934] [Citation(s) in RCA: 29] [Impact Index Per Article: 2.6] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/10/2012] [Revised: 02/09/2013] [Accepted: 02/18/2013] [Indexed: 02/06/2023] Open
Abstract
Noncoding RNAs (ncRNAs) have been found to have roles in a large variety of biological processes. Recent studies indicate that ncRNAs are far more abundant and important than initially imagined, holding great promise for use in diagnostic, prognostic, and therapeutic applications. Within ncRNAs, microRNAs (miRNAs) are the most widely studied and characterized. They have been implicated in initiation and progression of a variety of human malignancies, including major pathologies such as cancers, arthritis, neurodegenerative disorders, and cardiovascular diseases. Their surprising stability in serum and other bodily fluids led to their rapid ascent as a novel class of biomarkers. For example, several properties of stable miRNAs, and perhaps other classes of ncRNAs, make them good candidate biomarkers for early cancer detection and for determining which preneoplastic lesions are likely to progress to cancer. Of particular interest is the identification of biomarker signatures, which may include traditional protein-based biomarkers, to improve risk assessment, detection, and prognosis. Here, we offer a comprehensive review of the ncRNA biomarker literature and discuss state-of-the-art technologies for their detection. Furthermore, we address the challenges present in miRNA detection and quantification, and outline future perspectives for development of next-generation biodetection assays employing multicolor alternating-laser excitation (ALEX) fluorescence spectroscopy.
Collapse
Affiliation(s)
- Taiho Kim
- Nesher Technologies, Inc., 2100 W. 3rd St. Los Angeles, CA 90057, USA.
| | | |
Collapse
|
42
|
Frenkel-Morgenstern M, Valencia A. Novel domain combinations in proteins encoded by chimeric transcripts. ACTA ACUST UNITED AC 2013; 28:i67-74. [PMID: 22689780 PMCID: PMC3371848 DOI: 10.1093/bioinformatics/bts216] [Citation(s) in RCA: 34] [Impact Index Per Article: 3.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/18/2022]
Abstract
Motivation: Chimeric RNA transcripts are generated by different mechanisms including pre-mRNA trans-splicing, chromosomal translocations and/or gene fusions. It was shown recently that at least some of chimeric transcripts can be translated into functional chimeric proteins. Results: To gain a better understanding of the design principles underlying chimeric proteins, we have analyzed 7,424 chimeric RNAs from humans. We focused on the specific domains present in these proteins, comparing their permutations with those of known human proteins. Our method uses genomic alignments of the chimeras, identification of the gene–gene junction sites and prediction of the protein domains. We found that chimeras contain complete protein domains significantly more often than in random data sets. Specifically, we show that eight different types of domains are over-represented among all chimeras as well as in those chimeras confirmed by RNA-seq experiments. Moreover, we discovered that some chimeras potentially encode proteins with novel and unique domain combinations. Given the observed prevalence of entire protein domains in chimeras, we predict that certain putative chimeras that lack activation domains may actively compete with their parental proteins, thereby exerting dominant negative effects. More generally, the production of chimeric transcripts enables a combinatorial increase in the number of protein products available, which may disturb the function of parental genes and influence their protein–protein interaction network. Availability: our scripts are available upon request. Contact:avalencia@cnio.es Supplementary information:Supplementary data are available at Bioinformatics online.
Collapse
Affiliation(s)
- Milana Frenkel-Morgenstern
- Structural Biology and BioComputing Program, Spanish National Cancer Research Centre (CNIO), 28029 Madrid, Spain
| | | |
Collapse
|
43
|
Abstract
In its first production phase, The ENCODE Project Consortium (ENCODE) has generated thousands of genome-scale data sets, resulting in a genomic “parts list” that encompasses transcripts, sites of transcription factor binding, and other functional features that now number in the millions of distinct elements. These data are reshaping many long-held beliefs concerning the information content of the human and other complex genomes, including the very definition of the gene. Here I discuss and place in context many of the leading findings of ENCODE, as well as trends that are shaping the generation and interpretation of ENCODE data. Finally, I consider prospects for the future, including maximizing the accuracy, completeness, and utility of ENCODE data for the community.
Collapse
Affiliation(s)
- John A Stamatoyannopoulos
- Departments of Genome Sciences and Medicine, University of Washington School of Medicine, Seattle, Washington 98195, USA.
| |
Collapse
|
44
|
Fang W, Wei Y, Kang Y, Landweber LF. Detection of a common chimeric transcript between human chromosomes 7 and 16. Biol Direct 2012; 7:49. [PMID: 23273016 PMCID: PMC3538553 DOI: 10.1186/1745-6150-7-49] [Citation(s) in RCA: 16] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/05/2012] [Accepted: 12/11/2012] [Indexed: 11/17/2022] Open
Abstract
Abstract Interchromosomal chimeric RNA molecules are often transcription products from genomic rearrangement in cancerous cells. Here we report the computational detection of an interchromosomal RNA fusion between ZC3HAV1L and CHMP1A from RNA-seq data of normal human mammary epithelial cells, and experimental confirmation of the chimeric transcript in multiple human cells and tissues. Our experimental characterization also detected three variants of the ZC3HAV1L-CHMP1A chimeric RNA, suggesting that these genes are involved in complex splicing. The fusion sequence at the novel exon-exon boundary, and the absence of corresponding DNA rearrangement suggest that this chimeric RNA is likely produced by trans-splicing in human cells. Reviewers This article was reviewed by Rory Johnson (nominated by Fyodor Kondrashov); Gal Avital and Itai Yanai
Collapse
Affiliation(s)
- Wenwen Fang
- Department of Molecular Biology, Princeton University, Princeton, NJ 08544, USA
| | | | | | | |
Collapse
|
45
|
Affiliation(s)
- Wenwen Fang
- Department of Molecular Biology, Princeton University, Princeton, NJ, USA
| | | |
Collapse
|
46
|
Frenkel-Morgenstern M, Gorohovski A, Lacroix V, Rogers M, Ibanez K, Boullosa C, Andres Leon E, Ben-Hur A, Valencia A. ChiTaRS: a database of human, mouse and fruit fly chimeric transcripts and RNA-sequencing data. Nucleic Acids Res 2012; 41:D142-51. [PMID: 23143107 PMCID: PMC3531201 DOI: 10.1093/nar/gks1041] [Citation(s) in RCA: 43] [Impact Index Per Article: 3.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/23/2022] Open
Abstract
Chimeric RNAs that comprise two or more different transcripts have been identified in many cancers and among the Expressed Sequence Tags (ESTs) isolated from different organisms; they might represent functional proteins and produce different disease phenotypes. The ChiTaRS database of Chimeric Transcripts and RNA-Sequencing data (http://chitars.bioinfo.cnio.es/) collects more than 16 000 chimeric RNAs from humans, mice and fruit flies, 233 chimeras confirmed by RNA-seq reads and ∼2000 cancer breakpoints. The database indicates the expression and tissue specificity of these chimeras, as confirmed by RNA-seq data, and it includes mass spectrometry results for some human entries at their junctions. Moreover, the database has advanced features to analyze junction consistency and to rank chimeras based on the evidence of repeated junction sites. Finally, ‘Junction Search’ screens through the RNA-seq reads found at the chimeras’ junction sites to identify putative junctions in novel sequences entered by users. Thus, ChiTaRS is an extensive catalog of human, mouse and fruit fly chimeras that will extend our understanding of the evolution of chimeric transcripts in eukaryotes and can be advantageous in the analysis of human cancer breakpoints.
Collapse
Affiliation(s)
- Milana Frenkel-Morgenstern
- Structural Biology and BioComputing Program, Spanish National Cancer Research Centre (CNIO), Madrid 28029, Spain
| | | | | | | | | | | | | | | | | |
Collapse
|
47
|
Howald C, Tanzer A, Chrast J, Kokocinski F, Derrien T, Walters N, Gonzalez JM, Frankish A, Aken BL, Hourlier T, Vogel JH, White S, Searle S, Harrow J, Hubbard TJ, Guigó R, Reymond A. Combining RT-PCR-seq and RNA-seq to catalog all genic elements encoded in the human genome. Genome Res 2012; 22:1698-710. [PMID: 22955982 PMCID: PMC3431487 DOI: 10.1101/gr.134478.111] [Citation(s) in RCA: 46] [Impact Index Per Article: 3.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/07/2011] [Accepted: 05/01/2012] [Indexed: 12/21/2022]
Abstract
Within the ENCODE Consortium, GENCODE aimed to accurately annotate all protein-coding genes, pseudogenes, and noncoding transcribed loci in the human genome through manual curation and computational methods. Annotated transcript structures were assessed, and less well-supported loci were systematically, experimentally validated. Predicted exon-exon junctions were evaluated by RT-PCR amplification followed by highly multiplexed sequencing readout, a method we called RT-PCR-seq. Seventy-nine percent of all assessed junctions are confirmed by this evaluation procedure, demonstrating the high quality of the GENCODE gene set. RT-PCR-seq was also efficient to screen gene models predicted using the Human Body Map (HBM) RNA-seq data. We validated 73% of these predictions, thus confirming 1168 novel genes, mostly noncoding, which will further complement the GENCODE annotation. Our novel experimental validation pipeline is extremely sensitive, far more than unbiased transcriptome profiling through RNA sequencing, which is becoming the norm. For example, exon-exon junctions unique to GENCODE annotated transcripts are five times more likely to be corroborated with our targeted approach than with extensive large human transcriptome profiling. Data sets such as the HBM and ENCODE RNA-seq data fail sampling of low-expressed transcripts. Our RT-PCR-seq targeted approach also has the advantage of identifying novel exons of known genes, as we discovered unannotated exons in ~11% of assessed introns. We thus estimate that at least 18% of known loci have yet-unannotated exons. Our work demonstrates that the cataloging of all of the genic elements encoded in the human genome will necessitate a coordinated effort between unbiased and targeted approaches, like RNA-seq and RT-PCR-seq.
Collapse
Affiliation(s)
- Cédric Howald
- Center for Integrative Genomics, University of Lausanne, 1015 Lausanne, Switzerland
- Swiss Institute of Bioinformatics, 1015 Lausanne, Switzerland
| | - Andrea Tanzer
- Centre de Regulacio Genomica, Grup de Recerca en Informatica Biomedica, E-08003 Barcelona, Spain
| | - Jacqueline Chrast
- Center for Integrative Genomics, University of Lausanne, 1015 Lausanne, Switzerland
| | - Felix Kokocinski
- Wellcome Trust Sanger Institute, Wellcome Trust Genome Campus, Hinxton, Cambridge, CB10 1SA, United Kingdom
| | - Thomas Derrien
- Centre de Regulacio Genomica, Grup de Recerca en Informatica Biomedica, E-08003 Barcelona, Spain
| | - Nathalie Walters
- Center for Integrative Genomics, University of Lausanne, 1015 Lausanne, Switzerland
| | - Jose M. Gonzalez
- Wellcome Trust Sanger Institute, Wellcome Trust Genome Campus, Hinxton, Cambridge, CB10 1SA, United Kingdom
| | - Adam Frankish
- Wellcome Trust Sanger Institute, Wellcome Trust Genome Campus, Hinxton, Cambridge, CB10 1SA, United Kingdom
| | - Bronwen L. Aken
- Wellcome Trust Sanger Institute, Wellcome Trust Genome Campus, Hinxton, Cambridge, CB10 1SA, United Kingdom
| | - Thibaut Hourlier
- Wellcome Trust Sanger Institute, Wellcome Trust Genome Campus, Hinxton, Cambridge, CB10 1SA, United Kingdom
| | - Jan-Hinnerk Vogel
- Wellcome Trust Sanger Institute, Wellcome Trust Genome Campus, Hinxton, Cambridge, CB10 1SA, United Kingdom
| | - Simon White
- Wellcome Trust Sanger Institute, Wellcome Trust Genome Campus, Hinxton, Cambridge, CB10 1SA, United Kingdom
| | - Stephen Searle
- Wellcome Trust Sanger Institute, Wellcome Trust Genome Campus, Hinxton, Cambridge, CB10 1SA, United Kingdom
| | - Jennifer Harrow
- Wellcome Trust Sanger Institute, Wellcome Trust Genome Campus, Hinxton, Cambridge, CB10 1SA, United Kingdom
| | - Tim J. Hubbard
- Wellcome Trust Sanger Institute, Wellcome Trust Genome Campus, Hinxton, Cambridge, CB10 1SA, United Kingdom
| | - Roderic Guigó
- Centre de Regulacio Genomica, Grup de Recerca en Informatica Biomedica, E-08003 Barcelona, Spain
| | - Alexandre Reymond
- Center for Integrative Genomics, University of Lausanne, 1015 Lausanne, Switzerland
| |
Collapse
|
48
|
Valencia A, Hidalgo M. Getting personalized cancer genome analysis into the clinic: the challenges in bioinformatics. Genome Med 2012; 4:61. [PMID: 22839973 PMCID: PMC3580417 DOI: 10.1186/gm362] [Citation(s) in RCA: 22] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/18/2022] Open
Abstract
Progress in genomics has raised expectations in many fields, and particularly in personalized cancer research. The new technologies available make it possible to combine information about potential disease markers, altered function and accessible drug targets, which, coupled with pathological and medical information, will help produce more appropriate clinical decisions. The accessibility of such experimental techniques makes it all the more necessary to improve and adapt computational strategies to the new challenges. This review focuses on the critical issues associated with the standard pipeline, which includes: DNA sequencing analysis; analysis of mutations in coding regions; the study of genome rearrangements; extrapolating information on mutations to the functional and signaling level; and predicting the effects of therapies using mouse tumor models. We describe the possibilities, limitations and future challenges of current bioinformatics strategies for each of these issues. Furthermore, we emphasize the need for the collaboration between the bioinformaticians who implement the software and use the data resources, the computational biologists who develop the analytical methods, and the clinicians, the systems' end users and those ultimately responsible for taking medical decisions. Finally, the different steps in cancer genome analysis are illustrated through examples of applications in cancer genome analysis.
Collapse
Affiliation(s)
- Alfonso Valencia
- Spanish National Cancer Research Centre (CNIO), Calle Melchor Fernández Almagro, 3, E-28029 Madrid, Spain
| | - Manuel Hidalgo
- Spanish National Cancer Research Centre (CNIO), Calle Melchor Fernández Almagro, 3, E-28029 Madrid, Spain
| |
Collapse
|
49
|
Frenkel-Morgenstern M, Lacroix V, Ezkurdia I, Levin Y, Gabashvili A, Prilusky J, Del Pozo A, Tress M, Johnson R, Guigo R, Valencia A. Chimeras taking shape: potential functions of proteins encoded by chimeric RNA transcripts. Genome Res 2012; 22:1231-42. [PMID: 22588898 PMCID: PMC3396365 DOI: 10.1101/gr.130062.111] [Citation(s) in RCA: 106] [Impact Index Per Article: 8.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/15/2022]
Abstract
Chimeric RNAs comprise exons from two or more different genes and have the potential to encode novel proteins that alter cellular phenotypes. To date, numerous putative chimeric transcripts have been identified among the ESTs isolated from several organisms and using high throughput RNA sequencing. The few corresponding protein products that have been characterized mostly result from chromosomal translocations and are associated with cancer. Here, we systematically establish that some of the putative chimeric transcripts are genuinely expressed in human cells. Using high throughput RNA sequencing, mass spectrometry experimental data, and functional annotation, we studied 7424 putative human chimeric RNAs. We confirmed the expression of 175 chimeric RNAs in 16 human tissues, with an abundance varying from 0.06 to 17 RPKM (Reads Per Kilobase per Million mapped reads). We show that these chimeric RNAs are significantly more tissue-specific than non-chimeric transcripts. Moreover, we present evidence that chimeras tend to incorporate highly expressed genes. Despite the low expression level of most chimeric RNAs, we show that 12 novel chimeras are translated into proteins detectable in multiple shotgun mass spectrometry experiments. Furthermore, we confirm the expression of three novel chimeric proteins using targeted mass spectrometry. Finally, based on our functional annotation of exon organization and preserved domains, we discuss the potential features of chimeric proteins with illustrative examples and suggest that chimeras significantly exploit signal peptides and transmembrane domains, which can alter the cellular localization of cognate proteins. Taken together, these findings establish that some chimeric RNAs are translated into potentially functional proteins in humans.
Collapse
|