1
|
Chitale GG, Kulkarni SR, Bapat SA. Chimerism: A whole new perspective in gene regulation. Biochim Biophys Acta Gen Subj 2025; 1869:130767. [PMID: 39855315 DOI: 10.1016/j.bbagen.2025.130767] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/04/2024] [Revised: 01/10/2025] [Accepted: 01/16/2025] [Indexed: 01/27/2025]
Abstract
The diversity of molecular entities emerging from a single gene are recognized. Several studies have thus established the cellular role(s) of transcript variants and protein isoforms. A step ahead in challenging the central dogma towards expanding molecular diversity is the identification of fusion genes, chimeric transcripts and chimeric proteins that harbor sequences from more than one gene. The mechanisms for generation of chimeras largely follow similar patterns across all levels of gene regulation but also have interdependence and mutual exclusivity. Whole genome and RNA-seq technologies supported by development of computational algorithms and programs for processing datasets have increasingly enabled the identification of fusion genes and chimeric transcripts, while the discovery of chimeric proteins is as yet more subtle. Earlier thought to be associated with cellular transformation, the contribution of chimeric molecules to normal physiology is also realized and found to influence the expression of their parental genes and regulate cellular pathways. This review offers a collective and comprehensive overview of cellular chimeric entities encompassing the mechanisms involved in their generation, insights on their evolution, functions in gene regulation and their current and novel clinical applications.
Collapse
Affiliation(s)
- Gayatri G Chitale
- National Centre for Cell Science, Savitribai Phule Pune University, Ganeshkhind, Pune 411007, India
| | - Shweta R Kulkarni
- National Centre for Cell Science, Savitribai Phule Pune University, Ganeshkhind, Pune 411007, India
| | - Sharmila A Bapat
- National Centre for Cell Science, Savitribai Phule Pune University, Ganeshkhind, Pune 411007, India.
| |
Collapse
|
2
|
Shimpi AA, Naegle KM. Linguistic networks uncover grammatical constraints of protein sentences comprised of domain-based words. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2024:2024.12.04.626803. [PMID: 39677636 PMCID: PMC11643033 DOI: 10.1101/2024.12.04.626803] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Subscribe] [Scholar Register] [Indexed: 12/17/2024]
Abstract
Evolution has developed a set of principles that determine feasible domain combinations analogous to grammar within natural languages. Treating domains as words and proteins as sentences, made up of words, we apply a linguistic approach to represent the human proteome as an n-gram network. Combining this with network theory and application, we explore the functional language and rules of the human proteome. Additionally, we explored subnetwork languages by focusing on reversible post-translational modifications (PTMs) systems that follow a reader-writer-eraser paradigm. We find that PTM systems appear to sample grammar rules near the onset of the system expansion, but then convergently evolve towards similar grammar rules, which stabilize during the post-metazoan switch. For example, reader and writer domains are typically tightly connected through shared n-grams, but eraser domains are almost always loosely or completely disconnected from readers and writers. Additionally, after grammar fixation, domains with verb-like properties, such as writers and erasers, never appear - consistent with the idea of natural grammar that leads to clarity and limits futile enzymatic cycles. Then, given how some cancer fusion genes represent the possibility for the emergence of novel language, we investigate how cancer fusion genes alter the human proteome n-gram network. We find most cancer fusion genes follow existing grammar rules. Collectively, these results suggest that n-gram based analysis of proteomes is a complement to the more direct protein-protein interaction networks. N-grams can capture abstract functional connections in a more fully described manner, limited only by the definition of domains within the proteome and not by the combinatorial challenge of capturing all protein interaction connections.
Collapse
Affiliation(s)
- Adrian A. Shimpi
- Department of Biomedical Engineering, University of Virginia, Charlottesville, VA, 22903
- Department of Genome Sciences, University of Virginia, Charlottesville, VA, 22903
| | - Kristen M. Naegle
- Department of Biomedical Engineering, University of Virginia, Charlottesville, VA, 22903
- Department of Genome Sciences, University of Virginia, Charlottesville, VA, 22903
| |
Collapse
|
3
|
Segovia D, Tepes PS. p160 nuclear receptor coactivator family members and their role in rare fusion‑driven neoplasms (Review). Oncol Lett 2024; 27:210. [PMID: 38572059 PMCID: PMC10988192 DOI: 10.3892/ol.2024.14343] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/05/2024] [Accepted: 02/22/2024] [Indexed: 04/05/2024] Open
Abstract
Gene fusions with translocations involving nuclear receptor coactivators (NCoAs) are relatively common among fusion-driven malignancies. NCoAs are essential mediators of environmental cues and can modulate the transcription of downstream target genes upon binding to activated nuclear receptors. Therefore, fusion proteins containing NCoAs can become strong oncogenic drivers, affecting the cell transcriptional profile. These tumors show a strong dependency on the fusion oncogene; therefore, the direct pharmacological targeting of the fusion protein becomes an attractive strategy for therapy. Currently, different combinations of chemotherapy regimens are used to treat a variety of NCoA-fusion-driven tumors, but given the frequent tumor reoccurrence, more efficient treatment strategies are needed. Specific approaches directed towards inhibition or silencing of the fusion gene need to be developed while minimizing the interference with the original genes. This review highlights the relevant literature describing the normal function and structure of NCoAs and their oncogenic activity in NCoA-gene fusion-driven cancers, and explores potential strategies that could be effective in targeting these fusions.
Collapse
Affiliation(s)
- Danilo Segovia
- Cold Spring Harbor Laboratory, Cold Spring Harbor, NY 11724, USA
- Stony Brook University, Stony Brook, NY 11794, USA
| | - Polona Safaric Tepes
- Robert S. Boas Center for Genomics and Human Genetics, Feinstein Institutes for Medical Research, Northwell Health, Manhasset, NY 11030, USA
| |
Collapse
|
4
|
Mukherjee S, Mukherjee SB, Frenkel-Morgenstern M. Functional and regulatory impact of chimeric RNAs in human normal and cancer cells. WILEY INTERDISCIPLINARY REVIEWS. RNA 2023; 14:e1777. [PMID: 36633099 DOI: 10.1002/wrna.1777] [Citation(s) in RCA: 4] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 05/10/2022] [Revised: 12/21/2022] [Accepted: 12/27/2022] [Indexed: 01/13/2023]
Abstract
Fusions of two genes can lead to the generation of chimeric RNAs, which may have a distinct functional role from their original molecules. Chimeric RNAs could encode novel functional proteins or serve as novel long noncoding RNAs (lncRNAs). The appearance of chimeric RNAs in a cell could help to generate new functionality and phenotypic diversity that might facilitate this cell to survive against new environmental stress. Several recent studies have demonstrated the functional roles of various chimeric RNAs in cancer progression and are considered as biomarkers for cancer diagnosis and sometimes even drug targets. Further, the growing evidence demonstrated the potential functional association of chimeric RNAs with cancer heterogeneity and drug resistance cancer evolution. Recent studies highlighted that chimeric RNAs also have functional potentiality in normal physiological processes. Several functionally potential chimeric RNAs were discovered in human cancer and normal cells in the last two decades. This could indicate that chimeric RNAs are the hidden layer of the human transcriptome that should be explored from the functional insights to better understand the functional evolution of the genome and disease development that could facilitate clinical practice improvements. This review summarizes the current knowledge of chimeric RNAs and highlights their functional, regulatory, and evolutionary impact on different cancers and normal physiological processes. Further, we will discuss the potential functional roles of a recently discovered novel class of chimeric RNAs named sense-antisense/cross-strand chimeric RNAs generated by the fusion of the bi-directional transcripts of the same gene. This article is categorized under: Regulatory RNAs/RNAi/Riboswitches > Regulatory RNAs.
Collapse
Affiliation(s)
- Sumit Mukherjee
- Cancer Genomics and BioComputing of Complex Diseases Lab, Azrieli Faculty of Medicine, Bar-Ilan University, Safed, Israel
- Department of Computer Science, Ben-Gurion University, Beer-Sheva, Israel
- Cancer Data Science Laboratory (CDSL), National Cancer Institute (NCI), National Institutes of Health (NIH), Bethesda, Maryland, USA
| | - Sunanda Biswas Mukherjee
- Cancer Genomics and BioComputing of Complex Diseases Lab, Azrieli Faculty of Medicine, Bar-Ilan University, Safed, Israel
| | - Milana Frenkel-Morgenstern
- Cancer Genomics and BioComputing of Complex Diseases Lab, Azrieli Faculty of Medicine, Bar-Ilan University, Safed, Israel
| |
Collapse
|
5
|
Mukherjee SB, Mukherjee S, Frenkel-Morgenstern M. Fusion proteins mediate alternation of protein interaction networks in cancers. ADVANCES IN PROTEIN CHEMISTRY AND STRUCTURAL BIOLOGY 2022; 131:165-176. [PMID: 35871889 DOI: 10.1016/bs.apcsb.2022.05.007] [Citation(s) in RCA: 7] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 01/26/2023]
Abstract
Fusions of two different genes could lead to the production of chimeric RNAs, which could be translated into novel fusion (or chimeric) proteins. Fusion proteins often act as oncoproteins and drive cancer development, particularly in leukemia and lymphomas. Fusion proteins modify the existing protein-protein interaction (PPI) networks, which could eliminate some PPIs by removing protein domains in such fusions. This alternation of protein interaction networks could impact the signaling pathways and switch on the cancer-promoting activity that could drive the generation of cancer phenotypes and/or loss of controlled apoptosis. Thus, knowledge of the fusion proteins and their protein interaction networks could facilitate a deeper molecular understanding of cancer development, which could help to design new approaches for cancer therapies. Here, we discuss the structural features of fusion proteins and how they impact the PPI networks in cancers. Further, we discuss how to analyze the fusion protein-mediated alternation of PPI networks in cancers.
Collapse
Affiliation(s)
- Sunanda Biswas Mukherjee
- Cancer Genomics and BioComputing of Complex Diseases Lab, Azrieli Faculty of Medicine, Bar-Ilan University, Safed, Israel
| | - Sumit Mukherjee
- Cancer Genomics and BioComputing of Complex Diseases Lab, Azrieli Faculty of Medicine, Bar-Ilan University, Safed, Israel
| | - Milana Frenkel-Morgenstern
- Cancer Genomics and BioComputing of Complex Diseases Lab, Azrieli Faculty of Medicine, Bar-Ilan University, Safed, Israel.
| |
Collapse
|
6
|
The Landscape of Novel Expressed Chimeric RNAs in Rheumatoid Arthritis. Cells 2022; 11:cells11071092. [PMID: 35406656 PMCID: PMC8998144 DOI: 10.3390/cells11071092] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/15/2022] [Revised: 03/20/2022] [Accepted: 03/22/2022] [Indexed: 02/06/2023] Open
Abstract
In cancers and other complex diseases, the fusion of two genes can lead to the production of chimeric RNAs, which are associated with disease development. Several recurrent chimeric RNAs are expressed in different cancers and are thus used for clinical cancer diagnosis. Rheumatoid arthritis (RA) is an immune-mediated joint disorder resulting in synovial inflammation and joint destruction. Despite advances in therapy, many patients do not respond to treatment and present persistent inflammation. Understanding the landscape of chimeric RNA expression in RA patients could provide a better insight into RA pathogenesis, which might provide better treatment strategies and tailored therapies. Accordingly, we analyzed the publicly available RNA-seq data of synovium tissue from 151 RA patients and 28 healthy controls and were able to identify 37 recurrent chimeric RNAs found to be expressed in at least 3 RA samples. Furthermore, the parental genes of these 37 recurrent chimeric RNAs were found to be differentially expressed and enriched in immune-related processes, such as adaptive immune response and the positive regulation of B-cell activation. Interestingly, the appearance of 5 coding and 23 non-coding chimeric RNAs might be associated with regulating their parental gene expression, leading to the generation of dysfunctional immune responses, such as inflammation and bone destruction. Therefore, in this paper, we present the first study to demonstrate the novel chimeric RNAs that are highly expressed and functional in RA.
Collapse
|
7
|
Mukherjee S, Frenkel-Morgenstern M. Evolutionary impact of chimeric RNAs on generating phenotypic plasticity in human cells. Trends Genet 2021; 38:4-7. [PMID: 34579972 DOI: 10.1016/j.tig.2021.08.015] [Citation(s) in RCA: 10] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/03/2021] [Revised: 08/29/2021] [Accepted: 08/31/2021] [Indexed: 11/18/2022]
Abstract
Chimeric RNAs are generated by the fusion of the exons or introns of two genes. The generation of chimeric RNAs is important for the functional expansion of cells. Here, we describe the functional implications of chimeric RNAs for generating phenotypic plasticity from an evolutionary perspective.
Collapse
Affiliation(s)
- Sumit Mukherjee
- Cancer Genomics and BioComputing of Complex Diseases Lab, Azrieli Faculty of Medicine, Bar-Ilan University, Safed 1311502, Israel.
| | - Milana Frenkel-Morgenstern
- Cancer Genomics and BioComputing of Complex Diseases Lab, Azrieli Faculty of Medicine, Bar-Ilan University, Safed 1311502, Israel.
| |
Collapse
|
8
|
Carmi G, Gorohovski A, Frenkel-Morgenstern M. EvoProDom: Evolutionary modeling of protein families by assessing translocations of protein domains. FEBS Open Bio 2021; 11:2507-2524. [PMID: 34196123 PMCID: PMC8409312 DOI: 10.1002/2211-5463.13245] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/19/2021] [Revised: 06/22/2021] [Accepted: 06/30/2021] [Indexed: 11/29/2022] Open
Abstract
Here, we introduce a novel ‘evolution of protein domains’ (EvoProDom) model for describing the evolution of proteins based on the ‘mix and merge’ of protein domains. We assembled and integrated genomic and proteomic data comprising protein domain content and orthologous proteins from 109 organisms. In EvoProDom, we characterized evolutionary events, particularly, translocations, as reciprocal exchanges of protein domains between orthologous proteins in different organisms. We showed that protein domains that translocate with highly frequency are generated by transcripts enriched in trans‐splicing events, that is, the generation of novel transcripts from the fusion of two distinct genes. In EvoProDom, we describe a general method to collate orthologous protein annotation from KEGG, and protein domain content from protein sequences using tools such as KoFamKOAL and Pfam. To summarize, EvoProDom presents a novel model for protein evolution based on the ‘mix and merge’ of protein domains rather than DNA‐based evolution models. This confers the advantage of considering chromosomal alterations as drivers of protein evolutionary events.
Collapse
Affiliation(s)
- Gon Carmi
- Cancer Genomics and BioComputing of Complex Diseases Lab, The Azrieli Faculty of Medicine, Bar-Ilan University, 8 Henrietta Szold St, Safed, 13195, Israel
| | - Alessandro Gorohovski
- Cancer Genomics and BioComputing of Complex Diseases Lab, The Azrieli Faculty of Medicine, Bar-Ilan University, 8 Henrietta Szold St, Safed, 13195, Israel
| | - Milana Frenkel-Morgenstern
- Cancer Genomics and BioComputing of Complex Diseases Lab, The Azrieli Faculty of Medicine, Bar-Ilan University, 8 Henrietta Szold St, Safed, 13195, Israel
| |
Collapse
|
9
|
Carmi G, Tagore S, Gorohovski A, Sivan A, Raviv-Shay D, Frenkel-Morgenstern M. Design principles of gene evolution for niche adaptation through changes in protein-protein interaction networks. Sci Rep 2020; 10:15628. [PMID: 32973219 PMCID: PMC7519090 DOI: 10.1038/s41598-020-71976-x] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/02/2019] [Accepted: 08/24/2020] [Indexed: 12/15/2022] Open
Abstract
In contrast to fossorial and above-ground organisms, subterranean species have adapted to the extreme stresses of living underground. We analyzed the predicted protein–protein interactions (PPIs) of all gene products, including those of stress-response genes, among nine subterranean, ten fossorial, and 13 aboveground species. We considered 10,314 unique orthologous protein families and constructed 5,879,879 PPIs in all organisms using ChiPPI. We found strong association between PPI network modulation and adaptation to specific habitats, noting that mutations in genes and changes in protein sequences were not linked directly with niche adaptation in the organisms sampled. Thus, orthologous hypoxia, heat-shock, and circadian clock proteins were found to cluster according to habitat, based on PPIs rather than on sequence similarities. Curiously, "ordered" domains were preserved in aboveground species, while "disordered" domains were conserved in subterranean organisms, and confirmed for proteins in DistProt database. Furthermore, proteins with disordered regions were found to adopt significantly less optimal codon usage in subterranean species than in fossorial and above-ground species. These findings reveal design principles of protein networks by means of alterations in protein domains, thus providing insight into deep mechanisms of evolutionary adaptation, generally, and particularly of species to underground living and other confined habitats.
Collapse
Affiliation(s)
- Gon Carmi
- The Azrieli Faculty of Medicine, Bar-Ilan University, 8 Henrietta Szold St, 13195, Safed, Israel
| | - Somnath Tagore
- The Azrieli Faculty of Medicine, Bar-Ilan University, 8 Henrietta Szold St, 13195, Safed, Israel.,Department of Systems Biology, Columbia University Medical Center, Herbert Irving Cancer Research Center, New York, USA
| | - Alessandro Gorohovski
- The Azrieli Faculty of Medicine, Bar-Ilan University, 8 Henrietta Szold St, 13195, Safed, Israel
| | - Aviad Sivan
- The Azrieli Faculty of Medicine, Bar-Ilan University, 8 Henrietta Szold St, 13195, Safed, Israel
| | - Dorith Raviv-Shay
- The Azrieli Faculty of Medicine, Bar-Ilan University, 8 Henrietta Szold St, 13195, Safed, Israel
| | | |
Collapse
|
10
|
Showpnil IA, Miller KR, Taslim C, Pishas KI, Lessnick SL, Theisen ER. Mapping the Structure-Function Relationships of Disordered Oncogenic Transcription Factors Using Transcriptomic Analysis. J Vis Exp 2020. [PMID: 32658189 DOI: 10.3791/61564] [Citation(s) in RCA: 10] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/26/2022] Open
Abstract
Many cancers are characterized by chromosomal translocations which result in the expression of oncogenic fusion transcription factors. Typically, these proteins contain an intrinsically disordered domain (IDD) fused with the DNA-binding domain (DBD) of another protein and orchestrate widespread transcriptional changes to promote malignancy. These fusions are often the sole recurring genomic aberration in the cancers they cause, making them attractive therapeutic targets. However, targeting oncogenic transcription factors requires a better understanding of the mechanistic role that low-complexity, IDDs play in their function. The N-terminal domain of EWSR1 is an IDD involved in a variety of oncogenic fusion transcription factors, including EWS/FLI, EWS/ATF, and EWS/WT1. Here, we use RNA-sequencing to investigate the structural features of the EWS domain important for transcriptional function of EWS/FLI in Ewing sarcoma. First shRNA-mediated depletion of the endogenous fusion from Ewing sarcoma cells paired with ectopic expression of a variety of EWS-mutant constructs is performed. Then RNA-sequencing is used to analyze the transcriptomes of cells expressing these constructs to characterize the functional deficits associated with mutations in the EWS domain. By integrating the transcriptomic analyses with previously published information about EWS/FLI DNA binding motifs, and genomic localization, as well as functional assays for transforming ability, we were able to identify structural features of EWS/FLI important for oncogenesis and define a novel set of EWS/FLI target genes critical for Ewing sarcoma. This paper demonstrates the use of RNA-sequencing as a method to map the structure-function relationship of the intrinsically disordered domain of oncogenic transcription factors.
Collapse
Affiliation(s)
- Iftekhar A Showpnil
- Center for Childhood Cancer and Blood Diseases, Abigail Wexner Research Institute at Nationwide Children's Hospital; Molecular, Cellular, and Developmental Biology Program, The Ohio State University
| | - Kyle R Miller
- Center for Childhood Cancer and Blood Diseases, Abigail Wexner Research Institute at Nationwide Children's Hospital
| | - Cenny Taslim
- Center for Childhood Cancer and Blood Diseases, Abigail Wexner Research Institute at Nationwide Children's Hospital
| | - Kathleen I Pishas
- Center for Childhood Cancer and Blood Diseases, Abigail Wexner Research Institute at Nationwide Children's Hospital
| | - Stephen L Lessnick
- Center for Childhood Cancer and Blood Diseases, Abigail Wexner Research Institute at Nationwide Children's Hospital; Division of Pediatric Hematology/Oncology/Blood & Marrow Transplant, The Ohio State University
| | - Emily R Theisen
- Center for Childhood Cancer and Blood Diseases, Abigail Wexner Research Institute at Nationwide Children's Hospital; Department of Pediatrics, The Ohio State University;
| |
Collapse
|
11
|
Balamurali D, Gorohovski A, Detroja R, Palande V, Raviv-Shay D, Frenkel-Morgenstern M. ChiTaRS 5.0: the comprehensive database of chimeric transcripts matched with druggable fusions and 3D chromatin maps. Nucleic Acids Res 2020; 48:D825-D834. [PMID: 31747015 PMCID: PMC7145514 DOI: 10.1093/nar/gkz1025] [Citation(s) in RCA: 11] [Impact Index Per Article: 2.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/15/2019] [Revised: 10/18/2019] [Accepted: 10/26/2019] [Indexed: 12/11/2022] Open
Abstract
Chimeric RNA transcripts are formed when exons from two genes fuse together, often due to chromosomal translocations, transcriptional errors or trans-splicing effect. While these chimeric RNAs produce functional proteins only in certain cases, they play a significant role in disease phenotyping and progression. ChiTaRS 5.0 (http://chitars.md.biu.ac.il/) is the latest and most comprehensive chimeric transcript repository, with 111 582 annotated entries from eight species, including 23 167 known human cancer breakpoints. The database includes unique information correlating chimeric breakpoints with 3D chromatin contact maps, generated from public datasets of chromosome conformation capture techniques (Hi-C). In this update, we have added curated information on druggable fusion targets matched with chimeric breakpoints, which are applicable to precision medicine in cancers. The introduction of a new section that lists chimeric RNAs in various cell-lines is another salient feature. Finally, using text-mining techniques, novel chimeras in Alzheimer's disease, schizophrenia, dyslexia and other diseases were collected in ChiTaRS. Thus, this improved version is an extensive catalogue of chimeras from multiple species. It extends our understanding of the evolution of chimeric transcripts in eukaryotes and contributes to the analysis of 3D genome conformational changes and the functional role of chimeras in the etiopathogenesis of cancers and other complex diseases.
Collapse
Affiliation(s)
- Deepak Balamurali
- Laboratory of Cancer Genomics and Biocomputing of Complex Diseases, The Azrieli Faculty of Medicine, Bar-Ilan University, Safed 1311502, Israel
| | - Alessandro Gorohovski
- Laboratory of Cancer Genomics and Biocomputing of Complex Diseases, The Azrieli Faculty of Medicine, Bar-Ilan University, Safed 1311502, Israel
| | - Rajesh Detroja
- Laboratory of Cancer Genomics and Biocomputing of Complex Diseases, The Azrieli Faculty of Medicine, Bar-Ilan University, Safed 1311502, Israel
| | - Vikrant Palande
- Laboratory of Cancer Genomics and Biocomputing of Complex Diseases, The Azrieli Faculty of Medicine, Bar-Ilan University, Safed 1311502, Israel
| | - Dorith Raviv-Shay
- Laboratory of Cancer Genomics and Biocomputing of Complex Diseases, The Azrieli Faculty of Medicine, Bar-Ilan University, Safed 1311502, Israel
| | - Milana Frenkel-Morgenstern
- Laboratory of Cancer Genomics and Biocomputing of Complex Diseases, The Azrieli Faculty of Medicine, Bar-Ilan University, Safed 1311502, Israel
| |
Collapse
|
12
|
Frenkel-Morgenstern M. Identification of Chimeric RNAs Using RNA-Seq Reads and Protein-Protein Interactions of Translated Chimeras. Methods Mol Biol 2020; 2079:27-40. [PMID: 31728960 DOI: 10.1007/978-1-4939-9904-0_3] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 06/10/2023]
Abstract
Chimeric RNA moieties typically consist of exons from two genes expressed from different genomic locations and produced by chromosomal translocations, trans-splicing or transcription errors. Recent advances in next-generation sequencing procedures have opened new horizons for identification of novel chimeric transcripts in various diseases in a personalized manner. Here we describe the detailed computational procedures to identify chimeric transcripts using RNA-seq reads. Moreover, we elaborate on the domain-domain co-occurrence method to detect alterations in chimeric protein-protein interaction (ChiPPI) networks produced by chimeric RNA that are translated to chimeric proteins.
Collapse
|
13
|
Kim P, Jia P, Zhao Z. Kinase impact assessment in the landscape of fusion genes that retain kinase domains: a pan-cancer study. Brief Bioinform 2019; 19:450-460. [PMID: 28013235 DOI: 10.1093/bib/bbw127] [Citation(s) in RCA: 11] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/21/2016] [Indexed: 12/13/2022] Open
Abstract
Assessing the impact of kinase in gene fusion is essential for both identifying driver fusion genes (FGs) and developing molecular targeted therapies. Kinase domain retention is a crucial factor in kinase fusion genes (KFGs), but such a systematic investigation has not been done yet. To this end, we analyzed kinase domain retention (KDR) status in chimeric protein sequences of 914 KFGs covering 312 kinases across 13 major cancer types. Based on 171 kinase domain-retained KFGs including 101 kinases, we studied their recurrence, kinase groups, fusion partners, exon-based expression depth, short DNA motifs around the break points and networks. Our results, such as more KDR than 5'-kinase fusion genes, combinatorial effects between 3'-KDR kinases and their 5'-partners and a signal transduction-specific DNA sequence motif in the break point intronic sequences, supported positive selection on 3'-kinase fusion genes in cancer. We introduced a degree-of-frequency (DoF) score to measure the possible number of KFGs of a kinase. Interestingly, kinases with high DoF scores tended to undergo strong gene expression alteration at the break points. Furthermore, our KDR gene fusion network analysis revealed six of the seven kinases with the highest DoF scores (ALK, BRAF, MET, NTRK1, NTRK3 and RET) were all observed in thyroid carcinoma. Finally, we summarized common features of 'effective' (highly recurrent) kinases in gene fusions such as expression alteration at break point, redundant usage in multiple cancer types and 3'-location tendency. Collectively, our findings are useful for prioritizing driver kinases and FGs and provided insights into KFGs' clinical implications.
Collapse
Affiliation(s)
- Pora Kim
- Center for Precision Health, School of Biomedical Informatics, The University of Texas Health Science Center at Houston, Houston, TX 77030, USA
| | - Peilin Jia
- Center for Precision Health, School of Biomedical Informatics, The University of Texas Health Science Center at Houston, Houston, TX 77030, USA
| | - Zhongming Zhao
- Center for Precision Health, School of Biomedical Informatics, The University of Texas Health Science Center at Houston, Houston, TX 77030, USA.,Human Genetics Center, School of Public Health, The University of Texas Health Science Center at Houston, Houston, TX 77030, USA
| |
Collapse
|
14
|
Latysheva NS, Babu MM. Molecular Signatures of Fusion Proteins in Cancer. ACS Pharmacol Transl Sci 2019; 2:122-133. [PMID: 32219217 PMCID: PMC7088938 DOI: 10.1021/acsptsci.9b00019] [Citation(s) in RCA: 22] [Impact Index Per Article: 3.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/05/2019] [Indexed: 01/07/2023]
Abstract
![]()
Although gene fusions
are recognized as driver mutations in a wide
variety of cancers, the general molecular mechanisms underlying oncogenic
fusion proteins are insufficiently understood. Here, we employ large-scale
data integration and machine learning and (1) identify three functionally
distinct subgroups of gene fusions and their molecular signatures;
(2) characterize the cellular pathways rewired by fusion events across
different cancers; and (3) analyze the relative importance of over
100 structural, functional, and regulatory features of ∼2200
gene fusions. We report subgroups of fusions that likely act as driver
mutations and find that gene fusions disproportionately affect pathways
regulating cellular shape and movement. Although fusion proteins are
similar across different cancer types, they affect cancer type-specific
pathways. Key indicators of fusion-forming proteins include high and
nontissue specific expression, numerous splice sites, and higher centrality
in protein-interaction networks. Together, these findings provide
unifying and cancer type-specific trends across diverse oncogenic
fusion proteins.
Collapse
Affiliation(s)
- Natasha S Latysheva
- MRC Laboratory of Molecular Biology, Francis Crick Avenue, Cambridge CB2 0QH, United Kingdom
| | - M Madan Babu
- MRC Laboratory of Molecular Biology, Francis Crick Avenue, Cambridge CB2 0QH, United Kingdom
| |
Collapse
|
15
|
Krzyzanowski PM, Sircoulomb F, Yousif F, Normand J, La Rose J, E Francis K, Suarez F, Beck T, McPherson JD, Stein LD, Rottapel RK. Regional perturbation of gene transcription is associated with intrachromosomal rearrangements and gene fusion transcripts in high grade ovarian cancer. Sci Rep 2019; 9:3590. [PMID: 30837567 PMCID: PMC6401071 DOI: 10.1038/s41598-019-39878-9] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/16/2018] [Accepted: 01/30/2019] [Indexed: 01/10/2023] Open
Abstract
Genomic rearrangements are a hallmark of cancer biology and progression, allowing cells to rapidly transform through alterations in regulatory structures, changes in expression patterns, reprogramming of signaling pathways, and creation of novel transcripts via gene fusion events. Though functional gene fusions encoding oncogenic proteins are the most dramatic outcomes of genomic rearrangements, we investigated the relationship between rearrangements evidenced by fusion transcripts and local expression changes in cancer using transcriptome data alone. 9,953 gene fusion predictions from 418 primary serious ovarian cancer tumors were analyzed, identifying depletions of gene fusion breakpoints within coding regions of fused genes as well as an N-terminal enrichment of breakpoints within fused genes. We identified 48 genes with significant fusion-associated upregulation and furthermore demonstrate that significant regional overexpression of intact genes in patient transcriptomes occurs within 1 megabase of 78 novel gene fusions that function as central markers of these regions. We reveal that cancer transcriptomes select for gene fusions that preserve protein and protein domain coding potential. The association of gene fusion transcripts with neighboring gene overexpression supports rearrangements as mechanism through which cancer cells remodel their transcriptomes and identifies a new way to utilize gene fusions as indicators of regional expression changes in diseased cells with only transcriptomic data.
Collapse
Affiliation(s)
- Paul M Krzyzanowski
- Department of Medicine, University of Toronto, Ontario Institute for Cancer Research, MaRS Centre, Toronto, Ontario, Canada.
| | - Fabrice Sircoulomb
- Department of Immunology, University of Toronto, Princess Margaret Cancer Center, MaRS Centre, Toronto, Ontario, Canada
| | - Fouad Yousif
- Department of Medicine, University of Toronto, Ontario Institute for Cancer Research, MaRS Centre, Toronto, Ontario, Canada
| | - Josee Normand
- Department of Immunology, University of Toronto, Princess Margaret Cancer Center, MaRS Centre, Toronto, Ontario, Canada.,Department of Medical Biophysics, University of Toronto, Dalhousie University, Halifax, Nova Scotia, Canada
| | - Jose La Rose
- Department of Immunology, University of Toronto, Princess Margaret Cancer Center, MaRS Centre, Toronto, Ontario, Canada
| | - Kyle E Francis
- Department of Immunology, University of Toronto, Princess Margaret Cancer Center, MaRS Centre, Toronto, Ontario, Canada
| | - Fernando Suarez
- Department of Immunology, University of Toronto, Princess Margaret Cancer Center, MaRS Centre, Toronto, Ontario, Canada
| | - Tim Beck
- Human Longevity Inc., San Diego, California, USA
| | - John D McPherson
- Department of Medicine, University of Toronto, Ontario Institute for Cancer Research, MaRS Centre, Toronto, Ontario, Canada.,University of California, Davis Medical Center, Sacramento, California, USA
| | - Lincoln D Stein
- Department of Medicine, University of Toronto, Ontario Institute for Cancer Research, MaRS Centre, Toronto, Ontario, Canada. .,Department of Molecular Genetics, University of Toronto, Toronto, Ontario, Canada.
| | - Robert K Rottapel
- Department of Medicine, University of Toronto, Ontario Institute for Cancer Research, MaRS Centre, Toronto, Ontario, Canada. .,Department of Immunology, University of Toronto, Princess Margaret Cancer Center, MaRS Centre, Toronto, Ontario, Canada.
| |
Collapse
|
16
|
Pintarelli G, Dassano A, Cotroneo CE, Galvan A, Noci S, Piazza R, Pirola A, Spinelli R, Incarbone M, Palleschi A, Rosso L, Santambrogio L, Dragani TA, Colombo F. Read-through transcripts in normal human lung parenchyma are down-regulated in lung adenocarcinoma. Oncotarget 2017; 7:27889-98. [PMID: 27058892 PMCID: PMC5053695 DOI: 10.18632/oncotarget.8556] [Citation(s) in RCA: 13] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/17/2015] [Accepted: 02/18/2016] [Indexed: 12/26/2022] Open
Abstract
Read-through transcripts result from the continuous transcription of adjacent, similarly oriented genes, with the splicing out of the intergenic region. They have been found in several neoplastic and normal tissues, but their pathophysiological significance is unclear. We used high-throughput sequencing of cDNA fragments (RNA-Seq) to identify read-through transcripts in the non-involved lung tissue of 64 surgically treated lung adenocarcinoma patients. A total of 52 distinct read-through species was identified, with 24 patients having at least one read-through event, up to a maximum of 17 such transcripts in one patient. Sanger sequencing validated 28 of these transcripts and identified an additional 15, for a total of 43 distinct read-through events involving 35 gene pairs. Expression levels of 10 validated read-through transcripts were measured by quantitative PCR in pairs of matched non-involved lung tissue and lung adenocarcinoma tissue from 45 patients. Higher expression levels were observed in normal lung tissue than in the tumor counterpart, with median relative quantification ratios between normal and tumor varying from 1.90 to 7.78; the difference was statistically significant (P < 0.001, Wilcoxon's signed-rank test for paired samples) for eight transcripts: ELAVL1–TIMM44, FAM162B–ZUFSP, IFNAR2–IL10RB, INMT–FAM188B, KIAA1841–C2orf74, NFATC3–PLA2G15, SIRPB1–SIRPD, and SHANK3–ACR. This report documents the presence of read-through transcripts in apparently normal lung tissue, with inter-individual differences in patterns and abundance. It also shows their down-regulation in tumors, suggesting that these chimeric transcripts may function as tumor suppressors in lung tissue.
Collapse
Affiliation(s)
- Giulia Pintarelli
- Department of Predictive and Prevention Medicine, Fondazione IRCCS, Istituto Nazionale dei Tumori, Milan, Italy
| | - Alice Dassano
- Department of Predictive and Prevention Medicine, Fondazione IRCCS, Istituto Nazionale dei Tumori, Milan, Italy
| | - Chiara E Cotroneo
- Department of Predictive and Prevention Medicine, Fondazione IRCCS, Istituto Nazionale dei Tumori, Milan, Italy.,Present Address: UCD School of Biomolecular and Biomedical Science, University College Dublin, Belfield, Dublin, Ireland
| | - Antonella Galvan
- Formerly, Department of Predictive and Prevention Medicine, Fondazione IRCCS, Istituto Nazionale dei Tumori, Milan, Italy
| | - Sara Noci
- Department of Predictive and Prevention Medicine, Fondazione IRCCS, Istituto Nazionale dei Tumori, Milan, Italy
| | - Rocco Piazza
- Department of Health Sciences, University of Milano-Bicocca, Monza, Italy.,Hematology and Clinical Research Unit, San Gerardo Hospital, Monza, Italy
| | - Alessandra Pirola
- Department of Health Sciences, University of Milano-Bicocca, Monza, Italy
| | - Roberta Spinelli
- Formerly, Department of Health Sciences, University of Milano-Bicocca, Monza, Italy
| | - Matteo Incarbone
- Department of Surgery, San Giuseppe Hospital, Multimedica, Milan, Italy
| | - Alessandro Palleschi
- Department of Surgery, IRCCS Fondazione Cà Granda Ospedale Maggiore Policlinico, Università degli Studi di Milano, Milan, Italy
| | - Lorenzo Rosso
- Department of Surgery, IRCCS Fondazione Cà Granda Ospedale Maggiore Policlinico, Università degli Studi di Milano, Milan, Italy
| | - Luigi Santambrogio
- Department of Surgery, IRCCS Fondazione Cà Granda Ospedale Maggiore Policlinico, Università degli Studi di Milano, Milan, Italy
| | - Tommaso A Dragani
- Department of Predictive and Prevention Medicine, Fondazione IRCCS, Istituto Nazionale dei Tumori, Milan, Italy
| | - Francesca Colombo
- Department of Predictive and Prevention Medicine, Fondazione IRCCS, Istituto Nazionale dei Tumori, Milan, Italy
| |
Collapse
|
17
|
Shapiro JA. Living Organisms Author Their Read-Write Genomes in Evolution. BIOLOGY 2017; 6:E42. [PMID: 29211049 PMCID: PMC5745447 DOI: 10.3390/biology6040042] [Citation(s) in RCA: 31] [Impact Index Per Article: 3.9] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Subscribe] [Scholar Register] [Received: 08/23/2017] [Revised: 11/17/2017] [Accepted: 11/28/2017] [Indexed: 12/18/2022]
Abstract
Evolutionary variations generating phenotypic adaptations and novel taxa resulted from complex cellular activities altering genome content and expression: (i) Symbiogenetic cell mergers producing the mitochondrion-bearing ancestor of eukaryotes and chloroplast-bearing ancestors of photosynthetic eukaryotes; (ii) interspecific hybridizations and genome doublings generating new species and adaptive radiations of higher plants and animals; and, (iii) interspecific horizontal DNA transfer encoding virtually all of the cellular functions between organisms and their viruses in all domains of life. Consequently, assuming that evolutionary processes occur in isolated genomes of individual species has become an unrealistic abstraction. Adaptive variations also involved natural genetic engineering of mobile DNA elements to rewire regulatory networks. In the most highly evolved organisms, biological complexity scales with "non-coding" DNA content more closely than with protein-coding capacity. Coincidentally, we have learned how so-called "non-coding" RNAs that are rich in repetitive mobile DNA sequences are key regulators of complex phenotypes. Both biotic and abiotic ecological challenges serve as triggers for episodes of elevated genome change. The intersections of cell activities, biosphere interactions, horizontal DNA transfers, and non-random Read-Write genome modifications by natural genetic engineering provide a rich molecular and biological foundation for understanding how ecological disruptions can stimulate productive, often abrupt, evolutionary transformations.
Collapse
Affiliation(s)
- James A Shapiro
- Department of Biochemistry and Molecular Biology, University of Chicago GCIS W123B, 979 E. 57th Street, Chicago, IL 60637, USA.
| |
Collapse
|
18
|
Frenkel-Morgenstern M, Gorohovski A, Tagore S, Sekar V, Vazquez M, Valencia A. ChiPPI: a novel method for mapping chimeric protein-protein interactions uncovers selection principles of protein fusion events in cancer. Nucleic Acids Res 2017; 45:7094-7105. [PMID: 28549153 PMCID: PMC5499553 DOI: 10.1093/nar/gkx423] [Citation(s) in RCA: 29] [Impact Index Per Article: 3.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/19/2016] [Accepted: 05/07/2017] [Indexed: 12/20/2022] Open
Abstract
Fusion proteins, comprising peptides deriving from the translation of two parental genes, are produced in cancer by chromosomal aberrations. The expressed fusion protein incorporates domains of both parental proteins. Using a methodology that treats discrete protein domains as binding sites for specific domains of interacting proteins, we have cataloged the protein interaction networks for 11 528 cancer fusions (ChiTaRS-3.1). Here, we present our novel method, chimeric protein–protein interactions (ChiPPI) that uses the domain–domain co-occurrence scores in order to identify preserved interactors of chimeric proteins. Mapping the influence of fusion proteins on cell metabolism and pathways reveals that ChiPPI networks often lose tumor suppressor proteins and gain oncoproteins. Furthermore, fusions often induce novel connections between non-interactors skewing interaction networks and signaling pathways. We compared fusion protein PPI networks in leukemia/lymphoma, sarcoma and solid tumors finding distinct enrichment patterns for each disease type. While certain pathways are enriched in all three diseases (Wnt, Notch and TGF β), there are distinct patterns for leukemia (EGFR signaling, DNA replication and CCKR signaling), for sarcoma (p53 pathway and CCKR signaling) and solid tumors (FGFR and EGFR signaling). Thus, the ChiPPI method represents a comprehensive tool for studying the anomaly of skewed cellular networks produced by fusion proteins in cancer.
Collapse
Affiliation(s)
| | | | - Somnath Tagore
- Faculty of Medicine, Bar-Ilan-University, Henrietta Szold 8, Safed 1311502, Israel
| | - Vaishnovi Sekar
- Structural Biology and BioComputing Programme, Spanish National Cancer Research Centre (CNIO), M.F.Almagro 3, 28029 Madrid, Spain
| | - Miguel Vazquez
- Structural Biology and BioComputing Programme, Spanish National Cancer Research Centre (CNIO), M.F.Almagro 3, 28029 Madrid, Spain
| | - Alfonso Valencia
- Structural Biology and BioComputing Programme, Spanish National Cancer Research Centre (CNIO), M.F.Almagro 3, 28029 Madrid, Spain
| |
Collapse
|
19
|
Gorohovski A, Tagore S, Palande V, Malka A, Raviv-Shay D, Frenkel-Morgenstern M. ChiTaRS-3.1-the enhanced chimeric transcripts and RNA-seq database matched with protein-protein interactions. Nucleic Acids Res 2016; 45:D790-D795. [PMID: 27899596 PMCID: PMC5210585 DOI: 10.1093/nar/gkw1127] [Citation(s) in RCA: 43] [Impact Index Per Article: 4.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/02/2016] [Revised: 10/26/2016] [Accepted: 10/30/2016] [Indexed: 12/17/2022] Open
Abstract
Discovery of chimeric RNAs, which are produced by chromosomal translocations as well as the joining of exons from different genes by trans-splicing, has added a new level of complexity to our study and understanding of the transcriptome. The enhanced ChiTaRS-3.1 database (http://chitars.md.biu.ac.il) is designed to make widely accessible a wealth of mined data on chimeric RNAs, with easy-to-use analytical tools built-in. The database comprises 34 922 chimeric transcripts along with 11 714 cancer breakpoints. In this latest version, we have included multiple cross-references to GeneCards, iHop, PubMed, NCBI, Ensembl, OMIM, RefSeq and the Mitelman collection for every entry in the ‘Full Collection’. In addition, for every chimera, we have added a predicted chimeric protein–protein interaction (ChiPPI) network, which allows for easy visualization of protein partners of both parental and fusion proteins for all human chimeras. The database contains a comprehensive annotation for 34 922 chimeric transcripts from eight organisms, and includes the manual annotation of 200 sense-antiSense (SaS) chimeras. The current improvements in the content and functionality to the ChiTaRS database make it a central resource for the study of chimeric transcripts and fusion proteins.
Collapse
Affiliation(s)
- Alessandro Gorohovski
- Faculty of Medicine in Galilee, Bar-Ilan University, Henrietta Szold 8, Safed 13195, Israel
| | - Somnath Tagore
- Faculty of Medicine in Galilee, Bar-Ilan University, Henrietta Szold 8, Safed 13195, Israel
| | - Vikrant Palande
- Faculty of Medicine in Galilee, Bar-Ilan University, Henrietta Szold 8, Safed 13195, Israel
| | - Assaf Malka
- Faculty of Medicine in Galilee, Bar-Ilan University, Henrietta Szold 8, Safed 13195, Israel
| | - Dorith Raviv-Shay
- Faculty of Medicine in Galilee, Bar-Ilan University, Henrietta Szold 8, Safed 13195, Israel
| | - Milana Frenkel-Morgenstern
- Faculty of Medicine in Galilee, Bar-Ilan University, Henrietta Szold 8, Safed 13195, Israel. Corresponding author:
| |
Collapse
|
20
|
Lees JG, Dawson NL, Sillitoe I, Orengo CA. Functional innovation from changes in protein domains and their combinations. Curr Opin Struct Biol 2016; 38:44-52. [DOI: 10.1016/j.sbi.2016.05.016] [Citation(s) in RCA: 31] [Impact Index Per Article: 3.4] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/22/2016] [Revised: 05/17/2016] [Accepted: 05/24/2016] [Indexed: 10/21/2022]
|
21
|
Latysheva NS, Babu MM. Discovering and understanding oncogenic gene fusions through data intensive computational approaches. Nucleic Acids Res 2016; 44:4487-503. [PMID: 27105842 PMCID: PMC4889949 DOI: 10.1093/nar/gkw282] [Citation(s) in RCA: 115] [Impact Index Per Article: 12.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/12/2016] [Accepted: 03/24/2016] [Indexed: 12/21/2022] Open
Abstract
Although gene fusions have been recognized as important drivers of cancer for decades, our understanding of the prevalence and function of gene fusions has been revolutionized by the rise of next-generation sequencing, advances in bioinformatics theory and an increasing capacity for large-scale computational biology. The computational work on gene fusions has been vastly diverse, and the present state of the literature is fragmented. It will be fruitful to merge three camps of gene fusion bioinformatics that appear to rarely cross over: (i) data-intensive computational work characterizing the molecular biology of gene fusions; (ii) development research on fusion detection tools, candidate fusion prioritization algorithms and dedicated fusion databases and (iii) clinical research that seeks to either therapeutically target fusion transcripts and proteins or leverages advances in detection tools to perform large-scale surveys of gene fusion landscapes in specific cancer types. In this review, we unify these different-yet highly complementary and symbiotic-approaches with the view that increased synergy will catalyze advancements in gene fusion identification, characterization and significance evaluation.
Collapse
Affiliation(s)
- Natasha S Latysheva
- MRC Laboratory of Molecular Biology, Francis Crick Ave, Cambridge CB2 0QH, United Kingdom
| | - M Madan Babu
- MRC Laboratory of Molecular Biology, Francis Crick Ave, Cambridge CB2 0QH, United Kingdom
| |
Collapse
|
22
|
Lei Q, Li C, Zuo Z, Huang C, Cheng H, Zhou R. Evolutionary Insights into RNA trans-Splicing in Vertebrates. Genome Biol Evol 2016; 8:562-77. [PMID: 26966239 PMCID: PMC4824033 DOI: 10.1093/gbe/evw025] [Citation(s) in RCA: 65] [Impact Index Per Article: 7.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/12/2022] Open
Abstract
Pre-RNA splicing is an essential step in generating mature mRNA. RNA trans-splicing combines two separate pre-mRNA molecules to form a chimeric non-co-linear RNA, which may exert a function distinct from its original molecules. Trans-spliced RNAs may encode novel proteins or serve as noncoding or regulatory RNAs. These novel RNAs not only increase the complexity of the proteome but also provide new regulatory mechanisms for gene expression. An increasing amount of evidence indicates that trans-splicing occurs frequently in both physiological and pathological processes. In addition, mRNA reprogramming based on trans-splicing has been successfully applied in RNA-based therapies for human genetic diseases. Nevertheless, clarifying the extent and evolution of trans-splicing in vertebrates and developing detection methods for trans-splicing remain challenging. In this review, we summarize previous research, highlight recent advances in trans-splicing, and discuss possible splicing mechanisms and functions from an evolutionary viewpoint.
Collapse
Affiliation(s)
- Quan Lei
- Department of Genetics, College of Life Sciences, Wuhan University, P.R. China
| | - Cong Li
- Department of Genetics, College of Life Sciences, Wuhan University, P.R. China
| | - Zhixiang Zuo
- Department of Genetics, College of Life Sciences, Wuhan University, P.R. China
| | - Chunhua Huang
- Department of Cell Biology, College of Life Sciences, Wuhan University, P.R. China
| | - Hanhua Cheng
- Department of Cell Biology, College of Life Sciences, Wuhan University, P.R. China
| | - Rongjia Zhou
- Department of Genetics, College of Life Sciences, Wuhan University, P.R. China
| |
Collapse
|
23
|
Chuang TJ, Wu CS, Chen CY, Hung LY, Chiang TW, Yang MY. NCLscan: accurate identification of non-co-linear transcripts (fusion, trans-splicing and circular RNA) with a good balance between sensitivity and precision. Nucleic Acids Res 2015; 44:e29. [PMID: 26442529 PMCID: PMC4756807 DOI: 10.1093/nar/gkv1013] [Citation(s) in RCA: 87] [Impact Index Per Article: 8.7] [Reference Citation Analysis] [Abstract] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/20/2015] [Accepted: 09/24/2015] [Indexed: 12/19/2022] Open
Abstract
Analysis of RNA-seq data often detects numerous ‘non-co-linear’ (NCL) transcripts, which comprised sequence segments that are topologically inconsistent with their corresponding DNA sequences in the reference genome. However, detection of NCL transcripts involves two major challenges: removal of false positives arising from alignment artifacts and discrimination between different types of NCL transcripts (trans-spliced, circular or fusion transcripts). Here, we developed a new NCL-transcript-detecting method (‘NCLscan’), which utilized a stepwise alignment strategy to almost completely eliminate false calls (>98% precision) without sacrificing true positives, enabling NCLscan outperform 18 other publicly-available tools (including fusion- and circular-RNA-detecting tools) in terms of sensitivity and precision, regardless of the generation strategy of simulated dataset, type of intragenic or intergenic NCL event, read depth of coverage, read length or expression level of NCL transcript. With the high accuracy, NCLscan was applied to distinguishing between trans-spliced, circular and fusion transcripts on the basis of poly(A)- and nonpoly(A)-selected RNA-seq data. We showed that circular RNAs were expressed more ubiquitously, more abundantly and less cell type-specifically than trans-spliced and fusion transcripts. Our study thus describes a robust pipeline for the discovery of NCL transcripts, and sheds light on the fundamental biology of these non-canonical RNA events in human transcriptome.
Collapse
Affiliation(s)
- Trees-Juen Chuang
- Division of Physical and Computational Genomics, Genomics Research Center, Academia Sinica, Taipei 11529, Taiwan
| | - Chan-Shuo Wu
- Division of Physical and Computational Genomics, Genomics Research Center, Academia Sinica, Taipei 11529, Taiwan
| | - Chia-Ying Chen
- Division of Physical and Computational Genomics, Genomics Research Center, Academia Sinica, Taipei 11529, Taiwan
| | - Li-Yuan Hung
- Division of Physical and Computational Genomics, Genomics Research Center, Academia Sinica, Taipei 11529, Taiwan
| | - Tai-Wei Chiang
- Division of Physical and Computational Genomics, Genomics Research Center, Academia Sinica, Taipei 11529, Taiwan
| | - Min-Yu Yang
- Division of Physical and Computational Genomics, Genomics Research Center, Academia Sinica, Taipei 11529, Taiwan
| |
Collapse
|
24
|
Frenkel-Morgenstern M, Gorohovski A, Vucenovic D, Maestre L, Valencia A. ChiTaRS 2.1--an improved database of the chimeric transcripts and RNA-seq data with novel sense-antisense chimeric RNA transcripts. Nucleic Acids Res 2014; 43:D68-75. [PMID: 25414346 PMCID: PMC4383979 DOI: 10.1093/nar/gku1199] [Citation(s) in RCA: 24] [Impact Index Per Article: 2.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/20/2022] Open
Abstract
Chimeric RNAs that comprise two or more different transcripts have been identified in many cancers and among the Expressed Sequence Tags (ESTs) isolated from different organisms; they might represent functional proteins and produce different disease phenotypes. The ChiTaRS 2.1 database of chimeric transcripts and RNA-Seq data (http://chitars.bioinfo.cnio.es/) is the second version of the ChiTaRS database and includes improvements in content and functionality. Chimeras from eight organisms have been collated including novel sense–antisense (SAS) chimeras resulting from the slippage of the sense and anti-sense intragenic regions. The new database version collects more than 29 000 chimeric transcripts and indicates the expression and tissue specificity for 333 entries confirmed by RNA-seq reads mapping the chimeric junction sites. User interface allows for rapid and easy analysis of evolutionary conservation of fusions, literature references and experimental data supporting fusions in different organisms. More than 1428 cancer breakpoints have been automatically collected from public databases and manually verified to identify their correct cross-references, genomic sequences and junction sites. As a result, the ChiTaRS 2.1 collection of chimeras from eight organisms and human cancer breakpoints extends our understanding of the evolution of chimeric transcripts in eukaryotes as well as their functional role in carcinogenic processes.
Collapse
Affiliation(s)
- Milana Frenkel-Morgenstern
- Structural Biology and BioComputing Program, Spanish National Cancer Research Centre (CNIO), Madrid 28029, Spain
| | - Alessandro Gorohovski
- Structural Biology and BioComputing Program, Spanish National Cancer Research Centre (CNIO), Madrid 28029, Spain
| | - Dunja Vucenovic
- Structural Biology and BioComputing Program, Spanish National Cancer Research Centre (CNIO), Madrid 28029, Spain
| | - Lorena Maestre
- Monoclonal Antibodies Unit, Spanish National Cancer Research Centre (CNIO), Madrid 28029, Spain
| | - Alfonso Valencia
- Structural Biology and BioComputing Program, Spanish National Cancer Research Centre (CNIO), Madrid 28029, Spain.
| |
Collapse
|
25
|
Kang B, Gu Q, Tian P, Xiao L, Cao H, Yang W. A chimeric transcript containing Psy1 and a potential mRNA is associated with yellow flesh color in tomato accession PI 114490. PLANTA 2014; 240:1011-21. [PMID: 24663441 DOI: 10.1007/s00425-014-2052-z] [Citation(s) in RCA: 11] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 12/29/2013] [Accepted: 03/04/2014] [Indexed: 05/19/2023]
Abstract
Carotenoid content is the primary determinant of fruit color that affects nutritional value and appearance in tomato. Phytoene synthase (PSY) is the key regulatory enzyme in the carotenoid biosynthesis pathway. Absent function of PSY1 in tomato fruit results in yellow flesh phenotype. We, here, report that two different transcripts, a wild-type (Psy1) and a chimeric mRNA (Psy1/Unknown), exist in a yellow-fruited tomato accession PI 114490. Psy1/Unknown is generated by joining exons from two different genes, Psy1 and an unknown gene, transcribed using both complementary DNA strands. The Psy1 shows low expression in the fruit of PI 114490, while the expression of Psy1/Unknown in the fruit of PI 114490 shows the same pattern as Psy1 in red fruit. The PSY1/Unknown has a lower function than PSY1 in a bacterial expression system. Coincidence of one single-nucleotide polymorphism (SNP) in the fourth intron and one simple sequence repeat (SSR) with 19 AT repeats in the downstream sequence of Psy1 gene with Psy1/Unknown in a set of yellow-fruited tomato lines indicates that Psy1/Unknown might be caused by the SNP and/or SSR. One possible explanation of these observations is trans-splicing. Severely reduced Psy1 transcript caused by Psy1/Unknown results in low accumulation of carotenoid and yellow flesh in PI 114490.
Collapse
Affiliation(s)
- Baoshan Kang
- Beijing Key Laboratory of Growth and Developmental Regulation for Protected Vegetable Crops, Department of Vegetable Science, China Agricultural University, No. 2 Yuanmingyuan Xilu, Beijing, 100193, China
| | | | | | | | | | | |
Collapse
|
26
|
Lloréns-Rico V, Serrano L, Lluch-Senar M. Assessing the hodgepodge of non-mapped reads in bacterial transcriptomes: real or artifactual RNA chimeras? BMC Genomics 2014; 15:633. [PMID: 25070459 PMCID: PMC4122791 DOI: 10.1186/1471-2164-15-633] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/25/2014] [Accepted: 07/17/2014] [Indexed: 11/10/2022] Open
Abstract
BACKGROUND RNA sequencing methods have already altered our view of the extent and complexity of bacterial and eukaryotic transcriptomes, revealing rare transcript isoforms (circular RNAs, RNA chimeras) that could play an important role in their biology. RESULTS We performed an analysis of chimera formation by four different computational approaches, including a custom designed pipeline, to study the transcriptomes of M. pneumoniae and P. aeruginosa, as well as mixtures of both. We found that rare transcript isoforms detected by conventional pipelines of analysis could be artifacts of the experimental procedure used in the library preparation, and that they are protocol-dependent. CONCLUSION By using a customized pipeline we show that optimal library preparation protocol and the pipeline to analyze the results are crucial to identify real chimeric RNAs.
Collapse
Affiliation(s)
| | - Luis Serrano
- EMBL/CRG Systems Biology Research Unit, Centre for Genomic Regulation (CRG), Dr, Aiguader 88, 08003 Barcelona, Spain.
| | | |
Collapse
|
27
|
Yu CY, Liu HJ, Hung LY, Kuo HC, Chuang TJ. Is an observed non-co-linear RNA product spliced in trans, in cis or just in vitro? Nucleic Acids Res 2014; 42:9410-23. [PMID: 25053845 PMCID: PMC4132752 DOI: 10.1093/nar/gku643] [Citation(s) in RCA: 43] [Impact Index Per Article: 3.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/14/2022] Open
Abstract
Global transcriptome investigations often result in the detection of an enormous number of transcripts composed of non-co-linear sequence fragments. Such ‘aberrant’ transcript products may arise from post-transcriptional events or genetic rearrangements, or may otherwise be false positives (sequencing/alignment errors or in vitro artifacts). Moreover, post-transcriptionally non-co-linear (‘PtNcl’) transcripts can arise from trans-splicing or back-splicing in cis (to generate so-called ‘circular RNA’). Here, we collected previously-predicted human non-co-linear RNA candidates, and designed a validation procedure integrating in silico filters with multiple experimental validation steps to examine their authenticity. We showed that >50% of the tested candidates were in vitro artifacts, even though some had been previously validated by RT-PCR. After excluding the possibility of genetic rearrangements, we distinguished between trans-spliced and circular RNAs, and confirmed that these two splicing forms can share the same non-co-linear junction. Importantly, the experimentally-confirmed PtNcl RNA events and their corresponding PtNcl splicing types (i.e. trans-splicing, circular RNA, or both sharing the same junction) were all expressed in rhesus macaque, and some were even expressed in mouse. Our study thus describes an essential procedure for confirming PtNcl transcripts, and provides further insight into the evolutionary role of PtNcl RNA events, opening up this important, but understudied, class of post-transcriptional events for comprehensive characterization.
Collapse
Affiliation(s)
- Chun-Ying Yu
- Institute of Cellular and Organismic Biology, Academia Sinica, Taipei 11529, Taiwan
| | - Hsiao-Jung Liu
- Institute of Cellular and Organismic Biology, Academia Sinica, Taipei 11529, Taiwan
| | - Li-Yuan Hung
- Division of Physical and Computational Genomics, Genomics Research Center, Academia Sinica, Taipei 11529, Taiwan
| | - Hung-Chih Kuo
- Institute of Cellular and Organismic Biology, Academia Sinica, Taipei 11529, Taiwan
| | - Trees-Juen Chuang
- Division of Physical and Computational Genomics, Genomics Research Center, Academia Sinica, Taipei 11529, Taiwan
| |
Collapse
|
28
|
Dasgupta S, Basu G. Evolutionary insights about bacterial GlxRS from whole genome analyses: is GluRS2 a chimera? BMC Evol Biol 2014; 14:26. [PMID: 24521160 PMCID: PMC3927822 DOI: 10.1186/1471-2148-14-26] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/16/2013] [Accepted: 02/07/2014] [Indexed: 12/21/2022] Open
Abstract
Background Evolutionary histories of glutamyl-tRNA synthetase (GluRS) and glutaminyl-tRNA synthetase (GlnRS) in bacteria are convoluted. After the divergence of eubacteria and eukarya, bacterial GluRS glutamylated both tRNAGln and tRNAGlu until GlnRS appeared by horizontal gene transfer (HGT) from eukaryotes or a duplicate copy of GluRS (GluRS2) that only glutamylates tRNAGln appeared. The current understanding is based on limited sequence data and not always compatible with available experimental results. In particular, the origin of GluRS2 is poorly understood. Results A large database of bacterial GluRS, GlnRS, tRNAGln and the trimeric aminoacyl-tRNA-dependent amidotransferase (gatCAB), constructed from whole genomes by functionally annotating and classifying these enzymes according to their mutual presence and absence in the genome, was analyzed. Phylogenetic analyses showed that the catalytic and the anticodon-binding domains of functional GluRS2 (as in Helicobacter pylori) were independently acquired from evolutionarily distant hosts by HGT. Non-functional GluRS2 (as in Thermotoga maritima), on the other hand, was found to contain an anticodon-binding domain appended to a gene-duplicated catalytic domain. Several genomes were found to possess both GluRS2 and GlnRS, even though they share the common function of aminoacylating tRNAGln. GlnRS was widely distributed among bacterial phyla and although phylogenetic analyses confirmed the origin of most bacterial GlnRS to be through a single HGT from eukarya, many GlnRS sequences also appeared with evolutionarily distant phyla in phylogenetic tree. A GlnRS pseudogene could be identified in Sorangium cellulosum. Conclusions Our analysis broadens the current understanding of bacterial GlxRS evolution and highlights the idiosyncratic evolution of GluRS2. Specifically we show that: i) GluRS2 is a chimera of mismatching catalytic and anticodon-binding domains, ii) the appearance of GlnRS and GluRS2 in a single bacterial genome indicating that the evolutionary histories of the two enzymes are distinct, iii) GlnRS is more widespread in bacteria than is believed, iv) bacterial GlnRS appeared both by HGT from eukarya and intra-bacterial HGT, v) presence of GlnRS pseudogene shows that many bacteria could not retain the newly acquired eukaryal GlnRS. The functional annotation of GluRS, without recourse to experiments, performed in this work, demonstrates the inherent and unique advantages of using whole genome over isolated sequence databases.
Collapse
Affiliation(s)
| | - Gautam Basu
- Department of Biophysics, Bose Institute, P-1/12 CIT Scheme VIIM, Kolkata 700054, India.
| |
Collapse
|
29
|
Hoffmann S, Otto C, Doose G, Tanzer A, Langenberger D, Christ S, Kunz M, Holdt LM, Teupser D, Hackermüller J, Stadler PF. A multi-split mapping algorithm for circular RNA, splicing, trans-splicing and fusion detection. Genome Biol 2014; 15:R34. [PMID: 24512684 PMCID: PMC4056463 DOI: 10.1186/gb-2014-15-2-r34] [Citation(s) in RCA: 198] [Impact Index Per Article: 18.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/22/2013] [Accepted: 02/10/2014] [Indexed: 11/25/2022] Open
Abstract
Numerous high-throughput sequencing studies have focused on detecting conventionally spliced mRNAs in RNA-seq data. However, non-standard RNAs arising through gene fusion, circularization or trans-splicing are often neglected. We introduce a novel, unbiased algorithm to detect splice junctions from single-end cDNA sequences. In contrast to other methods, our approach accommodates multi-junction structures. Our method compares favorably with competing tools for conventionally spliced mRNAs and, with a gain of up to 40% of recall, systematically outperforms them on reads with multiple splits, trans-splicing and circular products. The algorithm is integrated into our mapping tool segemehl (http://www.bioinf.uni-leipzig.de/Software/segemehl/).
Collapse
Affiliation(s)
- Steve Hoffmann
- Junior Research Group Transcriptome Bioinformatics, Leipzig University, Haertelstrasse 16-18, Leipzig, Germany
- Interdisciplinary Center for Bioinformatics and Bioinformatics Group, University Leipzig, Haertelstrasse 16-18, Leipzig, Germany
- LIFE Research Center for Civilization Diseases, Leipzig University
| | - Christian Otto
- Junior Research Group Transcriptome Bioinformatics, Leipzig University, Haertelstrasse 16-18, Leipzig, Germany
- Interdisciplinary Center for Bioinformatics and Bioinformatics Group, University Leipzig, Haertelstrasse 16-18, Leipzig, Germany
- LIFE Research Center for Civilization Diseases, Leipzig University
| | - Gero Doose
- Junior Research Group Transcriptome Bioinformatics, Leipzig University, Haertelstrasse 16-18, Leipzig, Germany
- Interdisciplinary Center for Bioinformatics and Bioinformatics Group, University Leipzig, Haertelstrasse 16-18, Leipzig, Germany
- LIFE Research Center for Civilization Diseases, Leipzig University
| | - Andrea Tanzer
- Department of Theoretical Chemistry, University of Vienna, Währinger Strasse 17, Vienna, Austria
| | - David Langenberger
- Junior Research Group Transcriptome Bioinformatics, Leipzig University, Haertelstrasse 16-18, Leipzig, Germany
- Interdisciplinary Center for Bioinformatics and Bioinformatics Group, University Leipzig, Haertelstrasse 16-18, Leipzig, Germany
- LIFE Research Center for Civilization Diseases, Leipzig University
| | - Sabina Christ
- RNomics Group, Fraunhofer Institute for Cell Therapy and Immunology – IZI, Perlickstrasse 1, Leipzig, Germany
| | - Manfred Kunz
- Department of Dermatology, Venerology and Allergology, Leipzig University, Philipp-Rosenthal-Strasse 23, Leipzig, Germany
| | - Lesca M Holdt
- LIFE Research Center for Civilization Diseases, Leipzig University
- Institute of Laboratory Medicine, Ludwig Maximilian University, Marchioninistrasse 15, Munich, Germany
| | - Daniel Teupser
- LIFE Research Center for Civilization Diseases, Leipzig University
- Institute of Laboratory Medicine, Ludwig Maximilian University, Marchioninistrasse 15, Munich, Germany
| | - Jörg Hackermüller
- Interdisciplinary Center for Bioinformatics and Bioinformatics Group, University Leipzig, Haertelstrasse 16-18, Leipzig, Germany
- RNomics Group, Fraunhofer Institute for Cell Therapy and Immunology – IZI, Perlickstrasse 1, Leipzig, Germany
- Young Investigators Group Bioinformatics and Transcriptomics, Department of Proteomics, Helmholtz Centre for Environmental Research – UFZ, Permoserstrasse 15, Leipzig, Germany
| | - Peter F Stadler
- Junior Research Group Transcriptome Bioinformatics, Leipzig University, Haertelstrasse 16-18, Leipzig, Germany
- Interdisciplinary Center for Bioinformatics and Bioinformatics Group, University Leipzig, Haertelstrasse 16-18, Leipzig, Germany
- LIFE Research Center for Civilization Diseases, Leipzig University
- Department of Theoretical Chemistry, University of Vienna, Währinger Strasse 17, Vienna, Austria
- Max Planck Institute for Mathematics in the Sciences, Inselstrasse 22, Leipzig, Germany
- Center for non-coding RNA in Technology and Health, University of Copenhagen, Grønnegårdsvej 3, Frederiksberg, Denmark
- Santa Fe Institute, 1399 Hyde Park Road, Santa Fe, NM, USA
| |
Collapse
|
30
|
Nitsche A, Doose G, Tafer H, Robinson M, Saha NR, Gerdol M, Canapa A, Hoffmann S, Amemiya CT, Stadler PF. Atypical RNAs in the coelacanth transcriptome. JOURNAL OF EXPERIMENTAL ZOOLOGY PART B-MOLECULAR AND DEVELOPMENTAL EVOLUTION 2013; 322:342-51. [PMID: 24174405 DOI: 10.1002/jez.b.22542] [Citation(s) in RCA: 17] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Received: 04/24/2013] [Revised: 07/22/2013] [Accepted: 08/16/2013] [Indexed: 01/15/2023]
Abstract
Circular and apparently trans-spliced RNAs have recently been reported as abundant types of transcripts in mammalian transcriptome data. Both types of non-colinear RNAs are also abundant in RNA-seq of different tissue from both the African and the Indonesian coelacanth. We observe more than 8,000 lincRNAs with normal gene structure and several thousands of circularized and trans-spliced products, showing that such atypical RNAs form a substantial contribution to the transcriptome. Surprisingly, the majority of the circularizing and trans-connecting splice junctions are unique to atypical forms, that is, are not used in normal isoforms.
Collapse
Affiliation(s)
- Anne Nitsche
- Department of Computer Science, Bioinformatics Group, University of Leipzig, Leipzig, Germany; Interdisciplinary Center for Bioinformatics, University of Leipzig, Leipzig, Germany
| | | | | | | | | | | | | | | | | | | |
Collapse
|
31
|
Shugay M, Ortiz de Mendíbil I, Vizmanos JL, Novo FJ. Oncofuse: a computational framework for the prediction of the oncogenic potential of gene fusions. ACTA ACUST UNITED AC 2013; 29:2539-46. [PMID: 23956304 DOI: 10.1093/bioinformatics/btt445] [Citation(s) in RCA: 78] [Impact Index Per Article: 6.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/10/2023]
Abstract
MOTIVATION Gene fusions resulting from chromosomal aberrations are an important cause of cancer. The complexity of genomic changes in certain cancer types has hampered the identification of gene fusions by molecular cytogenetic methods, especially in carcinomas. This is changing with the advent of next-generation sequencing, which is detecting a substantial number of new fusion transcripts in individual cancer genomes. However, this poses the challenge of identifying those fusions with greater oncogenic potential amid a background of 'passenger' fusion sequences. RESULTS In the present work, we have used some recently identified genomic hallmarks of oncogenic fusion genes to develop a pipeline for the classification of fusion sequences, namely, Oncofuse. The pipeline predicts the oncogenic potential of novel fusion genes, calculating the probability that a fusion sequence behaves as 'driver' of the oncogenic process based on features present in known oncogenic fusions. Cross-validation and extensive validation tests on independent datasets suggest a robust behavior with good precision and recall rates. We believe that Oncofuse could become a useful tool to guide experimental validation studies of novel fusion sequences found during next-generation sequencing analysis of cancer transcriptomes. AVAILABILITY AND IMPLEMENTATION Oncofuse is a naive Bayes Network Classifier trained and tested using Weka machine learning package. The pipeline is executed by running a Java/Groovy script, available for download at www.unav.es/genetica/oncofuse.html.
Collapse
Affiliation(s)
- Mikhail Shugay
- Department of Genetics, University of Navarra. 31008 Pamplona, Spain
| | | | | | | |
Collapse
|
32
|
Mass T, Drake J, Haramaty L, Kim J, Zelzion E, Bhattacharya D, Falkowski P. Cloning and Characterization of Four Novel Coral Acid-Rich Proteins that Precipitate Carbonates In Vitro. Curr Biol 2013; 23:1126-31. [DOI: 10.1016/j.cub.2013.05.007] [Citation(s) in RCA: 85] [Impact Index Per Article: 7.1] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/19/2013] [Revised: 04/17/2013] [Accepted: 05/07/2013] [Indexed: 01/09/2023]
|
33
|
Wu CC, Kannan K, Lin S, Yen L, Milosavljevic A. Identification of cancer fusion drivers using network fusion centrality. ACTA ACUST UNITED AC 2013; 29:1174-81. [PMID: 23505294 DOI: 10.1093/bioinformatics/btt131] [Citation(s) in RCA: 22] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/01/2023]
Abstract
SUMMARY Gene fusions are being discovered at an increasing rate using massively parallel sequencing technologies. Prioritization of cancer fusion drivers for validation cannot be performed using traditional single-gene based methods because fusions involve portions of two partner genes. To address this problem, we propose a novel network analysis method called fusion centrality that is specifically tailored for prioritizing gene fusions. We first propose a domain-based fusion model built on the theory of exon/domain shuffling. The model leads to a hypothesis that a fusion is more likely to be an oncogenic driver if its partner genes act like hubs in a network because the fusion mutation can deregulate normal functions of many other genes and their pathways. The hypothesis is supported by the observation that for most known cancer fusion genes, at least one of the fusion partners appears to be a hub in a network, and even for many fusions both partners appear to be hubs. Based on this model, we construct fusion centrality, a multi-gene-based network metric, and use it to score fusion drivers. We show that the fusion centrality outperforms other single gene-based methods. Specifically, the method successfully predicts most of 38 newly discovered fusions that had validated oncogenic importance. To our best knowledge, this is the first network-based approach for identifying fusion drivers. AVAILABILITY Matlab code implementing the fusion centrality method is available upon request from the corresponding authors.
Collapse
Affiliation(s)
- Chia-Chin Wu
- Department of Genomic Medicine, UT MD Anderson Cancer Center, Houston, TX 77030, USA.
| | | | | | | | | |
Collapse
|
34
|
Li S, Heermann DW. Using chimaeric expression sequence tag as the reference to identify three-dimensional chromosome contacts. DNA Res 2012; 20:45-53. [PMID: 23213109 PMCID: PMC3576657 DOI: 10.1093/dnares/dss032] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/12/2022] Open
Abstract
Transcription-induced chimaeric transcripts, the potential post-transcriptional processing products, might reflect the spatial proximity of actively transcribed genes co-localized in transcription factories. A growing number of expression data deposited in databases provide us with the raw material for screening such chimaeric transcripts and using them as the probes to identify interactions between genes in cis or in trans. Based on the high-quality chimaeric transcripts gleaned from human expression sequence tag data with selection criteria, we identified the patterns of inter- and intrachromosomal gene–gene interactions. On top the contact pattern from interchromosomal interactions, we also observed an exponential behaviour of the intrachromosomal interactions within a certain length scale, which is consistent with the independent experimental results from Hi-C screening and with the Random Loop Model. A compatible result is found for mouse. Transcription-induced chimaeric transcripts, most of which might be accidental products with trivial functions, shed light on the spatial organization of chromosomes. These inter- and intrachromosomal interactions might contribute to the compaction of chromosomes, their segregation and formation of the chromosome territories, and their spatial distribution within the nucleus.
Collapse
Affiliation(s)
- Songling Li
- Theoretical Biophysics Group, Institute for Theoretical Physics, University of Heidelberg, Heidelberg, Germany
| | | |
Collapse
|
35
|
Frenkel-Morgenstern M, Gorohovski A, Lacroix V, Rogers M, Ibanez K, Boullosa C, Andres Leon E, Ben-Hur A, Valencia A. ChiTaRS: a database of human, mouse and fruit fly chimeric transcripts and RNA-sequencing data. Nucleic Acids Res 2012; 41:D142-51. [PMID: 23143107 PMCID: PMC3531201 DOI: 10.1093/nar/gks1041] [Citation(s) in RCA: 43] [Impact Index Per Article: 3.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/23/2022] Open
Abstract
Chimeric RNAs that comprise two or more different transcripts have been identified in many cancers and among the Expressed Sequence Tags (ESTs) isolated from different organisms; they might represent functional proteins and produce different disease phenotypes. The ChiTaRS database of Chimeric Transcripts and RNA-Sequencing data (http://chitars.bioinfo.cnio.es/) collects more than 16 000 chimeric RNAs from humans, mice and fruit flies, 233 chimeras confirmed by RNA-seq reads and ∼2000 cancer breakpoints. The database indicates the expression and tissue specificity of these chimeras, as confirmed by RNA-seq data, and it includes mass spectrometry results for some human entries at their junctions. Moreover, the database has advanced features to analyze junction consistency and to rank chimeras based on the evidence of repeated junction sites. Finally, ‘Junction Search’ screens through the RNA-seq reads found at the chimeras’ junction sites to identify putative junctions in novel sequences entered by users. Thus, ChiTaRS is an extensive catalog of human, mouse and fruit fly chimeras that will extend our understanding of the evolution of chimeric transcripts in eukaryotes and can be advantageous in the analysis of human cancer breakpoints.
Collapse
Affiliation(s)
- Milana Frenkel-Morgenstern
- Structural Biology and BioComputing Program, Spanish National Cancer Research Centre (CNIO), Madrid 28029, Spain
| | | | | | | | | | | | | | | | | |
Collapse
|