1
|
Sorokin M, Rabushko E, Rozenberg JM, Mohammad T, Seryakov A, Sekacheva M, Buzdin A. Clinically relevant fusion oncogenes: detection and practical implications. Ther Adv Med Oncol 2022; 14:17588359221144108. [PMID: 36601633 PMCID: PMC9806411 DOI: 10.1177/17588359221144108] [Citation(s) in RCA: 6] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/09/2022] [Accepted: 11/22/2022] [Indexed: 12/28/2022] Open
Abstract
Mechanistically, chimeric genes result from DNA rearrangements and include parts of preexisting normal genes combined at the genomic junction site. Some rearranged genes encode pathological proteins with altered molecular functions. Those which can aberrantly promote carcinogenesis are called fusion oncogenes. Their formation is not a rare event in human cancers, and many of them were documented in numerous study reports and in specific databases. They may have various molecular peculiarities like increased stability of an oncogenic part, self-activation of tyrosine kinase receptor moiety, and altered transcriptional regulation activities. Currently, tens of low molecular mass inhibitors are approved in cancers as the drugs targeting receptor tyrosine kinase (RTK) oncogenic fusion proteins, that is, including ALK, ABL, EGFR, FGFR1-3, NTRK1-3, MET, RET, ROS1 moieties. Therein, the presence of the respective RTK fusion in the cancer genome is the diagnostic biomarker for drug prescription. However, identification of such fusion oncogenes is challenging as the breakpoint may arise in multiple sites within the gene, and the exact fusion partner is generally unknown. There is no gold standard method for RTK fusion detection, and many alternative experimental techniques are employed nowadays to solve this issue. Among them, RNA-seq-based methods offer an advantage of unbiased high-throughput analysis of only transcribed RTK fusion genes, and of simultaneous finding both fusion partners in a single RNA-seq read. Here we focus on current knowledge of biology and clinical aspects of RTK fusion genes, related databases, and laboratory detection methods.
Collapse
Affiliation(s)
| | - Elizaveta Rabushko
- Moscow Institute of Physics and Technology,
Dolgoprudny, Moscow Region, Russia,I.M. Sechenov First Moscow State Medical
University, Moscow, Russia
| | | | - Tharaa Mohammad
- Moscow Institute of Physics and Technology,
Dolgoprudny, Moscow Region, Russia
| | | | - Marina Sekacheva
- I.M. Sechenov First Moscow State Medical
University, Moscow, Russia
| | - Anton Buzdin
- Moscow Institute of Physics and Technology,
Dolgoprudny, Moscow Region, Russia,I.M. Sechenov First Moscow State Medical
University, Moscow, Russia,Shemyakin-Ovchinnikov Institute of Bioorganic
Chemistry, Moscow, Russia,PathoBiology Group, European Organization for
Research and Treatment of Cancer (EORTC), Brussels, Belgium
| |
Collapse
|
2
|
Mirzaei G. GraphChrom: A Novel Graph-Based Framework for Cancer Classification Using Chromosomal Rearrangement Endpoints. Cancers (Basel) 2022; 14:cancers14133060. [PMID: 35804833 PMCID: PMC9265123 DOI: 10.3390/cancers14133060] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/18/2022] [Revised: 06/06/2022] [Accepted: 06/18/2022] [Indexed: 11/16/2022] Open
Abstract
Chromosomal rearrangements are generally a consequence of improperly repaired double-strand breaks in DNA. These genomic aberrations can be a driver of cancers. Here, we investigated the use of chromosomal rearrangements for classification of cancer tumors and the effect of inter- and intrachromosomal rearrangements in cancer classification. We used data from the Catalogue of Somatic Mutations in Cancer (COSMIC) for breast, pancreatic, and prostate cancers, for which the COSMIC dataset reports the highest number of chromosomal aberrations. We developed a framework known as GraphChrom for cancer classification. GraphChrom was developed using a graph neural network which models the complex structure of chromosomal aberrations (CA) and provides local connectivity between the aberrations. The proposed framework illustrates three important contributions to the field of cancers. Firstly, it successfully classifies cancer types and subtypes. Secondly, it evolved into a novel data extraction technique which can be used to extract more informative graphs (informative aberrations associated with a sample); and thirdly, it predicts that interCAs (rearrangements between two or more chromosomes) are more effective in cancer prediction than intraCAs (rearrangements within the same chromosome), although intraCAs are three times more likely to occur than intraCAs.
Collapse
Affiliation(s)
- Golrokh Mirzaei
- Department of Computer Science and Engineering, Ohio State University, Marion, OH 403302, USA
| |
Collapse
|
3
|
Marie L, Symington LS. Mechanism for inverted-repeat recombination induced by a replication fork barrier. Nat Commun 2022; 13:32. [PMID: 35013185 PMCID: PMC8748988 DOI: 10.1038/s41467-021-27443-w] [Citation(s) in RCA: 5] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/12/2021] [Accepted: 11/22/2021] [Indexed: 01/11/2023] Open
Abstract
Replication stress and abundant repetitive sequences have emerged as primary conditions underlying genomic instability in eukaryotes. To gain insight into the mechanism of recombination between repeated sequences in the context of replication stress, we used a prokaryotic Tus/Ter barrier designed to induce transient replication fork stalling near inverted repeats in the budding yeast genome. Our study reveals that the replication fork block stimulates a unique recombination pathway dependent on Rad51 strand invasion and Rad52-Rad59 strand annealing activities, Mph1/Rad5 fork remodelers, Mre11/Exo1/Dna2 resection machineries, Rad1-Rad10 nuclease and DNA polymerase δ. Furthermore, we show recombination at stalled replication forks is limited by the Srs2 helicase and Mus81-Mms4/Yen1 nucleases. Physical analysis of the replication-associated recombinants revealed that half are associated with an inversion of sequence between the repeats. Based on our extensive genetic characterization, we propose a model for recombination of closely linked repeats that can robustly generate chromosome rearrangements. Replication stress and abundant repetitive sequences have emerged as primary conditions underlying genomic instability in eukaryotes. Here the authors use a prokaryotic Tus/Ter barrier designed to induce transient replication fork stalling near inverted repeats in the budding yeast genome to support a model for recombination of closely linked repeats at stalled replication forks.
Collapse
Affiliation(s)
- Léa Marie
- Department of Microbiology & Immunology, Columbia University Irving Medical Center, New York, NY, 10032, USA
| | - Lorraine S Symington
- Department of Microbiology & Immunology, Columbia University Irving Medical Center, New York, NY, 10032, USA. .,Department of Genetics & Development, Columbia University Irving Medical Center, New York, NY, 10032, USA.
| |
Collapse
|
4
|
Chen W, Cui W, Qiu Y, Cui D. Research Progress of Chimeric RNA and Health. Health (London) 2021. [DOI: 10.4236/health.2021.134036] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/20/2022]
|
5
|
Human transcription factor and protein kinase gene fusions in human cancer. Sci Rep 2020; 10:14169. [PMID: 32843691 PMCID: PMC7447636 DOI: 10.1038/s41598-020-71040-8] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/21/2020] [Accepted: 07/30/2020] [Indexed: 11/26/2022] Open
Abstract
Oncogenic gene fusions are estimated to account for up-to 20% of cancer morbidity. Recently sequence-level studies have established oncofusions throughout all tissue types. However, the functional implications of the identified oncofusions have often not been investigated. In this study, identified oncofusions from a fusion detection approach (DEEPEST) were analyzed in detail. Of the 28,863 oncofusions, we found almost 30% are expected to produce functional proteins with features from both parent genes. Kinases and transcription factors were the main gene families of the protein producing fusions. Considering their role as initiators, actors, and termination points of cellular signaling pathways, we focused our in-depth analyses on them. Domain architecture of the fusions and their wild-type interactors suggests that abnormal molecular context of protein domains caused by fusion events may unlock the oncogenic potential of the wild type counterparts of the fusion proteins. To understand overall oncofusion effects, we performed differential expression analysis using TCGA cancer project samples. Results indicated oncofusion-specific alterations in gene expression levels, and lower expression levels of components of key cellular pathways, in particular signal transduction and transcription regulation. The sum of results suggests that kinase and transcription factor oncofusions deregulate cellular signaling, possibly via acquiring novel functions.
Collapse
|
6
|
Ramezankhani R, Minaei N, Haddadi M, Torabi S, Hesaraki M, Mirzaei H, Vosough M, Verfaillie CM. Gene editing technology for improving life quality: A dream coming true? Clin Genet 2020; 99:67-83. [PMID: 32506418 DOI: 10.1111/cge.13794] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/11/2020] [Revised: 06/02/2020] [Accepted: 06/03/2020] [Indexed: 12/13/2022]
Abstract
The fact that monogenic diseases are related to mutations in one specific gene, make gene correction one of the promising strategies in the future to treat genetic diseases or alleviate their symptoms. From this perspective, and along with recent advances in technology, genome editing tools have gained momentum and developed fast. In fact, clustered regularly interspaced short palindromic repeats-associated protein 9 (CRISPR/Cas9), transcription activator-like effector nucleases (TALENs), and zinc-finger nucleases (ZFNs) are regarded as novel technologies which are able to correct a number of genetic aberrations in vitro and in vivo. The number of ongoing clinical trials employing these tools has been increased showing the encouraging outcomes of these tools. However, there are still some major challenges with respect to the safety profile and directed delivery of them. In this paper, we provided updated information regarding the history, nature, methods of delivery, and application of the above-mentioned gene editing tools along with the meganucleases (an older similar tool) based on published in vitro and in vivo studies and introduced clinical trials which employed these technologies.
Collapse
Affiliation(s)
- Roya Ramezankhani
- Department of Stem Cells and Developmental Biology, Cell Science Research Center, Royan Institute for Stem Cell Biology and Technology, Academic Center for Education, Culture and Research (ACECR), Tehran, Iran.,Department of Regenerative Medicine, Cell Science Research Center, Royan Institute for Stem Cell Biology and Technology, Academic Center for Education, Culture and Research (ACECR), Tehran, Iran.,Department of Development and Regeneration, KU Leuven Stem Cell Institute, Leuven, Belgium
| | - Neda Minaei
- Department of Stem Cells and Developmental Biology, Cell Science Research Center, Royan Institute for Stem Cell Biology and Technology, Academic Center for Education, Culture and Research (ACECR), Tehran, Iran.,Department of Regenerative Medicine, Cell Science Research Center, Royan Institute for Stem Cell Biology and Technology, Academic Center for Education, Culture and Research (ACECR), Tehran, Iran
| | - Mahnaz Haddadi
- Department of Embryology, Reproductive Biomedicine Research Center, Royan Institute for Reproductive Biomedicine, ACECR, Tehran, Iran
| | - Shukoofeh Torabi
- Department of Stem Cells and Developmental Biology, Cell Science Research Center, Royan Institute for Stem Cell Biology and Technology, Academic Center for Education, Culture and Research (ACECR), Tehran, Iran.,Department of Regenerative Medicine, Cell Science Research Center, Royan Institute for Stem Cell Biology and Technology, Academic Center for Education, Culture and Research (ACECR), Tehran, Iran
| | - Mahdi Hesaraki
- Department of Stem Cells and Developmental Biology, Cell Science Research Center, Royan Institute for Stem Cell Biology and Technology, Academic Center for Education, Culture and Research (ACECR), Tehran, Iran
| | - Hamed Mirzaei
- Research Center for Biochemistry and Nutrition in Metabolic Diseases, Kashan University of Medical Sciences, Kashan, Iran
| | - Massoud Vosough
- Department of Stem Cells and Developmental Biology, Cell Science Research Center, Royan Institute for Stem Cell Biology and Technology, Academic Center for Education, Culture and Research (ACECR), Tehran, Iran.,Department of Regenerative Medicine, Cell Science Research Center, Royan Institute for Stem Cell Biology and Technology, Academic Center for Education, Culture and Research (ACECR), Tehran, Iran
| | - Catherine M Verfaillie
- Department of Development and Regeneration, Stem Cell Institute, KU Leuven, Leuven, Belgium
| |
Collapse
|
7
|
Abstract
Chimeric RNAs are hybrid transcripts containing exons from two separate genes. Chimeric RNAs are traditionally considered to be transcribed from fusion genes caused by chromosomal rearrangement. These canonical chimeric RNAs are well characterized to be expressed in a cancer-unique pattern and/or act as oncogene products. However, benefited by the development of advanced deep sequencing technologies, novel types of non-canonical chimeric RNAs have been discovered to be generated from intergenic splicing without genomic aberrations. They can be formed through trans-splicing or cis-splicing between adjacent genes (cis-SAGe) mechanisms. Non-canonical chimeric RNAs are widely detected in normal physiology, although several have been shown to have a cancer-specific expression pattern. Further studies have indicated that some of them play fundamental roles in controlling cell growth and motility, and may have functions independent of the parental genes. These discoveries are unveiling a new layer of the functional transcriptome and are also raising the possibility of utilizing non-canonical chimeric RNAs as cancer diagnostic markers and therapeutic targets. In this chapter, we will overview different categories of chimeric RNAs and their expression in various types of cancerous and normal samples. Acknowledging that chimeric RNAs are not unique to cancer, we will discuss both bioinformatic and biological methods to identify credible cancer-specific chimeric RNAs. Furthermore, we will describe downstream methods to explore their molecular processing mechanisms and potential functions. A better understanding of the biogenesis mechanisms and functional products of cancer-specific chimeric RNAs will pave ways for the development of novel cancer biomarkers and therapeutic targets.
Collapse
Affiliation(s)
- Xinrui Shi
- Department of Biochemistry and Molecular Genetics, School of Medicine, University of Virginia, Charlottesville, VA, United States
| | - Sandeep Singh
- Department of Pathology, School of Medicine, University of Virginia, Charlottesville, VA, United States
| | - Emily Lin
- Department of Pathology, School of Medicine, University of Virginia, Charlottesville, VA, United States
| | - Hui Li
- Department of Biochemistry and Molecular Genetics, School of Medicine, University of Virginia, Charlottesville, VA, United States; Department of Pathology, School of Medicine, University of Virginia, Charlottesville, VA, United States.
| |
Collapse
|
8
|
Balamurali D, Gorohovski A, Detroja R, Palande V, Raviv-Shay D, Frenkel-Morgenstern M. ChiTaRS 5.0: the comprehensive database of chimeric transcripts matched with druggable fusions and 3D chromatin maps. Nucleic Acids Res 2020; 48:D825-D834. [PMID: 31747015 PMCID: PMC7145514 DOI: 10.1093/nar/gkz1025] [Citation(s) in RCA: 11] [Impact Index Per Article: 2.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/15/2019] [Revised: 10/18/2019] [Accepted: 10/26/2019] [Indexed: 12/11/2022] Open
Abstract
Chimeric RNA transcripts are formed when exons from two genes fuse together, often due to chromosomal translocations, transcriptional errors or trans-splicing effect. While these chimeric RNAs produce functional proteins only in certain cases, they play a significant role in disease phenotyping and progression. ChiTaRS 5.0 (http://chitars.md.biu.ac.il/) is the latest and most comprehensive chimeric transcript repository, with 111 582 annotated entries from eight species, including 23 167 known human cancer breakpoints. The database includes unique information correlating chimeric breakpoints with 3D chromatin contact maps, generated from public datasets of chromosome conformation capture techniques (Hi-C). In this update, we have added curated information on druggable fusion targets matched with chimeric breakpoints, which are applicable to precision medicine in cancers. The introduction of a new section that lists chimeric RNAs in various cell-lines is another salient feature. Finally, using text-mining techniques, novel chimeras in Alzheimer's disease, schizophrenia, dyslexia and other diseases were collected in ChiTaRS. Thus, this improved version is an extensive catalogue of chimeras from multiple species. It extends our understanding of the evolution of chimeric transcripts in eukaryotes and contributes to the analysis of 3D genome conformational changes and the functional role of chimeras in the etiopathogenesis of cancers and other complex diseases.
Collapse
Affiliation(s)
- Deepak Balamurali
- Laboratory of Cancer Genomics and Biocomputing of Complex Diseases, The Azrieli Faculty of Medicine, Bar-Ilan University, Safed 1311502, Israel
| | - Alessandro Gorohovski
- Laboratory of Cancer Genomics and Biocomputing of Complex Diseases, The Azrieli Faculty of Medicine, Bar-Ilan University, Safed 1311502, Israel
| | - Rajesh Detroja
- Laboratory of Cancer Genomics and Biocomputing of Complex Diseases, The Azrieli Faculty of Medicine, Bar-Ilan University, Safed 1311502, Israel
| | - Vikrant Palande
- Laboratory of Cancer Genomics and Biocomputing of Complex Diseases, The Azrieli Faculty of Medicine, Bar-Ilan University, Safed 1311502, Israel
| | - Dorith Raviv-Shay
- Laboratory of Cancer Genomics and Biocomputing of Complex Diseases, The Azrieli Faculty of Medicine, Bar-Ilan University, Safed 1311502, Israel
| | - Milana Frenkel-Morgenstern
- Laboratory of Cancer Genomics and Biocomputing of Complex Diseases, The Azrieli Faculty of Medicine, Bar-Ilan University, Safed 1311502, Israel
| |
Collapse
|
9
|
Frenkel-Morgenstern M. Identification of Chimeric RNAs Using RNA-Seq Reads and Protein-Protein Interactions of Translated Chimeras. Methods Mol Biol 2020; 2079:27-40. [PMID: 31728960 DOI: 10.1007/978-1-4939-9904-0_3] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 06/10/2023]
Abstract
Chimeric RNA moieties typically consist of exons from two genes expressed from different genomic locations and produced by chromosomal translocations, trans-splicing or transcription errors. Recent advances in next-generation sequencing procedures have opened new horizons for identification of novel chimeric transcripts in various diseases in a personalized manner. Here we describe the detailed computational procedures to identify chimeric transcripts using RNA-seq reads. Moreover, we elaborate on the domain-domain co-occurrence method to detect alterations in chimeric protein-protein interaction (ChiPPI) networks produced by chimeric RNA that are translated to chimeric proteins.
Collapse
|
10
|
Barresi V, Cosentini I, Scuderi C, Napoli S, Di Bella V, Spampinato G, Condorelli DF. Fusion Transcripts of Adjacent Genes: New Insights into the World of Human Complex Transcripts in Cancer. Int J Mol Sci 2019; 20:ijms20215252. [PMID: 31652751 PMCID: PMC6862657 DOI: 10.3390/ijms20215252] [Citation(s) in RCA: 12] [Impact Index Per Article: 2.4] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/18/2019] [Revised: 10/18/2019] [Accepted: 10/20/2019] [Indexed: 12/12/2022] Open
Abstract
The awareness of genome complexity brought a radical approach to the study of transcriptome, opening eyes to single RNAs generated from two or more adjacent genes according to the present consensus. This kind of transcript was thought to originate only from chromosomal rearrangements, but the discovery of readthrough transcription opens the doors to a new world of fusion RNAs. In the last years many possible intergenic cis-splicing mechanisms have been proposed, unveiling the origins of transcripts that contain some exons of both the upstream and downstream genes. In some cases, alternative mechanisms, such as trans-splicing and transcriptional slippage, have been proposed. Five databases, containing validated and predicted Fusion Transcripts of Adjacent Genes (FuTAGs), are available for the scientific community. A comparative analysis revealed that two of them contain the majority of the results. A complete analysis of the more widely characterized FuTAGs is provided in this review, including their expression pattern in normal tissues and in cancer. Gene structure, intergenic splicing patterns and exon junction sequences have been determined and here reported for well-characterized FuTAGs. The available functional data and the possible roles in cancer progression are discussed.
Collapse
Affiliation(s)
- Vincenza Barresi
- Department of Biomedical and Biotechnological Sciences, Section of Medical Biochemistry, University of Catania, 95123 Catania, Italy.
| | - Ilaria Cosentini
- Department of Biomedical and Biotechnological Sciences, Section of Medical Biochemistry, University of Catania, 95123 Catania, Italy.
| | - Chiara Scuderi
- Department of Biomedical and Biotechnological Sciences, Section of Medical Biochemistry, University of Catania, 95123 Catania, Italy.
| | - Salvatore Napoli
- Department of Biomedical and Biotechnological Sciences, Section of Medical Biochemistry, University of Catania, 95123 Catania, Italy.
| | - Virginia Di Bella
- Department of Biomedical and Biotechnological Sciences, Section of Medical Biochemistry, University of Catania, 95123 Catania, Italy.
| | - Giorgia Spampinato
- Department of Biomedical and Biotechnological Sciences, Section of Medical Biochemistry, University of Catania, 95123 Catania, Italy.
| | - Daniele Filippo Condorelli
- Department of Biomedical and Biotechnological Sciences, Section of Medical Biochemistry, University of Catania, 95123 Catania, Italy.
| |
Collapse
|
11
|
ProtFus: A Comprehensive Method Characterizing Protein-Protein Interactions of Fusion Proteins. PLoS Comput Biol 2019; 15:e1007239. [PMID: 31437145 PMCID: PMC6705771 DOI: 10.1371/journal.pcbi.1007239] [Citation(s) in RCA: 8] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/02/2018] [Accepted: 07/03/2019] [Indexed: 01/10/2023] Open
Abstract
Tailored therapy aims to cure cancer patients effectively and safely, based on the complex interactions between patients' genomic features, disease pathology and drug metabolism. Thus, the continual increase in scientific literature drives the need for efficient methods of data mining to improve the extraction of useful information from texts based on patients' genomic features. An important application of text mining to tailored therapy in cancer encompasses the use of mutations and cancer fusion genes as moieties that change patients' cellular networks to develop cancer, and also affect drug metabolism. Fusion proteins, which are derived from the slippage of two parental genes, are produced in cancer by chromosomal aberrations and trans-splicing. Given that the two parental proteins for predicted fusion proteins are known, we used our previously developed method for identifying chimeric protein-protein interactions (ChiPPIs) associated with the fusion proteins. Here, we present a validation approach that receives fusion proteins of interest, predicts their cellular network alterations by ChiPPI and validates them by our new method, ProtFus, using an online literature search. This process resulted in a set of 358 fusion proteins and their corresponding protein interactions, as a training set for a Naïve Bayes classifier, to identify predicted fusion proteins that have reliable evidence in the literature and that were confirmed experimentally. Next, for a test group of 1817 fusion proteins, we were able to identify from the literature 2908 PPIs in total, across 18 cancer types. The described method, ProtFus, can be used for screening the literature to identify unique cases of fusion proteins and their PPIs, as means of studying alterations of protein networks in cancers. Availability: http://protfus.md.biu.ac.il/.
Collapse
|
12
|
Tang Y, Ma S, Wang X, Xing Q, Huang T, Liu H, Li Q, Zhang Y, Zhang K, Yao M, Yang GL, Li H, Zang X, Yang B, Guan F. Identification of chimeric RNAs in human infant brains and their implications in neural differentiation. Int J Biochem Cell Biol 2019; 111:19-26. [DOI: 10.1016/j.biocel.2019.03.012] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/31/2018] [Revised: 03/06/2019] [Accepted: 03/30/2019] [Indexed: 02/07/2023]
|
13
|
Kumar R, Nagpal G, Kumar V, Usmani SS, Agrawal P, Raghava GPS. HumCFS: a database of fragile sites in human chromosomes. BMC Genomics 2019; 19:985. [PMID: 30999860 PMCID: PMC7402404 DOI: 10.1186/s12864-018-5330-5] [Citation(s) in RCA: 47] [Impact Index Per Article: 9.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/24/2018] [Accepted: 11/29/2018] [Indexed: 11/25/2022] Open
Abstract
Background Fragile sites are the chromosomal regions that are susceptible to breakage, and their frequency varies among the human population. Based on the frequency of fragile site induction, they are categorized as common and rare fragile sites. Common fragile sites are sensitive to replication stress and often rearranged in cancer. Rare fragile sites are the archetypal trinucleotide repeats. Fragile sites are known to be involved in chromosomal rearrangements in tumors. Human miRNA genes are also present at fragile sites. A better understanding of genes and miRNAs lying in the fragile site regions and their association with disease progression is required. Result HumCFS is a manually curated database of human chromosomal fragile sites. HumCFS provides useful information on fragile sites such as coordinates on the chromosome, cytoband, their chemical inducers and frequency of fragile site (rare or common), genes and miRNAs lying in fragile sites. Protein coding genes in the fragile sites were identified by mapping the coordinates of fragile sites with human genome Ensembl (GRCh38/hg38). Genes present in fragile sites were further mapped to DisGenNET database, to understand their possible link with human diseases. Human miRNAs from miRBase was also mapped on fragile site coordinates. In brief, HumCFS provides useful information about 125 human chromosomal fragile sites and their association with 4921 human protein-coding genes and 917 human miRNA’s. Conclusion User-friendly web-interface of HumCFS and hyper-linking with other resources will help researchers to search for genes, miRNAs efficiently and to intersect the relationship among them. For easy data retrieval and analysis, we have integrated standard web-based tools, such as JBrowse, BLAST etc. Also, the user can download the data in various file formats such as text files, gff3 files and Bed-format files which can be used on UCSC browser. Database URL:http://webs.iiitd.edu.in/raghava/humcfs/ Electronic supplementary material The online version of this article (10.1186/s12864-018-5330-5) contains supplementary material, which is available to authorized users.
Collapse
Affiliation(s)
- Rajesh Kumar
- Center for Computational Biology, Indraprastha Institute of Information Technology, New Delhi, 110020, India.,Bioinformatics Centre, CSIR-Institute of Microbial Technology, Chandigarh, 160036, India
| | - Gandharva Nagpal
- Bioinformatics Centre, CSIR-Institute of Microbial Technology, Chandigarh, 160036, India
| | - Vinod Kumar
- Center for Computational Biology, Indraprastha Institute of Information Technology, New Delhi, 110020, India.,Bioinformatics Centre, CSIR-Institute of Microbial Technology, Chandigarh, 160036, India
| | - Salman Sadullah Usmani
- Center for Computational Biology, Indraprastha Institute of Information Technology, New Delhi, 110020, India.,Bioinformatics Centre, CSIR-Institute of Microbial Technology, Chandigarh, 160036, India
| | - Piyush Agrawal
- Center for Computational Biology, Indraprastha Institute of Information Technology, New Delhi, 110020, India.,Bioinformatics Centre, CSIR-Institute of Microbial Technology, Chandigarh, 160036, India
| | - Gajendra P S Raghava
- Center for Computational Biology, Indraprastha Institute of Information Technology, New Delhi, 110020, India.
| |
Collapse
|
14
|
Abstract
Chromosomal rearrangements, including translocations, are early and essential events in the formation of many tumors. Previous studies that defined the genetic requirements for rearrangement formation have identified differences between murine and human cells, most notably in the role of classic and alternative nonhomologous end-joining (NHEJ) factors. We reported that poly(ADP)ribose polymerase 3 (PARP3) promotes chromosomal rearrangements induced by endonucleases in multiple human cell types. We show here that in contrast to classic (c-NHEJ) factors, Parp3 also promotes rearrangements in murine cells, including translocations in murine embryonic stem cells (mESCs), class-switch recombination in primary B cells, and inversions in tail fibroblasts that generate Eml4-Alk fusions. In mESCs, Parp3-deficient cells had shorter deletion lengths at translocation junctions. This was corroborated using next-generation sequencing of Eml4-Alk junctions in tail fibroblasts and is consistent with a role for Parp3 in promoting the processing of DNA double-strand breaks. We confirmed a previous report that Parp1 also promotes rearrangement formation. In contrast with Parp3, rearrangement junctions in the absence of Parp1 had longer deletion lengths, suggesting that Parp1 may suppress double-strand break processing. Together, these data indicate that Parp3 and Parp1 promote rearrangements with distinct phenotypes.
Collapse
|
15
|
Claussin C, Porubský D, Spierings DCJ, Halsema N, Rentas S, Guryev V, Lansdorp PM, Chang M. Genome-wide mapping of sister chromatid exchange events in single yeast cells using Strand-seq. eLife 2017; 6:e30560. [PMID: 29231811 PMCID: PMC5734873 DOI: 10.7554/elife.30560] [Citation(s) in RCA: 24] [Impact Index Per Article: 3.4] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/19/2017] [Accepted: 12/08/2017] [Indexed: 01/09/2023] Open
Abstract
Homologous recombination involving sister chromatids is the most accurate, and thus most frequently used, form of recombination-mediated DNA repair. Despite its importance, sister chromatid recombination is not easily studied because it does not result in a change in DNA sequence, making recombination between sister chromatids difficult to detect. We have previously developed a novel DNA template strand sequencing technique, called Strand-seq, that can be used to map sister chromatid exchange (SCE) events genome-wide in single cells. An increase in the rate of SCE is an indicator of elevated recombination activity and of genome instability, which is a hallmark of cancer. In this study, we have adapted Strand-seq to detect SCE in the yeast Saccharomyces cerevisiae. We provide the first quantifiable evidence that most spontaneous SCE events in wild-type cells are not due to the repair of DNA double-strand breaks.
Collapse
Affiliation(s)
- Clémence Claussin
- European Research Institute for the Biology of Ageing, University Medical Center GroningenUniversity of GroningenGroningenNetherlands
| | - David Porubský
- European Research Institute for the Biology of Ageing, University Medical Center GroningenUniversity of GroningenGroningenNetherlands
| | - Diana CJ Spierings
- European Research Institute for the Biology of Ageing, University Medical Center GroningenUniversity of GroningenGroningenNetherlands
| | - Nancy Halsema
- European Research Institute for the Biology of Ageing, University Medical Center GroningenUniversity of GroningenGroningenNetherlands
| | | | - Victor Guryev
- European Research Institute for the Biology of Ageing, University Medical Center GroningenUniversity of GroningenGroningenNetherlands
| | - Peter M Lansdorp
- European Research Institute for the Biology of Ageing, University Medical Center GroningenUniversity of GroningenGroningenNetherlands
- Terry Fox LaboratoryBC Cancer AgencyVancouverCanada
- Department of Medical GeneticsUniversity of British ColumbiaVancouverCanada
| | - Michael Chang
- European Research Institute for the Biology of Ageing, University Medical Center GroningenUniversity of GroningenGroningenNetherlands
| |
Collapse
|
16
|
Li Z, Qin F, Li H. Chimeric RNAs and their implications in cancer. Curr Opin Genet Dev 2017; 48:36-43. [PMID: 29100211 DOI: 10.1016/j.gde.2017.10.002] [Citation(s) in RCA: 29] [Impact Index Per Article: 4.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/12/2017] [Revised: 09/06/2017] [Accepted: 10/02/2017] [Indexed: 11/26/2022]
Abstract
Chimeric RNAs have been believed to be solely produced by gene fusions resulting from chromosomal rearrangement, thus unique features of cancer. Detected chimeric RNAs have also been viewed as surrogates for the presence of gene fusions. However, more and more research has demonstrated that chimeric RNAs in general are not a hallmark of cancer, but rather widely present in non-cancerous cells and tissues. At the same time, they may be produced by other mechanisms other than chromosomal rearrangement. The field of non-canonical chimeric RNAs is still in its infancy, with many challenges ahead, including the lack of a unified terminology. However, we believe that these non-canonical chimeric RNAs will have significant impacts in cancer detection and treatment.
Collapse
Affiliation(s)
- Zi Li
- Department of Pathology, University of Virginia, Charlottesville, VA 22908, USA; Department of Orthopedics, The Second Xiangya Hospital, Central South University, Changsha 410011, China
| | - Fujun Qin
- Department of Pathology, University of Virginia, Charlottesville, VA 22908, USA
| | - Hui Li
- Department of Pathology, University of Virginia, Charlottesville, VA 22908, USA.
| |
Collapse
|
17
|
Menschaert G, David F. Proteogenomics from a bioinformatics angle: A growing field. MASS SPECTROMETRY REVIEWS 2017; 36:584-599. [PMID: 26670565 PMCID: PMC6101030 DOI: 10.1002/mas.21483] [Citation(s) in RCA: 52] [Impact Index Per Article: 7.4] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 04/20/2015] [Accepted: 09/01/2015] [Indexed: 05/16/2023]
Abstract
Proteogenomics is a research area that combines areas as proteomics and genomics in a multi-omics setup using both mass spectrometry and high-throughput sequencing technologies. Currently, the main goals of the field are to aid genome annotation or to unravel the proteome complexity. Mass spectrometry based identifications of matching or homologues peptides can further refine gene models. Also, the identification of novel proteoforms is also made possible based on detection of novel translation initiation sites (cognate or near-cognate), novel transcript isoforms, sequence variation or novel (small) open reading frames in intergenic or un-translated genic regions by analyzing high-throughput sequencing data from RNAseq or ribosome profiling experiments. Other proteogenomics studies using a combination of proteomics and genomics techniques focus on antibody sequencing, the identification of immunogenic peptides or venom peptides. Over the years, a growing amount of bioinformatics tools and databases became available to help streamlining these cross-omics studies. Some of these solutions only help in specific steps of the proteogenomics studies, e.g. building custom sequence databases (based on next generation sequencing output) for mass spectrometry fragmentation spectrum matching. Over the last few years a handful integrative tools also became available that can execute complete proteogenomics analyses. Some of these are presented as stand-alone solutions, whereas others are implemented in a web-based framework such as Galaxy. In this review we aimed at sketching a comprehensive overview of all the bioinformatics solutions that are available for this growing research area. © 2015 Wiley Periodicals, Inc. Mass Spec Rev 36:584-599, 2017.
Collapse
Affiliation(s)
- Gerben Menschaert
- Lab of Bioinformatics and Computational Genomics, Department of
Mathematical Modeling, Statistics and Bioinformatics, Faculty of Bioscience
Engineering, Ghent University, Ghent, Belgium
- To whom correspondence should be addressed. Tel:
+32 9 264 99 22; Fax: +32 9 264 6220;
| | - Fenyö David
- Center for Health Informatics and Bioinformatics and Department of
Biochemistry and Molecular Pharmacology, New York University School of Medicine, New
York, New York, USA
| |
Collapse
|
18
|
Day TA, Layer JV, Cleary JP, Guha S, Stevenson KE, Tivey T, Kim S, Schinzel AC, Izzo F, Doench J, Root DE, Hahn WC, Price BD, Weinstock DM. PARP3 is a promoter of chromosomal rearrangements and limits G4 DNA. Nat Commun 2017; 8:15110. [PMID: 28447610 PMCID: PMC5414184 DOI: 10.1038/ncomms15110] [Citation(s) in RCA: 28] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/09/2016] [Accepted: 02/28/2017] [Indexed: 12/24/2022] Open
Abstract
Chromosomal rearrangements are essential events in the pathogenesis of both malignant and nonmalignant disorders, yet the factors affecting their formation are incompletely understood. Here we develop a zinc-finger nuclease translocation reporter and screen for factors that modulate rearrangements in human cells. We identify UBC9 and RAD50 as suppressors and 53BP1, DDB1 and poly(ADP)ribose polymerase 3 (PARP3) as promoters of chromosomal rearrangements across human cell types. We focus on PARP3 as it is dispensable for murine viability and has druggable catalytic activity. We find that PARP3 regulates G quadruplex (G4) DNA in response to DNA damage, which suppresses repair by nonhomologous end-joining and homologous recombination. Chemical stabilization of G4 DNA in PARP3-/- cells leads to widespread DNA double-strand breaks and synthetic lethality. We propose a model in which PARP3 suppresses G4 DNA and facilitates DNA repair by multiple pathways.
Collapse
Affiliation(s)
- Tovah A. Day
- Department of Medical Oncology, Dana-Farber Cancer Institute, 450 Brookline Avenue, Boston, Massachusetts 02215, USA
| | - Jacob V. Layer
- Department of Medical Oncology, Dana-Farber Cancer Institute, 450 Brookline Avenue, Boston, Massachusetts 02215, USA
| | - J. Patrick Cleary
- Department of Medical Oncology, Dana-Farber Cancer Institute, 450 Brookline Avenue, Boston, Massachusetts 02215, USA
| | - Srijoy Guha
- Department of Medical Oncology, Dana-Farber Cancer Institute, 450 Brookline Avenue, Boston, Massachusetts 02215, USA
| | - Kristen E. Stevenson
- Department of Biostatistics and Computational Biology, Dana-Farber Cancer Institute, 450 Brookline Avenue, Boston, Massachusetts 02215, USA
| | - Trevor Tivey
- Department of Medical Oncology, Dana-Farber Cancer Institute, 450 Brookline Avenue, Boston, Massachusetts 02215, USA
| | - Sunhee Kim
- Department of Medical Oncology, Dana-Farber Cancer Institute, 450 Brookline Avenue, Boston, Massachusetts 02215, USA
| | - Anna C. Schinzel
- Genetic Perturbation Platform, Broad Institute of MIT and Harvard University, 415 Main Street, Cambridge, Massachusetts 02142, USA
| | - Francesca Izzo
- Department of Medical Oncology, Dana-Farber Cancer Institute, 450 Brookline Avenue, Boston, Massachusetts 02215, USA
- Genetic Perturbation Platform, Broad Institute of MIT and Harvard University, 415 Main Street, Cambridge, Massachusetts 02142, USA
| | - John Doench
- Genetic Perturbation Platform, Broad Institute of MIT and Harvard University, 415 Main Street, Cambridge, Massachusetts 02142, USA
| | - David E. Root
- Genetic Perturbation Platform, Broad Institute of MIT and Harvard University, 415 Main Street, Cambridge, Massachusetts 02142, USA
| | - William C. Hahn
- Department of Medical Oncology, Dana-Farber Cancer Institute, 450 Brookline Avenue, Boston, Massachusetts 02215, USA
- Genetic Perturbation Platform, Broad Institute of MIT and Harvard University, 415 Main Street, Cambridge, Massachusetts 02142, USA
| | - Brendan D. Price
- Department of Radiation Oncology, Dana-Farber Cancer Institute, Harvard Medical School, 450 Brookline Avenue, Boston, Massachusetts 02215, USA
| | - David M. Weinstock
- Department of Medical Oncology, Dana-Farber Cancer Institute, 450 Brookline Avenue, Boston, Massachusetts 02215, USA
- Genetic Perturbation Platform, Broad Institute of MIT and Harvard University, 415 Main Street, Cambridge, Massachusetts 02142, USA
| |
Collapse
|
19
|
Gorohovski A, Tagore S, Palande V, Malka A, Raviv-Shay D, Frenkel-Morgenstern M. ChiTaRS-3.1-the enhanced chimeric transcripts and RNA-seq database matched with protein-protein interactions. Nucleic Acids Res 2016; 45:D790-D795. [PMID: 27899596 PMCID: PMC5210585 DOI: 10.1093/nar/gkw1127] [Citation(s) in RCA: 43] [Impact Index Per Article: 5.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/02/2016] [Revised: 10/26/2016] [Accepted: 10/30/2016] [Indexed: 12/17/2022] Open
Abstract
Discovery of chimeric RNAs, which are produced by chromosomal translocations as well as the joining of exons from different genes by trans-splicing, has added a new level of complexity to our study and understanding of the transcriptome. The enhanced ChiTaRS-3.1 database (http://chitars.md.biu.ac.il) is designed to make widely accessible a wealth of mined data on chimeric RNAs, with easy-to-use analytical tools built-in. The database comprises 34 922 chimeric transcripts along with 11 714 cancer breakpoints. In this latest version, we have included multiple cross-references to GeneCards, iHop, PubMed, NCBI, Ensembl, OMIM, RefSeq and the Mitelman collection for every entry in the ‘Full Collection’. In addition, for every chimera, we have added a predicted chimeric protein–protein interaction (ChiPPI) network, which allows for easy visualization of protein partners of both parental and fusion proteins for all human chimeras. The database contains a comprehensive annotation for 34 922 chimeric transcripts from eight organisms, and includes the manual annotation of 200 sense-antiSense (SaS) chimeras. The current improvements in the content and functionality to the ChiTaRS database make it a central resource for the study of chimeric transcripts and fusion proteins.
Collapse
Affiliation(s)
- Alessandro Gorohovski
- Faculty of Medicine in Galilee, Bar-Ilan University, Henrietta Szold 8, Safed 13195, Israel
| | - Somnath Tagore
- Faculty of Medicine in Galilee, Bar-Ilan University, Henrietta Szold 8, Safed 13195, Israel
| | - Vikrant Palande
- Faculty of Medicine in Galilee, Bar-Ilan University, Henrietta Szold 8, Safed 13195, Israel
| | - Assaf Malka
- Faculty of Medicine in Galilee, Bar-Ilan University, Henrietta Szold 8, Safed 13195, Israel
| | - Dorith Raviv-Shay
- Faculty of Medicine in Galilee, Bar-Ilan University, Henrietta Szold 8, Safed 13195, Israel
| | - Milana Frenkel-Morgenstern
- Faculty of Medicine in Galilee, Bar-Ilan University, Henrietta Szold 8, Safed 13195, Israel. Corresponding author:
| |
Collapse
|
20
|
Lee M, Lee K, Yu N, Jang I, Choi I, Kim P, Jang YE, Kim B, Kim S, Lee B, Kang J, Lee S. ChimerDB 3.0: an enhanced database for fusion genes from cancer transcriptome and literature data mining. Nucleic Acids Res 2016; 45:D784-D789. [PMID: 27899563 PMCID: PMC5210563 DOI: 10.1093/nar/gkw1083] [Citation(s) in RCA: 48] [Impact Index Per Article: 6.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/15/2016] [Revised: 10/24/2016] [Accepted: 10/27/2016] [Indexed: 11/17/2022] Open
Abstract
Fusion gene is an important class of therapeutic targets and prognostic markers in cancer. ChimerDB is a comprehensive database of fusion genes encompassing analysis of deep sequencing data and manual curations. In this update, the database coverage was enhanced considerably by adding two new modules of The Cancer Genome Atlas (TCGA) RNA-Seq analysis and PubMed abstract mining. ChimerDB 3.0 is composed of three modules of ChimerKB, ChimerPub and ChimerSeq. ChimerKB represents a knowledgebase including 1066 fusion genes with manual curation that were compiled from public resources of fusion genes with experimental evidences. ChimerPub includes 2767 fusion genes obtained from text mining of PubMed abstracts. ChimerSeq module is designed to archive the fusion candidates from deep sequencing data. Importantly, we have analyzed RNA-Seq data of the TCGA project covering 4569 patients in 23 cancer types using two reliable programs of FusionScan and TopHat-Fusion. The new user interface supports diverse search options and graphic representation of fusion gene structure. ChimerDB 3.0 is available at http://ercsb.ewha.ac.kr/fusiongene/.
Collapse
Affiliation(s)
- Myunggyo Lee
- Department of Bio-Information Science, Ewha Womans University, Seoul 03760, Republic of Korea
| | - Kyubum Lee
- Department of Computer Science and Engineering, Korea University, Seoul 02841, Republic of Korea
| | - Namhee Yu
- Department of Life Science, Ewha Womans University, Seoul 03760, Republic of Korea
| | - Insu Jang
- Korean Bioinformation Center, Korean Research Institute of Bioscience and Biotechnology, Daejeon 34141, Republic of Korea
| | - Ikjung Choi
- Ewha Research Center for Systems Biology, Ewha Womans University, Seoul 03760, Republic of Korea
| | - Pora Kim
- Center for Precision Health, School of Biomedical Informatics, The University of Texas Health Science Center at Houston, Houston, TX 77030, USA
| | - Ye Eun Jang
- Department of Bio-Information Science, Ewha Womans University, Seoul 03760, Republic of Korea
| | - Byounggun Kim
- Interdisciplinary Graduate Program in Bioinformatics, Korea University, Seoul 02841, Republic of Korea
| | - Sunkyu Kim
- Interdisciplinary Graduate Program in Bioinformatics, Korea University, Seoul 02841, Republic of Korea
| | - Byungwook Lee
- Korean Bioinformation Center, Korean Research Institute of Bioscience and Biotechnology, Daejeon 34141, Republic of Korea
| | - Jaewoo Kang
- Department of Computer Science and Engineering, Korea University, Seoul 02841, Republic of Korea .,Interdisciplinary Graduate Program in Bioinformatics, Korea University, Seoul 02841, Republic of Korea
| | - Sanghyuk Lee
- Department of Bio-Information Science, Ewha Womans University, Seoul 03760, Republic of Korea .,Department of Life Science, Ewha Womans University, Seoul 03760, Republic of Korea.,Ewha Research Center for Systems Biology, Ewha Womans University, Seoul 03760, Republic of Korea
| |
Collapse
|
21
|
Li R, Liu Y, Li T, Li C. 3Disease Browser: A Web server for integrating 3D genome and disease-associated chromosome rearrangement data. Sci Rep 2016; 6:34651. [PMID: 27734896 PMCID: PMC5062081 DOI: 10.1038/srep34651] [Citation(s) in RCA: 28] [Impact Index Per Article: 3.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/04/2016] [Accepted: 09/16/2016] [Indexed: 11/23/2022] Open
Abstract
Chromosomal rearrangement (CR) events have been implicated in many tumor and non-tumor human diseases. CR events lead to their associated diseases by disrupting gene and protein structures. Also, they can lead to diseases through changes in chromosomal 3D structure and gene expression. In this study, we search for CR-associated diseases potentially caused by chromosomal 3D structure alteration by integrating Hi-C and ChIP-seq data. Our algorithm rediscovers experimentally verified disease-associated CRs (polydactyly diseases) that alter gene expression by disrupting chromosome 3D structure. Interestingly, we find that intellectual disability may be a candidate disease caused by 3D chromosome structure alteration. We also develop a Web server (3Disease Browser, http://3dgb.cbi.pku.edu.cn/disease/) for integrating and visualizing disease-associated CR events and chromosomal 3D structure.
Collapse
Affiliation(s)
- Ruifeng Li
- Peking-Tsinghua Center for Life Sciences, Academy for Advanced Interdisciplinary Studies; School of Life Sciences, Peking University, Beijing, China
| | - Yifang Liu
- School of Life Sciences, Tsinghua University, Beijing, China
| | - Tingting Li
- Peking-Tsinghua Center for Life Sciences, Academy for Advanced Interdisciplinary Studies; School of Life Sciences, Peking University, Beijing, China
| | - Cheng Li
- Peking-Tsinghua Center for Life Sciences, Academy for Advanced Interdisciplinary Studies; School of Life Sciences, Peking University, Beijing, China.,Center for Statistical Science; Center for Bioinformatics, Peking University, Beijing, China
| |
Collapse
|
22
|
Zhao J, Li X, Yao Q, Li M, Zhang J, Ai B, Liu W, Wang Q, Feng C, Liu Y, Bai X, Song C, Li S, Li E, Xu L, Li C. RWCFusion: identifying phenotype-specific cancer driver gene fusions based on fusion pair random walk scoring method. Oncotarget 2016; 7:61054-61068. [PMID: 27506935 PMCID: PMC5308635 DOI: 10.18632/oncotarget.11064] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/26/2016] [Accepted: 07/19/2016] [Indexed: 02/05/2023] Open
Abstract
While gene fusions have been increasingly detected by next-generation sequencing (NGS) technologies based methods in human cancers, these methods have limitations in identifying driver fusions. In addition, the existing methods to identify driver gene fusions ignored the specificity among different cancers or only considered their local rather than global topology features in networks. Here, we proposed a novel network-based method, called RWCFusion, to identify phenotype-specific cancer driver gene fusions. To evaluate its performance, we used leave-one-out cross-validation in 35 cancers and achieved a high AUC value 0.925 for overall cancers and an average 0.929 for signal cancer. Furthermore, we classified 35 cancers into two classes: haematological and solid, of which the haematological got a highly AUC which is up to 0.968. Finally, we applied RWCFusion to breast cancer and found that top 13 gene fusions, such as BCAS3-BCAS4, NOTCH-NUP214, MED13-BCAS3 and CARM-SMARCA4, have been previously proved to be drivers for breast cancer. Additionally, 8 among the top 10 of the remaining candidate gene fusions, such as SULF2-ZNF217, MED1-ACSF2, and ACACA-STAC2, were inferred to be potential driver gene fusions of breast cancer by us.
Collapse
Affiliation(s)
- Jianmei Zhao
- School of Medical Informatics, Daqing Campus, Harbin Medical University, Daqing, 163319, China
- The Key Laboratory of Molecular Biology for High Cancer Incidence Coastal Chaoshan Area, Shantou University Medical College, Shantou, 515041, China
| | - Xuecang Li
- School of Medical Informatics, Daqing Campus, Harbin Medical University, Daqing, 163319, China
| | - Qianlan Yao
- School of Life Sciences and Biotechnology, Shanghai Jiao Tong University, Shanghai, 200240, China
| | - Meng Li
- School of Medical Informatics, Daqing Campus, Harbin Medical University, Daqing, 163319, China
| | - Jian Zhang
- School of Medical Informatics, Daqing Campus, Harbin Medical University, Daqing, 163319, China
| | - Bo Ai
- School of Medical Informatics, Daqing Campus, Harbin Medical University, Daqing, 163319, China
| | - Wei Liu
- Department of Mathematics, Heilongjiang Institute of Technology, Harbin, 150050, China
| | - Qiuyu Wang
- School of Nursing and Pharmacology, Daqing Campus, Harbin Medical University, Daqing, 163319, China
| | - Chenchen Feng
- School of Medical Informatics, Daqing Campus, Harbin Medical University, Daqing, 163319, China
| | - Yuejuan Liu
- School of Medical Informatics, Daqing Campus, Harbin Medical University, Daqing, 163319, China
| | - Xuefeng Bai
- School of Medical Informatics, Daqing Campus, Harbin Medical University, Daqing, 163319, China
| | - Chao Song
- School of Nursing and Pharmacology, Daqing Campus, Harbin Medical University, Daqing, 163319, China
| | - Shang Li
- College of Bioinformatics Science and Technology, Harbin Medical University, Harbin, 150081, China
| | - Enmin Li
- The Key Laboratory of Molecular Biology for High Cancer Incidence Coastal Chaoshan Area, Shantou University Medical College, Shantou, 515041, China
| | - Liyan Xu
- The Key Laboratory of Molecular Biology for High Cancer Incidence Coastal Chaoshan Area, Shantou University Medical College, Shantou, 515041, China
| | - Chunquan Li
- School of Medical Informatics, Daqing Campus, Harbin Medical University, Daqing, 163319, China
- The Key Laboratory of Molecular Biology for High Cancer Incidence Coastal Chaoshan Area, Shantou University Medical College, Shantou, 515041, China
| |
Collapse
|
23
|
Latysheva NS, Babu MM. Discovering and understanding oncogenic gene fusions through data intensive computational approaches. Nucleic Acids Res 2016; 44:4487-503. [PMID: 27105842 PMCID: PMC4889949 DOI: 10.1093/nar/gkw282] [Citation(s) in RCA: 109] [Impact Index Per Article: 13.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/12/2016] [Accepted: 03/24/2016] [Indexed: 12/21/2022] Open
Abstract
Although gene fusions have been recognized as important drivers of cancer for decades, our understanding of the prevalence and function of gene fusions has been revolutionized by the rise of next-generation sequencing, advances in bioinformatics theory and an increasing capacity for large-scale computational biology. The computational work on gene fusions has been vastly diverse, and the present state of the literature is fragmented. It will be fruitful to merge three camps of gene fusion bioinformatics that appear to rarely cross over: (i) data-intensive computational work characterizing the molecular biology of gene fusions; (ii) development research on fusion detection tools, candidate fusion prioritization algorithms and dedicated fusion databases and (iii) clinical research that seeks to either therapeutically target fusion transcripts and proteins or leverages advances in detection tools to perform large-scale surveys of gene fusion landscapes in specific cancer types. In this review, we unify these different-yet highly complementary and symbiotic-approaches with the view that increased synergy will catalyze advancements in gene fusion identification, characterization and significance evaluation.
Collapse
Affiliation(s)
- Natasha S Latysheva
- MRC Laboratory of Molecular Biology, Francis Crick Ave, Cambridge CB2 0QH, United Kingdom
| | - M Madan Babu
- MRC Laboratory of Molecular Biology, Francis Crick Ave, Cambridge CB2 0QH, United Kingdom
| |
Collapse
|
24
|
Korla PK, Cheng J, Huang CH, Tsai JJP, Liu YH, Kurubanjerdjit N, Hsieh WT, Chen HY, Ng KL. FARE-CAFE: a database of functional and regulatory elements of cancer-associated fusion events. DATABASE-THE JOURNAL OF BIOLOGICAL DATABASES AND CURATION 2015; 2015:bav086. [PMID: 26384373 PMCID: PMC4684693 DOI: 10.1093/database/bav086] [Citation(s) in RCA: 9] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 04/29/2015] [Accepted: 08/18/2015] [Indexed: 01/08/2023]
Abstract
Chromosomal translocation (CT) is of enormous clinical interest because this disorder is associated with various major solid tumors and leukemia. A tumor-specific fusion gene event may occur when a translocation joins two separate genes. Currently, various CT databases provide information about fusion genes and their genomic elements. However, no database of the roles of fusion genes, in terms of essential functional and regulatory elements in oncogenesis, is available. FARE-CAFE is a unique combination of CTs, fusion proteins, protein domains, domain–domain interactions, protein–protein interactions, transcription factors and microRNAs, with subsequent experimental information, which cannot be found in any other CT database. Genomic DNA information including, for example, manually collected exact locations of the first and second break points, sequences and karyotypes of fusion genes are included. FARE-CAFE will substantially facilitate the cancer biologist’s mission of elucidating the pathogenesis of various types of cancer. This database will ultimately help to develop ‘novel’ therapeutic approaches. Database URL:http://ppi.bioinfo.asia.edu.tw/FARE-CAFE
Collapse
Affiliation(s)
- Praveen Kumar Korla
- Department of Bioinformatics and Medical Engineering, Asia University, Taichung 41354, Taiwan
| | - Jack Cheng
- Graduate Institute of Integrated Medicine, College of Chinese Medicine, China Medical University, Taichung 40402, Taiwan
| | - Chien-Hung Huang
- Department of Computer Science and Information Engineering, National Formosa University, Yunlin 632, Taiwan
| | - Jeffrey J P Tsai
- Department of Bioinformatics and Medical Engineering, Asia University, Taichung 41354, Taiwan
| | - Yu-Hsuan Liu
- Department of Computer Science and Information Engineering, National Formosa University, Yunlin 632, Taiwan
| | | | - Wen-Tsong Hsieh
- Department of Pharmacology, China Medical University, Taichung 40402, Taiwan
| | - Huey-Yi Chen
- Department of Obstetrics and Gynecology, China Medical University Hospital, China Medical University, Taichung 40402, Taiwan, and
| | - Ka-Lok Ng
- Department of Bioinformatics and Medical Engineering, Asia University, Taichung 41354, Taiwan, Department of Medical Research, China Medical University Hospital, China Medical University, Taichung 40402, Taiwan
| |
Collapse
|
25
|
Poot M, Haaf T. Mechanisms of Origin, Phenotypic Effects and Diagnostic Implications of Complex Chromosome Rearrangements. Mol Syndromol 2015; 6:110-34. [PMID: 26732513 DOI: 10.1159/000438812] [Citation(s) in RCA: 63] [Impact Index Per Article: 7.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 06/23/2015] [Indexed: 01/08/2023] Open
Abstract
Complex chromosome rearrangements (CCRs) are currently defined as structural genome variations that involve more than 2 chromosome breaks and result in exchanges of chromosomal segments. They are thought to be extremely rare, but their detection rate is rising because of improvements in molecular cytogenetic technology. Their population frequency is also underestimated, since many CCRs may not elicit a phenotypic effect. CCRs may be the result of fork stalling and template switching, microhomology-mediated break-induced repair, breakage-fusion-bridge cycles, or chromothripsis. Patients with chromosomal instability syndromes show elevated rates of CCRs due to impaired DNA double-strand break responses during meiosis. Therefore, the putative functions of the proteins encoded by ATM, BLM, WRN, ATR, MRE11, NBS1, and RAD51 in preventing CCRs are discussed. CCRs may exert a pathogenic effect by either (1) gene dosage-dependent mechanisms, e.g. haploinsufficiency, (2) mechanisms based on disruption of the genomic architecture, such that genes, parts of genes or regulatory elements are truncated, fused or relocated and thus their interactions disturbed - these mechanisms will predominantly affect gene expression - or (3) mixed mutation mechanisms in which a CCR on one chromosome is combined with a different type of mutation on the other chromosome. Such inferred mechanisms of pathogenicity need corroboration by mRNA sequencing. Also, future studies with in vitro models, such as inducible pluripotent stem cells from patients with CCRs, and transgenic model organisms should substantiate current inferences regarding putative pathogenic effects of CCRs. The ramifications of the growing body of information on CCRs for clinical and experimental genetics and future treatment modalities are briefly illustrated with 2 cases, one of which suggests KDM4C (JMJD2C) as a novel candidate gene for mental retardation.
Collapse
Affiliation(s)
- Martin Poot
- Department of Human Genetics, University of Würzburg, Würzburg, Germany
| | - Thomas Haaf
- Department of Human Genetics, University of Würzburg, Würzburg, Germany
| |
Collapse
|
26
|
Frenkel-Morgenstern M, Gorohovski A, Vucenovic D, Maestre L, Valencia A. ChiTaRS 2.1--an improved database of the chimeric transcripts and RNA-seq data with novel sense-antisense chimeric RNA transcripts. Nucleic Acids Res 2014; 43:D68-75. [PMID: 25414346 PMCID: PMC4383979 DOI: 10.1093/nar/gku1199] [Citation(s) in RCA: 24] [Impact Index Per Article: 2.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/20/2022] Open
Abstract
Chimeric RNAs that comprise two or more different transcripts have been identified in many cancers and among the Expressed Sequence Tags (ESTs) isolated from different organisms; they might represent functional proteins and produce different disease phenotypes. The ChiTaRS 2.1 database of chimeric transcripts and RNA-Seq data (http://chitars.bioinfo.cnio.es/) is the second version of the ChiTaRS database and includes improvements in content and functionality. Chimeras from eight organisms have been collated including novel sense–antisense (SAS) chimeras resulting from the slippage of the sense and anti-sense intragenic regions. The new database version collects more than 29 000 chimeric transcripts and indicates the expression and tissue specificity for 333 entries confirmed by RNA-seq reads mapping the chimeric junction sites. User interface allows for rapid and easy analysis of evolutionary conservation of fusions, literature references and experimental data supporting fusions in different organisms. More than 1428 cancer breakpoints have been automatically collected from public databases and manually verified to identify their correct cross-references, genomic sequences and junction sites. As a result, the ChiTaRS 2.1 collection of chimeras from eight organisms and human cancer breakpoints extends our understanding of the evolution of chimeric transcripts in eukaryotes as well as their functional role in carcinogenic processes.
Collapse
Affiliation(s)
- Milana Frenkel-Morgenstern
- Structural Biology and BioComputing Program, Spanish National Cancer Research Centre (CNIO), Madrid 28029, Spain
| | - Alessandro Gorohovski
- Structural Biology and BioComputing Program, Spanish National Cancer Research Centre (CNIO), Madrid 28029, Spain
| | - Dunja Vucenovic
- Structural Biology and BioComputing Program, Spanish National Cancer Research Centre (CNIO), Madrid 28029, Spain
| | - Lorena Maestre
- Monoclonal Antibodies Unit, Spanish National Cancer Research Centre (CNIO), Madrid 28029, Spain
| | - Alfonso Valencia
- Structural Biology and BioComputing Program, Spanish National Cancer Research Centre (CNIO), Madrid 28029, Spain.
| |
Collapse
|
27
|
Xu C, Zhang J, Wang YP, Deng HW, Li J. Characterization of human chromosomal material exchange with regard to the chromosome translocations using next-generation sequencing data. Genome Biol Evol 2014; 6:3015-24. [PMID: 25349267 PMCID: PMC4255766 DOI: 10.1093/gbe/evu234] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/24/2022] Open
Abstract
As an important subtype of structural variations, chromosomal translocation is associated with various diseases, especially cancers, by disrupting gene structures and functions. Traditional methods for identifying translocations are time consuming and have limited resolutions. Recently, a few studies have employed next-generation sequencing (NGS) technology for characterizing chromosomal translocations on human genome, obtaining high-throughput results with high resolutions. However, these studies are mainly focused on mechanism-specific or site-specific translocation mapping. In this study, we conducted a comprehensive genome-wide analysis on the characterization of human chromosomal material exchange with regard to the chromosome translocations. Using NGS data of 1,481 subjects from the 1000 Genomes Project, we identified 15,349,092 translocated DNA fragment pairs, ranging from 65 to 1,886 bp and with an average size of approximately 102 bp. On average, each individual genome carried about 10,364 pairs, covering approximately 0.069% of the genome. We identified 16 translocation hot regions, among which two regions did not contain repetitive fragments. Results of our study overlapped with a majority of previous results, containing approximately 79% of approximately 2,340 translocations characterized in three available translocation databases. In addition, our study identified five novel potential recurrent chromosomal material exchange regions with greater than 20% detection rates. Our results will be helpful for an accurate characterization of translocations in human genomes, and contribute as a resource for future studies of the roles of translocations in human disease etiology and mechanisms.
Collapse
Affiliation(s)
- Chao Xu
- Center for Bioinformatics and Genomics, Department of Biostatistics and Bioinformatics, School of Public Health and Tropical Medicine, Tulane University
| | - Jigang Zhang
- Center for Bioinformatics and Genomics, Department of Biostatistics and Bioinformatics, School of Public Health and Tropical Medicine, Tulane University
| | - Yu-Ping Wang
- Center for Bioinformatics and Genomics, Department of Biostatistics and Bioinformatics, School of Public Health and Tropical Medicine, Tulane University Department of Biomedical Engineering, School of Science and Engineering, Tulane University
| | - Hong-Wen Deng
- Center for Bioinformatics and Genomics, Department of Biostatistics and Bioinformatics, School of Public Health and Tropical Medicine, Tulane University Third Affiliated Hospital, China Southern Medical University, Guang Zhou, 510000, P. R. China
| | - Jian Li
- Center for Bioinformatics and Genomics, Department of Biostatistics and Bioinformatics, School of Public Health and Tropical Medicine, Tulane University
| |
Collapse
|
28
|
Wijaya E, Shimizu K, Asai K, Hamada M. Reference-free prediction of rearrangement breakpoint reads. ACTA ACUST UNITED AC 2014; 30:2559-67. [PMID: 24876376 DOI: 10.1093/bioinformatics/btu360] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/28/2022]
Abstract
MOTIVATION Chromosome rearrangement events are triggered by atypical breaking and rejoining of DNA molecules, which are observed in many cancer-related diseases. The detection of rearrangement is typically done by using short reads generated by next-generation sequencing (NGS) and combining the reads with knowledge of a reference genome. Because structural variations and genomes differ from one person to another, intermediate comparison via a reference genome may lead to loss of information. RESULTS In this article, we propose a reference-free method for detecting clusters of breakpoints from the chromosomal rearrangements. This is done by directly comparing a set of NGS normal reads with another set that may be rearranged. Our method SlideSort-BPR (breakpoint reads) is based on a fast algorithm for all-against-all comparisons of short reads and theoretical analyses of the number of neighboring reads. When applied to a dataset with a sequencing depth of 100×, it finds ∼ 88% of the breakpoints correctly with no false-positive reads. Moreover, evaluation on a real prostate cancer dataset shows that the proposed method predicts more fusion transcripts correctly than previous approaches, and yet produces fewer false-positive reads. To our knowledge, this is the first method to detect breakpoint reads without using a reference genome. AVAILABILITY AND IMPLEMENTATION The source code of SlideSort-BPR can be freely downloaded from https://code.google.com/p/slidesort-bpr/.
Collapse
Affiliation(s)
- Edward Wijaya
- Immunology Frontier Research Center, Osaka University, 3-1 Yamadaoka, Suita, Osaka 565-0871, Computational Biology Research Center, National Institute of Advanced Industrial Science and Technology, 2-4-7 Aomi, Koto-ku, Tokyo 135-0064, Graduate School of Frontier Sciences, University of Tokyo, 5-1-5 Kashiwanoha, Kashiwa, Chiba 277-8562 and Department of Electrical Engineering and Bioscience, Faculty of Science and Engineering, Waseda University, 55N-06-10, 3-4-1, Okubo Shinjuku-ku, Tokyo 169-8555, Japan
| | - Kana Shimizu
- Immunology Frontier Research Center, Osaka University, 3-1 Yamadaoka, Suita, Osaka 565-0871, Computational Biology Research Center, National Institute of Advanced Industrial Science and Technology, 2-4-7 Aomi, Koto-ku, Tokyo 135-0064, Graduate School of Frontier Sciences, University of Tokyo, 5-1-5 Kashiwanoha, Kashiwa, Chiba 277-8562 and Department of Electrical Engineering and Bioscience, Faculty of Science and Engineering, Waseda University, 55N-06-10, 3-4-1, Okubo Shinjuku-ku, Tokyo 169-8555, Japan
| | - Kiyoshi Asai
- Immunology Frontier Research Center, Osaka University, 3-1 Yamadaoka, Suita, Osaka 565-0871, Computational Biology Research Center, National Institute of Advanced Industrial Science and Technology, 2-4-7 Aomi, Koto-ku, Tokyo 135-0064, Graduate School of Frontier Sciences, University of Tokyo, 5-1-5 Kashiwanoha, Kashiwa, Chiba 277-8562 and Department of Electrical Engineering and Bioscience, Faculty of Science and Engineering, Waseda University, 55N-06-10, 3-4-1, Okubo Shinjuku-ku, Tokyo 169-8555, Japan Immunology Frontier Research Center, Osaka University, 3-1 Yamadaoka, Suita, Osaka 565-0871, Computational Biology Research Center, National Institute of Advanced Industrial Science and Technology, 2-4-7 Aomi, Koto-ku, Tokyo 135-0064, Graduate School of Frontier Sciences, University of Tokyo, 5-1-5 Kashiwanoha, Kashiwa, Chiba 277-8562 and Department of Electrical Engineering and Bioscience, Faculty of Science and Engineering, Waseda University, 55N-06-10, 3-4-1, Okubo Shinjuku-ku, Tokyo 169-8555, Japan
| | - Michiaki Hamada
- Immunology Frontier Research Center, Osaka University, 3-1 Yamadaoka, Suita, Osaka 565-0871, Computational Biology Research Center, National Institute of Advanced Industrial Science and Technology, 2-4-7 Aomi, Koto-ku, Tokyo 135-0064, Graduate School of Frontier Sciences, University of Tokyo, 5-1-5 Kashiwanoha, Kashiwa, Chiba 277-8562 and Department of Electrical Engineering and Bioscience, Faculty of Science and Engineering, Waseda University, 55N-06-10, 3-4-1, Okubo Shinjuku-ku, Tokyo 169-8555, Japan Immunology Frontier Research Center, Osaka University, 3-1 Yamadaoka, Suita, Osaka 565-0871, Computational Biology Research Center, National Institute of Advanced Industrial Science and Technology, 2-4-7 Aomi, Koto-ku, Tokyo 135-0064, Graduate School of Frontier Sciences, University of Tokyo, 5-1-5 Kashiwanoha, Kashiwa, Chiba 277-8562 and Department of Electrical Engineering and Bioscience, Faculty of Science and Engineering, Waseda University, 55N-06-10, 3-4-1, Okubo Shinjuku-ku, Tokyo 169-8555, Japan
| |
Collapse
|
29
|
Cheng L, Wang G, Li J, Zhang T, Xu P, Wang Y. SIDD: a semantically integrated database towards a global view of human disease. PLoS One 2013; 8:e75504. [PMID: 24146757 PMCID: PMC3795748 DOI: 10.1371/journal.pone.0075504] [Citation(s) in RCA: 40] [Impact Index Per Article: 3.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/21/2013] [Accepted: 08/15/2013] [Indexed: 01/08/2023] Open
Abstract
Background A number of databases have been developed to collect disease-related molecular, phenotypic and environmental features (DR-MPEs), such as genes, non-coding RNAs, genetic variations, drugs, phenotypes and environmental factors. However, each of current databases focused on only one or two DR-MPEs. There is an urgent demand to develop an integrated database, which can establish semantic associations among disease-related databases and link them to provide a global view of human disease at the biological level. This database, once developed, will facilitate researchers to query various DR-MPEs through disease, and investigate disease mechanisms from different types of data. Methodology To establish an integrated disease-associated database, disease vocabularies used in different databases are mapped to Disease Ontology (DO) through semantic match. 4,284 and 4,186 disease terms from Medical Subject Headings (MeSH) and Online Mendelian Inheritance in Man (OMIM) respectively are mapped to DO. Then, the relationships between DR-MPEs and diseases are extracted and merged from different source databases for reducing the data redundancy. Conclusions A semantically integrated disease-associated database (SIDD) is developed, which integrates 18 disease-associated databases, for researchers to browse multiple types of DR-MPEs in a view. A web interface allows easy navigation for querying information through browsing a disease ontology tree or searching a disease term. Furthermore, a network visualization tool using Cytoscape Web plugin has been implemented in SIDD. It enhances the SIDD usage when viewing the relationships between diseases and DR-MPEs. The current version of SIDD (Jul 2013) documents 4,465,131 entries relating to 139,365 DR-MPEs, and to 3,824 human diseases. The database can be freely accessed from: http://mlg.hit.edu.cn/SIDD.
Collapse
Affiliation(s)
- Liang Cheng
- Center for Bioinformatics, School of Computer Science and Technology, Harbin Institute of Technology, Harbin, Heilongjiang, China
| | - Guohua Wang
- Center for Bioinformatics, School of Computer Science and Technology, Harbin Institute of Technology, Harbin, Heilongjiang, China
| | - Jie Li
- Center for Bioinformatics, School of Computer Science and Technology, Harbin Institute of Technology, Harbin, Heilongjiang, China
| | - Tianjiao Zhang
- Center for Bioinformatics, School of Computer Science and Technology, Harbin Institute of Technology, Harbin, Heilongjiang, China
| | - Peigang Xu
- Center for Bioinformatics, School of Computer Science and Technology, Harbin Institute of Technology, Harbin, Heilongjiang, China
| | - Yadong Wang
- Center for Bioinformatics, School of Computer Science and Technology, Harbin Institute of Technology, Harbin, Heilongjiang, China
- * E-mail:
| |
Collapse
|
30
|
Halper-Stromberg E, Steranka J, Giraldo-Castillo N, Fuller T, Desiderio S, Burns KH. Fine mapping of V(D)J recombinase mediated rearrangements in human lymphoid malignancies. BMC Genomics 2013; 14:565. [PMID: 23957733 PMCID: PMC3846541 DOI: 10.1186/1471-2164-14-565] [Citation(s) in RCA: 8] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/09/2013] [Accepted: 08/06/2013] [Indexed: 12/03/2022] Open
Abstract
Background Lymphocytes achieve diversity in antigen recognition in part by rearranging genomic DNA at loci encoding antibodies and cell surface receptors. The process, termed V(D)J recombination, juxtaposes modular coding sequences for antigen binding. Erroneous recombination events causing chromosomal translocations are recognized causes of lymphoid malignancies. Here we show a hybridization based method for sequence enrichment can be used to efficiently and selectively capture genomic DNA adjacent to V(D)J recombination breakpoints for massively parallel sequencing. The approach obviates the need for PCR amplification of recombined sequences. Results Using tailored informatics analyses to resolve alignment and assembly issues in these repetitive regions, we were able to detect numerous recombination events across a panel of cancer cell lines and primary lymphoid tumors, and an EBV transformed lymphoblast line. With reassembly, breakpoints could be defined to single base pair resolution. The observed events consist of canonical V(D)J or V-J rearrangements, non-canonical rearrangements, and putatively oncogenic reciprocal chromosome translocations. We validated non-canonical and chromosome translocation junctions by PCR and Sanger sequencing. The translocations involved the MYC and BCL-2 loci, and activation of these was consistent with histopathologic features of the respective B-cell tumors. We also show an impressive prevalence of novel erroneous V-V recombination events at sites not incorporated with other downstream coding segments. Conclusions Our results demonstrate the ability of next generation sequencing to describe human V(D)J recombinase activity and provide a scalable means to chronicle off-target, unexpressed, and non-amplifiable recombinations occurring in the development of lymphoid cancers.
Collapse
Affiliation(s)
- Eitan Halper-Stromberg
- McKusick-Nathans Institute of Genetic Medicine, Johns Hopkins University, Baltimore, MD, USA.
| | | | | | | | | | | |
Collapse
|
31
|
Ludwig D, Carter J, Smith JR, Borsani G, Barlati S, Hafizi S. Functional characterisation of human cells harbouring a novel t(2p;7p) translocation involving TNS3 and EXOC6B genes. BMC MEDICAL GENETICS 2013; 14:65. [PMID: 23809228 PMCID: PMC3728010 DOI: 10.1186/1471-2350-14-65] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 10/18/2012] [Accepted: 06/24/2013] [Indexed: 01/08/2023]
Abstract
Background Tensin3 is an intracellular cytoskeleton-regulating protein, the loss of which is associated with increased cell motility, as has been observed in some human cancers. A novel chromosomal translocation, t(2;7)(p13;p12), present in a patient with a complex syndromic phenotype, directly involves Tensin3 (TNS3) and EXOC6B genes. This translocation could impair the expression of Tensin3 and ExoC6B proteins, and potentially produce two novel fusion transcripts. In the present study, we have investigated the expression and phenotypic features of these potential products in cultured cells from the proband. Methods Skin fibroblasts isolated from the proband as well as an age-matched control were grown in cell culture. Cells were used for quantitative RT-PCR, western blot and immunofluorescent confocal microscopy, which determined Tensin3 gene and protein expression. Phase-contrast and confocal microscopy additionally revealed cellular phenotype differences. A scratch wound assay monitored by live cell imaging measured cellular migration rates. Results The levels of Tensin3 at both mRNA and protein levels were lower in proband cells versus control fibroblasts. Proband cells displayed broader and shorter morphologies versus control fibroblasts, and immunofluorescent staining revealed additional Tensin3 expression along cytoskeletal filaments and the cell periphery only in control fibroblasts. In addition, proband fibroblasts showed a significantly higher migration rate than control cells over 24 h. Conclusions The phenotypic changes observed in proband cells may arise from TNS3 haploinsufficiency, causing partial loss of full-length Tensin3 protein. These results further expose a role for Tensin3 in cytoskeletal organisation and cell motility and may also help to explain the syndromic features observed in the patient.
Collapse
|
32
|
Zhou T, Hu Z, Zhou Z, Guo X, Sha J. Genome-wide analysis of human hotspot intersected genes highlights the roles of meiotic recombination in evolution and disease. BMC Genomics 2013; 14:67. [PMID: 23368819 PMCID: PMC3620679 DOI: 10.1186/1471-2164-14-67] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/20/2012] [Accepted: 01/29/2013] [Indexed: 11/21/2022] Open
Abstract
Background Meiotic recombination events are not randomly located, but rather cluster at hotspot regions. Recently, the fine-scale mapping of genome-wide human recombination hotspots was performed. Here, we systematically analyzed the evolutionary and disease-associated features of hotspots that overlapped with protein-coding genes. Results In this study, we defined hotspot intersected genes as HI genes. We found that HI genes were prone to be located in the extracellular part and were functionally enriched in cell-to-cell communication. Tissue-specific genes and secreted protein encoding genes were overrepresented in HI genes, while housekeeping genes were underrepresented. Compared to slowly evolving housekeeping genes and random genes with lower recombination rates, HI genes evolved faster. The fact that brain and blood specific genes were overrepresented in HI genes indicates that they may be involved in the evolution of human intelligence and the immune system. We also found that genes related to disease were enriched in HI genes, especially genes with disease-associated chromosomal rearrangements. Hotspot sequence motifs were overrepresented in common sequences of HI genes and genes with disease-associated chromosomal rearrangements. We further listed repeat elements that were enriched both in hotspots and genes with disease-associated chromosomal rearrangements. Conclusion HI genes are evolving and may be involved in the generation of key features of human during evolution. Disease-associated genes may be by-products of meiotic recombination. In addition, hotspot sequence motifs and repeat elements showed the connection between meiotic recombination and genes with disease-associated chromosomal rearrangements at the sequence level. Our study will enable us to better understand the evolutionary and biological significance of human meiotic recombination.
Collapse
Affiliation(s)
- Tao Zhou
- State Key Laboratory of Reproductive Medicine, Nanjing Medical University, 140 Hanzhong Road, Nanjing, Jiangsu Province 210029, People's Republic of China
| | | | | | | | | |
Collapse
|
33
|
Frenkel-Morgenstern M, Valencia A. Novel domain combinations in proteins encoded by chimeric transcripts. ACTA ACUST UNITED AC 2013; 28:i67-74. [PMID: 22689780 PMCID: PMC3371848 DOI: 10.1093/bioinformatics/bts216] [Citation(s) in RCA: 34] [Impact Index Per Article: 3.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/18/2022]
Abstract
Motivation: Chimeric RNA transcripts are generated by different mechanisms including pre-mRNA trans-splicing, chromosomal translocations and/or gene fusions. It was shown recently that at least some of chimeric transcripts can be translated into functional chimeric proteins. Results: To gain a better understanding of the design principles underlying chimeric proteins, we have analyzed 7,424 chimeric RNAs from humans. We focused on the specific domains present in these proteins, comparing their permutations with those of known human proteins. Our method uses genomic alignments of the chimeras, identification of the gene–gene junction sites and prediction of the protein domains. We found that chimeras contain complete protein domains significantly more often than in random data sets. Specifically, we show that eight different types of domains are over-represented among all chimeras as well as in those chimeras confirmed by RNA-seq experiments. Moreover, we discovered that some chimeras potentially encode proteins with novel and unique domain combinations. Given the observed prevalence of entire protein domains in chimeras, we predict that certain putative chimeras that lack activation domains may actively compete with their parental proteins, thereby exerting dominant negative effects. More generally, the production of chimeric transcripts enables a combinatorial increase in the number of protein products available, which may disturb the function of parental genes and influence their protein–protein interaction network. Availability: our scripts are available upon request. Contact:avalencia@cnio.es Supplementary information:Supplementary data are available at Bioinformatics online.
Collapse
Affiliation(s)
- Milana Frenkel-Morgenstern
- Structural Biology and BioComputing Program, Spanish National Cancer Research Centre (CNIO), 28029 Madrid, Spain
| | | |
Collapse
|
34
|
Hussin J, Sinnett D, Casals F, Idaghdour Y, Bruat V, Saillour V, Healy J, Grenier JC, de Malliard T, Busche S, Spinella JF, Larivière M, Gibson G, Andersson A, Holmfeldt L, Ma J, Wei L, Zhang J, Andelfinger G, Downing JR, Mullighan CG, Awadalla P. Rare allelic forms of PRDM9 associated with childhood leukemogenesis. Genome Res 2012; 23:419-30. [PMID: 23222848 PMCID: PMC3589531 DOI: 10.1101/gr.144188.112] [Citation(s) in RCA: 41] [Impact Index Per Article: 3.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/22/2023]
Abstract
One of the most rapidly evolving genes in humans, PRDM9, is a key determinant of the distribution of meiotic recombination events. Mutations in this meiotic-specific gene have previously been associated with male infertility in humans and recent studies suggest that PRDM9 may be involved in pathological genomic rearrangements. In studying genomes from families with children affected by B-cell precursor acute lymphoblastic leukemia (B-ALL), we characterized meiotic recombination patterns within a family with two siblings having hyperdiploid childhood B-ALL and observed unusual localization of maternal recombination events. The mother of the family carries a rare PRDM9 allele, potentially explaining the unusual patterns found. From exomes sequenced in 44 additional parents of children affected with B-ALL, we discovered a substantial and significant excess of rare allelic forms of PRDM9. The rare PRDM9 alleles are transmitted to the affected children in half the cases; nonetheless there remains a significant excess of rare alleles among patients relative to controls. We successfully replicated this latter observation in an independent cohort of 50 children with B-ALL, where we found an excess of rare PRDM9 alleles in aneuploid and infant B-ALL patients. PRDM9 variability in humans is thought to influence genomic instability, and these data support a potential role for PRDM9 variation in risk of acquiring aneuploidies or genomic rearrangements associated with childhood leukemogenesis.
Collapse
Affiliation(s)
- Julie Hussin
- Department of Biochemistry, Faculty of Medicine, University of Montreal, Montreal H3C 3J7, Canada
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | |
Collapse
|
35
|
Frenkel-Morgenstern M, Gorohovski A, Lacroix V, Rogers M, Ibanez K, Boullosa C, Andres Leon E, Ben-Hur A, Valencia A. ChiTaRS: a database of human, mouse and fruit fly chimeric transcripts and RNA-sequencing data. Nucleic Acids Res 2012; 41:D142-51. [PMID: 23143107 PMCID: PMC3531201 DOI: 10.1093/nar/gks1041] [Citation(s) in RCA: 43] [Impact Index Per Article: 3.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/23/2022] Open
Abstract
Chimeric RNAs that comprise two or more different transcripts have been identified in many cancers and among the Expressed Sequence Tags (ESTs) isolated from different organisms; they might represent functional proteins and produce different disease phenotypes. The ChiTaRS database of Chimeric Transcripts and RNA-Sequencing data (http://chitars.bioinfo.cnio.es/) collects more than 16 000 chimeric RNAs from humans, mice and fruit flies, 233 chimeras confirmed by RNA-seq reads and ∼2000 cancer breakpoints. The database indicates the expression and tissue specificity of these chimeras, as confirmed by RNA-seq data, and it includes mass spectrometry results for some human entries at their junctions. Moreover, the database has advanced features to analyze junction consistency and to rank chimeras based on the evidence of repeated junction sites. Finally, ‘Junction Search’ screens through the RNA-seq reads found at the chimeras’ junction sites to identify putative junctions in novel sequences entered by users. Thus, ChiTaRS is an extensive catalog of human, mouse and fruit fly chimeras that will extend our understanding of the evolution of chimeric transcripts in eukaryotes and can be advantageous in the analysis of human cancer breakpoints.
Collapse
Affiliation(s)
- Milana Frenkel-Morgenstern
- Structural Biology and BioComputing Program, Spanish National Cancer Research Centre (CNIO), Madrid 28029, Spain
| | | | | | | | | | | | | | | | | |
Collapse
|
36
|
Frenkel-Morgenstern M, Lacroix V, Ezkurdia I, Levin Y, Gabashvili A, Prilusky J, Del Pozo A, Tress M, Johnson R, Guigo R, Valencia A. Chimeras taking shape: potential functions of proteins encoded by chimeric RNA transcripts. Genome Res 2012; 22:1231-42. [PMID: 22588898 PMCID: PMC3396365 DOI: 10.1101/gr.130062.111] [Citation(s) in RCA: 106] [Impact Index Per Article: 8.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/15/2022]
Abstract
Chimeric RNAs comprise exons from two or more different genes and have the potential to encode novel proteins that alter cellular phenotypes. To date, numerous putative chimeric transcripts have been identified among the ESTs isolated from several organisms and using high throughput RNA sequencing. The few corresponding protein products that have been characterized mostly result from chromosomal translocations and are associated with cancer. Here, we systematically establish that some of the putative chimeric transcripts are genuinely expressed in human cells. Using high throughput RNA sequencing, mass spectrometry experimental data, and functional annotation, we studied 7424 putative human chimeric RNAs. We confirmed the expression of 175 chimeric RNAs in 16 human tissues, with an abundance varying from 0.06 to 17 RPKM (Reads Per Kilobase per Million mapped reads). We show that these chimeric RNAs are significantly more tissue-specific than non-chimeric transcripts. Moreover, we present evidence that chimeras tend to incorporate highly expressed genes. Despite the low expression level of most chimeric RNAs, we show that 12 novel chimeras are translated into proteins detectable in multiple shotgun mass spectrometry experiments. Furthermore, we confirm the expression of three novel chimeric proteins using targeted mass spectrometry. Finally, based on our functional annotation of exon organization and preserved domains, we discuss the potential features of chimeric proteins with illustrative examples and suggest that chimeras significantly exploit signal peptides and transmembrane domains, which can alter the cellular localization of cognate proteins. Taken together, these findings establish that some chimeric RNAs are translated into potentially functional proteins in humans.
Collapse
|