1
|
Cheng L, Yang C, Lu J, Huang M, Xie R, Lynch S, Elfman J, Huang Y, Liu S, Chen S, He B, Lin T, Li H, Chen X, Huang J. Oncogenic SLC2A11-MIF fusion protein interacts with polypyrimidine tract binding protein 1 to facilitate bladder cancer proliferation and metastasis by regulating mRNA stability. MedComm (Beijing) 2024; 5:e685. [PMID: 39156764 PMCID: PMC11324686 DOI: 10.1002/mco2.685] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/20/2023] [Revised: 07/03/2024] [Accepted: 07/14/2024] [Indexed: 08/20/2024] Open
Abstract
Chimeric RNAs, distinct from DNA gene fusions, have emerged as promising therapeutic targets with diverse functions in cancer treatment. However, the functional significance and therapeutic potential of most chimeric RNAs remain unclear. Here we identify a novel fusion transcript of solute carrier family 2-member 11 (SLC2A11) and macrophage migration inhibitory factor (MIF). In this study, we investigated the upregulation of SLC2A11-MIF in The Cancer Genome Atlas cohort and a cohort of patients from Sun Yat-Sen Memorial Hospital. Subsequently, functional investigations demonstrated that SLC2A11-MIF enhanced the proliferation, antiapoptotic effects, and metastasis of bladder cancer cells in vitro and in vivo. Mechanistically, the fusion protein encoded by SLC2A11-MIF interacted with polypyrimidine tract binding protein 1 (PTBP1) and regulated the mRNA half-lives of Polo Like Kinase 1, Roundabout guidance receptor 1, and phosphoinositide-3-kinase regulatory subunit 3 in BCa cells. Moreover, PTBP1 knockdown abolished the enhanced impact of SLC2A11-MIF on biological function and mRNA stability. Furthermore, the expression of SLC2A11-MIF mRNA is regulated by CCCTC-binding factor and stabilized through RNA N4-acetylcytidine modification facilitated by N-acetyltransferase 10. Overall, our findings revealed a significant fusion protein orchestrated by the SLC2A11-MIF-PTBP1 axis that governs mRNA stability during the multistep progression of bladder cancer.
Collapse
Affiliation(s)
- Liang Cheng
- Department of UrologySun Yat‐sen Memorial Hospital, Sun Yat‐sen UniversityGuangzhouGuangdongChina
- Guangdong Provincial Key Laboratory of Malignant Tumor Epigenetics and Gene RegulationDepartment of Urology,Sun Yat‐sen Memorial Hospital,Sun Yat‐Sen UniversityGuangzhouGuangdongChina
| | - Chenwei Yang
- Department of UrologySun Yat‐sen Memorial Hospital, Sun Yat‐sen UniversityGuangzhouGuangdongChina
- Guangdong Provincial Key Laboratory of Malignant Tumor Epigenetics and Gene RegulationDepartment of Urology,Sun Yat‐sen Memorial Hospital,Sun Yat‐Sen UniversityGuangzhouGuangdongChina
| | - Junlin Lu
- Department of UrologySun Yat‐sen Memorial Hospital, Sun Yat‐sen UniversityGuangzhouGuangdongChina
- Guangdong Provincial Key Laboratory of Malignant Tumor Epigenetics and Gene RegulationDepartment of Urology,Sun Yat‐sen Memorial Hospital,Sun Yat‐Sen UniversityGuangzhouGuangdongChina
| | - Ming Huang
- Department of UrologySun Yat‐sen Memorial Hospital, Sun Yat‐sen UniversityGuangzhouGuangdongChina
- Guangdong Provincial Key Laboratory of Malignant Tumor Epigenetics and Gene RegulationDepartment of Urology,Sun Yat‐sen Memorial Hospital,Sun Yat‐Sen UniversityGuangzhouGuangdongChina
- Guangdong Provincial Clinical Research Center for Urological DiseasesDepartment of Urology, Sun Yat‐sen Memorial Hospital, Sun Yat‐sen UniversityGuangzhouGuangdongChina
| | - Ruihui Xie
- Department of UrologySun Yat‐sen Memorial Hospital, Sun Yat‐sen UniversityGuangzhouGuangdongChina
- Guangdong Provincial Key Laboratory of Malignant Tumor Epigenetics and Gene RegulationDepartment of Urology,Sun Yat‐sen Memorial Hospital,Sun Yat‐Sen UniversityGuangzhouGuangdongChina
- Guangdong Provincial Clinical Research Center for Urological DiseasesDepartment of Urology, Sun Yat‐sen Memorial Hospital, Sun Yat‐sen UniversityGuangzhouGuangdongChina
| | - Sarah Lynch
- Department of PathologySchool of MedicineUniversity of VirginiaCharlottesvilleVirginiaUSA
| | - Justin Elfman
- Department of PathologySchool of MedicineUniversity of VirginiaCharlottesvilleVirginiaUSA
| | - Yuhang Huang
- Department of UrologySun Yat‐sen Memorial Hospital, Sun Yat‐sen UniversityGuangzhouGuangdongChina
- Guangdong Provincial Key Laboratory of Malignant Tumor Epigenetics and Gene RegulationDepartment of Urology,Sun Yat‐sen Memorial Hospital,Sun Yat‐Sen UniversityGuangzhouGuangdongChina
| | - Sen Liu
- Department of UrologySun Yat‐sen Memorial Hospital, Sun Yat‐sen UniversityGuangzhouGuangdongChina
- Guangdong Provincial Key Laboratory of Malignant Tumor Epigenetics and Gene RegulationDepartment of Urology,Sun Yat‐sen Memorial Hospital,Sun Yat‐Sen UniversityGuangzhouGuangdongChina
| | - Siting Chen
- Department of UrologySun Yat‐sen Memorial Hospital, Sun Yat‐sen UniversityGuangzhouGuangdongChina
- Guangdong Provincial Key Laboratory of Malignant Tumor Epigenetics and Gene RegulationDepartment of Urology,Sun Yat‐sen Memorial Hospital,Sun Yat‐Sen UniversityGuangzhouGuangdongChina
| | - Baoqing He
- Department of UrologySun Yat‐sen Memorial Hospital, Sun Yat‐sen UniversityGuangzhouGuangdongChina
- Guangdong Provincial Key Laboratory of Malignant Tumor Epigenetics and Gene RegulationDepartment of Urology,Sun Yat‐sen Memorial Hospital,Sun Yat‐Sen UniversityGuangzhouGuangdongChina
| | - Tianxin Lin
- Department of UrologySun Yat‐sen Memorial Hospital, Sun Yat‐sen UniversityGuangzhouGuangdongChina
- Guangdong Provincial Key Laboratory of Malignant Tumor Epigenetics and Gene RegulationDepartment of Urology,Sun Yat‐sen Memorial Hospital,Sun Yat‐Sen UniversityGuangzhouGuangdongChina
- Guangdong Provincial Clinical Research Center for Urological DiseasesDepartment of Urology, Sun Yat‐sen Memorial Hospital, Sun Yat‐sen UniversityGuangzhouGuangdongChina
| | - Hui Li
- Department of PathologySchool of MedicineUniversity of VirginiaCharlottesvilleVirginiaUSA
| | - Xu Chen
- Department of UrologySun Yat‐sen Memorial Hospital, Sun Yat‐sen UniversityGuangzhouGuangdongChina
- Guangdong Provincial Key Laboratory of Malignant Tumor Epigenetics and Gene RegulationDepartment of Urology,Sun Yat‐sen Memorial Hospital,Sun Yat‐Sen UniversityGuangzhouGuangdongChina
- Guangdong Provincial Clinical Research Center for Urological DiseasesDepartment of Urology, Sun Yat‐sen Memorial Hospital, Sun Yat‐sen UniversityGuangzhouGuangdongChina
| | - Jian Huang
- Department of UrologySun Yat‐sen Memorial Hospital, Sun Yat‐sen UniversityGuangzhouGuangdongChina
- Guangdong Provincial Key Laboratory of Malignant Tumor Epigenetics and Gene RegulationDepartment of Urology,Sun Yat‐sen Memorial Hospital,Sun Yat‐Sen UniversityGuangzhouGuangdongChina
- Guangdong Provincial Clinical Research Center for Urological DiseasesDepartment of Urology, Sun Yat‐sen Memorial Hospital, Sun Yat‐sen UniversityGuangzhouGuangdongChina
| |
Collapse
|
2
|
Piecoro DW, Allison DB. Precision Medicine in Cytopathology. Surg Pathol Clin 2024; 17:329-345. [PMID: 39129134 DOI: 10.1016/j.path.2024.04.002] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 08/13/2024]
Abstract
Over the last decade, cancer diagnostics has undergone a notable transformation with increasing complexity. Minimally invasive diagnostic tests, driven by advanced imaging and early detection protocols, are redefining patient care and reducing the need for more invasive procedures. Modern cytopathologists now safeguard patient samples for vital biomarker and molecular testing. In this article, we explore ancillary testing modalities and the role of biomarkers in organ-specific contexts, underscoring the transformative impact of precision medicine. Finally, the advent of more than 80 Food and Drug Administration-approved predictive biomarkers signals a new era, guiding cancer care toward personalized and targeted strategies.
Collapse
Affiliation(s)
- Dava W Piecoro
- Department of Pathology and Laboratory Medicine, 800 Rose Street, MS117, University of Kentucky College of Medicine, Lexington, KY 40536, USA
| | - Derek B Allison
- Department of Pathology and Laboratory Medicine, 800 Rose Street, MS117, University of Kentucky College of Medicine, Lexington, KY 40536, USA; Markey Cancer Center, Lexington, KY 40536, USA; Department of Urology, University of Kentucky College of Medicine, Lexington, KY 40536, USA.
| |
Collapse
|
3
|
Singh S, Shi X, Haddox S, Elfman J, Ahmad SB, Lynch S, Manley T, Piczak C, Phung C, Sun Y, Sharma A, Li H. RTCpredictor: identification of read-through chimeric RNAs from RNA sequencing data. Brief Bioinform 2024; 25:bbae251. [PMID: 38796690 PMCID: PMC11128028 DOI: 10.1093/bib/bbae251] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/15/2023] [Revised: 03/30/2024] [Accepted: 05/09/2024] [Indexed: 05/28/2024] Open
Abstract
Read-through chimeric RNAs are being recognized as a means to expand the functional transcriptome and contribute to cancer tumorigenesis when mis-regulated. However, current software tools often fail to predict them. We have developed RTCpredictor, utilizing a fast ripgrep tool to search for all possible exon-exon combinations of parental gene pairs. We also added exonic variants allowing searches containing common SNPs. To our knowledge, it is the first read-through chimeric RNA specific prediction method that also provides breakpoint coordinates. Compared with 10 other popular tools, RTCpredictor achieved high sensitivity on a simulated and three real datasets. In addition, RTCpredictor has less memory requirements and faster execution time, making it ideal for applying on large datasets.
Collapse
Affiliation(s)
- Sandeep Singh
- Department of Pathology, School of Medicine, University of Virginia, Charlottesville, VA 22908, United States
| | - Xinrui Shi
- Department of Pathology, School of Medicine, University of Virginia, Charlottesville, VA 22908, United States
- Department of Biochemistry and Molecular Genetics, School of Medicine, University of Virginia, Charlottesville, VA 22908, United States
| | - Samuel Haddox
- Department of Biochemistry and Molecular Genetics, School of Medicine, University of Virginia, Charlottesville, VA 22908, United States
| | - Justin Elfman
- Department of Biochemistry and Molecular Genetics, School of Medicine, University of Virginia, Charlottesville, VA 22908, United States
| | - Syed Basil Ahmad
- Department of Pathology, School of Medicine, University of Virginia, Charlottesville, VA 22908, United States
| | - Sarah Lynch
- Department of Pathology, School of Medicine, University of Virginia, Charlottesville, VA 22908, United States
| | - Tommy Manley
- Department of Pathology, School of Medicine, University of Virginia, Charlottesville, VA 22908, United States
| | - Claire Piczak
- Department of Pathology, School of Medicine, University of Virginia, Charlottesville, VA 22908, United States
| | - Christopher Phung
- Department of Pathology, School of Medicine, University of Virginia, Charlottesville, VA 22908, United States
| | - Yunan Sun
- Department of Pathology, School of Medicine, University of Virginia, Charlottesville, VA 22908, United States
| | - Aadi Sharma
- Department of Pathology, School of Medicine, University of Virginia, Charlottesville, VA 22908, United States
| | - Hui Li
- Department of Pathology, School of Medicine, University of Virginia, Charlottesville, VA 22908, United States
- Department of Biochemistry and Molecular Genetics, School of Medicine, University of Virginia, Charlottesville, VA 22908, United States
| |
Collapse
|
4
|
Hamid F, Arora S, Chitkara P, Kumar S. A Protocol for the Detection of Fusion Transcripts Using RNA-Sequencing Data. Methods Mol Biol 2024; 2812:243-258. [PMID: 39068367 DOI: 10.1007/978-1-0716-3886-6_14] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 07/30/2024]
Abstract
Fusion transcripts are formed when two genes or their mRNAs fuse to produce a novel gene or chimeric transcript. Fusion genes are well-known cancer biomarkers used for cancer diagnosis and as therapeutic targets. Gene fusions are also found in normal physiology and lead to the evolution of novel genes that contribute to better survival and adaptation for an organism. Various in vitro approaches, such as FISH, PCR, RT-PCR, and chromosome banding techniques, have been used to detect gene fusion. However, all these approaches have low resolution and throughput. Due to the development of high-throughput next-generation sequencing technologies, the detection of fusion transcript becomes feasible using whole genome sequencing, RNA-Seq data, and bioinformatics tools. This chapter will overview the general computational protocol for fusion transcript detection from RNA-sequencing datasets.
Collapse
Affiliation(s)
- Fiza Hamid
- Bioinformatics Laboratory, National Institute of Plant Genome Research (NIPGR), New Delhi, India
| | - Simran Arora
- Bioinformatics Laboratory, National Institute of Plant Genome Research (NIPGR), New Delhi, India
| | - Pragya Chitkara
- Bioinformatics Laboratory, National Institute of Plant Genome Research (NIPGR), New Delhi, India
| | - Shailesh Kumar
- Bioinformatics Laboratory, National Institute of Plant Genome Research (NIPGR), New Delhi, India.
| |
Collapse
|
5
|
Vicente-Garcés C, Maynou J, Fernández G, Esperanza-Cebollada E, Torrebadell M, Català A, Rives S, Camós M, Vega-García N. Fusion InPipe, an integrative pipeline for gene fusion detection from RNA-seq data in acute pediatric leukemia. Front Mol Biosci 2023; 10:1141310. [PMID: 37363396 PMCID: PMC10288994 DOI: 10.3389/fmolb.2023.1141310] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/10/2023] [Accepted: 05/30/2023] [Indexed: 06/28/2023] Open
Abstract
RNA sequencing (RNA-seq) is a reliable tool for detecting gene fusions in acute leukemia. Multiple bioinformatics pipelines have been developed to analyze RNA-seq data, but an agreed gold standard has not been established. This study aimed to compare the applicability of 5 fusion calling pipelines (Arriba, deFuse, CICERO, FusionCatcher, and STAR-Fusion), as well as to define and develop an integrative bioinformatics pipeline (Fusion InPipe) to detect clinically relevant gene fusions in acute pediatric leukemia. We analyzed RNA-seq data by each pipeline individually and by Fusion InPipe. Each algorithm individually called most of the fusions with similar sensitivity and precision. However, not all rearrangements were called, suggesting that choosing a single pipeline might cause missing important fusions. To improve this, we integrated the results of the five algorithms in just one pipeline, Fusion InPipe, comparing the output from the agreement of 5/5, 4/5, and 3/5 algorithms. The maximum sensitivity was achieved with the agreement of 3/5 algorithms, with a global sensitivity of 95%, achieving a 100% in patients' data. Furthermore, we showed the necessity of filtering steps to reduce the false positive detection rate. Here, we demonstrate that Fusion InPipe is an excellent tool for fusion detection in pediatric acute leukemia with the best performance when selecting those fusions called by at least 3/5 pipelines.
Collapse
Affiliation(s)
- Clara Vicente-Garcés
- Pediatric Cancer Center Barcelona (PCCB), Institut de Recerca Sant Joan de Déu, Leukemia and Pediatric Hematology Disorders, Developmental Tumors Biology Group, Esplugues de Llobregat, Spain
| | - Joan Maynou
- Hospital Sant Joan de Déu Barcelona, Genetics Medicine Section, Esplugues de Llobregat, Spain
- Institut de Recerca Hospital Sant Joan de Déu, Neurogenetics and Molecular Medicine, Esplugues de Llobregat, Spain
| | - Guerau Fernández
- Hospital Sant Joan de Déu Barcelona, Genetics Medicine Section, Esplugues de Llobregat, Spain
- Institut de Recerca Hospital Sant Joan de Déu, Neurogenetics and Molecular Medicine, Esplugues de Llobregat, Spain
| | - Elena Esperanza-Cebollada
- Pediatric Cancer Center Barcelona (PCCB), Institut de Recerca Sant Joan de Déu, Leukemia and Pediatric Hematology Disorders, Developmental Tumors Biology Group, Esplugues de Llobregat, Spain
| | - Montserrat Torrebadell
- Pediatric Cancer Center Barcelona (PCCB), Institut de Recerca Sant Joan de Déu, Leukemia and Pediatric Hematology Disorders, Developmental Tumors Biology Group, Esplugues de Llobregat, Spain
- Hospital Sant Joan de Déu Barcelona, Hematology Laboratory, Esplugues de Llobregat, Spain
- Instituto de Salud Carlos III, Centro de Investigación Biomédica en Red De Enfermedades Raras (CIBERER), Madrid, Spain
| | - Albert Català
- Pediatric Cancer Center Barcelona (PCCB), Institut de Recerca Sant Joan de Déu, Leukemia and Pediatric Hematology Disorders, Developmental Tumors Biology Group, Esplugues de Llobregat, Spain
- Instituto de Salud Carlos III, Centro de Investigación Biomédica en Red De Enfermedades Raras (CIBERER), Madrid, Spain
- Pediatric Cancer Center Barcelona (PCCB), Hospital Sant Joan De Déu Barcelona, Leukemia and Lymphoma Unit, Barcelona, Spain
| | - Susana Rives
- Pediatric Cancer Center Barcelona (PCCB), Institut de Recerca Sant Joan de Déu, Leukemia and Pediatric Hematology Disorders, Developmental Tumors Biology Group, Esplugues de Llobregat, Spain
- Instituto de Salud Carlos III, Centro de Investigación Biomédica en Red De Enfermedades Raras (CIBERER), Madrid, Spain
- Pediatric Cancer Center Barcelona (PCCB), Hospital Sant Joan De Déu Barcelona, Leukemia and Lymphoma Unit, Barcelona, Spain
| | - Mireia Camós
- Pediatric Cancer Center Barcelona (PCCB), Institut de Recerca Sant Joan de Déu, Leukemia and Pediatric Hematology Disorders, Developmental Tumors Biology Group, Esplugues de Llobregat, Spain
- Hospital Sant Joan de Déu Barcelona, Hematology Laboratory, Esplugues de Llobregat, Spain
- Instituto de Salud Carlos III, Centro de Investigación Biomédica en Red De Enfermedades Raras (CIBERER), Madrid, Spain
| | - Nerea Vega-García
- Pediatric Cancer Center Barcelona (PCCB), Institut de Recerca Sant Joan de Déu, Leukemia and Pediatric Hematology Disorders, Developmental Tumors Biology Group, Esplugues de Llobregat, Spain
- Hospital Sant Joan de Déu Barcelona, Hematology Laboratory, Esplugues de Llobregat, Spain
| |
Collapse
|
6
|
Haas BJ, Dobin A, Ghandi M, Van Arsdale A, Tickle T, Robinson JT, Gillani R, Kasif S, Regev A. Targeted in silico characterization of fusion transcripts in tumor and normal tissues via FusionInspector. CELL REPORTS METHODS 2023; 3:100467. [PMID: 37323575 PMCID: PMC10261907 DOI: 10.1016/j.crmeth.2023.100467] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 05/18/2022] [Revised: 02/28/2023] [Accepted: 04/14/2023] [Indexed: 06/17/2023]
Abstract
Here, we present FusionInspector for in silico characterization and interpretation of candidate fusion transcripts from RNA sequencing (RNA-seq) and exploration of their sequence and expression characteristics. We applied FusionInspector to thousands of tumor and normal transcriptomes and identified statistical and experimental features enriched among biologically impactful fusions. Through clustering and machine learning, we identified large collections of fusions potentially relevant to tumor and normal biological processes. We show that biologically relevant fusions are enriched for relatively high expression of the fusion transcript, imbalanced fusion allelic ratios, and canonical splicing patterns, and are deficient in sequence microhomologies between partner genes. We demonstrate that FusionInspector accurately validates fusion transcripts in silico and helps characterize numerous understudied fusions in tumor and normal tissue samples. FusionInspector is freely available as open source for screening, characterization, and visualization of candidate fusions via RNA-seq, and facilitates transparent explanation and interpretation of machine-learning predictions and their experimental sources.
Collapse
Affiliation(s)
- Brian J. Haas
- Klarman Cell Observatory, Broad Institute of MIT and Harvard, Cambridge, MA, USA
- Graduate Program in Bioinformatics, Boston University, Boston, MA 02215, USA
| | | | | | - Anne Van Arsdale
- Department of Obstetrics and Gynecology and Women’s Health, Albert Einstein Montefiore Medical Center, Bronx, NY 10461, USA
- Department of Genetics, Albert Einstein College of Medicine, Bronx, NY 10461, USA
| | - Timothy Tickle
- Klarman Cell Observatory, Broad Institute of MIT and Harvard, Cambridge, MA, USA
| | - James T. Robinson
- School of Medicine, University of California San Diego, La Jolla, CA 92093, USA
| | - Riaz Gillani
- Department of Pediatric Oncology, Dana-Farber Cancer Institute, Boston, MA 02215, USA
- Cancer Program, Broad Institute of Harvard and MIT, Cambridge, MA 02142, USA
- Department of Pediatrics, Harvard Medical School, Boston, MA 02215, USA
- Boston Children’s Hospital, Boston, MA 02115, USA
| | - Simon Kasif
- Graduate Program in Bioinformatics, Boston University, Boston, MA 02215, USA
- Department of Biomedical Engineering, Boston University, Boston, MA 02215, USA
| | - Aviv Regev
- Klarman Cell Observatory, Broad Institute of MIT and Harvard, Cambridge, MA, USA
- Department of Biology, Massachusetts Institute of Technology, Cambridge, MA 02139, USA
- Howard Hughes Medical Institute, Chevy Chase, MD 20815, USA
| |
Collapse
|
7
|
Dorney R, Dhungel BP, Rasko JEJ, Hebbard L, Schmitz U. Recent advances in cancer fusion transcript detection. Brief Bioinform 2022; 24:6918739. [PMID: 36527429 PMCID: PMC9851307 DOI: 10.1093/bib/bbac519] [Citation(s) in RCA: 9] [Impact Index Per Article: 4.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/26/2022] [Revised: 10/11/2022] [Accepted: 10/31/2022] [Indexed: 12/23/2022] Open
Abstract
Extensive investigation of gene fusions in cancer has led to the discovery of novel biomarkers and therapeutic targets. To date, most studies have neglected chromosomal rearrangement-independent fusion transcripts and complex fusion structures such as double or triple-hop fusions, and fusion-circRNAs. In this review, we untangle fusion-related terminology and propose a classification system involving both gene and transcript fusions. We highlight the importance of RNA-level fusions and how long-read sequencing approaches can improve detection and characterization. Moreover, we discuss novel bioinformatic tools to identify fusions in long-read sequencing data and strategies to experimentally validate and functionally characterize fusion transcripts.
Collapse
Affiliation(s)
- Ryley Dorney
- epartment of Molecular & Cell Biology, College of Public Health, Medical & Vet Sciences, James Cook University, Douglas, QLD 4811, Australia,Centre for Tropical Bioinformatics and Molecular Biology, Australian Institute of Tropical Health and Medicine, James Cook University, Cairns 4878, Australia
| | - Bijay P Dhungel
- Gene and Stem Cell Therapy Program Centenary Institute, The University of Sydney, Camperdown, NSW 2050, Australia,Faculty of Medicine & Health, The University of Sydney, Camperdown, NSW 2006, Australia,Centre for Tropical Bioinformatics and Molecular Biology, Australian Institute of Tropical Health and Medicine, James Cook University, Cairns 4878, Australia
| | - John E J Rasko
- Gene and Stem Cell Therapy Program Centenary Institute, The University of Sydney, Camperdown, NSW 2050, Australia,Faculty of Medicine & Health, The University of Sydney, Camperdown, NSW 2006, Australia
| | - Lionel Hebbard
- epartment of Molecular & Cell Biology, College of Public Health, Medical & Vet Sciences, James Cook University, Douglas, QLD 4811, Australia,Storr Liver Centre, Westmead Institute for Medical Research, Westmead Hospital and University of Sydney, Sydney, New South Wales, Australia
| | - Ulf Schmitz
- Corresponding author. Ulf Schmitz, Department of Molecular and Cell Biology, College of Public Health, Medical and Vet Sciences, James Cook University, Douglas, QLD 4811, Australia. E-mail:
| |
Collapse
|
8
|
mRNA Capture Sequencing and RT-qPCR for the Detection of Pathognomonic, Novel, and Secondary Fusion Transcripts in FFPE Tissue: A Sarcoma Showcase. Int J Mol Sci 2022; 23:ijms231911007. [PMID: 36232302 PMCID: PMC9569610 DOI: 10.3390/ijms231911007] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/09/2022] [Revised: 09/12/2022] [Accepted: 09/12/2022] [Indexed: 11/17/2022] Open
Abstract
We assess the performance of mRNA capture sequencing to identify fusion transcripts in FFPE tissue of different sarcoma types, followed by RT-qPCR confirmation. To validate our workflow, six positive control tumors with a specific chromosomal rearrangement were analyzed using the TruSight RNA Pan-Cancer Panel. Fusion transcript calling by FusionCatcher confirmed these aberrations and enabled the identification of both fusion gene partners and breakpoints. Next, whole-transcriptome TruSeq RNA Exome sequencing was applied to 17 fusion gene-negative alveolar rhabdomyosarcoma (ARMS) or undifferentiated round cell sarcoma (URCS) tumors, for whom fluorescence in situ hybridization (FISH) did not identify the classical pathognomonic rearrangements. For six patients, a pathognomonic fusion transcript was readily detected, i.e., PAX3-FOXO1 in two ARMS patients, and EWSR1-FLI1, EWSR1-ERG, or EWSR1-NFATC2 in four URCS patients. For the 11 remaining patients, 11 newly identified fusion transcripts were confirmed by RT-qPCR, including COPS3-TOM1L2, NCOA1-DTNB, WWTR1-LINC01986, PLAA-MOB3B, AP1B1-CHEK2, and BRD4-LEUTX fusion transcripts in ARMS patients. Additionally, recurrently detected secondary fusion transcripts in patients diagnosed with EWSR1-NFATC2-positive sarcoma were confirmed (COPS4-TBC1D9, PICALM-SYTL2, SMG6-VPS53, and UBE2F-ALS2). In conclusion, this study shows that mRNA capture sequencing enhances the detection rate of pathognomonic fusions and enables the identification of novel and secondary fusion transcripts in sarcomas.
Collapse
|
9
|
Saliba J, Church AJ, Rao S, Danos A, Furtado LV, Laetsch T, Zhang L, Nardi V, Lin WH, Ritter DI, Madhavan S, Li MM, Griffith OL, Griffith M, Raca G, Roy A. Standardized evidence-based approach for assessment of oncogenic and clinical significance of NTRK fusions. Cancer Genet 2022; 264-265:50-59. [PMID: 35366592 PMCID: PMC9252326 DOI: 10.1016/j.cancergen.2022.03.001] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/18/2021] [Revised: 02/13/2022] [Accepted: 03/07/2022] [Indexed: 11/17/2022]
Abstract
Gene fusions involving the neurotrophic receptor tyrosine kinase genes NTRK1, NTRK2, and NTRK3, are well established oncogenic drivers in a broad range of pediatric and adult tumors. These fusions are also important actionable markers, predicting often dramatic response to FDA approved kinase inhibitors. Accurate interpretation of the clinical significance of NTRK fusions is a high priority for diagnostic laboratories, but remains challenging and time consuming given the rapid pace of new data accumulation, the diversity of fusion partners and tumor types, and heterogeneous and incomplete information in variant databases and knowledgebases. The ClinGen NTRK Fusions Somatic Cancer Variant Curation Expert Panel (SC-VCEP) was formed to systematically address these challenges and create an expert-curated resource to support clinicians, researchers, patients and their families in making accurate interpretations and informed treatment decisions for NTRK fusion-driven tumors. We describe a system for NTRK fusion interpretation (including compilation of key elements and annotations) developed by the NTRK fusions SC-VCEP. We illustrate this stepwise process on examples of LMNA::NTRK1 and KANK1::NTRK2 fusions. Finally, we provide detailed analysis of current representation of NTRK fusions in public fusion databases and the CIViC knowledgebase, performed by the NTRK fusions SC-VCEP to determine existing gaps and prioritize future curation activities.
Collapse
Affiliation(s)
- Jason Saliba
- Department of Medicine, Washington University School of Medicine in St. Louis, St. Louis, MO United States
| | - Alanna J Church
- Department of Pathology, Boston Children's Hospital and Harvard Medical School, Boston, MA, United States
| | - Shruti Rao
- Innovation Center for Biomedical Informatics, Georgetown University Medical Center, Washington D.C., United States
| | - Arpad Danos
- Department of Medicine, Washington University School of Medicine in St. Louis, St. Louis, MO United States
| | - Larissa V Furtado
- Department of Pathology, St. Jude Children's Research Hospital, Memphis, TN, United States
| | - Theodore Laetsch
- Division of Oncology and Center for Childhood Cancer Research, Children's Hospital of Philadelphia, Philadelphia, PA; Abramson Cancer Center, Perelman School of Medicine, University of Pennsylvania, Philadelphia, PA, United States
| | - Liying Zhang
- Department of Pathology and Laboratory Medicine, David Geffen School of Medicine, University of California, Los Angeles (UCLA), Los Angeles, CA, United States
| | - Valentina Nardi
- Department of Pathology, Massachusetts General Hospital and Harvard Medical School, Boston, MA, United States
| | - Wan-Hsin Lin
- Department of Cancer Biology, Mayo Clinic Florida, Jacksonville, FL, United States
| | - Deborah I Ritter
- Department of Pediatrics, Baylor College of Medicine and Texas Children's Hospital, Houston, TX, United States
| | - Subha Madhavan
- Innovation Center for Biomedical Informatics, Georgetown University Medical Center, Washington D.C., United States; AstraZeneca, Cambridge, United Kingdom
| | - Marilyn M Li
- Department of Pathology and Laboratory Medicine, Children's Hospital of Philadelphia, University of Pennsylvania Perelman School of Medicine, Philadelphia, PA, United States
| | - Obi L Griffith
- Department of Medicine, Washington University School of Medicine in St. Louis, St. Louis, MO United States
| | - Malachi Griffith
- Department of Medicine, Washington University School of Medicine in St. Louis, St. Louis, MO United States
| | - Gordana Raca
- Department of Pathology and Laboratory Medicine, Children's Hospital of Los Angeles, Los Angeles, CA, United States
| | - Angshumoy Roy
- Department of Pediatrics, Baylor College of Medicine and Texas Children's Hospital, Houston, TX, United States; Department of Pathology & Immunology, Baylor College of Medicine, Houston, TX, United States; Department of Pathology, Texas Children's Hospital, Houston, TX, United States.
| |
Collapse
|
10
|
Abstract
Distilling biologically meaningful information from cancer genome sequencing data requires comprehensive identification of somatic alterations using rigorous computational methods. As the amount and complexity of sequencing data have increased, so has the number of tools for analysing them. Here, we describe the main steps involved in the bioinformatic analysis of cancer genomes, review key algorithmic developments and highlight popular tools and emerging technologies. These tools include those that identify point mutations, copy number alterations, structural variations and mutational signatures in cancer genomes. We also discuss issues in experimental design, the strengths and limitations of sequencing modalities and methodological challenges for the future.
Collapse
|
11
|
Sun Y, Li H. Chimeric RNAs Discovered by RNA Sequencing and Their Roles in Cancer and Rare Genetic Diseases. Genes (Basel) 2022; 13:741. [PMID: 35627126 PMCID: PMC9140685 DOI: 10.3390/genes13050741] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/09/2022] [Revised: 04/13/2022] [Accepted: 04/20/2022] [Indexed: 12/30/2022] Open
Abstract
Chimeric RNAs are transcripts that are generated by gene fusion and intergenic splicing events, thus comprising nucleotide sequences from different parental genes. In the past, Northern blot analysis and RT-PCR were used to detect chimeric RNAs. However, they are low-throughput and can be time-consuming, labor-intensive, and cost-prohibitive. With the development of RNA-seq and transcriptome analyses over the past decade, the number of chimeric RNAs in cancer as well as in rare inherited diseases has dramatically increased. Chimeric RNAs may be potential diagnostic biomarkers when they are specifically expressed in cancerous cells and/or tissues. Some chimeric RNAs can also play a role in cell proliferation and cancer development, acting as tools for cancer prognosis, and revealing new insights into the cell origin of tumors. Due to their abilities to characterize a whole transcriptome with a high sequencing depth and intergenically identify spliced chimeric RNAs produced with the absence of chromosomal rearrangement, RNA sequencing has not only enhanced our ability to diagnose genetic diseases, but also provided us with a deeper understanding of these diseases. Here, we reviewed the mechanisms of chimeric RNA formation and the utility of RNA sequencing for discovering chimeric RNAs in several types of cancer and rare inherited diseases. We also discussed the diagnostic, prognostic, and therapeutic values of chimeric RNAs.
Collapse
Affiliation(s)
- Yunan Sun
- Department of Pathology, School of Medicine, University of Virginia, Charlottesville, VA 22908, USA;
- Department of Biochemistry and Molecular Genetics, School of Medicine, University of Virginia, Charlottesville, VA 22908, USA
| | - Hui Li
- Department of Pathology, School of Medicine, University of Virginia, Charlottesville, VA 22908, USA;
- Department of Biochemistry and Molecular Genetics, School of Medicine, University of Virginia, Charlottesville, VA 22908, USA
| |
Collapse
|
12
|
Fusion Genes in Prostate Cancer: A Comparison in Men of African and European Descent. BIOLOGY 2022; 11:biology11050625. [PMID: 35625354 PMCID: PMC9137560 DOI: 10.3390/biology11050625] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 03/17/2022] [Revised: 04/03/2022] [Accepted: 04/06/2022] [Indexed: 11/21/2022]
Abstract
Simple Summary Men of African origin have a 2–3 times greater chance of developing prostate cancer than those of European origin, and of patients that are diagnosed with the disease, men of African descent are 2 times more likely to die compared to white men. Men of African origin are still greatly underrepresented in genetic studies and clinical trials. This, unfortunately, means that new discoveries in cancer treatment are missing key information on the group with a greater chance of mortality. A fusion gene is a hybrid gene formed from two previously independent genes. Fusion genes have been found to be common in all main types of human cancer. The objective of this study was to increase our knowledge of fusion genes in prostate cancer using computational approaches and to compare fusion genes between men of African and European origin. This identified novel gene fusions unique to men of African origin and suggested that this group has a greater number of fusion genes. Abstract Prostate cancer is one of the most prevalent cancers worldwide, particularly affecting men living a western lifestyle and of African descent, suggesting risk factors that are genetic, environmental, and socioeconomic in nature. In the USA, African American (AA) men are disproportionately affected, on average suffering from a higher grade of the disease and at a younger age compared to men of European descent (EA). Fusion genes are chimeric products formed by the merging of two separate genes occurring as a result of chromosomal structural changes, for example, inversion or trans/cis-splicing of neighboring genes. They are known drivers of cancer and have been identified in 20% of cancers. Improvements in genomics technologies such as RNA-sequencing coupled with better algorithms for prediction of fusion genes has added to our knowledge of specific gene fusions in cancers. At present AA are underrepresented in genomic studies of prostate cancer. The primary goal of this study was to examine molecular differences in predicted fusion genes in a cohort of AA and EA men in the context of prostate cancer using computational approaches. RNA was purified from prostate tissue specimens obtained at surgery from subjects enrolled in the study. Fusion gene predictions were performed using four different fusion gene detection programs. This identified novel putative gene fusions unique to AA and suggested that the fusion gene burden was higher in AA compared to EA men.
Collapse
|
13
|
Hafstað V, Søkilde R, Häkkinen J, Larsson M, Vallon-Christersson J, Rovira C, Persson H. Regulatory networks and 5' partner usage of miRNA host gene fusions in breast cancer. Int J Cancer 2022; 151:95-106. [PMID: 35182081 PMCID: PMC9303785 DOI: 10.1002/ijc.33972] [Citation(s) in RCA: 4] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/14/2021] [Revised: 02/07/2022] [Accepted: 02/10/2022] [Indexed: 11/12/2022]
Abstract
Genomic rearrangements in cancer cells can create gene fusions where the juxtaposition of two different genes leads to the production of chimeric proteins or altered gene expression through promoter‐swapping. We have previously shown that fusion transcripts involving microRNA (miRNA) host genes contribute to deregulation of miRNA expression regardless of the protein‐coding potential of these transcripts. Many different genes can also be used as 5′ partners by a miRNA host gene in what we named recurrent miRNA‐convergent fusions. Here, we have explored the properties of 5′ partners in fusion transcripts that involve miRNA hosts in breast tumours from The Cancer Genome Atlas (TCGA). We hypothesised that firstly, 5′ partner genes should belong to pathways and transcriptional programmes that reflect the tumour phenotype and secondly, there should be a selection for fusion events that shape miRNA expression to benefit the tumour cell through the known hallmarks of cancer. We found that the set of 5′ partners in miRNA host fusions is non‐random, with overrepresentation of highly expressed genes in pathways active in cancer including epithelial‐to‐mesenchymal transition, translational regulation and oestrogen signalling. Furthermore, many miRNAs were upregulated in samples with host gene fusions, including established oncogenic miRNAs such as mir‐21 and the mir‐106b~mir‐93~mir‐25 cluster. To the list of mechanisms for deregulation of miRNA expression, we have added fusion transcripts that change the promoter region. We propose that this adds material for genetic selection and tumour evolution in cancer cells and that miRNA host fusions can act as tumour ‘drivers’.
Collapse
Affiliation(s)
- Völundur Hafstað
- Lund University Cancer Centre, Faculty of Medicine, Department of Clinical Sciences Lund, Oncology, Lund, Sweden
| | - Rolf Søkilde
- Lund University Cancer Centre, Faculty of Medicine, Department of Clinical Sciences Lund, Oncology, Lund, Sweden
| | - Jari Häkkinen
- Lund University Cancer Centre, Faculty of Medicine, Department of Clinical Sciences Lund, Oncology, Lund, Sweden
| | - Malin Larsson
- Department of Physics, Chemistry and Biology, National Bioinformatics Infrastructure Sweden, Science for Life Laboratory, Linköping University, Linköping, Sweden
| | - Johan Vallon-Christersson
- Lund University Cancer Centre, Faculty of Medicine, Department of Clinical Sciences Lund, Oncology, Lund, Sweden
| | - Carlos Rovira
- Lund University Cancer Centre, Faculty of Medicine, Department of Clinical Sciences Lund, Oncology, Lund, Sweden
| | - Helena Persson
- Lund University Cancer Centre, Faculty of Medicine, Department of Clinical Sciences Lund, Oncology, Lund, Sweden
| |
Collapse
|
14
|
Deng W, Murugan S, Lindberg J, Chellappa V, Shen X, Pawitan Y, Vu TN. Fusion Gene Detection Using Whole-Exome Sequencing Data in Cancer Patients. Front Genet 2022; 13:820493. [PMID: 35251131 PMCID: PMC8888970 DOI: 10.3389/fgene.2022.820493] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/23/2021] [Accepted: 01/31/2022] [Indexed: 12/13/2022] Open
Abstract
Several fusion genes are directly involved in the initiation and progression of cancers. Numerous bioinformatics tools have been developed to detect fusion events, but they are mainly based on RNA-seq data. The whole-exome sequencing (WES) represents a powerful technology that is widely used for disease-related DNA variant detection. In this study, we build a novel analysis pipeline called Fuseq-WES to detect fusion genes at DNA level based on the WES data. The same method applies also for targeted panel sequencing data. We assess the method to real datasets of acute myeloid leukemia (AML) and prostate cancer patients. The result shows that two of the main AML fusion genes discovered in RNA-seq data, PML-RARA and CBFB-MYH11, are detected in the WES data in 36 and 63% of the available samples, respectively. For the targeted deep-sequencing of prostate cancer patients, detection of the TMPRSS2-ERG fusion, which is the most frequent chimeric alteration in prostate cancer, is 91% concordant with a manually curated procedure based on four other methods. In summary, the overall results indicate that it is challenging to detect fusion genes in WES data with a standard coverage of ∼ 15–30x, where fusion candidates discovered in the RNA-seq data are often not detected in the WES data and vice versa. A subsampling study of the prostate data suggests that a coverage of at least 75x is necessary to achieve high accuracy.
Collapse
Affiliation(s)
- Wenjiang Deng
- Department of Medical Epidemiology and Biostatistics, Karolinska Institutet, Stockholm, Sweden
| | - Sarath Murugan
- Department of Medical Epidemiology and Biostatistics, Karolinska Institutet, Stockholm, Sweden
| | - Johan Lindberg
- Department of Medical Epidemiology and Biostatistics, Karolinska Institutet, Stockholm, Sweden
| | - Venkatesh Chellappa
- Department of Medical Epidemiology and Biostatistics, Karolinska Institutet, Stockholm, Sweden
| | - Xia Shen
- Department of Medical Epidemiology and Biostatistics, Karolinska Institutet, Stockholm, Sweden
- Biostatistics Group, Greater Bay Area Institute of Precision Medicine, Fudan University, Guangzhou, China
- Centre for Global Health Research, Usher Institute, University of Edinburgh, Edinburgh, United Kingdom
| | - Yudi Pawitan
- Department of Medical Epidemiology and Biostatistics, Karolinska Institutet, Stockholm, Sweden
- *Correspondence: Yudi Pawitan, ; Trung Nghia Vu,
| | - Trung Nghia Vu
- Department of Medical Epidemiology and Biostatistics, Karolinska Institutet, Stockholm, Sweden
- *Correspondence: Yudi Pawitan, ; Trung Nghia Vu,
| |
Collapse
|
15
|
Davidson NM, Chen Y, Sadras T, Ryland GL, Blombery P, Ekert PG, Göke J, Oshlack A. JAFFAL: detecting fusion genes with long-read transcriptome sequencing. Genome Biol 2022; 23:10. [PMID: 34991664 PMCID: PMC8739696 DOI: 10.1186/s13059-021-02588-5] [Citation(s) in RCA: 12] [Impact Index Per Article: 6.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/26/2021] [Accepted: 12/22/2021] [Indexed: 12/26/2022] Open
Abstract
In cancer, fusions are important diagnostic markers and targets for therapy. Long-read transcriptome sequencing allows the discovery of fusions with their full-length isoform structure. However, due to higher sequencing error rates, fusion finding algorithms designed for short reads do not work. Here we present JAFFAL, to identify fusions from long-read transcriptome sequencing. We validate JAFFAL using simulations, cell lines, and patient data from Nanopore and PacBio. We apply JAFFAL to single-cell data and find fusions spanning three genes demonstrating transcripts detected from complex rearrangements. JAFFAL is available at https://github.com/Oshlack/JAFFA/wiki .
Collapse
Affiliation(s)
- Nadia M Davidson
- Peter MacCallum Cancer Centre, Victoria, Australia.
- School of BioSciences, University of Melbourne, Victoria, Australia.
- The Walter and Eliza Hall Institute, Victoria, Australia.
| | - Ying Chen
- Genome Institute of Singapore, Singapore, Singapore
| | - Teresa Sadras
- Peter MacCallum Cancer Centre, Victoria, Australia
- Sir Peter MacCallum Department of Oncology, University of Melbourne, Victoria, Australia
| | - Georgina L Ryland
- Peter MacCallum Cancer Centre, Victoria, Australia
- Sir Peter MacCallum Department of Oncology, University of Melbourne, Victoria, Australia
- Centre for Cancer Research, University of Melbourne, Victoria, Australia
| | - Piers Blombery
- Peter MacCallum Cancer Centre, Victoria, Australia
- Sir Peter MacCallum Department of Oncology, University of Melbourne, Victoria, Australia
| | - Paul G Ekert
- Peter MacCallum Cancer Centre, Victoria, Australia
- Sir Peter MacCallum Department of Oncology, University of Melbourne, Victoria, Australia
- Children's Cancer Institute, Lowy Cancer Centre, UNSW, Sydney, NSW, Australia
- School of Women's and Children's Health, UNSW, Sydney, NSW, Australia
- Murdoch Children's Research Institute, Victoria, Australia
| | - Jonathan Göke
- Genome Institute of Singapore, Singapore, Singapore
- National Cancer Centre Singapore, Singapore, Singapore
| | - Alicia Oshlack
- Peter MacCallum Cancer Centre, Victoria, Australia.
- School of BioSciences, University of Melbourne, Victoria, Australia.
- Sir Peter MacCallum Department of Oncology, University of Melbourne, Victoria, Australia.
| |
Collapse
|
16
|
Kerbs P, Vosberg S, Krebs S, Graf A, Blum H, Swoboda A, Batcha AMN, Mansmann U, Metzler D, Heckman CA, Herold T, Greif PA. Fusion gene detection by RNA-sequencing complements diagnostics of acute myeloid leukemia and identifies recurring NRIP1-MIR99AHG rearrangements. Haematologica 2022; 107:100-111. [PMID: 34134471 PMCID: PMC8719081 DOI: 10.3324/haematol.2021.278436] [Citation(s) in RCA: 12] [Impact Index Per Article: 6.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/26/2021] [Accepted: 05/03/2021] [Indexed: 12/04/2022] Open
Abstract
Identification of fusion genes in clinical routine is mostly based on cytogenetics and targeted molecular genetics, such as metaphase karyotyping, fluorescence in situ hybridization and reverse-transcriptase polymerase chain reaction. However, sequencing technologies are becoming more important in clinical routine as processing time and costs per sample decrease. To evaluate the performance of fusion gene detection by RNAsequencing compared to standard diagnostic techniques, we analyzed 806 RNA-sequencing samples from patients with acute myeloid leukemia using two state-of-the-art software tools, namely Arriba and FusionCatcher. RNA-sequencing detected 90% of fusion events that were reported by routine with high evidence, while samples in which RNA-sequencing failed to detect fusion genes had overall lower and inhomogeneous sequence coverage. Based on properties of known and unknown fusion events, we developed a workflow with integrated filtering strategies for the identification of robust fusion gene candidates by RNA-sequencing. Thereby, we detected known recurrent fusion events in 26 cases that were not reported by routine and found discrepancies in evidence for known fusion events between routine and RNA-sequencing in three cases. Moreover, we identified 157 fusion genes as novel robust candidates and comparison to entries from ChimerDB or Mitelman Database showed novel recurrence of fusion genes in 14 cases. Finally, we detected the novel recurrent fusion gene NRIP1- MIR99AHG resulting from inv(21)(q11.2;q21.1) in nine patients (1.1%) and LTN1-MX1 resulting from inv(21)(q21.3;q22.3) in two patients (0.25%). We demonstrated that NRIP1-MIR99AHG results in overexpression of the 3' region of MIR99AHG and the disruption of the tricistronic miRNA cluster miR-99a/let-7c/miR-125b-2. Interestingly, upregulation of MIR99AHG and deregulation of the miRNA cluster, residing in the MIR99AHG locus, are known mechanisms of leukemogenesis in acute megakaryoblastic leukemia. Our findings demonstrate that RNA-sequencing has a strong potential to improve the systematic detection of fusion genes in clinical applications and provides a valuable tool for fusion discovery.
Collapse
Affiliation(s)
- Paul Kerbs
- Department of Medicine III, University Hospital, LMU Munich, Munich, Germany; German Cancer Consortium (DKTK), partner site Munich; and; German Cancer Research Center (DKFZ), Heidelberg, Germany
| | - Sebastian Vosberg
- Department of Medicine III, University Hospital, LMU Munich, Munich, Germany; German Cancer Consortium (DKTK), partner site Munich; and; German Cancer Research Center (DKFZ), Heidelberg, Germany
| | - Stefan Krebs
- Laboratory for Functional Genome Analysis (LAFUGA), Gene Center, LMU Munich, Munich, Germany
| | - Alexander Graf
- Laboratory for Functional Genome Analysis (LAFUGA), Gene Center, LMU Munich, Munich, Germany
| | - Helmut Blum
- Laboratory for Functional Genome Analysis (LAFUGA), Gene Center, LMU Munich, Munich, Germany
| | - Anja Swoboda
- Department of Medicine III, University Hospital, LMU Munich, Munich, Germany
| | - Aarif M N Batcha
- Department of Medical Data Processing, Biometry and Epidemiology, LMU Munich, Munich, Germany
| | - Ulrich Mansmann
- Department of Medical Data Processing, Biometry and Epidemiology, LMU Munich, Munich, Germany
| | - Dirk Metzler
- Division of Evolutionary Biology, Faculty of Biology, LMU Munich, Planegg-Martinsried, Germany
| | - Caroline A Heckman
- Institute for Molecular Medicine Finland (FIMM), University of Helsinki, Helsinki, Finland
| | - Tobias Herold
- Department of Medicine III, University Hospital, LMU Munich, Munich, Germany; German Cancer Consortium (DKTK), partner site Munich; and; German Cancer Research Center (DKFZ), Heidelberg, Germany
| | - Philipp A Greif
- Department of Medicine III, University Hospital, LMU Munich, Munich, Germany; German Cancer Consortium (DKTK), partner site Munich; and; German Cancer Research Center (DKFZ), Heidelberg, Germany.
| |
Collapse
|
17
|
Detroja R, Gorohovski A, Giwa O, Baum G, Frenkel-Morgenstern M. ChiTaH: a fast and accurate tool for identifying known human chimeric sequences from high-throughput sequencing data. NAR Genom Bioinform 2021; 3:lqab112. [PMID: 34859212 PMCID: PMC8633610 DOI: 10.1093/nargab/lqab112] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/23/2021] [Revised: 10/21/2021] [Accepted: 11/22/2021] [Indexed: 12/16/2022] Open
Abstract
Fusion genes or chimeras typically comprise sequences from two different genes. The chimeric RNAs of such joined sequences often serve as cancer drivers. Identifying such driver fusions in a given cancer or complex disease is important for diagnosis and treatment. The advent of next-generation sequencing technologies, such as DNA-Seq or RNA-Seq, together with the development of suitable computational tools, has made the global identification of chimeras in tumors possible. However, the testing of over 20 computational methods showed these to be limited in terms of chimera prediction sensitivity, specificity, and accurate quantification of junction reads. These shortcomings motivated us to develop the first ‘reference-based’ approach termed ChiTaH (Chimeric Transcripts from High–throughput sequencing data). ChiTaH uses 43,466 non–redundant known human chimeras as a reference database to map sequencing reads and to accurately identify chimeric reads. We benchmarked ChiTaH and four other methods to identify human chimeras, leveraging both simulated and real sequencing datasets. ChiTaH was found to be the most accurate and fastest method for identifying known human chimeras from simulated and sequencing datasets. Moreover, especially ChiTaH uncovered heterogeneity of the BCR-ABL1 chimera in both bulk and single-cells of the K-562 cell line, which was confirmed experimentally.
Collapse
Affiliation(s)
- Rajesh Detroja
- Cancer Genomics and BioComputing of Complex Diseases Lab, Azrieli Faculty of Medicine, Bar-Ilan University, Safed 1311502, Israel
| | - Alessandro Gorohovski
- Cancer Genomics and BioComputing of Complex Diseases Lab, Azrieli Faculty of Medicine, Bar-Ilan University, Safed 1311502, Israel
| | - Olawumi Giwa
- Cancer Genomics and BioComputing of Complex Diseases Lab, Azrieli Faculty of Medicine, Bar-Ilan University, Safed 1311502, Israel
| | - Gideon Baum
- Cancer Genomics and BioComputing of Complex Diseases Lab, Azrieli Faculty of Medicine, Bar-Ilan University, Safed 1311502, Israel
| | - Milana Frenkel-Morgenstern
- Cancer Genomics and BioComputing of Complex Diseases Lab, Azrieli Faculty of Medicine, Bar-Ilan University, Safed 1311502, Israel
| |
Collapse
|
18
|
Liu D, Xia J, Yang Z, Zhao X, Li J, Hao W, Yang X. Identification of Chimeric RNAs in Pig Skeletal Muscle and Transcriptomic Analysis of Chimeric RNA TNNI2-ACTA1 V1. Front Vet Sci 2021; 8:742593. [PMID: 34778431 PMCID: PMC8578878 DOI: 10.3389/fvets.2021.742593] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/16/2021] [Accepted: 09/27/2021] [Indexed: 12/11/2022] Open
Abstract
Chimeric RNA was considered a special marker of cancer. However, recent studies have demonstrated that chimeric RNAs also exist in non-cancerous cells and tissues. Here, we analyzed and predicted jointly 49 chimeric RNAs by Star-Fusion and FusionMap. One chimeric RNA, we named TNNI2-ACTA1, and its eight transcript variants were identified by reverse transcriptase–polymerase chain reaction. The overexpression of TNNI2-ACTA1 V1 inhibited the proliferation of porcine skeletal muscle satellite cells through down-regulating the mRNA expression levels of cell cycle–related genes cyclinD1. However, as parental genes, there is no such effect in the TNNI2 and ACTA1. To explore the underlying mechanism for this phenomenon, we used RNA-seq to profile the transcriptomes of PSCs with overexpression. Compared with the negative control group, 1,592 differentially expressed genes (DEGs) were upregulated and 1,077 DEGs downregulated in TNNI2 group; 1,226 DEGs were upregulated and 902 DEGs downregulated in ACTA1 group; and 13 DEGs were upregulated and 16 DEGs downregulated in TNNI2-ACTA1 V1 group, respectively. Compared with the parental gene groups, three specific genes were enriched in the TNNI2-ACTA1 V1 group (NCOA3, Radixin, and DDR2). These three genes may be the key to TNNI2-ACTA1 V1 regulating cell proliferation. Taken together, our study explores the role of chimeric RNAs in normal tissues. In addition, our study as the first research provides the foundation for the mechanism of chimeric RNAs regulating porcine skeletal muscle growth.
Collapse
Affiliation(s)
- Dongyu Liu
- College of Animal Sciences and Technology, Northeast Agricultural University, Harbin, China
| | - Jiqiao Xia
- College of Animal Sciences and Technology, Northeast Agricultural University, Harbin, China
| | - Zewei Yang
- College of Animal Sciences and Technology, Northeast Agricultural University, Harbin, China
| | - Xuelian Zhao
- College of Animal Sciences and Technology, Northeast Agricultural University, Harbin, China
| | - Jiaxin Li
- College of Animal Sciences and Technology, Northeast Agricultural University, Harbin, China
| | - Wanjun Hao
- College of Animal Sciences and Technology, Northeast Agricultural University, Harbin, China
| | - Xiuqin Yang
- College of Animal Sciences and Technology, Northeast Agricultural University, Harbin, China
| |
Collapse
|
19
|
Schischlik F. Transcriptional configurations of myeloproliferative neoplasms. INTERNATIONAL REVIEW OF CELL AND MOLECULAR BIOLOGY 2021; 366:25-39. [PMID: 35153005 DOI: 10.1016/bs.ircmb.2021.07.003] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/14/2023]
Abstract
Myeloproliferative neoplasms (MPNs) is an umbrella term for several heterogenous diseases, which are characterized by their stem cell origin, clonal hematopoiesis and increase of blood cells of the myeloid lineage. The focus will be on BCR-ABL1 negative MPNs, polycythemia vera (PV), primary myelofibrosis (PMF), essential thrombocythemia (ET). Seminal findings in the field of MPN were driven by genomic analysis, focusing on dissecting genomic changes MPN patients. This led to identification of major MPN driver genes, JAK2, MPL and CALR. Transcriptomic analysis promises to bridge the gap between genetic and phenotypic characterization of each patient's tumor and with the advent of single cell sequencing even for each MPN cancer cell. This review will focus on efforts to mine the bulk transcriptome of MPN patients, including analysis of fusion genes and splicing alterations which can be addressed with RNA-seq technologies. Furthermore, this paper aims to review recent endeavors to elucidate tumor heterogeneity in MPN hematopoietic stem and progenitor cells using single cell technologies. Finally, it will highlight current shortcoming and future applications to advance the field in MPN biology and improve patient diagnostics using RNA-based assays.
Collapse
Affiliation(s)
- Fiorella Schischlik
- Cancer Data Science Laboratory, Center for Cancer Research, National Cancer Institute, Bethesda, MD, United States.
| |
Collapse
|
20
|
Glenfield C, Innan H. Gene Duplication and Gene Fusion Are Important Drivers of Tumourigenesis during Cancer Evolution. Genes (Basel) 2021; 12:1376. [PMID: 34573358 PMCID: PMC8466788 DOI: 10.3390/genes12091376] [Citation(s) in RCA: 12] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/02/2021] [Revised: 08/27/2021] [Accepted: 08/29/2021] [Indexed: 02/07/2023] Open
Abstract
Chromosomal rearrangement and genome instability are common features of cancer cells in human. Consequently, gene duplication and gene fusion events are frequently observed in human malignancies and many of the products of these events are pathogenic, representing significant drivers of tumourigenesis and cancer evolution. In certain subsets of cancers duplicated and fused genes appear to be essential for initiation of tumour formation, and some even have the capability of transforming normal cells, highlighting the importance of understanding the events that result in their formation. The mechanisms that drive gene duplication and fusion are unregulated in cancer and they facilitate rapid evolution by selective forces akin to Darwinian survival of the fittest on a cellular level. In this review, we examine current knowledge of the landscape and prevalence of gene duplication and gene fusion in human cancers.
Collapse
Affiliation(s)
| | - Hideki Innan
- Department of Evolutionary Studies of Biosystems, SOKENDAI, The Graduate University for Advanced Studies, Shonan Village, Hayama, Kanagawar 240-0193, Japan;
| |
Collapse
|
21
|
Shibata E, Morita KI, Kayamori K, Tange S, Shibata H, Harazono Y, Michi Y, Ikeda T, Harada H, Imoto I, Yoda T. Detection of novel fusion genes by next-generation sequencing-based targeted RNA sequencing analysis in adenoid cystic carcinoma of head and neck. Oral Surg Oral Med Oral Pathol Oral Radiol 2021; 132:426-433. [PMID: 34413003 DOI: 10.1016/j.oooo.2021.03.020] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/13/2020] [Revised: 02/12/2021] [Accepted: 03/21/2021] [Indexed: 11/18/2022]
Abstract
OBJECTIVE Adenoid cystic carcinoma (AdCC) is a rare, indolent salivary gland tumor that is reported to be driven by fusion genes. However, MYB/MYBL1-NFIB fusions have been detected in <60% of all AdCC cases and the oncogenic driver mutations in approximately 40% of AdCC remain unknown. Our aim was to identify novel gene fusions in AdCC. STUDY DESIGN We investigated 20 AdCC cases using a targeted RNA sequencing panel to identify gene fusions and performed quantitative real-time reverse transcription polymerase chain reaction to assess MYB, MYBL1, and NFIB expression levels. RESULTS A total of 36 fusion transcripts in 15 cases were detected and validated by Sanger sequencing. The MYB-NFIB and MYBL1-NFIB fusion genes were detected in 9 and 3 cases, respectively, in a mutually exclusive manner. Furthermore, novel gene fusions, namely, NFIB-EPB41L2, MAP7-NFIB, NFIB-MCMDC2, MYBL1-C8orf34, C8orf34-NFIB, and NFIB-CASC20, were identified. Among them, NFIB-EPB41L2 and NFIB-MCMDC2 are thought to activate MYB and MYBL1 expression, respectively, through the insertion of a genomic segment in proximity to MYB and MYBL1 genes, respectively. CONCLUSION Six novel gene fusions other than MYB/MYBL1-NFIB were identified. The detection of novel fusion genes and investigation of the molecular mechanism will contribute to the development of novel molecular targeted therapies for this disease.
Collapse
Affiliation(s)
- Eri Shibata
- Department of Maxillofacial Surgery, Graduate School of Medical and Dental Sciences, Tokyo Medical and Dental University, Tokyo, Japan
| | - Kei-Ichi Morita
- Department of Maxillofacial Surgery, Graduate School of Medical and Dental Sciences, Tokyo Medical and Dental University, Tokyo, Japan; Bioresource Research Center, Tokyo Medical and Dental University, Tokyo, Japan.
| | - Kou Kayamori
- Department of Oral Pathology, Graduate School of Medical and Dental Sciences, Tokyo Medical and Dental University, Tokyo, Japan
| | - Shoichiro Tange
- Department of Medical Genome Sciences, Research Institute for Frontier Medicine, Sapporo Medical University School of Medicine, Sapporo, Japan
| | - Hiroki Shibata
- Division of Genomics, Medical Institute of Bioregulation, Kyushu University, Fukuoka, Japan
| | - Yosuke Harazono
- Department of Maxillofacial Surgery, Graduate School of Medical and Dental Sciences, Tokyo Medical and Dental University, Tokyo, Japan
| | - Yasuyuki Michi
- Department of Oral and Maxillofacial Surgery, Graduate School of Medical and Dental Sciences, Tokyo Medical and Dental University, Tokyo, Japan
| | - Tohru Ikeda
- Department of Oral Pathology, Graduate School of Medical and Dental Sciences, Tokyo Medical and Dental University, Tokyo, Japan
| | - Hiroyuki Harada
- Department of Oral and Maxillofacial Surgery, Graduate School of Medical and Dental Sciences, Tokyo Medical and Dental University, Tokyo, Japan
| | - Issei Imoto
- Division of Molecular Genetics, Aichi Cancer Center Research Institute, Aichi, Japan
| | - Tetsuya Yoda
- Department of Maxillofacial Surgery, Graduate School of Medical and Dental Sciences, Tokyo Medical and Dental University, Tokyo, Japan
| |
Collapse
|
22
|
Creason A, Haan D, Dang K, Chiotti KE, Inkman M, Lamb A, Yu T, Hu Y, Norman TC, Buchanan A, van Baren MJ, Spangler R, Rollins MR, Spellman PT, Rozanov D, Zhang J, Maher CA, Caloian C, Watson JD, Uhrig S, Haas BJ, Jain M, Akeson M, Ahsen ME, Stolovitzky G, Guinney J, Boutros PC, Stuart JM, Ellrott K. A community challenge to evaluate RNA-seq, fusion detection, and isoform quantification methods for cancer discovery. Cell Syst 2021; 12:827-838.e5. [PMID: 34146471 PMCID: PMC8376800 DOI: 10.1016/j.cels.2021.05.021] [Citation(s) in RCA: 11] [Impact Index Per Article: 3.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/01/2019] [Revised: 09/15/2020] [Accepted: 05/25/2021] [Indexed: 02/03/2023]
Abstract
The accurate identification and quantitation of RNA isoforms present in the cancer transcriptome is key for analyses ranging from the inference of the impacts of somatic variants to pathway analysis to biomarker development and subtype discovery. The ICGC-TCGA DREAM Somatic Mutation Calling in RNA (SMC-RNA) challenge was a crowd-sourced effort to benchmark methods for RNA isoform quantification and fusion detection from bulk cancer RNA sequencing (RNA-seq) data. It concluded in 2018 with a comparison of 77 fusion detection entries and 65 isoform quantification entries on 51 synthetic tumors and 32 cell lines with spiked-in fusion constructs. We report the entries used to build this benchmark, the leaderboard results, and the experimental features associated with the accurate prediction of RNA species. This challenge required submissions to be in the form of containerized workflows, meaning each of the entries described is easily reusable through CWL and Docker containers at https://github.com/SMC-RNA-challenge. A record of this paper's transparent peer review process is included in the supplemental information.
Collapse
Affiliation(s)
- Allison Creason
- Biomedical Engineering, Oregon Health and Science University, Portland, OR 97239, USA
| | - David Haan
- Biomolecular Engineering and UC Santa Cruz Genome Institute, University of California, Santa Cruz, Santa Cruz, CA, USA
| | | | - Kami E Chiotti
- Biomedical Engineering, Oregon Health and Science University, Portland, OR 97239, USA
| | - Matthew Inkman
- The Genome Institute, Washington University School of Medicine, 4444 Forest Park Avenue, St. Louis, MO 63110, USA
| | | | | | - Yin Hu
- Sage Bionetworks, Seattle, WA, USA
| | | | - Alex Buchanan
- Biomedical Engineering, Oregon Health and Science University, Portland, OR 97239, USA
| | - Marijke J van Baren
- Biomolecular Engineering and UC Santa Cruz Genome Institute, University of California, Santa Cruz, Santa Cruz, CA, USA
| | - Ryan Spangler
- Biomedical Engineering, Oregon Health and Science University, Portland, OR 97239, USA
| | - M Rick Rollins
- Biomedical Engineering, Oregon Health and Science University, Portland, OR 97239, USA
| | - Paul T Spellman
- Biomedical Engineering, Oregon Health and Science University, Portland, OR 97239, USA
| | - Dmitri Rozanov
- Biomedical Engineering, Oregon Health and Science University, Portland, OR 97239, USA
| | - Jin Zhang
- The Genome Institute, Washington University School of Medicine, 4444 Forest Park Avenue, St. Louis, MO 63110, USA
| | - Christopher A Maher
- The Genome Institute, Washington University School of Medicine, 4444 Forest Park Avenue, St. Louis, MO 63110, USA
| | - Cristian Caloian
- Computational Biology, Ontario Institute for Cancer Research, Toronto, Canada
| | - John D Watson
- Computational Biology, Ontario Institute for Cancer Research, Toronto, Canada
| | - Sebastian Uhrig
- Division of Applied Bioinformatics, German Cancer Research Center (DKFZ) and Faculty of Biosciences, Heidelberg University, Heidelberg, Germany
| | - Brian J Haas
- Broad Institute of MIT and Harvard, Cambridge, MA 02142, USA
| | - Miten Jain
- Biomolecular Engineering and UC Santa Cruz Genome Institute, University of California, Santa Cruz, Santa Cruz, CA, USA
| | - Mark Akeson
- Biomolecular Engineering and UC Santa Cruz Genome Institute, University of California, Santa Cruz, Santa Cruz, CA, USA
| | - Mehmet Eren Ahsen
- Icahn School of Medicine at Mount Sinai, Department of Genetics and Genomic Sciences, One Gustave Levy Place, New York, NY 1498, USA
| | - Gustavo Stolovitzky
- Icahn School of Medicine at Mount Sinai, Department of Genetics and Genomic Sciences, One Gustave Levy Place, New York, NY 1498, USA; IBM T.J. Watson Research Center, 1101 Kitchawan Road, Route 134, Yorktown Heights, NY 10598, USA
| | | | - Paul C Boutros
- Computational Biology, Ontario Institute for Cancer Research, Toronto, Canada; Departments of Medical Biophysics and Pharmacology & Toxicology, University of Toronto, Toronto, Canada; Departments of Human Genetics and Urology, University of California, Los Angeles, Los Angeles, CA, USA
| | - Joshua M Stuart
- Biomolecular Engineering and UC Santa Cruz Genome Institute, University of California, Santa Cruz, Santa Cruz, CA, USA
| | - Kyle Ellrott
- Biomedical Engineering, Oregon Health and Science University, Portland, OR 97239, USA.
| |
Collapse
|
23
|
Chebib I, Taylor MS, Nardi V, Rivera MN, Lennerz JK, Cote GM, Choy E, Lozano Calderón SA, Raskin KA, Schwab JH, Mullen JT, Chen YLE, Hung YP, Nielsen GP, Deshpande V. Clinical Utility of Anchored Multiplex Solid Fusion Assay for Diagnosis of Bone and Soft Tissue Tumors. Am J Surg Pathol 2021; 45:1127-1137. [PMID: 34115673 DOI: 10.1097/pas.0000000000001745] [Citation(s) in RCA: 7] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/23/2023]
Abstract
Sarcoma diagnosis has become increasingly complex, requiring a combination of morphology, immunohistochemistry, and molecular studies to derive specific diagnoses. We evaluated the role of anchored multiplex polymerase chain reaction-based gene fusion assay in sarcoma diagnostics. Between 2015 and 2018, bone and soft tissue sarcomas with fusion assay results were compared with the histologic diagnosis. Of 143 sarcomas tested for fusions, 43 (30%) had a detectable fusion. In review, they could be classified into 2 main categories: (1) 31 tumors with concordant morphologic and fusion data; and (2) 12 tumors where the fusion panel identified an unexpected rearrangement that played a significant role in classification. The overall concordance of the fusion assay results with morphology/immunohistochemistry or alternate confirmatory molecular studies was 83%. Collectively, anchored multiplex polymerase chain reaction-based solid fusion assay represents a robust means of detecting targeted fusions with known and novel partners. The predictive value of the panel is highest in tumors that show a monomorphic cell population, round cell tumors, as well as tumors rich in inflammatory cells. However, with an increased ability to discover fusions of uncertain significance, it remains essential to emphasize that the diagnosis of bone and soft tissue neoplasms requires the integration of morphology and immunohistochemical profile with these molecular methods, for accurate diagnosis and optimal clinical management of sarcomas.
Collapse
Affiliation(s)
| | | | | | | | | | - Gregory M Cote
- Department of Internal Medicine, Division of Hematology/Oncology
| | - Edwin Choy
- Department of Internal Medicine, Division of Hematology/Oncology
| | | | | | | | | | - Yen-Lin E Chen
- Radiation Oncology, Massachusetts General Hospital and Harvard Medical School, Boston, MA
| | - Yin P Hung
- James Homer Wright Pathology Laboratories
| | | | | |
Collapse
|
24
|
Singh S, Li H. Comparative study of bioinformatic tools for the identification of chimeric RNAs from RNA Sequencing. RNA Biol 2021; 18:254-267. [PMID: 34142643 DOI: 10.1080/15476286.2021.1940047] [Citation(s) in RCA: 6] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/28/2022] Open
Abstract
Chimeric RNAs are gaining more and more attention as they have broad implications in both cancer and normal physiology. To date, over 40 chimeric RNA prediction methods have been developed to facilitate their identification from RNA sequencing data. However, a limited number of studies have been conducted to compare the performance of these tools; additionally, previous studies have become outdated as more software tools have been developed within the last three years. In this study, we benchmarked 16 chimeric RNA prediction software, including seven top performers in previous benchmarking studies, and nine that were recently developed. We used two simulated and two real RNA-Seq datasets, compared the 16 tools for their sensitivity, positive prediction value (PPV), F-measure, and also documented the computational requirements (time and memory). We noticed that none of the tools are inclusive, and their performance varies depending on the dataset and objects. To increase the detection of true positive events, we also evaluated the pair-wise combination of these methods to suggest the best combination for sensitivity and F-measure. In addition, we compared the performance of the tools for the identification of three classes (read-through, inter-chromosomal and intra-others) of chimeric RNAs. Finally, we performed TOPSIS analyses and ranked the weighted performance of the 16 tools.
Collapse
Affiliation(s)
- Sandeep Singh
- Department of Pathology, School of Medicine, University of Virginia, Charlottesville, VA, USA
| | - Hui Li
- Department of Pathology, School of Medicine, University of Virginia, Charlottesville, VA, USA.,Department of Biochemistry and Molecular Genetics, School of Medicine, University of Virginia, Charlottesville, VA, USA
| |
Collapse
|
25
|
Cell-Free Total Nucleic Acid-Based Genotyping of Aggressive Lymphoma: Comprehensive Analysis of Gene Fusions and Nucleotide Variants by Next-Generation Sequencing. Cancers (Basel) 2021; 13:cancers13123032. [PMID: 34204385 PMCID: PMC8235203 DOI: 10.3390/cancers13123032] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/02/2021] [Revised: 06/11/2021] [Accepted: 06/15/2021] [Indexed: 12/18/2022] Open
Abstract
Simple Summary This study aimed to simultaneously demonstrate pathogenic chromosomal translocations and point mutations from both tissue biopsy and peripheral blood (PB) liquid biopsy (LB) samples of aggressive lymphoma patients. Matched samples were analyzed by next-generation sequencing for the same 125 genes. Eight different gene fusions, including the classical BCL2, BCL6, and MYC genes were detected in the corresponding samples with generally good agreement. Besides, mutations of 29 commonly affected genes, such as BCL2, MYD88, NOTCH2, EZH2, and CD79B could be identified in the matched samples at a rate of 16/24 (66.7%). Our prospective study demonstrates a non-invasive approach to identify frequent gene fusions and variants in aggressive lymphomas. In conclusion, PB LB sampling substantially supports the oncogenetic diagnostics of lymphomas, especially at anatomically critical sites (such as the central nervous system). Abstract Chromosomal translocations and pathogenic nucleotide variants both gained special clinical importance in lymphoma diagnostics. Non-invasive genotyping from peripheral blood (PB) circulating free nucleic acid has been effectively used to demonstrate cancer-related nucleotide variants, while gene fusions were not covered in the past. Our prospective study aimed to isolate and quantify PB cell-free total nucleic acid (cfTNA) from patients diagnosed with aggressive lymphoma and to compare with tumor-derived RNA (tdRNA) from the tissue sample of the same patients for both gene fusion and nucleotide variant testing. Matched samples from 24 patients were analyzed by next-generation sequencing following anchored multiplexed polymerase chain reaction (AMP) for 125 gene regions. Eight different gene fusions, including the classical BCL2, BCL6, and MYC genes, were detected in the corresponding tissue biopsy and cfTNA specimens with generally good agreement. Synchronous BCL2 and MYC translocations in double-hit high-grade B-cell lymphomas were obvious from cfTNA. Besides, mutations of 29 commonly affected genes, such as BCL2, MYD88, NOTCH2, EZH2, and CD79B, could be identified in matched cfTNA, and previously described pathogenic variants were detected in 16/24 cases (66.7%). In 3/24 cases (12.5%), only the PB sample was informative. Our prospective study demonstrates a non-invasive approach to identify frequent gene fusions and variants in aggressive lymphomas. cfTNA was found to be a high-value representative reflecting the complexity of the lymphoma aberration landscape.
Collapse
|
26
|
Jones DC, Ruzzo WL. Polee: RNA-Seq analysis using approximate likelihood. NAR Genom Bioinform 2021; 3:lqab046. [PMID: 34056596 PMCID: PMC8152449 DOI: 10.1093/nargab/lqab046] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/10/2021] [Revised: 04/11/2021] [Accepted: 05/11/2021] [Indexed: 12/20/2022] Open
Abstract
The analysis of mRNA transcript abundance with RNA-Seq is a central tool in molecular biology research, but often analyses fail to account for the uncertainty in these estimates, which can be significant, especially when trying to disentangle isoforms or duplicated genes. Preserving uncertainty necessitates a full probabilistic model of the all the sequencing reads, which quickly becomes intractable, as experiments can consist of billions of reads. To overcome these limitations, we propose a new method of approximating the likelihood function of a sparse mixture model, using a technique we call the Pólya tree transformation. We demonstrate that substituting this approximation for the real thing achieves most of the benefits with a fraction of the computational costs, leading to more accurate detection of differential transcript expression and transcript coexpression.
Collapse
Affiliation(s)
- Daniel C Jones
- Paul G. Allen School of Computer Science & Engineering, University of Washington, Box 352350, Seattle, WA 98195-2350, USA
| | - Walter L Ruzzo
- Paul G. Allen School of Computer Science & Engineering, University of Washington, Box 352350, Seattle, WA 98195-2350, USA
- Department of Genome Sciences, University of Washington, Box 355065, Seattle, WA 98195-5065, USA
- Fred Hutchinson Cancer Research Center, 1100 Fairview Ave. N., P.O. Box 19024, Seattle, WA 98109, USA
| |
Collapse
|
27
|
Apostolides M, Jiang Y, Husić M, Siddaway R, Hawkins C, Turinsky AL, Brudno M, Ramani AK. MetaFusion: A high-confidence metacaller for filtering and prioritizing RNA-seq gene fusion candidates. Bioinformatics 2021; 37:3144-3151. [PMID: 33944895 DOI: 10.1093/bioinformatics/btab249] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/17/2020] [Revised: 03/04/2021] [Accepted: 05/03/2021] [Indexed: 11/13/2022] Open
Abstract
MOTIVATION Current fusion detection tools use diverse calling approaches and provide varying results, making selection of the appropriate tool challenging. Ensemble fusion calling techniques appear promising; however, current options have limited accessibility and function. RESULTS MetaFusion is a flexible meta-calling tool that amalgamates outputs from any number of fusion callers. Individual caller results are standardized by conversion into the new file type Common Fusion Format (CFF). Calls are annotated, merged using graph clustering, filtered, and ranked to provide a final output of high confidence candidates. MetaFusion consistently achieves higher precision and recall than individual callers on real and simulated datasets, and reaches up to 100% precision, indicating that ensemble calling is imperative for high confidence results. MetaFusion uses FusionAnnotator to annotate calls with information from cancer fusion databases, and is provided with a benchmarking toolkit to calibrate new callers. AVAILABILITY MetaFusion is freely available at https://github.com/ccmbioinfo/MetaFusion. SUPPLEMENTARY INFORMATION Supplementary data are available at Bioinformatics online.
Collapse
Affiliation(s)
- Michael Apostolides
- Centre for Computational Medicine, The Hospital For Sick Children, Toronto, ON, Canada
| | - Yue Jiang
- Centre for Computational Medicine, The Hospital For Sick Children, Toronto, ON, Canada
| | - Mia Husić
- Centre for Computational Medicine, The Hospital For Sick Children, Toronto, ON, Canada
| | - Robert Siddaway
- The Arthur and Sonia Labatt Brain Tumour Research Centre, The Hospital for Sick Children, Toronto, ON, Canada
| | - Cynthia Hawkins
- The Arthur and Sonia Labatt Brain Tumour Research Centre, The Hospital for Sick Children, Toronto, ON, Canada.,Department of Laboratory Medicine and Pathobiology, University of Toronto, Toronto, ON, Canada.,Division of Pathology, The Hospital for Sick Children, Toronto, ON, Canada
| | - Andrei L Turinsky
- Centre for Computational Medicine, The Hospital For Sick Children, Toronto, ON, Canada
| | - Michael Brudno
- Centre for Computational Medicine, The Hospital For Sick Children, Toronto, ON, Canada.,Genetics and Genome Biology Program, The Hospital for Sick Children, Toronto, ON, Canada.,Department of Computer Science, University of Toronto, Toronto, ON, Canada.,University Health Network, Toronto, ON, Canada
| | - Arun K Ramani
- Centre for Computational Medicine, The Hospital For Sick Children, Toronto, ON, Canada
| |
Collapse
|
28
|
Liu Z, Chen X, Roberts R, Huang R, Mikailov M, Tong W. Unraveling Gene Fusions for Drug Repositioning in High-Risk Neuroblastoma. Front Pharmacol 2021; 12:608778. [PMID: 33967751 PMCID: PMC8105087 DOI: 10.3389/fphar.2021.608778] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/21/2020] [Accepted: 03/23/2021] [Indexed: 11/13/2022] Open
Abstract
High-risk neuroblastoma (NB) remains a significant therapeutic challenge facing current pediatric oncology patients. Structural variants such as gene fusions have shown an initial promise in enhancing mechanistic understanding of NB and improving survival rates. In this study, we performed a comprehensive in silico investigation on the translational ability of gene fusions for patient stratification and treatment development for high-risk NB patients. Specifically, three state-of-the-art gene fusion detection algorithms, including ChimeraScan, SOAPfuse, and TopHat-Fusion, were employed to identify the fusion transcripts in a RNA-seq data set of 498 neuroblastoma patients. Then, the 176 high-risk patients were further stratified into four different subgroups based on gene fusion profiles. Furthermore, Kaplan-Meier survival analysis was performed, and differentially expressed genes (DEGs) for the redefined high-risk group were extracted and functionally analyzed. Finally, repositioning candidates were enriched in each patient subgroup with drug transcriptomic profiles from the LINCS L1000 Connectivity Map. We found the number of identified gene fusions was increased from clinical the low-risk stage to the high-risk stage. Although the technical concordance of fusion detection algorithms was suboptimal, they have a similar biological relevance concerning perturbed pathways and regulated DEGs. The gene fusion profiles could be utilized to redefine high-risk patient subgroups with significant onset age of NB, which yielded the improved survival curves (Log-rank p value ≤ 0.05). Out of 48 enriched repositioning candidates, 45 (93.8%) have antitumor potency, and 24 (50%) were confirmed with either on-going clinical trials or literature reports. The gene fusion profiles have a discrimination power for redefining patient subgroups in high-risk NB and facilitate precision medicine-based drug repositioning implementation.
Collapse
Affiliation(s)
- Zhichao Liu
- Division of Bioinformatics and Biostatistics, National Center for Toxicological Research, US Food and Drug Administration, Jefferson, AR, United States
| | - Xi Chen
- Division of Bioinformatics and Biostatistics, National Center for Toxicological Research, US Food and Drug Administration, Jefferson, AR, United States
| | - Ruth Roberts
- ApconiX, BioHub at Alderley Park, Alderley Edge, United Kingdom.,University of Birmingham, Edgbaston, Birmingham, United Kingdom
| | - Ruili Huang
- National Center for Advancing Translational Sciences, National Institutes of Health, Rockville, MD, United States
| | - Mike Mikailov
- Office of Science and Engineering Labs, Center for Devices and Radiological Health, US Food and Drug Administration, Silver Spring, MD, United States
| | - Weida Tong
- Division of Bioinformatics and Biostatistics, National Center for Toxicological Research, US Food and Drug Administration, Jefferson, AR, United States
| |
Collapse
|
29
|
Morton LM, Karyadi DM, Stewart C, Bogdanova TI, Dawson ET, Steinberg MK, Dai J, Hartley SW, Schonfeld SJ, Sampson JN, Maruvka YE, Kapoor V, Ramsden DA, Carvajal-Garcia J, Perou CM, Parker JS, Krznaric M, Yeager M, Boland JF, Hutchinson A, Hicks BD, Dagnall CL, Gastier-Foster JM, Bowen J, Lee O, Machiela MJ, Cahoon EK, Brenner AV, Mabuchi K, Drozdovitch V, Masiuk S, Chepurny M, Zurnadzhy LY, Hatch M, Berrington de Gonzalez A, Thomas GA, Tronko MD, Getz G, Chanock SJ. Radiation-related genomic profile of papillary thyroid carcinoma after the Chernobyl accident. Science 2021; 372:science.abg2538. [PMID: 33888599 DOI: 10.1126/science.abg2538] [Citation(s) in RCA: 70] [Impact Index Per Article: 23.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/22/2020] [Accepted: 03/25/2021] [Indexed: 12/13/2022]
Abstract
The 1986 Chernobyl nuclear power plant accident increased papillary thyroid carcinoma (PTC) incidence in surrounding regions, particularly for radioactive iodine (131I)-exposed children. We analyzed genomic, transcriptomic, and epigenomic characteristics of 440 PTCs from Ukraine (from 359 individuals with estimated childhood 131I exposure and 81 unexposed children born after 1986). PTCs displayed radiation dose-dependent enrichment of fusion drivers, nearly all in the mitogen-activated protein kinase pathway, and increases in small deletions and simple/balanced structural variants that were clonal and bore hallmarks of nonhomologous end-joining repair. Radiation-related genomic alterations were more pronounced for individuals who were younger at exposure. Transcriptomic and epigenomic features were strongly associated with driver events but not radiation dose. Our results point to DNA double-strand breaks as early carcinogenic events that subsequently enable PTC growth after environmental radiation exposure.
Collapse
Affiliation(s)
- Lindsay M Morton
- Radiation Epidemiology Branch, Division of Cancer Epidemiology and Genetics, National Cancer Institute, Bethesda, MD 20892, USA.
| | - Danielle M Karyadi
- Laboratory of Genetic Susceptibility, Division of Cancer Epidemiology and Genetics, National Cancer Institute, Bethesda, MD 20892, USA
| | - Chip Stewart
- Broad Institute of MIT and Harvard, Cambridge, MA 02142, USA
| | - Tetiana I Bogdanova
- Laboratory of Morphology of the Endocrine System, V. P. Komisarenko Institute of Endocrinology and Metabolism of the National Academy of Medical Sciences of Ukraine, Kyiv 04114, Ukraine
| | - Eric T Dawson
- Laboratory of Genetic Susceptibility, Division of Cancer Epidemiology and Genetics, National Cancer Institute, Bethesda, MD 20892, USA.,Nvidia Corporation, Santa Clara, CA 95051, USA
| | - Mia K Steinberg
- Cancer Genomics Research Laboratory, Leidos Biomedical Research Inc., Frederick National Laboratory for Cancer Research, Bethesda, MD 20892, USA
| | - Jieqiong Dai
- Cancer Genomics Research Laboratory, Leidos Biomedical Research Inc., Frederick National Laboratory for Cancer Research, Bethesda, MD 20892, USA
| | - Stephen W Hartley
- Laboratory of Genetic Susceptibility, Division of Cancer Epidemiology and Genetics, National Cancer Institute, Bethesda, MD 20892, USA
| | - Sara J Schonfeld
- Radiation Epidemiology Branch, Division of Cancer Epidemiology and Genetics, National Cancer Institute, Bethesda, MD 20892, USA
| | - Joshua N Sampson
- Biostatistics Branch, Division of Cancer Epidemiology and Genetics, National Cancer Institute, Bethesda, MD 20892, USA
| | - Yosef E Maruvka
- Broad Institute of MIT and Harvard, Cambridge, MA 02142, USA
| | - Vidushi Kapoor
- Cancer Genomics Research Laboratory, Leidos Biomedical Research Inc., Frederick National Laboratory for Cancer Research, Bethesda, MD 20892, USA
| | - Dale A Ramsden
- Department of Biochemistry and Biophysics, Lineberger Comprehensive Cancer Center, University of North Carolina, Chapel Hill, NC 27599, USA
| | - Juan Carvajal-Garcia
- Curriculum in Genetics and Molecular Biology, University of North Carolina, Chapel Hill, NC 27599, USA
| | - Charles M Perou
- Lineberger Comprehensive Cancer Center, University of North Carolina, Chapel Hill, NC 27599, USA.,Department of Genetics, University of North Carolina, Chapel Hill, NC 27599, USA
| | - Joel S Parker
- Department of Genetics, University of North Carolina, Chapel Hill, NC 27599, USA
| | - Marko Krznaric
- Department of Surgery and Cancer, Imperial College London, Charing Cross Hospital, London W6 8RF, UK
| | - Meredith Yeager
- Cancer Genomics Research Laboratory, Leidos Biomedical Research Inc., Frederick National Laboratory for Cancer Research, Bethesda, MD 20892, USA
| | - Joseph F Boland
- Cancer Genomics Research Laboratory, Leidos Biomedical Research Inc., Frederick National Laboratory for Cancer Research, Bethesda, MD 20892, USA
| | - Amy Hutchinson
- Cancer Genomics Research Laboratory, Leidos Biomedical Research Inc., Frederick National Laboratory for Cancer Research, Bethesda, MD 20892, USA
| | - Belynda D Hicks
- Cancer Genomics Research Laboratory, Leidos Biomedical Research Inc., Frederick National Laboratory for Cancer Research, Bethesda, MD 20892, USA
| | - Casey L Dagnall
- Cancer Genomics Research Laboratory, Leidos Biomedical Research Inc., Frederick National Laboratory for Cancer Research, Bethesda, MD 20892, USA
| | - Julie M Gastier-Foster
- Nationwide Children's Hospital, Biospecimen Core Resource, Columbus, OH 43205, USA.,Departments of Pathology and Pediatrics, Ohio State University College of Medicine, Columbus, OH 43210, USA
| | - Jay Bowen
- Nationwide Children's Hospital, Biospecimen Core Resource, Columbus, OH 43205, USA
| | - Olivia Lee
- Laboratory of Genetic Susceptibility, Division of Cancer Epidemiology and Genetics, National Cancer Institute, Bethesda, MD 20892, USA
| | - Mitchell J Machiela
- Integrative Tumor Epidemiology Branch, Division of Cancer Epidemiology and Genetics, National Cancer Institute, Bethesda, MD 20892, USA
| | - Elizabeth K Cahoon
- Radiation Epidemiology Branch, Division of Cancer Epidemiology and Genetics, National Cancer Institute, Bethesda, MD 20892, USA
| | - Alina V Brenner
- Radiation Effects Research Foundation, Hiroshima 732-0815, Japan
| | - Kiyohiko Mabuchi
- Radiation Epidemiology Branch, Division of Cancer Epidemiology and Genetics, National Cancer Institute, Bethesda, MD 20892, USA
| | - Vladimir Drozdovitch
- Radiation Epidemiology Branch, Division of Cancer Epidemiology and Genetics, National Cancer Institute, Bethesda, MD 20892, USA
| | - Sergii Masiuk
- Radiological Protection Laboratory, Institute of Radiation Hygiene and Epidemiology, National Research Center for Radiation Medicine of the National Academy of Medical Sciences of Ukraine, Kyiv 04050, Ukraine
| | - Mykola Chepurny
- Radiological Protection Laboratory, Institute of Radiation Hygiene and Epidemiology, National Research Center for Radiation Medicine of the National Academy of Medical Sciences of Ukraine, Kyiv 04050, Ukraine
| | - Liudmyla Yu Zurnadzhy
- Laboratory of Morphology of the Endocrine System, V. P. Komisarenko Institute of Endocrinology and Metabolism of the National Academy of Medical Sciences of Ukraine, Kyiv 04114, Ukraine
| | - Maureen Hatch
- Radiation Epidemiology Branch, Division of Cancer Epidemiology and Genetics, National Cancer Institute, Bethesda, MD 20892, USA
| | - Amy Berrington de Gonzalez
- Radiation Epidemiology Branch, Division of Cancer Epidemiology and Genetics, National Cancer Institute, Bethesda, MD 20892, USA
| | - Gerry A Thomas
- Department of Surgery and Cancer, Imperial College London, Charing Cross Hospital, London W6 8RF, UK
| | - Mykola D Tronko
- Department of Fundamental and Applied Problems of Endocrinology, V. P. Komisarenko Institute of Endocrinology and Metabolism of the National Academy of Medical Sciences of Ukraine, Kyiv 04114, Ukraine
| | - Gad Getz
- Broad Institute of MIT and Harvard, Cambridge, MA 02142, USA.,Center for Cancer Research and Department of Pathology, Massachusetts General Hospital, Boston, MA 02114, USA.,Harvard Medical School, Boston, MA 02115, USA
| | - Stephen J Chanock
- Laboratory of Genetic Susceptibility, Division of Cancer Epidemiology and Genetics, National Cancer Institute, Bethesda, MD 20892, USA.
| |
Collapse
|
30
|
Jilani M, Haspel N. Computational Methods for Detecting Large-Scale Structural Rearrangements in Chromosomes. Bioinformatics 2021. [DOI: 10.36255/exonpublications.bioinformatics.2021.ch3] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/20/2022] Open
|
31
|
Taniue K, Akimitsu N. Fusion Genes and RNAs in Cancer Development. Noncoding RNA 2021; 7:10. [PMID: 33557176 PMCID: PMC7931065 DOI: 10.3390/ncrna7010010] [Citation(s) in RCA: 16] [Impact Index Per Article: 5.3] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/31/2020] [Revised: 02/02/2021] [Accepted: 02/03/2021] [Indexed: 02/07/2023] Open
Abstract
Fusion RNAs are a hallmark of some cancers. They result either from chromosomal rearrangements or from splicing mechanisms that are non-chromosomal rearrangements. Chromosomal rearrangements that result in gene fusions are particularly prevalent in sarcomas and hematopoietic malignancies; they are also common in solid tumors. The splicing process can also give rise to more complex RNA patterns in cells. Gene fusions frequently affect tyrosine kinases, chromatin regulators, or transcription factors, and can cause constitutive activation, enhancement of downstream signaling, and tumor development, as major drivers of oncogenesis. In addition, some fusion RNAs have been shown to function as noncoding RNAs and to affect cancer progression. Fusion genes and RNAs will therefore become increasingly important as diagnostic and therapeutic targets for cancer development. Here, we discuss the function, biogenesis, detection, clinical relevance, and therapeutic implications of oncogenic fusion genes and RNAs in cancer development. Further understanding the molecular mechanisms that regulate how fusion RNAs form in cancers is critical to the development of therapeutic strategies against tumorigenesis.
Collapse
Affiliation(s)
- Kenzui Taniue
- Isotope Science Center, The University of Tokyo, 2-11-16, Yayoi, Bunkyo-ku, Tokyo 113-0032, Japan
- Cancer Genomics and Precision Medicine, Division of Gastroenterology and Hematology/Oncology, Department of Medicine, Asahikawa Medical University, 2-1 Midorigaoka Higashi, Asahikawa, Hokkaido 078-8510, Japan
| | - Nobuyoshi Akimitsu
- Isotope Science Center, The University of Tokyo, 2-11-16, Yayoi, Bunkyo-ku, Tokyo 113-0032, Japan
| |
Collapse
|
32
|
Liu Q, Hu Y, Stucky A, Fang L, Zhong JF, Wang K. LongGF: computational algorithm and software tool for fast and accurate detection of gene fusions by long-read transcriptome sequencing. BMC Genomics 2020; 21:793. [PMID: 33372596 PMCID: PMC7771079 DOI: 10.1186/s12864-020-07207-4] [Citation(s) in RCA: 17] [Impact Index Per Article: 4.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/21/2020] [Accepted: 10/29/2020] [Indexed: 12/17/2022] Open
Abstract
BACKGROUND Long-read RNA-Seq techniques can generate reads that encompass a large proportion or the entire mRNA/cDNA molecules, so they are expected to address inherited limitations of short-read RNA-Seq techniques that typically generate < 150 bp reads. However, there is a general lack of software tools for gene fusion detection from long-read RNA-seq data, which takes into account the high basecalling error rates and the presence of alignment errors. RESULTS In this study, we developed a fast computational tool, LongGF, to efficiently detect candidate gene fusions from long-read RNA-seq data, including cDNA sequencing data and direct mRNA sequencing data. We evaluated LongGF on tens of simulated long-read RNA-seq datasets, and demonstrated its superior performance in gene fusion detection. We also tested LongGF on a Nanopore direct mRNA sequencing dataset and a PacBio sequencing dataset generated on a mixture of 10 cancer cell lines, and found that LongGF achieved better performance to detect known gene fusions over existing computational tools. Furthermore, we tested LongGF on a Nanopore cDNA sequencing dataset on acute myeloid leukemia, and pinpointed the exact location of a translocation (previously known in cytogenetic resolution) in base resolution, which was further validated by Sanger sequencing. CONCLUSIONS In summary, LongGF will greatly facilitate the discovery of candidate gene fusion events from long-read RNA-Seq data, especially in cancer samples. LongGF is implemented in C++ and is available at https://github.com/WGLab/LongGF .
Collapse
Affiliation(s)
- Qian Liu
- Raymond G. Perelman Center for Cellular and Molecular Therapeutics, Children's Hospital of Philadelphia, Philadelphia, PA, 19104, USA
| | - Yu Hu
- Raymond G. Perelman Center for Cellular and Molecular Therapeutics, Children's Hospital of Philadelphia, Philadelphia, PA, 19104, USA
| | - Andres Stucky
- Department of Otolaryngology, Keck School of Medicine, University of Southern California, Los Angeles, CA, 90033, USA
| | - Li Fang
- Raymond G. Perelman Center for Cellular and Molecular Therapeutics, Children's Hospital of Philadelphia, Philadelphia, PA, 19104, USA
| | - Jiang F Zhong
- Department of Otolaryngology, Keck School of Medicine, University of Southern California, Los Angeles, CA, 90033, USA
| | - Kai Wang
- Raymond G. Perelman Center for Cellular and Molecular Therapeutics, Children's Hospital of Philadelphia, Philadelphia, PA, 19104, USA.
- Department of Pathology and Laboratory Medicine, Perelman School of Medicine, University of Pennsylvania, Philadelphia, PA, 19104, USA.
| |
Collapse
|
33
|
Gaonkar KS, Marini F, Rathi KS, Jain P, Zhu Y, Chimicles NA, Brown MA, Naqvi AS, Zhang B, Storm PB, Maris JM, Raman P, Resnick AC, Strauch K, Taroni JN, Rokita JL. annoFuse: an R Package to annotate, prioritize, and interactively explore putative oncogenic RNA fusions. BMC Bioinformatics 2020; 21:577. [PMID: 33317447 PMCID: PMC7737294 DOI: 10.1186/s12859-020-03922-7] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/14/2020] [Accepted: 12/03/2020] [Indexed: 01/22/2023] Open
Abstract
BACKGROUND Gene fusion events are significant sources of somatic variation across adult and pediatric cancers and are some of the most clinically-effective therapeutic targets, yet low consensus of RNA-Seq fusion prediction algorithms makes therapeutic prioritization difficult. In addition, events such as polymerase read-throughs, mis-mapping due to gene homology, and fusions occurring in healthy normal tissue require informed filtering, making it difficult for researchers and clinicians to rapidly discern gene fusions that might be true underlying oncogenic drivers of a tumor and in some cases, appropriate targets for therapy. RESULTS We developed annoFuse, an R package, and shinyFuse, a companion web application, to annotate, prioritize, and explore biologically-relevant expressed gene fusions, downstream of fusion calling. We validated annoFuse using a random cohort of TCGA RNA-Seq samples (N = 160) and achieved a 96% sensitivity for retention of high-confidence fusions (N = 603). annoFuse uses FusionAnnotator annotations to filter non-oncogenic and/or artifactual fusions. Then, fusions are prioritized if previously reported in TCGA and/or fusions containing gene partners that are known oncogenes, tumor suppressor genes, COSMIC genes, and/or transcription factors. We applied annoFuse to fusion calls from pediatric brain tumor RNA-Seq samples (N = 1028) provided as part of the Open Pediatric Brain Tumor Atlas (OpenPBTA) Project to determine recurrent fusions and recurrently-fused genes within different brain tumor histologies. annoFuse annotates protein domains using the PFAM database, assesses reciprocality, and annotates gene partners for kinase domain retention. As a standard function, reportFuse enables generation of a reproducible R Markdown report to summarize filtered fusions, visualize breakpoints and protein domains by transcript, and plot recurrent fusions within cohorts. Finally, we created shinyFuse for algorithm-agnostic interactive exploration and plotting of gene fusions. CONCLUSIONS annoFuse provides standardized filtering and annotation for gene fusion calls from STAR-Fusion and Arriba by merging, filtering, and prioritizing putative oncogenic fusions across large cancer datasets, as demonstrated here with data from the OpenPBTA project. We are expanding the package to be widely-applicable to other fusion algorithms and expect annoFuse to provide researchers a method for rapidly evaluating, prioritizing, and translating fusion findings in patient tumors.
Collapse
Affiliation(s)
- Krutika S Gaonkar
- Center for Data-Driven Discovery in Biomedicine, Children's Hospital of Philadelphia, Philadelphia, PA, USA
- Department of Bioinformatics and Health Informatics, Children's Hospital of Philadelphia, Philadelphia, PA, USA
- Division of Neurosurgery, Children's Hospital of Philadelphia, Philadelphia, PA, USA
| | - Federico Marini
- Institute of Medical Biostatistics, Epidemiology and Informatics, University Medical Center of the Johannes Gutenberg University Mainz, Mainz, Germany
- Center for Thrombosis and Hemostasis, Mainz, Germany
| | - Komal S Rathi
- Center for Data-Driven Discovery in Biomedicine, Children's Hospital of Philadelphia, Philadelphia, PA, USA
- Department of Bioinformatics and Health Informatics, Children's Hospital of Philadelphia, Philadelphia, PA, USA
- Division of Neurosurgery, Children's Hospital of Philadelphia, Philadelphia, PA, USA
| | - Payal Jain
- Center for Data-Driven Discovery in Biomedicine, Children's Hospital of Philadelphia, Philadelphia, PA, USA
- Division of Neurosurgery, Children's Hospital of Philadelphia, Philadelphia, PA, USA
| | - Yuankun Zhu
- Center for Data-Driven Discovery in Biomedicine, Children's Hospital of Philadelphia, Philadelphia, PA, USA
- Division of Neurosurgery, Children's Hospital of Philadelphia, Philadelphia, PA, USA
| | - Nicholas A Chimicles
- Center for Data-Driven Discovery in Biomedicine, Children's Hospital of Philadelphia, Philadelphia, PA, USA
- Department of Bioinformatics and Health Informatics, Children's Hospital of Philadelphia, Philadelphia, PA, USA
| | - Miguel A Brown
- Center for Data-Driven Discovery in Biomedicine, Children's Hospital of Philadelphia, Philadelphia, PA, USA
- Division of Neurosurgery, Children's Hospital of Philadelphia, Philadelphia, PA, USA
| | - Ammar S Naqvi
- Center for Data-Driven Discovery in Biomedicine, Children's Hospital of Philadelphia, Philadelphia, PA, USA
- Department of Bioinformatics and Health Informatics, Children's Hospital of Philadelphia, Philadelphia, PA, USA
- Division of Neurosurgery, Children's Hospital of Philadelphia, Philadelphia, PA, USA
| | - Bo Zhang
- Center for Data-Driven Discovery in Biomedicine, Children's Hospital of Philadelphia, Philadelphia, PA, USA
- Division of Neurosurgery, Children's Hospital of Philadelphia, Philadelphia, PA, USA
| | - Phillip B Storm
- Center for Data-Driven Discovery in Biomedicine, Children's Hospital of Philadelphia, Philadelphia, PA, USA
- Division of Neurosurgery, Children's Hospital of Philadelphia, Philadelphia, PA, USA
| | - John M Maris
- Division of Oncology, Children's Hospital of Philadelphia and Perelman School of Medicine, University of Pennsylvania, Philadelphia, PA, USA
| | - Pichai Raman
- Center for Data-Driven Discovery in Biomedicine, Children's Hospital of Philadelphia, Philadelphia, PA, USA
- Department of Bioinformatics and Health Informatics, Children's Hospital of Philadelphia, Philadelphia, PA, USA
- Division of Neurosurgery, Children's Hospital of Philadelphia, Philadelphia, PA, USA
| | - Adam C Resnick
- Center for Data-Driven Discovery in Biomedicine, Children's Hospital of Philadelphia, Philadelphia, PA, USA
- Department of Bioinformatics and Health Informatics, Children's Hospital of Philadelphia, Philadelphia, PA, USA
- Division of Neurosurgery, Children's Hospital of Philadelphia, Philadelphia, PA, USA
| | - Konstantin Strauch
- Institute of Medical Biostatistics, Epidemiology and Informatics, University Medical Center of the Johannes Gutenberg University Mainz, Mainz, Germany
| | - Jaclyn N Taroni
- Alex's Lemonade Stand Foundation Childhood Cancer Data Lab, Philadelphia, PA, USA
| | - Jo Lynne Rokita
- Center for Data-Driven Discovery in Biomedicine, Children's Hospital of Philadelphia, Philadelphia, PA, USA.
- Department of Bioinformatics and Health Informatics, Children's Hospital of Philadelphia, Philadelphia, PA, USA.
- Division of Neurosurgery, Children's Hospital of Philadelphia, Philadelphia, PA, USA.
| |
Collapse
|
34
|
Tong L, Wu PY, Phan JH, Hassazadeh HR, Tong W, Wang MD. Impact of RNA-seq data analysis algorithms on gene expression estimation and downstream prediction. Sci Rep 2020; 10:17925. [PMID: 33087762 PMCID: PMC7578822 DOI: 10.1038/s41598-020-74567-y] [Citation(s) in RCA: 13] [Impact Index Per Article: 3.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/22/2019] [Accepted: 08/27/2020] [Indexed: 11/23/2022] Open
Abstract
To use next-generation sequencing technology such as RNA-seq for medical and health applications, choosing proper analysis methods for biomarker identification remains a critical challenge for most users. The US Food and Drug Administration (FDA) has led the Sequencing Quality Control (SEQC) project to conduct a comprehensive investigation of 278 representative RNA-seq data analysis pipelines consisting of 13 sequence mapping, three quantification, and seven normalization methods. In this article, we focused on the impact of the joint effects of RNA-seq pipelines on gene expression estimation as well as the downstream prediction of disease outcomes. First, we developed and applied three metrics (i.e., accuracy, precision, and reliability) to quantitatively evaluate each pipeline's performance on gene expression estimation. We then investigated the correlation between the proposed metrics and the downstream prediction performance using two real-world cancer datasets (i.e., SEQC neuroblastoma dataset and the NIH/NCI TCGA lung adenocarcinoma dataset). We found that RNA-seq pipeline components jointly and significantly impacted the accuracy of gene expression estimation, and its impact was extended to the downstream prediction of these cancer outcomes. Specifically, RNA-seq pipelines that produced more accurate, precise, and reliable gene expression estimation tended to perform better in the prediction of disease outcome. In the end, we provided scenarios as guidelines for users to use these three metrics to select sensible RNA-seq pipelines for the improved accuracy, precision, and reliability of gene expression estimation, which lead to the improved downstream gene expression-based prediction of disease outcome.
Collapse
Affiliation(s)
- Li Tong
- Department of Biomedical Engineering, Georgia Institute of Technology and Emory University, Atlanta, GA, USA
| | - Po-Yen Wu
- School of Electrical and Computer Engineering, Georgia Institute of Technology, Atlanta, GA, USA
| | - John H Phan
- Department of Biomedical Engineering, Georgia Institute of Technology and Emory University, Atlanta, GA, USA
| | - Hamid R Hassazadeh
- School of Computational Science and Engineering, Georgia Institute of Technology, Atlanta, GA, USA
| | - Weida Tong
- National Center for Toxicological Research, US Food and Drug Administration, Jefferson, AR, USA
| | - May D Wang
- Department of Biomedical Engineering, Georgia Institute of Technology and Emory University, Atlanta, GA, USA.
| |
Collapse
|
35
|
Chiu R, Nip KM, Birol I. Fusion-Bloom: fusion detection in assembled transcriptomes. Bioinformatics 2020; 36:2256-2257. [PMID: 31790154 PMCID: PMC7141844 DOI: 10.1093/bioinformatics/btz902] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/26/2019] [Revised: 11/13/2019] [Accepted: 11/27/2019] [Indexed: 11/13/2022] Open
Abstract
Summary Presence or absence of gene fusions is one of the most important diagnostic markers in many cancer types. Consequently, fusion detection methods using various genomics data types, such as RNA sequencing (RNA-seq) are valuable tools for research and clinical applications. While information-rich RNA-seq data have proven to be instrumental in discovery of a number of hallmark fusion events, bioinformatics tools to detect fusions still have room for improvement. Here, we present Fusion-Bloom, a fusion detection method that leverages recent developments in de novo transcriptome assembly and assembly-based structural variant calling technologies (RNA-Bloom and PAVFinder, respectively). We benchmarked Fusion-Bloom against the performance of five other state-of-the-art fusion detection tools using multiple datasets. Overall, we observed Fusion-Bloom to display a good balance between detection sensitivity and specificity. We expect the tool to find applications in translational research and clinical genomics pipelines. Availability and implementation Fusion-Bloom is implemented as a UNIX Make utility, available at https://github.com/bcgsc/pavfinder and released under a Creative Commons License (Attribution 4.0 International), as described at http://creativecommons.org/licenses/by/4.0/. Supplementary information Supplementary data are available at Bioinformatics online.
Collapse
Affiliation(s)
- Readman Chiu
- Canada's Michael Smith Genome Sciences Centre, BC Cancer, Vancouver, BC V5Z 4S6, Canada
| | - Ka Ming Nip
- Canada's Michael Smith Genome Sciences Centre, BC Cancer, Vancouver, BC V5Z 4S6, Canada.,Bioinformatics Graduate Program, University of British Columbia, Vancouver, BC V6H 3N1, Canada
| | - Inanc Birol
- Canada's Michael Smith Genome Sciences Centre, BC Cancer, Vancouver, BC V5Z 4S6, Canada.,Department of Medical Genetics, University of British Columbia, Vancouver, BC V6H 3N1, Canada
| |
Collapse
|
36
|
Friedrich S, Sonnhammer ELL. Fusion transcript detection using spatial transcriptomics. BMC Med Genomics 2020; 13:110. [PMID: 32753032 PMCID: PMC7437936 DOI: 10.1186/s12920-020-00738-5] [Citation(s) in RCA: 14] [Impact Index Per Article: 3.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/22/2019] [Accepted: 06/11/2020] [Indexed: 12/20/2022] Open
Abstract
BACKGROUND Fusion transcripts are involved in tumourigenesis and play a crucial role in tumour heterogeneity, tumour evolution and cancer treatment resistance. However, fusion transcripts have not been studied at high spatial resolution in tissue sections due to the lack of full-length transcripts with spatial information. New high-throughput technologies like spatial transcriptomics measure the transcriptome of tissue sections on almost single-cell level. While this technique does not allow for direct detection of fusion transcripts, we show that they can be inferred using the relative poly(A) tail abundance of the involved parental genes. METHOD We present a new method STfusion, which uses spatial transcriptomics to infer the presence and absence of poly(A) tails. A fusion transcript lacks a poly(A) tail for the 5' gene and has an elevated number of poly(A) tails for the 3' gene. Its expression level is defined by the upstream promoter of the 5' gene. STfusion measures the difference between the observed and expected number of poly(A) tails with a novel C-score. RESULTS We verified the STfusion ability to predict fusion transcripts on HeLa cells with known fusions. STfusion and C-score applied to clinical prostate cancer data revealed the spatial distribution of the cis-SAGe SLC45A3-ELK4 in 12 tissue sections with almost single-cell resolution. The cis-SAGe occurred in disease areas, e.g. inflamed, prostatic intraepithelial neoplastic, or cancerous areas, and occasionally in normal glands. CONCLUSIONS STfusion detects fusion transcripts in cancer cell line and clinical tissue data, and distinguishes chimeric transcripts from chimeras caused by trans-splicing events. With STfusion and the use of C-scores, fusion transcripts can be spatially localised in clinical tissue sections on almost single cell level.
Collapse
Affiliation(s)
- Stefanie Friedrich
- Science for Life Laboratory, Department of Biochemistry and Biophysics, Stockholm University, Box 1031, 17121, Solna, Sweden.
| | - Erik L L Sonnhammer
- Science for Life Laboratory, Department of Biochemistry and Biophysics, Stockholm University, Box 1031, 17121, Solna, Sweden
| |
Collapse
|
37
|
Yang X, Saito Y, Rao A, Kim HJ, Singh P, Scott E, Larson M, Pan W, Desai M, Hubbell E. Alignment-free filtering for cfNA fusion fragments. Bioinformatics 2020; 35:i225-i232. [PMID: 31510681 PMCID: PMC6612805 DOI: 10.1093/bioinformatics/btz346] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/12/2022] Open
Abstract
Motivation Cell-free nucleic acid (cfNA) sequencing data require improvements to existing fusion detection methods along multiple axes: high depth of sequencing, low allele fractions, short fragment lengths and specialized barcodes, such as unique molecular identifiers. Results AF4 was developed to address these challenges. It uses a novel alignment-free kmer-based method to detect candidate fusion fragments with high sensitivity and orders of magnitude faster than existing tools. Candidate fragments are then filtered using a max-cover criterion that significantly reduces spurious matches while retaining authentic fusion fragments. This efficient first stage reduces the data sufficiently that commonly used criteria can process the remaining information, or sophisticated filtering policies that may not scale to the raw reads can be used. AF4 provides both targeted and de novo fusion detection modes. We demonstrate both modes in benchmark simulated and real RNA-seq data as well as clinical and cell-line cfNA data. Availability and implementation AF4 is open sourced, licensed under Apache License 2.0, and is available at: https://github.com/grailbio/bio/tree/master/fusion.
Collapse
|
38
|
Cai Z, Xue H, Xu Y, Köhler J, Cheng X, Dai Y, Zheng J, Wang H. Fcirc: A comprehensive pipeline for the exploration of fusion linear and circular RNAs. Gigascience 2020; 9:5848590. [PMID: 32470133 PMCID: PMC7259471 DOI: 10.1093/gigascience/giaa054] [Citation(s) in RCA: 6] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/06/2019] [Revised: 03/01/2020] [Accepted: 04/29/2020] [Indexed: 01/08/2023] Open
Abstract
BACKGROUND In cancer cells, fusion genes can produce linear and chimeric fusion-circular RNAs (f-circRNAs), which are functional in gene expression regulation and implicated in malignant transformation, cancer progression, and therapeutic resistance. For specific cancers, proteins encoded by fusion transcripts have been identified as innovative therapeutic targets (e.g., EML4-ALK). Even though RNA sequencing (RNA-Seq) technologies combined with existing bioinformatics approaches have enabled researchers to systematically identify fusion transcripts, specifically detecting f-circRNAs in cells remains challenging owing to their general sparsity and low abundance in cancer cells but also owing to imperfect computational methods. RESULTS We developed the Python-based workflow "Fcirc" to identify fusion linear and f-circRNAs from RNA-Seq data with high specificity. We applied Fcirc to 3 different types of RNA-Seq data scenarios: (i) actual synthetic spike-in RNA-Seq data, (ii) simulated RNA-Seq data, and (iii) actual cancer cell-derived RNA-Seq data. Fcirc showed significant advantages over existing methods regarding both detection accuracy (i.e., precision, recall, F-measure) and computing performance (i.e., lower runtimes). CONCLUSION Fcirc is a powerful and comprehensive Python-based pipeline to identify linear and circular RNA transcripts from known fusion events in RNA-Seq datasets with higher accuracy and shorter computing times compared with previously published algorithms. Fcirc empowers the research community to study the biology of fusion RNAs in cancer more effectively.
Collapse
Affiliation(s)
- Zhaoqing Cai
- School of Life Sciences and Technology, Tongji University, 1239 Siping Road, Shanghai 200092, China
| | - Hongzhang Xue
- School of Life Sciences and Technology, Tongji University, 1239 Siping Road, Shanghai 200092, China.,School of Life Sciences and Biotechnology, Shanghai Jiao Tong University, 800 Dongchuan Road, Shanghai 200240, China
| | - Yue Xu
- School of Life Sciences and Technology, Tongji University, 1239 Siping Road, Shanghai 200092, China
| | - Jens Köhler
- Department of Medical Oncology, Dana-Farber Cancer Institute, 450 Brookline Avenue, Boston, MA 02215, USA
| | - Xiaojie Cheng
- School of Life Sciences and Technology, Tongji University, 1239 Siping Road, Shanghai 200092, China
| | - Yao Dai
- School of Life Sciences and Technology, Tongji University, 1239 Siping Road, Shanghai 200092, China
| | - Jie Zheng
- School of Life Sciences and Technology, Tongji University, 1239 Siping Road, Shanghai 200092, China
| | - Haiyun Wang
- School of Life Sciences and Technology, Tongji University, 1239 Siping Road, Shanghai 200092, China
| |
Collapse
|
39
|
Jang YE, Jang I, Kim S, Cho S, Kim D, Kim K, Kim J, Hwang J, Kim S, Kim J, Kang J, Lee B, Lee S. ChimerDB 4.0: an updated and expanded database of fusion genes. Nucleic Acids Res 2020; 48:D817-D824. [PMID: 31680157 PMCID: PMC7145594 DOI: 10.1093/nar/gkz1013] [Citation(s) in RCA: 33] [Impact Index Per Article: 8.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/15/2019] [Revised: 10/16/2019] [Accepted: 10/17/2019] [Indexed: 12/14/2022] Open
Abstract
Fusion genes represent an important class of biomarkers and therapeutic targets in cancer. ChimerDB is a comprehensive database of fusion genes encompassing analysis of deep sequencing data (ChimerSeq) and text mining of publications (ChimerPub) with extensive manual annotations (ChimerKB). In this update, we present all three modules substantially enhanced by incorporating the recent flood of deep sequencing data and related publications. ChimerSeq now covers all 10 565 patients in the TCGA project, with compilation of computational results from two reliable programs of STAR-Fusion and FusionScan with several public resources. In sum, ChimerSeq includes 65 945 fusion candidates, 21 106 of which were predicted by multiple programs (ChimerSeq-Plus). ChimerPub has been upgraded by applying a deep learning method for text mining followed by extensive manual curation, which yielded 1257 fusion genes including 777 cases with experimental supports (ChimerPub-Plus). ChimerKB includes 1597 fusion genes with publication support, experimental evidences and breakpoint information. Importantly, we implemented several new features to aid estimation of functional significance, including the fusion structure viewer with domain information, gene expression plot of fusion positive versus negative patients and a STRING network viewer. The user interface also was greatly enhanced by applying responsive web design. ChimerDB 4.0 is available at http://www.kobic.re.kr/chimerdb/.
Collapse
Affiliation(s)
- Ye Eun Jang
- Department of Bio-Information Science, Ewha Womans University, Seoul 03760, Republic of Korea
| | - Insu Jang
- Korean Bioinformation Center, Korean Research Institute of Bioscience and Biotechnology, Daejeon 34141, Republic of Korea
| | - Sunkyu Kim
- Department of Computer Science and Engineering, Korea University, Seoul 02841, Republic of Korea
| | - Subin Cho
- Department of Bio-Information Science, Ewha Womans University, Seoul 03760, Republic of Korea
| | - Daehan Kim
- Department of Computer Science and Engineering, Korea University, Seoul 02841, Republic of Korea
| | - Keonwoo Kim
- Department of Computer Science and Engineering, Korea University, Seoul 02841, Republic of Korea
| | - Jaewon Kim
- Department of Bio-Information Science, Ewha Womans University, Seoul 03760, Republic of Korea
| | - Jimin Hwang
- Department of Bio-Information Science, Ewha Womans University, Seoul 03760, Republic of Korea
| | - Sangok Kim
- Department of Life Science, Ewha Womans University, Seoul 03760, Republic of Korea
| | - Jaesang Kim
- Department of Life Science, Ewha Womans University, Seoul 03760, Republic of Korea
| | - Jaewoo Kang
- Department of Computer Science and Engineering, Korea University, Seoul 02841, Republic of Korea
| | - Byungwook Lee
- Korean Bioinformation Center, Korean Research Institute of Bioscience and Biotechnology, Daejeon 34141, Republic of Korea
| | - Sanghyuk Lee
- Department of Bio-Information Science, Ewha Womans University, Seoul 03760, Republic of Korea.,Department of Life Science, Ewha Womans University, Seoul 03760, Republic of Korea
| |
Collapse
|
40
|
Liu C, Zhang Y, Li X, Jia Y, Li F, Li J, Zhang Z. Evidence of constraint in the 3D genome for trans-splicing in human cells. SCIENCE CHINA-LIFE SCIENCES 2020; 63:1380-1393. [PMID: 32221814 DOI: 10.1007/s11427-019-1609-6] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 10/25/2019] [Accepted: 12/04/2019] [Indexed: 10/24/2022]
Abstract
Fusion transcripts are commonly found in eukaryotes, and many aberrant fusions are associated with severe diseases, including cancer. One class of fusion transcripts is generated by joining separate transcripts through trans-splicing. However, the mechanism of trans-splicing in mammals remains largely elusive. Here we showed evidence to support an intuitive hypothesis that attributes trans-sphcing to the spatial proximity between premature transcripts. A novel trans-splicing detection tool (TSD) was developed to reliably identify intra-chromosomal trans-splicing events (iTSEs) from RNA-seq data. TSD can maintain a remarkable balance between sensitivity and accuracy, thus distinguishing it from most state-of-the-art tools. The accuracy of TSD was experimentally demonstrated by excluding potential false discovery from mosaic genome or template switching during PCR. We showed that iTSEs identified by TSD were frequently found between genomic regulatory elements, which are known to be more prone to interact with each other. Moreover, iTSE sites may be more physically adjacent to each other than random control in the tested human lymphoblastoid cell line according to Hi-C data. Our results suggest that trans-splicing and 3D genome architecture may be coupled in mammals and that our pipeline, TSD, may facilitate investigations of trans-splicing on a systematic and accurate level previously thought impossible.
Collapse
Affiliation(s)
- Cong Liu
- CAS Key Laboratory of Genome Sciences and Information, Beijing Institute of Genomics, and China National Center for Bioinformation, Chinese Academy of Sciences, Beijing, 100101, China.,School of Life Science, University of Chinese Academy of Sciences, Beijing, 100049, China
| | - Yiqun Zhang
- CAS Key Laboratory of Genome Sciences and Information, Beijing Institute of Genomics, and China National Center for Bioinformation, Chinese Academy of Sciences, Beijing, 100101, China.,School of Life Science, University of Chinese Academy of Sciences, Beijing, 100049, China
| | - Xiaoli Li
- CAS Key Laboratory of Genome Sciences and Information, Beijing Institute of Genomics, and China National Center for Bioinformation, Chinese Academy of Sciences, Beijing, 100101, China.,School of Life Science, University of Chinese Academy of Sciences, Beijing, 100049, China
| | - Yan Jia
- CAS Key Laboratory of Genome Sciences and Information, Beijing Institute of Genomics, and China National Center for Bioinformation, Chinese Academy of Sciences, Beijing, 100101, China
| | - Feifei Li
- CAS Key Laboratory of Genome Sciences and Information, Beijing Institute of Genomics, and China National Center for Bioinformation, Chinese Academy of Sciences, Beijing, 100101, China.
| | - Jing Li
- CAS Key Laboratory of Genome Sciences and Information, Beijing Institute of Genomics, and China National Center for Bioinformation, Chinese Academy of Sciences, Beijing, 100101, China.
| | - Zhihua Zhang
- CAS Key Laboratory of Genome Sciences and Information, Beijing Institute of Genomics, and China National Center for Bioinformation, Chinese Academy of Sciences, Beijing, 100101, China. .,School of Life Science, University of Chinese Academy of Sciences, Beijing, 100049, China.
| |
Collapse
|
41
|
Singh S, Qin F, Kumar S, Elfman J, Lin E, Pham LP, Yang A, Li H. The landscape of chimeric RNAs in non-diseased tissues and cells. Nucleic Acids Res 2020; 48:1764-1778. [PMID: 31965184 PMCID: PMC7038929 DOI: 10.1093/nar/gkz1223] [Citation(s) in RCA: 41] [Impact Index Per Article: 10.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/05/2019] [Revised: 12/13/2019] [Accepted: 01/20/2020] [Indexed: 12/17/2022] Open
Abstract
Chimeric RNAs and their encoded proteins have been traditionally viewed as unique features of neoplasia, and have been used as biomarkers and therapeutic targets for multiple cancers. Recent studies have demonstrated that chimeric RNAs also exist in non-cancerous cells and tissues, although large-scale, genome-wide studies of chimeric RNAs in non-diseased tissues have been scarce. Here, we explored the landscape of chimeric RNAs in 9495 non-diseased human tissue samples of 53 different tissues from the GTEx project. Further, we established means for classifying chimeric RNAs, and observed enrichment for particular classifications as more stringent filters are applied. We experimentally validated a subset of chimeric RNAs from each classification and demonstrated functional relevance of two chimeric RNAs in non-cancerous cells. Importantly, our list of chimeric RNAs in non-diseased tissues overlaps with some entries in several cancer fusion databases, raising concerns for some annotations. The data from this study provides a large repository of chimeric RNAs present in non-diseased tissues, which can be used as a control dataset to facilitate the identification of true cancer-specific chimeras.
Collapse
Affiliation(s)
- Sandeep Singh
- Department of Pathology, School of Medicine, University of Virginia, Charlottesville, VA 22908, USA
| | - Fujun Qin
- Department of Pathology, School of Medicine, University of Virginia, Charlottesville, VA 22908, USA
| | - Shailesh Kumar
- National Institute of Plant Genome Research (NIPGR), New Delhi 110067, India
| | - Justin Elfman
- Department of Pathology, School of Medicine, University of Virginia, Charlottesville, VA 22908, USA.,Department of Biochemistry and Molecular Genetics, School of Medicine, University of Virginia, Charlottesville, VA 22908, USA
| | - Emily Lin
- Department of Pathology, School of Medicine, University of Virginia, Charlottesville, VA 22908, USA
| | - Lam-Phong Pham
- Department of Pathology, School of Medicine, University of Virginia, Charlottesville, VA 22908, USA
| | - Amy Yang
- Department of Pathology, School of Medicine, University of Virginia, Charlottesville, VA 22908, USA
| | - Hui Li
- Department of Pathology, School of Medicine, University of Virginia, Charlottesville, VA 22908, USA.,Department of Biochemistry and Molecular Genetics, School of Medicine, University of Virginia, Charlottesville, VA 22908, USA
| |
Collapse
|
42
|
Vellichirammal NN, Albahrani A, Banwait JK, Mishra NK, Li Y, Roychoudhury S, Kling MJ, Mirza S, Bhakat KK, Band V, Joshi SS, Guda C. Pan-Cancer Analysis Reveals the Diverse Landscape of Novel Sense and Antisense Fusion Transcripts. MOLECULAR THERAPY. NUCLEIC ACIDS 2020; 19:1379-1398. [PMID: 32160708 PMCID: PMC7044684 DOI: 10.1016/j.omtn.2020.01.023] [Citation(s) in RCA: 9] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 09/18/2019] [Revised: 01/03/2020] [Accepted: 01/14/2020] [Indexed: 01/26/2023]
Abstract
Gene fusions that contribute to oncogenicity can be explored for identifying cancer biomarkers and potential drug targets. To investigate the nature and distribution of fusion transcripts in cancer, we examined the transcriptome data of about 9,000 primary tumors from 33 different cancers in TCGA (The Cancer Genome Atlas) along with cell line data from CCLE (Cancer Cell Line Encyclopedia) using ChimeRScope, a novel fusion detection algorithm. We identified several fusions with sense (canonical, 39%) or antisense (non-canonical, 61%) transcripts recurrent across cancers. The majority of the recurrent non-canonical fusions found in our study are novel, unexplored, and exhibited highly variable profiles across cancers, with breast cancer and glioblastoma having the highest and lowest rates, respectively. Overall, 4,344 recurrent fusions were identified from TCGA in this study, of which 70% were novel. Additional analysis of 802 tumor-derived cell line transcriptome data across 20 cancers revealed significant variability in recurrent fusion profiles between primary tumors and corresponding cell lines. A subset of canonical and non-canonical fusions was validated by examining the structural variation evidence in whole-genome sequencing (WGS) data or by Sanger sequencing of fusion junctions. Several recurrent fusion genes identified in our study show promise for drug repurposing in basket trials and present opportunities for mechanistic studies.
Collapse
Affiliation(s)
| | - Abrar Albahrani
- Department of Genetics, Cell Biology and Anatomy, University of Nebraska Medical Center, Omaha, NE 68198, USA
| | - Jasjit K Banwait
- Department of Genetics, Cell Biology and Anatomy, University of Nebraska Medical Center, Omaha, NE 68198, USA; Bioinformatics and Systems Biology Core. University of Nebraska Medical Center, Omaha, NE 68198, USA
| | - Nitish K Mishra
- Department of Genetics, Cell Biology and Anatomy, University of Nebraska Medical Center, Omaha, NE 68198, USA
| | - You Li
- HitGen, South Keyuan Road 88, Chengdu, China
| | - Shrabasti Roychoudhury
- Department of Genetics, Cell Biology and Anatomy, University of Nebraska Medical Center, Omaha, NE 68198, USA
| | - Mathew J Kling
- Department of Genetics, Cell Biology and Anatomy, University of Nebraska Medical Center, Omaha, NE 68198, USA
| | - Sameer Mirza
- Department of Genetics, Cell Biology and Anatomy, University of Nebraska Medical Center, Omaha, NE 68198, USA
| | - Kishor K Bhakat
- Department of Genetics, Cell Biology and Anatomy, University of Nebraska Medical Center, Omaha, NE 68198, USA
| | - Vimla Band
- Department of Genetics, Cell Biology and Anatomy, University of Nebraska Medical Center, Omaha, NE 68198, USA
| | - Shantaram S Joshi
- Department of Genetics, Cell Biology and Anatomy, University of Nebraska Medical Center, Omaha, NE 68198, USA
| | - Chittibabu Guda
- Department of Genetics, Cell Biology and Anatomy, University of Nebraska Medical Center, Omaha, NE 68198, USA; Bioinformatics and Systems Biology Core. University of Nebraska Medical Center, Omaha, NE 68198, USA.
| |
Collapse
|
43
|
Zhao S, Hoff AM, Skotheim RI. ScaR-a tool for sensitive detection of known fusion transcripts: establishing prevalence of fusions in testicular germ cell tumors. NAR Genom Bioinform 2020; 2:lqz025. [PMID: 33575572 PMCID: PMC7671340 DOI: 10.1093/nargab/lqz025] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/07/2019] [Revised: 11/20/2019] [Accepted: 12/19/2019] [Indexed: 11/13/2022] Open
Abstract
Bioinformatics tools for fusion transcript detection from RNA-sequencing data are in general developed for identification of novel fusions, which demands a high number of supporting reads and strict filters to avoid false discoveries. As our knowledge of bona fide fusion genes becomes more saturated, there is a need to establish their prevalence with high sensitivity. We present ScaR, a tool that uses a supervised scaffold realignment approach for sensitive fusion detection in RNA-seq data. ScaR detects a set of 130 synthetic fusion transcripts from simulated data at a higher sensitivity compared to established fusion finders. Applied to fusion transcripts potentially involved in testicular germ cell tumors (TGCTs), ScaR detects the fusions RCC1-ABHD12B and CLEC6A-CLEC4D in 9% and 28% of 150 TGCTs, respectively. The fusions were not detected in any of 198 normal testis tissues. Thus, we demonstrate high prevalence of RCC1-ABHD12B and CLEC6A-CLEC4D in TGCTs, and their cancer specific features. Further, we find that RCC1-ABHD12B and CLEC6A-CLEC4D are predominantly expressed in the seminoma and embryonal carcinoma histological subtypes of TGCTs, respectively. In conclusion, ScaR is useful for establishing the frequency of known and validated fusion transcripts in larger data sets and detecting clinically relevant fusion transcripts with high sensitivity.
Collapse
Affiliation(s)
- Sen Zhao
- Department of Molecular Oncology, Institute for Cancer Research, Oslo University Hospital-Radiumhospitalet, 0372 Oslo, Norway
| | - Andreas M Hoff
- Department of Molecular Oncology, Institute for Cancer Research, Oslo University Hospital-Radiumhospitalet, 0372 Oslo, Norway
| | - Rolf I Skotheim
- Department of Molecular Oncology, Institute for Cancer Research, Oslo University Hospital-Radiumhospitalet, 0372 Oslo, Norway
- Department of Informatics, Faculty of Natural Science and Mathematics, University of Oslo, 0373 Oslo, Norway
| |
Collapse
|
44
|
Oliver GR, Jenkinson G, Klee EW. Computational Detection of Known Pathogenic Gene Fusions in a Normal Tissue Database and Implications for Genetic Disease Research. Front Genet 2020; 11:173. [PMID: 32180803 PMCID: PMC7059617 DOI: 10.3389/fgene.2020.00173] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/02/2019] [Accepted: 02/13/2020] [Indexed: 11/13/2022] Open
Abstract
Several recent studies have demonstrated the utility of RNA-Seq in the diagnosis of rare inherited disease. Diagnostic rates 35% higher than those previously achievable with DNA-Seq alone have been attained. These studies have primarily profiled gene expression and splicing defects, however, some have also shown that fusion transcripts are diagnostic or phenotypically relevant in patients with constitutional disorders. Fusion transcripts have traditionally been studied as oncogenic phenomena, with relevance only to cancer testing. Consequently, fusion detection algorithms were biased toward the detection of well-known oncogenic fusions, hindering their application to rare Mendelian genetic disease studies. A recent methodology published by the authors successfully tailored a traditional algorithm to the detection of pathogenic fusion events in inherited disease. A key mechanism of decreasing false positive or biologically benign events was comparison to a database of events detected in normal tissues. This approach is akin to population frequency-based filtering of genetic variants. It is predicated on the idea that pathogenic fusion transcripts are absent from normal tissue. We report on an analysis of RNA-Seq data from the genotype-tissue expression (GTEx) project in which known pathogenic fusions are computationally detected at low levels in normal tissues unassociated with the disease phenotype. Examples include archetypal cancer fusion transcripts, as well as fusions responsible for rare inherited disease. We consider potential explanations for the detectability of such transcripts and discuss the bearing such results have on the future profiling of genetic disease patients for pathogenic gene fusions.
Collapse
Affiliation(s)
- Gavin Robert Oliver
- Center for Individualized Medicine, Mayo Clinic, Rochester, MN, United States.,Department of Health Sciences Research, Mayo Clinic, Rochester, MN, United States
| | - Garrett Jenkinson
- Center for Individualized Medicine, Mayo Clinic, Rochester, MN, United States.,Department of Health Sciences Research, Mayo Clinic, Rochester, MN, United States
| | - Eric W Klee
- Center for Individualized Medicine, Mayo Clinic, Rochester, MN, United States.,Department of Health Sciences Research, Mayo Clinic, Rochester, MN, United States
| |
Collapse
|
45
|
Xiong Q, Huang S, Li YH, Lv N, Lv C, Ding Y, Liu WW, Wang LL, Chen Y, Sun L, Zhao Y, Liao SY, Zhang MQ, Zhu BL, Yu L. Single‑cell RNA sequencing of t(8;21) acute myeloid leukemia for risk prediction. Oncol Rep 2020; 43:1278-1288. [PMID: 32323795 PMCID: PMC7057921 DOI: 10.3892/or.2020.7507] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/11/2019] [Accepted: 01/22/2020] [Indexed: 12/12/2022] Open
Abstract
Single-cell RNA sequencing (scRNA-seq) of bone marrow or peripheral blood samples from patients with acute myeloid leukemia (AML) enables the characterization of heterogeneous malignant cells. A total of 87 cells from two patients with t(8;21) AML were analyzed using scRNA-seq. Clustering methods were used to separate leukemia cells into different sub-populations, and the expression patterns of specific marker genes were used to annotate these populations. Among the 31 differentially expressed genes in the cells of a patient who relapsed after hematopoietic stem cell transplantation, 13 genes were identified to be associated with leukemia. Furthermore, three genes, namely AT-rich interaction domain 2, lysine methyltransferase 2A and synaptotagmin binding cytoplasmic RNA interacting protein were validated as possible prognostic biomarkers using two bulk expression datasets. Taking advantage of scRNA-seq, the results of the present study may provide clinicians with several possible biomarkers to predict the prognostic outcomes of t(8;21) AML.
Collapse
Affiliation(s)
- Qian Xiong
- Department of Hematology and BMT Center, Chinese PLA General Hospital, Beijing 100853, P.R. China
| | - Sai Huang
- Department of Hematology and BMT Center, Chinese PLA General Hospital, Beijing 100853, P.R. China
| | - Yong-Hui Li
- Department of Hematology and BMT Center, Chinese PLA General Hospital, Beijing 100853, P.R. China
| | - Na Lv
- Department of Hematology and BMT Center, Chinese PLA General Hospital, Beijing 100853, P.R. China
| | - Chao Lv
- Department of Hematology and BMT Center, Chinese PLA General Hospital, Beijing 100853, P.R. China
| | - Yi Ding
- Department of Hematology and BMT Center, Chinese PLA General Hospital, Beijing 100853, P.R. China
| | - Wen-Wen Liu
- Department of Hematology and BMT Center, Chinese PLA General Hospital, Beijing 100853, P.R. China
| | - Li-Li Wang
- Department of Hematology and BMT Center, Chinese PLA General Hospital, Beijing 100853, P.R. China
| | - Yang Chen
- School of Medicine, MOE Key Laboratory of Bioinformatics and Bioinformatics Division, Center for Synthetic and System Biology, TNLIST/Department of Automation, Tsinghua University, Beijing 100084, P.R. China
| | - Liang Sun
- School of Medicine, MOE Key Laboratory of Bioinformatics and Bioinformatics Division, Center for Synthetic and System Biology, TNLIST/Department of Automation, Tsinghua University, Beijing 100084, P.R. China
| | - Yi Zhao
- Key Laboratory of Intelligent Information Processing, Advanced Computer Research Center, Institute of Computing Technology, Chinese Academy of Sciences, Beijing 100190, P.R. China
| | - Sheng-You Liao
- Key Laboratory of Intelligent Information Processing, Advanced Computer Research Center, Institute of Computing Technology, Chinese Academy of Sciences, Beijing 100190, P.R. China
| | - Michael Q Zhang
- School of Medicine, MOE Key Laboratory of Bioinformatics and Bioinformatics Division, Center for Synthetic and System Biology, TNLIST/Department of Automation, Tsinghua University, Beijing 100084, P.R. China
| | - Bao-Li Zhu
- CAS Key Laboratory of Pathogenic Microbiology and Immunology, Institute of Microbiology, Chinese Academy of Sciences, Beijing 100101, P.R. China
| | - Li Yu
- Department of Hematology and BMT Center, Chinese PLA General Hospital, Beijing 100853, P.R. China
| |
Collapse
|
46
|
Sorn P, Holtsträter C, Löwer M, Sahin U, Weber D. ArtiFuse-computational validation of fusion gene detection tools without relying on simulated reads. Bioinformatics 2020; 36:373-379. [PMID: 31373612 DOI: 10.1093/bioinformatics/btz613] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/02/2019] [Revised: 07/30/2019] [Accepted: 08/01/2019] [Indexed: 11/12/2022] Open
Abstract
MOTIVATION Gene fusions are an important class of transcriptional variants that can influence cancer development and can be predicted from RNA sequencing (RNA-seq) data by multiple existing tools. However, the real-world performance of these tools is unclear due to the lack of known positive and negative events, especially with regard to fusion genes in individual samples. Often simulated reads are used, but these cannot account for all technical biases in RNA-seq data generated from real samples. RESULTS Here, we present ArtiFuse, a novel approach that simulates fusion genes by sequence modification to the genomic reference, and therefore, can be applied to any RNA-seq dataset without the need for any simulated reads. We demonstrate our approach on eight RNA-seq datasets for three fusion gene prediction tools: average recall values peak for all three tools between 0.4 and 0.56 for high-quality and high-coverage datasets. As ArtiFuse affords total control over involved genes and breakpoint position, we also assessed performance with regard to gene-related properties, showing a drop-in recall value for low-expressed genes in high-coverage samples and genes with co-expressed paralogues. Overall tool performance assessed from ArtiFusions is lower compared to previously reported estimates on simulated reads. Due to the use of real RNA-seq datasets, we believe that ArtiFuse provides a more realistic benchmark that can be used to develop more accurate fusion gene prediction tools for application in clinical settings. AVAILABILITY AND IMPLEMENTATION ArtiFuse is implemented in Python. The source code and documentation are available at https://github.com/TRON-Bioinformatics/ArtiFusion. SUPPLEMENTARY INFORMATION Supplementary data are available at Bioinformatics online.
Collapse
Affiliation(s)
- Patrick Sorn
- TRON - Translational Oncology at the University Medical Center of Johannes Gutenberg University Mainz gGmbH, Mainz 55131, Germany
| | - Christoph Holtsträter
- TRON - Translational Oncology at the University Medical Center of Johannes Gutenberg University Mainz gGmbH, Mainz 55131, Germany
| | - Martin Löwer
- TRON - Translational Oncology at the University Medical Center of Johannes Gutenberg University Mainz gGmbH, Mainz 55131, Germany
| | - Ugur Sahin
- TRON - Translational Oncology at the University Medical Center of Johannes Gutenberg University Mainz gGmbH, Mainz 55131, Germany
| | - David Weber
- TRON - Translational Oncology at the University Medical Center of Johannes Gutenberg University Mainz gGmbH, Mainz 55131, Germany
| |
Collapse
|
47
|
Abstract
Many chimeric RNA prediction software packages are available to assist the scientific community in searching for cancer-specific chimeric RNAs. These packages predict a large number of false positive events, which significantly hampers experimental validation of predicted chimeric RNAs. Here, we describe the detailed steps for (1) prediction of chimeric RNAs using EricScript software, (2) characterization of chimeric RNAs to discard most probable false positive events, and (3) in silico validation of chimeric RNA to select the potential cancer-specific events.
Collapse
Affiliation(s)
- Sandeep Singh
- Department of Pathology, School of Medicine, University of Virginia, Charlottesville, VA, USA
| | - Hui Li
- Department of Pathology, School of Medicine, University of Virginia, Charlottesville, VA, USA.
| |
Collapse
|
48
|
Abstract
Chimeric RNAs as well as their fused protein products have therapeutic applications ranging from diagnostics to being used as therapeutic target. Many algorithms have been developed to identify chimeric RNAs, however, identification and validation of fused protein product of the chimeric RNA is still an emerging field. These chimeric proteins can be validated by searching and identifying them in publicly available proteomics datasets. Here we describe the detailed steps for (1) downloading and processing publicly available proteomics datasets, (2) developing fusion peptide database by performing in silico tryptic digestion of chimeric proteins, and (3) software used to identify chimeric peptides in the proteomics data.
Collapse
Affiliation(s)
- Sandeep Singh
- Department of Pathology, School of Medicine, University of Virginia, Charlottesville, VA, USA
| | - Hui Li
- Department of Pathology, School of Medicine, University of Virginia, Charlottesville, VA, USA.
| |
Collapse
|
49
|
Mahmoud M, Gobet N, Cruz-Dávalos DI, Mounier N, Dessimoz C, Sedlazeck FJ. Structural variant calling: the long and the short of it. Genome Biol 2019; 20:246. [PMID: 31747936 PMCID: PMC6868818 DOI: 10.1186/s13059-019-1828-7] [Citation(s) in RCA: 309] [Impact Index Per Article: 61.8] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/11/2019] [Accepted: 09/19/2019] [Indexed: 02/08/2023] Open
Abstract
Recent research into structural variants (SVs) has established their importance to medicine and molecular biology, elucidating their role in various diseases, regulation of gene expression, ethnic diversity, and large-scale chromosome evolution-giving rise to the differences within populations and among species. Nevertheless, characterizing SVs and determining the optimal approach for a given experimental design remains a computational and scientific challenge. Multiple approaches have emerged to target various SV classes, zygosities, and size ranges. Here, we review these approaches with respect to their ability to infer SVs across the full spectrum of large, complex variations and present computational methods for each approach.
Collapse
Affiliation(s)
- Medhat Mahmoud
- Human Genome Sequencing Center, Baylor College of Medicine, Houston, USA
| | - Nastassia Gobet
- Center for Integrative Genomics, University of Lausanne, Lausanne, Switzerland
- Swiss Institute of Bioinformatics, Lausanne, Switzerland
| | - Diana Ivette Cruz-Dávalos
- Swiss Institute of Bioinformatics, Lausanne, Switzerland
- Department of Computational Biology, University of Lausanne, Lausanne, Switzerland
| | - Ninon Mounier
- Swiss Institute of Bioinformatics, Lausanne, Switzerland
- University Center for Primary Care and Public Health, Lausanne, Switzerland
| | - Christophe Dessimoz
- Center for Integrative Genomics, University of Lausanne, Lausanne, Switzerland.
- Swiss Institute of Bioinformatics, Lausanne, Switzerland.
- Department of Computational Biology, University of Lausanne, Lausanne, Switzerland.
- Centre for Life's Origins and Evolution, Department of Genetics, Evolution & Environment, University College London, London, UK.
- Department of Computer Science, University College London, London, UK.
| | - Fritz J Sedlazeck
- Human Genome Sequencing Center, Baylor College of Medicine, Houston, USA.
| |
Collapse
|
50
|
Haas BJ, Dobin A, Li B, Stransky N, Pochet N, Regev A. Accuracy assessment of fusion transcript detection via read-mapping and de novo fusion transcript assembly-based methods. Genome Biol 2019; 20:213. [PMID: 31639029 PMCID: PMC6802306 DOI: 10.1186/s13059-019-1842-9] [Citation(s) in RCA: 331] [Impact Index Per Article: 66.2] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/19/2019] [Accepted: 09/28/2019] [Indexed: 12/15/2022] Open
Abstract
BACKGROUND Accurate fusion transcript detection is essential for comprehensive characterization of cancer transcriptomes. Over the last decade, multiple bioinformatic tools have been developed to predict fusions from RNA-seq, based on either read mapping or de novo fusion transcript assembly. RESULTS We benchmark 23 different methods including applications we develop, STAR-Fusion and TrinityFusion, leveraging both simulated and real RNA-seq. Overall, STAR-Fusion, Arriba, and STAR-SEQR are the most accurate and fastest for fusion detection on cancer transcriptomes. CONCLUSION The lower accuracy of de novo assembly-based methods notwithstanding, they are useful for reconstructing fusion isoforms and tumor viruses, both of which are important in cancer research.
Collapse
Affiliation(s)
- Brian J. Haas
- Broad Institute of MIT and Harvard, Cambridge, MA 02142 USA
| | - Alexander Dobin
- Cold Spring Harbor Laboratory, Cold Spring Harbor, NY 11724 USA
| | - Bo Li
- Broad Institute of MIT and Harvard, Cambridge, MA 02142 USA
- Center for Immunology and Inflammatory Diseases, Division of Rheumatology, Allergy, and Immunology, Massachusetts General Hospital and Harvard Medical School, Boston, MA 02129 USA
| | | | - Nathalie Pochet
- Broad Institute of MIT and Harvard, Cambridge, MA 02142 USA
- Ann Romney Center for Neurologic Diseases, Department of Neurology, Brigham and Women’s Hospital, Harvard Medical School, Boston, MA 02115 USA
| | - Aviv Regev
- Broad Institute of MIT and Harvard, Cambridge, MA 02142 USA
- Howard Hughes Medical Institute, and Koch Institute for Integrative Cancer Research, Department of Biology, Massachusetts Institute of Technology, Cambridge, MA 02140 USA
| |
Collapse
|