2
|
Haas BJ, Dobin A, Ghandi M, Van Arsdale A, Tickle T, Robinson JT, Gillani R, Kasif S, Regev A. Targeted in silico characterization of fusion transcripts in tumor and normal tissues via FusionInspector. CELL REPORTS METHODS 2023; 3:100467. [PMID: 37323575 PMCID: PMC10261907 DOI: 10.1016/j.crmeth.2023.100467] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 05/18/2022] [Revised: 02/28/2023] [Accepted: 04/14/2023] [Indexed: 06/17/2023]
Abstract
Here, we present FusionInspector for in silico characterization and interpretation of candidate fusion transcripts from RNA sequencing (RNA-seq) and exploration of their sequence and expression characteristics. We applied FusionInspector to thousands of tumor and normal transcriptomes and identified statistical and experimental features enriched among biologically impactful fusions. Through clustering and machine learning, we identified large collections of fusions potentially relevant to tumor and normal biological processes. We show that biologically relevant fusions are enriched for relatively high expression of the fusion transcript, imbalanced fusion allelic ratios, and canonical splicing patterns, and are deficient in sequence microhomologies between partner genes. We demonstrate that FusionInspector accurately validates fusion transcripts in silico and helps characterize numerous understudied fusions in tumor and normal tissue samples. FusionInspector is freely available as open source for screening, characterization, and visualization of candidate fusions via RNA-seq, and facilitates transparent explanation and interpretation of machine-learning predictions and their experimental sources.
Collapse
Affiliation(s)
- Brian J. Haas
- Klarman Cell Observatory, Broad Institute of MIT and Harvard, Cambridge, MA, USA
- Graduate Program in Bioinformatics, Boston University, Boston, MA 02215, USA
| | | | | | - Anne Van Arsdale
- Department of Obstetrics and Gynecology and Women’s Health, Albert Einstein Montefiore Medical Center, Bronx, NY 10461, USA
- Department of Genetics, Albert Einstein College of Medicine, Bronx, NY 10461, USA
| | - Timothy Tickle
- Klarman Cell Observatory, Broad Institute of MIT and Harvard, Cambridge, MA, USA
| | - James T. Robinson
- School of Medicine, University of California San Diego, La Jolla, CA 92093, USA
| | - Riaz Gillani
- Department of Pediatric Oncology, Dana-Farber Cancer Institute, Boston, MA 02215, USA
- Cancer Program, Broad Institute of Harvard and MIT, Cambridge, MA 02142, USA
- Department of Pediatrics, Harvard Medical School, Boston, MA 02215, USA
- Boston Children’s Hospital, Boston, MA 02115, USA
| | - Simon Kasif
- Graduate Program in Bioinformatics, Boston University, Boston, MA 02215, USA
- Department of Biomedical Engineering, Boston University, Boston, MA 02215, USA
| | - Aviv Regev
- Klarman Cell Observatory, Broad Institute of MIT and Harvard, Cambridge, MA, USA
- Department of Biology, Massachusetts Institute of Technology, Cambridge, MA 02139, USA
- Howard Hughes Medical Institute, Chevy Chase, MD 20815, USA
| |
Collapse
|
3
|
LaHaye S, Fitch JR, Voytovich KJ, Herman AC, Kelly BJ, Lammi GE, Arbesfeld JA, Wijeratne S, Franklin SJ, Schieffer KM, Bir N, McGrath SD, Miller AR, Wetzel A, Miller KE, Bedrosian TA, Leraas K, Varga EA, Lee K, Gupta A, Setty B, Boué DR, Leonard JR, Finlay JL, Abdelbaki MS, Osorio DS, Koo SC, Koboldt DC, Wagner AH, Eisfeld AK, Mrózek K, Magrini V, Cottrell CE, Mardis ER, Wilson RK, White P. Discovery of clinically relevant fusions in pediatric cancer. BMC Genomics 2021; 22:872. [PMID: 34863095 PMCID: PMC8642973 DOI: 10.1186/s12864-021-08094-z] [Citation(s) in RCA: 13] [Impact Index Per Article: 4.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/11/2021] [Accepted: 10/15/2021] [Indexed: 12/13/2022] Open
Abstract
Background Pediatric cancers typically have a distinct genomic landscape when compared to adult cancers and frequently carry somatic gene fusion events that alter gene expression and drive tumorigenesis. Sensitive and specific detection of gene fusions through the analysis of next-generation-based RNA sequencing (RNA-Seq) data is computationally challenging and may be confounded by low tumor cellularity or underlying genomic complexity. Furthermore, numerous computational tools are available to identify fusions from supporting RNA-Seq reads, yet each algorithm demonstrates unique variability in sensitivity and precision, and no clearly superior approach currently exists. To overcome these challenges, we have developed an ensemble fusion calling approach to increase the accuracy of identifying fusions. Results Our Ensemble Fusion (EnFusion) approach utilizes seven fusion calling algorithms: Arriba, CICERO, FusionMap, FusionCatcher, JAFFA, MapSplice, and STAR-Fusion, which are packaged as a fully automated pipeline using Docker and Amazon Web Services (AWS) serverless technology. This method uses paired end RNA-Seq sequence reads as input, and the output from each algorithm is examined to identify fusions detected by a consensus of at least three algorithms. These consensus fusion results are filtered by comparison to an internal database to remove likely artifactual fusions occurring at high frequencies in our internal cohort, while a “known fusion list” prevents failure to report known pathogenic events. We have employed the EnFusion pipeline on RNA-Seq data from 229 patients with pediatric cancer or blood disorders studied under an IRB-approved protocol. The samples consist of 138 central nervous system tumors, 73 solid tumors, and 18 hematologic malignancies or disorders. The combination of an ensemble fusion-calling pipeline and a knowledge-based filtering strategy identified 67 clinically relevant fusions among our cohort (diagnostic yield of 29.3%), including RBPMS-MET, BCAN-NTRK1, and TRIM22-BRAF fusions. Following clinical confirmation and reporting in the patient’s medical record, both known and novel fusions provided medically meaningful information. Conclusions The EnFusion pipeline offers a streamlined approach to discover fusions in cancer, at higher levels of sensitivity and accuracy than single algorithm methods. Furthermore, this method accurately identifies driver fusions in pediatric cancer, providing clinical impact by contributing evidence to diagnosis and, when appropriate, indicating targeted therapies. Supplementary Information The online version contains supplementary material available at 10.1186/s12864-021-08094-z.
Collapse
Affiliation(s)
- Stephanie LaHaye
- The Steve and Cindy Rasmussen Institute for Genomic Medicine, Nationwide Children's Hospital, Columbus, OH, USA
| | - James R Fitch
- The Steve and Cindy Rasmussen Institute for Genomic Medicine, Nationwide Children's Hospital, Columbus, OH, USA
| | - Kyle J Voytovich
- The Steve and Cindy Rasmussen Institute for Genomic Medicine, Nationwide Children's Hospital, Columbus, OH, USA
| | - Adam C Herman
- The Steve and Cindy Rasmussen Institute for Genomic Medicine, Nationwide Children's Hospital, Columbus, OH, USA
| | - Benjamin J Kelly
- The Steve and Cindy Rasmussen Institute for Genomic Medicine, Nationwide Children's Hospital, Columbus, OH, USA
| | - Grant E Lammi
- The Steve and Cindy Rasmussen Institute for Genomic Medicine, Nationwide Children's Hospital, Columbus, OH, USA
| | - Jeremy A Arbesfeld
- The Steve and Cindy Rasmussen Institute for Genomic Medicine, Nationwide Children's Hospital, Columbus, OH, USA
| | - Saranga Wijeratne
- The Steve and Cindy Rasmussen Institute for Genomic Medicine, Nationwide Children's Hospital, Columbus, OH, USA
| | - Samuel J Franklin
- The Steve and Cindy Rasmussen Institute for Genomic Medicine, Nationwide Children's Hospital, Columbus, OH, USA
| | - Kathleen M Schieffer
- The Steve and Cindy Rasmussen Institute for Genomic Medicine, Nationwide Children's Hospital, Columbus, OH, USA
| | - Natalie Bir
- The Steve and Cindy Rasmussen Institute for Genomic Medicine, Nationwide Children's Hospital, Columbus, OH, USA
| | - Sean D McGrath
- The Steve and Cindy Rasmussen Institute for Genomic Medicine, Nationwide Children's Hospital, Columbus, OH, USA
| | - Anthony R Miller
- The Steve and Cindy Rasmussen Institute for Genomic Medicine, Nationwide Children's Hospital, Columbus, OH, USA
| | - Amy Wetzel
- The Steve and Cindy Rasmussen Institute for Genomic Medicine, Nationwide Children's Hospital, Columbus, OH, USA
| | - Katherine E Miller
- The Steve and Cindy Rasmussen Institute for Genomic Medicine, Nationwide Children's Hospital, Columbus, OH, USA
| | - Tracy A Bedrosian
- The Steve and Cindy Rasmussen Institute for Genomic Medicine, Nationwide Children's Hospital, Columbus, OH, USA
| | - Kristen Leraas
- The Steve and Cindy Rasmussen Institute for Genomic Medicine, Nationwide Children's Hospital, Columbus, OH, USA
| | - Elizabeth A Varga
- The Steve and Cindy Rasmussen Institute for Genomic Medicine, Nationwide Children's Hospital, Columbus, OH, USA
| | - Kristy Lee
- The Steve and Cindy Rasmussen Institute for Genomic Medicine, Nationwide Children's Hospital, Columbus, OH, USA
| | - Ajay Gupta
- Division of Hematology, Oncology, Blood and Marrow Transplant, Nationwide Children's Hospital, Columbus, OH, USA
| | - Bhuvana Setty
- Division of Hematology, Oncology, Blood and Marrow Transplant, Nationwide Children's Hospital, Columbus, OH, USA.,Department of Pediatrics, The Ohio State University, Columbus, OH, USA
| | - Daniel R Boué
- Department of Pathology, The Ohio State University, Columbus, OH, USA.,Department of Pathology, Nationwide Children's Hospital, Columbus, OH, USA
| | - Jeffrey R Leonard
- Department of Pediatrics, The Ohio State University, Columbus, OH, USA.,Section of Neurosurgery, Nationwide Children's Hospital, Columbus, OH, USA
| | - Jonathan L Finlay
- Division of Hematology, Oncology, Blood and Marrow Transplant, Nationwide Children's Hospital, Columbus, OH, USA.,Department of Pediatrics, The Ohio State University, Columbus, OH, USA
| | - Mohamed S Abdelbaki
- Division of Hematology, Oncology, Blood and Marrow Transplant, Nationwide Children's Hospital, Columbus, OH, USA.,Department of Pediatrics, The Ohio State University, Columbus, OH, USA
| | - Diana S Osorio
- Division of Hematology, Oncology, Blood and Marrow Transplant, Nationwide Children's Hospital, Columbus, OH, USA.,Department of Pediatrics, The Ohio State University, Columbus, OH, USA
| | - Selene C Koo
- Department of Pathology, The Ohio State University, Columbus, OH, USA.,Department of Pathology, Nationwide Children's Hospital, Columbus, OH, USA
| | - Daniel C Koboldt
- The Steve and Cindy Rasmussen Institute for Genomic Medicine, Nationwide Children's Hospital, Columbus, OH, USA
| | - Alex H Wagner
- The Steve and Cindy Rasmussen Institute for Genomic Medicine, Nationwide Children's Hospital, Columbus, OH, USA.,Department of Pediatrics, The Ohio State University, Columbus, OH, USA.,Department of Biomedical Informatics, The Ohio State University, Columbus, OH, USA
| | - Ann-Kathrin Eisfeld
- Division of Hematology, The Ohio State University, Columbus, OH, USA.,Clara D. Bloomfield Center for Leukemia Outcomes Research, The Ohio State University, Columbus, OH, USA.,The Ohio State Comprehensive Cancer Center, Columbus, OH, USA
| | - Krzysztof Mrózek
- Clara D. Bloomfield Center for Leukemia Outcomes Research, The Ohio State University, Columbus, OH, USA.,The Ohio State Comprehensive Cancer Center, Columbus, OH, USA
| | - Vincent Magrini
- The Steve and Cindy Rasmussen Institute for Genomic Medicine, Nationwide Children's Hospital, Columbus, OH, USA.,Department of Pediatrics, The Ohio State University, Columbus, OH, USA
| | - Catherine E Cottrell
- The Steve and Cindy Rasmussen Institute for Genomic Medicine, Nationwide Children's Hospital, Columbus, OH, USA.,Department of Pediatrics, The Ohio State University, Columbus, OH, USA.,Department of Pathology, The Ohio State University, Columbus, OH, USA
| | - Elaine R Mardis
- The Steve and Cindy Rasmussen Institute for Genomic Medicine, Nationwide Children's Hospital, Columbus, OH, USA.,Department of Pediatrics, The Ohio State University, Columbus, OH, USA
| | - Richard K Wilson
- The Steve and Cindy Rasmussen Institute for Genomic Medicine, Nationwide Children's Hospital, Columbus, OH, USA.,Department of Pediatrics, The Ohio State University, Columbus, OH, USA
| | - Peter White
- The Steve and Cindy Rasmussen Institute for Genomic Medicine, Nationwide Children's Hospital, Columbus, OH, USA. .,Department of Pediatrics, The Ohio State University, Columbus, OH, USA.
| |
Collapse
|
4
|
Cervera A, Rausio H, Kähkönen T, Andersson N, Partel G, Rantanen V, Paciello G, Ficarra E, Hynninen J, Hietanen S, Carpén O, Lehtonen R, Hautaniemi S, Huhtinen K. FUNGI: Fusion Gene Integration Toolset. Bioinformatics 2021; 37:3353-3355. [PMID: 33772596 PMCID: PMC8504624 DOI: 10.1093/bioinformatics/btab206] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/18/2020] [Revised: 02/27/2021] [Accepted: 03/25/2021] [Indexed: 11/17/2022] Open
Abstract
Motivation Fusion genes are both useful cancer biomarkers and important drug targets. Finding relevant fusion genes is challenging due to genomic instability resulting in a high number of passenger events. To reveal and prioritize relevant gene fusion events we have developed FUsionN Gene Identification toolset (FUNGI) that uses an ensemble of fusion detection algorithms with prioritization and visualization modules. Results We applied FUNGI to an ovarian cancer dataset of 107 tumor samples from 36 patients. Ten out of 11 detected and prioritized fusion genes were validated. Many of detected fusion genes affect the PI3K-AKT pathway with potential role in treatment resistance. Availabilityand implementation FUNGI and its documentation are available at https://bitbucket.org/alejandra_cervera/fungi as standalone or from Anduril at https://www.anduril.org. Supplementary information Supplementary data are available at Bioinformatics online.
Collapse
Affiliation(s)
- Alejandra Cervera
- Research Program in Systems Oncology, Research Programs Unit, Faculty of Medicine, University of Helsinki, Helsinki, 00014
| | - Heidi Rausio
- Cancer Research Unit, Institute of Biomedicine and FICAN West Cancer Centre, University of Turku, Turku, 20014
| | - Tiia Kähkönen
- Cancer Research Unit, Institute of Biomedicine and FICAN West Cancer Centre, University of Turku, Turku, 20014
| | - Noora Andersson
- Department of Pathology, University of Helsinki and HUS-Diagnostics, Helsinki University Hospital, Helsinki, 00014
| | - Gabriele Partel
- Research Program in Systems Oncology, Research Programs Unit, Faculty of Medicine, University of Helsinki, Helsinki, 00014
| | - Ville Rantanen
- Research Program in Systems Oncology, Research Programs Unit, Faculty of Medicine, University of Helsinki, Helsinki, 00014
| | | | - Elisa Ficarra
- Department of Engineering "Enzo Ferrari", University of Modena and Reggio Emilia (UNIMORE), Reggio Emilia, 42121
| | - Johanna Hynninen
- Department of Obstetrics and Gynecology, University of Turku and Turku University Hospital, Turku, 20521
| | - Sakari Hietanen
- Department of Obstetrics and Gynecology, University of Turku and Turku University Hospital, Turku, 20521
| | - Olli Carpén
- Research Program in Systems Oncology, Research Programs Unit, Faculty of Medicine, University of Helsinki, Helsinki, 00014.,Cancer Research Unit, Institute of Biomedicine and FICAN West Cancer Centre, University of Turku, Turku, 20014.,Department of Pathology, University of Helsinki and HUS-Diagnostics, Helsinki University Hospital, Helsinki, 00014
| | - Rainer Lehtonen
- Research Program in Systems Oncology, Research Programs Unit, Faculty of Medicine, University of Helsinki, Helsinki, 00014
| | - Sampsa Hautaniemi
- Research Program in Systems Oncology, Research Programs Unit, Faculty of Medicine, University of Helsinki, Helsinki, 00014
| | - Kaisa Huhtinen
- Research Program in Systems Oncology, Research Programs Unit, Faculty of Medicine, University of Helsinki, Helsinki, 00014.,Cancer Research Unit, Institute of Biomedicine and FICAN West Cancer Centre, University of Turku, Turku, 20014
| |
Collapse
|
6
|
Jang YE, Jang I, Kim S, Cho S, Kim D, Kim K, Kim J, Hwang J, Kim S, Kim J, Kang J, Lee B, Lee S. ChimerDB 4.0: an updated and expanded database of fusion genes. Nucleic Acids Res 2020; 48:D817-D824. [PMID: 31680157 PMCID: PMC7145594 DOI: 10.1093/nar/gkz1013] [Citation(s) in RCA: 33] [Impact Index Per Article: 8.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/15/2019] [Revised: 10/16/2019] [Accepted: 10/17/2019] [Indexed: 12/14/2022] Open
Abstract
Fusion genes represent an important class of biomarkers and therapeutic targets in cancer. ChimerDB is a comprehensive database of fusion genes encompassing analysis of deep sequencing data (ChimerSeq) and text mining of publications (ChimerPub) with extensive manual annotations (ChimerKB). In this update, we present all three modules substantially enhanced by incorporating the recent flood of deep sequencing data and related publications. ChimerSeq now covers all 10 565 patients in the TCGA project, with compilation of computational results from two reliable programs of STAR-Fusion and FusionScan with several public resources. In sum, ChimerSeq includes 65 945 fusion candidates, 21 106 of which were predicted by multiple programs (ChimerSeq-Plus). ChimerPub has been upgraded by applying a deep learning method for text mining followed by extensive manual curation, which yielded 1257 fusion genes including 777 cases with experimental supports (ChimerPub-Plus). ChimerKB includes 1597 fusion genes with publication support, experimental evidences and breakpoint information. Importantly, we implemented several new features to aid estimation of functional significance, including the fusion structure viewer with domain information, gene expression plot of fusion positive versus negative patients and a STRING network viewer. The user interface also was greatly enhanced by applying responsive web design. ChimerDB 4.0 is available at http://www.kobic.re.kr/chimerdb/.
Collapse
Affiliation(s)
- Ye Eun Jang
- Department of Bio-Information Science, Ewha Womans University, Seoul 03760, Republic of Korea
| | - Insu Jang
- Korean Bioinformation Center, Korean Research Institute of Bioscience and Biotechnology, Daejeon 34141, Republic of Korea
| | - Sunkyu Kim
- Department of Computer Science and Engineering, Korea University, Seoul 02841, Republic of Korea
| | - Subin Cho
- Department of Bio-Information Science, Ewha Womans University, Seoul 03760, Republic of Korea
| | - Daehan Kim
- Department of Computer Science and Engineering, Korea University, Seoul 02841, Republic of Korea
| | - Keonwoo Kim
- Department of Computer Science and Engineering, Korea University, Seoul 02841, Republic of Korea
| | - Jaewon Kim
- Department of Bio-Information Science, Ewha Womans University, Seoul 03760, Republic of Korea
| | - Jimin Hwang
- Department of Bio-Information Science, Ewha Womans University, Seoul 03760, Republic of Korea
| | - Sangok Kim
- Department of Life Science, Ewha Womans University, Seoul 03760, Republic of Korea
| | - Jaesang Kim
- Department of Life Science, Ewha Womans University, Seoul 03760, Republic of Korea
| | - Jaewoo Kang
- Department of Computer Science and Engineering, Korea University, Seoul 02841, Republic of Korea
| | - Byungwook Lee
- Korean Bioinformation Center, Korean Research Institute of Bioscience and Biotechnology, Daejeon 34141, Republic of Korea
| | - Sanghyuk Lee
- Department of Bio-Information Science, Ewha Womans University, Seoul 03760, Republic of Korea.,Department of Life Science, Ewha Womans University, Seoul 03760, Republic of Korea
| |
Collapse
|