1
|
Shafighi S, Geras A, Jurzysta B, Sahaf Naeini A, Filipiuk I, Ra Czkowska A, Toosi H, Koperski Ł, Thrane K, Engblom C, Mold JE, Chen X, Hartman J, Nowis D, Carbone A, Lagergren J, Szczurek E. Integrative spatial and genomic analysis of tumor heterogeneity with Tumoroscope. Nat Commun 2024; 15:9343. [PMID: 39472583 PMCID: PMC11522407 DOI: 10.1038/s41467-024-53374-3] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/03/2022] [Accepted: 10/09/2024] [Indexed: 11/02/2024] Open
Abstract
Spatial and genomic heterogeneity of tumors are crucial factors influencing cancer progression, treatment, and survival. However, a technology for direct mapping the clones in the tumor tissue based on somatic point mutations is lacking. Here, we propose Tumoroscope, the first probabilistic model that accurately infers cancer clones and their localization in close to single-cell resolution by integrating pathological images, whole exome sequencing, and spatial transcriptomics data. In contrast to previous methods, Tumoroscope explicitly addresses the problem of deconvoluting the proportions of clones in spatial transcriptomics spots. Applied to a reference prostate cancer dataset and a newly generated breast cancer dataset, Tumoroscope reveals spatial patterns of clone colocalization and mutual exclusion in sub-areas of the tumor tissue. We further infer clone-specific gene expression levels and the most highly expressed genes for each clone. In summary, Tumoroscope enables an integrated study of the spatial, genomic, and phenotypic organization of tumors.
Collapse
Affiliation(s)
- Shadi Shafighi
- Faculty of Mathematics, Informatics and Mechanics, University of Warsaw, Warsaw, Poland
- Sorbonne Universite, CNRS, IBPS, Laboratoire de Biologie Computationnelle et Quantitative, Paris, France
- Cancer Research UK Cambridge Institute, Cambridge, UK
| | - Agnieszka Geras
- Faculty of Mathematics, Informatics and Mechanics, University of Warsaw, Warsaw, Poland
- Faculty of Mathematics and Information Science, Warsaw University of Technology, Warsaw, Poland
- Department of Statistics, Columbia University, New York, NY, 10027, USA
- Irving Institute for Cancer Dynamics, Columbia University, New York, NY, 10027, USA
| | - Barbara Jurzysta
- Faculty of Mathematics, Informatics and Mechanics, University of Warsaw, Warsaw, Poland
| | - Alireza Sahaf Naeini
- Faculty of Mathematics, Informatics and Mechanics, University of Warsaw, Warsaw, Poland
| | - Igor Filipiuk
- Faculty of Mathematics, Informatics and Mechanics, University of Warsaw, Warsaw, Poland
| | - Alicja Ra Czkowska
- Faculty of Mathematics, Informatics and Mechanics, University of Warsaw, Warsaw, Poland
| | - Hosein Toosi
- SciLifeLab, School of EECS, KTH Royal Institute of Technology, Stockholm, Sweden
| | - Łukasz Koperski
- Department of Pathology, Medical University of Warsaw, Warsaw, Poland
| | - Kim Thrane
- Department of Gene Technology, KTH Royal Institute of Technology, SciLifeLab, Stockholm, Sweden
| | - Camilla Engblom
- Department of Cell and Molecular Biology, Karolinska Institutet, Solna, Sweden
- SciLifeLab, Department of Medicine Solna, Center of Molecular Medicine, Karolinska Institute and University Hospital, Stockholm, Sweden
| | - Jeff E Mold
- Department of Cell and Molecular Biology, Karolinska Institutet, Solna, Sweden
| | - Xinsong Chen
- Department of Oncology-Pathology, Karolinska Institutet, Stockholm, Sweden
| | - Johan Hartman
- Department of Oncology-Pathology, Karolinska Institutet, Stockholm, Sweden
- Department of Clinical Pathology and Cancer Diagnostics, Karolinska University Hospital, Stockholm, Sweden
| | - Dominika Nowis
- Laboratory of Experimental Medicine, Medical University of Warsaw, Warsaw, Poland
| | - Alessandra Carbone
- Sorbonne Universite, CNRS, IBPS, Laboratoire de Biologie Computationnelle et Quantitative, Paris, France
- Institut Universitaire de France, Paris, France
| | - Jens Lagergren
- SciLifeLab, School of EECS, KTH Royal Institute of Technology, Stockholm, Sweden
| | - Ewa Szczurek
- Faculty of Mathematics, Informatics and Mechanics, University of Warsaw, Warsaw, Poland.
- Institute of AI for Health, Helmholtz Munich, German Research Center for Environmental Health, Neuherberg, Germany.
| |
Collapse
|
2
|
Balasubramaniam NK, Penberthy S, Fenyo D, Viessmann N, Russmann C, Borchers CH. Digitalomics - digital transformation leading to omics insights. Expert Rev Proteomics 2024; 21:337-344. [PMID: 39364775 DOI: 10.1080/14789450.2024.2413107] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/02/2024] [Revised: 09/02/2024] [Accepted: 09/23/2024] [Indexed: 10/05/2024]
Abstract
INTRODUCTION Biomarker discovery is increasingly moving from single omics to multiomics, as well as from multi-cell omics to single-cell omics. These transitions have increasingly adopted digital transformation technologies to accelerate the progression from data to insight. Here, we will discuss the concept of 'digitalomics' and how digital transformation directly impacts biomarker discovery. This will ultimately assist clinicians in personalized therapy and precision-medicine treatment decisions. AREAS COVERED Genotype-to-phenotype-based insight generation involves integrating large amounts of complex multiomic data. This data integration and analysis is aided through digital transformation, leading to better clinical outcomes. We also highlight the challenges and opportunities of Digitalomics, and provide examples of the application of Artificial Intelligence, cloud- and high-performance computing, and use of tensors for multiomic analysis workflows. EXPERT OPINION Biomarker discovery, aided by digital transformation, is having a significant impact on cancer, cardiovascular, infectious, immunological, and neurological diseases, among others. Data insights garnered from multiomic analyses, combined with patient meta data, aids patient stratification and targeted treatment across a broad spectrum of diseases. Digital transformation offers time and cost savings while leading to improved patent healthcare. Here, we highlight the impact of digital transformation on multiomics- based biomarker discovery with specific applications related to oncology.
Collapse
Affiliation(s)
- Nandha Kumar Balasubramaniam
- PromptBio Inc, Pleasanton, CA, USA
- Health Campus Goettingen/University of Applied Sciences and Arts (HAWK), Göttingen, Germany
| | | | - David Fenyo
- New York University Grossman School of Medicine, New York, NY, USA
| | - Nina Viessmann
- Health Campus Goettingen/University of Applied Sciences and Arts (HAWK), Göttingen, Germany
| | - Christoph Russmann
- Health Campus Goettingen/University of Applied Sciences and Arts (HAWK), Göttingen, Germany
- Brigham and Women's Hospital, Harvard Medical School, Boston, MA, USA
| | - Christoph H Borchers
- Segal Cancer Proteomics Center, Lady Davis Institute for Medical Research, Jewish General Hospital and McGill University, Montreal, QC, Canada
- Gerald Bronfman Department of Oncology, Lady Davis Institute for Medical Research, Jewish General Hospital, Montreal, QC, Canada
- Division of Experimental Medicine, McGill University, Montreal, QC, Canada
- Department of Pathology, McGill University, Montreal, QC, Canada
| |
Collapse
|
3
|
Zhang N, Ma F, Guo D, Pang Y, Wang C, Zhang Y, Zheng X, Wang M. A novel hypergraph model for identifying and prioritizing personalized drivers in cancer. PLoS Comput Biol 2024; 20:e1012068. [PMID: 38683860 PMCID: PMC11081510 DOI: 10.1371/journal.pcbi.1012068] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/21/2023] [Revised: 05/09/2024] [Accepted: 04/09/2024] [Indexed: 05/02/2024] Open
Abstract
Cancer development is driven by an accumulation of a small number of driver genetic mutations that confer the selective growth advantage to the cell, while most passenger mutations do not contribute to tumor progression. The identification of these driver genes responsible for tumorigenesis is a crucial step in designing effective cancer treatments. Although many computational methods have been developed with this purpose, the majority of existing methods solely provided a single driver gene list for the entire cohort of patients, ignoring the high heterogeneity of driver events across patients. It remains challenging to identify the personalized driver genes. Here, we propose a novel method (PDRWH), which aims to prioritize the mutated genes of a single patient based on their impact on the abnormal expression of downstream genes across a group of patients who share the co-mutation genes and similar gene expression profiles. The wide experimental results on 16 cancer datasets from TCGA showed that PDRWH excels in identifying known general driver genes and tumor-specific drivers. In the comparative testing across five cancer types, PDRWH outperformed existing individual-level methods as well as cohort-level methods. Our results also demonstrated that PDRWH could identify both common and rare drivers. The personalized driver profiles could improve tumor stratification, providing new insights into understanding tumor heterogeneity and taking a further step toward personalized treatment. We also validated one of our predicted novel personalized driver genes on tumor cell proliferation by vitro cell-based assays, the promoting effect of the high expression of Low-density lipoprotein receptor-related protein 1 (LRP1) on tumor cell proliferation.
Collapse
Affiliation(s)
- Naiqian Zhang
- School of Mathematics and Statistics, Shandong University, Weihai, China
| | - Fubin Ma
- School of Mathematics and Statistics, Shandong University, Weihai, China
| | - Dong Guo
- School of Mathematics and Statistics, Shandong University, Weihai, China
- Department of Central Lab, Weihai Municipal Hospital, Shandong University, Weihai, China
| | - Yuxuan Pang
- SDU-ANU Joint Science College, Shandong University, Weihai, China
| | - Chenye Wang
- School of Mathematics and Statistics, Shandong University, Weihai, China
| | - Yusen Zhang
- School of Mathematics and Statistics, Shandong University, Weihai, China
| | - Xiaoqi Zheng
- Center for Single-Cell Omics, School of Public Health, Shanghai Jiao Tong University School of Medicine, Shanghai, China
| | - Mingyi Wang
- School of Mathematics and Statistics, Shandong University, Weihai, China
- Department of Central Lab, Weihai Municipal Hospital, Shandong University, Weihai, China
| |
Collapse
|
4
|
Qiao Y, Huang X, Moos PJ, Ahmann JM, Pomicter AD, Deininger MW, Byrd JC, Woyach JA, Stephens DM, Marth GT. A Bayesian framework to study tumor subclone-specific expression by combining bulk DNA and single-cell RNA sequencing data. Genome Res 2024; 34:94-105. [PMID: 38195207 PMCID: PMC10903947 DOI: 10.1101/gr.278234.123] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/29/2023] [Accepted: 11/22/2023] [Indexed: 01/11/2024]
Abstract
Genetic and gene expression heterogeneity is an essential hallmark of many tumors, allowing the cancer to evolve and to develop resistance to treatment. Currently, the most commonly used data types for studying such heterogeneity are bulk tumor/normal whole-genome or whole-exome sequencing (WGS, WES); and single-cell RNA sequencing (scRNA-seq), respectively. However, tools are currently lacking to link genomic tumor subclonality with transcriptomic heterogeneity by integrating genomic and single-cell transcriptomic data collected from the same tumor. To address this gap, we developed scBayes, a Bayesian probabilistic framework that uses tumor subclonal structure inferred from bulk DNA sequencing data to determine the subclonal identity of cells from single-cell gene expression (scRNA-seq) measurements. Grouping together cells representing the same genetically defined tumor subclones allows comparison of gene expression across different subclones, or investigation of gene expression changes within the same subclone across time (i.e., progression, treatment response, or relapse) or space (i.e., at multiple metastatic sites and organs). We used simulated data sets, in silico synthetic data sets, as well as biological data sets generated from cancer samples to extensively characterize and validate the performance of our method, as well as to show improvements over existing methods. We show the validity and utility of our approach by applying it to published data sets and recapitulating the findings, as well as arriving at novel insights into cancer subclonal expression behavior in our own data sets. We further show that our method is applicable to a wide range of single-cell sequencing technologies including single-cell DNA sequencing as well as Smart-seq and 10x Genomics scRNA-seq protocols.
Collapse
Affiliation(s)
- Yi Qiao
- Eccles Institute of Human Genetics, University of Utah, Salt Lake City, Utah 84112, USA
| | - Xiaomeng Huang
- Eccles Institute of Human Genetics, University of Utah, Salt Lake City, Utah 84112, USA
| | - Philip J Moos
- Department of Pharmacology and Toxicology, University of Utah, Salt Lake City, Utah 84112, USA
| | - Jonathan M Ahmann
- Huntsman Cancer Institute, University of Utah, Salt Lake City, Utah 84112, USA
| | - Anthony D Pomicter
- Huntsman Cancer Institute, University of Utah, Salt Lake City, Utah 84112, USA
| | - Michael W Deininger
- Huntsman Cancer Institute, University of Utah, Salt Lake City, Utah 84112, USA
- Division of Hematology and Hematologic Malignancies, University of Utah, Salt Lake City, Utah 84112, USA
| | - John C Byrd
- The James Comprehensive Cancer Center, The Ohio State University, Columbus, Ohio 43210, USA
| | - Jennifer A Woyach
- The James Comprehensive Cancer Center, The Ohio State University, Columbus, Ohio 43210, USA
| | - Deborah M Stephens
- Huntsman Cancer Institute, University of Utah, Salt Lake City, Utah 84112, USA
| | - Gabor T Marth
- Eccles Institute of Human Genetics, University of Utah, Salt Lake City, Utah 84112, USA;
| |
Collapse
|
5
|
Rossi N, Gigante N, Vitacolonna N, Piazza C. Inferring Markov Chains to Describe Convergent Tumor Evolution With CIMICE. IEEE/ACM TRANSACTIONS ON COMPUTATIONAL BIOLOGY AND BIOINFORMATICS 2024; 21:106-119. [PMID: 38015671 DOI: 10.1109/tcbb.2023.3337258] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/30/2023]
Abstract
The field of tumor phylogenetics focuses on studying the differences within cancer cell populations. Many efforts are done within the scientific community to build cancer progression models trying to understand the heterogeneity of such diseases. These models are highly dependent on the kind of data used for their construction, therefore, as the experimental technologies evolve, it is of major importance to exploit their peculiarities. In this work we describe a cancer progression model based on Single Cell DNA Sequencing data. When constructing the model, we focus on tailoring the formalism on the specificity of the data. We operate by defining a minimal set of assumptions needed to reconstruct a flexible DAG structured model, capable of identifying progression beyond the limitation of the infinite site assumption. Our proposal is conservative in the sense that we aim to neither discard nor infer knowledge which is not represented in the data. We provide simulations and analytical results to show the features of our model, test it on real data, show how it can be integrated with other approaches to cope with input noise. Moreover, our framework can be exploited to produce simulated data that follows our theoretical assumptions. Finally, we provide an open source R implementation of our approach, called CIMICE, that is publicly available on BioConductor.
Collapse
|
6
|
Quan C, Liu F, Qi L, Tie Y. LRT-CLUSTER: A New Clustering Algorithm Based on Likelihood Ratio Test to Identify Driving Genes. Interdiscip Sci 2023; 15:217-230. [PMID: 36848004 DOI: 10.1007/s12539-023-00554-2] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/01/2022] [Revised: 01/31/2023] [Accepted: 02/01/2023] [Indexed: 03/01/2023]
Abstract
Somatic mutations often occur at high relapse sites in protein sequences, which indicates that the location clustering of somatic missense mutations can be used to identify driving genes. However, the traditional clustering algorithm has such problems as the background signal over-fitting, the clustering algorithm is not suitable for mutation data, and the performance of identifying low-frequency mutation genes needs to be improved. In this paper, we propose a linear clustering algorithm based on likelihood ratio test knowledge to identify driver genes. In this experiment, firstly, the polynucleotide mutation rate is calculated based on the prior knowledge of likelihood ratio test. Then, the simulation data set is obtained through the background mutation rate model. Finally, the unsupervised peak clustering algorithm is used to, respectively, evaluate the somatic mutation data and the simulation data to identify the driver genes. The experimental results show that our method achieves a better balance of precision and sensitivity. It can also identify the driver genes missed by other methods, making it an effective supplement to other methods. We also discover some potential linkages between genes and between genes and mutation sites, which is of great value to target drug therapy research. Method framework: Our proposed model framework is as follows. a. Counting mutation sites and the number of mutations in tumor gene elements. b. The nucleotide context mutation frequency is counted based on the likelihood ratio test knowledge, and the background mutation rate model is obtained. c. Based on Monte Carlo simulation method, data sets with the same number of mutations as gene elements are randomly sampled to obtain simulated mutation data, and the sampling frequency of each mutation site is related to the mutation rate of polynucleotide. d. The original mutation data and the simulated mutation data after random reconstruction are clustered by peak density, respectively, and the corresponding clustering scores are obtained. e. We can obtain the clustering information statistics in each gene segment and score of each gene segment from the original single nucleotide mutation data through step d. f. According to the observed score and the simulated clustering score, the p-value of the corresponding gene fragment is calculated. g. We can obtain the clustering information statistics in each gene segment and score of each gene segment from the simulated single nucleotide mutation data through step d.
Collapse
Affiliation(s)
- Chenxu Quan
- School of Electrical and Information Engineering, Zhengzhou University, Zhengzhou, China.,Department of Respiratory and Sleep Medicine, The First Affiliated Hospital of Zhengzhou University, Zhengzhou, China
| | - Fenghui Liu
- Department of Respiratory and Sleep Medicine, The First Affiliated Hospital of Zhengzhou University, Zhengzhou, China
| | - Lin Qi
- School of Electrical and Information Engineering, Zhengzhou University, Zhengzhou, China
| | - Yun Tie
- School of Electrical and Information Engineering, Zhengzhou University, Zhengzhou, China.
| |
Collapse
|
7
|
Li F, Li H, Shang J, Liu JX, Dai L, Liu X, Li Y. A network-based method for identifying cancer driver genes based on node control centrality. Exp Biol Med (Maywood) 2022; 248:232-241. [PMID: 36573462 PMCID: PMC10107394 DOI: 10.1177/15353702221139201] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/28/2022] Open
Abstract
Cancer is one of the major contributors to human mortality and has a serious influence on human survival and health. In biomedical research, the identification of cancer driver genes (cancer drivers for short) is an important task; cancer drivers can promote the progression and generation of cancer. To identify cancer drivers, many methods have been developed. These computational models only identify coding cancer drivers; however, non-coding drivers likewise play significant roles in the progression of cancer. Hence, we propose a Network-based Method for identifying cancer Driver Genes based on node Control Centrality (NMDGCC), which can identify coding and non-coding cancer driver genes. The process of NMDGCC for identifying driver genes mainly includes the following two steps. In the first step, we construct a gene interaction network by using mRNAs and miRNAs expression data in the cancer state. In the second step, the control centrality of the node is used to identify cancer drivers in the constructed network. We use the breast cancer dataset from The Cancer Genome Atlas (TCGA) to verify the effectiveness of NMDGCC. Compared with the existing methods of cancer driver genes identification, NMDGCC has a better performance. NMDGCC also identifies 295 miRNAs as non-coding cancer drivers, of which 158 are related to tumorigenesis of BRCA. We also apply NMDGCC to identify driver genes related to the different breast cancer subtypes. The result shows that NMDGCC detects many cancer drivers of specific cancer subtypes.
Collapse
Affiliation(s)
- Feng Li
- School of Computer Science, Qufu Normal University, Rizhao 276826, China
| | - Han Li
- School of Computer Science, Qufu Normal University, Rizhao 276826, China
| | - Junliang Shang
- School of Computer Science, Qufu Normal University, Rizhao 276826, China
| | - Jin-Xing Liu
- School of Computer Science, Qufu Normal University, Rizhao 276826, China
| | - Lingyun Dai
- School of Computer Science, Qufu Normal University, Rizhao 276826, China
| | - Xikui Liu
- Department of Electrical Engineering and Information Technology, Shandong University of Science and Technology, Jinan 250031, China
| | - Yan Li
- Department of Electrical Engineering and Information Technology, Shandong University of Science and Technology, Jinan 250031, China
| |
Collapse
|
8
|
Rath S, Chakraborty D, Pradhan J, Imran Khan M, Dandapat J. Epigenomic interplay in tumor heterogeneity: Potential of epidrugs as adjunct therapy. Cytokine 2022; 157:155967. [PMID: 35905624 DOI: 10.1016/j.cyto.2022.155967] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/07/2022] [Revised: 07/11/2022] [Accepted: 07/13/2022] [Indexed: 11/28/2022]
Abstract
"Heterogeneity" in tumor mass has immense importance in cancer progression and therapy. The impact of tumor heterogeneity is an emerging field and not yet fully explored. Tumor heterogeneity is mainly considered as intra-tumor heterogeneity and inter-tumor heterogeneity based on their origin. Intra-tumor heterogeneity refers to the discrepancy within the same cancer mass while inter-tumor heterogeneity refers to the discrepancy between different patients having the same tumor type. Both of these heterogeneity types lead to variation in the histopathological as well as clinical properties of the cancer mass which drives disease resistance towards therapeutic approaches. Cancer stem cells (CSCs) act as pinnacle progenitors for heterogeneity development along with various other genetic and epigenetic parameters that are regulating this process. In recent times epigenetic factors are one of the most studied parameters that drive oxidative stress pathways essential during cancer progression. These epigenetic changes are modulated by various epidrugs and have an impact on tumor heterogeneity. The present review summarizes various aspects of epigenetic regulation in the tumor microenvironment, oxidative stress, and progression towards tumor heterogeneity that creates complications during cancer treatment. This review also explores the possible role of epidrugs in regulating tumor heterogeneity and personalized therapy against drug resistance.
Collapse
Affiliation(s)
- Suvasmita Rath
- Center of Environment, Climate Change and Public Health, Utkal University, Vani Vihar, Bhubaneswar 751004, Odisha, India
| | - Diptesh Chakraborty
- Department of Biotechnology, Utkal University, Bhubaneswar 751004, Odisha, India
| | - Jyotsnarani Pradhan
- Department of Biotechnology, Utkal University, Bhubaneswar 751004, Odisha, India
| | - Mohammad Imran Khan
- Department of Biochemistry, King Abdulaziz University (KAU), Jeddah 21577, Saudi Arabia; Centre of Artificial Intelligence for Precision Medicines, King Abdulaziz University, Jeddah 21589, Saudi Arabia
| | - Jagneshwar Dandapat
- Department of Biotechnology, Utkal University, Bhubaneswar 751004, Odisha, India; Centre of Excellence in Integrated Omics and Computational Biology, Utkal University, Bhubaneswar 751004, Odisha, India.
| |
Collapse
|
9
|
Bennett C, Carroll C, Wright C, Awad B, Park JM, Farmer M, Brown E(B, Heatherly A, Woodard S. Breast Cancer Genomics: Primary and Most Common Metastases. Cancers (Basel) 2022; 14:3046. [PMID: 35804819 PMCID: PMC9265113 DOI: 10.3390/cancers14133046] [Citation(s) in RCA: 4] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/02/2022] [Revised: 06/17/2022] [Accepted: 06/20/2022] [Indexed: 11/16/2022] Open
Abstract
Specific genomic alterations have been found in primary breast cancer involving driver mutations that result in tumorigenesis. Metastatic breast cancer, which is uncommon at the time of disease onset, variably impacts patients throughout the course of their disease. Both the molecular profiles and diverse genomic pathways vary in the development and progression of metastatic breast cancer. From the most common metastatic site (bone), to the rare sites such as orbital, gynecologic, or pancreatic metastases, different levels of gene expression indicate the potential involvement of numerous genes in the development and spread of breast cancer. Knowledge of these alterations can, not only help predict future disease, but also lead to advancement in breast cancer treatments. This review discusses the somatic landscape of breast primary and metastatic tumors.
Collapse
Affiliation(s)
- Caroline Bennett
- Birmingham Marnix E. Heersink School of Medicine, The University of Alabama, 1670 University Blvd, Birmingham, AL 35233, USA; (C.B.); (C.C.); (C.W.)
| | - Caleb Carroll
- Birmingham Marnix E. Heersink School of Medicine, The University of Alabama, 1670 University Blvd, Birmingham, AL 35233, USA; (C.B.); (C.C.); (C.W.)
| | - Cooper Wright
- Birmingham Marnix E. Heersink School of Medicine, The University of Alabama, 1670 University Blvd, Birmingham, AL 35233, USA; (C.B.); (C.C.); (C.W.)
| | - Barbara Awad
- Debusk College of Osteopathic Medicine, Lincoln Memorial University, 6965 Cumberland Gap Pkwy, Harrogate, TN 37752, USA;
| | - Jeong Mi Park
- Department of Radiology, The University of Alabama at Birmingham, 619 19th Street South, Birmingham, AL 35249, USA;
| | - Meagan Farmer
- Department of Genetics, Marnix E. Heersink School of Medicine, The University of Alabama at Birmingham, 1670 University Blvd, Birmingham, AL 35233, USA; (M.F.); (A.H.)
| | - Elizabeth (Bryce) Brown
- Laboratory Genetics Counselor, UAB Medical Genomics Laboratory, Kaul Human Genetics Building, 720 20th Street South, Suite 332, Birmingham, AL 35294, USA;
| | - Alexis Heatherly
- Department of Genetics, Marnix E. Heersink School of Medicine, The University of Alabama at Birmingham, 1670 University Blvd, Birmingham, AL 35233, USA; (M.F.); (A.H.)
| | - Stefanie Woodard
- Department of Radiology, The University of Alabama at Birmingham, 619 19th Street South, Birmingham, AL 35249, USA;
| |
Collapse
|
10
|
Laganà A. Computational Approaches for the Investigation of Intra-tumor Heterogeneity and Clonal Evolution from Bulk Sequencing Data in Precision Oncology Applications. ADVANCES IN EXPERIMENTAL MEDICINE AND BIOLOGY 2022; 1361:101-118. [DOI: 10.1007/978-3-030-91836-1_6] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 10/19/2022]
|
11
|
Laganà A. The Architecture of a Precision Oncology Platform. ADVANCES IN EXPERIMENTAL MEDICINE AND BIOLOGY 2022; 1361:1-22. [DOI: 10.1007/978-3-030-91836-1_1] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/30/2022]
|
12
|
Ogundijo OE, Zhu K, Wang X, Anastassiou D. Characterizing Intra-Tumor Heterogeneity From Somatic Mutations Without Copy-Neutral Assumption. IEEE/ACM TRANSACTIONS ON COMPUTATIONAL BIOLOGY AND BIOINFORMATICS 2021; 18:2271-2280. [PMID: 32070995 DOI: 10.1109/tcbb.2020.2973635] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/10/2023]
Abstract
Bulk samples of the same patient are heterogeneous in nature, comprising of different subpopulations (subclones) of cancer cells. Cells in a tumor subclone are characterized by unique mutational genotype profile. Resolving tumor heterogeneity by estimating the genotypes, cellular proportions and the number of subclones present in the tumor can help in understanding cancer progression and treatment. We present a novel method, ChaClone2, to efficiently deconvolve the observed variant allele fractions (VAFs), with consideration for possible effects from copy number aberrations at the mutation loci. Our method describes a state-space formulation of the feature allocation model, deconvolving the observed VAFs from samples of the same patient into three matrices: subclonal total and variant copy numbers for mutated genes, and proportions of subclones in each sample. We describe an efficient sequential Monte Carlo (SMC) algorithm to estimate these matrices. Extensive simulation shows that the ChaClone2 yields better accuracy when compared with other state-of-the-art methods for addressing similar problem and it offers scalability to large datasets. Also, ChaClone2 features that the model parameter estimates can be refined whenever new mutation data of freshly sequenced genomic locations are available. MATLAB code and datasets are available to download at: https://github.com/moyanre/method2.
Collapse
|
13
|
Pham VVH, Liu L, Bracken C, Goodall G, Li J, Le TD. Computational methods for cancer driver discovery: A survey. Am J Cancer Res 2021; 11:5553-5568. [PMID: 33859763 PMCID: PMC8039954 DOI: 10.7150/thno.52670] [Citation(s) in RCA: 10] [Impact Index Per Article: 3.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/01/2020] [Accepted: 01/20/2021] [Indexed: 12/21/2022] Open
Abstract
Identifying the genes responsible for driving cancer is of critical importance for directing treatment. Accordingly, multiple computational tools have been developed to facilitate this task. Due to the different methods employed by these tools, different data considered by the tools, and the rapidly evolving nature of the field, the selection of an appropriate tool for cancer driver discovery is not straightforward. This survey seeks to provide a comprehensive review of the different computational methods for discovering cancer drivers. We categorise the methods into three groups; methods for single driver identification, methods for driver module identification, and methods for identifying personalised cancer drivers. In addition to providing a “one-stop” reference of these methods, by evaluating and comparing their performance, we also provide readers the information about the different capabilities of the methods in identifying biologically significant cancer drivers. The biologically relevant information identified by these tools can be seen through the enrichment of discovered cancer drivers in GO biological processes and KEGG pathways and through our identification of a small cancer-driver cohort that is capable of stratifying patient survival.
Collapse
|
14
|
Schill R, Solbrig S, Wettig T, Spang R. Modelling cancer progression using Mutual Hazard Networks. Bioinformatics 2020; 36:241-249. [PMID: 31250881 PMCID: PMC6956791 DOI: 10.1093/bioinformatics/btz513] [Citation(s) in RCA: 9] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/12/2018] [Revised: 03/29/2019] [Accepted: 06/25/2019] [Indexed: 12/26/2022] Open
Abstract
MOTIVATION Cancer progresses by accumulating genomic events, such as mutations and copy number alterations, whose chronological order is key to understanding the disease but difficult to observe. Instead, cancer progression models use co-occurrence patterns in cross-sectional data to infer epistatic interactions between events and thereby uncover their most likely order of occurrence. State-of-the-art progression models, however, are limited by mathematical tractability and only allow events to interact in directed acyclic graphs, to promote but not inhibit subsequent events, or to be mutually exclusive in distinct groups that cannot overlap. RESULTS Here we propose Mutual Hazard Networks (MHN), a new Machine Learning algorithm to infer cyclic progression models from cross-sectional data. MHN model events by their spontaneous rate of fixation and by multiplicative effects they exert on the rates of successive events. MHN compared favourably to acyclic models in cross-validated model fit on four datasets tested. In application to the glioblastoma dataset from The Cancer Genome Atlas, MHN proposed a novel interaction in line with consecutive biopsies: IDH1 mutations are early events that promote subsequent fixation of TP53 mutations. AVAILABILITY AND IMPLEMENTATION Implementation and data are available at https://github.com/RudiSchill/MHN. SUPPLEMENTARY INFORMATION Supplementary data are available at Bioinformatics online.
Collapse
Affiliation(s)
- Rudolf Schill
- Department of Statistical Bioinformatics, Institute of Functional Genomics, Regensburg 93040, Germany
| | - Stefan Solbrig
- Department of Physics, University of Regensburg, Regensburg 93040, Germany
| | - Tilo Wettig
- Department of Physics, University of Regensburg, Regensburg 93040, Germany
| | - Rainer Spang
- Department of Statistical Bioinformatics, Institute of Functional Genomics, Regensburg 93040, Germany
| |
Collapse
|
15
|
Trevino V. Modeling and analysis of site-specific mutations in cancer identifies known plus putative novel hotspots and bias due to contextual sequences. Comput Struct Biotechnol J 2020; 18:1664-1675. [PMID: 32670506 PMCID: PMC7339035 DOI: 10.1016/j.csbj.2020.06.022] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/15/2019] [Revised: 06/10/2020] [Accepted: 06/12/2020] [Indexed: 11/22/2022] Open
Abstract
In cancer, recurrently mutated sites in DNA and proteins, called hotspots, are thought to be raised by positive selection and therefore important due to its potential functional impact. Although recent evidence for APOBEC enzymatic activity have shown that specific types of sequences are likely to be false, the identification of putative hotspots is important to confirm either its functional role or its mechanistic bias. In this work, an algorithm and a statistical model is presented to detect hotspots. The model consists of a beta-binomial component plus fixed effects that efficiently fits the distribution of mutated sites. The algorithm employs an optimal stepwise approach to find the model parameters. Simulations show that the proposed algorithmic model is highly accurate for common hotspots. The approach has been applied to TCGA mutational data from 33 cancer types. The results show that well-known cancer hotspots are easily detected. Besides, novel hotspots are also detected. An analysis of the sequence context of detected hotspots show a preference for TCG sites that may be related to APOBEC or other unknown mechanistic biases. The detected hotspots are available online in http://bioinformatica.mty.itesm.mx/HotSpotsAnnotations.
Collapse
Affiliation(s)
- Victor Trevino
- Tecnologico de Monterrey, Escuela de Medicina, Av Morones Prieto No. 3000, Colonia Los Doctores, Monterrey, Nuevo León Zip Code 64710, Mexico
| |
Collapse
|
16
|
Kataka E, Zaucha J, Frishman G, Ruepp A, Frishman D. Edgetic perturbation signatures represent known and novel cancer biomarkers. Sci Rep 2020; 10:4350. [PMID: 32152446 PMCID: PMC7062722 DOI: 10.1038/s41598-020-61422-3] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/24/2019] [Accepted: 02/20/2020] [Indexed: 02/07/2023] Open
Abstract
Isoform switching is a recently characterized hallmark of cancer, and often translates to the loss or gain of domains mediating protein interactions and thus, the re-wiring of the interactome. Recent computational tools leverage domain-domain interaction data to resolve the condition-specific interaction networks from RNA-Seq data accounting for the domain content of the primary transcripts expressed. Here, we used The Cancer Genome Atlas RNA-Seq datasets to generate 642 patient-specific pairs of interactomes corresponding to both the tumor and the healthy tissues across 13 cancer types. The comparison of these interactomes provided a list of patient-specific edgetic perturbations of the interactomes associated with the cancerous state. We found that among the identified perturbations, select sets are robustly shared between patients at the multi-cancer, cancer-specific and cancer sub-type specific levels. Interestingly, the majority of the alterations do not directly involve significantly mutated genes, nevertheless, they strongly correlate with patient survival. The findings (available at EdgeExplorer: “http://webclu.bio.wzw.tum.de/EdgeExplorer”) are a new source of potential biomarkers for classifying cancer types and the proteins we identified are potential anti-cancer therapy targets.
Collapse
Affiliation(s)
- Evans Kataka
- Department of Bioinformatics, Wissenschaftszentrum Weihenstephan, Technische Universität München, Maximus-von-Imhof-Forum 3, 85354, Freising, Germany
| | - Jan Zaucha
- Department of Bioinformatics, Wissenschaftszentrum Weihenstephan, Technische Universität München, Maximus-von-Imhof-Forum 3, 85354, Freising, Germany
| | - Goar Frishman
- Institute of Experimental Genetics (IEG), Helmholtz Zentrum München-German Research Center for Environmental Health (GmbH), Ingolstädter Landstrasse 1, 85764, Neuherberg, Germany
| | - Andreas Ruepp
- Institute of Experimental Genetics (IEG), Helmholtz Zentrum München-German Research Center for Environmental Health (GmbH), Ingolstädter Landstrasse 1, 85764, Neuherberg, Germany
| | - Dmitrij Frishman
- Department of Bioinformatics, Wissenschaftszentrum Weihenstephan, Technische Universität München, Maximus-von-Imhof-Forum 3, 85354, Freising, Germany. .,Laboratory of Bioinformatics, RASA Research Center, St Petersburg State Polytechnic University, St Petersburg, 195251, Russia.
| |
Collapse
|
17
|
Miura S, Vu T, Deng J, Buturla T, Oladeinde O, Choi J, Kumar S. Power and pitfalls of computational methods for inferring clone phylogenies and mutation orders from bulk sequencing data. Sci Rep 2020; 10:3498. [PMID: 32103044 PMCID: PMC7044161 DOI: 10.1038/s41598-020-59006-2] [Citation(s) in RCA: 15] [Impact Index Per Article: 3.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/03/2020] [Accepted: 01/23/2020] [Indexed: 12/13/2022] Open
Abstract
Tumors harbor extensive genetic heterogeneity in the form of distinct clone genotypes that arise over time and across different tissues and regions in cancer. Many computational methods produce clone phylogenies from population bulk sequencing data collected from multiple tumor samples from a patient. These clone phylogenies are used to infer mutation order and clone origins during tumor progression, rendering the selection of the appropriate clonal deconvolution method critical. Surprisingly, absolute and relative accuracies of these methods in correctly inferring clone phylogenies are yet to consistently assessed. Therefore, we evaluated the performance of seven computational methods. The accuracy of the reconstructed mutation order and inferred clone groupings varied extensively among methods. All the tested methods showed limited ability to identify ancestral clone sequences present in tumor samples correctly. The presence of copy number alterations, the occurrence of multiple seeding events among tumor sites during metastatic tumor evolution, and extensive intermixture of cancer cells among tumors hindered the detection of clones and the inference of clone phylogenies for all methods tested. Overall, CloneFinder, MACHINA, and LICHeE showed the highest overall accuracy, but none of the methods performed well for all simulated datasets. So, we present guidelines for selecting methods for data analysis.
Collapse
Affiliation(s)
- Sayaka Miura
- Institute for Genomics and Evolutionary Medicine, Temple University, Philadelphia, PA, 19122, USA.,Department of Biology, Temple University, Philadelphia, PA, 19122, USA
| | - Tracy Vu
- Institute for Genomics and Evolutionary Medicine, Temple University, Philadelphia, PA, 19122, USA.,Department of Biology, Temple University, Philadelphia, PA, 19122, USA
| | - Jiamin Deng
- Institute for Genomics and Evolutionary Medicine, Temple University, Philadelphia, PA, 19122, USA.,Department of Biology, Temple University, Philadelphia, PA, 19122, USA
| | - Tiffany Buturla
- Institute for Genomics and Evolutionary Medicine, Temple University, Philadelphia, PA, 19122, USA.,Department of Biology, Temple University, Philadelphia, PA, 19122, USA
| | - Olumide Oladeinde
- Institute for Genomics and Evolutionary Medicine, Temple University, Philadelphia, PA, 19122, USA.,Department of Biology, Temple University, Philadelphia, PA, 19122, USA
| | - Jiyeong Choi
- Institute for Genomics and Evolutionary Medicine, Temple University, Philadelphia, PA, 19122, USA.,Department of Biology, Temple University, Philadelphia, PA, 19122, USA
| | - Sudhir Kumar
- Institute for Genomics and Evolutionary Medicine, Temple University, Philadelphia, PA, 19122, USA. .,Department of Biology, Temple University, Philadelphia, PA, 19122, USA. .,Center for Excellence in Genome Medicine and Research, King Abdulaziz University, Jeddah, Saudi Arabia.
| |
Collapse
|
18
|
Trevino V. HotSpotAnnotations-a database for hotspot mutations and annotations in cancer. Database (Oxford) 2020; 2020:baaa025. [PMID: 32386297 PMCID: PMC7211031 DOI: 10.1093/database/baaa025] [Citation(s) in RCA: 17] [Impact Index Per Article: 4.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/06/2019] [Revised: 02/20/2020] [Accepted: 03/11/2020] [Indexed: 12/21/2022]
Abstract
Hotspots, recurrently mutated DNA positions in cancer, are thought to be oncogenic drivers because random chance is unlikely and the knowledge of clear examples of oncogenic hotspots in genes like BRAF, IDH1, KRAS and NRAS among many other genes. Hotspots are attractive because provide opportunities for biomedical research and novel treatments. Nevertheless, recent evidence, such as DNA hairpins for APOBEC3A, suggests that a considerable fraction of hotspots seem to be passengers rather than drivers. To document hotspots, the database HotSpotsAnnotations is proposed. For this, a statistical model was implemented to detect putative hotspots, which was applied to TCGA cancer datasets covering 33 cancer types, 10 182 patients and 3 175 929 mutations. Then, genes and hotspots were annotated by two published methods (APOBEC3A hairpins and dN/dS ratio) that may inform and warn researchers about possible false functional hotspots. Moreover, manual annotation from users can be added and shared. From the 23 198 detected as possible hotspots, 4435 were selected after false discovery rate correction and minimum mutation count. From these, 305 were annotated as likely for APOBEC3A whereas 442 were annotated as unlikely. To date, this is the first database dedicated to annotating hotspots for possible false functional hotspots.
Collapse
Affiliation(s)
- Victor Trevino
- Tecnologico de Monterrey, Escuela de Medicina, Cátedra de Bioinformática, Morones Prieto No. 3000, Colonia Los Doctores, Monterrey, Nuevo León 64710, Mexico
| |
Collapse
|
19
|
Pham VVH, Liu L, Bracken CP, Goodall GJ, Long Q, Li J, Le TD. CBNA: A control theory based method for identifying coding and non-coding cancer drivers. PLoS Comput Biol 2019; 15:e1007538. [PMID: 31790386 PMCID: PMC6907873 DOI: 10.1371/journal.pcbi.1007538] [Citation(s) in RCA: 14] [Impact Index Per Article: 2.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/25/2019] [Revised: 12/12/2019] [Accepted: 11/12/2019] [Indexed: 02/06/2023] Open
Abstract
A key task in cancer genomics research is to identify cancer driver genes. As these genes initialise and progress cancer, understanding them is critical in designing effective cancer interventions. Although there are several methods developed to discover cancer drivers, most of them only identify coding drivers. However, non-coding RNAs can regulate driver mutations to develop cancer. Hence, novel methods are required to reveal both coding and non-coding cancer drivers. In this paper, we develop a novel framework named Controllability based Biological Network Analysis (CBNA) to uncover coding and non-coding cancer drivers (i.e. miRNA cancer drivers). CBNA integrates different genomic data types, including gene expression, gene network, mutation data, and contains a two-stage process: (1) Building a network for a condition (e.g. cancer condition) and (2) Identifying drivers. The application of CBNA to the BRCA dataset demonstrates that it is more effective than the existing methods in detecting coding cancer drivers. In addition, CBNA also predicts 17 miRNA drivers for breast cancer. Some of these predicted miRNA drivers have been validated by literature and the rest can be good candidates for wet-lab validation. We further use CBNA to detect subtype-specific cancer drivers and several predicted drivers have been confirmed to be related to breast cancer subtypes. Another application of CBNA is to discover epithelial-mesenchymal transition (EMT) drivers. Of the predicted EMT drivers, 7 coding and 6 miRNA drivers are in the known EMT gene lists. Cancer is a disease of cells in human body and it causes a high rate of deaths worldwide. There has been evidence that coding and non-coding RNAs are key players in the initialisation and progression of cancer. These coding and non-coding RNAs are considered as cancer drivers. To design better diagnostic and therapeutic plans for cancer patients, we need to know the roles of cancer drivers in cancer development as well as their regulatory mechanisms in the human body. In this study, we propose a novel framework to identify coding and non-coding cancer drivers (i.e. miRNA cancer drivers). The proposed framework is applied to the breast cancer dataset for identifying drivers of breast cancer. Comparing our method with existing methods in predicting coding cancer drivers, our method shows a better performance. Several miRNA cancer drivers predicted by our method have already been validated by literature. The predicted cancer drivers by our method could be a potential source for further wet-lab experiments to discover the causes of cancer. In addition, the proposed method can be used to detect drivers of cancer subtypes and drivers of the epithelial-mesenchymal transition in cancer.
Collapse
Affiliation(s)
- Vu V. H. Pham
- School of Information Technology and Mathematical Sciences, University of South Australia, Mawson Lakes, Australia
| | - Lin Liu
- School of Information Technology and Mathematical Sciences, University of South Australia, Mawson Lakes, Australia
| | - Cameron P. Bracken
- Centre for Cancer Biology, an alliance of SA Pathology and University of South Australia, Adelaide, Australia
- Department of Medicine, The University of Adelaide, Adelaide, Australia
| | - Gregory J. Goodall
- Centre for Cancer Biology, an alliance of SA Pathology and University of South Australia, Adelaide, Australia
- Department of Medicine, The University of Adelaide, Adelaide, Australia
| | - Qi Long
- Perelman School of Medicine, University of Pennsylvania, Philadelphia, Pennsylvania, United States of America
| | - Jiuyong Li
- School of Information Technology and Mathematical Sciences, University of South Australia, Mawson Lakes, Australia
- * E-mail: (JL); (TL)
| | - Thuc D. Le
- School of Information Technology and Mathematical Sciences, University of South Australia, Mawson Lakes, Australia
- * E-mail: (JL); (TL)
| |
Collapse
|
20
|
Miura S, Gomez K, Murillo O, Huuki LA, Vu T, Buturla T, Kumar S. Predicting clone genotypes from tumor bulk sequencing of multiple samples. Bioinformatics 2019; 34:4017-4026. [PMID: 29931046 DOI: 10.1093/bioinformatics/bty469] [Citation(s) in RCA: 14] [Impact Index Per Article: 2.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/03/2017] [Accepted: 06/12/2018] [Indexed: 12/25/2022] Open
Abstract
Motivation Analyses of data generated from bulk sequencing of tumors have revealed extensive genomic heterogeneity within patients. Many computational methods have been developed to enable the inference of genotypes of tumor cell populations (clones) from bulk sequencing data. However, the relative and absolute accuracy of available computational methods in estimating clone counts and clone genotypes is not yet known. Results We have assessed the performance of nine methods, including eight previously-published and one new method (CloneFinder), by analyzing computer simulated datasets. CloneFinder, LICHeE, CITUP and cloneHD inferred clone genotypes with low error (<5% per clone) for a majority of datasets in which the tumor samples contained evolutionarily-related clones. Computational methods did not perform well for datasets in which tumor samples contained mixtures of clones from different clonal lineages. Generally, the number of clones was underestimated by cloneHD and overestimated by PhyloWGS, and BayClone2, Canopy and Clomial required prior information regarding the number of clones. AncesTree and Canopy did not produce results for a large number of datasets. Overall, the deconvolution of clone genotypes from single nucleotide variant (SNV) frequency differences among tumor samples remains challenging, so there is a need to develop more accurate computational methods and robust software for clone genotype inference. Availability and implementation CloneFinder is implemented in Python and is available from https://github.com/gstecher/CloneFinderAPI. Supplementary information Supplementary data are available at Bioinformatics online.
Collapse
Affiliation(s)
- Sayaka Miura
- Institute for Genomics and Evolutionary Medicine.,Department of Biology, Temple University, Philadelphia, PA, USA
| | - Karen Gomez
- Institute for Genomics and Evolutionary Medicine.,Department of Biology, Temple University, Philadelphia, PA, USA.,College of Physicians and Surgeons, Columbia University, New York, NY, USA
| | - Oscar Murillo
- Institute for Genomics and Evolutionary Medicine.,Department of Molecular and Human Genetics, Baylor College of Medicine, Houston, TX, USA
| | - Louise A Huuki
- Institute for Genomics and Evolutionary Medicine.,Department of Biology, Temple University, Philadelphia, PA, USA
| | - Tracy Vu
- Institute for Genomics and Evolutionary Medicine.,Department of Biology, Temple University, Philadelphia, PA, USA
| | - Tiffany Buturla
- Institute for Genomics and Evolutionary Medicine.,Department of Biology, Temple University, Philadelphia, PA, USA
| | - Sudhir Kumar
- Institute for Genomics and Evolutionary Medicine.,Department of Biology, Temple University, Philadelphia, PA, USA.,Center for Excellence in Genome Medicine and Research, King Abdulaziz University, Jeddah, Saudi Arabia
| |
Collapse
|
21
|
Sarto Basso R, Hochbaum DS, Vandin F. Efficient algorithms to discover alterations with complementary functional association in cancer. PLoS Comput Biol 2019; 15:e1006802. [PMID: 31120875 PMCID: PMC6550413 DOI: 10.1371/journal.pcbi.1006802] [Citation(s) in RCA: 9] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/28/2018] [Revised: 06/05/2019] [Accepted: 01/17/2019] [Indexed: 12/20/2022] Open
Abstract
Recent large cancer studies have measured somatic alterations in an unprecedented number of tumours. These large datasets allow the identification of cancer-related sets of genetic alterations by identifying relevant combinatorial patterns. Among such patterns, mutual exclusivity has been employed by several recent methods that have shown its effectiveness in characterizing gene sets associated to cancer. Mutual exclusivity arises because of the complementarity, at the functional level, of alterations in genes which are part of a group (e.g., a pathway) performing a given function. The availability of quantitative target profiles, from genetic perturbations or from clinical phenotypes, provides additional information that can be leveraged to improve the identification of cancer related gene sets by discovering groups with complementary functional associations with such targets. In this work we study the problem of finding groups of mutually exclusive alterations associated with a quantitative (functional) target. We propose a combinatorial formulation for the problem, and prove that the associated computational problem is computationally hard. We design two algorithms to solve the problem and implement them in our tool UNCOVER. We provide analytic evidence of the effectiveness of UNCOVER in finding high-quality solutions and show experimentally that UNCOVER finds sets of alterations significantly associated with functional targets in a variety of scenarios. In particular, we show that our algorithms find sets which are better than the ones obtained by the state-of-the-art method, even when sets are evaluated using the statistical score employed by the latter. In addition, our algorithms are much faster than the state-of-the-art, allowing the analysis of large datasets of thousands of target profiles from cancer cell lines. We show that on two such datasets, one from project Achilles and one from the Genomics of Drug Sensitivity in Cancer project, UNCOVER identifies several significant gene sets with complementary functional associations with targets. Software available at: https://github.com/VandinLab/UNCOVER.
Collapse
Affiliation(s)
- Rebecca Sarto Basso
- Department of Industrial Engineering and Operations Research, University of California at Berkeley, Berkeley, CA, USA
| | - Dorit S. Hochbaum
- Department of Industrial Engineering and Operations Research, University of California at Berkeley, Berkeley, CA, USA
| | - Fabio Vandin
- Department of Information Engineering, University of Padova, Padova, Italy
- Department of Computer Science, Brown University, Providence, RI, USA
- Department of Mathematics and Computer Science, University of Southern Denmark, Odense, Denmark
- * E-mail:
| |
Collapse
|
22
|
Hajkarim MC, Upfal E, Vandin F. Differentially mutated subnetworks discovery. Algorithms Mol Biol 2019; 14:10. [PMID: 30976291 PMCID: PMC6441493 DOI: 10.1186/s13015-019-0146-7] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/07/2018] [Accepted: 03/19/2019] [Indexed: 11/30/2022] Open
Abstract
PROBLEM We study the problem of identifying differentially mutated subnetworks of a large gene-gene interaction network, that is, subnetworks that display a significant difference in mutation frequency in two sets of cancer samples. We formally define the associated computational problem and show that the problem is NP-hard. ALGORITHM We propose a novel and efficient algorithm, called DAMOKLE, to identify differentially mutated subnetworks given genome-wide mutation data for two sets of cancer samples. We prove that DAMOKLE identifies subnetworks with statistically significant difference in mutation frequency when the data comes from a reasonable generative model, provided enough samples are available. EXPERIMENTAL RESULTS We test DAMOKLE on simulated and real data, showing that DAMOKLE does indeed find subnetworks with significant differences in mutation frequency and that it provides novel insights into the molecular mechanisms of the disease not revealed by standard methods.
Collapse
Affiliation(s)
| | - Eli Upfal
- Department of Computer Science, Brown University, Providence, RI USA
| | - Fabio Vandin
- Department of Information Engineering, University of Padova, Padova, Italy
| |
Collapse
|
23
|
Inman GJ, Wang J, Nagano A, Alexandrov LB, Purdie KJ, Taylor RG, Sherwood V, Thomson J, Hogan S, Spender LC, South AP, Stratton M, Chelala C, Harwood CA, Proby CM, Leigh IM. The genomic landscape of cutaneous SCC reveals drivers and a novel azathioprine associated mutational signature. Nat Commun 2018; 9:3667. [PMID: 30202019 PMCID: PMC6131170 DOI: 10.1038/s41467-018-06027-1] [Citation(s) in RCA: 188] [Impact Index Per Article: 31.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/31/2017] [Accepted: 08/07/2018] [Indexed: 02/07/2023] Open
Abstract
Cutaneous squamous cell carcinoma (cSCC) has a high tumour mutational burden (50 mutations per megabase DNA pair). Here, we combine whole-exome analyses from 40 primary cSCC tumours, comprising 20 well-differentiated and 20 moderately/poorly differentiated tumours, with accompanying clinical data from a longitudinal study of immunosuppressed and immunocompetent patients and integrate this analysis with independent gene expression studies. We identify commonly mutated genes, copy number changes and altered pathways and processes. Comparisons with tumour differentiation status suggest events which may drive disease progression. Mutational signature analysis reveals the presence of a novel signature (signature 32), whose incidence correlates with chronic exposure to the immunosuppressive drug azathioprine. Characterisation of a panel of 15 cSCC tumour-derived cell lines reveals that they accurately reflect the mutational signatures and genomic alterations of primary tumours and provide a valuable resource for the validation of tumour drivers and therapeutic targets.
Collapse
Affiliation(s)
- Gareth J Inman
- Division of Cancer Research, Jacqui Wood Cancer Centre, School of Medicine, University of Dundee, Dundee, DD1 9SY, UK.
| | - Jun Wang
- Centre for Molecular Oncology, Barts Cancer Institute, Queen Mary University of London, London, EC1M 6BQ, UK.
| | - Ai Nagano
- Centre for Molecular Oncology, Barts Cancer Institute, Queen Mary University of London, London, EC1M 6BQ, UK
| | - Ludmil B Alexandrov
- Department of Cellular and Molecular Medicine and Department of Bioengineering and Moores Cancer Center, University of California, San Diego, La Jolla, CA, 92093, USA
| | - Karin J Purdie
- Centre for Cell Biology and Cutaneous Research, Barts and the London School of Medicine and Dentistry, Queen Mary University of London, London, E1 2AT, UK
| | - Richard G Taylor
- Division of Cancer Research, Jacqui Wood Cancer Centre, School of Medicine, University of Dundee, Dundee, DD1 9SY, UK
| | - Victoria Sherwood
- Division of Cancer Research, Jacqui Wood Cancer Centre, School of Medicine, University of Dundee, Dundee, DD1 9SY, UK
| | - Jason Thomson
- Centre for Cell Biology and Cutaneous Research, Barts and the London School of Medicine and Dentistry, Queen Mary University of London, London, E1 2AT, UK
| | - Sarah Hogan
- Centre for Cell Biology and Cutaneous Research, Barts and the London School of Medicine and Dentistry, Queen Mary University of London, London, E1 2AT, UK
| | - Lindsay C Spender
- Division of Cancer Research, Jacqui Wood Cancer Centre, School of Medicine, University of Dundee, Dundee, DD1 9SY, UK
| | - Andrew P South
- Department of Dermatology and Cutaneous Biology, Thomas Jefferson University, Philadelphia, PA, 19107, USA
| | - Michael Stratton
- Cancer Genome Project, Wellcome Trust Sanger Institute, Hinxton, CB10 1SA, UK
| | - Claude Chelala
- Centre for Molecular Oncology, Barts Cancer Institute, Queen Mary University of London, London, EC1M 6BQ, UK
| | - Catherine A Harwood
- Centre for Cell Biology and Cutaneous Research, Barts and the London School of Medicine and Dentistry, Queen Mary University of London, London, E1 2AT, UK
| | - Charlotte M Proby
- Division of Cancer Research, Jacqui Wood Cancer Centre, School of Medicine, University of Dundee, Dundee, DD1 9SY, UK
| | - Irene M Leigh
- Division of Cancer Research, Jacqui Wood Cancer Centre, School of Medicine, University of Dundee, Dundee, DD1 9SY, UK.
| |
Collapse
|