1
|
Kacar Z, Slud E, Levy D, Candia J, Budhu A, Forgues M, Wu X, Raziuddin A, Tran B, Shetty J, Pomyen Y, Chaisaingmongkol J, Rabibhadana S, Pupacdi B, Bhudhisawasdi V, Lertprasertsuke N, Auewarakul C, Sangrajrang S, Mahidol C, Ruchirawat M, Wang XW. Characterization of tumor evolution by functional clonality and phylogenetics in hepatocellular carcinoma. Commun Biol 2024; 7:383. [PMID: 38553628 PMCID: PMC11245610 DOI: 10.1038/s42003-024-06040-9] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/01/2023] [Accepted: 03/11/2024] [Indexed: 04/02/2024] Open
Abstract
Hepatocellular carcinoma (HCC) is a molecularly heterogeneous solid malignancy, and its fitness may be shaped by how its tumor cells evolve. However, ability to monitor tumor cell evolution is hampered by the presence of numerous passenger mutations that do not provide any biological consequences. Here we develop a strategy to determine the tumor clonality of three independent HCC cohorts of 524 patients with diverse etiologies and race/ethnicity by utilizing somatic mutations in cancer driver genes. We identify two main types of tumor evolution, i.e., linear, and non-linear models where non-linear type could be further divided into classes, which we call shallow branching and deep branching. We find that linear evolving HCC is less aggressive than other types. GTF2IRD2B mutations are enriched in HCC with linear evolution, while TP53 mutations are the most frequent genetic alterations in HCC with non-linear models. Furthermore, we observe significant B cell enrichment in linear trees compared to non-linear trees suggesting the need for further research to uncover potential variations in immune cell types within genomically determined phylogeny types. These results hint at the possibility that tumor cells and their microenvironment may collectively influence the tumor evolution process.
Collapse
Affiliation(s)
- Zeynep Kacar
- Laboratory of Human Carcinogenesis, Center for Cancer Research, National Cancer Institute, Bethesda, MD, 20892, USA
- Department of Mathematics, University of Maryland, College Park, MD, 20742, USA
| | - Eric Slud
- Department of Mathematics, University of Maryland, College Park, MD, 20742, USA
| | - Doron Levy
- Department of Mathematics, University of Maryland, College Park, MD, 20742, USA
| | - Julián Candia
- Longitudinal Studies Section, Translational Gerontology Branch, National Institute on Aging, Baltimore, MD, 21224, USA
| | - Anuradha Budhu
- Laboratory of Human Carcinogenesis, Center for Cancer Research, National Cancer Institute, Bethesda, MD, 20892, USA
- Liver Cancer Program, Center for Cancer Research, National Cancer Institute, Bethesda, MD, 20892, USA
| | - Marshonna Forgues
- Laboratory of Human Carcinogenesis, Center for Cancer Research, National Cancer Institute, Bethesda, MD, 20892, USA
| | - Xiaolin Wu
- Cancer Research Technology Program, Frederick, MD, 21702, USA
| | - Arati Raziuddin
- Cancer Research Technology Program, Frederick, MD, 21702, USA
| | - Bao Tran
- Cancer Research Technology Program, Frederick, MD, 21702, USA
| | - Jyoti Shetty
- Cancer Research Technology Program, Frederick, MD, 21702, USA
| | - Yotsawat Pomyen
- Laboratory of Chemical Carcinogenesis, Chulabhorn Research Institute, Bangkok, 10210, Thailand
| | | | - Siritida Rabibhadana
- Laboratory of Chemical Carcinogenesis, Chulabhorn Research Institute, Bangkok, 10210, Thailand
| | - Benjarath Pupacdi
- Laboratory of Chemical Carcinogenesis, Chulabhorn Research Institute, Bangkok, 10210, Thailand
| | | | | | - Chirayu Auewarakul
- Princess Srisavangavadhana College of Medicine, Chulabhorn Royal Academy, Bangkok, 10210, Thailand
| | | | - Chulabhorn Mahidol
- Laboratory of Chemical Carcinogenesis, Chulabhorn Research Institute, Bangkok, 10210, Thailand
| | - Mathuros Ruchirawat
- Laboratory of Chemical Carcinogenesis, Chulabhorn Research Institute, Bangkok, 10210, Thailand
- Center of Excellence on Environmental Health and Toxicology (EHT), OPS, MHESI, Bangkok, Thailand
| | - Xin Wei Wang
- Laboratory of Human Carcinogenesis, Center for Cancer Research, National Cancer Institute, Bethesda, MD, 20892, USA.
- Liver Cancer Program, Center for Cancer Research, National Cancer Institute, Bethesda, MD, 20892, USA.
| |
Collapse
|
2
|
Little P, Hsu L, Sun W. Associating somatic mutation with clinical outcomes through kernel regression and optimal transport. Biometrics 2023; 79:2705-2718. [PMID: 36217816 PMCID: PMC10455040 DOI: 10.1111/biom.13769] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/27/2021] [Accepted: 09/16/2022] [Indexed: 11/30/2022]
Abstract
Somatic mutations in cancer patients are inherently sparse and potentially high dimensional. Cancer patients may share the same set of deregulated biological processes perturbed by different sets of somatically mutated genes. Therefore, when assessing the associations between somatic mutations and clinical outcomes, gene-by-gene analysis is often under-powered because it does not capture the complex disease mechanisms shared across cancer patients. Rather than testing genes one by one, an intuitive approach is to aggregate somatic mutation data of multiple genes to assess their joint association with clinical outcomes. The challenge is how to aggregate such information. Building on the optimal transport method, we propose a principled approach to estimate the similarity of somatic mutation profiles of multiple genes between tumor samples, while accounting for gene-gene similarities defined by gene annotations or empirical mutational patterns. Using such similarities, we can assess the associations between somatic mutations and clinical outcomes by kernel regression. We have applied our method to analyze somatic mutation data of 17 cancer types and identified at least five cancer types, where somatic mutations are associated with overall survival, progression-free interval, or cytolytic activity.
Collapse
Affiliation(s)
- Paul Little
- Biostatistics Program, Public Health Sciences Division, Fred Hutchinson Cancer Center, Seattle, Washington, U.S.A
| | - Li Hsu
- Biostatistics Program, Public Health Sciences Division, Fred Hutchinson Cancer Center, Seattle, Washington, U.S.A
- Department of Biostatistics, University of Washington, Seattle, Washington, U.S.A
| | - Wei Sun
- Biostatistics Program, Public Health Sciences Division, Fred Hutchinson Cancer Center, Seattle, Washington, U.S.A
- Department of Biostatistics, University of Washington, Seattle, Washington, U.S.A
- Department of Biostatistics, University of North Carolina at Chapel Hill, Chapel Hill, North Carolina, U.S.A
| |
Collapse
|
3
|
Patterson A, Elbasir A, Tian B, Auslander N. Computational Methods Summarizing Mutational Patterns in Cancer: Promise and Limitations for Clinical Applications. Cancers (Basel) 2023; 15:1958. [PMID: 37046619 PMCID: PMC10093138 DOI: 10.3390/cancers15071958] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/27/2022] [Revised: 02/24/2023] [Accepted: 03/09/2023] [Indexed: 03/29/2023] Open
Abstract
Since the rise of next-generation sequencing technologies, the catalogue of mutations in cancer has been continuously expanding. To address the complexity of the cancer-genomic landscape and extract meaningful insights, numerous computational approaches have been developed over the last two decades. In this review, we survey the current leading computational methods to derive intricate mutational patterns in the context of clinical relevance. We begin with mutation signatures, explaining first how mutation signatures were developed and then examining the utility of studies using mutation signatures to correlate environmental effects on the cancer genome. Next, we examine current clinical research that employs mutation signatures and discuss the potential use cases and challenges of mutation signatures in clinical decision-making. We then examine computational studies developing tools to investigate complex patterns of mutations beyond the context of mutational signatures. We survey methods to identify cancer-driver genes, from single-driver studies to pathway and network analyses. In addition, we review methods inferring complex combinations of mutations for clinical tasks and using mutations integrated with multi-omics data to better predict cancer phenotypes. We examine the use of these tools for either discovery or prediction, including prediction of tumor origin, treatment outcomes, prognosis, and cancer typing. We further discuss the main limitations preventing widespread clinical integration of computational tools for the diagnosis and treatment of cancer. We end by proposing solutions to address these challenges using recent advances in machine learning.
Collapse
Affiliation(s)
- Andrew Patterson
- Genomics and Computational Biology Graduate Group, Perelman School of Medicine, University of Pennsylvania, Philadelphia, PA 19104, USA
- The Wistar Institute, Philadelphia, PA 19104, USA
| | | | - Bin Tian
- The Wistar Institute, Philadelphia, PA 19104, USA
| | - Noam Auslander
- The Wistar Institute, Philadelphia, PA 19104, USA
- Department of Cancer Biology, University of Pennsylvania, Philadelphia, PA 19104, USA
| |
Collapse
|
4
|
Intra-tumor heterogeneity, turnover rate and karyotype space shape susceptibility to missegregation-induced extinction. PLoS Comput Biol 2023; 19:e1010815. [PMID: 36689467 PMCID: PMC9917311 DOI: 10.1371/journal.pcbi.1010815] [Citation(s) in RCA: 4] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/28/2022] [Revised: 02/10/2023] [Accepted: 12/12/2022] [Indexed: 01/24/2023] Open
Abstract
The phenotypic efficacy of somatic copy number alterations (SCNAs) stems from their incidence per base pair of the genome, which is orders of magnitudes greater than that of point mutations. One mitotic event stands out in its potential to significantly change a cell's SCNA burden-a chromosome missegregation. A stochastic model of chromosome mis-segregations has been previously developed to describe the evolution of SCNAs of a single chromosome type. Building upon this work, we derive a general deterministic framework for modeling missegregations of multiple chromosome types. The framework offers flexibility to model intra-tumor heterogeneity in the SCNAs of all chromosomes, as well as in missegregation- and turnover rates. The model can be used to test how selection acts upon coexisting karyotypes over hundreds of generations. We use the model to calculate missegregation-induced population extinction (MIE) curves, that separate viable from non-viable populations as a function of their turnover- and missegregation rates. Turnover- and missegregation rates estimated from scRNA-seq data are then compared to theoretical predictions. We find convergence of theoretical and empirical results in both the location of MIE curves and the necessary conditions for MIE. When a dependency of missegregation rate on karyotype is introduced, karyotypes associated with low missegregation rates act as a stabilizing refuge, rendering MIE impossible unless turnover rates are exceedingly high. Intra-tumor heterogeneity, including heterogeneity in missegregation rates, increases as tumors progress, rendering MIE unlikely.
Collapse
|
5
|
EMeth: An EM algorithm for cell type decomposition based on DNA methylation data. Sci Rep 2021; 11:5717. [PMID: 33707472 PMCID: PMC7952399 DOI: 10.1038/s41598-021-84864-9] [Citation(s) in RCA: 6] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/15/2020] [Accepted: 02/22/2021] [Indexed: 12/31/2022] Open
Abstract
We introduce a new computational method named EMeth to estimate cell type proportions using DNA methylation data. EMeth is a reference-based method that requires cell type-specific DNA methylation data from relevant cell types. EMeth improves on the existing reference-based methods by detecting the CpGs whose DNA methylation are inconsistent with the deconvolution model and reducing their contributions to cell type decomposition. Another novel feature of EMeth is that it allows a cell type with known proportions but unknown reference and estimates its methylation. This is motivated by the case of studying methylation in tumor cells while bulk tumor samples include tumor cells as well as other cell types such as infiltrating immune cells, and tumor cell proportion can be estimated by copy number data. We demonstrate that EMeth delivers more accurate estimates of cell type proportions than several other methods using simulated data and in silico mixtures. Applications in cancer studies show that the proportions of T regulatory cells estimated by DNA methylation have expected associations with mutation load and survival time, while the estimates from gene expression miss such associations.
Collapse
|
6
|
Zheng X, Amos CI, Frost HR. Cancer prognosis prediction using somatic point mutation and copy number variation data: a comparison of gene-level and pathway-based models. BMC Bioinformatics 2020; 21:467. [PMID: 33081688 PMCID: PMC7574407 DOI: 10.1186/s12859-020-03791-0] [Citation(s) in RCA: 6] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/15/2020] [Accepted: 09/30/2020] [Indexed: 01/11/2023] Open
Abstract
BACKGROUND Genomic profiling of solid human tumors by projects such as The Cancer Genome Atlas (TCGA) has provided important information regarding the somatic alterations that drive cancer progression and patient survival. Although researchers have successfully leveraged TCGA data to build prognostic models, most efforts have focused on specific cancer types and a targeted set of gene-level predictors. Less is known about the prognostic ability of pathway-level variables in a pan-cancer setting. To address these limitations, we systematically evaluated and compared the prognostic ability of somatic point mutation (SPM) and copy number variation (CNV) data, gene-level and pathway-level models for a diverse set of TCGA cancer types and predictive modeling approaches. RESULTS We evaluated gene-level and pathway-level penalized Cox proportional hazards models using SPM and CNV data for 29 different TCGA cohorts. We measured predictive accuracy as the concordance index for predicting survival outcomes. Our comprehensive analysis suggests that the use of pathway-level predictors did not offer superior predictive power relative to gene-level models for all cancer types but had the advantages of robustness and parsimony. We identified a set of cohorts for which somatic alterations could not predict prognosis, and a unique cohort LGG, for which SPM data was more predictive than CNV data and the predictive accuracy is good for all model types. We found that the pathway-level predictors provide superior interpretative value and that there is often a serious collinearity issue for the gene-level models while pathway-level models avoided this issue. CONCLUSION Our comprehensive analysis suggests that when using somatic alterations data for cancer prognosis prediction, pathway-level models are more interpretable, stable and parsimonious compared to gene-level models. Pathway-level models also avoid the issue of collinearity, which can be serious for gene-level somatic alterations. The prognostic power of somatic alterations is highly variable across different cancer types and we have identified a set of cohorts for which somatic alterations could not predict prognosis. In general, CNV data predicts prognosis better than SPM data with the exception of the LGG cohort.
Collapse
Affiliation(s)
- Xingyu Zheng
- Department of Biomedical Data Science, Geisel School of Medicine, Dartmouth College, Hanover, NH, 03755, USA
| | - Christopher I Amos
- Department of Biomedical Data Science, Geisel School of Medicine, Dartmouth College, Hanover, NH, 03755, USA.
- Department of Medicine, Institute for Clinical and Translational Research, Baylor College of Medicine, 1 Baylor Plaza, Houston, TX, 77030, USA.
| | - H Robert Frost
- Department of Biomedical Data Science, Geisel School of Medicine, Dartmouth College, Hanover, NH, 03755, USA.
| |
Collapse
|
7
|
Sun W, Jin C, Gelfond JA, Chen MH, Ibrahim JG. Joint analysis of single-cell and bulk tissue sequencing data to infer intratumor heterogeneity. Biometrics 2019; 76:983-994. [PMID: 31813161 DOI: 10.1111/biom.13198] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/09/2018] [Revised: 10/23/2019] [Accepted: 11/25/2019] [Indexed: 11/28/2022]
Abstract
Many computational methods have been developed to discern intratumor heterogeneity (ITH) using DNA sequence data from bulk tumor samples. These methods share an assumption that two mutations arise from the same subclone if they have similar mutant allele-frequencies (MAFs), and thus it is difficult or impossible to distinguish two subclones with similar MAFs. Single-cell DNA sequencing (scDNA-seq) data can be very informative for ITH inference. However, due to the difficulty of DNA amplification, scDNA-seq data are often very noisy. A promising new study design is to collect both bulk and single-cell DNA-seq data and jointly analyze them to mitigate the limitations of each data type. To address the analytic challenges of this new study design, we propose a computational method named BaSiC (Bulk tumor and Single Cell), to discern ITH by jointly analyzing DNA-seq data from bulk tumor and single cells. We demonstrate that BaSiC has comparable or better performance than the methods using either data type. We further evaluate BaSiC using bulk tumor and single-cell DNA-seq data from a breast cancer patient and several leukemia patients.
Collapse
Affiliation(s)
- Wei Sun
- Public Health Sciences Division, Fred Hutchinson Cancer Research Center, Seattle, Washington
| | - Chong Jin
- Department of Biostatistics, University of North Carolina at Chapel Hill, Chapel Hill, North Carolina
| | - Jonathan A Gelfond
- Department of Epidemiology and Biostatistics, UT Health Science Center, San Antonio, Texas
| | - Ming-Hui Chen
- Department of Statistics, University of Connecticut, Storrs, Connecticut
| | - Joseph G Ibrahim
- Department of Biostatistics, University of North Carolina, Chapel Hill, North Carolina
| |
Collapse
|