1
|
Dhanushkumar T, M E S, Selvam PK, Rambabu M, Dasegowda KR, Vasudevan K, George Priya Doss C. Advancements and hurdles in the development of a vaccine for triple-negative breast cancer: A comprehensive review of multi-omics and immunomics strategies. Life Sci 2024; 337:122360. [PMID: 38135117 DOI: 10.1016/j.lfs.2023.122360] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/12/2023] [Revised: 12/15/2023] [Accepted: 12/15/2023] [Indexed: 12/24/2023]
Abstract
Triple-Negative Breast Cancer (TNBC) presents a significant challenge in oncology due to its aggressive behavior and limited therapeutic options. This review explores the potential of immunotherapy, particularly vaccine-based approaches, in addressing TNBC. It delves into the role of immunoinformatics in creating effective vaccines against TNBC. The review first underscores the distinct attributes of TNBC and the importance of tumor antigens in vaccine development. It then elaborates on antigen detection techniques such as exome sequencing, HLA typing, and RNA sequencing, which are instrumental in identifying TNBC-specific antigens and selecting vaccine candidates. The discussion then shifts to the in-silico vaccine development process, encompassing antigen selection, epitope prediction, and rational vaccine design. This process merges computational simulations with immunological insights. The role of Artificial Intelligence (AI) in expediting the prediction of antigens and epitopes is also emphasized. The review concludes by encapsulating how Immunoinformatics can augment the design of TNBC vaccines, integrating tumor antigens, advanced detection methods, in-silico strategies, and AI-driven insights to advance TNBC immunotherapy. This could potentially pave the way for more targeted and efficacious treatments.
Collapse
Affiliation(s)
- T Dhanushkumar
- Department of Biotechnology, School of Applied Sciences, REVA University, Bengaluru 560064, India
| | - Santhosh M E
- Department of Biotechnology, School of Applied Sciences, REVA University, Bengaluru 560064, India
| | - Prasanna Kumar Selvam
- Department of Biotechnology, School of Applied Sciences, REVA University, Bengaluru 560064, India
| | - Majji Rambabu
- Department of Biotechnology, School of Applied Sciences, REVA University, Bengaluru 560064, India
| | - K R Dasegowda
- Department of Biotechnology, School of Applied Sciences, REVA University, Bengaluru 560064, India
| | - Karthick Vasudevan
- Department of Biotechnology, School of Applied Sciences, REVA University, Bengaluru 560064, India.
| | - C George Priya Doss
- Laboratory of Integrative Genomics, Department of Integrative Biology, School of BioSciences and Technology, Vellore Institute of Technology (VIT), Vellore, India.
| |
Collapse
|
2
|
Van Loon K, Mmbaga EJ, Mushi BP, Selekwa M, Mwanga A, Akoko LO, Mwaiselage J, Mosha I, Ng DL, Wu W, Silverstein J, Mulima G, Kaimila B, Gopal S, Snell JM, Benz SC, Vaske C, Sanborn Z, Sedgewick AJ, Radenbaugh A, Newton Y, Collisson EA. A Genomic Analysis of Esophageal Squamous Cell Carcinoma in Eastern Africa. Cancer Epidemiol Biomarkers Prev 2023; 32:1411-1420. [PMID: 37505926 DOI: 10.1158/1055-9965.epi-22-0775] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/12/2022] [Revised: 04/19/2023] [Accepted: 07/26/2023] [Indexed: 07/30/2023] Open
Abstract
BACKGROUND Esophageal squamous cell carcinoma (ESCC) comprises 90% of all esophageal cancer cases globally and is the most common histology in low-resource settings. Eastern Africa has a disproportionately high incidence of ESCC. METHODS We describe the genomic profiles of 61 ESCC cases from Tanzania and compare them to profiles from an existing cohort of ESCC cases from Malawi. We also provide a comparison to ESCC tumors in The Cancer Genome Atlas (TCGA). RESULTS We observed substantial transcriptional overlap with other squamous histologies via comparison with TCGA PanCan dataset. DNA analysis revealed known mutational patterns, both genome-wide as well as in genes known to be commonly mutated in ESCC. TP53 mutations were the most common somatic mutation in tumors from both Tanzania and Malawi but were detected at lower frequencies than previously reported in ESCC cases from other settings. In a combined analysis, two unique transcriptional clusters were identified: a proliferative/epithelial cluster and an invasive/migrative/mesenchymal cluster. Mutational signature analysis of the Tanzanian cohort revealed common signatures associated with aging and cytidine deaminase activity (APOBEC) and an absence of signature 29, which was previously reported in the Malawi cohort. CONCLUSIONS This study defines the molecular characteristics of ESCC in Tanzania, and enriches the Eastern African dataset, with findings of overall similarities but also some heterogeneity across two unique sites. IMPACT Despite a high burden of ESCC in Eastern Africa, investigations into the genomics in this region are nascent. This represents the largest comprehensive genomic analysis ESCC from sub-Saharan Africa to date.
Collapse
Affiliation(s)
- Katherine Van Loon
- UCSF Helen Diller Family Comprehensive Cancer Center, San Francisco, California
| | - Elia J Mmbaga
- Muhimbili University of Health and Allied Sciences, Dar es Salaam, Tanzania
| | - Beatrice P Mushi
- Muhimbili University of Health and Allied Sciences, Dar es Salaam, Tanzania
| | - Msiba Selekwa
- Muhimbili University of Health and Allied Sciences, Dar es Salaam, Tanzania
| | - Ally Mwanga
- Muhimbili University of Health and Allied Sciences, Dar es Salaam, Tanzania
| | - Larry O Akoko
- Muhimbili University of Health and Allied Sciences, Dar es Salaam, Tanzania
| | | | | | - Dianna L Ng
- UCSF Helen Diller Family Comprehensive Cancer Center, San Francisco, California
| | - Wei Wu
- UCSF Helen Diller Family Comprehensive Cancer Center, San Francisco, California
| | - Jordyn Silverstein
- UCSF Helen Diller Family Comprehensive Cancer Center, San Francisco, California
| | | | | | - Satish Gopal
- UNC Project-Malawi, Lilongwe, Malawi
- University of North Carolina, Chapel Hill, North Carolina
| | - Jeff M Snell
- University of North Carolina, Chapel Hill, North Carolina
| | | | | | - Zack Sanborn
- NantOmics/NantHealth, Inc., El Segundo, California
| | | | | | - Yulia Newton
- NantOmics/NantHealth, Inc., El Segundo, California
| | - Eric A Collisson
- UCSF Helen Diller Family Comprehensive Cancer Center, San Francisco, California
| |
Collapse
|
3
|
Zhang B, Bassani-Sternberg M. Current perspectives on mass spectrometry-based immunopeptidomics: the computational angle to tumor antigen discovery. J Immunother Cancer 2023; 11:e007073. [PMID: 37899131 PMCID: PMC10619091 DOI: 10.1136/jitc-2023-007073] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 07/21/2023] [Indexed: 10/31/2023] Open
Abstract
Identification of tumor antigens presented by the human leucocyte antigen (HLA) molecules is essential for the design of effective and safe cancer immunotherapies that rely on T cell recognition and killing of tumor cells. Mass spectrometry (MS)-based immunopeptidomics enables high-throughput, direct identification of HLA-bound peptides from a variety of cell lines, tumor tissues, and healthy tissues. It involves immunoaffinity purification of HLA complexes followed by MS profiling of the extracted peptides using data-dependent acquisition, data-independent acquisition, or targeted approaches. By incorporating DNA, RNA, and ribosome sequencing data into immunopeptidomics data analysis, the proteogenomic approach provides a powerful means for identifying tumor antigens encoded within the canonical open reading frames of annotated coding genes and non-canonical tumor antigens derived from presumably non-coding regions of our genome. We discuss emerging computational challenges in immunopeptidomics data analysis and tumor antigen identification, highlighting key considerations in the proteogenomics-based approach, including accurate DNA, RNA and ribosomal sequencing data analysis, careful incorporation of predicted novel protein sequences into reference protein database, special quality control in MS data analysis due to the expanded and heterogeneous search space, cancer-specificity determination, and immunogenicity prediction. The advancements in technology and computation is continually enabling us to identify tumor antigens with higher sensitivity and accuracy, paving the way toward the development of more effective cancer immunotherapies.
Collapse
Affiliation(s)
- Bing Zhang
- Lester and Sue Smith Breast Center, Baylor College of Medicine, Houston, Texas, USA
- Department of Molecular and Human Genetics, Baylor College of Medicine, Houston, Texas, USA
| | - Michal Bassani-Sternberg
- Ludwig Institute for Cancer Research, University of Lausanne, Lausanne, Switzerland
- Department of Oncology, Centre Hospitalier Universitaire Vaudois, Lausanne, Switzerland
- Agora Cancer Research Centre, Lausanne, Switzerland
| |
Collapse
|
4
|
Yang X, Xu X, Breuss MW, Antaki D, Ball LL, Chung C, Shen J, Li C, George RD, Wang Y, Bae T, Cheng Y, Abyzov A, Wei L, Alexandrov LB, Sebat JL, Gleeson JG. Control-independent mosaic single nucleotide variant detection with DeepMosaic. Nat Biotechnol 2023; 41:870-877. [PMID: 36593400 PMCID: PMC10314968 DOI: 10.1038/s41587-022-01559-w] [Citation(s) in RCA: 11] [Impact Index Per Article: 11.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/13/2020] [Accepted: 10/10/2022] [Indexed: 01/04/2023]
Abstract
Mosaic variants (MVs) reflect mutagenic processes during embryonic development and environmental exposure, accumulate with aging and underlie diseases such as cancer and autism. The detection of noncancer MVs has been computationally challenging due to the sparse representation of nonclonally expanded MVs. Here we present DeepMosaic, combining an image-based visualization module for single nucleotide MVs and a convolutional neural network-based classification module for control-independent MV detection. DeepMosaic was trained on 180,000 simulated or experimentally assessed MVs, and was benchmarked on 619,740 simulated MVs and 530 independent biologically tested MVs from 16 genomes and 181 exomes. DeepMosaic achieved higher accuracy compared with existing methods on biological data, with a sensitivity of 0.78, specificity of 0.83 and positive predictive value of 0.96 on noncancer whole-genome sequencing data, as well as doubling the validation rate over previous best-practice methods on noncancer whole-exome sequencing data (0.43 versus 0.18). DeepMosaic represents an accurate MV classifier for noncancer samples that can be implemented as an alternative or complement to existing methods.
Collapse
Affiliation(s)
- Xiaoxu Yang
- Department of Neurosciences, University of California, San Diego, La Jolla, CA, USA.
- Rady Children's Institute for Genomic Medicine, San Diego, CA, USA.
| | - Xin Xu
- Department of Neurosciences, University of California, San Diego, La Jolla, CA, USA
- Rady Children's Institute for Genomic Medicine, San Diego, CA, USA
| | - Martin W Breuss
- Department of Neurosciences, University of California, San Diego, La Jolla, CA, USA
- Rady Children's Institute for Genomic Medicine, San Diego, CA, USA
- Department of Pediatrics, Section of Genetics and Metabolism, University of Colorado School of Medicine, Aurora, CO, USA
| | - Danny Antaki
- Department of Neurosciences, University of California, San Diego, La Jolla, CA, USA
- Rady Children's Institute for Genomic Medicine, San Diego, CA, USA
| | - Laurel L Ball
- Department of Neurosciences, University of California, San Diego, La Jolla, CA, USA
- Rady Children's Institute for Genomic Medicine, San Diego, CA, USA
| | - Changuk Chung
- Department of Neurosciences, University of California, San Diego, La Jolla, CA, USA
- Rady Children's Institute for Genomic Medicine, San Diego, CA, USA
| | - Jiawei Shen
- Department of Neurosciences, University of California, San Diego, La Jolla, CA, USA
- Rady Children's Institute for Genomic Medicine, San Diego, CA, USA
| | - Chen Li
- Department of Neurosciences, University of California, San Diego, La Jolla, CA, USA
- Rady Children's Institute for Genomic Medicine, San Diego, CA, USA
| | - Renee D George
- Department of Neurosciences, University of California, San Diego, La Jolla, CA, USA
- Rady Children's Institute for Genomic Medicine, San Diego, CA, USA
| | - Yifan Wang
- Department of Quantitative Health Sciences, Center for Individualized Medicine, Mayo Clinic, Rochester, MN, USA
| | - Taejeong Bae
- Department of Quantitative Health Sciences, Center for Individualized Medicine, Mayo Clinic, Rochester, MN, USA
| | - Yuhe Cheng
- Department of Cellular and Molecular Medicine, UC San Diego, La Jolla, CA, USA
- Department of Bioengineering, UC San Diego, La Jolla, CA, USA
- Moores Cancer Center, UC San Diego, La Jolla, CA, USA
| | - Alexej Abyzov
- Department of Quantitative Health Sciences, Center for Individualized Medicine, Mayo Clinic, Rochester, MN, USA
| | - Liping Wei
- Center for Bioinformatics, State Key Laboratory of Protein and Plant Gene Research, School of Life Sciences, Peking University, Beijing, China
| | - Ludmil B Alexandrov
- Department of Cellular and Molecular Medicine, UC San Diego, La Jolla, CA, USA
- Department of Bioengineering, UC San Diego, La Jolla, CA, USA
- Moores Cancer Center, UC San Diego, La Jolla, CA, USA
| | - Jonathan L Sebat
- Beyster Center for Genomics of Psychiatric Diseases, University of California, San Diego, La Jolla, CA, USA
- Department of Psychiatry, University of California, San Diego, La Jolla, CA, USA
- Department of Cellular and Molecular Medicine, University of California, San Diego, La Jolla, CA, USA
- Department of Pediatrics, University of California, San Diego, La Jolla, CA, USA
| | - Joseph G Gleeson
- Department of Neurosciences, University of California, San Diego, La Jolla, CA, USA.
- Rady Children's Institute for Genomic Medicine, San Diego, CA, USA.
| |
Collapse
|
5
|
Vaisband M, Schubert M, Gassner FJ, Geisberger R, Greil R, Zaborsky N, Hasenauer J. Validation of genetic variants from NGS data using deep convolutional neural networks. BMC Bioinformatics 2023; 24:158. [PMID: 37081386 PMCID: PMC10116675 DOI: 10.1186/s12859-023-05255-7] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/25/2022] [Accepted: 03/27/2023] [Indexed: 04/22/2023] Open
Abstract
Accurate somatic variant calling from next-generation sequencing data is one most important tasks in personalised cancer therapy. The sophistication of the available technologies is ever-increasing, yet, manual candidate refinement is still a necessary step in state-of-the-art processing pipelines. This limits reproducibility and introduces a bottleneck with respect to scalability. We demonstrate that the validation of genetic variants can be improved using a machine learning approach resting on a Convolutional Neural Network, trained using existing human annotation. In contrast to existing approaches, we introduce a way in which contextual data from sequencing tracks can be included into the automated assessment. A rigorous evaluation shows that the resulting model is robust and performs on par with trained researchers following published standard operating procedure.
Collapse
Affiliation(s)
- Marc Vaisband
- Department of Internal Medicine III with Haematology, Medical Oncology, Haemostaseology, Infectiology and Rheumatology, Oncologic Center; Salzburg Cancer Research Institute - Laboratory for Immunological and Molecular Cancer Research (SCRI-LIMCR); Cancer Cluster Salzburg, Paracelsus Medical University, Salzburg, Austria.
- Life and Medical Sciences Institute, University of Bonn, Bonn, Germany.
| | - Maria Schubert
- Department of Internal Medicine III with Haematology, Medical Oncology, Haemostaseology, Infectiology and Rheumatology, Oncologic Center; Salzburg Cancer Research Institute - Laboratory for Immunological and Molecular Cancer Research (SCRI-LIMCR); Cancer Cluster Salzburg, Paracelsus Medical University, Salzburg, Austria
| | - Franz Josef Gassner
- Department of Internal Medicine III with Haematology, Medical Oncology, Haemostaseology, Infectiology and Rheumatology, Oncologic Center; Salzburg Cancer Research Institute - Laboratory for Immunological and Molecular Cancer Research (SCRI-LIMCR); Cancer Cluster Salzburg, Paracelsus Medical University, Salzburg, Austria
| | - Roland Geisberger
- Department of Internal Medicine III with Haematology, Medical Oncology, Haemostaseology, Infectiology and Rheumatology, Oncologic Center; Salzburg Cancer Research Institute - Laboratory for Immunological and Molecular Cancer Research (SCRI-LIMCR); Cancer Cluster Salzburg, Paracelsus Medical University, Salzburg, Austria
| | - Richard Greil
- Department of Internal Medicine III with Haematology, Medical Oncology, Haemostaseology, Infectiology and Rheumatology, Oncologic Center; Salzburg Cancer Research Institute - Laboratory for Immunological and Molecular Cancer Research (SCRI-LIMCR); Cancer Cluster Salzburg, Paracelsus Medical University, Salzburg, Austria
| | - Nadja Zaborsky
- Department of Internal Medicine III with Haematology, Medical Oncology, Haemostaseology, Infectiology and Rheumatology, Oncologic Center; Salzburg Cancer Research Institute - Laboratory for Immunological and Molecular Cancer Research (SCRI-LIMCR); Cancer Cluster Salzburg, Paracelsus Medical University, Salzburg, Austria
| | - Jan Hasenauer
- Life and Medical Sciences Institute, University of Bonn, Bonn, Germany
| |
Collapse
|
6
|
Azim R, Wang S, Dipu SA, Islam N, Ala Muid MR, Elahe MF. A patient-specific functional module and path identification technique from RNA-seq data. Comput Biol Med 2023; 158:106871. [PMID: 37030265 DOI: 10.1016/j.compbiomed.2023.106871] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/28/2022] [Revised: 02/12/2023] [Accepted: 03/30/2023] [Indexed: 04/05/2023]
Abstract
With the advancement of new technologies, a huge amount of high dimensional data is being generated which is opening new opportunities and challenges to the study of cancer and diseases. In particular, distinguishing the patient-specific key components and modules which drive tumorigenesis is necessary to analyze. A complex disease generally does not initiate from the dysregulation of a single component but it is the result of the dysfunction of a group of components and networks which differs from patient to patient. However, a patient-specific network is required to understand the disease and its molecular mechanism. We address this requirement by constructing a patient-specific network by sample-specific network theory with integrating cancer-specific differentially expressed genes and elite genes. By elucidating patient-specific networks, it can identify the regulatory modules, driver genes as well as personalized disease networks which can lead to personalized drug design. This method can provide insight into how genes are associating with each other and characterized the patient-specific disease subtypes. The results show that this method can be beneficial for the detection of patient-specific differential modules and interaction between genes. Extensive analysis using existing literature, gene enrichment and survival analysis for three cancer types STAD, PAAD and LUAD shows the effectiveness of this method over other existing methods. In addition, this method can be useful for personalized therapeutics and drug design. This methodology is implemented in the R language and is available at https://github.com/riasatazim/PatientSpecificRNANetwork.
Collapse
|
7
|
The Somatic Mutation Landscape of UDP-Glycosyltransferase ( UGT) Genes in Human Cancers. Cancers (Basel) 2022; 14:cancers14225708. [PMID: 36428799 PMCID: PMC9688768 DOI: 10.3390/cancers14225708] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/07/2022] [Revised: 11/16/2022] [Accepted: 11/18/2022] [Indexed: 11/23/2022] Open
Abstract
The human UDP-glycosyltransferase (UGTs) superfamily has a critical role in the metabolism of anticancer drugs and numerous pro/anti-cancer molecules (e.g., steroids, lipids, fatty acids, bile acids and carcinogens). Recent studies have shown wide and abundant expression of UGT genes in human cancers. However, the extent to which UGT genes acquire somatic mutations within tumors remains to be systematically investigated. In the present study, our comprehensive analysis of the somatic mutation profiles of 10,069 tumors from 33 different TCGA cancer types identified 3427 somatic mutations in UGT genes. Overall, nearly 18% (1802/10,069) of the assessed tumors had mutations in UGT genes with huge variations in mutation frequency across different cancer types, ranging from over 25% in five cancers (COAD, LUAD, LUSC, SKCM and UCSC) to less than 5% in eight cancers (LAML, MESO, PCPG, PAAD, PRAD, TGCT, THYM and UVM). All 22 UGT genes showed somatic mutations in tumors, with UGT2B4, UGT3A1 and UGT3A2 showing the largest number of mutations (289, 307 and 255 mutations, respectively). Nearly 65% (2260/3427) of the mutations were missense, frame-shift and nonsense mutations that have been predicted to code for variant UGT proteins. Furthermore, about 10% (362/3427) of the mutations occurred in non-coding regions (5' UTR, 3' UTR and splice sites) that may be able to alter the efficiency of translation initiation, miRNA regulation or the splicing of UGT transcripts. In conclusion, our data show widespread somatic mutations of UGT genes in human cancers that may affect the capacity of cancer cells to metabolize anticancer drugs and endobiotics that control pro/anti-cancer signaling pathways. This highlights their potential utility as biomarkers for predicting therapeutic efficacy and clinical outcomes.
Collapse
|
8
|
Zhang X, Zhou Y, Shi Z, Liu Z, Chen H, Wang X, Cheng Y, Xi L, Li X, Zhang C, Bao L, Xuan C. Integrated analysis of genes encoding ATP-dependent chromatin remodellers identifies CHD7 as a potential target for colorectal cancer therapy. Clin Transl Med 2022; 12:e953. [PMID: 35789070 PMCID: PMC9254903 DOI: 10.1002/ctm2.953] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/21/2022] [Revised: 06/09/2022] [Accepted: 06/15/2022] [Indexed: 11/08/2022] Open
Abstract
BACKGROUND Genes participating in chromatin organization and regulation are frequently mutated or dysregulated in cancers. ATP-dependent chromatin remodelers (ATPCRs) play a key role in organizing genomic DNA within chromatin, therefore regulating gene expression. The oncogenic role of ATPCRs and the mechanism involved remains unclear. METHODS We analyzed the genomic and transcriptional aberrations of the genes encoding ATPCRs in The Cancer Genome Atlas (TCGA) cohort. A series of cellular experiments and mouse tumor-bearing experiments were conducted to reveal the regulatory function of CHD7 on the growth of colorectal cancer cells. RNA-seq and ATAC-seq approaches together with ChIP assays were performed to elucidate the downstream targets and the molecular mechanisms. RESULTS Our data showed that many ATPCRs represented a high frequency of somatic copy number alterations, widespread somatic mutations, remarkable expression abnormalities, and significant correlation with overall survival, suggesting several somatic driver candidates including chromodomain helicase DNA-binding protein 7 (CHD7) in colorectal cancer. We experimentally demonstrated that CHD7 promotes the growth of colorectal cancer cells in vitro and in vivo. CHD7 can bind to the promoters of target genes to maintain chromatin accessibility and facilitate transcription. We found that CHD7 knockdown downregulates AK4 expression and activates AMPK phosphorylation, thereby promoting the phosphorylation and stability of p53 and leading to the inhibition of the colorectal cancer growth. Our muti-omics analyses of ATPCRs across large-scale cancer specimens identified potential therapeutic targets and our experimental studies revealed a novel CHD7-AK4-AMPK-p53 axis that plays an oncogenic role in colorectal cancer.
Collapse
Affiliation(s)
- Xingyan Zhang
- The Province and Ministry Co‐sponsored Collaborative Innovation Center for Medical Epigenetics, Key Laboratory of Immune Microenvironment and Disease (Ministry of Education), Department of Biochemistry and Molecular BiologyTianjin Medical UniversityTianjinChina
| | - Yaoyao Zhou
- Tianjin Medical University Cancer Institute and Hospital, National Clinical Research Center for Cancer, Key Laboratory of Cancer Prevention and Therapy, Tianjin's Clinical Research Center for Cancer, Key Laboratory of Breast Cancer Prevention and TherapyTianjin Medical University, Ministry of EducationTianjinChina
| | - Zhenyu Shi
- Tianjin Medical University Cancer Institute and Hospital, National Clinical Research Center for Cancer, Key Laboratory of Cancer Prevention and Therapy, Tianjin's Clinical Research Center for Cancer, Key Laboratory of Breast Cancer Prevention and TherapyTianjin Medical University, Ministry of EducationTianjinChina
| | - Zhenfeng Liu
- The Province and Ministry Co‐sponsored Collaborative Innovation Center for Medical Epigenetics, Key Laboratory of Immune Microenvironment and Disease (Ministry of Education), Department of Biochemistry and Molecular BiologyTianjin Medical UniversityTianjinChina
| | - Hao Chen
- The Province and Ministry Co‐sponsored Collaborative Innovation Center for Medical Epigenetics, Key Laboratory of Immune Microenvironment and Disease (Ministry of Education), Department of Biochemistry and Molecular BiologyTianjin Medical UniversityTianjinChina
| | - Xiaochen Wang
- The Province and Ministry Co‐sponsored Collaborative Innovation Center for Medical Epigenetics, Key Laboratory of Immune Microenvironment and Disease (Ministry of Education), Department of Biochemistry and Molecular BiologyTianjin Medical UniversityTianjinChina
| | - Yiming Cheng
- The Province and Ministry Co‐sponsored Collaborative Innovation Center for Medical Epigenetics, Key Laboratory of Immune Microenvironment and Disease (Ministry of Education), Department of Biochemistry and Molecular BiologyTianjin Medical UniversityTianjinChina
| | - Lishan Xi
- The Province and Ministry Co‐sponsored Collaborative Innovation Center for Medical Epigenetics, Key Laboratory of Immune Microenvironment and Disease (Ministry of Education), Department of Biochemistry and Molecular BiologyTianjin Medical UniversityTianjinChina
| | - Xuanyuan Li
- The Province and Ministry Co‐sponsored Collaborative Innovation Center for Medical Epigenetics, Key Laboratory of Immune Microenvironment and Disease (Ministry of Education), Department of Biochemistry and Molecular BiologyTianjin Medical UniversityTianjinChina
| | - Chunze Zhang
- Tianjin Institute of Coloproctology, Department of Colorectal SurgeryTianjin Union Medical CenterTianjinChina
| | - Li Bao
- Tianjin Medical University Cancer Institute and Hospital, National Clinical Research Center for Cancer, Key Laboratory of Cancer Prevention and Therapy, Tianjin's Clinical Research Center for Cancer, Key Laboratory of Breast Cancer Prevention and TherapyTianjin Medical University, Ministry of EducationTianjinChina
| | - Chenghao Xuan
- The Province and Ministry Co‐sponsored Collaborative Innovation Center for Medical Epigenetics, Key Laboratory of Immune Microenvironment and Disease (Ministry of Education), Department of Biochemistry and Molecular BiologyTianjin Medical UniversityTianjinChina
| |
Collapse
|
9
|
Park B, Lee W, Han K. A New Approach to Deriving Prognostic Gene Pairs From Cancer Patient-Specific Gene Correlation Networks. IEEE/ACM TRANSACTIONS ON COMPUTATIONAL BIOLOGY AND BIOINFORMATICS 2022; 19:1267-1276. [PMID: 32809942 DOI: 10.1109/tcbb.2020.3017209] [Citation(s) in RCA: 4] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/11/2023]
Abstract
Many of the known prognostic gene signatures for cancer are individual genes or combination of genes, found by the analysis of microarray data. However, many of the known cancer signatures are less predictive than random gene expression signatures, and such random signatures are significantly associated with proliferation genes. With the availability of RNA-seq gene expression data for thousands of human cancer patients, we have analyzed RNA-seq and clinical data of cancer patients and constructed gene correlation networks specific to individual cancer patients. From the patient-specific gene correlation networks, we derived prognostic gene pairs for three types of cancer. In this paper, we propose a new method for inferring prognostic gene pairs from patient-specific gene correlation networks. The main difference of our method from previous ones includes (1) it is focused on finding prognostic gene pairs rather than prognostic genes, (2) it can identify prognostic gene pairs from RNA-seq data even when no significant prognostic genes exist, and (3) prognostic gene pairs can serve as robust prognostic biomarkers in the sense that most prognostic gene pairs show little association with proliferation genes, the major boosting factor of the predictive power of random gene signatures. Evaluation of our method with extensive data of three types of cancer (liver cancer, pancreatic cancer, and stomach cancer) showed that our approach is general and that gene pairs can serve as more reliable prognostic signatures for cancer than genes. Analysis of patient-specific gene networks suggests that prognosis of individual cancer patients is affected by the existence of prognostic gene pairs in the patient-specific network and by the size of the patient-specific network. Although preliminary, our approach will be useful for finding gene pairs to predict survival time of patients and to tailor treatments to individual characteristics. The program for dynamically constructing patient-specific gene networks and for finding prognostic gene pairs is available at http://bclab.inha.ac.kr/LPS.
Collapse
|
10
|
Parvandeh S, Donehower LA, Katsonis P, Hsu TK, Asmussen J, Lee K, Lichtarge O. EPIMUTESTR: a nearest neighbor machine learning approach to predict cancer driver genes from the evolutionary action of coding variants. Nucleic Acids Res 2022; 50:e70. [PMID: 35412634 PMCID: PMC9262594 DOI: 10.1093/nar/gkac215] [Citation(s) in RCA: 5] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/28/2021] [Revised: 03/17/2022] [Accepted: 03/21/2022] [Indexed: 02/01/2023] Open
Abstract
Discovering rare cancer driver genes is difficult because their mutational frequency is too low for statistical detection by computational methods. EPIMUTESTR is an integrative nearest-neighbor machine learning algorithm that identifies such marginal genes by modeling the fitness of their mutations with the phylogenetic Evolutionary Action (EA) score. Over cohorts of sequenced patients from The Cancer Genome Atlas representing 33 tumor types, EPIMUTESTR detected 214 previously inferred cancer driver genes and 137 new candidates never identified computationally before of which seven genes are supported in the COSMIC Cancer Gene Census. EPIMUTESTR achieved better robustness and specificity than existing methods in a number of benchmark methods and datasets.
Collapse
Affiliation(s)
- Saeid Parvandeh
- To whom correspondence should be addressed. Tel: +1 713 798 7677;
| | - Lawrence A Donehower
- Department of Molecular Virology and Microbiology, Houston, TX 77030, USA,Human Genome Sequencing Center, Baylor College of Medicine, Houston, TX 77030, USA
| | - Panagiotis Katsonis
- Department of Molecular and Human Genetics, Baylor College of Medicine, Houston, TX 77030, USA
| | - Teng-Kuei Hsu
- Department of Biochemistry & Molecular Biology, Baylor College of Medicine, One Baylor Plaza, Houston, TX 77030, USA
| | - Jennifer K Asmussen
- Department of Molecular and Human Genetics, Baylor College of Medicine, Houston, TX 77030, USA
| | - Kwanghyuk Lee
- Department of Molecular and Human Genetics, Baylor College of Medicine, Houston, TX 77030, USA
| | - Olivier Lichtarge
- Correspondence may also be addressed to Olivier Lichtarge. Tel: +1 713 798 5646;
| |
Collapse
|
11
|
Lang F, Schrörs B, Löwer M, Türeci Ö, Sahin U. Identification of neoantigens for individualized therapeutic cancer vaccines. Nat Rev Drug Discov 2022; 21:261-282. [PMID: 35105974 PMCID: PMC7612664 DOI: 10.1038/s41573-021-00387-y] [Citation(s) in RCA: 163] [Impact Index Per Article: 81.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 12/13/2021] [Indexed: 02/07/2023]
Abstract
Somatic mutations in cancer cells can generate tumour-specific neoepitopes, which are recognized by autologous T cells in the host. As neoepitopes are not subject to central immune tolerance and are not expressed in healthy tissues, they are attractive targets for therapeutic cancer vaccines. Because the vast majority of cancer mutations are unique to the individual patient, harnessing the full potential of this rich source of targets requires individualized treatment approaches. Many computational algorithms and machine-learning tools have been developed to identify mutations in sequence data, to prioritize those that are more likely to be recognized by T cells and to design tailored vaccines for every patient. In this Review, we fill the gaps between the understanding of basic mechanisms of T cell recognition of neoantigens and the computational approaches for discovery of somatic mutations and neoantigen prediction for cancer immunotherapy. We present a new classification of neoantigens, distinguishing between guarding, restrained and ignored neoantigens, based on how they confer proficient antitumour immunity in a given clinical context. Such context-based differentiation will contribute to a framework that connects neoantigen biology to the clinical setting and medical peculiarities of cancer, and will enable future neoantigen-based therapies to provide greater clinical benefit.
Collapse
Affiliation(s)
- Franziska Lang
- TRON Translational Oncology, Mainz, Germany
- Faculty of Biology, Johannes Gutenberg University Mainz, Mainz, Germany
| | | | | | | | - Ugur Sahin
- BioNTech, Mainz, Germany.
- University Medical Center, Johannes Gutenberg University, Mainz, Germany.
| |
Collapse
|
12
|
Systematic illumination of druggable genes in cancer genomes. Cell Rep 2022; 38:110400. [PMID: 35196490 PMCID: PMC8919705 DOI: 10.1016/j.celrep.2022.110400] [Citation(s) in RCA: 12] [Impact Index Per Article: 6.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/24/2021] [Revised: 09/12/2021] [Accepted: 01/26/2022] [Indexed: 01/15/2023] Open
Abstract
By combining 6 druggable genome resources, we identify 6,083 genes as potential druggable genes (PDGs). We characterize their expression, recurrent genomic alterations, cancer dependencies, and therapeutic potentials by integrating genome, functionome, and druggome profiles across cancers. 81.5% of PDGs are reliably expressed in major adult cancers, 46.9% show selective expression patterns, and 39.1% exhibit at least one recurrent genomic alteration. We annotate a total of 784 PDGs as dependent genes for cancer cell growth. We further quantify 16 cancer-related features and estimate a PDG cancer drug target score (PCDT score). PDGs with higher PCDT scores are significantly enriched for genes encoding kinases and histone modification enzymes. Importantly, we find that a considerable portion of high PCDT score PDGs are understudied genes, providing unexplored opportunities for drug development in oncology. By integrating the druggable genome and the cancer genome, our study thus generates a comprehensive blueprint of potential druggable genes across cancers. Jiang et al. generate a comprehensive blueprint of potential druggable genes (PDGs) across cancers by a systematic integration of the druggable genome and the cancer genome. This resource is publicly available to the cancer research community in The Cancer Druggable Gene Atlas (TCDA) through the Functional Cancer Genome data portal.
Collapse
|
13
|
Wang TY, Yang R. Detecting Medium and Large Insertions and Deletions with transIndel. Methods Mol Biol 2022; 2493:67-75. [PMID: 35751809 DOI: 10.1007/978-1-0716-2293-3_5] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 06/15/2023]
Abstract
Insertions and deletions (indels) are primarily detected from DNA sequencing (DNA-seq) data, but their transcriptional consequences remain unexplored due to challenges in distinguishing medium- and large-sized indels from RNA splicing events in RNA-seq data. We introduce transIndel, a splice-aware algorithm that parses the chimeric alignments predicted by a short read aligner and reconstructs the mid-sized insertions and large deletions based on the linear alignments of split reads from DNA-seq or RNA-seq data. Here, we describe the method and provide a tutorial on the installation and application of transIndel.
Collapse
Affiliation(s)
- Ting-You Wang
- The Hormel Institute, University of Minnesota, Austin, MN, USA
| | - Rendong Yang
- The Hormel Institute, University of Minnesota, Austin, MN, USA.
| |
Collapse
|
14
|
Chang TC, Xu K, Cheng Z, Wu G. Somatic and Germline Variant Calling from Next-Generation Sequencing Data. ADVANCES IN EXPERIMENTAL MEDICINE AND BIOLOGY 2022; 1361:37-54. [DOI: 10.1007/978-3-030-91836-1_3] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 12/01/2022]
|
15
|
Magne N, Rousseau V, Duarte K, Poëa-Guyon S, Gleize V, Mutel A, Schmitt C, Castel H, Idbaih A, Huillard E, Sanson M, Barnier JV. PAK3 is a key signature gene of the glioma proneural subtype and affects its proliferation, differentiation and growth. Cell Oncol (Dordr) 2021; 44:1257-1271. [PMID: 34550532 DOI: 10.1007/s13402-021-00635-8] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 09/01/2021] [Indexed: 02/06/2023] Open
Abstract
PURPOSE Gliomas are the most lethal adult primary brain cancers. Recent advances in their molecular characterization have contributed to a better understanding of their pathophysiology, but there is still a need to identify key genes controling glioma cell proliferation and differentiation. The p21-activated kinases PAK1 and PAK2 play essential roles in cell division and brain development and are well-known oncogenes. In contrast, the role of PAK3 in cancer is poorly understood. It is known, however, that this gene is involved in brain ontogenesis and has been identified as a gene of the proneural subtype signature in glioblastomas. METHODS To better understand the role of PAK kinases in the pathophysiology of gliomas, we conducted expression analyses by querying multiple gene expression databases and analyzing primary human glioma samples. We next studied PAK3 expression upon differentiation in patient-derived cell lines (PDCLs) and the effects of PAK3 inhibition by lentiviral-mediated shRNA on glioma cell proliferation, differentiation and tumor growth. RESULTS We show that contrary to PAK1 and PAK2, high PAK3 expression positively correlates with a longer survival of glioma patients. We also found that PAK3 displays differential expression patterns between glioma sub-groups with a higher expression in 1p/19q-codeleted oligodendrogliomas, and is highly expressed in tumors and PDCLs of the proneural subtype. In PDCLs, high PAK3 expression negatively correlated with proliferation and positively correlated with neuronal differentiation. Inhibition of PAK3 expression increased PDCL proliferation and glioma tumor growth in nude mice. CONCLUSIONS Our results indicate that PAK3 plays a unique role among PAKs in glioma development and may represent a potential therapeutic target.
Collapse
Affiliation(s)
- Nathalie Magne
- Université Paris-Saclay, CNRS, Institut des Neurosciences Paris-Saclay, 91190, Gif-sur-Yvette, France
| | - Véronique Rousseau
- Université Paris-Saclay, CNRS, Institut des Neurosciences Paris-Saclay, 91190, Gif-sur-Yvette, France
| | - Kévin Duarte
- Université Paris-Saclay, CNRS, Institut des Neurosciences Paris-Saclay, 91190, Gif-sur-Yvette, France
| | - Sandrine Poëa-Guyon
- Université Paris-Saclay, CNRS, Institut des Neurosciences Paris-Saclay, 91190, Gif-sur-Yvette, France
| | - Vincent Gleize
- Sorbonne Université, Inserm, CNRS, UMR S 1127, Institut du Cerveau, ICM, AP-HP, Hôpitaux Universitaires La Pitié Salpêtrière - Charles Foix, Service de Neurologie 2-Mazarin, 75013, Paris, France
| | - Alexandre Mutel
- Normandie Univ, UNIROUEN, INSERM, U1239, Laboratoire Différenciation Et Communication Neuronale Et Neuroendocrine, Institut de Recherche Et D'Innovation Biomédicale de Normandie, 76000, Rouen, France
| | - Charlotte Schmitt
- Sorbonne Université, Inserm, CNRS, UMR S 1127, Institut du Cerveau, ICM, AP-HP, Hôpitaux Universitaires La Pitié Salpêtrière - Charles Foix, Service de Neurologie 2-Mazarin, 75013, Paris, France
| | - Hélène Castel
- Normandie Univ, UNIROUEN, INSERM, U1239, Laboratoire Différenciation Et Communication Neuronale Et Neuroendocrine, Institut de Recherche Et D'Innovation Biomédicale de Normandie, 76000, Rouen, France
| | - Ahmed Idbaih
- Sorbonne Université, Inserm, CNRS, UMR S 1127, Institut du Cerveau, ICM, AP-HP, Hôpitaux Universitaires La Pitié Salpêtrière - Charles Foix, Service de Neurologie 2-Mazarin, 75013, Paris, France
| | - Emmanuelle Huillard
- Sorbonne Université, Inserm, CNRS, UMR S 1127, Institut du Cerveau, ICM, AP-HP, Hôpitaux Universitaires La Pitié Salpêtrière - Charles Foix, Service de Neurologie 2-Mazarin, 75013, Paris, France
| | - Marc Sanson
- Sorbonne Université, Inserm, CNRS, UMR S 1127, Institut du Cerveau, ICM, AP-HP, Hôpitaux Universitaires La Pitié Salpêtrière - Charles Foix, Service de Neurologie 2-Mazarin, 75013, Paris, France
| | - Jean-Vianney Barnier
- Université Paris-Saclay, CNRS, Institut des Neurosciences Paris-Saclay, 91190, Gif-sur-Yvette, France.
| |
Collapse
|
16
|
Yuan X, Ma C, Zhao H, Yang L, Wang S, Xi J. STIC: Predicting Single Nucleotide Variants and Tumor Purity in Cancer Genome. IEEE/ACM TRANSACTIONS ON COMPUTATIONAL BIOLOGY AND BIOINFORMATICS 2021; 18:2692-2701. [PMID: 32086221 DOI: 10.1109/tcbb.2020.2975181] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/10/2023]
Abstract
Single nucleotide variant (SNV) plays an important role in cellular proliferation and tumorigenesis in various types of human cancer. Next-generation sequencing (NGS) has provided high-throughput data at an unprecedented resolution to predict SNVs. Currently, there exist many computational methods for either germline or somatic SNV discovery from NGS data, but very few of them are versatile enough to adapt to any situations. In the absence of matched normal samples, the prediction of somatic SNVs from single-tumor samples becomes considerably challenging, especially when the tumor purity is unknown. Here, we propose a new approach, STIC, to predict somatic SNVs and estimate tumor purity from NGS data without matched normal samples. The main features of STIC include: (1) extracting a set of SNV-relevant features on each site and training the BP neural network algorithm on the features to predict SNVs; (2) creating an iterative process to distinguish somatic SNVs from germline ones by disturbing allele frequency; and (3) establishing a reasonable relationship between tumor purity and allele frequencies of somatic SNVs to accurately estimate the purity. We quantitatively evaluate the performance of STIC on both simulation and real sequencing datasets, the results of which indicate that STIC outperforms competing methods.
Collapse
|
17
|
Park B, Lee W, Han K. GeneCoNet: A web application server for constructing cancer patient-specific gene correlation networks with prognostic gene pairs. COMPUTER METHODS AND PROGRAMS IN BIOMEDICINE 2021; 212:106465. [PMID: 34715518 DOI: 10.1016/j.cmpb.2021.106465] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 04/25/2020] [Accepted: 10/06/2021] [Indexed: 06/13/2023]
Abstract
BACKGROUND AND OBJECTIVE Most prognostic gene signatures that have been known for cancer are either individual genes or combination of genes. Both individual genes and combination of genes do not provide information on gene-gene relations, and often have less prognostic significance than random genes associated with cell proliferation. Several methods for generating sample-specific gene networks have been proposed, but programs implementing the methods are not publicly available. METHODS We have developed a method that builds gene correlation networks specific to individual cancer patients and derives prognostic gene correlations from the networks. A gene correlation network specific to a patient is constructed by identifying gene-gene relations that are significantly different from normal samples. Prognostic gene pairs are obtained by carrying out the Cox proportional hazards regression and the log-rank test for every gene pair. RESULTS We built a web application server called GeneCoNet with thousands of tumor samples in TCGA. Given a tumor sample ID of TCGA, GeneCoNet dynamically constructs a gene correlation network specific to the sample as output. As an additional output, it provides information on prognostic gene correlations in the network. GeneCoNet found several prognostic gene correlations for six types of cancer, but there were no prognostic gene pairs common to multiple cancer types. CONCLUSION Extensive analysis of patient-specific gene correlation networks suggests that patients with a larger subnetwork of prognostic gene pairs have shorter survival time than the others and that patients with a subnetwork that contains more genes participating in prognostic gene pairs have shorter survival time than the others. GeneCoNet can be used as a valuable resource for generating gene correlation networks specific to individual patients and for identifying prognostic gene correlations. It is freely accessible at http://geneconet.inha.ac.kr.
Collapse
Affiliation(s)
- Byungkyu Park
- Department of Computer Engineering, Inha University, Incheon, 22212, South Korea
| | - Wook Lee
- Department of Computer Engineering, Inha University, Incheon, 22212, South Korea
| | - Kyungsook Han
- Department of Computer Engineering, Inha University, Incheon, 22212, South Korea. http://biocomputing.inha.ac.kr
| |
Collapse
|
18
|
Identification of cancer-related mutations in human pluripotent stem cells using RNA-seq analysis. Nat Protoc 2021; 16:4522-4537. [PMID: 34363070 DOI: 10.1038/s41596-021-00591-5] [Citation(s) in RCA: 7] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/16/2020] [Accepted: 06/16/2021] [Indexed: 01/10/2023]
Abstract
Human pluripotent stem cells (hPSCs) are known to acquire genetic aberrations during in vitro propagation. In addition to recurrent chromosomal aberrations, it has recently been shown that these cells also gain point mutations in cancer-related genes, predominantly in TP53. The need for routine quality control of hPSCs is critical for both basic research and clinical applications. Here we discuss the relevance of detecting mutations for various hPSCs applications, and present a detailed protocol to identify cancer-related point mutations using data from RNA sequencing, an assay commonly performed during the growth and differentiation of hPSCs. In this protocol, we describe how to process and align the sequencing data, analyze it and conservatively interpret the results in order to generate an accurate estimation of mutations in tumor-related genes. This pipeline is designed to work in high throughput and is available as a software container at https://github.com/elyadlezmi/RNA2CM . The protocol requires minimal command-line skills and can be carried out in 1-2 d.
Collapse
|
19
|
Thind AS, Monga I, Thakur PK, Kumari P, Dindhoria K, Krzak M, Ranson M, Ashford B. Demystifying emerging bulk RNA-Seq applications: the application and utility of bioinformatic methodology. Brief Bioinform 2021; 22:6330938. [PMID: 34329375 DOI: 10.1093/bib/bbab259] [Citation(s) in RCA: 27] [Impact Index Per Article: 9.0] [Reference Citation Analysis] [Abstract] [Key Words] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/28/2021] [Revised: 06/14/2021] [Accepted: 06/18/2021] [Indexed: 12/13/2022] Open
Abstract
Significant innovations in next-generation sequencing techniques and bioinformatics tools have impacted our appreciation and understanding of RNA. Practical RNA sequencing (RNA-Seq) applications have evolved in conjunction with sequence technology and bioinformatic tools advances. In most projects, bulk RNA-Seq data is used to measure gene expression patterns, isoform expression, alternative splicing and single-nucleotide polymorphisms. However, RNA-Seq holds far more hidden biological information including details of copy number alteration, microbial contamination, transposable elements, cell type (deconvolution) and the presence of neoantigens. Recent novel and advanced bioinformatic algorithms developed the capacity to retrieve this information from bulk RNA-Seq data, thus broadening its scope. The focus of this review is to comprehend the emerging bulk RNA-Seq-based analyses, emphasizing less familiar and underused applications. In doing so, we highlight the power of bulk RNA-Seq in providing biological insights.
Collapse
Affiliation(s)
- Amarinder Singh Thind
- University of Wollongong, Wollongong, Australia.,Illawarra Health and Medical Research Institute, Wollongong, Australia
| | - Isha Monga
- Columbia University, New York City, NY, USA
| | | | - Pallawi Kumari
- Institute of Microbial Technology, Council of Scientific and Industrial Research, Chandigarh, India
| | - Kiran Dindhoria
- Institute of Microbial Technology, Council of Scientific and Industrial Research, Chandigarh, India
| | | | - Marie Ranson
- University of Wollongong, Wollongong, Australia.,Illawarra Health and Medical Research Institute, Wollongong, Australia
| | - Bruce Ashford
- University of Wollongong, Wollongong, Australia.,Illawarra Health and Medical Research Institute, Wollongong, Australia
| |
Collapse
|
20
|
Shan W, Yuan J, Hu Z, Jiang J, Wang Y, Loo N, Fan L, Tang Z, Zhang T, Xu M, Pan Y, Lu J, Long M, Tanyi JL, Montone KT, Fan Y, Hu X, Zhang Y, Zhang L. Systematic Characterization of Recurrent Genomic Alterations in Cyclin-Dependent Kinases Reveals Potential Therapeutic Strategies for Cancer Treatment. Cell Rep 2021; 32:107884. [PMID: 32668240 DOI: 10.1016/j.celrep.2020.107884] [Citation(s) in RCA: 12] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/07/2019] [Revised: 03/21/2020] [Accepted: 06/17/2020] [Indexed: 12/13/2022] Open
Abstract
Recurrent copy-number alterations, mutations, and transcript fusions of the genes encoding CDKs/cyclins are characterized in >10,000 tumors. Genomic alterations of CDKs/cyclins are dominantly driven by copy number aberrations. In contrast to cell-cycle-related CDKs/cyclins, which are globally amplified, transcriptional CDKs/cyclins recurrently lose copy numbers across cancers. Although mutations and transcript fusions are relatively rare events, CDK12 exhibits recurrent mutations in multiple cancers. Among the transcriptional CDKs, CDK7 and CDK12 show the most significant copy number loss and mutation, respectively. Their genomic alterations are correlated with increased sensitivities to DNA-damaging drugs. Inhibition of CDK7 preferentially represses the expression of genes in the DNA-damage-repair pathways and impairs the activity of homologous recombination. Low-dose CDK7 inhibitor treatment sensitizes cancer cells to PARP inhibitor-induced DNA damage and cell death. Our analysis provides genomic information for identification and prioritization of drug targets for CDKs and reveals rationales for treatment strategies.
Collapse
Affiliation(s)
- Weiwei Shan
- Center for Research on Reproduction & Women's Health, University of Pennsylvania, Philadelphia, PA 19104, USA; Department of Obstetrics and Gynecology, University of Pennsylvania, Philadelphia, PA 19104, USA
| | - Jiao Yuan
- Center for Research on Reproduction & Women's Health, University of Pennsylvania, Philadelphia, PA 19104, USA
| | - Zhongyi Hu
- Center for Research on Reproduction & Women's Health, University of Pennsylvania, Philadelphia, PA 19104, USA
| | - Junjie Jiang
- Center for Research on Reproduction & Women's Health, University of Pennsylvania, Philadelphia, PA 19104, USA
| | - Yueying Wang
- Center for Research on Reproduction & Women's Health, University of Pennsylvania, Philadelphia, PA 19104, USA
| | - Nicki Loo
- Center for Research on Reproduction & Women's Health, University of Pennsylvania, Philadelphia, PA 19104, USA
| | - Lingling Fan
- Center for Research on Reproduction & Women's Health, University of Pennsylvania, Philadelphia, PA 19104, USA
| | - Zhaoqing Tang
- Center for Research on Reproduction & Women's Health, University of Pennsylvania, Philadelphia, PA 19104, USA
| | - Tianli Zhang
- Center for Research on Reproduction & Women's Health, University of Pennsylvania, Philadelphia, PA 19104, USA
| | - Mu Xu
- Center for Research on Reproduction & Women's Health, University of Pennsylvania, Philadelphia, PA 19104, USA
| | - Yutian Pan
- Center for Research on Reproduction & Women's Health, University of Pennsylvania, Philadelphia, PA 19104, USA
| | - Jiaqi Lu
- Center for Research on Reproduction & Women's Health, University of Pennsylvania, Philadelphia, PA 19104, USA
| | - Meixiao Long
- Department of Internal Medicine, Division of Hematology, Ohio State University, Columbus, OH 43210, USA
| | - Janos L Tanyi
- Department of Obstetrics and Gynecology, University of Pennsylvania, Philadelphia, PA 19104, USA
| | - Kathleen T Montone
- Department of Pathology and Laboratory Medicine, University of Pennsylvania, Philadelphia, PA 19104, USA
| | - Yi Fan
- Department of Radiation Oncology, University of Pennsylvania, Philadelphia, PA 19104, USA
| | - Xiaowen Hu
- Center for Research on Reproduction & Women's Health, University of Pennsylvania, Philadelphia, PA 19104, USA; Department of Obstetrics and Gynecology, University of Pennsylvania, Philadelphia, PA 19104, USA
| | - Youyou Zhang
- Center for Research on Reproduction & Women's Health, University of Pennsylvania, Philadelphia, PA 19104, USA; Department of Obstetrics and Gynecology, University of Pennsylvania, Philadelphia, PA 19104, USA.
| | - Lin Zhang
- Center for Research on Reproduction & Women's Health, University of Pennsylvania, Philadelphia, PA 19104, USA; Department of Obstetrics and Gynecology, University of Pennsylvania, Philadelphia, PA 19104, USA.
| |
Collapse
|
21
|
Zhou T, Sengupta S, Müller P, Ji Y. RNDClone: Tumor subclone reconstruction based on integrating DNA and RNA sequence data. Ann Appl Stat 2020. [DOI: 10.1214/20-aoas1368] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/19/2022]
|
22
|
Quaglieri A, Flensburg C, Speed TP, Majewski IJ. Finding a suitable library size to call variants in RNA-Seq. BMC Bioinformatics 2020; 21:553. [PMID: 33261552 PMCID: PMC7708150 DOI: 10.1186/s12859-020-03860-4] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/19/2020] [Accepted: 11/03/2020] [Indexed: 11/10/2022] Open
Abstract
BACKGROUND RNA sequencing allows the study of both gene expression changes and transcribed mutations, providing a highly effective way to gain insight into cancer biology. When planning the sequencing of a large cohort of samples, library size is a fundamental factor affecting both the overall cost and the quality of the results. Here we specifically address how overall library size influences the detection of somatic mutations in RNA-seq data in two acute myeloid leukaemia datasets. RESULTS : We simulated shallower sequencing depths by downsampling 45 acute myeloid leukaemia samples (100 bp PE) that are part of the Leucegene project, which were originally sequenced at high depth. We compared the sensitivity of six methods of recovering validated mutations on the same samples. The methods compared are a combination of three popular callers (MuTect, VarScan, and VarDict) and two filtering strategies. We observed an incremental loss in sensitivity when simulating libraries of 80M, 50M, 40M, 30M and 20M fragments, with the largest loss detected with less than 30M fragments (below 90%, average loss of 7%). The sensitivity in recovering insertions and deletions varied markedly between callers, with VarDict showing the highest sensitivity (60%). Single nucleotide variant sensitivity is relatively consistent across methods, apart from MuTect, whose default filters need adjustment when using RNA-Seq. We also analysed 136 RNA-Seq samples from the TCGA-LAML cohort (50 bp PE) and assessed the change in sensitivity between the initial libraries (average 59M fragments) and after downsampling to 40M fragments. When considering single nucleotide variants in recurrently mutated myeloid genes we found a comparable performance, with a 6% average loss in sensitivity using 40M fragments. CONCLUSIONS Between 30M and 40M 100 bp PE reads are needed to recover 90-95% of the initial variants on recurrently mutated myeloid genes. To extend this result to another cancer type, an exploration of the characteristics of its mutations and gene expression patterns is suggested.
Collapse
Affiliation(s)
- Anna Quaglieri
- Walter and Eliza Hall Institute of Medical Research, 1G Royal Parade, Parkville, 3052, Australia. .,Faculty of Medicine, Dentistry and Health Sciences, The University of Melbourne, Grattan St, Melbourne, 3010, Australia.
| | - Christoffer Flensburg
- Walter and Eliza Hall Institute of Medical Research, 1G Royal Parade, Parkville, 3052, Australia
| | - Terence P Speed
- Walter and Eliza Hall Institute of Medical Research, 1G Royal Parade, Parkville, 3052, Australia.,Faculty of Medicine, Dentistry and Health Sciences, The University of Melbourne, Grattan St, Melbourne, 3010, Australia.,Department of Mathematics and Statistics, The University of Melbourne, 813 Swanston Street, Melbourne, 3010, Australia
| | - Ian J Majewski
- Walter and Eliza Hall Institute of Medical Research, 1G Royal Parade, Parkville, 3052, Australia. .,Faculty of Medicine, Dentistry and Health Sciences, The University of Melbourne, Grattan St, Melbourne, 3010, Australia.
| |
Collapse
|
23
|
Rao AA, Madejska AA, Pfeil J, Paten B, Salama SR, Haussler D. ProTECT-Prediction of T-Cell Epitopes for Cancer Therapy. Front Immunol 2020; 11:483296. [PMID: 33244314 PMCID: PMC7683782 DOI: 10.3389/fimmu.2020.483296] [Citation(s) in RCA: 9] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/06/2019] [Accepted: 10/13/2020] [Indexed: 12/21/2022] Open
Abstract
Somatic mutations in cancers affecting protein coding genes can give rise to potentially therapeutic neoepitopes. These neoepitopes can guide Adoptive Cell Therapies and Peptide- and RNA-based Neoepitope Vaccines to selectively target tumor cells using autologous patient cytotoxic T-cells. Currently, researchers have to independently align their data, call somatic mutations and haplotype the patient’s HLA to use existing neoepitope prediction tools. We present ProTECT, a fully automated, reproducible, scalable, and efficient end-to-end analysis pipeline to identify and rank therapeutically relevant tumor neoepitopes in terms of potential immunogenicity starting directly from raw patient sequencing data, or from pre-processed data. The ProTECT pipeline encompasses alignment, HLA haplotyping, mutation calling (single nucleotide variants, short insertions and deletions, and gene fusions), peptide:MHC binding prediction, and ranking of final candidates. We demonstrate the scalability, efficiency, and utility of ProTECT on 326 samples from the TCGA Prostate Adenocarcinoma cohort, identifying recurrent potential neoepitopes from TMPRSS2-ERG fusions, and from SNVs in SPOP. We also compare ProTECT with results from published tools. ProTECT can be run on a standalone computer, a local cluster, or on a compute cloud using a Mesos backend. ProTECT is highly scalable and can process TCGA data in under 30 min per sample (on average) when run in large batches. ProTECT is freely available at https://www.github.com/BD2KGenomics/protect.
Collapse
Affiliation(s)
- Arjun A Rao
- Department of Biomolecular Engineering, University of California, Santa Cruz, Santa Cruz, CA, United States.,Computational Genomics Lab, University of California, Santa Cruz, Santa Cruz, CA, United States.,UC Santa Cruz Genomics Institute, University of California, Santa Cruz, Santa Cruz, CA, United States
| | - Ada A Madejska
- Computational Genomics Lab, University of California, Santa Cruz, Santa Cruz, CA, United States.,Department of Molecular, Cell, and Developmental Biology, University of California, Santa Cruz, Santa Cruz, Santa Cruz, CA, United States
| | - Jacob Pfeil
- Department of Biomolecular Engineering, University of California, Santa Cruz, Santa Cruz, CA, United States.,Computational Genomics Lab, University of California, Santa Cruz, Santa Cruz, CA, United States.,UC Santa Cruz Genomics Institute, University of California, Santa Cruz, Santa Cruz, CA, United States
| | - Benedict Paten
- Department of Biomolecular Engineering, University of California, Santa Cruz, Santa Cruz, CA, United States.,Computational Genomics Lab, University of California, Santa Cruz, Santa Cruz, CA, United States.,UC Santa Cruz Genomics Institute, University of California, Santa Cruz, Santa Cruz, CA, United States
| | - Sofie R Salama
- Department of Biomolecular Engineering, University of California, Santa Cruz, Santa Cruz, CA, United States.,UC Santa Cruz Genomics Institute, University of California, Santa Cruz, Santa Cruz, CA, United States.,Howard Hughes Medical Institute, University of California, Santa Cruz, Santa Cruz, CA, United States
| | - David Haussler
- Department of Biomolecular Engineering, University of California, Santa Cruz, Santa Cruz, CA, United States.,Computational Genomics Lab, University of California, Santa Cruz, Santa Cruz, CA, United States.,UC Santa Cruz Genomics Institute, University of California, Santa Cruz, Santa Cruz, CA, United States.,Howard Hughes Medical Institute, University of California, Santa Cruz, Santa Cruz, CA, United States
| |
Collapse
|
24
|
Large scale, robust, and accurate whole transcriptome profiling from clinical formalin-fixed paraffin-embedded samples. Sci Rep 2020; 10:17597. [PMID: 33077815 PMCID: PMC7572424 DOI: 10.1038/s41598-020-74483-1] [Citation(s) in RCA: 10] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/08/2020] [Accepted: 09/30/2020] [Indexed: 01/25/2023] Open
Abstract
Transcriptome profiling can provide information of great value in clinical decision-making, yet RNA from readily available formalin-fixed paraffin-embedded (FFPE) tissue is often too degraded for quality sequencing. To assess the clinical utility of FFPE-derived RNA, we performed ribo-deplete RNA extractions on > 3200 FFPE slide samples; 25 of these had direct FFPE vs. fresh frozen (FF) replicates, 57 were sequenced in 2 different labs, 87 underwent multiple library analyses, and 16 had direct microdissected vs. macrodissected replicates. Poly-A versus ribo-depletion RNA extraction methods were compared using transcriptomes of TCGA cohort and 3116 FFPE samples. Compared to FF, FFPE transcripts coding for nuclear/cytoplasmic proteins involved in DNA packaging, replication, and protein synthesis were detected at lower rates and zinc finger family transcripts were of poorer quality. The greatest difference in extraction methods was in histone transcripts which typically lack poly-A tails. Encouragingly, the overall sequencing success rate was 81%. Exome coverage was highly concordant in direct FFPE and FF replicates, with 98% agreement in coding exon coverage and a median correlation of whole transcriptome profiles of 0.95. We provide strong rationale for clinical use of FFPE-derived RNA based on the robustness, reproducibility, and consistency of whole transcriptome profiling.
Collapse
|
25
|
Bailey MH, Meyerson WU, Dursi LJ, Wang LB, Dong G, Liang WW, Weerasinghe A, Li S, Li Y, Kelso S, Saksena G, Ellrott K, Wendl MC, Wheeler DA, Getz G, Simpson JT, Gerstein MB, Ding L. Retrospective evaluation of whole exome and genome mutation calls in 746 cancer samples. Nat Commun 2020; 11:4748. [PMID: 32958763 PMCID: PMC7505971 DOI: 10.1038/s41467-020-18151-y] [Citation(s) in RCA: 21] [Impact Index Per Article: 5.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/22/2019] [Accepted: 07/28/2020] [Indexed: 02/03/2023] Open
Abstract
The Cancer Genome Atlas (TCGA) and International Cancer Genome Consortium (ICGC) curated consensus somatic mutation calls using whole exome sequencing (WES) and whole genome sequencing (WGS), respectively. Here, as part of the ICGC/TCGA Pan-Cancer Analysis of Whole Genomes (PCAWG) Consortium, which aggregated whole genome sequencing data from 2,658 cancers across 38 tumour types, we compare WES and WGS side-by-side from 746 TCGA samples, finding that ~80% of mutations overlap in covered exonic regions. We estimate that low variant allele fraction (VAF < 15%) and clonal heterogeneity contribute up to 68% of private WGS mutations and 71% of private WES mutations. We observe that ~30% of private WGS mutations trace to mutations identified by a single variant caller in WES consensus efforts. WGS captures both ~50% more variation in exonic regions and un-observed mutations in loci with variable GC-content. Together, our analysis highlights technological divergences between two reproducible somatic variant detection efforts.
Collapse
Affiliation(s)
- Matthew H Bailey
- The McDonnell Genome Institute at Washington University, St. Louis, MO, 63108, USA
- Division of Oncology, Department of Medicine, Washington University School of Medicine, St. Louis, MO, 63108, USA
- Alvin J. Siteman Cancer Center, Washington University School of Medicine, St. Louis, MO, 63108, USA
| | - William U Meyerson
- Yale School of Medicine, Yale University, New Haven, CT, 06520, USA
- Program in Computational Biology and Bioinformatics, Yale University, New Haven, CT, 06520, USA
| | - Lewis Jonathan Dursi
- Computational Biology Program, Ontario Institute for Cancer Research, Toronto, ON, M5G 0A3, Canada
- The Hospital for Sick Children, Toronto, ON, M5G 1X8, Canada
| | - Liang-Bo Wang
- The McDonnell Genome Institute at Washington University, St. Louis, MO, 63108, USA
- Division of Oncology, Department of Medicine, Washington University School of Medicine, St. Louis, MO, 63108, USA
| | - Guanlan Dong
- Division of Oncology, Department of Medicine, Washington University School of Medicine, St. Louis, MO, 63108, USA
| | - Wen-Wei Liang
- The McDonnell Genome Institute at Washington University, St. Louis, MO, 63108, USA
- Division of Oncology, Department of Medicine, Washington University School of Medicine, St. Louis, MO, 63108, USA
| | - Amila Weerasinghe
- The McDonnell Genome Institute at Washington University, St. Louis, MO, 63108, USA
- Division of Oncology, Department of Medicine, Washington University School of Medicine, St. Louis, MO, 63108, USA
| | - Shantao Li
- Program in Computational Biology and Bioinformatics, Yale University, New Haven, CT, 06520, USA
| | - Yize Li
- The McDonnell Genome Institute at Washington University, St. Louis, MO, 63108, USA
| | - Sean Kelso
- Division of Oncology, Department of Medicine, Washington University School of Medicine, St. Louis, MO, 63108, USA
| | - Gordon Saksena
- Broad Institute of MIT and Harvard, Cambridge, MA, 02142, USA
| | - Kyle Ellrott
- Biomedical Engineering, Oregon Health and Science University, Portland, OR, 97239, USA
| | - Michael C Wendl
- The McDonnell Genome Institute at Washington University, St. Louis, MO, 63108, USA
- Department of Mathematics, Washington University in St. Louis, St. Louis, MO, 63130, USA
- Department of Genetics, Washington University School of Medicine, St.Louis, MO, 63110, USA
| | - David A Wheeler
- Human Genome Sequencing Center, Baylor College of Medicine, Houston, TX, 77030, USA
- Department of Molecular and Human Genetics, Baylor College of Medicine, Houston, TX, 77030, USA
| | - Gad Getz
- Broad Institute of MIT and Harvard, Cambridge, MA, 02142, USA
- Harvard Medical School, Boston, MA, 02115, USA
- Center for Cancer Research, Massachusetts General Hospital, Boston, MA, 02114, USA
- Department of Pathology, Massachusetts General Hospital, Boston, MA, 02114, USA
| | - Jared T Simpson
- Computational Biology Program, Ontario Institute for Cancer Research, Toronto, ON, M5G 0A3, Canada
- Department of Computer Science, University of Toronto, Toronto, ON, M5S, Canada
| | - Mark B Gerstein
- Program in Computational Biology and Bioinformatics, Yale University, New Haven, CT, 06520, USA.
- Department of Computer Science, Yale University, New Haven, CT, 06520, USA.
- Department of Molecular Biophysics and Biochemistry, Yale University, New Haven, CT, 06520, USA.
| | - Li Ding
- The McDonnell Genome Institute at Washington University, St. Louis, MO, 63108, USA.
- Division of Oncology, Department of Medicine, Washington University School of Medicine, St. Louis, MO, 63108, USA.
- Alvin J. Siteman Cancer Center, Washington University School of Medicine, St. Louis, MO, 63108, USA.
- Department of Medicine and Department of Genetics, Washington University School of Medicine, St. Louis, MO, 63110, USA.
| |
Collapse
|
26
|
Wood MA, Nguyen A, Struck AJ, Ellrott K, Nellore A, Thompson RF. neoepiscope improves neoepitope prediction with multivariant phasing. Bioinformatics 2020; 36:713-720. [PMID: 31424527 DOI: 10.1093/bioinformatics/btz653] [Citation(s) in RCA: 16] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/15/2019] [Revised: 07/22/2019] [Accepted: 08/16/2019] [Indexed: 12/30/2022] Open
Abstract
MOTIVATION The vast majority of tools for neoepitope prediction from DNA sequencing of complementary tumor and normal patient samples do not consider germline context or the potential for the co-occurrence of two or more somatic variants on the same mRNA transcript. Without consideration of these phenomena, existing approaches are likely to produce both false-positive and false-negative results, resulting in an inaccurate and incomplete picture of the cancer neoepitope landscape. We developed neoepiscope chiefly to address this issue for single nucleotide variants (SNVs) and insertions/deletions (indels). RESULTS Herein, we illustrate how germline and somatic variant phasing affects neoepitope prediction across multiple datasets. We estimate that up to ∼5% of neoepitopes arising from SNVs and indels may require variant phasing for their accurate assessment. neoepiscope is performant, flexible and supports several major histocompatibility complex binding affinity prediction tools. AVAILABILITY AND IMPLEMENTATION neoepiscope is available on GitHub at https://github.com/pdxgx/neoepiscope under the MIT license. Scripts for reproducing results described in the text are available at https://github.com/pdxgx/neoepiscope-paper under the MIT license. Additional data from this study, including summaries of variant phasing incidence and benchmarking wallclock times, are available in Supplementary Files 1, 2 and 3. Supplementary File 1 contains Supplementary Table 1, Supplementary Figures 1 and 2, and descriptions of Supplementary Tables 2-8. Supplementary File 2 contains Supplementary Tables 2-6 and 8. Supplementary File 3 contains Supplementary Table 7. Raw sequencing data used for the analyses in this manuscript are available from the Sequence Read Archive under accessions PRJNA278450, PRJNA312948, PRJNA307199, PRJNA343789, PRJNA357321, PRJNA293912, PRJNA369259, PRJNA305077, PRJNA306070, PRJNA82745 and PRJNA324705; from the European Genome-phenome Archive under accessions EGAD00001004352 and EGAD00001002731; and by direct request to the authors. SUPPLEMENTARY INFORMATION Supplementary data are available at Bioinformatics online.
Collapse
Affiliation(s)
- Mary A Wood
- Computational Biology Program, Oregon Health & Science University, Portland, OR 97201, USA
- Portland VA Research Foundation, Portland, OR 97239, USA
| | - Austin Nguyen
- Computational Biology Program, Oregon Health & Science University, Portland, OR 97201, USA
| | - Adam J Struck
- Computational Biology Program, Oregon Health & Science University, Portland, OR 97201, USA
| | - Kyle Ellrott
- Computational Biology Program, Oregon Health & Science University, Portland, OR 97201, USA
- Department of Biomedical Engineering, OR 97239, USA
| | - Abhinav Nellore
- Computational Biology Program, Oregon Health & Science University, Portland, OR 97201, USA
- Department of Biomedical Engineering, OR 97239, USA
- Department of Surgery, OR 97239, USA
| | - Reid F Thompson
- Computational Biology Program, Oregon Health & Science University, Portland, OR 97201, USA
- Portland VA Research Foundation, Portland, OR 97239, USA
- Department of Radiation Medicine, OR 97239, USA
- Department of Medical Informatics & Clinical Epidemiology, Oregon Health & Science University Portland, OR 97239, USA
- Division of Hospital and Specialty Medicine, VA Portland Healthcare System, Portland, OR 97239, USA
| |
Collapse
|
27
|
Brueffer C, Gladchuk S, Winter C, Vallon-Christersson J, Hegardt C, Häkkinen J, George AM, Chen Y, Ehinger A, Larsson C, Loman N, Malmberg M, Rydén L, Borg Å, Saal LH. The mutational landscape of the SCAN-B real-world primary breast cancer transcriptome. EMBO Mol Med 2020; 12:e12118. [PMID: 32926574 PMCID: PMC7539222 DOI: 10.15252/emmm.202012118] [Citation(s) in RCA: 29] [Impact Index Per Article: 7.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/02/2020] [Revised: 08/08/2020] [Accepted: 08/13/2020] [Indexed: 12/12/2022] Open
Abstract
Breast cancer is a disease of genomic alterations, of which the panorama of somatic mutations and how these relate to subtypes and therapy response is incompletely understood. Within SCAN‐B (ClinicalTrials.gov: NCT02306096), a prospective study elucidating the transcriptomic profiles for thousands of breast cancers, we developed a RNA‐seq pipeline for detection of SNVs/indels and profiled a real‐world cohort of 3,217 breast tumors. We describe the mutational landscape of primary breast cancer viewed through the transcriptome of a large population‐based cohort and relate it to patient survival. We demonstrate that RNA‐seq can be used to call mutations in genes such as PIK3CA,TP53, and ERBB2, as well as the status of molecular pathways and mutational burden, and identify potentially druggable mutations in 86.8% of tumors. To make this rich dataset available for the research community, we developed an open source web application, the SCAN‐B MutationExplorer (http://oncogenomics.bmc.lu.se/MutationExplorer). These results add another dimension to the use of RNA‐seq as a clinical tool, where both gene expression‐ and mutation‐based biomarkers can be interrogated in real‐time within 1 week of tumor sampling.
Collapse
Affiliation(s)
- Christian Brueffer
- Division of Oncology, Department of Clinical Sciences, Lund University, Lund, Sweden.,Lund University Cancer Center, Lund, Sweden
| | - Sergii Gladchuk
- Division of Oncology, Department of Clinical Sciences, Lund University, Lund, Sweden.,Lund University Cancer Center, Lund, Sweden
| | - Christof Winter
- Division of Oncology, Department of Clinical Sciences, Lund University, Lund, Sweden.,Lund University Cancer Center, Lund, Sweden
| | - Johan Vallon-Christersson
- Division of Oncology, Department of Clinical Sciences, Lund University, Lund, Sweden.,Lund University Cancer Center, Lund, Sweden.,CREATE Health Strategic Center for Translational Cancer Research, Lund University, Lund, Sweden
| | - Cecilia Hegardt
- Division of Oncology, Department of Clinical Sciences, Lund University, Lund, Sweden.,Lund University Cancer Center, Lund, Sweden.,CREATE Health Strategic Center for Translational Cancer Research, Lund University, Lund, Sweden
| | - Jari Häkkinen
- Division of Oncology, Department of Clinical Sciences, Lund University, Lund, Sweden.,Lund University Cancer Center, Lund, Sweden
| | - Anthony M George
- Division of Oncology, Department of Clinical Sciences, Lund University, Lund, Sweden.,Lund University Cancer Center, Lund, Sweden
| | - Yilun Chen
- Division of Oncology, Department of Clinical Sciences, Lund University, Lund, Sweden.,Lund University Cancer Center, Lund, Sweden
| | - Anna Ehinger
- Division of Oncology, Department of Clinical Sciences, Lund University, Lund, Sweden.,Lund University Cancer Center, Lund, Sweden.,Department of Pathology, Skåne University Hospital, Lund, Sweden
| | - Christer Larsson
- Lund University Cancer Center, Lund, Sweden.,Division of Molecular Pathology, Department of Laboratory Medicine, Lund University, Lund, Sweden
| | - Niklas Loman
- Division of Oncology, Department of Clinical Sciences, Lund University, Lund, Sweden.,Lund University Cancer Center, Lund, Sweden.,Department of Oncology, Skåne University Hospital, Lund, Sweden
| | - Martin Malmberg
- Department of Oncology, Skåne University Hospital, Lund, Sweden
| | - Lisa Rydén
- Division of Oncology, Department of Clinical Sciences, Lund University, Lund, Sweden.,Lund University Cancer Center, Lund, Sweden.,Department of Surgery, Skåne University Hospital, Lund, Sweden
| | - Åke Borg
- Division of Oncology, Department of Clinical Sciences, Lund University, Lund, Sweden.,Lund University Cancer Center, Lund, Sweden.,CREATE Health Strategic Center for Translational Cancer Research, Lund University, Lund, Sweden
| | - Lao H Saal
- Division of Oncology, Department of Clinical Sciences, Lund University, Lund, Sweden.,Lund University Cancer Center, Lund, Sweden.,CREATE Health Strategic Center for Translational Cancer Research, Lund University, Lund, Sweden
| |
Collapse
|
28
|
Gao GF, Parker JS, Reynolds SM, Silva TC, Wang LB, Zhou W, Akbani R, Bailey M, Balu S, Berman BP, Brooks D, Chen H, Cherniack AD, Demchok JA, Ding L, Felau I, Gaheen S, Gerhard DS, Heiman DI, Hernandez KM, Hoadley KA, Jayasinghe R, Kemal A, Knijnenburg TA, Laird PW, Mensah MKA, Mungall AJ, Robertson AG, Shen H, Tarnuzzer R, Wang Z, Wyczalkowski M, Yang L, Zenklusen JC, Zhang Z, Liang H, Noble MS. Before and After: Comparison of Legacy and Harmonized TCGA Genomic Data Commons' Data. Cell Syst 2020; 9:24-34.e10. [PMID: 31344359 DOI: 10.1016/j.cels.2019.06.006] [Citation(s) in RCA: 93] [Impact Index Per Article: 23.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/19/2019] [Revised: 03/18/2019] [Accepted: 06/13/2019] [Indexed: 01/09/2023]
Abstract
We present a systematic analysis of the effects of synchronizing a large-scale, deeply characterized, multi-omic dataset to the current human reference genome, using updated software, pipelines, and annotations. For each of 5 molecular data platforms in The Cancer Genome Atlas (TCGA)-mRNA and miRNA expression, single nucleotide variants, DNA methylation and copy number alterations-comprehensive sample, gene, and probe-level studies were performed, towards quantifying the degree of similarity between the 'legacy' GRCh37 (hg19) TCGA data and its GRCh38 (hg38) version as 'harmonized' by the Genomic Data Commons. We offer gene lists to elucidate differences that remained after controlling for confounders, and strategies to mitigate their impact on biological interpretation. Our results demonstrate that the hg19 and hg38 TCGA datasets are very highly concordant, promote informed use of either legacy or harmonized omics data, and provide a rubric that encourages similar comparisons as new data emerge and reference data evolve.
Collapse
Affiliation(s)
- Galen F Gao
- Eli and Edythe L. Broad Institute of Massachusetts Institute of Technology and Harvard University, Cambridge, MA 02142, USA; The University of Texas Southwestern Medical School, Dallas, TX 75390, USA
| | - Joel S Parker
- Department of Genetics, Lineberger Comprehensive Cancer Center, the University of North Carolin at Chapel Hill, Chapel Hill, NC 27599, USA
| | | | - Tiago C Silva
- Center for Bioinformatics and Functional Genomics, Department of Biomedical Sciences, Cedars-Sinai Medical Center, Los Angeles, CA 90048, USA; Department of Genetics, Ribeirão Preto Medical School, University of São Paulo, Ribeirão Preto, SP 14.040-905, Brazil
| | - Liang-Bo Wang
- Department of Medicine, Washington University in St Louis, Saint Louis, MO 63108, USA; McDonnell Genome Institute, Washington University in St Louis, Saint Louis, MO 63108, USA; Siteman Cancer Center, Washington University in St Louis, Saint Louis, MO 63108, USA
| | - Wanding Zhou
- Center for Epigenetics, Van Andel Research Institute, Grand Rapids, MI 49503, USA
| | - Rehan Akbani
- Department of Bioinformatics and Computational Biology, the University of Texas MD Anderson Cancer Center, Houston, TX 77030, USA
| | - Matthew Bailey
- Department of Medicine, Washington University in St Louis, Saint Louis, MO 63108, USA; McDonnell Genome Institute, Washington University in St Louis, Saint Louis, MO 63108, USA; Siteman Cancer Center, Washington University in St Louis, Saint Louis, MO 63108, USA
| | - Saianand Balu
- Lineberger Comprehensive Cancer Center, Bioinformatics Core, the University of North Carolina at Chapel Hill, Chapel Hill, NC 27599, USA
| | - Benjamin P Berman
- Center for Bioinformatics and Functional Genomics, Department of Biomedical Sciences, Cedars-Sinai Medical Center, Los Angeles, CA 90048, USA; Faculty of Medicine, Department of Developmental Biology and Cancer Research, the Hebrew University of Jerusalem, Jerusalem 91120, Israel
| | - Denise Brooks
- Canada's Michael Smith Genome Sciences Centre, Vancouver, BC V5Z 4S6, Canada
| | - Hu Chen
- Department of Bioinformatics and Computational Biology, the University of Texas MD Anderson Cancer Center, Houston, TX 77030, USA; Graduate Program in Quantitative and Computational Biosciences, Baylor College of Medicine, Houston, TX 77030, USA
| | - Andrew D Cherniack
- Eli and Edythe L. Broad Institute of Massachusetts Institute of Technology and Harvard University, Cambridge, MA 02142, USA; Department of Medical Oncology, Dana-Farber Cancer Institute, Boston, MA 02215, USA
| | | | - Li Ding
- Department of Medicine, Washington University in St Louis, Saint Louis, MO 63108, USA; McDonnell Genome Institute, Washington University in St Louis, Saint Louis, MO 63108, USA; Siteman Cancer Center, Washington University in St Louis, Saint Louis, MO 63108, USA
| | - Ina Felau
- National Cancer Institute, Bethesda, MD 20892, USA
| | - Sharon Gaheen
- Frederick National Laboratory for Cancer Research, Leidos Biomedical Research, Inc., Frederick, MD 21702, USA
| | | | - David I Heiman
- Eli and Edythe L. Broad Institute of Massachusetts Institute of Technology and Harvard University, Cambridge, MA 02142, USA
| | - Kyle M Hernandez
- Department of Pediatrics, the University of Chicago, Chicago, IL 60637, USA; Center for Research Informatics, the University of Chicago, Chicago, IL 60637, USA
| | - Katherine A Hoadley
- Department of Genetics, Lineberger Comprehensive Cancer Center, the University of North Carolin at Chapel Hill, Chapel Hill, NC 27599, USA
| | - Reyka Jayasinghe
- Department of Medicine, Washington University in St Louis, Saint Louis, MO 63108, USA
| | - Anab Kemal
- National Cancer Institute, Bethesda, MD 20892, USA
| | | | - Peter W Laird
- Center for Epigenetics, Van Andel Research Institute, Grand Rapids, MI 49503, USA
| | | | - Andrew J Mungall
- Canada's Michael Smith Genome Sciences Centre, Vancouver, BC V5Z 4S6, Canada
| | - A Gordon Robertson
- Canada's Michael Smith Genome Sciences Centre, Vancouver, BC V5Z 4S6, Canada
| | - Hui Shen
- Center for Epigenetics, Van Andel Research Institute, Grand Rapids, MI 49503, USA
| | | | - Zhining Wang
- National Cancer Institute, Bethesda, MD 20892, USA
| | - Matthew Wyczalkowski
- Department of Medicine, Washington University in St Louis, Saint Louis, MO 63108, USA; McDonnell Genome Institute, Washington University in St Louis, Saint Louis, MO 63108, USA; Siteman Cancer Center, Washington University in St Louis, Saint Louis, MO 63108, USA
| | - Liming Yang
- National Cancer Institute, Bethesda, MD 20892, USA
| | | | - Zhenyu Zhang
- Center for Translational Data Science, the University of Chicago, Chicago, IL 60615, USA
| | | | - Han Liang
- Department of Bioinformatics and Computational Biology, the University of Texas MD Anderson Cancer Center, Houston, TX 77030, USA; Graduate Program in Quantitative and Computational Biosciences, Baylor College of Medicine, Houston, TX 77030, USA; Department of Systems Biology, the University of Texas MD Anderson Cancer Center, Houston, TX 77030, USA.
| | - Michael S Noble
- Eli and Edythe L. Broad Institute of Massachusetts Institute of Technology and Harvard University, Cambridge, MA 02142, USA.
| |
Collapse
|
29
|
Adashek JJ, Kato S, Parulkar R, Szeto CW, Sanborn JZ, Vaske CJ, Benz SC, Reddy SK, Kurzrock R. Transcriptomic silencing as a potential mechanism of treatment resistance. JCI Insight 2020; 5:134824. [PMID: 32493840 DOI: 10.1172/jci.insight.134824] [Citation(s) in RCA: 24] [Impact Index Per Article: 6.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/06/2019] [Accepted: 04/29/2020] [Indexed: 12/14/2022] Open
Abstract
Next-generation sequencing (NGS) has not revealed all the mechanisms underlying resistance to genomically matched drugs. Here, we performed in 1417 tumors whole-exome tumor (somatic)/normal (germline) NGS and whole-transcriptome sequencing, the latter focusing on a clinically oriented 50-gene panel in order to examine transcriptomic silencing of putative driver alterations. In this large-scale study, approximately 13% of the somatic single nucleotide variants (SNVs) were unexpectedly not expressed as RNA; 23% of patients had ≥1 nonexpressed SNV. SNV-bearing genes consistently transcribed were TP53, PIK3CA, and KRAS; those with lower transcription rates were ALK, CSF1R, ERBB4, FLT3, GNAS, HNF1A, KDR, PDGFRA, RET, and SMO. We also determined the frequency of tumor mutations being germline, rather than somatic, in these and an additional 462 tumors with tumor/normal exomes; 33.8% of germline SNVs within the gene panel were rare (not found after filtering through variant information domains) and at risk of being falsely reported as somatic. Both the frequency of silenced variant transcription and the risk of falsely identifying germline mutations as somatic/tumor related are important phenomena. Therefore, transcriptomics is a critical adjunct to genomics when interrogating patient tumors for actionable alterations, because, without expression of the target aberrations, there will likely be therapeutic resistance.
Collapse
Affiliation(s)
- Jacob J Adashek
- Department of Internal Medicine, University of South Florida, H. Lee Moffitt Cancer Center & Research Institute, Tampa, Florida, USA
| | - Shumei Kato
- Center for Personalized Cancer Therapy and Division of Hematology and Oncology, Department of Medicine, University of California, San Diego, Moores Cancer Center, La Jolla, California, USA
| | | | | | | | | | | | | | - Razelle Kurzrock
- Center for Personalized Cancer Therapy and Division of Hematology and Oncology, Department of Medicine, University of California, San Diego, Moores Cancer Center, La Jolla, California, USA
| |
Collapse
|
30
|
Muyas F, Zapata L, Guigó R, Ossowski S. The rate and spectrum of mosaic mutations during embryogenesis revealed by RNA sequencing of 49 tissues. Genome Med 2020; 12:49. [PMID: 32460841 PMCID: PMC7254727 DOI: 10.1186/s13073-020-00746-1] [Citation(s) in RCA: 20] [Impact Index Per Article: 5.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/03/2020] [Accepted: 05/08/2020] [Indexed: 12/23/2022] Open
Abstract
BACKGROUND Mosaic mutations acquired during early embryogenesis can lead to severe early-onset genetic disorders and cancer predisposition, but are often undetectable in blood samples. The rate and mutational spectrum of embryonic mosaic mutations (EMMs) have only been studied in few tissues, and their contribution to genetic disorders is unknown. Therefore, we investigated how frequent mosaic mutations occur during embryogenesis across all germ layers and tissues. METHODS Mosaic mutation detection in 49 normal tissues from 570 individuals (Genotype-Tissue Expression (GTEx) cohort) was performed using a newly developed multi-tissue, multi-individual variant calling approach for RNA-seq data. Our method allows for reliable identification of EMMs and the developmental stage during which they appeared. RESULTS The analysis of EMMs in 570 individuals revealed that newborns on average harbor 0.5-1 EMMs in the exome affecting multiple organs (1.3230 × 10-8 per nucleotide per individual), a similar frequency as reported for germline de novo mutations. Our multi-tissue, multi-individual study design allowed us to distinguish mosaic mutations acquired during different stages of embryogenesis and adult life, as well as to provide insights into the rate and spectrum of mosaic mutations. We observed that EMMs are dominated by a mutational signature associated with spontaneous deamination of methylated cytosines and the number of cell divisions. After birth, cells continue to accumulate somatic mutations, which can lead to the development of cancer. Investigation of the mutational spectrum of the gastrointestinal tract revealed a mutational pattern associated with the food-borne carcinogen aflatoxin, a signature that has so far only been reported in liver cancer. CONCLUSIONS In summary, our multi-tissue, multi-individual study reveals a surprisingly high number of embryonic mosaic mutations in coding regions, implying novel hypotheses and diagnostic procedures for investigating genetic causes of disease and cancer predisposition.
Collapse
Affiliation(s)
- Francesc Muyas
- Institute of Medical Genetics and Applied Genomics, University of Tübingen, Tübingen, Germany.
- Center for Genomic Regulation, The Barcelona Institute of Science and Technology, Barcelona, Spain.
- Universitat Pompeu Fabra (UPF), Barcelona, Spain.
| | - Luis Zapata
- Center for Genomic Regulation, The Barcelona Institute of Science and Technology, Barcelona, Spain
- Centre for Evolution and Cancer, The Institute of Cancer Research, London, UK
| | - Roderic Guigó
- Center for Genomic Regulation, The Barcelona Institute of Science and Technology, Barcelona, Spain
- Universitat Pompeu Fabra (UPF), Barcelona, Spain
| | - Stephan Ossowski
- Institute of Medical Genetics and Applied Genomics, University of Tübingen, Tübingen, Germany.
- Center for Genomic Regulation, The Barcelona Institute of Science and Technology, Barcelona, Spain.
- Universitat Pompeu Fabra (UPF), Barcelona, Spain.
| |
Collapse
|
31
|
Wood MA, Weeder BR, David JK, Nellore A, Thompson RF. Burden of tumor mutations, neoepitopes, and other variants are weak predictors of cancer immunotherapy response and overall survival. Genome Med 2020; 12:33. [PMID: 32228719 PMCID: PMC7106909 DOI: 10.1186/s13073-020-00729-2] [Citation(s) in RCA: 53] [Impact Index Per Article: 13.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/27/2020] [Accepted: 03/10/2020] [Indexed: 12/20/2022] Open
Abstract
BACKGROUND Tumor mutational burden (TMB; the quantity of aberrant nucleotide sequences a given tumor may harbor) has been associated with response to immune checkpoint inhibitor therapy and is gaining broad acceptance as a result. However, TMB harbors intrinsic variability across cancer types, and its assessment and interpretation are poorly standardized. METHODS Using a standardized approach, we quantify the robustness of TMB as a metric and its potential as a predictor of immunotherapy response and survival among a diverse cohort of cancer patients. We also explore the additive predictive potential of RNA-derived variants and neoepitope burden, incorporating several novel metrics of immunogenic potential. RESULTS We find that TMB is a partial predictor of immunotherapy response in melanoma and non-small cell lung cancer, but not renal cell carcinoma. We find that TMB is predictive of overall survival in melanoma patients receiving immunotherapy, but not in an immunotherapy-naive population. We also find that it is an unstable metric with potentially problematic repercussions for clinical cohort classification. We finally note minimal additional predictive benefit to assessing neoepitope burden or its bulk derivatives, including RNA-derived sources of neoepitopes. CONCLUSIONS We find sufficient cause to suggest that the predictive clinical value of TMB should not be overstated or oversimplified. While it is readily quantified, TMB is at best a limited surrogate biomarker of immunotherapy response. The data do not support isolated use of TMB in renal cell carcinoma.
Collapse
Affiliation(s)
- Mary A Wood
- Computational Biology Program, Oregon Health & Science University, Portland, USA
- Portland VA Research Foundation, Portland, USA
| | - Benjamin R Weeder
- Computational Biology Program, Oregon Health & Science University, Portland, USA
- Department of Biomedical Engineering, Oregon Health & Science University, Portland, USA
| | - Julianne K David
- Computational Biology Program, Oregon Health & Science University, Portland, USA
- Department of Biomedical Engineering, Oregon Health & Science University, Portland, USA
| | - Abhinav Nellore
- Computational Biology Program, Oregon Health & Science University, Portland, USA
- Department of Biomedical Engineering, Oregon Health & Science University, Portland, USA
- Department of Surgery, Oregon Health & Science University, Portland, USA
| | - Reid F Thompson
- Computational Biology Program, Oregon Health & Science University, Portland, USA.
- Portland VA Research Foundation, Portland, USA.
- Department of Biomedical Engineering, Oregon Health & Science University, Portland, USA.
- Department of Radiation Medicine, Oregon Health & Science University, Portland, USA.
- Department of Medical Informatics & Clinical Epidemiology, Oregon Health & Science University, Portland, USA.
- VA Portland Healthcare System, Division of Hospital and Specialty Medicine, Portland, USA.
| |
Collapse
|
32
|
Yizhak K, Aguet F, Kim J, Hess JM, Kübler K, Grimsby J, Frazer R, Zhang H, Haradhvala NJ, Rosebrock D, Livitz D, Li X, Arich-Landkof E, Shoresh N, Stewart C, Segrè AV, Branton PA, Polak P, Ardlie KG, Getz G. RNA sequence analysis reveals macroscopic somatic clonal expansion across normal tissues. Science 2019; 364:364/6444/eaaw0726. [PMID: 31171663 DOI: 10.1126/science.aaw0726] [Citation(s) in RCA: 312] [Impact Index Per Article: 62.4] [Reference Citation Analysis] [Abstract] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/15/2018] [Accepted: 05/02/2019] [Indexed: 02/06/2023]
Abstract
How somatic mutations accumulate in normal cells is poorly understood. A comprehensive analysis of RNA sequencing data from ~6700 samples across 29 normal tissues revealed multiple somatic variants, demonstrating that macroscopic clones can be found in many normal tissues. We found that sun-exposed skin, esophagus, and lung have a higher mutation burden than other tested tissues, which suggests that environmental factors can promote somatic mosaicism. Mutation burden was associated with both age and tissue-specific cell proliferation rate, highlighting that mutations accumulate over both time and number of cell divisions. Finally, normal tissues were found to harbor mutations in known cancer genes and hotspots. This study provides a broad view of macroscopic clonal expansion in human tissues, thus serving as a foundation for associating clonal expansion with environmental factors, aging, and risk of disease.
Collapse
Affiliation(s)
- Keren Yizhak
- Broad Institute of MIT and Harvard, Cambridge, MA, USA
| | | | - Jaegil Kim
- Broad Institute of MIT and Harvard, Cambridge, MA, USA
| | - Julian M Hess
- Broad Institute of MIT and Harvard, Cambridge, MA, USA
| | - Kirsten Kübler
- Broad Institute of MIT and Harvard, Cambridge, MA, USA.,Center for Cancer Research, Massachusetts General Hospital, Boston, MA, USA.,Harvard Medical School, Boston, MA, USA
| | - Jonna Grimsby
- Broad Institute of MIT and Harvard, Cambridge, MA, USA
| | | | - Hailei Zhang
- Broad Institute of MIT and Harvard, Cambridge, MA, USA
| | - Nicholas J Haradhvala
- Broad Institute of MIT and Harvard, Cambridge, MA, USA.,Center for Cancer Research, Massachusetts General Hospital, Boston, MA, USA
| | | | | | - Xiao Li
- Broad Institute of MIT and Harvard, Cambridge, MA, USA
| | - Eila Arich-Landkof
- Broad Institute of MIT and Harvard, Cambridge, MA, USA.,Center for Cancer Research, Massachusetts General Hospital, Boston, MA, USA
| | - Noam Shoresh
- Broad Institute of MIT and Harvard, Cambridge, MA, USA
| | - Chip Stewart
- Broad Institute of MIT and Harvard, Cambridge, MA, USA
| | - Ayellet V Segrè
- Broad Institute of MIT and Harvard, Cambridge, MA, USA.,Harvard Medical School, Boston, MA, USA.,Ocular Genomics Institute, Department of Ophthalmology, Massachusetts Eye and Ear, Boston, MA, USA
| | - Philip A Branton
- Biorepositories and Biospecimen Research Branch, Cancer Diagnosis Program, National Cancer Institute, Bethesda, MD, USA
| | - Paz Polak
- Oncological Sciences, Icahn School of Medicine at Mount Sinai Hospital, New York, NY, USA
| | | | - Gad Getz
- Broad Institute of MIT and Harvard, Cambridge, MA, USA. .,Center for Cancer Research, Massachusetts General Hospital, Boston, MA, USA.,Harvard Medical School, Boston, MA, USA.,Department of Pathology, Massachusetts General Hospital, Boston, MA, USA
| |
Collapse
|
33
|
Abstract
Tumor cells acquire distinct genetic characteristics as a means to survive and proliferate indefinitely. Changes in the genetic code can also translate in changes at the protein level, therefore creating a distinguishable signature unique for tumor cells, and absent in normal tissues. The presence of discernable moieties in tumors is particularly attractive because it represents a therapeutic opportunity to target tumor cells with specificity, while sparing non-transformed cells. In this sense neoantigens, short peptides containing a mutated sequence, are seen attractive therapeutic targets because of their confinement within tumor cells. Neoantigens can be recognized with high affinity and specificity by tumor-targeting T cells, which consequently can initiate a potent anti-tumor immune response. While this is feasible and it has been tested in numerous cancer types including melanoma, colon and lung cancer, to mention a few, there are technical challenges in identifying immunogenic neoantigens. In this manuscript we address the topic of neoantigen identification from tumor samples, offering a technical overview of the bioinformatic methods utilized to profile the neoantigenic load of tumor samples obtained from clinical specimens. This is meant to guide readers through the steps of neoantigen identification using genomic data, by suggesting tools and methods that can provide, with a high degree of confidence, reliable results for downstream in vitro and in vivo applications.
Collapse
Affiliation(s)
- Sebastiano Battaglia
- Center For Immunotherapy, Department of Genetics and Genomics, Roswell Park Comprehensive Cancer Center, Buffalo, NY, United States.
| |
Collapse
|
34
|
Bartha Á, Győrffy B. Comprehensive Outline of Whole Exome Sequencing Data Analysis Tools Available in Clinical Oncology. Cancers (Basel) 2019; 11:E1725. [PMID: 31690036 PMCID: PMC6895801 DOI: 10.3390/cancers11111725] [Citation(s) in RCA: 19] [Impact Index Per Article: 3.8] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/30/2019] [Revised: 10/31/2019] [Accepted: 11/01/2019] [Indexed: 12/17/2022] Open
Abstract
Whole exome sequencing (WES) enables the analysis of all protein coding sequences in the human genome. This technology enables the investigation of cancer-related genetic aberrations that are predominantly located in the exonic regions. WES delivers high-throughput results at a reasonable price. Here, we review analysis tools enabling utilization of WES data in clinical and research settings. Technically, WES initially allows the detection of single nucleotide variants (SNVs) and copy number variations (CNVs), and data obtained through these methods can be combined and further utilized. Variant calling algorithms for SNVs range from standalone tools to machine learning-based combined pipelines. Tools for CNV detection compare the number of reads aligned to a dedicated segment. Both SNVs and CNVs help to identify mutations resulting in pharmacologically druggable alterations. The identification of homologous recombination deficiency enables the use of PARP inhibitors. Determining microsatellite instability and tumor mutation burden helps to select patients eligible for immunotherapy. To pave the way for clinical applications, we have to recognize some limitations of WES, including its restricted ability to detect CNVs, low coverage compared to targeted sequencing, and the missing consensus regarding references and minimal application requirements. Recently, Galaxy became the leading platform in non-command line-based WES data processing. The maturation of next-generation sequencing is reinforced by Food and Drug Administration (FDA)-approved methods for cancer screening, detection, and follow-up. WES is on the verge of becoming an affordable and sufficiently evolved technology for everyday clinical use.
Collapse
Affiliation(s)
- Áron Bartha
- Semmelweis University, Department of Bioinformatics and 2nd Department of Pediatrics, H-1094 Budapest, Hungary.
- TTK Cancer Biomarker Research Group, Institute of Enzymology, Magyar tudósokkörútja 2., H-1117 Budapest, Hungary.
| | - Balázs Győrffy
- Semmelweis University, Department of Bioinformatics and 2nd Department of Pediatrics, H-1094 Budapest, Hungary.
- TTK Cancer Biomarker Research Group, Institute of Enzymology, Magyar tudósokkörútja 2., H-1117 Budapest, Hungary.
| |
Collapse
|
35
|
Sepulveda JL. Using R and Bioconductor in Clinical Genomics and Transcriptomics. J Mol Diagn 2019; 22:3-20. [PMID: 31605800 DOI: 10.1016/j.jmoldx.2019.08.006] [Citation(s) in RCA: 60] [Impact Index Per Article: 12.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/18/2017] [Revised: 05/02/2019] [Accepted: 08/08/2019] [Indexed: 02/08/2023] Open
Abstract
Bioinformatics pipelines are essential in the analysis of genomic and transcriptomic data generated by next-generation sequencing (NGS). Recent guidelines emphasize the need for rigorous validation and assessment of robustness, reproducibility, and quality of NGS analytic pipelines intended for clinical use. Software tools written in the R statistical language and, in particular, the set of tools available in the Bioconductor repository are widely used in research bioinformatics; and these frameworks offer several advantages for use in clinical bioinformatics, including the breath of available tools, modular nature of software packages, ease of installation, enforcement of interoperability, version control, and short learning curve. This review provides an introduction to R and Bioconductor software, its advantages and limitations for clinical bioinformatics, and illustrative examples of tools that can be used in various steps of NGS analysis.
Collapse
Affiliation(s)
- Jorge L Sepulveda
- Department of Pathology and Cell Biology, Columbia University Irving Medical Center, New York, New York; Informatics Subdivision Leadership, Association for Molecular Pathology, Bethesda, Maryland.
| |
Collapse
|
36
|
Leveraging protein dynamics to identify cancer mutational hotspots using 3D structures. Proc Natl Acad Sci U S A 2019; 116:18962-18970. [PMID: 31462496 PMCID: PMC6754584 DOI: 10.1073/pnas.1901156116] [Citation(s) in RCA: 23] [Impact Index Per Article: 4.6] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/12/2022] Open
Abstract
Large-scale exome sequencing of tumors has enabled the identification of cancer drivers using recurrence-based approaches. Some of these methods also employ 3D protein structures to identify mutational hotspots in cancer-associated genes. In determining such mutational clusters in structures, existing approaches overlook protein dynamics, despite its essential role in protein function. We present a framework to identify cancer driver genes using a dynamics-based search of mutational hotspot communities. Mutations are mapped to protein structures, which are partitioned into distinct residue communities. These communities are identified in a framework where residue-residue contact edges are weighted by correlated motions (as inferred by dynamics-based models). We then search for signals of positive selection among these residue communities to identify putative driver genes, while applying our method to the TCGA (The Cancer Genome Atlas) PanCancer Atlas missense mutation catalog. Overall, we predict 1 or more mutational hotspots within the resolved structures of proteins encoded by 434 genes. These genes were enriched among biological processes associated with tumor progression. Additionally, a comparison between our approach and existing cancer hotspot detection methods using structural data suggests that including protein dynamics significantly increases the sensitivity of driver detection.
Collapse
|
37
|
Zhang J, Caruso FP, Sa JK, Justesen S, Nam DH, Sims P, Ceccarelli M, Lasorella A, Iavarone A. The combination of neoantigen quality and T lymphocyte infiltrates identifies glioblastomas with the longest survival. Commun Biol 2019; 2:135. [PMID: 31044160 PMCID: PMC6478916 DOI: 10.1038/s42003-019-0369-7] [Citation(s) in RCA: 38] [Impact Index Per Article: 7.6] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/29/2018] [Accepted: 03/06/2019] [Indexed: 12/16/2022] Open
Abstract
Glioblastoma (GBM) is resistant to multimodality therapeutic approaches. A high burden of tumor-specific mutant peptides (neoantigens) correlates with better survival and response to immunotherapies in selected solid tumors but how neoantigens impact clinical outcome in GBM remains unclear. Here, we exploit the similarity between tumor neoantigens and infectious disease-derived immune epitopes and apply a neoantigen fitness model for identifying high-quality neoantigens in a human pan-glioma dataset. We find that the neoantigen quality fitness model stratifies GBM patients with more favorable clinical outcome and, together with CD8+ T lymphocytes tumor infiltration, identifies a GBM subgroup with the longest survival, which displays distinct genomic and transcriptomic features. Conversely, neither tumor neoantigen burden from a quantitative model nor the isolated enrichment of CD8+ T lymphocytes were able to predict survival of GBM patients. This approach may guide optimal stratification of GBM patients for maximum response to immunotherapy.
Collapse
Affiliation(s)
- Jing Zhang
- Institute for Cancer Genetics, Columbia University Medical Center, New York, NY 10032 USA
| | - Francesca P. Caruso
- Department of Science and Technology, Universita’ degli Studi del Sannio, 82100 Benevento, Italy
- BIOGEM Istituto di Ricerche Genetiche ‘G. Salvatore’, Campo Reale, 83031 Ariano Irpino, Italy
| | - Jason K. Sa
- Institute for Refractory Cancer Research, Samsung Medical Center, Seoul, Republic of Korea
| | - Sune Justesen
- Immunitrack Aps, Rønnegade 4, 2100 Copenhagen East, Denmark
| | - Do-Hyun Nam
- Institute for Refractory Cancer Research, Samsung Medical Center, Seoul, Republic of Korea
- Department of Health Sciences and Technology, SAIHST, Sungkyunkwan University, Seoul, Republic of Korea
- Department of Neurosurgery, Samsung Medical Center, Sungkyunkwan University School of Medicine, Seoul, Republic of Korea
| | - Peter Sims
- Department of Systems Biology, Columbia University Medical Center, New York, NY 10032 USA
| | - Michele Ceccarelli
- Department of Science and Technology, Universita’ degli Studi del Sannio, 82100 Benevento, Italy
- ABBVIE, Redwood City (CA), Redwood City, CA 94063 USA
| | - Anna Lasorella
- Institute for Cancer Genetics, Columbia University Medical Center, New York, NY 10032 USA
- Department of Pediatrics, Columbia University Medical Center, New York, NY 10032 USA
- Department of Pathology and Cell Biology, Columbia University Medical Center, New York, NY 10032 USA
| | - Antonio Iavarone
- Institute for Cancer Genetics, Columbia University Medical Center, New York, NY 10032 USA
- Department of Pathology and Cell Biology, Columbia University Medical Center, New York, NY 10032 USA
- Department of Neurology, Columbia University Medical Center, New York, NY 10032 USA
| |
Collapse
|
38
|
Calling Variants in the Clinic: Informed Variant Calling Decisions Based on Biological, Clinical, and Laboratory Variables. Comput Struct Biotechnol J 2019; 17:561-569. [PMID: 31049166 PMCID: PMC6482431 DOI: 10.1016/j.csbj.2019.04.002] [Citation(s) in RCA: 15] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/21/2018] [Revised: 03/12/2019] [Accepted: 04/03/2019] [Indexed: 01/10/2023] Open
Abstract
Deep sequencing genomic analysis is becoming increasingly common in clinical research and practice, enabling accurate identification of diagnostic, prognostic, and predictive determinants. Variant calling, distinguishing between true mutations and experimental errors, is a central task of genomic analysis and often requires sophisticated statistical, computational, and/or heuristic techniques. Although variant callers seek to overcome noise inherent in biological experiments, variant calling can be significantly affected by outside factors including those used to prepare, store, and analyze samples. The goal of this review is to discuss known experimental features, such as sample preparation, library preparation, and sequencing, alongside diverse biological and clinical variables, and evaluate their effect on variant caller selection and optimization.
Collapse
|
39
|
Genomic characterization of genes encoding histone acetylation modulator proteins identifies therapeutic targets for cancer treatment. Nat Commun 2019; 10:733. [PMID: 30760718 PMCID: PMC6374416 DOI: 10.1038/s41467-019-08554-x] [Citation(s) in RCA: 35] [Impact Index Per Article: 7.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/18/2018] [Accepted: 01/18/2019] [Indexed: 02/07/2023] Open
Abstract
A growing emphasis in anticancer drug discovery efforts has been on targeting histone acetylation modulators. Here we comprehensively analyze the genomic alterations of the genes encoding histone acetylation modulator proteins (HAMPs) in the Cancer Genome Atlas cohort and observe that HAMPs have a high frequency of focal copy number alterations and recurrent mutations, whereas transcript fusions of HAMPs are relatively rare genomic events in common adult cancers. Collectively, 86.3% (63/73) of HAMPs have recurrent alterations in at least 1 cancer type and 16 HAMPs, including 9 understudied HAMPs, are identified as putative therapeutic targets across multiple cancer types. For example, the recurrent focal amplification of BRD9 is observed in 9 cancer types and genetic depletion of BRD9 inhibits tumor growth. Our systematic genomic analysis of HAMPs across a large-scale cancer specimen cohort may facilitate the identification and prioritization of potential drug targets and selection of suitable patients for precision treatment. Targeting histone acetylation modulators (HAMPs) is a promising avenue of drug discovery in cancer research. Here, the authors integrate multi-dimensional genomic profiles to systematically investigate recurrent genomic alterations in HAMPs, identifying potential therapeutic targets for precision epigenetic treatment.
Collapse
|
40
|
Mosen-Ansorena D. Identification of Mutated Cancer Driver Genes in Unpaired RNA-Seq Samples. Methods Mol Biol 2019; 1878:95-108. [PMID: 30378071 DOI: 10.1007/978-1-4939-8868-6_5] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 06/08/2023]
Abstract
The identification of cancer driver genes through the analysis of mutations detected with high-throughput sequencing is a useful tool and a key challenge in cancer genomics. The workflow presented here relies on unpaired RNA-seq tumoral samples, thus leveraging already available RNA-seq data and providing the intrinsical benefits of directly targeting the transcriptome. Based on well-established methods for variant detection, this workflow also involves thorough data cleaning and extensive annotation, which enable the selection for somatic mutations with functional impact and the prioritization of genes relevant to the carcinogenic processes in the input samples.
Collapse
Affiliation(s)
- David Mosen-Ansorena
- Department of Biostatistics, Harvard T.H. Chan School of Public Health, Boston, MA, USA.
- Department of Biostatistics and Computational Biology, Dana-Farber Cancer Institute, Boston, MA, USA.
| |
Collapse
|
41
|
Neums L, Suenaga S, Beyerlein P, Anders S, Koestler D, Mariani A, Chien J. VaDiR: an integrated approach to Variant Detection in RNA. Gigascience 2018; 7:4757064. [PMID: 29267927 PMCID: PMC5827345 DOI: 10.1093/gigascience/gix122] [Citation(s) in RCA: 13] [Impact Index Per Article: 2.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/07/2016] [Accepted: 11/30/2017] [Indexed: 12/22/2022] Open
Abstract
Background Advances in next-generation DNA sequencing technologies are now enabling detailed characterization of sequence variations in cancer genomes. With whole-genome sequencing, variations in coding and non-coding sequences can be discovered. But the cost associated with it is currently limiting its general use in research. Whole-exome sequencing is used to characterize sequence variations in coding regions, but the cost associated with capture reagents and biases in capture rate limit its full use in research. Additional limitations include uncertainty in assigning the functional significance of the mutations when these mutations are observed in the non-coding region or in genes that are not expressed in cancer tissue. Results We investigated the feasibility of uncovering mutations from expressed genes using RNA sequencing datasets with a method called Variant Detection in RNA(VaDiR) that integrates 3 variant callers, namely: SNPiR, RVBoost, and MuTect2. The combination of all 3 methods, which we called Tier 1 variants, produced the highest precision with true positive mutations from RNA-seq that could be validated at the DNA level. We also found that the integration of Tier 1 variants with those called by MuTect2 and SNPiR produced the highest recall with acceptable precision. Finally, we observed a higher rate of mutation discovery in genes that are expressed at higher levels. Conclusions Our method, VaDiR, provides a possibility of uncovering mutations from RNA sequencing datasets that could be useful in further functional analysis. In addition, our approach allows orthogonal validation of DNA-based mutation discovery by providing complementary sequence variation analysis from paired RNA/DNA sequencing datasets.
Collapse
Affiliation(s)
- Lisa Neums
- Department of Cancer Biology, University of Kansas Medical Center, 3901 Rainbow Blvd., Kansas City, KS 66160, USA.,Department of Bioinformatics and Biosystems Technology, University of Applied Sciences Wildau, Hochschulring 1, 15745 Wildau, Germany
| | - Seiji Suenaga
- Department of Cancer Biology, University of Kansas Medical Center, 3901 Rainbow Blvd., Kansas City, KS 66160, USA
| | - Peter Beyerlein
- Department of Bioinformatics and Biosystems Technology, University of Applied Sciences Wildau, Hochschulring 1, 15745 Wildau, Germany
| | - Sara Anders
- Department of Bioinformatics and Biosystems Technology, University of Applied Sciences Wildau, Hochschulring 1, 15745 Wildau, Germany
| | - Devin Koestler
- Department of Biostatistics, University of Kansas Medical Center, 3901 Rainbow Blvd., Kansas City, KS 66160, USA
| | - Andrea Mariani
- Obstetrics and Gynecology, Cancer Center, Mayo Clinic, 200 First St. SW, Rochester, MN 55905, USA
| | - Jeremy Chien
- Department of Internal Medicine, University of New Mexico Health Sciences Center, 2325 Camino de Salud NE, Albuquerque, NM 87131, USA
| |
Collapse
|
42
|
Xiang Y, Ye Y, Zhang Z, Han L. Maximizing the Utility of Cancer Transcriptomic Data. Trends Cancer 2018; 4:823-837. [PMID: 30470304 DOI: 10.1016/j.trecan.2018.09.009] [Citation(s) in RCA: 22] [Impact Index Per Article: 3.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/23/2018] [Revised: 09/23/2018] [Accepted: 09/24/2018] [Indexed: 12/13/2022]
Abstract
Transcriptomic profiling has been applied to large numbers of cancer samples, by large-scale consortia, including The Cancer Genome Atlas, International Cancer Genome Consortium, and Cancer Cell Line Encyclopedia. Advances in mining cancer transcriptomic data enable us to understand the endless complexity of the cancer transcriptome and thereby to discover new biomarkers and therapeutic targets. In this paper, we review computational resources for deep mining of transcriptomic data to identify, quantify, and determine the functional effects and clinical utility of transcriptomic events, including noncoding RNAs, post-transcriptional regulation, exogenous RNAs, and transcribed genetic variants. These approaches can be applied to other complex diseases, thereby greatly leveraging the impact of this work.
Collapse
Affiliation(s)
- Yu Xiang
- Department of Biochemistry and Molecular Biology, McGovern Medical School at The University of Texas Health Science Center at Houston, Houston, TX 77030, USA; These authors contributed equally
| | - Youqiong Ye
- Department of Biochemistry and Molecular Biology, McGovern Medical School at The University of Texas Health Science Center at Houston, Houston, TX 77030, USA; These authors contributed equally
| | - Zhao Zhang
- Department of Biochemistry and Molecular Biology, McGovern Medical School at The University of Texas Health Science Center at Houston, Houston, TX 77030, USA
| | - Leng Han
- Department of Biochemistry and Molecular Biology, McGovern Medical School at The University of Texas Health Science Center at Houston, Houston, TX 77030, USA; Center for Precision Health, The University of Texas Health Science Center at Houston, Houston, TX 77030, USA.
| |
Collapse
|
43
|
Korkut A, Zaidi S, Kanchi RS, Rao S, Gough NR, Schultz A, Li X, Lorenzi PL, Berger AC, Robertson G, Kwong LN, Datto M, Roszik J, Ling S, Ravikumar V, Manyam G, Rao A, Shelley S, Liu Y, Ju Z, Hansel D, de Velasco G, Pennathur A, Andersen JB, O'Rourke CJ, Ohshiro K, Jogunoori W, Nguyen BN, Li S, Osmanbeyoglu HU, Ajani JA, Mani SA, Houseman A, Wiznerowicz M, Chen J, Gu S, Ma W, Zhang J, Tong P, Cherniack AD, Deng C, Resar L, Weinstein JN, Mishra L, Akbani R. A Pan-Cancer Analysis Reveals High-Frequency Genetic Alterations in Mediators of Signaling by the TGF-β Superfamily. Cell Syst 2018; 7:422-437.e7. [PMID: 30268436 DOI: 10.1016/j.cels.2018.08.010] [Citation(s) in RCA: 124] [Impact Index Per Article: 20.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/01/2018] [Revised: 05/29/2018] [Accepted: 08/21/2018] [Indexed: 02/07/2023]
Abstract
We present an integromic analysis of gene alterations that modulate transforming growth factor β (TGF-β)-Smad-mediated signaling in 9,125 tumor samples across 33 cancer types in The Cancer Genome Atlas (TCGA). Focusing on genes that encode mediators and regulators of TGF-β signaling, we found at least one genomic alteration (mutation, homozygous deletion, or amplification) in 39% of samples, with highest frequencies in gastrointestinal cancers. We identified mutation hotspots in genes that encode TGF-β ligands (BMP5), receptors (TGFBR2, AVCR2A, and BMPR2), and Smads (SMAD2 and SMAD4). Alterations in the TGF-β superfamily correlated positively with expression of metastasis-associated genes and with decreased survival. Correlation analyses showed the contributions of mutation, amplification, deletion, DNA methylation, and miRNA expression to transcriptional activity of TGF-β signaling in each cancer type. This study provides a broad molecular perspective relevant for future functional and therapeutic studies of the diverse cancer pathways mediated by the TGF-β superfamily.
Collapse
Affiliation(s)
- Anil Korkut
- Department of Bioinformatics and Computational Biology, MD Anderson Cancer Center, Houston, TX 77030, USA
| | - Sobia Zaidi
- Center for Translational Medicine, Department of Surgery, George Washington University, Washington, DC 20037, USA
| | - Rupa S Kanchi
- Department of Bioinformatics and Computational Biology, MD Anderson Cancer Center, Houston, TX 77030, USA
| | - Shuyun Rao
- Center for Translational Medicine, Department of Surgery, George Washington University, Washington, DC 20037, USA
| | - Nancy R Gough
- Center for Translational Medicine, Department of Surgery, George Washington University, Washington, DC 20037, USA
| | - Andre Schultz
- Department of Bioinformatics and Computational Biology, MD Anderson Cancer Center, Houston, TX 77030, USA
| | - Xubin Li
- Department of Bioinformatics and Computational Biology, MD Anderson Cancer Center, Houston, TX 77030, USA
| | - Philip L Lorenzi
- Department of Bioinformatics and Computational Biology, MD Anderson Cancer Center, Houston, TX 77030, USA
| | - Ashton C Berger
- Cancer Program, The Eli and Edythe L. Broad Institute of Massachusetts Institute of Technology and Harvard University, Cambridge, MA 02142, USA
| | - Gordon Robertson
- Canada's Michael Smith Genome Sciences Center, BC Cancer Agency, Vancouver, BC V5Z 4S6, Canada
| | - Lawrence N Kwong
- Department of Translational Molecular Pathology, University of Texas MD Anderson Cancer Center, Houston, TX 77030, USA
| | - Mike Datto
- Department of Pathology, Duke School of Medicine Durham, Durham, NC 27710, USA
| | - Jason Roszik
- Department of Melanoma Medical Oncology and Genomic Medicine, MD Anderson Cancer Center, Houston, TX 77030, USA
| | - Shiyun Ling
- Department of Bioinformatics and Computational Biology, MD Anderson Cancer Center, Houston, TX 77030, USA
| | - Visweswaran Ravikumar
- Department of Bioinformatics and Computational Biology, MD Anderson Cancer Center, Houston, TX 77030, USA
| | - Ganiraju Manyam
- Department of Bioinformatics and Computational Biology, MD Anderson Cancer Center, Houston, TX 77030, USA
| | - Arvind Rao
- Department of Bioinformatics and Computational Biology, MD Anderson Cancer Center, Houston, TX 77030, USA
| | - Simon Shelley
- Department of Medicine, University of Wisconsin School of Medicine and Public Health, Madison, WI 53726, USA
| | - Yuexin Liu
- Department of Bioinformatics and Computational Biology, MD Anderson Cancer Center, Houston, TX 77030, USA
| | - Zhenlin Ju
- Department of Bioinformatics and Computational Biology, MD Anderson Cancer Center, Houston, TX 77030, USA
| | - Donna Hansel
- Department of Pathology, University of California, San Diego, La Jolla, CA 92093, USA
| | - Guillermo de Velasco
- Department of Medical Oncology, Dana-Farber Cancer Institute, Boston, MA 02215, USA; Department of Medical Oncology, University Hospital 12 de Octubre, Madrid 28041, Spain
| | - Arjun Pennathur
- Department of Cardiothoracic Surgery, University of Pittsburgh School of Medicine and University of Pittsburgh Medical Center, Pittsburgh, PA 15213, USA
| | - Jesper B Andersen
- Department of Health and Medical Sciences, Biotech Research and Innovation Centre, University of Copenhagen, Ole Maaloes Vej 5, Copenhagen 2200, Denmark
| | - Colm J O'Rourke
- Department of Health and Medical Sciences, Biotech Research and Innovation Centre, University of Copenhagen, Ole Maaloes Vej 5, Copenhagen 2200, Denmark
| | - Kazufumi Ohshiro
- Center for Translational Medicine, Department of Surgery, George Washington University, Washington, DC 20037, USA
| | - Wilma Jogunoori
- Center for Translational Medicine, Department of Surgery, George Washington University, Washington, DC 20037, USA; Veterans Affairs Medical Center, Institute of Clinical Research, Washington, DC 20422, USA
| | - Bao-Ngoc Nguyen
- Center for Translational Medicine, Department of Surgery, George Washington University, Washington, DC 20037, USA
| | - Shulin Li
- Department of Pediatrics, The University of Texas MD Anderson Cancer Center, Houston, TX 77030, USA
| | - Hatice U Osmanbeyoglu
- Memorial Sloan Kettering Cancer Center, Computational & Systems Biology Program, New York, NY 10065, USA
| | - Jaffer A Ajani
- Department of GI Medical Oncology, The University of Texas MD Anderson Cancer Center, Houston, TX 77030, USA
| | - Sendurai A Mani
- Department of Translational Molecular Pathology, University of Texas MD Anderson Cancer Center, Houston, TX 77030, USA
| | - Andres Houseman
- College of Public Health and Human Sciences, Oregon State University, Corvallis, OR 9733, USA
| | - Maciej Wiznerowicz
- Poznań University of Medical Sciences, Poznań 61701, Poland; Greater Poland Cancer Center, Poznań 61866, Poland; International Institute for Molecular Oncology, Poznań 60203, Poland
| | - Jian Chen
- Department of Gastroenterology, Hepatology & Nutrition, The University of Texas MD Anderson Cancer Center, Houston, TX 77030, USA
| | - Shoujun Gu
- Center for Translational Medicine, Department of Surgery, George Washington University, Washington, DC 20037, USA
| | - Wencai Ma
- Department of Bioinformatics and Computational Biology, MD Anderson Cancer Center, Houston, TX 77030, USA
| | - Jiexin Zhang
- Department of Bioinformatics and Computational Biology, MD Anderson Cancer Center, Houston, TX 77030, USA
| | - Pan Tong
- Department of Bioinformatics and Computational Biology, MD Anderson Cancer Center, Houston, TX 77030, USA
| | - Andrew D Cherniack
- Cancer Program, The Eli and Edythe L. Broad Institute of Massachusetts Institute of Technology and Harvard University, Cambridge, MA 02142, USA
| | - Chuxia Deng
- Center for Translational Medicine, Department of Surgery, George Washington University, Washington, DC 20037, USA; Faculty of Health Sciences, University of Macau, Macau, Macau SAR, China
| | - Linda Resar
- Departments of Medicine, Division of Hematology, Oncology and Pathology, The Johns Hopkins University School of Medicine, Baltimore, MD 21205, USA
| | | | - John N Weinstein
- Department of Bioinformatics and Computational Biology, MD Anderson Cancer Center, Houston, TX 77030, USA; Department of Systems Biology, MD Anderson Cancer Center, Houston, TX 77030, USA
| | - Lopa Mishra
- Center for Translational Medicine, Department of Surgery, George Washington University, Washington, DC 20037, USA; Veterans Affairs Medical Center, Institute of Clinical Research, Washington, DC 20422, USA.
| | - Rehan Akbani
- Department of Bioinformatics and Computational Biology, MD Anderson Cancer Center, Houston, TX 77030, USA.
| |
Collapse
|
44
|
Yang R, Van Etten JL, Dehm SM. Indel detection from DNA and RNA sequencing data with transIndel. BMC Genomics 2018; 19:270. [PMID: 29673323 PMCID: PMC5909256 DOI: 10.1186/s12864-018-4671-4] [Citation(s) in RCA: 21] [Impact Index Per Article: 3.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/18/2017] [Accepted: 04/13/2018] [Indexed: 12/18/2022] Open
Abstract
Background Insertions and deletions (indels) are a major class of genomic variation associated with human disease. Indels are primarily detected from DNA sequencing (DNA-seq) data but their transcriptional consequences remain unexplored due to challenges in discriminating medium-sized and large indels from splicing events in RNA-seq data. Results Here, we developed transIndel, a splice-aware algorithm that parses the chimeric alignments predicted by a short read aligner and reconstructs the mid-sized insertions and large deletions based on the linear alignments of split reads from DNA-seq or RNA-seq data. TransIndel exhibits competitive or superior performance over eight state-of-the-art indel detection tools on benchmarks using both synthetic and real DNA-seq data. Additionally, we applied transIndel to DNA-seq and RNA-seq datasets from 333 primary prostate cancer patients from The Cancer Genome Atlas (TCGA) and 59 metastatic prostate cancer patients from AACR-PCF Stand-Up- To-Cancer (SU2C) studies. TransIndel enhanced the taxonomy of DNA- and RNA-level alterations in prostate cancer by identifying recurrent FOXA1 indels as well as exitron splicing in genes implicated in disease progression. Conclusions Our study demonstrates that transIndel is a robust tool for elucidation of medium- and large-sized indels from DNA-seq and RNA-seq data. Including RNA-seq in indel discovery efforts leads to significant improvements in sensitivity for identification of med-sized and large indels missed by DNA-seq, and reveals non-canonical RNA-splicing events in genes associated with disease pathology. Electronic supplementary material The online version of this article (10.1186/s12864-018-4671-4) contains supplementary material, which is available to authorized users.
Collapse
Affiliation(s)
- Rendong Yang
- The Hormel Institute, University of Minnesota, 801 16th AVE NE, Austin, MN, 55912, USA. .,Masonic Cancer Center, University of Minnesota, 420 Delaware St SE, Minneapolis, MN, 55455, USA.
| | - Jamie L Van Etten
- Masonic Cancer Center, University of Minnesota, 420 Delaware St SE, Minneapolis, MN, 55455, USA
| | - Scott M Dehm
- Masonic Cancer Center, University of Minnesota, 420 Delaware St SE, Minneapolis, MN, 55455, USA. .,Department of Laboratory Medicine and Pathology, University of Minnesota, Minneapolis, MN, 55455, USA.
| |
Collapse
|
45
|
Berger AC, Korkut A, Kanchi RS, Hegde AM, Lenoir W, Liu W, Liu Y, Fan H, Shen H, Ravikumar V, Rao A, Schultz A, Li X, Sumazin P, Williams C, Mestdagh P, Gunaratne PH, Yau C, Bowlby R, Robertson AG, Tiezzi DG, Wang C, Cherniack AD, Godwin AK, Kuderer NM, Rader JS, Zuna RE, Sood AK, Lazar AJ, Ojesina AI, Adebamowo C, Adebamowo SN, Baggerly KA, Chen TW, Chiu HS, Lefever S, Liu L, MacKenzie K, Orsulic S, Roszik J, Shelley CS, Song Q, Vellano CP, Wentzensen N, Weinstein JN, Mills GB, Levine DA, Akbani R. A Comprehensive Pan-Cancer Molecular Study of Gynecologic and Breast Cancers. Cancer Cell 2018; 33:690-705.e9. [PMID: 29622464 PMCID: PMC5959730 DOI: 10.1016/j.ccell.2018.03.014] [Citation(s) in RCA: 370] [Impact Index Per Article: 61.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 10/26/2017] [Revised: 02/22/2018] [Accepted: 03/12/2018] [Indexed: 02/07/2023]
Abstract
We analyzed molecular data on 2,579 tumors from The Cancer Genome Atlas (TCGA) of four gynecological types plus breast. Our aims were to identify shared and unique molecular features, clinically significant subtypes, and potential therapeutic targets. We found 61 somatic copy-number alterations (SCNAs) and 46 significantly mutated genes (SMGs). Eleven SCNAs and 11 SMGs had not been identified in previous TCGA studies of the individual tumor types. We found functionally significant estrogen receptor-regulated long non-coding RNAs (lncRNAs) and gene/lncRNA interaction networks. Pathway analysis identified subtypes with high leukocyte infiltration, raising potential implications for immunotherapy. Using 16 key molecular features, we identified five prognostic subtypes and developed a decision tree that classified patients into the subtypes based on just six features that are assessable in clinical laboratories.
Collapse
Affiliation(s)
- Ashton C Berger
- The Eli and Edythe L. Broad Institute of Massachusetts Institute of Technology and Harvard University, Cambridge, MA 02142, USA
| | - Anil Korkut
- Department of Bioinformatics and Computational Biology, University of Texas MD Anderson Cancer Center, Houston, TX 77030, USA
| | - Rupa S Kanchi
- Department of Bioinformatics and Computational Biology, University of Texas MD Anderson Cancer Center, Houston, TX 77030, USA
| | - Apurva M Hegde
- Department of Bioinformatics and Computational Biology, University of Texas MD Anderson Cancer Center, Houston, TX 77030, USA
| | - Walter Lenoir
- Department of Bioinformatics and Computational Biology, University of Texas MD Anderson Cancer Center, Houston, TX 77030, USA
| | - Wenbin Liu
- Department of Bioinformatics and Computational Biology, University of Texas MD Anderson Cancer Center, Houston, TX 77030, USA
| | - Yuexin Liu
- Department of Bioinformatics and Computational Biology, University of Texas MD Anderson Cancer Center, Houston, TX 77030, USA
| | - Huihui Fan
- Center for Epigenetics, Van Andel Research Institute, 333 Bostwick Avenue NE, Grand Rapids, MI 49503, USA
| | - Hui Shen
- Center for Epigenetics, Van Andel Research Institute, 333 Bostwick Avenue NE, Grand Rapids, MI 49503, USA
| | - Visweswaran Ravikumar
- Department of Bioinformatics and Computational Biology, University of Texas MD Anderson Cancer Center, Houston, TX 77030, USA
| | - Arvind Rao
- Department of Bioinformatics and Computational Biology, University of Texas MD Anderson Cancer Center, Houston, TX 77030, USA
| | - Andre Schultz
- Department of Bioinformatics and Computational Biology, University of Texas MD Anderson Cancer Center, Houston, TX 77030, USA
| | - Xubin Li
- Department of Bioinformatics and Computational Biology, University of Texas MD Anderson Cancer Center, Houston, TX 77030, USA
| | - Pavel Sumazin
- Texas Children's Cancer Center, Baylor College of Medicine, Houston, TX 77030, USA
| | - Cecilia Williams
- Department of Protein Sciences, CBH, KTH - Royal Institute of Technology, Science for Life Laboratory, Tomtebodavägen 23, 171 21 Solna, Sweden
| | - Pieter Mestdagh
- Department of Pediatrics and Medical Genetics, Ghent University, Ghent, Belgium
| | - Preethi H Gunaratne
- Department of Biology & Biochemistry, UH-Sequencing Core, University of Houston, Houston, TX 77204, USA; Human Genome Sequencing Center, Baylor College of Medicine, Houston, TX 77030, USA
| | - Christina Yau
- Buck Institute of Research on Aging, Novato, CA 94945, USA; Department of Surgery, University of California, San Francisco, San Francisco, CA 94143, USA
| | - Reanne Bowlby
- BC Cancer Agency, Canada's Michael Smith Genome Sciences Centre, Vancouver, BC V5Z 4S6, Canada
| | - A Gordon Robertson
- BC Cancer Agency, Canada's Michael Smith Genome Sciences Centre, Vancouver, BC V5Z 4S6, Canada
| | - Daniel G Tiezzi
- Breast Disease and Gynecologic Oncology Division - Department of Gynecology and Obstetrics, Ribeirão Preto Medical School, University of São Paulo, 3900 Bandeirantes Avenue, Ribeirão Preto, SP 14048-900, Brazil
| | - Chen Wang
- Department of Health Sciences Research, Mayo Clinic College of Medicine, 200 First Street SW, Rochester, MN 55905, USA; Department of Obstetrics and Gynecology, Mayo Clinic College of Medicine, 200 First Street SW, Rochester, MN 55905, USA
| | - Andrew D Cherniack
- The Eli and Edythe L. Broad Institute of Massachusetts Institute of Technology and Harvard University, Cambridge, MA 02142, USA; Department of Medical Oncology, Dana Farber Cancer Institute, Boston, MA 02215, USA
| | - Andrew K Godwin
- Department of Pathology and Laboratory Medicine, The University of Kansas Medical Center, 3901 Rainbow Boulevard, Kansas City, KS 66160, USA
| | - Nicole M Kuderer
- Advanced Cancer Research Group, Seattle, Washington, and Center for Cancer Innovation, Department of Medicine, University of Washington, WA 98195, USA
| | - Janet S Rader
- Department of Obstetrics and Gynecology, Medical College of Wisconsin, Milwaukee, WI 53226, USA
| | - Rosemary E Zuna
- Pathology Department, University of Oklahoma Health Sciences Center, Oklahoma City, OK 73104, USA
| | - Anil K Sood
- Department of Gynecologic Oncology and Reproductive Medicine, University of Texas MD Anderson Cancer Center, Houston, TX 77030, USA
| | - Alexander J Lazar
- Department of Pathology, The University of Texas MD Anderson Cancer Center, Houston, TX 77030, USA; Department of Genomic Medicine, The University of Texas MD Anderson Cancer Center, Houston, TX 77030, USA; Department of Translational Molecular Pathology, The University of Texas MD Anderson Cancer Center, Houston, TX 77030, USA
| | - Akinyemi I Ojesina
- Department of Epidemiology and Comprehensive Cancer Center, University of Alabama at Birmingham, Birmingham, AL 35294, USA
| | - Clement Adebamowo
- Department of Epidemiology and Public Health, Institute of Human Virology and Greenebaum Comprehensive Cancer Center, University of Maryland School of Medicine, Baltimore, MD 21201, USA; Institute of Human Virology, Abuja, Nigeria
| | - Sally N Adebamowo
- Department of Epidemiology and Public Health, Institute of Human Virology and Greenebaum Comprehensive Cancer Center, University of Maryland School of Medicine, Baltimore, MD 21201, USA
| | - Keith A Baggerly
- Department of Bioinformatics and Computational Biology, University of Texas MD Anderson Cancer Center, Houston, TX 77030, USA
| | - Ting-Wen Chen
- Texas Children's Cancer Center, Baylor College of Medicine, Houston, TX 77030, USA; Bioinformatics Center, Molecular Medicine Research Center, Chang Gung University, Taoyuan, Taiwan
| | - Hua-Sheng Chiu
- Texas Children's Cancer Center, Baylor College of Medicine, Houston, TX 77030, USA
| | - Steve Lefever
- Department of Pediatrics and Medical Genetics, Ghent University, Ghent, Belgium
| | - Liang Liu
- Department of Cancer Biology, Wake Forest Baptist Health Center, Winston Salem, NC 27157, USA
| | - Karen MacKenzie
- School of Women's and Children's Health, University of New South Wales, Sydney, Australia
| | - Sandra Orsulic
- Women's Cancer Program, Samuel Oschin Comprehensive Cancer Institute, Cedars-Sinai Medical Center, Los Angeles, CA 90048, USA
| | - Jason Roszik
- Department of Genomic Medicine, The University of Texas MD Anderson Cancer Center, Houston, TX 77030, USA; Department of Melanoma Medical Oncology, University of Texas MD Anderson Cancer Center, Houston, TX 77030, USA
| | | | - Qianqian Song
- Department of Cancer Biology, Wake Forest Baptist Health Center, Winston Salem, NC 27157, USA
| | - Christopher P Vellano
- Department of Systems Biology, University of Texas MD Anderson Cancer Center, Houston, TX 77030, USA
| | - Nicolas Wentzensen
- Division of Cancer Epidemiology and Genetics, National Cancer Institute, Bethesda, MD 20892, USA
| | - John N Weinstein
- Department of Bioinformatics and Computational Biology, University of Texas MD Anderson Cancer Center, Houston, TX 77030, USA; Department of Systems Biology, University of Texas MD Anderson Cancer Center, Houston, TX 77030, USA.
| | - Gordon B Mills
- Department of Systems Biology, University of Texas MD Anderson Cancer Center, Houston, TX 77030, USA.
| | - Douglas A Levine
- Gynecologic Oncology, Perlmutter Cancer Center, New York University Langone Health, New York, NY 10016, USA.
| | - Rehan Akbani
- Department of Bioinformatics and Computational Biology, University of Texas MD Anderson Cancer Center, Houston, TX 77030, USA.
| |
Collapse
|
46
|
Ding L, Bailey MH, Porta-Pardo E, Thorsson V, Colaprico A, Bertrand D, Gibbs DL, Weerasinghe A, Huang KL, Tokheim C, Cortés-Ciriano I, Jayasinghe R, Chen F, Yu L, Sun S, Olsen C, Kim J, Taylor AM, Cherniack AD, Akbani R, Suphavilai C, Nagarajan N, Stuart JM, Mills GB, Wyczalkowski MA, Vincent BG, Hutter CM, Zenklusen JC, Hoadley KA, Wendl MC, Shmulevich L, Lazar AJ, Wheeler DA, Getz G. Perspective on Oncogenic Processes at the End of the Beginning of Cancer Genomics. Cell 2018; 173:305-320.e10. [PMID: 29625049 PMCID: PMC5916814 DOI: 10.1016/j.cell.2018.03.033] [Citation(s) in RCA: 210] [Impact Index Per Article: 35.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/17/2017] [Revised: 02/20/2018] [Accepted: 03/13/2018] [Indexed: 12/21/2022]
Abstract
The Cancer Genome Atlas (TCGA) has catalyzed systematic characterization of diverse genomic alterations underlying human cancers. At this historic junction marking the completion of genomic characterization of over 11,000 tumors from 33 cancer types, we present our current understanding of the molecular processes governing oncogenesis. We illustrate our insights into cancer through synthesis of the findings of the TCGA PanCancer Atlas project on three facets of oncogenesis: (1) somatic driver mutations, germline pathogenic variants, and their interactions in the tumor; (2) the influence of the tumor genome and epigenome on transcriptome and proteome; and (3) the relationship between tumor and the microenvironment, including implications for drugs targeting driver events and immunotherapies. These results will anchor future characterization of rare and common tumor types, primary and relapsed tumors, and cancers across ancestry groups and will guide the deployment of clinical genomic sequencing.
Collapse
Affiliation(s)
- Li Ding
- Department of Medicine, Washington University in St. Louis, St. Louis, MO 63110, USA; McDonnell Genome Institute, Washington University in St. Louis, St. Louis, MO 63108, USA; Department of Genetics, Washington University in St. Louis, St. Louis, MO 63110, USA; Siteman Cancer Center, Washington University in St. Louis, St. Louis, MO 63110, USA.
| | - Matthew H Bailey
- Department of Medicine, Washington University in St. Louis, St. Louis, MO 63110, USA; McDonnell Genome Institute, Washington University in St. Louis, St. Louis, MO 63108, USA
| | - Eduard Porta-Pardo
- Barcelona Supercomputing Centre, 08034 Barcelona, Spain; Sanford Burnham Prebys Medical Discovery Institute, La Jolla, CA 92037, USA
| | | | - Antonio Colaprico
- Machine Learning Group (MLG), Département d'Informatique, Université Libre de Bruxelles, 1050 Brussels, Belgium; Department of Human Genetics, University of Miami, Miami, FL 33136, USA
| | - Denis Bertrand
- Computational and Systems Biology, Genome Institute of Singapore, Singapore, 13862
| | - David L Gibbs
- Institute for Systems Biology, Seattle, WA 98109, USA
| | - Amila Weerasinghe
- Department of Medicine, Washington University in St. Louis, St. Louis, MO 63110, USA; McDonnell Genome Institute, Washington University in St. Louis, St. Louis, MO 63108, USA
| | - Kuan-Lin Huang
- Department of Medicine, Washington University in St. Louis, St. Louis, MO 63110, USA; McDonnell Genome Institute, Washington University in St. Louis, St. Louis, MO 63108, USA
| | - Collin Tokheim
- Institute for Computational Medicine, Johns Hopkins University, Baltimore, MD 21218, USA; Department of Biomedical Engineering, Johns Hopkins University, Baltimore, MD 21218, USA
| | - Isidro Cortés-Ciriano
- Harvard Medical School, Boston, MA 02115, USA; Ludwig Center at Harvard, Boston, MA 02115, USA; Department of Chemistry, University of Cambridge, Cambridge CB2 1EW, UK
| | - Reyka Jayasinghe
- Department of Medicine, Washington University in St. Louis, St. Louis, MO 63110, USA; McDonnell Genome Institute, Washington University in St. Louis, St. Louis, MO 63108, USA
| | - Feng Chen
- Department of Medicine, Washington University in St. Louis, St. Louis, MO 63110, USA; Siteman Cancer Center, Washington University in St. Louis, St. Louis, MO 63110, USA
| | - Lihua Yu
- H3 Biomedicine Inc., Cambridge, MA 02139, USA
| | - Sam Sun
- Department of Radiation Oncology, Baylor College of Medicine, Houston, TX 77030, USA
| | - Catharina Olsen
- Machine Learning Group (MLG), Département d'Informatique, Université Libre de Bruxelles, 1050 Brussels, Belgium
| | - Jaegil Kim
- Broad Institute, Cambridge, MA 02142, USA
| | - Alison M Taylor
- Broad Institute, Cambridge, MA 02142, USA; Department of Medical Oncology, Dana-Farber Cancer Institute, 450 Brookline Avenue, Boston, MA 02215, USA
| | - Andrew D Cherniack
- Broad Institute, Cambridge, MA 02142, USA; Department of Medical Oncology, Dana-Farber Cancer Institute, 450 Brookline Avenue, Boston, MA 02215, USA
| | - Rehan Akbani
- Department of Bioinformatics and Computational Biology, University of Texas MD Anderson Cancer Center, Houston, TX 77498, USA
| | - Chayaporn Suphavilai
- Computational and Systems Biology, Genome Institute of Singapore, Singapore, 13862
| | - Niranjan Nagarajan
- Computational and Systems Biology, Genome Institute of Singapore, Singapore, 13862
| | - Joshua M Stuart
- Baskin School of Engineering, University of California, Santa Cruz, Santa Cruz, CA 95064, USA
| | - Gordon B Mills
- Department of Systems Biology, University of Texas MD Anderson Cancer Center, Houston, TX 77498, USA
| | - Matthew A Wyczalkowski
- Department of Medicine, Washington University in St. Louis, St. Louis, MO 63110, USA; McDonnell Genome Institute, Washington University in St. Louis, St. Louis, MO 63108, USA
| | - Benjamin G Vincent
- Lineberger Comprehensive Cancer Center, University of North Carolina, Chapel Hill, NC 27599, USA; Curriculum in Bioinformatics and Computational Biology, University of North Carolina, Chapel Hill, NC 27599, USA
| | - Carolyn M Hutter
- National Human Genome Research Institute, Bethesda, MD 20892, USA
| | | | - Katherine A Hoadley
- Lineberger Comprehensive Cancer Center, University of North Carolina, Chapel Hill, NC 27599, USA; Department of Genetics, University of North Carolina, Chapel Hill, NC 27599, USA
| | - Michael C Wendl
- Department of Medicine, Washington University in St. Louis, St. Louis, MO 63110, USA; McDonnell Genome Institute, Washington University in St. Louis, St. Louis, MO 63108, USA; Department of Genetics, Washington University in St. Louis, St. Louis, MO 63110, USA
| | | | - Alexander J Lazar
- Departments of Pathology, Genomic Medicine, and Translational Molecular Pathology, The University of Texas MD Anderson Cancer Center, Houston, TX 77498, USA
| | - David A Wheeler
- Department of Molecular and Human Genetics, Baylor College of Medicine, Houston, TX 77030, USA; Human Genome Sequencing Center, Baylor College of Medicine, Houston, TX 77030, USA; Dan L Duncan Cancer Center, Baylor College of Medicine, Houston, TX 77030, USA.
| | - Gad Getz
- Harvard Medical School, Boston, MA 02115, USA; Broad Institute, Cambridge, MA 02142, USA; Massachusetts General Hospital, Boston, MA 02114, USA.
| |
Collapse
|
47
|
Ricketts CJ, De Cubas AA, Fan H, Smith CC, Lang M, Reznik E, Bowlby R, Gibb EA, Akbani R, Beroukhim R, Bottaro DP, Choueiri TK, Gibbs RA, Godwin AK, Haake S, Hakimi AA, Henske EP, Hsieh JJ, Ho TH, Kanchi RS, Krishnan B, Kwiatkowski DJ, Liu W, Merino MJ, Mills GB, Myers J, Nickerson ML, Reuter VE, Schmidt LS, Shelley CS, Shen H, Shuch B, Signoretti S, Srinivasan R, Tamboli P, Thomas G, Vincent BG, Vocke CD, Wheeler DA, Yang L, Kim WY, Robertson AG, Spellman PT, Rathmell WK, Linehan WM. The Cancer Genome Atlas Comprehensive Molecular Characterization of Renal Cell Carcinoma. Cell Rep 2018; 23:313-326.e5. [PMID: 29617669 PMCID: PMC6075733 DOI: 10.1016/j.celrep.2018.03.075] [Citation(s) in RCA: 458] [Impact Index Per Article: 76.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/26/2017] [Revised: 03/09/2018] [Accepted: 03/19/2018] [Indexed: 01/05/2023] Open
Abstract
Renal cell carcinoma (RCC) is not a single disease, but several histologically defined cancers with different genetic drivers, clinical courses, and therapeutic responses. The current study evaluated 843 RCC from the three major histologic subtypes, including 488 clear cell RCC, 274 papillary RCC, and 81 chromophobe RCC. Comprehensive genomic and phenotypic analysis of the RCC subtypes reveals distinctive features of each subtype that provide the foundation for the development of subtype-specific therapeutic and management strategies for patients affected with these cancers. Somatic alteration of BAP1, PBRM1, and PTEN and altered metabolic pathways correlated with subtype-specific decreased survival, while CDKN2A alteration, increased DNA hypermethylation, and increases in the immune-related Th2 gene expression signature correlated with decreased survival within all major histologic subtypes. CIMP-RCC demonstrated an increased immune signature, and a uniform and distinct metabolic expression pattern identified a subset of metabolically divergent (MD) ChRCC that associated with extremely poor survival.
Collapse
Affiliation(s)
- Christopher J Ricketts
- Urologic Oncology Branch, National Cancer Institute, Center for Cancer Research, Bethesda, MD 20892, USA
| | | | - Huihui Fan
- Van Andel Research Institute, Grand Rapids, MI 49503, USA
| | - Christof C Smith
- Lineberger Comprehensive Cancer Center, University of North Carolina at Chapel Hill, Chapel Hill, NC 27599, USA
| | - Martin Lang
- Urologic Oncology Branch, National Cancer Institute, Center for Cancer Research, Bethesda, MD 20892, USA
| | - Ed Reznik
- Memorial Sloan Kettering Cancer Center, New York, NY 10065, USA
| | - Reanne Bowlby
- Canada's Michael Smith Genome Sciences Centre, Vancouver, BC V5Z 4S6, Canada
| | - Ewan A Gibb
- Canada's Michael Smith Genome Sciences Centre, Vancouver, BC V5Z 4S6, Canada
| | - Rehan Akbani
- The University of Texas MD Anderson Cancer Center, Houston, TX 77030, USA
| | - Rameen Beroukhim
- The Broad Institute of Massachusetts Institute of Technology and Harvard University, Cambridge, MA 02142, USA
| | - Donald P Bottaro
- Urologic Oncology Branch, National Cancer Institute, Center for Cancer Research, Bethesda, MD 20892, USA
| | | | | | - Andrew K Godwin
- University of Kansas Medical Center, Kansas City, KS 66206, USA
| | - Scott Haake
- Vanderbilt University School of Medicine, Nashville, TN 37232, USA
| | - A Ari Hakimi
- Memorial Sloan Kettering Cancer Center, New York, NY 10065, USA
| | | | - James J Hsieh
- Washington University School of Medicine, St. Louis, MO 63110, USA
| | - Thai H Ho
- Mayo Clinic Arizona, Phoenix, AZ 85054, USA
| | - Rupa S Kanchi
- The University of Texas MD Anderson Cancer Center, Houston, TX 77030, USA
| | - Bhavani Krishnan
- Lineberger Comprehensive Cancer Center, University of North Carolina at Chapel Hill, Chapel Hill, NC 27599, USA
| | | | - Wenbin Liu
- The University of Texas MD Anderson Cancer Center, Houston, TX 77030, USA
| | - Maria J Merino
- Laboratory of Pathology, Center for Cancer Research, National Cancer Institute, Bethesda, MD 20892, USA
| | - Gordon B Mills
- The University of Texas MD Anderson Cancer Center, Houston, TX 77030, USA
| | | | - Michael L Nickerson
- Division of Cancer Epidemiology and Genetics, National Cancer Institute, Bethesda, MD 20892, USA
| | - Victor E Reuter
- Memorial Sloan Kettering Cancer Center, New York, NY 10065, USA
| | - Laura S Schmidt
- Urologic Oncology Branch, National Cancer Institute, Center for Cancer Research, Bethesda, MD 20892, USA; Basic Science Program, Leidos Biomedical Research, Inc. Frederick National Laboratory of Cancer Research, Frederick, MD 21702, USA
| | | | - Hui Shen
- Van Andel Research Institute, Grand Rapids, MI 49503, USA
| | | | | | - Ramaprasad Srinivasan
- Urologic Oncology Branch, National Cancer Institute, Center for Cancer Research, Bethesda, MD 20892, USA
| | - Pheroze Tamboli
- The University of Texas MD Anderson Cancer Center, Houston, TX 77030, USA
| | - George Thomas
- Oregon Health & Science University, Portland, OR 97239, USA
| | - Benjamin G Vincent
- Lineberger Comprehensive Cancer Center, University of North Carolina at Chapel Hill, Chapel Hill, NC 27599, USA
| | - Cathy D Vocke
- Urologic Oncology Branch, National Cancer Institute, Center for Cancer Research, Bethesda, MD 20892, USA
| | | | - Lixing Yang
- Harvard Medical School, Boston, MA 02115, USA
| | - William Y Kim
- Lineberger Comprehensive Cancer Center, University of North Carolina at Chapel Hill, Chapel Hill, NC 27599, USA
| | - A Gordon Robertson
- Canada's Michael Smith Genome Sciences Centre, Vancouver, BC V5Z 4S6, Canada
| | | | | | - W Marston Linehan
- Urologic Oncology Branch, National Cancer Institute, Center for Cancer Research, Bethesda, MD 20892, USA.
| |
Collapse
|
48
|
Ellrott K, Bailey MH, Saksena G, Covington KR, Kandoth C, Stewart C, Hess J, Ma S, Chiotti KE, McLellan M, Sofia HJ, Hutter C, Getz G, Wheeler D, Ding L. Scalable Open Science Approach for Mutation Calling of Tumor Exomes Using Multiple Genomic Pipelines. Cell Syst 2018; 6:271-281.e7. [PMID: 29596782 PMCID: PMC6075717 DOI: 10.1016/j.cels.2018.03.002] [Citation(s) in RCA: 455] [Impact Index Per Article: 75.8] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/14/2017] [Revised: 01/21/2018] [Accepted: 03/01/2018] [Indexed: 12/12/2022]
Abstract
The Cancer Genome Atlas (TCGA) cancer genomics dataset includes over 10,000 tumor-normal exome pairs across 33 different cancer types, in total >400 TB of raw data files requiring analysis. Here we describe the Multi-Center Mutation Calling in Multiple Cancers project, our effort to generate a comprehensive encyclopedia of somatic mutation calls for the TCGA data to enable robust cross-tumor-type analyses. Our approach accounts for variance and batch effects introduced by the rapid advancement of DNA extraction, hybridization-capture, sequencing, and analysis methods over time. We present best practices for applying an ensemble of seven mutation-calling algorithms with scoring and artifact filtering. The dataset created by this analysis includes 3.5 million somatic variants and forms the basis for PanCan Atlas papers. The results have been made available to the research community along with the methods used to generate them. This project is the result of collaboration from a number of institutes and demonstrates how team science drives extremely large genomics projects.
Collapse
Affiliation(s)
- Kyle Ellrott
- Biomedical Engineering, Oregon Health and Science University, Portland, OR 97239, USA.
| | - Matthew H Bailey
- Department of Medicine, McDonnell Genome Institute, Siteman Cancer Center, Washington University School of Medicine, St. Louis, MO 63110, USA
| | - Gordon Saksena
- The Eli and Edythe L. Broad Institute of Massachusetts Institute of Technology and Harvard University, Cambridge, MA 02142, USA
| | - Kyle R Covington
- Department of Molecular and Human Genetics, Baylor College of Medicine Human Genome Sequencing Center, 1 Baylor Plaza, Houston, TX 77030, USA
| | - Cyriac Kandoth
- Marie-Josée and Henry R. Kravis Center for Molecular Oncology, Memorial Sloan Kettering Cancer Center, New York, NY 10021, USA
| | - Chip Stewart
- The Eli and Edythe L. Broad Institute of Massachusetts Institute of Technology and Harvard University, Cambridge, MA 02142, USA
| | - Julian Hess
- The Eli and Edythe L. Broad Institute of Massachusetts Institute of Technology and Harvard University, Cambridge, MA 02142, USA
| | - Singer Ma
- DNAnexus, 1975 W EL Camino Real, Suite 204, Mountain View, CA 94040, USA
| | - Kami E Chiotti
- Biomedical Engineering, Oregon Health and Science University, Portland, OR 97239, USA
| | - Michael McLellan
- Department of Medicine, McDonnell Genome Institute, Siteman Cancer Center, Washington University School of Medicine, St. Louis, MO 63110, USA
| | - Heidi J Sofia
- National Human Genome Research Institute (NHGRI), NIH, Bethesda, MD 20892, USA
| | - Carolyn Hutter
- National Human Genome Research Institute (NHGRI), NIH, Bethesda, MD 20892, USA
| | - Gad Getz
- The Eli and Edythe L. Broad Institute of Massachusetts Institute of Technology and Harvard University, Cambridge, MA 02142, USA; Cancer Center and Department of Pathology, Massachusetts General Hospital, Boston, MA 02129, USA; Harvard Medical School, Boston, MA 02115, USA
| | - David Wheeler
- Department of Molecular and Human Genetics, Baylor College of Medicine Human Genome Sequencing Center, 1 Baylor Plaza, Houston, TX 77030, USA
| | - Li Ding
- Department of Medicine, McDonnell Genome Institute, Siteman Cancer Center, Washington University School of Medicine, St. Louis, MO 63110, USA
| |
Collapse
|
49
|
Albuquerque MA, Grande BM, Ritch EJ, Pararajalingam P, Jessa S, Krzywinski M, Grewal JK, Shah SP, Boutros PC, Morin RD. Enhancing knowledge discovery from cancer genomics data with Galaxy. Gigascience 2018; 6:1-13. [PMID: 28327945 PMCID: PMC5437943 DOI: 10.1093/gigascience/gix015] [Citation(s) in RCA: 6] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/04/2016] [Accepted: 03/06/2017] [Indexed: 01/15/2023] Open
Abstract
The field of cancer genomics has demonstrated the power of massively parallel sequencing techniques to inform on the genes and specific alterations that drive tumor onset and progression. Although large comprehensive sequence data sets continue to be made increasingly available, data analysis remains an ongoing challenge, particularly for laboratories lacking dedicated resources and bioinformatics expertise. To address this, we have produced a collection of Galaxy tools that represent many popular algorithms for detecting somatic genetic alterations from cancer genome and exome data. We developed new methods for parallelization of these tools within Galaxy to accelerate runtime and have demonstrated their usability and summarized their runtimes on multiple cloud service providers. Some tools represent extensions or refinement of existing toolkits to yield visualizations suited to cohort-wide cancer genomic analysis. For example, we present Oncocircos and Oncoprintplus, which generate data-rich summaries of exome-derived somatic mutation. Workflows that integrate these to achieve data integration and visualizations are demonstrated on a cohort of 96 diffuse large B-cell lymphomas and enabled the discovery of multiple candidate lymphoma-related genes. Our toolkit is available from our GitHub repository as Galaxy tool and dependency definitions and has been deployed using virtualization on multiple platforms including Docker.
Collapse
Affiliation(s)
- Marco A Albuquerque
- Department of Molecular Biology and Biochemistry, Simon Fraser University, Burnaby, BC, Canada
| | - Bruno M Grande
- Department of Molecular Biology and Biochemistry, Simon Fraser University, Burnaby, BC, Canada
| | - Elie J Ritch
- Department of Molecular Biology and Biochemistry, Simon Fraser University, Burnaby, BC, Canada
| | - Prasath Pararajalingam
- Department of Molecular Biology and Biochemistry, Simon Fraser University, Burnaby, BC, Canada
| | - Selin Jessa
- Department of Molecular Biology and Biochemistry, Simon Fraser University, Burnaby, BC, Canada
| | - Martin Krzywinski
- Canada's Michael Smith Genome Sciences Center, BC Cancer Agency, Vancouver, BC, Canada
| | - Jasleen K Grewal
- Department of Molecular Biology and Biochemistry, Simon Fraser University, Burnaby, BC, Canada
| | - Sohrab P Shah
- Department of Pathology, University of British Columbia, Vancouver, BC, Canada
| | - Paul C Boutros
- Ontario Institute for Cancer Research, Toronto, ON, Canada
| | - Ryan D Morin
- Department of Molecular Biology and Biochemistry, Simon Fraser University, Burnaby, BC, Canada.,Canada's Michael Smith Genome Sciences Center, BC Cancer Agency, Vancouver, BC, Canada
| |
Collapse
|
50
|
Radovich M, Pickering CR, Felau I, Ha G, Zhang H, Jo H, Hoadley KA, Anur P, Zhang J, McLellan M, Bowlby R, Matthew T, Danilova L, Hegde AM, Kim J, Leiserson MDM, Sethi G, Lu C, Ryan M, Su X, Cherniack AD, Robertson G, Akbani R, Spellman P, Weinstein JN, Hayes DN, Raphael B, Lichtenberg T, Leraas K, Zenklusen JC, Fujimoto J, Scapulatempo-Neto C, Moreira AL, Hwang D, Huang J, Marino M, Korst R, Giaccone G, Gokmen-Polar Y, Badve S, Rajan A, Ströbel P, Girard N, Tsao MS, Marx A, Tsao AS, Loehrer PJ. The Integrated Genomic Landscape of Thymic Epithelial Tumors. Cancer Cell 2018; 33:244-258.e10. [PMID: 29438696 PMCID: PMC5994906 DOI: 10.1016/j.ccell.2018.01.003] [Citation(s) in RCA: 222] [Impact Index Per Article: 37.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 06/26/2017] [Revised: 10/15/2017] [Accepted: 01/09/2018] [Indexed: 12/31/2022]
Abstract
Thymic epithelial tumors (TETs) are one of the rarest adult malignancies. Among TETs, thymoma is the most predominant, characterized by a unique association with autoimmune diseases, followed by thymic carcinoma, which is less common but more clinically aggressive. Using multi-platform omics analyses on 117 TETs, we define four subtypes of these tumors defined by genomic hallmarks and an association with survival and World Health Organization histological subtype. We further demonstrate a marked prevalence of a thymoma-specific mutated oncogene, GTF2I, and explore its biological effects on multi-platform analysis. We further observe enrichment of mutations in HRAS, NRAS, and TP53. Last, we identify a molecular link between thymoma and the autoimmune disease myasthenia gravis, characterized by tumoral overexpression of muscle autoantigens, and increased aneuploidy.
Collapse
Affiliation(s)
- Milan Radovich
- Indiana University Melvin and Bren Simon Cancer Center, Indianapolis, IN 46202, USA
| | | | - Ina Felau
- National Cancer Institute, Bethesda, MD 20892, USA
| | - Gavin Ha
- Broad Institute, Cambridge, MA 02142, USA
| | | | - Heejoon Jo
- Department of Genetics, University of North Carolina at Chapel Hill, Chapel Hill, NC 27599, USA
| | - Katherine A Hoadley
- Department of Genetics, University of North Carolina at Chapel Hill, Chapel Hill, NC 27599, USA
| | - Pavana Anur
- Oregon Health & Science University, Portland, OR 97239, USA
| | - Jiexin Zhang
- MD Anderson Cancer Center, Houston, TX 77030, USA
| | - Mike McLellan
- McDonnell Genome Institute at Washington University, St. Louis, MO 63108, USA
| | - Reanne Bowlby
- Canada's Michael Smith Genome Sciences Centre, BC Cancer Agency, Vancouver, BC V5Z 4S6, Canada
| | - Thomas Matthew
- University of California, Santa Cruz, Santa Cruz, CA 95064, USA
| | | | | | - Jaegil Kim
- Broad Institute, Cambridge, MA 02142, USA
| | - Mark D M Leiserson
- Department of Computer Science & Center for Computational Molecular Biology, Brown University, Providence, RI 02912, USA
| | - Geetika Sethi
- Institute for Systems Biology, Seattle, WA 98109, USA
| | - Charles Lu
- McDonnell Genome Institute at Washington University, St. Louis, MO 63108, USA
| | - Michael Ryan
- MD Anderson Cancer Center, Houston, TX 77030, USA
| | - Xiaoping Su
- MD Anderson Cancer Center, Houston, TX 77030, USA
| | | | - Gordon Robertson
- Canada's Michael Smith Genome Sciences Centre, BC Cancer Agency, Vancouver, BC V5Z 4S6, Canada
| | - Rehan Akbani
- MD Anderson Cancer Center, Houston, TX 77030, USA
| | - Paul Spellman
- Oregon Health & Science University, Portland, OR 97239, USA
| | | | - D Neil Hayes
- Department of Genetics, University of North Carolina at Chapel Hill, Chapel Hill, NC 27599, USA
| | - Ben Raphael
- Department of Computer Science & Center for Computational Molecular Biology, Brown University, Providence, RI 02912, USA
| | | | | | | | | | | | | | - David Hwang
- University Health Network, Toronto, ON M5G 2C4, Canada
| | - James Huang
- Memorial Sloan Kettering Cancer Center, New York, NY 10065, USA
| | - Mirella Marino
- Department of Pathology, Regina Elena National Cancer Institute, Rome 00144, Italy
| | | | | | - Yesim Gokmen-Polar
- Indiana University Melvin and Bren Simon Cancer Center, Indianapolis, IN 46202, USA
| | - Sunil Badve
- Indiana University Melvin and Bren Simon Cancer Center, Indianapolis, IN 46202, USA
| | - Arun Rajan
- National Cancer Institute, Bethesda, MD 20892, USA
| | | | - Nicolas Girard
- Institute of Oncology, Cardiobiotec, Hospices Civils de Lyon, Lyon 69002, France
| | - Ming S Tsao
- Princess Margaret Cancer Centre, Toronto, ON M5G 2M9, Canada
| | - Alexander Marx
- University Medical Centre Mannheim, University of Heidelberg, Mannheim 68167, Germany
| | - Anne S Tsao
- MD Anderson Cancer Center, Houston, TX 77030, USA.
| | - Patrick J Loehrer
- Indiana University Melvin and Bren Simon Cancer Center, Indianapolis, IN 46202, USA.
| |
Collapse
|