1
|
Meirson T, Bomze D, Schueler-Furman O, Stemmer SM, Markel G. Systemic structural analysis of alterations reveals a common structural basis of driver mutations in cancer. NAR Cancer 2023; 5:zcac040. [PMID: 36683915 PMCID: PMC9846427 DOI: 10.1093/narcan/zcac040] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/21/2022] [Revised: 10/17/2022] [Accepted: 12/04/2022] [Indexed: 01/19/2023] Open
Abstract
A major effort in cancer research is to organize the complexities of the disease into fundamental traits. Despite conceptual progress in the last decades and the synthesis of hallmark features, no organizing principles governing cancer beyond cellular features exist. We analyzed experimentally determined structures harboring the most significant and prevalent driver missense mutations in human cancer, covering 73% (n = 168178) of the Catalog of Somatic Mutation in Cancer tumor samples (COSMIC). The results reveal that a single structural element-κ-helix (polyproline II helix)-lies at the core of driver point mutations, with significant enrichment in all major anatomical sites, suggesting that a small number of molecular traits are shared by most and perhaps all types of cancer. Thus, we uncovered the lowest possible level of organization at which carcinogenesis takes place at the protein level. This framework provides an initial scheme for a mechanistic understanding underlying the development of tumors and pinpoints key vulnerabilities.
Collapse
Affiliation(s)
- Tomer Meirson
- Davidoff Cancer Center, Rabin Medical Center-Beilinson Hospital, Petah Tikva, 49100, Israel
| | - David Bomze
- Sackler Faculty of Medicine, Tel Aviv University, Tel Aviv, 6997801, Israel
| | - Ora Schueler-Furman
- Department of Microbiology and Molecular Genetics, Institute for Biomedical Research Israel-Canada, Faculty of Medicine, The Hebrew University of Jerusalem, Jerusalem, 9112001, Israel
| | - Salomon M Stemmer
- Davidoff Cancer Center, Rabin Medical Center-Beilinson Hospital, Petah Tikva, 49100, Israel
- Sackler Faculty of Medicine, Tel Aviv University, Tel Aviv, 6997801, Israel
| | - Gal Markel
- Davidoff Cancer Center, Rabin Medical Center-Beilinson Hospital, Petah Tikva, 49100, Israel
- Department of Clinical Microbiology and Immunology, Sackler Faculty of Medicine, Tel Aviv University, Tel-Aviv, 6997801, Israel
| |
Collapse
|
2
|
Liu Y, Yeung WSB, Chiu PCN, Cao D. Computational approaches for predicting variant impact: An overview from resources, principles to applications. Front Genet 2022; 13:981005. [PMID: 36246661 PMCID: PMC9559863 DOI: 10.3389/fgene.2022.981005] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/29/2022] [Accepted: 08/08/2022] [Indexed: 11/13/2022] Open
Abstract
One objective of human genetics is to unveil the variants that contribute to human diseases. With the rapid development and wide use of next-generation sequencing (NGS), massive genomic sequence data have been created, making personal genetic information available. Conventional experimental evidence is critical in establishing the relationship between sequence variants and phenotype but with low efficiency. Due to the lack of comprehensive databases and resources which present clinical and experimental evidence on genotype-phenotype relationship, as well as accumulating variants found from NGS, different computational tools that can predict the impact of the variants on phenotype have been greatly developed to bridge the gap. In this review, we present a brief introduction and discussion about the computational approaches for variant impact prediction. Following an innovative manner, we mainly focus on approaches for non-synonymous variants (nsSNVs) impact prediction and categorize them into six classes. Their underlying rationale and constraints, together with the concerns and remedies raised from comparative studies are discussed. We also present how the predictive approaches employed in different research. Although diverse constraints exist, the computational predictive approaches are indispensable in exploring genotype-phenotype relationship.
Collapse
Affiliation(s)
- Ye Liu
- Shenzhen Key Laboratory of Fertility Regulation, Reproductive Medicine Center, The University of Hong Kong-Shenzhen Hospital, Shenzhen, China
| | - William S. B. Yeung
- Shenzhen Key Laboratory of Fertility Regulation, Reproductive Medicine Center, The University of Hong Kong-Shenzhen Hospital, Shenzhen, China
- Department of Obstetrics and Gynaecology, Li Ka Shing Faculty of Medicine, The University of Hong Kong, Hong Kong SAR, China
| | - Philip C. N. Chiu
- Shenzhen Key Laboratory of Fertility Regulation, Reproductive Medicine Center, The University of Hong Kong-Shenzhen Hospital, Shenzhen, China
- Department of Obstetrics and Gynaecology, Li Ka Shing Faculty of Medicine, The University of Hong Kong, Hong Kong SAR, China
| | - Dandan Cao
- Shenzhen Key Laboratory of Fertility Regulation, Reproductive Medicine Center, The University of Hong Kong-Shenzhen Hospital, Shenzhen, China
| |
Collapse
|
3
|
Anilkumar Sithara A, Maripuri D, Moorthy K, Amirtha Ganesh S, Philip P, Banerjee S, Sudhakar M, Raman K. iCOMIC: a graphical interface-driven bioinformatics pipeline for analyzing cancer omics data. NAR Genom Bioinform 2022; 4:lqac053. [PMID: 35899080 PMCID: PMC9310080 DOI: 10.1093/nargab/lqac053] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/17/2021] [Revised: 06/17/2022] [Accepted: 07/04/2022] [Indexed: 11/13/2022] Open
Abstract
Abstract
Despite the tremendous increase in omics data generated by modern sequencing technologies, their analysis can be tricky and often requires substantial expertise in bioinformatics. To address this concern, we have developed a user-friendly pipeline to analyze (cancer) genomic data that takes in raw sequencing data (FASTQ format) as input and outputs insightful statistics. Our iCOMIC toolkit pipeline featuring many independent workflows is embedded in the popular Snakemake workflow management system. It can analyze whole-genome and transcriptome data and is characterized by a user-friendly GUI that offers several advantages, including minimal execution steps and eliminating the need for complex command-line arguments. Notably, we have integrated algorithms developed in-house to predict pathogenicity among cancer-causing mutations and differentiate between tumor suppressor genes and oncogenes from somatic mutation data. We benchmarked our tool against Genome In A Bottle benchmark dataset (NA12878) and got the highest F1 score of 0.971 and 0.988 for indels and SNPs, respectively, using the BWA MEM—GATK HC DNA-Seq pipeline. Similarly, we achieved a correlation coefficient of r = 0.85 using the HISAT2-StringTie-ballgown and STAR-StringTie-ballgown RNA-Seq pipelines on the human monocyte dataset (SRP082682). Overall, our tool enables easy analyses of omics datasets, significantly ameliorating complex data analysis pipelines.
Collapse
Affiliation(s)
- Anjana Anilkumar Sithara
- Department of Biotechnology, Bhupat and Jyoti Mehta School of Biosciences, Indian Institute of Technology (IIT) Madras , Chennai 600036, India
- Centre for Integrative Biology and Systems mEdicine , IIT Madras, India
- Robert Bosch Centre for Data Science and Artificial Intelligence (RBCDSAI) , IIT Madras, India
| | - Devi Priyanka Maripuri
- Department of Biotechnology, Bhupat and Jyoti Mehta School of Biosciences, Indian Institute of Technology (IIT) Madras , Chennai 600036, India
- Centre for Integrative Biology and Systems mEdicine , IIT Madras, India
- Robert Bosch Centre for Data Science and Artificial Intelligence (RBCDSAI) , IIT Madras, India
| | - Keerthika Moorthy
- Department of Biotechnology, Bhupat and Jyoti Mehta School of Biosciences, Indian Institute of Technology (IIT) Madras , Chennai 600036, India
- Centre for Integrative Biology and Systems mEdicine , IIT Madras, India
- Robert Bosch Centre for Data Science and Artificial Intelligence (RBCDSAI) , IIT Madras, India
| | - Sai Sruthi Amirtha Ganesh
- Department of Biotechnology, Bhupat and Jyoti Mehta School of Biosciences, Indian Institute of Technology (IIT) Madras , Chennai 600036, India
- Centre for Integrative Biology and Systems mEdicine , IIT Madras, India
- Robert Bosch Centre for Data Science and Artificial Intelligence (RBCDSAI) , IIT Madras, India
| | - Philge Philip
- Centre for Integrative Biology and Systems mEdicine , IIT Madras, India
- Robert Bosch Centre for Data Science and Artificial Intelligence (RBCDSAI) , IIT Madras, India
| | - Shayantan Banerjee
- Department of Biotechnology, Bhupat and Jyoti Mehta School of Biosciences, Indian Institute of Technology (IIT) Madras , Chennai 600036, India
- Centre for Integrative Biology and Systems mEdicine , IIT Madras, India
- Robert Bosch Centre for Data Science and Artificial Intelligence (RBCDSAI) , IIT Madras, India
| | - Malvika Sudhakar
- Department of Biotechnology, Bhupat and Jyoti Mehta School of Biosciences, Indian Institute of Technology (IIT) Madras , Chennai 600036, India
- Centre for Integrative Biology and Systems mEdicine , IIT Madras, India
- Robert Bosch Centre for Data Science and Artificial Intelligence (RBCDSAI) , IIT Madras, India
| | - Karthik Raman
- Department of Biotechnology, Bhupat and Jyoti Mehta School of Biosciences, Indian Institute of Technology (IIT) Madras , Chennai 600036, India
- Centre for Integrative Biology and Systems mEdicine , IIT Madras, India
- Robert Bosch Centre for Data Science and Artificial Intelligence (RBCDSAI) , IIT Madras, India
| |
Collapse
|
4
|
Sudhakar M, Rengaswamy R, Raman K. Multi-Omic Data Improve Prediction of Personalized Tumor Suppressors and Oncogenes. Front Genet 2022; 13:854190. [PMID: 35620468 PMCID: PMC9127508 DOI: 10.3389/fgene.2022.854190] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/13/2022] [Accepted: 04/04/2022] [Indexed: 12/12/2022] Open
Abstract
The progression of tumorigenesis starts with a few mutational and structural driver events in the cell. Various cohort-based computational tools exist to identify driver genes but require multiple samples to identify less frequently mutated driver genes. Many studies use different methods to identify driver mutations/genes from mutations that have no impact on tumor progression; however, a small fraction of patients show no mutational events in any known driver genes. Current unsupervised methods map somatic and expression data onto a network to identify personalized driver genes based on changes in expression. Our method is the first machine learning model to classify genes as tumor suppressor gene (TSG), oncogene (OG), or neutral, thus assigning the functional impact of the gene in the patient. In this study, we develop a multi-omic approach, PIVOT (Personalized Identification of driVer OGs and TSGs), to train on experimentally or computationally validated mutational and structural driver events. Given the lack of any gold standards for the identification of personalized driver genes, we label the data using four strategies and, based on classification metrics, show gene-based labeling strategies perform best. We build different models using SNV, RNA, and multi-omic features to be used based on the data available. Our models trained on multi-omic data improved predictions compared with mutation and expression data, achieving an accuracy ≥0.99 for BRCA, LUAD, and COAD datasets. We show network and expression-based features contribute the most to PIVOT. Our predictions on BRCA, COAD, and LUAD cancer types reveal commonly altered genes such as TP53 and PIK3CA, which are predicted drivers for multiple cancer types. Along with known driver genes, our models also identify new driver genes such as PRKCA, SOX9, and PSMD4. Our multi-omic model labels both CNV and mutations with a more considerable contribution by CNV alterations. While predicting labels for genes mutated in multiple samples, we also label rare driver events occurring in as few as one sample. We also identify genes with dual roles within the same cancer type. Overall, PIVOT labels personalized driver genes as TSGs and OGs and also identified rare driver genes.
Collapse
Affiliation(s)
- Malvika Sudhakar
- Centre for Integrative Biology and Systems mEdicine (IBSE), Indian Institute of Technology (IIT) Madras, Chennai, India.,Robert Bosch Center for Data Science and Artificial Intelligence (RBCDSAI), IIT Madras, Chennai, India.,Department of Biotechnology, Bhupat and Jyoti Mehta School of Biosciences, IIT Madras, Chennai, India
| | - Raghunathan Rengaswamy
- Centre for Integrative Biology and Systems mEdicine (IBSE), Indian Institute of Technology (IIT) Madras, Chennai, India.,Robert Bosch Center for Data Science and Artificial Intelligence (RBCDSAI), IIT Madras, Chennai, India.,Department of Chemical Engineering, IIT Madras, Chennai, India
| | - Karthik Raman
- Centre for Integrative Biology and Systems mEdicine (IBSE), Indian Institute of Technology (IIT) Madras, Chennai, India.,Robert Bosch Center for Data Science and Artificial Intelligence (RBCDSAI), IIT Madras, Chennai, India.,Department of Biotechnology, Bhupat and Jyoti Mehta School of Biosciences, IIT Madras, Chennai, India
| |
Collapse
|