1
|
Chillón-Pino D, Badonyi M, Semple CA, Marsh JA. Protein structural context of cancer mutations reveals molecular mechanisms and candidate driver genes. Cell Rep 2024; 43:114905. [PMID: 39441719 DOI: 10.1016/j.celrep.2024.114905] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/25/2024] [Revised: 08/23/2024] [Accepted: 10/08/2024] [Indexed: 10/25/2024] Open
Abstract
Advances in protein structure determination and modeling allow us to study the structural context of human genetic variants on an unprecedented scale. Here, we analyze millions of cancer-associated missense mutations based on their structural locations and predicted perturbative effects. By considering the collective properties of mutations at the level of individual proteins, we identify distinct patterns associated with tumor suppressors and oncogenes. Tumor suppressors are enriched in structurally damaging mutations, consistent with loss-of-function mechanisms, while oncogene mutations tend to be structurally mild, reflecting selection for gain-of-function driver mutations and against loss-of-function mutations. Although oncogenes are difficult to distinguish from genes with no role in cancer using only structural damage, we find that the three-dimensional clustering of mutations is highly predictive. These observations allow us to identify candidate driver genes and speculate about their molecular roles, which we expect will have general utility in the analysis of cancer sequencing data.
Collapse
Affiliation(s)
- Diego Chillón-Pino
- MRC Human Genetics Unit, Institute of Genetics and Cancer, University of Edinburgh, Edinburgh, UK
| | - Mihaly Badonyi
- MRC Human Genetics Unit, Institute of Genetics and Cancer, University of Edinburgh, Edinburgh, UK
| | - Colin A Semple
- MRC Human Genetics Unit, Institute of Genetics and Cancer, University of Edinburgh, Edinburgh, UK
| | - Joseph A Marsh
- MRC Human Genetics Unit, Institute of Genetics and Cancer, University of Edinburgh, Edinburgh, UK.
| |
Collapse
|
2
|
Dörig C, Marulli C, Peskett T, Volkmar N, Pantolini L, Studer G, Paleari C, Frommelt F, Schwede T, de Souza N, Barral Y, Picotti P. Global profiling of protein complex dynamics with an experimental library of protein interaction markers. Nat Biotechnol 2024:10.1038/s41587-024-02432-8. [PMID: 39415059 DOI: 10.1038/s41587-024-02432-8] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/28/2023] [Accepted: 09/16/2024] [Indexed: 10/18/2024]
Abstract
Methods to systematically monitor protein complex dynamics are needed. We introduce serial ultrafiltration combined with limited proteolysis-coupled mass spectrometry (FLiP-MS), a structural proteomics workflow that generates a library of peptide markers specific to changes in PPIs by probing differences in protease susceptibility between complex-bound and monomeric forms of proteins. The library includes markers mapping to protein-binding interfaces and markers reporting on structural changes that accompany PPI changes. Integrating the marker library with LiP-MS data allows for global profiling of protein-protein interactions (PPIs) from unfractionated lysates. We apply FLiP-MS to Saccharomyces cerevisiae and probe changes in protein complex dynamics after DNA replication stress, identifying links between Spt-Ada-Gcn5 acetyltransferase activity and the assembly state of several complexes. FLiP-MS enables protein complex dynamics to be probed on any perturbation, proteome-wide, at high throughput, with peptide-level structural resolution and informing on occupancy of binding interfaces, thus providing both global and molecular views of a system under study.
Collapse
Affiliation(s)
- Christian Dörig
- Institute of Molecular Systems Biology, Department of Biology, ETH Zurich, Zurich, Switzerland
| | - Cathy Marulli
- Institute of Molecular Systems Biology, Department of Biology, ETH Zurich, Zurich, Switzerland
| | - Thomas Peskett
- Institute of Biochemistry, Department of Biology, ETH Zurich, Zurich, Switzerland
| | - Norbert Volkmar
- Institute of Molecular Systems Biology, Department of Biology, ETH Zurich, Zurich, Switzerland
| | - Lorenzo Pantolini
- Biozentrum, University of Basel, Basel, Switzerland
- SIB Swiss Institute of Bioinformatics, Computational Structural Biology, Basel, Switzerland
| | - Gabriel Studer
- Biozentrum, University of Basel, Basel, Switzerland
- SIB Swiss Institute of Bioinformatics, Computational Structural Biology, Basel, Switzerland
| | - Camilla Paleari
- Institute of Molecular Systems Biology, Department of Biology, ETH Zurich, Zurich, Switzerland
| | - Fabian Frommelt
- Institute of Molecular Systems Biology, Department of Biology, ETH Zurich, Zurich, Switzerland
| | - Torsten Schwede
- Biozentrum, University of Basel, Basel, Switzerland
- SIB Swiss Institute of Bioinformatics, Computational Structural Biology, Basel, Switzerland
| | - Natalie de Souza
- Institute of Molecular Systems Biology, Department of Biology, ETH Zurich, Zurich, Switzerland
| | - Yves Barral
- Institute of Biochemistry, Department of Biology, ETH Zurich, Zurich, Switzerland
| | - Paola Picotti
- Institute of Molecular Systems Biology, Department of Biology, ETH Zurich, Zurich, Switzerland.
| |
Collapse
|
3
|
Newaz K, Schaefers C, Weisel K, Baumbach J, Frishman D. Prognostic importance of splicing-triggered aberrations of protein complex interfaces in cancer. NAR Genom Bioinform 2024; 6:lqae133. [PMID: 39328266 PMCID: PMC11426328 DOI: 10.1093/nargab/lqae133] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/09/2024] [Revised: 08/30/2024] [Accepted: 09/13/2024] [Indexed: 09/28/2024] Open
Abstract
Aberrant alternative splicing (AS) is a prominent hallmark of cancer. AS can perturb protein-protein interactions (PPIs) by adding or removing interface regions encoded by individual exons. Identifying prognostic exon-exon interactions (EEIs) from PPI interfaces can help discover AS-affected cancer-driving PPIs that can serve as potential drug targets. Here, we assessed the prognostic significance of EEIs across 15 cancer types by integrating RNA-seq data with three-dimensional (3D) structures of protein complexes. By analyzing the resulting EEI network we identified patient-specific perturbed EEIs (i.e., EEIs present in healthy samples but absent from the paired cancer samples or vice versa) that were significantly associated with survival. We provide the first evidence that EEIs can be used as prognostic biomarkers for cancer patient survival. Our findings provide mechanistic insights into AS-affected PPI interfaces. Given the ongoing expansion of available RNA-seq data and the number of 3D structurally-resolved (or confidently predicted) protein complexes, our computational framework will help accelerate the discovery of clinically important cancer-promoting AS events.
Collapse
Affiliation(s)
- Khalique Newaz
- Institute for Computational Systems Biology and Center for Data and Computing in Natural Sciences, Universität Hamburg, 22761 Hamburg, Germany
| | - Christoph Schaefers
- Department of Oncology, Hematology and Bone Marrow Transplantation with Division of Pneumology, Universitätsklinikum Hamburg-Eppendorf, 20251 Hamburg, Germany
| | - Katja Weisel
- Department of Oncology, Hematology and Bone Marrow Transplantation with Division of Pneumology, Universitätsklinikum Hamburg-Eppendorf, 20251 Hamburg, Germany
| | - Jan Baumbach
- Institute for Computational Systems Biology and Center for Data and Computing in Natural Sciences, Universität Hamburg, 22761 Hamburg, Germany
- Department of Mathematics and Computer Science, University of Southern Denmark, Odense, Denmark
| | - Dmitrij Frishman
- Department of Bioinformatics, School of Life Sciences, Technical University of Munich, 85354 Freising, Germany
| |
Collapse
|
4
|
Zhang Y, Leung AK, Kang JJ, Sun Y, Wu G, Li L, Sun J, Cheng L, Qiu T, Zhang J, Wierbowski S, Gupta S, Booth J, Yu H. A multiscale functional map of somatic mutations in cancer integrating protein structure and network topology. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2024:2023.03.06.531441. [PMID: 36945530 PMCID: PMC10028849 DOI: 10.1101/2023.03.06.531441] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 03/09/2023]
Abstract
A major goal of cancer biology is to understand the mechanisms underlying tumorigenesis driven by somatically acquired mutations. Two distinct types of computational methodologies have emerged: one focuses on analyzing clustering of mutations within protein sequences and 3D structures, while the other characterizes mutations by leveraging the topology of protein-protein interaction network. Their insights are largely non-overlapping, offering complementary strengths. Here, we established a unified, end-to-end 3D structurally-informed protein interaction network propagation framework, NetFlow3D, that systematically maps the multiscale mechanistic effects of somatic mutations in cancer. The establishment of NetFlow3D hinges upon the Human Protein Structurome, a comprehensive repository we compiled that incorporates the 3D structures of every single protein as well as the binding interfaces of all known protein interactions in humans. NetFlow3D leverages the Structurome to integrate information across atomic, residue, protein and network levels: It conducts 3D clustering of mutations across atomic and residue levels on protein structures to identify potential driver mutations. It then anisotropically propagates their impacts across the protein interaction network, with propagation guided by the specific 3D structural interfaces involved, to identify significantly interconnected network "modules", thereby uncovering key biological processes underlying disease etiology. Applied to 1,038,899 somatic protein-altering mutations in 9,946 TCGA tumors across 33 cancer types, NetFlow3D identified 1,4444 significant 3D clusters throughout the Human Protein Structurome, of which ~55% would not have been found if using only experimentally-determined structures. It then identified 26 significantly interconnected modules that encompass ~8-fold more proteins than applying standard network analyses. NetFlow3D and our pan-cancer results can be accessed from http://netflow3d.yulab.org.
Collapse
Affiliation(s)
- Yingying Zhang
- Department of Computational Biology, Cornell University; Ithaca, 14853, USA
- Weill Institute for Cell and Molecular Biology, Cornell University; Ithaca, 14853, USA
- Department of Molecular Biology and Genetics, Cornell University; Ithaca, 14853, USA
| | - Alden K. Leung
- Department of Computational Biology, Cornell University; Ithaca, 14853, USA
- Weill Institute for Cell and Molecular Biology, Cornell University; Ithaca, 14853, USA
| | - Jin Joo Kang
- Department of Computational Biology, Cornell University; Ithaca, 14853, USA
- Weill Institute for Cell and Molecular Biology, Cornell University; Ithaca, 14853, USA
| | - Yu Sun
- Department of Computational Biology, Cornell University; Ithaca, 14853, USA
- Weill Institute for Cell and Molecular Biology, Cornell University; Ithaca, 14853, USA
| | - Guanxi Wu
- College of Agriculture and Life Sciences, Cornell University; Ithaca, 14853, USA
| | - Le Li
- Department of Computational Biology, Cornell University; Ithaca, 14853, USA
- Weill Institute for Cell and Molecular Biology, Cornell University; Ithaca, 14853, USA
| | - Jiayang Sun
- Department of Computational Biology, Cornell University; Ithaca, 14853, USA
| | - Lily Cheng
- Department of Science and Technology Studies, Cornell University; Ithaca, 14853, USA
| | - Tian Qiu
- School of Electrical and Computer Engineering, Cornell University; Ithaca, 14853, USA
| | - Junke Zhang
- Department of Computational Biology, Cornell University; Ithaca, 14853, USA
- Weill Institute for Cell and Molecular Biology, Cornell University; Ithaca, 14853, USA
| | - Shayne Wierbowski
- Department of Computational Biology, Cornell University; Ithaca, 14853, USA
- Weill Institute for Cell and Molecular Biology, Cornell University; Ithaca, 14853, USA
| | - Shagun Gupta
- Department of Computational Biology, Cornell University; Ithaca, 14853, USA
- Weill Institute for Cell and Molecular Biology, Cornell University; Ithaca, 14853, USA
| | - James Booth
- Department of Computational Biology, Cornell University; Ithaca, 14853, USA
- Department of Statistics and Data Science, Cornell University; Ithaca, 14853, USA
| | - Haiyuan Yu
- Department of Computational Biology, Cornell University; Ithaca, 14853, USA
- Weill Institute for Cell and Molecular Biology, Cornell University; Ithaca, 14853, USA
| |
Collapse
|
5
|
Ozturk K, Panwala R, Sheen J, Ford K, Jayne N, Portell A, Zhang DE, Hutter S, Haferlach T, Ideker T, Mali P, Carter H. Interface-guided phenotyping of coding variants in the transcription factor RUNX1. Cell Rep 2024; 43:114436. [PMID: 38968069 PMCID: PMC11345852 DOI: 10.1016/j.celrep.2024.114436] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/08/2023] [Revised: 05/15/2024] [Accepted: 06/19/2024] [Indexed: 07/07/2024] Open
Abstract
Single-gene missense mutations remain challenging to interpret. Here, we deploy scalable functional screening by sequencing (SEUSS), a Perturb-seq method, to generate mutations at protein interfaces of RUNX1 and quantify their effect on activities of downstream cellular programs. We evaluate single-cell RNA profiles of 115 mutations in myelogenous leukemia cells and categorize them into three functionally distinct groups, wild-type (WT)-like, loss-of-function (LoF)-like, and hypomorphic, that we validate in orthogonal assays. LoF-like variants dominate the DNA-binding site and are recurrent in cancer; however, recurrence alone does not predict functional impact. Hypomorphic variants share characteristics with LoF-like but favor protein interactions, promoting gene expression indicative of nerve growth factor (NGF) response and cytokine recruitment of neutrophils. Accessible DNA near differentially expressed genes frequently contains RUNX1-binding motifs. Finally, we reclassify 16 variants of uncertain significance and train a classifier to predict 103 more. Our work demonstrates the potential of targeting protein interactions to better define the landscape of phenotypes reachable by missense mutations.
Collapse
Affiliation(s)
- Kivilcim Ozturk
- Division of Medical Genetics, Department of Medicine, University of California, San Diego, La Jolla, CA, USA; Bioinformatics and Systems Biology Program, University of California, San Diego, La Jolla, CA, USA
| | - Rebecca Panwala
- Department of Bioengineering, University of California, San Diego, La Jolla, CA, USA
| | - Jeanna Sheen
- School of Biological Sciences, University of California, San Diego, La Jolla, CA, USA
| | - Kyle Ford
- Department of Bioengineering, University of California, San Diego, La Jolla, CA, USA
| | - Nathan Jayne
- School of Biological Sciences, University of California, San Diego, La Jolla, CA, USA; Moores Cancer Center, University of California, San Diego, La Jolla, CA, USA
| | - Andrew Portell
- Department of Bioengineering, University of California, San Diego, La Jolla, CA, USA
| | - Dong-Er Zhang
- Moores Cancer Center, University of California, San Diego, La Jolla, CA, USA
| | - Stephan Hutter
- MLL Munich Leukemia Laboratory, Max-Lebsche-Platz 31, 81377 Munich, Germany
| | - Torsten Haferlach
- MLL Munich Leukemia Laboratory, Max-Lebsche-Platz 31, 81377 Munich, Germany
| | - Trey Ideker
- Division of Medical Genetics, Department of Medicine, University of California, San Diego, La Jolla, CA, USA; Bioinformatics and Systems Biology Program, University of California, San Diego, La Jolla, CA, USA; Moores Cancer Center, University of California, San Diego, La Jolla, CA, USA
| | - Prashant Mali
- Department of Bioengineering, University of California, San Diego, La Jolla, CA, USA.
| | - Hannah Carter
- Division of Medical Genetics, Department of Medicine, University of California, San Diego, La Jolla, CA, USA; Bioinformatics and Systems Biology Program, University of California, San Diego, La Jolla, CA, USA; Moores Cancer Center, University of California, San Diego, La Jolla, CA, USA.
| |
Collapse
|
6
|
Deng C, Li HD, Zhang LS, Liu Y, Li Y, Wang J. Identifying new cancer genes based on the integration of annotated gene sets via hypergraph neural networks. Bioinformatics 2024; 40:i511-i520. [PMID: 38940121 PMCID: PMC11211849 DOI: 10.1093/bioinformatics/btae257] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 06/29/2024] Open
Abstract
MOTIVATION Identifying cancer genes remains a significant challenge in cancer genomics research. Annotated gene sets encode functional associations among multiple genes, and cancer genes have been shown to cluster in hallmark signaling pathways and biological processes. The knowledge of annotated gene sets is critical for discovering cancer genes but remains to be fully exploited. RESULTS Here, we present the DIsease-Specific Hypergraph neural network (DISHyper), a hypergraph-based computational method that integrates the knowledge from multiple types of annotated gene sets to predict cancer genes. First, our benchmark results demonstrate that DISHyper outperforms the existing state-of-the-art methods and highlight the advantages of employing hypergraphs for representing annotated gene sets. Second, we validate the accuracy of DISHyper-predicted cancer genes using functional validation results and multiple independent functional genomics data. Third, our model predicts 44 novel cancer genes, and subsequent analysis shows their significant associations with multiple types of cancers. Overall, our study provides a new perspective for discovering cancer genes and reveals previously undiscovered cancer genes. AVAILABILITY AND IMPLEMENTATION DISHyper is freely available for download at https://github.com/genemine/DISHyper.
Collapse
Affiliation(s)
- Chao Deng
- School of Computer Science and Engineering, Central South University, Changsha, 410083, China
- Hunan Provincial Key Lab on Bioinformatics, Central South University, Changsha, 410083, China
| | - Hong-Dong Li
- School of Computer Science and Engineering, Central South University, Changsha, 410083, China
- Hunan Provincial Key Lab on Bioinformatics, Central South University, Changsha, 410083, China
| | - Li-Shen Zhang
- School of Computer Science and Engineering, Central South University, Changsha, 410083, China
- Hunan Provincial Key Lab on Bioinformatics, Central South University, Changsha, 410083, China
| | - Yiwei Liu
- School of Computer Science and Engineering, Central South University, Changsha, 410083, China
- Hunan Provincial Key Lab on Bioinformatics, Central South University, Changsha, 410083, China
| | - Yaohang Li
- Department of Computer Science, Old Dominion University, Norfolk, VA 23529-0001, United States
| | - Jianxin Wang
- School of Computer Science and Engineering, Central South University, Changsha, 410083, China
- Hunan Provincial Key Lab on Bioinformatics, Central South University, Changsha, 410083, China
| |
Collapse
|
7
|
Nourbakhsh M, Degn K, Saksager A, Tiberti M, Papaleo E. Prediction of cancer driver genes and mutations: the potential of integrative computational frameworks. Brief Bioinform 2024; 25:bbad519. [PMID: 38261338 PMCID: PMC10805075 DOI: 10.1093/bib/bbad519] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/09/2023] [Revised: 11/27/2023] [Accepted: 12/11/2023] [Indexed: 01/24/2024] Open
Abstract
The vast amount of available sequencing data allows the scientific community to explore different genetic alterations that may drive cancer or favor cancer progression. Software developers have proposed a myriad of predictive tools, allowing researchers and clinicians to compare and prioritize driver genes and mutations and their relative pathogenicity. However, there is little consensus on the computational approach or a golden standard for comparison. Hence, benchmarking the different tools depends highly on the input data, indicating that overfitting is still a massive problem. One of the solutions is to limit the scope and usage of specific tools. However, such limitations force researchers to walk on a tightrope between creating and using high-quality tools for a specific purpose and describing the complex alterations driving cancer. While the knowledge of cancer development increases daily, many bioinformatic pipelines rely on single nucleotide variants or alterations in a vacuum without accounting for cellular compartments, mutational burden or disease progression. Even within bioinformatics and computational cancer biology, the research fields work in silos, risking overlooking potential synergies or breakthroughs. Here, we provide an overview of databases and datasets for building or testing predictive cancer driver tools. Furthermore, we introduce predictive tools for driver genes, driver mutations, and the impact of these based on structural analysis. Additionally, we suggest and recommend directions in the field to avoid silo-research, moving towards integrative frameworks.
Collapse
Affiliation(s)
- Mona Nourbakhsh
- Cancer Systems Biology, Section for Bioinformatics, Department of Health Technology, Technical University of Denmark, 2800 Lyngby, Denmark
| | - Kristine Degn
- Cancer Systems Biology, Section for Bioinformatics, Department of Health Technology, Technical University of Denmark, 2800 Lyngby, Denmark
| | - Astrid Saksager
- Cancer Systems Biology, Section for Bioinformatics, Department of Health Technology, Technical University of Denmark, 2800 Lyngby, Denmark
| | - Matteo Tiberti
- Cancer Structural Biology, Danish Cancer Institute, 2100 Copenhagen, Denmark
| | - Elena Papaleo
- Cancer Systems Biology, Section for Bioinformatics, Department of Health Technology, Technical University of Denmark, 2800 Lyngby, Denmark
- Cancer Structural Biology, Danish Cancer Institute, 2100 Copenhagen, Denmark
| |
Collapse
|
8
|
Pei J, Zhang J, Cong Q. Computational analysis of protein-protein interactions of cancer drivers in renal cell carcinoma. FEBS Open Bio 2024; 14:112-126. [PMID: 37964489 PMCID: PMC10761929 DOI: 10.1002/2211-5463.13732] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/16/2023] [Revised: 10/30/2023] [Accepted: 11/06/2023] [Indexed: 11/16/2023] Open
Abstract
Renal cell carcinoma (RCC) is the most common type of kidney cancer with rising cases in recent years. Extensive research has identified various cancer driver proteins associated with different subtypes of RCC. Most RCC drivers are encoded by tumor suppressor genes and exhibit enrichment in functional categories such as protein degradation, chromatin remodeling, and transcription. To further our understanding of RCC, we utilized powerful deep-learning methods based on AlphaFold to predict protein-protein interactions (PPIs) involving RCC drivers. We predicted high-confidence complexes formed by various RCC drivers, including TCEB1, KMT2C/D and KDM6A of the COMPASS-related complexes, TSC1 of the MTOR pathway, and TRRAP. These predictions provide valuable structural insights into the interaction interfaces, some of which are promising targets for cancer drug design, such as the NRF2-MAFK interface. Cancer somatic missense mutations from large datasets of genome sequencing of RCCs were mapped to the interfaces of predicted and experimental structures of PPIs involving RCC drivers, and their effects on the binding affinity were evaluated. We observed more than 100 cancer somatic mutations affecting the binding affinity of complexes formed by key RCC drivers such as VHL and TCEB1. These findings emphasize the importance of these mutations in RCC pathogenesis and potentially offer new avenues for targeted therapies.
Collapse
Affiliation(s)
- Jimin Pei
- Eugene McDermott Center for Human Growth and DevelopmentUniversity of Texas Southwestern Medical CenterDallasTXUSA
- Department of BiophysicsUniversity of Texas Southwestern Medical CenterDallasTXUSA
- Harold C. Simmons Comprehensive Cancer CenterUniversity of Texas Southwestern Medical CenterDallasTXUSA
| | - Jing Zhang
- Eugene McDermott Center for Human Growth and DevelopmentUniversity of Texas Southwestern Medical CenterDallasTXUSA
- Department of BiophysicsUniversity of Texas Southwestern Medical CenterDallasTXUSA
- Harold C. Simmons Comprehensive Cancer CenterUniversity of Texas Southwestern Medical CenterDallasTXUSA
| | - Qian Cong
- Eugene McDermott Center for Human Growth and DevelopmentUniversity of Texas Southwestern Medical CenterDallasTXUSA
- Department of BiophysicsUniversity of Texas Southwestern Medical CenterDallasTXUSA
- Harold C. Simmons Comprehensive Cancer CenterUniversity of Texas Southwestern Medical CenterDallasTXUSA
| |
Collapse
|
9
|
Zhang M, Lang X, Chen X, Lv Y. Prospective Identification of Prognostic Hot-Spot Mutant Gene Signatures for Leukemia: A Computational Study Based on Integrative Analysis of TCGA and cBioPortal Data. Mol Biotechnol 2023; 65:1898-1912. [PMID: 36879146 DOI: 10.1007/s12033-023-00704-3] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/15/2022] [Accepted: 02/14/2023] [Indexed: 03/08/2023]
Abstract
The advantage of an increasing amount of bioinformatics data on leukemias intrigued us to explore the hot-spot mutation profiles and investigate the implications of those hot-spot mutations in patient survival. We retrieved somatic mutations and their distribution in protein domains through data analysis of The Cancer Genome Atlas and cBioPortal databases. After determining differentially expressed mutant genes related to leukemia, we further conducted principal component analysis and single-factor Cox regression analyses. Moreover, survival analysis was performed for the obtained candidate genes, followed by a multi-factor Cox proportional hazard model method for the impacts of the candidate genes on the survival and prognosis of patients with leukemia. At last, the signaling pathways involved in leukemia were investigated by gene set enrichment analysis. There were 223 somatic missense mutation hot-spots identified with pertinence to leukemia, which were distributed in 41 genes. Differential expression in leukemia was witnessed in 39 genes. We found a close correlation between seven genes and the prognosis of leukemia patients, among which, three genes could significantly influence the survival rate. In addition, among these three genes, CD74 and P2RY8 were highlighted due to close pertinence with survival conditions of leukemia patients. Finally, data suggested that B cell receptor, Hedgehog, and TGF-beta signaling pathways were enriched in low-hazard patients. In conclusion, these data underline the involvement of hot-spot mutations of CD74 and P2RY8 genes in survival status of leukemia patients, highlighting their as novel therapeutic targets or prognostic indicators for leukemia patients. Summary of Graphical Abstract: We identified 223 leukemia-associated somatic missense mutation hotspots concentrated in 41 different genes from 2297 leukemia patients in the TCGA database. Differential analysis of leukemic and normal samples from the TCGA and GTEx databases revealed that 39 of these 41 genes showed significant differential expression in leukemia. These 39 genes were subjected to PCA analysis, univariate Cox analysis, survival analysis, multivariate Cox regression analysis, GSEA pathway enrichment analysis, and then the association with leukemia survival prognosis and related pathways were investigated.
Collapse
Affiliation(s)
- Min Zhang
- Department of Hematology, The First People's Hospital of Yongkang, Affiliated to Hangzhou Medical College, No. 599, Jinshan West Road, Yongkang, Jinhua City, Zhejiang Province, 321300, People's Republic of China.
| | - Xianghua Lang
- Department of Hematology, The First People's Hospital of Yongkang, Affiliated to Hangzhou Medical College, No. 599, Jinshan West Road, Yongkang, Jinhua City, Zhejiang Province, 321300, People's Republic of China
| | - Xinyi Chen
- Department of Hematology, The First People's Hospital of Yongkang, Affiliated to Hangzhou Medical College, No. 599, Jinshan West Road, Yongkang, Jinhua City, Zhejiang Province, 321300, People's Republic of China
| | - Yuke Lv
- Department of Hematology, The First People's Hospital of Yongkang, Affiliated to Hangzhou Medical College, No. 599, Jinshan West Road, Yongkang, Jinhua City, Zhejiang Province, 321300, People's Republic of China
| |
Collapse
|
10
|
Yue Y, Li S, Wang L, Liu H, Tong HHY, He S. MpbPPI: a multi-task pre-training-based equivariant approach for the prediction of the effect of amino acid mutations on protein-protein interactions. Brief Bioinform 2023; 24:bbad310. [PMID: 37651610 PMCID: PMC10516393 DOI: 10.1093/bib/bbad310] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/27/2023] [Revised: 07/12/2023] [Accepted: 08/04/2023] [Indexed: 09/02/2023] Open
Abstract
The accurate prediction of the effect of amino acid mutations for protein-protein interactions (PPI $\Delta \Delta G$) is a crucial task in protein engineering, as it provides insight into the relevant biological processes underpinning protein binding and provides a basis for further drug discovery. In this study, we propose MpbPPI, a novel multi-task pre-training-based geometric equivariance-preserving framework to predict PPI $\Delta \Delta G$. Pre-training on a strictly screened pre-training dataset is employed to address the scarcity of protein-protein complex structures annotated with PPI $\Delta \Delta G$ values. MpbPPI employs a multi-task pre-training technique, forcing the framework to learn comprehensive backbone and side chain geometric regulations of protein-protein complexes at different scales. After pre-training, MpbPPI can generate high-quality representations capturing the effective geometric characteristics of labeled protein-protein complexes for downstream $\Delta \Delta G$ predictions. MpbPPI serves as a scalable framework supporting different sources of mutant-type (MT) protein-protein complexes for flexible application. Experimental results on four benchmark datasets demonstrate that MpbPPI is a state-of-the-art framework for PPI $\Delta \Delta G$ predictions. The data and source code are available at https://github.com/arantir123/MpbPPI.
Collapse
Affiliation(s)
- Yang Yue
- School of Computer Science from the University of Birmingham, UK
| | - Shu Li
- Centre for Artificial Intelligence Driven Drug Discovery at Macao Polytechnic University
| | - Lingling Wang
- Centre for Artificial Intelligence Driven Drug Discovery at Macao Polytechnic University
| | - Huanxiang Liu
- Centre for Artificial Intelligence Driven Drug Discovery at Macao Polytechnic University
| | - Henry H Y Tong
- Centre for Artificial Intelligence Driven Drug Discovery at Macao Polytechnic University
| | - Shan He
- School of Computer Science, the University of Birmingham, Edgbaston, Birmingham, B15 2TT, UK
| |
Collapse
|
11
|
Li Y, Porta-Pardo E, Tokheim C, Bailey MH, Yaron TM, Stathias V, Geffen Y, Imbach KJ, Cao S, Anand S, Akiyama Y, Liu W, Wyczalkowski MA, Song Y, Storrs EP, Wendl MC, Zhang W, Sibai M, Ruiz-Serra V, Liang WW, Terekhanova NV, Rodrigues FM, Clauser KR, Heiman DI, Zhang Q, Aguet F, Calinawan AP, Dhanasekaran SM, Birger C, Satpathy S, Zhou DC, Wang LB, Baral J, Johnson JL, Huntsman EM, Pugliese P, Colaprico A, Iavarone A, Chheda MG, Ricketts CJ, Fenyö D, Payne SH, Rodriguez H, Robles AI, Gillette MA, Kumar-Sinha C, Lazar AJ, Cantley LC, Getz G, Ding L. Pan-cancer proteogenomics connects oncogenic drivers to functional states. Cell 2023; 186:3921-3944.e25. [PMID: 37582357 DOI: 10.1016/j.cell.2023.07.014] [Citation(s) in RCA: 18] [Impact Index Per Article: 18.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/05/2022] [Revised: 12/30/2022] [Accepted: 07/10/2023] [Indexed: 08/17/2023]
Abstract
Cancer driver events refer to key genetic aberrations that drive oncogenesis; however, their exact molecular mechanisms remain insufficiently understood. Here, our multi-omics pan-cancer analysis uncovers insights into the impacts of cancer drivers by identifying their significant cis-effects and distal trans-effects quantified at the RNA, protein, and phosphoprotein levels. Salient observations include the association of point mutations and copy-number alterations with the rewiring of protein interaction networks, and notably, most cancer genes converge toward similar molecular states denoted by sequence-based kinase activity profiles. A correlation between predicted neoantigen burden and measured T cell infiltration suggests potential vulnerabilities for immunotherapies. Patterns of cancer hallmarks vary by polygenic protein abundance ranging from uniform to heterogeneous. Overall, our work demonstrates the value of comprehensive proteogenomics in understanding the functional states of oncogenic drivers and their links to cancer development, surpassing the limitations of studying individual cancer types.
Collapse
Affiliation(s)
- Yize Li
- Department of Medicine, Washington University in St. Louis, St. Louis, MO 63110, USA; McDonnell Genome Institute, Washington University in St. Louis, St. Louis, MO 63108, USA
| | - Eduard Porta-Pardo
- Josep Carreras Leukaemia Research Institute (IJC), Badalona 08916, Spain; Barcelona Supercomputing Center (BSC), Barcelona 08034, Spain
| | - Collin Tokheim
- Department of Data Science, Dana-Farber Cancer Institute, Boston, MA 02215, USA; Department of Biostatistics, Harvard T.H. Chan School of Public Health, Boston, MA 02115, USA
| | - Matthew H Bailey
- Department of Biology and Simmons Center for Cancer Research, Brigham Young University, Provo, UT 84602, USA
| | - Tomer M Yaron
- Meyer Cancer Center, Weill Cornell Medicine, New York, NY 10021, USA; Department of Medicine, Weill Cornell Medicine, New York, NY 10021, USA; Englander Institute for Precision Medicine, Institute for Computational Biomedicine, Weill Cornell Medicine, New York, NY 10021, USA
| | - Vasileios Stathias
- Sylvester Comprehensive Cancer Center, University of Miami Miller School of Medicine, Miami, FL 33136, USA; Department of Molecular and Cellular Pharmacology, University of Miami Miller School of Medicine, Miami, FL 33136, USA
| | - Yifat Geffen
- Broad Institute of Massachusetts Institute of Technology and Harvard, Cambridge, MA 02142, USA; Cancer Center and Department of Pathology, Massachusetts General Hospital, Boston, MA 02115, USA
| | - Kathleen J Imbach
- Josep Carreras Leukaemia Research Institute (IJC), Badalona 08916, Spain; Barcelona Supercomputing Center (BSC), Barcelona 08034, Spain
| | - Song Cao
- Department of Medicine, Washington University in St. Louis, St. Louis, MO 63110, USA; McDonnell Genome Institute, Washington University in St. Louis, St. Louis, MO 63108, USA
| | - Shankara Anand
- Broad Institute of Massachusetts Institute of Technology and Harvard, Cambridge, MA 02142, USA
| | - Yo Akiyama
- Broad Institute of Massachusetts Institute of Technology and Harvard, Cambridge, MA 02142, USA
| | - Wenke Liu
- Institute for Systems Genetics, NYU Grossman School of Medicine, New York, NY 10016, USA; Department of Biochemistry and Molecular Pharmacology, NYU Grossman School of Medicine, New York, NY 10016, USA
| | - Matthew A Wyczalkowski
- Department of Medicine, Washington University in St. Louis, St. Louis, MO 63110, USA; McDonnell Genome Institute, Washington University in St. Louis, St. Louis, MO 63108, USA
| | - Yizhe Song
- Department of Medicine, Washington University in St. Louis, St. Louis, MO 63110, USA; McDonnell Genome Institute, Washington University in St. Louis, St. Louis, MO 63108, USA
| | - Erik P Storrs
- Department of Medicine, Washington University in St. Louis, St. Louis, MO 63110, USA; McDonnell Genome Institute, Washington University in St. Louis, St. Louis, MO 63108, USA
| | - Michael C Wendl
- McDonnell Genome Institute, Washington University in St. Louis, St. Louis, MO 63108, USA; Department of Genetics, Washington University in St. Louis, St. Louis, MO 63130, USA; Department of Mathematics, Washington University in St. Louis, St. Louis, MO 63130, USA
| | - Wubing Zhang
- Department of Data Science, Dana-Farber Cancer Institute, Boston, MA 02215, USA
| | - Mustafa Sibai
- Josep Carreras Leukaemia Research Institute (IJC), Badalona 08916, Spain; Barcelona Supercomputing Center (BSC), Barcelona 08034, Spain
| | - Victoria Ruiz-Serra
- Josep Carreras Leukaemia Research Institute (IJC), Badalona 08916, Spain; Barcelona Supercomputing Center (BSC), Barcelona 08034, Spain
| | - Wen-Wei Liang
- Department of Medicine, Washington University in St. Louis, St. Louis, MO 63110, USA; McDonnell Genome Institute, Washington University in St. Louis, St. Louis, MO 63108, USA
| | - Nadezhda V Terekhanova
- Department of Medicine, Washington University in St. Louis, St. Louis, MO 63110, USA; McDonnell Genome Institute, Washington University in St. Louis, St. Louis, MO 63108, USA
| | - Fernanda Martins Rodrigues
- Department of Medicine, Washington University in St. Louis, St. Louis, MO 63110, USA; McDonnell Genome Institute, Washington University in St. Louis, St. Louis, MO 63108, USA
| | - Karl R Clauser
- Broad Institute of Massachusetts Institute of Technology and Harvard, Cambridge, MA 02142, USA
| | - David I Heiman
- Broad Institute of Massachusetts Institute of Technology and Harvard, Cambridge, MA 02142, USA
| | - Qing Zhang
- Broad Institute of Massachusetts Institute of Technology and Harvard, Cambridge, MA 02142, USA
| | - Francois Aguet
- Broad Institute of Massachusetts Institute of Technology and Harvard, Cambridge, MA 02142, USA
| | - Anna P Calinawan
- Department of Genetic and Genomic Sciences, Icahn School of Medicine at Mount Sinai, New York, NY 10029, USA
| | - Saravana M Dhanasekaran
- Michigan Center for Translational Pathology, Department of Pathology, University of Michigan, Ann Arbor, MI 48109, USA
| | - Chet Birger
- Broad Institute of Massachusetts Institute of Technology and Harvard, Cambridge, MA 02142, USA
| | - Shankha Satpathy
- Broad Institute of Massachusetts Institute of Technology and Harvard, Cambridge, MA 02142, USA
| | - Daniel Cui Zhou
- Department of Medicine, Washington University in St. Louis, St. Louis, MO 63110, USA; McDonnell Genome Institute, Washington University in St. Louis, St. Louis, MO 63108, USA
| | - Liang-Bo Wang
- Department of Medicine, Washington University in St. Louis, St. Louis, MO 63110, USA; McDonnell Genome Institute, Washington University in St. Louis, St. Louis, MO 63108, USA
| | - Jessika Baral
- Department of Medicine, Washington University in St. Louis, St. Louis, MO 63110, USA; McDonnell Genome Institute, Washington University in St. Louis, St. Louis, MO 63108, USA
| | - Jared L Johnson
- Meyer Cancer Center, Weill Cornell Medicine, New York, NY 10021, USA; Department of Medicine, Weill Cornell Medicine, New York, NY 10021, USA
| | - Emily M Huntsman
- Meyer Cancer Center, Weill Cornell Medicine, New York, NY 10021, USA; Department of Medicine, Weill Cornell Medicine, New York, NY 10021, USA
| | - Pietro Pugliese
- Department of Science and Technology, University of Sannio, 82100 Benevento, Italy
| | - Antonio Colaprico
- Sylvester Comprehensive Cancer Center, University of Miami Miller School of Medicine, Miami, FL 33136, USA; Department of Public Health Sciences, University of Miami Miller School of Medicine, Miami, FL 33136, USA
| | - Antonio Iavarone
- Sylvester Comprehensive Cancer Center, University of Miami Miller School of Medicine, Miami, FL 33136, USA; Department of Neurological Surgery, Department of Biochemistry and Molecular Biology, University of Miami Miller School of Medicine, Miami, FL 33136, USA
| | - Milan G Chheda
- Department of Medicine, Washington University in St. Louis, St. Louis, MO 63110, USA; Siteman Cancer Center, Washington University in St. Louis, St. Louis, MO 63130, USA; Department of Neurology, Washington University in St. Louis, St. Louis, MO 63130, USA
| | - Christopher J Ricketts
- Urologic Oncology Branch, Center for Cancer Research, National Cancer Institute, National Institutes of Health, Bethesda, MD 20892, USA
| | - David Fenyö
- Institute for Systems Genetics, NYU Grossman School of Medicine, New York, NY 10016, USA; Department of Biochemistry and Molecular Pharmacology, NYU Grossman School of Medicine, New York, NY 10016, USA
| | - Samuel H Payne
- Department of Biology, Brigham Young University, Provo, UT 84602, USA
| | - Henry Rodriguez
- Office of Cancer Clinical Proteomics Research, National Cancer Institute, Rockville, MD 20850, USA
| | - Ana I Robles
- Office of Cancer Clinical Proteomics Research, National Cancer Institute, Rockville, MD 20850, USA
| | - Michael A Gillette
- Broad Institute of Massachusetts Institute of Technology and Harvard, Cambridge, MA 02142, USA; Harvard Medical School, Boston, MA 02115, USA
| | - Chandan Kumar-Sinha
- Michigan Center for Translational Pathology, Department of Pathology, University of Michigan, Ann Arbor, MI 48109, USA
| | - Alexander J Lazar
- Departments of Pathology & Genomic Medicine, The University of Texas MD Anderson Cancer Center, Houston, TX 77030, USA
| | - Lewis C Cantley
- Meyer Cancer Center, Weill Cornell Medicine, New York, NY 10021, USA; Department of Medicine, Weill Cornell Medicine, New York, NY 10021, USA.
| | - Gad Getz
- Broad Institute of Massachusetts Institute of Technology and Harvard, Cambridge, MA 02142, USA; Cancer Center and Department of Pathology, Massachusetts General Hospital, Boston, MA 02115, USA; Harvard Medical School, Boston, MA 02115, USA.
| | - Li Ding
- Department of Medicine, Washington University in St. Louis, St. Louis, MO 63110, USA; McDonnell Genome Institute, Washington University in St. Louis, St. Louis, MO 63108, USA; Department of Genetics, Washington University in St. Louis, St. Louis, MO 63130, USA; Siteman Cancer Center, Washington University in St. Louis, St. Louis, MO 63130, USA.
| |
Collapse
|
12
|
Ozturk K, Panwala R, Sheen J, Ford K, Payne N, Zhang DE, Hutter S, Haferlach T, Ideker T, Mali P, Carter H. Interface-guided phenotyping of coding variants in the transcription factor RUNX1 with SEUSS. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2023:2023.08.03.551876. [PMID: 37577681 PMCID: PMC10418284 DOI: 10.1101/2023.08.03.551876] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 08/15/2023]
Abstract
Understanding the consequences of single amino acid substitutions in cancer driver genes remains an unmet need. Perturb-seq provides a tool to investigate the effects of individual mutations on cellular programs. Here we deploy SEUSS, a Perturb-seq like approach, to generate and assay mutations at physical interfaces of the RUNX1 Runt domain. We measured the impact of 115 mutations on RNA profiles in single myelogenous leukemia cells and used the profiles to categorize mutations into three functionally distinct groups: wild-type (WT)-like, loss-of-function (LOF)-like and hypomorphic. Notably, the largest concentration of functional mutations (non-WT-like) clustered at the DNA binding site and contained many of the more frequently observed mutations in human cancers. Hypomorphic variants shared characteristics with loss of function variants but had gene expression profiles indicative of response to neural growth factor and cytokine recruitment of neutrophils. Additionally, DNA accessibility changes upon perturbations were enriched for RUNX1 binding motifs, particularly near differentially expressed genes. Overall, our work demonstrates the potential of targeting protein interaction interfaces to better define the landscape of prospective phenotypes reachable by amino acid substitutions.
Collapse
|
13
|
Towards a structurally resolved human protein interaction network. Nat Struct Mol Biol 2023; 30:216-225. [PMID: 36690744 PMCID: PMC9935395 DOI: 10.1038/s41594-022-00910-8] [Citation(s) in RCA: 70] [Impact Index Per Article: 70.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/11/2022] [Accepted: 12/14/2022] [Indexed: 01/25/2023]
Abstract
Cellular functions are governed by molecular machines that assemble through protein-protein interactions. Their atomic details are critical to studying their molecular mechanisms. However, fewer than 5% of hundreds of thousands of human protein interactions have been structurally characterized. Here we test the potential and limitations of recent progress in deep-learning methods using AlphaFold2 to predict structures for 65,484 human protein interactions. We show that experiments can orthogonally confirm higher-confidence models. We identify 3,137 high-confidence models, of which 1,371 have no homology to a known structure. We identify interface residues harboring disease mutations, suggesting potential mechanisms for pathogenic variants. Groups of interface phosphorylation sites show patterns of co-regulation across conditions, suggestive of coordinated tuning of multiple protein interactions as signaling responses. Finally, we provide examples of how the predicted binary complexes can be used to build larger assemblies helping to expand our understanding of human cell biology.
Collapse
|
14
|
Kim D, Ha D, Lee K, Lee H, Kim I, Kim S. An evolution-based machine learning to identify cancer type-specific driver mutations. Brief Bioinform 2023; 24:6961611. [PMID: 36575568 DOI: 10.1093/bib/bbac593] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/22/2022] [Revised: 11/18/2022] [Accepted: 12/03/2022] [Indexed: 12/29/2022] Open
Abstract
Identifying cancer type-specific driver mutations is crucial for illuminating distinct pathologic mechanisms across various tumors and providing opportunities of patient-specific treatment. However, although many computational methods were developed to predict driver mutations in a type-specific manner, the methods still have room to improve. Here, we devise a novel feature based on sequence co-evolution analysis to identify cancer type-specific driver mutations and construct a machine learning (ML) model with state-of-the-art performance. Specifically, relying on 28 000 tumor samples across 66 cancer types, our ML framework outperformed current leading methods of detecting cancer driver mutations. Interestingly, the cancer mutations identified by sequence co-evolution feature are frequently observed in interfaces mediating tissue-specific protein-protein interactions that are known to associate with shaping tissue-specific oncogenesis. Moreover, we provide pre-calculated potential oncogenicity on available human proteins with prediction scores of all possible residue alterations through user-friendly website (http://sbi.postech.ac.kr/w/cancerCE). This work will facilitate the identification of cancer type-specific driver mutations in newly sequenced tumor samples.
Collapse
Affiliation(s)
| | | | | | | | - Inhae Kim
- ImmunoBiome Inc., Pohang, South Korea
| | - Sanguk Kim
- Department of Life Sciences.,Artificial Intelligence Graduate Program, Pohang University of Science and Technology, Pohang 790-784, South Korea.,Institute of Convergence Research and Education in Advanced Technology, Yonsei University, Seoul 120-149, South Korea
| |
Collapse
|
15
|
Zhang J, Pei J, Durham J, Bos T, Cong Q. Computed cancer interactome explains the effects of somatic mutations in cancers. Protein Sci 2022; 31:e4479. [PMID: 36261849 PMCID: PMC9667826 DOI: 10.1002/pro.4479] [Citation(s) in RCA: 11] [Impact Index Per Article: 5.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/14/2022] [Revised: 09/28/2022] [Accepted: 10/13/2022] [Indexed: 12/13/2022]
Abstract
Protein-protein interactions (PPIs) are involved in almost all essential cellular processes. Perturbation of PPI networks plays critical roles in tumorigenesis, cancer progression, and metastasis. While numerous high-throughput experiments have produced a vast amount of data for PPIs, these data sets suffer from high false positive rates and exhibit a high degree of discrepancy. Coevolution of amino acid positions between protein pairs has proven to be useful in identifying interacting proteins and providing structural details of the interaction interfaces with the help of deep learning methods like AlphaFold (AF). In this study, we applied AF to investigate the cancer protein-protein interactome. We predicted 1,798 PPIs for cancer driver proteins involved in diverse cellular processes such as transcription regulation, signal transduction, DNA repair, and cell cycle. We modeled the spatial structures for the predicted binary protein complexes, 1,087 of which lacked previous 3D structure information. Our predictions offer novel structural insight into many cancer-related processes such as the MAP kinase cascade and Fanconi anemia pathway. We further investigated the cancer mutation landscape by mapping somatic missense mutations (SMMs) in cancer to the predicted PPI interfaces and performing enrichment and depletion analyses. Interfaces enriched or depleted with SMMs exhibit different preferences for functional categories. Interfaces enriched in mutations tend to function in pathways that are deregulated in cancers and they may help explain the molecular mechanisms of cancers in patients; interfaces lacking mutations appear to be essential for the survival of cancer cells and thus may be future targets for PPI modulating drugs.
Collapse
Affiliation(s)
- Jing Zhang
- Eugene McDermott Center for Human Growth and DevelopmentUniversity of Texas Southwestern Medical CenterDallasTexasUSA
- Department of BiophysicsUniversity of Texas Southwestern Medical CenterDallasTexasUSA
| | - Jimin Pei
- Eugene McDermott Center for Human Growth and DevelopmentUniversity of Texas Southwestern Medical CenterDallasTexasUSA
- Department of BiophysicsUniversity of Texas Southwestern Medical CenterDallasTexasUSA
| | - Jesse Durham
- Eugene McDermott Center for Human Growth and DevelopmentUniversity of Texas Southwestern Medical CenterDallasTexasUSA
- Department of BiophysicsUniversity of Texas Southwestern Medical CenterDallasTexasUSA
| | - Tasia Bos
- Eugene McDermott Center for Human Growth and DevelopmentUniversity of Texas Southwestern Medical CenterDallasTexasUSA
- Department of BiophysicsUniversity of Texas Southwestern Medical CenterDallasTexasUSA
| | - Qian Cong
- Eugene McDermott Center for Human Growth and DevelopmentUniversity of Texas Southwestern Medical CenterDallasTexasUSA
- Department of BiophysicsUniversity of Texas Southwestern Medical CenterDallasTexasUSA
| |
Collapse
|
16
|
Ozturk K, Carter H. Predicting functional consequences of mutations using molecular interaction network features. Hum Genet 2022; 141:1195-1210. [PMID: 34432150 PMCID: PMC8873243 DOI: 10.1007/s00439-021-02329-5] [Citation(s) in RCA: 4] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/20/2021] [Accepted: 07/31/2021] [Indexed: 12/13/2022]
Abstract
Variant interpretation remains a central challenge for precision medicine. Missense variants are particularly difficult to understand as they change only a single amino acid in a protein sequence yet can have large and varied effects on protein activity. Numerous tools have been developed to identify missense variants with putative disease consequences from protein sequence and structure. However, biological function arises through higher order interactions among proteins and molecules within cells. We therefore sought to capture information about the potential of missense mutations to perturb protein interaction networks by integrating protein structure and interaction data. We developed 16 network-based annotations for missense mutations that provide orthogonal information to features classically used to prioritize variants. We then evaluated them in the context of a proven machine-learning framework for variant effect prediction across multiple benchmark datasets to demonstrate their potential to improve variant classification. Interestingly, network features resulted in larger performance gains for classifying somatic mutations than for germline variants, possibly due to different constraints on what mutations are tolerated at the cellular versus organismal level. Our results suggest that modeling variant potential to perturb context-specific interactome networks is a fruitful strategy to advance in silico variant effect prediction.
Collapse
Affiliation(s)
- Kivilcim Ozturk
- Division of Medical Genetics, Department of Medicine, University of California San Diego, La Jolla, CA, USA
- Bioinformatics and Systems Biology Program, University of California San Diego, La Jolla, CA, USA
| | - Hannah Carter
- Division of Medical Genetics, Department of Medicine, University of California San Diego, La Jolla, CA, USA.
- Bioinformatics and Systems Biology Program, University of California San Diego, La Jolla, CA, USA.
- Moores Cancer Center, University of California San Diego, La Jolla, CA, USA.
| |
Collapse
|
17
|
Comprehensive profiling of 1015 patients' exomes reveals genomic-clinical associations in colorectal cancer. Nat Commun 2022; 13:2342. [PMID: 35487942 PMCID: PMC9055073 DOI: 10.1038/s41467-022-30062-8] [Citation(s) in RCA: 20] [Impact Index Per Article: 10.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/27/2021] [Accepted: 04/14/2022] [Indexed: 01/12/2023] Open
Abstract
The genetic basis of colorectal cancer (CRC) and its clinical associations remain poorly understood due to limited samples or targeted genes in current studies. Here, we perform ultradeep whole-exome sequencing on 1015 patients with CRC as part of the ChangKang Project. We identify 46 high-confident significantly mutated genes, 8 of which mutate in 14.9% of patients: LYST, DAPK1, CR2, KIF16B, NPIPB15, SYTL2, ZNF91, and KIAA0586. With an unsupervised clustering algorithm, we propose a subtyping strategy that classisfies CRC patients into four genomic subtypes with distinct clinical characteristics, including hypermutated, chromosome instability with high risk, chromosome instability with low risk, and genome stability. Analysis of immunogenicity uncover the association of immunogenicity reduction with genomic subtypes and poor prognosis in CRC. Moreover, we find that mitochondrial DNA copy number is an independent factor for predicting the survival outcome of CRCs. Overall, our results provide CRC-related molecular features for clinical practice and a valuable resource for translational research.
Collapse
|
18
|
The properties of human disease mutations at protein interfaces. PLoS Comput Biol 2022; 18:e1009858. [PMID: 35120134 PMCID: PMC8849535 DOI: 10.1371/journal.pcbi.1009858] [Citation(s) in RCA: 13] [Impact Index Per Article: 6.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/23/2021] [Revised: 02/16/2022] [Accepted: 01/24/2022] [Indexed: 12/27/2022] Open
Abstract
The assembly of proteins into complexes and their interactions with other biomolecules are often vital for their biological function. While it is known that mutations at protein interfaces have a high potential to be damaging and cause human genetic disease, there has been relatively little consideration for how this varies between different types of interfaces. Here we investigate the properties of human pathogenic and putatively benign missense variants at homomeric (isologous and heterologous), heteromeric, DNA, RNA and other ligand interfaces, and at different regions in proteins with respect to those interfaces. We find that different types of interfaces vary greatly in their propensity to be associated with pathogenic mutations, with homomeric heterologous and DNA interfaces being particularly enriched in disease. We also find that residues that do not directly participate in an interface, but are close in three-dimensional space, show a significant disease enrichment. Finally, we observe that mutations at different types of interfaces tend to have distinct property changes when undergoing amino acid substitutions associated with disease, and that this is linked to substantial variability in their identification by computational variant effect predictors. Nearly all proteins interact with other molecules as part of their biological function. For example, proteins can interact with other copies of the same type of protein, with different proteins, with DNA, or with small ligand molecules. Many mutations at protein interfaces, the regions of proteins that interact with other molecules, are known to cause human genetic disease. In this study, we first investigate how different types of protein interfaces have different tendencies to be associated with disease. We also show that the closer a mutation is to an interface, the more likely it is to cause disease. Finally, we study how mutations at different types of interfaces tend to be associated with different changes in amino acid properties, which appears to influence our ability to computationally predict the effects of mutations. Ultimately, we hope that consideration of protein interface properties will eventually improve our ability to identify new disease-causing mutations.
Collapse
|
19
|
Riera-Mestre A, Cerdà P, Iriarte A, Graupera M, Viñals F. Translational medicine in hereditary hemorrhagic telangiectasia. Eur J Intern Med 2022; 95:32-37. [PMID: 34538686 DOI: 10.1016/j.ejim.2021.09.003] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 08/25/2021] [Accepted: 09/04/2021] [Indexed: 12/18/2022]
Abstract
Scientific community have gained lots of new insights in the genetic and biochemical background of different conditions, rare diseases included, settling the basis for preclinical models that are helping to identify new biomarkers and therapeutic targets. Translational Medicine (TM) is an interdisciplinary area of biomedicine with an essential role in bench-to-bedside transition enhancement, generating a circular flow of knowledge transference between research environment and clinical setting, always centered in patient needs. Here, we present different tools used in TM and an overview of what is being done related to hereditary hemorrhagic telangiectasia (HHT), as a disease's model. This work is focused on how this combination of basic and clinical research impacts in HHT patient's daily clinical management and also looking into the future. Further randomized clinical trials with HHT patients should assess the findings of this bench-to-bedside transition. The benefits of this basic and clinical research combination, may not only be important for HHT patients but for patients with other vascular diseases sharing angiogenic disturbances.
Collapse
Affiliation(s)
- A Riera-Mestre
- HHT Unit. Internal Medicine Department. Hospital Universitari Bellvitge, C/ Feixa Llarga s/n., L'Hospitalet de Llobregat, Barcelona 08907, Spain; Bellvitge Biomedical Research Institute (IDIBELL), Barcelona, Spain; Faculty of Medicine and Health Sciences. Universitat de Barcelona, Barcelona, Spain.
| | - P Cerdà
- HHT Unit. Internal Medicine Department. Hospital Universitari Bellvitge, C/ Feixa Llarga s/n., L'Hospitalet de Llobregat, Barcelona 08907, Spain; Bellvitge Biomedical Research Institute (IDIBELL), Barcelona, Spain
| | - A Iriarte
- HHT Unit. Internal Medicine Department. Hospital Universitari Bellvitge, C/ Feixa Llarga s/n., L'Hospitalet de Llobregat, Barcelona 08907, Spain; Bellvitge Biomedical Research Institute (IDIBELL), Barcelona, Spain
| | - M Graupera
- Endothelial Pathobiology and Microenvironment, Josep Carreras Leukaemia Research Institute, Barcelona 08916, Spain; CIBERONC, Instituto de Salud Carlos III, Madrid, Spain
| | - F Viñals
- Physiological Sciences Department. Faculty of Medicine and Health Sciences. Universitat de Barcelona, Barcelona, Spain; Program Against Cancer Therapeutic Resistance, Hospital Duran i Reynals, Institut Catala d'Oncologia, Barcelona, Spain; Oncobell Program, Bellvitge Biomedical Research Institute (IDIBELL), Barcelona, Spain
| |
Collapse
|
20
|
Porta-Pardo E, Ruiz-Serra V, Valentini S, Valencia A. The structural coverage of the human proteome before and after AlphaFold. PLoS Comput Biol 2022; 18:e1009818. [PMID: 35073311 PMCID: PMC8812986 DOI: 10.1371/journal.pcbi.1009818] [Citation(s) in RCA: 64] [Impact Index Per Article: 32.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/20/2021] [Revised: 02/03/2022] [Accepted: 01/07/2022] [Indexed: 12/12/2022] Open
Abstract
The protein structure field is experiencing a revolution. From the increased throughput of techniques to determine experimental structures, to developments such as cryo-EM that allow us to find the structures of large protein complexes or, more recently, the development of artificial intelligence tools, such as AlphaFold, that can predict with high accuracy the folding of proteins for which the availability of homology templates is limited. Here we quantify the effect of the recently released AlphaFold database of protein structural models in our knowledge on human proteins. Our results indicate that our current baseline for structural coverage of 48%, considering experimentally-derived or template-based homology models, elevates up to 76% when including AlphaFold predictions. At the same time the fraction of dark proteome is reduced from 26% to just 10% when AlphaFold models are considered. Furthermore, although the coverage of disease-associated genes and mutations was near complete before AlphaFold release (69% of Clinvar pathogenic mutations and 88% of oncogenic mutations), AlphaFold models still provide an additional coverage of 3% to 13% of these critically important sets of biomedical genes and mutations. Finally, we show how the contribution of AlphaFold models to the structural coverage of non-human organisms, including important pathogenic bacteria, is significantly larger than that of the human proteome. Overall, our results show that the sequence-structure gap of human proteins has almost disappeared, an outstanding success of direct consequences for the knowledge on the human genome and the derived medical applications. Protein structures are key to understand many biological phenomena at the molecular scale: from the effects of genetic variation to how different proteins interact with each other to create molecular pathways that, together, have a biological function. Obtaining experimental structures, however, is extremely consuming in terms of both, time and resources. For this and other reasons, scientists have long worked to develop computational approaches that predict the structure of a protein using only its sequence as input. Recently, a group of scientists at Deepmind have developed AlphaFold2, a computational tool that is extremely accurate at this task. Moreover, they have used this tool to predict the structures of all human proteins. In this manuscript we provide an overview of the structural coverage of the human proteome before AlphaFold models were released and how much we have gained thanks to these models. We also show how the gain affects our understanding of human pathogenic variants, both germline and somatic. Finally, we provide evidence suggesting that the gain in non-human organisms is larger than for the human proteome, particularly in the case of bacteria.
Collapse
Affiliation(s)
- Eduard Porta-Pardo
- Barcelona Supercomputing Center (BSC), Barcelona, Spain
- Josep Carreras Leukaemia Research Institute (IJC), Badalona, Spain
- * E-mail: (EP-P); (AV)
| | - Victoria Ruiz-Serra
- Barcelona Supercomputing Center (BSC), Barcelona, Spain
- Josep Carreras Leukaemia Research Institute (IJC), Badalona, Spain
| | - Samuel Valentini
- Department of Cellular, Computational and Integrative Biology (CIBIO), University of Trento, Trento, Italy
| | - Alfonso Valencia
- Barcelona Supercomputing Center (BSC), Barcelona, Spain
- Institució Catalana de Recerca Avançada (ICREA), Barcelona, Spain
- * E-mail: (EP-P); (AV)
| |
Collapse
|
21
|
Chen S, Liu Y, Zhang Y, Wierbowski SD, Lipkin SM, Wei X, Yu H. A full-proteome, interaction-specific characterization of mutational hotspots across human cancers. Genome Res 2022; 32:135-149. [PMID: 34963661 PMCID: PMC8744679 DOI: 10.1101/gr.275437.121] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/24/2021] [Accepted: 11/22/2021] [Indexed: 11/24/2022]
Abstract
Rapid accumulation of cancer genomic data has led to the identification of an increasing number of mutational hotspots with uncharacterized significance. Here we present a biologically informed computational framework that characterizes the functional relevance of all 1107 published mutational hotspots identified in approximately 25,000 tumor samples across 41 cancer types in the context of a human 3D interactome network, in which the interface of each interaction is mapped at residue resolution. Hotspots reside in network hub proteins and are enriched on protein interaction interfaces, suggesting that alteration of specific protein-protein interactions is critical for the oncogenicity of many hotspot mutations. Our framework enables, for the first time, systematic identification of specific protein interactions affected by hotspot mutations at the full proteome scale. Furthermore, by constructing a hotspot-affected network that connects all hotspot-affected interactions throughout the whole-human interactome, we uncover genome-wide relationships among hotspots and implicate novel cancer proteins that do not harbor hotspot mutations themselves. Moreover, applying our network-based framework to specific cancer types identifies clinically significant hotspots that can be used for prognosis and therapy targets. Overall, we show that our framework bridges the gap between the statistical significance of mutational hotspots and their biological and clinical significance in human cancers.
Collapse
Affiliation(s)
- Siwei Chen
- Department of Computational Biology, Cornell University, Ithaca, New York 14853, USA
- Weill Institute for Cell and Molecular Biology, Cornell University, Ithaca, New York 14853, USA
- Department of Molecular Biology and Genetics, Cornell University, Ithaca, New York 14853, USA
| | - Yuan Liu
- Department of Computational Biology, Cornell University, Ithaca, New York 14853, USA
- Weill Institute for Cell and Molecular Biology, Cornell University, Ithaca, New York 14853, USA
| | - Yingying Zhang
- Department of Computational Biology, Cornell University, Ithaca, New York 14853, USA
- Weill Institute for Cell and Molecular Biology, Cornell University, Ithaca, New York 14853, USA
- Department of Molecular Biology and Genetics, Cornell University, Ithaca, New York 14853, USA
| | - Shayne D Wierbowski
- Department of Computational Biology, Cornell University, Ithaca, New York 14853, USA
- Weill Institute for Cell and Molecular Biology, Cornell University, Ithaca, New York 14853, USA
| | - Steven M Lipkin
- Department of Medicine, Weill Cornell Medicine, New York, New York 10021, USA
| | - Xiaomu Wei
- Department of Computational Biology, Cornell University, Ithaca, New York 14853, USA
- Department of Medicine, Weill Cornell Medicine, New York, New York 10021, USA
| | - Haiyuan Yu
- Department of Computational Biology, Cornell University, Ithaca, New York 14853, USA
- Weill Institute for Cell and Molecular Biology, Cornell University, Ithaca, New York 14853, USA
| |
Collapse
|
22
|
Abstract
The biological significance of proteins attracted the scientific community in exploring their characteristics. The studies shed light on the interaction patterns and functions of proteins in a living body. Due to their practical difficulties, reliable experimental techniques pave the way for introducing computational methods in the interaction prediction. Automated methods reduced the difficulties but could not yet replace experimental studies as the field is still evolving. Interaction prediction problem being critical needs highly accurate results, but none of the existing methods could offer reliable performance that can parallel with experimental results yet. This article aims to assess the existing computational docking algorithms, their challenges, and future scope. Blind docking techniques are quite helpful when no information other than the individual structures are available. As more and more complex structures are being added to different databases, information-driven approaches can be a good alternative. Artificial intelligence, ruling over the major fields, is expected to take over this domain very shortly.
Collapse
|
23
|
Raimondi F, Burkhart JG, Betts MJ, Russell RB, Wu G. Leveraging biochemical reactions to unravel functional impacts of cancer somatic variants affecting protein interaction interfaces. F1000Res 2021; 10:1111. [PMID: 36569594 PMCID: PMC9755755 DOI: 10.12688/f1000research.74395.1] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Figures] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Accepted: 10/27/2021] [Indexed: 07/26/2023] Open
Abstract
Background: Considering protein mutations in their biological context is essential for understanding their functional impact, interpretation of high-dimensional datasets and development of effective targeted therapies in personalized medicine. Methods: We combined the curated knowledge of biochemical reactions from Reactome with the analysis of interaction-mediating 3D interfaces from Mechismo. In addition, we provided a software tool for users to explore and browse the analysis results in a multi-scale perspective starting from pathways and reactions to protein-protein interactions and protein 3D structures. Results: We analyzed somatic mutations from TCGA, revealing several significantly impacted reactions and pathways in specific cancer types. We found examples of genes not yet listed as oncodrivers, whose rare mutations were predicted to affect cancer processes similarly to known oncodrivers. Some identified processes lack any known oncodrivers, which suggests potentially new cancer-related processes (e.g. complement cascade reactions). Furthermore, we found that mutations perturbing certain processes are significantly associated with distinct phenotypes (i.e. survival time) in specific cancer types (e.g. PIK3CA centered pathways in LGG and UCEC cancer types), suggesting the translational potential of our approach for patient stratification. Our analysis also uncovered several druggable processes (e.g. GPCR signalling pathways) containing enriched reactions, providing support for new off-label therapeutic options. Conclusions: In summary, we have established a multi-scale approach to study genetic variants based on protein-protein interaction 3D structures. Our approach is different from previously published studies in its focus on biochemical reactions and can be applied to other data types (e.g. post-translational modifications) collected for many types of disease.
Collapse
Affiliation(s)
| | - Joshua G. Burkhart
- Division of Bioinformatics and Computational Biology, Department of Medical Informatics and Clinical Epidemiology, Oregon Health & Science University, Portland, Oregon, USA
| | - Matthew J. Betts
- Heidelberg University Biochemistry Center, University of Heidelberg, Heidelberg, Germany
- BioQuant, University of Heidelberg, Heidelberg, Germany
| | - Robert B. Russell
- Heidelberg University Biochemistry Center, University of Heidelberg, Heidelberg, Germany
- BioQuant, University of Heidelberg, Heidelberg, Germany
| | - Guanming Wu
- Division of Bioinformatics and Computational Biology, Department of Medical Informatics and Clinical Epidemiology, Oregon Health & Science University, Portland, Oregon, USA
| |
Collapse
|
24
|
Raimondi F, Burkhart JG, Betts MJ, Russell RB, Wu G. Leveraging biochemical reactions to unravel functional impacts of cancer somatic variants affecting protein interaction interfaces. F1000Res 2021; 10:1111. [PMID: 36569594 PMCID: PMC9755755 DOI: 10.12688/f1000research.74395.3] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Accepted: 12/01/2022] [Indexed: 12/14/2022] Open
Abstract
Background: Considering protein mutations in their biological context is essential for understanding their functional impact, interpretation of high-dimensional datasets and development of effective targeted therapies in personalized medicine. Methods: We combined the curated knowledge of biochemical reactions from Reactome with the analysis of interaction-mediating 3D interfaces from Mechismo. In addition, we provided a software tool for users to explore and browse the analysis results in a multi-scale perspective starting from pathways and reactions to protein-protein interactions and protein 3D structures. Results: We analyzed somatic mutations from TCGA, revealing several significantly impacted reactions and pathways in specific cancer types. We found examples of genes not yet listed as oncodrivers, whose rare mutations were predicted to affect cancer processes similarly to known oncodrivers. Some identified processes lack any known oncodrivers, which suggests potentially new cancer-related processes (e.g. complement cascade reactions). Furthermore, we found that mutations perturbing certain processes are significantly associated with distinct phenotypes (i.e. survival time) in specific cancer types (e.g. PIK3CA centered pathways in LGG and UCEC cancer types), suggesting the translational potential of our approach for patient stratification. Our analysis also uncovered several druggable processes (e.g. GPCR signalling pathways) containing enriched reactions, providing support for new off-label therapeutic options. Conclusions: In summary, we have established a multi-scale approach to study genetic variants based on protein-protein interaction 3D structures. Our approach is different from previously published studies in its focus on biochemical reactions and can be applied to other data types (e.g. post-translational modifications) collected for many types of disease.
Collapse
Affiliation(s)
| | - Joshua G. Burkhart
- Division of Bioinformatics and Computational Biology, Department of Medical Informatics and Clinical Epidemiology, Oregon Health & Science University, Portland, Oregon, USA
| | - Matthew J. Betts
- Heidelberg University Biochemistry Center, University of Heidelberg, Heidelberg, Germany
- BioQuant, University of Heidelberg, Heidelberg, Germany
| | - Robert B. Russell
- Heidelberg University Biochemistry Center, University of Heidelberg, Heidelberg, Germany
- BioQuant, University of Heidelberg, Heidelberg, Germany
| | - Guanming Wu
- Division of Bioinformatics and Computational Biology, Department of Medical Informatics and Clinical Epidemiology, Oregon Health & Science University, Portland, Oregon, USA
| |
Collapse
|
25
|
Raimondi F, Burkhart JG, Betts MJ, Russell RB, Wu G. Leveraging biochemical reactions to unravel functional impacts of cancer somatic variants affecting protein interaction interfaces. F1000Res 2021; 10:1111. [PMID: 36569594 PMCID: PMC9755755 DOI: 10.12688/f1000research.74395.2] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Figures] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Accepted: 11/25/2022] [Indexed: 07/26/2023] Open
Abstract
Background: Considering protein mutations in their biological context is essential for understanding their functional impact, interpretation of high-dimensional datasets and development of effective targeted therapies in personalized medicine. Methods: We combined the curated knowledge of biochemical reactions from Reactome with the analysis of interaction-mediating 3D interfaces from Mechismo. In addition, we provided a software tool for users to explore and browse the analysis results in a multi-scale perspective starting from pathways and reactions to protein-protein interactions and protein 3D structures. Results: We analyzed somatic mutations from TCGA, revealing several significantly impacted reactions and pathways in specific cancer types. We found examples of genes not yet listed as oncodrivers, whose rare mutations were predicted to affect cancer processes similarly to known oncodrivers. Some identified processes lack any known oncodrivers, which suggests potentially new cancer-related processes (e.g. complement cascade reactions). Furthermore, we found that mutations perturbing certain processes are significantly associated with distinct phenotypes (i.e. survival time) in specific cancer types (e.g. PIK3CA centered pathways in LGG and UCEC cancer types), suggesting the translational potential of our approach for patient stratification. Our analysis also uncovered several druggable processes (e.g. GPCR signalling pathways) containing enriched reactions, providing support for new off-label therapeutic options. Conclusions: In summary, we have established a multi-scale approach to study genetic variants based on protein-protein interaction 3D structures. Our approach is different from previously published studies in its focus on biochemical reactions and can be applied to other data types (e.g. post-translational modifications) collected for many types of disease.
Collapse
Affiliation(s)
| | - Joshua G. Burkhart
- Division of Bioinformatics and Computational Biology, Department of Medical Informatics and Clinical Epidemiology, Oregon Health & Science University, Portland, Oregon, USA
| | - Matthew J. Betts
- Heidelberg University Biochemistry Center, University of Heidelberg, Heidelberg, Germany
- BioQuant, University of Heidelberg, Heidelberg, Germany
| | - Robert B. Russell
- Heidelberg University Biochemistry Center, University of Heidelberg, Heidelberg, Germany
- BioQuant, University of Heidelberg, Heidelberg, Germany
| | - Guanming Wu
- Division of Bioinformatics and Computational Biology, Department of Medical Informatics and Clinical Epidemiology, Oregon Health & Science University, Portland, Oregon, USA
| |
Collapse
|
26
|
Ershov P, Kaluzhskiy L, Mezentsev Y, Yablokov E, Gnedenko O, Ivanov A. Enzymes in the Cholesterol Synthesis Pathway: Interactomics in the Cancer Context. Biomedicines 2021; 9:biomedicines9080895. [PMID: 34440098 PMCID: PMC8389681 DOI: 10.3390/biomedicines9080895] [Citation(s) in RCA: 26] [Impact Index Per Article: 8.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/01/2021] [Revised: 07/20/2021] [Accepted: 07/22/2021] [Indexed: 02/06/2023] Open
Abstract
A global protein interactome ensures the maintenance of regulatory, signaling and structural processes in cells, but at the same time, aberrations in the repertoire of protein-protein interactions usually cause a disease onset. Many metabolic enzymes catalyze multistage transformation of cholesterol precursors in the cholesterol biosynthesis pathway. Cancer-associated deregulation of these enzymes through various molecular mechanisms results in pathological cholesterol accumulation (its precursors) which can be disease risk factors. This work is aimed at systematization and bioinformatic analysis of the available interactomics data on seventeen enzymes in the cholesterol pathway, encoded by HMGCR, MVK, PMVK, MVD, FDPS, FDFT1, SQLE, LSS, DHCR24, CYP51A1, TM7SF2, MSMO1, NSDHL, HSD17B7, EBP, SC5D, DHCR7 genes. The spectrum of 165 unique and 21 common protein partners that physically interact with target enzymes was selected from several interatomic resources. Among them there were 47 modifying proteins from different protein kinases/phosphatases and ubiquitin-protein ligases/deubiquitinases families. A literature search, enrichment and gene co-expression analysis showed that about a quarter of the identified protein partners was associated with cancer hallmarks and over-represented in cancer pathways. Our results allow to update the current fundamental view on protein-protein interactions and regulatory aspects of the cholesterol synthesis enzymes and annotate of their sub-interactomes in term of possible involvement in cancers that will contribute to prioritization of protein targets for future drug development.
Collapse
|
27
|
Shivakumar M, Miller JE, Dasari VR, Zhang Y, Lee MTM, Carey DJ, Gogoi R, Kim D. Genetic Analysis of Functional Rare Germline Variants across Nine Cancer Types from an Electronic Health Record Linked Biobank. Cancer Epidemiol Biomarkers Prev 2021; 30:1681-1688. [PMID: 34244158 DOI: 10.1158/1055-9965.epi-21-0082] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/18/2021] [Revised: 02/15/2021] [Accepted: 06/17/2021] [Indexed: 11/16/2022] Open
Abstract
BACKGROUND Rare variants play an essential role in the etiology of cancer. In this study, we aim to characterize rare germline variants that impact the risk of cancer. METHODS We performed a genome-wide rare variant analysis using germline whole exome sequencing (WES) data derived from the Geisinger MyCode initiative to discover cancer predisposition variants. The case-control association analysis was conducted by binning variants in 5,538 patients with cancer and 7,286 matched controls in a discovery set and 1,991 patients with cancer and 2,504 matched controls in a validation set across nine cancer types. Further, The Cancer Genome Atlas (TCGA) germline data were used to replicate the findings. RESULTS We identified 133 significant pathway-cancer pairs (85 replicated) and 90 significant gene-cancer pairs (12 replicated). In addition, we identified 18 genes and 3 pathways that were associated with survival outcome across cancers (Bonferroni P < 0.05). CONCLUSIONS In this study, we identified potential predisposition genes and pathways based on rare variants in nine cancers. IMPACT This work adds to the knowledge base and progress being made in precision medicine.
Collapse
Affiliation(s)
- Manu Shivakumar
- Biomedical & Translational Informatics Institute, Geisinger, Danville, Pennsylvania
- Department of Biostatistics, Epidemiology and Informatics, Perelman School of Medicine, University of Pennsylvania, Philadelphia, Pennsylvania
| | - Jason E Miller
- Department of Genetics, Perelman School of Medicine, University of Pennsylvania, Philadelphia, Pennsylvania
- Institute for Biomedical Informatics, University of Pennsylvania, Philadelphia, Pennsylvania
| | | | - Yanfei Zhang
- Genomic Medicine Institute, Geisinger, Danville, Pennsylvania
| | | | - David J Carey
- Department of Molecular and Functional Genomics, Geisinger, Danville, Pennsylvania
| | - Radhika Gogoi
- Weis Center for Research, Geisinger Clinic, Danville, Pennsylvania.
| | | |
Collapse
|
28
|
The Role of Fibroblast Growth Factor 19 in Hepatocellular Carcinoma. THE AMERICAN JOURNAL OF PATHOLOGY 2021; 191:1180-1192. [PMID: 34000282 DOI: 10.1016/j.ajpath.2021.04.014] [Citation(s) in RCA: 14] [Impact Index Per Article: 4.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Received: 11/09/2020] [Revised: 04/09/2021] [Accepted: 04/22/2021] [Indexed: 12/12/2022]
Abstract
Hepatocellular carcinoma (HCC) is the fifth most common type of cancer and the third leading cause of cancer-related deaths worldwide. Liver resection or liver transplantation is the most effective therapy for HCC because drugs approved by the US Food and Drug Administration to treat patients with unresectable HCC have an unfavorable overall survival rate. Therefore, the development of biomarkers for early diagnosis and effective therapy strategies are still necessary to improve patient outcomes. Fibroblast growth factor (FGF) 19 was amplified in patients with HCC from various studies, including patients from The Cancer Genome Atlas. FGF19 plays a syngeneic function with other signaling pathways in primary liver cancer development, such as epidermal growth factor receptor, Wnt/β-catenin, the endoplasmic reticulum-related signaling pathway, STAT3/IL-6, RAS, and extracellular signal-regulated protein kinase, among others. The current review presents a comprehensive description of the FGF19 signaling pathway involved in liver cancer development. The use of big data and bioinformatic analysis can provide useful clues for further studies of the FGF19 pathway in HCC, including its application as a biomarker, targeted therapy, and combination therapy strategies.
Collapse
|
29
|
Pathogenic missense protein variants affect different functional pathways and proteomic features than healthy population variants. PLoS Biol 2021; 19:e3001207. [PMID: 33909605 PMCID: PMC8110273 DOI: 10.1371/journal.pbio.3001207] [Citation(s) in RCA: 10] [Impact Index Per Article: 3.3] [Reference Citation Analysis] [Abstract] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/31/2020] [Revised: 05/10/2021] [Accepted: 03/26/2021] [Indexed: 12/27/2022] Open
Abstract
Missense variants are present amongst the healthy population, but some of them are causative of human diseases. A classification of variants associated with “healthy” or “diseased” states is therefore not always straightforward. A deeper understanding of the nature of missense variants in health and disease, the cellular processes they may affect, and the general molecular principles which underlie these differences is essential to offer mechanistic explanations of the true impact of pathogenic variants. Here, we have formalised a statistical framework which enables robust probabilistic quantification of variant enrichment across full-length proteins, their domains, and 3D structure-defined regions. Using this framework, we validate and extend previously reported trends of variant enrichment in different protein structural regions (surface/core/interface). By examining the association of variant enrichment with available functional pathways and transcriptomic and proteomic (protein half-life, thermal stability, abundance) data, we have mined a rich set of molecular features which distinguish between pathogenic and population variants: Pathogenic variants mainly affect proteins involved in cell proliferation and nucleotide processing and are enriched in more abundant proteins. Additionally, rare population variants display features closer to common than pathogenic variants. We validate the association between these molecular features and variant pathogenicity by comparing against existing in silico variant impact annotations. This study provides molecular details into how different proteins exhibit resilience and/or sensitivity towards missense variants and provides the rationale to prioritise variant-enriched proteins and protein domains for therapeutic targeting and development. The ZoomVar database, which we created for this study, is available at fraternalilab.kcl.ac.uk/ZoomVar. It allows users to programmatically annotate missense variants with protein structural information and to calculate variant enrichment in different protein structural regions. How do can one improve the classification of genetic variants as harmful or harmless? This study uses a robust statistical analysis to exploit the interplay between protein structure, proteomic measurements and functional pathways to enable better discrimination between missense variants in health and disease.
Collapse
|
30
|
Huang KL, Scott AD, Zhou DC, Wang LB, Weerasinghe A, Elmas A, Liu R, Wu Y, Wendl MC, Wyczalkowski MA, Baral J, Sengupta S, Lai CW, Ruggles K, Payne SH, Raphael B, Fenyö D, Chen K, Mills G, Ding L. Spatially interacting phosphorylation sites and mutations in cancer. Nat Commun 2021; 12:2313. [PMID: 33875650 PMCID: PMC8055881 DOI: 10.1038/s41467-021-22481-w] [Citation(s) in RCA: 10] [Impact Index Per Article: 3.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/16/2020] [Accepted: 02/17/2021] [Indexed: 11/18/2022] Open
Abstract
Advances in mass-spectrometry have generated increasingly large-scale proteomics datasets containing tens of thousands of phosphorylation sites (phosphosites) that require prioritization. We develop a bioinformatics tool called HotPho and systematically discover 3D co-clustering of phosphosites and cancer mutations on protein structures. HotPho identifies 474 such hybrid clusters containing 1255 co-clustering phosphosites, including RET p.S904/Y928, the conserved HRAS/KRAS p.Y96, and IDH1 p.Y139/IDH2 p.Y179 that are adjacent to recurrent mutations on protein structures not found by linear proximity approaches. Hybrid clusters, enriched in histone and kinase domains, frequently include expression-associated mutations experimentally shown as activating and conferring genetic dependency. Approximately 300 co-clustering phosphosites are verified in patient samples of 5 cancer types or previously implicated in cancer, including CTNNB1 p.S29/Y30, EGFR p.S720, MAPK1 p.S142, and PTPN12 p.S275. In summary, systematic 3D clustering analysis highlights nearly 3,000 likely functional mutations and over 1000 cancer phosphosites for downstream investigation and evaluation of potential clinical relevance.
Collapse
Affiliation(s)
- Kuan-Lin Huang
- Department of Genetics and Genomics, Tisch Cancer Institute, Icahn Institute for Data Science and Genomic Technology, Icahn School of Medicine at Mount Sinai, New York, NY, USA.
| | - Adam D Scott
- Department of Medicine, McDonnell Genome Institute, Department of Genetics, Siteman Cancer Center, Washington University in St. Louis, St. Louis, MO, USA
| | - Daniel Cui Zhou
- Department of Medicine, McDonnell Genome Institute, Department of Genetics, Siteman Cancer Center, Washington University in St. Louis, St. Louis, MO, USA
| | - Liang-Bo Wang
- Department of Medicine, McDonnell Genome Institute, Department of Genetics, Siteman Cancer Center, Washington University in St. Louis, St. Louis, MO, USA
| | - Amila Weerasinghe
- Department of Medicine, McDonnell Genome Institute, Department of Genetics, Siteman Cancer Center, Washington University in St. Louis, St. Louis, MO, USA
| | - Abdulkadir Elmas
- Department of Genetics and Genomics, Tisch Cancer Institute, Icahn Institute for Data Science and Genomic Technology, Icahn School of Medicine at Mount Sinai, New York, NY, USA
| | - Ruiyang Liu
- Department of Medicine, McDonnell Genome Institute, Department of Genetics, Siteman Cancer Center, Washington University in St. Louis, St. Louis, MO, USA
| | - Yige Wu
- Department of Medicine, McDonnell Genome Institute, Department of Genetics, Siteman Cancer Center, Washington University in St. Louis, St. Louis, MO, USA
| | - Michael C Wendl
- Department of Medicine, McDonnell Genome Institute, Department of Genetics, Siteman Cancer Center, Washington University in St. Louis, St. Louis, MO, USA
| | - Matthew A Wyczalkowski
- Department of Medicine, McDonnell Genome Institute, Department of Genetics, Siteman Cancer Center, Washington University in St. Louis, St. Louis, MO, USA
| | - Jessika Baral
- Department of Medicine, McDonnell Genome Institute, Department of Genetics, Siteman Cancer Center, Washington University in St. Louis, St. Louis, MO, USA
| | - Sohini Sengupta
- Department of Medicine, McDonnell Genome Institute, Department of Genetics, Siteman Cancer Center, Washington University in St. Louis, St. Louis, MO, USA
| | - Chin-Wen Lai
- Department of Pathology and Immunology, Washington University in St. Louis, St. Louis, MO, USA
| | - Kelly Ruggles
- Center for Health Informatics and Bioinformatics, New York University School of Medicine, New York, NY, USA
| | - Samuel H Payne
- Department of Biology, Brigham Young University, Provo, UT, USA
| | - Benjamin Raphael
- Lewis-Sigler Institute, Princeton University, Princeton, NJ, USA
| | - David Fenyö
- Center for Health Informatics and Bioinformatics, New York University School of Medicine, New York, NY, USA
| | - Ken Chen
- Departments of Bioinformatics and Computational Biology, The University of Texas MD Anderson Cancer Center, Houston, TX, USA
| | - Gordon Mills
- Knight Cancer Institute, Oregon Health & Science University, Portland, OR, USA
| | - Li Ding
- Department of Medicine, McDonnell Genome Institute, Department of Genetics, Siteman Cancer Center, Washington University in St. Louis, St. Louis, MO, USA.
| |
Collapse
|
31
|
Krassowski M, Pellegrina D, Mee MW, Fradet-Turcotte A, Bhat M, Reimand J. ActiveDriverDB: Interpreting Genetic Variation in Human and Cancer Genomes Using Post-translational Modification Sites and Signaling Networks (2021 Update). Front Cell Dev Biol 2021; 9:626821. [PMID: 33834021 PMCID: PMC8021862 DOI: 10.3389/fcell.2021.626821] [Citation(s) in RCA: 6] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/06/2020] [Accepted: 02/08/2021] [Indexed: 12/14/2022] Open
Abstract
Deciphering the functional impact of genetic variation is required to understand phenotypic diversity and the molecular mechanisms of inherited disease and cancer. While millions of genetic variants are now mapped in genome sequencing projects, distinguishing functional variants remains a major challenge. Protein-coding variation can be interpreted using post-translational modification (PTM) sites that are core components of cellular signaling networks controlling molecular processes and pathways. ActiveDriverDB is an interactive proteo-genomics database that uses more than 260,000 experimentally detected PTM sites to predict the functional impact of genetic variation in disease, cancer and the human population. Using machine learning tools, we prioritize proteins and pathways with enriched PTM-specific amino acid substitutions that potentially rewire signaling networks via induced or disrupted short linear motifs of kinase binding. We then map these effects to site-specific protein interaction networks and drug targets. In the 2021 update, we increased the PTM datasets by nearly 50%, included glycosylation, sumoylation and succinylation as new types of PTMs, and updated the workflows to interpret inherited disease mutations. We added a recent phosphoproteomics dataset reflecting the cellular response to SARS-CoV-2 to predict the impact of human genetic variation on COVID-19 infection and disease course. Overall, we estimate that 16-21% of known amino acid substitutions affect PTM sites among pathogenic disease mutations, somatic mutations in cancer genomes and germline variants in the human population. These data underline the potential of interpreting genetic variation through the lens of PTMs and signaling networks. The open-source database is freely available at www.ActiveDriverDB.org.
Collapse
Affiliation(s)
- Michal Krassowski
- Nuffield Department of Women’s and Reproductive Health, Medical Sciences Division, University of Oxford, Oxford, United Kingdom
| | - Diogo Pellegrina
- Computational Biology Program, Ontario Institute for Cancer Research, Toronto, ON, Canada
| | - Miles W. Mee
- Computational Biology Program, Ontario Institute for Cancer Research, Toronto, ON, Canada
| | - Amelie Fradet-Turcotte
- Department of Molecular Biology, Medical Biochemistry and Pathology, Universite Laval, Quebec, QC, Canada
- Oncology Division, Centre Hospitalier Universitaire (CHU) de Quebec-Universite Laval Research Center, Quebec, QC, Canada
| | - Mamatha Bhat
- Multiorgan Transplant Program, University Health Network, Toronto, ON, Canada
- Division of Gastroenterology & Hepatology, Department of Medicine, University of Toronto, Toronto, ON, Canada
| | - Jüri Reimand
- Computational Biology Program, Ontario Institute for Cancer Research, Toronto, ON, Canada
- Department of Medical Biophysics, University of Toronto, Toronto, ON, Canada
- Department of Molecular Genetics, University of Toronto, Toronto, ON, Canada
| |
Collapse
|
32
|
Murray D, Petrey D, Honig B. Integrating 3D structural information into systems biology. J Biol Chem 2021; 296:100562. [PMID: 33744294 PMCID: PMC8095114 DOI: 10.1016/j.jbc.2021.100562] [Citation(s) in RCA: 15] [Impact Index Per Article: 5.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/31/2020] [Revised: 02/18/2021] [Accepted: 03/17/2021] [Indexed: 12/12/2022] Open
Abstract
Systems biology is a data-heavy field that focuses on systems-wide depictions of biological phenomena necessarily sacrificing a detailed characterization of individual components. As an example, genome-wide protein interaction networks are widely used in systems biology and continuously extended and refined as new sources of evidence become available. Despite the vast amount of information about individual protein structures and protein complexes that has accumulated in the past 50 years in the Protein Data Bank, the data, computational tools, and language of structural biology are not an integral part of systems biology. However, increasing effort has been devoted to this integration, and the related literature is reviewed here. Relationships between proteins that are detected via structural similarity offer a rich source of information not available from sequence similarity, and homology modeling can be used to leverage Protein Data Bank structures to produce 3D models for a significant fraction of many proteomes. A number of structure-informed genomic and cross-species (i.e., virus–host) interactomes will be described, and the unique information they provide will be illustrated with a number of examples. Tissue- and tumor-specific interactomes have also been developed through computational strategies that exploit patient information and through genetic interactions available from increasingly sensitive screens. Strategies to integrate structural information with these alternate data sources will be described. Finally, efforts to link protein structure space with chemical compound space offer novel sources of information in drug design, off-target identification, and the identification of targets for compounds found to be effective in phenotypic screens.
Collapse
Affiliation(s)
- Diana Murray
- Department of Systems Biology, Columbia University, New York, New York, USA
| | - Donald Petrey
- Department of Systems Biology, Columbia University, New York, New York, USA
| | - Barry Honig
- Department of Systems Biology, Department of Biochemistry and Molecular Biophysics, Department of Medicine, Zuckerman Mind Brain and Behavior Institute, Columbia University, New York, New York, USA.
| |
Collapse
|
33
|
Guo Z, Fu Y, Huang C, Zheng C, Wu Z, Chen X, Gao S, Ma Y, Shahen M, Li Y, Tu P, Zhu J, Wang Z, Xiao W, Wang Y. NOGEA: A Network-oriented Gene Entropy Approach for Dissecting Disease Comorbidity and Drug Repositioning. GENOMICS, PROTEOMICS & BIOINFORMATICS 2021; 19:549-564. [PMID: 33744433 PMCID: PMC9040018 DOI: 10.1016/j.gpb.2020.06.023] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 10/03/2019] [Revised: 04/04/2020] [Accepted: 09/24/2020] [Indexed: 10/31/2022]
Abstract
Rapid development of high-throughput technologies has permitted the identification of an increasing number of disease-associated genes (DAGs), which are important for understanding disease initiation and developing precision therapeutics. However, DAGs often contain large amounts of redundant or false positive information, leading to difficulties in quantifying and prioritizing potential relationships between these DAGs and human diseases. In this study, a network-oriented gene entropy approach (NOGEA) is proposed for accurately inferring master genes that contribute to specific diseases by quantitatively calculating their perturbation abilities on directed disease-specific gene networks. In addition, we confirmed that the master genes identified by NOGEA have a high reliability for predicting disease-specific initiation events and progression risk. Master genes may also be used to extract the underlying information of different diseases, thus revealing mechanisms of disease comorbidity. More importantly, approved therapeutic targets are topologically localized in a small neighborhood of master genes on the interactome network, which provides a new way for predicting drug-disease associations. Through this method, 11 old drugs were newly identified and predicted to be effective for treating pancreatic cancer and then validated by in vitro experiments. Collectively, the NOGEA was useful for identifying master genes that control disease initiation and co-occurrence, thus providing a valuable strategy for drug efficacy screening and repositioning. NOGEA codes are publicly available at https://github.com/guozihuaa/NOGEA.
Collapse
Affiliation(s)
- Zihu Guo
- College of Life Science, Northwest University, Xi'an 710069, China; College of Life Science, Northwest A & F University, Yangling 712100, China
| | - Yingxue Fu
- College of Life Science, Northwest A & F University, Yangling 712100, China
| | - Chao Huang
- College of Life Science, Northwest A & F University, Yangling 712100, China
| | - Chunli Zheng
- College of Life Science, Northwest University, Xi'an 710069, China
| | - Ziyin Wu
- College of Life Science, Northwest A & F University, Yangling 712100, China
| | - Xuetong Chen
- College of Life Science, Northwest A & F University, Yangling 712100, China
| | - Shuo Gao
- College of Life Science, Northwest A & F University, Yangling 712100, China
| | - Yaohua Ma
- College of Life Science, Northwest University, Xi'an 710069, China
| | - Mohamed Shahen
- Zoology Department, Faculty of Science, Tanta University, Tanta 31527, Egypt
| | - Yan Li
- Key Laboratory of Industrial Ecology and Environmental Engineering (Ministry of Education), Faculty of Chemical, Environmental and Biological Science and Technology, Dalian University of Technology, Dalian 116024, China
| | - Pengfei Tu
- State Key Laboratory of Natural and Biomimetic Drugs, School of Pharmaceutical Sciences, Peking University, Beijing 100191, China
| | - Jingbo Zhu
- School of Food Science and Technology, Dalian Polytechnic University, Dalian 116034, China
| | - Zhenzhong Wang
- State Key Laboratory of New-tech for Chinese Medicine Pharmaceutical Process, Lianyungang 222001, China
| | - Wei Xiao
- State Key Laboratory of New-tech for Chinese Medicine Pharmaceutical Process, Lianyungang 222001, China.
| | - Yonghua Wang
- College of Life Science, Northwest University, Xi'an 710069, China; College of Life Science, Northwest A & F University, Yangling 712100, China; State Key Laboratory of New-tech for Chinese Medicine Pharmaceutical Process, Lianyungang 222001, China.
| |
Collapse
|
34
|
Identification of Breast Cancer Subtype-Specific Biomarkers by Integrating Copy Number Alterations and Gene Expression Profiles. ACTA ACUST UNITED AC 2021; 57:medicina57030261. [PMID: 33809336 PMCID: PMC7998437 DOI: 10.3390/medicina57030261] [Citation(s) in RCA: 7] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/07/2021] [Revised: 03/01/2021] [Accepted: 03/09/2021] [Indexed: 12/20/2022]
Abstract
Background and Objectives: Breast cancer is a heterogeneous disease categorized into four subtypes. Previous studies have shown that copy number alterations of several genes are implicated with the development and progression of many cancers. This study evaluates the effects of DNA copy number alterations on gene expression levels in different breast cancer subtypes. Materials and Methods: We performed a computational analysis integrating copy number alterations and gene expression profiles in 1024 breast cancer samples grouped into four molecular subtypes: luminal A, luminal B, HER2, and basal. Results: Our analyses identified several genes correlated in all subtypes such as KIAA1967 and MCPH1. In addition, several subtype-specific genes that showed a significant correlation between copy number and gene expression profiles were detected: SMARCB1, AZIN1, MTDH in luminal A, PPP2R5E, APEX1, GCN5 in luminal B, TNFAIP1, PCYT2, DIABLO in HER2, and FAM175B, SENP5, SCAF1 in basal subtype. Conclusions: This study showed that computational analyses integrating copy number and gene expression can contribute to unveil the molecular mechanisms of cancer and identify new subtype-specific biomarkers.
Collapse
|
35
|
Mészáros B, Hajdu-Soltész B, Zeke A, Dosztányi Z. Mutations of Intrinsically Disordered Protein Regions Can Drive Cancer but Lack Therapeutic Strategies. Biomolecules 2021; 11:biom11030381. [PMID: 33806614 PMCID: PMC8000335 DOI: 10.3390/biom11030381] [Citation(s) in RCA: 11] [Impact Index Per Article: 3.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/25/2021] [Revised: 02/22/2021] [Accepted: 02/24/2021] [Indexed: 12/22/2022] Open
Abstract
Many proteins contain intrinsically disordered regions (IDRs) which carry out important functions without relying on a single well-defined conformation. IDRs are increasingly recognized as critical elements of regulatory networks and have been also associated with cancer. However, it is unknown whether mutations targeting IDRs represent a distinct class of driver events associated with specific molecular and system-level properties, cancer types and treatment options. Here, we used an integrative computational approach to explore the direct role of intrinsically disordered protein regions driving cancer. We showed that around 20% of cancer drivers are primarily targeted through a disordered region. These IDRs can function in multiple ways which are distinct from the functional mechanisms of ordered drivers. Disordered drivers play a central role in context-dependent interaction networks and are enriched in specific biological processes such as transcription, gene expression regulation and protein degradation. Furthermore, their modulation represents an alternative mechanism for the emergence of all known cancer hallmarks. Importantly, in certain cancer patients, mutations of disordered drivers represent key driving events. However, treatment options for such patients are currently severely limited. The presented study highlights a largely overlooked class of cancer drivers associated with specific cancer types that need novel therapeutic options.
Collapse
Affiliation(s)
- Bálint Mészáros
- Department of Biochemistry, ELTE Eötvös Loránd University, H-1117 Budapest, Hungary; (B.M.); (B.H.-S.)
- EMBL Heidelberg, Meyerhofstraße 1, 69117 Heidelberg, Germany
| | - Borbála Hajdu-Soltész
- Department of Biochemistry, ELTE Eötvös Loránd University, H-1117 Budapest, Hungary; (B.M.); (B.H.-S.)
| | - András Zeke
- Institute of Enzymology, RCNS, P.O. Box 7, H-1518 Budapest, Hungary;
| | - Zsuzsanna Dosztányi
- Department of Biochemistry, ELTE Eötvös Loránd University, H-1117 Budapest, Hungary; (B.M.); (B.H.-S.)
- Correspondence: ; Tel.: +36-1-372 2500/8537
| |
Collapse
|
36
|
Comprehensive characterization of protein-protein interactions perturbed by disease mutations. Nat Genet 2021; 53:342-353. [PMID: 33558758 DOI: 10.1038/s41588-020-00774-y] [Citation(s) in RCA: 97] [Impact Index Per Article: 32.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/27/2020] [Accepted: 12/22/2020] [Indexed: 02/07/2023]
Abstract
Technological and computational advances in genomics and interactomics have made it possible to identify how disease mutations perturb protein-protein interaction (PPI) networks within human cells. Here, we show that disease-associated germline variants are significantly enriched in sequences encoding PPI interfaces compared to variants identified in healthy participants from the projects 1000 Genomes and ExAC. Somatic missense mutations are also significantly enriched in PPI interfaces compared to noninterfaces in 10,861 tumor exomes. We computationally identified 470 putative oncoPPIs in a pan-cancer analysis and demonstrate that oncoPPIs are highly correlated with patient survival and drug resistance/sensitivity. We experimentally validate the network effects of 13 oncoPPIs using a systematic binary interaction assay, and also demonstrate the functional consequences of two of these on tumor cell growth. In summary, this human interactome network framework provides a powerful tool for prioritization of alleles with PPI-perturbing mutations to inform pathobiological mechanism- and genotype-based therapeutic discovery.
Collapse
|
37
|
How wide is the application of genetic big data in biomedicine. Biomed Pharmacother 2020; 133:111074. [PMID: 33378973 DOI: 10.1016/j.biopha.2020.111074] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/22/2020] [Revised: 11/16/2020] [Accepted: 11/27/2020] [Indexed: 12/17/2022] Open
Abstract
In the era of big data, massive genetic data, as a new industry, has quickly swept almost all industries, especially the pharmaceutical industry. As countries around the world start to build their own gene banks, scientists study the data to explore the origins and migration of humans. Moreover, big data encourage the development of cancer therapy and bring good news to cancer patients. Big datum has been involved in the study of many diseases, and it has been found that analyzing diseases at the gene level can lead to more beneficial treatment options than ordinary treatments. This review will introduce the development of extensive data in medical research from the perspective of big data and tumor, neurological and psychiatric diseases, cardiovascular diseases, other applications and the development direction of big data in medicine.
Collapse
|
38
|
Porta‐Pardo E, Valencia A, Godzik A. Understanding oncogenicity of cancer driver genes and mutations in the cancer genomics era. FEBS Lett 2020; 594:4233-4246. [PMID: 32239503 PMCID: PMC7529711 DOI: 10.1002/1873-3468.13781] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/03/2019] [Revised: 01/23/2020] [Accepted: 02/09/2020] [Indexed: 12/12/2022]
Abstract
One of the key challenges of cancer biology is to catalogue and understand the somatic genomic alterations leading to cancer. Although alternative definitions and search methods have been developed to identify cancer driver genes and mutations, analyses of thousands of cancer genomes return a remarkably similar catalogue of around 300 genes that are mutated in at least one cancer type. Yet, many features of these genes and their role in cancer remain unclear, first and foremost when a somatic mutation is truly oncogenic. In this review, we first summarize some of the recent efforts in completing the catalogue of cancer driver genes. Then, we give an overview of different aspects that influence the oncogenicity of somatic mutations in the core cancer driver genes, including their interactions with the germline genome, other cancer driver mutations, the immune system, or their potential role in healthy tissues. In the coming years, this research holds promise to illuminate how, when, and why cancer driver genes and mutations are really drivers, and thereby move personalized cancer medicine and targeted therapies forward.
Collapse
Affiliation(s)
- Eduard Porta‐Pardo
- Barcelona Supercomputing Center (BSC)BarcelonaSpain
- Josep Carreras Leukaemia Research Institute (IJC)BadalonaSpain
| | - Alfonso Valencia
- Barcelona Supercomputing Center (BSC)BarcelonaSpain
- Institucio Catalana de Recerca I Estudis Avançats (ICREA)BarcelonaSpain
| | - Adam Godzik
- Division of Biomedical SciencesUniversity of California Riverside School of MedicineRiversideCAUSA
| |
Collapse
|
39
|
Lyu J, Li JJ, Su J, Peng F, Chen YE, Ge X, Li W. DORGE: Discovery of Oncogenes and tumoR suppressor genes using Genetic and Epigenetic features. SCIENCE ADVANCES 2020; 6:6/46/eaba6784. [PMID: 33177077 PMCID: PMC7673741 DOI: 10.1126/sciadv.aba6784] [Citation(s) in RCA: 29] [Impact Index Per Article: 7.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 12/23/2019] [Accepted: 09/29/2020] [Indexed: 05/09/2023]
Abstract
Data-driven discovery of cancer driver genes, including tumor suppressor genes (TSGs) and oncogenes (OGs), is imperative for cancer prevention, diagnosis, and treatment. Although epigenetic alterations are important for tumor initiation and progression, most known driver genes were identified based on genetic alterations alone. Here, we developed an algorithm, DORGE (Discovery of Oncogenes and tumor suppressoR genes using Genetic and Epigenetic features), to identify TSGs and OGs by integrating comprehensive genetic and epigenetic data. DORGE identified histone modifications as strong predictors for TSGs, and it found missense mutations, super enhancers, and methylation differences as strong predictors for OGs. We extensively validated DORGE-predicted cancer driver genes using independent functional genomics data. We also found that DORGE-predicted dual-functional genes (both TSGs and OGs) are enriched at hubs in protein-protein interaction and drug-gene networks. Overall, our study has deepened the understanding of epigenetic mechanisms in tumorigenesis and revealed previously undetected cancer driver genes.
Collapse
Affiliation(s)
- Jie Lyu
- Division of Computational Biomedicine, Department of Biological Chemistry, School of Medicine, University of California, Irvine, Irvine, CA 92697, USA
| | - Jingyi Jessica Li
- Department of Statistics, University of California, Los Angeles, Los Angeles, CA 90095, USA.
| | - Jianzhong Su
- Department of Molecular and Cellular Biology, Baylor College of Medicine, Houston, TX 77030, USA
| | - Fanglue Peng
- Department of Molecular and Cellular Biology, Baylor College of Medicine, Houston, TX 77030, USA
| | - Yiling Elaine Chen
- Department of Statistics, University of California, Los Angeles, Los Angeles, CA 90095, USA
| | - Xinzhou Ge
- Department of Statistics, University of California, Los Angeles, Los Angeles, CA 90095, USA
| | - Wei Li
- Division of Computational Biomedicine, Department of Biological Chemistry, School of Medicine, University of California, Irvine, Irvine, CA 92697, USA.
| |
Collapse
|
40
|
Khalighi S, Singh S, Varadan V. Untangling a complex web: Computational analyses of tumor molecular profiles to decode driver mechanisms. J Genet Genomics 2020; 47:595-609. [PMID: 33423960 PMCID: PMC7902422 DOI: 10.1016/j.jgg.2020.11.001] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/12/2020] [Revised: 11/04/2020] [Accepted: 11/14/2020] [Indexed: 12/19/2022]
Abstract
Genome-scale studies focusing on molecular profiling of cancers across tissue types have revealed a plethora of aberrations across the genomic, transcriptomic, and epigenomic scales. The significant molecular heterogeneity across individual tumors even within the same tissue context complicates decoding the key etiologic mechanisms of this disease. Furthermore, it is increasingly likely that biologic mechanisms underlying the pathobiology of cancer involve multiple molecular entities interacting across functional scales. This has motivated the development of computational approaches that integrate molecular measurements with prior biological knowledge in increasingly intricate ways to enable the discovery of driver genomic aberrations across cancers. Here, we review diverse methodological approaches that have powered significant advances in our understanding of the genomic underpinnings of cancer at the cohort and at the individual tumor scales. We outline the key advances and challenges in the computational discovery of cancer mechanisms while motivating the development of systems biology approaches to comprehensively decode the biologic drivers of this complex disease.
Collapse
Affiliation(s)
- Sirvan Khalighi
- Division of General Medical Sciences-Oncology, Case Comprehensive Cancer Center, Case Western Reserve University School of Medicine, Cleveland, OH 44106, USA
| | - Salendra Singh
- Division of General Medical Sciences-Oncology, Case Comprehensive Cancer Center, Case Western Reserve University School of Medicine, Cleveland, OH 44106, USA
| | - Vinay Varadan
- Division of General Medical Sciences-Oncology, Case Comprehensive Cancer Center, Case Western Reserve University School of Medicine, Cleveland, OH 44106, USA.
| |
Collapse
|
41
|
Rehmat N, Farooq H, Kumar S, Ul Hussain S, Naveed H. Predicting the pathogenicity of protein coding mutations using Natural Language Processing. ANNUAL INTERNATIONAL CONFERENCE OF THE IEEE ENGINEERING IN MEDICINE AND BIOLOGY SOCIETY. IEEE ENGINEERING IN MEDICINE AND BIOLOGY SOCIETY. ANNUAL INTERNATIONAL CONFERENCE 2020; 2020:5842-5846. [PMID: 33019302 DOI: 10.1109/embc44109.2020.9175781] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/06/2022]
Abstract
DNA-Sequencing of tumor cells has revealed thousands of genetic mutations. However, cancer is caused by only some of them. Identifying mutations that contribute to tumor growth from neutral ones is extremely challenging and is currently carried out manually. This manual annotation is very cumbersome and expensive in terms of time and money. In this study, we introduce a novel method "NLP-SNPPred" to read scientific literature and learn the implicit features that cause certain variations to be pathogenic. Precisely, our method ingests the bio-medical literature and produces its vector representation via exploiting state of the art NLP methods like sent2vec, word2vec and tf-idf. These representations are then fed to machine learning predictors to identify the pathogenic versus neutral variations. Our best model (NLPSNPPred) trained on OncoKB and evaluated on several publicly available benchmark datasets, outperformed state of the art function prediction methods. Our results show that NLP can be used effectively in predicting functional impact of protein coding variations with minimal complementary biological features. Moreover, encoding biological knowledge into the right representations, combined with machine learning methods can help in automating manual efforts. A free to use web-server is available at http://www.nlp-snppred.cbrlab.org.
Collapse
|
42
|
Sarkar A, Atay Y, Erickson AL, Arisi I, Saltini C, Kahveci T. An Efficient Algorithm for Identifying Mutated Subnetworks Associated with Survival in Cancer. IEEE/ACM TRANSACTIONS ON COMPUTATIONAL BIOLOGY AND BIOINFORMATICS 2020; 17:1582-1594. [PMID: 30990435 DOI: 10.1109/tcbb.2019.2911069] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/09/2023]
Abstract
Protein-protein interaction (PPI) network models interconnections between protein-encoding genes. A group of proteins that perform similar functions are often connected to each other in the PPI network. The corresponding genes form pathways or functional modules. Mutation in protein-encoding genes affect behavior of pathways. This results in initiation, progression, and severity of diseases that propagates through pathways. In this work, we integrate mutation, survival information of patients, and PPI network to identify connected subnetworks associated with survival. We define the computational problem using a fitness function called log-rank statistic to score subnetworks. Log-rank statistic compares the survival between two populations. We propose a novel method, Survival Associated Mutated Subnetwork (SAMS) that adopts genetic algorithm strategy to find the connected subnetwork within the PPI network whose mutation yields highest log-rank statistic. We test on real cancer and synthetic datasets. SAMS generate solutions in negligible time while the state-of-art method in literature takes exponential time. Log-rank statistic of SAMS selected mutated subnetworks are comparable to the method. Our result genesets show significant overlap with well-known cancer driver genes derived from curated datasets and studies in literature, display high text-mining score in terms of number of citations combined with disease-specific keywords in PubMed, and identify pathways having high biological relevance.
Collapse
|
43
|
Zhang N, Lu H, Chen Y, Zhu Z, Yang Q, Wang S, Li M. PremPRI: Predicting the Effects of Missense Mutations on Protein-RNA Interactions. Int J Mol Sci 2020; 21:ijms21155560. [PMID: 32756481 PMCID: PMC7432928 DOI: 10.3390/ijms21155560] [Citation(s) in RCA: 9] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/29/2020] [Revised: 07/28/2020] [Accepted: 07/30/2020] [Indexed: 12/23/2022] Open
Abstract
Protein–RNA interactions are crucial for many cellular processes, such as protein synthesis and regulation of gene expression. Missense mutations that alter protein–RNA interaction may contribute to the pathogenesis of many diseases. Here, we introduce a new computational method PremPRI, which predicts the effects of single mutations occurring in RNA binding proteins on the protein–RNA interactions by calculating the binding affinity changes quantitatively. The multiple linear regression scoring function of PremPRI is composed of three sequence- and eight structure-based features, and is parameterized on 248 mutations from 50 protein–RNA complexes. Our model shows a good agreement between calculated and experimental values of binding affinity changes with a Pearson correlation coefficient of 0.72 and the corresponding root-mean-square error of 0.76 kcal·mol−1, outperforming three other available methods. PremPRI can be used for finding functionally important variants, understanding the molecular mechanisms, and designing new protein–RNA interaction inhibitors.
Collapse
|
44
|
Kobren SN, Chazelle B, Singh M. PertInInt: An Integrative, Analytical Approach to Rapidly Uncover Cancer Driver Genes with Perturbed Interactions and Functionalities. Cell Syst 2020; 11:63-74.e7. [PMID: 32711844 PMCID: PMC7493809 DOI: 10.1016/j.cels.2020.06.005] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/02/2019] [Revised: 02/23/2020] [Accepted: 06/05/2020] [Indexed: 12/12/2022]
Abstract
A major challenge in cancer genomics is to identify genes with functional roles in cancer and uncover their mechanisms of action. We introduce an integrative framework that identifies cancer-relevant genes by pinpointing those whose interaction or other functional sites are enriched in somatic mutations across tumors. We derive analytical calculations that enable us to avoid time-prohibitive permutation-based significance tests, making it computationally feasible to simultaneously consider multiple measures of protein site functionality. Our accompanying software, PertInInt, combines knowledge about sites participating in interactions with DNA, RNA, peptides, ions, or small molecules with domain, evolutionary conservation, and gene-level mutation data. When applied to 10,037 tumor samples, PertInInt uncovers both known and newly predicted cancer genes, while additionally revealing what types of interactions or other functionalities are disrupted. PertInInt’s analysis demonstrates that somatic mutations are frequently enriched in interaction sites and domains and implicates interaction perturbation as a pervasive cancer-driving event. A fast, analytical framework called PertInInt enables efficient integration of multiple measures of protein site functionality—including interaction, domain, and evolutionary conservation—with gene-level mutation data in order to rapidly detect cancer driver genes along with their disrupted functionalities.
Collapse
Affiliation(s)
- Shilpa Nadimpalli Kobren
- Department of Biomedical Informatics, Harvard Medical School, Boston, MA, USA; Department of Computer Science, Princeton University, Princeton, NJ, USA; Lewis-Sigler Institute for Integrative Genomics, Princeton University, Princeton, NJ, USA
| | - Bernard Chazelle
- Department of Computer Science, Princeton University, Princeton, NJ, USA
| | - Mona Singh
- Department of Computer Science, Princeton University, Princeton, NJ, USA; Lewis-Sigler Institute for Integrative Genomics, Princeton University, Princeton, NJ, USA.
| |
Collapse
|
45
|
Sedova M, Iyer M, Li Z, Jaroszewski L, Post KW, Hrabe T, Porta-Pardo E, Godzik A. Cancer3D 2.0: interactive analysis of 3D patterns of cancer mutations in cancer subsets. Nucleic Acids Res 2020; 47:D895-D899. [PMID: 30407596 PMCID: PMC6324060 DOI: 10.1093/nar/gky1098] [Citation(s) in RCA: 8] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/15/2018] [Accepted: 11/07/2018] [Indexed: 01/17/2023] Open
Abstract
Our knowledge of cancer genomics exploded in last several years, providing us with detailed knowledge of genetic alterations in almost all cancer types. Analysis of this data gave us new insights into molecular aspects of cancer, most important being the amazing diversity of molecular abnormalities in individual cancers. The most important question in cancer research today is how to classify this diversity to identify subtypes that are most relevant for treatment and outcome prediction for individual patients. The Cancer3D database at http://www.cancer3d.org gives an open and user-friendly way to analyze cancer missense mutations in the context of structures of proteins they are found in and in relation to patients’ clinical data. This approach allows users to find novel candidate driver regions for specific subgroups, that often cannot be found when similar analyses are done on the whole gene level and for large, diverse cohorts. Interactive interface allows user to visualize the distribution of mutations in subgroups defined by cancer type and stage, gender and age brackets, patient's ethnicity or vice versa find dominant cancer type, gender or age groups for specific three-dimensional mutation patterns.
Collapse
Affiliation(s)
- Mayya Sedova
- Bioinformatics and Systems Biology Program, Sanford Burnham Prebys Medical Discovery Institute, 10901 North Torrey Pines Road, La Jolla, CA 92037, USA
| | - Mallika Iyer
- Bioinformatics and Systems Biology Program, Sanford Burnham Prebys Medical Discovery Institute, 10901 North Torrey Pines Road, La Jolla, CA 92037, USA
- Graduate School of Biomedical Sciences, Sanford Burnham Prebys Medical Discovery Institute, 10901 North Torrey Pines Road, La Jolla, CA 92037, USA
| | - Zhanwen Li
- Bioinformatics and Systems Biology Program, Sanford Burnham Prebys Medical Discovery Institute, 10901 North Torrey Pines Road, La Jolla, CA 92037, USA
| | - Lukasz Jaroszewski
- Bioinformatics and Systems Biology Program, Sanford Burnham Prebys Medical Discovery Institute, 10901 North Torrey Pines Road, La Jolla, CA 92037, USA
| | - Kai W Post
- Bioinformatics and Systems Biology Program, Sanford Burnham Prebys Medical Discovery Institute, 10901 North Torrey Pines Road, La Jolla, CA 92037, USA
| | - Thomas Hrabe
- Bioinformatics and Systems Biology Program, Sanford Burnham Prebys Medical Discovery Institute, 10901 North Torrey Pines Road, La Jolla, CA 92037, USA
| | | | - Adam Godzik
- Bioinformatics and Systems Biology Program, Sanford Burnham Prebys Medical Discovery Institute, 10901 North Torrey Pines Road, La Jolla, CA 92037, USA
- To whom correspondence should be addressed. Tel: +1 858 646 3100; Fax: +858 646 3199;
| |
Collapse
|
46
|
Bürtin F, Mullins CS, Linnebacher M. Mouse models of colorectal cancer: Past, present and future perspectives. World J Gastroenterol 2020; 26:1394-1426. [PMID: 32308343 PMCID: PMC7152519 DOI: 10.3748/wjg.v26.i13.1394] [Citation(s) in RCA: 84] [Impact Index Per Article: 21.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 12/18/2019] [Revised: 03/05/2020] [Accepted: 03/09/2020] [Indexed: 02/06/2023] Open
Abstract
Colorectal cancer (CRC) is the third most common diagnosed malignancy among both sexes in the United States as well as in the European Union. While the incidence and mortality rates in western, high developed countries are declining, reflecting the success of screening programs and improved treatment regimen, a rise of the overall global CRC burden can be observed due to lifestyle changes paralleling an increasing human development index. Despite a growing insight into the biology of CRC and many therapeutic improvements in the recent decades, preclinical in vivo models are still indispensable for the development of new treatment approaches. Since the development of carcinogen-induced rodent models for CRC more than 80 years ago, a plethora of animal models has been established to study colon cancer biology. Despite tenuous invasiveness and metastatic behavior, these models are useful for chemoprevention studies and to evaluate colitis-related carcinogenesis. Genetically engineered mouse models (GEMM) mirror the pathogenesis of sporadic as well as inherited CRC depending on the specific molecular pathways activated or inhibited. Although the vast majority of CRC GEMM lack invasiveness, metastasis and tumor heterogeneity, they still have proven useful for examination of the tumor microenvironment as well as systemic immune responses; thus, supporting development of new therapeutic avenues. Induction of metastatic disease by orthotopic injection of CRC cell lines is possible, but the so generated models lack genetic diversity and the number of suited cell lines is very limited. Patient-derived xenografts, in contrast, maintain the pathological and molecular characteristics of the individual patient’s CRC after subcutaneous implantation into immunodeficient mice and are therefore most reliable for preclinical drug development – even in comparison to GEMM or cell line-based analyses. However, subcutaneous patient-derived xenograft models are less suitable for studying most aspects of the tumor microenvironment and anti-tumoral immune responses. The authors review the distinct mouse models of CRC with an emphasis on their clinical relevance and shed light on the latest developments in the field of preclinical CRC models.
Collapse
Affiliation(s)
- Florian Bürtin
- Department of General, Visceral, Vascular and Transplantation Surgery, University Medical Center Rostock, University of Rostock, Rostock 18057, Germany
| | - Christina S Mullins
- Department of Thoracic Surgery, University Medical Center Rostock, University of Rostock, Rostock 18057, Germany
| | - Michael Linnebacher
- Molecular Oncology and Immunotherapy, Department of General, Visceral, Vascular and Transplantation Surgery, University Medical Center Rostock, Rostock 18057, Germany
| |
Collapse
|
47
|
Dinstag G, Shamir R. PRODIGY: personalized prioritization of driver genes. Bioinformatics 2020; 36:1831-1839. [PMID: 31681944 PMCID: PMC7703777 DOI: 10.1093/bioinformatics/btz815] [Citation(s) in RCA: 16] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/11/2019] [Revised: 09/03/2019] [Accepted: 10/30/2019] [Indexed: 12/12/2022] Open
Abstract
MOTIVATION Evolution of cancer is driven by few somatic mutations that disrupt cellular processes, causing abnormal proliferation and tumor development, whereas most somatic mutations have no impact on progression. Distinguishing those mutated genes that drive tumorigenesis in a patient is a primary goal in cancer therapy: Knowledge of these genes and the pathways on which they operate can illuminate disease mechanisms and indicate potential therapies and drug targets. Current research focuses mainly on cohort-level driver gene identification but patient-specific driver gene identification remains a challenge. METHODS We developed a new algorithm for patient-specific ranking of driver genes. The algorithm, called PRODIGY, analyzes the expression and mutation profiles of the patient along with data on known pathways and protein-protein interactions. Prodigy quantifies the impact of each mutated gene on every deregulated pathway using the prize-collecting Steiner tree model. Mutated genes are ranked by their aggregated impact on all deregulated pathways. RESULTS In testing on five TCGA cancer cohorts spanning >2500 patients and comparison to validated driver genes, Prodigy outperformed extant methods and ranking based on network centrality measures. Our results pinpoint the pleiotropic effect of driver genes and show that Prodigy is capable of identifying even very rare drivers. Hence, Prodigy takes a step further toward personalized medicine and treatment. AVAILABILITY AND IMPLEMENTATION The Prodigy R package is available at: https://github.com/Shamir-Lab/PRODIGY. SUPPLEMENTARY INFORMATION Supplementary data are available at Bioinformatics online.
Collapse
Affiliation(s)
- Gal Dinstag
- Blavatnik School of Computer Science, Tel-Aviv University, Tel Aviv 6997801, Israel
| | - Ron Shamir
- Blavatnik School of Computer Science, Tel-Aviv University, Tel Aviv 6997801, Israel
| |
Collapse
|
48
|
Bin Y, Wang X, Zhao L, Wen P, Xia J. An analysis of mutational signatures of synonymous mutations across 15 cancer types. BMC MEDICAL GENETICS 2019; 20:190. [PMID: 31815613 PMCID: PMC6900878 DOI: 10.1186/s12881-019-0926-4] [Citation(s) in RCA: 8] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Indexed: 12/19/2022]
Abstract
Background Synonymous mutations have been identified to play important roles in cancer development, although they do not modify the protein sequences. However, relatively little research has specifically delineated the functionality of synonymous mutations in cancer. Results We investigated the nucleotide-based and amino acid-based features of synonymous mutations across 15 cancer types from The Cancer Genome Atlas (TCGA), and revealed novel driver candidates by identifying hotspot mutations. Firstly, synonymous mutations were analyzed between TCGA and 1000 Genomes Project at nucleotide and amino acid levels. We found that C:G → T:A transitions were the most frequent single-base substitutions, and leucine underwent the largest number of synonymous mutations in TCGA due to prevalent C → T transition, which induced the transformation between optimal and non-optimal codons. Next, 97 synonymous hotspot mutations in 86 genes were nominated as candidate drivers with potential cancer risk by considering the mutational rates across different sequence contexts. We observed that non-CpG-island GC transition sequence context was positively selected across most of cancer types, and different sequence contexts under which hotspot mutations occur could be significance for genetic differences and functional features. We also found that the hotspots were more conserved than neutral mutations of hotspot-mutation-containing-genes and frequently happened at leucine. In addition, we mapped hotspots, neutral and non-hotspot mutations of hotspot-mutation-containing-genes to their respective protein domains and found ion transport domain was the most frequent one, which could mediate the cell interaction and had relevant implication for tumor therapy. And the signatures of synonymous hotspots were qualitatively similar with those of harmful missense variants. Conclusions We illustrated the preferences of cancer associated synonymous mutations, especially hotspots, and laid the groundwork for understanding the synonymous mutations act as drivers in cancer.
Collapse
Affiliation(s)
- Yannan Bin
- Institutes of Physical Science and Information Technology, School of Computer Science and Technology, Anhui University, Hefei, 230601, Anhui, China
| | - Xiaojuan Wang
- Institutes of Physical Science and Information Technology, School of Computer Science and Technology, Anhui University, Hefei, 230601, Anhui, China
| | - Le Zhao
- Institutes of Physical Science and Information Technology, School of Computer Science and Technology, Anhui University, Hefei, 230601, Anhui, China
| | - Pengbo Wen
- Institutes of Physical Science and Information Technology, School of Computer Science and Technology, Anhui University, Hefei, 230601, Anhui, China
| | - Junfeng Xia
- Institutes of Physical Science and Information Technology, School of Computer Science and Technology, Anhui University, Hefei, 230601, Anhui, China.
| |
Collapse
|
49
|
Nussinov R, Tsai CJ, Jang H. Why Are Some Driver Mutations Rare? Trends Pharmacol Sci 2019; 40:919-929. [PMID: 31699406 DOI: 10.1016/j.tips.2019.10.003] [Citation(s) in RCA: 26] [Impact Index Per Article: 5.2] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/16/2019] [Revised: 10/09/2019] [Accepted: 10/10/2019] [Indexed: 12/13/2022]
Abstract
Understanding why driver mutations that promote cancer are sometimes rare is important for precision medicine since it would help in their identification. Driver mutations are largely discovered through their frequencies. Thus, rare mutations often escape detection. Unlike high-frequency drivers, low-frequency drivers can be tissue specific; rare drivers have extremely low frequencies. Here, we discuss rare drivers and strategies to discover them. We suggest that allosteric driver mutations shift the protein ensemble from the inactive to the active state. Rare allosteric drivers are statistically rare since, to switch the protein functional state, they cooperate with additional mutations, and these are not considered in the patient cancer-specific protein sequence analysis. A complete landscape of mutations that drive cancer will reveal tumor-specific therapeutic vulnerabilities.
Collapse
Affiliation(s)
- Ruth Nussinov
- Computational Structural Biology Section, Basic Science Program, Frederick National Laboratory for Cancer Research, Frederick, MD 21702, USA; Department of Human Molecular Genetics and Biochemistry, Sackler School of Medicine, Tel Aviv University, Tel Aviv 69978, Israel.
| | - Chung-Jung Tsai
- Computational Structural Biology Section, Basic Science Program, Frederick National Laboratory for Cancer Research, Frederick, MD 21702, USA
| | - Hyunbum Jang
- Computational Structural Biology Section, Basic Science Program, Frederick National Laboratory for Cancer Research, Frederick, MD 21702, USA
| |
Collapse
|
50
|
Lin CY, Vennam S, Purington N, Lin E, Varma S, Han S, Desa M, Seto T, Wang NJ, Stehr H, Troxell ML, Kurian AW, West RB. Genomic landscape of ductal carcinoma in situ and association with progression. Breast Cancer Res Treat 2019; 178:307-316. [PMID: 31420779 PMCID: PMC6800639 DOI: 10.1007/s10549-019-05401-x] [Citation(s) in RCA: 13] [Impact Index Per Article: 2.6] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/03/2019] [Accepted: 08/07/2019] [Indexed: 01/07/2023]
Abstract
PURPOSE The detection rate of breast ductal carcinoma in situ (DCIS) has increased significantly, raising the concern that DCIS is overdiagnosed and overtreated. Therefore, there is an unmet clinical need to better predict the risk of progression among DCIS patients. Our hypothesis is that by combining molecular signatures with clinicopathologic features, we can elucidate the biology of breast cancer progression, and risk-stratify patients with DCIS. METHODS Targeted exon sequencing with a custom panel of 223 genes/regions was performed for 125 DCIS cases. Among them, 60 were from cases having concurrent or subsequent invasive breast cancer (IBC) (DCIS + IBC group), and 65 from cases with no IBC development over a median follow-up of 13 years (DCIS-only group). Copy number alterations in chromosome 1q32, 8q24, and 11q13 were analyzed using fluorescence in situ hybridization (FISH). Multivariable logistic regression models were fit to the outcome of DCIS progression to IBC as functions of demographic and clinical features. RESULTS We observed recurrent variants of known IBC-related mutations, and the most commonly mutated genes in DCIS were PIK3CA (34.4%) and TP53 (18.4%). There was an inverse association between PIK3CA kinase domain mutations and progression (Odds Ratio [OR] 10.2, p < 0.05). Copy number variations in 1q32 and 8q24 were associated with progression (OR 9.3 and 46, respectively; both p < 0.05). CONCLUSIONS PIK3CA kinase domain mutations and the absence of copy number gains in DCIS are protective against progression to IBC. These results may guide efforts to distinguish low-risk from high-risk DCIS.
Collapse
MESH Headings
- Aged
- Aged, 80 and over
- Carcinoma, Ductal, Breast/genetics
- Carcinoma, Ductal, Breast/pathology
- Carcinoma, Ductal, Breast/therapy
- Carcinoma, Intraductal, Noninfiltrating/genetics
- Carcinoma, Intraductal, Noninfiltrating/pathology
- DNA Copy Number Variations
- Female
- Genetic Predisposition to Disease
- Genome-Wide Association Study/methods
- Genomics/methods
- Humans
- In Situ Hybridization, Fluorescence
- Middle Aged
- Neoplasm Metastasis
- Neoplasm Staging
- Tumor Burden
Collapse
Affiliation(s)
- Chieh-Yu Lin
- Department of Pathology, Stanford University School of Medicine, Stanford, CA, USA
- Department of Pathology and Immunology, School of Medicine, Washington University in St. Louis, St. Louis, MO, USA
| | - Sujay Vennam
- Department of Pathology, Stanford University School of Medicine, Stanford, CA, USA
| | - Natasha Purington
- Department of Medicine, Quantitative Sciences Unit, Stanford University, Stanford, CA, USA
| | - Eric Lin
- Department of Pathology, Stanford University School of Medicine, Stanford, CA, USA
| | - Sushama Varma
- Department of Pathology, Stanford University School of Medicine, Stanford, CA, USA
| | - Summer Han
- Department of Medicine, Quantitative Sciences Unit, Stanford University, Stanford, CA, USA
| | - Manisha Desa
- Department of Medicine and of Biomedical Data Science, Quantitative Sciences Unit, Stanford University, Stanford, CA, USA
| | - Tina Seto
- Research Information Technology, Stanford University School of Medicine, Stanford, CA, USA
| | - Nicholas J Wang
- Department of Biomedical Engineering, Oregon Health and Science University, Portland, OR, USA
| | - Henning Stehr
- Department of Pathology, Stanford University School of Medicine, Stanford, CA, USA
| | - Megan L Troxell
- Department of Pathology, Stanford University School of Medicine, Stanford, CA, USA
| | - Allison W Kurian
- Departments of Medicine and of Health Research and Policy, Stanford University School of Medicine, Stanford, CA, USA
| | - Robert B West
- Department of Pathology, Stanford University School of Medicine, Stanford, CA, USA.
| |
Collapse
|