1
|
Liu J, Chen Y, Huang K, Guan X. Enhancing Missense Variant Pathogenicity Prediction with MissenseNet: Integrating Structural Insights and ShuffleNet-Based Deep Learning Techniques. Biomolecules 2024; 14:1105. [PMID: 39334871 PMCID: PMC11429773 DOI: 10.3390/biom14091105] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/24/2024] [Revised: 07/17/2024] [Accepted: 07/22/2024] [Indexed: 09/30/2024] Open
Abstract
The classification of missense variant pathogenicity continues to pose significant challenges in human genetics, necessitating precise predictions of functional impacts for effective disease diagnosis and personalized treatment strategies. Traditional methods, often compromised by suboptimal feature selection and limited generalizability, are outpaced by the enhanced classification model, MissenseNet (Missense Classification Network). This model, advancing beyond standard predictive features, incorporates structural insights from AlphaFold2 protein predictions, thus optimizing structural data utilization. MissenseNet, built on the ShuffleNet architecture, incorporates an encoder-decoder framework and a Squeeze-and-Excitation (SE) module designed to adaptively adjust channel weights and enhance feature fusion and interaction. The model's efficacy in classifying pathogenicity has been validated through superior accuracy compared to conventional methods and by achieving the highest areas under the Receiver Operating Characteristic (ROC) and Precision-Recall (PR) curves (Area Under the Curve and Area Under the Precision-Recall Curve) in an independent test set, thus underscoring its superiority.
Collapse
Affiliation(s)
- Jing Liu
- College of Information Engineering, Shanghai Maritime University, Shanghai 201306, China
| | - Yingying Chen
- College of Information Engineering, Shanghai Maritime University, Shanghai 201306, China
| | - Kai Huang
- School of Health Science and Engineering, University of Shanghai for Science and Technology, Shanghai 200093, China
- National Grain Industry (Urban Grain and Oil Security) Technology Innovation Center, Shanghai 200093, China
| | - Xiao Guan
- School of Health Science and Engineering, University of Shanghai for Science and Technology, Shanghai 200093, China
- National Grain Industry (Urban Grain and Oil Security) Technology Innovation Center, Shanghai 200093, China
| |
Collapse
|
2
|
Mustafa AHM, Krämer OH. Pharmacological Modulation of the Crosstalk between Aberrant Janus Kinase Signaling and Epigenetic Modifiers of the Histone Deacetylase Family to Treat Cancer. Pharmacol Rev 2023; 75:35-61. [PMID: 36752816 DOI: 10.1124/pharmrev.122.000612] [Citation(s) in RCA: 16] [Impact Index Per Article: 8.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/11/2022] [Revised: 07/08/2022] [Accepted: 08/15/2022] [Indexed: 12/13/2022] Open
Abstract
Hyperactivated Janus kinase (JAK) signaling is an appreciated drug target in human cancers. Numerous mutant JAK molecules as well as inherent and acquired drug resistance mechanisms limit the efficacy of JAK inhibitors (JAKi). There is accumulating evidence that epigenetic mechanisms control JAK-dependent signaling cascades. Like JAKs, epigenetic modifiers of the histone deacetylase (HDAC) family regulate the growth and development of cells and are often dysregulated in cancer cells. The notion that inhibitors of histone deacetylases (HDACi) abrogate oncogenic JAK-dependent signaling cascades illustrates an intricate crosstalk between JAKs and HDACs. Here, we summarize how structurally divergent, broad-acting as well as isoenzyme-specific HDACi, hybrid fusion pharmacophores containing JAKi and HDACi, and proteolysis targeting chimeras for JAKs inactivate the four JAK proteins JAK1, JAK2, JAK3, and tyrosine kinase-2. These agents suppress aberrant JAK activity through specific transcription-dependent processes and mechanisms that alter the phosphorylation and stability of JAKs. Pharmacological inhibition of HDACs abrogates allosteric activation of JAKs, overcomes limitations of ATP-competitive type 1 and type 2 JAKi, and interacts favorably with JAKi. Since such findings were collected in cultured cells, experimental animals, and cancer patients, we condense preclinical and translational relevance. We also discuss how future research on acetylation-dependent mechanisms that regulate JAKs might allow the rational design of improved treatments for cancer patients. SIGNIFICANCE STATEMENT: Reversible lysine-ɛ-N acetylation and deacetylation cycles control phosphorylation-dependent Janus kinase-signal transducer and activator of transcription signaling. The intricate crosstalk between these fundamental molecular mechanisms provides opportunities for pharmacological intervention strategies with modern small molecule inhibitors. This could help patients suffering from cancer.
Collapse
Affiliation(s)
- Al-Hassan M Mustafa
- Department of Toxicology, University Medical Center, Mainz, Germany (A.-H.M.M., O.H.K.) and Department of Zoology, Faculty of Science, Aswan University, Aswan, Egypt (A.-H.M.M.)
| | - Oliver H Krämer
- Department of Toxicology, University Medical Center, Mainz, Germany (A.-H.M.M., O.H.K.) and Department of Zoology, Faculty of Science, Aswan University, Aswan, Egypt (A.-H.M.M.)
| |
Collapse
|
3
|
Moritsch S, Mödl B, Scharf I, Janker L, Zwolanek D, Timelthaler G, Casanova E, Sibilia M, Mohr T, Kenner L, Herndler-Brandstetter D, Gerner C, Müller M, Strobl B, Eferl R. Tyk2 is a tumor suppressor in colorectal cancer. Oncoimmunology 2022; 11:2127271. [PMID: 36185806 PMCID: PMC9519006 DOI: 10.1080/2162402x.2022.2127271] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/09/2022] [Revised: 09/15/2022] [Accepted: 09/15/2022] [Indexed: 12/04/2022] Open
Abstract
Janus kinase Tyk2 is implicated in cancer immune surveillance, but its role in solid tumors is not well defined. We used Tyk2 knockout mice (Tyk2Δ/Δ) and mice with conditional deletion of Tyk2 in hematopoietic (Tyk2ΔHem) or intestinal epithelial cells (Tyk2ΔIEC) to assess their cell type-specific functions in chemically induced colorectal cancer. All Tyk2-deficient mouse models showed a higher tumor burden after AOM-DSS treatment compared to their corresponding wild-type controls (Tyk2+/+ and Tyk2fl/fl), demonstrating tumor-suppressive functions of Tyk2 in immune cells and epithelial cancer cells. However, specific deletion of Tyk2 in hematopoietic cells or in intestinal epithelial cells was insufficient to accelerate tumor progression, while deletion in both compartments promoted carcinoma formation. RNA-seq and proteomics revealed that tumors of Tyk2Δ/Δ and Tyk2ΔIEC mice were immunoedited in different ways with downregulated and upregulated IFNγ signatures, respectively. Accordingly, the IFNγ-regulated immune checkpoint Ido1 was downregulated in Tyk2Δ/Δ and upregulated in Tyk2ΔIEC tumors, although both showed reduced CD8+ T cell infiltration. These data suggest that Tyk2Δ/Δ tumors are Ido1-independent and poorly immunoedited while Tyk2ΔIEC tumors require Ido1 for immune evasion. Our study shows that Tyk2 prevents Ido1 expression in CRC cells and promotes CRC immune surveillance in the tumor stroma. Both of these Tyk2-dependent mechanisms must work together to prevent CRC progression.
Collapse
Affiliation(s)
- Stefan Moritsch
- Center for Cancer Research, Medical University of Vienna & Comprehensive Cancer Center, Vienna, Austria
| | - Bernadette Mödl
- Center for Cancer Research, Medical University of Vienna & Comprehensive Cancer Center, Vienna, Austria
| | - Irene Scharf
- Center for Cancer Research, Medical University of Vienna & Comprehensive Cancer Center, Vienna, Austria
| | - Lukas Janker
- Department of Analytical Chemistry, Faculty of Chemistry, University of Vienna, Vienna, Austria
- Joint Metabolomics Facility, University and Medical University of Vienna, Vienna, Austria
| | - Daniela Zwolanek
- Center for Cancer Research, Medical University of Vienna & Comprehensive Cancer Center, Vienna, Austria
| | - Gerald Timelthaler
- Center for Cancer Research, Medical University of Vienna & Comprehensive Cancer Center, Vienna, Austria
| | - Emilio Casanova
- Department of Pharmacology, Center of Physiology and Pharmacology & Comprehensive Cancer Center, Medical University of Vienna, Vienna, Austria
| | - Maria Sibilia
- Center for Cancer Research, Medical University of Vienna & Comprehensive Cancer Center, Vienna, Austria
| | - Thomas Mohr
- Center for Cancer Research, Medical University of Vienna & Comprehensive Cancer Center, Vienna, Austria
| | - Lukas Kenner
- Institute of Clinical Pathology, Medical University of Vienna, Vienna, Austria
| | | | - Christopher Gerner
- Department of Analytical Chemistry, Faculty of Chemistry, University of Vienna, Vienna, Austria
- Joint Metabolomics Facility, University and Medical University of Vienna, Vienna, Austria
| | - Mathias Müller
- Institute of Animal Breeding and Genetics, University of Veterinary Medicine Vienna, Vienna, Austria
| | - Birgit Strobl
- Institute of Animal Breeding and Genetics, University of Veterinary Medicine Vienna, Vienna, Austria
| | - Robert Eferl
- Center for Cancer Research, Medical University of Vienna & Comprehensive Cancer Center, Vienna, Austria
| |
Collapse
|
4
|
Yang Y, Shao A, Vihinen M. PON-All: Amino Acid Substitution Tolerance Predictor for All Organisms. Front Mol Biosci 2022; 9:867572. [PMID: 35782867 PMCID: PMC9245922 DOI: 10.3389/fmolb.2022.867572] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/01/2022] [Accepted: 05/02/2022] [Indexed: 01/08/2023] Open
Abstract
Genetic variations are investigated in human and many other organisms for many purposes (e.g., to aid in clinical diagnosis). Interpretation of the identified variations can be challenging. Although some dedicated prediction methods have been developed and some tools for human variants can also be used for other organisms, the performance and species range have been limited. We developed a novel variant pathogenicity/tolerance predictor for amino acid substitutions in any organism. The method, PON-All, is a machine learning tool trained on human, animal, and plant variants. Two versions are provided, one with Gene Ontology (GO) annotations and another without these details. GO annotations are not available or are partial for many organisms of interest. The methods provide predictions for three classes: pathogenic, benign, and variants of unknown significance. On the blind test, when using GO annotations, accuracy was 0.913 and MCC 0.827. When GO features were not used, accuracy was 0.856 and MCC 0.712. The performance is the best for human and plant variants and somewhat lower for animal variants because the number of known disease-causing variants in animals is rather small. The method was compared to several other tools and was found to have superior performance. PON-All is freely available at http://structure.bmc.lu.se/PON-All and http://8.133.174.28:8999/.
Collapse
Affiliation(s)
- Yang Yang
- School of Computer Science and Technology, Soochow University, Suzhou, China
- Collaborative Innovation Center of Novel Software Technology and Industrialization, Nanjing, China
| | - Aibin Shao
- School of Computer Science and Technology, Soochow University, Suzhou, China
| | - Mauno Vihinen
- Department of Experimental Medical Science, Lund University, Lund, Sweden
- *Correspondence: Mauno Vihinen,
| |
Collapse
|
5
|
Borcherding DC, He K, Amin NV, Hirbe AC. TYK2 in Cancer Metastases: Genomic and Proteomic Discovery. Cancers (Basel) 2021; 13:4171. [PMID: 34439323 PMCID: PMC8393599 DOI: 10.3390/cancers13164171] [Citation(s) in RCA: 17] [Impact Index Per Article: 4.3] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/30/2021] [Revised: 08/07/2021] [Accepted: 08/12/2021] [Indexed: 12/12/2022] Open
Abstract
Advances in genomic analysis and proteomic tools have rapidly expanded identification of biomarkers and molecular targets important to cancer development and metastasis. On an individual basis, personalized medicine approaches allow better characterization of tumors and patient prognosis, leading to more targeted treatments by detection of specific gene mutations, overexpression, or activity. Genomic and proteomic screens by our lab and others have revealed tyrosine kinase 2 (TYK2) as an oncogene promoting progression and metastases of many types of carcinomas, sarcomas, and hematologic cancers. TYK2 is a Janus kinase (JAK) that acts as an intermediary between cytokine receptors and STAT transcription factors. TYK2 signals to stimulate proliferation and metastasis while inhibiting apoptosis of cancer cells. This review focuses on the growing evidence from genomic and proteomic screens, as well as molecular studies that link TYK2 to cancer prevalence, prognosis, and metastasis. In addition, pharmacological inhibition of TYK2 is currently used clinically for autoimmune diseases, and now provides promising treatment modalities as effective therapeutic agents against multiple types of cancer.
Collapse
Affiliation(s)
- Dana C. Borcherding
- Division of Oncology, Department of Internal Medicine, Washington University School of Medicine, St. Louis, MO 63110, USA; (D.C.B.); (K.H.); (N.V.A.)
| | - Kevin He
- Division of Oncology, Department of Internal Medicine, Washington University School of Medicine, St. Louis, MO 63110, USA; (D.C.B.); (K.H.); (N.V.A.)
| | - Neha V. Amin
- Division of Oncology, Department of Internal Medicine, Washington University School of Medicine, St. Louis, MO 63110, USA; (D.C.B.); (K.H.); (N.V.A.)
| | - Angela C. Hirbe
- Division of Oncology, Department of Internal Medicine, Washington University School of Medicine, St. Louis, MO 63110, USA; (D.C.B.); (K.H.); (N.V.A.)
- Siteman Cancer Center, Washington University School of Medicine, St. Louis, MO 63110, USA
| |
Collapse
|
6
|
Computational studies of anaplastic lymphoma kinase mutations reveal common mechanisms of oncogenic activation. Proc Natl Acad Sci U S A 2021; 118:2019132118. [PMID: 33674381 PMCID: PMC7958353 DOI: 10.1073/pnas.2019132118] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/11/2022] Open
Abstract
High-risk tumors are genomically heterogeneous, harboring gene amplifications and mutations. The activation status of mutated proteins in cancer can profoundly impact disease progression, patient response, and drug sensitivity. Yet, outside of a few hotspot mutations, functional studies of clinically observed mutations are not commonly pursued. We report a combined experimental profiling and computational analysis of the effects of clinically observed and “test” mutations in the kinase domain of anaplastic lymphoma kinase (ALK), a known oncogenic driver in pediatric neuroblastoma. We find that the activation status of the mutated protein is a good indicator of the transforming ability in NIH 3T3 cells. We also report biophysical as well as data-driven models with predictive power to profile these mutant kinases in silico. Kinases play important roles in diverse cellular processes, including signaling, differentiation, proliferation, and metabolism. They are frequently mutated in cancer and are the targets of a large number of specific inhibitors. Surveys of cancer genome atlases reveal that kinase domains, which consist of 300 amino acids, can harbor numerous (150 to 200) single-point mutations across different patients in the same disease. This preponderance of mutations—some activating, some silent—in a known target protein make clinical decisions for enrolling patients in drug trials challenging since the relevance of the target and its drug sensitivity often depend on the mutational status in a given patient. We show through computational studies using molecular dynamics (MD) as well as enhanced sampling simulations that the experimentally determined activation status of a mutated kinase can be predicted effectively by identifying a hydrogen bonding fingerprint in the activation loop and the αC-helix regions, despite the fact that mutations in cancer patients occur throughout the kinase domain. In our study, we find that the predictive power of MD is superior to a purely data-driven machine learning model involving biochemical features that we implemented, even though MD utilized far fewer features (in fact, just one) in an unsupervised setting. Moreover, the MD results provide key insights into convergent mechanisms of activation, primarily involving differential stabilization of a hydrogen bond network that engages residues of the activation loop and αC-helix in the active-like conformation (in >70% of the mutations studied, regardless of the location of the mutation).
Collapse
|
7
|
Grillo E, Ravelli C, Corsini M, Zammataro L, Mitola S. Protein domain-based approaches for the identification and prioritization of therapeutically actionable cancer variants. Biochim Biophys Acta Rev Cancer 2021; 1876:188614. [PMID: 34403770 DOI: 10.1016/j.bbcan.2021.188614] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/22/2021] [Revised: 08/11/2021] [Accepted: 08/11/2021] [Indexed: 01/04/2023]
Abstract
The tremendous number of cancer variants that can be detected by NGS analyses has required the development of computational approaches to prioritize mutations on the basis of their biological and clinical significance. Standard strategies take a gene-centric approach to the problem, allowing exclusively the identification of highly frequent variants. On the contrary, protein domain (PD)-based approaches allow to identify functionally relevant low frequency variants by searching for mutations that recur on analogous residues across homologous proteins (i.e. containing the same PD). Such approaches enable to transfer information about the effects and druggability from one known mutation to unknown ones. Here we describe how PD-based strategies work, and discuss how they could be exploited for mutation prioritization. The principle that mutations clustered on specific residues of PDs have the same functional consequences and are therapeutically actionable in a similar manner could help the choice of patient-specific targeted drugs, eventually improving the management of cancer patients.
Collapse
Affiliation(s)
- Elisabetta Grillo
- Department of Molecular and Translational Medicine, University of Brescia, Brescia, Italy.
| | - Cosetta Ravelli
- Department of Molecular and Translational Medicine, University of Brescia, Brescia, Italy
| | - Michela Corsini
- Department of Molecular and Translational Medicine, University of Brescia, Brescia, Italy
| | - Luca Zammataro
- Division of Artificial Intelligence Systems for Immunoinformatics, Kiromic BioPharma, Inc., Houston, USA
| | - Stefania Mitola
- Department of Molecular and Translational Medicine, University of Brescia, Brescia, Italy.
| |
Collapse
|
8
|
Raimondi D, Passemiers A, Fariselli P, Moreau Y. Current cancer driver variant predictors learn to recognize driver genes instead of functional variants. BMC Biol 2021; 19:3. [PMID: 33441128 PMCID: PMC7807764 DOI: 10.1186/s12915-020-00930-0] [Citation(s) in RCA: 13] [Impact Index Per Article: 3.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/08/2020] [Accepted: 11/19/2020] [Indexed: 12/11/2022] Open
Abstract
BACKGROUND Identifying variants that drive tumor progression (driver variants) and distinguishing these from variants that are a byproduct of the uncontrolled cell growth in cancer (passenger variants) is a crucial step for understanding tumorigenesis and precision oncology. Various bioinformatics methods have attempted to solve this complex task. RESULTS In this study, we investigate the assumptions on which these methods are based, showing that the different definitions of driver and passenger variants influence the difficulty of the prediction task. More importantly, we prove that the data sets have a construction bias which prevents the machine learning (ML) methods to actually learn variant-level functional effects, despite their excellent performance. This effect results from the fact that in these data sets, the driver variants map to a few driver genes, while the passenger variants spread across thousands of genes, and thus just learning to recognize driver genes provides almost perfect predictions. CONCLUSIONS To mitigate this issue, we propose a novel data set that minimizes this bias by ensuring that all genes covered by the data contain both driver and passenger variants. As a result, we show that the tested predictors experience a significant drop in performance, which should not be considered as poorer modeling, but rather as correcting unwarranted optimism. Finally, we propose a weighting procedure to completely eliminate the gene effects on such predictions, thus precisely evaluating the ability of predictors to model the functional effects of single variants, and we show that indeed this task is still open.
Collapse
Affiliation(s)
| | | | | | - Yves Moreau
- ESAT-STADIUS, KU Leuven, Leuven, 3001, Belgium.
| |
Collapse
|
9
|
TYK2 Variants in B-Acute Lymphoblastic Leukaemia. Genes (Basel) 2020; 11:genes11121434. [PMID: 33260630 PMCID: PMC7761059 DOI: 10.3390/genes11121434] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/05/2020] [Revised: 11/20/2020] [Accepted: 11/25/2020] [Indexed: 12/31/2022] Open
Abstract
B-cell precursor acute lymphoblastic leukaemia (B-ALL) is a malignancy of lymphoid progenitor cells with altered genes including the Janus kinase (JAK) gene family. Among them, tyrosine kinase 2 (TYK2) is involved in signal transduction of cytokines such as interferon (IFN) α/β through IFN−α/β receptor alpha chain (IFNAR1). To search for disease-associated TYK2 variants, bone marrow samples from 62 B-ALL patients at diagnosis were analysed by next-generation sequencing. TYK2 variants were found in 16 patients (25.8%): one patient had a novel mutation at the four-point-one, ezrin, radixin, moesin (FERM) domain (S431G) and two patients had the rare variants rs150601734 or rs55882956 (R425H or R832W). To functionally characterise them, they were generated by direct mutagenesis, cloned in expression vectors, and transfected in TYK2-deficient cells. Under high-IFNα doses, the three variants were competent to phosphorylate STAT1/2. While R425H and R832W induced STAT1/2-target genes measured by qPCR, S431G behaved as the kinase-dead form of the protein. None of these variants phosphorylated STAT3 in in vitro kinase assays. Molecular dynamics simulation showed that TYK2/IFNAR1 interaction is not affected by these variants. Finally, qPCR analysis revealed diminished expression of TYK2 in B-ALL patients at diagnosis compared to that in healthy donors, further stressing the tumour immune surveillance role of TYK2.
Collapse
|
10
|
Wöss K, Simonović N, Strobl B, Macho-Maschler S, Müller M. TYK2: An Upstream Kinase of STATs in Cancer. Cancers (Basel) 2019; 11:E1728. [PMID: 31694222 PMCID: PMC6896190 DOI: 10.3390/cancers11111728] [Citation(s) in RCA: 41] [Impact Index Per Article: 6.8] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/30/2019] [Revised: 10/28/2019] [Accepted: 11/02/2019] [Indexed: 02/07/2023] Open
Abstract
In this review we concentrate on the recent findings describing the oncogenic potential of the protein tyrosine kinase 2 (TYK2). The overview on the current understanding of TYK2 functions in cytokine responses and carcinogenesis focusses on the activation of the signal transducers and activators of transcription (STAT) 3 and 5. Insight gained from loss-of-function (LOF) gene-modified mice and human patients homozygous for Tyk2/TYK2-mutated alleles established the central role in immunological and inflammatory responses. For the description of physiological TYK2 structure/function relationships in cytokine signaling and of overarching molecular and pathologic properties in carcinogenesis, we mainly refer to the most recent reviews. Dysregulated TYK2 activation, aberrant TYK2 protein levels, and gain-of-function (GOF) TYK2 mutations are found in various cancers. We discuss the molecular consequences thereof and briefly describe the molecular means to counteract TYK2 activity under (patho-)physiological conditions by cellular effectors and by pharmacological intervention. For the role of TYK2 in tumor immune-surveillance we refer to the recent Special Issue of Cancers "JAK-STAT Signaling Pathway in Cancer".
Collapse
Affiliation(s)
| | | | | | | | - Mathias Müller
- Institute of Animal Breeding and Genetics, University of Veterinary Medicine Vienna, A-1210 Vienna, Austria; (K.W.); (N.S.); (B.S.); (S.M.-M.)
| |
Collapse
|
11
|
Qin W, Godec A, Zhang X, Zhu C, Shao J, Tao Y, Bu X, Hirbe AC. TYK2 promotes malignant peripheral nerve sheath tumor progression through inhibition of cell death. Cancer Med 2019; 8:5232-5241. [PMID: 31278855 PMCID: PMC6718590 DOI: 10.1002/cam4.2386] [Citation(s) in RCA: 10] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/08/2019] [Revised: 06/05/2019] [Accepted: 06/18/2019] [Indexed: 01/01/2023] Open
Abstract
Background Malignant peripheral nerve sheath tumors (MPNSTs) are aggressive sarcomas that arise most commonly in the setting of the Neurofibromatosis Type 1 (NF1) cancer predisposition syndrome. Despite aggressive multimodality therapy, outcomes are dismal and most patients die within 5 years of diagnosis. Prior genomic studies in our laboratory identified tyrosine kinase 2 (TYK2) as a frequently mutated gene in MPNST. Herein, we explored the function of TYK2 in MPNST pathogenesis. Methods Immunohistochemistry was utilized to examine expression of TYK2 in MPNSTs and other sarcomas. To establish a role for TYK2 in MPNST pathogenesis, murine and human TYK2 knockdown and knockout cells were established using shRNA and CRISPR/Cas9 systems, respectively. Results We have demonstrated that TYK2 was highly expressed in the majority of human MPNSTs examined. Additionally, we demonstrated that knockdown of Tyk2/TYK2 in murine and human MPNST cells significantly increased cell death in vitro. These effects were accompanied by a decrease in the levels of activated Stats and Bcl‐2 as well as an increase in the levels of Cleaved Caspase‐3. In addition, Tyk2‐KD cells demonstrated impaired growth in subcutaneous and metastasis models in vivo. Conclusion Taken together, these data illustrate the importance of TYK2 in MPNST pathogenesis and suggest that the TYK2 pathway may be a potential therapeutic target for these deadly cancers.
Collapse
Affiliation(s)
- Wenjing Qin
- Division of Oncology, Department of Internal Medicine, Washington University School of Medicine, St. Louis, Missouri.,School of Pharmaceutical Sciences, Sun Yat-Sen University, Guangzhou, China
| | - Abigail Godec
- Division of Oncology, Department of Internal Medicine, Washington University School of Medicine, St. Louis, Missouri
| | - Xiaochun Zhang
- Division of Oncology, Department of Internal Medicine, Washington University School of Medicine, St. Louis, Missouri
| | - Cuige Zhu
- Division of Oncology, Department of Internal Medicine, Washington University School of Medicine, St. Louis, Missouri
| | - Jieya Shao
- Division of Oncology, Department of Internal Medicine, Washington University School of Medicine, St. Louis, Missouri.,Siteman Cancer Center, Washington University School of Medicine, Saint Louis, Missouri
| | - Yu Tao
- Cancer Center Biostatistics Shared Resource, Division of Public Health Sciences, Department of Surgery, Washington University School of Medicine, St. Louis, Missouri
| | - Xianzhang Bu
- School of Pharmaceutical Sciences, Sun Yat-Sen University, Guangzhou, China
| | - Angela C Hirbe
- Division of Oncology, Department of Internal Medicine, Washington University School of Medicine, St. Louis, Missouri.,Siteman Cancer Center, Washington University School of Medicine, Saint Louis, Missouri
| |
Collapse
|
12
|
Jordan EJ, Patil K, Suresh K, Park JH, Mosse YP, Lemmon MA, Radhakrishnan R. Computational algorithms for in silico profiling of activating mutations in cancer. Cell Mol Life Sci 2019; 76:2663-2679. [PMID: 30982079 PMCID: PMC6589134 DOI: 10.1007/s00018-019-03097-2] [Citation(s) in RCA: 8] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/10/2019] [Revised: 04/01/2019] [Accepted: 04/08/2019] [Indexed: 12/17/2022]
Abstract
Methods to catalog and computationally assess the mutational landscape of proteins in human cancers are desirable. One approach is to adapt evolutionary or data-driven methods developed for predicting whether a single-nucleotide polymorphism (SNP) is deleterious to protein structure and function. In cases where understanding the mechanism of protein activation and regulation is desired, an alternative approach is to employ structure-based computational approaches to predict the effects of point mutations. Through a case study of mutations in kinase domains of three proteins, namely, the anaplastic lymphoma kinase (ALK) in pediatric neuroblastoma patients, serine/threonine-protein kinase B-Raf (BRAF) in melanoma patients, and erythroblastic oncogene B 2 (ErbB2 or HER2) in breast cancer patients, we compare the two approaches above. We find that the structure-based method is most appropriate for developing a binary classification of several different mutations, especially infrequently occurring ones, concerning the activation status of the given target protein. This approach is especially useful if the effects of mutations on the interactions of inhibitors with the target proteins are being sought. However, many patients will present with mutations spread across different target proteins, making structure-based models computationally demanding to implement and execute. In this situation, data-driven methods-including those based on machine learning techniques and evolutionary methods-are most appropriate for recognizing and illuminate mutational patterns. We show, however, that, in the present status of the field, the two methods have very different accuracies and confidence values, and hence, the optimal choice of their deployment is context-dependent.
Collapse
Affiliation(s)
- E Joseph Jordan
- Graduate Group in Biochemistry and Molecular Biophysics, University of Pennsylvania, Philadelphia, PA, USA
| | - Keshav Patil
- Department of Chemical and Biomolecular Engineering, University of Pennsylvania, Philadelphia, PA, USA
| | - Krishna Suresh
- Department of Bioengineering, University of Pennsylvania, Philadelphia, PA, USA
| | - Jin H Park
- Department of Pharmacology, Yale University, New Haven, CT, USA
- Cancer Biology Institute, Yale University, West Haven, CT, USA
| | - Yael P Mosse
- Department of Pediatrics, Perelman School of Medicine, University of Pennsylvania, Philadelphia, PA, USA
- Children's Hospital of Philadelphia, Philadelphia, PA, USA
| | - Mark A Lemmon
- Department of Pharmacology, Yale University, New Haven, CT, USA
- Cancer Biology Institute, Yale University, West Haven, CT, USA
| | - Ravi Radhakrishnan
- Graduate Group in Biochemistry and Molecular Biophysics, University of Pennsylvania, Philadelphia, PA, USA.
- Department of Chemical and Biomolecular Engineering, University of Pennsylvania, Philadelphia, PA, USA.
- Department of Bioengineering, University of Pennsylvania, Philadelphia, PA, USA.
| |
Collapse
|
13
|
Abstract
Human cancers often harbor large numbers of somatic mutations. However, only a small proportion of these mutations are expected to contribute to tumor growth and progression. Therefore, determining causal driver mutations and the genes they target is becoming an important challenge in cancer genomics. Here we describe an approach for mapping somatic mutations onto 3D structures of human proteins in complex to identify "driver interfaces." Our strategy relies on identifying protein-interaction interfaces that are unexpectedly biased toward nonsynonymous mutations, which suggests that these interfaces are subject to positive selection during tumorigenesis, implicating the interacting proteins as candidate drivers.
Collapse
Affiliation(s)
- Kivilcim Ozturk
- Division of Medical Genetics, Department of Medicine, University of California San Diego, La Jolla, CA, USA
- Bioinformatics Program, University of California San Diego, La Jolla, CA, USA
| | - Hannah Carter
- Division of Medical Genetics, Department of Medicine, University of California San Diego, La Jolla, CA, USA.
- Bioinformatics Program, University of California San Diego, La Jolla, CA, USA.
- Moores Cancer Center, University of California San Diego, La Jolla, CA, USA.
- Institute for Genomic Medicine, University of California San Diego, La Jolla, CA, USA.
| |
Collapse
|
14
|
Hassan MS, Shaalan AA, Dessouky MI, Abdelnaiem AE, ElHefnawi M. A review study: Computational techniques for expecting the impact of non-synonymous single nucleotide variants in human diseases. Gene 2018; 680:20-33. [PMID: 30240882 DOI: 10.1016/j.gene.2018.09.028] [Citation(s) in RCA: 41] [Impact Index Per Article: 5.9] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/17/2018] [Accepted: 09/14/2018] [Indexed: 01/18/2023]
Abstract
Non-Synonymous Single-Nucleotide Variants (nsSNVs) and mutations can create a diversity effect on proteins as changing genotype and phenotype, which interrupts its stability. The alterations in the protein stability may cause diseases like cancer. Discovering of nsSNVs and mutations can be a useful tool for diagnosing the disease at a beginning stage. Many studies introduced the various predicting singular and consensus tools that based on different Machine Learning Techniques (MLTs) using diverse datasets. Therefore, we introduce the current comprehensive review of the most popular and recent unique tools that predict pathogenic variations and Meta-tool that merge some of them for enhancing their predictive power. Also, we scanned the several types computational techniques in the state-of-the-art and methods for predicting the effect both of coding and noncoding variants. We then displayed, the protein stability predictors. We offer the details of the most common benchmark database for variations including the main predictive features used by the different methods. Finally, we address the most common fundamental criteria for performance assessment of predictive tools. This review is targeted at bioinformaticians attentive in the characterization of regulatory variants, geneticists, molecular biologists attentive in understanding more about the nature and effective role of such variants from a functional point of views, and clinicians who may hope to learn about variants in human associated with a specific disease and find out what to do next to uncover how they impact on the underlying mechanisms.
Collapse
Affiliation(s)
- Marwa S Hassan
- Systems and Information Department and Biomedical Informatics Group, Engineering Research Division, National Research Center, Giza, Egypt; Patent Office of Scientific Research Academy, Egypt.
| | - A A Shaalan
- Electronics and Communication Department, Faculty of Engineering, Zagazig University, Zagazig, Egypt
| | - M I Dessouky
- Electronics and Electrical Communications Department, Faculty of Electronic Engineering, Menoufia University, Menouf 32952, Egypt
| | - Abdelaziz E Abdelnaiem
- Electronics and Communication Department, Faculty of Engineering, Zagazig University, Zagazig, Egypt
| | - Mahmoud ElHefnawi
- Systems and Information Department and Biomedical Informatics Group, Engineering Research Division, National Research Center, Giza, Egypt; Center for Informatics, Nile University, Giza, Egypt
| |
Collapse
|
15
|
Peterson TA, Gauran IIM, Park J, Park D, Kann MG. Oncodomains: A protein domain-centric framework for analyzing rare variants in tumor samples. PLoS Comput Biol 2017; 13:e1005428. [PMID: 28426665 PMCID: PMC5398485 DOI: 10.1371/journal.pcbi.1005428] [Citation(s) in RCA: 21] [Impact Index Per Article: 2.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/29/2016] [Accepted: 02/28/2017] [Indexed: 12/28/2022] Open
Abstract
The fight against cancer is hindered by its highly heterogeneous nature. Genome-wide sequencing studies have shown that individual malignancies contain many mutations that range from those commonly found in tumor genomes to rare somatic variants present only in a small fraction of lesions. Such rare somatic variants dominate the landscape of genomic mutations in cancer, yet efforts to correlate somatic mutations found in one or few individuals with functional roles have been largely unsuccessful. Traditional methods for identifying somatic variants that drive cancer are 'gene-centric' in that they consider only somatic variants within a particular gene and make no comparison to other similar genes in the same family that may play a similar role in cancer. In this work, we present oncodomain hotspots, a new 'domain-centric' method for identifying clusters of somatic mutations across entire gene families using protein domain models. Our analysis confirms that our approach creates a framework for leveraging structural and functional information encapsulated by protein domains into the analysis of somatic variants in cancer, enabling the assessment of even rare somatic variants by comparison to similar genes. Our results reveal a vast landscape of somatic variants that act at the level of domain families altering pathways known to be involved with cancer such as protein phosphorylation, signaling, gene regulation, and cell metabolism. Due to oncodomain hotspots' unique ability to assess rare variants, we expect our method to become an important tool for the analysis of sequenced tumor genomes, complementing existing methods.
Collapse
Affiliation(s)
- Thomas A. Peterson
- Department of Biological Sciences, University of Maryland, Baltimore County, Baltimore, Maryland, United States of America
- University of California, San Francisco, Institute for Computational Health Science, San Francisco, California, United States of America
| | - Iris Ivy M. Gauran
- Department of Mathematics and Statistics, University of Maryland, Baltimore County, Baltimore, Maryland, United States of America
| | - Junyong Park
- Department of Mathematics and Statistics, University of Maryland, Baltimore County, Baltimore, Maryland, United States of America
| | - DoHwan Park
- Department of Mathematics and Statistics, University of Maryland, Baltimore County, Baltimore, Maryland, United States of America
| | - Maricel G. Kann
- Department of Biological Sciences, University of Maryland, Baltimore County, Baltimore, Maryland, United States of America
| |
Collapse
|
16
|
Gallion J, Wilkins AD, Lichtarge O. HUMAN KINASES DISPLAY MUTATIONAL HOTSPOTS AT COGNATE POSITIONS WITHIN CANCER. PACIFIC SYMPOSIUM ON BIOCOMPUTING. PACIFIC SYMPOSIUM ON BIOCOMPUTING 2016; 22:414-425. [PMID: 27896994 DOI: 10.1142/9789813207813_0039] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/18/2022]
Abstract
The discovery of driver genes is a major pursuit of cancer genomics, usually based on observing the same mutation in different patients. But the heterogeneity of cancer pathways plus the high background mutational frequency of tumor cells often cloud the distinction between less frequent drivers and innocent passenger mutations. Here, to overcome these disadvantages, we grouped together mutations from close kinase paralogs under the hypothesis that cognate mutations may functionally favor cancer cells in similar ways. Indeed, we find that kinase paralogs often bear mutations to the same substituted amino acid at the same aligned positions and with a large predicted Evolutionary Action. Functionally, these high Evolutionary Action, non-random mutations affect known kinase motifs, but strikingly, they do so differently among different kinase types and cancers, consistent with differences in selective pressures. Taken together, these results suggest that cancer pathways may flexibly distribute a dependence on a given functional mutation among multiple close kinase paralogs. The recognition of this "mutational delocalization" of cancer drivers among groups of paralogs is a new phenomena that may help better identify relevant mechanisms and therefore eventually guide personalized therapy.
Collapse
Affiliation(s)
- Jonathan Gallion
- Structural Computational Biology and Molecular Biophysics, Baylor College of Medicine, One Baylor Plaza Houston, TX, 77030, USA†The authors gratefully acknowledge support from the National Institutes of Health (GM066099 and GM079656), from the National Science Foundation (DBI-1356569), and from DARPA (N66001-15-C-4042),
| | | | | |
Collapse
|
17
|
Hirbe AC, Kaushal M, Sharma MK, Dahiya S, Pekmezci M, Perry A, Gutmann DH. Clinical genomic profiling identifies TYK2 mutation and overexpression in patients with neurofibromatosis type 1-associated malignant peripheral nerve sheath tumors. Cancer 2016; 123:1194-1201. [PMID: 27875628 DOI: 10.1002/cncr.30455] [Citation(s) in RCA: 22] [Impact Index Per Article: 2.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/09/2016] [Revised: 08/22/2016] [Accepted: 10/24/2016] [Indexed: 12/20/2022]
Abstract
BACKGROUND Malignant peripheral nerve sheath tumors (MPNSTs) are aggressive sarcomas that arise at an estimated frequency of 8% to 13% in individuals with neurofibromatosis type 1 (NF1). Compared with their sporadic counterparts, NF1-associated MPNSTs (NF1-MPNSTs) develop in young adults, frequently recur (approximately 50% of cases), and carry a dismal prognosis. As such, most individuals affected with NF1-MPNSTs die within 5 years of diagnosis, despite surgical resection combined with radiotherapy and chemotherapy. METHODS Clinical genomic profiling was performed using 1000 ng of DNA from 7 cases of NF1-MPNST, and bioinformatic analyses were conducted to identify genes with actionable mutations. RESULTS A total of 3 women and 4 men with NF1-MPNST were identified (median age, 38 years). Nonsynonymous mutations were discovered in 4 genes (neurofibromatosis type 1 [NF1], ROS proto-oncogene 1 [ROS1], tumor protein p53 [TP53], and tyrosine kinase 2 [TYK2]), which in addition were mutated in other MPNST cases in this sample set. Consistent with their occurrence in individuals with NF1, all tumors had at least 1 mutation in the NF1 gene. Whereas TP53 gene mutations are frequently observed in other cancers, ROS1 mutations are common in melanoma (15%-35%), another neural crest-derived malignancy. In contrast, TYK2 mutations are uncommon in other malignancies (<7%). In the current series, recurrent TYK2 mutations were identified in 2 cases of NF1-MPNST (30% of cases), whereas TYK2 protein overexpression was observed in 60% of MPNST cases using an independently generated tissue microarray, regardless of NF1 status. CONCLUSIONS Clinical genomic analysis of the current series of NF1-MPNST cases found that TYK2 is a new gene mutated in MPNST. Future work will focus on examining the utility of TYK2 expression as a biomarker and therapeutic target for these cancers. Cancer 2017;123:1194-1201. © 2016 American Cancer Society.
Collapse
Affiliation(s)
- Angela C Hirbe
- Division of Medical Oncology, Department of Medicine, Washington University School of Medicine, St. Louis, Missouri
| | - Madhurima Kaushal
- Department of Pathology and Immunology, Washington University School of Medicine, St. Louis, Missouri
| | - Mukesh Kumar Sharma
- Department of Pathology and Immunology, Washington University School of Medicine, St. Louis, Missouri
| | - Sonika Dahiya
- Department of Pathology and Immunology, Washington University School of Medicine, St. Louis, Missouri
| | - Melike Pekmezci
- Department of Pathology, University of California at San Francisco School of Medicine, San Francisco, California
| | - Arie Perry
- Department of Pathology, University of California at San Francisco School of Medicine, San Francisco, California.,Department of Neurological Surgery, University of California at San Francisco School of Medicine, San Francisco, California
| | - David H Gutmann
- Department of Neurology, Washington University, St. Louis, Missouri
| |
Collapse
|
18
|
Liu L, Wang H, Wen J, Tseng CE, Zu Y, Chang CC, Zhou X. Mutated genes and driver pathways involved in myelodysplastic syndromes—a transcriptome sequencing based approach. MOLECULAR BIOSYSTEMS 2016; 11:2158-66. [PMID: 26010722 DOI: 10.1039/c4mb00663a] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 12/18/2022]
Abstract
Myelodysplastic syndromes are a heterogeneous group of clonal disorders of hematopoietic progenitors and have potentiality to progress into acute myelogenous leukemia. Development of effective treatments has been impeded by limited insight into pathogenic pathways. In this study, we applied RNA-seq technology to study the transcriptome on 20 MDS patients and 5 age-matched controls, and developed a pipeline for analyzing this data. After analysis, we identified 38 mutated genes contributing to MDS pathogenesis. 37 out of 38 genes have not been reported previously, suggesting our pipeline is critical for identifying novel mutated genes in MDS. The most recurrent mutation happened in gene IFRD1, which involved 30% of patient samples. Biological relationships among these mutated genes were mined using Ingenuity Pathway Analysis, and the results demonstrated that top two networks with highest scores were highly associated with cancer and hematological diseases, indicating that the mutated genes identified by our method were highly relevant to MDS. We then integrated the pathways in KEGG database and the identified mutated genes using our novel rule-based mutated driver pathway scoring approach for detecting mutated driver pathways. The results indicated two mutated driver pathways are important for the pathogenesis of MDS: pathway in cancer and in regulation of actin cytoskeleton. The latter, which likely contributes to the hallmark morphologic dysplasia observed in MDS, has not been reported, to the best of our knowledge. These results provide us new insights into the pathogenesis of MDS, which, in turn, may lead to novel therapeutics for this disease.
Collapse
Affiliation(s)
- Liang Liu
- Center for Bioinformatics and Systems Biology, Division of Radiologic Sciences, Wake Forest University Baptist Medical Center, Winston-Salem, NC 27157, USA.
| | | | | | | | | | | | | |
Collapse
|
19
|
Leitner NR, Witalisz-Siepracka A, Strobl B, Müller M. Tyrosine kinase 2 - Surveillant of tumours and bona fide oncogene. Cytokine 2015; 89:209-218. [PMID: 26631911 DOI: 10.1016/j.cyto.2015.10.015] [Citation(s) in RCA: 38] [Impact Index Per Article: 3.8] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/21/2015] [Accepted: 10/29/2015] [Indexed: 12/16/2022]
Abstract
Tyrosine kinase 2 (TYK2) is a member of the Janus kinase (JAK) family, which transduces cytokine and growth factor signalling. Analysis of TYK2 loss-of-function revealed its important role in immunity to infection, (auto-) immunity and (auto-) inflammation. TYK2-deficient patients unravelled high similarity between mice and men with respect to cellular signalling functions and basic immunology. Genome-wide association studies link TYK2 to several autoimmune and inflammatory diseases as well as carcinogenesis. Due to its cytokine signalling functions TYK2 was found to be essential in tumour surveillance. Lately TYK2 activating mutants and fusion proteins were detected in patients diagnosed with leukaemic diseases suggesting that TYK2 is a potent oncogene. Here we review the cell intrinsic and extrinsic functions of TYK2 in the characteristics preventing and enabling carcinogenesis. In addition we describe an unexpected function of kinase-inactive TYK2 in tumour rejection.
Collapse
Affiliation(s)
- Nicole R Leitner
- Institute of Animal Breeding and Genetics, University of Veterinary Medicine Vienna, Veterinärplatz 1, 1210 Vienna, Austria
| | - Agnieszka Witalisz-Siepracka
- Institute of Animal Breeding and Genetics, University of Veterinary Medicine Vienna, Veterinärplatz 1, 1210 Vienna, Austria
| | - Birgit Strobl
- Institute of Animal Breeding and Genetics, University of Veterinary Medicine Vienna, Veterinärplatz 1, 1210 Vienna, Austria
| | - Mathias Müller
- Institute of Animal Breeding and Genetics, University of Veterinary Medicine Vienna, Veterinärplatz 1, 1210 Vienna, Austria.
| |
Collapse
|
20
|
Li J, Drubay D, Michiels S, Gautheret D. Mining the coding and non-coding genome for cancer drivers. Cancer Lett 2015; 369:307-15. [PMID: 26433158 DOI: 10.1016/j.canlet.2015.09.015] [Citation(s) in RCA: 12] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/03/2015] [Revised: 09/24/2015] [Accepted: 09/24/2015] [Indexed: 12/20/2022]
Abstract
Progress in next-generation sequencing provides unprecedented opportunities to fully characterize the spectrum of somatic mutations of cancer genomes. Given the large number of somatic mutations identified by such technologies, the prioritization of cancer-driving events is a consistent bottleneck. Most bioinformatics tools concentrate on driver mutations in the coding fraction of the genome, those causing changes in protein products. As more non-coding pathogenic variants are identified and characterized, the development of computational approaches to effectively prioritize cancer-driving variants within the non-coding fraction of human genome is becoming critical. After a short summary of methods for coding variant prioritization, we here review the highly diverse non-coding elements that may act as cancer drivers and describe recent methods that attempt to evaluate the deleteriousness of sequence variation in these elements. With such tools, the prioritization and identification of cancer-implicated regulatory elements and non-coding RNAs is becoming a reality.
Collapse
Affiliation(s)
- Jia Li
- Institute for Integrative Biology of the Cell (I2BC), CNRS, CEA, Université Paris-Sud, Université Paris-Saclay, 91198 Gif sur Yvette, France
| | - Damien Drubay
- Service de Biostatistique et d'Epidemiologie, Gustave Roussy, Villejuif, France; INSERM U1018, CESP, Université Paris-Sud, Université Paris-Saclay, Villejuif, France
| | - Stefan Michiels
- Service de Biostatistique et d'Epidemiologie, Gustave Roussy, Villejuif, France; INSERM U1018, CESP, Université Paris-Sud, Université Paris-Saclay, Villejuif, France
| | - Daniel Gautheret
- Institute for Integrative Biology of the Cell (I2BC), CNRS, CEA, Université Paris-Sud, Université Paris-Saclay, 91198 Gif sur Yvette, France.
| |
Collapse
|
21
|
Rane SU, Mirza H, Grigoriadis A, Pinder SE. Selection and evolution in the genomic landscape of copy number alterations in ductal carcinoma in situ (DCIS) and its progression to invasive carcinoma of ductal/no special type: a meta-analysis. Breast Cancer Res Treat 2015; 153:101-21. [PMID: 26255059 DOI: 10.1007/s10549-015-3509-x] [Citation(s) in RCA: 16] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/14/2015] [Accepted: 07/18/2015] [Indexed: 12/18/2022]
Abstract
Ductal carcinoma in situ (DCIS) is a pre-invasive malignancy detected with an increasing frequency through screening mammography. One of the primary aims of therapy is to prevent local recurrence, as in situ or as invasive carcinoma, the latter arising in half of the recurrent cases. Reliable biomarkers predictive of its association with recurrence, particularly as invasive disease, are however lacking. In this study, we perform a meta-analysis of 26 studies which report somatic copy number aberrations (SCNAs) in 288 cases of 'pure' DCIS and 328 of DCIS associated with invasive carcinoma, along with additional unmatched cases of 145 invasive carcinoma of ductal/no special type (IDC) and 50 of atypical ductal hyperplasia (ADH). SCNA frequencies across the genome were calculated at cytoband resolution (UCSC genome build 19) to maximally utilize the available information in published literature. Fisher's exact test was used to identify significant differences in the gain-loss distribution in each cytoband in different group comparisons. We found synchronous DCIS to be at a more advanced stage of genetic aberrations than pure DCIS and was very similar to IDC. Differences in gains and losses in each disease process (i.e. invasive or in situ) at each cytoband were used to infer evidence of selection and conservation for each cytoband and to define an evolutionary conservation scale (ECS) as a tool to identify and distinguish driver SCNA from the passenger SCNA. Using ECS, we have identified aberrations that show evidence of selection from the early stages of neoplasia (i.e. in ADH and pure DCIS) and persist in IDC; we postulate these to be driver aberrations and that their presence may predict progression to invasive disease.
Collapse
Affiliation(s)
- Swapnil Ulhas Rane
- Department of Research Oncology, King's Health Partners AHSC, King's College London, London, UK,
| | | | | | | |
Collapse
|
22
|
Ferguson BD, Carol Tan YH, Kanteti RS, Liu R, Gayed MJ, Vokes EE, Ferguson MK, John Iafrate A, Gill PS, Salgia R. Novel EPHB4 Receptor Tyrosine Kinase Mutations and Kinomic Pathway Analysis in Lung Cancer. Sci Rep 2015; 5:10641. [PMID: 26073592 PMCID: PMC4466581 DOI: 10.1038/srep10641] [Citation(s) in RCA: 16] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/08/2014] [Accepted: 04/28/2015] [Indexed: 12/11/2022] Open
Abstract
Lung cancer outcomes remain poor despite the identification of several potential therapeutic targets. The EPHB4 receptor tyrosine kinase (RTK) has recently emerged as an oncogenic factor in many cancers, including lung cancer. Mutations of EPHB4 in lung cancers have previously been identified, though their significance remains unknown. Here, we report the identification of novel EPHB4 mutations that lead to putative structural alterations as well as increased cellular proliferation and motility. We also conducted a bioinformatic analysis of these mutations to demonstrate that they are mutually exclusive from other common RTK variants in lung cancer, that they correspond to analogous sites of other RTKs’ variations in cancers, and that they are predicted to be oncogenic based on biochemical, evolutionary, and domain-function constraints. Finally, we show that EPHB4 mutations can induce broad changes in the kinome signature of lung cancer cells. Taken together, these data illuminate the role of EPHB4 in lung cancer and further identify EPHB4 as a potentially important therapeutic target.
Collapse
Affiliation(s)
- Benjamin D Ferguson
- Department of Surgery, University of Chicago, Chicago, Illinois, United States of America
| | - Yi-Hung Carol Tan
- Department of Medicine, Section of Hematology/Oncology, University of Chicago, Chicago, Illinois, United States of America
| | - Rajani S Kanteti
- Department of Medicine, Section of Hematology/Oncology, University of Chicago, Chicago, Illinois, United States of America
| | - Ren Liu
- Department of Medicine, Division of Medical Oncology, University of Southern California, Los Angeles, California, United States of America
| | - Matthew J Gayed
- Department of Medicine, Section of Hematology/Oncology, University of Chicago, Chicago, Illinois, United States of America
| | - Everett E Vokes
- Department of Medicine, Section of Hematology/Oncology, University of Chicago, Chicago, Illinois, United States of America
| | - Mark K Ferguson
- Department of Surgery, University of Chicago, Chicago, Illinois, United States of America.,Comprehensive Cancer Center, University of Chicago, Chicago, Illinois, United States of America
| | - A John Iafrate
- Department of Pathology, Massachusetts General Hospital, Boston, Massachusetts, United States of America
| | - Parkash S Gill
- Department of Medicine, Division of Medical Oncology, University of Southern California, Los Angeles, California, United States of America
| | - Ravi Salgia
- Department of Medicine, Section of Hematology/Oncology, University of Chicago, Chicago, Illinois, United States of America.,Comprehensive Cancer Center, University of Chicago, Chicago, Illinois, United States of America
| |
Collapse
|
23
|
PON-P2: prediction method for fast and reliable identification of harmful variants. PLoS One 2015; 10:e0117380. [PMID: 25647319 PMCID: PMC4315405 DOI: 10.1371/journal.pone.0117380] [Citation(s) in RCA: 159] [Impact Index Per Article: 15.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/24/2014] [Accepted: 12/17/2014] [Indexed: 01/04/2023] Open
Abstract
More reliable and faster prediction methods are needed to interpret enormous amounts of data generated by sequencing and genome projects. We have developed a new computational tool, PON-P2, for classification of amino acid substitutions in human proteins. The method is a machine learning-based classifier and groups the variants into pathogenic, neutral and unknown classes, on the basis of random forest probability score. PON-P2 is trained using pathogenic and neutral variants obtained from VariBench, a database for benchmark variation datasets. PON-P2 utilizes information about evolutionary conservation of sequences, physical and biochemical properties of amino acids, GO annotations and if available, functional annotations of variation sites. Extensive feature selection was performed to identify 8 informative features among altogether 622 features. PON-P2 consistently showed superior performance in comparison to existing state-of-the-art tools. In 10-fold cross-validation test, its accuracy and MCC are 0.90 and 0.80, respectively, and in the independent test, they are 0.86 and 0.71, respectively. The coverage of PON-P2 is 61.7% in the 10-fold cross-validation and 62.1% in the test dataset. PON-P2 is a powerful tool for screening harmful variants and for ranking and prioritizing experimental characterization. It is very fast making it capable of analyzing large variant datasets. PON-P2 is freely available at http://structure.bmc.lu.se/PON-P2/.
Collapse
|
24
|
Tian R, Basu MK, Capriotti E. ContrastRank: a new method for ranking putative cancer driver genes and classification of tumor samples. ACTA ACUST UNITED AC 2015; 30:i572-8. [PMID: 25161249 PMCID: PMC4147919 DOI: 10.1093/bioinformatics/btu466] [Citation(s) in RCA: 16] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/15/2023]
Abstract
Motivation: The recent advance in high-throughput sequencing technologies is generating a huge amount of data that are becoming an important resource for deciphering the genotype underlying a given phenotype. Genome sequencing has been extensively applied to the study of the cancer genomes. Although a few methods have been already proposed for the detection of cancer-related genes, their automatic identification is still a challenging task. Using the genomic data made available by The Cancer Genome Atlas Consortium (TCGA), we propose a new prioritization approach based on the analysis of the distribution of putative deleterious variants in a large cohort of cancer samples. Results: In this paper, we present ContastRank, a new method for the prioritization of putative impaired genes in cancer. The method is based on the comparison of the putative defective rate of each gene in tumor versus normal and 1000 genome samples. We show that the method is able to provide a ranked list of putative impaired genes for colon, lung and prostate adenocarcinomas. The list significantly overlaps with the list of known cancer driver genes previously published. More importantly, by using our scoring approach, we can successfully discriminate between TCGA normal and tumor samples. A binary classifier based on ContrastRank score reaches an overall accuracy >90% and the area under the curve (AUC) of receiver operating characteristics (ROC) >0.95 for all the three types of adenocarcinoma analyzed in this paper. In addition, using ContrastRank score, we are able to discriminate the three tumor types with a minimum overall accuracy of 77% and AUC of 0.83. Conclusions: We describe ContrastRank, a method for prioritizing putative impaired genes in cancer. The method is based on the comparison of exome sequencing data from different cohorts and can detect putative cancer driver genes. ContrastRank can also be used to estimate a global score for an individual genome about the risk of adenocarcinoma based on the genetic variants information from a whole-exome VCF (Variant Calling Format) file. We believe that the application of ContrastRank can be an important step in genomic medicine to enable genome-based diagnosis. Availability and implementation: The lists of ContrastRank scores of all genes in each tumor type are available as supplementary materials. A webserver for evaluating the risk of the three studied adenocarcinomas starting from whole-exome VCF file is under development. Contact:emidio@uab.edu Supplementary information:Supplementary data are available at Bioinformatics online.
Collapse
Affiliation(s)
- Rui Tian
- Division of Informatics, Department of Pathology, Department of Clinical and Diagnostic Sciences and Department of Biomedical Engineering, University of Alabama at Birmingham, Birmingham, AL 35249, USA
| | - Malay K Basu
- Division of Informatics, Department of Pathology, Department of Clinical and Diagnostic Sciences and Department of Biomedical Engineering, University of Alabama at Birmingham, Birmingham, AL 35249, USA Division of Informatics, Department of Pathology, Department of Clinical and Diagnostic Sciences and Department of Biomedical Engineering, University of Alabama at Birmingham, Birmingham, AL 35249, USA
| | - Emidio Capriotti
- Division of Informatics, Department of Pathology, Department of Clinical and Diagnostic Sciences and Department of Biomedical Engineering, University of Alabama at Birmingham, Birmingham, AL 35249, USA Division of Informatics, Department of Pathology, Department of Clinical and Diagnostic Sciences and Department of Biomedical Engineering, University of Alabama at Birmingham, Birmingham, AL 35249, USA Division of Informatics, Department of Pathology, Department of Clinical and Diagnostic Sciences and Department of Biomedical Engineering, University of Alabama at Birmingham, Birmingham, AL 35249, USA
| |
Collapse
|
25
|
Katsonis P, Koire A, Wilson SJ, Hsu TK, Lua RC, Wilkins AD, Lichtarge O. Single nucleotide variations: biological impact and theoretical interpretation. Protein Sci 2014; 23:1650-66. [PMID: 25234433 PMCID: PMC4253807 DOI: 10.1002/pro.2552] [Citation(s) in RCA: 90] [Impact Index Per Article: 8.2] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/06/2014] [Revised: 09/12/2014] [Accepted: 09/15/2014] [Indexed: 12/27/2022]
Abstract
Genome-wide association studies (GWAS) and whole-exome sequencing (WES) generate massive amounts of genomic variant information, and a major challenge is to identify which variations drive disease or contribute to phenotypic traits. Because the majority of known disease-causing mutations are exonic non-synonymous single nucleotide variations (nsSNVs), most studies focus on whether these nsSNVs affect protein function. Computational studies show that the impact of nsSNVs on protein function reflects sequence homology and structural information and predict the impact through statistical methods, machine learning techniques, or models of protein evolution. Here, we review impact prediction methods and discuss their underlying principles, their advantages and limitations, and how they compare to and complement one another. Finally, we present current applications and future directions for these methods in biological research and medical genetics.
Collapse
Affiliation(s)
- Panagiotis Katsonis
- Department of Molecular and Human Genetics, Baylor College of MedicineHouston, Texas
| | - Amanda Koire
- Department of Structural and Computational Biology and Molecular BiophysicsHouston, Texas
| | - Stephen Joseph Wilson
- Department of Biochemistry and Molecular Biology, Baylor College of MedicineHouston, Texas
| | - Teng-Kuei Hsu
- Department of Biochemistry and Molecular Biology, Baylor College of MedicineHouston, Texas
| | - Rhonald C Lua
- Department of Molecular and Human Genetics, Baylor College of MedicineHouston, Texas
| | - Angela Dawn Wilkins
- Department of Molecular and Human Genetics, Baylor College of MedicineHouston, Texas
- Computational and Integrative Biomedical Research Center, Baylor College of MedicineHouston, Texas
| | - Olivier Lichtarge
- Department of Molecular and Human Genetics, Baylor College of MedicineHouston, Texas
- Department of Structural and Computational Biology and Molecular BiophysicsHouston, Texas
- Department of Biochemistry and Molecular Biology, Baylor College of MedicineHouston, Texas
- Computational and Integrative Biomedical Research Center, Baylor College of MedicineHouston, Texas
- Department of Pharmacology, Baylor College of MedicineHouston, Texas
| |
Collapse
|
26
|
Chen J, Sun M, Shen B. Deciphering oncogenic drivers: from single genes to integrated pathways. Brief Bioinform 2014; 16:413-28. [PMID: 25378434 DOI: 10.1093/bib/bbu039] [Citation(s) in RCA: 20] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/02/2014] [Accepted: 09/18/2014] [Indexed: 12/12/2022] Open
Abstract
Technological advances in next-generation sequencing have uncovered a wide spectrum of aberrations in cancer genomes. The extreme diversity in cancer mutations necessitates computational approaches to differentiate between the 'drivers' with vital function in cancer progression and those nonfunctional 'passengers'. Although individual driver mutations are routinely identified, mutational profiles of different tumors are highly heterogeneous. There is growing consensus that pathways rather than single genes are the primary target of mutations. Here we review extant bioinformatics approaches to identifying oncogenic drivers at different mutational levels, highlighting the strategies for discovering driver pathways and networks from cancer mutation data. These approaches will help reduce the mutation complexity, thus providing a simplified picture of cancer.
Collapse
|
27
|
Ubel C, Mousset S, Trufa D, Sirbu H, Finotto S. Establishing the role of tyrosine kinase 2 in cancer. Oncoimmunology 2014; 2:e22840. [PMID: 23482926 PMCID: PMC3583936 DOI: 10.4161/onci.22840] [Citation(s) in RCA: 25] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Key Words] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/25/2022] Open
Abstract
Tyrosine kinase 2 (TYK2) is a member of the Janus family of non-receptor tyrosine kinases involved in cytokine signaling. TYK2 deficiency is associated with increased susceptibility to mycobacterial and viral infections, hyper IgE syndrome as well as with allergic asthma. However the precise role of TYK2 in oncogenesis and tumor progression is not clear yet. Tyk2-deficient mice are prone to develop tumors because they lack efficient cytotoxic CD8+ T-cell antitumor responses as a result of deficient Type I interferon signaling. However, as TYK2 functions downstream of growth factor receptors that are often hyperactivated in cancer, inhibiting TYK2 might also have beneficial effects for cancer treatment.
Collapse
Affiliation(s)
- Caroline Ubel
- Laboratory of Cellular and Molecular Lung Immunology; Institute of Molecular Pneumology; University of Erlangen-Nürnberg, Erlangen, Germany
| | | | | | | | | |
Collapse
|
28
|
Shihab HA, Gough J, Mort M, Cooper DN, Day INM, Gaunt TR. Ranking non-synonymous single nucleotide polymorphisms based on disease concepts. Hum Genomics 2014; 8:11. [PMID: 24980617 PMCID: PMC4083756 DOI: 10.1186/1479-7364-8-11] [Citation(s) in RCA: 146] [Impact Index Per Article: 13.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/19/2014] [Accepted: 06/21/2014] [Indexed: 11/10/2022] Open
Abstract
As the number of non-synonymous single nucleotide polymorphisms (nsSNPs) identified through whole-exome/whole-genome sequencing programs increases, researchers and clinicians are becoming increasingly reliant upon computational prediction algorithms designed to prioritize potential functional variants for further study. A large proportion of existing prediction algorithms are 'disease agnostic' but are nevertheless quite capable of predicting when a mutation is likely to be deleterious. However, most clinical and research applications of these algorithms relate to specific diseases and would therefore benefit from an approach that discriminates between functional variants specifically related to that disease from those which are not. In a whole-exome/whole-genome sequencing context, such an approach could substantially reduce the number of false positive candidate mutations. Here, we test this postulate by incorporating a disease-specific weighting scheme into the Functional Analysis through Hidden Markov Models (FATHMM) algorithm. When compared to traditional prediction algorithms, we observed an overall reduction in the number of false positives identified using a disease-specific approach to functional prediction across 17 distinct disease concepts/categories. Our results illustrate the potential benefits of making disease-specific predictions when prioritizing candidate variants in relation to specific diseases. A web-based implementation of our algorithm is available at http://fathmm.biocompute.org.uk.
Collapse
Affiliation(s)
| | | | | | | | | | - Tom R Gaunt
- Bristol Centre for Systems Biomedicine and MRC Integrative Epidemiology Unit, School of Social and Community Medicine, University of Bristol, Oakfield House, Oakfield Grove, Bristol BS8 2BN, UK.
| |
Collapse
|
29
|
Application of Massively Parallel Sequencing in the Clinical Diagnostic Testing of Inherited Cardiac Conditions. Med Sci (Basel) 2014. [DOI: 10.3390/medsci2020098] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/27/2022] Open
|
30
|
Cheong PL, Caramins M. Approaches for classifying DNA variants found by Sanger sequencing in a medical genetics laboratory. Methods Mol Biol 2014; 1168:227-50. [PMID: 24870139 DOI: 10.1007/978-1-4939-0847-9_13] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/21/2023]
Abstract
Diagnostic applications of DNA sequencing technologies present a powerful tool for the clinical management of patients. Applications range from better diagnostic classification to identification of therapeutic options, prediction of drug response and toxicity, and carrier testing. Although the advent of massively parallel sequencing technologies has increased the complexity of clinical interpretation of sequence variants by an order of magnitude, the annotation and interpretation of the clinical effects of identified genomic variants remain a challenge regardless of the sequencing technologies used to identify them. Here, we survey methodologies which assist in the diagnostic classification of DNA variants and propose a practical decision analytic protocol to assist in the classification of sequencing variants in a clinical setting. The methods include database queries, software tools for protein consequence, evolutionary conservation and pathogenicity prediction, familial segregation, case-control studies, and literature review. These methods are deliberately pragmatic as diagnostic constraints of clinically useful turnaround times generally preclude obtaining evidence from in vivo or in vitro functional experiments for variant assessment. Clinical considerations require that variant classification is stringent and rigorous, as misinterpretation may lead to inappropriate clinical consequences; thus, multiple parameters and lines of evidence are considered to determine potential biological significance.
Collapse
Affiliation(s)
- Pak Leng Cheong
- Department of Medical Genomics, Royal Prince Alfred Hospital, Camperdown, NSW, Australia
| | | |
Collapse
|
31
|
Prediction and prioritization of rare oncogenic mutations in the cancer Kinome using novel features and multiple classifiers. PLoS Comput Biol 2014; 10:e1003545. [PMID: 24743239 PMCID: PMC3990476 DOI: 10.1371/journal.pcbi.1003545] [Citation(s) in RCA: 19] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/21/2013] [Accepted: 02/18/2014] [Indexed: 01/18/2023] Open
Abstract
Cancer is a genetic disease that develops through a series of somatic mutations, a subset of which drive cancer progression. Although cancer genome sequencing studies are beginning to reveal the mutational patterns of genes in various cancers, identifying the small subset of “causative” mutations from the large subset of “non-causative” mutations, which accumulate as a consequence of the disease, is a challenge. In this article, we present an effective machine learning approach for identifying cancer-associated mutations in human protein kinases, a class of signaling proteins known to be frequently mutated in human cancers. We evaluate the performance of 11 well known supervised learners and show that a multiple-classifier approach, which combines the performances of individual learners, significantly improves the classification of known cancer-associated mutations. We introduce several novel features related specifically to structural and functional characteristics of protein kinases and find that the level of conservation of the mutated residue at specific evolutionary depths is an important predictor of oncogenic effect. We consolidate the novel features and the multiple-classifier approach to prioritize and experimentally test a set of rare unconfirmed mutations in the epidermal growth factor receptor tyrosine kinase (EGFR). Our studies identify T725M and L861R as rare cancer-associated mutations inasmuch as these mutations increase EGFR activity in the absence of the activating EGF ligand in cell-based assays. Cancer progresses by accumulation of mutations in a subset of genes that confer growth advantage. The 518 protein kinase genes encoded in the human genome, collectively called the kinome, represent one of the largest families of oncogenes. Targeted sequencing studies of many different cancers have shown that the mutational landscape comprises both cancer-causing “driver” mutations and harmless “passenger” mutations. While the frequent recurrence of some driver mutations in human cancers helps distinguish them from the large number of passenger mutations, a significant challenge is to identify the rare “driver” mutations that are less frequently observed in patient samples and yet are causative. Here we combine computational and experimental approaches to identify rare cancer-associated mutations in Epidermal Growth Factor receptor kinase (EGFR), a signaling protein frequently mutated in cancers. Specifically, we evaluate a novel multiple-classifier approach and features specific to the protein kinase super-family in distinguishing known cancer-associated mutations from benign mutations. We then apply the multiple classifier to identify and test the functional impact of rare cancer-associated mutations in EGFR. We report, for the first time, that the EGFR mutations T725M and L861R, which are infrequently observed in cancers, constitutively activate EGFR in a manner analogous to the frequently observed driver mutations.
Collapse
|
32
|
Structure-functional prediction and analysis of cancer mutation effects in protein kinases. COMPUTATIONAL AND MATHEMATICAL METHODS IN MEDICINE 2014; 2014:653487. [PMID: 24817905 PMCID: PMC4000980 DOI: 10.1155/2014/653487] [Citation(s) in RCA: 27] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 08/13/2013] [Revised: 12/31/2013] [Accepted: 02/28/2014] [Indexed: 12/17/2022]
Abstract
A central goal of cancer research is to discover and characterize the functional effects of mutated genes that contribute to tumorigenesis. In this study, we provide a detailed structural classification and analysis of functional dynamics for members of protein kinase families that are known to harbor cancer mutations. We also present a systematic computational analysis that combines sequence and structure-based prediction models to characterize the effect of cancer mutations in protein kinases. We focus on the differential effects of activating point mutations that increase protein kinase activity and kinase-inactivating mutations that decrease activity. Mapping of cancer mutations onto the conformational mobility profiles of known crystal structures demonstrated that activating mutations could reduce a steric barrier for the movement from the basal “low” activity state to the “active” state. According to our analysis, the mechanism of activating mutations reflects a combined effect of partial destabilization of the kinase in its inactive state and a concomitant stabilization of its active-like form, which is likely to drive tumorigenesis at some level. Ultimately, the analysis of the evolutionary and structural features of the major cancer-causing mutational hotspot in kinases can also aid in the correlation of kinase mutation effects with clinical outcomes.
Collapse
|
33
|
Abstract
Moving from a traditional medical model of treating pathologies to an individualized predictive and preventive model of personalized medicine promises to reduce the healthcare cost on an overburdened and overwhelmed system. Next-generation sequencing (NGS) has the potential to accelerate the early detection of disorders and the identification of pharmacogenetics markers to customize treatments. This review explains the historical facts that led to the development of NGS along with the strengths and weakness of NGS, with a special emphasis on the analytical aspects used to process NGS data. There are solutions to all the steps necessary for performing NGS in the clinical context where the majority of them are very efficient, but there are some crucial steps in the process that need immediate attention.
Collapse
Affiliation(s)
- Manuel L. Gonzalez-Garay
- Center for Molecular Imaging, Division of Genomics & Bioinformatics, The Brown Foundation Institute of Molecular Medicine, University of Texas Health Science Center at Houston, Houston, TX 77030, USA
| |
Collapse
|
34
|
Ghersi D, Singh M. Interaction-based discovery of functionally important genes in cancers. Nucleic Acids Res 2013; 42:e18. [PMID: 24362839 PMCID: PMC3919581 DOI: 10.1093/nar/gkt1305] [Citation(s) in RCA: 24] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/14/2022] Open
Abstract
A major challenge in cancer genomics is uncovering genes with an active role in tumorigenesis from a potentially large pool of mutated genes across patient samples. Here we focus on the interactions that proteins make with nucleic acids, small molecules, ions and peptides, and show that residues within proteins that are involved in these interactions are more frequently affected by mutations observed in large-scale cancer genomic data than are other residues. We leverage this observation to predict genes that play a functionally important role in cancers by introducing a computational pipeline (http://canbind.princeton.edu) for mapping large-scale cancer exome data across patients onto protein structures, and automatically extracting proteins with an enriched number of mutations affecting their nucleic acid, small molecule, ion or peptide binding sites. Using this computational approach, we show that many previously known genes implicated in cancers are enriched in mutations within the binding sites of their encoded proteins. By focusing on functionally relevant portions of proteins--specifically those known to be involved in molecular interactions--our approach is particularly well suited to detect infrequent mutations that may nonetheless be important in cancer, and should aid in expanding our functional understanding of the genomic landscape of cancer.
Collapse
Affiliation(s)
- Dario Ghersi
- Lewis-Sigler Institute for Integrative Genomics, Princeton University, Princeton, NJ 08544, USA and Department of Computer Science, Princeton University, Princeton, NJ 08544, USA
| | | |
Collapse
|
35
|
Izarzugaza JMG, Vazquez M, del Pozo A, Valencia A. wKinMut: an integrated tool for the analysis and interpretation of mutations in human protein kinases. BMC Bioinformatics 2013; 14:345. [PMID: 24289158 PMCID: PMC3879071 DOI: 10.1186/1471-2105-14-345] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/07/2012] [Accepted: 05/30/2013] [Indexed: 11/13/2022] Open
Abstract
Background Protein kinases are involved in relevant physiological functions and a broad number of mutations in this superfamily have been reported in the literature to affect protein function and stability. Unfortunately, the exploration of the consequences on the phenotypes of each individual mutation remains a considerable challenge. Results The wKinMut web-server offers direct prediction of the potential pathogenicity of the mutations from a number of methods, including our recently developed prediction method based on the combination of information from a range of diverse sources, including physicochemical properties and functional annotations from FireDB and Swissprot and kinase-specific characteristics such as the membership to specific kinase groups, the annotation with disease-associated GO terms or the occurrence of the mutation in PFAM domains, and the relevance of the residues in determining kinase subfamily specificity from S3Det. This predictor yields interesting results that compare favourably with other methods in the field when applied to protein kinases. Together with the predictions, wKinMut offers a number of integrated services for the analysis of mutations. These include: the classification of the kinase, information about associations of the kinase with other proteins extracted from iHop, the mapping of the mutations onto PDB structures, pathogenicity records from a number of databases and the classification of mutations in large-scale cancer studies. Importantly, wKinMut is connected with the SNP2L system that extracts mentions of mutations directly from the literature, and therefore increases the possibilities of finding interesting functional information associated to the studied mutations. Conclusions wKinMut facilitates the exploration of the information available about individual mutations by integrating prediction approaches with the automatic extraction of information from the literature (text mining) and several state-of-the-art databases. wKinMut has been used during the last year for the analysis of the consequences of mutations in the context of a number of cancer genome projects, including the recent analysis of Chronic Lymphocytic Leukemia cases and is publicly available at
http://wkinmut.bioinfo.cnio.es.
Collapse
Affiliation(s)
- Jose M G Izarzugaza
- Structural Biology and BioComputing Programme, Spanish National Cancer Research Centre (CNIO), C/Melchor Fernandez Almagro, 3, E-28029 Madrid, Spain.
| | | | | | | |
Collapse
|
36
|
Fawdar S, Trotter EW, Li Y, Stephenson NL, Hanke F, Marusiak AA, Edwards ZC, Ientile S, Waszkowycz B, Miller CJ, Brognard J. Targeted genetic dependency screen facilitates identification of actionable mutations in FGFR4, MAP3K9, and PAK5 in lung cancer. Proc Natl Acad Sci U S A 2013; 110:12426-31. [PMID: 23836671 PMCID: PMC3725071 DOI: 10.1073/pnas.1305207110] [Citation(s) in RCA: 44] [Impact Index Per Article: 3.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/30/2022] Open
Abstract
Approximately 70% of patients with non-small-cell lung cancer present with late-stage disease and have limited treatment options, so there is a pressing need to develop efficacious targeted therapies for these patients. This remains a major challenge as the underlying genetic causes of ~50% of non-small-cell lung cancers remain unknown. Here we demonstrate that a targeted genetic dependency screen is an efficient approach to identify somatic cancer alterations that are functionally important. By using this approach, we have identified three kinases with gain-of-function mutations in lung cancer, namely FGFR4, MAP3K9, and PAK5. Mutations in these kinases are activating toward the ERK pathway, and targeted depletion of the mutated kinases inhibits proliferation, suppresses constitutive activation of downstream signaling pathways, and results in specific killing of the lung cancer cells. Genomic profiling of patients with lung cancer is ushering in an era of personalized medicine; however, lack of actionable mutations presents a significant hurdle. Our study indicates that targeted genetic dependency screens will be an effective strategy to elucidate somatic variants that are essential for lung cancer cell viability.
Collapse
Affiliation(s)
| | | | - Yaoyong Li
- Applied Computational Biology and Bioinformatics Group, and
| | | | | | | | | | | | - Bohdan Waszkowycz
- Drug Discovery Unit, Cancer Research UK, Paterson Institute for Cancer Research, University of Manchester, Manchester M20 4BX, United Kingdom
| | | | | |
Collapse
|
37
|
Guo Y, Wei X, Das J, Grimson A, Lipkin S, Clark A, Yu H. Dissecting disease inheritance modes in a three-dimensional protein network challenges the "guilt-by-association" principle. Am J Hum Genet 2013; 93:78-89. [PMID: 23791107 DOI: 10.1016/j.ajhg.2013.05.022] [Citation(s) in RCA: 31] [Impact Index Per Article: 2.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/05/2012] [Revised: 05/02/2013] [Accepted: 05/23/2013] [Indexed: 10/26/2022] Open
Abstract
To better understand different molecular mechanisms by which mutations lead to various human diseases, we classified 82,833 disease-associated mutations according to their inheritance modes (recessive versus dominant) and molecular types (in-frame [missense point mutations and in-frame indels] versus truncating [nonsense mutations and frameshift indels]) and systematically examined the effects of different classes of disease mutations in a three-dimensional protein interactome network with the atomic-resolution interface resolved for each interaction. We found that although recessive mutations affecting the interaction interface of two interacting proteins tend to cause the same disease, this widely accepted "guilt-by-association" principle does not apply to dominant mutations. Furthermore, recessive truncating mutations in regions encoding the same interface are much more likely to cause the same disease, even for interfaces close to the N terminus of the protein. Conversely, dominant truncating mutations tend to be enriched in regions encoding areas between interfaces. These results suggest that a significant fraction of truncating mutations can generate functional protein products. For example, TRIM27, a known cancer-associated protein, interacts with three proteins (MID2, TRIM42, and SIRPA) through two different interfaces. A dominant truncating mutation (c.1024delT [p.Tyr342Thrfs*30]) associated with ovarian carcinoma is located between the regions encoding the two interfaces; the altered protein retains its interaction with MID2 and TRIM42 through the first interface but loses its interaction with SIRPA through the second interface. Our findings will help clarify the molecular mechanisms of thousands of disease-associated genes and their tens of thousands of mutations, especially for those carrying truncating mutations, often erroneously considered "knockout" alleles.
Collapse
|
38
|
Identifying driver mutations from sequencing data of heterogeneous tumors in the era of personalized genome sequencing. Brief Bioinform 2013; 15:244-55. [DOI: 10.1093/bib/bbt042] [Citation(s) in RCA: 34] [Impact Index Per Article: 2.8] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/01/2023] Open
|
39
|
Assessment of computational methods for predicting the effects of missense mutations in human cancers. BMC Genomics 2013; 14 Suppl 3:S7. [PMID: 23819521 PMCID: PMC3665581 DOI: 10.1186/1471-2164-14-s3-s7] [Citation(s) in RCA: 123] [Impact Index Per Article: 10.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/19/2022] Open
Abstract
BACKGROUND Recent advances in sequencing technologies have greatly increased the identification of mutations in cancer genomes. However, it remains a significant challenge to identify cancer-driving mutations, since most observed missense changes are neutral passenger mutations. Various computational methods have been developed to predict the effects of amino acid substitutions on protein function and classify mutations as deleterious or benign. These include approaches that rely on evolutionary conservation, structural constraints, or physicochemical attributes of amino acid substitutions. Here we review existing methods and further examine eight tools: SIFT, PolyPhen2, Condel, CHASM, mCluster, logRE, SNAP, and MutationAssessor, with respect to their coverage, accuracy, availability and dependence on other tools. RESULTS Single nucleotide polymorphisms with high minor allele frequencies were used as a negative (neutral) set for testing, and recurrent mutations from the COSMIC database as well as novel recurrent somatic mutations identified in very recent cancer studies were used as positive (non-neutral) sets. Conservation-based methods generally had moderately high accuracy in distinguishing neutral from deleterious mutations, whereas the performance of machine learning based predictors with comprehensive feature spaces varied between assessments using different positive sets. MutationAssessor consistently provided the highest accuracies. For certain combinations metapredictors slightly improved the performance of included individual methods, but did not outperform MutationAssessor as stand-alone tool. CONCLUSIONS Our independent assessment of existing tools reveals various performance disparities. Cancer-trained methods did not improve upon more general predictors. No method or combination of methods exceeds 81% accuracy, indicating there is still significant room for improvement for driver mutation prediction, and perhaps more sophisticated feature integration is needed to develop a more robust tool.
Collapse
|
40
|
Shihab HA, Gough J, Cooper DN, Day INM, Gaunt TR. Predicting the functional consequences of cancer-associated amino acid substitutions. ACTA ACUST UNITED AC 2013; 29:1504-10. [PMID: 23620363 PMCID: PMC3673218 DOI: 10.1093/bioinformatics/btt182] [Citation(s) in RCA: 180] [Impact Index Per Article: 15.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/25/2022]
Abstract
Motivation: The number of missense mutations being identified in cancer genomes has greatly increased as a consequence of technological advances and the reduced cost of whole-genome/whole-exome sequencing methods. However, a high proportion of the amino acid substitutions detected in cancer genomes have little or no effect on tumour progression (passenger mutations). Therefore, accurate automated methods capable of discriminating between driver (cancer-promoting) and passenger mutations are becoming increasingly important. In our previous work, we developed the Functional Analysis through Hidden Markov Models (FATHMM) software and, using a model weighted for inherited disease mutations, observed improved performances over alternative computational prediction algorithms. Here, we describe an adaptation of our original algorithm that incorporates a cancer-specific model to potentiate the functional analysis of driver mutations. Results: The performance of our algorithm was evaluated using two separate benchmarks. In our analysis, we observed improved performances when distinguishing between driver mutations and other germ line variants (both disease-causing and putatively neutral mutations). In addition, when discriminating between somatic driver and passenger mutations, we observed performances comparable with the leading computational prediction algorithms: SPF-Cancer and TransFIC. Availability and implementation: A web-based implementation of our cancer-specific model, including a downloadable stand-alone package, is available at http://fathmm.biocompute.org.uk. Contact:fathmm@biocompute.org.uk Supplementary information:Supplementary data are available at Bioinformatics online.
Collapse
Affiliation(s)
- Hashem A Shihab
- Bristol Centre for Systems Biomedicine and MRC CAiTE Centre, School of Social and Community Medicine, University of Bristol, Bristol BS8 2BN, UK
| | | | | | | | | |
Collapse
|
41
|
Castellana S, Mazza T. Congruency in the prediction of pathogenic missense mutations: state-of-the-art web-based tools. Brief Bioinform 2013; 14:448-59. [PMID: 23505257 DOI: 10.1093/bib/bbt013] [Citation(s) in RCA: 65] [Impact Index Per Article: 5.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/22/2022] Open
Abstract
A remarkable degree of genetic variation has been found in the protein-encoding regions of DNA through deep sequencing of samples obtained from thousands of subjects from several populations. Approximately half of the 20 000 single nucleotide polymorphisms present, even in normal healthy subjects, are nonsynonymous amino acid substitutions that could potentially affect protein function. The greatest challenges currently facing investigators are data interpretation and the development of strategies to identify the few gene-coding variants that actually cause or confer susceptibility to disease. A confusing array of options is available to address this problem. Unfortunately, the overall accuracy of these tools at ultraconserved positions is low, and predictions generated by current computational tools may mislead researchers involved in downstream experimental and clinical studies. First, we have presented an updated review of these tools and their primary functionalities, focusing on those that are naturally prone to analyze massive variant sets, to infer some interesting similarities among their results. Additionally, we have evaluated the prediction congruency for real whole-exome sequencing data in a proof-of-concept study on some of these web-based tools.
Collapse
|
42
|
Li Z, Gakovic M, Ragimbeau J, Eloranta ML, Rönnblom L, Michel F, Pellegrini S. Two rare disease-associated Tyk2 variants are catalytically impaired but signaling competent. THE JOURNAL OF IMMUNOLOGY 2013; 190:2335-44. [PMID: 23359498 DOI: 10.4049/jimmunol.1203118] [Citation(s) in RCA: 55] [Impact Index Per Article: 4.6] [Reference Citation Analysis] [Abstract] [Subscribe] [Scholar Register] [Indexed: 01/12/2023]
Abstract
Tyk2 belongs to the Janus protein tyrosine kinase family and is involved in signaling of immunoregulatory cytokines (type I and III IFNs, IL-6, IL-10, and IL-12 families) via its interaction with shared receptor subunits. Depending on the receptor complex, Tyk2 is coactivated with either Jak1 or Jak2, but a detailed molecular characterization of the interplay between the two enzymes is missing. In human populations, the Tyk2 gene presents high levels of genetic diversity with >100 nonsynonymous variants being detected. In this study, we characterized two rare Tyk2 variants, I684S and P1104A, which have been associated with susceptibility to autoimmune disease. Specifically, we measured their in vitro catalytic activity and their ability to mediate Stat activation in fibroblasts and genotyped B cell lines. Both variants were found to be catalytically impaired but rescued signaling in response to IFN-α/β, IL-6, and IL-10. These data, coupled with functional study of an engineered Jak1 P1084A, support a model of nonhierarchical activation of Janus kinases in which one catalytically competent Jak is sufficient for signaling provided that its partner behaves as proper scaffold, even if inactive. Through the analysis of IFN-α and IFN-γ signaling in cells with different Jak1 P1084A levels, we also illustrate a context in which a hypomorphic Jak can hamper signaling in a cytokine-specific manner. Given the multitude of Tyk2-activating cytokines, the cell context-dependent requirement for Tyk2 and the catalytic defect of the two disease-associated variants studied in this paper, we predict that these alleles are functionally significant in complex immune disorders.
Collapse
Affiliation(s)
- Zhi Li
- Unit of Cytokine Signaling, Institut Pasteur, Paris 75724, France
| | | | | | | | | | | | | |
Collapse
|
43
|
Lei JB, Yin JB, Shen HB. GFO: A data driven approach for optimizing the Gaussian function based similarity metric in computational biology. Neurocomputing 2013. [DOI: 10.1016/j.neucom.2012.07.003] [Citation(s) in RCA: 10] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/30/2022]
|
44
|
Jin P, Cai R, Zhou X, Li-Ling J, Ma F. Features of missense/nonsense mutations in exonic splicing enhancer sequences from cancer-related human genes. Mutat Res 2012; 740:6-12. [PMID: 23123687 DOI: 10.1016/j.mrfmmm.2012.10.001] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/16/2011] [Revised: 08/31/2012] [Accepted: 10/19/2012] [Indexed: 11/18/2022]
Affiliation(s)
- Ping Jin
- College of Life Science, Nanjing Normal University, Nanjing, China
| | | | | | | | | |
Collapse
|
45
|
Hashimoto K, Rogozin IB, Panchenko AR. Oncogenic potential is related to activating effect of cancer single and double somatic mutations in receptor tyrosine kinases. Hum Mutat 2012; 33:1566-75. [PMID: 22753356 PMCID: PMC3465464 DOI: 10.1002/humu.22145] [Citation(s) in RCA: 24] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/19/2012] [Accepted: 05/29/2012] [Indexed: 01/16/2023]
Abstract
Aberrant activation of receptor tyrosine kinases (RTKs) is a common feature of many cancer cells. It was previously suggested that the mechanisms of kinase activation in cancer might be linked to transitions between active and inactive states. Here, we estimate the effects of single and double cancer mutations on the stability of active and inactive states of the kinase domains from different RTKs. We show that singleton cancer mutations destabilize active and inactive states; however, inactive states are destabilized more than the active ones, leading to kinase activation. We show that there exists a relationship between the estimate of oncogenic potential of cancer mutation and kinase activation. Namely, more frequent mutations have a higher activating effect, which might allow us to predict the activating effect of the mutations from the mutation spectra. Independent evolutionary analysis of mutation spectra complements this observation and finds the same frequency threshold defining mutation hotspots. We analyze double mutations and report a positive epistasis and additional advantage of doublets with respect to cancer cell fitness. The activation mechanisms of double mutations differ from those of single mutations and double mutation spectrum is found to be dissimilar to the mutation spectrum of singletons.
Collapse
Affiliation(s)
| | - Igor B. Rogozin
- National Center for Biotechnology Information, National Library of Medicine, National Institutes of Health, Bethesda, MD 20894, USA
| | - Anna R. Panchenko
- National Center for Biotechnology Information, National Library of Medicine, National Institutes of Health, Bethesda, MD 20894, USA
| |
Collapse
|
46
|
Tan H, Bao J, Zhou X. A novel missense-mutation-related feature extraction scheme for 'driver' mutation identification. ACTA ACUST UNITED AC 2012; 28:2948-55. [PMID: 23044540 DOI: 10.1093/bioinformatics/bts558] [Citation(s) in RCA: 42] [Impact Index Per Article: 3.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/07/2023]
Abstract
MOTIVATION It becomes widely accepted that human cancer is a disease involving dynamic changes in the genome and that the missense mutations constitute the bulk of human genetic variations. A multitude of computational algorithms, especially the machine learning-based ones, has consequently been proposed to distinguish missense changes that contribute to the cancer progression ('driver' mutation) from those that do not ('passenger' mutation). However, the existing methods have multifaceted shortcomings, in the sense that they either adopt incomplete feature space or depend on protein structural databases which are usually far from integrated. RESULTS In this article, we investigated multiple aspects of a missense mutation and identified a novel feature space that well distinguishes cancer-associated driver mutations from passenger ones. An index (DX score) was proposed to evaluate the discriminating capability of each feature, and a subset of these features which ranks top was selected to build the SVM classifier. Cross-validation showed that the classifier trained on our selected features significantly outperforms the existing ones both in precision and robustness. We applied our method to several datasets of missense mutations culled from published database and literature and obtained more reasonable results than previous studies. AVAILABILITY The software is available online at http://www.methodisthealth.com/software and https://sites.google.com/site/drivermutationidentification/. CONTACT xzhou@tmhs.org. SUPPLEMENTARY INFORMATION Supplementary data are available at Bioinformatics online.
Collapse
Affiliation(s)
- Hua Tan
- School of Mathematical Sciences, Beijing Normal University, Laboratory of Mathematics and Complex Systems, Ministry of Education, Beijing 100875, P.R. China
| | | | | |
Collapse
|
47
|
Izarzugaza JMG, Krallinger M, Valencia A. Interpretation of the consequences of mutations in protein kinases: combined use of bioinformatics and text mining. Front Physiol 2012; 3:323. [PMID: 23055974 PMCID: PMC3449330 DOI: 10.3389/fphys.2012.00323] [Citation(s) in RCA: 7] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/23/2012] [Accepted: 07/23/2012] [Indexed: 11/30/2022] Open
Abstract
Protein kinases play a crucial role in a plethora of significant physiological functions and a number of mutations in this superfamily have been reported in the literature to disrupt protein structure and/or function. Computational and experimental research aims to discover the mechanistic connection between mutations in protein kinases and disease with the final aim of predicting the consequences of mutations on protein function and the subsequent phenotypic alterations. In this article, we will review the possibilities and limitations of current computational methods for the prediction of the pathogenicity of mutations in the protein kinase superfamily. In particular we will focus on the problem of benchmarking the predictions with independent gold standard datasets. We will propose a pipeline for the curation of mutations automatically extracted from the literature. Since many of these mutations are not included in the databases that are commonly used to train the computational methods to predict the pathogenicity of protein kinase mutations we propose them to build a valuable gold standard dataset in the benchmarking of a number of these predictors. Finally, we will discuss how text mining approaches constitute a powerful tool for the interpretation of the consequences of mutations in the context of disease genome analysis with particular focus on cancer.
Collapse
Affiliation(s)
- Jose M G Izarzugaza
- Structural Computational Biology Group, Structural Biology and BioComputing Programme, Spanish National Cancer Research Centre Madrid, Spain
| | | | | |
Collapse
|
48
|
A Bayesian ensemble approach with a disease gene network predicts damaging effects of missense variants of human cancers. Hum Genet 2012; 132:15-27. [DOI: 10.1007/s00439-012-1218-7] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/15/2012] [Accepted: 08/05/2012] [Indexed: 02/04/2023]
|
49
|
|
50
|
Izarzugaza JMG, del Pozo A, Vazquez M, Valencia A. Prioritization of pathogenic mutations in the protein kinase superfamily. BMC Genomics 2012; 13 Suppl 4:S3. [PMID: 22759651 PMCID: PMC3303724 DOI: 10.1186/1471-2164-13-s4-s3] [Citation(s) in RCA: 21] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/23/2023] Open
Abstract
BACKGROUND Most of the many mutations described in human protein kinases are tolerated without significant disruption of the corresponding structures or molecular functions, while some of them have been associated to a variety of human diseases, including cancer. In the last decade, a plethora of computational methods to predict the effect of missense single-nucleotide variants (SNVs) have been developed. Still, current high-throughput sequencing efforts and the concomitant need for massive interpretation of protein sequence variants will demand for more efficient and/or accurate computational methods in the forthcoming years. RESULTS We present KinMut, a support vector machine (SVM) approach, to identify pathogenic mutations in the protein kinase superfamily. KinMut relays on a combination of sequence-derived features that describe mutations at different levels: (1) Gene level: membership to a specific group in Kinbase and the annotation with GO terms; (2) Domain level: annotated PFAM domains; and (3) Residue level: physicochemical features of amino acids, specificity determining positions, and functional annotations from SwissProt and FireDB. The system has been trained with the set of 3492 human kinase mutations in UniProt for which experimental validation of their pathogenic or neutral character exists. In addition, we discuss the relative importance of these independent properties and their combination for the development of a kinase-specific predictor. Finally, we compare KinMut with other state-of-the-art prediction methods. CONCLUSIONS Family-specific features appear among the most discriminative information sources, which allow us to produce accurate results in a reliable and very simple way with minimal supervision. Our study aims to broaden the knowledge on the mechanisms by which mutations in the human kinome contribute to disease with a particular focus in cancer. The classifier as well as further documentation is available at http://kinmut.bioinfo.cnio.es/.
Collapse
Affiliation(s)
- Jose M G Izarzugaza
- Structural Biology and BioComputing Programme, Spanish National Cancer Research Centre (CNIO), Madrid, Spain.
| | | | | | | |
Collapse
|