1
|
Ng JK, Chen Y, Akinwe TM, Heins HB, Mehinovic E, Chang Y, Payne ZL, Manuel JG, Karchin R, Turner TN. Proteome-Wide Assessment of Clustering of Missense Variants in Neurodevelopmental Disorders Versus Cancer. MEDRXIV : THE PREPRINT SERVER FOR HEALTH SCIENCES 2024:2024.02.02.24302238. [PMID: 38352539 PMCID: PMC10863034 DOI: 10.1101/2024.02.02.24302238] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Indexed: 02/19/2024]
Abstract
Missense de novo variants (DNVs) and missense somatic variants contribute to neurodevelopmental disorders (NDDs) and cancer, respectively. Proteins with statistical enrichment based on analyses of these variants exhibit convergence in the differing NDD and cancer phenotypes. Herein, the question of why some of the same proteins are identified in both phenotypes is examined through investigation of clustering of missense variation at the protein level. Our hypothesis is that missense variation is present in different protein locations in the two phenotypes leading to the distinct phenotypic outcomes. We tested this hypothesis in 1D protein space using our software CLUMP. Furthermore, we newly developed 3D-CLUMP that uses 3D protein structures to spatially test clustering of missense variation for proteome-wide significance. We examined missense DNVs in 39,883 parent-child sequenced trios with NDDs and missense somatic variants from 10,543 sequenced tumors covering five TCGA cancer types and two COSMIC pan-cancer aggregates of tissue types. There were 57 proteins with proteome-wide significant missense variation clustering in NDDs when compared to cancers and 79 proteins with proteome-wide significant missense clustering in cancers compared to NDDs. While our main objective was to identify differences in patterns of missense variation, we also identified a novel NDD protein BLTP2. Overall, our study is innovative, provides new insights into differential missense variation in NDDs and cancer at the protein-level, and contributes necessary information toward building a framework for thinking about prognostic and therapeutic aspects of these proteins.
Collapse
Affiliation(s)
- Jeffrey K. Ng
- Department of Genetics, Washington University School of Medicine, St. Louis, MO 63110, USA
| | - Yilin Chen
- Department of Biomedical Engineering, Johns Hopkins University, Baltimore, MD, USA
| | - Titilope M. Akinwe
- Department of Genetics, Washington University School of Medicine, St. Louis, MO 63110, USA
- Molecular Genetics & Genomics Graduate Program, Washington University School of Medicine, St. Louis, MO 63110, USA
| | - Hillary B. Heins
- Department of Genetics, Washington University School of Medicine, St. Louis, MO 63110, USA
| | - Elvisa Mehinovic
- Department of Genetics, Washington University School of Medicine, St. Louis, MO 63110, USA
| | - Yoonhoo Chang
- Department of Genetics, Washington University School of Medicine, St. Louis, MO 63110, USA
- Human & Statistical Genetics Graduate Program, Washington University School of Medicine, St. Louis, MO 63110, USA
| | - Zachary L. Payne
- Department of Genetics, Washington University School of Medicine, St. Louis, MO 63110, USA
- Molecular Genetics & Genomics Graduate Program, Washington University School of Medicine, St. Louis, MO 63110, USA
| | - Juana G. Manuel
- Department of Genetics, Washington University School of Medicine, St. Louis, MO 63110, USA
| | - Rachel Karchin
- Department of Biomedical Engineering, Johns Hopkins University, Baltimore, MD, USA
- Department of Computer Science, Johns Hopkins University, Baltimore, MD, USA
- The Sidney Kimmel Comprehensive Cancer Center, School of Medicine, Johns Hopkins University, Baltimore, MD, USA
- Institute for Computational Medicine, Johns Hopkins University, Baltimore, MD, USA
| | - Tychele N. Turner
- Department of Genetics, Washington University School of Medicine, St. Louis, MO 63110, USA
- Intellectual and Developmental Disabilities Research Center, Washington University School of Medicine, St. Louis, MO, USA
| |
Collapse
|
2
|
Roothans N, Gabriëls M, Abeel T, Pabst M, van Loosdrecht MCM, Laureni M. Aerobic denitrification as an N2O source from microbial communities. THE ISME JOURNAL 2024; 18:wrae116. [PMID: 38913498 PMCID: PMC11272060 DOI: 10.1093/ismejo/wrae116] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 12/28/2023] [Revised: 04/26/2024] [Accepted: 06/21/2024] [Indexed: 06/26/2024]
Abstract
Nitrous oxide (N2O) is a potent greenhouse gas of primarily microbial origin. Oxic and anoxic emissions are commonly ascribed to autotrophic nitrification and heterotrophic denitrification, respectively. Beyond this established dichotomy, we quantitatively show that heterotrophic denitrification can significantly contribute to aerobic nitrogen turnover and N2O emissions in complex microbiomes exposed to frequent oxic/anoxic transitions. Two planktonic, nitrification-inhibited enrichment cultures were established under continuous organic carbon and nitrate feeding, and cyclic oxygen availability. Over a third of the influent organic substrate was respired with nitrate as electron acceptor at high oxygen concentrations (>6.5 mg/L). N2O accounted for up to one-quarter of the nitrate reduced under oxic conditions. The enriched microorganisms maintained a constitutive abundance of denitrifying enzymes due to the oxic/anoxic frequencies exceeding their protein turnover-a common scenario in natural and engineered ecosystems. The aerobic denitrification rates are ascribed primarily to the residual activity of anaerobically synthesised enzymes. From an ecological perspective, the selection of organisms capable of sustaining significant denitrifying activity during aeration shows their competitive advantage over other heterotrophs under varying oxygen availabilities. Ultimately, we propose that the contribution of heterotrophic denitrification to aerobic nitrogen turnover and N2O emissions is currently underestimated in dynamic environments.
Collapse
Affiliation(s)
- Nina Roothans
- Department of Biotechnology, Delft University of Technology, van der Maasweg 9, 2629 HZ Delft, the Netherlands
| | - Minke Gabriëls
- Department of Biotechnology, Delft University of Technology, van der Maasweg 9, 2629 HZ Delft, the Netherlands
| | - Thomas Abeel
- Delft Bioinformatics Lab, Delft University of Technology, van Mourik Broekmanweg 6, Delft 2628 XE, the Netherlands
- Infectious Disease and Microbiome Program, Broad Institute of MIT and Harvard, 415 Main Street, Cambridge, MA 02142, United States
| | - Martin Pabst
- Department of Biotechnology, Delft University of Technology, van der Maasweg 9, 2629 HZ Delft, the Netherlands
| | - Mark C M van Loosdrecht
- Department of Biotechnology, Delft University of Technology, van der Maasweg 9, 2629 HZ Delft, the Netherlands
| | - Michele Laureni
- Department of Biotechnology, Delft University of Technology, van der Maasweg 9, 2629 HZ Delft, the Netherlands
- Department of Water Management, Delft University of Technology, Stevinweg 1, 2628 CN Delft, the Netherlands
| |
Collapse
|
3
|
Shorthouse D, Zhuang L, Rahrmann EP, Kosmidou C, Wickham Rahrmann K, Hall M, Greenwood B, Devonshire G, Gilbertson RJ, Fitzgerald RC, Hall BA. KCNQ potassium channels modulate Wnt activity in gastro-oesophageal adenocarcinomas. Life Sci Alliance 2023; 6:e202302124. [PMID: 37748809 PMCID: PMC10520261 DOI: 10.26508/lsa.202302124] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/28/2023] [Revised: 09/11/2023] [Accepted: 09/11/2023] [Indexed: 09/27/2023] Open
Abstract
Voltage-sensitive potassium channels play an important role in controlling membrane potential and ionic homeostasis in the gut and have been implicated in gastrointestinal (GI) cancers. Through large-scale analysis of 897 patients with gastro-oesophageal adenocarcinomas (GOAs) coupled with in vitro models, we find KCNQ family genes are mutated in ∼30% of patients, and play therapeutically targetable roles in GOA cancer growth. KCNQ1 and KCNQ3 mediate the WNT pathway and MYC to increase proliferation through resultant effects on cadherin junctions. This also highlights novel roles of KCNQ3 in non-excitable tissues. We also discover that activity of KCNQ3 sensitises cancer cells to existing potassium channel inhibitors and that inhibition of KCNQ activity reduces proliferation of GOA cancer cells. These findings reveal a novel and exploitable role of potassium channels in the advancement of human cancer, and highlight that supplemental treatments for GOAs may exist through KCNQ inhibitors.
Collapse
Affiliation(s)
- David Shorthouse
- https://ror.org/02jx3x895 Department of Medical Physics and Biomedical Engineering, Malet Place Engineering Building, University College London, London, UK
| | - Lizhe Zhuang
- Institute for Early Detection, CRUK Cambridge Centre, Cambridge, UK
| | - Eric P Rahrmann
- Cancer Research UK Cambridge Institute, Li Ka Shing Centre, University of Cambridge, Cambridge, UK
| | | | | | - Michael Hall
- Cancer Research UK Cambridge Institute, Li Ka Shing Centre, University of Cambridge, Cambridge, UK
| | - Benedict Greenwood
- https://ror.org/02jx3x895 Department of Medical Physics and Biomedical Engineering, Malet Place Engineering Building, University College London, London, UK
| | - Ginny Devonshire
- Cancer Research UK Cambridge Institute, Li Ka Shing Centre, University of Cambridge, Cambridge, UK
| | - Richard J Gilbertson
- Cancer Research UK Cambridge Institute, Li Ka Shing Centre, University of Cambridge, Cambridge, UK
| | | | - Benjamin A Hall
- https://ror.org/02jx3x895 Department of Medical Physics and Biomedical Engineering, Malet Place Engineering Building, University College London, London, UK
| |
Collapse
|
4
|
Iqbal S, Brünger T, Pérez-Palma E, Macnee M, Brunklaus A, Daly MJ, Campbell AJ, Hoksza D, May P, Lal D. Delineation of functionally essential protein regions for 242 neurodevelopmental genes. Brain 2023; 146:519-533. [PMID: 36256779 PMCID: PMC9924913 DOI: 10.1093/brain/awac381] [Citation(s) in RCA: 4] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/29/2022] [Revised: 08/12/2022] [Accepted: 09/04/2022] [Indexed: 01/25/2023] Open
Abstract
Neurodevelopmental disorders (NDDs), including severe paediatric epilepsy, autism and intellectual disabilities are heterogeneous conditions in which clinical genetic testing can often identify a pathogenic variant. For many of them, genetic therapies will be tested in this or the coming years in clinical trials. In contrast to first-generation symptomatic treatments, the new disease-modifying precision medicines require a genetic test-informed diagnosis before a patient can be enrolled in a clinical trial. However, even in 2022, most identified genetic variants in NDD genes are 'variants of uncertain significance'. To safely enrol patients in precision medicine clinical trials, it is important to increase our knowledge about which regions in NDD-associated proteins can 'tolerate' missense variants and which ones are 'essential' and will cause a NDD when mutated. In addition, knowledge about functionally indispensable regions in the 3D structure context of proteins can also provide insights into the molecular mechanisms of disease variants. We developed a novel consensus approach that overlays evolutionary, and population based genomic scores to identify 3D essential sites (Essential3D) on protein structures. After extensive benchmarking of AlphaFold predicted and experimentally solved protein structures, we generated the currently largest expert curated protein structure set for 242 NDDs and identified 14 377 Essential3D sites across 189 gene disorders associated proteins. We demonstrate that the consensus annotation of Essential3D sites improves prioritization of disease mutations over single annotations. The identified Essential3D sites were enriched for functional features such as intermembrane regions or active sites and discovered key inter-molecule interactions in protein complexes that were otherwise not annotated. Using the currently largest autism, developmental disorders, and epilepsies exome sequencing studies including >360 000 NDD patients and population controls, we found that missense variants at Essential3D sites are 8-fold enriched in patients. In summary, we developed a comprehensive protein structure set for 242 NDDs and identified 14 377 Essential3D sites in these. All data are available at https://es-ndd.broadinstitute.org for interactive visual inspection to enhance variant interpretation and development of mechanistic hypotheses for 242 NDDs genes. The provided resources will enhance clinical variant interpretation and in silico drug target development for NDD-associated genes and encoded proteins.
Collapse
Affiliation(s)
- Sumaiya Iqbal
- The Center for the Development of Therapeutics, Broad Institute of MIT and Harvard, Cambridge, MA 02142, USA
- Stanley Center for Psychiatric Research, Broad Institute of MIT and Harvard, Cambridge, MA 02142, USA
- Analytic and Translational Genetics Unit, Massachusetts General Hospital, Boston, MA 02114, USA
| | - Tobias Brünger
- Cologne Center for Genomics, University of Cologne, 50923 Köln, Germany
| | - Eduardo Pérez-Palma
- Universidad del Desarrollo, Centro de Genética y Genómica, Facultad de Medicina Clínica Alemana, 7610658 Las Condes, Santiago de Chile, Chile
| | - Marie Macnee
- Cologne Center for Genomics, University of Cologne, 50923 Köln, Germany
| | - Andreas Brunklaus
- The Paediatric Neurosciences Research Group, Royal Hospital for Children, Glasgow G12 8QQ, UK
- School of Health and Wellbeing, College of Medical, Veterinary and Life Sciences, University of Glasgow, Glasgow G12 8QQ, UK
| | - Mark J Daly
- Stanley Center for Psychiatric Research, Broad Institute of MIT and Harvard, Cambridge, MA 02142, USA
- Analytic and Translational Genetics Unit, Massachusetts General Hospital, Boston, MA 02114, USA
- Institute for Molecular Medicine Finland (FIMM), Centre of Excellence in Complex Disease Genetics, University of Helsinki, 00100 Helsinki, Finland
| | - Arthur J Campbell
- The Center for the Development of Therapeutics, Broad Institute of MIT and Harvard, Cambridge, MA 02142, USA
- Stanley Center for Psychiatric Research, Broad Institute of MIT and Harvard, Cambridge, MA 02142, USA
| | - David Hoksza
- Department of Software Engineering, Faculty of Mathematics and Physics, Charles University, 110 00 Staré Město, Czechia, Czech Republic
| | - Patrick May
- Luxembourg Centre for Systems Biomedicine, University of Luxembourg, 4365 Esch-sur-Alzette, Luxembourg
| | - Dennis Lal
- Stanley Center for Psychiatric Research, Broad Institute of MIT and Harvard, Cambridge, MA 02142, USA
- Cologne Center for Genomics, University of Cologne, 50923 Köln, Germany
- Epilepsy Center, Neurological Institute, Cleveland Clinic, Cleveland, OH 44195, USA
- Genomic Medicine Institute, Lerner Research Institute Cleveland Clinic, Cleveland, OH 44106, USA
| |
Collapse
|
5
|
Unni P, Friend J, Weinberg J, Okur V, Hochscherf J, Dominguez I. Predictive functional, statistical and structural analysis of CSNK2A1 and CSNK2B variants linked to neurodevelopmental diseases. Front Mol Biosci 2022; 9:851547. [PMID: 36310603 PMCID: PMC9608649 DOI: 10.3389/fmolb.2022.851547] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/10/2022] [Accepted: 06/29/2022] [Indexed: 12/02/2022] Open
Abstract
Okur-Chung Neurodevelopmental Syndrome (OCNDS) and Poirier-Bienvenu Neurodevelopmental Syndrome (POBINDS) were recently identified as rare neurodevelopmental disorders. OCNDS and POBINDS are associated with heterozygous mutations in the CSNK2A1 and CSNK2B genes which encode CK2α, a serine/threonine protein kinase, and CK2β, a regulatory protein, respectively, which together can form a tetrameric enzyme called protein kinase CK2. A challenge in OCNDS and POBINDS is to understand the genetic basis of these diseases and the effect of the various CK2⍺ and CK2β mutations. In this study we have collected all variants available to date in CSNK2A1 and CSNK2B, and identified hotspots. We have investigated CK2⍺ and CK2β missense mutations through prediction programs which consider the evolutionary conservation, functionality and structure or these two proteins, compared these results with published experimental data on CK2α and CK2β mutants, and suggested prediction programs that could help predict changes in functionality of CK2α mutants. We also investigated the potential effect of CK2α and CK2β mutations on the 3D structure of the proteins and in their binding to each other. These results indicate that there are functional and structural consequences of mutation of CK2α and CK2β, and provide a rationale for further study of OCNDS and POBINDS-associated mutations. These data contribute to understanding the genetic and functional basis of these diseases, which is needed to identify their underlying mechanisms.
Collapse
Affiliation(s)
- Prasida Unni
- Department of Medicine, Boston University School of Medicine and Boston Medical Center, Boston University, Boston, MA, United States
| | - Jack Friend
- Department of Medicine, Boston University School of Medicine and Boston Medical Center, Boston University, Boston, MA, United States
| | - Janice Weinberg
- Department of Biostatistics, Boston University School of Public Health, Boston University, Boston, MA, United States
| | - Volkan Okur
- New York Genome Center, New York, NY, United States
| | - Jennifer Hochscherf
- Department of Chemistry, Institute of Biochemistry, University of Cologne, Cologne, Germany
| | - Isabel Dominguez
- Department of Medicine, Boston University School of Medicine and Boston Medical Center, Boston University, Boston, MA, United States
- *Correspondence: Isabel Dominguez,
| |
Collapse
|
6
|
Nussinov R, Zhang M, Maloney R, Liu Y, Tsai CJ, Jang H. Allostery: Allosteric Cancer Drivers and Innovative Allosteric Drugs. J Mol Biol 2022; 434:167569. [PMID: 35378118 PMCID: PMC9398924 DOI: 10.1016/j.jmb.2022.167569] [Citation(s) in RCA: 26] [Impact Index Per Article: 13.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/03/2022] [Revised: 03/11/2022] [Accepted: 03/25/2022] [Indexed: 01/12/2023]
Abstract
Here, we discuss the principles of allosteric activating mutations, propagation downstream of the signals that they prompt, and allosteric drugs, with examples from the Ras signaling network. We focus on Abl kinase where mutations shift the landscape toward the active, imatinib binding-incompetent conformation, likely resulting in the high affinity ATP outcompeting drug binding. Recent pharmacological innovation extends to allosteric inhibitor (GNF-5)-linked PROTAC, targeting Bcr-Abl1 myristoylation site, and broadly, allosteric heterobifunctional degraders that destroy targets, rather than inhibiting them. Designed chemical linkers in bifunctional degraders can connect the allosteric ligand that binds the target protein and the E3 ubiquitin ligase warhead anchor. The physical properties and favored conformational state of the engineered linker can precisely coordinate the distance and orientation between the target and the recruited E3. Allosteric PROTACs, noncompetitive molecular glues, and bitopic ligands, with covalent links of allosteric ligands and orthosteric warheads, increase the effective local concentration of productively oriented and placed ligands. Through covalent chemical or peptide linkers, allosteric drugs can collaborate with competitive drugs, degrader anchors, or other molecules of choice, driving innovative drug discovery.
Collapse
Affiliation(s)
- Ruth Nussinov
- Computational Structural Biology Section, Frederick National Laboratory for Cancer Research in the Laboratory of Cancer Immunometabolism, National Cancer Institute, Frederick, MD 21702, USA; Department of Human Molecular Genetics and Biochemistry, Sackler School of Medicine, Tel Aviv University, Tel Aviv 69978, Israel.
| | - Mingzhen Zhang
- Computational Structural Biology Section, Frederick National Laboratory for Cancer Research in the Laboratory of Cancer Immunometabolism, National Cancer Institute, Frederick, MD 21702, USA
| | - Ryan Maloney
- Computational Structural Biology Section, Frederick National Laboratory for Cancer Research in the Laboratory of Cancer Immunometabolism, National Cancer Institute, Frederick, MD 21702, USA
| | - Yonglan Liu
- Computational Structural Biology Section, Frederick National Laboratory for Cancer Research in the Laboratory of Cancer Immunometabolism, National Cancer Institute, Frederick, MD 21702, USA
| | - Chung-Jung Tsai
- Computational Structural Biology Section, Frederick National Laboratory for Cancer Research in the Laboratory of Cancer Immunometabolism, National Cancer Institute, Frederick, MD 21702, USA
| | - Hyunbum Jang
- Computational Structural Biology Section, Frederick National Laboratory for Cancer Research in the Laboratory of Cancer Immunometabolism, National Cancer Institute, Frederick, MD 21702, USA
| |
Collapse
|
7
|
Cisneros LH, Vaske C, Bussey KJ. Identification of a signature of evolutionarily conserved stress-induced mutagenesis in cancer. Front Genet 2022; 13:932763. [PMID: 36147501 PMCID: PMC9488704 DOI: 10.3389/fgene.2022.932763] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/30/2022] [Accepted: 08/05/2022] [Indexed: 11/13/2022] Open
Abstract
The clustering of mutations observed in cancer cells is reminiscent of the stress-induced mutagenesis (SIM) response in bacteria. Bacteria deploy SIM when faced with DNA double-strand breaks in the presence of conditions that elicit an SOS response. SIM employs DinB, the evolutionary precursor to human trans-lesion synthesis (TLS) error-prone polymerases, and results in mutations concentrated around DNA double-strand breaks with an abundance that decays with distance. We performed a quantitative study on single nucleotide variant calls for whole-genome sequencing data from 1950 tumors, non-inherited mutations from 129 normal samples, and acquired mutations in 3 cell line models of stress-induced adaptive mutation. We introduce statistical methods to identify mutational clusters, quantify their shapes and tease out the potential mechanism that produced them. Our results show that mutations in both normal and cancer samples are indeed clustered and have shapes indicative of SIM. Clusters in normal samples occur more often in the same genomic location across samples than in cancer suggesting loss of regulation over the mutational process during carcinogenesis. Additionally, the signatures of TLS contribute the most to mutational cluster formation in both patient samples as well as experimental models of SIM. Furthermore, a measure of cluster shape heterogeneity was associated with cancer patient survival with a hazard ratio of 5.744 (Cox Proportional Hazard Regression, 95% CI: 1.824-18.09). Our results support the conclusion that the ancient and evolutionary-conserved adaptive mutation response found in bacteria is a source of genomic instability in cancer. Biological adaptation through SIM might explain the ability of tumors to evolve in the face of strong selective pressures such as treatment and suggests that the conventional 'hit it hard' approaches to therapy could prove themselves counterproductive.
Collapse
Affiliation(s)
- Luis H. Cisneros
- NantOmics, LLC, Santa Cruz, CA, United States
- The Beyond Center for Fundamental Concepts in Science, Arizona State University, Tempe, AZ, United States
| | | | - Kimberly J. Bussey
- NantOmics, LLC, Santa Cruz, CA, United States
- The Beyond Center for Fundamental Concepts in Science, Arizona State University, Tempe, AZ, United States
- Precision Medicine, Midwestern University, Glendale, AZ, United States
| |
Collapse
|
8
|
Jiang C, Sun XR, Feng J, Zhu SF, Shui W. Metagenomic analysis reveals the different characteristics of microbial communities inside and outside the karst tiankeng. BMC Microbiol 2022; 22:115. [PMID: 35473500 PMCID: PMC9040234 DOI: 10.1186/s12866-022-02513-1] [Citation(s) in RCA: 6] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/25/2021] [Accepted: 03/28/2022] [Indexed: 01/10/2023] Open
Abstract
Background Karst tiankengs serve as a reservoir of biodiversity in the degraded karst landscape areas. However, the microbial diversity of karst tiankengs is poorly understood. Here, we investigated the composition and function of the microbial community in a karst tiankeng. Results We found that habitat differences inside and outside the karst tiankeng changed the composition and structure of the soil microbial communities, and the dominant phyla were Proteobacteria, Actinobacteria, Chloroflexi and Acidobacteria. The Shannon–Wiener diversity of microbial communities inside and outside the tiankeng was significantly different, and it was higher inside the tiankeng (IT). Venn and LEfSe analysis found that the soil microbial communities inside the tiankeng had 640 more endemic species and 39 more biomarker microbial clades than those identified outside of the tiankeng (OT)..Functional prediction indicated that soil microorganisms in outside the tiankeng had a high potential for carbohydrate metabolism, translation and amino acid metabolism. There were biomarker pathways associated with several of human diseases at both IT and OT sites. Except for auxiliary activities (AA), other CAZy classes had higher abundance at IT sites, which can readily convert litter and fix carbon and nitrogen, thereby supporting the development of underground forests. The differences in microbial communities were mainly related to the soil water content and soil total nitrogen. Conclusions Our results provide a metagenomic overview of the karst tiankeng system and provide new insights into habitat conservation and biodiversity restoration in the area. Supplementary Information The online version contains supplementary material available at 10.1186/s12866-022-02513-1.
Collapse
Affiliation(s)
- Cong Jiang
- College of Urban and Environmental Sciences, Peking University, Beijing, 100871, China
| | - Xiao-Rui Sun
- College of Environment Safety Engineering, Fuzhou University, Fuzhou, 350116, China
| | - Jie Feng
- College of Environment Safety Engineering, Fuzhou University, Fuzhou, 350116, China
| | - Su-Feng Zhu
- Chinese Research Academy of Environmental Sciences, Beijing, 100020, China
| | - Wei Shui
- College of Environment Safety Engineering, Fuzhou University, Fuzhou, 350116, China.
| |
Collapse
|
9
|
Kan Y, Jiang L, Tang J, Guo Y, Guo F. A systematic view of computational methods for identifying driver genes based on somatic mutation data. Brief Funct Genomics 2021; 20:333-343. [PMID: 34312663 DOI: 10.1093/bfgp/elab032] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/01/2021] [Revised: 06/16/2021] [Accepted: 06/22/2021] [Indexed: 11/13/2022] Open
Abstract
Abnormal changes of driver genes are serious for human health and biomedical research. Identifying driver genes, exactly from enormous genes with mutations, promotes accurate diagnosis and treatment of cancer. A lot of works about uncovering driver genes have been developed over the past decades. By analyzing previous works, we find that computational methods are more efficient than traditional biological experiments when distinguishing driver genes from massive data. In this study, we summarize eight common computational algorithms only using somatic mutation data. We first group these methods into three categories according to mutation features they apply. Then, we conclude a general process of nominating candidate cancer driver genes. Finally, we evaluate three representative methods on 10 kinds of cancer derived from The Cancer Genome Atlas Program and five Chinese projects from the International Cancer Genome Consortium. In addition, we compare results of methods with various parameters. Evaluation is performed from four perspectives, including CGC, OG/TSG, Q-value and QQQuantile-Quantileplot. To sum up, we present algorithms using somatic mutation data in order to offer a systematic view of various mutation features and lay the foundation of methods based on integration of mutation information and other types of data.
Collapse
Affiliation(s)
- Yingxin Kan
- School of Computer Science and Technology, College of Intelligence and Computing, Tianjin University, Tianjin, China
| | - Limin Jiang
- School of Computer Science and Technology, College of Intelligence and Computing, Tianjin University, Tianjin, China.,Shenzhen Institute of Advanced Technology, Chinese Academy of Sciences, Shenzhen, China
| | - Jijun Tang
- Shenzhen Institute of Advanced Technology, Chinese Academy of Sciences, Shenzhen, China.,School of Computational Science and Engineering, University of South Carolina, Columbia, U.S
| | - Yan Guo
- Comprehensive cancer center, Department of Internal Medicine, University of New Mexico, Albuquerque, U.S
| | - Fei Guo
- School of Computer Science and Engineering, Central South University, Changsha, China
| |
Collapse
|
10
|
Shorthouse D, Hall MWJ, Hall BA. Computational Saturation Screen Reveals the Landscape of Mutations in Human Fumarate Hydratase. J Chem Inf Model 2021; 61:1970-1980. [DOI: 10.1021/acs.jcim.1c00063] [Citation(s) in RCA: 6] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/18/2022]
Affiliation(s)
- David Shorthouse
- Department of Medical Physics and Biomedical Engineering, UCL, London WC1E 6BT, U.K
| | - Michael W. J. Hall
- MRC Cancer Unit, University of Cambridge, Cambridge CB2 0XZ, U.K
- Wellcome Trust Sanger Institute, Hinxton CB10 1SA, U.K
| | - Benjamin A. Hall
- Department of Medical Physics and Biomedical Engineering, UCL, London WC1E 6BT, U.K
| |
Collapse
|
11
|
Martinez-Ledesma E, Flores D, Trevino V. Computational methods for detecting cancer hotspots. Comput Struct Biotechnol J 2020; 18:3567-3576. [PMID: 33304455 PMCID: PMC7711189 DOI: 10.1016/j.csbj.2020.11.020] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/13/2020] [Revised: 11/12/2020] [Accepted: 11/13/2020] [Indexed: 12/14/2022] Open
Abstract
Cancer mutations that are recurrently observed among patients are known as hotspots. Hotspots are highly relevant because they are, presumably, likely functional. Known hotspots in BRAF, PIK3CA, TP53, KRAS, IDH1 support this idea. However, hundreds of hotspots have never been validated experimentally. The detection of hotspots nevertheless is challenging because background mutations obscure their statistical and computational identification. Although several algorithms have been applied to identify hotspots, they have not been reviewed before. Thus, in this mini-review, we summarize more than 40 computational methods applied to detect cancer hotspots in coding and non-coding DNA. We first organize the methods in cluster-based, 3D, position-specific, and miscellaneous to provide a general overview. Then, we describe their embed procedures, implementations, variations, and differences. Finally, we discuss some advantages, provide some ideas for future developments, and mention opportunities such as application to viral integrations, translocations, and epigenetics.
Collapse
Affiliation(s)
- Emmanuel Martinez-Ledesma
- Tecnologico de Monterrey, Escuela de Medicina y Ciencias de la Salud, Bioinformática y Diagnóstico Clínico, Monterrey, Nuevo León, Mexico
| | - David Flores
- Tecnologico de Monterrey, Escuela de Medicina y Ciencias de la Salud, Bioinformática y Diagnóstico Clínico, Monterrey, Nuevo León, Mexico
- Universidad del Caribe, Departamento de Ciencias Básicas e Ingenierías, Cancún, Quintana Roo, Mexico
| | - Victor Trevino
- Tecnologico de Monterrey, Escuela de Medicina y Ciencias de la Salud, Bioinformática y Diagnóstico Clínico, Monterrey, Nuevo León, Mexico
| |
Collapse
|
12
|
Kobren SN, Chazelle B, Singh M. PertInInt: An Integrative, Analytical Approach to Rapidly Uncover Cancer Driver Genes with Perturbed Interactions and Functionalities. Cell Syst 2020; 11:63-74.e7. [PMID: 32711844 PMCID: PMC7493809 DOI: 10.1016/j.cels.2020.06.005] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/02/2019] [Revised: 02/23/2020] [Accepted: 06/05/2020] [Indexed: 12/12/2022]
Abstract
A major challenge in cancer genomics is to identify genes with functional roles in cancer and uncover their mechanisms of action. We introduce an integrative framework that identifies cancer-relevant genes by pinpointing those whose interaction or other functional sites are enriched in somatic mutations across tumors. We derive analytical calculations that enable us to avoid time-prohibitive permutation-based significance tests, making it computationally feasible to simultaneously consider multiple measures of protein site functionality. Our accompanying software, PertInInt, combines knowledge about sites participating in interactions with DNA, RNA, peptides, ions, or small molecules with domain, evolutionary conservation, and gene-level mutation data. When applied to 10,037 tumor samples, PertInInt uncovers both known and newly predicted cancer genes, while additionally revealing what types of interactions or other functionalities are disrupted. PertInInt’s analysis demonstrates that somatic mutations are frequently enriched in interaction sites and domains and implicates interaction perturbation as a pervasive cancer-driving event. A fast, analytical framework called PertInInt enables efficient integration of multiple measures of protein site functionality—including interaction, domain, and evolutionary conservation—with gene-level mutation data in order to rapidly detect cancer driver genes along with their disrupted functionalities.
Collapse
Affiliation(s)
- Shilpa Nadimpalli Kobren
- Department of Biomedical Informatics, Harvard Medical School, Boston, MA, USA; Department of Computer Science, Princeton University, Princeton, NJ, USA; Lewis-Sigler Institute for Integrative Genomics, Princeton University, Princeton, NJ, USA
| | - Bernard Chazelle
- Department of Computer Science, Princeton University, Princeton, NJ, USA
| | - Mona Singh
- Department of Computer Science, Princeton University, Princeton, NJ, USA; Lewis-Sigler Institute for Integrative Genomics, Princeton University, Princeton, NJ, USA.
| |
Collapse
|
13
|
Zhang J, Kim EC, Chen C, Procko E, Pant S, Lam K, Patel J, Choi R, Hong M, Joshi D, Bolton E, Tajkhorshid E, Chung HJ. Identifying mutation hotspots reveals pathogenetic mechanisms of KCNQ2 epileptic encephalopathy. Sci Rep 2020; 10:4756. [PMID: 32179837 PMCID: PMC7075958 DOI: 10.1038/s41598-020-61697-6] [Citation(s) in RCA: 33] [Impact Index Per Article: 8.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/01/2019] [Accepted: 03/02/2020] [Indexed: 11/08/2022] Open
Abstract
Kv7 channels are enriched at the axonal plasma membrane where their voltage-dependent potassium currents suppress neuronal excitability. Mutations in Kv7.2 and Kv7.3 subunits cause epileptic encephalopathy (EE), yet the underlying pathogenetic mechanism is unclear. Here, we used novel statistical algorithms and structural modeling to identify EE mutation hotspots in key functional domains of Kv7.2 including voltage sensing S4, the pore loop and S6 in the pore domain, and intracellular calmodulin-binding helix B and helix B-C linker. Characterization of selected EE mutations from these hotspots revealed that L203P at S4 induces a large depolarizing shift in voltage dependence of Kv7.2 channels and L268F at the pore decreases their current densities. While L268F severely reduces expression of heteromeric channels in hippocampal neurons without affecting internalization, K552T and R553L mutations at distal helix B decrease calmodulin-binding and axonal enrichment. Importantly, L268F, K552T, and R553L mutations disrupt current potentiation by increasing phosphatidylinositol 4,5-bisphosphate (PIP2), and our molecular dynamics simulation suggests PIP2 interaction with these residues. Together, these findings demonstrate that each EE variant causes a unique combination of defects in Kv7 channel function and neuronal expression, and suggest a critical need for both prediction algorithms and experimental interrogations to understand pathophysiology of Kv7-associated EE.
Collapse
Affiliation(s)
- Jiaren Zhang
- Department of Molecular and Integrative Physiology, University of Illinois at Urbana-Champaign, Urbana, Illinois, 61801, USA
| | - Eung Chang Kim
- Department of Molecular and Integrative Physiology, University of Illinois at Urbana-Champaign, Urbana, Illinois, 61801, USA
| | - Congcong Chen
- Department of Molecular and Integrative Physiology, University of Illinois at Urbana-Champaign, Urbana, Illinois, 61801, USA
- Department of Statistics, University of Illinois at Urbana-Champaign, Urbana, Illinois, 61801, USA
| | - Erik Procko
- Department of Biochemistry, University of Illinois at Urbana-Champaign, Urbana, Illinois, 61801, USA
- Neuroscience Program, University of Illinois at Urbana-Champaign, Urbana, Illinois, 61801, USA
| | - Shashank Pant
- Department of Biochemistry, University of Illinois at Urbana-Champaign, Urbana, Illinois, 61801, USA
- NIH Center for Macromolecular Modeling and Bioinformatics, Beckman Institute for Advanced Science and Technology, University of Illinois at Urbana-Champaign, Urbana, Illinois, 61801, USA
- Center for Biophysics and Quantitative Biology, University of Illinois at Urbana-Champaign, Urbana, Illinois, 61801, USA
| | - Kin Lam
- Department of Biochemistry, University of Illinois at Urbana-Champaign, Urbana, Illinois, 61801, USA
- NIH Center for Macromolecular Modeling and Bioinformatics, Beckman Institute for Advanced Science and Technology, University of Illinois at Urbana-Champaign, Urbana, Illinois, 61801, USA
- Department of Physics, University of Illinois at Urbana-Champaign, Urbana, Illinois, 61801, USA
| | - Jaimin Patel
- Department of Molecular and Integrative Physiology, University of Illinois at Urbana-Champaign, Urbana, Illinois, 61801, USA
| | - Rebecca Choi
- Department of Molecular and Integrative Physiology, University of Illinois at Urbana-Champaign, Urbana, Illinois, 61801, USA
| | - Mary Hong
- Department of Molecular and Integrative Physiology, University of Illinois at Urbana-Champaign, Urbana, Illinois, 61801, USA
| | - Dhruv Joshi
- Department of Molecular and Integrative Physiology, University of Illinois at Urbana-Champaign, Urbana, Illinois, 61801, USA
| | - Eric Bolton
- Department of Molecular and Integrative Physiology, University of Illinois at Urbana-Champaign, Urbana, Illinois, 61801, USA
| | - Emad Tajkhorshid
- Department of Biochemistry, University of Illinois at Urbana-Champaign, Urbana, Illinois, 61801, USA
- NIH Center for Macromolecular Modeling and Bioinformatics, Beckman Institute for Advanced Science and Technology, University of Illinois at Urbana-Champaign, Urbana, Illinois, 61801, USA
- Center for Biophysics and Quantitative Biology, University of Illinois at Urbana-Champaign, Urbana, Illinois, 61801, USA
| | - Hee Jung Chung
- Department of Molecular and Integrative Physiology, University of Illinois at Urbana-Champaign, Urbana, Illinois, 61801, USA.
- Neuroscience Program, University of Illinois at Urbana-Champaign, Urbana, Illinois, 61801, USA.
| |
Collapse
|
14
|
Nussinov R, Tsai C, Jang H. Autoinhibition can identify rare driver mutations and advise pharmacology. FASEB J 2019; 34:16-29. [DOI: 10.1096/fj.201901341r] [Citation(s) in RCA: 17] [Impact Index Per Article: 3.4] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/28/2019] [Revised: 09/18/2019] [Accepted: 10/09/2019] [Indexed: 12/16/2022]
Affiliation(s)
- Ruth Nussinov
- Computational Structural Biology Section Basic Science Program Frederick National Laboratory for Cancer Research Frederick MD USA
- Department of Human Molecular Genetics and Biochemistry, Sackler School of Medicine Tel Aviv University Tel Aviv Israel
| | - Chung‐Jung Tsai
- Department of Human Molecular Genetics and Biochemistry, Sackler School of Medicine Tel Aviv University Tel Aviv Israel
| | - Hyunbum Jang
- Department of Human Molecular Genetics and Biochemistry, Sackler School of Medicine Tel Aviv University Tel Aviv Israel
| |
Collapse
|
15
|
Lu X, Qian X, Li X, Miao Q, Peng S. DMCM: a Data-adaptive Mutation Clustering Method to identify cancer-related mutation clusters. Bioinformatics 2019; 35:389-397. [PMID: 30010784 DOI: 10.1093/bioinformatics/bty624] [Citation(s) in RCA: 16] [Impact Index Per Article: 3.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/07/2018] [Accepted: 07/12/2018] [Indexed: 12/11/2022] Open
Abstract
Motivation Functional somatic mutations within coding amino acid sequences confer growth advantage in pathogenic process. Most existing methods for identifying cancer-related mutations focus on the single amino acid or the entire gene level. However, gain-of-function mutations often cluster in specific protein regions instead of existing independently in the amino acid sequences. Some approaches for identifying mutation clusters with mutation density on amino acid chain have been proposed recently. But their performance in identification of mutation clusters remains to be improved. Results Here we present a Data-adaptive Mutation Clustering Method (DMCM), in which kernel density estimate (KDE) with a data-adaptive bandwidth is applied to estimate the mutation density, to find variable clusters with different lengths on amino acid sequences. We apply this approach in the mutation data of 571 genes in over twenty cancer types from The Cancer Genome Atlas (TCGA). We compare the DMCM with M2C, OncodriveCLUST and Pfam Domain and find that DMCM tends to identify more significant clusters. The cross-validation analysis shows DMCM is robust and cluster cancer type enrichment analysis shows that specific cancer types are enriched for specific mutation clusters. Availability and implementation DMCM is written in Python and analysis methods of DMCM are written in R. They are all released online, available through https://github.com/XinguoLu/DMCM. Supplementary information Supplementary data are available at Bioinformatics online.
Collapse
Affiliation(s)
- Xinguo Lu
- College of Computer Science and Electronic Engineering, Hunan University, Changsha, China
| | - Xin Qian
- College of Computer Science and Electronic Engineering, Hunan University, Changsha, China
| | - Xing Li
- College of Computer Science and Electronic Engineering, Hunan University, Changsha, China
| | - Qiumai Miao
- College of Computer Science and Electronic Engineering, Hunan University, Changsha, China
| | - Shaoliang Peng
- College of Computer Science and Electronic Engineering, Hunan University, Changsha, China.,School of Computer Science, National University of Defense Technology, Changsha, China
| |
Collapse
|
16
|
Leveraging protein dynamics to identify cancer mutational hotspots using 3D structures. Proc Natl Acad Sci U S A 2019; 116:18962-18970. [PMID: 31462496 PMCID: PMC6754584 DOI: 10.1073/pnas.1901156116] [Citation(s) in RCA: 23] [Impact Index Per Article: 4.6] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/12/2022] Open
Abstract
Large-scale exome sequencing of tumors has enabled the identification of cancer drivers using recurrence-based approaches. Some of these methods also employ 3D protein structures to identify mutational hotspots in cancer-associated genes. In determining such mutational clusters in structures, existing approaches overlook protein dynamics, despite its essential role in protein function. We present a framework to identify cancer driver genes using a dynamics-based search of mutational hotspot communities. Mutations are mapped to protein structures, which are partitioned into distinct residue communities. These communities are identified in a framework where residue-residue contact edges are weighted by correlated motions (as inferred by dynamics-based models). We then search for signals of positive selection among these residue communities to identify putative driver genes, while applying our method to the TCGA (The Cancer Genome Atlas) PanCancer Atlas missense mutation catalog. Overall, we predict 1 or more mutational hotspots within the resolved structures of proteins encoded by 434 genes. These genes were enriched among biological processes associated with tumor progression. Additionally, a comparison between our approach and existing cancer hotspot detection methods using structural data suggests that including protein dynamics significantly increases the sensitivity of driver detection.
Collapse
|
17
|
Majuta SN, Li C, Jayasundara K, Kiani Karanji A, Attanayake K, Ranganathan N, Li P, Valentine SJ. Rapid Solution-Phase Hydrogen/Deuterium Exchange for Metabolite Compound Identification. JOURNAL OF THE AMERICAN SOCIETY FOR MASS SPECTROMETRY 2019; 30:1102-1114. [PMID: 30980382 DOI: 10.1007/s13361-019-02163-0] [Citation(s) in RCA: 10] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 01/09/2019] [Revised: 02/15/2019] [Accepted: 02/16/2019] [Indexed: 05/25/2023]
Abstract
Rapid, solution-phase hydrogen/deuterium exchange (HDX) coupled with mass spectrometry (MS) is demonstrated as a means for distinguishing small-molecule metabolites. HDX is achieved using capillary vibrating sharp-edge spray ionization (cVSSI) to allow sufficient time for reagent mixing and exchange in micrometer-sized droplets. Different compounds are observed to incorporate deuterium with varying efficiencies resulting in unique isotopic patterns as revealed in the MS spectra. Using linear regression techniques, parameters representing contribution to exchange by different hydrogen types can be computed. In this proof-of-concept study, the exchange parameters are shown to be useful in the retrodiction of the amount of deuterium incorporated within different compounds. On average, the exchange parameters retrodict the exchange level with ~ 2.2-fold greater accuracy than treating all exchangeable hydrogens equally. The parameters can be used to produce hypothetical isotopic distributions that agree (± 16% RMSD) with experimental measurements. These initial studies are discussed in light of their potential value for identifying challenging metabolite species.
Collapse
Affiliation(s)
- Sandra N Majuta
- C. Eugene Bennett Department of Chemistry, West Virginia University, Morgantown, WV, 26506, USA
| | - Chong Li
- C. Eugene Bennett Department of Chemistry, West Virginia University, Morgantown, WV, 26506, USA
| | - Kinkini Jayasundara
- C. Eugene Bennett Department of Chemistry, West Virginia University, Morgantown, WV, 26506, USA
| | - Ahmad Kiani Karanji
- C. Eugene Bennett Department of Chemistry, West Virginia University, Morgantown, WV, 26506, USA
| | - Kushani Attanayake
- C. Eugene Bennett Department of Chemistry, West Virginia University, Morgantown, WV, 26506, USA
| | - Nandhini Ranganathan
- C. Eugene Bennett Department of Chemistry, West Virginia University, Morgantown, WV, 26506, USA
| | - Peng Li
- C. Eugene Bennett Department of Chemistry, West Virginia University, Morgantown, WV, 26506, USA
| | - Stephen J Valentine
- C. Eugene Bennett Department of Chemistry, West Virginia University, Morgantown, WV, 26506, USA.
| |
Collapse
|
18
|
Review: Precision medicine and driver mutations: Computational methods, functional assays and conformational principles for interpreting cancer drivers. PLoS Comput Biol 2019; 15:e1006658. [PMID: 30921324 PMCID: PMC6438456 DOI: 10.1371/journal.pcbi.1006658] [Citation(s) in RCA: 56] [Impact Index Per Article: 11.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/19/2022] Open
Abstract
At the root of the so-called precision medicine or precision oncology, which is our focus here, is the hypothesis that cancer treatment would be considerably better if therapies were guided by a tumor’s genomic alterations. This hypothesis has sparked major initiatives focusing on whole-genome and/or exome sequencing, creation of large databases, and developing tools for their statistical analyses—all aspiring to identify actionable alterations, and thus molecular targets, in a patient. At the center of the massive amount of collected sequence data is their interpretations that largely rest on statistical analysis and phenotypic observations. Statistics is vital, because it guides identification of cancer-driving alterations. However, statistics of mutations do not identify a change in protein conformation; therefore, it may not define sufficiently accurate actionable mutations, neglecting those that are rare. Among the many thematic overviews of precision oncology, this review innovates by further comprehensively including precision pharmacology, and within this framework, articulating its protein structural landscape and consequences to cellular signaling pathways. It provides the underlying physicochemical basis, thereby also opening the door to a broader community.
Collapse
|
19
|
Luo B, Edge AK, Tolg C, Turley EA, Dean CB, Hill KA, Kulperger RJ. Spatial statistical tools for genome-wide mutation cluster detection under a microarray probe sampling system. PLoS One 2018; 13:e0204156. [PMID: 30252889 PMCID: PMC6155535 DOI: 10.1371/journal.pone.0204156] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/15/2018] [Accepted: 09/04/2018] [Indexed: 11/30/2022] Open
Abstract
Mutation cluster analysis is critical for understanding certain mutational mechanisms relevant to genetic disease, diversity, and evolution. Yet, whole genome sequencing for detection of mutation clusters is prohibitive with high cost for most organisms and population surveys. Single nucleotide polymorphism (SNP) genotyping arrays, like the Mouse Diversity Genotyping Array, offer an alternative low-cost, screening for mutations at hundreds of thousands of loci across the genome using experimental designs that permit capture of de novo mutations in any tissue. Formal statistical tools for genome-wide detection of mutation clusters under a microarray probe sampling system are yet to be established. A challenge in the development of statistical methods is that microarray detection of mutation clusters is constrained to select SNP loci captured by probes on the array. This paper develops a Monte Carlo framework for cluster testing and assesses test statistics for capturing potential deviations from spatial randomness which are motivated by, and incorporate, the array design. While null distributions of the test statistics are established under spatial randomness via the homogeneous Poisson process, power performance of the test statistics is evaluated under postulated types of Neyman-Scott clustering processes through Monte Carlo simulation. A new statistic is developed and recommended as a screening tool for mutation cluster detection. The statistic is demonstrated to be excellent in terms of its robustness and power performance, and useful for cluster analysis in settings of missing data. The test statistic can also be generalized to any one dimensional system where every site is observed, such as DNA sequencing data. The paper illustrates how the informal graphical tools for detecting clusters may be misleading. The statistic is used for finding clusters of putative SNP differences in a mixture of different mouse genetic backgrounds and clusters of de novo SNP differences arising between tissues with development and carcinogenesis.
Collapse
Affiliation(s)
- Bin Luo
- Department of Statistical and Actuarial Sciences, Western University, London, Ontario, Canada
- * E-mail: (BL); (CBD); (KAH); (RJK)
| | - Alanna K. Edge
- Department of Biology, Western University, London, Ontario, Canada
| | - Cornelia Tolg
- London Regional Cancer Program, Lawson Health Research Institute, London, Ontario, Canada
| | - Eva A. Turley
- London Regional Cancer Program, Lawson Health Research Institute, London, Ontario, Canada
| | - C. B. Dean
- Department of Statistics and Actuarial Science, University of Waterloo, Waterloo, Ontario, Canada
- * E-mail: (BL); (CBD); (KAH); (RJK)
| | - Kathleen A. Hill
- Department of Biology, Western University, London, Ontario, Canada
- * E-mail: (BL); (CBD); (KAH); (RJK)
| | - R. J. Kulperger
- Department of Statistical and Actuarial Sciences, Western University, London, Ontario, Canada
- * E-mail: (BL); (CBD); (KAH); (RJK)
| |
Collapse
|
20
|
No major role for rare plectin variants in arrhythmogenic right ventricular cardiomyopathy. PLoS One 2018; 13:e0203078. [PMID: 30161220 PMCID: PMC6117038 DOI: 10.1371/journal.pone.0203078] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/15/2018] [Accepted: 08/14/2018] [Indexed: 11/19/2022] Open
Abstract
Aims Likely pathogenic/pathogenic variants in genes encoding desmosomal proteins play an important role in the pathophysiology of arrhythmogenic right ventricular cardiomyopathy (ARVC). However, for a substantial proportion of ARVC patients, the genetic substrate remains unknown. We hypothesized that plectin, a cytolinker protein encoded by the PLEC gene, could play a role in ARVC because it has been proposed to link the desmosomal protein desmoplakin to the cytoskeleton and therefore has a potential function in the desmosomal structure. Methods We screened PLEC in 359 ARVC patients and compared the frequency of rare coding PLEC variants (minor allele frequency [MAF] <0.001) between patients and controls. To assess the frequency of rare variants in the control population, we evaluated the rare coding variants (MAF <0.001) found in the European cohort of the Exome Aggregation Database. We further evaluated plectin localization by immunofluorescence in a subset of patients with and without a PLEC variant. Results Forty ARVC patients carried one or more rare PLEC variants (11%, 40/359). However, rare variants also seem to occur frequently in the control population (18%, 4754/26197 individuals). Nor did we find a difference in the prevalence of rare PLEC variants in ARVC patients with or without a desmosomal likely pathogenic/pathogenic variant (14% versus 8%, respectively). However, immunofluorescence analysis did show decreased plectin junctional localization in myocardial tissue from 5 ARVC patients with PLEC variants. Conclusions Although PLEC has been hypothesized as a promising candidate gene for ARVC, our current study did not show an enrichment of rare PLEC variants in ARVC patients compared to controls and therefore does not support a major role for PLEC in this disorder. Although rare PLEC variants were associated with abnormal localization in cardiac tissue, the confluence of data does not support a role for plectin abnormalities in ARVC development.
Collapse
|
21
|
Lammel DR, Barth G, Ovaskainen O, Cruz LM, Zanatta JA, Ryo M, de Souza EM, Pedrosa FO. Direct and indirect effects of a pH gradient bring insights into the mechanisms driving prokaryotic community structures. MICROBIOME 2018; 6:106. [PMID: 29891000 PMCID: PMC5996553 DOI: 10.1186/s40168-018-0482-8] [Citation(s) in RCA: 63] [Impact Index Per Article: 10.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 12/14/2017] [Accepted: 05/14/2018] [Indexed: 05/13/2023]
Abstract
BACKGROUND pH is frequently reported as the main driver for prokaryotic community structure in soils. However, pH changes are also linked to "spillover effects" on other chemical parameters (e.g., availability of Al, Fe, Mn, Zn, and Cu) and plant growth, but these indirect effects on the microbial communities are rarely investigated. Usually, pH also co-varies with some confounding factors, such as land use, soil management (e.g., tillage and chemical inputs), plant cover, and/or edapho-climatic conditions. So, a more comprehensive analysis of the direct and indirect effects of pH brings a better understanding of the mechanisms driving prokaryotic (archaeal and bacterial) community structures. RESULTS We evaluated an agricultural soil pH gradient (from 4 to 6, the typical range for tropical farms), in a liming gradient with confounding factors minimized, investigating relationships between prokaryotic communities (16S rRNA) and physical-chemical parameters (indirect effects). Correlations, hierarchical modeling of species communities (HMSC), and random forest (RF) modeling indicated that both direct and indirect effects of the pH gradient affected the prokaryotic communities. Some OTUs were more affected by the pH changes (e.g., some Actinobacteria), while others were more affected by the indirect pH effects (e.g., some Proteobacteria). HMSC detected a phylogenetic signal related to the effects. Both HMSC and RF indicated that the main indirect effect was the pH changes on the availability of some elements (e.g., Al, Fe, and Cu), and secondarily, effects on plant growth and nutrient cycling also affected the OTUs. Additionally, we found that some of the OTUs that responded to pH also correlated with CO2, CH4, and N2O greenhouse gas fluxes. CONCLUSIONS Our results indicate that there are two distinct pH-related mechanisms driving prokaryotic community structures, the direct effect and "spillover effects" of pH (indirect effects). Moreover, the indirect effects are highly relevant for some OTUs and consequently for the community structure; therefore, it is a mechanism that should be further investigated in microbial ecology.
Collapse
Affiliation(s)
- Daniel R Lammel
- Department of Biochemistry and Molecular Biology, Universidade Federal do Paraná (UFPR), Curitiba, Brazil
- Department of Soils and Agricultural Engineer, UFPR, Curitiba, Brazil
- Freie Universität Berlin and Berlin-Brandenburg Institute of Advanced Biodiversity Research (BBIB), Berlin, Germany
| | | | - Otso Ovaskainen
- Department of Biosciences, University of Helsinki, PO Box 65, 00014, Helsinki, Finland
- Department of Biology, Centre for Biodiversity Dynamics, Norwegian University of Science and Technology, 7491, Trondheim, Norway
| | - Leonardo M Cruz
- Department of Biochemistry and Molecular Biology, Universidade Federal do Paraná (UFPR), Curitiba, Brazil
| | | | - Masahiro Ryo
- Freie Universität Berlin and Berlin-Brandenburg Institute of Advanced Biodiversity Research (BBIB), Berlin, Germany
| | - Emanuel M de Souza
- Department of Biochemistry and Molecular Biology, Universidade Federal do Paraná (UFPR), Curitiba, Brazil
| | - Fábio O Pedrosa
- Department of Biochemistry and Molecular Biology, Universidade Federal do Paraná (UFPR), Curitiba, Brazil.
| |
Collapse
|
22
|
Gagunashvili AN, Andrésson ÓS. Distinctive characters of Nostoc genomes in cyanolichens. BMC Genomics 2018; 19:434. [PMID: 29866043 PMCID: PMC5987646 DOI: 10.1186/s12864-018-4743-5] [Citation(s) in RCA: 23] [Impact Index Per Article: 3.8] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/12/2017] [Accepted: 04/30/2018] [Indexed: 11/25/2022] Open
Abstract
BACKGROUND Cyanobacteria of the genus Nostoc are capable of forming symbioses with a wide range of organism, including a diverse assemblage of cyanolichens. Only certain lineages of Nostoc appear to be able to form a close, stable symbiosis, raising the question whether symbiotic competence is determined by specific sets of genes and functionalities. RESULTS We present the complete genome sequencing, annotation and analysis of two lichen Nostoc strains. Comparison with other Nostoc genomes allowed identification of genes potentially involved in symbioses with a broad range of partners including lichen mycobionts. The presence of additional genes necessary for symbiotic competence is likely reflected in larger genome sizes of symbiotic Nostoc strains. Some of the identified genes are presumably involved in the initial recognition and establishment of the symbiotic association, while others may confer advantage to cyanobionts during cohabitation with a mycobiont in the lichen symbiosis. CONCLUSIONS Our study presents the first genome sequencing and genome-scale analysis of lichen-associated Nostoc strains. These data provide insight into the molecular nature of the cyanolichen symbiosis and pinpoint candidate genes for further studies aimed at deciphering the genetic mechanisms behind the symbiotic competence of Nostoc. Since many phylogenetic studies have shown that Nostoc is a polyphyletic group that includes several lineages, this work also provides an improved molecular basis for demarcation of a Nostoc clade with symbiotic competence.
Collapse
Affiliation(s)
- Andrey N. Gagunashvili
- Faculty of Life and Environmental Sciences, University of Iceland, Sturlugata 7, Reykjavík, 101 Iceland
| | - Ólafur S. Andrésson
- Faculty of Life and Environmental Sciences, University of Iceland, Sturlugata 7, Reykjavík, 101 Iceland
| |
Collapse
|
23
|
K T, N KV, S S. Distribution based Fuzzy Estimate Spectral Clustering for Cancer Detection with Protein Sequence and Structural Motifs. Asian Pac J Cancer Prev 2018; 19:1935-1940. [PMID: 30051675 PMCID: PMC6165630 DOI: 10.22034/apjcp.2018.19.7.1935] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [Key Words] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/14/2022] Open
Abstract
Objective: In biological data analysis, protein sequence and structural motifs are an amino-acid sequence patterns
that are widespread and used as tools for detecting the cancer at an earlier stage. To improve the cancer detection with
minimum space and time complexity, Distribution based Fuzzy Estimate Spectral Clustering (DFESC) technique is
developed. Methods: Initially, the protein sequence motifs are taken from dataset to form the cluster. The Distribution
based spectral clustering is applied to group the protein sequence by measuring the generalized jaccard similarity
between each protein sequences. To develop the clustering accuracy, soft computing technique namely fuzzy logic is
applied to calculate membership value of each sequence motifs. Results: The outcome showed that the presented DFESC
technique effectively identifies the cancer in terms of clustering accuracy, false positive rate, and cancer detection time
and space complexity. Conclusion: Based on the observations, evaluation of DFESC technique provides improved
result for premature detection of cancer using protein sequence and structural motifs.
Collapse
Affiliation(s)
- Thenmozhi K
- Department of Computer Applications, Selvam College of Technology, Namakkal, TamilNadu, India,For Correspondence:
| | | | - Shanthi S
- Department of Computer Applications, Kongu Engineering College, Erode, TamilNadu, India
| |
Collapse
|
24
|
Thomas LE, Hurley JJ, Meuser E, Jose S, Ashelford KE, Mort M, Idziaszczyk S, Maynard J, Brito HL, Harry M, Walters A, Raja M, Walton SJ, Dolwani S, Williams GT, Morgan M, Moorghen M, Clark SK, Sampson JR. Burden and Profile of Somatic Mutation in Duodenal Adenomas from Patients with Familial Adenomatous- and MUTYH-associated Polyposis. Clin Cancer Res 2017; 23:6721-6732. [PMID: 28790112 DOI: 10.1158/1078-0432.ccr-17-1269] [Citation(s) in RCA: 23] [Impact Index Per Article: 3.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/02/2017] [Revised: 06/21/2017] [Accepted: 07/25/2017] [Indexed: 11/16/2022]
Abstract
Purpose: Duodenal polyposis and cancer are important causes of morbidity and mortality in familial adenomatous polyposis (FAP) and MUTYH-associated polyposis (MAP). This study aimed to comprehensively characterize somatic genetic changes in FAP and MAP duodenal adenomas to better understand duodenal tumorigenesis in these disorders.Experimental Design: Sixty-nine adenomas were biopsied during endoscopy in 16 FAP and 10 MAP patients with duodenal polyposis. Ten FAP and 10 MAP adenomas and matched blood DNA samples were exome sequenced, 42 further adenomas underwent targeted sequencing, and 47 were studied by array comparative genomic hybridization. Findings in FAP and MAP duodenal adenomas were compared with each other and to the reported mutational landscape in FAP and MAP colorectal adenomas.Results: MAP duodenal adenomas had significantly more protein-changing somatic mutations (P = 0.018), truncating mutations (P = 0.006), and copy number variants (P = 0.005) than FAP duodenal adenomas, even though MAP patients had lower Spigelman stage duodenal polyposis. Fifteen genes were significantly recurrently mutated. Targeted sequencing of APC, KRAS, PTCHD2, and PLCL1 identified further mutations in each of these genes in additional duodenal adenomas. In contrast to MAP and FAP colorectal adenomas, neither exome nor targeted sequencing identified WTX mutations (P = 0.0017).Conclusions: The mutational landscapes in FAP and MAP duodenal adenomas overlapped with, but had significant differences to those reported in colorectal adenomas. The significantly higher burden of somatic mutations in MAP than FAP duodenal adenomas despite lower Spigelman stage disease could increase cancer risk in the context of apparently less severe benign disease. Clin Cancer Res; 23(21); 6721-32. ©2017 AACR.
Collapse
Affiliation(s)
- Laura E Thomas
- Institute of Medical Genetics, Division of Cancer and Genetics, Cardiff University, School of Medicine, Cardiff, United Kingdom
| | - Joanna J Hurley
- Institute of Medical Genetics, Division of Cancer and Genetics, Cardiff University, School of Medicine, Cardiff, United Kingdom.,Department of Gastroenterology, Prince Charles Hospital, Merthyr Tydfil, United Kingdom
| | - Elena Meuser
- Institute of Medical Genetics, Division of Cancer and Genetics, Cardiff University, School of Medicine, Cardiff, United Kingdom
| | - Sian Jose
- Institute of Medical Genetics, Division of Cancer and Genetics, Cardiff University, School of Medicine, Cardiff, United Kingdom
| | - Kevin E Ashelford
- Institute of Medical Genetics, Division of Cancer and Genetics, Cardiff University, School of Medicine, Cardiff, United Kingdom
| | - Matthew Mort
- Institute of Medical Genetics, Division of Cancer and Genetics, Cardiff University, School of Medicine, Cardiff, United Kingdom
| | - Shelley Idziaszczyk
- Institute of Medical Genetics, Division of Cancer and Genetics, Cardiff University, School of Medicine, Cardiff, United Kingdom
| | - Julie Maynard
- Institute of Medical Genetics, Division of Cancer and Genetics, Cardiff University, School of Medicine, Cardiff, United Kingdom
| | - Helena Leon Brito
- Institute of Medical Genetics, Division of Cancer and Genetics, Cardiff University, School of Medicine, Cardiff, United Kingdom
| | - Manon Harry
- Institute of Medical Genetics, Division of Cancer and Genetics, Cardiff University, School of Medicine, Cardiff, United Kingdom
| | - Angharad Walters
- Institute of Medical Genetics, Division of Cancer and Genetics, Cardiff University, School of Medicine, Cardiff, United Kingdom
| | - Meera Raja
- Institute of Medical Genetics, Division of Cancer and Genetics, Cardiff University, School of Medicine, Cardiff, United Kingdom
| | | | - Sunil Dolwani
- Institute of Medical Genetics, Division of Cancer and Genetics, Cardiff University, School of Medicine, Cardiff, United Kingdom.,Division of Population Medicine, Cardiff University School of Medicine, Cardiff, United Kingdom
| | - Geraint T Williams
- Institute of Medical Genetics, Division of Cancer and Genetics, Cardiff University, School of Medicine, Cardiff, United Kingdom
| | - Meleri Morgan
- Department of Pathology, University Hospital for Wales, Cardiff, United Kingdom
| | - Morgan Moorghen
- The Polyposis Registry, St. Marks Hospital, Harrow, United Kingdom.,Department of Pathology, St. Marks Hospital, Harrow, United Kingdom
| | - Susan K Clark
- The Polyposis Registry, St. Marks Hospital, Harrow, United Kingdom.,Department of Surgery and Cancer, Faculty of Medicine, Imperial College, London, United Kingdom
| | - Julian R Sampson
- Institute of Medical Genetics, Division of Cancer and Genetics, Cardiff University, School of Medicine, Cardiff, United Kingdom.
| |
Collapse
|
25
|
Comparison of algorithms for the detection of cancer drivers at subgene resolution. Nat Methods 2017; 14:782-788. [PMID: 28714987 DOI: 10.1038/nmeth.4364] [Citation(s) in RCA: 63] [Impact Index Per Article: 9.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/18/2017] [Accepted: 06/16/2017] [Indexed: 12/19/2022]
Abstract
Understanding genetic events that lead to cancer initiation and progression remains one of the biggest challenges in cancer biology. Traditionally, most algorithms for cancer-driver identification look for genes that have more mutations than expected from the average background mutation rate. However, there is now a wide variety of methods that look for nonrandom distribution of mutations within proteins as a signal for the driving role of mutations in cancer. Here we classify and review such subgene-resolution algorithms, compare their findings on four distinct cancer data sets from The Cancer Genome Atlas and discuss how predictions from these algorithms can be interpreted in the emerging paradigms that challenge the simple dichotomy between driver and passenger genes.
Collapse
|
26
|
Reassessment of Mendelian gene pathogenicity using 7,855 cardiomyopathy cases and 60,706 reference samples. Genet Med 2016; 19:192-203. [PMID: 27532257 PMCID: PMC5116235 DOI: 10.1038/gim.2016.90] [Citation(s) in RCA: 505] [Impact Index Per Article: 63.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/22/2016] [Accepted: 05/10/2016] [Indexed: 11/26/2022] Open
Abstract
Purpose: The accurate interpretation of variation in Mendelian disease genes has lagged behind data generation as sequencing has become increasingly accessible. Ongoing large sequencing efforts present huge interpretive challenges, but they also provide an invaluable opportunity to characterize the spectrum and importance of rare variation. Methods: We analyzed sequence data from 7,855 clinical cardiomyopathy cases and 60,706 Exome Aggregation Consortium (ExAC) reference samples to obtain a better understanding of genetic variation in a representative autosomal dominant disorder. Results: We found that in some genes previously reported as important causes of a given cardiomyopathy, rare variation is not clinically informative because there is an unacceptably high likelihood of false-positive interpretation. By contrast, in other genes, we find that diagnostic laboratories may be overly conservative when assessing variant pathogenicity. Conclusions: We outline improved analytical approaches that evaluate which genes and variant classes are interpretable and propose that these will increase the clinical utility of testing across a range of Mendelian diseases. Genet Med19 2, 192–203.
Collapse
|
27
|
Zhu Z, Ihle NT, Rejto PA, Zarrinkar PP. Outlier analysis of functional genomic profiles enriches for oncology targets and enables precision medicine. BMC Genomics 2016; 17:455. [PMID: 27296290 PMCID: PMC4907009 DOI: 10.1186/s12864-016-2807-y] [Citation(s) in RCA: 7] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/06/2015] [Accepted: 05/27/2016] [Indexed: 01/22/2023] Open
Abstract
Background Genome-scale functional genomic screens across large cell line panels provide a rich resource for discovering tumor vulnerabilities that can lead to the next generation of targeted therapies. Their data analysis typically has focused on identifying genes whose knockdown enhances response in various pre-defined genetic contexts, which are limited by biological complexities as well as the incompleteness of our knowledge. We thus introduce a complementary data mining strategy to identify genes with exceptional sensitivity in subsets, or outlier groups, of cell lines, allowing an unbiased analysis without any a priori assumption about the underlying biology of dependency. Results Genes with outlier features are strongly and specifically enriched with those known to be associated with cancer and relevant biological processes, despite no a priori knowledge being used to drive the analysis. Identification of exceptional responders (outliers) may not lead only to new candidates for therapeutic intervention, but also tumor indications and response biomarkers for companion precision medicine strategies. Several tumor suppressors have an outlier sensitivity pattern, supporting and generalizing the notion that tumor suppressors can play context-dependent oncogenic roles. Conclusions The novel application of outlier analysis described here demonstrates a systematic and data-driven analytical strategy to decipher large-scale functional genomic data for oncology target and precision medicine discoveries. Electronic supplementary material The online version of this article (doi:10.1186/s12864-016-2807-y) contains supplementary material, which is available to authorized users.
Collapse
Affiliation(s)
- Zhou Zhu
- Oncology Research Unit, Pfizer Worldwide Research & Development, La Jolla Laboratories, 10777 Science Center Drive, San Diego, CA, 92121, USA.
| | - Nathan T Ihle
- Oncology Research Unit, Pfizer Worldwide Research & Development, La Jolla Laboratories, 10777 Science Center Drive, San Diego, CA, 92121, USA
| | - Paul A Rejto
- Oncology Research Unit, Pfizer Worldwide Research & Development, La Jolla Laboratories, 10777 Science Center Drive, San Diego, CA, 92121, USA
| | - Patrick P Zarrinkar
- Oncology Research Unit, Pfizer Worldwide Research & Development, La Jolla Laboratories, 10777 Science Center Drive, San Diego, CA, 92121, USA.
| |
Collapse
|
28
|
Inzelberg R, Samuels Y, Azizi E, Qutob N, Inzelberg L, Domany E, Schechtman E, Friedman E. Parkinson disease (PARK) genes are somatically mutated in cutaneous melanoma. Neurol Genet 2016; 2:e70. [PMID: 27123489 PMCID: PMC4832432 DOI: 10.1212/nxg.0000000000000070] [Citation(s) in RCA: 21] [Impact Index Per Article: 2.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/27/2016] [Accepted: 03/01/2016] [Indexed: 01/04/2023]
Abstract
OBJECTIVE To assess whether Parkinson disease (PD) genes are somatically mutated in cutaneous melanoma (CM) tissue, because CM occurs in patients with PD at higher rates than in the general population and PD is more common than expected in CM cohorts. METHODS We cross-referenced somatic mutations in metastatic CM detected by whole-exome sequencing with the 15 known PD (PARK) genes. We computed the empirical distribution of the sum of mutations in each gene (Smut) and of the number of tissue samples in which a given gene was mutated at least once (SSampl) for each of the analyzable genes, determined the 90th and 95th percentiles of the empirical distributions of these sums, and verified the location of PARK genes in these distributions. Identical analyses were applied to adenocarcinoma of lung (ADENOCA-LUNG) and squamous cell carcinoma of lung (SQUAMCA-LUNG). We also analyzed the distribution of the number of mutated PARK genes in CM samples vs the 2 lung cancers. RESULTS Somatic CM mutation analysis (n = 246) detected 315,914 mutations in 18,758 genes. Somatic CM mutations were found in 14 of 15 PARK genes. Forty-eight percent of CM samples carried ≥1 PARK mutation and 25% carried multiple PARK mutations. PARK8 mutations occurred above the 95th percentile of the empirical distribution for SMut and SSampl. Significantly more CM samples harbored multiple PARK gene mutations compared with SQUAMCA-LUNG (p = 0.0026) and with ADENOCA-LUNG (p < 0.0001). CONCLUSIONS The overrepresentation of somatic PARK mutations in CM suggests shared dysregulated pathways for CM and PD.
Collapse
Affiliation(s)
- Rivka Inzelberg
- Department of Neurology (R.I.), Department of Dermatology (E.A.), Sackler Faculty of Medicine, Tel Aviv University; Center of Advanced Technologies in Rehabilitation (R.I.), Sheba Medical Center, Tel Hashomer; Department of Molecular Cell Biology (Y.S., N.Q.), Weizmann Institute of Science, Rehovot; The Sagol School of Neuroscience (L.I.), Tel Aviv University; Department of Physics of Complex Systems (E.D.), Weizmann Institute of Science, Rehovot; Department of Industrial Engineering and Management (E.S.), Ben Gurion University of the Negev, Beer Sheva; The Susanne Levy Gertner Oncogenetics Unit (E.F.), Institute of Human Genetics, Sheba Medical Center, Tel-Hashomer; and the Sackler Faculty of Medicine (E.F.), Tel Aviv University, Israel
| | - Yardena Samuels
- Department of Neurology (R.I.), Department of Dermatology (E.A.), Sackler Faculty of Medicine, Tel Aviv University; Center of Advanced Technologies in Rehabilitation (R.I.), Sheba Medical Center, Tel Hashomer; Department of Molecular Cell Biology (Y.S., N.Q.), Weizmann Institute of Science, Rehovot; The Sagol School of Neuroscience (L.I.), Tel Aviv University; Department of Physics of Complex Systems (E.D.), Weizmann Institute of Science, Rehovot; Department of Industrial Engineering and Management (E.S.), Ben Gurion University of the Negev, Beer Sheva; The Susanne Levy Gertner Oncogenetics Unit (E.F.), Institute of Human Genetics, Sheba Medical Center, Tel-Hashomer; and the Sackler Faculty of Medicine (E.F.), Tel Aviv University, Israel
| | - Esther Azizi
- Department of Neurology (R.I.), Department of Dermatology (E.A.), Sackler Faculty of Medicine, Tel Aviv University; Center of Advanced Technologies in Rehabilitation (R.I.), Sheba Medical Center, Tel Hashomer; Department of Molecular Cell Biology (Y.S., N.Q.), Weizmann Institute of Science, Rehovot; The Sagol School of Neuroscience (L.I.), Tel Aviv University; Department of Physics of Complex Systems (E.D.), Weizmann Institute of Science, Rehovot; Department of Industrial Engineering and Management (E.S.), Ben Gurion University of the Negev, Beer Sheva; The Susanne Levy Gertner Oncogenetics Unit (E.F.), Institute of Human Genetics, Sheba Medical Center, Tel-Hashomer; and the Sackler Faculty of Medicine (E.F.), Tel Aviv University, Israel
| | - Nouar Qutob
- Department of Neurology (R.I.), Department of Dermatology (E.A.), Sackler Faculty of Medicine, Tel Aviv University; Center of Advanced Technologies in Rehabilitation (R.I.), Sheba Medical Center, Tel Hashomer; Department of Molecular Cell Biology (Y.S., N.Q.), Weizmann Institute of Science, Rehovot; The Sagol School of Neuroscience (L.I.), Tel Aviv University; Department of Physics of Complex Systems (E.D.), Weizmann Institute of Science, Rehovot; Department of Industrial Engineering and Management (E.S.), Ben Gurion University of the Negev, Beer Sheva; The Susanne Levy Gertner Oncogenetics Unit (E.F.), Institute of Human Genetics, Sheba Medical Center, Tel-Hashomer; and the Sackler Faculty of Medicine (E.F.), Tel Aviv University, Israel
| | - Lilah Inzelberg
- Department of Neurology (R.I.), Department of Dermatology (E.A.), Sackler Faculty of Medicine, Tel Aviv University; Center of Advanced Technologies in Rehabilitation (R.I.), Sheba Medical Center, Tel Hashomer; Department of Molecular Cell Biology (Y.S., N.Q.), Weizmann Institute of Science, Rehovot; The Sagol School of Neuroscience (L.I.), Tel Aviv University; Department of Physics of Complex Systems (E.D.), Weizmann Institute of Science, Rehovot; Department of Industrial Engineering and Management (E.S.), Ben Gurion University of the Negev, Beer Sheva; The Susanne Levy Gertner Oncogenetics Unit (E.F.), Institute of Human Genetics, Sheba Medical Center, Tel-Hashomer; and the Sackler Faculty of Medicine (E.F.), Tel Aviv University, Israel
| | - Eytan Domany
- Department of Neurology (R.I.), Department of Dermatology (E.A.), Sackler Faculty of Medicine, Tel Aviv University; Center of Advanced Technologies in Rehabilitation (R.I.), Sheba Medical Center, Tel Hashomer; Department of Molecular Cell Biology (Y.S., N.Q.), Weizmann Institute of Science, Rehovot; The Sagol School of Neuroscience (L.I.), Tel Aviv University; Department of Physics of Complex Systems (E.D.), Weizmann Institute of Science, Rehovot; Department of Industrial Engineering and Management (E.S.), Ben Gurion University of the Negev, Beer Sheva; The Susanne Levy Gertner Oncogenetics Unit (E.F.), Institute of Human Genetics, Sheba Medical Center, Tel-Hashomer; and the Sackler Faculty of Medicine (E.F.), Tel Aviv University, Israel
| | - Edna Schechtman
- Department of Neurology (R.I.), Department of Dermatology (E.A.), Sackler Faculty of Medicine, Tel Aviv University; Center of Advanced Technologies in Rehabilitation (R.I.), Sheba Medical Center, Tel Hashomer; Department of Molecular Cell Biology (Y.S., N.Q.), Weizmann Institute of Science, Rehovot; The Sagol School of Neuroscience (L.I.), Tel Aviv University; Department of Physics of Complex Systems (E.D.), Weizmann Institute of Science, Rehovot; Department of Industrial Engineering and Management (E.S.), Ben Gurion University of the Negev, Beer Sheva; The Susanne Levy Gertner Oncogenetics Unit (E.F.), Institute of Human Genetics, Sheba Medical Center, Tel-Hashomer; and the Sackler Faculty of Medicine (E.F.), Tel Aviv University, Israel
| | - Eitan Friedman
- Department of Neurology (R.I.), Department of Dermatology (E.A.), Sackler Faculty of Medicine, Tel Aviv University; Center of Advanced Technologies in Rehabilitation (R.I.), Sheba Medical Center, Tel Hashomer; Department of Molecular Cell Biology (Y.S., N.Q.), Weizmann Institute of Science, Rehovot; The Sagol School of Neuroscience (L.I.), Tel Aviv University; Department of Physics of Complex Systems (E.D.), Weizmann Institute of Science, Rehovot; Department of Industrial Engineering and Management (E.S.), Ben Gurion University of the Negev, Beer Sheva; The Susanne Levy Gertner Oncogenetics Unit (E.F.), Institute of Human Genetics, Sheba Medical Center, Tel-Hashomer; and the Sackler Faculty of Medicine (E.F.), Tel Aviv University, Israel
| |
Collapse
|
29
|
Tokheim C, Bhattacharya R, Niknafs N, Gygax DM, Kim R, Ryan M, Masica DL, Karchin R. Exome-Scale Discovery of Hotspot Mutation Regions in Human Cancer Using 3D Protein Structure. Cancer Res 2016; 76:3719-31. [PMID: 27197156 DOI: 10.1158/0008-5472.can-15-3190] [Citation(s) in RCA: 72] [Impact Index Per Article: 9.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/30/2015] [Accepted: 04/01/2016] [Indexed: 12/12/2022]
Abstract
The impact of somatic missense mutation on cancer etiology and progression is often difficult to interpret. One common approach for assessing the contribution of missense mutations in carcinogenesis is to identify genes mutated with statistically nonrandom frequencies. Even given the large number of sequenced cancer samples currently available, this approach remains underpowered to detect drivers, particularly in less studied cancer types. Alternative statistical and bioinformatic approaches are needed. One approach to increase power is to focus on localized regions of increased missense mutation density or hotspot regions, rather than a whole gene or protein domain. Detecting missense mutation hotspot regions in three-dimensional (3D) protein structure may also be beneficial because linear sequence alone does not fully describe the biologically relevant organization of codons. Here, we present a novel and statistically rigorous algorithm for detecting missense mutation hotspot regions in 3D protein structures. We analyzed approximately 3 × 10(5) mutations from The Cancer Genome Atlas (TCGA) and identified 216 tumor-type-specific hotspot regions. In addition to experimentally determined protein structures, we considered high-quality structural models, which increase genomic coverage from approximately 5,000 to more than 15,000 genes. We provide new evidence that 3D mutation analysis has unique advantages. It enables discovery of hotspot regions in many more genes than previously shown and increases sensitivity to hotspot regions in tumor suppressor genes (TSG). Although hotspot regions have long been known to exist in both TSGs and oncogenes, we provide the first report that they have different characteristic properties in the two types of driver genes. We show how cancer researchers can use our results to link 3D protein structure and the biologic functions of missense mutations in cancer, and to generate testable hypotheses about driver mechanisms. Our results are included in a new interactive website for visualizing protein structures with TCGA mutations and associated hotspot regions. Users can submit new sequence data, facilitating the visualization of mutations in a biologically relevant context. Cancer Res; 76(13); 3719-31. ©2016 AACR.
Collapse
Affiliation(s)
- Collin Tokheim
- Department of Biomedical Engineering and Institute for Computational Medicine, Johns Hopkins University, Baltimore, Maryland
| | - Rohit Bhattacharya
- Department of Biomedical Engineering and Institute for Computational Medicine, Johns Hopkins University, Baltimore, Maryland
| | - Noushin Niknafs
- Department of Biomedical Engineering and Institute for Computational Medicine, Johns Hopkins University, Baltimore, Maryland
| | | | - Rick Kim
- In Silico Solutions, Fairfax, Virginia
| | | | - David L Masica
- Department of Biomedical Engineering and Institute for Computational Medicine, Johns Hopkins University, Baltimore, Maryland
| | - Rachel Karchin
- Department of Biomedical Engineering and Institute for Computational Medicine, Johns Hopkins University, Baltimore, Maryland. Department of Oncology, Johns Hopkins University School of Medicine, Baltimore, Maryland.
| |
Collapse
|
30
|
Ryslik GA, Cheng Y, Modis Y, Zhao H. Leveraging protein quaternary structure to identify oncogenic driver mutations. BMC Bioinformatics 2016; 17:137. [PMID: 27001666 PMCID: PMC4802602 DOI: 10.1186/s12859-016-0963-3] [Citation(s) in RCA: 8] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/14/2015] [Accepted: 02/18/2016] [Indexed: 11/23/2022] Open
Abstract
BACKGROUND Identifying key "driver" mutations which are responsible for tumorigenesis is critical in the development of new oncology drugs. Due to multiple pharmacological successes in treating cancers that are caused by such driver mutations, a large body of methods have been developed to differentiate these mutations from the benign "passenger" mutations which occur in the tumor but do not further progress the disease. Under the hypothesis that driver mutations tend to cluster in key regions of the protein, the development of algorithms that identify these clusters has become a critical area of research. RESULTS We have developed a novel methodology, QuartPAC (Quaternary Protein Amino acid Clustering), that identifies non-random mutational clustering while utilizing the protein quaternary structure in 3D space. By integrating the spatial information in the Protein Data Bank (PDB) and the mutational data in the Catalogue of Somatic Mutations in Cancer (COSMIC), QuartPAC is able to identify clusters which are otherwise missed in a variety of proteins. The R package is available on Bioconductor at: http://bioconductor.jp/packages/3.1/bioc/html/QuartPAC.html . CONCLUSION QuartPAC provides a unique tool to identify mutational clustering while accounting for the complete folded protein quaternary structure.
Collapse
Affiliation(s)
- Gregory A. Ryslik
- />Department of Biostatistics, Yale School of Public Health, New Haven, CT USA
| | - Yuwei Cheng
- />Program of Computational Biology and Bioinformatics, Yale University, New Haven, CT USA
| | - Yorgo Modis
- />Department of Medicine, University of Cambridge, MRC Laboratory of Molecular Biology, Francis Crick Avenue, Cambridge, CB2 0QH UK
| | - Hongyu Zhao
- />Department of Biostatistics, Yale School of Public Health, New Haven, CT USA
- />Program of Computational Biology and Bioinformatics, Yale University, New Haven, CT USA
| |
Collapse
|
31
|
Oleksiewicz U, Tomczak K, Woropaj J, Markowska M, Stępniak P, Shah PK. Computational characterisation of cancer molecular profiles derived using next generation sequencing. Contemp Oncol (Pozn) 2015; 19:A78-91. [PMID: 25691827 PMCID: PMC4322529 DOI: 10.5114/wo.2014.47137] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/30/2023] Open
Abstract
Our current understanding of cancer genetics is grounded on the principle that cancer arises from a clone that has accumulated the requisite somatically acquired genetic aberrations, leading to the malignant transformation. It also results in aberrent of gene and protein expression. Next generation sequencing (NGS) or deep sequencing platforms are being used to create large catalogues of changes in copy numbers, mutations, structural variations, gene fusions, gene expression, and other types of information for cancer patients. However, inferring different types of biological changes from raw reads generated using the sequencing experiments is algorithmically and computationally challenging. In this article, we outline common steps for the quality control and processing of NGS data. We highlight the importance of accurate and application-specific alignment of these reads and the methodological steps and challenges in obtaining different types of information. We comment on the importance of integrating these data and building infrastructure to analyse it. We also provide exhaustive lists of available software to obtain information and point the readers to articles comparing software for deeper insight in specialised areas. We hope that the article will guide readers in choosing the right tools for analysing oncogenomic datasets.
Collapse
Affiliation(s)
- Urszula Oleksiewicz
- Laboratory of Gene Therapy, Department of Cancer Immunology, The Greater Poland Cancer Centre, Poznan, Poland ; Department of Cancer Immunology and Diagnostics, Chair of Medical Biotechnology, Poznan University of Medical Sciences, Poznan, Poland ; These authors contributed equally to this paper
| | - Katarzyna Tomczak
- Laboratory of Gene Therapy, Department of Cancer Immunology, The Greater Poland Cancer Centre, Poznan, Poland ; Department of Cancer Immunology and Diagnostics, Chair of Medical Biotechnology, Poznan University of Medical Sciences, Poznan, Poland ; Postgraduate School of Molecular Medicine, Medical University of Warsaw, Warsaw ; These authors contributed equally to this paper
| | - Jakub Woropaj
- Poznan University of Economics, Poznań, Poland ; These authors contributed equally to this paper
| | | | | | - Parantu K Shah
- Institute for Applied Cancer Science, The University of Texas MD Anderson Cancer Center, Houston, TX, USA
| |
Collapse
|
32
|
Pan-cancer network analysis identifies combinations of rare somatic mutations across pathways and protein complexes. Nat Genet 2014; 47:106-14. [PMID: 25501392 PMCID: PMC4444046 DOI: 10.1038/ng.3168] [Citation(s) in RCA: 592] [Impact Index Per Article: 59.2] [Reference Citation Analysis] [Abstract] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/20/2014] [Accepted: 11/20/2014] [Indexed: 12/13/2022]
Abstract
Cancers exhibit extensive mutational heterogeneity and the resulting long tail
phenomenon complicates the discovery of the genes and pathways that are significantly
mutated in cancer. We perform a Pan-Cancer analysis of mutated networks in 3281 samples
from 12 cancer types from The Cancer Genome Atlas (TCGA) using HotNet2, a novel algorithm
to find mutated subnetworks that overcomes limitations of existing single gene and
pathway/network approaches.. We identify 14 significantly mutated subnetworks that include
well-known cancer signaling pathways as well as subnetworks with less characterized roles
in cancer including cohesin, condensin, and others. Many of these subnetworks exhibit
co-occurring mutations across samples. These subnetworks contain dozens of genes with rare
somatic mutations across multiple cancers; many of these genes have additional evidence
supporting a role in cancer. By illuminating these rare combinations of mutations,
Pan-Cancer network analyses provide a roadmap to investigate new diagnostic and
therapeutic opportunities across cancer types.
Collapse
|
33
|
Van Neste L, Van Criekinge W. We are all individuals... bioinformatics in the personalized medicine era. Cell Oncol (Dordr) 2014; 38:29-37. [PMID: 25204962 DOI: 10.1007/s13402-014-0195-3] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 08/26/2014] [Indexed: 12/16/2022] Open
Abstract
The medical landscape is evolving at a rapid pace, creating the opportunity for more personalized patient treatment and shifting the way healthcare is approached and thought about. With the availability of (epi)genome-wide, transcriptomic and proteogenomic profiling techniques detailed characterization of a disease at the level of the individual is now possible, offering the opportunity for truly tailored approaches for treatment and patient care. While improvements are still expected, the techniques and the basic analytical tools have reached a state that these can be efficiently deployed in both routine research and clinical practice. Still, some major challenges remain. Notably, holistic approaches, integrating data from several sources, e.g. genomic and epigenomic, will increase the understanding of the underlying biological concepts and provide insight into the causes, effects and effective solutions. However, creating and validating such a knowledge base, potentially for different levels of expertise, and integrating several data points into meaningful information is not trivial.
Collapse
Affiliation(s)
- Leander Van Neste
- Department of Pathology, School for Oncology and Developmental Biology, Maastricht University Medical Center, Maastricht, The Netherlands,
| | | |
Collapse
|
34
|
Abstract
High-throughput DNA sequencing has revolutionized the study of cancer genomics with numerous discoveries that are relevant to cancer diagnosis and treatment. The latest sequencing and analysis methods have successfully identified somatic alterations, including single-nucleotide variants, insertions and deletions, copy-number aberrations, structural variants and gene fusions. Additional computational techniques have proved useful for defining the mutations, genes and molecular networks that drive diverse cancer phenotypes and that determine clonal architectures in tumour samples. Collectively, these tools have advanced the study of genomic, transcriptomic and epigenomic alterations in cancer, and their association to clinical properties. Here, we review cancer genomics software and the insights that have been gained from their application.
Collapse
|
35
|
A spatial simulation approach to account for protein structure when identifying non-random somatic mutations. BMC Bioinformatics 2014; 15:231. [PMID: 24990767 PMCID: PMC4227039 DOI: 10.1186/1471-2105-15-231] [Citation(s) in RCA: 21] [Impact Index Per Article: 2.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/15/2013] [Accepted: 05/27/2014] [Indexed: 02/08/2023] Open
Abstract
Background Current research suggests that a small set of “driver” mutations are responsible for tumorigenesis while a larger body of “passenger” mutations occur in the tumor but do not progress the disease. Due to recent pharmacological successes in treating cancers caused by driver mutations, a variety of methodologies that attempt to identify such mutations have been developed. Based on the hypothesis that driver mutations tend to cluster in key regions of the protein, the development of cluster identification algorithms has become critical. Results We have developed a novel methodology, SpacePAC (Spatial Protein Amino acid Clustering), that identifies mutational clustering by considering the protein tertiary structure directly in 3D space. By combining the mutational data in the Catalogue of Somatic Mutations in Cancer (COSMIC) and the spatial information in the Protein Data Bank (PDB), SpacePAC is able to identify novel mutation clusters in many proteins such as FGFR3 and CHRM2. In addition, SpacePAC is better able to localize the most significant mutational hotspots as demonstrated in the cases of BRAF and ALK. The R package is available on Bioconductor at: http://www.bioconductor.org/packages/release/bioc/html/SpacePAC.html. Conclusion SpacePAC adds a valuable tool to the identification of mutational clusters while considering protein tertiary structure.
Collapse
|
36
|
Ryslik GA, Cheng Y, Cheung KH, Modis Y, Zhao H. A graph theoretic approach to utilizing protein structure to identify non-random somatic mutations. BMC Bioinformatics 2014; 15:86. [PMID: 24669769 PMCID: PMC4024121 DOI: 10.1186/1471-2105-15-86] [Citation(s) in RCA: 23] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/09/2013] [Accepted: 03/11/2014] [Indexed: 02/23/2023] Open
Abstract
Background It is well known that the development of cancer is caused by the accumulation of somatic mutations within the genome. For oncogenes specifically, current research suggests that there is a small set of "driver" mutations that are primarily responsible for tumorigenesis. Further, due to recent pharmacological successes in treating these driver mutations and their resulting tumors, a variety of approaches have been developed to identify potential driver mutations using methods such as machine learning and mutational clustering. We propose a novel methodology that increases our power to identify mutational clusters by taking into account protein tertiary structure via a graph theoretical approach. Results We have designed and implemented GraphPAC (Graph Protein Amino acid Clustering) to identify mutational clustering while considering protein spatial structure. Using GraphPAC, we are able to detect novel clusters in proteins that are known to exhibit mutation clustering as well as identify clusters in proteins without evidence of prior clustering based on current methods. Specifically, by utilizing the spatial information available in the Protein Data Bank (PDB) along with the mutational data in the Catalogue of Somatic Mutations in Cancer (COSMIC), GraphPAC identifies new mutational clusters in well known oncogenes such as EGFR and KRAS. Further, by utilizing graph theory to account for the tertiary structure, GraphPAC discovers clusters in DPP4, NRP1 and other proteins not identified by existing methods. The R package is available at:
http://bioconductor.org/packages/release/bioc/html/GraphPAC.html. Conclusion GraphPAC provides an alternative to iPAC and an extension to current methodology when identifying potential activating driver mutations by utilizing a graph theoretic approach when considering protein tertiary structure.
Collapse
Affiliation(s)
- Gregory A Ryslik
- Department of Biostatistics, Yale School of Public Health, New Haven, CT, USA.
| | | | | | | | | |
Collapse
|
37
|
Raphael BJ, Dobson JR, Oesper L, Vandin F. Identifying driver mutations in sequenced cancer genomes: computational approaches to enable precision medicine. Genome Med 2014; 6:5. [PMID: 24479672 PMCID: PMC3978567 DOI: 10.1186/gm524] [Citation(s) in RCA: 134] [Impact Index Per Article: 13.4] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/07/2023] Open
Abstract
High-throughput DNA sequencing is revolutionizing the study of cancer and enabling the measurement of the somatic mutations that drive cancer development. However, the resulting sequencing datasets are large and complex, obscuring the clinically important mutations in a background of errors, noise, and random mutations. Here, we review computational approaches to identify somatic mutations in cancer genome sequences and to distinguish the driver mutations that are responsible for cancer from random, passenger mutations. First, we describe approaches to detect somatic mutations from high-throughput DNA sequencing data, particularly for tumor samples that comprise heterogeneous populations of cells. Next, we review computational approaches that aim to predict driver mutations according to their frequency of occurrence in a cohort of samples, or according to their predicted functional impact on protein sequence or structure. Finally, we review techniques to identify recurrent combinations of somatic mutations, including approaches that examine mutations in known pathways or protein-interaction networks, as well as de novo approaches that identify combinations of mutations according to statistical patterns of mutual exclusivity. These techniques, coupled with advances in high-throughput DNA sequencing, are enabling precision medicine approaches to the diagnosis and treatment of cancer.
Collapse
Affiliation(s)
- Benjamin J Raphael
- Department of Computer Science, Brown University, 115 Waterman Street, Providence, RI 02912, USA
- Center for Computational Molecular Biology, Brown University, 115 Waterman Street, Providence, RI 02912, USA
| | - Jason R Dobson
- Department of Computer Science, Brown University, 115 Waterman Street, Providence, RI 02912, USA
- Center for Computational Molecular Biology, Brown University, 115 Waterman Street, Providence, RI 02912, USA
- Department of Molecular Biology, Cell Biology and Biochemistry, Brown University, 185 Meeting Street, Providence, RI 02912, USA
| | - Layla Oesper
- Department of Computer Science, Brown University, 115 Waterman Street, Providence, RI 02912, USA
| | - Fabio Vandin
- Department of Computer Science, Brown University, 115 Waterman Street, Providence, RI 02912, USA
- Center for Computational Molecular Biology, Brown University, 115 Waterman Street, Providence, RI 02912, USA
| |
Collapse
|
38
|
Tamborero D, Gonzalez-Perez A, Lopez-Bigas N. OncodriveCLUST: exploiting the positional clustering of somatic mutations to identify cancer genes. ACTA ACUST UNITED AC 2013; 29:2238-44. [PMID: 23884480 DOI: 10.1093/bioinformatics/btt395] [Citation(s) in RCA: 312] [Impact Index Per Article: 28.4] [Reference Citation Analysis] [Abstract] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/30/2022]
Abstract
MOTIVATION Gain-of-function mutations often cluster in specific protein regions, a signal that those mutations provide an adaptive advantage to cancer cells and consequently are positively selected during clonal evolution of tumours. We sought to determine the overall extent of this feature in cancer and the possibility to use this feature to identify drivers. RESULTS We have developed OncodriveCLUST, a method to identify genes with a significant bias towards mutation clustering within the protein sequence. This method constructs the background model by assessing coding-silent mutations, which are assumed not to be under positive selection and thus may reflect the baseline tendency of somatic mutations to be clustered. OncodriveCLUST analysis of the Catalogue of Somatic Mutations in Cancer retrieved a list of genes enriched by the Cancer Gene Census, prioritizing those with dominant phenotypes but also highlighting some recessive cancer genes, which showed wider but still delimited mutation clusters. Assessment of datasets from The Cancer Genome Atlas demonstrated that OncodriveCLUST selected cancer genes that were nevertheless missed by methods based on frequency and functional impact criteria. This stressed the benefit of combining approaches based on complementary principles to identify driver mutations. We propose OncodriveCLUST as an effective tool for that purpose. AVAILABILITY OncodriveCLUST has been implemented as a Python script and is freely available from http://bg.upf.edu/oncodriveclust CONTACT nuria.lopez@upf.edu or abel.gonzalez@upf.edu SUPPLEMENTARY INFORMATION Supplementary data are available at Bioinformatics online.
Collapse
Affiliation(s)
- David Tamborero
- Research Unit on Biomedical Informatics, Department of Experimental and Health Sciences, Universitat Pompeu Fabra, Dr. Aiguader 88, 08003 Barcelona and Institució Catalana de Recerca i Estudis Avançats ICREA, Passeig Lluis Companys, 23, 08010 Barcelona, Spain
| | | | | |
Collapse
|
39
|
İlhan İ, Tezel G. How to Select Tag SNPs in Genetic Association Studies? The CLONTagger Method with Parameter Optimization. OMICS-A JOURNAL OF INTEGRATIVE BIOLOGY 2013; 17:368-83. [DOI: 10.1089/omi.2012.0100] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 12/31/2022]
Affiliation(s)
- İlhan İlhan
- Akören Vocational School, Selçuk University, Konya, Turkey
| | - Gülay Tezel
- Department of Computer Engineering Faculty of Engineering and Architecture, Selçuk University, Konya, Turkey
| |
Collapse
|
40
|
Utilizing protein structure to identify non-random somatic mutations. BMC Bioinformatics 2013; 14:190. [PMID: 23758891 PMCID: PMC3691676 DOI: 10.1186/1471-2105-14-190] [Citation(s) in RCA: 38] [Impact Index Per Article: 3.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/08/2013] [Accepted: 05/28/2013] [Indexed: 02/07/2023] Open
Abstract
Background Human cancer is caused by the accumulation of somatic mutations in tumor suppressors and oncogenes within the genome. In the case of oncogenes, recent theory suggests that there are only a few key “driver” mutations responsible for tumorigenesis. As there have been significant pharmacological successes in developing drugs that treat cancers that carry these driver mutations, several methods that rely on mutational clustering have been developed to identify them. However, these methods consider proteins as a single strand without taking their spatial structures into account. We propose an extension to current methodology that incorporates protein tertiary structure in order to increase our power when identifying mutation clustering. Results We have developed iPAC (identification of Protein Amino acid Clustering), an algorithm that identifies non-random somatic mutations in proteins while taking into account the three dimensional protein structure. By using the tertiary information, we are able to detect both novel clusters in proteins that are known to exhibit mutation clustering as well as identify clusters in proteins without evidence of clustering based on existing methods. For example, by combining the data in the Protein Data Bank (PDB) and the Catalogue of Somatic Mutations in Cancer, our algorithm identifies new mutational clusters in well known cancer proteins such as KRAS and PI3KC α. Further, by utilizing the tertiary structure, our algorithm also identifies clusters in EGFR, EIF2AK2, and other proteins that are not identified by current methodology. The R package is available at: http://www.bioconductor.org/packages/2.12/bioc/html/iPAC.html. Conclusion Our algorithm extends the current methodology to identify oncogenic activating driver mutations by utilizing tertiary protein structure when identifying nonrandom somatic residue mutation clusters.
Collapse
|
41
|
Peterson TA, Nehrt NL, Park D, Kann MG. Incorporating molecular and functional context into the analysis and prioritization of human variants associated with cancer. J Am Med Inform Assoc 2012; 19:275-83. [PMID: 22319177 DOI: 10.1136/amiajnl-2011-000655] [Citation(s) in RCA: 19] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/04/2022] Open
Abstract
BACKGROUND AND OBJECTIVE With recent breakthroughs in high-throughput sequencing, identifying deleterious mutations is one of the key challenges for personalized medicine. At the gene and protein level, it has proven difficult to determine the impact of previously unknown variants. A statistical method has been developed to assess the significance of disease mutation clusters on protein domains by incorporating domain functional annotations to assist in the functional characterization of novel variants. METHODS Disease mutations aggregated from multiple databases were mapped to domains, and were classified as either cancer- or non-cancer-related. The statistical method for identifying significantly disease-associated domain positions was applied to both sets of mutations and to randomly generated mutation sets for comparison. To leverage the known function of protein domain regions, the method optionally distributes significant scores to associated functional feature positions. RESULTS Most disease mutations are localized within protein domains and display a tendency to cluster at individual domain positions. The method identified significant disease mutation hotspots in both the cancer and non-cancer datasets. The domain significance scores (DS-scores) for cancer form a bimodal distribution with hotspots in oncogenes forming a second peak at higher DS-scores than non-cancer, and hotspots in tumor suppressors have scores more similar to non-cancers. In addition, on an independent mutation benchmarking set, the DS-score method identified mutations known to alter protein function with very high precision. CONCLUSION By aggregating mutations with known disease association at the domain level, the method was able to discover domain positions enriched with multiple occurrences of deleterious mutations while incorporating relevant functional annotations. The method can be incorporated into translational bioinformatics tools to characterize rare and novel variants within large-scale sequencing studies.
Collapse
Affiliation(s)
- Thomas A Peterson
- University of Maryland, Baltimore County, Baltimore, Maryland 21250, USA
| | | | | | | |
Collapse
|
42
|
Stehr H, Jang SHJ, Duarte JM, Wierling C, Lehrach H, Lappe M, Lange BMH. The structural impact of cancer-associated missense mutations in oncogenes and tumor suppressors. Mol Cancer 2011; 10:54. [PMID: 21575214 PMCID: PMC3123651 DOI: 10.1186/1476-4598-10-54] [Citation(s) in RCA: 44] [Impact Index Per Article: 3.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/17/2011] [Accepted: 05/16/2011] [Indexed: 12/30/2022] Open
Abstract
BACKGROUND Current large-scale cancer sequencing projects have identified large numbers of somatic mutations covering an increasing number of different cancer tissues and patients. However, the characterization of these mutations at the structural and functional level remains a challenge. RESULTS We present results from an analysis of the structural impact of frequent missense cancer mutations using an automated method. We find that inactivation of tumor suppressors in cancer correlates frequently with destabilizing mutations preferably in the core of the protein, while enhanced activity of oncogenes is often linked to specific mutations at functional sites. Furthermore, our results show that this alteration of oncogenic activity is often associated with mutations at ATP or GTP binding sites. CONCLUSIONS With our findings we can confirm and statistically validate the hypotheses for the gain-of-function and loss-of-function mechanisms of oncogenes and tumor suppressors, respectively. We show that the distinct mutational patterns can potentially be used to pre-classify newly identified cancer-associated genes with yet unknown function.
Collapse
Affiliation(s)
- Henning Stehr
- Max-Planck Institute for Molecular Genetics, Structural Proteomics/Bioinformatics Group, Otto-Warburg Laboratory, Boltzmannstrasse 12, 14195 Berlin, Germany
| | | | | | | | | | | | | |
Collapse
|