151
|
Wu MY, Zhang XF, Dai DQ, Ou-Yang L, Zhu Y, Yan H. Regularized logistic regression with network-based pairwise interaction for biomarker identification in breast cancer. BMC Bioinformatics 2016; 17:108. [PMID: 26921029 PMCID: PMC4769543 DOI: 10.1186/s12859-016-0951-7] [Citation(s) in RCA: 14] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/06/2015] [Accepted: 01/28/2016] [Indexed: 12/14/2022] Open
Abstract
BACKGROUND To facilitate advances in personalized medicine, it is important to detect predictive, stable and interpretable biomarkers related with different clinical characteristics. These clinical characteristics may be heterogeneous with respect to underlying interactions between genes. Usually, traditional methods just focus on detection of differentially expressed genes without taking the interactions between genes into account. Moreover, due to the typical low reproducibility of the selected biomarkers, it is difficult to give a clear biological interpretation for a specific disease. Therefore, it is necessary to design a robust biomarker identification method that can predict disease-associated interactions with high reproducibility. RESULTS In this article, we propose a regularized logistic regression model. Different from previous methods which focus on individual genes or modules, our model takes gene pairs, which are connected in a protein-protein interaction network, into account. A line graph is constructed to represent the adjacencies between pairwise interactions. Based on this line graph, we incorporate the degree information in the model via an adaptive elastic net, which makes our model less dependent on the expression data. Experimental results on six publicly available breast cancer datasets show that our method can not only achieve competitive performance in classification, but also retain great stability in variable selection. Therefore, our model is able to identify the diagnostic and prognostic biomarkers in a more robust way. Moreover, most of the biomarkers discovered by our model have been verified in biochemical or biomedical researches. CONCLUSIONS The proposed method shows promise in the diagnosis of disease pathogenesis with different clinical characteristics. These advances lead to more accurate and stable biomarker discovery, which can monitor the functional changes that are perturbed by diseases. Based on these predictions, researchers may be able to provide suggestions for new therapeutic approaches.
Collapse
Affiliation(s)
- Meng-Yun Wu
- School of Statistics and Management, Shanghai University of Finance and Economics, Guoding Road, Shanghai, 200433, China. .,Key Laboratory of Mathematical Economics SUFE, Ministry of Education, Guoding Road, Shanghai, 200433, China.
| | - Xiao-Fei Zhang
- School of Mathematics and Statistics & Hubei Key Laboratory of Mathematical Sciences, Central China Normal University, Luoyu Road, Wuhan, 430079, China.
| | - Dao-Qing Dai
- Intelligent Data Center and Department of Mathematics, Sun Yat-Sen University, Xingang West Road, Guangzhou, 510275, China.
| | - Le Ou-Yang
- College of Information Engineering, Shenzhen University, Nanhai Avenue, Shenzhen, 518060, China.
| | - Yuan Zhu
- School of Automation, China University of Geosciences, Lumo Road, Wuhan, 430074, China.
| | - Hong Yan
- Department of Electronic and Engineering, City University of Hong Kong, Tat Chee Avenue, Hong Kong, 999077, China.
| |
Collapse
|
152
|
Ames RM, Talavera D, Williams SG, Robertson DL, Lovell SC. Binding interface change and cryptic variation in the evolution of protein-protein interactions. BMC Evol Biol 2016; 16:40. [PMID: 26892785 PMCID: PMC4758157 DOI: 10.1186/s12862-016-0608-1] [Citation(s) in RCA: 7] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/25/2015] [Accepted: 02/02/2016] [Indexed: 12/03/2022] Open
Abstract
Background Physical interactions between proteins are essential for almost all biological functions and systems. To understand the evolution of function it is therefore important to understand the evolution of molecular interactions. Of key importance is the evolution of binding specificity, the set of interactions made by a protein, since change in specificity can lead to “rewiring” of interaction networks. Unfortunately, the interfaces through which proteins interact are complex, typically containing many amino-acid residues that collectively must contribute to binding specificity as well as binding affinity, structural integrity of the interface and solubility in the unbound state. Results In order to study the relationship between interface composition and binding specificity, we make use of paralogous pairs of yeast proteins. Immediately after duplication these paralogues will have identical sequences and protein products that make an identical set of interactions. As the sequences diverge, we can correlate amino-acid change in the interface with any change in the specificity of binding. We show that change in interface regions correlates only weakly with change in specificity, and many variants in interfaces are functionally equivalent. We show that many of the residue replacements within interfaces are silent with respect to their contribution to binding specificity. Conclusions We conclude that such functionally-equivalent change has the potential to contribute to evolutionary plasticity in interfaces by creating cryptic variation, which in turn may provide the raw material for functional innovation and coevolution. Electronic supplementary material The online version of this article (doi:10.1186/s12862-016-0608-1) contains supplementary material, which is available to authorized users.
Collapse
Affiliation(s)
- Ryan M Ames
- Computational and Evolutionary Biology, Faculty of Life Sciences, University of Manchester, Oxford Road, Manchester, M13 9PT, UK. .,Current address: Wellcome Trust Centre for Biomedical Modelling and Analysis, University of Exeter, RILD Level 3, Exeter, EX2 5DW, UK.
| | - David Talavera
- Computational and Evolutionary Biology, Faculty of Life Sciences, University of Manchester, Oxford Road, Manchester, M13 9PT, UK. .,Current address: Institute of Cardiovascular Sciences, Faculty of Medical and Human Sciences, University of Manchester, Oxford Road, Manchester, M13 9PT, UK.
| | - Simon G Williams
- Computational and Evolutionary Biology, Faculty of Life Sciences, University of Manchester, Oxford Road, Manchester, M13 9PT, UK. .,Current address: Institute of Cardiovascular Sciences, Faculty of Medical and Human Sciences, University of Manchester, Oxford Road, Manchester, M13 9PT, UK.
| | - David L Robertson
- Computational and Evolutionary Biology, Faculty of Life Sciences, University of Manchester, Oxford Road, Manchester, M13 9PT, UK.
| | - Simon C Lovell
- Computational and Evolutionary Biology, Faculty of Life Sciences, University of Manchester, Oxford Road, Manchester, M13 9PT, UK.
| |
Collapse
|
153
|
Meyer MJ, Lapcevic R, Romero AE, Yoon M, Das J, Beltrán JF, Mort M, Stenson PD, Cooper DN, Paccanaro A, Yu H. mutation3D: Cancer Gene Prediction Through Atomic Clustering of Coding Variants in the Structural Proteome. Hum Mutat 2016; 37:447-56. [PMID: 26841357 DOI: 10.1002/humu.22963] [Citation(s) in RCA: 75] [Impact Index Per Article: 8.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/27/2015] [Accepted: 01/14/2016] [Indexed: 12/20/2022]
Abstract
A new algorithm and Web server, mutation3D (http://mutation3d.org), proposes driver genes in cancer by identifying clusters of amino acid substitutions within tertiary protein structures. We demonstrate the feasibility of using a 3D clustering approach to implicate proteins in cancer based on explorations of single proteins using the mutation3D Web interface. On a large scale, we show that clustering with mutation3D is able to separate functional from nonfunctional mutations by analyzing a combination of 8,869 known inherited disease mutations and 2,004 SNPs overlaid together upon the same sets of crystal structures and homology models. Further, we present a systematic analysis of whole-genome and whole-exome cancer datasets to demonstrate that mutation3D identifies many known cancer genes as well as previously underexplored target genes. The mutation3D Web interface allows users to analyze their own mutation data in a variety of popular formats and provides seamless access to explore mutation clusters derived from over 975,000 somatic mutations reported by 6,811 cancer sequencing studies. The mutation3D Web interface is freely available with all major browsers supported.
Collapse
Affiliation(s)
- Michael J Meyer
- Department of Biological Statistics and Computational Biology, Cornell University, Ithaca, New York, 14853.,Weill Institute for Cell and Molecular Biology, Cornell University, Ithaca, New York, 14853.,Tri-Institutional Training Program in Computational Biology and Medicine, New York, New York, 10065
| | - Ryan Lapcevic
- Department of Biological Statistics and Computational Biology, Cornell University, Ithaca, New York, 14853.,Weill Institute for Cell and Molecular Biology, Cornell University, Ithaca, New York, 14853
| | - Alfonso E Romero
- Department of Computer Science and Centre for Systems and Synthetic Biology, Royal Holloway, University of London, Egham TW20 0EX, UK
| | - Mark Yoon
- Department of Biological Statistics and Computational Biology, Cornell University, Ithaca, New York, 14853.,Weill Institute for Cell and Molecular Biology, Cornell University, Ithaca, New York, 14853
| | - Jishnu Das
- Department of Biological Statistics and Computational Biology, Cornell University, Ithaca, New York, 14853.,Weill Institute for Cell and Molecular Biology, Cornell University, Ithaca, New York, 14853
| | - Juan Felipe Beltrán
- Department of Biological Statistics and Computational Biology, Cornell University, Ithaca, New York, 14853.,Weill Institute for Cell and Molecular Biology, Cornell University, Ithaca, New York, 14853
| | - Matthew Mort
- Institute of Medical Genetics, School of Medicine, Cardiff University, Heath Park, Cardiff CF14 4XN, UK
| | - Peter D Stenson
- Institute of Medical Genetics, School of Medicine, Cardiff University, Heath Park, Cardiff CF14 4XN, UK
| | - David N Cooper
- Institute of Medical Genetics, School of Medicine, Cardiff University, Heath Park, Cardiff CF14 4XN, UK
| | - Alberto Paccanaro
- Department of Computer Science and Centre for Systems and Synthetic Biology, Royal Holloway, University of London, Egham TW20 0EX, UK
| | - Haiyuan Yu
- Department of Biological Statistics and Computational Biology, Cornell University, Ithaca, New York, 14853.,Weill Institute for Cell and Molecular Biology, Cornell University, Ithaca, New York, 14853
| |
Collapse
|
154
|
Kuang X, Dhroso A, Han JG, Shyu CR, Korkin D. DOMMINO 2.0: integrating structurally resolved protein-, RNA-, and DNA-mediated macromolecular interactions. DATABASE-THE JOURNAL OF BIOLOGICAL DATABASES AND CURATION 2016; 2016:bav114. [PMID: 26827237 PMCID: PMC4733329 DOI: 10.1093/database/bav114] [Citation(s) in RCA: 26] [Impact Index Per Article: 2.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 03/01/2015] [Accepted: 11/16/2015] [Indexed: 11/14/2022]
Abstract
Macromolecular interactions are formed between proteins, DNA and RNA molecules. Being a principle building block in macromolecular assemblies and pathways, the interactions underlie most of cellular functions. Malfunctioning of macromolecular interactions is also linked to a number of diseases. Structural knowledge of the macromolecular interaction allows one to understand the interaction's mechanism, determine its functional implications and characterize the effects of genetic variations, such as single nucleotide polymorphisms, on the interaction. Unfortunately, until now the interactions mediated by different types of macromolecules, e.g. protein-protein interactions or protein-DNA interactions, are collected into individual and unrelated structural databases. This presents a significant obstacle in the analysis of macromolecular interactions. For instance, the homogeneous structural interaction databases prevent scientists from studying structural interactions of different types but occurring in the same macromolecular complex. Here, we introduce DOMMINO 2.0, a structural Database Of Macro-Molecular INteractiOns. Compared to DOMMINO 1.0, a comprehensive database on protein-protein interactions, DOMMINO 2.0 includes the interactions between all three basic types of macromolecules extracted from PDB files. DOMMINO 2.0 is automatically updated on a weekly basis. It currently includes ∼1,040,000 interactions between two polypeptide subunits (e.g. domains, peptides, termini and interdomain linkers), ∼43,000 RNA-mediated interactions, and ∼12,000 DNA-mediated interactions. All protein structures in the database are annotated using SCOP and SUPERFAMILY family annotation. As a result, protein-mediated interactions involving protein domains, interdomain linkers, C- and N- termini, and peptides are identified. Our database provides an intuitive web interface, allowing one to investigate interactions at three different resolution levels: whole subunit network, binary interaction and interaction interface. Database URL: http://dommino.org.
Collapse
Affiliation(s)
- Xingyan Kuang
- Informatics Institute, University of Missouri, Columbia, MO, USA
| | - Andi Dhroso
- Department of Computer Science and Bioinformatics and Computational Biology Program, Worcester Polytechnic Institute, Worcester, MA, USA
| | - Jing Ginger Han
- Informatics Institute, University of Missouri, Columbia, MO, USA
| | - Chi-Ren Shyu
- Informatics Institute, University of Missouri, Columbia, MO, USA, Department of Electrical and Computer Engineering, Department of Computer Science, University of Missouri, Columbia, MO, USA
| | - Dmitry Korkin
- Department of Computer Science and Bioinformatics and Computational Biology Program, Worcester Polytechnic Institute, Worcester, MA, USA,
| |
Collapse
|
155
|
Lee J, Lee D. Association analysis of the perturbation of interactions in biological pathways and anticancer drug activity. Biochem Biophys Res Commun 2016; 470:137-143. [PMID: 26772881 DOI: 10.1016/j.bbrc.2016.01.009] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/29/2015] [Accepted: 01/03/2016] [Indexed: 11/25/2022]
Abstract
Understanding how different genomic mutational landscapes in patients with cancer lead to different responses to anticancer drugs is an important challenge for realizing precision medicine for cancer. Many studies have analyzed the comprehensive anticancer drug-response profiles and genomic profiles of cancer cell lines to identify the relationship between the anticancer drug response and genomic alternations. However, few studies have focused on interpreting these profiles with a network perspective. In this work, we analyzed genomic alterations in cancer cell lines by considering which interactions in the signaling pathway were perturbed by mutations. With our interaction-centric approach, we identified novel interaction/drug response associations for two drugs (afatinib and ixabepilone) for which no gene-centric association could be found. When we compared the performance of classifiers for predicting the responses to 164 drugs, the classifiers trained with interaction-centric features outperformed the classifiers trained with gene-centric features, despite the smaller number of features (p-value = 2.0 × 10(-3)). By incorporating the interaction information from signaling pathways, we revealed associations between genomic alterations and drug responses that could be missed when using a gene-centric approach.
Collapse
Affiliation(s)
- Junehawk Lee
- Department of Bio and Brain Engineering, Korea Advanced Institute of Science and Technology, Daejeon, Republic of Korea; Department of Convergence Technology Research, Korea Institute of Science and Technology Information, Daejeon, Republic of Korea
| | - Doheon Lee
- Department of Bio and Brain Engineering, Korea Advanced Institute of Science and Technology, Daejeon, Republic of Korea; Bio-Synergy Research Center, Daejeon, Republic of Korea.
| |
Collapse
|
156
|
Snider J, Kotlyar M, Saraon P, Yao Z, Jurisica I, Stagljar I. Fundamentals of protein interaction network mapping. Mol Syst Biol 2015; 11:848. [PMID: 26681426 PMCID: PMC4704491 DOI: 10.15252/msb.20156351] [Citation(s) in RCA: 192] [Impact Index Per Article: 19.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/13/2022] Open
Abstract
Studying protein interaction networks of all proteins in an organism (“interactomes”) remains one of the major challenges in modern biomedicine. Such information is crucial to understanding cellular pathways and developing effective therapies for the treatment of human diseases. Over the past two decades, diverse biochemical, genetic, and cell biological methods have been developed to map interactomes. In this review, we highlight basic principles of interactome mapping. Specifically, we discuss the strengths and weaknesses of individual assays, how to select a method appropriate for the problem being studied, and provide general guidelines for carrying out the necessary follow‐up analyses. In addition, we discuss computational methods to predict, map, and visualize interactomes, and provide a summary of some of the most important interactome resources. We hope that this review serves as both a useful overview of the field and a guide to help more scientists actively employ these powerful approaches in their research.
Collapse
Affiliation(s)
- Jamie Snider
- Donnelly Centre, Department of Biochemistry, Department of Molecular Genetics, University of Toronto, Toronto, ON, Canada
| | - Max Kotlyar
- Princess Margaret Cancer Center, IBM Life Sciences Discovery Centre, University Health Network, Ontario, Canada
| | - Punit Saraon
- Donnelly Centre, Department of Biochemistry, Department of Molecular Genetics, University of Toronto, Toronto, ON, Canada
| | - Zhong Yao
- Donnelly Centre, Department of Biochemistry, Department of Molecular Genetics, University of Toronto, Toronto, ON, Canada
| | - Igor Jurisica
- Princess Margaret Cancer Center, IBM Life Sciences Discovery Centre, University Health Network, Ontario, Canada
| | - Igor Stagljar
- Donnelly Centre, Department of Biochemistry, Department of Molecular Genetics, University of Toronto, Toronto, ON, Canada
| |
Collapse
|
157
|
Kumar A, Butler BM, Kumar S, Ozkan SB. Integration of structural dynamics and molecular evolution via protein interaction networks: a new era in genomic medicine. Curr Opin Struct Biol 2015; 35:135-42. [PMID: 26684487 PMCID: PMC4856467 DOI: 10.1016/j.sbi.2015.11.002] [Citation(s) in RCA: 24] [Impact Index Per Article: 2.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/18/2015] [Revised: 11/03/2015] [Accepted: 11/05/2015] [Indexed: 01/08/2023]
Abstract
Sequencing technologies are revealing many new non-synonymous single nucleotide variants (nsSNVs) in each personal exome. To assess their functional impacts, comparative genomics is frequently employed to predict if they are benign or not. However, evolutionary analysis alone is insufficient, because it misdiagnoses many disease-associated nsSNVs, such as those at positions involved in protein interfaces, and because evolutionary predictions do not provide mechanistic insights into functional change or loss. Structural analyses can aid in overcoming both of these problems by incorporating conformational dynamics and allostery in nSNV diagnosis. Finally, protein-protein interaction networks using systems-level methodologies shed light onto disease etiology and pathogenesis. Bridging these network approaches with structurally resolved protein interactions and dynamics will advance genomic medicine.
Collapse
Affiliation(s)
- Avishek Kumar
- Department of Physics and Center for Biological Physics, Arizona State University, Tempe, AZ 85281, United States
| | - Brandon M Butler
- Department of Physics and Center for Biological Physics, Arizona State University, Tempe, AZ 85281, United States
| | - Sudhir Kumar
- Institute for Genomics and Evolutionary Medicine, Temple University, Philadelphia, PA 19122, United States; Department of Biology, Temple University, Philadelphia, PA 19122, United States; Center for Genomic Medicine and Research, King Abdulaziz University, Jeddah, Saudi Arabia
| | - S Banu Ozkan
- Department of Physics and Center for Biological Physics, Arizona State University, Tempe, AZ 85281, United States.
| |
Collapse
|
158
|
Sethi A, Clarke D, Chen J, Kumar S, Galeev TR, Regan L, Gerstein M. Reads meet rotamers: structural biology in the age of deep sequencing. Curr Opin Struct Biol 2015; 35:125-34. [PMID: 26658741 DOI: 10.1016/j.sbi.2015.11.003] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/21/2015] [Revised: 11/04/2015] [Accepted: 11/05/2015] [Indexed: 01/07/2023]
Abstract
Structure has traditionally been interrelated with sequence, usually in the framework of comparing sequences across species sharing a common fold. However, the nature of information within the sequence and structure databases is evolving, changing the type of comparisons possible. In particular, we now have a vast amount of personal genome sequences from human populations and a greater fraction of new structures contain interacting proteins within large complexes. Consequently, we have to recast our conception of sequence conservation and its relation to structure-for example, focusing more on selection within the human population. Moreover, within structural biology there is less emphasis on the discovery of novel folds and more on relating structures to networks of protein interactions. We cover this changing mindset here.
Collapse
Affiliation(s)
- Anurag Sethi
- Molecular Biophysics and Biochemistry, Yale University, New Haven, CT, United States; Program in Computational Biology and Bioinformatics, Yale University, New Haven, CT, United States
| | - Declan Clarke
- Department of Chemistry, Yale University, New Haven, CT, United States
| | - Jieming Chen
- Molecular Biophysics and Biochemistry, Yale University, New Haven, CT, United States
| | - Sushant Kumar
- Molecular Biophysics and Biochemistry, Yale University, New Haven, CT, United States; Program in Computational Biology and Bioinformatics, Yale University, New Haven, CT, United States
| | - Timur R Galeev
- Molecular Biophysics and Biochemistry, Yale University, New Haven, CT, United States; Program in Computational Biology and Bioinformatics, Yale University, New Haven, CT, United States
| | - Lynne Regan
- Molecular Biophysics and Biochemistry, Yale University, New Haven, CT, United States; Department of Chemistry, Yale University, New Haven, CT, United States
| | - Mark Gerstein
- Molecular Biophysics and Biochemistry, Yale University, New Haven, CT, United States; Program in Computational Biology and Bioinformatics, Yale University, New Haven, CT, United States.
| |
Collapse
|
159
|
Das J, Meyer MJ, Yu H. Studying Autism in Context. Cell Syst 2015; 1:312-3. [PMID: 27136240 DOI: 10.1016/j.cels.2015.11.004] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/19/2022]
Abstract
Studying autism genes in the context of the protein complexes to which they belong illustrates the potential of network-centric approaches for understanding complex genetic disease.
Collapse
Affiliation(s)
- Jishnu Das
- Department of Biological Statistics and Computational Biology, Cornell University, Ithaca, NY 14853, USA; Weill Institute for Cell and Molecular Biology, Cornell University, Ithaca, NY 14853, USA
| | - Michael J Meyer
- Department of Biological Statistics and Computational Biology, Cornell University, Ithaca, NY 14853, USA; Weill Institute for Cell and Molecular Biology, Cornell University, Ithaca, NY 14853, USA; Tri-Institutional Training Program in Computational Biology and Medicine, New York, NY 10065, USA
| | - Haiyuan Yu
- Department of Biological Statistics and Computational Biology, Cornell University, Ithaca, NY 14853, USA; Weill Institute for Cell and Molecular Biology, Cornell University, Ithaca, NY 14853, USA.
| |
Collapse
|
160
|
Tsuji T, Yoda T, Shirai T. Deciphering Supramolecular Structures with Protein-Protein Interaction Network Modeling. Sci Rep 2015; 5:16341. [PMID: 26549015 PMCID: PMC4637837 DOI: 10.1038/srep16341] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/02/2015] [Accepted: 10/09/2015] [Indexed: 11/30/2022] Open
Abstract
Many biological molecules are assembled into supramolecules that are essential to perform complicated functions in the cell. However, experimental information about the structures of supramolecules is not sufficient at this point. We developed a method of predicting and modeling the structures of supramolecules in a biological network by combining structural data of the Protein Data Bank (PDB) and interaction data in IntAct databases. Templates for binary complexes in IntAct were extracted from PDB. Modeling was attempted by assembling binary complexes with superposed shared subunits. A total of 3,197 models were constructed, and 1,306 (41% of the total) contained at least one subunit absent from experimental structures. The models also suggested 970 (25% of the total) experimentally undetected subunit interfaces, and 41 human disease-related amino acid variants were mapped onto these model-suggested interfaces. The models demonstrated that protein-protein interaction network modeling is useful to fill the information gap between biological networks and structures.
Collapse
Affiliation(s)
- Toshiyuki Tsuji
- Nagahama Institute of Bio-Science and Technology, and Japan Science and Technology Agency, Bioinformatics Research Division, Nagahama, Shiga 526-0829, Japan
| | - Takao Yoda
- Nagahama Institute of Bio-Science and Technology, and Japan Science and Technology Agency, Bioinformatics Research Division, Nagahama, Shiga 526-0829, Japan
| | - Tsuyoshi Shirai
- Nagahama Institute of Bio-Science and Technology, and Japan Science and Technology Agency, Bioinformatics Research Division, Nagahama, Shiga 526-0829, Japan
| |
Collapse
|
161
|
Jiang P, Wang H, Li W, Zang C, Li B, Wong YJ, Meyer C, Liu JS, Aster JC, Liu XS. Network analysis of gene essentiality in functional genomics experiments. Genome Biol 2015; 16:239. [PMID: 26518695 PMCID: PMC4627418 DOI: 10.1186/s13059-015-0808-9] [Citation(s) in RCA: 38] [Impact Index Per Article: 3.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/13/2015] [Accepted: 10/20/2015] [Indexed: 12/18/2022] Open
Abstract
Many genomic techniques have been developed to study gene essentiality genome-wide, such as CRISPR and shRNA screens. Our analyses of public CRISPR screens suggest protein interaction networks, when integrated with gene expression or histone marks, are highly predictive of gene essentiality. Meanwhile, the quality of CRISPR and shRNA screen results can be significantly enhanced through network neighbor information. We also found network neighbor information to be very informative on prioritizing ChIP-seq target genes and survival indicator genes from tumor profiling. Thus, our study provides a general method for gene essentiality analysis in functional genomic experiments ( http://nest.dfci.harvard.edu ).
Collapse
Affiliation(s)
- Peng Jiang
- Department of Biostatistics and Computational Biology, Dana-Farber Cancer Institute, Harvard T.H. Chan School of Public Health, Boston, MA, 02215, USA
| | - Hongfang Wang
- Department of Pathology, Brigham and Women's Hospital, Boston, MA, 02115, USA
| | - Wei Li
- Department of Biostatistics and Computational Biology, Dana-Farber Cancer Institute, Harvard T.H. Chan School of Public Health, Boston, MA, 02215, USA
| | - Chongzhi Zang
- Department of Biostatistics and Computational Biology, Dana-Farber Cancer Institute, Harvard T.H. Chan School of Public Health, Boston, MA, 02215, USA
| | - Bo Li
- Department of Biostatistics and Computational Biology, Dana-Farber Cancer Institute, Harvard T.H. Chan School of Public Health, Boston, MA, 02215, USA
| | - Yinling J Wong
- Department of Pathology, Brigham and Women's Hospital, Boston, MA, 02115, USA
| | - Cliff Meyer
- Department of Biostatistics and Computational Biology, Dana-Farber Cancer Institute, Harvard T.H. Chan School of Public Health, Boston, MA, 02215, USA
| | - Jun S Liu
- Department of Statistics, Harvard University, Cambridge, 200092, China
| | - Jon C Aster
- Department of Pathology, Brigham and Women's Hospital, Boston, MA, 02115, USA
| | - X Shirley Liu
- Department of Biostatistics and Computational Biology, Dana-Farber Cancer Institute, Harvard T.H. Chan School of Public Health, Boston, MA, 02215, USA. .,School of Life Science and Technology, Tongji University, Shanghai, MA, 02138, USA.
| |
Collapse
|
162
|
Porta-Pardo E, Garcia-Alonso L, Hrabe T, Dopazo J, Godzik A. A Pan-Cancer Catalogue of Cancer Driver Protein Interaction Interfaces. PLoS Comput Biol 2015; 11:e1004518. [PMID: 26485003 PMCID: PMC4616621 DOI: 10.1371/journal.pcbi.1004518] [Citation(s) in RCA: 84] [Impact Index Per Article: 8.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/29/2015] [Accepted: 08/21/2015] [Indexed: 12/19/2022] Open
Abstract
Despite their importance in maintaining the integrity of all cellular pathways, the role of mutations on protein-protein interaction (PPI) interfaces as cancer drivers has not been systematically studied. Here we analyzed the mutation patterns of the PPI interfaces from 10,028 proteins in a pan-cancer cohort of 5,989 tumors from 23 projects of The Cancer Genome Atlas (TCGA) to find interfaces enriched in somatic missense mutations. To that end we use e-Driver, an algorithm to analyze the mutation distribution of specific protein functional regions. We identified 103 PPI interfaces enriched in somatic cancer mutations. 32 of these interfaces are found in proteins coded by known cancer driver genes. The remaining 71 interfaces are found in proteins that have not been previously identified as cancer drivers even that, in most cases, there is an extensive literature suggesting they play an important role in cancer. Finally, we integrate these findings with clinical information to show how tumors apparently driven by the same gene have different behaviors, including patient outcomes, depending on which specific interfaces are mutated. Until now, most efforts in cancer genomics have focused on identifying genes and pathways driving tumor development. Although this has been unquestionably a success, as evidenced by the fact that we now have an extensive catalogue of cancer driver genes and pathways, there is still a poor understanding of why patients with the same affected driver genes may have different disease outcomes or drug responses. This is precisely the aim of this work-to show how by considering proteins as multifunctional factories instead of monolithic black boxes, it is possible to identify novel cancer driver genes and propose molecular hypotheses to explain such heterogeneity. To that end we have mapped the mutation profiles of 5,989 cancer patients from TCGA to more than 10,000 protein structures, leading us to identify 103 protein interaction interfaces enriched in somatic mutations. Finally, we have integrated clinical annotations as well as proteomics data to show how tumors apparently driven by the same gene can display different behaviors, including patient outcomes, depending on which specific interfaces are mutated.
Collapse
Affiliation(s)
- Eduard Porta-Pardo
- Bioinformatics and Systems Biology Program, Sanford-Burnham Medical Research Institute, La Jolla, California, United States of America
| | - Luz Garcia-Alonso
- European Bioinformatics Institute (EMBL-EBI), Wellcome Trust Genome Campus, Cambridge, United Kingdom
| | - Thomas Hrabe
- Bioinformatics and Systems Biology Program, Sanford-Burnham Medical Research Institute, La Jolla, California, United States of America
| | - Joaquin Dopazo
- Computational Genomics Department, Centro de Investigación Príncipe Felipe (CIPF), Valencia, Spain
- Functional Genomics Node, (INB) at CIPF, Valencia, Spain
- Bioinformatics of Rare Diseases (BIER), CIBER de Enfermedades Raras (CIBERER), Valencia, Spain
- * E-mail: (JD); (AG)
| | - Adam Godzik
- Bioinformatics and Systems Biology Program, Sanford-Burnham Medical Research Institute, La Jolla, California, United States of America
- * E-mail: (JD); (AG)
| |
Collapse
|
163
|
Theofilatos KA, Likothanassis S, Mavroudi S. Quo vadis computational analysis of PPI data or why the future isn't here yet. Front Genet 2015; 6:289. [PMID: 26442107 PMCID: PMC4584938 DOI: 10.3389/fgene.2015.00289] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Key Words] [Track Full Text] [Download PDF] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/02/2015] [Accepted: 08/31/2015] [Indexed: 11/13/2022] Open
Affiliation(s)
| | - Spiros Likothanassis
- InSyBio Ltd. London, UK ; Pattern Recognition Laboratory, Department of Computer Engineering and Informatics, University of Patras Patras, Greece
| | - Seferina Mavroudi
- InSyBio Ltd. London, UK ; Pattern Recognition Laboratory, Department of Computer Engineering and Informatics, University of Patras Patras, Greece ; Department of Social Work, School of Sciences of Health and Care, Technological Educational Institute of Western Greece Patras, Greece
| |
Collapse
|
164
|
Zhang XF, Ou-Yang L, Hu X, Dai DQ. Identifying binary protein-protein interactions from affinity purification mass spectrometry data. BMC Genomics 2015; 16:745. [PMID: 26438428 PMCID: PMC4595009 DOI: 10.1186/s12864-015-1944-z] [Citation(s) in RCA: 11] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/06/2014] [Accepted: 09/22/2015] [Indexed: 02/04/2023] Open
Abstract
Background The identification of protein-protein interactions contributes greatly to the understanding of functional organization within cells. With the development of affinity purification-mass spectrometry (AP-MS) techniques, several computational scoring methods have been proposed to detect protein interactions from AP-MS data. However, most of the current methods focus on the detection of co-complex interactions and do not discriminate between direct physical interactions and indirect interactions. Consequently, less is known about the precise physical wiring diagram within cells. Results In this paper, we develop a Binary Interaction Network Model (BINM) to computationally identify direct physical interactions from co-complex interactions which can be inferred from purification data using previous scoring methods. This model provides a mathematical framework for capturing topological relationships between direct physical interactions and observed co-complex interactions. It reassigns a confidence score to each observed interaction to indicate its propensity to be a direct physical interaction. Then observed interactions with high confidence scores are predicted as direct physical interactions. We run our model on two yeast co-complex interaction networks which are constructed by two different scoring methods on a same combined AP-MS data. The direct physical interactions identified by various methods are comprehensively benchmarked against different reference sets that provide both direct and indirect evidence for physical contacts. Experiment results show that our model has a competitive performance over the state-of-the-art methods. Conclusions According to the results obtained in this study, BINM is a powerful scoring method that can solely use network topology to predict direct physical interactions from AP-MS data. This study provides us an alternative approach to explore the information inherent in AP-MS data. The software can be downloaded from https://github.com/Zhangxf-ccnu/BINM. Electronic supplementary material The online version of this article (doi:10.1186/s12864-015-1944-z) contains supplementary material, which is available to authorized users.
Collapse
Affiliation(s)
- Xiao-Fei Zhang
- School of Mathematics and Statistics, Central China Normal University, Luoyu Road, Wuhan, 430079, China.
| | - Le Ou-Yang
- Intelligent Data Center and Department of Mathematics, Sun Yat-Sen University, Xingang West Road, Guangzhou, 510275, China.
| | - Xiaohua Hu
- School of Computer, Central China Normal University, 774 Luoyu Road, Wuhan, 430079, China. .,College of Information Science and Technology, Drexel University, Chestnut Street, Philadelphia, 19104, USA.
| | - Dao-Qing Dai
- Intelligent Data Center and Department of Mathematics, Sun Yat-Sen University, Xingang West Road, Guangzhou, 510275, China.
| |
Collapse
|
165
|
|
166
|
Alanis-Lobato G. Mining protein interactomes to improve their reliability and support the advancement of network medicine. Front Genet 2015; 6:296. [PMID: 26442112 PMCID: PMC4585290 DOI: 10.3389/fgene.2015.00296] [Citation(s) in RCA: 14] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/29/2015] [Accepted: 09/07/2015] [Indexed: 12/12/2022] Open
Abstract
High-throughput detection of protein interactions has had a major impact in our understanding of the intricate molecular machinery underlying the living cell, and has permitted the construction of very large protein interactomes. The protein networks that are currently available are incomplete and a significant percentage of their interactions are false positives. Fortunately, the structural properties observed in good quality social or technological networks are also present in biological systems. This has encouraged the development of tools, to improve the reliability of protein networks and predict new interactions based merely on the topological characteristics of their components. Since diseases are rarely caused by the malfunction of a single protein, having a more complete and reliable interactome is crucial in order to identify groups of inter-related proteins involved in disease etiology. These system components can then be targeted with minimal collateral damage. In this article, an important number of network mining tools is reviewed, together with resources from which reliable protein interactomes can be constructed. In addition to the review, a few representative examples of how molecular and clinical data can be integrated to deepen our understanding of pathogenesis are discussed.
Collapse
Affiliation(s)
- Gregorio Alanis-Lobato
- Faculty of Biology, Institute of Molecular Biology, Johannes Gutenberg University of Mainz Mainz, Germany ; Integrative Systems Biology Lab, Biological and Environmental Sciences and Engineering Division, King Abdullah University of Science and Technology Thuwal, Saudi Arabia
| |
Collapse
|
167
|
Comprehensive assessment of cancer missense mutation clustering in protein structures. Proc Natl Acad Sci U S A 2015; 112:E5486-95. [PMID: 26392535 DOI: 10.1073/pnas.1516373112] [Citation(s) in RCA: 160] [Impact Index Per Article: 16.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/07/2023] Open
Abstract
Large-scale tumor sequencing projects enabled the identification of many new cancer gene candidates through computational approaches. Here, we describe a general method to detect cancer genes based on significant 3D clustering of mutations relative to the structure of the encoded protein products. The approach can also be used to search for proteins with an enrichment of mutations at binding interfaces with a protein, nucleic acid, or small molecule partner. We applied this approach to systematically analyze the PanCancer compendium of somatic mutations from 4,742 tumors relative to all known 3D structures of human proteins in the Protein Data Bank. We detected significant 3D clustering of missense mutations in several previously known oncoproteins including HRAS, EGFR, and PIK3CA. Although clustering of missense mutations is often regarded as a hallmark of oncoproteins, we observed that a number of tumor suppressors, including FBXW7, VHL, and STK11, also showed such clustering. Beside these known cases, we also identified significant 3D clustering of missense mutations in NUF2, which encodes a component of the kinetochore, that could affect chromosome segregation and lead to aneuploidy. Analysis of interaction interfaces revealed enrichment of mutations in the interfaces between FBXW7-CCNE1, HRAS-RASA1, CUL4B-CAND1, OGT-HCFC1, PPP2R1A-PPP2R5C/PPP2R2A, DICER1-Mg2+, MAX-DNA, SRSF2-RNA, and others. Together, our results indicate that systematic consideration of 3D structure can assist in the identification of cancer genes and in the understanding of the functional role of their mutations.
Collapse
|
168
|
Solute Carrier Family 26 Member a2 (slc26a2) Regulates Otic Development and Hair Cell Survival in Zebrafish. PLoS One 2015; 10:e0136832. [PMID: 26375458 PMCID: PMC4573323 DOI: 10.1371/journal.pone.0136832] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/29/2014] [Accepted: 08/10/2015] [Indexed: 12/16/2022] Open
Abstract
Hearing loss is one of the most prevalent human birth defects. Genetic factors contribute to the pathogenesis of deafness. It is estimated that one-third of deafness genes have already been identified. The current work is an attempt to find novel genes relevant to hearing loss using guilt-by-profiling and guilt-by-association bioinformatics analyses of approximately 80 known non-syndromic hereditary hearing loss (NSHL) genes. Among the 300 newly identified candidate deafness genes, slc26a2 were selected for functional studies in zebrafish. The slc26a2 gene was knocked down using an antisense morpholino (MO), and significant defects were observed in otolith patterns, semicircular canal morphology, and lateral neuromast distributions in morphants. Loss-of-function defects are caused primarily by apoptosis, and morphants are insensitive to sound stimulation and imbalanced swimming behaviours. Morphant defects were found to be partially rescued by co-injection of human SLC26A2 mRNA. All the results suggest that bioinformatics is capable of predicting new deafness genes and this showed slc26a2 is to be a critical otic gene whose dysfunction may induce hearing impairment.
Collapse
|
169
|
Abstract
The acquisition of mutations that activate oncogenes or inactivate tumor suppressors is a primary feature of most cancers. Mutations that directly alter protein sequence and structure drive the development of tumors through aberrant expression and modification of proteins, in many cases directly impacting components of signal transduction pathways and cellular architecture. Cancer-associated mutations may have direct or indirect effects on proteins and their interactions and while the effects of mutations on signaling pathways have been widely studied, how mutations alter underlying protein-protein interaction networks is much less well understood. Systematic mapping of oncoprotein protein interactions using proteomics techniques as well as computational network analyses is revealing how oncoprotein mutations perturb protein-protein interaction networks and drive the cancer phenotype.
Collapse
Affiliation(s)
- Emily Bowler
- Centre for Biological Sciences, University of Southampton, Southampton SO17 1BJ, UK
| | - Zhenghe Wang
- Department of Genetics and Genome Science, Case Western Reserve University, Cleveland, Ohio 44106, USA
| | - Rob M. Ewing
- Centre for Biological Sciences, University of Southampton, Southampton SO17 1BJ, UK
| |
Collapse
|
170
|
Yeger-Lotem E, Sharan R. Human protein interaction networks across tissues and diseases. Front Genet 2015; 6:257. [PMID: 26347769 PMCID: PMC4541328 DOI: 10.3389/fgene.2015.00257] [Citation(s) in RCA: 55] [Impact Index Per Article: 5.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/28/2015] [Accepted: 07/17/2015] [Indexed: 11/13/2022] Open
Abstract
Protein interaction networks are an important framework for studying protein function, cellular processes, and genotype-to-phenotype relationships. While our view of the human interaction network is constantly expanding, less is known about networks that form in biologically important contexts such as within distinct tissues or in disease conditions. Here we review efforts to characterize these networks and to harness them to gain insights into the molecular mechanisms underlying human disease.
Collapse
Affiliation(s)
- Esti Yeger-Lotem
- Department of Clinical Biochemistry and Pharmacology, Ben-Gurion University of the Negev Beer-Sheva, Israel
| | - Roded Sharan
- Blavatnik School of Computer Science, Tel Aviv University Tel Aviv, Israel
| |
Collapse
|
171
|
Lu HC, Chung SS, Fornili A, Fraternali F. Anatomy of protein disorder, flexibility and disease-related mutations. Front Mol Biosci 2015; 2:47. [PMID: 26322316 PMCID: PMC4532925 DOI: 10.3389/fmolb.2015.00047] [Citation(s) in RCA: 15] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/31/2015] [Accepted: 07/29/2015] [Indexed: 01/23/2023] Open
Abstract
Integration of protein structural information with human genetic variation and pathogenic mutations is essential to understand molecular mechanisms associated with the effects of polymorphisms on protein interactions and cellular processes. We investigate occurrences of non-synonymous SNPs in ordered and disordered protein regions by systematic mapping of common variants and disease-related SNPs onto these regions. We show that common variants accumulate in disordered regions; conversely pathogenic variants are significantly depleted in disordered regions. These different occurrences of pathogenic and common SNPs can be attributed to a negative selection on random mutations in structurally highly constrained regions. New approaches in the study of quantitative effects of pathogenic-related mutations should effectively account for all the possible contexts and relative functional constraints in which the sequence variation occurs.
Collapse
Affiliation(s)
- Hui-Chun Lu
- Randall Division of Cell and Molecular Biophysics, King's College London London, UK
| | - Sun Sook Chung
- Randall Division of Cell and Molecular Biophysics, King's College London London, UK ; Department of Haematological Medicine, King's College London London, UK
| | - Arianna Fornili
- Randall Division of Cell and Molecular Biophysics, King's College London London, UK ; School of Biological and Chemical Sciences, Queen Mary University of London London, UK
| | - Franca Fraternali
- Randall Division of Cell and Molecular Biophysics, King's College London London, UK
| |
Collapse
|
172
|
Trepte P, Buntru A, Klockmeier K, Willmore L, Arumughan A, Secker C, Zenkner M, Brusendorf L, Rau K, Redel A, Wanker EE. DULIP: A Dual Luminescence-Based Co-Immunoprecipitation Assay for Interactome Mapping in Mammalian Cells. J Mol Biol 2015; 427:3375-88. [PMID: 26264872 DOI: 10.1016/j.jmb.2015.08.003] [Citation(s) in RCA: 22] [Impact Index Per Article: 2.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/24/2015] [Revised: 07/31/2015] [Accepted: 08/03/2015] [Indexed: 12/30/2022]
Abstract
Mapping of protein-protein interactions (PPIs) is critical for understanding protein function and complex biological processes. Here, we present DULIP, a dual luminescence-based co-immunoprecipitation assay, for systematic PPI mapping in mammalian cells. DULIP is a second-generation luminescence-based PPI screening method for the systematic and quantitative analysis of co-immunoprecipitations using two different luciferase tags. Benchmarking studies with positive and negative PPI reference sets revealed that DULIP allows the detection of interactions with high sensitivity and specificity. Furthermore, the analysis of a PPI reference set with known binding affinities demonstrated that both low- and high-affinity interactions can be detected with DULIP assays. Finally, using the well-characterized interaction between Syntaxin-1 and Munc18, we found that DULIP is capable of detecting the effects of point mutations on interaction strength. Taken together, our studies demonstrate that DULIP is a sensitive and reliable method of great utility for systematic interactome research. It can be applied for interaction screening and validation of PPIs in mammalian cells. Moreover, DULIP permits the specific analysis of mutation-dependent binding patterns.
Collapse
Affiliation(s)
- Philipp Trepte
- Neuroproteomics, Max Delbrueck Center for Molecular Medicine, Robert-Roessle-Straße 10, 13125 Berlin, Germany
| | - Alexander Buntru
- Neuroproteomics, Max Delbrueck Center for Molecular Medicine, Robert-Roessle-Straße 10, 13125 Berlin, Germany
| | - Konrad Klockmeier
- Neuroproteomics, Max Delbrueck Center for Molecular Medicine, Robert-Roessle-Straße 10, 13125 Berlin, Germany
| | - Lindsay Willmore
- Neuroproteomics, Max Delbrueck Center for Molecular Medicine, Robert-Roessle-Straße 10, 13125 Berlin, Germany
| | - Anup Arumughan
- Neuroproteomics, Max Delbrueck Center for Molecular Medicine, Robert-Roessle-Straße 10, 13125 Berlin, Germany
| | - Christopher Secker
- Neuroproteomics, Max Delbrueck Center for Molecular Medicine, Robert-Roessle-Straße 10, 13125 Berlin, Germany
| | - Martina Zenkner
- Neuroproteomics, Max Delbrueck Center for Molecular Medicine, Robert-Roessle-Straße 10, 13125 Berlin, Germany
| | - Lydia Brusendorf
- Neuroproteomics, Max Delbrueck Center for Molecular Medicine, Robert-Roessle-Straße 10, 13125 Berlin, Germany
| | - Kirstin Rau
- Neuroproteomics, Max Delbrueck Center for Molecular Medicine, Robert-Roessle-Straße 10, 13125 Berlin, Germany
| | - Alexandra Redel
- Neuroproteomics, Max Delbrueck Center for Molecular Medicine, Robert-Roessle-Straße 10, 13125 Berlin, Germany
| | - Erich E Wanker
- Neuroproteomics, Max Delbrueck Center for Molecular Medicine, Robert-Roessle-Straße 10, 13125 Berlin, Germany.
| |
Collapse
|
173
|
Krogan, PhD NJ, Babu, PhD M. Mapping the Protein-Protein Interactome Networks Using Yeast Two-Hybrid Screens. ADVANCES IN EXPERIMENTAL MEDICINE AND BIOLOGY 2015; 883:187-214. [PMID: 26621469 PMCID: PMC7120425 DOI: 10.1007/978-3-319-23603-2_11] [Citation(s) in RCA: 20] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Indexed: 01/04/2023]
Abstract
The yeast two-hybrid system (Y2H) is a powerful method to identify binary protein-protein interactions in vivo. Here we describe Y2H screening strategies that use defined libraries of open reading frames (ORFs) and cDNA libraries. The array-based Y2H system is well suited for interactome studies of small genomes with an existing ORFeome clones preferentially in a recombination based cloning system. For large genomes, pooled library screening followed by Y2H pairwise retests may be more efficient in terms of time and resources, but multiple sampling is necessary to ensure comprehensive screening. While the Y2H false positives can be efficiently reduced by using built-in controls, retesting, and evaluation of background activation; implementing the multiple variants of the Y2H vector systems is essential to reduce the false negatives and ensure comprehensive coverage of an interactome.
Collapse
Affiliation(s)
- Nevan J. Krogan, PhD
- grid.266102.10000000122976811Cellular and Molecular Pharmacology, Univ of California, San Francisco, SAN FRANCISCO, California USA
| | - Mohan Babu, PhD
- grid.57926.3f0000000419369131Department of Biochemistry, University of Regina, Regina, Saskatchewan Canada
| |
Collapse
|
174
|
Jain P, Thukral N, Gahlot LK, Hasija Y. CARDIO-PRED: an in silico tool for predicting cardiovascular-disorder associated proteins. SYSTEMS AND SYNTHETIC BIOLOGY 2015; 9:55-66. [PMID: 25972989 DOI: 10.1007/s11693-015-9164-z] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 03/04/2015] [Accepted: 03/06/2015] [Indexed: 10/23/2022]
Abstract
Interactions between proteins largely govern cellular processes and this has led to numerous efforts culminating in enormous information related to the proteins, their interactions and the function which is determined by their interactions. The main concern of the present study is to present interface analysis of cardiovascular-disorder (CVD) related proteins to shed lights on details of interactions and to emphasize the importance of using structures in network studies. This study combines the network-centred approach with three dimensional studies to comprehend the fundamentals of biology. Interface properties were used as descriptors to classify the CVD associated proteins and non-CVD associated proteins. Machine learning algorithm was used to generate a classifier based on the training set which was then used to predict potential CVD related proteins from a set of polymorphic proteins which are not known to be involved in any disease. Among several classifying algorithms applied to generate models, best performance was achieved using Random Forest with an accuracy of 69.5 %. The tool named CARDIO-PRED, based on the prediction model is present at http://www.genomeinformatics.dce.edu/CARDIO-PRED/. The predicted CVD related proteins may not be the causing factor of particular disease but can be involved in pathways and reactions yet unknown to us thus permitting a more rational analysis of disease mechanism. Study of their interactions with other proteins can significantly improve our understanding of the molecular mechanism of diseases.
Collapse
Affiliation(s)
- Prerna Jain
- Department of Biotechnology, Delhi Technological University, Shahbad Daulatpur, Main Bawana Road, Delhi, 110042 India
| | - Nitin Thukral
- Department of Biotechnology, Delhi Technological University, Shahbad Daulatpur, Main Bawana Road, Delhi, 110042 India
| | - Lokesh Kumar Gahlot
- Department of Biotechnology, Delhi Technological University, Shahbad Daulatpur, Main Bawana Road, Delhi, 110042 India
| | - Yasha Hasija
- Department of Biotechnology, Delhi Technological University, Shahbad Daulatpur, Main Bawana Road, Delhi, 110042 India
| |
Collapse
|
175
|
Sikosek T, Chan HS. Biophysics of protein evolution and evolutionary protein biophysics. J R Soc Interface 2015; 11:20140419. [PMID: 25165599 DOI: 10.1098/rsif.2014.0419] [Citation(s) in RCA: 161] [Impact Index Per Article: 16.1] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/05/2023] Open
Abstract
The study of molecular evolution at the level of protein-coding genes often entails comparing large datasets of sequences to infer their evolutionary relationships. Despite the importance of a protein's structure and conformational dynamics to its function and thus its fitness, common phylogenetic methods embody minimal biophysical knowledge of proteins. To underscore the biophysical constraints on natural selection, we survey effects of protein mutations, highlighting the physical basis for marginal stability of natural globular proteins and how requirement for kinetic stability and avoidance of misfolding and misinteractions might have affected protein evolution. The biophysical underpinnings of these effects have been addressed by models with an explicit coarse-grained spatial representation of the polypeptide chain. Sequence-structure mappings based on such models are powerful conceptual tools that rationalize mutational robustness, evolvability, epistasis, promiscuous function performed by 'hidden' conformational states, resolution of adaptive conflicts and conformational switches in the evolution from one protein fold to another. Recently, protein biophysics has been applied to derive more accurate evolutionary accounts of sequence data. Methods have also been developed to exploit sequence-based evolutionary information to predict biophysical behaviours of proteins. The success of these approaches demonstrates a deep synergy between the fields of protein biophysics and protein evolution.
Collapse
Affiliation(s)
- Tobias Sikosek
- Department of Biochemistry, University of Toronto, Toronto, Ontario, Canada M5S 1A8 Department of Molecular Genetics, University of Toronto, Toronto, Ontario, Canada M5S 1A8 Department of Physics, University of Toronto, Toronto, Ontario, Canada M5S 1A8
| | - Hue Sun Chan
- Department of Biochemistry, University of Toronto, Toronto, Ontario, Canada M5S 1A8 Department of Molecular Genetics, University of Toronto, Toronto, Ontario, Canada M5S 1A8 Department of Physics, University of Toronto, Toronto, Ontario, Canada M5S 1A8
| |
Collapse
|
176
|
Zhang T, Li S, Zuo W. Landscape of protein domain interactome. Protein Cell 2015; 6:610-4. [PMID: 25960191 PMCID: PMC4506283 DOI: 10.1007/s13238-015-0158-0] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/26/2022] Open
Affiliation(s)
- Ting Zhang
- />Genome Institute of Singapore, A*STAR, Singapore, 138672 Singapore
| | - Shuang Li
- />Tianjin International Joint Academy of Biomedicine, Tianjin, 300457 China
| | - Wei Zuo
- />Shanghai Pulmonary Hospital, School of Medicine, Tongji University, Shanghai, 200092 China
| |
Collapse
|
177
|
Cui H, Dhroso A, Johnson N, Korkin D. The variation game: Cracking complex genetic disorders with NGS and omics data. Methods 2015; 79-80:18-31. [PMID: 25944472 DOI: 10.1016/j.ymeth.2015.04.018] [Citation(s) in RCA: 18] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/13/2014] [Revised: 03/27/2015] [Accepted: 04/17/2015] [Indexed: 12/14/2022] Open
Abstract
Tremendous advances in Next Generation Sequencing (NGS) and high-throughput omics methods have brought us one step closer towards mechanistic understanding of the complex disease at the molecular level. In this review, we discuss four basic regulatory mechanisms implicated in complex genetic diseases, such as cancer, neurological disorders, heart disease, diabetes, and many others. The mechanisms, including genetic variations, copy-number variations, posttranscriptional variations, and epigenetic variations, can be detected using a variety of NGS methods. We propose that malfunctions detected in these mechanisms are not necessarily independent, since these malfunctions are often found associated with the same disease and targeting the same gene, group of genes, or functional pathway. As an example, we discuss possible rewiring effects of the cancer-associated genetic, structural, and posttranscriptional variations on the protein-protein interaction (PPI) network centered around P53 protein. The review highlights multi-layered complexity of common genetic disorders and suggests that integration of NGS and omics data is a critical step in developing new computational methods capable of deciphering this complexity.
Collapse
Affiliation(s)
- Hongzhu Cui
- Department of Computer Science, Worcester Polytechnic Institute, 100 Institute Road, Worcester, MA 01609, United States
| | - Andi Dhroso
- Department of Computer Science, Worcester Polytechnic Institute, 100 Institute Road, Worcester, MA 01609, United States
| | - Nathan Johnson
- Department of Computer Science, Worcester Polytechnic Institute, 100 Institute Road, Worcester, MA 01609, United States
| | - Dmitry Korkin
- Department of Computer Science, Worcester Polytechnic Institute, 100 Institute Road, Worcester, MA 01609, United States; Bioinformatics and Computational Biology Program, Worcester Polytechnic Institute, 100 Institute Road, Worcester, MA 01609, United States
| |
Collapse
|
178
|
Kimura H, Tsuboi D, Wang C, Kushima I, Koide T, Ikeda M, Iwayama Y, Toyota T, Yamamoto N, Kunimoto S, Nakamura Y, Yoshimi A, Banno M, Xing J, Takasaki Y, Yoshida M, Aleksic B, Uno Y, Okada T, Iidaka T, Inada T, Suzuki M, Ujike H, Kunugi H, Kato T, Yoshikawa T, Iwata N, Kaibuchi K, Ozaki N. Identification of Rare, Single-Nucleotide Mutations in NDE1 and Their Contributions to Schizophrenia Susceptibility. Schizophr Bull 2015; 41:744-53. [PMID: 25332407 PMCID: PMC4393687 DOI: 10.1093/schbul/sbu147] [Citation(s) in RCA: 22] [Impact Index Per Article: 2.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Indexed: 12/23/2022]
Abstract
BACKGROUND Nuclear distribution E homolog 1 (NDE1), located within chromosome 16p13.11, plays an essential role in microtubule organization, mitosis, and neuronal migration and has been suggested by several studies of rare copy number variants to be a promising schizophrenia (SCZ) candidate gene. Recently, increasing attention has been paid to rare single-nucleotide variants (SNVs) discovered by deep sequencing of candidate genes, because such SNVs may have large effect sizes and their functional analysis may clarify etiopathology. METHODS AND RESULTS We conducted mutation screening of NDE1 coding exons using 433 SCZ and 145 pervasive developmental disorders samples in order to identify rare single nucleotide variants with a minor allele frequency ≤5%. We then performed genetic association analysis using a large number of unrelated individuals (3554 SCZ, 1041 bipolar disorder [BD], and 4746 controls). Among the discovered novel rare variants, we detected significant associations between SCZ and S214F (P = .039), and between BD and R234C (P = .032). Furthermore, functional assays showed that S214F affected axonal outgrowth and the interaction between NDE1 and YWHAE (14-3-3 epsilon; a neurodevelopmental regulator). CONCLUSIONS This study strengthens the evidence for association between rare variants within NDE1 and SCZ, and may shed light into the molecular mechanisms underlying this severe psychiatric disorder.
Collapse
Affiliation(s)
- Hiroki Kimura
- Department of Psychiatry, Nagoya University Graduate School of Medicine, Nagoya, Japan
| | - Daisuke Tsuboi
- Department of Cell Pharmacology, Nagoya University Graduate School of Medicine, Nagoya, Japan
| | - Chenyao Wang
- Department of Psychiatry, Nagoya University Graduate School of Medicine, Nagoya, Japan
| | - Itaru Kushima
- Department of Psychiatry, Nagoya University Graduate School of Medicine, Nagoya, Japan
| | - Takayoshi Koide
- Department of Psychiatry, Nagoya University Graduate School of Medicine, Nagoya, Japan
| | - Masashi Ikeda
- Department of Psychiatry, Fujita Health University School of Medicine, Toyoake, Aichi, Japan
| | - Yoshimi Iwayama
- Laboratory for Molecular Psychiatry, RIKEN Brain Science Institute, Wako, Saitama, Japan
| | - Tomoko Toyota
- Laboratory for Molecular Psychiatry, RIKEN Brain Science Institute, Wako, Saitama, Japan
| | - Noriko Yamamoto
- Department of Mental Disorder Research, National Institute of Neuroscience, National Center of Neurology and Psychiatry, Kodaira, Tokyo, Japan
| | - Shohko Kunimoto
- Department of Psychiatry, Nagoya University Graduate School of Medicine, Nagoya, Japan
| | - Yukako Nakamura
- Department of Psychiatry, Nagoya University Graduate School of Medicine, Nagoya, Japan
| | - Akira Yoshimi
- Department of Psychiatry, Nagoya University Graduate School of Medicine, Nagoya, Japan
| | - Masahiro Banno
- Department of Psychiatry, Nagoya University Graduate School of Medicine, Nagoya, Japan
| | - Jingrui Xing
- Department of Psychiatry, Nagoya University Graduate School of Medicine, Nagoya, Japan
| | - Yuto Takasaki
- Department of Psychiatry, Nagoya University Graduate School of Medicine, Nagoya, Japan
| | - Mami Yoshida
- Department of Psychiatry, Nagoya University Graduate School of Medicine, Nagoya, Japan
| | - Branko Aleksic
- Department of Psychiatry, Nagoya University Graduate School of Medicine, Nagoya, Japan;
| | - Yota Uno
- Department of Psychiatry, Nagoya University Graduate School of Medicine, Nagoya, Japan
| | - Takashi Okada
- Department of Psychiatry, Nagoya University Graduate School of Medicine, Nagoya, Japan
| | - Tetsuya Iidaka
- Department of Psychiatry, Nagoya University Graduate School of Medicine, Nagoya, Japan
| | - Toshiya Inada
- Department of Psychiatry, Seiwa Hospital, Institute of Neuropsychiatry, Shinjuku, Tokyo, Japan
| | - Michio Suzuki
- Department of Neuropsychiatry, Graduate School of Medicine and Pharmaceutical Sciences, University of Toyama, Toyama, Japan
| | - Hiroshi Ujike
- Department of Psychiatry, Ujike Nishiguchi Clinic (HU), Okayama, Japan
| | - Hiroshi Kunugi
- Department of Mental Disorder Research, National Institute of Neuroscience, National Center of Neurology and Psychiatry, Kodaira, Tokyo, Japan
| | - Tadafumi Kato
- Laboratory for Molecular Dynamics of Mental Disorders, RIKEN Brain Science Institute, Wako, Saitama, Japan
| | - Takeo Yoshikawa
- Laboratory for Molecular Psychiatry, RIKEN Brain Science Institute, Wako, Saitama, Japan
| | - Nakao Iwata
- Department of Psychiatry, Fujita Health University School of Medicine, Toyoake, Aichi, Japan
| | - Kozo Kaibuchi
- Department of Cell Pharmacology, Nagoya University Graduate School of Medicine, Nagoya, Japan
| | - Norio Ozaki
- Department of Psychiatry, Nagoya University Graduate School of Medicine, Nagoya, Japan
| |
Collapse
|
179
|
Brunk E, Rothlisberger U. Mixed Quantum Mechanical/Molecular Mechanical Molecular Dynamics Simulations of Biological Systems in Ground and Electronically Excited States. Chem Rev 2015; 115:6217-63. [PMID: 25880693 DOI: 10.1021/cr500628b] [Citation(s) in RCA: 308] [Impact Index Per Article: 30.8] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/07/2023]
Affiliation(s)
- Elizabeth Brunk
- †Laboratory of Computational Chemistry and Biochemistry, Ecole Polytechnique Fédérale de Lausanne, 1015 Lausanne, Switzerland.,‡Joint BioEnergy Institute, Lawrence Berkeley National Laboratory, Emeryville, California 94618, United States
| | - Ursula Rothlisberger
- †Laboratory of Computational Chemistry and Biochemistry, Ecole Polytechnique Fédérale de Lausanne, 1015 Lausanne, Switzerland.,§National Competence Center of Research (NCCR) MARVEL-Materials' Revolution: Computational Design and Discovery of Novel Materials, 1015 Lausanne, Switzerland
| |
Collapse
|
180
|
Das J, Gayvert KM, Bunea F, Wegkamp MH, Yu H. ENCAPP: elastic-net-based prognosis prediction and biomarker discovery for human cancers. BMC Genomics 2015; 16:263. [PMID: 25887568 PMCID: PMC4392808 DOI: 10.1186/s12864-015-1465-9] [Citation(s) in RCA: 25] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/03/2014] [Accepted: 03/13/2015] [Indexed: 02/08/2023] Open
Abstract
Background With the explosion of genomic data over the last decade, there has been a tremendous amount of effort to understand the molecular basis of cancer using informatics approaches. However, this has proven to be extremely difficult primarily because of the varied etiology and vast genetic heterogeneity of different cancers and even within the same cancer. One particularly challenging problem is to predict prognostic outcome of the disease for different patients. Results Here, we present ENCAPP, an elastic-net-based approach that combines the reference human protein interactome network with gene expression data to accurately predict prognosis for different human cancers. Our method identifies functional modules that are differentially expressed between patients with good and bad prognosis and uses these to fit a regression model that can be used to predict prognosis for breast, colon, rectal, and ovarian cancers. Using this model, ENCAPP can also identify prognostic biomarkers with a high degree of confidence, which can be used to generate downstream mechanistic and therapeutic insights. Conclusion ENCAPP is a robust method that can accurately predict prognostic outcome and identify biomarkers for different human cancers. Electronic supplementary material The online version of this article (doi:10.1186/s12864-015-1465-9) contains supplementary material, which is available to authorized users.
Collapse
Affiliation(s)
- Jishnu Das
- Department of Biological Statistics and Computational Biology, Cornell University, 335 Weill Hall, Ithaca, NY, 14853, USA. .,Weill Institute for Cell and Molecular Biology, Cornell University, Ithaca, NY, 14853, USA.
| | - Kaitlyn M Gayvert
- Tri-Institutional Training Program in Computational Biology and Medicine, New York, NY, 10065, USA.
| | - Florentina Bunea
- Department of Statistical Science, Cornell University, Ithaca, NY, 14853, USA.
| | - Marten H Wegkamp
- Department of Statistical Science, Cornell University, Ithaca, NY, 14853, USA.
| | - Haiyuan Yu
- Department of Biological Statistics and Computational Biology, Cornell University, 335 Weill Hall, Ithaca, NY, 14853, USA. .,Weill Institute for Cell and Molecular Biology, Cornell University, Ithaca, NY, 14853, USA.
| |
Collapse
|
181
|
Segura J, Marín-López MA, Jones PF, Oliva B, Fernandez-Fuentes N. VORFFIP-driven dock: V-D2OCK, a fast and accurate protein docking strategy. PLoS One 2015; 10:e0118107. [PMID: 25763838 PMCID: PMC4357426 DOI: 10.1371/journal.pone.0118107] [Citation(s) in RCA: 18] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/16/2014] [Accepted: 12/27/2014] [Indexed: 12/24/2022] Open
Abstract
The experimental determination of the structure of protein complexes cannot keep pace with the generation of interactomic data, hence resulting in an ever-expanding gap. As the structural details of protein complexes are central to a full understanding of the function and dynamics of the cell machinery, alternative strategies are needed to circumvent the bottleneck in structure determination. Computational protein docking is a valid and valuable approach to model the structure of protein complexes. In this work, we describe a novel computational strategy to predict the structure of protein complexes based on data-driven docking: VORFFIP-driven dock (V-D2OCK). This new approach makes use of our newly described method to predict functional sites in protein structures, VORFFIP, to define the region to be sampled during docking and structural clustering to reduce the number of models to be examined by users. V-D2OCK has been benchmarked using a validated and diverse set of protein complexes and compared to a state-of-art docking method. The speed and accuracy compared to contemporary tools justifies the potential use of VD2OCK for high-throughput, genome-wide, protein docking. Finally, we have developed a web interface that allows users to browser and visualize V-D2OCK predictions from the convenience of their web-browsers.
Collapse
Affiliation(s)
- Joan Segura
- Leeds Institute of Molecular Medicine, School of Medicine, University of Leeds, Leeds, LS9 7TF, United Kingdom
| | - Manuel Alejandro Marín-López
- Structural Bioinformatics Lab (GRIB-IMIM), Department of Experimental and Health Sciences, Universitat Pompeu Fabra, 08003 Barcelona, Catalonia, Spain
| | - Pamela F. Jones
- Leeds Institute of Molecular Medicine, School of Medicine, University of Leeds, Leeds, LS9 7TF, United Kingdom
| | - Baldo Oliva
- Structural Bioinformatics Lab (GRIB-IMIM), Department of Experimental and Health Sciences, Universitat Pompeu Fabra, 08003 Barcelona, Catalonia, Spain
| | - Narcis Fernandez-Fuentes
- Leeds Institute of Molecular Medicine, School of Medicine, University of Leeds, Leeds, LS9 7TF, United Kingdom
- * E-mail:
| |
Collapse
|
182
|
Vázquez M, Valencia A, Pons T. Structure-PPi: a module for the annotation of cancer-related single-nucleotide variants at protein-protein interfaces. Bioinformatics 2015; 31:2397-9. [PMID: 25765346 PMCID: PMC4495296 DOI: 10.1093/bioinformatics/btv142] [Citation(s) in RCA: 34] [Impact Index Per Article: 3.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/02/2014] [Accepted: 03/08/2015] [Indexed: 02/06/2023] Open
Abstract
Motivation: The interpretation of cancer-related single-nucleotide variants (SNVs) considering the protein features they affect, such as known functional sites, protein–protein interfaces, or relation with already annotated mutations, might complement the annotation of genetic variants in the analysis of NGS data. Current tools that annotate mutations fall short on several aspects, including the ability to use protein structure information or the interpretation of mutations in protein complexes. Results: We present the Structure–PPi system for the comprehensive analysis of coding SNVs based on 3D protein structures of protein complexes. The 3D repository used, Interactome3D, includes experimental and modeled structures for proteins and protein–protein complexes. Structure–PPi annotates SNVs with features extracted from UniProt, InterPro, APPRIS, dbNSFP and COSMIC databases. We illustrate the usefulness of Structure–PPi with the interpretation of 1 027 122 non-synonymous SNVs from COSMIC and the 1000G Project that provides a collection of ∼172 700 SNVs mapped onto the protein 3D structure of 8726 human proteins (43.2% of the 20 214 SwissProt-curated proteins in UniProtKB release 2014_06) and protein–protein interfaces with potential functional implications. Availability and implementation: Structure–PPi, along with a user manual and examples, isavailable at http://structureppi.bioinfo.cnio.es/Structure, the code for local installations at https://github.com/Rbbt-Workflows Contact:tpons@cnio.es Supplementary Information:Supplementary data are available at Bioinformatics online.
Collapse
Affiliation(s)
- Miguel Vázquez
- Structural Biology and BioComputing Programme, Spanish National Cancer Research Centre (CNIO), 28029 Madrid, Spain
| | - Alfonso Valencia
- Structural Biology and BioComputing Programme, Spanish National Cancer Research Centre (CNIO), 28029 Madrid, Spain
| | - Tirso Pons
- Structural Biology and BioComputing Programme, Spanish National Cancer Research Centre (CNIO), 28029 Madrid, Spain
| |
Collapse
|
183
|
Jin J, He K, Tang X, Li Z, Lv L, Zhao Y, Luo J, Gao G. An Arabidopsis Transcriptional Regulatory Map Reveals Distinct Functional and Evolutionary Features of Novel Transcription Factors. Mol Biol Evol 2015; 32:1767-73. [PMID: 25750178 PMCID: PMC4476157 DOI: 10.1093/molbev/msv058] [Citation(s) in RCA: 103] [Impact Index Per Article: 10.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/24/2022] Open
Abstract
Transcription factors (TFs) play key roles in both development and stress responses. By integrating into and rewiring original systems, novel TFs contribute significantly to the evolution of transcriptional regulatory networks. Here, we report a high-confidence transcriptional regulatory map covering 388 TFs from 47 families in Arabidopsis. Systematic analysis of this map revealed the architectural heterogeneity of developmental and stress response subnetworks and identified three types of novel network motifs that are absent from unicellular organisms and essential for multicellular development. Moreover, TFs of novel families that emerged during plant landing present higher binding specificities and are preferentially wired into developmental processes and these novel network motifs. Further unveiled connection between the binding specificity and wiring preference of TFs explains the wiring preferences of novel-family TFs. These results reveal distinct functional and evolutionary features of novel TFs, suggesting a plausible mechanism for their contribution to the evolution of multicellular organisms.
Collapse
Affiliation(s)
- Jinpu Jin
- State Key Laboratory of Protein and Plant Gene Research, College of Life Sciences, Center for Bioinformatics, Peking University, Beijing, P.R. China
| | - Kun He
- Monsanto Biotechnology R&D Center, Beijing, P.R. China
| | - Xing Tang
- State Key Laboratory of Protein and Plant Gene Research, College of Life Sciences, Center for Bioinformatics, Peking University, Beijing, P.R. China
| | - Zhe Li
- State Key Laboratory of Systematic and Evolutionary Botany, Institute of Botany, Chinese Academy of Sciences, Beijing, P.R. China
| | - Le Lv
- Monsanto Biotechnology R&D Center, Beijing, P.R. China
| | - Yi Zhao
- State Key Laboratory of Protein and Plant Gene Research, College of Life Sciences, Center for Bioinformatics, Peking University, Beijing, P.R. China
| | - Jingchu Luo
- State Key Laboratory of Protein and Plant Gene Research, College of Life Sciences, Center for Bioinformatics, Peking University, Beijing, P.R. China
| | - Ge Gao
- State Key Laboratory of Protein and Plant Gene Research, College of Life Sciences, Center for Bioinformatics, Peking University, Beijing, P.R. China
| |
Collapse
|
184
|
Mosca R, Tenorio-Laranga J, Olivella R, Alcalde V, Céol A, Soler-López M, Aloy P. dSysMap: exploring the edgetic role of disease mutations. Nat Methods 2015; 12:167-8. [DOI: 10.1038/nmeth.3289] [Citation(s) in RCA: 78] [Impact Index Per Article: 7.8] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/09/2022]
|
185
|
Reimand J, Wagih O, Bader GD. Evolutionary constraint and disease associations of post-translational modification sites in human genomes. PLoS Genet 2015; 11:e1004919. [PMID: 25611800 PMCID: PMC4303425 DOI: 10.1371/journal.pgen.1004919] [Citation(s) in RCA: 47] [Impact Index Per Article: 4.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/04/2014] [Accepted: 11/24/2014] [Indexed: 12/14/2022] Open
Abstract
Interpreting the impact of human genome variation on phenotype is challenging. The functional effect of protein-coding variants is often predicted using sequence conservation and population frequency data, however other factors are likely relevant. We hypothesized that variants in protein post-translational modification (PTM) sites contribute to phenotype variation and disease. We analyzed fraction of rare variants and non-synonymous to synonymous variant ratio (Ka/Ks) in 7,500 human genomes and found a significant negative selection signal in PTM regions independent of six factors, including conservation, codon usage, and GC-content, that is widely distributed across tissue-specific genes and function classes. PTM regions are also enriched in known disease mutations, suggesting that PTM variation is more likely deleterious. PTM constraint also affects flanking sequence around modified residues and increases around clustered sites, indicating presence of functionally important short linear motifs. Using target site motifs of 124 kinases, we predict that at least ∼180,000 motif-breaker amino acid residues that disrupt PTM sites when substituted, and highlight kinase motifs that show specific negative selection and enrichment of disease mutations. We provide this dataset with corresponding hypothesized mechanisms as a community resource. As an example of our integrative approach, we propose that PTPN11 variants in Noonan syndrome aberrantly activate the protein by disrupting an uncharacterized cluster of phosphorylation sites. Further, as PTMs are molecular switches that are modulated by drugs, we study mutated binding sites of PTM enzymes in disease genes and define a drug-disease network containing 413 novel predicted disease-gene links.
Collapse
Affiliation(s)
- Jüri Reimand
- The Donnelly Centre, University of Toronto, Canada
- * E-mail: (JR); (GDB)
| | - Omar Wagih
- The Donnelly Centre, University of Toronto, Canada
| | - Gary D. Bader
- The Donnelly Centre, University of Toronto, Canada
- * E-mail: (JR); (GDB)
| |
Collapse
|
186
|
Luo X, You Z, Zhou M, Li S, Leung H, Xia Y, Zhu Q. A highly efficient approach to protein interactome mapping based on collaborative filtering framework. Sci Rep 2015; 5:7702. [PMID: 25572661 PMCID: PMC4287731 DOI: 10.1038/srep07702] [Citation(s) in RCA: 54] [Impact Index Per Article: 5.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/24/2014] [Accepted: 12/08/2014] [Indexed: 12/17/2022] Open
Abstract
The comprehensive mapping of protein-protein interactions (PPIs) is highly desired for one to gain deep insights into both fundamental cell biology processes and the pathology of diseases. Finely-set small-scale experiments are not only very expensive but also inefficient to identify numerous interactomes despite their high accuracy. High-throughput screening techniques enable efficient identification of PPIs; yet the desire to further extract useful knowledge from these data leads to the problem of binary interactome mapping. Network topology-based approaches prove to be highly efficient in addressing this problem; however, their performance deteriorates significantly on sparse putative PPI networks. Motivated by the success of collaborative filtering (CF)-based approaches to the problem of personalized-recommendation on large, sparse rating matrices, this work aims at implementing a highly efficient CF-based approach to binary interactome mapping. To achieve this, we first propose a CF framework for it. Under this framework, we model the given data into an interactome weight matrix, where the feature-vectors of involved proteins are extracted. With them, we design the rescaled cosine coefficient to model the inter-neighborhood similarity among involved proteins, for taking the mapping process. Experimental results on three large, sparse datasets demonstrate that the proposed approach outperforms several sophisticated topology-based approaches significantly.
Collapse
Affiliation(s)
- Xin Luo
- X. Luo, Y. Xia and Q. Zhu are with the College of Computer Science, Chongqing University, Chongqing, 400044 China
- X. Luo, Z. You, S. Li and H. Leung are with the Department of Computing, Hong Kong Polytechnic University, Hong Kong, HK 999077, China
| | - Zhuhong You
- X. Luo, Z. You, S. Li and H. Leung are with the Department of Computing, Hong Kong Polytechnic University, Hong Kong, HK 999077, China
| | - Mengchu Zhou
- M. Zhou is with the Department of Electrical and Computer Engineering, New Jersey Institute of Technology, Newark, NJ 07102, USA
| | - Shuai Li
- X. Luo, Z. You, S. Li and H. Leung are with the Department of Computing, Hong Kong Polytechnic University, Hong Kong, HK 999077, China
| | - Hareton Leung
- X. Luo, Z. You, S. Li and H. Leung are with the Department of Computing, Hong Kong Polytechnic University, Hong Kong, HK 999077, China
| | - Yunni Xia
- X. Luo, Y. Xia and Q. Zhu are with the College of Computer Science, Chongqing University, Chongqing, 400044 China
| | - Qingsheng Zhu
- X. Luo, Y. Xia and Q. Zhu are with the College of Computer Science, Chongqing University, Chongqing, 400044 China
| |
Collapse
|
187
|
Pardo EP, Godzik A. Analysis of individual protein regions provides novel insights on cancer pharmacogenomics. PLoS Comput Biol 2015; 11:e1004024. [PMID: 25568936 PMCID: PMC4287345 DOI: 10.1371/journal.pcbi.1004024] [Citation(s) in RCA: 8] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/26/2014] [Accepted: 11/01/2014] [Indexed: 01/22/2023] Open
Abstract
The promise of personalized cancer medicine cannot be fulfilled until we gain better understanding of the connections between the genomic makeup of a patient's tumor and its response to anticancer drugs. Several datasets that include both pharmacologic profiles of cancer cell lines as well as their genomic alterations have been recently developed and extensively analyzed. However, most analyses of these datasets assume that mutations in a gene will have the same consequences regardless of their location. While this assumption might be correct in some cases, such analyses may miss subtler, yet still relevant, effects mediated by mutations in specific protein regions. Here we study such perturbations by separating effects of mutations in different protein functional regions (PFRs), including protein domains and intrinsically disordered regions. Using this approach, we have been able to identify 171 novel associations between mutations in specific PFRs and changes in the activity of 24 drugs that couldn't be recovered by traditional gene-centric analyses. Our results demonstrate how focusing on individual protein regions can provide novel insights into the mechanisms underlying the drug sensitivity of cancer cell lines. Moreover, while these new correlations are identified using only data from cancer cell lines, we have been able to validate some of our predictions using data from actual cancer patients. Our findings highlight how gene-centric experiments (such as systematic knock-out or silencing of individual genes) are missing relevant effects mediated by perturbations of specific protein regions. All the associations described here are available from http://www.cancer3d.org. There is increasing evidence that altering different functional regions within the same protein can lead to dramatically distinct phenotypes. Here we show how, by focusing on individual regions instead of whole proteins, we are able to identify novel correlations that predict the activity of anticancer drugs. We have also used proteomic data from both cancer cell lines and actual cancer patients to explore the molecular mechanisms underlying some of these region-drug associations. We finally show how associations found between protein regions and drugs using only data from cancer cell lines can predict the survival of cancer patients.
Collapse
Affiliation(s)
- Eduard Porta Pardo
- Program on Bioinformatics and Systems Biology, Sanford-Burnham Medical Research Institute, La Jolla, California, United States of America
| | - Adam Godzik
- Program on Bioinformatics and Systems Biology, Sanford-Burnham Medical Research Institute, La Jolla, California, United States of America
- * E-mail:
| |
Collapse
|
188
|
Conant GC. Structure, Interaction, and Evolution: Reflections on the Natural History of Proteins. EVOLUTIONARY BIOLOGY: BIODIVERSIFICATION FROM GENOTYPE TO PHENOTYPE 2015:187-201. [DOI: 10.1007/978-3-319-19932-0_10] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 01/03/2025]
|
189
|
ENGIN HBILLUR, HOFREE MATAN, CARTER HANNAH. Identifying mutation specific cancer pathways using a structurally resolved protein interaction network. PACIFIC SYMPOSIUM ON BIOCOMPUTING. PACIFIC SYMPOSIUM ON BIOCOMPUTING 2015; 20:84-95. [PMID: 25592571 PMCID: PMC4299875] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Subscribe] [Scholar Register] [Indexed: 06/04/2023]
Abstract
Here we present a method for extracting candidate cancer pathways from tumor 'omics data while explicitly accounting for diverse consequences of mutations for protein interactions. Disease-causing mutations are frequently observed at either core or interface residues mediating protein interactions. Mutations at core residues frequently destabilize protein structure while mutations at interface residues can specifically affect the binding energies of protein-protein interactions. As a result, mutations in a protein may result in distinct interaction profiles and thus have different phenotypic consequences. We describe a protein structure-guided pipeline for extracting interacting protein sets specific to a particular mutation. Of 59 cancer genes with 3D co-complexed structures in the Protein Data Bank, 43 showed evidence of mutations with different functional consequences. Literature survey reciprocated functional predictions specific to distinct mutations on APC, ATRX, BRCA1, CBL and HRAS. Our analysis suggests that accounting for mutation-specific perturbations to cancer pathways will be essential for personalized cancer therapy.
Collapse
Affiliation(s)
- H. BILLUR ENGIN
- School of Medicine, University of California San Diego, 9500 Gilman Dr. San Diego, CA 92093, USA
| | - MATAN HOFREE
- Department of Computer Science and Engineering, University of California San Diego, 9500 Gilman Dr. San Diego, CA 92093, USA
| | | |
Collapse
|
190
|
Li J, Shi M, Ma Z, Zhao S, Euskirchen G, Ziskin J, Urban A, Hallmayer J, Snyder M. Integrated systems analysis reveals a molecular network underlying autism spectrum disorders. Mol Syst Biol 2014; 10:774. [PMID: 25549968 PMCID: PMC4300495 DOI: 10.15252/msb.20145487] [Citation(s) in RCA: 94] [Impact Index Per Article: 8.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/04/2022] Open
Abstract
Autism is a complex disease whose etiology remains elusive. We integrated previously and newly generated data and developed a systems framework involving the interactome, gene expression and genome sequencing to identify a protein interaction module with members strongly enriched for autism candidate genes. Sequencing of 25 patients confirmed the involvement of this module in autism, which was subsequently validated using an independent cohort of over 500 patients. Expression of this module was dichotomized with a ubiquitously expressed subcomponent and another subcomponent preferentially expressed in the corpus callosum, which was significantly affected by our identified mutations in the network center. RNA-sequencing of the corpus callosum from patients with autism exhibited extensive gene mis-expression in this module, and our immunochemical analysis showed that the human corpus callosum is predominantly populated by oligodendrocyte cells. Analysis of functional genomic data further revealed a significant involvement of this module in the development of oligodendrocyte cells in mouse brain. Our analysis delineates a natural network involved in autism, helps uncover novel candidate genes for this disease and improves our understanding of its molecular pathology.
Collapse
Affiliation(s)
- Jingjing Li
- Department of Genetics, Stanford Center for Genomics and Personalized Medicine Stanford University School of Medicine, Stanford, CA, USA
| | - Minyi Shi
- Department of Genetics, Stanford Center for Genomics and Personalized Medicine Stanford University School of Medicine, Stanford, CA, USA
| | - Zhihai Ma
- Department of Genetics, Stanford Center for Genomics and Personalized Medicine Stanford University School of Medicine, Stanford, CA, USA
| | - Shuchun Zhao
- Department of Pathology, Stanford University School of Medicine, Stanford, CA, USA
| | - Ghia Euskirchen
- Department of Genetics, Stanford Center for Genomics and Personalized Medicine Stanford University School of Medicine, Stanford, CA, USA
| | - Jennifer Ziskin
- Department of Pathology, Stanford University School of Medicine, Stanford, CA, USA
| | - Alexander Urban
- Department of Psychiatry & Behavioral Sciences, Stanford University School of Medicine, Stanford, CA, USA
| | - Joachim Hallmayer
- Department of Psychiatry & Behavioral Sciences, Stanford University School of Medicine, Stanford, CA, USA
| | - Michael Snyder
- Department of Genetics, Stanford Center for Genomics and Personalized Medicine Stanford University School of Medicine, Stanford, CA, USA
| |
Collapse
|
191
|
A massively parallel pipeline to clone DNA variants and examine molecular phenotypes of human disease mutations. PLoS Genet 2014; 10:e1004819. [PMID: 25502805 PMCID: PMC4263371 DOI: 10.1371/journal.pgen.1004819] [Citation(s) in RCA: 36] [Impact Index Per Article: 3.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/28/2014] [Accepted: 10/14/2014] [Indexed: 12/13/2022] Open
Abstract
Understanding the functional relevance of DNA variants is essential for all exome and genome sequencing projects. However, current mutagenesis cloning protocols require Sanger sequencing, and thus are prohibitively costly and labor-intensive. We describe a massively-parallel site-directed mutagenesis approach, "Clone-seq", leveraging next-generation sequencing to rapidly and cost-effectively generate a large number of mutant alleles. Using Clone-seq, we further develop a comparative interactome-scanning pipeline integrating high-throughput GFP, yeast two-hybrid (Y2H), and mass spectrometry assays to systematically evaluate the functional impact of mutations on protein stability and interactions. We use this pipeline to show that disease mutations on protein-protein interaction interfaces are significantly more likely than those away from interfaces to disrupt corresponding interactions. We also find that mutation pairs with similar molecular phenotypes in terms of both protein stability and interactions are significantly more likely to cause the same disease than those with different molecular phenotypes, validating the in vivo biological relevance of our high-throughput GFP and Y2H assays, and indicating that both assays can be used to determine candidate disease mutations in the future. The general scheme of our experimental pipeline can be readily expanded to other types of interactome-mapping methods to comprehensively evaluate the functional relevance of all DNA variants, including those in non-coding regions.
Collapse
|
192
|
Keith BP, Robertson DL, Hentges KE. Locus heterogeneity disease genes encode proteins with high interconnectivity in the human protein interaction network. Front Genet 2014; 5:434. [PMID: 25538735 PMCID: PMC4260505 DOI: 10.3389/fgene.2014.00434] [Citation(s) in RCA: 7] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/17/2014] [Accepted: 11/24/2014] [Indexed: 01/20/2023] Open
Abstract
Mutations in genes potentially lead to a number of genetic diseases with differing severity. These disease genes have been the focus of research in recent years showing that the disease gene population as a whole is not homogeneous, and can be categorized according to their interactions. Locus heterogeneity describes a single disorder caused by mutations in different genes each acting individually to cause the same disease. Using datasets of experimentally derived human disease genes and protein interactions, we created a protein interaction network to investigate the relationships between the products of genes associated with a disease displaying locus heterogeneity, and use network parameters to suggest properties that distinguish these disease genes from the overall disease gene population. Through the manual curation of known causative genes of 100 diseases displaying locus heterogeneity and 397 single-gene Mendelian disorders, we use network parameters to show that our locus heterogeneity network displays distinct properties from the global disease network and a Mendelian network. Using the global human proteome, through random simulation of the network we show that heterogeneous genes display significant interconnectivity. Further topological analysis of this network revealed clustering of locus heterogeneity genes that cause identical disorders, indicating that these disease genes are involved in similar biological processes. We then use this information to suggest additional genes that may contribute to diseases with locus heterogeneity.
Collapse
Affiliation(s)
- Benjamin P Keith
- Faculty of Life Sciences, University of Manchester Manchester, UK
| | | | | |
Collapse
|
193
|
Abstract
The assembly of individual proteins into functional complexes is fundamental to nearly all biological processes. In recent decades, many thousands of homomeric and heteromeric protein complex structures have been determined, greatly improving our understanding of the fundamental principles that control symmetric and asymmetric quaternary structure organization. Furthermore, our conception of protein complexes has moved beyond static representations to include dynamic aspects of quaternary structure, including conformational changes upon binding, multistep ordered assembly pathways, and structural fluctuations occurring within fully assembled complexes. Finally, major advances have been made in our understanding of protein complex evolution, both in reconstructing evolutionary histories of specific complexes and in elucidating general mechanisms that explain how quaternary structure tends to evolve. The evolution of quaternary structure occurs via changes in self-assembly state or through the gain or loss of protein subunits, and these processes can be driven by both adaptive and nonadaptive influences.
Collapse
Affiliation(s)
- Joseph A Marsh
- Medical Research Council Human Genetics Unit, Institute of Genetics and Molecular Medicine, University of Edinburgh, Western General Hospital, Edinburgh EH4 2XU, United Kingdom;
| | | |
Collapse
|
194
|
Katsonis P, Koire A, Wilson SJ, Hsu TK, Lua RC, Wilkins AD, Lichtarge O. Single nucleotide variations: biological impact and theoretical interpretation. Protein Sci 2014; 23:1650-66. [PMID: 25234433 PMCID: PMC4253807 DOI: 10.1002/pro.2552] [Citation(s) in RCA: 88] [Impact Index Per Article: 8.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/06/2014] [Revised: 09/12/2014] [Accepted: 09/15/2014] [Indexed: 12/27/2022]
Abstract
Genome-wide association studies (GWAS) and whole-exome sequencing (WES) generate massive amounts of genomic variant information, and a major challenge is to identify which variations drive disease or contribute to phenotypic traits. Because the majority of known disease-causing mutations are exonic non-synonymous single nucleotide variations (nsSNVs), most studies focus on whether these nsSNVs affect protein function. Computational studies show that the impact of nsSNVs on protein function reflects sequence homology and structural information and predict the impact through statistical methods, machine learning techniques, or models of protein evolution. Here, we review impact prediction methods and discuss their underlying principles, their advantages and limitations, and how they compare to and complement one another. Finally, we present current applications and future directions for these methods in biological research and medical genetics.
Collapse
Affiliation(s)
- Panagiotis Katsonis
- Department of Molecular and Human Genetics, Baylor College of MedicineHouston, Texas
| | - Amanda Koire
- Department of Structural and Computational Biology and Molecular BiophysicsHouston, Texas
| | - Stephen Joseph Wilson
- Department of Biochemistry and Molecular Biology, Baylor College of MedicineHouston, Texas
| | - Teng-Kuei Hsu
- Department of Biochemistry and Molecular Biology, Baylor College of MedicineHouston, Texas
| | - Rhonald C Lua
- Department of Molecular and Human Genetics, Baylor College of MedicineHouston, Texas
| | - Angela Dawn Wilkins
- Department of Molecular and Human Genetics, Baylor College of MedicineHouston, Texas
- Computational and Integrative Biomedical Research Center, Baylor College of MedicineHouston, Texas
| | - Olivier Lichtarge
- Department of Molecular and Human Genetics, Baylor College of MedicineHouston, Texas
- Department of Structural and Computational Biology and Molecular BiophysicsHouston, Texas
- Department of Biochemistry and Molecular Biology, Baylor College of MedicineHouston, Texas
- Computational and Integrative Biomedical Research Center, Baylor College of MedicineHouston, Texas
- Department of Pharmacology, Baylor College of MedicineHouston, Texas
| |
Collapse
|
195
|
Sun HY, Hou TJ, Zhang HY. Finding chemical drugs for genetic diseases. Drug Discov Today 2014; 19:1836-40. [DOI: 10.1016/j.drudis.2014.09.013] [Citation(s) in RCA: 17] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/15/2014] [Revised: 07/24/2014] [Accepted: 09/15/2014] [Indexed: 02/03/2023]
|
196
|
Zhang XZ, Quan Y, Tang GY. Medical genetics-based drug repurposing for Alzheimer's disease. Brain Res Bull 2014; 110:26-9. [PMID: 25446738 DOI: 10.1016/j.brainresbull.2014.11.003] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/08/2014] [Revised: 11/12/2014] [Accepted: 11/13/2014] [Indexed: 12/31/2022]
Abstract
Alzheimer's disease (AD) is a disease that threatens the elderly. No efficient therapeutic method is currently available to combat AD. Drug repurposing has provided a new route for AD drug discovery, and medical genetics has shown potential in target-based drug repurposing. We compared AD-associated genes with approved drug targets and found that three are targeted by 23 approved drugs. Thus, these drugs may be used to treat AD according to the medical genetic information of the targets. In vitro and in vivo experiments revealed that four drugs, all of which are angiotensin-converting enzyme (ACE) inhibitors, had potential to treat AD.
Collapse
Affiliation(s)
- Xiu-Zhen Zhang
- School of Life Sciences, Shandong University of Technology, Zibo 255049, PR China.
| | - Yuan Quan
- Agricultural Bioinformatics Key Laboratory of Hubei Province, College of Informatics, Huazhong Agricultural University, Wuhan 430070, PR China.
| | - Guang-Yan Tang
- Agricultural Bioinformatics Key Laboratory of Hubei Province, College of Informatics, Huazhong Agricultural University, Wuhan 430070, PR China.
| |
Collapse
|
197
|
Betts MJ, Lu Q, Jiang Y, Drusko A, Wichmann O, Utz M, Valtierra-Gutiérrez IA, Schlesner M, Jaeger N, Jones DT, Pfister S, Lichter P, Eils R, Siebert R, Bork P, Apic G, Gavin AC, Russell RB. Mechismo: predicting the mechanistic impact of mutations and modifications on molecular interactions. Nucleic Acids Res 2014; 43:e10. [PMID: 25392414 PMCID: PMC4333368 DOI: 10.1093/nar/gku1094] [Citation(s) in RCA: 70] [Impact Index Per Article: 6.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/21/2023] Open
Abstract
Systematic interrogation of mutation or protein modification data is important to identify sites with functional consequences and to deduce global consequences from large data sets. Mechismo (mechismo.russellab.org) enables simultaneous consideration of thousands of 3D structures and biomolecular interactions to predict rapidly mechanistic consequences for mutations and modifications. As useful functional information often only comes from homologous proteins, we benchmarked the accuracy of predictions as a function of protein/structure sequence similarity, which permits the use of relatively weak sequence similarities with an appropriate confidence measure. For protein–protein, protein–nucleic acid and a subset of protein–chemical interactions, we also developed and benchmarked a measure of whether modifications are likely to enhance or diminish the interactions, which can assist the detection of modifications with specific effects. Analysis of high-throughput sequencing data shows that the approach can identify interesting differences between cancers, and application to proteomics data finds potential mechanistic insights for how post-translational modifications can alter biomolecular interactions.
Collapse
Affiliation(s)
- Matthew J Betts
- Cell Networks, University of Heidelberg, Im Neuenheimer Feld 267, 69120 Heidelberg, Germany Bioquant, University of Heidelberg, Im Neuenheimer Feld 267, 69120 Heidelberg, Germany
| | - Qianhao Lu
- Cell Networks, University of Heidelberg, Im Neuenheimer Feld 267, 69120 Heidelberg, Germany Bioquant, University of Heidelberg, Im Neuenheimer Feld 267, 69120 Heidelberg, Germany
| | - YingYing Jiang
- Cell Networks, University of Heidelberg, Im Neuenheimer Feld 267, 69120 Heidelberg, Germany Bioquant, University of Heidelberg, Im Neuenheimer Feld 267, 69120 Heidelberg, Germany
| | - Armin Drusko
- Cell Networks, University of Heidelberg, Im Neuenheimer Feld 267, 69120 Heidelberg, Germany Bioquant, University of Heidelberg, Im Neuenheimer Feld 267, 69120 Heidelberg, Germany
| | - Oliver Wichmann
- Cell Networks, University of Heidelberg, Im Neuenheimer Feld 267, 69120 Heidelberg, Germany Bioquant, University of Heidelberg, Im Neuenheimer Feld 267, 69120 Heidelberg, Germany
| | - Mathias Utz
- Cell Networks, University of Heidelberg, Im Neuenheimer Feld 267, 69120 Heidelberg, Germany Bioquant, University of Heidelberg, Im Neuenheimer Feld 267, 69120 Heidelberg, Germany
| | - Ilse A Valtierra-Gutiérrez
- Cell Networks, University of Heidelberg, Im Neuenheimer Feld 267, 69120 Heidelberg, Germany Bioquant, University of Heidelberg, Im Neuenheimer Feld 267, 69120 Heidelberg, Germany
| | - Matthias Schlesner
- Deutsches Krebsforschungszentrum, Im Neuenheimer Feld 280, 69120 Heidelberg, Germany
| | - Natalie Jaeger
- Deutsches Krebsforschungszentrum, Im Neuenheimer Feld 280, 69120 Heidelberg, Germany
| | - David T Jones
- Deutsches Krebsforschungszentrum, Im Neuenheimer Feld 280, 69120 Heidelberg, Germany
| | - Stefan Pfister
- Deutsches Krebsforschungszentrum, Im Neuenheimer Feld 280, 69120 Heidelberg, Germany
| | - Peter Lichter
- Deutsches Krebsforschungszentrum, Im Neuenheimer Feld 280, 69120 Heidelberg, Germany
| | - Roland Eils
- Bioquant, University of Heidelberg, Im Neuenheimer Feld 267, 69120 Heidelberg, Germany Deutsches Krebsforschungszentrum, Im Neuenheimer Feld 280, 69120 Heidelberg, Germany Department for Bioinformatics and Functional Genomics, Institute for Pharmacy and Molecular Biotechnology (IPMB), University of Heidelberg, Heidelberg, Germany
| | - Reiner Siebert
- Institut für Humangenetik, Universitätsklinikum Schleswig-Holstein, Christian-Albrechts-Universität zu Kiel, Arnold Heller Straße 3, 24105 Kiel, Germany
| | - Peer Bork
- EMBL, Meyerhofstrasse 1, 69117 Heidelberg, Germany
| | - Gordana Apic
- Cell Networks, University of Heidelberg, Im Neuenheimer Feld 267, 69120 Heidelberg, Germany Bioquant, University of Heidelberg, Im Neuenheimer Feld 267, 69120 Heidelberg, Germany Cambridge Cell Networks Ltd, St John's Innovation Centre, Cowley Road, CB3 0WS, Cambridge, UK
| | | | - Robert B Russell
- Cell Networks, University of Heidelberg, Im Neuenheimer Feld 267, 69120 Heidelberg, Germany Bioquant, University of Heidelberg, Im Neuenheimer Feld 267, 69120 Heidelberg, Germany
| |
Collapse
|
198
|
Porta-Pardo E, Hrabe T, Godzik A. Cancer3D: understanding cancer mutations through protein structures. Nucleic Acids Res 2014; 43:D968-73. [PMID: 25392415 PMCID: PMC4383948 DOI: 10.1093/nar/gku1140] [Citation(s) in RCA: 41] [Impact Index Per Article: 3.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/04/2022] Open
Abstract
The new era of cancer genomics is providing us with extensive knowledge of mutations and other alterations in cancer. The Cancer3D database at http://www.cancer3d.org gives an open and user-friendly way to analyze cancer missense mutations in the context of structures of proteins in which they are found. The database also helps users analyze the distribution patterns of the mutations as well as their relationship to changes in drug activity through two algorithms: e-Driver and e-Drug. These algorithms use knowledge of modular structure of genes and proteins to separately study each region. This approach allows users to find novel candidate driver regions or drug biomarkers that cannot be found when similar analyses are done on the whole-gene level. The Cancer3D database provides access to the results of such analyses based on data from The Cancer Genome Atlas (TCGA) and the Cancer Cell Line Encyclopedia (CCLE). In addition, it displays mutations from over 14 700 proteins mapped to more than 24 300 structures from PDB. This helps users visualize the distribution of mutations and identify novel three-dimensional patterns in their distribution.
Collapse
Affiliation(s)
- Eduard Porta-Pardo
- Bioinformatics and Systems Biology Program, Sanford-Burnham Medical Research Institute, 10901 North Torrey Pines Road, La Jolla, CA 92037, USA
| | - Thomas Hrabe
- Bioinformatics and Systems Biology Program, Sanford-Burnham Medical Research Institute, 10901 North Torrey Pines Road, La Jolla, CA 92037, USA
| | - Adam Godzik
- Bioinformatics and Systems Biology Program, Sanford-Burnham Medical Research Institute, 10901 North Torrey Pines Road, La Jolla, CA 92037, USA
| |
Collapse
|
199
|
Wang H, Zheng H. Organized Modularity in the Interactome: Evidence from the Analysis of Dynamic Organization in the Cell Cycle. IEEE/ACM TRANSACTIONS ON COMPUTATIONAL BIOLOGY AND BIOINFORMATICS 2014; 11:1264-1270. [PMID: 26357062 DOI: 10.1109/tcbb.2014.2318715] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/05/2023]
Abstract
The organization of global protein interaction networks (PINs) has been extensively studied and heatedly debated. We revisited this issue in the context of the analysis of dynamic organization of a PIN in the yeast cell cycle. Statistically significant bimodality was observed when analyzing the distribution of the differences in expression peak between periodically expressed partners. A close look at their behavior revealed that date and party hubs derived from this analysis have some distinct features. There are no significant differences between them in terms of protein essentiality, expression correlation and semantic similarity derived from gene ontology (GO) biological process hierarchy. However, date hubs exhibit significantly greater values than party hubs in terms of semantic similarity derived from both GO molecular function and cellular component hierarchies. Relating to three-dimensional structures, we found that both single- and multi-interface proteins could become date hubs coordinating multiple functions performed at different times while party hubs are mainly multi-interface proteins. Furthermore, we constructed and analyzed a PPI network specific to the human cell cycle and highlighted that the dynamic organization in human interactome is far more complex than the dichotomy of hubs observed in the yeast cell cycle.
Collapse
|
200
|
Cukuroglu E, Engin HB, Gursoy A, Keskin O. Hot spots in protein–protein interfaces: Towards drug discovery. PROGRESS IN BIOPHYSICS AND MOLECULAR BIOLOGY 2014; 116:165-73. [DOI: 10.1016/j.pbiomolbio.2014.06.003] [Citation(s) in RCA: 113] [Impact Index Per Article: 10.3] [Reference Citation Analysis] [Track Full Text] [Subscribe] [Scholar Register] [Received: 03/06/2014] [Revised: 05/30/2014] [Accepted: 06/12/2014] [Indexed: 11/16/2022]
|