101
|
Quantifying protein interaction dynamics by SWATH mass spectrometry: application to the 14-3-3 system. Nat Methods 2013; 10:1246-53. [PMID: 24162925 DOI: 10.1038/nmeth.2703] [Citation(s) in RCA: 244] [Impact Index Per Article: 22.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/05/2013] [Accepted: 09/25/2013] [Indexed: 12/22/2022]
Abstract
Protein complexes and protein interaction networks are essential mediators of most biological functions. Complexes supporting transient functions such as signal transduction processes are frequently subject to dynamic remodeling. Currently, the majority of studies on the composition of protein complexes are carried out by affinity purification and mass spectrometry (AP-MS) and present a static view of the system. For a better understanding of inherently dynamic biological processes, methods to reliably quantify temporal changes of protein interaction networks are essential. Here we used affinity purification combined with sequential window acquisition of all theoretical spectra (AP-SWATH) mass spectrometry to study the dynamics of the 14-3-3β scaffold protein interactome after stimulation of the insulin-PI3K-AKT pathway. The consistent and reproducible quantification of 1,967 proteins across all stimulation time points provided insights into the 14-3-3β interactome and its dynamic changes following IGF1 stimulation. We therefore establish AP-SWATH as a tool to quantify dynamic changes in protein-complex interaction networks.
Collapse
|
102
|
Jayaswal V, Schramm SJ, Mann GJ, Wilkins MR, Yang YH. VAN: an R package for identifying biologically perturbed networks via differential variability analysis. BMC Res Notes 2013; 6:430. [PMID: 24156242 PMCID: PMC4015612 DOI: 10.1186/1756-0500-6-430] [Citation(s) in RCA: 7] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/21/2013] [Accepted: 10/18/2013] [Indexed: 12/26/2022] Open
Abstract
BACKGROUND Large-scale molecular interaction networks are dynamic in nature and are of special interest in the analysis of complex diseases, which are characterized by network-level perturbations rather than changes in individual genes/proteins. The methods developed for the identification of differentially expressed genes or gene sets are not suitable for network-level analyses. Consequently, bioinformatics approaches that enable a joint analysis of high-throughput transcriptomics datasets and large-scale molecular interaction networks for identifying perturbed networks are gaining popularity. Typically, these approaches require the sequential application of multiple bioinformatics techniques - ID mapping, network analysis, and network visualization. Here, we present the Variability Analysis in Networks (VAN) software package: a collection of R functions to streamline this bioinformatics analysis. FINDINGS VAN determines whether there are network-level perturbations across biological states of interest. It first identifies hubs (densely connected proteins/microRNAs) in a network and then uses them to extract network modules (comprising of a hub and all its interaction partners). The function identifySignificantHubs identifies dysregulated modules (i.e. modules with changes in expression correlation between a hub and its interaction partners) using a single expression and network dataset. The function summarizeHubData identifies dysregulated modules based on a meta-analysis of multiple expression and/or network datasets. VAN also converts protein identifiers present in a MITAB-formatted interaction network to gene identifiers (UniProt identifier to Entrez identifier or gene symbol using the function generatePpiMap) and generates microRNA-gene interaction networks using TargetScan and Microcosm databases (generateMicroRnaMap). The function obtainCancerInfo is used to identify hubs (corresponding to significantly perturbed modules) that are already causally associated with cancer(s) in the Cancer Gene Census database. Additionally, VAN supports the visualization of changes to network modules in R and Cytoscape (visualizeNetwork and obtainPairSubset, respectively). We demonstrate the utility of VAN using a gene expression data from metastatic melanoma and a protein-protein interaction network from the Human Protein Reference Database. CONCLUSIONS Our package provides a comprehensive and user-friendly platform for the integrative analysis of -omics data to identify disease-associated network modules. This bioinformatics approach, which is essentially focused on the question of explaining phenotype with a 'network type' and in particular, how regulation is changing among different states of interest, is relevant to many questions including those related to network perturbations across developmental timelines.
Collapse
Affiliation(s)
- Vivek Jayaswal
- School of Mathematics and Statistics, The University of Sydney, Sydney, NSW, Australia
| | - Sarah-Jane Schramm
- Westmead Millennium Institute for Medical Research, Sydney Medical School, The University of Sydney, Sydney, NSW, Australia
- Melanoma Institute Australia, Sydney, NSW, Australia
| | - Graham J Mann
- Westmead Millennium Institute for Medical Research, Sydney Medical School, The University of Sydney, Sydney, NSW, Australia
- Melanoma Institute Australia, Sydney, NSW, Australia
| | - Marc R Wilkins
- School of Biotechnology and Biomolecular Sciences, University of New South Wales, Sydney, NSW, Australia
- Systems Biology Initiative, University of New South Wales, Sydney, NSW, Australia
| | - Yee Hwa Yang
- School of Mathematics and Statistics, The University of Sydney, Sydney, NSW, Australia
- Melanoma Institute Australia, Sydney, NSW, Australia
| |
Collapse
|
103
|
The functional interactome landscape of the human histone deacetylase family. Mol Syst Biol 2013; 9:672. [PMID: 23752268 PMCID: PMC3964310 DOI: 10.1038/msb.2013.26] [Citation(s) in RCA: 218] [Impact Index Per Article: 19.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/30/2013] [Accepted: 04/29/2013] [Indexed: 12/22/2022] Open
Abstract
This study presents the first global protein interaction network for all 11 human HDACs in T cells and an integrative mass spectrometry approach for profiling relative interaction stability within isolated protein complexes. ![]()
T-cell lines stably expressing each of the human HDACs (1 - 11), C-terminally tagged with both EGFP and FLAG, were generated using retroviral transduction. Affinity purification coupled to mass spectrometry-based proteomics (AP-MS) was used to build the first global protein interaction network for all eleven human HDACs in T cells. An optimized label free AP-MS and computational workflow was developed for profiling relative interaction stability among isolated protein complexes. HDAC11 is a member of the “survival of motor neuron” protein complex with a functional role in mRNA splicing.
Histone deacetylases (HDACs) are a diverse family of essential transcriptional regulatory enzymes, that function through the spatial and temporal recruitment of protein complexes. As the composition and regulation of HDAC complexes are only partially characterized, we built the first global protein interaction network for all 11 human HDACs in T cells. Integrating fluorescence microscopy, immunoaffinity purifications, quantitative mass spectrometry, and bioinformatics, we identified over 200 unreported interactions for both well-characterized and lesser-studied HDACs, a subset of which were validated by orthogonal approaches. We establish HDAC11 as a member of the survival of motor neuron complex and pinpoint a functional role in mRNA splicing. We designed a complementary label-free and metabolic-labeling mass spectrometry-based proteomics strategy for profiling interaction stability among different HDAC classes, revealing that HDAC1 interactions within chromatin-remodeling complexes are largely stable, while transcription factors preferentially exist in rapid equilibrium. Overall, this study represents a valuable resource for investigating HDAC functions in health and disease, encompassing emerging themes of HDAC regulation in cell cycle and RNA processing and a deeper functional understanding of HDAC complex stability.
Collapse
|
104
|
Wang X, Thijssen B, Yu H. Target essentiality and centrality characterize drug side effects. PLoS Comput Biol 2013; 9:e1003119. [PMID: 23874169 PMCID: PMC3708859 DOI: 10.1371/journal.pcbi.1003119] [Citation(s) in RCA: 40] [Impact Index Per Article: 3.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/02/2012] [Accepted: 05/15/2013] [Indexed: 01/19/2023] Open
Abstract
To investigate factors contributing to drug side effects, we systematically examine relationships between 4,199 side effects associated with 996 drugs and their 647 human protein targets. We find that it is the number of essential targets, not the number of total targets, that determines the side effects of corresponding drugs. Furthermore, within the context of a three-dimensional interaction network with atomic-resolution interaction interfaces, we find that drugs causing more side effects are also characterized by high degree and betweenness of their targets and highly shared interaction interfaces on these targets. Our findings suggest that both essentiality and centrality of a drug target are key factors contributing to side effects and should be taken into consideration in rational drug design. The ultimate goal of medical research is to develop effective treatments for disease with minimal side effects. Currently, about 20% of drug candidates failed at clinical trial phases II and III due to safety issues. Therefore, understanding the determining factors of drug side effects is of paramount importance to human health and the pharmaceutical industry. Here, we present the first systematic study to uncover key factors leading to drug side effects within the framework of the human protein interactome network. Our results show that it is the number of essential targets, not the number of total targets, of a drug that determines the occurrence of its side effects. Furthermore, we find that the centrality, both degree and betweenness, of the drug targets is also an important determining factor of drug side effects. Our findings will shed light on new factors to be incorporated into the drug development pipeline.
Collapse
Affiliation(s)
- Xiujuan Wang
- Department of Biological Statistics and Computational Biology, Cornell University, Ithaca, New York, United States of America
- Weill Institute for Cell and Molecular Biology, Cornell University, Ithaca, New York, United States of America
| | - Bram Thijssen
- Department of Bioinformatics, Maastricht University, Maastricht, The Netherlands
| | - Haiyuan Yu
- Department of Biological Statistics and Computational Biology, Cornell University, Ithaca, New York, United States of America
- Weill Institute for Cell and Molecular Biology, Cornell University, Ithaca, New York, United States of America
- * E-mail:
| |
Collapse
|
105
|
Schramm SJ, Li SS, Jayaswal V, Fung DCY, Campain AE, Pang CNI, Scolyer RA, Yang YH, Mann GJ, Wilkins MR. Disturbed protein-protein interaction networks in metastatic melanoma are associated with worse prognosis and increased functional mutation burden. Pigment Cell Melanoma Res 2013; 26:708-22. [PMID: 23738911 DOI: 10.1111/pcmr.12126] [Citation(s) in RCA: 12] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/29/2013] [Accepted: 05/30/2013] [Indexed: 12/15/2022]
Abstract
For disseminated melanoma, new prognostic biomarkers and therapeutic targets are urgently needed. The organization of protein-protein interaction networks was assessed via the transcriptomes of four independent studies of metastatic melanoma and related to clinical outcome and MAP-kinase pathway mutations (BRAF/NRAS). We also examined patient outcome-related differences in a predicted network of microRNAs and their targets. The 32 hub genes with the most reproducible survival-related disturbances in co-expression with their protein partner genes included oncogenes and tumor suppressors, previously known correlates of prognosis, and other proteins not previously associated with melanoma outcome. Notably, this network-based gene set could classify patients according to clinical outcomes with 67-80% accuracy among cohorts. Reproducibly disturbed networks were also more likely to have a higher functional mutation burden than would be expected by chance. The disturbed regions of networks are therefore markers of clinically relevant, selectable tumor evolution in melanoma which may carry driver mutations.
Collapse
Affiliation(s)
- Sarah-Jane Schramm
- Sydney Medical School, The University of Sydney at Westmead Millennium Institute for Medical Research, Sydney, NSW, Australia
| | | | | | | | | | | | | | | | | | | |
Collapse
|
106
|
Abstract
Digenic inheritance (DI) is the simplest form of inheritance for genetically complex diseases. By contrast with the thousands of reports that mutations in single genes cause human diseases, there are only dozens of human disease phenotypes with evidence for DI in some pedigrees. The advent of high-throughput sequencing (HTS) has made it simpler to identify monogenic disease causes and could similarly simplify proving DI because one can simultaneously find mutations in two genes in the same sample. However, through 2012, I could find only one example of human DI in which HTS was used; in that example, HTS found only the second of the two genes. To explore the gap between expectation and reality, I tried to collect all examples of human DI with a narrow definition and characterise them according to the types of evidence collected, and whether there has been replication. Two strong trends are that knowledge of candidate genes and knowledge of protein–protein interactions (PPIs) have been helpful in most published examples of human DI. By contrast, the positional method of genetic linkage analysis, has been mostly unsuccessful in identifying genes underlying human DI. Based on the empirical data, I suggest that combining HTS with growing networks of established PPIs may expedite future discoveries of human DI and strengthen the evidence for them.
Collapse
|
107
|
Das J, Vo TV, Wei X, Mellor JC, Tong V, Degatano AG, Wang X, Wang L, Cordero NA, Kruer-Zerhusen N, Matsuyama A, Pleiss JA, Lipkin SM, Yoshida M, Roth FP, Yu H. Cross-species protein interactome mapping reveals species-specific wiring of stress response pathways. Sci Signal 2013; 6:ra38. [PMID: 23695164 DOI: 10.1126/scisignal.2003350] [Citation(s) in RCA: 29] [Impact Index Per Article: 2.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/13/2022]
Abstract
The fission yeast Schizosaccharomyces pombe has more metazoan-like features than the budding yeast Saccharomyces cerevisiae, yet it has similarly facile genetics. We present a large-scale verified binary protein-protein interactome network, "StressNet," based on high-throughput yeast two-hybrid screens of interacting proteins classified as part of stress response and signal transduction pathways in S. pombe. We performed systematic, cross-species interactome mapping using StressNet and a protein interactome network of orthologous proteins in S. cerevisiae. With cross-species comparative network studies, we detected a previously unidentified component (Snr1) of the S. pombe mitogen-activated protein kinase Sty1 pathway. Coimmunoprecipitation experiments showed that Snr1 interacted with Sty1 and that deletion of snr1 increased the sensitivity of S. pombe cells to stress. Comparison of StressNet with the interactome network of orthologous proteins in S. cerevisiae showed that most of the interactions among these stress response and signaling proteins are not conserved between species but are "rewired"; orthologous proteins have different binding partners in both species. In particular, transient interactions connecting proteins in different functional modules were more likely to be rewired than conserved. By directly testing interactions between proteins in one yeast species and their corresponding binding partners in the other yeast species with yeast two-hybrid assays, we found that about half of the interactions that are traditionally considered "conserved" form modified interaction interfaces that may potentially accommodate novel functions.
Collapse
Affiliation(s)
- Jishnu Das
- Department of Biological Statistics and Computational Biology Cornell University, Ithaca, NY 14853, USA.,Weill Institute for Cell and Molecular Biology Cornell University, Ithaca, NY 14853, USA
| | - Tommy V Vo
- Weill Institute for Cell and Molecular Biology Cornell University, Ithaca, NY 14853, USA.,Department of Molecular Biology and Genetics, Cornell University, Ithaca, NY 14853, USA
| | - Xiaomu Wei
- Weill Institute for Cell and Molecular Biology Cornell University, Ithaca, NY 14853, USA.,Department of Medicine, Weill Cornell College of Medicine, New York, NY 10021, USA
| | - Joseph C Mellor
- Donnelly Centre, University of Toronto, Toronto, ON M5S-3E1, Canada
| | - Virginia Tong
- Weill Institute for Cell and Molecular Biology Cornell University, Ithaca, NY 14853, USA
| | - Andrew G Degatano
- Weill Institute for Cell and Molecular Biology Cornell University, Ithaca, NY 14853, USA
| | - Xiujuan Wang
- Department of Biological Statistics and Computational Biology Cornell University, Ithaca, NY 14853, USA.,Weill Institute for Cell and Molecular Biology Cornell University, Ithaca, NY 14853, USA
| | - Lihua Wang
- Weill Institute for Cell and Molecular Biology Cornell University, Ithaca, NY 14853, USA
| | - Nicolas A Cordero
- Weill Institute for Cell and Molecular Biology Cornell University, Ithaca, NY 14853, USA
| | - Nathan Kruer-Zerhusen
- Weill Institute for Cell and Molecular Biology Cornell University, Ithaca, NY 14853, USA.,Department of Molecular Biology and Genetics, Cornell University, Ithaca, NY 14853, USA
| | - Akihisa Matsuyama
- Chemical Genetics Laboratory, RIKEN Advanced Science Institute, Wako, Saitama 351-0198, Japan.,CREST Research Project, JST, Kawaguchi, Saitama 332-0012, Japan
| | - Jeffrey A Pleiss
- Department of Molecular Biology and Genetics, Cornell University, Ithaca, NY 14853, USA
| | - Steven M Lipkin
- Department of Medicine, Weill Cornell College of Medicine, New York, NY 10021, USA
| | - Minoru Yoshida
- Chemical Genetics Laboratory, RIKEN Advanced Science Institute, Wako, Saitama 351-0198, Japan.,CREST Research Project, JST, Kawaguchi, Saitama 332-0012, Japan.,Department of Biotechnology, Graduate School of Agriculture and Life Sciences, University of Tokyo, Bunkyo-ku, Tokyo 113-8657, Japan
| | - Frederick P Roth
- Donnelly Centre, University of Toronto, Toronto, ON M5S-3E1, Canada.,Departments of Molecular Genetics and Computer Science, University of Toronto, Toronto, ON M5S-3E1, Canada.,Center for Cancer Systems Biology, Dana-Farber Cancer Institute, Boston, MA 02115.,Harvard Medical School, Boston, MA 02115.,Samuel Lunenfeld Research Institute, Mt. Sinai Hospital, Toronto, ON M5G-1X5, Canada.,Genetic Networks Program, Canadian Institute for Advanced Research, Toronto, ON M5G-1Z8, Canada
| | - Haiyuan Yu
- Department of Biological Statistics and Computational Biology Cornell University, Ithaca, NY 14853, USA.,Weill Institute for Cell and Molecular Biology Cornell University, Ithaca, NY 14853, USA
| |
Collapse
|
108
|
Tripathi LP, Kambara H, Chen YA, Nishimura Y, Moriishi K, Okamoto T, Morita E, Abe T, Mori Y, Matsuura Y, Mizuguchi K. Understanding the Biological Context of NS5A–Host Interactions in HCV Infection: A Network-Based Approach. J Proteome Res 2013; 12:2537-51. [DOI: 10.1021/pr3011217] [Citation(s) in RCA: 25] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/08/2023]
Affiliation(s)
- Lokesh P. Tripathi
- National Institute of Biomedical Innovation, 7-6-8 Saito Asagi, Ibaraki,
Osaka, 567-0085, Japan
| | - Hiroto Kambara
- Department of Molecular Virology,
Research Institute for Microbial Diseases, Osaka University, 3-1 Yamada-Oka, Suita, Osaka, 565-0871, Japan
| | - Yi-An Chen
- National Institute of Biomedical Innovation, 7-6-8 Saito Asagi, Ibaraki,
Osaka, 567-0085, Japan
| | - Yorihiro Nishimura
- Department of Molecular Virology,
Research Institute for Microbial Diseases, Osaka University, 3-1 Yamada-Oka, Suita, Osaka, 565-0871, Japan
| | - Kohji Moriishi
- Department of Molecular Virology,
Research Institute for Microbial Diseases, Osaka University, 3-1 Yamada-Oka, Suita, Osaka, 565-0871, Japan
| | - Toru Okamoto
- Department of Molecular Virology,
Research Institute for Microbial Diseases, Osaka University, 3-1 Yamada-Oka, Suita, Osaka, 565-0871, Japan
| | - Eiji Morita
- Department of Molecular Virology,
Research Institute for Microbial Diseases, Osaka University, 3-1 Yamada-Oka, Suita, Osaka, 565-0871, Japan
| | - Takayuki Abe
- Department of Molecular Virology,
Research Institute for Microbial Diseases, Osaka University, 3-1 Yamada-Oka, Suita, Osaka, 565-0871, Japan
| | - Yoshio Mori
- Department of Molecular Virology,
Research Institute for Microbial Diseases, Osaka University, 3-1 Yamada-Oka, Suita, Osaka, 565-0871, Japan
| | - Yoshiharu Matsuura
- Department of Molecular Virology,
Research Institute for Microbial Diseases, Osaka University, 3-1 Yamada-Oka, Suita, Osaka, 565-0871, Japan
| | - Kenji Mizuguchi
- National Institute of Biomedical Innovation, 7-6-8 Saito Asagi, Ibaraki,
Osaka, 567-0085, Japan
- Graduate School of Frontier Biosciences, Osaka University, 1-3 Yamada-Oka, Suita, Osaka, 565-0871,
Japan
| |
Collapse
|
109
|
Varjosalo M, Keskitalo S, Van Drogen A, Nurkkala H, Vichalkovski A, Aebersold R, Gstaiger M. The protein interaction landscape of the human CMGC kinase group. Cell Rep 2013; 3:1306-20. [PMID: 23602568 DOI: 10.1016/j.celrep.2013.03.027] [Citation(s) in RCA: 151] [Impact Index Per Article: 13.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/14/2012] [Revised: 03/01/2013] [Accepted: 03/18/2013] [Indexed: 12/24/2022] Open
Abstract
Cellular information processing via reversible protein phosphorylation requires tight control of the localization, activity, and substrate specificity of protein kinases, which to a large extent is accomplished by complex formation with other proteins. Despite their critical role in cellular regulation and pathogenesis, protein interaction information is available for only a subset of the 518 human protein kinases. Here we present a global proteomic analysis of complexes of the human CMGC kinase group. In addition to subgroup-specific functional enrichment and modularity, the identified 652 high-confidence kinase-protein interactions provide a specific biochemical context for many poorly studied CMGC kinases. Furthermore, the analysis revealed a kinase-kinase subnetwork and candidate substrates for CMGC kinases. Finally, the presented interaction proteome uncovered a large set of interactions with proteins genetically linked to a range of human diseases, including cancer, suggesting additional routes for analyzing the role of CMGC kinases in controlling human disease pathways.
Collapse
Affiliation(s)
- Markku Varjosalo
- Department of Biology, Institute of Molecular Systems Biology, ETH Zurich, 8093 Zurich, Switzerland
| | | | | | | | | | | | | |
Collapse
|
110
|
Meyer MJ, Das J, Wang X, Yu H. INstruct: a database of high-quality 3D structurally resolved protein interactome networks. ACTA ACUST UNITED AC 2013; 29:1577-9. [PMID: 23599502 DOI: 10.1093/bioinformatics/btt181] [Citation(s) in RCA: 102] [Impact Index Per Article: 9.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/30/2022]
Abstract
UNLABELLED INstruct is a database of high-quality, 3D, structurally resolved protein interactome networks in human and six model organisms. INstruct combines the scale of available high-quality binary protein interaction data with the specificity of atomic-resolution structural information derived from co-crystal evidence using a tested interaction interface inference method. Its web interface is designed to allow for flexible search based on standard and organism-specific protein and gene-naming conventions, visualization of protein architecture highlighting interaction interfaces and viewing and downloading custom 3D structurally resolved interactome datasets. AVAILABILITY INstruct is freely available on the web at http://instruct.yulab.org with all major browsers supported.
Collapse
Affiliation(s)
- Michael J Meyer
- Department of Biological Statistics and Computational Biology and Weill Institute for Cell and Molecular Biology, Cornell University, Ithaca, NY 14853, USA
| | | | | | | |
Collapse
|
111
|
Ferreira RM, Rybarczyk-Filho JL, Dalmolin RJS, Castro MAA, Moreira JCF, Brunnet LG, de Almeida RMC. Preferential duplication of intermodular hub genes: an evolutionary signature in eukaryotes genome networks. PLoS One 2013; 8:e56579. [PMID: 23468868 PMCID: PMC3582557 DOI: 10.1371/journal.pone.0056579] [Citation(s) in RCA: 11] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/25/2012] [Accepted: 01/14/2013] [Indexed: 12/31/2022] Open
Abstract
Whole genome protein-protein association networks are not random and their topological properties stem from genome evolution mechanisms. In fact, more connected, but less clustered proteins are related to genes that, in general, present more paralogs as compared to other genes, indicating frequent previous gene duplication episodes. On the other hand, genes related to conserved biological functions present few or no paralogs and yield proteins that are highly connected and clustered. These general network characteristics must have an evolutionary explanation. Considering data from STRING database, we present here experimental evidence that, more than not being scale free, protein degree distributions of organisms present an increased probability for high degree nodes. Furthermore, based on this experimental evidence, we propose a simulation model for genome evolution, where genes in a network are either acquired de novo using a preferential attachment rule, or duplicated with a probability that linearly grows with gene degree and decreases with its clustering coefficient. For the first time a model yields results that simultaneously describe different topological distributions. Also, this model correctly predicts that, to produce protein-protein association networks with number of links and number of nodes in the observed range for Eukaryotes, it is necessary 90% of gene duplication and 10% of de novo gene acquisition. This scenario implies a universal mechanism for genome evolution.
Collapse
Affiliation(s)
- Ricardo M. Ferreira
- Instituto de Física, Universidade Federal do Rio Grande do Sul, Porto Alegre, Brazil
| | | | - Rodrigo J. S. Dalmolin
- Departamento de Bioquímica, Universidade Federal do Rio Grande do Sul, Porto Alegre, Brazil
| | - Mauro A. A. Castro
- Instituto de Física, Universidade Federal do Rio Grande do Sul, Porto Alegre, Brazil
- National Institute of Science and Technology for Complex Systems, Universidade Federal do Rio Grande do Sul, Porto Alegre, Brazil
| | - José C. F. Moreira
- Departamento de Bioquímica, Universidade Federal do Rio Grande do Sul, Porto Alegre, Brazil
| | - Leonardo G. Brunnet
- Instituto de Física, Universidade Federal do Rio Grande do Sul, Porto Alegre, Brazil
| | - Rita M. C. de Almeida
- Instituto de Física, Universidade Federal do Rio Grande do Sul, Porto Alegre, Brazil
- National Institute of Science and Technology for Complex Systems, Universidade Federal do Rio Grande do Sul, Porto Alegre, Brazil
- * E-mail:
| |
Collapse
|
112
|
Clancy T, Rødland EA, Nygard S, Hovig E. Predicting physical interactions between protein complexes. Mol Cell Proteomics 2013; 12:1723-34. [PMID: 23438732 DOI: 10.1074/mcp.o112.019828] [Citation(s) in RCA: 8] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/29/2022] Open
Abstract
Protein complexes enact most biochemical functions in the cell. Dynamic interactions between protein complexes are frequent in many cellular processes. As they are often of a transient nature, they may be difficult to detect using current genome-wide screens. Here, we describe a method to computationally predict physical interactions between protein complexes, applied to both humans and yeast. We integrated manually curated protein complexes and physical protein interaction networks, and we designed a statistical method to identify pairs of protein complexes where the number of protein interactions between a complex pair is due to an actual physical interaction between the complexes. An evaluation against manually curated physical complex-complex interactions in yeast revealed that 50% of these interactions could be predicted in this manner. A community network analysis of the highest scoring pairs revealed a biologically sensible organization of physical complex-complex interactions in the cell. Such analyses of proteomes may serve as a guide to the discovery of novel functional cellular relationships.
Collapse
Affiliation(s)
- Trevor Clancy
- Department of Tumor Biology, Institute for Cancer Research, The Norwegian Radium Hospital and Oslo University Hospital, Oslo, Norway.
| | | | | | | |
Collapse
|
113
|
Li C, Liakata M, Rebholz-Schuhmann D. Biological network extraction from scientific literature: state of the art and challenges. Brief Bioinform 2013; 15:856-77. [PMID: 23434632 DOI: 10.1093/bib/bbt006] [Citation(s) in RCA: 45] [Impact Index Per Article: 4.1] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/14/2022] Open
Abstract
Networks of molecular interactions explain complex biological processes, and all known information on molecular events is contained in a number of public repositories including the scientific literature. Metabolic and signalling pathways are often viewed separately, even though both types are composed of interactions involving proteins and other chemical entities. It is necessary to be able to combine data from all available resources to judge the functionality, complexity and completeness of any given network overall, but especially the full integration of relevant information from the scientific literature is still an ongoing and complex task. Currently, the text-mining research community is steadily moving towards processing the full body of the scientific literature by making use of rich linguistic features such as full text parsing, to extract biological interactions. The next step will be to combine these with information from scientific databases to support hypothesis generation for the discovery of new knowledge and the extension of biological networks. The generation of comprehensive networks requires technologies such as entity grounding, coordination resolution and co-reference resolution, which are not fully solved and are required to further improve the quality of results. Here, we analyse the state of the art for the extraction of network information from the scientific literature and the evaluation of extraction methods against reference corpora, discuss challenges involved and identify directions for future research.
Collapse
|
114
|
Blandin G, Marchand S, Charton K, Danièle N, Gicquel E, Boucheteil JB, Bentaib A, Barrault L, Stockholm D, Bartoli M, Richard I. A human skeletal muscle interactome centered on proteins involved in muscular dystrophies: LGMD interactome. Skelet Muscle 2013; 3:3. [PMID: 23414517 PMCID: PMC3610214 DOI: 10.1186/2044-5040-3-3] [Citation(s) in RCA: 33] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/03/2012] [Accepted: 02/07/2013] [Indexed: 02/01/2023] Open
Abstract
Background The complexity of the skeletal muscle and the identification of numerous human disease-causing mutations in its constitutive proteins make it an interesting tissue for proteomic studies aimed at understanding functional relationships of interacting proteins in both health and diseases. Method We undertook a large-scale study using two-hybrid screens and a human skeletal-muscle cDNA library to establish a proteome-scale map of protein-protein interactions centered on proteins involved in limb-girdle muscular dystrophies (LGMD). LGMD is a group of more than 20 different neuromuscular disorders that principally affect the proximal pelvic and shoulder girdle muscles. Results and conclusion The interaction network we unraveled incorporates 1018 proteins connected by 1492 direct binary interactions and includes 1420 novel protein-protein interactions. Computational, experimental and literature-based analyses were performed to assess the overall quality of this network. Interestingly, LGMD proteins were shown to be highly interconnected, in particular indirectly through sarcomeric proteins. In-depth mining of the LGMD-centered interactome identified new candidate genes for orphan LGMDs and other neuromuscular disorders. The data also suggest the existence of functional links between LGMD2B/dysferlin and gene regulation, between LGMD2C/γ-sarcoglycan and energy control and between LGMD2G/telethonin and maintenance of genome integrity. This dataset represents a valuable resource for future functional investigations.
Collapse
Affiliation(s)
- Gaëlle Blandin
- Généthon CNRS UMR8587, 1, rue de l'Internationale, Evry 91000, France.
| | | | | | | | | | | | | | | | | | | | | |
Collapse
|
115
|
A survey of protein interaction data and multigenic inherited disorders. BMC Bioinformatics 2013; 14:47. [PMID: 23398688 PMCID: PMC3598893 DOI: 10.1186/1471-2105-14-47] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/20/2012] [Accepted: 02/05/2013] [Indexed: 11/15/2022] Open
Abstract
Background Multigenic diseases are often associated with protein complexes or interactions involved in the same pathway. We wanted to estimate to what extent this is true given a consolidated protein interaction data set. The study stresses data integration and data representation issues. Results We constructed 497 multigenic disease groups from OMIM and tested for overlaps with interaction and pathway data. A total of 159 disease groups had significant overlaps with protein interaction data consolidated by iRefIndex. A further 68 disease overlaps were found only in the KEGG pathway database. No single database contained all significant overlaps thus stressing the importance of data integration. We also found that disease groups overlapped with all three interaction data types: n-ary, spoke-represented complexes and binary data – thus stressing the importance of considering each of these data types separately. Conclusions Almost half of our multigenic disease groups could potentially be explained by protein complexes and pathways. However, the fact that no database or data type was able to cover all disease groups suggests that no single database has systematically covered all disease groups for potential related complex and pathway data. This survey provides a basis for further curation efforts to confirm and search for overlaps between diseases and interaction data. The accompanying R script can be used to reproduce the work and track progress in this area as databases change. Disease group overlaps can be further explored using the iRefscape plugin for Cytoscape.
Collapse
|
116
|
Predicting PDZ domain mediated protein interactions from structure. BMC Bioinformatics 2013; 14:27. [PMID: 23336252 PMCID: PMC3602153 DOI: 10.1186/1471-2105-14-27] [Citation(s) in RCA: 29] [Impact Index Per Article: 2.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/15/2012] [Accepted: 12/19/2012] [Indexed: 12/03/2022] Open
Abstract
Background PDZ domains are structural protein domains that recognize simple linear amino acid motifs, often at protein C-termini, and mediate protein-protein interactions (PPIs) in important biological processes, such as ion channel regulation, cell polarity and neural development. PDZ domain-peptide interaction predictors have been developed based on domain and peptide sequence information. Since domain structure is known to influence binding specificity, we hypothesized that structural information could be used to predict new interactions compared to sequence-based predictors. Results We developed a novel computational predictor of PDZ domain and C-terminal peptide interactions using a support vector machine trained with PDZ domain structure and peptide sequence information. Performance was estimated using extensive cross validation testing. We used the structure-based predictor to scan the human proteome for ligands of 218 PDZ domains and show that the predictions correspond to known PDZ domain-peptide interactions and PPIs in curated databases. The structure-based predictor is complementary to the sequence-based predictor, finding unique known and novel PPIs, and is less dependent on training–testing domain sequence similarity. We used a functional enrichment analysis of our hits to create a predicted map of PDZ domain biology. This map highlights PDZ domain involvement in diverse biological processes, some only found by the structure-based predictor. Based on this analysis, we predict novel PDZ domain involvement in xenobiotic metabolism and suggest new interactions for other processes including wound healing and Wnt signalling. Conclusions We built a structure-based predictor of PDZ domain-peptide interactions, which can be used to scan C-terminal proteomes for PDZ interactions. We also show that the structure-based predictor finds many known PDZ mediated PPIs in human that were not found by our previous sequence-based predictor and is less dependent on training–testing domain sequence similarity. Using both predictors, we defined a functional map of human PDZ domain biology and predict novel PDZ domain function. Users may access our structure-based and previous sequence-based predictors at
http://webservice.baderlab.org/domains/POW.
Collapse
|
117
|
Naegle KM, White FM, Lauffenburger DA, Yaffe MB. Robust co-regulation of tyrosine phosphorylation sites on proteins reveals novel protein interactions. MOLECULAR BIOSYSTEMS 2013; 8:2771-82. [PMID: 22851037 DOI: 10.1039/c2mb25200g] [Citation(s) in RCA: 15] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 01/08/2023]
Abstract
Cell signaling networks propagate information from extracellular cues via dynamic modulation of protein-protein interactions in a context-dependent manner. Networks based on receptor tyrosine kinases (RTKs), for example, phosphorylate intracellular proteins in response to extracellular ligands, resulting in dynamic protein-protein interactions that drive phenotypic changes. Most commonly used methods for discovering these protein-protein interactions, however, are optimized for detecting stable, longer-lived complexes, rather than the type of transient interactions that are essential components of dynamic signaling networks such as those mediated by RTKs. Substrate phosphorylation downstream of RTK activation modifies substrate activity and induces phospho-specific binding interactions, resulting in the formation of large transient macromolecular signaling complexes. Since protein complex formation should follow the trajectory of events that drive it, we reasoned that mining phosphoproteomic datasets for highly similar dynamic behavior of measured phosphorylation sites on different proteins could be used to predict novel, transient protein-protein interactions that had not been previously identified. We applied this method to explore signaling events downstream of EGFR stimulation. Our computational analysis of robustly co-regulated phosphorylation sites, based on multiple clustering analysis of quantitative time-resolved mass-spectrometry phosphoproteomic data, not only identified known sitewise-specific recruitment of proteins to EGFR, but also predicted novel, a priori interactions. A particularly intriguing prediction of EGFR interaction with the cytoskeleton-associated protein PDLIM1 was verified within cells using co-immunoprecipitation and in situ proximity ligation assays. Our approach thus offers a new way to discover protein-protein interactions in a dynamic context- and phosphorylation site-specific manner.
Collapse
Affiliation(s)
- Kristen M Naegle
- The David H. Koch Institute for Integrative Cancer Research, Washington University in St. Louis, St. Louis, MO 63130, USA.
| | | | | | | |
Collapse
|
118
|
Xin X, Gfeller D, Cheng J, Tonikian R, Sun L, Guo A, Lopez L, Pavlenco A, Akintobi A, Zhang Y, Rual JF, Currell B, Seshagiri S, Hao T, Yang X, Shen YA, Salehi-Ashtiani K, Li J, Cheng AT, Bouamalay D, Lugari A, Hill DE, Grimes ML, Drubin DG, Grant BD, Vidal M, Boone C, Sidhu SS, Bader GD. SH3 interactome conserves general function over specific form. Mol Syst Biol 2013; 9:652. [PMID: 23549480 PMCID: PMC3658277 DOI: 10.1038/msb.2013.9] [Citation(s) in RCA: 50] [Impact Index Per Article: 4.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/27/2012] [Accepted: 02/20/2013] [Indexed: 12/20/2022] Open
Abstract
Src homology 3 (SH3) domains bind peptides to mediate protein-protein interactions that assemble and regulate dynamic biological processes. We surveyed the repertoire of SH3 binding specificity using peptide phage display in a metazoan, the worm Caenorhabditis elegans, and discovered that it structurally mirrors that of the budding yeast Saccharomyces cerevisiae. We then mapped the worm SH3 interactome using stringent yeast two-hybrid and compared it with the equivalent map for yeast. We found that the worm SH3 interactome resembles the analogous yeast network because it is significantly enriched for proteins with roles in endocytosis. Nevertheless, orthologous SH3 domain-mediated interactions are highly rewired. Our results suggest a model of network evolution where general function of the SH3 domain network is conserved over its specific form.
Collapse
Affiliation(s)
- Xiaofeng Xin
- The Donnelly Centre, University of Toronto, Toronto, Ontario, Canada
- Department of Molecular Genetics, University of Toronto, Toronto, Ontario, Canada
| | - David Gfeller
- The Donnelly Centre, University of Toronto, Toronto, Ontario, Canada
| | - Jackie Cheng
- Department of Molecular and Cell Biology, University of California Berkeley, Berkeley, CA, USA
| | - Raffi Tonikian
- The Donnelly Centre, University of Toronto, Toronto, Ontario, Canada
- Department of Molecular Genetics, University of Toronto, Toronto, Ontario, Canada
| | - Lin Sun
- Department of Molecular Biology and Biochemistry, Rutgers University, Piscataway, NJ, USA
| | - Ailan Guo
- Cell Signaling Technology, Danvers, MA, USA
| | - Lianet Lopez
- The Donnelly Centre, University of Toronto, Toronto, Ontario, Canada
| | - Alevtina Pavlenco
- The Donnelly Centre, University of Toronto, Toronto, Ontario, Canada
| | - Adenrele Akintobi
- Department of Molecular Biology and Biochemistry, Rutgers University, Piscataway, NJ, USA
| | - Yingnan Zhang
- Department of Early Discovery Biochemistry, Genentech, South San Francisco, CA, USA
| | - Jean-François Rual
- Center for Cancer Systems Biology (CCSB) and Department of Cancer Biology, Dana-Farber Cancer Institute, Boston, MA, USA
- Department of Genetics, Harvard Medical School, Boston, MA, USA
| | - Bridget Currell
- Department of Molecular Biology, Genentech, South San Francisco, CA, USA
| | | | - Tong Hao
- Center for Cancer Systems Biology (CCSB) and Department of Cancer Biology, Dana-Farber Cancer Institute, Boston, MA, USA
- Department of Genetics, Harvard Medical School, Boston, MA, USA
| | - Xinping Yang
- Center for Cancer Systems Biology (CCSB) and Department of Cancer Biology, Dana-Farber Cancer Institute, Boston, MA, USA
- Department of Genetics, Harvard Medical School, Boston, MA, USA
| | - Yun A Shen
- Center for Cancer Systems Biology (CCSB) and Department of Cancer Biology, Dana-Farber Cancer Institute, Boston, MA, USA
- Department of Genetics, Harvard Medical School, Boston, MA, USA
| | - Kourosh Salehi-Ashtiani
- Center for Cancer Systems Biology (CCSB) and Department of Cancer Biology, Dana-Farber Cancer Institute, Boston, MA, USA
- Department of Genetics, Harvard Medical School, Boston, MA, USA
| | - Jingjing Li
- The Donnelly Centre, University of Toronto, Toronto, Ontario, Canada
- Department of Molecular Genetics, University of Toronto, Toronto, Ontario, Canada
| | - Aaron T Cheng
- Department of Molecular and Cell Biology, University of California Berkeley, Berkeley, CA, USA
| | - Dryden Bouamalay
- Department of Molecular and Cell Biology, University of California Berkeley, Berkeley, CA, USA
| | - Adrien Lugari
- IMR Laboratory, UPR 3243, Institut de Microbiologie de la Méditérannée, CNRS and Aix-Marseille Université, Marseille Cedex 20, France
| | - David E Hill
- Center for Cancer Systems Biology (CCSB) and Department of Cancer Biology, Dana-Farber Cancer Institute, Boston, MA, USA
- Department of Genetics, Harvard Medical School, Boston, MA, USA
| | - Mark L Grimes
- Division of Biological Sciences, Center for Structural and Functional Neuroscience, The University of Montana, Missoula, MT, USA
| | - David G Drubin
- Department of Molecular and Cell Biology, University of California Berkeley, Berkeley, CA, USA
| | - Barth D Grant
- Department of Molecular Biology and Biochemistry, Rutgers University, Piscataway, NJ, USA
| | - Marc Vidal
- Center for Cancer Systems Biology (CCSB) and Department of Cancer Biology, Dana-Farber Cancer Institute, Boston, MA, USA
- Department of Genetics, Harvard Medical School, Boston, MA, USA
| | - Charles Boone
- The Donnelly Centre, University of Toronto, Toronto, Ontario, Canada
- Department of Molecular Genetics, University of Toronto, Toronto, Ontario, Canada
| | - Sachdev S Sidhu
- The Donnelly Centre, University of Toronto, Toronto, Ontario, Canada
- Department of Molecular Genetics, University of Toronto, Toronto, Ontario, Canada
| | - Gary D Bader
- The Donnelly Centre, University of Toronto, Toronto, Ontario, Canada
- Department of Molecular Genetics, University of Toronto, Toronto, Ontario, Canada
- Department of Computer Science, University of Toronto, Toronto, Ontario, Canada
| |
Collapse
|
119
|
Choi H, Liu G, Mellacheruvu D, Tyers M, Gingras AC, Nesvizhskii AI. Analyzing protein-protein interactions from affinity purification-mass spectrometry data with SAINT. ACTA ACUST UNITED AC 2012; Chapter 8:8.15.1-8.15.23. [PMID: 22948729 DOI: 10.1002/0471250953.bi0815s39] [Citation(s) in RCA: 94] [Impact Index Per Article: 7.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/09/2022]
Abstract
Significance Analysis of INTeractome (SAINT) is a software package for scoring protein-protein interactions based on label-free quantitative proteomics data (e.g., spectral count or intensity) in affinity purification-mass spectrometry (AP-MS) experiments. SAINT allows bench scientists to select bona fide interactions and remove nonspecific interactions in an unbiased manner. However, there is no 'one-size-fits-all' statistical model for every dataset, since the experimental design varies across studies. Key variables include the number of baits, the number of biological replicates per bait, and control purifications. Here we give a detailed account of input data format, control data, selection of high-confidence interactions, and visualization of filtered data. We explain additional options for customizing the statistical model for optimal filtering in specific datasets. We also discuss a graphical user interface of SAINT in connection to the LIMS system ProHits, which can be installed as a virtual machine on Mac OS X or Windows computers.
Collapse
Affiliation(s)
- Hyungwon Choi
- Saw Swee Hock School of Public Health, National University of Singapore, Singapore
| | | | | | | | | | | |
Collapse
|
120
|
Abstract
Protein complex identification is an important goal of protein-protein interaction analysis. To date, development of computational methods for detecting protein complexes has been largely motivated by genome-scale interaction data sets from high-throughput assays such as yeast two-hybrid or tandem affinity purification coupled with mass spectrometry (TAP-MS). However, due to the popularity of small to intermediate-scale affinity purification-mass spectrometry (AP-MS) experiments, protein complex detection is increasingly discussed in local network analysis. In such data sets, protein complexes cannot be detected using binary interaction data alone because the data contain interactions with tagged proteins only and, as a result, interactions between all other proteins remain unobserved, limiting the scope of existing algorithms. In this article, we provide a pragmatic review of network graph-based computational algorithms for protein complex analysis in global interactome data, without requiring any computational background. We discuss the practical gap in applying these algorithms to recently surging small to intermediate-scale AP-MS data sets, and review alternative clustering algorithms using quantitative proteomics data and their limitations.
Collapse
Affiliation(s)
- Hyungwon Choi
- Saw Swee Hock School of Public Health, National University of Singapore, Singapore.
| |
Collapse
|
121
|
Armean IM, Lilley KS, Trotter MWB. Popular computational methods to assess multiprotein complexes derived from label-free affinity purification and mass spectrometry (AP-MS) experiments. Mol Cell Proteomics 2012; 12:1-13. [PMID: 23071097 DOI: 10.1074/mcp.r112.019554] [Citation(s) in RCA: 41] [Impact Index Per Article: 3.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/06/2022] Open
Abstract
Advances in sensitivity, resolution, mass accuracy, and throughput have considerably increased the number of protein identifications made via mass spectrometry. Despite these advances, state-of-the-art experimental methods for the study of protein-protein interactions yield more candidate interactions than may be expected biologically owing to biases and limitations in the experimental methodology. In silico methods, which distinguish between true and false interactions, have been developed and applied successfully to reduce the number of false positive results yielded by physical interaction assays. Such methods may be grouped according to: (1) the type of data used: methods based on experiment-specific measurements (e.g., spectral counts or identification scores) versus methods that extract knowledge encoded in external annotations (e.g., public interaction and functional categorisation databases); (2) the type of algorithm applied: the statistical description and estimation of physical protein properties versus predictive supervised machine learning or text-mining algorithms; (3) the type of protein relation evaluated: direct (binary) interaction of two proteins in a cocomplex versus probability of any functional relationship between two proteins (e.g., co-occurrence in a pathway, sub cellular compartment); and (4) initial motivation: elucidation of experimental data by evaluation versus prediction of novel protein-protein interaction, to be experimentally validated a posteriori. This work reviews several popular computational scoring methods and software platforms for protein-protein interactions evaluation according to their methodology, comparative strengths and weaknesses, data representation, accessibility, and availability. The scoring methods and platforms described include: CompPASS, SAINT, Decontaminator, MINT, IntAct, STRING, and FunCoup. References to related work are provided throughout in order to provide a concise but thorough introduction to a rapidly growing interdisciplinary field of investigation.
Collapse
Affiliation(s)
- Irina M Armean
- Cambridge Centre for Proteomics, Department of Biochemistry, University of Cambridge, CB2 1GA, UK
| | | | | |
Collapse
|
122
|
Isokpehi RD, Udensi UK, Anyanwu MN, Mbah AN, Johnson MO, Edusei K, Bauer MA, Hall RA, Awofolu OR. Knowledge building insights on biomarkers of arsenic toxicity to keratinocytes and melanocytes. Biomark Insights 2012; 7:127-41. [PMID: 23115478 PMCID: PMC3480875 DOI: 10.4137/bmi.s7799] [Citation(s) in RCA: 9] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/06/2022] Open
Abstract
Exposure to inorganic arsenic induces skin cancer and abnormal pigmentation in susceptible humans. High-throughput gene transcription assays such as DNA microarrays allow for the identification of biological pathways affected by arsenic that lead to initiation and progression of skin cancer and abnormal pigmentation. The overall purpose of the reported research was to determine knowledge building insights on biomarker genes for arsenic toxicity to human epidermal cells by integrating a collection of gene lists annotated with biological information. The information sets included toxicogenomics gene-chemical interaction; enzymes encoded in the human genome; enriched biological information associated with genes; environmentally relevant gene sequence variation; and effects of non-synonymous single nucleotide polymorphisms (SNPs) on protein function. Molecular network construction for arsenic upregulated genes TNFSF18 (tumor necrosis factor [ligand] superfamily member 18) and IL1R2 (interleukin 1 Receptor, type 2) revealed subnetwork interconnections to E2F4, an oncogenic transcription factor, predominantly expressed at the onset of keratinocyte differentiation. Visual analytics integration of gene information sources helped identify RAC1, a GTP binding protein, and TFRC, an iron uptake protein as prioritized arsenic-perturbed protein targets for biological processes leading to skin hyperpigmentation. RAC1 regulates the formation of dendrites that transfer melanin from melanocytes to neighboring keratinocytes. Increased melanocyte dendricity is correlated with hyperpigmentation. TFRC is a key determinant of the amount and location of iron in the epidermis. Aberrant TFRC expression could impair cutaneous iron metabolism leading to abnormal pigmentation seen in some humans exposed to arsenicals. The reported findings contribute to insights on how arsenic could impair the function of genes and biological pathways in epidermal cells. Finally, we developed visual analytics resources to facilitate further exploration of the information and knowledge building insights on arsenic toxicity to human epidermal keratinocytes and melanocytes.
Collapse
Affiliation(s)
- Raphael D Isokpehi
- RCMI Center for Environmental Health, College of Science, Engineering and Technology, Jackson State University, Jackson, MS, USA. ; Center for Bioinformatics & Computational Biology, Department of Biology, Jackson State University, Jackson, MS, USA
| | | | | | | | | | | | | | | | | |
Collapse
|
123
|
New horizons for antiviral drug discovery from virus–host protein interaction networks. Curr Opin Virol 2012; 2:606-13. [DOI: 10.1016/j.coviro.2012.09.001] [Citation(s) in RCA: 43] [Impact Index Per Article: 3.6] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/16/2012] [Revised: 09/05/2012] [Accepted: 09/05/2012] [Indexed: 12/21/2022]
|
124
|
Babu M, Vlasblom J, Pu S, Guo X, Graham C, Bean BDM, Burston HE, Vizeacoumar FJ, Snider J, Phanse S, Fong V, Tam YYC, Davey M, Hnatshak O, Bajaj N, Chandran S, Punna T, Christopolous C, Wong V, Yu A, Zhong G, Li J, Stagljar I, Conibear E, Wodak SJ, Emili A, Greenblatt JF. Interaction landscape of membrane-protein complexes in Saccharomyces cerevisiae. Nature 2012; 489:585-9. [PMID: 22940862 DOI: 10.1038/nature11354] [Citation(s) in RCA: 176] [Impact Index Per Article: 14.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/02/2011] [Accepted: 06/27/2012] [Indexed: 01/03/2023]
Abstract
Macromolecular assemblies involving membrane proteins (MPs) serve vital biological roles and are prime drug targets in a variety of diseases. Large-scale affinity purification studies of soluble-protein complexes have been accomplished for diverse model organisms, but no global characterization of MP-complex membership has been described so far. Here we report a complete survey of 1,590 putative integral, peripheral and lipid-anchored MPs from Saccharomyces cerevisiae, which were affinity purified in the presence of non-denaturing detergents. The identities of the co-purifying proteins were determined by tandem mass spectrometry and subsequently used to derive a high-confidence physical interaction map encompassing 1,726 membrane protein-protein interactions and 501 putative heteromeric complexes associated with the various cellular membrane systems. Our analysis reveals unexpected physical associations underlying the membrane biology of eukaryotes and delineates the global topological landscape of the membrane interactome.
Collapse
Affiliation(s)
- Mohan Babu
- Banting and Best Department of Medical Research, Donnelly Centre, 160 College Street, University of Toronto, Toronto, Ontario M5S 3E1, Canada
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | |
Collapse
|
125
|
De Las Rivas J, Fontanillo C. Protein-protein interaction networks: unraveling the wiring of molecular machines within the cell. Brief Funct Genomics 2012; 11:489-96. [PMID: 22908212 DOI: 10.1093/bfgp/els036] [Citation(s) in RCA: 57] [Impact Index Per Article: 4.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/23/2023] Open
Abstract
Mapping and understanding of the protein interaction networks with their key modules and hubs can provide deeper insights into the molecular machinery underlying complex phenotypes. In this article, we present the basic characteristics and definitions of protein networks, starting with a distinction of the different types of associations between proteins. We focus the review on protein-protein interactions (PPIs), a subset of associations defined as physical contacts between proteins that occur by selective molecular docking in a particular biological context. We present such definition as opposed to other types of protein associations derived from regulatory, genetic, structural or functional relations. To determine PPIs, a variety of binary and co-complex methods exist; however, not all the technologies provide the same information and data quality. A way of increasing confidence in a given protein interaction is to integrate orthogonal experimental evidences. The use of several complementary methods testing each single interaction assesses the accuracy of PPI data and tries to minimize the occurrence of false interactions. Following this approach there have been important efforts to unify primary databases of experimentally proven PPIs into integrated databases. These meta-databases provide a measure of the confidence of interactions based on the number of experimental proofs that report them. As a conclusion, we can state that integrated information allows the building of more reliable interaction networks. Identification of communities, cliques, modules and hubs by analysing the topological parameters and graph properties of the protein networks allows the discovery of central/critical nodes, which are candidates to regulate cellular flux and dynamics.
Collapse
Affiliation(s)
- Javier De Las Rivas
- Bioinformatics and Functional Genomics Research Group, Cancer Research Center (IBMCC, CSIC/USAL), Salamanca, Spain.
| | | |
Collapse
|
126
|
Tejera E, Bernardes J, Rebelo I. Preeclampsia: a bioinformatics approach through protein-protein interaction networks analysis. BMC SYSTEMS BIOLOGY 2012; 6:97. [PMID: 22873350 PMCID: PMC3483240 DOI: 10.1186/1752-0509-6-97] [Citation(s) in RCA: 19] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 01/05/2012] [Accepted: 07/23/2012] [Indexed: 01/29/2023]
Abstract
Background In this study we explored preeclampsia through a bioinformatics approach. We create a comprehensive genes/proteins dataset by the analysis of both public proteomic data and text mining of public scientific literature. From this dataset the associated protein-protein interaction network has been obtained. Several indexes of centrality have been explored for hubs detection as well as the enrichment statistical analysis of metabolic pathway and disease. Results We confirmed the well known relationship between preeclampsia and cardiovascular diseases but also identified statistically significant relationships with respect to cancer and aging. Moreover, significant metabolic pathways such as apoptosis, cancer and cytokine-cytokine receptor interaction have also been identified by enrichment analysis. We obtained FLT1, VEGFA, FN1, F2 and PGF genes with the highest scores by hubs analysis; however, we also found other genes as PDIA3, LYN, SH2B2 and NDRG1 with high scores. Conclusions The applied methodology not only led to the identification of well known genes related to preeclampsia but also to propose new candidates poorly explored or completely unknown in the pathogenesis of preeclampsia, which eventually need to be validated experimentally. Moreover, new possible connections were detected between preeclampsia and other diseases that could open new areas of research. More must be done in this area to resolve the identification of unknown interactions of proteins/genes and also for a better integration of metabolic pathways and diseases.
Collapse
Affiliation(s)
- Eduardo Tejera
- Department of Biological Sciences, Biochemistry, University of Porto, Portugal/Institute for Molecular and Cell Biology (IBMC), Porto, Portugal
| | | | | |
Collapse
|
127
|
Arnold R, Boonen K, Sun MG, Kim PM. Computational analysis of interactomes: current and future perspectives for bioinformatics approaches to model the host-pathogen interaction space. Methods 2012; 57:508-18. [PMID: 22750305 PMCID: PMC7128575 DOI: 10.1016/j.ymeth.2012.06.011] [Citation(s) in RCA: 33] [Impact Index Per Article: 2.8] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/13/2012] [Revised: 06/20/2012] [Accepted: 06/21/2012] [Indexed: 11/05/2022] Open
Abstract
Bacterial and viral pathogens affect their eukaryotic host partly by interacting with proteins of the host cell. Hence, to investigate infection from a systems' perspective we need to construct complete and accurate host-pathogen protein-protein interaction networks. Because of the paucity of available data and the cost associated with experimental approaches, any construction and analysis of such a network in the near future has to rely on computational predictions. Specifically, this challenge consists of a number of sub-problems: First, prediction of possible pathogen interactors (e.g. effector proteins) is necessary for bacteria and protozoa. Second, the prospective host binding partners have to be determined and finally, the impact on the host cell analyzed. This review gives an overview of current bioinformatics approaches to obtain and understand host-pathogen interactions. As an application example of the methods covered, we predict host-pathogen interactions of Salmonella and discuss the value of these predictions as a prospective for further research.
Collapse
Affiliation(s)
- Roland Arnold
- Terrence Donnelly Centre for Cellular and Biomolecular Research, University of Toronto, Toronto, ON, Canada M5S 3E1
| | - Kurt Boonen
- Terrence Donnelly Centre for Cellular and Biomolecular Research, University of Toronto, Toronto, ON, Canada M5S 3E1
| | - Mark G.F. Sun
- Terrence Donnelly Centre for Cellular and Biomolecular Research, University of Toronto, Toronto, ON, Canada M5S 3E1
| | - Philip M. Kim
- Terrence Donnelly Centre for Cellular and Biomolecular Research, University of Toronto, Toronto, ON, Canada M5S 3E1
- Banting and Best Department of Medical Research, University of Toronto, Toronto, ON, Canada M5S 3E1
- Department of Molecular Genetics, University of Toronto, Toronto, ON, Canada M5S 3E1
- Department of Computer Science, University of Toronto, Toronto, ON, Canada M5S 3E1
| |
Collapse
|
128
|
Das J, Yu H. HINT: High-quality protein interactomes and their applications in understanding human disease. BMC SYSTEMS BIOLOGY 2012; 6:92. [PMID: 22846459 PMCID: PMC3483187 DOI: 10.1186/1752-0509-6-92] [Citation(s) in RCA: 287] [Impact Index Per Article: 23.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 07/22/2011] [Accepted: 06/30/2012] [Indexed: 12/22/2022]
Abstract
Background A global map of protein-protein interactions in cellular systems provides key insights into the workings of an organism. A repository of well-validated high-quality protein-protein interactions can be used in both large- and small-scale studies to generate and validate a wide range of functional hypotheses. Results We develop HINT (http://hint.yulab.org) - a database of high-quality protein-protein interactomes for human, Saccharomyces cerevisiae, Schizosaccharomyces pombe, and Oryza sativa. These were collected from several databases and filtered both systematically and manually to remove low-quality/erroneous interactions. The resulting datasets are classified by type (binary physical interactions vs. co-complex associations) and data source (high-throughput systematic setups vs. literature-curated small-scale experiments). We find strong sociological sampling biases in literature-curated datasets of small-scale interactions. An interactome without such sampling biases was used to understand network properties of human disease-genes - hubs are unlikely to cause disease, but if they do, they usually cause multiple disorders. Conclusions HINT is of significant interest to researchers in all fields of biology as it addresses the ubiquitous need of having a repository of high-quality protein-protein interactions. These datasets can be utilized to generate specific hypotheses about specific proteins and/or pathways, as well as analyzing global properties of cellular networks. HINT will be regularly updated and all versions will be tracked.
Collapse
Affiliation(s)
- Jishnu Das
- Department of Biological Statistics and Computational Biology, Cornell University, Ithaca, NY 14853, USA.
| | | |
Collapse
|
129
|
Orchard S, Kerrien S, Abbani S, Aranda B, Bhate J, Bidwell S, Bridge A, Briganti L, Brinkman FSL, Brinkman F, Cesareni G, Chatr-aryamontri A, Chautard E, Chen C, Dumousseau M, Goll J, Hancock REW, Hancock R, Hannick LI, Jurisica I, Khadake J, Lynn DJ, Mahadevan U, Perfetto L, Raghunath A, Ricard-Blum S, Roechert B, Salwinski L, Stümpflen V, Tyers M, Uetz P, Xenarios I, Hermjakob H. Protein interaction data curation: the International Molecular Exchange (IMEx) consortium. Nat Methods 2012; 9:345-50. [PMID: 22453911 DOI: 10.1038/nmeth.1931] [Citation(s) in RCA: 385] [Impact Index Per Article: 32.1] [Reference Citation Analysis] [Abstract] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/18/2023]
Abstract
The International Molecular Exchange (IMEx) consortium is an international collaboration between major public interaction data providers to share literature-curation efforts and make a nonredundant set of protein interactions available in a single search interface on a common website (http://www.imexconsortium.org/). Common curation rules have been developed, and a central registry is used to manage the selection of articles to enter into the dataset. We discuss the advantages of such a service to the user, our quality-control measures and our data-distribution practices.
Collapse
Affiliation(s)
- Sandra Orchard
- European Molecular Biology Laboratory-European Bioinformatics Institute, Wellcome Trust Genome Campus, Hinxton, UK.
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | |
Collapse
|
130
|
Affiliation(s)
- Ian W. Taylor
- Samuel Lunenfeld Research Institute; Mount Sinai Hospital; Toronto Ontario Canada
- Department of Molecular Genetics; University of Toronto; Toronto Ontario Canada
| | - Jeffrey L. Wrana
- Samuel Lunenfeld Research Institute; Mount Sinai Hospital; Toronto Ontario Canada
- Department of Molecular Genetics; University of Toronto; Toronto Ontario Canada
| |
Collapse
|
131
|
Tripathi LP, Kambara H, Moriishi K, Morita E, Abe T, Mori Y, Chen YA, Matsuura Y, Mizuguchi K. Proteomic analysis of hepatitis C virus (HCV) core protein transfection and host regulator PA28γ knockout in HCV pathogenesis: a network-based study. J Proteome Res 2012; 11:3664-79. [PMID: 22646850 DOI: 10.1021/pr300121a] [Citation(s) in RCA: 13] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/07/2023]
Abstract
Hepatitis C virus (HCV) causes chronic liver disease worldwide. HCV Core protein (Core) forms the viral capsid and is crucial for HCV pathogenesis and HCV-induced hepatocellular carcinoma, through its interaction with the host factor proteasome activator PA28γ. Here, using BD-PowerBlot high-throughput Western array, we attempt to further investigate HCV pathogenesis by comparing the protein levels in liver samples from Core-transgenic mice with or without the knockout of PA28γ expression (abbreviated PA28γ(-/-)CoreTG and CoreTG, respectively) against the wild-type (WT). The differentially expressed proteins integrated into the human interactome were shown to participate in compact and well-connected cellular networks. Functional analysis of the interaction networks using a newly developed data warehouse system highlighted cellular pathways associated with vesicular transport, immune system, cellular adhesion, and cell growth and death among others that were prominently influenced by Core and PA28γ in HCV infection. Follow-up assays with in vitro HCV cell culture systems validated VTI1A, a vesicular transport associated factor, which was upregulated in CoreTG but not in PA28γ(-/-)CoreTG, as a novel regulator of HCV release but not replication. Our analysis provided novel insights into the Core-PA28γ interplay in HCV pathogenesis and identified potential targets for better anti-HCV therapy and potentially novel biomarkers of HCV infection.
Collapse
Affiliation(s)
- Lokesh P Tripathi
- National Institute of Biomedical Innovation, 7-6-8 Saito Asagi, Ibaraki, Osaka, 567-0085, Japan
| | | | | | | | | | | | | | | | | |
Collapse
|
132
|
Ihara S, Kida H, Arase H, Tripathi LP, Chen YA, Kimura T, Yoshida M, Kashiwa Y, Hirata H, Fukamizu R, Inoue R, Hasegawa K, Goya S, Takahashi R, Minami T, Tsujino K, Suzuki M, Kohmo S, Inoue K, Nagatomo I, Takeda Y, Kijima T, Mizuguchi K, Tachibana I, Kumanogoh A. Inhibitory Roles of Signal Transducer and Activator of Transcription 3 in Antitumor Immunity during Carcinogen-Induced Lung Tumorigenesis. Cancer Res 2012; 72:2990-9. [DOI: 10.1158/0008-5472.can-11-4062] [Citation(s) in RCA: 46] [Impact Index Per Article: 3.8] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/16/2022]
|
133
|
Fiume M, Smith EJM, Brook A, Strbenac D, Turner B, Mezlini AM, Robinson MD, Wodak SJ, Brudno M. Savant Genome Browser 2: visualization and analysis for population-scale genomics. Nucleic Acids Res 2012; 40:W615-21. [PMID: 22638571 PMCID: PMC3394255 DOI: 10.1093/nar/gks427] [Citation(s) in RCA: 46] [Impact Index Per Article: 3.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/26/2022] Open
Abstract
High-throughput sequencing (HTS) technologies are providing an unprecedented capacity for data generation, and there is a corresponding need for efficient data exploration and analysis capabilities. Although most existing tools for HTS data analysis are developed for either automated (e.g. genotyping) or visualization (e.g. genome browsing) purposes, such tools are most powerful when combined. For example, integration of visualization and computation allows users to iteratively refine their analyses by updating computational parameters within the visual framework in real-time. Here we introduce the second version of the Savant Genome Browser, a standalone program for visual and computational analysis of HTS data. Savant substantially improves upon its predecessor and existing tools by introducing innovative visualization modes and navigation interfaces for several genomic datatypes, and synergizing visual and automated analyses in a way that is powerful yet easy even for non-expert users. We also present a number of plugins that were developed by the Savant Community, which demonstrate the power of integrating visual and automated analyses using Savant. The Savant Genome Browser is freely available (open source) at www.savantbrowser.com.
Collapse
Affiliation(s)
- Marc Fiume
- Department of Computer Science, University of Toronto, Ontario, Canada M5S 2E4
| | | | | | | | | | | | | | | | | |
Collapse
|
134
|
Kholodenko B, Yaffe MB, Kolch W. Computational approaches for analyzing information flow in biological networks. Sci Signal 2012; 5:re1. [PMID: 22510471 DOI: 10.1126/scisignal.2002961] [Citation(s) in RCA: 126] [Impact Index Per Article: 10.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/13/2022]
Abstract
The advancements in "omics" (proteomics, genomics, lipidomics, and metabolomics) technologies have yielded large inventories of genes, transcripts, proteins, and metabolites. The challenge is to find out how these entities work together to regulate the processes by which cells respond to external and internal signals. Mathematical and computational modeling of signaling networks has a key role in this task, and network analysis provides insights into biological systems and has applications for medicine. Here, we review experimental and theoretical progress and future challenges toward this goal. We focus on how networks are reconstructed from data, how these networks are structured to control the flow of biological information, and how the design features of the networks specify biological decisions.
Collapse
Affiliation(s)
- Boris Kholodenko
- Systems Biology Ireland, University College Dublin, Belfield, Dublin 4, Ireland
| | | | | |
Collapse
|
135
|
Schaefer MH, Fontaine JF, Vinayagam A, Porras P, Wanker EE, Andrade-Navarro MA. HIPPIE: Integrating protein interaction networks with experiment based quality scores. PLoS One 2012; 7:e31826. [PMID: 22348130 PMCID: PMC3279424 DOI: 10.1371/journal.pone.0031826] [Citation(s) in RCA: 231] [Impact Index Per Article: 19.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/18/2011] [Accepted: 01/12/2012] [Indexed: 01/03/2023] Open
Abstract
Protein function is often modulated by protein-protein interactions (PPIs) and therefore defining the partners of a protein helps to understand its activity. PPIs can be detected through different experimental approaches and are collected in several expert curated databases. These databases are used by researchers interested in examining detailed information on particular proteins. In many analyses the reliability of the characterization of the interactions becomes important and it might be necessary to select sets of PPIs of different confidence levels. To this goal, we generated HIPPIE (Human Integrated Protein-Protein Interaction rEference), a human PPI dataset with a normalized scoring scheme that integrates multiple experimental PPI datasets. HIPPIE's scoring scheme has been optimized by human experts and a computer algorithm to reflect the amount and quality of evidence for a given PPI and we show that these scores correlate to the quality of the experimental characterization. The HIPPIE web tool (available at http://cbdm.mdc-berlin.de/tools/hippie) allows researchers to do network analyses focused on likely true PPI sets by generating subnetworks around proteins of interest at a specified confidence level.
Collapse
Affiliation(s)
| | | | - Arunachalam Vinayagam
- Max Delbrück Center for Molecular Medicine, Berlin, Germany
- Department of Genetics, Harvard Medical School, Boston, Massachusetts, United States of America
| | - Pablo Porras
- Max Delbrück Center for Molecular Medicine, Berlin, Germany
- IntAct Scientific Curator, EMBL-EBI, Wellcome Trust Genome Campus, Hinxton, Cambridge, United Kingdom
| | | | | |
Collapse
|
136
|
Wang X, Wei X, Thijssen B, Das J, Lipkin SM, Yu H. Three-dimensional reconstruction of protein networks provides insight into human genetic disease. Nat Biotechnol 2012; 30:159-64. [PMID: 22252508 DOI: 10.1038/nbt.2106] [Citation(s) in RCA: 280] [Impact Index Per Article: 23.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/11/2011] [Accepted: 12/19/2011] [Indexed: 01/13/2023]
Abstract
To better understand the molecular mechanisms and genetic basis of human disease, we systematically examine relationships between 3,949 genes, 62,663 mutations and 3,453 associated disorders by generating a three-dimensional, structurally resolved human interactome. This network consists of 4,222 high-quality binary protein-protein interactions with their atomic-resolution interfaces. We find that in-frame mutations (missense point mutations and in-frame insertions and deletions) are enriched on the interaction interfaces of proteins associated with the corresponding disorders, and that the disease specificity for different mutations of the same gene can be explained by their location within an interface. We also predict 292 candidate genes for 694 unknown disease-to-gene associations with proposed molecular mechanism hypotheses. This work indicates that knowledge of how in-frame disease mutations alter specific interactions is critical to understanding pathogenesis. Structurally resolved interaction networks should be valuable tools for interpreting the wealth of data being generated by large-scale structural genomics and disease association studies.
Collapse
Affiliation(s)
- Xiujuan Wang
- Department of Biological Statistics and Computational Biology, Cornell University, Ithaca, New York, USA
| | | | | | | | | | | |
Collapse
|
137
|
De Las Rivas J, Prieto C. Protein interactions: mapping interactome networks to support drug target discovery and selection. Methods Mol Biol 2012; 910:279-96. [PMID: 22821600 DOI: 10.1007/978-1-61779-965-5_12] [Citation(s) in RCA: 13] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/04/2023]
Abstract
Proteins are biomolecular structures that build the microscopic working machinery of any living system. Proteins within the cells and biological systems do not act alone, but rather team up into macromolecular structures enclosing intricate physicochemical dynamic connections to undertake biological functions. A critical step towards unraveling the complex molecular relationships in living systems is the mapping of protein-to-protein physical "interactions". The complete map of protein interactions that can occur in a living organism is called the "interactome". Achieving an adequate atlas of all the protein interactions within a living system should allow to build its interaction network and to identity the "central nodes" that can be critical for the function, the homeostasis, and the movement of such system. Focusing on human studies, the data about the human interactome are most relevant for current biomedical research, because it is clear that the location of the proteins in the interactome network will allow to evaluate their centrality and to redefine the potential value of each protein as a drug target. This chapter presents our current knowledge on the human protein-protein interactome and explains how such knowledge can help us to select adequate targets for drugs.
Collapse
Affiliation(s)
- Javier De Las Rivas
- Bioinformatics and Functional Genomics Group, Cancer Research Center (IBMCC, CSIC/USAL), Salamanca, Spain.
| | | |
Collapse
|
138
|
Sun MGF, Kim PM. Evolution of biological interaction networks: from models to real data. Genome Biol 2011; 12:235. [PMID: 22204388 PMCID: PMC3334609 DOI: 10.1186/gb-2011-12-12-235] [Citation(s) in RCA: 32] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/15/2011] [Accepted: 12/12/2011] [Indexed: 01/19/2023] Open
Abstract
We are beginning to uncover common mechanisms leading to the evolution of biological networks. The driving force behind these advances is the increasing availability of comparative data in several species.
Collapse
Affiliation(s)
- Mark G F Sun
- Department of Computer Science, University of Toronto, 160 College St, Toronto, Ontario, Canada
| | | |
Collapse
|
139
|
Mora A, Donaldson IM. iRefR: an R package to manipulate the iRefIndex consolidated protein interaction database. BMC Bioinformatics 2011; 12:455. [PMID: 22115179 PMCID: PMC3282787 DOI: 10.1186/1471-2105-12-455] [Citation(s) in RCA: 55] [Impact Index Per Article: 4.2] [Reference Citation Analysis] [Abstract] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/22/2011] [Accepted: 11/24/2011] [Indexed: 11/19/2022] Open
Abstract
Background The iRefIndex addresses the need to consolidate protein interaction data into a single uniform data resource. iRefR provides the user with access to this data source from an R environment. Results The iRefR package includes tools for selecting specific subsets of interest from the iRefIndex by criteria such as organism, source database, experimental method, protein accessions and publication identifier. Data may be converted between three representations (MITAB, edgeList and graph) for use with other R packages such as igraph, graph and RBGL. The user may choose between different methods for resolving redundancies in interaction data and how n-ary data is represented. In addition, we describe a function to identify binary interaction records that possibly represent protein complexes. We show that the user choice of data selection, redundancy resolution and n-ary data representation all have an impact on graphical analysis. Conclusions The package allows the user to control how these issues are dealt with and communicate them via an R-script written using the iRefR package - this will facilitate communication of methods, reproducibility of network analyses and further modification and comparison of methods by researchers.
Collapse
Affiliation(s)
- Antonio Mora
- Department for Molecular Biosciences, University of Oslo, P,O, Box 1041 Blindern, 0316 Oslo, Norway
| | | |
Collapse
|
140
|
Razick S, Mora A, Michalickova K, Boddie P, Donaldson IM. iRefScape. A Cytoscape plug-in for visualization and data mining of protein interaction data from iRefIndex. BMC Bioinformatics 2011; 12:388. [PMID: 21975162 PMCID: PMC3228863 DOI: 10.1186/1471-2105-12-388] [Citation(s) in RCA: 13] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/22/2011] [Accepted: 10/05/2011] [Indexed: 11/10/2022] Open
Abstract
Background The iRefIndex consolidates protein interaction data from ten databases in a rigorous manner using sequence-based hash keys. Working with consolidated interaction data comes with distinct challenges: data are redundant, overlapping, highly interconnected and may be collected and represented using different curation practices. These phenomena were quantified in our previous studies. Results The iRefScape plug-in for the Cytoscape graphical viewer addresses these challenges. We show how these factors impact on data-mining tasks and how our solutions resolve them in a simple and efficient manner. A uniform accession space is used to limit redundancy and support search expansion and searching on multiple accession types. Multiple node and edge features support data filtering and mining. Node colours and features supply information about search result provenance. Overlapping evidence is presented using a multi-graph and a bi-partite representation is used to distinguish binary and n-ary source data. Searching for interactions between sets of proteins is supported and specifically includes searches on disease-related genes found in OMIM. Finally, a synchronized adjacency-matrix view facilitates visualization of relationships between sets of user defined groups. Conclusions The iRefScape plug-in will be of interest to advanced users of interaction data. The plug-in provides access to a consolidated data set in a uniform accession space while remaining faithful to the underlying source data. Tools are provided to facilitate a range of tasks from a simple search to knowledge discovery. The plug-in uses a number of strategies that will be of interest to other plug-in developers.
Collapse
Affiliation(s)
- Sabry Razick
- The Biotechnology Centre of Oslo, University of Oslo, P,O, Box 1125 Blindern, 0317 Oslo, Norway
| | | | | | | | | |
Collapse
|
141
|
Lu Z, Kao HY, Wei CH, Huang M, Liu J, Kuo CJ, Hsu CN, Tsai RTH, Dai HJ, Okazaki N, Cho HC, Gerner M, Solt I, Agarwal S, Liu F, Vishnyakova D, Ruch P, Romacker M, Rinaldi F, Bhattacharya S, Srinivasan P, Liu H, Torii M, Matos S, Campos D, Verspoor K, Livingston KM, Wilbur WJ. The gene normalization task in BioCreative III. BMC Bioinformatics 2011; 12 Suppl 8:S2. [PMID: 22151901 PMCID: PMC3269937 DOI: 10.1186/1471-2105-12-s8-s2] [Citation(s) in RCA: 79] [Impact Index Per Article: 6.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/10/2022] Open
Abstract
BACKGROUND We report the Gene Normalization (GN) challenge in BioCreative III where participating teams were asked to return a ranked list of identifiers of the genes detected in full-text articles. For training, 32 fully and 500 partially annotated articles were prepared. A total of 507 articles were selected as the test set. Due to the high annotation cost, it was not feasible to obtain gold-standard human annotations for all test articles. Instead, we developed an Expectation Maximization (EM) algorithm approach for choosing a small number of test articles for manual annotation that were most capable of differentiating team performance. Moreover, the same algorithm was subsequently used for inferring ground truth based solely on team submissions. We report team performance on both gold standard and inferred ground truth using a newly proposed metric called Threshold Average Precision (TAP-k). RESULTS We received a total of 37 runs from 14 different teams for the task. When evaluated using the gold-standard annotations of the 50 articles, the highest TAP-k scores were 0.3297 (k=5), 0.3538 (k=10), and 0.3535 (k=20), respectively. Higher TAP-k scores of 0.4916 (k=5, 10, 20) were observed when evaluated using the inferred ground truth over the full test set. When combining team results using machine learning, the best composite system achieved TAP-k scores of 0.3707 (k=5), 0.4311 (k=10), and 0.4477 (k=20) on the gold standard, representing improvements of 12.4%, 21.8%, and 26.6% over the best team results, respectively. CONCLUSIONS By using full text and being species non-specific, the GN task in BioCreative III has moved closer to a real literature curation task than similar tasks in the past and presents additional challenges for the text mining community, as revealed in the overall team results. By evaluating teams using the gold standard, we show that the EM algorithm allows team submissions to be differentiated while keeping the manual annotation effort feasible. Using the inferred ground truth we show measures of comparative performance between teams. Finally, by comparing team rankings on gold standard vs. inferred ground truth, we further demonstrate that the inferred ground truth is as effective as the gold standard for detecting good team performance.
Collapse
Affiliation(s)
- Zhiyong Lu
- National Center for Biotechnology Information (NCBI), 8600 Rockville Pike, Bethesda, Maryland 20894, USA
| | - Hung-Yu Kao
- Department of Computer Science and Information Engineering, National Cheng Kung University, Tainan, Taiwan, R.O.C
| | - Chih-Hsuan Wei
- Department of Computer Science and Information Engineering, National Cheng Kung University, Tainan, Taiwan, R.O.C
| | - Minlie Huang
- Department of Computer Science and Technology, Tsinghua University, Beijing, 100084, China
| | - Jingchen Liu
- Department of Computer Science and Technology, Tsinghua University, Beijing, 100084, China
| | - Cheng-Ju Kuo
- Institute of Information Science, Academia Sinica, Taipei 115, Taiwan
| | - Chun-Nan Hsu
- Institute of Information Science, Academia Sinica, Taipei 115, Taiwan
- Information Science Institute, University of Southern California, Marina del Rey, California, USA
| | - Richard Tzong-Han Tsai
- Department of Computer Science and Engineering, Yuan Ze University, Chung-Li, Taiwan, R.O.C
| | - Hong-Jie Dai
- Department of Computer Science, National Tsing-Hua University, Hsinchu, Taiwan, R.O.C
- Institute of Information Science, Academic Sinica, Taipei, Taiwan, R.O.C
| | - Naoaki Okazaki
- Interfaculty Initiative in Information Studies, University of Tokyo, Japan
| | - Han-Cheol Cho
- Graduate School of Information Science and Technology, University of Tokyo, Japan
| | - Martin Gerner
- Faculty of Life Sciences, University of Manchester, Manchester, M13 9PT, UK
| | - Illes Solt
- Department of Telecommunications and Media Informatics, Budapest University of Technology and Economics, 1117 Budapest, Hungary
| | - Shashank Agarwal
- Medical Informatics, University of Wisconsin-Milwaukee, Milwaukee, Wisconsin, USA
| | - Feifan Liu
- Medical Informatics, University of Wisconsin-Milwaukee, Milwaukee, Wisconsin, USA
| | - Dina Vishnyakova
- BiTem Group, Division of Medical Information Sciences, University of Geneva, Switzerland
| | - Patrick Ruch
- BiTeM Group, Information Science Department, University of Applied Science, Geneva, Switzerland
| | | | - Fabio Rinaldi
- Institute of Computational Linguistics, University of Zurich, Zurich, Switzerland
| | | | - Padmini Srinivasan
- Department of Computer Science, The University of Iowa, Iowa City, Iowa 52242, USA
| | - Hongfang Liu
- Department of Health Sciences Research, Mayo Clinic College of Medicine, Rochester, MN 55905 USA
| | - Manabu Torii
- Lab of Text Intelligence in Biomedicine, Georgetown University Medical Center, 4000 Reservoir Rd., NW, Washington, DC 20057 USA
| | - Sergio Matos
- DETI/IEETA, University of Aveiro, Campus Universitário de Santiago, 3810-193 Aveiro, Portugal
| | - David Campos
- DETI/IEETA, University of Aveiro, Campus Universitário de Santiago, 3810-193 Aveiro, Portugal
| | - Karin Verspoor
- Center for Computational Pharmacology, University of Colorado School of Medicine, Aurora, Colorado, USA
| | - Kevin M Livingston
- Center for Computational Pharmacology, University of Colorado School of Medicine, Aurora, Colorado, USA
| | - W John Wilbur
- National Center for Biotechnology Information (NCBI), 8600 Rockville Pike, Bethesda, Maryland 20894, USA
| |
Collapse
|
142
|
|
143
|
Stojmirović A, Yu YK. ppiTrim: constructing non-redundant and up-to-date interactomes. DATABASE-THE JOURNAL OF BIOLOGICAL DATABASES AND CURATION 2011; 2011:bar036. [PMID: 21873645 PMCID: PMC3162744 DOI: 10.1093/database/bar036] [Citation(s) in RCA: 11] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Indexed: 12/30/2022]
Abstract
Robust advances in interactome analysis demand comprehensive, non-redundant and consistently annotated data sets. By non-redundant, we mean that the accounting of evidence for every interaction should be faithful: each independent experimental support is counted exactly once, no more, no less. While many interactions are shared among public repositories, none of them contains the complete known interactome for any model organism. In addition, the annotations of the same experimental result by different repositories often disagree. This brings up the issue of which annotation to keep while consolidating evidences that are the same. The iRefIndex database, including interactions from most popular repositories with a standardized protein nomenclature, represents a significant advance in all aspects, especially in comprehensiveness. However, iRefIndex aims to maintain all information/annotation from original sources and requires users to perform additional processing to fully achieve the aforementioned goals. Another issue has to do with protein complexes. Some databases represent experimentally observed complexes as interactions with more than two participants, while others expand them into binary interactions using spoke or matrix model. To avoid untested interaction information buildup, it is preferable to replace the expanded protein complexes, either from spoke or matrix models, with a flat list of complex members. To address these issues and to achieve our goals, we have developed ppiTrim, a script that processes iRefIndex to produce non-redundant, consistently annotated data sets of physical interactions. Our script proceeds in three stages: mapping all interactants to gene identifiers and removing all undesired raw interactions, deflating potentially expanded complexes, and reconciling for each interaction the annotation labels among different source databases. As an illustration, we have processed the three largest organismal data sets: yeast, human and fruitfly. While ppiTrim can resolve most apparent conflicts between different labelings, we also discovered some unresolvable disagreements mostly resulting from different annotation policies among repositories. Database URL:http://www.ncbi.nlm.nih.gov/CBBresearch/Yu/downloads/ppiTrim.html
Collapse
Affiliation(s)
- Aleksandar Stojmirović
- National Center for Biotechnology Information, National Library of Medicine, National Institutes of Health, Bethesda, MD 20894, USA
| | | |
Collapse
|
144
|
Fernández‐Recio J. Prediction of protein binding sites and hot spots. WILEY INTERDISCIPLINARY REVIEWS-COMPUTATIONAL MOLECULAR SCIENCE 2011. [DOI: 10.1002/wcms.45] [Citation(s) in RCA: 42] [Impact Index Per Article: 3.2] [Reference Citation Analysis] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/10/2022]
|
145
|
Azuaje FJ, Wang H, Zheng H, Léonard F, Rolland-Turner M, Zhang L, Devaux Y, Wagner DR. Predictive integration of gene functional similarity and co-expression defines treatment response of endothelial progenitor cells. BMC SYSTEMS BIOLOGY 2011; 5:46. [PMID: 21447198 PMCID: PMC3080295 DOI: 10.1186/1752-0509-5-46] [Citation(s) in RCA: 7] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 11/17/2010] [Accepted: 03/30/2011] [Indexed: 01/04/2023]
Abstract
Background Endothelial progenitor cells (EPCs) have been implicated in different processes crucial to vasculature repair, which may offer the basis for new therapeutic strategies in cardiovascular disease. Despite advances facilitated by functional genomics, there is a lack of systems-level understanding of treatment response mechanisms of EPCs. In this research we aimed to characterize the EPCs response to adenosine (Ado), a cardioprotective factor, based on the systems-level integration of gene expression data and prior functional knowledge. Specifically, we set out to identify novel biosignatures of Ado-treatment response in EPCs. Results The predictive integration of gene expression data and standardized functional similarity information enabled us to identify new treatment response biosignatures. Gene expression data originated from Ado-treated and -untreated EPCs samples, and functional similarity was estimated with Gene Ontology (GO)-based similarity information. These information sources enabled us to implement and evaluate an integrated prediction approach based on the concept of k-nearest neighbours learning (kNN). The method can be executed by expert- and data-driven input queries to guide the search for biologically meaningful biosignatures. The resulting integrated kNN system identified new candidate EPC biosignatures that can offer high classification performance (areas under the operating characteristic curve > 0.8). We also showed that the proposed models can outperform those discovered by standard gene expression analysis. Furthermore, we report an initial independent in vitro experimental follow-up, which provides additional evidence of the potential validity of the top biosignature. Conclusion Response to Ado treatment in EPCs can be accurately characterized with a new method based on the combination of gene co-expression data and GO-based similarity information. It also exploits the incorporation of human expert-driven queries as a strategy to guide the automated search for candidate biosignatures. The proposed biosignature improves the systems-level characterization of EPCs. The new integrative predictive modeling approach can also be applied to other phenotype characterization or biomarker discovery problems.
Collapse
Affiliation(s)
- Francisco J Azuaje
- Laboratory of Cardiovascular Research, Centre de Recherche Public-Santé, L-1150, Luxembourg.
| | | | | | | | | | | | | | | |
Collapse
|
146
|
Hao Y, Merkoulovitch A, Vlasblom J, Pu S, Turinsky AL, Roudeva D, Turner B, Greenblatt J, Wodak SJ. OrthoNets: simultaneous visual analysis of orthologs and their interaction neighborhoods across different organisms. Bioinformatics 2011; 27:883-4. [PMID: 21257609 PMCID: PMC3051336 DOI: 10.1093/bioinformatics/btr035] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/20/2022] Open
Abstract
Motivation: Protein interaction networks contain a wealth of biological information, but their large size often hinders cross-organism comparisons. We present OrthoNets, a Cytoscape plugin that displays protein–protein interaction (PPI) networks from two organisms simultaneously, highlighting orthology relationships and aggregating several types of biomedical annotations. OrthoNets also allows PPI networks derived from experiments to be overlaid on networks extracted from public databases, supporting the identification and verification of new interactors. Any newly identified PPIs can be validated by checking whether their orthologs interact in another organism. Availability: OrthoNets is freely available at http://wodaklab.org/orthonets/. Contact:jim.vlasblom@utoronto.ca
Collapse
Affiliation(s)
- Yanqi Hao
- Molecular Structure & Function program, Hospital for Sick Children, Toronto, ON, Canada
| | | | | | | | | | | | | | | | | |
Collapse
|
147
|
Choi H, Larsen B, Lin ZY, Breitkreutz A, Mellacheruvu D, Fermin D, Qin ZS, Tyers M, Gingras AC, Nesvizhskii AI. SAINT: probabilistic scoring of affinity purification-mass spectrometry data. Nat Methods 2011; 8:70-3. [PMID: 21131968 PMCID: PMC3064265 DOI: 10.1038/nmeth.1541] [Citation(s) in RCA: 531] [Impact Index Per Article: 40.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/28/2010] [Accepted: 11/09/2010] [Indexed: 01/12/2023]
Abstract
We present 'significance analysis of interactome' (SAINT), a computational tool that assigns confidence scores to protein-protein interaction data generated using affinity purification-mass spectrometry (AP-MS). The method uses label-free quantitative data and constructs separate distributions for true and false interactions to derive the probability of a bona fide protein-protein interaction. We show that SAINT is applicable to data of different scales and protein connectivity and allows transparent analysis of AP-MS data.
Collapse
Affiliation(s)
- Hyungwon Choi
- Department of Pathology, University of Michigan, Ann Arbor, MI 48109-0602, USA
| | - Brett Larsen
- Centre for Systems Biology, Samuel Lunenfeld Research Institute, 600 University Avenue, Toronto, Ontario, M5G 1X5, Canada
| | - Zhen-Yuan Lin
- Centre for Systems Biology, Samuel Lunenfeld Research Institute, 600 University Avenue, Toronto, Ontario, M5G 1X5, Canada
| | - Ashton Breitkreutz
- Centre for Systems Biology, Samuel Lunenfeld Research Institute, 600 University Avenue, Toronto, Ontario, M5G 1X5, Canada
| | | | - Damian Fermin
- Department of Pathology, University of Michigan, Ann Arbor, MI 48109-0602, USA
| | - Zhaohui S. Qin
- Department of Biostatistics and Center for Statistical Genetics, University of Michigan, Ann Arbor, MI 48109, USA
| | - Mike Tyers
- Centre for Systems Biology, Samuel Lunenfeld Research Institute, 600 University Avenue, Toronto, Ontario, M5G 1X5, Canada
- Department of Molecular Genetics, University of Toronto, 1 Kings College Circle, Toronto, Ontario, M5S 1A8, Canada
- Wellcome Trust Centre for Cell Biology and Centre for Systems Biology, School of Biological Sciences, University of Edinburgh, Mayfield Road, Edinburgh, EH9 3JR, Scotland, UK
| | - Anne-Claude Gingras
- Centre for Systems Biology, Samuel Lunenfeld Research Institute, 600 University Avenue, Toronto, Ontario, M5G 1X5, Canada
- Department of Molecular Genetics, University of Toronto, 1 Kings College Circle, Toronto, Ontario, M5S 1A8, Canada
| | - Alexey I. Nesvizhskii
- Department of Pathology, University of Michigan, Ann Arbor, MI 48109-0602, USA
- Center for Computational Medicine and Bioinformatics, University of Michigan, Ann Arbor, MI 48109-0602, USA
| |
Collapse
|
148
|
Turinsky AL, Razick S, Turner B, Donaldson IM, Wodak SJ. Literature curation of protein interactions: measuring agreement across major public databases. DATABASE-THE JOURNAL OF BIOLOGICAL DATABASES AND CURATION 2010; 2010:baq026. [PMID: 21183497 PMCID: PMC3011985 DOI: 10.1093/database/baq026] [Citation(s) in RCA: 51] [Impact Index Per Article: 3.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Indexed: 01/20/2023]
Abstract
Literature curation of protein interaction data faces a number of challenges. Although curators increasingly adhere to standard data representations, the data that various databases actually record from the same published information may differ significantly. Some of the reasons underlying these differences are well known, but their global impact on the interactions collectively curated by major public databases has not been evaluated. Here we quantify the agreement between curated interactions from 15 471 publications shared across nine major public databases. Results show that on average, two databases fully agree on 42% of the interactions and 62% of the proteins curated from the same publication. Furthermore, a sizable fraction of the measured differences can be attributed to divergent assignments of organism or splice isoforms, different organism focus and alternative representations of multi-protein complexes. Our findings highlight the impact of divergent curation policies across databases, and should be relevant to both curators and data consumers interested in analyzing protein-interaction data generated by the scientific community. Database URL:http://wodaklab.org/iRefWeb
Collapse
Affiliation(s)
- Andrei L Turinsky
- Molecular Structure and Function Program, Hospital for Sick Children, 555 University Avenue, Toronto, Ontario, Canada
| | | | | | | | | |
Collapse
|