1
|
Scott MA, Woolums AR, Swiderski CE, Finley A, Perkins AD, Nanduri B, Karisch BB. Hematological and gene co-expression network analyses of high-risk beef cattle defines immunological mechanisms and biological complexes involved in bovine respiratory disease and weight gain. PLoS One 2022; 17:e0277033. [PMID: 36327246 PMCID: PMC9632787 DOI: 10.1371/journal.pone.0277033] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/24/2022] [Accepted: 10/18/2022] [Indexed: 11/06/2022] Open
Abstract
Bovine respiratory disease (BRD), the leading disease complex in beef cattle production systems, remains highly elusive regarding diagnostics and disease prediction. Previous research has employed cellular and molecular techniques to describe hematological and gene expression variation that coincides with BRD development. Here, we utilized weighted gene co-expression network analysis (WGCNA) to leverage total gene expression patterns from cattle at arrival and generate hematological and clinical trait associations to describe mechanisms that may predict BRD development. Gene expression counts of previously published RNA-Seq data from 23 cattle (2017; n = 11 Healthy, n = 12 BRD) were used to construct gene co-expression modules and correlation patterns with complete blood count (CBC) and clinical datasets. Modules were further evaluated for cross-populational preservation of expression with RNA-Seq data from 24 cattle in an independent population (2019; n = 12 Healthy, n = 12 BRD). Genes within well-preserved modules were subject to functional enrichment analysis for significant Gene Ontology terms and pathways. Genes which possessed high module membership and association with BRD development, regardless of module preservation (“hub genes”), were utilized for protein-protein physical interaction network and clustering analyses. Five well-preserved modules of co-expressed genes were identified. One module (“steelblue”), involved in alpha-beta T-cell complexes and Th2-type immunity, possessed significant correlation with increased erythrocytes, platelets, and BRD development. One module (“purple”), involved in mitochondrial metabolism and rRNA maturation, possessed significant correlation with increased eosinophils, fecal egg count per gram, and weight gain over time. Fifty-two interacting hub genes, stratified into 11 clusters, may possess transient function involved in BRD development not previously described in literature. This study identifies co-expressed genes and coordinated mechanisms associated with BRD, which necessitates further investigation in BRD-prediction research.
Collapse
Affiliation(s)
- Matthew A. Scott
- Veterinary Education, Research, and Outreach Center, Texas A&M University and West Texas A&M University, Canyon, TX, United States of America
- * E-mail:
| | - Amelia R. Woolums
- Department of Pathobiology and Population Medicine, College of Veterinary Medicine, Mississippi State University, Mississippi State, MS, United States of America
| | - Cyprianna E. Swiderski
- School of Animal and Comparative Biomedical Sciences, University of Arizona, Tucson, Arizona, United States of America
| | - Abigail Finley
- Veterinary Education, Research, and Outreach Center, Texas A&M University and West Texas A&M University, Canyon, TX, United States of America
| | - Andy D. Perkins
- Department of Computer Science and Engineering, Mississippi State University, Mississippi State, MS, United States of America
| | - Bindu Nanduri
- Department of Comparative Biomedical Sciences, College of Veterinary Medicine, Mississippi State University, Mississippi State, MS, United States of America
| | - Brandi B. Karisch
- Department of Animal and Dairy Sciences, Mississippi State University, Mississippi State, MS, United States of America
| |
Collapse
|
2
|
Singh V, Singh G, Singh V. TulsiPIN: An Interologous Protein Interactome of Ocimum tenuiflorum. J Proteome Res 2020; 19:884-899. [PMID: 31789043 DOI: 10.1021/acs.jproteome.9b00683] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/02/2023]
Abstract
Ocimum tenuiflorum, commonly known as holy basil or tulsi, is globally recognized for its multitude of medicinal properties. However, a comprehensive study revealing the complex interplay among its constituent proteins at subcellular level is still lacking. To bridge this gap, in this work, a genome-scale interologous protein-protein interaction (PPI) network, TulsiPIN, is developed using 36 template plants, which consists of 13 660 nodes and 327 409 binary interactions. A high confidence network, hc-TulsiPIN, consisting of 7719 nodes having 95 532 interactions is inferred using domain-domain interaction information along with interolog-based statistics, and its reliability is assessed using pathway enrichment, functional homogeneity, and protein colocalization of PPIs. Examination of topological features revealed that hc-TulsiPIN possesses conventional properties, like small-world, scale-free, and modular architecture. A total of 1625 vital proteins are predicted by statistically evaluating hc-TulsiPIN with two ensembles of corresponding random networks, each consisting of 10 000 realizations of Erdoős-Rényi and Barabási-Albert models. Also, numerous regulatory proteins like transcription factors, transcription regulators, and protein kinases are profiled. Using 36 guide genes participating in 9 secondary metabolite biosynthetic pathways, a subnetwork consisting of 171 proteins and 612 interactions was constructed, and 127 of these proteins could be successfully characterized. Detailed information of TulsiPIN is available at https://cuhpcbbtulsipin.shinyapps.io/tulsipin_v0/ .
Collapse
Affiliation(s)
- Vikram Singh
- Centre for Computational Biology and Bioinformatics , Central University of Himahcal Pradesh , Dharamshala 176206 , India
| | - Gagandeep Singh
- Centre for Computational Biology and Bioinformatics , Central University of Himahcal Pradesh , Dharamshala 176206 , India
| | - Vikram Singh
- Centre for Computational Biology and Bioinformatics , Central University of Himahcal Pradesh , Dharamshala 176206 , India
| |
Collapse
|
3
|
Zhou G, Xia J. Using OmicsNet for Network Integration and 3D Visualization. ACTA ACUST UNITED AC 2018; 65:e69. [DOI: 10.1002/cpbi.69] [Citation(s) in RCA: 35] [Impact Index Per Article: 5.8] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/18/2022]
Affiliation(s)
- Guangyan Zhou
- Institute of Parasitology, McGill University, Sainte Anne de Bellevue; Quebec Canada
| | - Jianguo Xia
- Institute of Parasitology, McGill University, Sainte Anne de Bellevue; Quebec Canada
- Department of Animal Sciences, McGill University, Sainte Anne de Bellevue; Quebec Canada
- Department of Microbiology and Immunology, McGill University; Montreal Quebec Canada
| |
Collapse
|
4
|
Manikandan P, Ramyachitra D, Banupriya D. Detection of overlapping protein complexes in gene expression, phenotype and pathways of Saccharomyces cerevisiae using Prorank based Fuzzy algorithm. Gene 2016; 580:144-158. [PMID: 26809099 DOI: 10.1016/j.gene.2016.01.016] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/02/2015] [Revised: 12/07/2015] [Accepted: 01/11/2016] [Indexed: 02/02/2023]
Abstract
Proteins show their functional activity by interacting with other proteins and forms protein complexes since it is playing an important role in cellular organization and function. To understand the higher order protein organization, overlapping is an important step towards unveiling functional and evolutionary mechanisms behind biological networks. Most of the clustering algorithms do not consider the weighted as well as overlapping complexes. In this research, Prorank based Fuzzy algorithm has been proposed to find the overlapping protein complexes. The Fuzzy detection algorithm is incorporated in the Prorank algorithm after ranking step to find the overlapping community. The proposed algorithm executes in an iterative manner to compute the probability of robust clusters. The proposed and the existing algorithms were tested on different datasets such as PPI-D1, PPI-D2, Collins, DIP, Krogan Core and Krogan-Extended, gene expression such as GSE7645, GSE22269, GSE26923, pathways such as Meiosis, MAPK, Cell Cycle, phenotypes such as Yeast Heterogeneous and Yeast Homogeneous datasets. The experimental results show that the proposed algorithm predicts protein complexes with better accuracy compared to other state of art algorithms.
Collapse
Affiliation(s)
- P Manikandan
- Department of Computer Science, Bharathiar University, Coimbatore 641 046, India.
| | - D Ramyachitra
- Department of Computer Science, Bharathiar University, Coimbatore 641 046, India
| | - D Banupriya
- Department of Computer Science, Bharathiar University, Coimbatore 641 046, India
| |
Collapse
|
5
|
Tényi Á, de Atauri P, Gomez-Cabrero D, Cano I, Clarke K, Falciani F, Cascante M, Roca J, Maier D. ChainRank, a chain prioritisation method for contextualisation of biological networks. BMC Bioinformatics 2016; 17:17. [PMID: 26729273 PMCID: PMC4700624 DOI: 10.1186/s12859-015-0864-x] [Citation(s) in RCA: 18] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/18/2015] [Accepted: 12/17/2015] [Indexed: 12/30/2022] Open
Abstract
BACKGROUND Advances in high throughput technologies and growth of biomedical knowledge have contributed to an exponential increase in associative data. These data can be represented in the form of complex networks of biological associations, which are suitable for systems analyses. However, these networks usually lack both, context specificity in time and space as well as the distinctive borders, which are usually assigned in the classical pathway view of molecular events (e.g. signal transduction). This complexity and high interconnectedness call for automated techniques that can identify smaller targeted subnetworks specific to a given research context (e.g. a disease scenario). RESULTS Our method, named ChainRank, finds relevant subnetworks by identifying and scoring chains of interactions that link specific network components. Scores can be generated from integrating multiple general and context specific measures (e.g. experimental molecular data from expression to proteomics and metabolomics, literature evidence, network topology). The performance of the novel ChainRank method was evaluated on recreating selected signalling pathways from a human protein interaction network. Specifically, we recreated skeletal muscle specific signaling networks in healthy and chronic obstructive pulmonary disease (COPD) contexts. The analysis showed that ChainRank can identify main mediators of context specific molecular signalling. An improvement of up to factor 2.5 was shown in the precision of finding proteins of the recreated pathways compared to random simulation. CONCLUSIONS ChainRank provides a framework, which can integrate several user-defined scores and evaluate their combined effect on ranking interaction chains linking input data sets. It can be used to contextualise networks, identify signaling and regulatory path amongst targeted genes or to analyse synthetic lethality in the context of anticancer therapy. ChainRank is implemented in R programming language and freely available at https://github.com/atenyi/ChainRank.
Collapse
Affiliation(s)
- Ákos Tényi
- Hospital Clínic-Institut d'Investigacions Biomediques August Pi i Sunyer (IDIBAPS), Research Institute, Universitat de Barcelona, C/Villarroel 170, 08036, Barcelona, Spain. .,Centro de Investigación en Red de Enfermedades Respiratorias (CibeRes), 07110, Palma de Mallorca, Spain.
| | - Pedro de Atauri
- Hospital Clínic-Institut d'Investigacions Biomediques August Pi i Sunyer (IDIBAPS), Research Institute, Universitat de Barcelona, C/Villarroel 170, 08036, Barcelona, Spain. .,Departament de Bioquimica i Biologia Molecular, Facultat de Biologia-IBUB, Universitat de Barcelona, 08028, Barcelona, Spain.
| | - David Gomez-Cabrero
- Unit of computational Medicine, Center for Molecular Medicine, Department of Medicine, Karolinska Institute and Karolinska University Hospital, SE-171 76, Stockholm, Sweden.
| | - Isaac Cano
- Hospital Clínic-Institut d'Investigacions Biomediques August Pi i Sunyer (IDIBAPS), Research Institute, Universitat de Barcelona, C/Villarroel 170, 08036, Barcelona, Spain. .,Centro de Investigación en Red de Enfermedades Respiratorias (CibeRes), 07110, Palma de Mallorca, Spain.
| | - Kim Clarke
- Integrative Systems Biology, University of Liverpool, L69 3BX, Liverpool, UK.
| | - Francesco Falciani
- Integrative Systems Biology, University of Liverpool, L69 3BX, Liverpool, UK.
| | - Marta Cascante
- Hospital Clínic-Institut d'Investigacions Biomediques August Pi i Sunyer (IDIBAPS), Research Institute, Universitat de Barcelona, C/Villarroel 170, 08036, Barcelona, Spain. .,Departament de Bioquimica i Biologia Molecular, Facultat de Biologia-IBUB, Universitat de Barcelona, 08028, Barcelona, Spain.
| | - Josep Roca
- Hospital Clínic-Institut d'Investigacions Biomediques August Pi i Sunyer (IDIBAPS), Research Institute, Universitat de Barcelona, C/Villarroel 170, 08036, Barcelona, Spain. .,Centro de Investigación en Red de Enfermedades Respiratorias (CibeRes), 07110, Palma de Mallorca, Spain.
| | - Dieter Maier
- Biomax Informatics AG, D-82152, Planegg, Germany.
| |
Collapse
|
6
|
Abstract
As function units, network motifs have been detected to reveal evolutionary mechanisms of complex systems, such as biological networks, food webs, engineering networks and social networks. However, emergence of motifs in growing networks may be problematic due to large fluctuation of subgraph frequency in the initial stage. This paper contributes to present a method which can identify the emergence of motif in growing networks. Based on the Erdös-Rényi(E-R) random null model, the variation rate of expected frequency of subgraph at adjacent time points was used to define the suitable detection range for motif identification. Upper and lower boundaries of the range were obtained in analytical form according to a chosen risk level. Then, the statistical metric Z-score was extended to a new one,, which effectively reveals the statistical significance of subgraph in a continuous period of time. In this paper, a novel research framework of motif identification was proposed, defining critical boundaries for the evolutionary process of networks and a significance metric of time scale. Finally, an industrial ecosystem at Kalundborg was adopted as a case study to illustrate the effectiveness and convenience of the proposed methodology.
Collapse
Affiliation(s)
- Haijia Shi
- State Key Joint-Laboratory of Environmental Simulation and Pollution Control, School of Environment, Tsinghua University, Beijing, China
| | - Lei Shi
- State Key Joint-Laboratory of Environmental Simulation and Pollution Control, School of Environment, Tsinghua University, Beijing, China
- * E-mail:
| |
Collapse
|
7
|
Pache RA, Aloy P. Increasing the precision of orthology-based complex prediction through network alignment. PeerJ 2014; 2:e413. [PMID: 24918034 PMCID: PMC4045337 DOI: 10.7717/peerj.413] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/13/2014] [Accepted: 05/13/2014] [Indexed: 12/01/2022] Open
Abstract
Macromolecular assemblies play an important role in almost all cellular processes. However, despite several large-scale studies, our current knowledge about protein complexes is still quite limited, thus advocating the use of in silico predictions to gather information on complex composition in model organisms. Since protein–protein interactions present certain constraints on the functional divergence of macromolecular assemblies during evolution, it is possible to predict complexes based on orthology data. Here, we show that incorporating interaction information through network alignment significantly increases the precision of orthology-based complex prediction. Moreover, we performed a large-scale in silico screen for protein complexes in human, yeast and fly, through the alignment of hundreds of known complexes to whole organism interactomes. Systematic comparison of the resulting network alignments to all complexes currently known in those species revealed many conserved complexes, as well as several novel complex components. In addition to validating our predictions using orthogonal data, we were able to assign specific functional roles to the predicted complexes. In several cases, the incorporation of interaction data through network alignment allowed to distinguish real complex components from other orthologous proteins. Our analyses indicate that current knowledge of yeast protein complexes exceeds that in other organisms and that predicting complexes in fly based on human and yeast data is complementary rather than redundant. Lastly, assessing the conservation of protein complexes of the human pathogen Mycoplasma pneumoniae, we discovered that its complexes repertoire is different from that of eukaryotes, suggesting new points of therapeutic intervention, whereas targeting the pathogen’s Restriction enzyme complex might lead to adverse effects due to its similarity to ATP-dependent metalloproteases in the human host.
Collapse
Affiliation(s)
- Roland A Pache
- Joint IRB-BSC Program in Computational Biology, Institute for Research in Biomedicine (IRB Barcelona) , Barcelona , Spain
| | - Patrick Aloy
- Joint IRB-BSC Program in Computational Biology, Institute for Research in Biomedicine (IRB Barcelona) , Barcelona , Spain ; Institució Catalana de Recerca i Estudis Avançats (ICREA) , Barcelona , Spain
| |
Collapse
|
8
|
Wang Y, Qian X. Joint clustering of protein interaction networks through Markov random walk. BMC SYSTEMS BIOLOGY 2014; 8 Suppl 1:S9. [PMID: 24565376 PMCID: PMC4080334 DOI: 10.1186/1752-0509-8-s1-s9] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Indexed: 12/02/2022]
Abstract
Biological networks obtained by high-throughput profiling or human curation are typically noisy. For functional module identification, single network clustering algorithms may not yield accurate and robust results. In order to borrow information across multiple sources to alleviate such problems due to data quality, we propose a new joint network clustering algorithm ASModel in this paper. We construct an integrated network to combine network topological information based on protein-protein interaction (PPI) datasets and homological information introduced by constituent similarity between proteins across networks. A novel random walk strategy on the integrated network is developed for joint network clustering and an optimization problem is formulated by searching for low conductance sets defined on the derived transition matrix of the random walk, which fuses both topology and homology information. The optimization problem of joint clustering is solved by a derived spectral clustering algorithm. Network clustering using several state-of-the-art algorithms has been implemented to both PPI networks within the same species (two yeast PPI networks and two human PPI networks) and those from different species (a yeast PPI network and a human PPI network). Experimental results demonstrate that ASModel outperforms the existing single network clustering algorithms as well as another recent joint clustering algorithm in terms of complex prediction and Gene Ontology (GO) enrichment analysis.
Collapse
|
9
|
Evolutionary systems biology: historical and philosophical perspectives on an emerging synthesis. ADVANCES IN EXPERIMENTAL MEDICINE AND BIOLOGY 2012; 751:1-28. [PMID: 22821451 DOI: 10.1007/978-1-4614-3567-9_1] [Citation(s) in RCA: 11] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 12/12/2022]
Abstract
Systems biology (SB) is at least a decade old now and maturing rapidly. A more recent field, evolutionary systems biology (ESB), is in the process of further developing system-level approaches through the expansion of their explanatory and potentially predictive scope. This chapter will outline the varieties of ESB existing today by tracing the diverse roots and fusions that make up this integrative project. My approach is philosophical and historical. As well as examining the recent origins of ESB, I will reflect on its central features and the different clusters of research it comprises. In its broadest interpretation, ESB consists of five overlapping approaches: comparative and correlational ESB; network architecture ESB; network property ESB; population genetics ESB; and finally, standard evolutionary questions answered with SB methods. After outlining each approach with examples, I will examine some strong general claims about ESB, particularly that it can be viewed as the next step toward a fuller modern synthesis of evolutionary biology (EB), and that it is also the way forward for evolutionary and systems medicine. I will conclude with a discussion of whether the emerging field of ESB has the capacity to combine an even broader scope of research aims and efforts than it presently does.
Collapse
|
10
|
Abstract
BACKGROUND Identifying biologically relevant protein complexes from a large protein-protein interaction (PPI) network, is essential to understand the organization of biological systems. However, high-throughput experimental techniques that can produce a large amount of PPIs are known to yield non-negligible rates of false-positives and false-negatives, making the protein complexes difficult to be identified. RESULTS We propose a binary matrix factorization (BMF) algorithm under the Bayesian Ying-Yang (BYY) harmony learning, to detect protein complexes by clustering the proteins which share similar interactions through factorizing the binary adjacent matrix of a PPI network. The proposed BYY-BMF algorithm automatically determines the cluster number while this number is pre-given for most existing BMF algorithms. Also, BYY-BMF's clustering results does not depend on any parameters or thresholds, unlike the Markov Cluster Algorithm (MCL) that relies on a so-called inflation parameter. On synthetic PPI networks, the predictions evaluated by the known annotated complexes indicate that BYY-BMF is more robust than MCL for most cases. On real PPI networks from the MIPS and DIP databases, BYY-BMF obtains a better balanced prediction accuracies than MCL and a spectral analysis method, while MCL has its own advantages, e.g., with good separation values.
Collapse
Affiliation(s)
- Shikui Tu
- Department of Computer Science and Engineering, The Chinese University of Hong Kong, Shatin, N.T., Hong Kong
| | - Runsheng Chen
- Bioinformatics Laboratory and National Laboratory of Biomacromolecules, Institute of Biophysics, Chinese Academy of Sciences, Beijing 100101
| | - Lei Xu
- Department of Computer Science and Engineering, The Chinese University of Hong Kong, Shatin, N.T., Hong Kong
| |
Collapse
|
11
|
Song J, Singh M. How and when should interactome-derived clusters be used to predict functional modules and protein function? ACTA ACUST UNITED AC 2009; 25:3143-50. [PMID: 19770263 PMCID: PMC3167697 DOI: 10.1093/bioinformatics/btp551] [Citation(s) in RCA: 83] [Impact Index Per Article: 5.5] [Reference Citation Analysis] [Abstract] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/14/2022]
Abstract
Motivation: Clustering of protein–protein interaction networks is one of the most common approaches for predicting functional modules, protein complexes and protein functions. But, how well does clustering perform at these tasks? Results: We develop a general framework to assess how well computationally derived clusters in physical interactomes overlap functional modules derived via the Gene Ontology (GO). Using this framework, we evaluate six diverse network clustering algorithms using Saccharomyces cerevisiae and show that (i) the performances of these algorithms can differ substantially when run on the same network and (ii) their relative performances change depending upon the topological characteristics of the network under consideration. For the specific task of function prediction in S.cerevisiae, we demonstrate that, surprisingly, a simple non-clustering guilt-by-association approach outperforms widely used clustering-based approaches that annotate a protein with the overrepresented biological process and cellular component terms in its cluster; this is true over the range of clustering algorithms considered. Further analysis parameterizes performance based on the number of annotated proteins, and suggests when clustering approaches should be used for interactome functional analyses. Overall our results suggest a re-examination of when and how clustering approaches should be applied to physical interactomes, and establishes guidelines by which novel clustering approaches for biological networks should be justified and evaluated with respect to functional analysis. Contact:msingh@cs.princeton.edu Supplementary information:Supplementary data are available at Bioinformatics online.
Collapse
Affiliation(s)
- Jimin Song
- Department of Computer Science & Lewis-Sigler Institute for Integrative Genomics Princeton University, Princeton, NJ 08544, USA
| | | |
Collapse
|
12
|
Andreopoulos B, Winter C, Labudde D, Schroeder M. Triangle network motifs predict complexes by complementing high-error interactomes with structural information. BMC Bioinformatics 2009; 10:196. [PMID: 19558694 PMCID: PMC2714575 DOI: 10.1186/1471-2105-10-196] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/14/2009] [Accepted: 06/27/2009] [Indexed: 11/30/2022] Open
Abstract
Background A lot of high-throughput studies produce protein-protein interaction networks (PPINs) with many errors and missing information. Even for genome-wide approaches, there is often a low overlap between PPINs produced by different studies. Second-level neighbors separated by two protein-protein interactions (PPIs) were previously used for predicting protein function and finding complexes in high-error PPINs. We retrieve second level neighbors in PPINs, and complement these with structural domain-domain interactions (SDDIs) representing binding evidence on proteins, forming PPI-SDDI-PPI triangles. Results We find low overlap between PPINs, SDDIs and known complexes, all well below 10%. We evaluate the overlap of PPI-SDDI-PPI triangles with known complexes from Munich Information center for Protein Sequences (MIPS). PPI-SDDI-PPI triangles have ~20 times higher overlap with MIPS complexes than using second-level neighbors in PPINs without SDDIs. The biological interpretation for triangles is that a SDDI causes two proteins to be observed with common interaction partners in high-throughput experiments. The relatively few SDDIs overlapping with PPINs are part of highly connected SDDI components, and are more likely to be detected in experimental studies. We demonstrate the utility of PPI-SDDI-PPI triangles by reconstructing myosin-actin processes in the nucleus, cytoplasm, and cytoskeleton, which were not obvious in the original PPIN. Using other complementary datatypes in place of SDDIs to form triangles, such as PubMed co-occurrences or threading information, results in a similar ability to find protein complexes. Conclusion Given high-error PPINs with missing information, triangles of mixed datatypes are a promising direction for finding protein complexes. Integrating PPINs with SDDIs improves finding complexes. Structural SDDIs partially explain the high functional similarity of second-level neighbors in PPINs. We estimate that relatively little structural information would be sufficient for finding complexes involving most of the proteins and interactions in a typical PPIN.
Collapse
Affiliation(s)
- Bill Andreopoulos
- Biotechnology Center (BIOTEC), Technische Universität Dresden, 01307 Dresden, Germany.
| | | | | | | |
Collapse
|
13
|
Chiang T, Scholtens D, Sarkar D, Gentleman R, Huber W. Coverage and error models of protein-protein interaction data by directed graph analysis. Genome Biol 2008; 8:R186. [PMID: 17845715 PMCID: PMC2375024 DOI: 10.1186/gb-2007-8-9-r186] [Citation(s) in RCA: 32] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/12/2007] [Revised: 05/26/2007] [Accepted: 09/10/2007] [Indexed: 01/10/2023] Open
Abstract
Using a directed graph model for bait to prey systems and a multinomial error model, we assessed the error statistics in all published large-scale datasets for Saccharomyces cerevisiae and characterized them by three traits: the set of tested interactions, artifacts that lead to false-positive or false-negative observations, and estimates of the stochastic error rates that affect the data. These traits provide a prerequisite for the estimation of the protein interactome and its modules.
Collapse
Affiliation(s)
- Tony Chiang
- EMBL, European Bioinformatics Institute, Wellcome Trust Genome Campus, Hinxton, Cambridge, CB10 1SD, UK
- Fred Hutchinson Cancer Research Center, Computational Biology Group, Fairview Avenue North, Seattle, WA 98109-1024, USA
| | - Denise Scholtens
- Northwestern University, Department of Preventive Medicine, N Lake Shore Drive, Chicago, IL 60611-4402, USA
| | - Deepayan Sarkar
- Fred Hutchinson Cancer Research Center, Computational Biology Group, Fairview Avenue North, Seattle, WA 98109-1024, USA
| | - Robert Gentleman
- Fred Hutchinson Cancer Research Center, Computational Biology Group, Fairview Avenue North, Seattle, WA 98109-1024, USA
| | - Wolfgang Huber
- EMBL, European Bioinformatics Institute, Wellcome Trust Genome Campus, Hinxton, Cambridge, CB10 1SD, UK
| |
Collapse
|
14
|
Baudot A, Angelelli JB, Guénoche A, Jacq B, Brun C. Defining a modular signalling network from the fly interactome. BMC SYSTEMS BIOLOGY 2008; 2:45. [PMID: 18489752 PMCID: PMC2405789 DOI: 10.1186/1752-0509-2-45] [Citation(s) in RCA: 9] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 04/25/2008] [Accepted: 05/19/2008] [Indexed: 01/22/2023]
Abstract
Background Signalling pathways relay information by transmitting signals from cell surface receptors to intracellular effectors that eventually activate the transcription of target genes. Since signalling pathways involve several types of molecular interactions including protein-protein interactions, we postulated that investigating their organization in the context of the global protein-protein interaction network could provide a new integrated view of signalling mechanisms. Results Using a graph-theory based method to analyse the fly protein-protein interaction network, we found that each signalling pathway is organized in two to three different signalling modules. These modules contain canonical proteins of the signalling pathways, known regulators as well as other proteins thereby predicted to participate to the signalling mechanisms. Connections between the signalling modules are prominent as compared to the other network's modules and interactions within and between signalling modules are among the more central routes of the interaction network. Conclusion Altogether, these modules form an interactome sub-network devoted to signalling with particular topological properties: modularity, density and centrality. This finding reflects the integration of the signalling system into cell functioning and its important role connecting and coordinating different biological processes at the level of the interactome.
Collapse
Affiliation(s)
- Anaïs Baudot
- Institut de Biologie du Développement de Marseille-Luminy, UMR6216, CNRS/Université de Méditerranée, Marseille, France.
| | | | | | | | | |
Collapse
|
15
|
Abstract
We review the estimation of coverage and error rate in high-throughput protein-protein interaction datasets and argue that reports of the low quality of such data are to a substantial extent based on misinterpretations. Probabilistic statistical models and methods can be used to estimate properties of interest and to make the best use of the available data.
Collapse
|
16
|
Zhang B, Park BH, Karpinets T, Samatova NF. From pull-down data to protein interaction networks and complexes with biological relevance. Bioinformatics 2008; 24:979-86. [PMID: 18304937 DOI: 10.1093/bioinformatics/btn036] [Citation(s) in RCA: 114] [Impact Index Per Article: 7.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/15/2022] Open
Abstract
MOTIVATION Recent improvements in high-throughput Mass Spectrometry (MS) technology have expedited genome-wide discovery of protein-protein interactions by providing a capability of detecting protein complexes in a physiological setting. Computational inference of protein interaction networks and protein complexes from MS data are challenging. Advances are required in developing robust and seamlessly integrated procedures for assessment of protein-protein interaction affinities, mathematical representation of protein interaction networks, discovery of protein complexes and evaluation of their biological relevance. RESULTS A multi-step but easy-to-follow framework for identifying protein complexes from MS pull-down data is introduced. It assesses interaction affinity between two proteins based on similarity of their co-purification patterns derived from MS data. It constructs a protein interaction network by adopting a knowledge-guided threshold selection method. Based on the network, it identifies protein complexes and infers their core components using a graph-theoretical approach. It deploys a statistical evaluation procedure to assess biological relevance of each found complex. On Saccharomyces cerevisiae pull-down data, the framework outperformed other more complicated schemes by at least 10% in F(1)-measure and identified 610 protein complexes with high-functional homogeneity based on the enrichment in Gene Ontology (GO) annotation. Manual examination of the complexes brought forward the hypotheses on cause of false identifications. Namely, co-purification of different protein complexes as mediated by a common non-protein molecule, such as DNA, might be a source of false positives. Protein identification bias in pull-down technology, such as the hydrophilic bias could result in false negatives.
Collapse
Affiliation(s)
- Bing Zhang
- Computer Science and Mathematics Division, Oak Ridge National Laboratory, Oak Ridge, TN 37831, USA
| | | | | | | |
Collapse
|
17
|
|
18
|
|
19
|
Wang Z, Zhang J. In search of the biological significance of modular structures in protein networks. PLoS Comput Biol 2007; 3:e107. [PMID: 17542644 PMCID: PMC1885274 DOI: 10.1371/journal.pcbi.0030107] [Citation(s) in RCA: 92] [Impact Index Per Article: 5.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/15/2006] [Accepted: 04/26/2007] [Indexed: 12/02/2022] Open
Abstract
Many complex networks such as computer and social networks exhibit modular structures, where links between nodes are much denser within modules than between modules. It is widely believed that cellular networks are also modular, reflecting the relative independence and coherence of different functional units in a cell. While many authors have claimed that observations from the yeast protein–protein interaction (PPI) network support the above hypothesis, the observed structural modularity may be an artifact because the current PPI data include interactions inferred from protein complexes through approaches that create modules (e.g., assigning pairwise interactions among all proteins in a complex). Here we analyze the yeast PPI network including protein complexes (PIC network) and excluding complexes (PEC network). We find that both PIC and PEC networks show a significantly greater structural modularity than that of randomly rewired networks. Nonetheless, there is little evidence that the structural modules correspond to functional units, particularly in the PEC network. More disturbingly, there is no evolutionary conservation among yeast, fly, and nematode modules at either the whole-module or protein-pair level. Neither is there a correlation between the evolutionary or phylogenetic conservation of a protein and the extent of its participation in various modules. Using computer simulation, we demonstrate that a higher-than-expected modularity can arise during network growth through a simple model of gene duplication, without natural selection for modularity. Taken together, our results suggest the intriguing possibility that the structural modules in the PPI network originated as an evolutionary byproduct without biological significance. Many complex networks are naturally divided into communities or modules, where links within modules are much denser than those across modules. For example, human individuals belonging to the same ethnic groups interact more than those from different ethnic groups. Cellular functions are also organized in a highly modular manner, where each module is a discrete object composed of a group of tightly linked components and performs a relatively independent task. It is interesting to ask whether this modularity in cellular function arises from modularity in molecular interaction networks such as the transcriptional regulatory network and protein–protein interaction (PPI) network. We analyze the yeast PPI network and show that it is indeed significantly more modular than randomly rewired networks. However, we find little evidence that the structural modules correspond to functional units. We also fail to observe any evolutionary conservation among yeast, fly, and nematode PPI modules. We then show by computer simulation that modular structures can arise during network growth via a simple model of gene duplication, without natural selection for modularity. Thus, it appears that the structural modules in the PPI network may have originated as an evolutionary byproduct without much biological significance.
Collapse
Affiliation(s)
- Zhi Wang
- Department of Ecology and Evolutionary Biology, University of Michigan, Ann Arbor, Michigan, United States of America
| | - Jianzhi Zhang
- Department of Ecology and Evolutionary Biology, University of Michigan, Ann Arbor, Michigan, United States of America
- * To whom correspondence should be addressed. E-mail:
| |
Collapse
|
20
|
Brohée S, van Helden J. Evaluation of clustering algorithms for protein-protein interaction networks. BMC Bioinformatics 2006; 7:488. [PMID: 17087821 PMCID: PMC1637120 DOI: 10.1186/1471-2105-7-488] [Citation(s) in RCA: 464] [Impact Index Per Article: 25.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/15/2006] [Accepted: 11/06/2006] [Indexed: 11/26/2022] Open
Abstract
Background Protein interactions are crucial components of all cellular processes. Recently, high-throughput methods have been developed to obtain a global description of the interactome (the whole network of protein interactions for a given organism). In 2002, the yeast interactome was estimated to contain up to 80,000 potential interactions. This estimate is based on the integration of data sets obtained by various methods (mass spectrometry, two-hybrid methods, genetic studies). High-throughput methods are known, however, to yield a non-negligible rate of false positives, and to miss a fraction of existing interactions. The interactome can be represented as a graph where nodes correspond with proteins and edges with pairwise interactions. In recent years clustering methods have been developed and applied in order to extract relevant modules from such graphs. These algorithms require the specification of parameters that may drastically affect the results. In this paper we present a comparative assessment of four algorithms: Markov Clustering (MCL), Restricted Neighborhood Search Clustering (RNSC), Super Paramagnetic Clustering (SPC), and Molecular Complex Detection (MCODE). Results A test graph was built on the basis of 220 complexes annotated in the MIPS database. To evaluate the robustness to false positives and false negatives, we derived 41 altered graphs by randomly removing edges from or adding edges to the test graph in various proportions. Each clustering algorithm was applied to these graphs with various parameter settings, and the clusters were compared with the annotated complexes. We analyzed the sensitivity of the algorithms to the parameters and determined their optimal parameter values. We also evaluated their robustness to alterations of the test graph. We then applied the four algorithms to six graphs obtained from high-throughput experiments and compared the resulting clusters with the annotated complexes. Conclusion This analysis shows that MCL is remarkably robust to graph alterations. In the tests of robustness, RNSC is more sensitive to edge deletion but less sensitive to the use of suboptimal parameter values. The other two algorithms are clearly weaker under most conditions. The analysis of high-throughput data supports the superiority of MCL for the extraction of complexes from interaction networks.
Collapse
Affiliation(s)
- Sylvain Brohée
- Service de Conformation des Macromolécules Biologiques et de Bioinformatique. Université Libre de Bruxelles, CP 263, Campus Plaine, Bd. du Triomphe, B-1050 Bruxelles, Belgium
| | - Jacques van Helden
- Service de Conformation des Macromolécules Biologiques et de Bioinformatique. Université Libre de Bruxelles, CP 263, Campus Plaine, Bd. du Triomphe, B-1050 Bruxelles, Belgium
| |
Collapse
|
21
|
Lin YS, Hwang JK, Li WH. Protein complexity, gene duplicability and gene dispensability in the yeast genome. Gene 2006; 387:109-17. [PMID: 17049186 PMCID: PMC2707112 DOI: 10.1016/j.gene.2006.08.022] [Citation(s) in RCA: 9] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/09/2006] [Revised: 08/14/2006] [Accepted: 08/21/2006] [Indexed: 11/23/2022]
Abstract
Using functional genomic and protein structural data we studied the effects of protein complexity (here defined as the number of subunit types in a protein) on gene dispensability and gene duplicability. We found that in terms of gene duplicability the major distinction in protein complexity is between hetero-complexes, each of which includes at least two different types of subunits (polypeptides), and homo-complexes, which include monomers and complexes that consist of only subunits of one polypeptide type. However, gene dispensability decreases only gradually as the number of subunit types in a protein complex increases. These observations suggest that the dosage balance hypothesis can explain well gene duplicability of complex proteins, but cannot completely explain the difference in dispensabilities between hetero-complex subunits. It is likely that knocking out a gene coding for a hetero-complex subunit would disrupt the function of the whole complex, so that the deletion effect on fitness would increase with protein complexity. We also found that multi-domain polypeptide genes are less dispensable but more duplicable than single-domain polypeptide genes. Duplicate genes derived from the whole genome duplication event in yeast are more dispensable (except for ribosomal protein genes) than other duplicate genes. Further, we found that subunits of the same protein complex tend to have similar expression levels and similar effects of gene deletion on fitness. Finally, we estimated that in yeast the contribution of duplicate genes to genetic robustness against null mutation is approximately 9%, smaller than previously estimated. In yeast, protein complexity may serve as a better indicator of gene dispensability than do duplicate genes.
Collapse
Affiliation(s)
- Yeong-Shin Lin
- Department of Biological Science and Technology, National Chiao Tung University, Hsinchu, Taiwan 300
- Department of Ecology and Evolution, University of Chicago, 1101 East 57 Street, Chicago, Illinois 60637, USA
| | - Jenn-Kang Hwang
- Department of Biological Science and Technology, National Chiao Tung University, Hsinchu, Taiwan 300
| | - Wen-Hsiung Li
- Department of Ecology and Evolution, University of Chicago, 1101 East 57 Street, Chicago, Illinois 60637, USA
- Address for Correspondence: Wen-Hsiung Li, Department of Ecology and Evolution, University of Chicago, 1101 East 57 Street, Chicago, Illinois 60637, USA, E-MAIL ; TEL (773) 702-3104; FAX (773) 702-9740
| |
Collapse
|
22
|
Lubovac Z, Gamalielsson J, Olsson B. Combining functional and topological properties to identify core modules in protein interaction networks. Proteins 2006; 64:948-59. [PMID: 16794996 DOI: 10.1002/prot.21071] [Citation(s) in RCA: 70] [Impact Index Per Article: 3.9] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/11/2022]
Abstract
Advances in large-scale technologies in proteomics, such as yeast two-hybrid screening and mass spectrometry, have made it possible to generate large Protein Interaction Networks (PINs). Recent methods for identifying dense sub-graphs in such networks have been based solely on graph theoretic properties. Therefore, there is a need for an approach that will allow us to combine domain-specific knowledge with topological properties to generate functionally relevant sub-graphs from large networks. This article describes two alternative network measures for analysis of PINs, which combine functional information with topological properties of the networks. These measures, called weighted clustering coefficient and weighted average nearest-neighbors degree, use weights representing the strengths of interactions between the proteins, calculated according to their semantic similarity, which is based on the Gene Ontology terms of the proteins. We perform a global analysis of the yeast PIN by systematically comparing the weighted measures with their topological counterparts. To show the usefulness of the weighted measures, we develop an algorithm for identification of functional modules, called SWEMODE (Semantic WEights for MODule Elucidation), that identifies dense sub-graphs containing functionally similar proteins. The proposed method is based on the ranking of nodes, i.e., proteins, according to their weighted neighborhood cohesiveness. The highest ranked nodes are considered as seeds for candidate modules. The algorithm then iterates through the neighborhood of each seed protein, to identify densely connected proteins with high functional similarity, according to the chosen parameters. Using a yeast two-hybrid data set of experimentally determined protein-protein interactions, we demonstrate that SWEMODE is able to identify dense clusters containing proteins that are functionally similar. Many of the identified modules correspond to known complexes or subunits of these complexes.
Collapse
Affiliation(s)
- Zelmina Lubovac
- School of Humanities and Informatics, University of Skövde, Skövde, Sweden.
| | | | | |
Collapse
|
23
|
Uhrig JF. Protein interaction networks in plants. PLANTA 2006; 224:771-81. [PMID: 16575597 DOI: 10.1007/s00425-006-0260-x] [Citation(s) in RCA: 19] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 11/10/2005] [Accepted: 03/03/2006] [Indexed: 05/08/2023]
Abstract
Protein-protein interactions are fundamental to virtually every aspect of cellular functions. With the development of high-throughput technologies of both the yeast two-hybrid system and tandem mass spectrometry, genome-wide protein-linkage mapping has become a major objective in post-genomic research. While at least partial "interactome" networks of several model organisms are already available, in the plant field, progress in this respect is slow. However, even with comprehensive protein interaction data still missing, substantial recent advance in the graph-theoretical functional interpretation of complex network architectures might pave the way for novel approaches in plant research. This article reviews current progress and discussions in network biology. Emphasis is put on the question of what can be learned about protein functions and cellular processes by studying the topology of complex protein interaction networks and the evolutionary mechanisms underlying their development. Particularly the intermediate and local levels of network organization--the modules, motifs and cliques--are increasingly recognized as the operational units of biological functions. As demonstrated by some recent results from systematic analyses of plant protein families, protein interaction networks promise to be a valuable tool for a molecular understanding of functional specificities and for identifying novel regulatory components and pathways.
Collapse
Affiliation(s)
- Joachim F Uhrig
- Botanisches Institut III, Universität zu Köln, Gyrhof Strasse 15, 50931 Koln, Germany.
| |
Collapse
|
24
|
Poyatos JF, Hurst LD. Is optimal gene order impossible? Trends Genet 2006; 22:420-3. [PMID: 16806566 DOI: 10.1016/j.tig.2006.06.003] [Citation(s) in RCA: 26] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/05/2005] [Revised: 03/23/2006] [Accepted: 06/06/2006] [Indexed: 11/27/2022]
Abstract
Recent evidence suggests that yeast genes encoding proteins that are present in the same protein complex tend to be linked and to be co-expressed. More generally, we found that genes that are close to each other in the protein interaction network tend to be linked more often than expected and are often co-expressed. Unexpectedly, we found that linked genes in network proximity have unusually high recombination rates. Because high recombination rates are associated with high rates of genome re-organization, our findings might explain why the clustering of genes in proximity in the network is such a weak effect: there could be a co-evolutionary cycle of physical linkage for co-expression, upwards modification of the recombination rate and concomitant break-up of a cluster. Under such a model an "optimal" gene order is never stable.
Collapse
Affiliation(s)
- Juan F Poyatos
- Evolutionary Systems Biology Initiative, Structural and Computational Biology Programme, Spanish National Cancer Centre (CNIO), Melchor Fernández Almagro 3, 28029 Madrid, Spain.
| | | |
Collapse
|
25
|
Lee WP, Jeng BC, Pai TW, Tsai CP, Yu CY, Tzou WS. Differential evolutionary conservation of motif modes in the yeast protein interaction network. BMC Genomics 2006; 7:89. [PMID: 16638125 PMCID: PMC1501022 DOI: 10.1186/1471-2164-7-89] [Citation(s) in RCA: 17] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/10/2006] [Accepted: 04/25/2006] [Indexed: 01/03/2023] Open
Abstract
Background The importance of a network motif (a recurring interconnected pattern of special topology which is over-represented in a biological network) lies in its position in the hierarchy between the protein molecule and the module in a protein-protein interaction network. Until now, however, the methods available have greatly restricted the scope of research. While they have focused on the analysis in the resolution of a motif topology, they have not been able to distinguish particular motifs of the same topology in a protein-protein interaction network. Results We have been able to assign the molecular function annotations of Gene Ontology to each protein in the protein-protein interactions of Saccharomyces cerevisiae. For various motif topologies, we have developed an algorithm, enabling us to unveil one million "motif modes", each of which features a unique topological combination of molecular functions. To our surprise, the conservation ratio, i.e., the extent of the evolutionary constraints upon the motif modes of the same motif topology, varies significantly, clearly indicative of distinct differences in the evolutionary constraints upon motifs of the same motif topology. Equally important, for all motif modes, we have found a power-law distribution of the motif counts on each motif mode. We postulate that motif modes may very well represent the evolutionary-conserved topological units of a protein interaction network. Conclusion For the first time, the motifs of a protein interaction network have been investigated beyond the scope of motif topology. The motif modes determined in this study have not only enabled us to differentiate among different evolutionary constraints on motifs of the same topology but have also opened up new avenues through which protein interaction networks can be analyzed.
Collapse
Affiliation(s)
- Wei-Po Lee
- Department of Information Management, National University of Kaohsiung, Taiwan
| | - Bing-Chiang Jeng
- Department of Information Management, National Sun Yat-sen University, Taiwan
| | - Tun-Wen Pai
- Department of Computer Science, National Taiwan Ocean University, Taiwan
| | - Chin-Pei Tsai
- Department of Applied Mathematics, Providence University, Taiwan
| | - Chang-Yung Yu
- Department of Applied Mathematics, Providence University, Taiwan
| | - Wen-Shyong Tzou
- Institute of Bioscience and Biotechnology, National Taiwan Ocean University, Taiwan
- Center for Marine Bioscience and Biotechnology, National Taiwan Ocean University, Taiwan
| |
Collapse
|
26
|
Farutin V, Robison K, Lightcap E, Dancik V, Ruttenberg A, Letovsky S, Pradines J. Edge-count probabilities for the identification of local protein communities and their organization. Proteins 2005; 62:800-18. [PMID: 16372355 DOI: 10.1002/prot.20799] [Citation(s) in RCA: 25] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/09/2022]
Abstract
We present a computational approach based on a local search strategy that discovers sets of proteins that preferentially interact with each other. Such sets are referred to as protein communities and are likely to represent functional modules. Preferential interaction between module members is quantified via an analytical framework based on a network null model known as the random graph with given expected degrees. Based on this framework, the concept of local protein community is generalized to that of community of communities. Protein communities and higher-level structures are extracted from two yeast protein interaction data sets and a network of published interactions between human proteins. The high level structures obtained with the human network correspond to broad biological concepts such as signal transduction, regulation of gene expression, and intercellular communication. Many of the obtained human communities are enriched, in a statistically significant way, for proteins having no clear orthologs in lower organisms. This indicates that the extracted modules are quite coherent in terms of function.
Collapse
Affiliation(s)
- Victor Farutin
- Computational Sciences, Informatics, Millennium Pharmaceuticals Inc., Cambridge, Massachusetts 02139, USA.
| | | | | | | | | | | | | |
Collapse
|
27
|
Han JDJ, Dupuy D, Bertin N, Cusick ME, Vidal M. Effect of sampling on topology predictions of protein-protein interaction networks. Nat Biotechnol 2005; 23:839-44. [PMID: 16003372 DOI: 10.1038/nbt1116] [Citation(s) in RCA: 181] [Impact Index Per Article: 9.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/21/2022]
Abstract
Currently available protein-protein interaction (PPI) network or 'interactome' maps, obtained with the yeast two-hybrid (Y2H) assay or by co-affinity purification followed by mass spectrometry (co-AP/MS), only cover a fraction of the complete PPI networks. These partial networks display scale-free topologies--most proteins participate in only a few interactions whereas a few proteins have many interaction partners. Here we analyze whether the scale-free topologies of the partial networks obtained from Y2H assays can be used to accurately infer the topology of complete interactomes. We generated four theoretical interaction networks of different topologies (random, exponential, power law, truncated normal). Partial sampling of these networks resulted in sub-networks with topological characteristics that were virtually indistinguishable from those of currently available Y2H-derived partial interactome maps. We conclude that given the current limited coverage levels, the observed scale-free topology of existing interactome maps cannot be confidently extrapolated to complete interactomes.
Collapse
Affiliation(s)
- Jing-Dong J Han
- Center for Cancer Systems Biology and Department of Cancer Biology, Dana-Farber Cancer Institute, Boston, Massachusetts 02115, USA
| | | | | | | | | |
Collapse
|
28
|
Cusick ME, Klitgord N, Vidal M, Hill DE. Interactome: gateway into systems biology. Hum Mol Genet 2005; 14 Spec No. 2:R171-81. [PMID: 16162640 DOI: 10.1093/hmg/ddi335] [Citation(s) in RCA: 263] [Impact Index Per Article: 13.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/24/2023] Open
Abstract
Protein-protein interactions are fundamental to all biological processes, and a comprehensive determination of all protein-protein interactions that can take place in an organism provides a framework for understanding biology as an integrated system. The availability of genome-scale sets of cloned open reading frames has facilitated systematic efforts at creating proteome-scale data sets of protein-protein interactions, which are represented as complex networks or 'interactome' maps. Protein-protein interaction mapping projects that follow stringent criteria, coupled with experimental validation in orthogonal systems, provide high-confidence data sets immanently useful for interrogating developmental and disease mechanisms at a system level as well as elucidating individual protein function and interactome network topology. Although far from complete, currently available maps provide insight into how biochemical properties of proteins and protein complexes are integrated into biological systems. Such maps are also a useful resource to predict the function(s) of thousands of genes.
Collapse
Affiliation(s)
- Michael E Cusick
- Center for Cancer Systems Biology and Department of Cancer Biology, Dana-Farber Cancer Institute, 44 Binney Street, Boston, MA 02115, USA.
| | | | | | | |
Collapse
|
29
|
Uetz P, Finley RL. From protein networks to biological systems. FEBS Lett 2005; 579:1821-7. [PMID: 15763558 DOI: 10.1016/j.febslet.2005.02.001] [Citation(s) in RCA: 66] [Impact Index Per Article: 3.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/15/2005] [Accepted: 01/31/2005] [Indexed: 11/21/2022]
Abstract
A system-level understanding of any biological process requires a map of the relationships among the various molecules involved. Technologies to detect and predict protein interactions have begun to produce very large maps of protein interactions, some including most of an organism's proteins. These maps can be used to study how proteins work together to form molecular machines and regulatory pathways. They also provide a framework for constructing predictive models of how information and energy flow through biological networks. In many respects, protein interaction maps are an entrée into systems biology.
Collapse
Affiliation(s)
- Peter Uetz
- Research Center Karlsruhe, Institute of Genetics, P.O. Box 3640, D-76021 Karlsruhe, Germany.
| | | |
Collapse
|
30
|
Huynen MA, Gabaldón T, Snel B. Variation and evolution of biomolecular systems: Searching for functional relevance. FEBS Lett 2005; 579:1839-45. [PMID: 15763561 DOI: 10.1016/j.febslet.2005.02.004] [Citation(s) in RCA: 10] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/18/2005] [Revised: 01/18/2005] [Accepted: 02/01/2005] [Indexed: 11/29/2022]
Abstract
The availability of genome sequences and functional genomics data from multiple species enables us to compare the composition of biomolecular systems like biochemical pathways and protein complexes between species. Here, we review small- and large-scale, "genomics-based" approaches to biomolecular systems variation. In general, caution is required when comparing the results of bioinformatics analyses of genomes or of functional genomics data between species. Limitations to the sensitivity of sequence analysis tools and the noisy nature of genomics data tend to lead to systematic overestimates of the amount of variation. Nevertheless, the results from detailed manual analyses, and of large-scale analyses that filter out systematic biases, point to a large amount of variation in the composition of biomolecular systems. Such observations challenge our understanding of the function of the systems and their individual components and can potentially facilitate the identification and functional characterization of sub-systems within a system. Mapping the inter-species variation of complex biomolecular systems on a phylogenetic species tree allows one to reconstruct their evolution.
Collapse
Affiliation(s)
- Martijn A Huynen
- Center for Molecular and Biomolecular Informatics, Nijmegen Center for Molecular Life Sciences, Radboud University Nijmegen Medical Center, P.O. Box 9010, 6500 GL Nijmegen, The Netherlands.
| | | | | |
Collapse
|