1
|
Diepeveen ET, Gehrmann T, Pourquié V, Abeel T, Laan L. Patterns of Conservation and Diversification in the Fungal Polarization Network. Genome Biol Evol 2018; 10:1765-1782. [PMID: 29931311 PMCID: PMC6054225 DOI: 10.1093/gbe/evy121] [Citation(s) in RCA: 11] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 06/18/2018] [Indexed: 12/12/2022] Open
Abstract
The combined actions of proteins in networks underlie all fundamental cellular functions. Deeper insights into the dynamics of network composition across species and their functional consequences are crucial to fully understand protein network evolution. Large-scale comparative studies with high phylogenetic resolution are now feasible through the recent rise in available genomic data sets of both model and nonmodel species. Here, we focus on the polarity network, which is universally essential for cell proliferation and studied in great detail in the model organism, Saccharomyces cerevisiae. We examine 42 proteins, directly related to cell polarization, across 298 fungal strains/species to determine the composition of the network and patterns of conservation and diversification. We observe strong protein conservation for a group of 23 core proteins: >95% of all examined strains/species possess at least 14 of these core proteins, albeit in varying compositions, and non of the individual core proteins is 100% conserved. We find high levels of variation in prevalence and sequence identity in the remaining 19 proteins, resulting in distinct lineage-specific compositions of the network in the majority of strains/species. We show that the observed diversification in network composition correlates with lineage, lifestyle, and genetic distance. Yeast, filamentous and basal unicellular fungi, form distinctive groups based on these analyses, with substantial differences to their polarization network. Our study shows that the fungal polarization network is highly dynamic, even between closely related species, and that functional conservation appears to be achieved by varying the specific components of the fungal polarization repertoire.
Collapse
Affiliation(s)
- Eveline T Diepeveen
- Department of Bionanoscience, Faculty of Applied Sciences, Kavli Institute of NanoScience, Delft University of Technology, The Netherlands
| | - Thies Gehrmann
- Delft Bioinformatics Lab, Faculty of Electrical Engineering, Mathematics and Computer Science, Intelligent Systems, Delft University of Technology, The Netherlands
- Department of Molecular Epidemiology, Leiden Computational Biology Center, Leiden University Medical Centre, The Netherlands
| | - Valérie Pourquié
- Department of Bionanoscience, Faculty of Applied Sciences, Kavli Institute of NanoScience, Delft University of Technology, The Netherlands
- Delft Bioinformatics Lab, Faculty of Electrical Engineering, Mathematics and Computer Science, Intelligent Systems, Delft University of Technology, The Netherlands
| | - Thomas Abeel
- Delft Bioinformatics Lab, Faculty of Electrical Engineering, Mathematics and Computer Science, Intelligent Systems, Delft University of Technology, The Netherlands
- Genome Sequencing and Analysis Program, Broad Institute of MIT and Harvard, 7 Cambridge Center, Cambridge, Massachusetts
| | - Liedewij Laan
- Department of Bionanoscience, Faculty of Applied Sciences, Kavli Institute of NanoScience, Delft University of Technology, The Netherlands
| |
Collapse
|
2
|
Roux J, Liu J, Robinson-Rechavi M. Selective Constraints on Coding Sequences of Nervous System Genes Are a Major Determinant of Duplicate Gene Retention in Vertebrates. Mol Biol Evol 2018; 34:2773-2791. [PMID: 28981708 PMCID: PMC5850798 DOI: 10.1093/molbev/msx199] [Citation(s) in RCA: 27] [Impact Index Per Article: 3.9] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/12/2022] Open
Abstract
The evolutionary history of vertebrates is marked by three ancient whole-genome duplications: two successive rounds in the ancestor of vertebrates, and a third one specific to teleost fishes. Biased loss of most duplicates enriched the genome for specific genes, such as slow evolving genes, but this selective retention process is not well understood. To understand what drives the long-term preservation of duplicate genes, we characterized duplicated genes in terms of their expression patterns. We used a new method of expression enrichment analysis, TopAnat, applied to in situ hybridization data from thousands of genes from zebrafish and mouse. We showed that the presence of expression in the nervous system is a good predictor of a higher rate of retention of duplicate genes after whole-genome duplication. Further analyses suggest that purifying selection against the toxic effects of misfolded or misinteracting proteins, which is particularly strong in nonrenewing neural tissues, likely constrains the evolution of coding sequences of nervous system genes, leading indirectly to the preservation of duplicate genes after whole-genome duplication. Whole-genome duplications thus greatly contributed to the expansion of the toolkit of genes available for the evolution of profound novelties of the nervous system at the base of the vertebrate radiation.
Collapse
Affiliation(s)
- Julien Roux
- Département d'Ecologie et d'Evolution, Université de Lausanne, Lausanne, Switzerland.,Swiss Institute of Bioinformatics, Lausanne, Switzerland
| | - Jialin Liu
- Département d'Ecologie et d'Evolution, Université de Lausanne, Lausanne, Switzerland.,Swiss Institute of Bioinformatics, Lausanne, Switzerland
| | - Marc Robinson-Rechavi
- Département d'Ecologie et d'Evolution, Université de Lausanne, Lausanne, Switzerland.,Swiss Institute of Bioinformatics, Lausanne, Switzerland
| |
Collapse
|
3
|
Evolution and tinkering: what do a protein kinase, a transcriptional regulator and chromosome segregation/cell division proteins have in common? Curr Genet 2015; 62:67-70. [PMID: 26286503 DOI: 10.1007/s00294-015-0513-y] [Citation(s) in RCA: 9] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Abstract] [Key Words] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/06/2015] [Revised: 08/08/2015] [Accepted: 08/10/2015] [Indexed: 01/14/2023]
Abstract
In this study, we focus on functional interactions among multi-domain proteins which share a common evolutionary origin. The examples we develop are four Bacillus subtilis proteins, which all possess an ATP-binding Walker motif: the bacterial tyrosine kinase (BY-kinase) PtkA, the chromosome segregation protein Soj (ParA), the cell division protein MinD and a transcription regulator SalA. These proteins have arisen via duplication of the ancestral ATP-binding domain, which has undergone fusions with other functional domains in the process of divergent evolution. We point out that these four proteins, despite having very different physiological roles, engage in an unusually high number of binary functional interactions. Namely, MinD attracts Soj and PtkA to the cell pole, and in addition, activates the kinase function of PtkA. SalA also activates the kinase function of PtkA, and it gets phosphorylated by PtkA as well. The consequence of this phosphorylation is the activation of SalA as a transcriptional repressor. We hypothesize that these functional interactions remain preserved during divergent evolution and represent a constraint on the process of evolutionary "tinkering", brought about by fusions of different functional domains.
Collapse
|
4
|
Feng S, Ollivier JF, Swain PS, Soyer OS. BioJazz: in silico evolution of cellular networks with unbounded complexity using rule-based modeling. Nucleic Acids Res 2015; 43:e123. [PMID: 26101250 PMCID: PMC4627059 DOI: 10.1093/nar/gkv595] [Citation(s) in RCA: 9] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/22/2014] [Accepted: 05/26/2015] [Indexed: 11/13/2022] Open
Abstract
Systems biologists aim to decipher the structure and dynamics of signaling and regulatory networks underpinning cellular responses; synthetic biologists can use this insight to alter existing networks or engineer de novo ones. Both tasks will benefit from an understanding of which structural and dynamic features of networks can emerge from evolutionary processes, through which intermediary steps these arise, and whether they embody general design principles. As natural evolution at the level of network dynamics is difficult to study, in silico evolution of network models can provide important insights. However, current tools used for in silico evolution of network dynamics are limited to ad hoc computer simulations and models. Here we introduce BioJazz, an extendable, user-friendly tool for simulating the evolution of dynamic biochemical networks. Unlike previous tools for in silico evolution, BioJazz allows for the evolution of cellular networks with unbounded complexity by combining rule-based modeling with an encoding of networks that is akin to a genome. We show that BioJazz can be used to implement biologically realistic selective pressures and allows exploration of the space of network architectures and dynamics that implement prescribed physiological functions. BioJazz is provided as an open-source tool to facilitate its further development and use. Source code and user manuals are available at: http://oss-lab.github.io/biojazz and http://osslab.lifesci.warwick.ac.uk/BioJazz.aspx.
Collapse
Affiliation(s)
- Song Feng
- School of Life Sciences, University of Warwick, Coventry, United Kingdom
| | | | - Peter S Swain
- SynthSys, The University of Edinburgh, Edinburgh, United Kingdom
| | - Orkun S Soyer
- School of Life Sciences, University of Warwick, Coventry, United Kingdom
| |
Collapse
|
5
|
Affeldt S, Singh PP, Cascone I, Selimoglu R, Camonis J, Isambert H. [Evolution and cancer: expansion of dangerous gene repertoire by whole genome duplications]. Med Sci (Paris) 2013; 29:358-61. [PMID: 23621930 DOI: 10.1051/medsci/2013294008] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/14/2022] Open
|
6
|
Jin Y, Turaev D, Weinmaier T, Rattei T, Makse HA. The evolutionary dynamics of protein-protein interaction networks inferred from the reconstruction of ancient networks. PLoS One 2013; 8:e58134. [PMID: 23526967 PMCID: PMC3603955 DOI: 10.1371/journal.pone.0058134] [Citation(s) in RCA: 42] [Impact Index Per Article: 3.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/11/2012] [Accepted: 01/30/2013] [Indexed: 11/18/2022] Open
Abstract
Cellular functions are based on the complex interplay of proteins, therefore the structure and dynamics of these protein-protein interaction (PPI) networks are the key to the functional understanding of cells. In the last years, large-scale PPI networks of several model organisms were investigated. A number of theoretical models have been developed to explain both the network formation and the current structure. Favored are models based on duplication and divergence of genes, as they most closely represent the biological foundation of network evolution. However, studies are often based on simulated instead of empirical data or they cover only single organisms. Methodological improvements now allow the analysis of PPI networks of multiple organisms simultaneously as well as the direct modeling of ancestral networks. This provides the opportunity to challenge existing assumptions on network evolution. We utilized present-day PPI networks from integrated datasets of seven model organisms and developed a theoretical and bioinformatic framework for studying the evolutionary dynamics of PPI networks. A novel filtering approach using percolation analysis was developed to remove low confidence interactions based on topological constraints. We then reconstructed the ancient PPI networks of different ancestors, for which the ancestral proteomes, as well as the ancestral interactions, were inferred. Ancestral proteins were reconstructed using orthologous groups on different evolutionary levels. A stochastic approach, using the duplication-divergence model, was developed for estimating the probabilities of ancient interactions from today's PPI networks. The growth rates for nodes, edges, sizes and modularities of the networks indicate multiplicative growth and are consistent with the results from independent static analysis. Our results support the duplication-divergence model of evolution and indicate fractality and multiplicative growth as general properties of the PPI network structure and dynamics.
Collapse
Affiliation(s)
- Yuliang Jin
- Levich Institute and Physics Department, City College of New York, New York, New York, United States of America
| | - Dmitrij Turaev
- Department of Computational Systems Biology, University of Vienna, Vienna, Austria
| | - Thomas Weinmaier
- Department of Computational Systems Biology, University of Vienna, Vienna, Austria
| | - Thomas Rattei
- Department of Computational Systems Biology, University of Vienna, Vienna, Austria
| | - Hernán A. Makse
- Levich Institute and Physics Department, City College of New York, New York, New York, United States of America
| |
Collapse
|
7
|
A kinetic model of the evolution of a protein interaction network. BMC Genomics 2013; 14:172. [PMID: 23497092 PMCID: PMC3751699 DOI: 10.1186/1471-2164-14-172] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/22/2012] [Accepted: 03/08/2013] [Indexed: 11/10/2022] Open
Abstract
Background Known protein interaction networks have very particular properties. Old proteins tend to have more interactions than new ones. One of the best statistical representatives of this property is the node degree distribution (distribution of proteins having a given number of interactions). It has previously been shown that this distribution is very close to the sum of two distinct exponential components. In this paper, we asked: What are the possible mechanisms of evolution for such types of networks? To answer this question, we tested a kinetic model for simplified evolution of a protein interactome. Our proposed model considers the emergence of new genes and interactions and the loss of old ones. We assumed that there are generally two coexisting classes of proteins. Proteins constituting the first class are essential only for ecological adaptations and are easily lost when ecological conditions change. Proteins of the second class are essential for basic life processes and, hence, are always effectively protected against deletion. All proteins can transit between the above classes in both directions. We also assumed that the phenomenon of gene duplication is always related to ecological adaptation and that a new copy of a duplicated gene is not essential. According to this model, all proteins gain new interactions with a rate that preferentially increases with the number of interactions (the rich get richer). Proteins can also gain interactions because of duplication. Proteins lose their interactions both with and without the loss of partner genes. Results The proposed model reproduces the main properties of protein-protein interaction networks very well. The connectivity of the oldest part of the interaction network is densest, and the node degree distribution follows the sum of two shifted power-law functions, which is a theoretical generalization of the previous finding. The above distribution covers the wide range of values of node degrees very well, much better than a power law or generalized power law supplemented with an exponential cut-off. The presented model also relates the total number of interactome links to the total number of interacting proteins. The theoretical results were for the interactomes of A. thaliana, B. taurus, C. elegans, D. melanogaster, E. coli, H. pylori, H. sapiens, M. musculus, R. norvegicus and S. cerevisiae. Conclusions Using these approaches, the kinetic parameters could be estimated. Finally, the model revealed the evolutionary kinetics of proteome formation, the phenomenon of protein differentiation and the process of gaining new interactions.
Collapse
|
8
|
Ferreira RM, Rybarczyk-Filho JL, Dalmolin RJS, Castro MAA, Moreira JCF, Brunnet LG, de Almeida RMC. Preferential duplication of intermodular hub genes: an evolutionary signature in eukaryotes genome networks. PLoS One 2013; 8:e56579. [PMID: 23468868 PMCID: PMC3582557 DOI: 10.1371/journal.pone.0056579] [Citation(s) in RCA: 11] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/25/2012] [Accepted: 01/14/2013] [Indexed: 12/31/2022] Open
Abstract
Whole genome protein-protein association networks are not random and their topological properties stem from genome evolution mechanisms. In fact, more connected, but less clustered proteins are related to genes that, in general, present more paralogs as compared to other genes, indicating frequent previous gene duplication episodes. On the other hand, genes related to conserved biological functions present few or no paralogs and yield proteins that are highly connected and clustered. These general network characteristics must have an evolutionary explanation. Considering data from STRING database, we present here experimental evidence that, more than not being scale free, protein degree distributions of organisms present an increased probability for high degree nodes. Furthermore, based on this experimental evidence, we propose a simulation model for genome evolution, where genes in a network are either acquired de novo using a preferential attachment rule, or duplicated with a probability that linearly grows with gene degree and decreases with its clustering coefficient. For the first time a model yields results that simultaneously describe different topological distributions. Also, this model correctly predicts that, to produce protein-protein association networks with number of links and number of nodes in the observed range for Eukaryotes, it is necessary 90% of gene duplication and 10% of de novo gene acquisition. This scenario implies a universal mechanism for genome evolution.
Collapse
Affiliation(s)
- Ricardo M. Ferreira
- Instituto de Física, Universidade Federal do Rio Grande do Sul, Porto Alegre, Brazil
| | | | - Rodrigo J. S. Dalmolin
- Departamento de Bioquímica, Universidade Federal do Rio Grande do Sul, Porto Alegre, Brazil
| | - Mauro A. A. Castro
- Instituto de Física, Universidade Federal do Rio Grande do Sul, Porto Alegre, Brazil
- National Institute of Science and Technology for Complex Systems, Universidade Federal do Rio Grande do Sul, Porto Alegre, Brazil
| | - José C. F. Moreira
- Departamento de Bioquímica, Universidade Federal do Rio Grande do Sul, Porto Alegre, Brazil
| | - Leonardo G. Brunnet
- Instituto de Física, Universidade Federal do Rio Grande do Sul, Porto Alegre, Brazil
| | - Rita M. C. de Almeida
- Instituto de Física, Universidade Federal do Rio Grande do Sul, Porto Alegre, Brazil
- National Institute of Science and Technology for Complex Systems, Universidade Federal do Rio Grande do Sul, Porto Alegre, Brazil
- * E-mail:
| |
Collapse
|
9
|
Kitchen JL, Allaby RG. Systems Modeling at Multiple Levels of Regulation: Linking Systems and Genetic Networks to Spatially Explicit Plant Populations. PLANTS (BASEL, SWITZERLAND) 2013; 2:16-49. [PMID: 27137364 PMCID: PMC4844292 DOI: 10.3390/plants2010016] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 11/01/2012] [Revised: 12/21/2012] [Accepted: 01/16/2013] [Indexed: 11/16/2022]
Abstract
Selection and adaptation of individuals to their underlying environments are highly dynamical processes, encompassing interactions between the individual and its seasonally changing environment, synergistic or antagonistic interactions between individuals and interactions amongst the regulatory genes within the individual. Plants are useful organisms to study within systems modeling because their sedentary nature simplifies interactions between individuals and the environment, and many important plant processes such as germination or flowering are dependent on annual cycles which can be disrupted by climate behavior. Sedentism makes plants relevant candidates for spatially explicit modeling that is tied in with dynamical environments. We propose that in order to fully understand the complexities behind plant adaptation, a system that couples aspects from systems biology with population and landscape genetics is required. A suitable system could be represented by spatially explicit individual-based models where the virtual individuals are located within time-variable heterogeneous environments and contain mutable regulatory gene networks. These networks could directly interact with the environment, and should provide a useful approach to studying plant adaptation.
Collapse
Affiliation(s)
- James L Kitchen
- School of Life Sciences, University of Warwick, Coventry, CV4 7AL, UK
| | - Robin G Allaby
- School of Life Sciences, University of Warwick, Coventry, CV4 7AL, UK.
| |
Collapse
|
10
|
Singh PP, Affeldt S, Cascone I, Selimoglu R, Camonis J, Isambert H. On the expansion of "dangerous" gene repertoires by whole-genome duplications in early vertebrates. Cell Rep 2012; 2:1387-98. [PMID: 23168259 DOI: 10.1016/j.celrep.2012.09.034] [Citation(s) in RCA: 31] [Impact Index Per Article: 2.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/12/2012] [Revised: 09/17/2012] [Accepted: 09/27/2012] [Indexed: 10/27/2022] Open
Abstract
The emergence and evolutionary expansion of gene families implicated in cancers and other severe genetic diseases is an evolutionary oddity from a natural selection perspective. Here, we show that gene families prone to deleterious mutations in the human genome have been preferentially expanded by the retention of "ohnolog" genes from two rounds of whole-genome duplication (WGD) dating back from the onset of jawed vertebrates. We further demonstrate that the retention of many ohnologs suspected to be dosage balanced is in fact indirectly mediated by their susceptibility to deleterious mutations. This enhanced retention of "dangerous" ohnologs, defined as prone to autosomal-dominant deleterious mutations, is shown to be a consequence of WGD-induced speciation and the ensuing purifying selection in post-WGD species. These findings highlight the importance of WGD-induced nonadaptive selection for the emergence of vertebrate complexity, while rationalizing, from an evolutionary perspective, the expansion of gene families frequently implicated in genetic disorders and cancers.
Collapse
Affiliation(s)
- Param Priya Singh
- CNRS UMR168, UPMC, Institut Curie, Research Center, 26, rue d'Ulm, 75248 Paris, France
| | | | | | | | | | | |
Collapse
|
11
|
Bottinelli A, Bassetti B, Lagomarsino MC, Gherardi M. Influence of homology and node age on the growth of protein-protein interaction networks. PHYSICAL REVIEW. E, STATISTICAL, NONLINEAR, AND SOFT MATTER PHYSICS 2012; 86:041919. [PMID: 23214627 DOI: 10.1103/physreve.86.041919] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 06/15/2012] [Indexed: 06/01/2023]
Abstract
Proteins participating in a protein-protein interaction network can be grouped into homology classes following their common ancestry. Proteins added to the network correspond to genes added to the classes, so the dynamics of the two objects are intrinsically linked. Here we first introduce a statistical model describing the joint growth of the network and the partitioning of nodes into classes, which is studied through a combined mean-field and simulation approach. We then employ this unified framework to address the specific issue of the age dependence of protein interactions through the definition of three different node wiring or divergence schemes. A comparison with empirical data indicates that an age-dependent divergence move is necessary in order to reproduce the basic topological observables together with the age correlation between interacting nodes visible in empirical data. We also discuss the possibility of nontrivial joint partition and topology observables.
Collapse
|
12
|
DeDeo S, Krakauer DC. Dynamics and processing in finite self-similar networks. J R Soc Interface 2012; 9:2131-44. [PMID: 22378750 PMCID: PMC3405736 DOI: 10.1098/rsif.2011.0840] [Citation(s) in RCA: 12] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/01/2011] [Accepted: 02/07/2012] [Indexed: 11/12/2022] Open
Abstract
A common feature of biological networks is the geometrical property of self-similarity. Molecular regulatory networks through to circulatory systems, nervous systems, social systems and ecological trophic networks show self-similar connectivity at multiple scales. We analyse the relationship between topology and signalling in contrasting classes of such topologies. We find that networks differ in their ability to contain or propagate signals between arbitrary nodes in a network depending on whether they possess branching or loop-like features. Networks also differ in how they respond to noise, such that one allows for greater integration at high noise, and this performance is reversed at low noise. Surprisingly, small-world topologies, with diameters logarithmic in system size, have slower dynamical time scales, and may be less integrated (more modular) than networks with longer path lengths. All of these phenomena are essentially mesoscopic, vanishing in the infinite limit but producing strong effects at sizes and time scales relevant to biology.
Collapse
Affiliation(s)
- Simon DeDeo
- Santa Fe Institute, Santa Fe, NM 87501, USA.
| | | |
Collapse
|
13
|
Fokkens L, Hogeweg P, Snel B. Gene duplications contribute to the overrepresentation of interactions between proteins of a similar age. BMC Evol Biol 2012; 12:99. [PMID: 22732003 PMCID: PMC3457867 DOI: 10.1186/1471-2148-12-99] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/11/2012] [Accepted: 06/07/2012] [Indexed: 11/10/2022] Open
Abstract
BACKGROUND The study of biological networks and how they have evolved is fundamental to our understanding of the cell. By investigating how proteins of different ages are connected in the protein interaction network, one can infer how that network has expanded in evolution, without the need for explicit reconstruction of ancestral networks. Studies that implement this approach show that proteins are often connected to proteins of a similar age, suggesting a simultaneous emergence of interacting proteins. There are several theories explaining this phenomenon, but despite the importance of gene duplication in genome evolution, none consider protein family dynamics as a contributing factor. RESULTS In an S. cerevisiae protein interaction network we investigate to what extent edges that arise from duplication events contribute to the observed tendency to interact with proteins of a similar age. We find that part of this tendency is explained by interactions between paralogs. Age is usually defined on the level of protein families, rather than individual proteins, hence paralogs have the same age. The major contribution however, is from interaction partners that are shared between paralogs. These interactions have most likely been conserved after a duplication event. To investigate to what extent a nearly neutral process of network growth can explain these results, we adjust a well-studied network growth model to incorporate protein families. Our model shows that the number of edges between paralogs can be amplified by subsequent duplication events, thus explaining the overrepresentation of interparalog edges in the data. The fact that interaction partners shared by paralogs are often of the same age as the paralogs does not arise naturally from our model and needs further investigation. CONCLUSION We amend previous theories that explain why proteins of a similar age prefer to interact by demonstrating that this observation can be partially explained by gene duplication events. There is an ongoing debate on whether the protein interaction network is predominantly shaped by duplication and subfunctionalization or whether network rewiring is most important. Our analyses of S. cerevisiae protein interaction networks demonstrate that duplications have influenced at least one property of the protein interaction network: how proteins of different ages are connected.
Collapse
Affiliation(s)
- Like Fokkens
- Theoretical Biology and Bioinformatics, Department of Biology, Faculty of Science, Utrecht University, Padualaan 8, 3584CH, Utrecht, The Netherlands
| | - Paulien Hogeweg
- Theoretical Biology and Bioinformatics, Department of Biology, Faculty of Science, Utrecht University, Padualaan 8, 3584CH, Utrecht, The Netherlands
| | - Berend Snel
- Theoretical Biology and Bioinformatics, Department of Biology, Faculty of Science, Utrecht University, Padualaan 8, 3584CH, Utrecht, The Netherlands
- Netherlands Consortium for Systems Biology (NCSB), c/o NISB Bureau, University of Amsterdam, Science Park 904, 1098XH, Amsterdam, The Netherlands
| |
Collapse
|
14
|
de Souza SJ. Domain shuffling and the increasing complexity of biological networks. Bioessays 2012; 34:655-7. [DOI: 10.1002/bies.201200006] [Citation(s) in RCA: 8] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/08/2022]
|
15
|
Emmert-Streib F. Limitations of gene duplication models: evolution of modules in protein interaction networks. PLoS One 2012; 7:e35531. [PMID: 22530042 PMCID: PMC3329483 DOI: 10.1371/journal.pone.0035531] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/02/2011] [Accepted: 03/18/2012] [Indexed: 01/05/2023] Open
Abstract
It has been generally acknowledged that the module structure of protein interaction networks plays a crucial role with respect to the functional understanding of these networks. In this paper, we study evolutionary aspects of the module structure of protein interaction networks, which forms a mesoscopic level of description with respect to the architectural principles of networks. The purpose of this paper is to investigate limitations of well known gene duplication models by showing that these models are lacking crucial structural features present in protein interaction networks on a mesoscopic scale. This observation reveals our incomplete understanding of the structural evolution of protein networks on the module level.
Collapse
Affiliation(s)
- Frank Emmert-Streib
- Computational Biology and Machine Learning Lab, Center for Cancer Research and Cell Biology, School of Medicine, Dentistry and Biomedical Sciences, Queen's University Belfast, Belfast, United Kingdom.
| |
Collapse
|
16
|
Stein RR, Isambert H. Logistic map analysis of biomolecular network evolution. PHYSICAL REVIEW. E, STATISTICAL, NONLINEAR, AND SOFT MATTER PHYSICS 2011; 84:051904. [PMID: 22181441 DOI: 10.1103/physreve.84.051904] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 02/03/2011] [Revised: 08/22/2011] [Indexed: 05/31/2023]
Abstract
We study the expansion of biomolecular networks from the view point of first evolutionary principles based on the duplication and divergence of ancestral genes. The expansion of gene families and subnetworks is analyzed in terms of logistic map compositions, which capture the varying functional constraints of individual genes in the course of evolution. Using a mean-field approach, we then demonstrate the existence of spontaneous growth-rate variations between gene families and discuss the relevance of such heterogeneous expansions for the emergent properties of actual biomolecular networks.
Collapse
Affiliation(s)
- R R Stein
- Institut Curie, CNRS-UMR168, UPMC, Paris, France
| | | |
Collapse
|
17
|
Ali W, Deane C, Reinert G. Protein Interaction Networks and Their Statistical Analysis. HANDBOOK OF STATISTICAL SYSTEMS BIOLOGY 2011:200-234. [DOI: 10.1002/9781119970606.ch10] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 01/03/2025]
|
18
|
Cancherini DV, França GS, de Souza SJ. The role of exon shuffling in shaping protein-protein interaction networks. BMC Genomics 2010; 11 Suppl 5:S11. [PMID: 21210967 PMCID: PMC3045794 DOI: 10.1186/1471-2164-11-s5-s11] [Citation(s) in RCA: 20] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/10/2022] Open
Abstract
BACKGROUND Physical protein-protein interaction (PPI) is a critical phenomenon for the function of most proteins in living organisms and a significant fraction of PPIs are the result of domain-domain interactions. Exon shuffling, intron-mediated recombination of exons from existing genes, is known to have been a major mechanism of domain shuffling in metazoans. Thus, we hypothesized that exon shuffling could have a significant influence in shaping the topology of PPI networks. RESULTS We tested our hypothesis by compiling exon shuffling and PPI data from six eukaryotic species: Homo sapiens, Mus musculus, Drosophila melanogaster, Caenorhabditis elegans, Cryptococcus neoformans and Arabidopsis thaliana. For all four metazoan species, genes enriched in exon shuffling events presented on average higher vertex degree (number of interacting partners) in PPI networks. Furthermore, we verified that a set of protein domains that are simultaneously promiscuous (known to interact to multiple types of other domains), self-interacting (able to interact with another copy of themselves) and abundant in the genomes presents a stronger signal for exon shuffling. CONCLUSIONS Exon shuffling appears to have been a recurrent mechanism for the emergence of new PPIs along metazoan evolution. In metazoan genomes, exon shuffling also promoted the expansion of some protein domains. We speculate that their promiscuous and self-interacting properties may have been decisive for that expansion.
Collapse
|
19
|
Ordered structure of the transcription network inherited from the yeast whole-genome duplication. BMC SYSTEMS BIOLOGY 2010; 4:77. [PMID: 20525287 PMCID: PMC2900227 DOI: 10.1186/1752-0509-4-77] [Citation(s) in RCA: 14] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 01/12/2010] [Accepted: 06/03/2010] [Indexed: 01/07/2023]
Abstract
Background Gene duplication, a major evolutionary path to genomic innovation, can occur at the scale of an entire genome. One such "whole-genome duplication" (WGD) event among the Ascomycota fungi gave rise to genes with distinct biological properties compared to small-scale duplications. Results We studied the evolution of transcriptional interactions of whole-genome duplicates, to understand how they are wired into the yeast regulatory system. Our work combines network analysis and modeling of the large-scale structure of the interactions stemming from the WGD. Conclusions The results uncover the WGD as a major source for the evolution of a complex interconnected block of transcriptional pathways. The inheritance of interactions among WGD duplicates follows elementary "duplication subgraphs", relating ancestral interactions with newly formed ones. Duplication subgraphs are correlated with their neighbours and give rise to higher order circuits with two elementary properties: newly formed transcriptional pathways remain connected (paths are not broken), and are preferentially cross-connected with ancestral ones. The result is a coherent and connected "WGD-network", where duplication subgraphs are arranged in an astonishingly ordered configuration.
Collapse
|
20
|
Ratmann O, Wiuf C, Pinney JW. From evidence to inference: probing the evolution of protein interaction networks. HFSP JOURNAL 2009; 3:290-306. [PMID: 20357887 DOI: 10.2976/1.3167215] [Citation(s) in RCA: 24] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Received: 03/16/2009] [Revised: 05/30/2009] [Indexed: 01/06/2023]
Abstract
The evolutionary mechanisms by which protein interaction networks grow and change are beginning to be appreciated as a major factor shaping their present-day structures and properties. Starting with a consideration of the biases and errors inherent in our current views of these networks, we discuss the dangers of constructing evolutionary arguments from naïve analyses of network topology. We argue that progress in understanding the processes of network evolution is only possible when hypotheses are formulated as plausible evolutionary models and compared against the observed data within the framework of probabilistic modeling. The value of such models is expected to be greatly enhanced as they incorporate more of the details of the biophysical properties of interacting proteins, gene phylogeny, and measurement error and as more advanced methodologies emerge for model comparison and the inference of ancestral network states.
Collapse
|
21
|
Lima-Mendez G, van Helden J. The powerful law of the power law and other myths in network biology. MOLECULAR BIOSYSTEMS 2009; 5:1482-93. [PMID: 20023717 DOI: 10.1039/b908681a] [Citation(s) in RCA: 115] [Impact Index Per Article: 7.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/21/2022]
Abstract
For almost 10 years, topological analysis of different large-scale biological networks (metabolic reactions, protein interactions, transcriptional regulation) has been highlighting some recurrent properties: power law distribution of degree, scale-freeness, small world, which have been proposed to confer functional advantages such as robustness to environmental changes and tolerance to random mutations. Stochastic generative models inspired different scenarios to explain the growth of interaction networks during evolution. The power law and the associated properties appeared so ubiquitous in complex networks that they were qualified as "universal laws". However, these properties are no longer observed when the data are subjected to statistical tests: in most cases, the data do not fit the expected theoretical models, and the cases of good fitting merely result from sampling artefacts or improper data representation. The field of network biology seems to be founded on a series of myths, i.e. widely believed but false ideas. The weaknesses of these foundations should however not be considered as a failure for the entire domain. Network analysis provides a powerful frame for understanding the function and evolution of biological processes, provided it is brought to an appropriate level of description, by focussing on smaller functional modules and establishing the link between their topological properties and their dynamical behaviour.
Collapse
Affiliation(s)
- Gipsi Lima-Mendez
- Bioinformatique des Génomes et des Réseaux-BiGRe, Université Libre de Bruxelles, Campus Plaine, CP 263, Boulevard du Triomphe, B-1050 Bruxelles, Belgium.
| | | |
Collapse
|
22
|
Abstract
Our understanding of how evolution acts on biological networks remains patchy, as is our knowledge of how that action is best identified, modelled and understood. Starting with network structure and the evolution of protein-protein interaction networks, we briefly survey the ways in which network evolution is being addressed in the fields of systems biology, development and ecology. The approaches highlighted demonstrate a movement away from a focus on network topology towards a more integrated view, placing biological properties centre-stage. We argue that there remains great potential in a closer synergy between evolutionary biology and biological network analysis, although that may require the development of novel approaches and even different analogies for biological networks themselves.
Collapse
Affiliation(s)
- Christopher G Knight
- Faculty of Life Sciences, The University of Manchester, Michael Smith Building, Manchester, UK.
| | | |
Collapse
|
23
|
Lewis ACF, Saeed R, Deane CM. Predicting protein-protein interactions in the context of protein evolution. MOLECULAR BIOSYSTEMS 2009; 6:55-64. [PMID: 20024067 DOI: 10.1039/b916371a] [Citation(s) in RCA: 32] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 12/27/2022]
Abstract
Here we review the methods for the prediction of protein interactions and the ideas in protein evolution that relate to them. The evolutionary assumptions implicit in many of the protein interaction prediction methods are elucidated. We draw attention to the caution needed in deploying certain evolutionary assumptions, in particular cross-organism transfer of interactions by sequence homology, and discuss the known issues in deriving interaction predictions from evidence of co-evolution. We also conject that there is evolutionary knowledge yet to be exploited in the prediction of interactions, in particular the heterogeneity of interactions, the increasing availability of interaction data from multiple species, and the models of protein interaction network growth.
Collapse
Affiliation(s)
- Anna C F Lewis
- Department of Statistics and Systems Biology DTC, University of Oxford, UK
| | | | | |
Collapse
|
24
|
Isambert H, Stein RR. On the need for widespread horizontal gene transfers under genome size constraint. Biol Direct 2009; 4:28. [PMID: 19703318 PMCID: PMC2740843 DOI: 10.1186/1745-6150-4-28] [Citation(s) in RCA: 18] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/19/2009] [Accepted: 08/25/2009] [Indexed: 11/20/2022] Open
Abstract
Background While eukaryotes primarily evolve by duplication-divergence expansion (and reduction) of their own gene repertoire with only rare horizontal gene transfers, prokaryotes appear to evolve under both gene duplications and widespread horizontal gene transfers over long evolutionary time scales. But, the evolutionary origin of this striking difference in the importance of horizontal gene transfers remains by and large a mystery. Hypothesis We propose that the abundance of horizontal gene transfers in free-living prokaryotes is a simple but necessary consequence of two opposite effects: i) their apparent genome size constraint compared to typical eukaryote genomes and ii) their underlying genome expansion dynamics through gene duplication-divergence evolution, as demonstrated by the presence of many tandem and block repeated genes. In principle, this combination of genome size constraint and underlying duplication expansion should lead to a coalescent-like process with extensive turnover of functional genes. This would, however, imply the unlikely, systematic reinvention of functions from discarded genes within independent phylogenetic lineages. Instead, we propose that the long-term evolutionary adaptation of free-living prokaryotes must have resulted in the emergence of efficient non-phylogenetic pathways to circumvent gene loss. Implications This need for widespread horizontal gene transfers due to genome size constraint implies, in particular, that prokaryotes must remain under strong selection pressure in order to maintain the long-term evolutionary adaptation of their "mutualized" gene pool, beyond the inevitable turnover of individual prokaryote species. By contrast, the absence of genome size constraint for typical eukaryotes has presumably relaxed their need for widespread horizontal gene transfers and strong selection pressure. Yet, the resulting loss of genetic functions, due to weak selection pressure and inefficient gene recovery mechanisms, must have ultimately favored the emergence of more complex life styles and ecological integration of many eukaryotes. Reviewers This article was reviewed by Pierre Pontarotti, Eugene V Koonin and Sergei Maslov.
Collapse
Affiliation(s)
- Hervé Isambert
- Institut Curie, CNRS UMR168, 11 rue P, & M, Curie, 75005 Paris, France.
| | | |
Collapse
|
25
|
Huang Y, Zheng Y, Su Z, Gu X. Differences in duplication age distributions between human GPCRs and their downstream genes from a network prospective. BMC Genomics 2009; 10 Suppl 1:S14. [PMID: 19594873 PMCID: PMC2709257 DOI: 10.1186/1471-2164-10-s1-s14] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/21/2022] Open
Abstract
BACKGROUND How gene duplication has influenced the evolution of gene networks is one of the core problems in evolution. Current duplication-divergence theories generally suggested that genes on the periphery of the networks were preferentially retained after gene duplication. However, previous studies were mostly based on gene networks in invertebrate species, and they had the inherent shortcoming of not being able to provide information on how the duplication-divergence process proceeded along the time axis during major speciation events. RESULTS In this study, we constructed a model system consisting of human G protein-coupled receptors (GPCRs) and their downstream genes in the GPCR pathways. These two groups of genes offered a natural partition of genes in the peripheral and the backbone layers of the network. Analysis of the age distributions of the duplication events in human GPCRs and "downstream genes" gene families indicated that they both experienced an explosive expansion at the time of early vertebrate emergence. However, we found only GPCR families saw a continued expansion after early vertebrates, mostly prominently in several small subfamilies of GPCRs involved in immune responses and sensory responses. CONCLUSION In general, in the human GPCR model system, we found that the position of a gene in the gene networks has significant influences on the likelihood of fixation of its duplicates. However, for a super gene family, the influence was not uniform among subfamilies. For super families, such as GPCRs, whose gene basis of expression diversity was well established at early vertebrates, continued expansions were mostly prominent in particular small subfamilies mainly involved in lineage-specific functions.
Collapse
Affiliation(s)
- Yong Huang
- Department of Genetics, Development, and Cell Biology, Center for Bioinformatics and Biological Statistics, Iowa State University, Ames, IA 50011, USA.
| | | | | | | |
Collapse
|
26
|
Frech C, Kommenda M, Dorfer V, Kern T, Hintner H, Bauer JW, Onder K. Improved homology-driven computational validation of protein-protein interactions motivated by the evolutionary gene duplication and divergence hypothesis. BMC Bioinformatics 2009; 10:21. [PMID: 19152684 PMCID: PMC2637843 DOI: 10.1186/1471-2105-10-21] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/09/2008] [Accepted: 01/19/2009] [Indexed: 11/10/2022] Open
Abstract
Background Protein-protein interaction (PPI) data sets generated by high-throughput experiments are contaminated by large numbers of erroneous PPIs. Therefore, computational methods for PPI validation are necessary to improve the quality of such data sets. Against the background of the theory that most extant PPIs arose as a consequence of gene duplication, the sensitive search for homologous PPIs, i.e. for PPIs descending from a common ancestral PPI, should be a successful strategy for PPI validation. Results To validate an experimentally observed PPI, we combine FASTA and PSI-BLAST to perform a sensitive sequence-based search for pairs of interacting homologous proteins within a large, integrated PPI database. A novel scoring scheme that incorporates both quality and quantity of all observed matches allows us (1) to consider also tentative paralogs and orthologs in this analysis and (2) to combine search results from more than one homology detection method. ROC curves illustrate the high efficacy of this approach and its improvement over other homology-based validation methods. Conclusion New PPIs are primarily derived from preexisting PPIs and not invented de novo. Thus, the hallmark of true PPIs is the existence of homologous PPIs. The sensitive search for homologous PPIs within a large body of known PPIs is an efficient strategy to separate biologically relevant PPIs from the many spurious PPIs reported by high-throughput experiments.
Collapse
Affiliation(s)
- Christian Frech
- Upper Austria University of Applied Sciences, Hagenberg, Austria.
| | | | | | | | | | | | | |
Collapse
|
27
|
Sellerio A, Bassetti B, Isambert H, Cosentino Lagomarsino M. A comparative evolutionary study of transcription networks. The global role of feedback and hierachical structures. ACTA ACUST UNITED AC 2009; 5:170-9. [DOI: 10.1039/b815339f] [Citation(s) in RCA: 12] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/22/2022]
|
28
|
Evlampiev K, Isambert H. Conservation and topology of protein interaction networks under duplication-divergence evolution. Proc Natl Acad Sci U S A 2008; 105:9863-8. [PMID: 18632555 PMCID: PMC2481380 DOI: 10.1073/pnas.0804119105] [Citation(s) in RCA: 51] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/22/2007] [Indexed: 11/18/2022] Open
Abstract
Genomic duplication-divergence processes are the primary source of new protein functions and thereby contribute to the evolutionary expansion of functional molecular networks. Yet, it is still unclear to what extent such duplication-divergence processes also restrict by construction the emerging properties of molecular networks, regardless of any specific cellular functions. We address this question, here, focusing on the evolution of protein-protein interaction (PPI) networks. We solve a general duplication-divergence model, based on the statistically necessary deletions of protein-protein interactions arising from stochastic duplications at various genomic scales, from single-gene to whole-genome duplications. Major evolutionary scenarios are shown to depend on two global parameters only: (i) a protein conservation index (M), which controls the evolutionary history of PPI networks, and (ii) a distinct topology index (M') controlling their resulting structure. We then demonstrate that conserved, nondense networks, which are of prime biological relevance, are also necessarily scale-free by construction, irrespective of any evolutionary variations or fluctuations of the model parameters. It is shown to result from a fundamental linkage between individual protein conservation and network topology under general duplication-divergence evolution. By contrast, we find that conservation of network motifs with two or more proteins cannot be indefinitely preserved under general duplication-divergence evolution (independently from any network rewiring dynamics), in broad agreement with empirical evidence between phylogenetically distant species. All in all, these evolutionary constraints, inherent to duplication-divergence processes, appear to have largely controlled the overall topology and scale-dependent conservation of PPI networks, regardless of any specific biological function.
Collapse
Affiliation(s)
- Kirill Evlampiev
- Physico-chimie Curie, Centre National de la Recherche Scientifique Unité Mixte de Recherche 168, Institut Curie, Section de Recherche, 11 rue P. & M. Curie, 75005 Paris, France
| | - Hervé Isambert
- Physico-chimie Curie, Centre National de la Recherche Scientifique Unité Mixte de Recherche 168, Institut Curie, Section de Recherche, 11 rue P. & M. Curie, 75005 Paris, France
| |
Collapse
|