51
|
Katju V. To the beat of a different drum: determinants implicated in the asymmetric sequence divergence of Caenorhabditis elegans paralogs. BMC Evol Biol 2013; 13:73. [PMID: 23530733 PMCID: PMC3637608 DOI: 10.1186/1471-2148-13-73] [Citation(s) in RCA: 8] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/22/2012] [Accepted: 03/20/2013] [Indexed: 12/18/2022] Open
Abstract
Background Gene duplicates often exhibit asymmetric rates of molecular evolution in their early evolutionary existence. This asymmetry in rates is thought to signify the maintenance of the ancestral function by one copy and the removal of functional constraint on the other copy, enabling it to embark on a novel evolutionary trajectory. Here I focused on a large population of evolutionarily young gene duplicates (KS ≤ 0.14) in the Caenorhabditis elegans genome in order to conduct the first combined analysis of four predictors (evolutionary age, chromosomal location, structural resemblance between duplicates, and duplication span) which may be implicated in the asymmetric sequence divergence of paralogs at the nucleotide and amino acid level. In addition, I investigate if either paralog is equally likely to embark on a trajectory of accelerated sequence evolution or whether the derived paralog is more likely to exhibit faster sequence evolution. Results Three predictors (evolutionary age of duplicates, chromosomal location and duplication span) serve as major determinants of sequence asymmetry between C. elegans paralogs. Paralogs diverge asymmetrically in sequence with increasing evolutionary age, the relocation of one copy to a different chromosome and attenuated duplication spans that likely fail to capture the entire ancestral repertoire of coding sequence and regulatory elements. Furthermore, for paralogs residing on the same chromosome, opposite transcriptional orientation and increased genomic distance do not increase sequence asymmetry between paralogs. For a subset of duplicate pairs wherein the ancestral versus derived paralog could be distinguished, the derived paralogs are more likely to evolve at accelerated rates. Conclusions This genome-wide study of evolutionarily young duplicates stemming primarily from DNA-mediated small-scale duplication events demonstrates that genomic relocation to a new chromosome has important consequences for asymmetric divergence of paralogs, akin to paralogs arising from RNA-mediated duplication events. Additionally, the duplication span is negatively correlated with sequence rate asymmetry among paralogs, suggesting that attenuated duplication spans stemming from incomplete duplication of the ORF and/or ancestral regulatory elements further accelerate sequence divergence between paralogs. Cumulatively, derived copies exhibit accelerated rates of sequence evolution suggesting that they are primed for a divergent evolutionary trajectory by changes in structure and genomic context at inception.
Collapse
Affiliation(s)
- Vaishali Katju
- Department of Biology, University of New Mexico, Albuquerque, NM 87131, USA.
| |
Collapse
|
52
|
|
53
|
Where do phosphosites come from and where do they go after gene duplication? INTERNATIONAL JOURNAL OF EVOLUTIONARY BIOLOGY 2012; 2012:843167. [PMID: 22779031 PMCID: PMC3388353 DOI: 10.1155/2012/843167] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 03/21/2012] [Accepted: 05/03/2012] [Indexed: 01/09/2023]
Abstract
Gene duplication followed by divergence is an important mechanism that leads to molecular innovation. Divergence of paralogous genes can be achieved at functional and regulatory levels. Whereas regulatory divergence at the transcriptional level is well documented, little is known about divergence of posttranslational modifications (PTMs). Protein phosphorylation, one of the most important PTMs, has recently been shown to be an important determinant of the retention of paralogous genes. Here we test whether gains and losses of phosphorylated amino acids after gene duplication may specifically modify the regulation of these duplicated proteins. We show that when phosphosites are lost in one paralog, transitions from phosphorylated serines and threonines are significantly biased toward negatively charged amino acids, which can mimic their phosphorylated status in a constitutive manner. Our analyses support the hypothesis that divergence between paralogs can be generated by a loss of the posttranslational regulatory control on a function rather than by the complete loss of the function itself. Surprisingly, these favoured transitions cannot be reached by single mutational steps, which suggests that the function of a phosphosite needs to be completely abolished before it is restored through substitution by these phosphomimetic residues. We conclude by discussing how gene duplication could facilitate the transitions between phosphorylated and phosphomimetic amino acids.
Collapse
|
54
|
Arodź T, Płonka PM. Effects of point mutations on protein structure are nonexponentially distributed. Proteins 2012; 80:1780-90. [DOI: 10.1002/prot.24073] [Citation(s) in RCA: 12] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/24/2011] [Revised: 02/02/2012] [Accepted: 03/12/2012] [Indexed: 11/07/2022]
|
55
|
Wong ES, Belov K. Venom evolution through gene duplications. Gene 2012; 496:1-7. [DOI: 10.1016/j.gene.2012.01.009] [Citation(s) in RCA: 80] [Impact Index Per Article: 6.7] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/01/2011] [Revised: 01/10/2012] [Accepted: 01/10/2012] [Indexed: 12/30/2022]
|
56
|
Townsley BT, Sinha NR. A new development: evolving concepts in leaf ontogeny. ANNUAL REVIEW OF PLANT BIOLOGY 2012; 63:535-62. [PMID: 22404465 DOI: 10.1146/annurev-arplant-042811-105524] [Citation(s) in RCA: 21] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/07/2023]
Abstract
Elucidation of gene regulatory networks (GRNs) underlying aspects of leaf development in multiple model species has uncovered surprisingly plastic regulatory architecture. The meticulously mapped network interactions in one model species cannot now be assumed to map directly onto a different species. Despite these overall differences, however, many modules do appear to be almost universal. Extrapolating findings across different model systems will demand great care but promises to reveal a rich tapestry of themes in GRN architecture and regulation. The purpose of this review is to approach the field of leaf development from the perspectives of the evolution of developmental systems that orchestrate leaf development.
Collapse
Affiliation(s)
- Brad T Townsley
- Department of Plant Biology, University of California-Davis, CA 95616, USA
| | | |
Collapse
|
57
|
Bu L, Bergthorsson U, Katju V. Local synteny and codon usage contribute to asymmetric sequence divergence of Saccharomyces cerevisiae gene duplicates. BMC Evol Biol 2011; 11:279. [PMID: 21955875 PMCID: PMC3190396 DOI: 10.1186/1471-2148-11-279] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/06/2011] [Accepted: 09/28/2011] [Indexed: 11/10/2022] Open
Abstract
Background Duplicated genes frequently experience asymmetric rates of sequence evolution. Relaxed selective constraints and positive selection have both been invoked to explain the observation that one paralog within a gene-duplicate pair exhibits an accelerated rate of sequence evolution. In the majority of studies where asymmetric divergence has been established, there is no indication as to which gene copy, ancestral or derived, is evolving more rapidly. In this study we investigated the effect of local synteny (gene-neighborhood conservation) and codon usage on the sequence evolution of gene duplicates in the S. cerevisiae genome. We further distinguish the gene duplicates into those that originated from a whole-genome duplication (WGD) event (ohnologs) versus small-scale duplications (SSD) to determine if there exist any differences in their patterns of sequence evolution. Results For SSD pairs, the derived copy evolves faster than the ancestral copy. However, there is no relationship between rate asymmetry and synteny conservation (ancestral-like versus derived-like) in ohnologs. mRNA abundance and optimal codon usage as measured by the CAI is lower in the derived SSD copies relative to ancestral paralogs. Moreover, in the case of ohnologs, the faster-evolving copy has lower CAI and lowered expression. Conclusions Together, these results suggest that relaxation of selection for codon usage and gene expression contribute to rate asymmetry in the evolution of duplicated genes and that in SSD pairs, the relaxation of selection stems from the loss of ancestral regulatory information in the derived copy.
Collapse
Affiliation(s)
- Lijing Bu
- Department of Biology, University of New Mexico, Albuquerque, NM 87131, USA
| | | | | |
Collapse
|
58
|
Montanari F, Shields DC, Khaldi N. Differences in the number of intrinsically disordered regions between yeast duplicated proteins, and their relationship with functional divergence. PLoS One 2011; 6:e24989. [PMID: 21949823 PMCID: PMC3174238 DOI: 10.1371/journal.pone.0024989] [Citation(s) in RCA: 10] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/24/2011] [Accepted: 08/22/2011] [Indexed: 11/19/2022] Open
Abstract
Background Intrinsically disordered regions are enriched in short interaction motifs that play a critical role in many protein-protein interactions. Since new short interaction motifs may easily evolve, they have the potential to rapidly change protein interactions and cellular signaling. In this work we examined the dynamics of gain and loss of intrinsically disordered regions in duplicated proteins to inspect if changes after genome duplication can create functional divergence. For this purpose we used Saccharomyces cerevisiae and the outgroup species Lachancea kluyveri. Principal Findings We find that genes duplicated as part of a genome duplication (ohnologs) are significantly more intrinsically disordered than singletons (p<2.2e-16, Wilcoxon), reflecting a preference for retaining intrinsically disordered proteins in duplicate. In addition, there have been marked changes in the extent of intrinsic disorder following duplication. A large number of duplicated genes have more intrinsic disorder than their L. kluyveri ortholog (29% for duplicates versus 25% for singletons) and an even greater number have less intrinsic disorder than the L. kluyveri ortholog (37% for duplicates versus 25% for singletons). Finally, we show that the number of physical interactions is significantly greater in the more intrinsically disordered ohnolog of a pair (p = 0.003, Wilcoxon). Conclusion This work shows that intrinsic disorder gain and loss in a protein is a mechanism by which a genome can also diverge and innovate. The higher number of interactors for proteins that have gained intrinsic disorder compared with their duplicates may reflect the acquisition of new interaction partners or new functional roles.
Collapse
Affiliation(s)
- Floriane Montanari
- UCD Conway Institute of Biomolecular and Biomedical Research, School of Medicine and Medical Sciences, and UCD Complex and Adaptive Systems Laboratory, University College Dublin, Dublin, Republic of Ireland
| | - Denis C. Shields
- UCD Conway Institute of Biomolecular and Biomedical Research, School of Medicine and Medical Sciences, and UCD Complex and Adaptive Systems Laboratory, University College Dublin, Dublin, Republic of Ireland
| | - Nora Khaldi
- UCD Conway Institute of Biomolecular and Biomedical Research, School of Medicine and Medical Sciences, and UCD Complex and Adaptive Systems Laboratory, University College Dublin, Dublin, Republic of Ireland
- * E-mail:
| |
Collapse
|
59
|
The duplicated deacetylases Sir2 and Hst1 subfunctionalized by acquiring complementary inactivating mutations. Mol Cell Biol 2011; 31:3351-65. [PMID: 21690292 DOI: 10.1128/mcb.05175-11] [Citation(s) in RCA: 25] [Impact Index Per Article: 1.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/20/2022] Open
Abstract
Protein families are generated by successive rounds of gene duplication and subsequent diversification. However, the paths by which duplicated genes acquire distinct functions are not well characterized. We focused on a pair of duplicated deacetylases from Saccharomyces cerevisiae, Sir2 and Hst1, that subfunctionalized after duplication. As a proxy for the ancestral, nonduplicated deacetylase, we studied Sir2 from another yeast, Kluyveromyces lactis. We compared the interaction domains of these deacetylases for the Sir transcriptional silencing complex, which acts with ScSir2, and the Sum1 repressor, which acts with ScHst1, and found that these interaction domains have been retained over the course of evolution and can be disrupted by simple amino acid substitutions. Therefore, Sir2 and Hst1 subfunctionalized by acquiring complementary inactivating mutations in these interaction domains.
Collapse
|
60
|
Gonçalves P, Valério E, Correia C, de Almeida JMGCF, Sampaio JP. Evidence for divergent evolution of growth temperature preference in sympatric Saccharomyces species. PLoS One 2011; 6:e20739. [PMID: 21674061 PMCID: PMC3107239 DOI: 10.1371/journal.pone.0020739] [Citation(s) in RCA: 64] [Impact Index Per Article: 4.9] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/20/2011] [Accepted: 05/09/2011] [Indexed: 01/17/2023] Open
Abstract
The genus Saccharomyces currently includes eight species in addition to the model yeast Saccharomyces cerevisiae, most of which can be consistently isolated from tree bark and soil. We recently found sympatric pairs of Saccharomyces species, composed of one cryotolerant and one thermotolerant species in oak bark samples of various geographic origins. In order to contribute to explain the occurrence in sympatry of Saccharomyces species, we screened Saccharomyces genomic data for protein divergence that might be correlated to distinct growth temperature preferences of the species, using the dN/dS ratio as a measure of protein evolution rates and pair-wise species comparisons. In addition to proteins previously implicated in growth at suboptimal temperatures, we found that glycolytic enzymes were among the proteins exhibiting higher than expected divergence when one cryotolerant and one thermotolerant species are compared. By measuring glycolytic fluxes and glycolytic enzymatic activities in different species and at different temperatures, we subsequently show that the unusual divergence of glycolytic genes may be related to divergent evolution of the glycolytic pathway aligning its performance to the growth temperature profiles of the different species. In general, our results support the view that growth temperature preference is a trait that may have undergone divergent selection in the course of ecological speciation in Saccharomyces.
Collapse
Affiliation(s)
- Paula Gonçalves
- Departamento de Ciências da Vida, Centro de Recursos Microbiológicos (CREM), Faculdade de Ciências e Tecnologia, Universidade Nova de Lisboa, Caparica, Portugal.
| | | | | | | | | |
Collapse
|
61
|
The Awesome Power of Yeast Evolutionary Genetics: New Genome Sequences and Strain Resources for the Saccharomyces sensu stricto Genus. G3-GENES GENOMES GENETICS 2011; 1:11-25. [PMID: 22384314 PMCID: PMC3276118 DOI: 10.1534/g3.111.000273] [Citation(s) in RCA: 225] [Impact Index Per Article: 17.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 04/18/2011] [Accepted: 05/01/2011] [Indexed: 01/05/2023]
Abstract
High-quality, well-annotated genome sequences and standardized laboratory strains fuel experimental and evolutionary research. We present improved genome sequences of three species of Saccharomyces sensu stricto yeasts: S. bayanus var. uvarum (CBS 7001), S. kudriavzevii (IFO 1802T and ZP 591), and S. mikatae (IFO 1815T), and describe their comparison to the genomes of S. cerevisiae and S. paradoxus. The new sequences, derived by assembling millions of short DNA sequence reads together with previously published Sanger shotgun reads, have vastly greater long-range continuity and far fewer gaps than the previously available genome sequences. New gene predictions defined a set of 5261 protein-coding orthologs across the five most commonly studied Saccharomyces yeasts, enabling a re-examination of the tempo and mode of yeast gene evolution and improved inferences of species-specific gains and losses. To facilitate experimental investigations, we generated genetically marked, stable haploid strains for all three of these Saccharomyces species. These nearly complete genome sequences and the collection of genetically marked strains provide a valuable toolset for comparative studies of gene function, metabolism, and evolution, and render Saccharomyces sensu stricto the most experimentally tractable model genus. These resources are freely available and accessible through www.SaccharomycesSensuStricto.org.
Collapse
|
62
|
Genetic interactions reveal the evolutionary trajectories of duplicate genes. Mol Syst Biol 2011; 6:429. [PMID: 21081923 PMCID: PMC3010121 DOI: 10.1038/msb.2010.82] [Citation(s) in RCA: 66] [Impact Index Per Article: 5.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/04/2010] [Accepted: 09/27/2010] [Indexed: 11/15/2022] Open
Abstract
Duplicate genes show significantly fewer interactions than singleton genes, and functionally similar duplicates can exhibit dissimilar profiles because common interactions are ‘hidden' due to buffering. Genetic interaction profiles provide insights into evolutionary mechanisms of duplicate retention by distinguishing duplicates under dosage selection from those retained because of some divergence in function. The genetic interactions of duplicate genes evolve in an extremely asymmetric way and the directionality of this asymmetry correlates well with other evolutionary properties of duplicate genes. Genetic interaction profiles can be used to elucidate the divergent function of specific duplicate pairs.
Gene duplication and divergence serves as a primary source for new genes and new functions, and as such has broad implications on the evolutionary process. Duplicate genes within S. cerevisiae have been shown to retain a high degree of similarity with regard to many of their functional properties (Papp et al, 2004; Guan et al, 2007; Wapinski et al, 2007; Musso et al, 2008), and perturbation of duplicate genes has been shown to result in smaller fitness defects than singleton genes (Gu et al, 2003; DeLuna et al, 2008; Dean et al, 2008; Musso et al, 2008). Individual genetic interactions between pairs of genes and profiles of such interactions across the entire genome provide a new context in which to examine the properties of duplicate compensation. In this study we use the most recent and comprehensive set of genetic interactions in yeast produced to date (Costanzo et al, 2010) to address questions of duplicate retention and redundancy. We show that the ability for duplicate genes to buffer the deletion of a partner has three main consequences. First it agrees with previous work demonstrating that a high proportion of duplicate pairs are synthetic lethal, a classic indication of the ability to buffer one another functionally (DeLuna et al, 2008; Dean et al, 2008; Musso et al, 2008). Second, it reduces the number of genetic interactions observed between duplicate genes and the rest of the genome by masking interactions relating to common function from experimental detection. Third, this buffering of common interactions serves to reduce profile similarity in spite of common function (Figure 1). The compensatory ability of functionally similar duplicates buffers genetic interactions related to their common function (reducing the number of genetic interactions overall), while allowing the measurement of interactions related to any divergent function. Thus, even functionally similar duplicates may have dissimilar genetic interaction profiles. As previously surmised (Ihmels et al, 2007), duplicate genes under selection for dosage amplification have differing profile characteristics. We show that dosage-mediated duplicates have much higher genetic interaction profile similarity than do other duplicate pairs. Furthermore, we show in a comparison with local neighbors on a protein–protein interaction (PPI) network, that although dosage-mediated duplicates more often have higher similarity to each other than they do to their neighbors, the reverse is true for duplicates in general. That is, slightly divergent duplicate genes more often exhibit a higher similarity with a common neighbor on the PPI network than they do with each other, and that observation is consistent with the idea that common interactions are buffered while interactions corresponding to divergent functions are observed. We then asked whether duplicates' genetic interactions that are not buffered appear in a symmetric or an asymmetric fashion. Previous work has established asymmetric patterns with regard to PPI degree (Wagner, 2002; He and Zhang, 2005), sequence divergence (Conant and Wagner, 2003; Zhang et al, 2003; Kellis et al, 2004; Scannell and Wolfe, 2008) and expression patterns (Gu et al, 2002b; Tirosh and Barkai, 2007). Although genetic interactions are further removed from mechanism than protein–protein interactions, for example, they do offer a more direct measurement of functional consequence and, thus, may give a better indication of the functional differences between a duplicate pair. We found that duplicates exhibit a strikingly asymmetric pattern of genetic interactions, with the ratio of interactions between sisters commonly exceeding 7:1 (Figure 4A). The observations differ significantly from random simulations in which genetic interactions were redistributed between sisters with equal probability (Figure 4A). Moreover, the directionality of this interaction asymmetry agrees with other physiological properties of duplicate pairs. For example, the sister with more genetic interactions also tends to have more protein–protein interactions and also tends to evolve at a slower rate (Figure 4B). Genetic interaction degree and profiles can be used to understand the functional divergence of particular duplicates pairs. As a case example, we consider the whole-genome-duplication pair CIK1–VIK1. Each of these genes encode proteins that form distinct heterodimeric complexes with the microtubule motor protein Kar3 (Manning et al, 1999). Although each of these proteins depend on a direct physical interaction with Kar3, Cik1 has a much higher profile similarity to Kar3 than does Vik1 (r=0.5 and r=0.3, respectively). Consistent with its higher similarity, Δcik1 and Δkar3 exhibit several similar phenotypes, including abnormally short spindles, chromosome loss and delayed cell cycle progression (Page et al, 1994; Manning et al, 1999). In contrast, a Δvik1 mutant strain exhibits no overt phenotype (Manning et al, 1999). The characterization of functional redundancy and divergence between duplicate genes is an important step in understanding the evolution of genetic systems. Large-scale genetic network analysis in Saccharomyces cerevisiae provides a powerful perspective for addressing these questions through quantitative measurements of genetic interactions between pairs of duplicated genes, and more generally, through the study of genome-wide genetic interaction profiles associated with duplicated genes. We show that duplicate genes exhibit fewer genetic interactions than other genes because they tend to buffer one another functionally, whereas observed interactions are non-overlapping and reflect their divergent roles. We also show that duplicate gene pairs are highly imbalanced in their number of genetic interactions with other genes, a pattern that appears to result from asymmetric evolution, such that one duplicate evolves or degrades faster than the other and often becomes functionally or conditionally specialized. The differences in genetic interactions are predictive of differences in several other evolutionary and physiological properties of duplicate pairs.
Collapse
|
63
|
Panchin AY, Gelfand MS, Ramensky VE, Artamonova II. Asymmetric and non-uniform evolution of recently duplicated human genes. Biol Direct 2010; 5:54. [PMID: 20825637 PMCID: PMC2942815 DOI: 10.1186/1745-6150-5-54] [Citation(s) in RCA: 25] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/27/2010] [Accepted: 09/08/2010] [Indexed: 12/26/2022] Open
Abstract
BACKGROUND Gene duplications are a source of new genes and protein functions. The innovative role of duplication events makes families of paralogous genes an interesting target for studies in evolutionary biology. Here we study global trends in the evolution of human genes that resulted from recent duplications. RESULTS The pressure of negative selection is weaker during a short time immediately after a duplication event. Roughly one fifth of genes in paralogous gene families are evolving asymmetrically: one of the proteins encoded by two closest paralogs accumulates amino acid substitutions significantly faster than its partner. This asymmetry cannot be explained by differences in gene expression levels. In asymmetric gene pairs the number of deleterious mutations is increased in one copy, while decreased in the other copy as compared to genes constituting non-asymmetrically evolving pairs. The asymmetry in the rate of synonymous substitutions is much weaker and not significant. CONCLUSIONS The increase of negative selection pressure over time after a duplication event seems to be a major trend in the evolution of human paralogous gene families. The observed asymmetry in the evolution of paralogous genes shows that in many cases one of two gene copies remains practically unchanged, while the other accumulates functional mutations. This supports the hypothesis that slowly evolving gene copies preserve their original functions, while fast evolving copies obtain new specificities or functions.
Collapse
Affiliation(s)
- Alexander Y Panchin
- M.V. Lomonosov Moscow State University, Faculty of Bioengineering and Bioinformatics, Vorobyevy Gory 1-73, Moscow, 119992, Russia
- A.A. Kharkevich Institute for Information Transmission Problems, Russian Academy of Science, Bolshoi Karetny 19, Moscow, 127994, Russia
| | - Mikhail S Gelfand
- M.V. Lomonosov Moscow State University, Faculty of Bioengineering and Bioinformatics, Vorobyevy Gory 1-73, Moscow, 119992, Russia
- A.A. Kharkevich Institute for Information Transmission Problems, Russian Academy of Science, Bolshoi Karetny 19, Moscow, 127994, Russia
| | - Vasily E Ramensky
- V.A. Engelhardt Institute of Molecular Biology, Russian Academy of Sciences, Vavilova 32, Moscow, 119991, Russia
| | - Irena I Artamonova
- M.V. Lomonosov Moscow State University, Faculty of Bioengineering and Bioinformatics, Vorobyevy Gory 1-73, Moscow, 119992, Russia
- A.A. Kharkevich Institute for Information Transmission Problems, Russian Academy of Science, Bolshoi Karetny 19, Moscow, 127994, Russia
- N.I. Vavilov Institute of General Genetics, Russian Academy of Science, Gubkina 3, Moscow, 119991, Russia
| |
Collapse
|
64
|
Tong P, Prendergast JGD, Lohan AJ, Farrington SM, Cronin S, Friel N, Bradley DG, Hardiman O, Evans A, Wilson JF, Loftus B. Sequencing and analysis of an Irish human genome. Genome Biol 2010; 11:R91. [PMID: 20822512 PMCID: PMC2965383 DOI: 10.1186/gb-2010-11-9-r91] [Citation(s) in RCA: 36] [Impact Index Per Article: 2.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/10/2010] [Revised: 07/13/2010] [Accepted: 09/07/2010] [Indexed: 11/10/2022] Open
Abstract
Background Recent studies generating complete human sequences from Asian, African and European subgroups have revealed population-specific variation and disease susceptibility loci. Here, choosing a DNA sample from a population of interest due to its relative geographical isolation and genetic impact on further populations, we extend the above studies through the generation of 11-fold coverage of the first Irish human genome sequence. Results Using sequence data from a branch of the European ancestral tree as yet unsequenced, we identify variants that may be specific to this population. Through comparisons with HapMap and previous genetic association studies, we identified novel disease-associated variants, including a novel nonsense variant putatively associated with inflammatory bowel disease. We describe a novel method for improving SNP calling accuracy at low genome coverage using haplotype information. This analysis has implications for future re-sequencing studies and validates the imputation of Irish haplotypes using data from the current Human Genome Diversity Cell Line Panel (HGDP-CEPH). Finally, we identify gene duplication events as constituting significant targets of recent positive selection in the human lineage. Conclusions Our findings show that there remains utility in generating whole genome sequences to illustrate both general principles and reveal specific instances of human biology. With increasing access to low cost sequencing we would predict that even armed with the resources of a small research group a number of similar initiatives geared towards answering specific biological questions will emerge.
Collapse
Affiliation(s)
- Pin Tong
- Conway Institute, University College Dublin, Belfield, Dublin 4, Ireland
| | | | | | | | | | | | | | | | | | | | | |
Collapse
|
65
|
Grassi L, Fusco D, Sellerio A, Corà D, Bassetti B, Caselle M, Lagomarsino MC. Identity and divergence of protein domain architectures after the yeast whole-genome duplication event. MOLECULAR BIOSYSTEMS 2010; 6:2305-15. [PMID: 20820472 DOI: 10.1039/c003507f] [Citation(s) in RCA: 17] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 12/20/2022]
Abstract
Gene duplication is a key mechanism in evolution for generating new functionality, and it is known to have produced a large proportion of genes. Duplication mechanisms include small-scale, or "local", events such as unequal crossing over and retroposition, together with global events, such as chromosomal or whole genome duplication (WGD). In particular, different studies confirmed that the yeast S. cerevisiae arose from a 100-150 million-year old whole-genome duplication. Detection and study of duplications are usually based on sequence alignment, synteny and phylogenetic techniques, but protein domains are also useful in assessing protein homology. We develop a simple and computationally efficient protein domain architecture comparison method based on the domain assignments available from public databases. We test the accuracy and the reliability of this method in detecting instances of gene duplication in the yeast S. cerevisiae. In particular, we analyze the evolution of WGD and non-WGD paralogs from the domain viewpoint, in comparison with a more standard functional analysis of the genes. A large number of domains is shared by genes that underwent local and global duplications, indicating the existence of a common set of "duplicable" domains. On the other hand, WGD and non-WGD paralogs tend to have different functions. We find evidence that this comes from functional migration within similar domain superfamilies, but also from the existence of small sets of WGD and non-WGD specific domain superfamilies with largely different functions. This observation gives a novel perspective on the finding that WGD paralogs tend to be functionally different from small-scale paralogs. WGD and non-WGD superfamilies carry distinct functions. Finally, the Gene Ontology similarity of paralogs tends to decrease with duplication age, while this tendency is weaker or not observable by the comparison of the domain architectures of paralogs. This suggests that the set of domains composing a protein tends to be maintained, while its function, cellular process or localization diversifies. Overall, the gathered evidence gives a different viewpoint on the biological specificity of the WGD and at the same time points out the validity of domain architecture comparison as a tool for detecting homology.
Collapse
Affiliation(s)
- Luigi Grassi
- Università degli Studi di Torino, Dip. Fisica Teorica-Via Giuria 1, 10125 Torino, Italy
| | | | | | | | | | | | | |
Collapse
|
66
|
Abstract
The divergence of new genes and proteins occurs through mutations that modulate protein function. However, mutations are pleiotropic and can have different effects on organismal fitness depending on the environment, as well as opposite effects on protein function and dosage. We review the pleiotropic effects of mutations. We discuss how they affect the evolution of gene and protein function, and how these complex mutational effects dictate the likelihood and mechanism of gene duplication and divergence. We propose several factors that can affect the divergence of new protein functions, including mutational trade-offs and hidden, or apparently neutral, variation.
Collapse
|
67
|
Toll-Riera M, Laurie S, Albà MM. Lineage-specific variation in intensity of natural selection in mammals. Mol Biol Evol 2010; 28:383-98. [PMID: 20688808 DOI: 10.1093/molbev/msq206] [Citation(s) in RCA: 32] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/23/2022] Open
Abstract
The molecular clock hypothesis states that protein-coding genes evolve at an approximately constant rate. However, this is only expected to be true as long as the function and the tertiary structure of the molecule remain unaltered. An important implication of this statement is that significant deviations in the rate of evolution of a gene with respect to the species clock are likely to reflect functional and/or structural alterations. Here, we present a method to identify such deviations and apply it to a data set of 2,929 high-quality coding sequence alignments corresponding to one-to-one orthologous genes from six mammalian species--human, macaque, mouse, rat, cow, and dog. Deviated branches are defined as those that present significant alterations in both the rate of nonsynonymous substitutions (dN) and the selective pressure (dN/dS). Strikingly, we find that as many as 24.5% of the genes show branch-specific deviations in dN and dN/dS, though this is a relatively well-conserved set of genes. Around half of these genes show branch-specific acceleration of evolutionary rates. Positive selection (PS) tests based on divergence data only identify 17.7% of the accelerated branches. Failure to identify PS in accelerated branches with an excess of radical amino acid replacements suggests that these tests are conservative. Interestingly, genes with accelerated branches are significantly enriched in neural proteins, indicating that this type of protein might play a more important role than previously thought in species diversification, although they are generally not detected by PS tests. We discuss in detail several examples of genes that show lineage-specific evolutionary rate acceleration and are involved in synaptic transmission, chemosensory perception, and ubiquitination.
Collapse
Affiliation(s)
- Macarena Toll-Riera
- Evolutionary Genomics Group, Research Programme on Biomedical Informatics, Fundació Institut Municipal d'Investigació Mèdica, Barcelona Biomedical Research Park, Barcelona, Spain
| | | | | |
Collapse
|
68
|
Ames RM, Rash BM, Hentges KE, Robertson DL, Delneri D, Lovell SC. Gene duplication and environmental adaptation within yeast populations. Genome Biol Evol 2010; 2:591-601. [PMID: 20660110 PMCID: PMC2997561 DOI: 10.1093/gbe/evq043] [Citation(s) in RCA: 36] [Impact Index Per Article: 2.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/31/2022] Open
Abstract
Population-level differences in the number of copies of genes resulting from gene duplication and loss have recently been recognized as an important source of variation in eukaryotes. However, except for a small number of cases, the phenotypic effects of this variation are unknown. Data from the Saccharomyces Genome Resequencing Project permit the study of duplication in genome sequences from a set of individuals within the same population. These sequences can be correlated with available information on the environments from which these yeast strains were isolated. We find that yeast show an abundance of duplicate genes that are lineage specific, leading to a large degree of variation in gene content between individual strains. There is a detectable bias for specific functions, indicating that selection is acting to preferentially retain certain duplicates. Most strikingly, we find that sets of over- and underrepresented duplicates correlate with the environment from which they were isolated. Together, these observations indicate that gene duplication can give rise to substantial phenotypic differences within populations that in turn can offer a shortcut to evolutionary adaptation.
Collapse
Affiliation(s)
- Ryan M Ames
- Faculty of Life Sciences, University of Manchester, Manchester, United Kingdom
| | | | | | | | | | | |
Collapse
|
69
|
|
70
|
Abstract
Many, if not most, enzymes can promiscuously catalyze reactions, or act on substrates, other than those for which they evolved. Here, we discuss the structural, mechanistic, and evolutionary implications of this manifestation of infidelity of molecular recognition. We define promiscuity and related phenomena and also address their generality and physiological implications. We discuss the mechanistic enzymology of promiscuity--how enzymes, which generally exert exquisite specificity, catalyze other, and sometimes barely related, reactions. Finally, we address the hypothesis that promiscuous enzymatic activities serve as evolutionary starting points and highlight the unique evolutionary features of promiscuous enzyme functions.
Collapse
Affiliation(s)
- Olga Khersonsky
- Department of Biological Chemistry, Weizmann Institute of Science, Rehovot 76100, Israel
| | | |
Collapse
|
71
|
Brown CA, Murray AW, Verstrepen KJ. Rapid expansion and functional divergence of subtelomeric gene families in yeasts. Curr Biol 2010; 20:895-903. [PMID: 20471265 DOI: 10.1016/j.cub.2010.04.027] [Citation(s) in RCA: 238] [Impact Index Per Article: 17.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/08/2010] [Revised: 04/13/2010] [Accepted: 04/14/2010] [Indexed: 12/31/2022]
Abstract
BACKGROUND Subtelomeres, regions proximal to telomeres, exhibit characteristics unique to eukaryotic genomes. Genes residing in these loci are subject to epigenetic regulation and elevated rates of both meiotic and mitotic recombination. However, most genome sequences do not contain assembled subtelomeric sequences, and, as a result, subtelomeres are often overlooked in comparative genomics. RESULTS We studied the evolution and functional divergence of subtelomeric gene families in the yeast lineage. Our computational results show that subtelomeric families are evolving and expanding much faster than families that do not contain subtelomeric genes. Focusing on three related subtelomeric MAL gene families involved in disaccharide metabolism that show typical patterns of rapid expansion and evolution, we show experimentally how frequent duplication events followed by functional divergence yield novel alleles that allow the metabolism of different carbohydrates. CONCLUSIONS Taken together, our computational and experimental analyses show that the extraordinary instability of eukaryotic subtelomeres supports rapid adaptation to novel niches by promoting gene recombination and duplication followed by functional divergence of the alleles.
Collapse
Affiliation(s)
- Chris A Brown
- Faculty of Arts and Sciences Center for Systems Biology, Harvard University, 52 Oxford Street, Cambridge, MA 02138, USA
| | | | | |
Collapse
|
72
|
Sun HZ, Ge S. Molecular evolution of the duplicated TFIIAgamma genes in Oryzeae and its relatives. BMC Evol Biol 2010; 10:128. [PMID: 20438643 PMCID: PMC2887407 DOI: 10.1186/1471-2148-10-128] [Citation(s) in RCA: 11] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/22/2009] [Accepted: 05/04/2010] [Indexed: 11/10/2022] Open
Abstract
Background Gene duplication provides raw genetic materials for evolutionary novelty and adaptation. The evolutionary fate of duplicated transcription factor genes is less studied although transcription factor gene plays important roles in many biological processes. TFIIAγ is a small subunit of TFIIA that is one of general transcription factors required by RNA polymerase II. Previous studies identified two TFIIAγ-like genes in rice genome and found that these genes either conferred resistance to rice bacterial blight or could be induced by pathogen invasion, raising the question as to their functional divergence and evolutionary fates after gene duplication. Results We reconstructed the evolutionary history of the TFIIAγ genes from main lineages of angiosperms and demonstrated that two TFIIAγ genes (TFIIAγ1 and TFIIAγ5) arose from a whole genome duplication that happened in the common ancestor of grasses. Likelihood-based analyses with branch, codon, and branch-site models showed no evidence of positive selection but a signature of relaxed selective constraint after the TFIIAγ duplication. In particular, we found that the nonsynonymous/synonymous rate ratio (ω = dN/dS) of the TFIIAγ1 sequences was two times higher than that of TFIIAγ5 sequences, indicating highly asymmetric rates of protein evolution in rice tribe and its relatives, with an accelerated rate of TFIIAγ1 gene. Our expression data and EST database search further indicated that after whole genome duplication, the expression of TFIIAγ1 gene was significantly reduced while TFIIAγ5 remained constitutively expressed and maintained the ancestral role as a subunit of the TFIIA complex. Conclusion The evolutionary fate of TFIIAγ duplicates is not consistent with the neofunctionalization model that predicts that one of the duplicated genes acquires a new function because of positive Darwinian selection. Instead, we suggest that subfunctionalization might be involved in TFIIAγ evolution in grasses. The fact that both TFIIAγ1 and TFIIAγ5 genes were effectively involved in response to biotic or abiotic factors might be explained by either Dykhuizen-Hartl effect or buffering hypothesis.
Collapse
Affiliation(s)
- Hong-Zheng Sun
- State Key Laboratory of Systematic and Evolutionary Botany, Institute of Botany, Chinese Academy of Sciences, Beijing, China
| | | |
Collapse
|
73
|
Warren AS, Anandakrishnan R, Zhang L. Functional bias in molecular evolution rate of Arabidopsis thaliana. BMC Evol Biol 2010; 10:125. [PMID: 20433764 PMCID: PMC2876160 DOI: 10.1186/1471-2148-10-125] [Citation(s) in RCA: 15] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/29/2009] [Accepted: 05/01/2010] [Indexed: 11/10/2022] Open
Abstract
BACKGROUND Characteristics derived from mutation and other mechanisms that are advantageous for survival are often preserved during evolution by natural selection. Some genes are conserved in many organisms because they are responsible for fundamental biological function, others are conserved for their unique functional characteristics. Therefore one would expect the rate of molecular evolution for individual genes to be dependent on their biological function. Whether this expectation holds for genes duplicated by whole genome duplication is not known. RESULTS We empirically demonstrate here, using duplicated genes generated from the Arabidopsis thaliana alpha-duplication event, that the rate of molecular evolution of genes duplicated in this event depend on biological function. Using functional clustering based on gene ontology annotation of gene pairs, we show that some duplicated genes, such as defense response genes, are under weaker purifying selection or under stronger diversifying selection than other duplicated genes, such as protein translation genes, as measured by the ratio of nonsynonymous to synonymous divergence (dN/dS). CONCLUSIONS These results provide empirical evidence indicating that molecular evolution rate for genes duplicated in whole genome duplication, as measured by dN/dS, may depend on biological function, which we characterize using gene ontology annotation. Furthermore, the general approach used here provides a framework for comparative analysis of molecular evolution rate for genes based on their biological function.
Collapse
Affiliation(s)
- Andrew S Warren
- Department of Computer Science, Virginia Tech, Blacksburg, VA, USA
| | | | | |
Collapse
|
74
|
Hanada K, Kuromori T, Myouga F, Toyoda T, Shinozaki K. Increased expression and protein divergence in duplicate genes is associated with morphological diversification. PLoS Genet 2009; 5:e1000781. [PMID: 20041196 PMCID: PMC2788128 DOI: 10.1371/journal.pgen.1000781] [Citation(s) in RCA: 34] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/13/2009] [Accepted: 11/22/2009] [Indexed: 11/18/2022] Open
Abstract
The differentiation of both gene expression and protein function is thought to be important as a mechanism of the functionalization of duplicate genes. However, it has not been addressed whether expression or protein divergence of duplicate genes is greater in those genes that have undergone functionalization compared with those that have not. We examined a total of 492 paralogous gene pairs associated with morphological diversification in a plant model organism (Arabidopsis thaliana). Classifying these paralogous gene pairs into high, low, and no morphological diversification groups, based on knock-out data, we found that the divergence rate of both gene expression and protein sequences were significantly higher in either high or low morphological diversification groups compared with those in the no morphological diversification group. These results strongly suggest that the divergence of both expression and protein sequence are important sources for morphological diversification of duplicate genes. Although both mechanisms are not mutually exclusive, our analysis suggested that changes of expression pattern play the minor role (33%–41%) and that changes of protein sequence play the major role (59%–67%) in morphological diversification. Finally, we examined to what extent duplicate genes are associated with expression or protein divergence exerting morphological diversification at the whole-genome level. Interestingly, duplicate genes randomly chosen from A. thaliana had not experienced expression or protein divergence that resulted in morphological diversification. These results indicate that most duplicate genes have experienced minor functionalization. The relationship between morphological and molecular evolution is a central issue to the understanding of eukaryote evolution. In particular, there is much interest in how duplicate genes have contributed to morphological diversification during evolution. As a mechanism of functionalization of duplicate genes, differentiation of both gene expression and protein function are believed to be important. Although it has been reported that both expression and protein divergence tend to increase as a duplication ages, it is unclear whether expression or protein divergence in duplicate genes is greater in those genes that have undergone functionalization compared with those that have not. Here, we studied 492 duplicate gene pairs associated with various degrees of morphological diversification in Arabidopsis thaliana. Using these data, we found that the divergence of both expression and protein sequence were important sources for morphological diversification of duplicate genes. Although both mechanisms are not mutually exclusive, our analysis suggested that expression divergence is the minor contributor and protein divergence is the major contributor to morphological diversification. However, the expression or protein sequence of randomly chosen duplicate genes did not show significant divergence that resulted in morphological diversification. These results indicate that most duplicate genes experienced minor functionalization in the genome.
Collapse
Affiliation(s)
- Kousuke Hanada
- Gene Discovery Research Group, RIKEN Plant Science Center, Yokohama, Kanagawa, Japan.
| | | | | | | | | |
Collapse
|
75
|
Farré D, Albà MM. Heterogeneous patterns of gene-expression diversification in mammalian gene duplicates. Mol Biol Evol 2009; 27:325-35. [PMID: 19822635 DOI: 10.1093/molbev/msp242] [Citation(s) in RCA: 33] [Impact Index Per Article: 2.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/29/2022] Open
Abstract
Gene duplication is a major mechanism for molecular evolutionary innovation. Young gene duplicates typically exhibit elevated rates of protein evolution and, according to a number of recent studies, increased expression divergence. However, the nature of these changes is still poorly understood. To gain novel insights into the functional consequences of gene duplication, we have undertaken an in-depth analysis of a large data set of gene families containing primate- and/or rodent-specific gene duplicates. We have found a clear tendency toward an increase in protein, promoter, and expression divergence with increasing number of duplication events undergone by each gene since the human-mouse split. In addition, gene duplication is significantly associated with a reduction in expression breadth and intensity. Interestingly, it is possible to identify three main groups regarding the evolution of gene expression following gene duplication. The first group, which comprises around 25% of the families, shows patterns compatible with tissue-expression partitioning. The second and largest group, comprising 33-53% of the families, shows broad expression of one of the gene copies and reduced, overlapping, expression of the other copy or copies. This can be attributed, in most cases, to loss of expression in several tissues of one or more gene copies. Finally, a substantial number of families, 19-35%, maintain a very high level of tissue-expression overlap (>0.8) after tens of millions of years of evolution. These families may have been subject to selection for increased gene dosage.
Collapse
|
76
|
Whittle CA, Krochko JE. Transcript profiling provides evidence of functional divergence and expression networks among ribosomal protein gene paralogs in Brassica napus. THE PLANT CELL 2009; 21:2203-19. [PMID: 19706795 PMCID: PMC2751962 DOI: 10.1105/tpc.109.068411] [Citation(s) in RCA: 67] [Impact Index Per Article: 4.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 04/30/2009] [Revised: 06/14/2009] [Accepted: 07/15/2009] [Indexed: 05/19/2023]
Abstract
The plant ribosome is composed of 80 distinct ribosomal (r)-proteins. In Arabidopsis thaliana, each r-protein is encoded by two or more highly similar paralogous genes, although only one copy of each r-protein is incorporated into the ribosome. Brassica napus is especially suited to the comparative study of r-protein gene paralogs due to its documented history of genome duplication as well as the recent availability of large EST data sets. We have identified 996 putative r-protein genes spanning 79 distinct r-proteins in B. napus using EST data from 16 tissue collections. A total of 23,408 tissue-specific r-protein ESTs are associated with this gene set. Comparative analysis of the transcript levels for these unigenes reveals that a large fraction of r-protein genes are differentially expressed and that the number of paralogs expressed for each r-protein varies extensively with tissue type in B. napus. In addition, in many cases the paralogous genes for a specific r-protein are not transcribed in concert and have highly contrasting expression patterns among tissues. Thus, each tissue examined has a novel r-protein transcript population. Furthermore, hierarchical clustering reveals that particular paralogs for nonhomologous r-protein genes cluster together, suggesting that r-protein paralog combinations are associated with specific tissues in B. napus and, thus, may contribute to tissue differentiation and/or specialization. Altogether, the data suggest that duplicated r-protein genes undergo functional divergence into highly specialized paralogs and coexpression networks and that, similar to recent reports for yeast, these are likely actively involved in differentiation, development, and/or tissue-specific processes.
Collapse
|
77
|
Glucose sensing network in Candida albicans: a sweet spot for fungal morphogenesis. EUKARYOTIC CELL 2009; 8:1314-20. [PMID: 19617394 DOI: 10.1128/ec.00138-09] [Citation(s) in RCA: 65] [Impact Index Per Article: 4.3] [Reference Citation Analysis] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 12/26/2022]
|
78
|
Schrimpf SP, Weiss M, Reiter L, Ahrens CH, Jovanovic M, Malmström J, Brunner E, Mohanty S, Lercher MJ, Hunziker PE, Aebersold R, von Mering C, Hengartner MO. Comparative functional analysis of the Caenorhabditis elegans and Drosophila melanogaster proteomes. PLoS Biol 2009; 7:e48. [PMID: 19260763 PMCID: PMC2650730 DOI: 10.1371/journal.pbio.1000048] [Citation(s) in RCA: 185] [Impact Index Per Article: 12.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/11/2008] [Accepted: 01/13/2009] [Indexed: 12/24/2022] Open
Abstract
The nematode Caenorhabditis elegans is a popular model system in genetics, not least because a majority of human disease genes are conserved in C. elegans. To generate a comprehensive inventory of its expressed proteome, we performed extensive shotgun proteomics and identified more than half of all predicted C. elegans proteins. This allowed us to confirm and extend genome annotations, characterize the role of operons in C. elegans, and semiquantitatively infer abundance levels for thousands of proteins. Furthermore, for the first time to our knowledge, we were able to compare two animal proteomes (C. elegans and Drosophila melanogaster). We found that the abundances of orthologous proteins in metazoans correlate remarkably well, better than protein abundance versus transcript abundance within each organism or transcript abundances across organisms; this suggests that changes in transcript abundance may have been partially offset during evolution by opposing changes in protein abundance. Proteins are the active players that execute the genetic program of a cell, and their levels and interactions are precisely controlled. Routinely monitoring thousands of proteins is difficult, as they can be present at vastly different abundances, come with various sizes, shapes, and charge, and have a more complex alphabet of twenty “letters,” in contrast to the four letters of the genome itself. Here, we used mass spectrometry to extensively characterize the proteins of a popular model organism, the nematode Caenorhabditis elegans. Together with previous data from the fruit fly Drosophila melanogaster, this allows us to compare the protein levels of two animals on a global scale. Surprisingly, we find that individual protein abundance is highly conserved between the two species. So, although worms and flies look very different, they need similar amounts of each conserved, orthologous protein. Because many C. elegans and D. melanogaster proteins also have counterparts in humans, our results suggest that similar rules may apply to our own proteins. A quantitative comparison of two animal proteomes shows a striking correlation of protein abundance levels, a better correlation than transcript levels. Are the latter more variable during evolution?
Collapse
Affiliation(s)
- Sabine P Schrimpf
- Institute of Molecular Biology, University of Zurich, Zurich, Switzerland
- Center for Model Organism Proteomes, University of Zurich, Zurich, Switzerland
- * To whom correspondence should be addressed. E-mail: (SPS); (CvM); (MOH)
| | - Manuel Weiss
- Institute of Molecular Biology, University of Zurich, Zurich, Switzerland
- Center for Model Organism Proteomes, University of Zurich, Zurich, Switzerland
- PhD Program in Molecular Life Sciences, University of Zurich, Zurich, Switzerland
| | - Lukas Reiter
- Institute of Molecular Biology, University of Zurich, Zurich, Switzerland
- Center for Model Organism Proteomes, University of Zurich, Zurich, Switzerland
- PhD Program in Molecular Life Sciences, University of Zurich, Zurich, Switzerland
- Institute of Molecular Systems Biology, Swiss Federal Institute of Technology Zurich, Zurich, Switzerland
| | - Christian H Ahrens
- Center for Model Organism Proteomes, University of Zurich, Zurich, Switzerland
- Functional Genomics Center, University of Zurich and Swiss Federal Institute of Technology Zurich, Zurich, Switzerland
| | - Marko Jovanovic
- Institute of Molecular Biology, University of Zurich, Zurich, Switzerland
- Center for Model Organism Proteomes, University of Zurich, Zurich, Switzerland
- PhD Program in Molecular Life Sciences, University of Zurich, Zurich, Switzerland
| | - Johan Malmström
- Institute of Molecular Systems Biology, Swiss Federal Institute of Technology Zurich, Zurich, Switzerland
| | - Erich Brunner
- Center for Model Organism Proteomes, University of Zurich, Zurich, Switzerland
| | - Sonali Mohanty
- Center for Model Organism Proteomes, University of Zurich, Zurich, Switzerland
- Institute of Molecular Systems Biology, Swiss Federal Institute of Technology Zurich, Zurich, Switzerland
| | - Martin J Lercher
- Institute of Informatics, University of Düsseldorf, Düsseldorf, Germany
| | - Peter E Hunziker
- Functional Genomics Center, University of Zurich and Swiss Federal Institute of Technology Zurich, Zurich, Switzerland
| | - Ruedi Aebersold
- Institute of Molecular Systems Biology, Swiss Federal Institute of Technology Zurich, Zurich, Switzerland
- Institute for Systems Biology, Seattle, Washington, United States of America
| | - Christian von Mering
- Institute of Molecular Biology, University of Zurich, Zurich, Switzerland
- Center for Model Organism Proteomes, University of Zurich, Zurich, Switzerland
- Swiss Institute of Bioinformatics, University of Zurich, Zurich, Switzerland
- * To whom correspondence should be addressed. E-mail: (SPS); (CvM); (MOH)
| | - Michael O Hengartner
- Institute of Molecular Biology, University of Zurich, Zurich, Switzerland
- Center for Model Organism Proteomes, University of Zurich, Zurich, Switzerland
- * To whom correspondence should be addressed. E-mail: (SPS); (CvM); (MOH)
| |
Collapse
|
79
|
How confident can we be that orthologs are similar, but paralogs differ? Trends Genet 2009; 25:210-6. [PMID: 19368988 DOI: 10.1016/j.tig.2009.03.004] [Citation(s) in RCA: 112] [Impact Index Per Article: 7.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/05/2009] [Revised: 03/15/2009] [Accepted: 03/16/2009] [Indexed: 11/24/2022]
Abstract
Homologous genes are classified into orthologs and paralogs, depending on whether they arose by speciation or duplication. It is widely assumed that orthologs share similar functions, whereas paralogs are expected to diverge more from each other. But does this assumption hold up on further examination? We present evidence that orthologs and paralogs are not so different in either their evolutionary rates or their mechanisms of divergence. We emphasize the importance of appropriately designed studies to test models of gene evolution between orthologs and between paralogs. Thus, functional change between orthologs might be as common as between paralogs, and future studies should be designed to test the impact of duplication against this alternative model.
Collapse
|
80
|
Simultaneous Bayesian gene tree reconstruction and reconciliation analysis. Proc Natl Acad Sci U S A 2009; 106:5714-9. [PMID: 19299507 DOI: 10.1073/pnas.0806251106] [Citation(s) in RCA: 126] [Impact Index Per Article: 8.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/10/2023] Open
Abstract
We present GSR, a probabilistic model integrating gene duplication, sequence evolution, and a relaxed molecular clock for substitution rates, that enables genomewide analysis of gene families. The gene duplication and loss process is a major cause for incongruence between gene and species tree, and deterministic methods have been developed to explain such differences through tree reconciliations. Although probabilistic methods for phylogenetic inference have been around for decades, probabilistic reconciliation methods are far less established. Based on our model, we have implemented a Bayesian analysis tool, PrIME-GSR, for gene tree inference that takes a known species tree into account. Our implementation is sound and we demonstrate its utility for genomewide gene-family analysis by applying it to recently presented yeast data. We validate PrIME-GSR by comparing with previous analyses of these data that take advantage of gene order information. In a case study we apply our method to the ADH gene family and are able to draw biologically relevant conclusions concerning gene duplications creating key yeast phenotypes. On a higher level this shows the biological relevance of our method. The obtained results demonstrate the value of a relaxed molecular clock. Our good performance will extend to species where gene order conservation is insufficient.
Collapse
|
81
|
Abstract
Comparative genomics and systems biology offer unprecedented opportunities for testing central tenets of evolutionary biology formulated by Darwin in the Origin of Species in 1859 and expanded in the Modern Synthesis 100 years later. Evolutionary-genomic studies show that natural selection is only one of the forces that shape genome evolution and is not quantitatively dominant, whereas non-adaptive processes are much more prominent than previously suspected. Major contributions of horizontal gene transfer and diverse selfish genetic elements to genome evolution undermine the Tree of Life concept. An adequate depiction of evolution requires the more complex concept of a network or ‘forest’ of life. There is no consistent tendency of evolution towards increased genomic complexity, and when complexity increases, this appears to be a non-adaptive consequence of evolution under weak purifying selection rather than an adaptation. Several universals of genome evolution were discovered including the invariant distributions of evolutionary rates among orthologous genes from diverse genomes and of paralogous gene family sizes, and the negative correlation between gene expression level and sequence evolution rate. Simple, non-adaptive models of evolution explain some of these universals, suggesting that a new synthesis of evolutionary biology might become feasible in a not so remote future.
Collapse
Affiliation(s)
- Eugene V Koonin
- National Center for Biotechnology Information, National Library of Medicine, National Institutes of Health, Bethesda, MD, USA.
| |
Collapse
|
82
|
Jaillon O, Aury JM, Wincker P. “Changing by doubling”, the impact of Whole Genome Duplications in the evolution of eukaryotes. C R Biol 2009; 332:241-53. [DOI: 10.1016/j.crvi.2008.07.007] [Citation(s) in RCA: 46] [Impact Index Per Article: 3.1] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/01/2008] [Accepted: 07/21/2008] [Indexed: 12/17/2022]
|
83
|
Park C, Makova KD. Coding region structural heterogeneity and turnover of transcription start sites contribute to divergence in expression between duplicate genes. Genome Biol 2009; 10:R10. [PMID: 19175934 PMCID: PMC2687787 DOI: 10.1186/gb-2009-10-1-r10] [Citation(s) in RCA: 30] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/11/2008] [Revised: 12/24/2008] [Accepted: 01/28/2009] [Indexed: 11/10/2022] Open
Abstract
BACKGROUND Gene expression divergence is one manifestation of functional differences between duplicate genes. Although rapid accumulation of expression divergence between duplicate gene copies has been observed, the driving mechanisms behind this phenomenon have not been explored in detail. RESULTS We examine which factors influence expression divergence between human duplicate genes, utilizing the latest genome-wide data sets. We conclude that the turnover of transcription start sites between duplicate genes occurs rapidly after gene duplication and that gene pairs with shared transcription start sites have significantly higher expression similarity than those without shared transcription start sites. Moreover, we find that most (55%) duplicate gene pairs do not retain the same coding sequence structure between the two duplicate copies and this also contributes to divergence in their expression. Furthermore, the proportion of aligned sequences in cis-regulatory regions between the two copies is positively correlated with expression similarity. Surprisingly, we find no effect of copy-specific transposable element insertions on the divergence of duplicate gene expression. CONCLUSIONS Our results suggest that turnover of transcription start sites, structural heterogeneity of coding sequences, and divergence of cis-regulatory regions between copies play a pivotal role in determining the expression divergence of duplicate genes.
Collapse
Affiliation(s)
- Chungoo Park
- Center for Comparative Genomics and Bioinformatics, Department of Biology, The Pennsylvania State University, University Park, PA 16802, USA.
| | | |
Collapse
|
84
|
Dunn B, Sherlock G. Reconstruction of the genome origins and evolution of the hybrid lager yeast Saccharomyces pastorianus. Genome Res 2008; 18:1610-23. [PMID: 18787083 DOI: 10.1101/gr.076075.108] [Citation(s) in RCA: 211] [Impact Index Per Article: 13.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/24/2022]
Abstract
Inter-specific hybridization leading to abrupt speciation is a well-known, common mechanism in angiosperm evolution; only recently, however, have similar hybridization and speciation mechanisms been documented to occur frequently among the closely related group of sensu stricto Saccharomyces yeasts. The economically important lager beer yeast Saccharomyces pastorianus is such a hybrid, formed by the union of Saccharomyces cerevisiae and Saccharomyces bayanus-related yeasts; efforts to understand its complex genome, searching for both biological and brewing-related insights, have been underway since its hybrid nature was first discovered. It had been generally thought that a single hybridization event resulted in a unique S. pastorianus species, but it has been recently postulated that there have been two or more hybridization events. Here, we show that there may have been two independent origins of S. pastorianus strains, and that each independent group--defined by characteristic genome rearrangements, copy number variations, ploidy differences, and DNA sequence polymorphisms--is correlated with specific breweries and/or geographic locations. Finally, by reconstructing common ancestral genomes via array-CGH data analysis and by comparing representative DNA sequences of the S. pastorianus strains with those of many different S. cerevisiae isolates, we have determined that the most likely S. cerevisiae ancestral parent for each of the independent S. pastorianus groups was an ale yeast, with different, but closely related ale strains contributing to each group's parentage.
Collapse
Affiliation(s)
- Barbara Dunn
- Department of Genetics, Stanford University, Stanford, California 94305-5120, USA
| | | |
Collapse
|
85
|
Bershtein S, Tawfik DS. Ohno's Model Revisited: Measuring the Frequency of Potentially Adaptive Mutations under Various Mutational Drifts. Mol Biol Evol 2008; 25:2311-8. [DOI: 10.1093/molbev/msn174] [Citation(s) in RCA: 59] [Impact Index Per Article: 3.7] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/30/2022] Open
|
86
|
Probabilistic cross-species inference of orthologous genomic regions created by whole-genome duplication in yeast. Genetics 2008; 179:1681-92. [PMID: 18562662 DOI: 10.1534/genetics.107.074450] [Citation(s) in RCA: 36] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/18/2022] Open
Abstract
Identification of orthologous genes across species becomes challenging in the presence of a whole-genome duplication (WGD). We present a probabilistic method for identifying orthologs that considers all possible orthology/paralogy assignments for a set of genomes with a shared WGD (here five yeast species). This approach allows us to estimate how confident we can be in the orthology assignments in each genomic region. Two inferences produced by this model are indicative of purifying selection acting to prevent duplicate gene loss. First, our model suggests that there are significant differences (up to a factor of seven) in duplicate gene half-life. Second, we observe differences between the genes that the model infers to have been lost soon after WGD and those lost more recently. Gene losses soon after WGD appear uncorrelated with gene expression level and knockout fitness defect. However, later losses are biased toward genes whose paralogs have high expression and large knockout fitness defects, as well as showing biases toward certain functional groups such as ribosomal proteins. We suggest that while duplicate copies of some genes may be lost neutrally after WGD, another set of genes may be initially preserved in duplicate by natural selection for reasons including dosage.
Collapse
|
87
|
Studer RA, Penel S, Duret L, Robinson-Rechavi M. Pervasive positive selection on duplicated and nonduplicated vertebrate protein coding genes. Genome Res 2008; 18:1393-402. [PMID: 18562677 DOI: 10.1101/gr.076992.108] [Citation(s) in RCA: 68] [Impact Index Per Article: 4.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/19/2022]
Abstract
A stringent branch-site codon model was used to detect positive selection in vertebrate evolution. We show that the test is robust to the large evolutionary distances involved. Positive selection was detected in 77% of 884 genes studied. Most positive selection concerns a few sites on a single branch of the phylogenetic tree: Between 0.9% and 4.7% of sites are affected by positive selection depending on the branches. No functional category was overrepresented among genes under positive selection. Surprisingly, whole genome duplication had no effect on the prevalence of positive selection, whether the fish-specific genome duplication or the two rounds at the origin of vertebrates. Thus positive selection has not been limited to a few gene classes, or to specific evolutionary events such as duplication, but has been pervasive during vertebrate evolution.
Collapse
Affiliation(s)
- Romain A Studer
- Department of Ecology and Evolution, Biophore, Lausanne University, CH-1015 Lausanne, Switzerland
| | | | | | | |
Collapse
|
88
|
Chain FJJ, Ilieva D, Evans BJ. Duplicate gene evolution and expression in the wake of vertebrate allopolyploidization. BMC Evol Biol 2008; 8:43. [PMID: 18261230 PMCID: PMC2275784 DOI: 10.1186/1471-2148-8-43] [Citation(s) in RCA: 39] [Impact Index Per Article: 2.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/31/2007] [Accepted: 02/08/2008] [Indexed: 12/21/2022] Open
Abstract
Background The mechanism by which duplicate genes originate – whether by duplication of a whole genome or of a genomic segment – influences their genetic fates. To study events that trigger duplicate gene persistence after whole genome duplication in vertebrates, we have analyzed molecular evolution and expression of hundreds of persistent duplicate gene pairs in allopolyploid clawed frogs (Xenopus and Silurana). We collected comparative data that allowed us to tease apart the molecular events that occurred soon after duplication from those that occurred later on. We also quantified expression profile divergence of hundreds of paralogs during development and in different tissues. Results Our analyses indicate that persistent duplicates generated by allopolyploidization are subjected to strong purifying selection soon after duplication. The level of purifying selection is relaxed compared to a singleton ortholog, but not significantly variable over a period spanning about 40 million years. Despite persistent functional constraints, however, analysis of paralogous expression profiles indicates that quantitative aspects of their expression diverged substantially during this period. Conclusion These results offer clues into how vertebrate transcriptomes are sculpted in the wake of whole genome duplication (WGD), such as those that occurred in our early ancestors. That functional constraints were relaxed relative to a singleton ortholog but not significantly different in the early compared to the later stage of duplicate gene evolution suggests that the timescale for a return to pre-duplication levels is drawn out over tens of millions of years – beyond the age of these tetraploid species. Quantitative expression divergence can occur soon after WGD and with a magnitude that is not correlated with the rate of protein sequence divergence. On a coarse scale, quantitative expression divergence appears to be more prevalent than spatial and temporal expression divergence, and also faster or more frequent than other processes that operate at the protein level, such as some types of neofunctionalization.
Collapse
Affiliation(s)
- Frédéric J J Chain
- Center for Environmental Genomics, Department of Biology, Life Sciences Building Room 328 McMaster University, 1280 Main Street West, Hamilton, ON, L8S 4K1, Canada.
| | | | | |
Collapse
|