101
|
Abstract
Recent investigations of high-throughput genomic and phenomic data have uncovered a variety of significant but relatively weak correlations between a gene's functional and evolutionary characteristics. In particular, essential genes and genes with paralogs have a slight propensity to evolve more slowly than nonessential genes and singletons, respectively. However, given the weakness and multiplicity of these associations, their biological relevance remains uncertain. Here, we show that existence of an essential paralog can be used as a specific and strong gauge of selection. We partition gene families in several genomes into two classes: those that include at least one essential gene (E-families) and those without essential genes (N-families). We find that weaker purifying selection causes N-families to evolve in a more dynamic regime with higher rates both of duplicate fixation and pseudogenization. Because genes in E-families are subject to significantly stronger purifying selection than those in N-families, they survive longer and exhibit greater sequence divergence. Longer average survival time also allows for divergence of upstream regulatory regions, resulting in change of transcriptional context among paralogs in E-families. These findings are compatible with differential division of ancestral functions (subfunctionalization) or emergence of novel functions (neofunctionalization) being the prevalent modes of evolution of paralogs in E-families as opposed to pseudogenization (nonfunctionalization), which is the typical fate of paralogs in N-families. Unlike other characteristics of genes, such as essentiality, existence of paralogs, or expression level, membership in an E-family or an N-family strongly correlates with the level of selection and appears to be a major determinant of a gene's evolutionary fate.
Collapse
Affiliation(s)
- Boris E Shakhnovich
- Bioinformatics Program, Boston University, Boston, Massachusetts 02215, USA.
| | | |
Collapse
|
102
|
Johnson DA, Hill JP, Thomas MA. The monosaccharide transporter gene family in land plants is ancient and shows differential subfamily expression and expansion across lineages. BMC Evol Biol 2006; 6:64. [PMID: 16923188 PMCID: PMC1578591 DOI: 10.1186/1471-2148-6-64] [Citation(s) in RCA: 48] [Impact Index Per Article: 2.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/04/2006] [Accepted: 08/21/2006] [Indexed: 11/24/2022] Open
Abstract
Background In plants, tandem, segmental and whole-genome duplications are prevalent, resulting in large numbers of duplicate loci. Recent studies suggest that duplicate genes diverge predominantly through the partitioning of expression and that breadth of gene expression is related to the rate of gene duplication and protein sequence evolution. Here, we utilize expressed sequence tag (EST) data to study gene duplication and expression patterns in the monosaccharide transporter (MST) gene family across the land plants. In Arabidopsis, there are 53 MST genes that form seven distinct subfamilies. We created profile hidden Markov models of each subfamily and searched EST databases representing diverse land plant lineages to address the following questions: 1) Are homologs of each Arabidopsis subfamily present in the earliest land plants? 2) Do expression patterns among subfamilies and individual genes within subfamilies differ across lineages? 3) Has gene duplication within each lineage resulted in lineage-specific expansion patterns? We also looked for correlations between relative EST database representation in Arabidopsis and similarity to orthologs in early lineages. Results Homologs of all seven MST subfamilies were present in land plants at least 400 million years ago. Subfamily expression levels vary across lineages with greater relative expression of the STP, ERD6-like, INT and PLT subfamilies in the vascular plants. In the large EST databases of the moss, gymnosperm, monocot and eudicot lineages, EST contig construction reveals that MST subfamilies have experienced lineage-specific expansions. Large subfamily expansions appear to be due to multiple gene duplications arising from single ancestral genes. In Arabidopsis, one or a few genes within most subfamilies have much higher EST database representation than others. Most highly represented (broadly expressed) genes in Arabidopsis have best match orthologs in early divergent lineages. Conclusion The seven subfamilies of the Arabidopsis MST gene family are ancient in land plants and show differential subfamily expression and lineage-specific subfamily expansions. Patterns of gene expression in Arabidopsis and correlation of highly represented genes with best match homologs in early lineages suggests that broadly expressed genes are often highly conserved, and that most genes have more limited expression.
Collapse
Affiliation(s)
- Deborah A Johnson
- Department of Biological Sciences, Idaho State University, Campus Box 8007, Pocatello, ID, USA
| | - Jeffrey P Hill
- Department of Biological Sciences, Idaho State University, Campus Box 8007, Pocatello, ID, USA
| | - Michael A Thomas
- Department of Biological Sciences, Idaho State University, Campus Box 8007, Pocatello, ID, USA
| |
Collapse
|
103
|
Brunet FG, Roest Crollius H, Paris M, Aury JM, Gibert P, Jaillon O, Laudet V, Robinson-Rechavi M. Gene loss and evolutionary rates following whole-genome duplication in teleost fishes. Mol Biol Evol 2006; 23:1808-16. [PMID: 16809621 DOI: 10.1093/molbev/msl049] [Citation(s) in RCA: 278] [Impact Index Per Article: 15.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/13/2022] Open
Abstract
Teleost fishes provide the first unambiguous support for ancient whole-genome duplication in an animal lineage. Studies in yeast or plants have shown that the effects of such duplications can be mediated by a complex pattern of gene retention and changes in evolutionary pressure. To explore such patterns in fishes, we have determined by phylogenetic analysis the evolutionary origin of 675 Tetraodon duplicated genes assigned to chromosomes, using additional data from other species of actinopterygian fishes. The subset of genes, which was retained in double after the genome duplication, is enriched in development, signaling, behavior, and regulation functional categories. The evolutionary rate of duplicate fish genes appears to be determined by 3 forces: 1) fish proteins evolve faster than mammalian orthologs; 2) the genes kept in double after genome duplication represent the subset under strongest purifying selection; and 3) following duplication, there is an asymmetric acceleration of evolutionary rate in one of the paralogs. These results show that similar mechanisms are at work in fishes as in yeast or plants and provide a framework for future investigation of the consequences of duplication in fishes and other animals.
Collapse
Affiliation(s)
- Frédéric G Brunet
- Laboratoire de Biologie Moléculaire de la Cellule, INRA LA 1237, CNRS UMR5161, IFR 128 BioSciences Lyon-Gerland, Ecole Normale Supérieure de Lyon, Lyon, France
| | | | | | | | | | | | | | | |
Collapse
|
104
|
Abstract
Why do proteins evolve at different rates? Advances in systems biology and genomics have facilitated a move from studying individual proteins to characterizing global cellular factors. Systematic surveys indicate that protein evolution is not determined exclusively by selection on protein structure and function, but is also affected by the genomic position of the encoding genes, their expression patterns, their position in biological networks and possibly their robustness to mistranslation. Recent work has allowed insights into the relative importance of these factors. We discuss the status of a much-needed coherent view that integrates studies on protein evolution with biochemistry and functional and structural genomics.
Collapse
Affiliation(s)
- Csaba Pál
- European Molecular Biology Laboratory, Meyerhofstrasse 1, D-69012 Heidelberg, Germany
| | | | | |
Collapse
|
105
|
Chain FJJ, Evans BJ. Multiple mechanisms promote the retained expression of gene duplicates in the tetraploid frog Xenopus laevis. PLoS Genet 2006; 2:e56. [PMID: 16683033 PMCID: PMC1449897 DOI: 10.1371/journal.pgen.0020056] [Citation(s) in RCA: 61] [Impact Index Per Article: 3.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/10/2005] [Accepted: 02/28/2006] [Indexed: 01/19/2023] Open
Abstract
Gene duplication provides a window of opportunity for biological variants to persist under the protection of a co-expressed copy with similar or redundant function. Duplication catalyzes innovation (neofunctionalization), subfunction degeneration (subfunctionalization), and genetic buffering (redundancy), and the genetic survival of each paralog is triggered by mechanisms that add, compromise, or do not alter protein function. We tested the applicability of three types of mechanisms for promoting the retained expression of duplicated genes in 290 expressed paralogs of the tetraploid clawed frog, Xenopus laevis. Tests were based on explicit expectations concerning the ka/ks ratio, and the number and location of nonsynonymous substitutions after duplication. Functional constraints on the majority of paralogs are not significantly different from a singleton ortholog. However, we recover strong support that some of them have an asymmetric rate of nonsynonymous substitution: 6% match predictions of the neofunctionalization hypothesis in that (1) each paralog accumulated nonsynonymous substitutions at a significantly different rate and (2) the one that evolves faster has a higher ka/ks ratio than the other paralog and than a singleton ortholog. Fewer paralogs (3%) exhibit a complementary pattern of substitution at the protein level that is predicted by enhancement or degradation of different functional domains, and the remaining 13% have a higher average ka/ks ratio in both paralogs that is consistent with altered functional constraints, diversifying selection, or activity-reducing mutations after duplication. We estimate that these paralogs have been retained since they originated by genome duplication between 21 and 41 million years ago. Multiple mechanisms operate to promote the retained expression of duplicates in the same genome, in genes in the same functional class, over the same period of time following duplication, and sometimes in the same pair of paralogs. None of these paralogs are superfluous; degradation or enhancement of different protein subfunctions and neofunctionalization are plausible hypotheses for the retained expression of some of them. Evolution of most X. laevis paralogs, however, is consistent with retained expression via mechanisms that do not radically alter functional constraints, such as selection to preserve post-duplication stoichiometry or temporal, quantitative, or spatial subfunctionalization. Gene duplication plays a fundamental role in biological innovation but it is not clear how both copies of a duplicated gene manage to circumvent degradation by mutation if neither is unique. This study explores genetic mechanisms that could make each copy of a duplicate gene different, and therefore distinguishable and potentially preserved by natural selection. It is based on DNA sequences of the protein-coding region of 290 expressed duplicated genes in a frog, Xenopus laevis, that underwent complete duplication of its entire genome. Results provide evidence for multiple mechanisms acting within the same genome, within the same functional classes of genes, within the same period of time following duplication, and even on the same set of duplicated genes. Each copy of a duplicate gene may be subject to distinct evolutionary constraints, and this could be associated with degradation or enhancement of function. Functional constraints of most of these duplicates, however, are not substantially different from a single copy gene; their persistence in the first dozens of millions of years after duplication may more frequently be explained by mechanisms acting on their expression rather than their function.
Collapse
Affiliation(s)
- Frédéric J. J Chain
- Center for Environmental Genomics, Department of Biology, McMaster University, Hamilton, Ontario, Canada
| | - Ben J Evans
- Center for Environmental Genomics, Department of Biology, McMaster University, Hamilton, Ontario, Canada
- * To whom correspondence should be addressed. E-mail:
| |
Collapse
|
106
|
Ermakova EO, Nurtdinov RN, Gelfand MS. Fast rate of evolution in alternatively spliced coding regions of mammalian genes. BMC Genomics 2006; 7:84. [PMID: 16620375 PMCID: PMC1459143 DOI: 10.1186/1471-2164-7-84] [Citation(s) in RCA: 35] [Impact Index Per Article: 1.9] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/21/2005] [Accepted: 04/18/2006] [Indexed: 11/10/2022] Open
Abstract
BACKGROUND At least half of mammalian genes are alternatively spliced. Alternative isoforms are often genome-specific and it has been suggested that alternative splicing is one of the major mechanisms for generating protein diversity in the course of evolution. Another way of looking at alternative splicing is to consider sequence evolution of constitutive and alternative regions of protein-coding genes. Indeed, it turns out that constitutive and alternative regions evolve in different ways. RESULTS A set of 3029 orthologous pairs of human and mouse alternatively spliced genes was considered. The rate of nonsynonymous substitutions (dN), the rate of synonymous substitutions (dS), and their ratio (omega = dN/dS) appear to be significantly higher in alternatively spliced coding regions compared to constitutive regions. When N-terminal, internal and C-terminal alternatives are analysed separately, C-terminal alternatives appear to make the main contribution to the observed difference. The effects become even more pronounced in a subset of fast evolving genes. CONCLUSION These results provide evidence of weaker purifying selection and/or stronger positive selection in alternative regions and thus one more confirmation of accelerated evolution in alternative regions. This study corroborates the theory that alternative splicing serves as a testing ground for molecular evolution.
Collapse
Affiliation(s)
- Ekaterina O Ermakova
- Department of Bioengineering and Bioinformatics, Moscow State University, Vorob'evy gory, 1-73, 119992, Moscow, Russia
- Research and Training Center "Bioinformatics", Institute for Information Transmission Problems, Russian Academy of Sciences, Bolshoi Karetny per. 19, 127994, Moscow, Russia
| | - Ramil N Nurtdinov
- Department of Bioengineering and Bioinformatics, Moscow State University, Vorob'evy gory, 1-73, 119992, Moscow, Russia
| | - Mikhail S Gelfand
- Department of Bioengineering and Bioinformatics, Moscow State University, Vorob'evy gory, 1-73, 119992, Moscow, Russia
- Research and Training Center "Bioinformatics", Institute for Information Transmission Problems, Russian Academy of Sciences, Bolshoi Karetny per. 19, 127994, Moscow, Russia
| |
Collapse
|
107
|
Abstract
MOTIVATION Amino acid changing mutations in proteins are contstrained by purifying selection and accumulate at different rates. We estimate evolutionary rates on multiple alignments of eukaryotic protein families in a maximum likelihood framework and spot sets of slow and fast evolving proteins. RESULTS We find that the evolution of indispensable proteins is constrained by selection and that protein secretion is coupled to an increased evolutionary rate.
Collapse
Affiliation(s)
- Hannes Luz
- Max Planck Institute for Molecular Genetics Ihnestrasse 73, 14195 Berlin, Germany.
| | | |
Collapse
|
108
|
Kim SH, Yi SV. Correlated asymmetry of sequence and functional divergence between duplicate proteins of Saccharomyces cerevisiae. Mol Biol Evol 2006; 23:1068-75. [PMID: 16510556 DOI: 10.1093/molbev/msj115] [Citation(s) in RCA: 56] [Impact Index Per Article: 3.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/13/2022] Open
Abstract
The role of sequence divergence in functional divergence of duplicate genes is a topic of great interest. In this study, we compare the numbers of amino acid substitutions in each sequence since two yeast duplicates diverged, using a preduplication ancestral outgroup. Using this strategy, we explored the relationship between sequence divergence and functional divergence between duplicate partners. We show that the degree of relative functional asymmetry between duplicate proteins is proportional to the relative sequence divergence between them. Furthermore, of the two duplicates, the copy closer to their ancestral sequence (fewer number of amino acid substitutions) interacts with more proteins and affects fitness more severely when deleted. Therefore, asymmetric sequence divergence between duplicates is correlated with asymmetric functional divergence and may underlie the duplicate's role in genetic robustness against mutations. Among the functional traits considered, protein abundance appears to have the strongest correlation with the nonsynonymous divergence between duplicates. Taken together with the results from whole-genome analyses, our results indicate that within-species duplicates are subject to the same evolutionary force that acts on interspecific sequence and functional divergence. In particular, we detect signs of purifying selection on the more slowly evolving duplicate.
Collapse
Affiliation(s)
- Seong-Ho Kim
- School of Biology, Georgia Institute of Technology, USA
| | | |
Collapse
|
109
|
Abstract
The last common ancestor between fish and mammals dates back to the very origin of the vertebrate lineage and today, half of modern vertebrates are fish. It is thus not surprising that several fish species have played important roles in recent years to advance our understanding of vertebrate genome evolution, to inform us on the structure of human genes, and, somewhat more unexpectedly, to provide leads to understanding the function of genes involved in human diseases. Genome sequence comparisons between such distantly related organisms are highly informative due to the accumulation of neutral mutations in nonfunctional regions. Yet humans and fishes share many developmental pathways, organ systems, and physiological mechanisms, making conclusions relevant to human biology. The respective advantages of zebrafish, medaka, Tetraodon, or Takifugu have been well exploited so far with bioinformatics analyses and molecular biology techniques. However the full potential of fish genomics is about to be unleashed with the integration of more traditional disciplines such as biochemistry and physiology, with the study of additional species such as carp, trout, or tilapia and a broadening of its applications to environmental genomics or aquaculture.
Collapse
Affiliation(s)
- Hugues Roest Crollius
- Dyogen Lab, Centre National de la Recherche Scientifique UMR8541, Ecole Normale Supérieure, 75005 Paris, France.
| | | |
Collapse
|
110
|
Chapman BA, Bowers JE, Feltus FA, Paterson AH. Buffering of crucial functions by paleologous duplicated genes may contribute cyclicality to angiosperm genome duplication. Proc Natl Acad Sci U S A 2006; 103:2730-5. [PMID: 16467140 PMCID: PMC1413778 DOI: 10.1073/pnas.0507782103] [Citation(s) in RCA: 138] [Impact Index Per Article: 7.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/18/2022] Open
Abstract
Genome duplication followed by massive gene loss has permanently shaped the genomes of many higher eukaryotes, particularly angiosperms. It has long been believed that a primary advantage of genome duplication is the opportunity for the evolution of genes with new functions by modification of duplicated genes. If so, then patterns of genetic diversity among strains within taxa might reveal footprints of selection that are consistent with this advantage. Contrary to classical predictions that duplicated genes may be relatively free to acquire unique functionality, we find among both Arabidopsis ecotypes and Oryza subspecies that SNPs encode less radical amino acid changes in genes for which there exists a duplicated copy at a "paleologous" locus than in "singleton" genes. Preferential retention of duplicated genes encoding long complex proteins and their unexpectedly slow divergence (perhaps because of homogenization) suggest that a primary advantage of retaining duplicated paleologs may be the buffering of crucial functions. Functional buffering and functional divergence may represent extremes in the spectrum of duplicated gene fates. Functional buffering may be especially important during "genomic turmoil" immediately after genome duplication but continues to act approximately 60 million years later, and its gradual deterioration may contribute cyclicality to genome duplication in some lineages.
Collapse
Affiliation(s)
- Brad A. Chapman
- *Plant Genome Mapping Laboratory and Departments of
- Plant Biology
| | | | | | - Andrew H. Paterson
- *Plant Genome Mapping Laboratory and Departments of
- Plant Biology
- Genetics, and
- Crop and Soil Science, University of Georgia, Athens, GA 30602
- To whom correspondence should be addressed at:
Plant Genome Mapping Laboratory, University of Georgia, 111 Riverbend Road, Athens, GA 30602. E-mail:
| |
Collapse
|
111
|
Abstract
Orthologs and paralogs are two fundamentally different types of homologous genes that evolved, respectively, by vertical descent from a single ancestral gene and by duplication. Orthology and paralogy are key concepts of evolutionary genomics. A clear distinction between orthologs and paralogs is critical for the construction of a robust evolutionary classification of genes and reliable functional annotation of newly sequenced genomes. Genome comparisons show that orthologous relationships with genes from taxonomically distant species can be established for the majority of the genes from each sequenced genome. This review examines in depth the definitions and subtypes of orthologs and paralogs, outlines the principal methodological approaches employed for identification of orthology and paralogy, and considers evolutionary and functional implications of these concepts.
Collapse
Affiliation(s)
- Eugene V Koonin
- National Center for Biotechnology Information, National Library of Medicine, National Institutes of Health, Bethesda, Maryland 20894, USA.
| |
Collapse
|
112
|
Landry CR, Oh J, Hartl DL, Cavalieri D. Genome-wide scan reveals that genetic variation for transcriptional plasticity in yeast is biased towards multi-copy and dispensable genes. Gene 2006; 366:343-51. [PMID: 16427747 DOI: 10.1016/j.gene.2005.10.042] [Citation(s) in RCA: 94] [Impact Index Per Article: 5.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/26/2005] [Revised: 08/14/2005] [Accepted: 10/14/2005] [Indexed: 11/26/2022]
Abstract
One of the most important aspects of the evolution of development and physiology is the interplay between gene expression and the environment, by which traits become altered in response to environmental triggers. This feature is known as phenotypic plasticity. When different genotypes show different levels of plasticity for a trait, then they show genotype-by-environment interaction, or GEI. It is now clear that gene expression plays an important role in organismic-level phenotypic plasticity, but we know very little about whether gene expression itself is subject to genetic variation for phenotypic plasticity (GEI). Given that gene regulation is likely to have evolved to respond to environmental changes, it is of central importance to understand how environmental and genetic variation interact to produce variation in gene expression. Here we investigate genetic variation for phenotypic plasticity in the yeast transcriptome for the whole genome. Six strains of Saccharomyces cerevisiae were grown in four different environments representing a continuum of rich and poor natural conditions. Using DNA-microarray data and an ANOVA analysis with a stringent criterion of significance, we found significant genetic variation for transcriptional plasticity (GEI) among strains for approximately 5% of the genes in the genome. There are about twice as many genes that show genetic variation for phenotypic plasticity as show genetic variation in transcription level independent of the environment. We also found that genes with genetic variation for plasticity were less likely to be essential and were significantly biased towards genes that have paralogs.
Collapse
Affiliation(s)
- Christian R Landry
- Department of Organismic and Evolutionary Biology, Harvard University, Cambridge MA, USA
| | | | | | | |
Collapse
|
113
|
Abstract
Gene duplication is key to molecular evolution in all three domains of life and may be the first step in the emergence of new gene function. It is a well-recognized feature in large DNA viruses but has not been studied extensively in the largest known virus to date, the recently discovered Acanthamoeba polyphaga Mimivirus. Here, I present a systematic analysis of gene and genome duplication events in the mimivirus genome. I found that one-third of the mimivirus genes are related to at least one other gene in the mimivirus genome, either through a large segmental genome duplication event that occurred in the more remote past or through more recent gene duplication events, which often occur in tandem. This shows that gene and genome duplication played a major role in shaping the mimivirus genome. Using multiple alignments, together with remote-homology detection methods based on Hidden Markov Model comparison, I assign putative functions to some of the paralogous gene families. I suggest that a large part of the duplicated mimivirus gene families are likely to interfere with important host cell processes, such as transcription control, protein degradation, and cell regulatory processes. My findings support the view that large DNA viruses are complex evolving organisms, possibly deeply rooted within the tree of life, and oppose the paradigm that viral evolution is dominated by lateral gene acquisition, at least in regard to large DNA viruses.
Collapse
Affiliation(s)
- Karsten Suhre
- Information Génomique et Structurale, UPR CNRS 2589, 31 Chemin Joseph-Aiguier, 13402 Marseille Cedex 20, France.
| |
Collapse
|
114
|
Iovchev M, Boutanaev A, Ivanov I, Wolstenholme A, Nurminsky D, Semenov E. Phylogenetic shadowing of a histamine-gated chloride channel involved in insect vision. INSECT BIOCHEMISTRY AND MOLECULAR BIOLOGY 2006; 36:10-7. [PMID: 16360945 DOI: 10.1016/j.ibmb.2005.09.002] [Citation(s) in RCA: 12] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 08/04/2005] [Revised: 09/27/2005] [Accepted: 09/27/2005] [Indexed: 05/05/2023]
Abstract
A recently identified gene, hclA (synonym: ort), codes for an ionotrophic histamine receptor subunit in Drosophila melanogaster, and known hclA mutations lead to defects in the visual system, neurologic disorders and changed responsiveness to neurotoxins. To investigate whether this novel class of receptors is common across the Insecta, we analysed the genomes of 15 other insect species (Diptera, Hymenoptera, Coleoptera, Lepidoptera) and revealed orthologs of hclA in all of them. The predicted receptor domain of HCLA is extensively conserved (86-100% of identity) among the 16 proteins. Minor changes in the amino acid sequence that includes the putative transmembrane domains (TMs) 1-3 were found in non-drosophilid species only. Substantial amino acid variability was observed in the signal polypeptides, the intracellular loop domains and in TM4, in good accordance with known data on sequence variations in ligand-gated ion channels. Pairwise comparisons revealed three consensus sequences for N-glycosylation, conserved in HCLAs of all species studied, as well as a drosophilid-specific putative phosphorylation site. Real-time PCR analysis demonstrated that hclA-mRNA is abundant in heads of adult Drosophila. However, species- and sex-specific variations of the hclA expression levels were also observed.
Collapse
Affiliation(s)
- Mladen Iovchev
- Institute of Molecular Biology, Department of Molecular Neurobiology, Sofia 1113, Bulgaria
| | | | | | | | | | | |
Collapse
|
115
|
Kondrashov FA, Kondrashov AS. Role of selection in fixation of gene duplications. J Theor Biol 2005; 239:141-51. [PMID: 16242725 DOI: 10.1016/j.jtbi.2005.08.033] [Citation(s) in RCA: 138] [Impact Index Per Article: 7.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/05/2005] [Revised: 05/25/2005] [Accepted: 05/26/2005] [Indexed: 02/02/2023]
Abstract
New genes commonly appear through complete or partial duplications of pre-existing genes. Duplications of long DNA segments are constantly produced by rare mutations, may become fixed in a population by selection or random drift, and are subject to divergent evolution of the paralogous sequences after fixation, although gene conversion can impede this process. New data shed some light on each of these processes. Mutations which involve duplications can occur through at least two different mechanisms, backward strand slippage during DNA replication and unequal crossing-over. The background rate of duplication of a complete gene in humans is 10(-9)-10(-10) per generation, although many genes located within hot-spots of large-scale mutation are duplicated much more often. Many gene duplications affect fitness strongly, and are responsible, through gene dosage effects, for a number of genetic diseases. However, high levels of intrapopulation polymorphism caused by presence or absence of long, gene-containing DNA segments imply that some duplications are not under strong selection. The polymorphism to fixation ratios appear to be approximately the same for gene duplications and for presumably selectively neutral nucleotide substitutions, which, according to the McDonald-Kreitman test, is consistent with selective neutrality of duplications. However, this pattern can also be due to negative selection against most of segregating duplications and positive selection for at least some duplications which become fixed. Patterns in post-fixation evolution of duplicated genes do not easily reveal the causes of fixations. Many gene duplications which became fixed recently in a variety of organisms were positively selected because the increased expression of the corresponding genes was beneficial. The effects of gene dosage provide a unified framework for studying all phases of the life history of a gene duplication. Application of well-known methods of evolutionary genetics to accumulating data on new, polymorphic, and fixed duplication will enhance our understanding of the role of natural selection in the evolution by gene duplication.
Collapse
Affiliation(s)
- Fyodor A Kondrashov
- Rybka Research Institute, 25138 Woodfield School Rd., Gaithersburg, MD 20882, USA
| | | |
Collapse
|
116
|
He X, Zhang J. Gene complexity and gene duplicability. Curr Biol 2005; 15:1016-21. [PMID: 15936271 DOI: 10.1016/j.cub.2005.04.035] [Citation(s) in RCA: 79] [Impact Index Per Article: 4.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/19/2005] [Revised: 04/13/2005] [Accepted: 04/19/2005] [Indexed: 11/22/2022]
Abstract
Eukaryotic genes are on average more complex than prokaryotic genes in terms of expression regulation, protein length, and protein-domain structure [1-5]. Eukaryotes are also known to have a higher rate of gene duplication than prokaryotes do [6, 7]. Because gene duplication is the primary source of new genes [], the average gene complexity in a genome may have been increased by gene duplication if complex genes are preferentially duplicated. Here, we test this "gene complexity and gene duplicability" hypothesis with yeast genomic data. We show that, on average, duplicate genes from either whole-genome or individual-gene duplication have longer protein sequences, more functional domains, and more cis-regulatory motifs than singleton genes. This phenomenon is not a by-product of previously known mechanisms, such as protein function [10-13], evolutionary rate [14, 15], dosage [11], and dosage balance [16], that influence gene duplicability. Rather, it appears to have resulted from the sub-neo-functionalization process in duplicate-gene evolution [11]. Under this process, complex genes are more likely to be retained after duplication because they are prone to subfunctionalization, and gene complexity is regained via subsequent neofunctionalization. Thus, gene duplication increases both gene number and gene complexity, two important factors in the origin of genomic and organismal complexity.
Collapse
Affiliation(s)
- Xionglei He
- Department of Ecology and Evolutionary Biology, University of Michigan, Ann Arbor 48109, USA
| | | |
Collapse
|
117
|
Abstract
Gene duplication plays an important role in evolution because it is the primary source of new genes. Many recent studies showed that gene duplicability varies considerably among genes. Several considerations led us to hypothesize that less important genes have higher rates of successful duplications, where gene importance is measured by the fitness reduction caused by the deletion of the gene. Here, we test this hypothesis by comparing the importance of two groups of singleton genes in the yeast Saccharomyces cerevisiae (Sce). Group S genes did not duplicate in four other yeast species examined, whereas group D experienced duplication in these species. Consistent with our hypothesis, we found group D genes to be less important than group S genes. Specifically, 17% of group D genes are essential in Sce, compared to 28% for group S. Furthermore, deleting a group D gene in Sce reduces the fitness by 24% on average, compared to 38% for group S. Our subsequent analysis showed that less important genes have more cis-regulatory motifs, which could lead to a higher chance of subfunctionalization of duplicate genes and result in an enhanced rate of gene retention. Less important genes may also have weaker dosage imbalance effects and cause fewer genetic perturbations when duplicated. Regardless of the cause, our observation indicates that the previous finding of a less severe fitness consequence of deleting a duplicate gene than deleting a singleton gene is at least in part due to the fact that duplicate genes are intrinsically less important than singleton genes and suggests that the contribution of duplicate genes to genetic robustness has been overestimated.
Collapse
Affiliation(s)
- Xionglei He
- Department of Ecology and Evolutionary Biology, University of Michigan, Ann Arbor, USA
| | | |
Collapse
|
118
|
Cusack BP, Wolfe KH. Changes in alternative splicing of human and mouse genes are accompanied by faster evolution of constitutive exons. Mol Biol Evol 2005; 22:2198-208. [PMID: 16049198 DOI: 10.1093/molbev/msi218] [Citation(s) in RCA: 36] [Impact Index Per Article: 1.9] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/13/2022] Open
Abstract
Alternative splicing is known to be an important source of protein sequence variation, but its evolutionary impact has not been explored in detail. Studying alternative splicing requires extensive sampling of the transcriptome, but new data sets based on expressed sequence tags aligned to chromosomes make it possible to study alternative splicing on a genome-wide scale. Although genes showing alternative splicing by exon skipping are conserved as compared to the genome as a whole, we find that genes where structural differences between human and mouse result in genome-specific alternatively spliced exons in one species show almost 60% greater nonsynonymous divergence in constitutive exons than genes where exon skipping is conserved. This effect is also seen for genes showing species-specific patterns of alternative splicing where gene structure is conserved. Our observations are not attributable to an inherent difference in rate of evolution between these two sets of proteins or to differences with respect to predictors of evolutionary rate such as expression level, tissue specificity, or genetic redundancy. Where genome-specific alternatively spliced exons are seen in mammals, the vast majority of skipped exons appear to be recent additions to gene structures. Furthermore, among genes with genome-specific alternatively spliced exons, the degree of nonsynonymous divergence in constitutive sequence is a function of the frequency of incorporation of these alternative exons into transcripts. These results suggest that alterations in alternative splicing pattern can have knock-on effects in terms of accelerated sequence evolution in constant regions of the protein.
Collapse
Affiliation(s)
- Brian P Cusack
- Department of Genetics, Smurfit Institute, University of Dublin, Trinity College, Dublin, Ireland
| | | |
Collapse
|
119
|
Abstract
Over 35 years ago, Susumu Ohno stated that gene duplication was the single most important factor in evolution. He reiterated this point a few years later in proposing that without duplicated genes the creation of metazoans, vertebrates, and mammals from unicellular organisms would have been impossible. Such big leaps in evolution, he argued, required the creation of new gene loci with previously nonexistent functions. Bold statements such as these, combined with his proposal that at least one whole-genome duplication event facilitated the evolution of vertebrates, have made Ohno an icon in the literature on genome evolution. However, discussion on the occurrence and consequences of gene and genome duplication events has a much longer, and often neglected, history. Here we review literature dealing with the occurrence and consequences of gene duplication, beginning in 1911. We document conceptual and technological advances in gene duplication research from this early research in comparative cytology up to recent research on whole genomes, "transcriptomes," and "interactomes."
Collapse
Affiliation(s)
- John S Taylor
- Department of Biology, University of Victoria, British Columbia V8W 3N5, Canada.
| | | |
Collapse
|
120
|
He X, Zhang J. Rapid subfunctionalization accompanied by prolonged and substantial neofunctionalization in duplicate gene evolution. Genetics 2005; 169:1157-64. [PMID: 15654095 PMCID: PMC1449125 DOI: 10.1534/genetics.104.037051] [Citation(s) in RCA: 498] [Impact Index Per Article: 26.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/30/2004] [Accepted: 11/16/2004] [Indexed: 11/18/2022] Open
Abstract
Gene duplication is the primary source of new genes. Duplicate genes that are stably preserved in genomes usually have divergent functions. The general rules governing the functional divergence, however, are not well understood and are controversial. The neofunctionalization (NF) hypothesis asserts that after duplication one daughter gene retains the ancestral function while the other acquires new functions. In contrast, the subfunctionalization (SF) hypothesis argues that duplicate genes experience degenerate mutations that reduce their joint levels and patterns of activity to that of the single ancestral gene. We here show that neither NF nor SF alone adequately explains the genome-wide patterns of yeast protein interaction and human gene expression for duplicate genes. Instead, our analysis reveals rapid SF, accompanied by prolonged and substantial NF in a large proportion of duplicate genes, suggesting a new model termed subneofunctionalization (SNF). Our results demonstrate that enormous numbers of new functions have originated via gene duplication.
Collapse
Affiliation(s)
- Xionglei He
- Department of Ecology and Evolutionary Biology, University of Michigan, Ann Arbor, Michigan 48109, USA
| | | |
Collapse
|
121
|
Tripoli G, D'Elia D, Barsanti P, Caggese C. Comparison of the oxidative phosphorylation (OXPHOS) nuclear genes in the genomes of Drosophila melanogaster, Drosophila pseudoobscura and Anopheles gambiae. Genome Biol 2005; 6:R11. [PMID: 15693940 PMCID: PMC551531 DOI: 10.1186/gb-2005-6-2-r11] [Citation(s) in RCA: 43] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/24/2004] [Revised: 12/08/2004] [Accepted: 01/07/2005] [Indexed: 01/16/2023] Open
Abstract
An analysis of nuclear-encoded oxidative phosphorylation genes in Drosophila and Anopheles reveals that pairs of duplicated genes have strikingly different expression patterns. Background In eukaryotic cells, oxidative phosphorylation (OXPHOS) uses the products of both nuclear and mitochondrial genes to generate cellular ATP. Interspecies comparative analysis of these genes, which appear to be under strong functional constraints, may shed light on the evolutionary mechanisms that act on a set of genes correlated by function and subcellular localization of their products. Results We have identified and annotated the Drosophila melanogaster, D. pseudoobscura and Anopheles gambiae orthologs of 78 nuclear genes encoding mitochondrial proteins involved in oxidative phosphorylation by a comparative analysis of their genomic sequences and organization. We have also identified 47 genes in these three dipteran species each of which shares significant sequence homology with one of the above-mentioned OXPHOS orthologs, and which are likely to have originated by duplication during evolution. Gene structure and intron length are essentially conserved in the three species, although gain or loss of introns is common in A. gambiae. In most tissues of D. melanogaster and A. gambiae the expression level of the duplicate gene is much lower than that of the original gene, and in D. melanogaster at least, its expression is almost always strongly testis-biased, in contrast to the soma-biased expression of the parent gene. Conclusions Quickly achieving an expression pattern different from the parent genes may be required for new OXPHOS gene duplicates to be maintained in the genome. This may be a general evolutionary mechanism for originating phenotypic changes that could lead to species differentiation.
Collapse
Affiliation(s)
- Gaetano Tripoli
- University of Bari, DAPEG Section of Genetics, via Amendola 165/A, 70126 Bari, Italy
| | - Domenica D'Elia
- CNR, Institute of Biomedical Technology, Section of Bari, via Amendola 122/D, 70126 Bari, Italy
| | - Paolo Barsanti
- University of Bari, DAPEG Section of Genetics, via Amendola 165/A, 70126 Bari, Italy
| | - Corrado Caggese
- University of Bari, DAPEG Section of Genetics, via Amendola 165/A, 70126 Bari, Italy
| |
Collapse
|
122
|
Abstract
The resolution of the complete sequences of several hemiascomycete genomes provides new insights into the ways that yeast genomes change in size and in gene contents. These genomes provide evidence of whole-genome duplication occurring before the divergence of Saccharomyces cerevisiae and Candida glabrata, followed by massive gene loss that restored diploidy. The pattern of genome evolution in yeast differs from that in bacteria apparently as a result of stronger selective constraints on bacterial chromosomes.
Collapse
Affiliation(s)
- Howard Ochman
- Department of Biochemistry and Molecular Biophysics, University of Arizona, Tucson, AZ 87521, USA.
| | | | | |
Collapse
|
123
|
Gene family evolution: an in-depth theoretical and simulation analysis of non-linear birth-death-innovation models. BMC Evol Biol 2004; 4:32. [PMID: 15357876 PMCID: PMC523855 DOI: 10.1186/1471-2148-4-32] [Citation(s) in RCA: 40] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/10/2004] [Accepted: 09/09/2004] [Indexed: 11/10/2022] Open
Abstract
BACKGROUND The size distribution of gene families in a broad range of genomes is well approximated by a generalized Pareto function. Evolution of ensembles of gene families can be described with Birth, Death, and Innovation Models (BDIMs). Analysis of the properties of different versions of BDIMs has the potential of revealing important features of genome evolution. RESULTS In this work, we extend our previous analysis of stochastic BDIMs. In addition to the previously examined rational BDIMs, we introduce potentially more realistic logistic BDIMs, in which birth/death rates are limited for the largest families, and show that their properties are similar to those of models that include no such limitation. We show that the mean time required for the formation of the largest gene families detected in eukaryotic genomes is limited by the mean number of duplications per gene and does not increase indefinitely with the model degree. Instead, this time reaches a minimum value, which corresponds to a non-linear rational BDIM with the degree of approximately 2.7. Even for this BDIM, the mean time of the largest family formation is orders of magnitude greater than any realistic estimates based on the timescale of life's evolution. We employed the embedding chains technique to estimate the expected number of elementary evolutionary events (gene duplications and deletions) preceding the formation of gene families of the observed size and found that the mean number of events exceeds the family size by orders of magnitude, suggesting a highly dynamic process of genome evolution. The variance of the time required for the formation of the largest families was found to be extremely large, with the coefficient of variation >> 1. This indicates that some gene families might grow much faster than the mean rate such that the minimal time required for family formation is more relevant for a realistic representation of genome evolution than the mean time. We determined this minimal time using Monte Carlo simulations of family growth from an ensemble of simultaneously evolving singletons. In these simulations, the time elapsed before the formation of the largest family was much shorter than the estimated mean time and was compatible with the timescale of evolution of eukaryotes. CONCLUSIONS The analysis of stochastic BDIMs presented here shows that non-linear versions of such models can well approximate not only the size distribution of gene families but also the dynamics of their formation during genome evolution. The fact that only higher degree BDIMs are compatible with the observed characteristics of genome evolution suggests that the growth of gene families is self-accelerating, which might reflect differential selective pressure acting on different genes.
Collapse
|
124
|
Jordan IK, Wolf YI, Koonin EV. Duplicated genes evolve slower than singletons despite the initial rate increase. BMC Evol Biol 2004; 4:22. [PMID: 15238160 PMCID: PMC481058 DOI: 10.1186/1471-2148-4-22] [Citation(s) in RCA: 156] [Impact Index Per Article: 7.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/26/2004] [Accepted: 07/06/2004] [Indexed: 11/10/2022] Open
Abstract
BACKGROUND Gene duplication is an important mechanism that can lead to the emergence of new functions during evolution. The impact of duplication on the mode of gene evolution has been the subject of several theoretical and empirical comparative-genomic studies. It has been shown that, shortly after the duplication, genes seem to experience a considerable relaxation of purifying selection. RESULTS Here we demonstrate two opposite effects of gene duplication on evolutionary rates. Sequence comparisons between paralogs show that, in accord with previous observations, a substantial acceleration in the evolution of paralogs occurs after duplication, presumably due to relaxation of purifying selection. The effect of gene duplication on evolutionary rate was also assessed by sequence comparison between orthologs that have paralogs (duplicates) and those that do not (singletons). It is shown that, in eukaryotes, duplicates, on average, evolve significantly slower than singletons. Eukaryotic ortholog evolutionary rates for duplicates are also negatively correlated with the number of paralogs per gene and the strength of selection between paralogs. A tally of annotated gene functions shows that duplicates tend to be enriched for proteins with known functions, particularly those involved in signaling and related cellular processes; by contrast, singletons include an over-abundance of poorly characterized proteins. CONCLUSIONS These results suggest that whether or not a gene duplicate is retained by selection depends critically on the pre-existing functional utility of the protein encoded by the ancestral singleton. Duplicates of genes of a higher biological import, which are subject to strong functional constraints on the sequence, are retained relatively more often. Thus, the evolutionary trajectory of duplicated genes appears to be determined by two opposing trends, namely, the post-duplication rate acceleration and the generally slow evolutionary rate owing to the high level of functional constraints.
Collapse
MESH Headings
- Animals
- Base Composition/genetics
- DNA/genetics
- DNA, Archaeal/genetics
- DNA, Bacterial/genetics
- Evolution, Molecular
- Genes/genetics
- Genes/physiology
- Genes, Archaeal/genetics
- Genes, Archaeal/physiology
- Genes, Bacterial/genetics
- Genes, Bacterial/physiology
- Genes, Duplicate/genetics
- Genes, Duplicate/physiology
- Genes, Fungal/genetics
- Genes, Fungal/physiology
- Genes, Insect/genetics
- Genes, Insect/physiology
- Gram-Negative Bacteria/genetics
- Gram-Positive Bacteria/genetics
- Humans
- Mice
- Mutation/genetics
- Sequence Homology, Nucleic Acid
Collapse
Affiliation(s)
- I King Jordan
- National Center for Biotechnology Information, National Library of Medicine, National Institutes of Health, Bethesda, MD 20894, USA
| | - Yuri I Wolf
- National Center for Biotechnology Information, National Library of Medicine, National Institutes of Health, Bethesda, MD 20894, USA
| | - Eugene V Koonin
- National Center for Biotechnology Information, National Library of Medicine, National Institutes of Health, Bethesda, MD 20894, USA
| |
Collapse
|