151
|
Chen Y, Dokholyan NV. The coordinated evolution of yeast proteins is constrained by functional modularity. Trends Genet 2006; 22:416-9. [PMID: 16797778 DOI: 10.1016/j.tig.2006.06.008] [Citation(s) in RCA: 54] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/24/2005] [Revised: 04/07/2006] [Accepted: 06/02/2006] [Indexed: 11/20/2022]
Abstract
Functional modularity is a key attribute of cellular systems and has important roles in evolution. However, the extent to which functional modularity affects protein evolution is largely unknown. Here, we analyzed the evolution of both sequence and expression level of proteins in the yeast Saccharomyces cerevisiae and found that proteins within the same functional modules evolve at more similar rates than those between different modules. We also found stronger co-evolution of expression levels between proteins within functional modules than between them. These results suggest that a coordinated evolution of both the sequence and expression level of proteins is constrained by functional modularity.
Collapse
Affiliation(s)
- Yiwen Chen
- Department of Physics and Astronomy, University of North Carolina at Chapel Hill, Chapel Hill, NC 27599, USA
| | | |
Collapse
|
152
|
Heizer EM, Raiford DW, Raymer ML, Doom TE, Miller RV, Krane DE. Amino Acid Cost and Codon-Usage Biases in 6 Prokaryotic Genomes: A Whole-Genome Analysis. Mol Biol Evol 2006; 23:1670-80. [PMID: 16754641 DOI: 10.1093/molbev/msl029] [Citation(s) in RCA: 60] [Impact Index Per Article: 3.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/13/2022] Open
Abstract
For most prokaryotic organisms, amino acid biosynthesis represents a significant portion of their overall energy budget. The difference in the cost of synthesis between amino acids can be striking, differing by as much as 7-fold. Two prokaryotic organisms, Escherichia coli and Bacillus subtilis, have been shown to preferentially utilize less costly amino acids in highly expressed genes, indicating that parsimony in amino acid selection may confer a selective advantage for prokaryotes. This study confirms those findings and extends them to 4 additional prokaryotic organisms: Chlamydia trachomatis, Chlamydophila pneumoniae AR39, Synechocystis sp. PCC 6803, and Thermus thermophilus HB27. Adherence to codon-usage biases for each of these 6 organisms is inversely correlated with a coding region's average amino acid biosynthetic cost in a fashion that is independent of chemoheterotrophic, photoautotrophic, or thermophilic lifestyle. The obligate parasites C. trachomatis and C. pneumoniae AR39 are incapable of synthesizing many of the 20 common amino acids. Removing auxotrophic amino acids from consideration in these organisms does not alter the overall trend of preferential use of energetically inexpensive amino acids in highly expressed genes.
Collapse
Affiliation(s)
- Esley M Heizer
- Department of Biological Sciences, Wright State University, USA
| | | | | | | | | | | |
Collapse
|
153
|
McInerney JO. The causes of protein evolutionary rate variation. Trends Ecol Evol 2006; 21:230-2. [PMID: 16697908 DOI: 10.1016/j.tree.2006.03.008] [Citation(s) in RCA: 33] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/15/2005] [Revised: 03/02/2006] [Accepted: 03/13/2006] [Indexed: 11/22/2022]
Abstract
The rate of protein evolution varies more than 1000-fold and, for the past 30 years, it was thought that the rate was determined by protein function. Drummond and co-workers have now shown that a single factor underlying mRNA expression, protein abundance and synonymous codon usage is the chief causal agent of protein evolutionary rate in yeast. It will be interesting to see whether this is shown to be a universal rule for all biological systems.
Collapse
Affiliation(s)
- James O McInerney
- Department of Biology, National University of Ireland, Maynooth, County Kildare, Ireland.
| |
Collapse
|
154
|
Wu G, Nie L, Zhang W. Relation between mRNA expression and sequence information in Desulfovibrio vulgaris: combinatorial contributions of upstream regulatory motifs and coding sequence features to variations in mRNA abundance. Biochem Biophys Res Commun 2006; 344:114-21. [PMID: 16603130 DOI: 10.1016/j.bbrc.2006.03.124] [Citation(s) in RCA: 7] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/14/2006] [Accepted: 03/21/2006] [Indexed: 11/29/2022]
Abstract
The context-dependent expression of genes is the core for biological activities, and significant attention has been given to identification of various factors contributing to gene expression at genomic scale. However, so far this type of analysis has been focused either on relation between mRNA expression and non-coding sequence features such as upstream regulatory motifs or on correlation between mRNA abundance and non-random features in coding sequences (e.g., codon usage and amino acid usage). In this study multiple regression analyses of the mRNA abundance and all sequence information in Desulfovibrio vulgaris were performed, with the goal to investigate how much coding and non-coding sequence features contribute to the variations in mRNA expression, and in what manner they act together. Using the AlignACE program, 442 over-represented motifs were identified from the upstream 100bp region of 293 genes located in the known regulons. Regression of mRNA expression data against the measures of coding and non-coding sequence features indicated that 54.1% of the variations in mRNA abundance can be explained by the presence of upstream motifs, while coding sequences alone contribute to 29.7% of the variations in mRNA abundance. Interestingly, most of contribution from coding sequences is overlapping with that from upstream motifs; thereby a total of 60.3% of the variations in mRNA abundance can be explained when coding and non-coding information was included. This result demonstrates that upstream regulatory motifs and coding sequence information contribute to the overall mRNA expression in a combinatorial rather than an additive manner.
Collapse
Affiliation(s)
- Gang Wu
- Department of Biological Sciences, University of Maryland at Baltimore County, 1000 Hilltop Circle, Baltimore, MD 21250, USA
| | | | | |
Collapse
|
155
|
Xu L, Chen H, Hu X, Zhang R, Zhang Z, Luo ZW. Average Gene Length Is Highly Conserved in Prokaryotes and Eukaryotes and Diverges Only Between the Two Kingdoms. Mol Biol Evol 2006; 23:1107-8. [PMID: 16611645 DOI: 10.1093/molbev/msk019] [Citation(s) in RCA: 98] [Impact Index Per Article: 5.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/14/2022] Open
Abstract
The average length of genes in a eukaryote is larger than in a prokaryote, implying that evolution of complexity is related to change of gene lengths. Here, we show that although the average lengths of genes in prokaryotes and eukaryotes are much different, the average lengths of genes are highly conserved within either of the two kingdoms. This suggests that natural selection has clearly set a strong limitation on gene elongation within the kingdom. Furthermore, the average gene size adds another distinct characteristic for the discrimination between the two kingdoms of organisms.
Collapse
|
156
|
Kotlar D, Lavner Y. The action of selection on codon bias in the human genome is related to frequency, complexity, and chronology of amino acids. BMC Genomics 2006; 7:67. [PMID: 16584540 PMCID: PMC1456966 DOI: 10.1186/1471-2164-7-67] [Citation(s) in RCA: 42] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/20/2005] [Accepted: 04/03/2006] [Indexed: 11/29/2022] Open
Abstract
BACKGROUND The question of whether synonymous codon choice is affected by cellular tRNA abundance has been positively answered in many organisms. In some recent works, concerning the human genome, this relation has been studied, but no conclusive answers have been found. In the human genome, the variation in base composition and the absence of cellular tRNA count data makes the study of the question more complicated. In this work we study the relation between codon choice and tRNA abundance in the human genome by correcting relative codon usage for background base composition and using a measure based on tRNA-gene copy numbers as a rough estimate of tRNA abundance. RESULTS We term major codons to be those codons with a relatively large tRNA-gene copy number for their corresponding amino acid. We use two measures of expression: breadth of expression (the number of tissues in which a gene was expressed) and maximum expression level among tissues (the highest value of expression of a gene among tissues). We show that for half the amino acids in the study (8 of 16) the relative major codon usage rises with breadth of expression. We show that these amino acids are significantly more frequent, are smaller and simpler, and are more ancient than the rest of the amino acids. Similar, although weaker, results were obtained for maximum expression level. CONCLUSION There is evidence that codon bias in the human genome is related to selection, although the selection forces acting on codon bias may not be straightforward and may be different for different amino acids. We suggest that, in the first group of amino acids, selection acts to enhance translation efficiency in highly expressed genes by preferring major codons, and acts to reduce translation rate in lowly expressed genes by preferring non-major ones. In the second group of amino acids other selection forces, such as reducing misincorporation rate of expensive amino acids, in terms of their size/complexity, may be in action. The fact that codon usage is more strongly related to breadth of expression than to maximum expression level supports the notion, presented in a recent study, that codon choice may be related to the tRNA abundance in the tissue in which a gene is expressed.
Collapse
Affiliation(s)
- Daniel Kotlar
- Department of Computer Science, Tel-Hai Academic College, Upper Galilee, 12210, Israel
| | - Yizhar Lavner
- Department of Computer Science, Tel-Hai Academic College, Upper Galilee, 12210, Israel
| |
Collapse
|
157
|
Gilchrist MA, Wagner A. A model of protein translation including codon bias, nonsense errors, and ribosome recycling. J Theor Biol 2006; 239:417-34. [PMID: 16171830 DOI: 10.1016/j.jtbi.2005.08.007] [Citation(s) in RCA: 72] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/18/2005] [Revised: 08/05/2005] [Accepted: 08/08/2005] [Indexed: 11/15/2022]
Abstract
We present and analyse a model of protein translation at the scale of an individual messenger RNA (mRNA) transcript. The model we develop is unique in that it incorporates the phenomena of ribosome recycling and nonsense errors. The model conceptualizes translation as a probabilistic wave of ribosome occupancy traveling down a heterogeneous medium, the mRNA transcript. Our results show that the heterogeneity of the codon translation rates along the mRNA results in short-scale spikes and dips in the wave. Nonsense errors attenuate this wave on a longer scale while ribosome recycling reinforces it. We find that the combination of nonsense errors and codon usage bias can have a large effect on the probability that a ribosome will completely translate a transcript. We also elucidate how these forces interact with ribosome recycling to determine the overall translation rate of an mRNA transcript. We derive a simple cost function for nonsense errors using our model and apply this function to the yeast (Saccharomyces cervisiae) genome. Using this function we are able to detect position dependent selection on codon bias which correlates with gene expression levels as predicted a priori. These results indirectly validate our underlying model assumptions and confirm that nonsense errors can play an important role in shaping codon usage bias.
Collapse
Affiliation(s)
- Michael A Gilchrist
- Department of Ecology and Evolutionary Biology, University of Tennessee, Knoxville, 37996, USA.
| | | |
Collapse
|
158
|
Sällström B, Arnaout RA, Davids W, Bjelkmar P, Andersson SGE. Protein evolutionary rates correlate with expression independently of synonymous substitutions in Helicobacter pylori. J Mol Evol 2006; 62:600-14. [PMID: 16586017 DOI: 10.1007/s00239-005-0104-5] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/25/2005] [Accepted: 12/20/2005] [Indexed: 11/29/2022]
Abstract
In free-living microorganisms, such as Escherichia coli and Saccharomyces cerevisiae, both synonymous and nonsynonymous substitution frequencies correlate with expression levels. Here, we have tested the hypothesis that the correlation between amino acid substitution rates and expression is a by-product of selection for codon bias and translational efficiency in highly expressed genes. To this end, we have examined the correlation between protein evolutionary rates and expression in the human gastric pathogen Helicobacter pylori, where the absence of selection on synonymous sites enables the two types of substitutions to be uncoupled. The results revealed a statistically significant negative correlation between expression levels and nonsynonymous substitutions in both H. pylori and E. coli. We also found that neighboring genes located on the same, but not on opposite strands, evolve at significantly more similar rates than random gene pairs, as expected by co-expression of genes located in the same operon. However, the two species differ in that synonymous substitutions show a strand-specific pattern in E. coli, whereas the weak similarity in synonymous substitutions for neighbors in H. pylori is independent of gene orientation. These results suggest a direct influence of expression levels on nonsynonymous substitution frequencies independent of codon bias and selective constraints on synonymous sites.
Collapse
Affiliation(s)
- Björn Sällström
- Program of Molecular Evolution, Department of Evolution, Genomics and Systematics, Evolutionary Biology Center, Uppsala University, 752 36 Uppsala, Sweden
| | | | | | | | | |
Collapse
|
159
|
Xing Y, Lee C. Can RNA selection pressure distort the measurement of Ka/Ks? Gene 2006; 370:1-5. [PMID: 16488091 DOI: 10.1016/j.gene.2005.12.015] [Citation(s) in RCA: 23] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/11/2005] [Revised: 12/15/2005] [Accepted: 12/20/2005] [Indexed: 11/24/2022]
Abstract
Recently, an interesting question has emerged in the evolutionary interpretation of sequence substitution data as evidence of amino acid selection pressure. Specifically, the Ka/Ks metric was designed to measure selection pressure on amino acid substitutions, assuming that the synonymous substitution rate Ks reflects the neutral nucleotide substitution rate. However, there is increasing evidence for selection pressure at silent sites due to constraints of RNA splicing. Is Ka/Ks an appropriate metric for selection pressure on amino acid substitutions, in the presence of other selection pressures acting only at the RNA level (such as selection for exonic splicing enhancers)? Or can the resulting decreases in Ks from such selection pressures introduce bias into the Ka/Ks metric, so that it no longer gives an accurate measure of amino acid level selection pressure? In this review, we present both mathematical models and empirical evidence for these divergent points of view.
Collapse
Affiliation(s)
- Yi Xing
- Molecular Biology Institute, Center for Genomics and Proteomics, Department of Chemistry and Biochemistry, University of California, Los Angeles, CA 90095, USA
| | | |
Collapse
|
160
|
Herbeck JT, Wall DP. Converging on a general model of protein evolution. Trends Biotechnol 2006; 23:485-7. [PMID: 16054255 DOI: 10.1016/j.tibtech.2005.07.009] [Citation(s) in RCA: 19] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/26/2005] [Revised: 06/08/2005] [Accepted: 07/18/2005] [Indexed: 10/25/2022]
Abstract
The availability of high-throughput genomic databases that establish protein dispensability, expression and interaction networks enables rigorous tests of competing models of protein evolution. Recent research utilizing these new data sets shows that protein evolution is more complex than was previously thought. Several variables, including protein dispensability, expression, functional density, and genetic modularity, appear to have independent effects on the evolutionary rate of proteins, suggesting that proteomes have evolved via an assembly of selectional regimes. These results indicate that a general model of protein evolution will emerge as more functional genomic data from a diversity of organisms accumulate.
Collapse
Affiliation(s)
- Joshua T Herbeck
- Department of Microbiology, University of Washington School of Medicine, Seattle, WA 98103, USA.
| | | |
Collapse
|
161
|
Akashi H, Ko WY, Piao S, John A, Goel P, Lin CF, Vitins AP. Molecular evolution in the Drosophila melanogaster species subgroup: frequent parameter fluctuations on the timescale of molecular divergence. Genetics 2005; 172:1711-26. [PMID: 16387879 PMCID: PMC1456288 DOI: 10.1534/genetics.105.049676] [Citation(s) in RCA: 39] [Impact Index Per Article: 2.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/18/2022] Open
Abstract
Although mutation, genetic drift, and natural selection are well established as determinants of genome evolution, the importance (frequency and magnitude) of parameter fluctuations in molecular evolution is less understood. DNA sequence comparisons among closely related species allow specific substitutions to be assigned to lineages on a phylogenetic tree. In this study, we compare patterns of codon usage and protein evolution in 22 genes (>11,000 codons) among Drosophila melanogaster and five relatives within the D. melanogaster subgroup. We assign changes to eight lineages using a maximum-likelihood approach to infer ancestral states. Uncertainty in ancestral reconstructions is taken into account, at least to some extent, by weighting reconstructions by their posterior probabilities. Four of the eight lineages show potentially genomewide departures from equilibrium synonymous codon usage; three are decreasing and one is increasing in major codon usage. Several of these departures are consistent with lineage-specific changes in selection intensity (selection coefficients scaled to effective population size) at silent sites. Intron base composition and rates and patterns of protein evolution are also heterogeneous among these lineages. The magnitude of forces governing silent, intron, and protein evolution appears to have varied frequently, and in a lineage-specific manner, within the D. melanogaster subgroup.
Collapse
Affiliation(s)
- Hiroshi Akashi
- Institute of Molecular Evolutionary Genetics and Department of Biology, Pennsylvania State University, University Park, Pennsylvania 16802, USA.
| | | | | | | | | | | | | |
Collapse
|
162
|
Kondrashov FA, Ogurtsov AY, Kondrashov AS. Selection in favor of nucleotides G and C diversifies evolution rates and levels of polymorphism at mammalian synonymous sites. J Theor Biol 2005; 240:616-26. [PMID: 16343547 DOI: 10.1016/j.jtbi.2005.10.020] [Citation(s) in RCA: 52] [Impact Index Per Article: 2.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/20/2005] [Revised: 10/26/2005] [Accepted: 10/27/2005] [Indexed: 11/24/2022]
Abstract
The impact of synonymous nucleotide substitutions on fitness in mammals remains controversial. Despite some indications of selective constraint, synonymous sites are often assumed to be neutral, and the rate of their evolution is used as a proxy for mutation rate. We subdivide all sites into four classes in terms of the mutable CpG context, nonCpG, postC, preG, and postCpreG, and compare four-fold synonymous sites and intron sites residing outside transposable elements. The distribution of the rate of evolution across all synonymous sites is trimodal. Rate of evolution at nonCpG synonymous sites, not preceded by C and not followed by G, is approximately 10% below that at such intron sites. In contrast, rate of evolution at postCpreG synonymous sites is approximately 30% above that at such intron sites. Finally, synonymous and intron postC and preG sites evolve at similar rates. The relationship between the levels of polymorphism at the corresponding synonymous and intron sites is very similar to that between their rates of evolution. Within every class, synonymous sites are occupied by G or C much more often than intron sites, whose nucleotide composition is consistent with neutral mutation-drift equilibrium. These patterns suggest that synonymous sites are under weak selection in favor of G and C, with the average coefficient s approximately 0.25/Ne approximately 10(-5), where Ne is the effective population size. Such selection decelerates evolution and reduces variability at sites with symmetric mutation, but has the opposite effects at sites where the favored nucleotides are more mutable. The amino-acid composition of proteins dictates that many synonymous sites are CpGprone, which causes them, on average, to evolve faster and to be more polymorphic than intron sites. An average genotype carries approximately 10(7) suboptimal nucleotides at synonymous sites, implying synergistic epistasis in selection against them.
Collapse
Affiliation(s)
- Fyodor A Kondrashov
- Section of Ecology, Behavior and Evolution, University of California, San Diego, 9500 Gilman Drive, La Jolla, CA 92093-0346, USA.
| | | | | |
Collapse
|
163
|
Popescu CE, Borza T, Bielawski JP, Lee RW. Evolutionary rates and expression level in Chlamydomonas. Genetics 2005; 172:1567-76. [PMID: 16361241 PMCID: PMC1456299 DOI: 10.1534/genetics.105.047399] [Citation(s) in RCA: 36] [Impact Index Per Article: 1.9] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/18/2022] Open
Abstract
In many biological systems, especially bacteria and unicellular eukaryotes, rates of synonymous and nonsynonymous nucleotide divergence are negatively correlated with the level of gene expression, a phenomenon that has been attributed to natural selection. Surprisingly, this relationship has not been examined in many important groups, including the unicellular model organism Chlamydomonas reinhardtii. Prior to this study, comparative data on protein-coding sequences from C. reinhardtii and its close noninterfertile relative C. incerta were very limited. We compiled and analyzed protein-coding sequences for 67 nuclear genes from these taxa; the sequences were mostly obtained from the C. reinhardtii EST database and our C. incerta EST data. Compositional and synonymous codon usage biases varied among genes within each species but were highly correlated between the orthologous genes of the two species. Relative rates of synonymous and nonsynonymous substitution across genes varied widely and showed a strong negative correlation with the level of gene expression estimated by the codon adaptation index. Our comparative analysis of substitution rates in introns of lowly and highly expressed genes suggests that natural selection has a larger contribution than mutation to the observed correlation between evolutionary rates and gene expression level in Chlamydomonas.
Collapse
Affiliation(s)
- Cristina E Popescu
- Department of Biology, Dalhousie University, Halifax, Nova Scotia B3H 4J1, Canada
| | | | | | | |
Collapse
|
164
|
Drummond DA, Raval A, Wilke CO. A single determinant dominates the rate of yeast protein evolution. Mol Biol Evol 2005; 23:327-37. [PMID: 16237209 DOI: 10.1093/molbev/msj038] [Citation(s) in RCA: 303] [Impact Index Per Article: 15.9] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/13/2022] Open
Abstract
A gene's rate of sequence evolution is among the most fundamental evolutionary quantities in common use, but what determines evolutionary rates has remained unclear. Here, we carry out the first combined analysis of seven predictors (gene expression level, dispensability, protein abundance, codon adaptation index, gene length, number of protein-protein interactions, and the gene's centrality in the interaction network) previously reported to have independent influences on protein evolutionary rates. Strikingly, our analysis reveals a single dominant variable linked to the number of translation events which explains 40-fold more variation in evolutionary rate than any other, suggesting that protein evolutionary rate has a single major determinant among the seven predictors. The dominant variable explains nearly half the variation in the rate of synonymous and protein evolution. We show that the two most commonly used methods to disentangle the determinants of evolutionary rate, partial correlation analysis and ordinary multivariate regression, produce misleading or spurious results when applied to noisy biological data. We overcome these difficulties by employing principal component regression, a multivariate regression of evolutionary rate against the principal components of the predictor variables. Our results support the hypothesis that translational selection governs the rate of synonymous and protein sequence evolution in yeast.
Collapse
Affiliation(s)
- D Allan Drummond
- Program in Computation and Neural Systems, California Institute of Technology, Pasadena, USA
| | | | | |
Collapse
|
165
|
Carbone A, Madden R. Insights on the evolution of metabolic networks of unicellular translationally biased organisms from transcriptomic data and sequence analysis. J Mol Evol 2005; 61:456-69. [PMID: 16187158 DOI: 10.1007/s00239-004-0317-z] [Citation(s) in RCA: 11] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/09/2004] [Accepted: 04/20/2005] [Indexed: 11/27/2022]
Abstract
Codon bias is related to metabolic functions in translationally biased organisms, and two facts are argued about. First, genes with high codon bias describe in meaningful ways the metabolic characteristics of the organism; important metabolic pathways corresponding to crucial characteristics of the lifestyle of an organism, such as photosynthesis, nitrification, anaerobic versus aerobic respiration, sulfate reduction, methanogenesis, and others, happen to involve especially biased genes. Second, gene transcriptional levels of sets of experiments representing a significant variation of biological conditions strikingly confirm, in the case of Saccharomyces cerevisiae, that metabolic preferences are detectable by purely statistical analysis: the high metabolic activity of yeast during fermentation is encoded in the high bias of enzymes involved in the associated pathways, suggesting that this genome was affected by a strong evolutionary pressure that favored a predominantly fermentative metabolism of yeast in the wild. The ensemble of metabolic pathways involving enzymes with high codon bias is rather well defined and remains consistent across many species, even those that have not been considered as translationally biased, such as Helicobacter pylori, for instance, reveal some weak form of translational bias for this genome. We provide numerical evidence, supported by experimental data, of these facts and conclude that the metabolic networks of translationally biased genomes, observable today as projections of eons of evolutionary pressure, can be analyzed numerically and predictions of the role of specific pathways during evolution can be derived. The new concepts of Comparative Pathway Index, used to compare organisms with respect to their metabolic networks, and Evolutionary Pathway Index, used to detect evolutionarily meaningful bias in the genetic code from transcriptional data, are introduced.
Collapse
Affiliation(s)
- Alessandra Carbone
- Génomique Analytique, Université Pierre et Marie Curie, INSERM U511, 91 Bd de l'Hôpital, 75013 Paris, France.
| | | |
Collapse
|
166
|
Drummond DA, Bloom JD, Adami C, Wilke CO, Arnold FH. Why highly expressed proteins evolve slowly. Proc Natl Acad Sci U S A 2005; 102:14338-43. [PMID: 16176987 PMCID: PMC1242296 DOI: 10.1073/pnas.0504070102] [Citation(s) in RCA: 584] [Impact Index Per Article: 30.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/18/2022] Open
Abstract
Much recent work has explored molecular and population-genetic constraints on the rate of protein sequence evolution. The best predictor of evolutionary rate is expression level, for reasons that have remained unexplained. Here, we hypothesize that selection to reduce the burden of protein misfolding will favor protein sequences with increased robustness to translational missense errors. Pressure for translational robustness increases with expression level and constrains sequence evolution. Using several sequenced yeast genomes, global expression and protein abundance data, and sets of paralogs traceable to an ancient whole-genome duplication in yeast, we rule out several confounding effects and show that expression level explains roughly half the variation in Saccharomyces cerevisiae protein evolutionary rates. We examine causes for expression's dominant role and find that genome-wide tests favor the translational robustness explanation over existing hypotheses that invoke constraints on function or translational efficiency. Our results suggest that proteins evolve at rates largely unrelated to their functions and can explain why highly expressed proteins evolve slowly across the tree of life.
Collapse
Affiliation(s)
- D Allan Drummond
- Program in Computation and Neural Systems and Division of Chemistry and Chemical Engineering, California Institute of Technology, Pasadena, CA 91125-4100, USA.
| | | | | | | | | |
Collapse
|
167
|
Stenøien HK, Stephan W. Global mRNA stability is not associated with levels of gene expression in Drosophila melanogaster but shows a negative correlation with codon bias. J Mol Evol 2005; 61:306-14. [PMID: 16044249 DOI: 10.1007/s00239-004-0271-9] [Citation(s) in RCA: 14] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/02/2004] [Accepted: 03/16/2005] [Indexed: 11/26/2022]
Abstract
A multitude of factors contribute to the regulation of gene expression in living cells. The relationship between codon usage bias and gene expression has been extensively studied, and it has been shown that codon bias may have adaptive significance in many unicellular and multicellular organisms. Given the central role of mRNA in post-transcriptional regulation, we hypothesize that mRNA stability is another important factor associated either with positive or negative regulation of gene expression. We have conducted genome-wide studies of the association between gene expression (measured as transcript abundance in public EST databases), mRNA stability, codon bias, GC content, and gene length in Drosophila melanogaster. To remove potential bias of gene length inherently present in EST libraries, gene expression is measured as normalized transcript abundance. It is demonstrated that codon bias and GC content in second codon position are positively associated with transcript abundance. Gene length is negatively associated with transcript abundance. The stability of thermodynamically predicted mRNA secondary structures is not associated with transcript abundance, but there is a negative correlation between mRNA stability and codon bias. This finding does not support the hypothesis that codon bias has evolved as an indirect consequence of selection favoring thermodynamically stable mRNA molecules.
Collapse
Affiliation(s)
- Hans K Stenøien
- Plant Ecology/Department of Ecology and Evolution, Evolutionary Biology Centre, Uppsala University, SE-752 36, Uppsala, Sweden
| | | |
Collapse
|
168
|
Qin H, Wu WB, Comeron JM, Kreitman M, Li WH. Intragenic spatial patterns of codon usage bias in prokaryotic and eukaryotic genomes. Genetics 2005; 168:2245-60. [PMID: 15611189 PMCID: PMC1448744 DOI: 10.1534/genetics.104.030866] [Citation(s) in RCA: 65] [Impact Index Per Article: 3.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/18/2022] Open
Abstract
To study the roles of translational accuracy, translational efficiency, and the Hill-Robertson effect in codon usage bias, we studied the intragenic spatial distribution of synonymous codon usage bias in four prokaryotic (Escherichia coli, Bacillus subtilis, Sulfolobus tokodaii, and Thermotoga maritima) and two eukaryotic (Saccharomyces cerevisiae and Drosophila melanogaster) genomes. We generated supersequences at each codon position across genes in a genome and computed the overall bias at each codon position. By quantitatively evaluating the trend of spatial patterns using isotonic regression, we show that in yeast and prokaryotic genomes, codon usage bias increases along translational direction, which is consistent with purifying selection against nonsense errors. Fruit fly genes show a nearly symmetric M-shaped spatial pattern of codon usage bias, with less bias in the middle and both ends. The low codon usage bias in the middle region is best explained by interference (the Hill-Robertson effect) between selections at different codon positions. In both yeast and fruit fly, spatial patterns of codon usage bias are characteristically different from patterns of GC-content variations. Effect of expression level on the strength of codon usage bias is more conspicuous than its effect on the shape of the spatial distribution.
Collapse
Affiliation(s)
- Hong Qin
- Department of Ecology and Evolution, University of Chicago, Chicago, Illinois 60637, USA
| | | | | | | | | |
Collapse
|
169
|
Hambuch TM, Parsch J. Patterns of synonymous codon usage in Drosophila melanogaster genes with sex-biased expression. Genetics 2005; 170:1691-700. [PMID: 15937136 PMCID: PMC1449783 DOI: 10.1534/genetics.104.038109] [Citation(s) in RCA: 56] [Impact Index Per Article: 2.9] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/18/2022] Open
Abstract
The nonrandom use of synonymous codons (codon bias) is a well-established phenomenon in Drosophila. Recent reports suggest that levels of codon bias differ among genes that are differentially expressed between the sexes, with male-expressed genes showing less codon bias than female-expressed genes. To examine the relationship between sex-biased gene expression and level of codon bias on a genomic scale, we surveyed synonymous codon usage in 7276 D. melanogaster genes that were classified as male-, female-, or non-sex-biased in their expression in microarray experiments. We found that male-biased genes have significantly less codon bias than both female- and non-sex-biased genes. This pattern holds for both germline and somatically expressed genes. Furthermore, we find a significantly negative correlation between level of codon bias and degree of sex-biased expression for male-biased genes. In contrast, female-biased genes do not differ from non-sex-biased genes in their level of codon bias and show a significantly positive correlation between codon bias and degree of sex-biased expression. These observations cannot be explained by differences in chromosomal distribution, mutational processes, recombinational environment, gene length, or absolute expression level among genes of the different expression classes. We propose that the observed codon bias differences result from differences in selection at synonymous and/or linked nonsynonymous sites between genes with male- and female-biased expression.
Collapse
Affiliation(s)
- Tina M Hambuch
- Section of Evolutionary Biology, Department of Biology II, University of Munich (LMU), 82152 Munich, Germany
| | | |
Collapse
|
170
|
Subramanian S, Kumar S. Gene expression intensity shapes evolutionary rates of the proteins encoded by the vertebrate genome. Genetics 2005; 168:373-81. [PMID: 15454550 PMCID: PMC1448110 DOI: 10.1534/genetics.104.028944] [Citation(s) in RCA: 196] [Impact Index Per Article: 10.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/18/2022] Open
Abstract
Natural selection leaves its footprints on protein-coding sequences by modulating their silent and replacement evolutionary rates. In highly expressed genes in invertebrates, these footprints are seen in the higher codon usage bias and lower synonymous divergence. In mammals, the highly expressed genes have a shorter gene length in the genome and the breadth of expression is known to constrain the rate of protein evolution. Here we have examined how the rates of evolution of proteins encoded by the vertebrate genomes are modulated by the amount (intensity) of gene expression. To understand how natural selection operates on proteins that appear to have arisen in earlier and later phases of animal evolution, we have contrasted patterns of mouse proteins that have homologs in invertebrate and protist genomes (Precambrian genes) with those that do not have such detectable homologs (vertebrate-specific genes). We find that the intensity of gene expression relates inversely to the rate of protein sequence evolution on a genomic scale. The most highly expressed genes actually show the lowest total number of substitutions per polypeptide, consistent with cumulative effects of purifying selection on individual amino acid replacements. Precambrian genes exhibit a more pronounced difference in protein evolutionary rates (up to three times) between the genes with high and low expression levels as compared to the vertebrate-specific genes, which appears to be due to the narrower breadth of expression of the vertebrate-specific genes. These results provide insights into the differential relationship and effect of the increasing complexity of animal body form on evolutionary rates of proteins.
Collapse
Affiliation(s)
- Sankar Subramanian
- Center for Evolutionary Functional Genomics, The Biodesign Institute, Arizona State University, Tempe 85287-4501, USA
| | | |
Collapse
|
171
|
Kay AD, Ashton IW, Gorokhova E, Kerkhoff AJ, Liess A, Litchman E. Toward a stoichiometric framework for evolutionary biology. OIKOS 2005. [DOI: 10.1111/j.0030-1299.2005.14048.x] [Citation(s) in RCA: 85] [Impact Index Per Article: 4.5] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/27/2022]
|
172
|
Wall DP, Hirsh AE, Fraser HB, Kumm J, Giaever G, Eisen MB, Feldman MW. Functional genomic analysis of the rates of protein evolution. Proc Natl Acad Sci U S A 2005; 102:5483-8. [PMID: 15800036 PMCID: PMC555735 DOI: 10.1073/pnas.0501761102] [Citation(s) in RCA: 225] [Impact Index Per Article: 11.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/18/2022] Open
Abstract
The evolutionary rates of proteins vary over several orders of magnitude. Recent work suggests that analysis of large data sets of evolutionary rates in conjunction with the results from high-throughput functional genomic experiments can identify the factors that cause proteins to evolve at such dramatically different rates. To this end, we estimated the evolutionary rates of >3,000 proteins in four species of the yeast genus Saccharomyces and investigated their relationship with levels of expression and protein dispensability. Each protein's dispensability was estimated by the growth rate of mutants deficient for the protein. Our analyses of these improved evolutionary and functional genomic data sets yield three main results. First, dispensability and expression have independent, significant effects on the rate of protein evolution. Second, measurements of expression levels in the laboratory can be used to filter data sets of dispensability estimates, removing variates that are unlikely to reflect real biological effects. Third, structural equation models show that although we may reasonably infer that dispensability and expression have significant effects on protein evolutionary rate, we cannot yet accurately estimate the relative strengths of these effects.
Collapse
Affiliation(s)
- Dennis P Wall
- Department of Biological Sciences, and Stanford Genome Technology Center, Stanford University, Stanford, CA 94305, USA.
| | | | | | | | | | | | | |
Collapse
|
173
|
Abstract
We have found a negative correlation between evolutionary rate at the protein level (as measured by d(N)) and intron size in Drosophila. Although such a relation is expected if introns reduce Hill-Robertson interference within genes, it seems more likely to be explained by the higher abundance of cis-regulatory elements in introns (especially first introns) in genes under strong selective constraints.
Collapse
Affiliation(s)
- Gabriel Marais
- Institute of Evolutionary Biology, School of Biological Sciences, University of Edinburgh, UK
| | | | | | | |
Collapse
|
174
|
Marais G, Domazet-Loso T, Tautz D, Charlesworth B. Correlated evolution of synonymous and nonsynonymous sites in Drosophila. J Mol Evol 2005; 59:771-9. [PMID: 15599509 DOI: 10.1007/s00239-004-2671-2] [Citation(s) in RCA: 43] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/15/2004] [Accepted: 06/30/2004] [Indexed: 11/28/2022]
Abstract
Recent work has shown that Drosophila melanogaster genes with fast-evolving nonsynonymous sites have lower codon usage bias. This pattern has been attributed to interference between positive selection at nonsynonymous sites and weak selection on codon usage. Here we have looked for this correlation in a much larger and less biased dataset, comprising 630 gene pairs from D. melanogaster and D. yakuba. We confirmed that there is a negative correlation between the rate of nonsynonymous substitutions (d(N)) and codon bias in D. melanogaster. We then tested the interference hypothesis and other alternative explanations, including one involving gene expression. We found that d(N) indeed correlates with the level of gene expression. Given that gene expression is a strong determinant of codon bias, the relationship between d(N) and codon bias might be a by-product of gene expression. However, our tests show that none of the hypotheses we consider seem to explain the data fully.
Collapse
Affiliation(s)
- Gabriel Marais
- Institute of Cell, Animal and Population Biology, University of Edinburgh, Edinburgh, EH9 3JT, Scotland, UK
| | | | | | | |
Collapse
|
175
|
Raghava GPS, Han JH. Correlation and prediction of gene expression level from amino acid and dipeptide composition of its protein. BMC Bioinformatics 2005; 6:59. [PMID: 15773999 PMCID: PMC1083413 DOI: 10.1186/1471-2105-6-59] [Citation(s) in RCA: 41] [Impact Index Per Article: 2.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/14/2004] [Accepted: 03/17/2005] [Indexed: 11/29/2022] Open
Abstract
Background A large number of papers have been published on analysis of microarray data with particular emphasis on normalization of data, detection of differentially expressed genes, clustering of genes and regulatory network. On other hand there are only few studies on relation between expression level and composition of nucleotide/protein sequence, using expression data. There is a need to understand why particular genes/proteins express more in particular conditions. In this study, we analyze 3468 genes of Saccharomyces cerevisiae obtained from Holstege et al., (1998) to understand the relationship between expression level and amino acid composition. Results We compute the correlation between expression of a gene and amino acid composition of its protein. It was observed that some residues (like Ala, Gly, Arg and Val) have significant positive correlation (r > 0.20) and some other residues (Like Asp, Leu, Asn and Ser) have negative correlation (r < -0.15) with the expression of genes. A significant negative correlation (r = -0.18) was also found between length and gene expression. These observations indicate the relationship between percent composition and gene expression level. Thus, attempts have been made to develop a Support Vector Machine (SVM) based method for predicting the expression level of genes from its protein sequence. In this method the SVM is trained with proteins whose gene expression data is known in a given condition. Then trained SVM is used to predict the gene expression of other proteins of the same organism in the same condition. A correlation coefficient r = 0.70 was obtained between predicted and experimentally determined expression of genes, which improves from r = 0.70 to 0.72 when dipeptide composition was used instead of residue composition. The method was evaluated using 5-fold cross validation test. We also demonstrate that amino acid composition information along with gene expression data can be used for improving the function classification of proteins. Conclusion There is a correlation between gene expression and amino acid composition that can be used to predict the expression level of genes up to a certain extent. A web server based on the above strategy has been developed for calculating the correlation between amino acid composition and gene expression and prediction of expression level . This server will allow users to study the evolution from expression data.
Collapse
Affiliation(s)
- Gajendra PS Raghava
- Department of Computer Science and Engineering, Pohang University of Science and Technology, San 31 Hyo-Ja Dong, Pohang 790–784, Republic of Korea
- Bioinformatics Centre, Institute of Microbial Technology, Sector 39A, Chandigarh-160036, India
| | - Joon H Han
- Department of Computer Science and Engineering, Pohang University of Science and Technology, San 31 Hyo-Ja Dong, Pohang 790–784, Republic of Korea
| |
Collapse
|
176
|
Carlini DB. Context-dependent codon bias and messenger RNA longevity in the yeast transcriptome. Mol Biol Evol 2005; 22:1403-11. [PMID: 15772378 DOI: 10.1093/molbev/msi135] [Citation(s) in RCA: 25] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/13/2022] Open
Abstract
Context-dependent codon bias and its relationship with messenger RNA (mRNA) longevity was examined in 4,648 mRNA transcripts of the Saccharomyces cerevisiae transcriptome for which mRNA half-lives have been empirically determined. Surprisingly, rare codon usage (codons used <13 times per 1,000 codons in the genome) increased with mRNA half-life. However, it is shown that this pattern was not due to preference for rare codon use within codon families containing both rare and nonrare codons. Rather, the pattern was due to an increase in the frequency of amino acids encoded solely by rare codons, and a decrease in the frequency of amino acids never encoded by rare codons, with mRNA half-life. When standardized by open reading frame length, the use of consecutive rare codons was also positively correlated with mRNA half-life. There was negative correlation between the usage of synonymous A|T dinucleotides spanning codon boundaries and mRNA half-life, despite the fact that the frequency of AT dinucleotide usage overall, and AT dinucleotide usage at other codon position contexts (e.g., 1-2, 2-3, or 3|1 total), was not correlated with mRNA half-life. The use of A|T dinucleotides at synonymous dicodon boundaries could potentially allow for more efficient 3'-5' degradation by endonucleolytic cleavage.
Collapse
|
177
|
Comeron JM. Selective and mutational patterns associated with gene expression in humans: influences on synonymous composition and intron presence. Genetics 2005; 167:1293-304. [PMID: 15280243 PMCID: PMC1470943 DOI: 10.1534/genetics.104.026351] [Citation(s) in RCA: 158] [Impact Index Per Article: 8.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/09/2023] Open
Abstract
We report the results of a comprehensive study of the influence of gene expression on synonymous codons, amino acid composition, and intron presence and size in human protein-coding genes. First, in addition to a strong effect of isochores, we have detected the influence of transcription-associated mutational biases (TAMB) on gene composition. Genes expressed in different tissues show diverse degrees of TAMB, with genes expressed in testis showing the greatest influence. Second, the study of tissues with no evidence of TAMB reveals a consistent set of optimal synonymous codons favored in highly expressed genes. This result exposes the consequences of natural selection on synonymous composition to increase efficiency of translation in the human lineage. Third, overall amino acid composition of proteins closely resembles tRNA abundance but there is no difference in amino acid composition in differentially expressed genes. Fourth, there is a negative relationship between expression and CDS length. Significantly, this is observed only among genes with introns, suggesting that the cause for this relationship in humans cannot be associated only with costs of amino acid biosynthesis. Fifth, we show that broadly and highly expressed genes have more, although shorter, introns. The selective advantage for having more introns in highly expressed genes is likely counterbalanced by containment of transcriptional costs and a minimum exon size for proper splicing.
Collapse
Affiliation(s)
- Josep M Comeron
- Department of Biological Sciences, University of Iowa, Iowa City, Iowa 52242, USA.
| |
Collapse
|
178
|
Chin CS, Chuang JH, Li H. Genome-wide regulatory complexity in yeast promoters: separation of functionally conserved and neutral sequence. Genome Res 2005; 15:205-13. [PMID: 15653830 PMCID: PMC546519 DOI: 10.1101/gr.3243305] [Citation(s) in RCA: 41] [Impact Index Per Article: 2.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/08/2004] [Accepted: 11/23/2004] [Indexed: 11/25/2022]
Abstract
To gauge the complexity of gene regulation in yeast, it is essential to know how much promoter sequence is functional. Conservation across species can be a sensitive means of detecting functional sequences, provided that the significance of conservation can be accurately calibrated with the local neutral mutation rate. By analyzing yeast coding and promoter sequences, we find that neutral mutation rates in yeast are uniform genome-wide, in contrast to mammals, where neutral mutation rates vary along chromosomes. We develop an approach that uses this uniform rate to estimate the amount of promoter sequence under purifying selection. This amount is approximately 30%, corresponding to roughly 90 bp for a typical promoter. Furthermore, using a hidden Markov model, we are able to separate each promoter into distinct high and low conservation regions. Known regulatory motifs are strongly biased toward high conservation regions, while low conservation regions have mutation rates similar to that of the neutral background. Certain Gene Ontology groupings of genes (e.g., Carbohydrate Metabolism) have large amounts of high conservation sequence, suggesting complexity in their transcriptional regulation. Others (e.g., RNA Processing) have little high conservation sequence and are likely to be simply regulated. The separation of functionally conserved sequence from the neutral background allows us to estimate the complexity of cis-regulation on a genomic scale.
Collapse
Affiliation(s)
- Chen-Shan Chin
- Department of Biochemistry and Biophysics, University of California, San Francisco, California 94143, USA
| | | | | |
Collapse
|
179
|
Gu Z, David L, Petrov D, Jones T, Davis RW, Steinmetz LM. Elevated evolutionary rates in the laboratory strain of Saccharomyces cerevisiae. Proc Natl Acad Sci U S A 2005; 102:1092-7. [PMID: 15647350 PMCID: PMC545845 DOI: 10.1073/pnas.0409159102] [Citation(s) in RCA: 80] [Impact Index Per Article: 4.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/18/2022] Open
Abstract
By using the maximum likelihood method, we made a genome-wide comparison of the evolutionary rates in the lineages leading to the laboratory strain (S288c) and a wild strain (YJM789) of Saccharomyces cerevisiae and found that genes in the laboratory strain tend to evolve faster than in the wild strain. The pattern of elevated evolution suggests that relaxation of selection intensity is the dominant underlying reason, which is consistent with recurrent bottlenecks in the S. cerevisiae laboratory strain population. Supporting this conclusion are the following observations: (i) the increases in nonsynonymous evolutionary rate occur for genes in all functional categories; (ii) most of the synonymous evolutionary rate increases in S288c occur in genes with strong codon usage bias; (iii) genes under stronger negative selection have a larger increase in nonsynonymous evolutionary rate; and (iv) more genes with adaptive evolution were detected in the laboratory strain, but they do not account for the majority of the increased evolution. The present discoveries suggest that experimental and possible industrial manipulations of the laboratory strain of yeast could have had a strong effect on the genetic makeup of this model organism. Furthermore, they imply an evolution of laboratory model organisms away from their wild counterparts, questioning the relevancy of the models especially when extensive laboratory cultivation has occurred. In addition, these results shed light on the evolution of livestock and crop species that have been under human domestication for years.
Collapse
Affiliation(s)
- Zhenglong Gu
- Stanford Genome Technology Center, 855 California Avenue, Palo Alto, CA 94304, USA
| | | | | | | | | | | |
Collapse
|
180
|
Fadiel A, Lithwick S, Naftolin F. The influence of environmental adaptation on bacterial genome structure. Lett Appl Microbiol 2005; 40:12-8. [PMID: 15612996 DOI: 10.1111/j.1472-765x.2004.01619.x] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/29/2022]
Abstract
AIMS Researchers have long been puzzled by the diversity of life. Now that the complete genomic sequence of many organisms has been determined, it is possible to evaluate the impact of organismal variation on sequence structure or vice versa. The aim of this investigation was to explore genomic changes mandated by organismal adaptation to its ecological niches. METHODS AND RESULTS Coding sequences from three phylogenetically related bacterial species namely Mycoplasma genitalium, M. pneumoniae and Ureaplasma urealyticum were subject to in depth sequence analyses. M. genitalium and M. pneumoniae both belong to the genus Mycoplasma while U. urealyticum is a member of the genus Ureaplasma. However, M. genitalium and U. urealyticum are urogenital pathogens while M. pneumoniae is a respiratory pathogen. Complete transcriptomes were downloaded from NCBI for each species, and were subject to in silico investigation using in-house software, and public sequence analysis tools. Clear similarities in transcriptome structure were identified among the functionally similar species M. genitalium and U. urealyticum while no such relationship was identified among the phylogenetically related species M. genitalium and M. pneumoniae. CONCLUSIONS It is plausible to conclude that, in these bacterial species, environmental stimuli might be more influential in shaping sequence signatures than phylogenetic relationships. SIGNIFICANCE AND IMPACT OF THE STUDY This study suggests that molecular signatures within the transcriptomes of the species examined are likely to be a product of evolutionary adaptation to diverse environmental ecological stimuli, and not a result of common phylogeny.
Collapse
Affiliation(s)
- A Fadiel
- The Bioinformatics Supercomputing Centre, The Hospital for Sick Children, Toronto, ON, Canada.
| | | | | |
Collapse
|
181
|
Xia X. Mutation and selection on the anticodon of tRNA genes in vertebrate mitochondrial genomes. Gene 2004; 345:13-20. [PMID: 15716092 DOI: 10.1016/j.gene.2004.11.019] [Citation(s) in RCA: 46] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/14/2004] [Revised: 10/14/2004] [Accepted: 11/07/2004] [Indexed: 11/22/2022]
Abstract
The H-strand of vertebrate mitochondrial DNA is left single-stranded for hours during the slow DNA replication. This facilitates C-->U mutations on the H-strand (and consequently G-->A mutations on the L-strand) via spontaneous deamination which occurs much more frequently on single-stranded than on double-stranded DNA. For the 12 coding sequences (CDS) collinear with the L-strand, NNY synonymous codon families (where N stands for any of the four nucleotides and Y stands for either C or U) end mostly with C, and NNR and NNN codon families (where R stands for either A or G) end mostly with A. For the lone ND6 gene on the other strand, the codon bias is the opposite, with NNY codon families ending mostly with U and NNR and NNN codon families ending mostly with G. These patterns are consistent with the strand-specific mutation bias. The codon usage biased towards C-ending and A-ending in the 12 CDS sequences affects the codon-anticodon adaptation. The wobble site of the anticodon is always G for NNY codon families dominated by C-ending codons and U for NNR and NNN codon families dominated by A-ending codons. The only, but consistent, exception is the anticodon of tRNA-Met which consistently has a 5'-CAU-3' anticodon base-pairing with the AUG codon (the translation initiation codon) instead of the more frequent AUA. The observed CAU anticodon (matching AUG) would increase the rate of translation initiation but would reduce the rate of peptide elongation because most methionine codons are AUA, whereas the unobserved UAU anticodon (matching AUA) would increase the elongation rate at the cost of translation initiation rate. The consistent CAU anticodon in tRNA-Met suggests the importance of maximizing the rate of translation initiation.
Collapse
Affiliation(s)
- Xuhua Xia
- Department of Biology, University of Ottawa, 150 Louis, P.O. Box 450, Station A, Ottawa, Ontario, Canada K1N 6N5.
| |
Collapse
|
182
|
Mougel F, Manichanh C, Duchateau N'guyen G, Termier M. Genomic Choice of Codons in 16 Microbial Species. J Biomol Struct Dyn 2004; 22:315-29. [PMID: 15473705 DOI: 10.1080/07391102.2004.10507003] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/28/2022]
Abstract
We study the codon usage over whole set of ORFs of 16 unicellular microbial species: eight archaebacteria, seven eubacteria, and one eukarya. We first try to define, for each species, the neutral expected codon usage to better approach subsequently the influence of selection. Overlapping triplets counted from the complete DNA genomic sequence and mean amino acid composition of ORFs allow us to build satisfying expected codon usage for each species. Within species deviation from this neutral model is then studied through Correspondence Analysis and characterization with bias index, N(C)' (effective number of codons reported to neutral model). Our results are compared to previously published ones for three species and let appear good agreement in spite of very different methods. We thus propose set of codons probably preferred by selection for nine other species. In the four last species, no clear preference can be evidenced. Finally, we characterize variation of codon usage over functional categories. We propose that the high degree of bias of proteins involved in translation, ribosomal structure and biogenesis has a positive influence on overexpression of the corresponding genes under optimum growth conditions and is a negative regulator of the same genes when amino acids become limited resources.
Collapse
Affiliation(s)
- F Mougel
- Bioinformatique des Genomes, IGM, bat. 400, 91405 Orsay CEDEX, France
| | | | | | | |
Collapse
|
183
|
Wright SI, Yau CBK, Looseley M, Meyers BC. Effects of gene expression on molecular evolution in Arabidopsis thaliana and Arabidopsis lyrata. Mol Biol Evol 2004; 21:1719-26. [PMID: 15201397 DOI: 10.1093/molbev/msh191] [Citation(s) in RCA: 123] [Impact Index Per Article: 6.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/12/2023] Open
Abstract
We analyzed the complete genome sequence of Arabidopsis thaliana and sequence data from 83 genes in the outcrossing A. lyrata, to better understand the role of gene expression on the strength of natural selection on synonymous and replacement sites in Arabidopsis. From data on tRNA gene abundance, we find a good concordance between codon preferences and the relative abundance of isoaccepting tRNAs in the complete A. thaliana genome, consistent with models of translational selection. Both EST-based and new quantitative measures of gene expression (MPSS) suggest that codon preferences derived from information on tRNA abundance are more strongly associated with gene expression than those obtained from multivariate analysis, which provides further support for the hypothesis that codon bias in Arabidopsis is under selection mediated by tRNA abundance. Consistent with previous results, analysis of protein evolution reveals a significant correlation between gene expression level and amino acid substitution rate. Analysis by MPSS estimates of gene expression suggests that this effect is primarily the result of a correlation between the number of tissues in which a gene is expressed and the rate of amino acid substitution, which indicates that the degree of tissue specialization may be an important determinant of the rate of protein evolution in Arabidopsis.
Collapse
|
184
|
Fraser HB, Hirsh AE, Wall DP, Eisen MB. Coevolution of gene expression among interacting proteins. Proc Natl Acad Sci U S A 2004; 101:9033-8. [PMID: 15175431 PMCID: PMC439012 DOI: 10.1073/pnas.0402591101] [Citation(s) in RCA: 167] [Impact Index Per Article: 8.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/18/2022] Open
Abstract
Physically interacting proteins or parts of proteins are expected to evolve in a coordinated manner that preserves proper interactions. Such coevolution at the amino acid-sequence level is well documented and has been used to predict interacting proteins, domains, and amino acids. Interacting proteins are also often precisely coexpressed with one another, presumably to maintain proper stoichiometry among interacting components. Here, we show that the expression levels of physically interacting proteins coevolve. We estimate average expression levels of genes from four closely related fungi of the genus Saccharomyces using the codon adaptation index and show that expression levels of interacting proteins exhibit coordinated changes in these different species. We find that this coevolution of expression is a more powerful predictor of physical interaction than is coevolution of amino acid sequence. These results demonstrate that gene expression levels can coevolve, adding another dimension to the study of the coevolution of interacting proteins and underscoring the importance of maintaining coexpression of interacting proteins over evolutionary time. Our results also suggest that expression coevolution can be used for computational prediction of protein-protein interactions.
Collapse
Affiliation(s)
- Hunter B Fraser
- Department of Molecular and Cell Biology, University of California, Berkeley, CA 94720, USA.
| | | | | | | |
Collapse
|