51
|
Haddrill PR, Bachtrog D, Andolfatto P. Positive and negative selection on noncoding DNA in Drosophila simulans. Mol Biol Evol 2008; 25:1825-34. [PMID: 18515263 DOI: 10.1093/molbev/msn125] [Citation(s) in RCA: 86] [Impact Index Per Article: 5.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/12/2022] Open
Abstract
There is now a wealth of evidence that some of the most important regions of the genome are found outside those that encode proteins, and noncoding regions of the genome have been shown to be subject to substantial levels of selective constraint, particularly in Drosophila. Recent work has suggested that these regions may also have been subject to the action of positive selection, with large fractions of noncoding divergence having been driven to fixation by adaptive evolution. However, this work has focused on Drosophila melanogaster, which is thought to have experienced a reduction in effective population size (N(e)), and thus a reduction in the efficacy of selection, compared with its closest relative Drosophila simulans. Here, we examine patterns of evolution at several classes of noncoding DNA in D. simulans and find that all noncoding DNA is subject to the action of negative selection, indicated by reduced levels of polymorphism and divergence and a skew in the frequency spectrum toward rare variants. We find that the signature of negative selection on noncoding DNA and nonsynonymous sites is obscured to some extent by purifying selection acting on preferred to unpreferred synonymous codon mutations. We investigate the extent to which divergence in noncoding DNA is inferred to be the product of positive selection and to what extent these inferences depend on selection on synonymous sites and demography. Based on patterns of polymorphism and divergence for different classes of synonymous substitution, we find the divergence excess inferred in noncoding DNA and nonsynonymous sites in the D. simulans lineage difficult to reconcile with demographic explanations.
Collapse
Affiliation(s)
- Penelope R Haddrill
- Institute of Evolutionary Biology, School of Biological Sciences, University of Edinburgh, Edinburgh, United Kingdom.
| | | | | |
Collapse
|
52
|
African Drosophila melanogaster and D. simulans populations have similar levels of sequence variability, suggesting comparable effective population sizes. Genetics 2008; 178:405-12. [PMID: 18202383 DOI: 10.1534/genetics.107.080200] [Citation(s) in RCA: 29] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/18/2022] Open
Abstract
Drosophila melanogaster and D. simulans are two closely related species with a similar distribution range. Many studies suggested that D. melanogaster has a smaller effective population size than D. simulans. As most evidence was derived from non-African populations, we readdressed this question by sequencing 10 X-linked loci in five African D. simulans and six African D. melanogaster populations. Contrary to previous results, we found no evidence for higher variability, and thus larger effective population size, in D. simulans. Our observation of similar levels of variability of both species will have important implications for the interpretation of patterns of molecular evolution.
Collapse
|
53
|
Evaluating Evolutionary Constraint on the Rapidly Evolving Gene matK Using Protein Composition. J Mol Evol 2007; 66:85-97. [DOI: 10.1007/s00239-007-9060-6] [Citation(s) in RCA: 22] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/27/2007] [Revised: 08/10/2007] [Accepted: 11/19/2007] [Indexed: 10/22/2022]
|
54
|
Macpherson JM, Sella G, Davis JC, Petrov DA. Genomewide spatial correspondence between nonsynonymous divergence and neutral polymorphism reveals extensive adaptation in Drosophila. Genetics 2007; 177:2083-99. [PMID: 18073425 PMCID: PMC2219485 DOI: 10.1534/genetics.107.080226] [Citation(s) in RCA: 99] [Impact Index Per Article: 5.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/09/2007] [Accepted: 09/18/2007] [Indexed: 11/18/2022] Open
Abstract
The effect of recurrent selective sweeps is a spatially heterogeneous reduction in neutral polymorphism throughout the genome. The pattern of reduction depends on the selective advantage and recurrence rate of the sweeps. Because many adaptive substitutions responsible for these sweeps also contribute to nonsynonymous divergence, the spatial distribution of nonsynonymous divergence also reflects the distribution of adaptive substitutions. Thus, the spatial correspondence between neutral polymorphism and nonsynonymous divergence may be especially informative about the process of adaptation. Here we study this correspondence using genomewide polymorphism data from Drosophila simulans and the divergence between D. simulans and D. melanogaster. Focusing on highly recombining portions of the autosomes, at a spatial scale appropriate to the study of selective sweeps, we find that neutral polymorphism is both lower and, as measured by a new statistic Q(S), less homogeneous where nonsynonymous divergence is higher and that the spatial structure of this correlation is best explained by the action of strong recurrent selective sweeps. We introduce a method to infer, from the spatial correspondence between polymorphism and divergence, the rate and selective strength of adaptation. Our results independently confirm a high rate of adaptive substitution (approximately 1/3000 generations) and newly suggest that many adaptations are of surprisingly great selective effect (approximately 1%), reducing the effective population size by approximately 15% even in highly recombining regions of the genome.
Collapse
|
55
|
Molina N, van Nimwegen E. Universal patterns of purifying selection at noncoding positions in bacteria. Genome Res 2007; 18:148-60. [PMID: 18032729 DOI: 10.1101/gr.6759507] [Citation(s) in RCA: 50] [Impact Index Per Article: 2.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/24/2022]
Abstract
To investigate the dependence of the number of regulatory sites per intergenic region on genome size, we developed a new method for detecting purifying selection at noncoding positions in clades of related bacterial genomes. We comprehensively quantified evidence of purifying selection at noncoding positions across bacteria and found several striking universal patterns. Consistent with selection acting at transcriptional regulatory elements near the transcription start, we find a universal positional profile of selection with respect to gene starts and ends, showing most evidence of selection immediately upstream and least immediately downstream from genes. A further set of universal features indicates that selection for translation initiation efficiency is the major determinant of the sequence composition around translation start in all clades. In addition to a peak in selection at ribosomal binding sites, the region immediately around translation start shows a universal pattern of high adenine frequency, significant selection at silent positions, and avoidance of RNA secondary structure. Surprisingly, although the number of transcription factors (TF) increases quadratically with genome size, we present several lines of evidence that small and large genomes have the same average number of regulatory sites per intergenic region. By comparing the sequence diversity of the most and least conserved DNA words in intergenic regions across clades we provide evidence that the structure of transcription regulatory networks changes dramatically with genome size: Small genomes have a small number of TFs with a large number of target sites, whereas large genomes have a large number of TFs with a small number of target sites each.
Collapse
Affiliation(s)
- Nacho Molina
- Biozentrum, the University of Basel, and Swiss Institute of Bioinformatics, 4056-CH, Basel, Switzerland
| | | |
Collapse
|
56
|
Guo X, Wang Y, Keightley PD, Fan L. Patterns of selective constraints in noncoding DNA of rice. BMC Evol Biol 2007; 7:208. [PMID: 17976238 PMCID: PMC2174951 DOI: 10.1186/1471-2148-7-208] [Citation(s) in RCA: 19] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/29/2007] [Accepted: 11/01/2007] [Indexed: 11/10/2022] Open
Abstract
Background Several studies have investigated the relationships between selective constraints in introns and their length, GC content and location within genes. To date, however, no such investigation has been done in plants. Studies of selective constraints in noncoding DNA have generally involved interspecific comparisons, under the assumption of the same selective pressures acting in each lineage. Such comparisons are limited to cases in which the noncoding sequences are not too strongly diverged so that reliable sequence alignments can be obtained. Here, we investigate selective constraints in a recent segmental duplication that includes 605 paralogous intron pairs that occurred about 7 million years ago in rice (O. sativa). Results Our principal findings are: (1) intronic divergence is negatively correlated with intron length, a pattern that has previously been described in Drosophila and mammals; (2) there is a signature of strong purifying selection at splice control sites; (3) first introns are significantly longer and have a higher GC content than other introns; (4) the divergences of first and non-first introns are not significantly different from one another, a pattern that differs from Drosophila and mammals; and (5) short introns are more diverged than four-fold degenerate sites suggesting that selection reduces divergence at four-fold sites. Conclusion Our observation of stronger selective constraints in long introns suggests that functional elements subject to purifying selection may be concentrated within long introns. Our results are consistent with the presence of strong purifying selection at splicing control sites. Selective constraints are not significantly stronger in first introns of rice, as they are in other species.
Collapse
Affiliation(s)
- Xingyi Guo
- Institute of Crop Science & Institute of Bioinformatics, Zhejiang University, Hangzhou 310029, China.
| | | | | | | |
Collapse
|
57
|
Piganeau G, Moreau H. Screening the Sargasso Sea metagenome for data to investigate genome evolution in Ostreococcus (Prasinophyceae, Chlorophyta). Gene 2007; 406:184-90. [PMID: 17961934 DOI: 10.1016/j.gene.2007.09.015] [Citation(s) in RCA: 24] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/03/2007] [Revised: 09/07/2007] [Accepted: 09/20/2007] [Indexed: 11/16/2022]
Abstract
The Sargasso Sea water shotgun sequencing unveiled an unprecedented glimpse of marine prokaryotic diversity and gene content. The sequence data was gathered from 0.8 microm filtered surface water extracts, and revealed picoeukaryotic (cell size<2 microm) sequences alongside the prokaryotic data. We used the available genome sequence of the picoeukaryote Ostreococcus tauri (Prasinophyceae, Chlorophyta) as a benchmark for the eukaryotic sequence content of the Sargasso Sea metagenome. Sequence data from at least two new Ostreococcus strains were identified and analyzed, and showed a bias towards higher coverage of the AT-rich organellar genomes. The Ostreococcus nuclear sequence data retrieved from the Sargasso metagenome is divided onto 731 scaffolds of average size 3917 bp, and covers 23% of the complete nuclear genome and 14% of the total number of protein coding genes in O. tauri. We used this environmental Ostreococcus sequence data to estimate the level of constraint on intronic and intergenic sequences in this compact genome.
Collapse
Affiliation(s)
- Gwenaël Piganeau
- Universite Pierre et Marie Curie-Paris6, Laboratoire Arago, BP44, 66651 Banyuls sur Mer Cedex, France.
| | | |
Collapse
|
58
|
Foltz DW. An Ancient Repeat Sequence in the ATP Synthase β-Subunit Gene of Forcipulate Sea Stars. J Mol Evol 2007; 65:564-73. [PMID: 17909692 DOI: 10.1007/s00239-007-9036-6] [Citation(s) in RCA: 7] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/12/2007] [Revised: 08/10/2007] [Accepted: 08/17/2007] [Indexed: 10/22/2022]
Abstract
A novel repeat sequence with a conserved secondary structure is described from two nonadjacent introns of the ATP synthase beta-subunit gene in sea stars of the order Forcipulatida (Echinodermata: Asteroidea). The repeat is present in both introns of all forcipulate sea stars examined, which suggests that it is an ancient feature of this gene (with an approximate age of 200 Mya). Both stem and loop regions show high levels of sequence constraint when compared to flanking nonrepetitive intronic regions. The repeat was also detected in (1) the family Pterasteridae, order Velatida and (2) the family Korethrasteridae, order Velatida. The repeat was not detected in (1) the family Echinasteridae, order Spinulosida, (2) the family Astropectinidae, order Paxillosida, (3) the family Solasteridae, order Velatida, or (4) the family Goniasteridae, order Valvatida. The repeat lacks similarity to published sequences in unrestricted GenBank searches, and there are no significant open reading frames in the repeat or in the flanking intron sequences. Comparison via parametric bootstrapping to a published phylogeny based on 4.2 kb of nuclear and mitochondrial sequence for a subset of these species allowed the null hypothesis of a congruent phylogeny to be rejected for each repeat, when compared separately to the published phylogeny. In contrast, the flanking nonrepetitive sequences in each intron yielded separate phylogenies that were each congruent with the published phylogeny. In four species, the repeat in one or both introns has apparently experienced gene conversion. The two introns also show a correlated pattern of nucleotide substitutions, even after excluding the putative cases of gene conversion.
Collapse
Affiliation(s)
- David W Foltz
- Department of Biological Sciences, Louisiana State University, Baton Rouge, LA 70803-1715, USA.
| |
Collapse
|
59
|
Thornton KR, Jensen JD, Becquet C, Andolfatto P. Progress and prospects in mapping recent selection in the genome. Heredity (Edinb) 2007; 98:340-8. [PMID: 17473869 DOI: 10.1038/sj.hdy.6800967] [Citation(s) in RCA: 112] [Impact Index Per Article: 6.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/09/2022] Open
Abstract
One of the central goals of evolutionary biology is to understand the genetic basis of adaptive evolution. The availability of nearly complete genome sequences from a variety of organisms has facilitated the collection of data on naturally occurring genetic variation on the scale of hundreds of loci to whole genomes. Such data have changed the focus of molecular population genetics from making inferences about adaptive evolution at single loci to identifying which loci, out of hundreds to thousands, have been recent targets of natural selection. A major challenge in this effort is distinguishing the effects of selection from those of the demographic history of populations. Here we review some current progress and remaining challenges in the field.
Collapse
Affiliation(s)
- K R Thornton
- Department of Molecular Biology and Genetics, Cornell University, Ithaca, NY, USA
| | | | | | | |
Collapse
|
60
|
Bachtrog D. Reduced selection for codon usage bias in Drosophila miranda. J Mol Evol 2007; 64:586-90. [PMID: 17457633 DOI: 10.1007/s00239-006-0257-x] [Citation(s) in RCA: 28] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/30/2006] [Accepted: 01/22/2007] [Indexed: 01/17/2023]
Abstract
Biased codon usage in many species results from a balance among mutation, weak selection, and genetic drift. Here I show that selection to maintain biased codon usage is reduced in Drosophila miranda relative to its ancestor. Analyses of mutation patterns in noncoding DNA suggest that the extent of this reduction cannot be explained by changes in mutation bias or by biased gene conversion. Low levels of variability in D. miranda relative to its sibling species, D. pseudoobscura, suggest that it has a much smaller effective population size. Reduced codon usage bias in D. miranda may thus result from the reduced efficacy of selection against newly arising mutations to unpreferred codons.
Collapse
Affiliation(s)
- Doris Bachtrog
- Division of Biological Sciences, University of California, San Diego, 9500 Gilman Drive, MC 0116, La Jolla, CA 92093, USA.
| |
Collapse
|
61
|
Dias FC, Ruiz JC, Lopes WCZ, Squina FM, Renzi A, Cruz AK, Tosi LRO. Organization of H locus conserved repeats in Leishmania (Viannia) braziliensis correlates with lack of gene amplification and drug resistance. Parasitol Res 2007; 101:667-76. [PMID: 17393181 DOI: 10.1007/s00436-007-0528-5] [Citation(s) in RCA: 17] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/08/2007] [Accepted: 03/14/2007] [Indexed: 11/27/2022]
Abstract
Resistance to antimonials is a major problem when treating visceral leishmaniasis in India and has already been described for New World parasites. Clinical response to meglumine antimoniate in patients infected with parasites of the Viannia sub-genus can be widely variable, suggesting the presence of mechanisms of drug resistance. In this work, we have compared L. major and L. braziliensis mutants selected in different drugs. The cross-resistance profiles of some cell lines resembled those of mutants bearing H locus amplicons. However, amplified episomal molecules were exclusively detected in L. major mutants. The analysis of the L. braziliensis H region revealed a strong conservation of gene synteny. The typical intergenic repeats that are believed to mediate the amplification of the H locus in species of the Leishmania sub-genus are partially conserved in the Viannia species. The conservation of these non-coding elements in equivalent positions in both species is indicative of their relevance within this locus. The absence of amplicons in L. braziliensis suggests that this species may not favour extra-chromosomal gene amplification as a source of phenotypic heterogeneity and fitness maintenance in changing environments.
Collapse
Affiliation(s)
- Fabricio C Dias
- Departamento de Biologia Celular e Molecular e Bioagentes Patogênicos, Faculdade de Medicina de Ribeirão Preto, Universidade de São Paulo, 14049-900, Ribeirão Preto, Sao Paulo, Brazil
| | | | | | | | | | | | | |
Collapse
|
62
|
Shapiro JA, Huang W, Zhang C, Hubisz MJ, Lu J, Turissini DA, Fang S, Wang HY, Hudson RR, Nielsen R, Chen Z, Wu CI. Adaptive genic evolution in the Drosophila genomes. Proc Natl Acad Sci U S A 2007; 104:2271-6. [PMID: 17284599 PMCID: PMC1892965 DOI: 10.1073/pnas.0610385104] [Citation(s) in RCA: 180] [Impact Index Per Article: 10.6] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/28/2006] [Indexed: 11/18/2022] Open
Abstract
Determining the extent of adaptive evolution at the genomic level is central to our understanding of molecular evolution. A suitable observation for this purpose would consist of polymorphic data on a large and unbiased collection of genes from two closely related species, each having a large and stable population. In this study, we sequenced 419 genes from 24 lines of Drosophila melanogaster and its close relatives. Together with data from Drosophila simulans, these data reveal the following. (i) Approximately 10% of the loci in regions of normal recombination are much less polymorphic at silent sites than expected, hinting at the action of selective sweeps. (ii) The level of polymorphism is negatively correlated with the rate of nonsynonymous divergence across loci. Thus, even under strict neutrality, the ratio of amino acid to silent nucleotide changes (A:S) between Drosophila species is expected to be 25-40% higher than the A:S ratio for polymorphism when data are pooled across the genome. (iii) The observed A/S ratio between species among the 419 loci is 28.9% higher than the (adjusted) neutral expectation. We estimate that nearly 30% of the amino acid substitutions between D. melanogaster and its close relatives were adaptive. (iv) This signature of adaptive evolution is observable only in regions of normal recombination. Hence, the low level of polymorphism observed in regions of reduced recombination may not be driven primarily by positive selection. Finally, we discuss the theories and data pertaining to the interpretation of adaptive evolution in genomic studies.
Collapse
Affiliation(s)
| | - Wei Huang
- Department of Genetics, Chinese National Human Genome Center at Shanghai, Shanghai 201203, China
| | - Chenhui Zhang
- Department of Genetics, Chinese National Human Genome Center at Shanghai, Shanghai 201203, China
| | | | - Jian Lu
- Departments of *Ecology and Evolution and
| | | | - Shu Fang
- Departments of *Ecology and Evolution and
| | | | | | - Rasmus Nielsen
- Institute of Biology, University of Copenhagen, DK-1100 Copenhagen, Denmark
| | - Zhu Chen
- Department of Genetics, Chinese National Human Genome Center at Shanghai, Shanghai 201203, China
- State Key Laboratory of Medical Genomics and Shanghai Institute of Hematology, Rui-Jin Hospital, School of Medicine, Shanghai Jiao Tong University, Shanghai, 201203, China; and
| | - Chung-I Wu
- Departments of *Ecology and Evolution and
- **International Center for Evolutionary and Genomic Studies, Sun Yat-Sen University, Guangzhou 510275, China
| |
Collapse
|
63
|
Thornton KR, Jensen JD. Controlling the false-positive rate in multilocus genome scans for selection. Genetics 2007; 175:737-50. [PMID: 17110489 PMCID: PMC1800626 DOI: 10.1534/genetics.106.064642] [Citation(s) in RCA: 138] [Impact Index Per Article: 8.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/09/2006] [Accepted: 10/18/2006] [Indexed: 11/18/2022] Open
Abstract
Rapid typing of genetic variation at many regions of the genome is an efficient way to survey variability in natural populations in an effort to identify segments of the genome that have experienced recent natural selection. Following such a genome scan, individual regions may be chosen for further sequencing and a more detailed analysis of patterns of variability, often to perform a parametric test for selection and to estimate the strength of a recent selective sweep. We show here that not accounting for the ascertainment of loci in such analyses leads to false inference of natural selection when the true model is selective neutrality, because the procedure of choosing unusual loci (in comparison to the rest of the genome-scan data) selects regions of the genome with genealogies similar to those expected under models of recent directional selection. We describe a simple and efficient correction for this ascertainment bias, which restores the false-positive rate to near-nominal levels. For the parameters considered here, we find that obtaining a test with the expected distribution of P-values depends on accurately accounting both for ascertainment of regions and for demography. Finally, we use simulations to explore the utility of relying on outlier loci to detect recent selective sweeps. We find that measures of diversity and of population differentiation are more effective than summaries of the site-frequency spectrum and that sequencing larger regions (2.5 kbp) in genome-scan studies leads to more power to detect recent selective sweeps.
Collapse
Affiliation(s)
- Kevin R Thornton
- Department of Molecular Biology and Genetics, Cornell University, Ithaca, New York 14853, USA.
| | | |
Collapse
|
64
|
Abstract
Research into the origins of introns is at a critical juncture in the resolution of theories on the evolution of early life (which came first, RNA or DNA?), the identity of LUCA (the last universal common ancestor, was it prokaryotic- or eukaryotic-like?), and the significance of noncoding nucleotide variation. One early notion was that introns would have evolved as a component of an efficient mechanism for the origin of genes. But alternative theories emerged as well. From the debate between the "introns-early" and "introns-late" theories came the proposal that introns arose before the origin of genetically encoded proteins and DNA, and the more recent "introns-first" theory, which postulates the presence of introns at that early evolutionary stage from a reconstruction of the "RNA world." Here we review seminal and recent ideas about intron origins. Recent discoveries about the patterns and causes of intron evolution make this one of the most hotly debated and exciting topics in molecular evolutionary biology today.
Collapse
Affiliation(s)
- Francisco Rodríguez-Trelles
- Department of Ecology and Evolutionary Biology, University of California, Irvine, California 92697-2525, USA.
| | | | | |
Collapse
|
65
|
Bachtrog D, Andolfatto P. Selection, recombination and demographic history in Drosophila miranda. Genetics 2006; 174:2045-59. [PMID: 17028331 PMCID: PMC1698658 DOI: 10.1534/genetics.106.062760] [Citation(s) in RCA: 67] [Impact Index Per Article: 3.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/28/2006] [Accepted: 09/14/2006] [Indexed: 12/30/2022] Open
Abstract
Selection, recombination, and the demographic history of a species can all have profound effects on genomewide patterns of variability. To assess the impact of these forces in the genome of Drosophila miranda, we examine polymorphism and divergence patterns at 62 loci scattered across the genome. In accordance with recent findings in D. melanogaster, we find that noncoding DNA generally evolves more slowly than synonymous sites, that the distribution of polymorphism frequencies in noncoding DNA is significantly skewed toward rare variants relative to synonymous sites, and that long introns evolve significantly slower than short introns or synonymous sites. These observations suggest that most noncoding DNA is functionally constrained and evolving under purifying selection. However, in contrast to findings in the D. melanogaster species group, we find little evidence of adaptive evolution acting on either coding or noncoding sequences in D. miranda. Levels of linkage disequilibrium (LD) in D. miranda are comparable to those observed in D. melanogaster, but vary considerably among chromosomes. These patterns suggest a significantly lower rate of recombination on autosomes, possibly due to the presence of polymorphic autosomal inversions and/or differences in chromosome sizes. All chromosomes show significant departures from the standard neutral model, including too much heterogeneity in synonymous site polymorphism relative to divergence among loci and a general excess of rare synonymous polymorphisms. These departures from neutral equilibrium expectations are discussed in the context of nonequilibrium models of demography and selection.
Collapse
Affiliation(s)
- Doris Bachtrog
- Division of Biological Sciences, University of California, San Diego, California 92093, USA.
| | | |
Collapse
|
66
|
Ko WY, Piao S, Akashi H. Strong regional heterogeneity in base composition evolution on the Drosophila X chromosome. Genetics 2006; 174:349-62. [PMID: 16547109 PMCID: PMC1569809 DOI: 10.1534/genetics.105.054346] [Citation(s) in RCA: 11] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/05/2005] [Accepted: 05/08/2006] [Indexed: 11/18/2022] Open
Abstract
Fluctuations in base composition appear to be prevalent in Drosophila and mammal genome evolution, but their timescale, genomic breadth, and causes remain obscure. Here, we study base composition evolution within the X chromosomes of Drosophila melanogaster and five of its close relatives. Substitutions were inferred on six extant and two ancestral lineages for 14 near-telomeric and 9 nontelomeric genes. GC content evolution is highly variable both within the genome and within the phylogenetic tree. In the lineages leading to D. yakuba and D. orena, GC content at silent sites has increased rapidly near telomeres, but has decreased in more proximal (nontelomeric) regions. D. orena shows a 17-fold excess of GC-increasing vs. AT-increasing synonymous changes within a small (approximately 130-kb) region close to the telomeric end. Base composition changes within introns are consistent with changes in mutation patterns, but stronger GC elevation at synonymous sites suggests contributions of natural selection or biased gene conversion. The Drosophila yakuba lineage shows a less extreme elevation of GC content distributed over a wider genetic region (approximately 1.2 Mb). A lack of change in GC content for most introns within this region suggests a role of natural selection in localized base composition fluctuations.
Collapse
Affiliation(s)
- Wen-Ya Ko
- Institute of Molecular Evolutionary Genetics and Department of Biology, Pennsylvania State University, University Park, Pennsylvania 16802, USA
| | | | | |
Collapse
|
67
|
Pollard DA, Moses AM, Iyer VN, Eisen MB. Detecting the limits of regulatory element conservation and divergence estimation using pairwise and multiple alignments. BMC Bioinformatics 2006; 7:376. [PMID: 16904011 PMCID: PMC1613255 DOI: 10.1186/1471-2105-7-376] [Citation(s) in RCA: 42] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/13/2006] [Accepted: 08/14/2006] [Indexed: 01/01/2023] Open
Abstract
Background Molecular evolutionary studies of noncoding sequences rely on multiple alignments. Yet how multiple alignment accuracy varies across sequence types, tree topologies, divergences and tools, and further how this variation impacts specific inferences, remains unclear. Results Here we develop a molecular evolution simulation platform, CisEvolver, with models of background noncoding and transcription factor binding site evolution, and use simulated alignments to systematically examine multiple alignment accuracy and its impact on two key molecular evolutionary inferences: transcription factor binding site conservation and divergence estimation. We find that the accuracy of multiple alignments is determined almost exclusively by the pairwise divergence distance of the two most diverged species and that additional species have a negligible influence on alignment accuracy. Conserved transcription factor binding sites align better than surrounding noncoding DNA yet are often found to be misaligned at relatively short divergence distances, such that studies of binding site gain and loss could easily be confounded by alignment error. Divergence estimates from multiple alignments tend to be overestimated at short divergence distances but reach a tool specific divergence at which they cease to increase, leading to underestimation at long divergences. Our most striking finding was that overall alignment accuracy, binding site alignment accuracy and divergence estimation accuracy vary greatly across branches in a tree and are most accurate for terminal branches connecting sister taxa and least accurate for internal branches connecting sub-alignments. Conclusion Our results suggest that variation in alignment accuracy can lead to errors in molecular evolutionary inferences that could be construed as biological variation. These findings have implications for which species to choose for analyses, what kind of errors would be expected for a given set of species and how multiple alignment tools and phylogenetic inference methods might be improved to minimize or control for alignment errors.
Collapse
Affiliation(s)
- Daniel A Pollard
- Graduate Group in Biophysics, University of California, Berkeley, CA 94720, USA
| | - Alan M Moses
- Graduate Group in Biophysics, University of California, Berkeley, CA 94720, USA
| | - Venky N Iyer
- Department of Molecular and Cell Biology, University of California, Berkeley, CA 94720, USA
| | - Michael B Eisen
- Graduate Group in Biophysics, University of California, Berkeley, CA 94720, USA
- Department of Molecular and Cell Biology, University of California, Berkeley, CA 94720, USA
- Department of Genome Sciences, Genomics Division, Ernest Orlando Lawrence Berkeley National Lab, Berkeley, CA 94720, USA
- Center for Integrative Genomics, University of California, Berkeley, CA 94720, USA
| |
Collapse
|
68
|
Wang J, Keightley PD, Johnson T. MCALIGN2: faster, accurate global pairwise alignment of non-coding DNA sequences based on explicit models of indel evolution. BMC Bioinformatics 2006; 7:292. [PMID: 16762073 PMCID: PMC1534069 DOI: 10.1186/1471-2105-7-292] [Citation(s) in RCA: 23] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/07/2005] [Accepted: 06/08/2006] [Indexed: 11/10/2022] Open
Abstract
Background Non-coding DNA sequences comprise a very large proportion of the total genomic content of mammals, most other vertebrates, many invertebrates, and most plants. Unraveling the functional significance of non-coding DNA depends on how well we are able to align non-coding DNA sequences. However, the alignment of non-coding DNA sequences is more difficult than aligning protein-coding sequences. Results Here we present an improved pair-hidden-Markov-Model (pair HMM) based method for performing global pairwise alignment of non-coding DNA sequences. The method uses an explicit model of indel length frequency distribution which can be specified, and allows any time reversible model of nucleotide substitution. The method uses a deterministic global optimiser to find the alignment with the highest posterior probability. We test MCALIGN2 in simulations, and compare it to a previous Monte Carlo based method (MCALIGN), to the pair HMM method of Knudsen and Miyamoto, and to a heuristic method (AVID) that performed very well in a previous simulation study. We show that the pair HMM methods have excellent performance for all combinations of parameter values we have considered. MCALIGN2 is up to ten times faster than MCALIGN. MCALIGN2 is more accurate in resolving indels given an accurate explicit model than heuristic methods, but is computationally slower. Conclusion MCALIGN2 produces better quality alignments by explicitly using biological knowledge about the indel length distribution and time reversible models of nucleotide substitution. As a result, it can outperform other available sequence alignment methods for the cases we have considered to align non-coding DNA sequences.
Collapse
Affiliation(s)
- Jun Wang
- Institute of Evolutionary Biology, School of Biological Sciences, University of Edinburgh, Edinburgh EH9 3JT, UK
| | - Peter D Keightley
- Institute of Evolutionary Biology, School of Biological Sciences, University of Edinburgh, Edinburgh EH9 3JT, UK
| | - Toby Johnson
- Institute of Evolutionary Biology, School of Biological Sciences, University of Edinburgh, Edinburgh EH9 3JT, UK
| |
Collapse
|
69
|
Halligan DL, Keightley PD. Ubiquitous selective constraints in the Drosophila genome revealed by a genome-wide interspecies comparison. Genome Res 2006; 16:875-84. [PMID: 16751341 PMCID: PMC1484454 DOI: 10.1101/gr.5022906] [Citation(s) in RCA: 181] [Impact Index Per Article: 10.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/09/2023]
Abstract
Non-coding DNA comprises approximately 80% of the euchromatic portion of the Drosophila melanogaster genome. Non-coding sequences are known to contain functionally important elements controlling gene expression, but the proportion of sites that are selectively constrained is still largely unknown. We have compared the complete D. melanogaster and Drosophila simulans genome sequences to estimate mean selective constraint (the fraction of mutations that are eliminated by selection) in coding and non-coding DNA by standardizing to substitution rates in putatively unconstrained sequences. We show that constraint is positively correlated with intronic and intergenic sequence length and is generally remarkably strong in non-coding DNA, implying that more than half of all point mutations in the Drosophila genome are deleterious. This fraction is also likely to be an underestimate if many substitutions in non-coding DNA are adaptively driven to fixation. We also show that substitutions in long introns and intergenic sequences are clustered, such that there is an excess of substitutions <8 bp apart and a deficit farther apart. These results suggest that there are blocks of constrained nucleotides, presumably involved in gene expression control, that are concentrated in long non-coding sequences. Furthermore, we infer that there is more than three times as much functional non-coding DNA as protein-coding DNA in the Drosophila genome. Most deleterious mutations therefore occur in non-coding DNA, and these may make an important contribution to a wide variety of evolutionary processes.
Collapse
Affiliation(s)
- Daniel L Halligan
- Institute of Evolutionary Biology, University of Edinburgh, Edinburgh EH9 3JT, United Kingdom.
| | | |
Collapse
|
70
|
Schully SD, Hellberg ME. Positive Selection on Nucleotide Substitutions and Indels in Accessory Gland Proteins of the Drosophila pseudoobscura Subgroup. J Mol Evol 2006; 62:793-802. [PMID: 16752217 DOI: 10.1007/s00239-005-0239-4] [Citation(s) in RCA: 35] [Impact Index Per Article: 1.9] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/06/2005] [Accepted: 02/20/2006] [Indexed: 01/17/2023]
Abstract
Genes encoding reproductive proteins often diverge rapidly due to positive selection on nucleotide substitutions. While this general pattern is well established, the extent to which specific reproductive genes experience similar selection in different clades has been little explored, nor have possible targets of positive selection other than nucleotide substitutions, such as indels, received much attention. Here, we inspect for the signature of positive selection in the genes encoding five accessory gland proteins (Acps) (Acp26Aa, Acp32CD, Acp53Ea, Acp62F, and Acp70A) originally described from Drosophila melanogaster but with recognizable orthologues in the D. pseudoobscura subgroup. We compare patterns of selection within the D. psuedoobscura subgroup to those in the D. melanogaster subgroup. Similar patterns of positive selection were found in Acp26Aa and Acp62F in the two subgroups, while Acp53Ea and Acp70A experienced purifying selection in both subgroups. These proteins have thus remained targets for similar types of selection over long (>21-MY) periods of time. We also found several indel substitutions and polymorphisms in Acp26Aa and Acp32CD. These indels occur in the same regions as positively selected nucleotide substitutions for Acp26Aa in the D. pseudoobscura subgroup but not in the D. melanogaster subgroup. Rates of indel substitution within Acp26Aa in the D. pseudoobscura subgroup were up to several times those in noncoding regions of the Drosophila genome. This suggests that indel substitutions may be under positive selection and may play a key role in the divergence of some Acps.
Collapse
Affiliation(s)
- Sheri Dixon Schully
- Department of Biological Sciences, Louisiana State University, Baton Rouge, LA 70803, USA
| | | |
Collapse
|
71
|
Keightley PD, Kryukov GV, Sunyaev S, Halligan DL, Gaffney DJ. Evolutionary constraints in conserved nongenic sequences of mammals. Genome Res 2006; 15:1373-8. [PMID: 16204190 PMCID: PMC1240079 DOI: 10.1101/gr.3942005] [Citation(s) in RCA: 44] [Impact Index Per Article: 2.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/24/2022]
Abstract
Mammalian genomes contain many highly conserved nongenic sequences (CNGs) whose functional significance is poorly understood. Sets of CNGs have previously been identified by selecting the most conserved elements from a chromosome or genome, but in these highly selected samples, conservation may be unrelated to purifying selection. Furthermore, conservation of CNGs may be caused by mutation rate variation rather than selective constraints. To account for the effect of selective sampling, we have examined conservation of CNGs in taxa whose evolution is largely independent of the taxa from which the CNGs were initially identified, and we have controlled for mutation rate variation in the genome. We show that selective constraints in CNGs and their flanks are about one-half as strong in hominids as in murids, implying that hominids have accumulated many slightly deleterious mutations in functionally important nongenic regions. This is likely to be a consequence of the low effective population size of hominids leading to a reduced effectiveness of selection. We estimate that there are one and two times as many conserved nucleotides in CNGs as in known protein-coding genes of hominids and murids, respectively. Polymorphism frequencies in CNGs indicate that purifying selection operates in these sequences. During hominid evolution, we estimate that a total of about three deleterious mutations in CNGs and protein-coding genes have been selectively eliminated per diploid genome each generation, implying that deleterious mutations are eliminated from populations non-independently and that sex is necessary for long-term population persistence.
Collapse
Affiliation(s)
- Peter D Keightley
- Institute of Evolutionary Biology, School of Biological Sciences, University of Edinburgh, Edinburgh EH9 3JT, United Kingdom.
| | | | | | | | | |
Collapse
|
72
|
Akashi H, Ko WY, Piao S, John A, Goel P, Lin CF, Vitins AP. Molecular evolution in the Drosophila melanogaster species subgroup: frequent parameter fluctuations on the timescale of molecular divergence. Genetics 2005; 172:1711-26. [PMID: 16387879 PMCID: PMC1456288 DOI: 10.1534/genetics.105.049676] [Citation(s) in RCA: 39] [Impact Index Per Article: 2.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/18/2022] Open
Abstract
Although mutation, genetic drift, and natural selection are well established as determinants of genome evolution, the importance (frequency and magnitude) of parameter fluctuations in molecular evolution is less understood. DNA sequence comparisons among closely related species allow specific substitutions to be assigned to lineages on a phylogenetic tree. In this study, we compare patterns of codon usage and protein evolution in 22 genes (>11,000 codons) among Drosophila melanogaster and five relatives within the D. melanogaster subgroup. We assign changes to eight lineages using a maximum-likelihood approach to infer ancestral states. Uncertainty in ancestral reconstructions is taken into account, at least to some extent, by weighting reconstructions by their posterior probabilities. Four of the eight lineages show potentially genomewide departures from equilibrium synonymous codon usage; three are decreasing and one is increasing in major codon usage. Several of these departures are consistent with lineage-specific changes in selection intensity (selection coefficients scaled to effective population size) at silent sites. Intron base composition and rates and patterns of protein evolution are also heterogeneous among these lineages. The magnitude of forces governing silent, intron, and protein evolution appears to have varied frequently, and in a lineage-specific manner, within the D. melanogaster subgroup.
Collapse
Affiliation(s)
- Hiroshi Akashi
- Institute of Molecular Evolutionary Genetics and Department of Biology, Pennsylvania State University, University Park, Pennsylvania 16802, USA.
| | | | | | | | | | | | | |
Collapse
|
73
|
Abstract
Most of the phenotypic diversity that we perceive in the natural world is directly attributable to the peculiar structure of the eukaryotic gene, which harbors numerous embellishments relative to the situation in prokaryotes. The most profound changes include introns that must be spliced out of precursor mRNAs, transcribed but untranslated leader and trailer sequences (untranslated regions), modular regulatory elements that drive patterns of gene expression, and expansive intergenic regions that harbor additional diffuse control mechanisms. Explaining the origins of these features is difficult because they each impose an intrinsic disadvantage by increasing the genic mutation rate to defective alleles. To address these issues, a general hypothesis for the emergence of eukaryotic gene structure is provided here. Extensive information on absolute population sizes, recombination rates, and mutation rates strongly supports the view that eukaryotes have reduced genetic effective population sizes relative to prokaryotes, with especially extreme reductions being the rule in multicellular lineages. The resultant increase in the power of random genetic drift appears to be sufficient to overwhelm the weak mutational disadvantages associated with most novel aspects of the eukaryotic gene, supporting the idea that most such changes are simple outcomes of semi-neutral processes rather than direct products of natural selection. However, by establishing an essentially permanent change in the population-genetic environment permissive to the genome-wide repatterning of gene structure, the eukaryotic condition also promoted a reliable resource from which natural selection could secondarily build novel forms of organismal complexity. Under this hypothesis, arguments based on molecular, cellular, and/or physiological constraints are insufficient to explain the disparities in gene, genomic, and phenotypic complexity between prokaryotes and eukaryotes.
Collapse
Affiliation(s)
- Michael Lynch
- Department of Biology, Indiana University, Bloomington, USA.
| |
Collapse
|
74
|
Andolfatto P. Adaptive evolution of non-coding DNA in Drosophila. Nature 2005; 437:1149-52. [PMID: 16237443 DOI: 10.1038/nature04107] [Citation(s) in RCA: 462] [Impact Index Per Article: 24.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/23/2005] [Accepted: 08/02/2005] [Indexed: 11/08/2022]
Abstract
A large fraction of eukaryotic genomes consists of DNA that is not translated into protein sequence, and little is known about its functional significance. Here I show that several classes of non-coding DNA in Drosophila are evolving considerably slower than synonymous sites, and yet show an excess of between-species divergence relative to polymorphism when compared with synonymous sites. The former is a hallmark of selective constraint, but the latter is a signature of adaptive evolution, resembling general patterns of protein evolution in Drosophila. I estimate that about 40-70% of nucleotides in intergenic regions, untranslated portions of mature mRNAs (UTRs) and most intronic DNA are evolutionarily constrained relative to synonymous sites. However, I also use an extension to the McDonald-Kreitman test to show that a substantial fraction of the nucleotide divergence in these regions was driven to fixation by positive selection (about 20% for most intronic and intergenic DNA, and 60% for UTRs). On the basis of these observations, I suggest that a large fraction of the non-translated genome is functionally important and subject to both purifying selection and adaptive evolution. These results imply that, although positive selection is clearly an important facet of protein evolution, adaptive changes to non-coding DNA might have been considerably more common in the evolution of D. melanogaster.
Collapse
Affiliation(s)
- Peter Andolfatto
- Section of Ecology, Behavior and Evolution, Division of Biological Sciences, University of California San Diego, La Jolla, California 92093, USA.
| |
Collapse
|
75
|
Bachtrog D. Sex chromosome evolution: molecular aspects of Y-chromosome degeneration in Drosophila. Genome Res 2005; 15:1393-401. [PMID: 16169921 PMCID: PMC1240082 DOI: 10.1101/gr.3543605] [Citation(s) in RCA: 92] [Impact Index Per Article: 4.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/14/2023]
Abstract
Ancient Y-chromosomes of various organisms contain few active genes and an abundance of repetitive DNA. The neo-Y chromosome of Drosophila miranda is in transition from an ordinary autosome to a genetically inert Y-chromosome, while its homolog, the neo-X chromosome, is evolving partial dosage compensation. Here, I compare four large genomic regions located on the neo-sex chromosomes that contain a total of 12 homologous genes. In addition, I investigate the partial coding sequence for 56 more homologous gene pairs from the neo-sex chromosomes. Little modification has occurred on the neo-X chromosome, and genes are highly constrained at the protein level. In contrast, a diverse array of molecular changes is contributing to the observed degeneration of the neo-Y chromosome. In particular, the four large regions surveyed on the neo-Y chromosome harbor several transposable element insertions, large deletions, and a large structural rearrangement. About one-third of all neo-Y-linked genes are nonfunctional, containing either premature stop codons and/or frameshift mutations. Intact genes on the neo-Y are accumulating amino acid and unpreferred codon changes. In addition, both 5'- and 3'-flanking regions of genes and intron sequences are less constrained on the neo-Y relative to the neo-X. Despite heterogeneity in levels of dosage compensation along the neo-X chromosome of D. miranda, the neo-Y chromosome shows surprisingly uniform signs of degeneration.
Collapse
Affiliation(s)
- Doris Bachtrog
- Department of Ecology, Behavior and Evolution, University of California, San Diego, La Jolla, California 92093, USA.
| |
Collapse
|
76
|
Abstract
An essential component of the immune system of animals is the production of antimicrobial peptides (AMPs). In vertebrates and termites the protein sequence of some AMPs evolves rapidly under positive selection, suggesting that they may be coevolving with pathogens. However, antibacterial peptides in Drosophila tend to be highly conserved. We have inferred the selection pressures acting on Drosophila antifungal peptides (drosomycins) from both the divergence of drosomycin genes within and between five species of Drosophila and polymorphism data from Drosophila simulans and D. melanogaster. In common with Drosophila antibacterial peptides, there is no evidence of adaptive protein evolution in any of the drosomycin genes, suggesting that they do not coevolve with pathogens. It is possible that this reflects a lack of specific fungal and bacterial parasites in Drosophila populations. The polymorphism data from both species differed from neutrality at one locus, but this was not associated with changes in the protein sequence. The synonymous site diversity was greater in D. simulans than in D. melanogaster, but the diversity both upstream of the genes and at nonsynonymous sites was similar. This can be explained if both upstream and nonsynonymous mutations are slightly deleterious and are removed more effectively from D. simulans due to its larger effective population size.
Collapse
Affiliation(s)
- Francis M Jiggins
- Institute of Evolutionary Biology, School of Biological Sciences, University of Edinburgh, Ashworth Lab, King's Buildings, West Mains Road, Edinburgh EH9 3JT, Scotland.
| | | |
Collapse
|
77
|
Haddrill PR, Charlesworth B, Halligan DL, Andolfatto P. Patterns of intron sequence evolution in Drosophila are dependent upon length and GC content. Genome Biol 2005; 6:R67. [PMID: 16086849 PMCID: PMC1273634 DOI: 10.1186/gb-2005-6-8-r67] [Citation(s) in RCA: 143] [Impact Index Per Article: 7.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/04/2005] [Revised: 04/25/2005] [Accepted: 06/29/2005] [Indexed: 11/10/2022] Open
Abstract
BACKGROUND Introns comprise a large fraction of eukaryotic genomes, yet little is known about their functional significance. Regulatory elements have been mapped to some introns, though these are believed to account for only a small fraction of genome wide intronic DNA. No consistent patterns have emerged from studies that have investigated general levels of evolutionary constraint in introns. RESULTS We examine the relationship between intron length and levels of evolutionary constraint by analyzing inter-specific divergence at 225 intron fragments in Drosophila melanogaster and Drosophila simulans, sampled from a broad distribution of intron lengths. We document a strongly negative correlation between intron length and divergence. Interestingly, we also find that divergence in introns is negatively correlated with GC content. This relationship does not account for the correlation between intron length and divergence, however, and may simply reflect local variation in mutational rates or biases. CONCLUSION Short introns make up only a small fraction of total intronic DNA in the genome. Our finding that long introns evolve more slowly than average implies that, while the majority of introns in the Drosophila genome may experience little or no selective constraint, most intronic DNA in the genome is likely to be evolving under considerable constraint. Our results suggest that functional elements may be ubiquitous within longer introns and that these introns may have a more general role in regulating gene expression than previously appreciated. Our finding that GC content and divergence are negatively correlated in introns has important implications for the interpretation of the correlation between divergence and levels of codon bias observed in Drosophila.
Collapse
Affiliation(s)
- Penelope R Haddrill
- Institute of Evolutionary Biology, School of Biological Sciences, University of Edinburgh, Edinburgh EH9 3JT, UK
| | - Brian Charlesworth
- Institute of Evolutionary Biology, School of Biological Sciences, University of Edinburgh, Edinburgh EH9 3JT, UK
| | - Daniel L Halligan
- Institute of Evolutionary Biology, School of Biological Sciences, University of Edinburgh, Edinburgh EH9 3JT, UK
| | - Peter Andolfatto
- Section of Ecology, Behavior and Evolution, Division of Biological Sciences, University of California San Diego, La Jolla, CA 92093, USA
| |
Collapse
|
78
|
Negre B, Casillas S, Suzanne M, Sánchez-Herrero E, Akam M, Nefedov M, Barbadilla A, de Jong P, Ruiz A. Conservation of regulatory sequences and gene expression patterns in the disintegrating Drosophila Hox gene complex. Genome Res 2005; 15:692-700. [PMID: 15867430 PMCID: PMC1088297 DOI: 10.1101/gr.3468605] [Citation(s) in RCA: 43] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/15/2004] [Accepted: 01/26/2005] [Indexed: 11/25/2022]
Abstract
Homeotic (Hox) genes are usually clustered and arranged in the same order as they are expressed along the anteroposterior body axis of metazoans. The mechanistic explanation for this colinearity has been elusive, and it may well be that a single and universal cause does not exist. The Hox-gene complex (HOM-C) has been rearranged differently in several Drosophila species, producing a striking diversity of Hox gene organizations. We investigated the genomic and functional consequences of the two HOM-C splits present in Drosophila buzzatii. Firstly, we sequenced two regions of the D. buzzatii genome, one containing the genes labial and abdominal A, and another one including proboscipedia, and compared their organization with that of D. melanogaster and D. pseudoobscura in order to map precisely the two splits. Then, a plethora of conserved noncoding sequences, which are putative enhancers, were identified around the three Hox genes closer to the splits. The position and order of these enhancers are conserved, with minor exceptions, between the three Drosophila species. Finally, we analyzed the expression patterns of the same three genes in embryos and imaginal discs of four Drosophila species with different Hox-gene organizations. The results show that their expression patterns are conserved despite the HOM-C splits. We conclude that, in Drosophila, Hox-gene clustering is not an absolute requirement for proper function. Rather, the organization of Hox genes is modular, and their clustering seems the result of phylogenetic inertia more than functional necessity.
Collapse
Affiliation(s)
- Bárbara Negre
- Departament de Genètica i de Microbiologia, Universitat Autònoma de Barcelona, 08193 Bellaterra, Barcelona, Spain
| | | | | | | | | | | | | | | | | |
Collapse
|
79
|
Charlesworth B, Borthwick H, Bartolomé C, Pignatelli P. Estimates of the genomic mutation rate for detrimental alleles in Drosophila melanogaster. Genetics 2005; 167:815-26. [PMID: 15238530 PMCID: PMC1470907 DOI: 10.1534/genetics.103.025262] [Citation(s) in RCA: 50] [Impact Index Per Article: 2.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/18/2022] Open
Abstract
The net rate of mutation to deleterious but nonlethal alleles and the sizes of effects of these mutations are of great significance for many evolutionary questions. Here we describe three replicate experiments in which mutations have been accumulated on chromosome 3 of Drosophila melanogaster by means of single-male backcrosses of heterozygotes for a wild-type third chromosome. Egg-to-adult viability was assayed for nonlethal homozygous chromosomes. The rates of decline in mean and increase in variance (DM and DV, respectively) were estimated. Scaled up to the diploid whole genome, the mean DM for homozygous detrimental mutations over the three experiments was between 0.8 and 1.8%. The corresponding DV estimate was approximately 0.11%. Overall, the results suggest a lower bound estimate of at least 12% for the diploid per genome mutation rate for detrimentals. The upper bound estimates for the mean selection coefficient were between 2 and 10%, depending on the method used. Mutations with selection coefficients of at least a few percent must be the major contributors to the effects detected here and are likely to be caused mostly by transposable element insertions or indels.
Collapse
Affiliation(s)
- Brian Charlesworth
- Institute of Cell, Animal and Population Biology, School of Biological Sciences, University of Edinburgh, Edinburgh EH9 3JT, United Kingdom.
| | | | | | | |
Collapse
|
80
|
Abstract
We have found a negative correlation between evolutionary rate at the protein level (as measured by d(N)) and intron size in Drosophila. Although such a relation is expected if introns reduce Hill-Robertson interference within genes, it seems more likely to be explained by the higher abundance of cis-regulatory elements in introns (especially first introns) in genes under strong selective constraints.
Collapse
Affiliation(s)
- Gabriel Marais
- Institute of Evolutionary Biology, School of Biological Sciences, University of Edinburgh, UK
| | | | | | | |
Collapse
|
81
|
Evidence for widespread degradation of gene control regions in hominid genomes. PLoS Biol 2005; 3:e42. [PMID: 15678168 PMCID: PMC544929 DOI: 10.1371/journal.pbio.0030042] [Citation(s) in RCA: 152] [Impact Index Per Article: 8.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/30/2004] [Accepted: 12/01/2004] [Indexed: 01/28/2023] Open
Abstract
Although sequences containing regulatory elements located close to protein-coding genes are often only weakly conserved during evolution, comparisons of rodent genomes have implied that these sequences are subject to some selective constraints. Evolutionary conservation is particularly apparent upstream of coding sequences and in first introns, regions that are enriched for regulatory elements. By comparing the human and chimpanzee genomes, we show here that there is almost no evidence for conservation in these regions in hominids. Furthermore, we show that gene expression is diverging more rapidly in hominids than in murids per unit of neutral sequence divergence. By combining data on polymorphism levels in human noncoding DNA and the corresponding human–chimpanzee divergence, we show that the proportion of adaptive substitutions in these regions in hominids is very low. It therefore seems likely that the lack of conservation and increased rate of gene expression divergence are caused by a reduction in the effectiveness of natural selection against deleterious mutations because of the low effective population sizes of hominids. This has resulted in the accumulation of a large number of deleterious mutations in sequences containing gene control elements and hence a widespread degradation of the genome during the evolution of humans and chimpanzees. A comparison of hominid and rodent lineages reveals that the gene control regions of hominids are not conserved and are accumulating mutations, suggesting widespread degradation of the hominid genome
Collapse
|
82
|
Axelsson E, Webster MT, Smith NGC, Burt DW, Ellegren H. Comparison of the chicken and turkey genomes reveals a higher rate of nucleotide divergence on microchromosomes than macrochromosomes. Genome Res 2004; 15:120-5. [PMID: 15590944 PMCID: PMC540272 DOI: 10.1101/gr.3021305] [Citation(s) in RCA: 113] [Impact Index Per Article: 5.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/24/2022]
Abstract
A distinctive feature of the avian genome is the large heterogeneity in the size of chromosomes, which are usually classified into a small number of macrochromosomes and numerous microchromosomes. These chromosome classes show characteristic differences in a number of interrelated features that could potentially affect the rate of sequence evolution, such as GC content, gene density, and recombination rate. We studied the effects of these factors by analyzing patterns of nucleotide substitution in two sets of chicken-turkey sequence alignments. First, in a set of 67 orthologous introns, divergence was significantly higher in microchromosomes (chromosomes 11-38; 11.7% divergence) than in both macrochromosomes (chromosomes 1-5; 9.9% divergence; P = 0.016) and intermediate-sized chromosomes (chromosomes 6-10; 9.5% divergence; P = 0.026). At least part of this difference was due to the higher incidence of CpG sites on microchromosomes. Second, using 155 orthologous coding sequences we noted a similar pattern, in which synonymous substitution rates on microchromosomes (13.1%) were significantly higher than were rates on macrochromosomes (10.3%; P = 0.024). Broadly assuming neutrality of introns and synonymous sites, or constraints on such sequences do not differ between chromosomal classes, these observations imply that microchromosomal genes are exposed to more germ line mutations than those on other chromosomes. We also find that dN/dS ratios for genes located on microchromosomes (average, 0.094) are significantly lower than those of macrochromosomes (average, 0.185; P = 0.025), suggesting that the proteins of genes on microchromosomes are under greater evolutionary constraint.
Collapse
Affiliation(s)
- Erik Axelsson
- Department of Evolutionary Biology, Evolutionary Biology Centre, Uppsala University, SE-752 36 Uppsala, Sweden
| | | | | | | | | |
Collapse
|
83
|
Kern AD, Begun DJ. Patterns of Polymorphism and Divergence from Noncoding Sequences of Drosophila melanogaster and D. simulans: Evidence for Nonequilibrium Processes. Mol Biol Evol 2004; 22:51-62. [PMID: 15456897 DOI: 10.1093/molbev/msh269] [Citation(s) in RCA: 36] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/08/2023] Open
Abstract
Despite the fact that D. melanogaster and D. simulans have been the central model system for molecular population genetics, few data are available for noncoding regions. Here, we present an analysis of population genetic data from intergenic regions and comparisons of these data to previously collected data from introns and exons. Polymorphisms and fixations were categorized as A/T to G/C or G/C to A/T changes and were polarized by inferring the ancestral state using both parsimony and maximum likelihood. Noncoding fixations in both D. melanogaster and D. simulans were consistent with equilibrium base-composition evolution. However, polarized noncoding polymorphisms, revealed a different pattern. Although A/T to G/C and G/C to A/T polymorphisms in D. simulans were consistent with equilibrium, we observed a highly significant dearth of A/T to G/C polymorphisms in D. melanogaster introns but not in intergenic sequences. Such data could be explained by recent evolution of mutational biases associated with transcription or by lineage-specific selection on base composition. These data reveal the complexity of evolutionary processes acting even on noncoding DNA in Drosophila.
Collapse
Affiliation(s)
- Andrew D Kern
- Center for Population Biology, University of California, Davis, USA.
| | | |
Collapse
|
84
|
Keightley PD, Johnson T. MCALIGN: stochastic alignment of noncoding DNA sequences based on an evolutionary model of sequence evolution. Genome Res 2004; 14:442-50. [PMID: 14993209 PMCID: PMC353231 DOI: 10.1101/gr.1571904] [Citation(s) in RCA: 43] [Impact Index Per Article: 2.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/25/2022]
Abstract
A method is described for performing global alignment of noncoding DNA sequences based on an evolutionary model parameterized by the frequency distribution of lengths of insertion/deletion events (indels) and their rate relative to nucleotide substitutions. A stochastic hill-climbing algorithm is used to search for the most probable alignment between a pair of sequences or three sequences of known phylogenetic relationship. The performance of the procedure, parameterized according to the empirical distribution of indel lengths in noncoding DNA of Drosophila species, is investigated by simulation. We show that there is excellent agreement between true and estimated alignments over a wide range of sequence divergences, and that the method outperforms other available alignment methods.
Collapse
Affiliation(s)
- Peter D Keightley
- University of Edinburgh, School of Biological Sciences, Ashworth Laboratories, Edinburgh EH9 3JT, UK. Peter.Keightley_at_ed.ac.uk
| | | |
Collapse
|