201
|
Pool JE, Corbett-Detig RB, Sugino RP, Stevens KA, Cardeno CM, Crepeau MW, Duchen P, Emerson JJ, Saelao P, Begun DJ, Langley CH. Population Genomics of sub-saharan Drosophila melanogaster: African diversity and non-African admixture. PLoS Genet 2012; 8:e1003080. [PMID: 23284287 PMCID: PMC3527209 DOI: 10.1371/journal.pgen.1003080] [Citation(s) in RCA: 229] [Impact Index Per Article: 19.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/04/2012] [Accepted: 09/27/2012] [Indexed: 11/25/2022] Open
Abstract
Drosophila melanogaster has played a pivotal role in the development of modern population genetics. However, many basic questions regarding the demographic and adaptive history of this species remain unresolved. We report the genome sequencing of 139 wild-derived strains of D. melanogaster, representing 22 population samples from the sub-Saharan ancestral range of this species, along with one European population. Most genomes were sequenced above 25X depth from haploid embryos. Results indicated a pervasive influence of non-African admixture in many African populations, motivating the development and application of a novel admixture detection method. Admixture proportions varied among populations, with greater admixture in urban locations. Admixture levels also varied across the genome, with localized peaks and valleys suggestive of a non-neutral introgression process. Genomes from the same location differed starkly in ancestry, suggesting that isolation mechanisms may exist within African populations. After removing putatively admixed genomic segments, the greatest genetic diversity was observed in southern Africa (e.g. Zambia), while diversity in other populations was largely consistent with a geographic expansion from this potentially ancestral region. The European population showed different levels of diversity reduction on each chromosome arm, and some African populations displayed chromosome arm-specific diversity reductions. Inversions in the European sample were associated with strong elevations in diversity across chromosome arms. Genomic scans were conducted to identify loci that may represent targets of positive selection within an African population, between African populations, and between European and African populations. A disproportionate number of candidate selective sweep regions were located near genes with varied roles in gene regulation. Outliers for Europe-Africa F(ST) were found to be enriched in genomic regions of locally elevated cosmopolitan admixture, possibly reflecting a role for some of these loci in driving the introgression of non-African alleles into African populations.
Collapse
Affiliation(s)
- John E Pool
- Laboratory of Genetics, University of Wisconsin-Madison, Madison, Wisconsin, USA.
| | | | | | | | | | | | | | | | | | | | | |
Collapse
|
202
|
Extraordinary genome stability in the ciliate Paramecium tetraurelia. Proc Natl Acad Sci U S A 2012; 109:19339-44. [PMID: 23129619 DOI: 10.1073/pnas.1210663109] [Citation(s) in RCA: 90] [Impact Index Per Article: 7.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/18/2022] Open
Abstract
Mutation plays a central role in all evolutionary processes and is also the basis of genetic disorders. Established base-substitution mutation rates in eukaryotes range between ∼5 × 10(-10) and 5 × 10(-8) per site per generation, but here we report a genome-wide estimate for Paramecium tetraurelia that is more than an order of magnitude lower than any previous eukaryotic estimate. Nevertheless, when the mutation rate per cell division is extrapolated to the length of the sexual cycle for this protist, the measure obtained is comparable to that for multicellular species with similar genome sizes. Because Paramecium has a transcriptionally silent germ-line nucleus, these results are consistent with the hypothesis that natural selection operates on the cumulative germ-line replication fidelity per episode of somatic gene expression, with the germ-line mutation rate per cell division evolving downward to the lower barrier imposed by random genetic drift. We observe ciliate-specific modifications of widely conserved amino acid sites in DNA polymerases as one potential explanation for unusually high levels of replication fidelity.
Collapse
|
203
|
Carmel I, Shomron N, Heifetz Y. Does base-pairing strength play a role in microRNA repression? RNA (NEW YORK, N.Y.) 2012; 18:1947-1956. [PMID: 23019592 PMCID: PMC3479386 DOI: 10.1261/rna.032185.111] [Citation(s) in RCA: 17] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 12/28/2011] [Accepted: 06/11/2012] [Indexed: 06/01/2023]
Abstract
MicroRNAs (miRNAs) are short, single-stranded RNAs that silence gene expression by either degrading mRNA or repressing translation. Each miRNA regulates a specific set of mRNA "targets" by binding to complementary sequences in their 3' untranslated region. In this study, we examined the importance of the base-pairing strength of the miRNA-target duplex to repression. We hypothesized that if base-pairing strength affects the functionality of miRNA repression, organisms with higher body temperature or that live at higher temperatures will have miRNAs with higher G/C content so that the miRNA-target complex will remain stable. In the nine model organisms examined, we found a significant correlation between the average G/C content of miRNAs and physiological temperature, supporting our hypothesis. Next, for each organism examined, we compared the average G/C content of miRNAs that are conserved among distant organisms and that of miRNAs that are evolutionarily recent. We found that the average G/C content of ancient miRNAs is lower than recent miRNAs in homeotherms, whereas the trend was inversed in poikilotherms, suggesting that G/C content is associated with temperature, thus further supporting our hypothesis. In the organisms examined, the average G/C content of miRNA "seed" sequences was higher than that of mature miRNAs, which was higher than pre-miRNA loops, suggesting an association between the degree of functionality of the sequence and its average G/C content. Our analyses show a possible association between the base-pairing strength of miRNA-targets and the temperature of an organism, suggesting that base-pairing strength plays a role in repression by miRNAs.
Collapse
Affiliation(s)
- Ido Carmel
- Department of Entomology, The Hebrew University, Rehovot 76100, Israel
| | - Noam Shomron
- Department of Cell and Developmental Biology, Sackler Faculty of Medicine, Tel Aviv University, Tel Aviv 69978, Israel
| | - Yael Heifetz
- Department of Entomology, The Hebrew University, Rehovot 76100, Israel
| |
Collapse
|
204
|
Obbard DJ, Maclennan J, Kim KW, Rambaut A, O'Grady PM, Jiggins FM. Estimating divergence dates and substitution rates in the Drosophila phylogeny. Mol Biol Evol 2012; 29:3459-73. [PMID: 22683811 PMCID: PMC3472498 DOI: 10.1093/molbev/mss150] [Citation(s) in RCA: 162] [Impact Index Per Article: 13.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/04/2023] Open
Abstract
An absolute timescale for evolution is essential if we are to associate evolutionary phenomena, such as adaptation or speciation, with potential causes, such as geological activity or climatic change. Timescales in most phylogenetic studies use geologically dated fossils or phylogeographic events as calibration points, but more recently, it has also become possible to use experimentally derived estimates of the mutation rate as a proxy for substitution rates. The large radiation of drosophilid taxa endemic to the Hawaiian islands has provided multiple calibration points for the Drosophila phylogeny, thanks to the "conveyor belt" process by which this archipelago forms and is colonized by species. However, published date estimates for key nodes in the Drosophila phylogeny vary widely, and many are based on simplistic models of colonization and coalescence or on estimates of island age that are not current. In this study, we use new sequence data from seven species of Hawaiian Drosophila to examine a range of explicit coalescent models and estimate substitution rates. We use these rates, along with a published experimentally determined mutation rate, to date key events in drosophilid evolution. Surprisingly, our estimate for the date for the most recent common ancestor of the genus Drosophila based on mutation rate (25-40 Ma) is closer to being compatible with independent fossil-derived dates (20-50 Ma) than are most of the Hawaiian-calibration models and also has smaller uncertainty. We find that Hawaiian-calibrated dates are extremely sensitive to model choice and give rise to point estimates that range between 26 and 192 Ma, depending on the details of the model. Potential problems with the Hawaiian calibration may arise from systematic variation in the molecular clock due to the long generation time of Hawaiian Drosophila compared with other Drosophila and/or uncertainty in linking island formation dates with colonization dates. As either source of error will bias estimates of divergence time, we suggest mutation rate estimates be used until better models are available.
Collapse
Affiliation(s)
- Darren J Obbard
- Institute of Evolutionary Biology, and Centre for Infection Immunity and Evolution, University of Edinburgh, Edinburgh, United Kingdom.
| | | | | | | | | | | |
Collapse
|
205
|
Abstract
Mutation dictates the tempo and mode of evolution, and like all traits, the mutation rate is subject to evolutionary modification. Here, we report refined estimates of the mutation rate for a prokaryote with an exceptionally small genome and for a unicellular eukaryote with a large genome. Combined with prior results, these estimates provide the basis for a potentially unifying explanation for the wide range in mutation rates that exists among organisms. Natural selection appears to reduce the mutation rate of a species to a level that scales negatively with both the effective population size (N(e)), which imposes a drift barrier to the evolution of molecular refinements, and the genomic content of coding DNA, which is proportional to the target size for deleterious mutations. As a consequence of an expansion in genome size, some microbial eukaryotes with large N(e) appear to have evolved mutation rates that are lower than those known to occur in prokaryotes, but multicellular eukaryotes have experienced elevations in the genome-wide deleterious mutation rate because of substantial reductions in N(e).
Collapse
|
206
|
Abstract
The nature of spontaneous mutations, including their rate, distribution across the genome, and fitness consequences, is of central importance to biology. However, the low rate of mutation has made it difficult to study spontaneous mutagenesis, and few studies have directly addressed these questions. Here, we present a direct estimate of the mutation rate and a description of the properties of new spontaneous mutations in the unicellular green alga Chlamydomonas reinhardtii. We conducted a mutation accumulation experiment for ∼350 generations followed by whole-genome resequencing of two replicate lines. Our analysis identified a total of 14 mutations, including 5 short indels and 9 single base mutations, and no evidence of larger structural mutations. From this, we estimate a total mutation rate of 3.23 × 10(-10)/site/generation (95% C.I. 1.82 × 10(-10) to 5.23 × 10(-10)) and a single base mutation rate of 2.08 × 10(-10)/site/generation (95% C.I., 1.09 × 10(-10) to 3.74 × 10(-10)). We observed no mutations from A/T → G/C, suggesting a strong mutational bias toward A/T, although paradoxically, the GC content of the C. reinhardtii genome is very high. Our estimate is only the second direct estimate of the mutation rate from plants and among the lowest spontaneous base-substitution rates known in eukaryotes.
Collapse
|
207
|
Saxer G, Havlak P, Fox SA, Quance MA, Gupta S, Fofanov Y, Strassmann JE, Queller DC. Whole genome sequencing of mutation accumulation lines reveals a low mutation rate in the social amoeba Dictyostelium discoideum. PLoS One 2012; 7:e46759. [PMID: 23056439 PMCID: PMC3466296 DOI: 10.1371/journal.pone.0046759] [Citation(s) in RCA: 40] [Impact Index Per Article: 3.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/18/2012] [Accepted: 09/03/2012] [Indexed: 12/18/2022] Open
Abstract
Spontaneous mutations play a central role in evolution. Despite their importance, mutation rates are some of the most elusive parameters to measure in evolutionary biology. The combination of mutation accumulation (MA) experiments and whole-genome sequencing now makes it possible to estimate mutation rates by directly observing new mutations at the molecular level across the whole genome. We performed an MA experiment with the social amoeba Dictyostelium discoideum and sequenced the genomes of three randomly chosen lines using high-throughput sequencing to estimate the spontaneous mutation rate in this model organism. The mitochondrial mutation rate of 6.76×10(-9), with a Poisson confidence interval of 4.1×10(-9) - 9.5×10(-9), per nucleotide per generation is slightly lower than estimates for other taxa. The mutation rate estimate for the nuclear DNA of 2.9×10(-11), with a Poisson confidence interval ranging from 7.4×10(-13) to 1.6×10(-10), is the lowest reported for any eukaryote. These results are consistent with low microsatellite mutation rates previously observed in D. discoideum and low levels of genetic variation observed in wild D. discoideum populations. In addition, D. discoideum has been shown to be quite resistant to DNA damage, which suggests an efficient DNA-repair mechanism that could be an adaptation to life in soil and frequent exposure to intracellular and extracellular mutagenic compounds. The social aspect of the life cycle of D. discoideum and a large portion of the genome under relaxed selection during vegetative growth could also select for a low mutation rate. This hypothesis is supported by a significantly lower mutation rate per cell division in multicellular eukaryotes compared with unicellular eukaryotes.
Collapse
Affiliation(s)
- Gerda Saxer
- Department of Ecology and Evolutionary Biology, Rice University, Houston, Texas, United States of America.
| | | | | | | | | | | | | | | |
Collapse
|
208
|
Langley CH, Stevens K, Cardeno C, Lee YCG, Schrider DR, Pool JE, Langley SA, Suarez C, Corbett-Detig RB, Kolaczkowski B, Fang S, Nista PM, Holloway AK, Kern AD, Dewey CN, Song YS, Hahn MW, Begun DJ. Genomic variation in natural populations of Drosophila melanogaster. Genetics 2012; 192:533-98. [PMID: 22673804 PMCID: PMC3454882 DOI: 10.1534/genetics.112.142018] [Citation(s) in RCA: 243] [Impact Index Per Article: 20.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/22/2011] [Accepted: 05/24/2012] [Indexed: 02/07/2023] Open
Abstract
This report of independent genome sequences of two natural populations of Drosophila melanogaster (37 from North America and 6 from Africa) provides unique insight into forces shaping genomic polymorphism and divergence. Evidence of interactions between natural selection and genetic linkage is abundant not only in centromere- and telomere-proximal regions, but also throughout the euchromatic arms. Linkage disequilibrium, which decays within 1 kbp, exhibits a strong bias toward coupling of the more frequent alleles and provides a high-resolution map of recombination rate. The juxtaposition of population genetics statistics in small genomic windows with gene structures and chromatin states yields a rich, high-resolution annotation, including the following: (1) 5'- and 3'-UTRs are enriched for regions of reduced polymorphism relative to lineage-specific divergence; (2) exons overlap with windows of excess relative polymorphism; (3) epigenetic marks associated with active transcription initiation sites overlap with regions of reduced relative polymorphism and relatively reduced estimates of the rate of recombination; (4) the rate of adaptive nonsynonymous fixation increases with the rate of crossing over per base pair; and (5) both duplications and deletions are enriched near origins of replication and their density correlates negatively with the rate of crossing over. Available demographic models of X and autosome descent cannot account for the increased divergence on the X and loss of diversity associated with the out-of-Africa migration. Comparison of the variation among these genomes to variation among genomes from D. simulans suggests that many targets of directional selection are shared between these species.
Collapse
Affiliation(s)
- Charles H Langley
- Department of Evolution and Ecology, University of California, Davis, CA 95616, USA.
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | |
Collapse
|
209
|
Kong A, Frigge ML, Masson G, Besenbacher S, Sulem P, Magnusson G, Gudjonsson SA, Sigurdsson A, Jonasdottir A, Jonasdottir A, Wong WSW, Sigurdsson G, Walters GB, Steinberg S, Helgason H, Thorleifsson G, Gudbjartsson DF, Helgason A, Magnusson OT, Thorsteinsdottir U, Stefansson K. Rate of de novo mutations and the importance of father's age to disease risk. Nature 2012; 488:471-5. [PMID: 22914163 PMCID: PMC3548427 DOI: 10.1038/nature11396] [Citation(s) in RCA: 1371] [Impact Index Per Article: 114.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/28/2012] [Accepted: 07/04/2012] [Indexed: 02/06/2023]
Abstract
Mutations generate sequence diversity and provide a substrate for selection. The rate of de novo mutations is therefore of major importance to evolution. Here we conduct a study of genome-wide mutation rates by sequencing the entire genomes of 78 Icelandic parent-offspring trios at high coverage. We show that in our samples, with an average father's age of 29.7, the average de novo mutation rate is 1.20 × 10(-8) per nucleotide per generation. Most notably, the diversity in mutation rate of single nucleotide polymorphisms is dominated by the age of the father at conception of the child. The effect is an increase of about two mutations per year. An exponential model estimates paternal mutations doubling every 16.5 years. After accounting for random Poisson variation, father's age is estimated to explain nearly all of the remaining variation in the de novo mutation counts. These observations shed light on the importance of the father's age on the risk of diseases such as schizophrenia and autism.
Collapse
|
210
|
Leffler EM, Bullaughey K, Matute DR, Meyer WK, Ségurel L, Venkat A, Andolfatto P, Przeworski M. Revisiting an old riddle: what determines genetic diversity levels within species? PLoS Biol 2012; 10:e1001388. [PMID: 22984349 PMCID: PMC3439417 DOI: 10.1371/journal.pbio.1001388] [Citation(s) in RCA: 334] [Impact Index Per Article: 27.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/26/2022] Open
Abstract
Understanding why some species have more genetic diversity than others is central to the study of ecology and evolution, and carries potentially important implications for conservation biology. Yet not only does this question remain unresolved, it has largely fallen into disregard. With the rapid decrease in sequencing costs, we argue that it is time to revive it.
Collapse
Affiliation(s)
- Ellen M. Leffler
- Department of Human Genetics, University of Chicago, Chicago, Illinois, United States of America
- * E-mail: (EML); (MP)
| | - Kevin Bullaughey
- Department of Ecology and Evolution, University of Chicago, Chicago, Illinois, United States of America
| | - Daniel R. Matute
- Department of Human Genetics, University of Chicago, Chicago, Illinois, United States of America
| | - Wynn K. Meyer
- Department of Human Genetics, University of Chicago, Chicago, Illinois, United States of America
| | - Laure Ségurel
- Department of Human Genetics, University of Chicago, Chicago, Illinois, United States of America
- Howard Hughes Medical Institute, University of Chicago, Chicago, Illinois, United States of America
| | - Aarti Venkat
- Department of Human Genetics, University of Chicago, Chicago, Illinois, United States of America
| | - Peter Andolfatto
- Department of Ecology and Evolutionary Biology and the Lewis-Sigler Institute for Integrative Genomics, Princeton University, Princeton, New Jersey, United States of America
| | - Molly Przeworski
- Department of Human Genetics, University of Chicago, Chicago, Illinois, United States of America
- Department of Ecology and Evolution, University of Chicago, Chicago, Illinois, United States of America
- Howard Hughes Medical Institute, University of Chicago, Chicago, Illinois, United States of America
- * E-mail: (EML); (MP)
| |
Collapse
|
211
|
Zhu Y, Bergland AO, González J, Petrov DA. Empirical validation of pooled whole genome population re-sequencing in Drosophila melanogaster. PLoS One 2012; 7:e41901. [PMID: 22848651 PMCID: PMC3406057 DOI: 10.1371/journal.pone.0041901] [Citation(s) in RCA: 73] [Impact Index Per Article: 6.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/18/2012] [Accepted: 06/28/2012] [Indexed: 11/26/2022] Open
Abstract
The sequencing of pooled non-barcoded individuals is an inexpensive and efficient means of assessing genome-wide population allele frequencies, yet its accuracy has not been thoroughly tested. We assessed the accuracy of this approach on whole, complex eukaryotic genomes by resequencing pools of largely isogenic, individually sequenced Drosophila melanogaster strains. We called SNPs in the pooled data and estimated false positive and false negative rates using the SNPs called in individual strain as a reference. We also estimated allele frequency of the SNPs using “pooled” data and compared them with “true” frequencies taken from the estimates in the individual strains. We demonstrate that pooled sequencing provides a faithful estimate of population allele frequency with the error well approximated by binomial sampling, and is a reliable means of novel SNP discovery with low false positive rates. However, a sufficient number of strains should be used in the pooling because variation in the amount of DNA derived from individual strains is a substantial source of noise when the number of pooled strains is low. Our results and analysis confirm that pooled sequencing is a very powerful and cost-effective technique for assessing of patterns of sequence variation in populations on genome-wide scales, and is applicable to any dataset where sequencing individuals or individual cells is impossible, difficult, time consuming, or expensive.
Collapse
Affiliation(s)
- Yuan Zhu
- Department of Genetics, Stanford University, Stanford, California, United States of America.
| | | | | | | |
Collapse
|
212
|
Small inverted repeats drive mitochondrial genome evolution in Lake Baikal sponges. Gene 2012; 505:91-9. [PMID: 22669046 DOI: 10.1016/j.gene.2012.05.039] [Citation(s) in RCA: 20] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/05/2012] [Revised: 05/15/2012] [Accepted: 05/17/2012] [Indexed: 12/31/2022]
Abstract
Demosponges, the largest and most diverse class in the phylum Porifera, possess mitochondrial DNA (mtDNA) markedly different from that in other animals. Although several studies investigated evolution of demosponge mtDNA among major lineages of the group, the changes within these groups remain largely unexplored. Recently we determined mitochondrial genomic sequence of the Lake Baikal sponge Lubomirskia baicalensis and described proliferation of small inverted repeats (hairpins) that occurred in it since the divergence between L. baicalensis and the most closely related cosmopolitan freshwater sponge Ephydatia muelleri. Here we report mitochondrial genomes of three additional species of Lake Baikal sponges: Swartschewskia papyracea, Rezinkovia echinata and Baikalospongia intermedia morpha profundalis (Demospongiae, Haplosclerida, Lubomirskiidae) and from a more distantly related freshwater sponge Corvomeyenia sp. (Demospongiae, Haplosclerida, Metaniidae). We use these additional sequences to explore mtDNA evolution in Baikalian sponges, paying particular attention to the variation in the rates of nucleotide substitutions and the distribution of hairpins, abundant in these genomes. We show that most of the changes in Lubomirskiidae mitochondrial genomes are due to insertion/deletion/duplication of these elements rather than single nucleotide substitutions. Thus inverted repeats can act as an important force in evolution of mitochondrial genome architecture and be a valuable marker for population- and species-level studies in this group. In addition, we infer (((Rezinkovia+Lubomirskia)+Swartschewskia)+Baikalospongia) phylogeny for the family Lubomirskiidae based on the analysis of mitochondrial coding sequences from freshwater sponges.
Collapse
|
213
|
Smith G, Lohse K, Etges WJ, Ritchie MG. Model-based comparisons of phylogeographic scenarios resolve the intraspecific divergence of cactophilic Drosophila mojavensis. Mol Ecol 2012; 21:3293-307. [PMID: 22571504 DOI: 10.1111/j.1365-294x.2012.05604.x] [Citation(s) in RCA: 33] [Impact Index Per Article: 2.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/02/2023]
Abstract
The cactophilic fly Drosophila mojavensis exhibits considerable intraspecific genetic structure across allopatric geographic regions and shows associations with different host cactus species across its range. The divergence between these populations has been studied for more than 60years, yet their exact historical relationships have not been resolved. We analysed sequence data from 15 intronic X-linked loci across populations from Baja California, mainland Sonora-Arizona and Mojave Desert regions under an isolation-with-migration model to assess multiple scenarios of divergence. We also compared the results with a pre-existing sequence data set of eight autosomal loci. We derived a population tree with Baja California placed at its base and link their isolation to Pleistocene climatic oscillations. Our estimates suggest the Baja California population diverged from an ancestral Mojave Desert/mainland Sonora-Arizona group around 230,000-270,000years ago, while the split between the Mojave Desert and mainland Sonora-Arizona populations occurred one glacial cycle later, 117,000-135,000years ago. Although we found these three populations to be effectively allopatric, model ranking could not rule out the possibility of a low level of gene flow between two of them. Finally, the Mojave Desert population showed a small effective population size, consistent with a historical population bottleneck. We show that model-based inference from multiple loci can provide accurate information on the historical relationships of closely related groups allowing us to set into historical context a classic system of incipient ecological speciation.
Collapse
Affiliation(s)
- Gilbert Smith
- School of Biology, University of St. Andrews, St. Andrews KY16 9TH, UK.
| | | | | | | |
Collapse
|
214
|
Crawford JE, Lazzaro BP. Assessing the accuracy and power of population genetic inference from low-pass next-generation sequencing data. Front Genet 2012; 3:66. [PMID: 22536207 PMCID: PMC3334522 DOI: 10.3389/fgene.2012.00066] [Citation(s) in RCA: 38] [Impact Index Per Article: 3.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/26/2012] [Accepted: 04/05/2012] [Indexed: 01/17/2023] Open
Abstract
Next-generation sequencing (NGS) technologies have made it possible to address population genetic questions in almost any system, but high error rates associated with such data can introduce significant biases into downstream analyses, necessitating careful experimental design and interpretation in studies based on short-read sequencing. Exploration of population genetic analyses based on NGS has revealed some of the potential biases, but previous work has emphasized parameters relevant to human population genetics and further examination of parameters relevant to other systems is necessary, including situations where sample sizes are small and genetic variation is high. To assess experimental power to address several principal objectives of population genetic studies under these conditions, we simulated population samples under selective sweep, population growth, and population subdivision models and tested the power to accurately infer population genetic parameters from sequence polymorphism data obtained through simulated 4×, 8×, and 15× read depth sequence data. We found that estimates of population genetic differentiation and population growth parameters were systematically biased when inference was based on 4× sequencing, but biases were markedly reduced at even 8× read depth. We also found that the power to identify footprints of positive selection depends on an interaction between read depth and the strength of selection, with strong selection being recovered consistently at all read depths, but weak selection requiring deeper read depths for reliable detection. Although we have explored only a small subset of the many possible experimental designs and population genetic models, using only one SNP-calling approach, our results reveal some general patterns and provide some assessment of what biases could be expected under similar experimental structures.
Collapse
|
215
|
Beal MA, Glenn TC, Somers CM. Whole genome sequencing for quantifying germline mutation frequency in humans and model species: cautious optimism. Mutat Res 2012; 750:96-106. [PMID: 22178956 DOI: 10.1016/j.mrrev.2011.11.002] [Citation(s) in RCA: 17] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/31/2011] [Revised: 11/29/2011] [Accepted: 11/30/2011] [Indexed: 05/31/2023]
Abstract
Factors affecting the type and frequency of germline mutations in animals are of significant interest from health and toxicology perspectives. However, studies in this field have been limited by the use of markers with low detection power or uncertain relevance to phenotype. Whole genome sequencing (WGS) is now a potential option to directly determine germline mutation type and frequency in family groups at all loci simultaneously. Medical studies have already capitalized on WGS to identify novel mutations in human families for clinical purposes, such as identifying candidate genes contributing to inherited conditions. However, WGS has not yet been used in any studies of vertebrates that aim to quantify changes in germline mutation frequency as a result of environmental factors. WGS is a promising tool for detecting mutation induction, but it is currently limited by several technical challenges. Perhaps the most pressing issue is sequencing error rates that are currently high in comparison to the intergenerational mutation frequency. Different platforms and depths of coverage currently result in a range of 10-10(3) false positives for every true mutation. In addition, the cost of WGS is still relatively high, particularly when comparing mutation frequencies among treatment groups with even moderate sample sizes. Despite these challenges, WGS offers the potential for unprecedented insight into germline mutation processes. Refinement of available tools and emergence of new technologies may be able to provide the improved accuracy and reduced costs necessary to make WGS viable in germline mutation studies in the very near future. To streamline studies, researchers may use multiple family triads per treatment group and sequence a targeted (reduced) portion of each genome with high (20-40 ×) depth of coverage. We are optimistic about the application of WGS for quantifying germline mutations, but caution researchers regarding the resource-intensive nature of the work using existing technology.
Collapse
Affiliation(s)
- Marc A Beal
- University of Regina, Department of Biology, 3737 Wascana Parkway, Regina, Saskatchewan, Canada S4S 0A2
| | - Travis C Glenn
- University of Georgia, Environmental Health Science, College of Public Health, Athens, GA 30602, USA
| | - Christopher M Somers
- University of Regina, Department of Biology, 3737 Wascana Parkway, Regina, Saskatchewan, Canada S4S 0A2.
| |
Collapse
|
216
|
Denver DR, Wilhelm LJ, Howe DK, Gafner K, Dolan PC, Baer CF. Variation in base-substitution mutation in experimental and natural lineages of Caenorhabditis nematodes. Genome Biol Evol 2012; 4:513-22. [PMID: 22436997 PMCID: PMC3342874 DOI: 10.1093/gbe/evs028] [Citation(s) in RCA: 78] [Impact Index Per Article: 6.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/14/2022] Open
Abstract
Variation among lineages in the mutation process has the potential to impact diverse biological processes ranging from susceptibilities to genetic disease to the mode and tempo of molecular evolution. The combination of high-throughput DNA sequencing (HTS) with mutation-accumulation (MA) experiments has provided a powerful approach to genome-wide mutation analysis, though insights into mutational variation have been limited by the vast evolutionary distances among the few species analyzed. We performed a HTS analysis of MA lines derived from four Caenorhabditis nematode natural genotypes: C. elegans N2 and PB306 and C. briggsae HK104 and PB800. Total mutation rates did not differ among the four sets of MA lines. A mutational bias toward G:C→A:T transitions and G:C→T:A transversions was observed in all four sets of MA lines. Chromosome-specific rates were mostly stable, though there was some evidence for a slightly elevated X chromosome mutation rate in PB306. Rates were homogeneous among functional coding sequence types and across autosomal cores, arms, and tips. Mutation spectra were similar among the four MA line sets but differed significantly when compared with patterns of natural base-substitution polymorphism for 13/14 comparisons performed. Our findings show that base-substitution mutation processes in these closely related animal lineages are mostly stable but differ from natural polymorphism patterns in these two species.
Collapse
Affiliation(s)
- Dee R Denver
- Department of Zoology and Center for Genome Research and Biocomputing, Oregon State University, OR, USA.
| | | | | | | | | | | |
Collapse
|
217
|
Izutsu M, Zhou J, Sugiyama Y, Nishimura O, Aizu T, Toyoda A, Fujiyama A, Agata K, Fuse N. Genome features of "Dark-fly", a Drosophila line reared long-term in a dark environment. PLoS One 2012; 7:e33288. [PMID: 22432011 PMCID: PMC3303825 DOI: 10.1371/journal.pone.0033288] [Citation(s) in RCA: 27] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/28/2011] [Accepted: 02/08/2012] [Indexed: 11/22/2022] Open
Abstract
Organisms are remarkably adapted to diverse environments by specialized metabolisms, morphology, or behaviors. To address the molecular mechanisms underlying environmental adaptation, we have utilized a Drosophila melanogaster line, termed “Dark-fly”, which has been maintained in constant dark conditions for 57 years (1400 generations). We found that Dark-fly exhibited higher fecundity in dark than in light conditions, indicating that Dark-fly possesses some traits advantageous in darkness. Using next-generation sequencing technology, we determined the whole genome sequence of Dark-fly and identified approximately 220,000 single nucleotide polymorphisms (SNPs) and 4,700 insertions or deletions (InDels) in the Dark-fly genome compared to the genome of the Oregon-R-S strain, a control strain. 1.8% of SNPs were classified as non-synonymous SNPs (nsSNPs: i.e., they alter the amino acid sequence of gene products). Among them, we detected 28 nonsense mutations (i.e., they produce a stop codon in the protein sequence) in the Dark-fly genome. These included genes encoding an olfactory receptor and a light receptor. We also searched runs of homozygosity (ROH) regions as putative regions selected during the population history, and found 21 ROH regions in the Dark-fly genome. We identified 241 genes carrying nsSNPs or InDels in the ROH regions. These include a cluster of alpha-esterase genes that are involved in detoxification processes. Furthermore, analysis of structural variants in the Dark-fly genome showed the deletion of a gene related to fatty acid metabolism. Our results revealed unique features of the Dark-fly genome and provided a list of potential candidate genes involved in environmental adaptation.
Collapse
Affiliation(s)
- Minako Izutsu
- Laboratory for Biodiversity, Global COE Program, Graduate School of Science, Kyoto University, Kyoto, Japan
- Laboratory for Molecular Developmental Biology, Graduate School of Science, Kyoto University, Kyoto, Japan
| | - Jun Zhou
- Department of Organismic and Evolutionary Biology, Harvard University, Cambridge, Massachusetts, United States of America
| | - Yuzo Sugiyama
- Laboratory for Biodiversity, Global COE Program, Graduate School of Science, Kyoto University, Kyoto, Japan
| | - Osamu Nishimura
- Laboratory for Biodiversity, Global COE Program, Graduate School of Science, Kyoto University, Kyoto, Japan
| | - Tomoyuki Aizu
- Comparative Genomics Laboratory, National Institute of Genetics, Mishima, Japan
| | - Atsushi Toyoda
- Comparative Genomics Laboratory, National Institute of Genetics, Mishima, Japan
| | - Asao Fujiyama
- Comparative Genomics Laboratory, National Institute of Genetics, Mishima, Japan
| | - Kiyokazu Agata
- Laboratory for Biodiversity, Global COE Program, Graduate School of Science, Kyoto University, Kyoto, Japan
- Laboratory for Molecular Developmental Biology, Graduate School of Science, Kyoto University, Kyoto, Japan
| | - Naoyuki Fuse
- Laboratory for Biodiversity, Global COE Program, Graduate School of Science, Kyoto University, Kyoto, Japan
- * E-mail:
| |
Collapse
|
218
|
The role of background selection in shaping patterns of molecular evolution and variation: evidence from variability on the Drosophila X chromosome. Genetics 2012; 191:233-46. [PMID: 22377629 DOI: 10.1534/genetics.111.138073] [Citation(s) in RCA: 94] [Impact Index Per Article: 7.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/18/2022] Open
Abstract
In the putatively ancestral population of Drosophila melanogaster, the ratio of silent DNA sequence diversity for X-linked loci to that for autosomal loci is approximately one, instead of the expected "null" value of 3/4. One possible explanation is that background selection (the hitchhiking effect of deleterious mutations) is more effective on the autosomes than on the X chromosome, because of the lack of crossing over in male Drosophila. The expected effects of background selection on neutral variability at sites in the middle of an X chromosome or an autosomal arm were calculated for different models of chromosome organization and methods of approximation, using current estimates of the deleterious mutation rate and distributions of the fitness effects of deleterious mutations. The robustness of the results to different distributions of fitness effects, dominance coefficients, mutation rates, mapping functions, and chromosome size was investigated. The predicted ratio of X-linked to autosomal variability is relatively insensitive to these variables, except for the mutation rate and map length. Provided that the deleterious mutation rate per genome is sufficiently large, it seems likely that background selection can account for the observed X to autosome ratio of variability in the ancestral population of D. melanogaster. The fact that this ratio is much less than one in D. pseudoobscura is also consistent with the model's predictions, since this species has a high rate of crossing over. The results suggest that background selection may play a major role in shaping patterns of molecular evolution and variation.
Collapse
|
219
|
Kourilsky P. Selfish cellular networks and the evolution of complex organisms. C R Biol 2012; 335:169-79. [PMID: 22464425 DOI: 10.1016/j.crvi.2012.01.003] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/06/2011] [Accepted: 01/06/2012] [Indexed: 10/28/2022]
Abstract
Human gametogenesis takes years and involves many cellular divisions, particularly in males. Consequently, gametogenesis provides the opportunity to acquire multiple de novo mutations. A significant portion of these is likely to impact the cellular networks linking genes, proteins, RNA and metabolites, which constitute the functional units of cells. A wealth of literature shows that these individual cellular networks are complex, robust and evolvable. To some extent, they are able to monitor their own performance, and display sufficient autonomy to be termed "selfish". Their robustness is linked to quality control mechanisms which are embedded in and act upon the individual networks, thereby providing a basis for selection during gametogenesis. These selective processes are equally likely to affect cellular functions that are not gamete-specific, and the evolution of the most complex organisms, including man, is therefore likely to occur via two pathways: essential housekeeping functions would be regulated and evolve during gametogenesis within the parents before being transmitted to their progeny, while classical selection would operate on other traits of the organisms that shape their fitness with respect to the environment.
Collapse
|
220
|
Keightley PD. Rates and fitness consequences of new mutations in humans. Genetics 2012; 190:295-304. [PMID: 22345605 PMCID: PMC3276617 DOI: 10.1534/genetics.111.134668] [Citation(s) in RCA: 108] [Impact Index Per Article: 9.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/13/2011] [Accepted: 11/13/2011] [Indexed: 12/13/2022] Open
Abstract
The human mutation rate per nucleotide site per generation (μ) can be estimated from data on mutation rates at loci causing Mendelian genetic disease, by comparing putatively neutrally evolving nucleotide sequences between humans and chimpanzees and by comparing the genome sequences of relatives. Direct estimates from genome sequencing of relatives suggest that μ is about 1.1 × 10(-8), which is about twofold lower than estimates based on the human-chimp divergence. This implies that an average of ~70 new mutations arise in the human diploid genome per generation. Most of these mutations are paternal in origin, but the male:female mutation rate ratio is currently uncertain and might vary even among individuals within a population. On the basis of a method proposed by Kondrashov and Crow, the genome-wide deleterious mutation rate (U) can be estimated from the product of the number of nucleotide sites in the genome, μ, and the mean selective constraint per site. Although the presence of many weakly selected mutations in human noncoding DNA makes this approach somewhat problematic, estimates are U ≈ 2.2 for the whole diploid genome per generation and 0.35 for mutations that change an amino acid of a protein-coding gene. A genome-wide deleterious mutation rate of 2.2 seems higher than humans could tolerate if natural selection is "hard," but could be tolerated if selection acts on relative fitness differences between individuals or if there is synergistic epistasis. I argue that in the foreseeable future, an accumulation of new deleterious mutations is unlikely to lead to a detectable decline in fitness of human populations.
Collapse
Affiliation(s)
- Peter D Keightley
- Institute of Evolutionary Biology, School of Biological Sciences, University of Edinburgh, Edinburgh EH9 3JT, United Kingdom.
| |
Collapse
|
221
|
Burke GR, Strand MR. Polydnaviruses of Parasitic Wasps: Domestication of Viruses To Act as Gene Delivery Vectors. INSECTS 2012; 3:91-119. [PMID: 26467950 PMCID: PMC4553618 DOI: 10.3390/insects3010091] [Citation(s) in RCA: 48] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 11/22/2011] [Revised: 01/07/2012] [Accepted: 01/16/2012] [Indexed: 12/21/2022]
Abstract
Symbiosis is a common phenomenon in which associated organisms can cooperate in ways that increase their ability to survive, reproduce, or utilize hostile environments. Here, we discuss polydnavirus symbionts of parasitic wasps. These viruses are novel in two ways: (1) they have become non-autonomous domesticated entities that cannot replicate outside of wasps; and (2) they function as a delivery vector of genes that ensure successful parasitism of host insects that wasps parasitize. In this review we discuss how these novelties may have arisen, which genes are potentially involved, and what the consequences have been for genome evolution.
Collapse
Affiliation(s)
- Gaelen R Burke
- Department of Entomology, The University of Georgia, 120 Cedar St., Athens, GA 30601, USA.
| | - Michael R Strand
- Department of Entomology, The University of Georgia, 120 Cedar St., Athens, GA 30601, USA.
| |
Collapse
|
222
|
Gundry M, Vijg J. Direct mutation analysis by high-throughput sequencing: from germline to low-abundant, somatic variants. Mutat Res 2012; 729:1-15. [PMID: 22016070 PMCID: PMC3237897 DOI: 10.1016/mrfmmm.2011.10.001 10.1016/j.mrfmmm.2011.10.001] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/03/2011] [Revised: 09/23/2011] [Accepted: 10/05/2011] [Indexed: 08/15/2023]
Abstract
DNA mutations are the source of genetic variation within populations. The majority of mutations with observable effects are deleterious. In humans mutations in the germ line can cause genetic disease. In somatic cells multiple rounds of mutations and selection lead to cancer. The study of genetic variation has progressed rapidly since the completion of the draft sequence of the human genome. Recent advances in sequencing technology, most importantly the introduction of massively parallel sequencing (MPS), have resulted in more than a hundred-fold reduction in the time and cost required for sequencing nucleic acids. These improvements have greatly expanded the use of sequencing as a practical tool for mutation analysis. While in the past the high cost of sequencing limited mutation analysis to selectable markers or small forward mutation targets assumed to be representative for the genome overall, current platforms allow whole genome sequencing for less than $5000. This has already given rise to direct estimates of germline mutation rates in multiple organisms including humans by comparing whole genome sequences between parents and offspring. Here we present a brief history of the field of mutation research, with a focus on classical tools for the measurement of mutation rates. We then review MPS, how it is currently applied and the new insight into human and animal mutation frequencies and spectra that has been obtained from whole genome sequencing. While great progress has been made, we note that the single most important limitation of current MPS approaches for mutation analysis is the inability to address low-abundance mutations that turn somatic tissues into mosaics of cells. Such mutations are at the basis of intra-tumor heterogeneity, with important implications for clinical diagnosis, and could also contribute to somatic diseases other than cancer, including aging. Some possible approaches to gain access to low-abundance mutations are discussed, with a brief overview of new sequencing platforms that are currently waiting in the wings to advance this exploding field even further.
Collapse
Affiliation(s)
- Michael Gundry
- Albert Einstein College of Medicine, Department of Genetics, New York, NY 10461, United States
| | | |
Collapse
|
223
|
Charlesworth B. The effects of deleterious mutations on evolution at linked sites. Genetics 2012; 190:5-22. [PMID: 22219506 PMCID: PMC3249359 DOI: 10.1534/genetics.111.134288] [Citation(s) in RCA: 215] [Impact Index Per Article: 17.9] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/26/2011] [Accepted: 11/04/2011] [Indexed: 01/14/2023] Open
Abstract
The process of evolution at a given site in the genome can be influenced by the action of selection at other sites, especially when these are closely linked to it. Such selection reduces the effective population size experienced by the site in question (the Hill-Robertson effect), reducing the level of variability and the efficacy of selection. In particular, deleterious variants are continually being produced by mutation and then eliminated by selection at sites throughout the genome. The resulting reduction in variability at linked neutral or nearly neutral sites can be predicted from the theory of background selection, which assumes that deleterious mutations have such large effects that their behavior in the population is effectively deterministic. More weakly selected mutations can accumulate by Muller's ratchet after a shutdown of recombination, as in an evolving Y chromosome. Many functionally significant sites are probably so weakly selected that Hill-Robertson interference undermines the effective strength of selection upon them, when recombination is rare or absent. This leads to large departures from deterministic equilibrium and smaller effects on linked neutral sites than under background selection or Muller's ratchet. Evidence is discussed that is consistent with the action of these processes in shaping genome-wide patterns of variation and evolution.
Collapse
Affiliation(s)
- Brian Charlesworth
- Institute of Evolutionary Biology, School of Biological Sciences, University of Edinburgh, Edinburgh EH9 3JT, United Kingdom.
| |
Collapse
|
224
|
Abstract
Differences in gene regulation are thought to play an important role in speciation and adaptation. Comparative genomic studies of gene expression levels have identified a large number of differentially expressed genes among species, and, in a number of cases, also pointed to connections between interspecies differences in gene regulation and differences in ultimate physiological or morphological phenotypes. The mechanisms underlying changes in gene regulation are also being actively studied using comparative genomic approaches. However, the relative importance of different regulatory mechanisms to interspecies differences in gene expression levels is not yet well understood. In particular, it is often difficult to infer causality between apparent differences in regulatory mechanisms and changes in gene expression levels, a challenge that is compounded by the fact that the link between sequence variation and gene regulation is not clear. Indeed, in certain cases, gene regulation can be conserved even when sequences at associated regulatory elements have changed. In this chapter, I examine different genomic approaches to the study of regulatory evolution and the underlying genetic and epigenetic regulatory mechanisms. I try to distinguish between hypothesis-driven and exploratory studies, and argue that the latter class of studies provides valuable information in its own right as well as necessary context for the former. I discuss issues related to study designs and statistical analyses of genomic studies, and review the evidence for natural selection on gene expression levels and associated regulatory mechanisms. Most of the issues that are discussed pertain to the general nature of multivariate genomic data, and thus are often relevant regardless of the technology that is used to collect high-throughput genomic data (for example, microarrays or massively parallel sequencing).
Collapse
|
225
|
Lee YCG, Reinhardt JA. Widespread polymorphism in the positions of stop codons in Drosophila melanogaster. Genome Biol Evol 2011; 4:533-49. [PMID: 22051795 PMCID: PMC3342867 DOI: 10.1093/gbe/evr113] [Citation(s) in RCA: 24] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 10/28/2011] [Indexed: 12/19/2022] Open
Abstract
The mechanisms underlying evolutionary changes in protein length are poorly understood. Protein domains are lost and gained between species and must have arisen first as within-species polymorphisms. Here, we use Drosophila melanogaster population genomic data combined with between species divergence information to understand the evolutionary forces that generate and maintain polymorphisms causing changes in protein length in D. melanogaster. Specifically, we looked for protein length variations resulting from premature termination codons (PTCs) and stop codon losses (SCLs). We discovered that 438 genes contained polymorphisms resulting in truncation of the translated region (PTCs) and 119 genes contained polymorphisms predicted to lengthen the translated region (SCLs). Stop codon polymorphisms (SCPs) (especially PTCs) appear to be more deleterious than other polymorphisms, including protein amino acid changes. Genes harboring SCPs are in general less selectively constrained, more narrowly expressed, and enriched for dispensable biological functions. However, we also observed exceptional cases such as genes that have multiple independent SCPs, alleles that are shared between D. melanogaster and Drosophila simulans, and high-frequency alleles that cause extreme changes in gene length. SCPs likely have an important role in the evolution of these genes.
Collapse
Affiliation(s)
- Yuh Chwen G. Lee
- Department of Evolution and Ecology, The University of California at Davis
| | | |
Collapse
|
226
|
Schrider DR, Hourmozdi JN, Hahn MW. Pervasive multinucleotide mutational events in eukaryotes. Curr Biol 2011; 21:1051-4. [PMID: 21636278 DOI: 10.1016/j.cub.2011.05.013] [Citation(s) in RCA: 104] [Impact Index Per Article: 8.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/10/2011] [Revised: 04/12/2011] [Accepted: 05/05/2011] [Indexed: 10/18/2022]
Abstract
Many aspects of mutational processes are nonrandom, from the preponderance of transitions relative to transversions to the higher rate of mutation at CpG dinucleotides [1]. However, it is still often assumed that single-nucleotide mutations are independent of one another, each being caused by separate mutational events. The occurrence of multiple, closely spaced substitutions appears to violate assumptions of independence and is often interpreted as evidence for the action of adaptive natural selection [2, 3], balancing selection [4], or compensatory evolution [5, 6]. Here we provide evidence of a frequent, widespread multinucleotide mutational process active throughout eukaryotes. Genomic data from mutation-accumulation experiments, parent-offspring trios, and human polymorphisms all show that simultaneous nucleotide substitutions occur within short stretches of DNA. Regardless of species, such multinucleotide mutations (MNMs) consistently comprise ~3% of the total number of nucleotide substitutions. These results imply that previous adaptive interpretations of multiple, closely spaced substitutions may have been unwarranted and that MNMs must be considered when interpreting sequence data.
Collapse
Affiliation(s)
- Daniel R Schrider
- Department of Biology, Indiana University Bloomington, Bloomington, IN 47405, USA
| | | | | |
Collapse
|
227
|
Gundry M, Vijg J. Direct mutation analysis by high-throughput sequencing: from germline to low-abundant, somatic variants. Mutat Res 2011; 729:1-15. [PMID: 22016070 DOI: 10.1016/j.mrfmmm.2011.10.001] [Citation(s) in RCA: 65] [Impact Index Per Article: 5.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/03/2011] [Revised: 09/23/2011] [Accepted: 10/05/2011] [Indexed: 12/20/2022]
Abstract
DNA mutations are the source of genetic variation within populations. The majority of mutations with observable effects are deleterious. In humans mutations in the germ line can cause genetic disease. In somatic cells multiple rounds of mutations and selection lead to cancer. The study of genetic variation has progressed rapidly since the completion of the draft sequence of the human genome. Recent advances in sequencing technology, most importantly the introduction of massively parallel sequencing (MPS), have resulted in more than a hundred-fold reduction in the time and cost required for sequencing nucleic acids. These improvements have greatly expanded the use of sequencing as a practical tool for mutation analysis. While in the past the high cost of sequencing limited mutation analysis to selectable markers or small forward mutation targets assumed to be representative for the genome overall, current platforms allow whole genome sequencing for less than $5000. This has already given rise to direct estimates of germline mutation rates in multiple organisms including humans by comparing whole genome sequences between parents and offspring. Here we present a brief history of the field of mutation research, with a focus on classical tools for the measurement of mutation rates. We then review MPS, how it is currently applied and the new insight into human and animal mutation frequencies and spectra that has been obtained from whole genome sequencing. While great progress has been made, we note that the single most important limitation of current MPS approaches for mutation analysis is the inability to address low-abundance mutations that turn somatic tissues into mosaics of cells. Such mutations are at the basis of intra-tumor heterogeneity, with important implications for clinical diagnosis, and could also contribute to somatic diseases other than cancer, including aging. Some possible approaches to gain access to low-abundance mutations are discussed, with a brief overview of new sequencing platforms that are currently waiting in the wings to advance this exploding field even further.
Collapse
Affiliation(s)
- Michael Gundry
- Albert Einstein College of Medicine, Department of Genetics, New York, NY 10461, United States
| | | |
Collapse
|
228
|
Fort P, Albertini A, Van-Hua A, Berthomieu A, Roche S, Delsuc F, Pasteur N, Capy P, Gaudin Y, Weill M. Fossil rhabdoviral sequences integrated into arthropod genomes: ontogeny, evolution, and potential functionality. Mol Biol Evol 2011; 29:381-90. [PMID: 21917725 DOI: 10.1093/molbev/msr226] [Citation(s) in RCA: 82] [Impact Index Per Article: 6.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/12/2022] Open
Abstract
Retroelements represent a considerable fraction of many eukaryotic genomes and are considered major drives for adaptive genetic innovations. Recent discoveries showed that despite not normally using DNA intermediates like retroviruses do, Mononegaviruses (i.e., viruses with nonsegmented, negative-sense RNA genomes) can integrate gene fragments into the genomes of their hosts. This was shown for Bornaviridae and Filoviridae, the sequences of which have been found integrated into the germ line cells of many vertebrate hosts. Here, we show that Rhabdoviridae sequences, the major Mononegavirales family, have integrated only into the genomes of arthropod species. We identified 185 integrated rhabdoviral elements (IREs) coding for nucleoproteins, glycoproteins, or RNA-dependent RNA polymerases; they were mostly found in the genomes of the mosquito Aedes aegypti and the blacklegged tick Ixodes scapularis. Phylogenetic analyses showed that most IREs in A. aegypti derived from multiple independent integration events. Since RNA viruses are submitted to much higher substitution rates as compared with their hosts, IREs thus represent fossil traces of the diversity of extinct Rhabdoviruses. Furthermore, analyses of orthologous IREs in A. aegypti field mosquitoes sampled worldwide identified an integrated polymerase IRE fragment that appeared under purifying selection within several million years, which supports a functional role in the host's biology. These results show that A. aegypti was subjected to repeated Rhabdovirus infectious episodes during its evolution history, which led to the accumulation of many integrated sequences. They also suggest that like retroviruses, integrated rhabdoviral sequences may participate actively in the evolution of their hosts.
Collapse
Affiliation(s)
- Philippe Fort
- Centre de Recherche de Biochimie Macromoléculaire, UMR 5237, CNRS, Universités Montpellier 2 et 1, Montpellier, France.
| | | | | | | | | | | | | | | | | | | |
Collapse
|
229
|
Altshuler I, Demiri B, Xu S, Constantin A, Yan ND, Cristescu ME. An integrated multi-disciplinary approach for studying multiple stressors in freshwater ecosystems: Daphnia as a model organism. Integr Comp Biol 2011; 51:623-33. [PMID: 21873644 DOI: 10.1093/icb/icr103] [Citation(s) in RCA: 106] [Impact Index Per Article: 8.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/13/2022] Open
Abstract
The increased overexploitation of freshwater ecosystems and their extended watersheds often generates a cascade of anthropogenic stressors (e.g., acidification, eutrophication, metal contamination, Ca decline, changes in the physical environment, introduction of invasive species, over-harvesting of resources). The combined effect of these stressors is particularly difficult to study, requiring a coordinated multi-disciplinary effort and insights from various sub-disciplines of biology, including ecology, evolution, toxicology, and genetics. It also would benefit from a well-developed and broadly accepted model systems. The freshwater crustacean Daphnia is an excellent model organism for studying multiple stressors because it has been a chosen focus of study in all four of these fields. Daphnia is a widespread keystone species in most freshwater ecosystems, where it is routinely exposed to a multitude of anthropogenic and natural stressors. It has a fully sequenced genome, a well-understood life history and ecology, and a huge library of responses to toxicity. To make the case for its value as a model species, we consider the joint and separate effects of natural and three anthropogenic stressors-climatic change, calcium decline, and metal contaminants on daphniids. We propose that integrative approaches marrying various subfields of biology can advance our understanding of the combined effects of stressors. Such approaches can involve the measuring of multiple responses at several levels of biological organization from molecules to natural populations. For example, novel interdisciplinary approaches such as transcriptome profiling and mutation accumulation experiments can offer insights into how multiple stressors influence gene transcription and mutation rates across genomes, and, thus, help determine the causal mechanism between environmental stressors and population/community effects as well as long-term evolutionary patterns.
Collapse
Affiliation(s)
- Ianina Altshuler
- Great Lakes Institute for Environmental Research, University of Windsor, Windsor, ON, Canada.
| | | | | | | | | | | |
Collapse
|
230
|
Qiu S, Zeng K, Slotte T, Wright S, Charlesworth D. Reduced efficacy of natural selection on codon usage bias in selfing Arabidopsis and Capsella species. Genome Biol Evol 2011; 3:868-80. [PMID: 21856647 PMCID: PMC3296465 DOI: 10.1093/gbe/evr085] [Citation(s) in RCA: 65] [Impact Index Per Article: 5.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/01/2023] Open
Abstract
Population genetic theory predicts that the efficacy of natural selection in a self-fertilizing species should be lower than its outcrossing relatives because of the reduction in the effective population size (N(e)) in the former brought about by inbreeding. However, previous analyses comparing Arabidopsis thaliana (selfer) with A. lyrata (outcrosser) have not found conclusive support for this prediction. In this study, we addressed this issue by examining silent site polymorphisms (synonymous and intronic), which are expected to be informative about changes in N(e). Two comparisons were made: A. thaliana versus A. lyrata and Capsella rubella (selfer) versus C. grandiflora (outcrosser). Extensive polymorphism data sets were obtained by compiling published data from the literature and by sequencing 354 exon loci in C. rubella and 89 additional loci in C. grandiflora. To extract information from the data effectively for studying these questions, we extended two recently developed models in order to investigate detailed selective differences between synonymous codons, mutational biases, and biased gene conversion (BGC), taking into account the effects of recent changes in population size. We found evidence that selection on synonymous codons is significantly weaker in the selfers compared with the outcrossers and that this difference cannot be fully accounted for by mutational biases or BGC.
Collapse
Affiliation(s)
- Suo Qiu
- State Key Laboratory of Biocontrol and Key Laboratory of Gene Engineering of the Ministry of Education, Sun Yat-Sen University, Guangzhou, China
| | | | | | | | | |
Collapse
|
231
|
de Procé SM, Zeng K, Betancourt AJ, Charlesworth B. Selection on codon usage and base composition in Drosophila americana. Biol Lett 2011; 8:82-5. [PMID: 21849309 DOI: 10.1098/rsbl.2011.0601] [Citation(s) in RCA: 10] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/04/2023] Open
Abstract
We have used a polymorphism dataset on introns and coding sequences of X-linked loci in Drosophila americana to estimate the strength of selection on codon usage and/or biased gene conversion (BGC), taking into account a recent population expansion detected by a maximum-likelihood method. Drosophila americana was previously thought to have a stable demographic history, so that this evidence for a recent population expansion means that previous estimates of selection need revision. There was evidence for natural selection or BGC favouring GC over AT variants in introns, which is stronger for GC-rich than GC-poor introns. By comparing introns and coding sequences, we found evidence for selection on codon usage bias, which is much stronger than the forces acting on GC versus AT basepairs in introns.
Collapse
Affiliation(s)
- Sophie Marion de Procé
- Institute of Evolutionary Biology, School of Biological Sciences, University of Edinburgh, Edinburgh, UK.
| | | | | | | |
Collapse
|
232
|
Ho SYW, Lanfear R, Bromham L, Phillips MJ, Soubrier J, Rodrigo AG, Cooper A. Time-dependent rates of molecular evolution. Mol Ecol 2011; 20:3087-101. [PMID: 21740474 DOI: 10.1111/j.1365-294x.2011.05178.x] [Citation(s) in RCA: 360] [Impact Index Per Article: 27.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/15/2022]
Abstract
For over half a century, it has been known that the rate of morphological evolution appears to vary with the time frame of measurement. Rates of microevolutionary change, measured between successive generations, were found to be far higher than rates of macroevolutionary change inferred from the fossil record. More recently, it has been suggested that rates of molecular evolution are also time dependent, with the estimated rate depending on the timescale of measurement. This followed surprising observations that estimates of mutation rates, obtained in studies of pedigrees and laboratory mutation-accumulation lines, exceeded long-term substitution rates by an order of magnitude or more. Although a range of studies have provided evidence for such a pattern, the hypothesis remains relatively contentious. Furthermore, there is ongoing discussion about the factors that can cause molecular rate estimates to be dependent on time. Here we present an overview of our current understanding of time-dependent rates. We provide a summary of the evidence for time-dependent rates in animals, bacteria and viruses. We review the various biological and methodological factors that can cause rates to be time dependent, including the effects of natural selection, calibration errors, model misspecification and other artefacts. We also describe the challenges in calibrating estimates of molecular rates, particularly on the intermediate timescales that are critical for an accurate characterization of time-dependent rates. This has important consequences for the use of molecular-clock methods to estimate timescales of recent evolutionary events.
Collapse
Affiliation(s)
- Simon Y W Ho
- Centre for Macroevolution and Macroecology, Evolution Ecology & Genetics, Research School of Biology, Australian National University, Canberra, ACT, Australia.
| | | | | | | | | | | | | |
Collapse
|
233
|
Yeasty clocks: dating genomic changes in yeasts. C R Biol 2011; 334:620-8. [PMID: 21819943 DOI: 10.1016/j.crvi.2011.05.010] [Citation(s) in RCA: 22] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/12/2010] [Accepted: 03/17/2011] [Indexed: 02/04/2023]
Abstract
Calibration of clocks to date evolutionary changes is of primary importance for comparative genomics. In the absence of fossil records, the dating of changes during yeast genome evolution can only rely on the properties of the genomes themselves, given the uncertainty of extrapolations using clocks from other organisms. In this work, we use the experimentally determined mutational rate of Saccharomyces cerevisiae to calculate the numbers of successive generations corresponding to observed sequence polymorphism between strains or species of other yeasts. We then examine synteny conservation across the entire subphylum of Saccharomycotina yeasts, and compare this second clock based on chromosomal rearrangements with the first one based on sequence divergence. A non-linear relationship is observed, that interestingly also applies to insects although, for equivalent sequence divergence, their rate of chromosomal rearrangements is higher than that of yeasts.
Collapse
|
234
|
Inference of site frequency spectra from high-throughput sequence data: quantification of selection on nonsynonymous and synonymous sites in humans. Genetics 2011; 188:931-40. [PMID: 21596896 DOI: 10.1534/genetics.111.128355] [Citation(s) in RCA: 33] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/18/2022] Open
Abstract
Sequencing errors and random sampling of nucleotide types among sequencing reads at heterozygous sites present challenges for accurate, unbiased inference of single-nucleotide polymorphism genotypes from high-throughput sequence data. Here, we develop a maximum-likelihood approach to estimate the frequency distribution of the number of alleles in a sample of individuals (the site frequency spectrum), using high-throughput sequence data. Our method assumes binomial sampling of nucleotide types in heterozygotes and random sequencing error. By simulations, we show that close to unbiased estimates of the site frequency spectrum can be obtained if the error rate per base read does not exceed the population nucleotide diversity. We also show that these estimates are reasonably robust if errors are nonrandom. We then apply the method to infer site frequency spectra for zerofold degenerate, fourfold degenerate, and intronic sites of protein-coding genes using the low coverage human sequence data produced by the 1000 Genomes Project phase-one pilot. By fitting a model to the inferred site frequency spectra that estimates parameters of the distribution of fitness effects of new mutations, we find evidence for significant natural selection operating on fourfold sites. We also find that a model with variable effects of mutations at synonymous sites fits the data significantly better than a model with equal mutational effects. Under the variable effects model, we infer that 11% of synonymous mutations are subject to strong purifying selection.
Collapse
|
235
|
|
236
|
Schneeberger K, Weigel D. Fast-forward genetics enabled by new sequencing technologies. TRENDS IN PLANT SCIENCE 2011; 16:282-8. [PMID: 21439889 DOI: 10.1016/j.tplants.2011.02.006] [Citation(s) in RCA: 120] [Impact Index Per Article: 9.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 12/14/2010] [Revised: 02/09/2011] [Accepted: 02/11/2011] [Indexed: 05/18/2023]
Abstract
New sequencing technologies are dramatically accelerating progress in forward genetics, and the use of such methods for the rapid identification of mutant alleles will be soon routine in many laboratories. A straightforward extension will be the cloning of major-effect genetic variants in crop species. In the near future, it can be expected that mapping by sequencing will become a centerpiece in efforts to discover the genes responsible for quantitative trait loci. The largest impact, however, might come from the use of these strategies to extract genes from non-model, non-crop plants that exhibit heritable variation in important traits. Deployment of such genes to improve crops or engineer microbes that produce valuable compounds heralds a potential paradigm shift for plant biology.
Collapse
Affiliation(s)
- Korbinian Schneeberger
- Department of Molecular Biology, Max Planck Institute for Developmental Biology, 72076 Tübingen, Germany
| | | |
Collapse
|
237
|
|
238
|
Lawrie DS, Petrov DA, Messer PW. Faster than neutral evolution of constrained sequences: the complex interplay of mutational biases and weak selection. Genome Biol Evol 2011; 3:383-95. [PMID: 21498884 PMCID: PMC3101017 DOI: 10.1093/gbe/evr032] [Citation(s) in RCA: 30] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/27/2022] Open
Abstract
Comparative genomics has become widely accepted as the major framework for the ascertainment of functionally important regions in genomes. The underlying paradigm of this approach is that most of the functional regions are assumed to be under selective constraint, which in turn reduces the rate of evolution relative to neutrality. This assumption allows detection of functional regions through sequence conservation. However, constraint does not always lead to sequence conservation. When purifying selection is weak and mutation is biased, constrained regions can even evolve faster than neutral sequences and thus can appear to be under positive selection. Moreover, conservation estimates depend also on the orientation of selection relative to mutational biases and can vary over time. In the light of recent data of the ubiquity of mutational biases and weak selective forces, these effects should reduce the power of conservation analyses to define functional regions using comparative genomics data. We argue that the estimation of true mutational biases and the use of explicit evolutionary models are essential to improve methods inferring the action of natural selection and functionality in genome sequences.
Collapse
|
239
|
Circumventing heterozygosity: sequencing the amplified genome of a single haploid Drosophila melanogaster embryo. Genetics 2011; 188:239-46. [PMID: 21441209 PMCID: PMC3122310 DOI: 10.1534/genetics.111.127530] [Citation(s) in RCA: 40] [Impact Index Per Article: 3.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/09/2023] Open
Abstract
Heterozygosity is a major challenge to efficient, high-quality genomic assembly and to the full genomic survey of polymorphism and divergence. In Drosophila melanogaster lines derived from equatorial populations are particularly resistant to inbreeding, thus imposing a major barrier to the determination and analyses of genomic variation in natural populations of this model organism. Here we present a simple genome sequencing protocol based on the whole-genome amplification of the gynogenetically derived haploid genome of a progeny of females mated to males homozygous for the recessive male sterile mutation, ms(3)K81. A single “lane” of paired-end sequences (2 × 76 bp) provides a good syntenic assembly with >95% high-quality coverage (more than five reads). The amplification of the genomic DNA moderately inflates the variation in coverage across the euchromatic portion of the genome. It also increases the frequency of chimeric clones. But the low frequency and random genomic distribution of the chimeric clones limits their impact on the final assemblies. This method provides a solid path forward for population genomic sequencing and offers applications to many other systems in which small amounts of genomic DNA have unique experimental relevance.
Collapse
|
240
|
Appels R, Adelson DL, Moolhuijzen P, Webster H, Barrero R, Bellgard M. Genome studies at the PAG 2011 conference. Funct Integr Genomics 2011; 11:1-11. [PMID: 21360134 DOI: 10.1007/s10142-011-0215-6] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/10/2011] [Revised: 02/15/2011] [Accepted: 02/15/2011] [Indexed: 01/15/2023]
Abstract
The contents of the plenary lectures presented at the Plant and Animal Genome (PAG) meeting in January 2011 are summarized in order to provide some insights into the advances in plant, animal and microbe genome studies as they impact on our understanding of complex biological systems. The areas of biology covered include the dynamics of genome change, biological recognition processes and the new processes that underpin investment in science. This overview does not attempt to summarize the diversity of activities that are covered during the PAG through workshops, posters and the suppliers of cutting-edge technologies, but reviews major advances in specific research areas.
Collapse
Affiliation(s)
- R Appels
- Centre for Comparative Genomics, Murdoch University, Perth, 6150, WA, Australia.
| | | | | | | | | | | |
Collapse
|
241
|
Haddrill PR, Zeng K, Charlesworth B. Determinants of synonymous and nonsynonymous variability in three species of Drosophila. Mol Biol Evol 2010; 28:1731-43. [PMID: 21191087 DOI: 10.1093/molbev/msq354] [Citation(s) in RCA: 34] [Impact Index Per Article: 2.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/13/2022] Open
Abstract
We estimated the intensity of selection on preferred codons in Drosophila pseudoobscura and D. miranda at X-linked and autosomal loci, using a published data set on sequence variability at 67 loci, by means of an improved method that takes account of demographic effects. We found evidence for stronger selection at X-linked loci, consistent with their higher levels of codon usage bias. The estimates of the strength of selection and mutational bias in favor of unpreferred codons were similar to those found in other species, after taking into account the fact that D. pseudoobscura showed evidence for a recent expansion in population size. We examined correlates of synonymous and nonsynonymous diversity in these species and found no evidence for effects of recurrent selective sweeps on nonsynonymous mutations, which is probably because this set of genes have much higher than average levels of selective constraints. There was evidence for correlated effects of levels of selective constraints on protein sequences and on codon usage, as expected under models of selection for translational accuracy. Our analysis of a published data set on D. melanogaster provided evidence for the effects of selective sweeps of nonsynonymous mutations on linked synonymous diversity, but only in the subset of loci that experienced the highest rates of nonsynonymous substitutions (about one-quarter of the total) and not at more slowly evolving loci. Our correlational analysis of this data set suggested that both selective constraints on protein sequences and recurrent selective sweeps affect the overall level of codon usage.
Collapse
Affiliation(s)
- Penelope R Haddrill
- Institute of Evolutionary Biology, School of Biological Sciences, University of Edinburgh, Edinburgh, United Kingdom.
| | | | | |
Collapse
|
242
|
Babbitt GA, Cotter CR. Functional conservation of nucleosome formation selectively biases presumably neutral molecular variation in yeast genomes. Genome Biol Evol 2010; 3:15-22. [PMID: 21135411 PMCID: PMC3014273 DOI: 10.1093/gbe/evq081] [Citation(s) in RCA: 21] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/03/2023] Open
Abstract
One prominent pattern of mutational frequency, long appreciated in comparative genomics, is the bias of purine/pyrimidine conserving substitutions (transitions) over purine/pyrimidine altering substitutions (transversions). Traditionally, this transitional bias has been thought to be driven by the underlying rates of DNA mutation and/or repair. However, recent sequencing studies of mutation accumulation lines in model organisms demonstrate that substitutions generally do not accumulate at rates that would indicate a transitional bias. These observations have called into question a very basic assumption of molecular evolution; that naturally occurring patterns of molecular variation in noncoding regions accurately reflect the underlying processes of randomly accumulating neutral mutation in nuclear genomes. Here, in Saccharomyces yeasts, we report a very strong inverse association (r = −0.951, P < 0.004) between the genome-wide frequency of substitutions and their average energetic effect on nucleosome formation, as predicted by a structurally based energy model of DNA deformation around the nucleosome core. We find that transitions occurring at sites positioned nearest the nucleosome surface, which are believed to function most importantly in nucleosome formation, alter the deformation energy of DNA to the nucleosome core by only a fraction of the energy changes typical of most transversions. When we examined the same substitutions set against random background sequences as well as an existing study reporting substitutions arising in mutation accumulation lines of Saccharomyces cerevisiae, we failed to find a similar relationship. These results support the idea that natural selection acting to functionally conserve chromatin organization may contribute significantly to genome-wide transitional bias, even in noncoding regions. Because nucleosome core structure is highly conserved across eukaryotes, our observations may also help to further explain locally elevated transition bias at CpG islands, which are known to destabilize nucleosomes at vertebrate promoters.
Collapse
|
243
|
Lucas-Lledó JI, Maddamsetti R, Lynch M. Phylogenomic analysis of the uracil-DNA glycosylase superfamily. Mol Biol Evol 2010; 28:1307-17. [PMID: 21135150 DOI: 10.1093/molbev/msq318] [Citation(s) in RCA: 24] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/25/2022] Open
Abstract
The spontaneous deamination of cytosine produces uracil mispaired with guanine in DNA, which will produce a mutation, unless repaired. In all domains of life, uracil-DNA glycosylases (UDGs) are responsible for the elimination of uracil from DNA. Thus, UDGs contribute to the integrity of the genetic information and their loss results in mutator phenotypes. We are interested in understanding the role of UDG genes in the evolutionary variation of the rate and the spectrum of spontaneous mutations. To this end, we determined the presence or absence of the five main UDG families in more than 1,000 completely sequenced genomes and analyzed their patterns of gene loss and gain in eubacterial lineages. We observe nonindependent patterns of gene loss and gain between UDG families in Eubacteria, suggesting extensive functional overlap in an evolutionary timescale. Given that UDGs prevent transitions at G:C sites, we expected the loss of UDG genes to bias the mutational spectrum toward a lower equilibrium G + C content. To test this hypothesis, we used phylogenetically independent contrasts to compare the G + C content at intergenic and 4-fold redundant sites between lineages where UDG genes have been lost and their sister clades. None of the main UDG families present in Eubacteria was associated with a higher G + C content at intergenic or 4-fold redundant sites. We discuss the reasons of this negative result and report several features of the evolution of the UDG superfamily with implications for their functional study. uracil-DNA glycosylase, mutation rate evolution, mutational bias, GC content, DNA repair, mutator gene.
Collapse
|
244
|
Zeng K, Charlesworth B. The effects of demography and linkage on the estimation of selection and mutation parameters. Genetics 2010; 186:1411-24. [PMID: 20923980 PMCID: PMC2998320 DOI: 10.1534/genetics.110.122150] [Citation(s) in RCA: 26] [Impact Index Per Article: 1.9] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/13/2010] [Accepted: 09/27/2010] [Indexed: 11/18/2022] Open
Abstract
We explore the effects of demography and linkage on a maximum-likelihood (ML) method for estimating selection and mutation parameters in a reversible mutation model. This method assumes free recombination between sites and a randomly mating population of constant size and uses information from both polymorphic and monomorphic sites in the sample. Two likelihood-ratio test statistics were constructed under this ML framework: LRTγ for detecting selection and LRTκ for detecting mutational bias. By carrying out extensive simulations, we obtain the following results. When mutations are neutral and population size is constant, LRTγ and LRTκ follow a chi-square distribution with 1 d.f. regardless of the level of linkage, as long as the mutation rate is not very high. In addition, LRTγ and LRTκ are relatively insensitive to demographic effects and selection at linked sites. We find that the ML estimators of the selection and mutation parameters are usually approximately unbiased and that LRTκ usually has good power to detect mutational bias. Finally, with a recombination rate that is typical for Drosophila, LRTγ has good power to detect weak selection acting on synonymous sites. These results suggest that the method should be useful under many different circumstances.
Collapse
Affiliation(s)
- Kai Zeng
- Institute of Evolutionary Biology, School of Biological Sciences, University of Edinburgh, Edinburgh EH9 3JT, United Kingdom.
| | | |
Collapse
|
245
|
Obbard DJ, Jiggins FM, Bradshaw NJ, Little TJ. Recent and recurrent selective sweeps of the antiviral RNAi gene Argonaute-2 in three species of Drosophila. Mol Biol Evol 2010; 28:1043-56. [PMID: 20978039 PMCID: PMC3021790 DOI: 10.1093/molbev/msq280] [Citation(s) in RCA: 51] [Impact Index Per Article: 3.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/21/2022] Open
Abstract
Antagonistic host–parasite interactions can drive rapid adaptive evolution in genes of the immune system, and such arms races may be an important force shaping polymorphism in the genome. The RNA interference pathway gene Argonaute-2 (AGO2) is a key component of antiviral defense in Drosophila, and we have previously shown that genes in this pathway experience unusually high rates of adaptive substitution. Here we study patterns of genetic variation in a 100-kbp region around AGO2 in three different species of Drosophila. Our data suggest that recent independent selective sweeps in AGO2 have reduced genetic variation across a region of more than 50 kbp in Drosophila melanogaster, D. simulans, and D. yakuba, and we estimate that selection has fixed adaptive substitutions in this gene every 30–100 thousand years. The strongest signal of recent selection is evident in D. simulans, where we estimate that the most recent selective sweep involved an allele with a selective advantage of the order of 0.5–1% and occurred roughly 13–60 Kya. To evaluate the potential consequences of the recent substitutions on the structure and function of AGO2, we used fold-recognition and homology-based modeling to derive a structural model for the Drosophila protein, and this suggests that recent substitutions in D. simulans are overrepresented at the protein surface. In summary, our results show that selection by parasites can consistently target the same genes in multiple species, resulting in areas of the genome that have markedly reduced genetic diversity.
Collapse
Affiliation(s)
- Darren J Obbard
- Institute of Evolutionary Biology, University of Edinburgh, Edinburgh, UK.
| | | | | | | |
Collapse
|
246
|
Yu G. GenHtr: a tool for comparative assessment of genetic heterogeneity in microbial genomes generated by massive short-read sequencing. BMC Bioinformatics 2010; 11:508. [PMID: 20939910 PMCID: PMC2967562 DOI: 10.1186/1471-2105-11-508] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/20/2010] [Accepted: 10/12/2010] [Indexed: 12/02/2022] Open
Abstract
Background Microevolution is the study of short-term changes of alleles within a population and their effects on the phenotype of organisms. The result of the below-species-level evolution is heterogeneity, where populations consist of subpopulations with a large number of structural variations. Heterogeneity analysis is thus essential to our understanding of how selective and neutral forces shape bacterial populations over a short period of time. The Solexa Genome Analyzer, a next-generation sequencing platform, allows millions of short sequencing reads to be obtained with great accuracy, allowing for the ability to study the dynamics of the bacterial population at the whole genome level. The tool referred to as GenHtr was developed for genome-wide heterogeneity analysis. Results For particular bacterial strains, GenHtr relies on a set of Solexa short reads on given bacteria pathogens and their isogenic reference genome to identify heterogeneity sites, the chromosomal positions with multiple variants of genes in the bacterial population, and variations that occur in large gene families. GenHtr accomplishes this by building and comparatively analyzing genome-wide heterogeneity genotypes for both the newly sequenced genomes (using massive short-read sequencing) and their isogenic reference (using simulated data). As proof of the concept, this approach was applied to SRX007711, the Solexa sequencing data for a newly sequenced Staphylococcus aureus subsp. USA300 cell line, and demonstrated that it could predict such multiple variants. They include multiple variants of genes critical in pathogenesis, e.g. genes encoding a LysR family transcriptional regulator, 23 S ribosomal RNA, and DNA mismatch repair protein MutS. The heterogeneity results in non-synonymous and nonsense mutations, leading to truncated proteins for both LysR and MutS. Conclusion GenHtr was developed for genome-wide heterogeneity analysis. Although it is much more time-consuming when compared to Maq, a popular tool for SNP analysis, GenHtr is able to predict potential multiple variants that pre-exist in the bacterial population as well as SNPs that occur in the highly duplicated gene families. It is expected that, with the proper experimental design, this analysis can improve our understanding of the molecular mechanism underlying the dynamics and the evolution of drug-resistant bacterial pathogens.
Collapse
Affiliation(s)
- Gongxin Yu
- Department of Biological Science, Boise State University, 1910 University Drive, Boise, Idaho 83725, USA.
| |
Collapse
|
247
|
Abstract
There has been an enormous increase in the amount of data on DNA sequence polymorphism available for many organisms in the last decade. New sequencing technologies provide great potential for investigating natural selection in plants using population genomic approaches. However, plant populations frequently show significant departures from the assumptions of standard models used to detect selection and many forms of directional selection do not fit with classical population genetics theory. Here, we explore the extent to which plant populations show departures from standard model assumptions, and the implications this has for detecting selection on molecular variation. A growing number of multilocus studies of nucleotide variation suggest that changes in population size, particularly bottlenecks, and strong subdivision may be common in plants. This demographic variation presents important challenges for models used to infer selection. In addition, selection from standing genetic variation and multiple independent adaptive substitutions can further complicate efforts to understand the nature of selection. We discuss emerging patterns from plant studies and propose that, rather than treating population history as a nuisance variable when testing for selection, the interaction between demography and selection is of fundamental importance for evolutionary studies of plant populations using molecular data.
Collapse
Affiliation(s)
- Mathieu Siol
- Department of Ecology and Evolutionary Biology, University of Toronto, Toronto, ON, Canada.
| | | | | |
Collapse
|
248
|
Brockhurst MA, Colegrave N, Rozen DE. Next-generation sequencing as a tool to study microbial evolution. Mol Ecol 2010; 20:972-80. [PMID: 20874764 DOI: 10.1111/j.1365-294x.2010.04835.x] [Citation(s) in RCA: 59] [Impact Index Per Article: 4.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/29/2022]
Abstract
Thanks to their short generation times and large population sizes, microbes evolve rapidly. Evolutionary biologists have exploited this to observe evolution in real time. The falling costs of whole-genome sequencing using next-generation technologies now mean that it is realistic to use this as a tool to study this rapid microbial evolution both in the laboratory and in the wild. Such experiments are being used to accurately estimate the rates of mutation, reveal the genetic targets and dynamics of natural selection, uncover the correlation (or lack thereof) between genetic and phenotypic change, and provide data to test long-standing evolutionary hypotheses. These advances have important implications for our understanding of the within- and between-host evolution of microbial pathogens.
Collapse
Affiliation(s)
- Michael A Brockhurst
- Institute of Integrative Biology, University of Liverpool, Liverpool L69 7ZB, UK.
| | | | | |
Collapse
|
249
|
Nishant KT, Wei W, Mancera E, Argueso JL, Schlattl A, Delhomme N, Ma X, Bustamante CD, Korbel JO, Gu Z, Steinmetz LM, Alani E. The baker's yeast diploid genome is remarkably stable in vegetative growth and meiosis. PLoS Genet 2010; 6:e1001109. [PMID: 20838597 PMCID: PMC2936533 DOI: 10.1371/journal.pgen.1001109] [Citation(s) in RCA: 84] [Impact Index Per Article: 6.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/11/2010] [Accepted: 08/03/2010] [Indexed: 11/18/2022] Open
Abstract
Accurate estimates of mutation rates provide critical information to analyze genome evolution and organism fitness. We used whole-genome DNA sequencing, pulse-field gel electrophoresis, and comparative genome hybridization to determine mutation rates in diploid vegetative and meiotic mutation accumulation lines of Saccharomyces cerevisiae. The vegetative lines underwent only mitotic divisions while the meiotic lines underwent a meiotic cycle every ∼20 vegetative divisions. Similar base substitution rates were estimated for both lines. Given our experimental design, these measures indicated that the meiotic mutation rate is within the range of being equal to zero to being 55-fold higher than the vegetative rate. Mutations detected in vegetative lines were all heterozygous while those in meiotic lines were homozygous. A quantitative analysis of intra-tetrad mating events in the meiotic lines showed that inter-spore mating is primarily responsible for rapidly fixing mutations to homozygosity as well as for removing mutations. We did not observe 1-2 nt insertion/deletion (in-del) mutations in any of the sequenced lines and only one structural variant in a non-telomeric location was found. However, a large number of structural variations in subtelomeric sequences were seen in both vegetative and meiotic lines that did not affect viability. Our results indicate that the diploid yeast nuclear genome is remarkably stable during the vegetative and meiotic cell cycles and support the hypothesis that peripheral regions of chromosomes are more dynamic than gene-rich central sections where structural rearrangements could be deleterious. This work also provides an improved estimate for the mutational load carried by diploid organisms.
Collapse
Affiliation(s)
- K. T. Nishant
- Department of Molecular Biology and Genetics, Cornell University, Ithaca, New York, United States of America
| | - Wu Wei
- European Molecular Biology Laboratory, Heidelberg, Germany
| | | | - Juan Lucas Argueso
- Department of Molecular Genetics and Microbiology, Duke University Medical Center, Durham, North Carolina, United States of America
| | | | | | - Xin Ma
- Department of Biological Statistics and Computational Biology, Cornell University, Ithaca, New York, United States of America
| | - Carlos D. Bustamante
- Department of Genetics, Stanford University, Stanford, California, United States of America
| | - Jan O. Korbel
- European Molecular Biology Laboratory, Heidelberg, Germany
| | - Zhenglong Gu
- Division of Nutritional Sciences, Cornell University, Ithaca, New York, United States of America
| | - Lars M. Steinmetz
- European Molecular Biology Laboratory, Heidelberg, Germany
- * E-mail: (LMS); (EA)
| | - Eric Alani
- Department of Molecular Biology and Genetics, Cornell University, Ithaca, New York, United States of America
- * E-mail: (LMS); (EA)
| |
Collapse
|
250
|
Hershberg R, Petrov DA. Evidence that mutation is universally biased towards AT in bacteria. PLoS Genet 2010; 6:e1001115. [PMID: 20838599 PMCID: PMC2936535 DOI: 10.1371/journal.pgen.1001115] [Citation(s) in RCA: 304] [Impact Index Per Article: 21.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/09/2010] [Accepted: 08/09/2010] [Indexed: 11/19/2022] Open
Abstract
Mutation is the engine that drives evolution and adaptation forward in that it generates the variation on which natural selection acts. Mutation is a random process that nevertheless occurs according to certain biases. Elucidating mutational biases and the way they vary across species and within genomes is crucial to understanding evolution and adaptation. Here we demonstrate that clonal pathogens that evolve under severely relaxed selection are uniquely suitable for studying mutational biases in bacteria. We estimate mutational patterns using sequence datasets from five such clonal pathogens belonging to four diverse bacterial clades that span most of the range of genomic nucleotide content. We demonstrate that across different types of sites and in all four clades mutation is consistently biased towards AT. This is true even in clades that have high genomic GC content. In all studied cases the mutational bias towards AT is primarily due to the high rate of C/G to T/A transitions. These results suggest that bacterial mutational biases are far less variable than previously thought. They further demonstrate that variation in nucleotide content cannot stem entirely from variation in mutational biases and that natural selection and/or a natural selection-like process such as biased gene conversion strongly affect nucleotide content.
Collapse
Affiliation(s)
- Ruth Hershberg
- Department of Biology, Stanford University, Stanford, California, United States of America.
| | | |
Collapse
|