1
|
Zhao Z, Jin L, Fu YX, Ramsay M, Jenkins T, Leskinen E, Pamilo P, Trexler M, Patthy L, Jorde LB, Ramos-Onsins S, Yu N, Li WH. Worldwide DNA sequence variation in a 10-kilobase noncoding region on human chromosome 22. Proc Natl Acad Sci U S A 2000; 97:11354-8. [PMID: 11005839 PMCID: PMC17204 DOI: 10.1073/pnas.200348197] [Citation(s) in RCA: 150] [Impact Index Per Article: 6.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/18/2022] Open
Abstract
Human DNA sequence variation data are useful for studying the origin, evolution, and demographic history of modern humans and the mechanisms of maintenance of genetic variability in human populations, and for detecting linkage association of disease. Here, we report worldwide variation data from a approximately 10-kilobase noncoding autosomal region. We identified 75 variant sites in 64 humans (128 sequences) and 463 variant sites among the human, chimpanzee, and orangutan sequences. Statistical tests suggested that the region is selectively neutral. The average nucleotide diversity (pi) across the region was 0.088% among all of the human sequences obtained, 0.085% among African sequences, and 0.082% among non-African sequences, supporting the view of a low nucleotide diversity ( approximately 0.1%) in humans. The comparable pi value in non-Africans to that in Africans indicates no severe bottleneck during the evolution of modern non-Africans; however, the possibility of a mild bottleneck cannot be excluded because non-Africans showed considerably fewer variants than Africans. The present and two previous large data sets all show a strong excess of low frequency variants in comparison to that expected from an equilibrium population, indicating a relatively recent population expansion. The mutation rate was estimated to be 1.15 x 10(-9) per nucleotide per year. Estimates of the long-term effective population size N(e) by various statistical methods were similar to those in other studies. The age of the most recent common ancestor was estimated to be approximately 1.29 million years ago among all of the sequences obtained and approximately 634,000 years ago among the non-African sequences, providing the first evidence from a noncoding autosomal region for ancient human histories, even among non-Africans.
Collapse
|
research-article |
25 |
150 |
2
|
Hughes AL, Friedman R, Murray M. Genomewide pattern of synonymous nucleotide substitution in two complete genomes of Mycobacterium tuberculosis. Emerg Infect Dis 2002; 8:1342-6. [PMID: 12453367 PMCID: PMC2738538 DOI: 10.3201/eid0811.020064] [Citation(s) in RCA: 69] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/26/2022] Open
Abstract
Comparison of the pattern of synonymous nucleotide substitution between two complete genomes of Mycobacterium tuberculosis at 3298 putatively orthologous loci showed a mean percent difference per synonymous site of 0.000328 0.000022. Although 80.5% of loci showed no synonymous or nonsynonymous nucleotide differences, the level of polymorphism observed at other loci was greater than suggested by previous studies of a small number of loci. This level of nucleotide difference leads to the conservative estimate that the common ancestor of these two genotypes occurred approximately 35000 ago, which is twice as high as some recent estimates of the time of origin of this species. Our results suggest that a large number of loci should be examined for an accurate assessment of the level of nucleotide diversity in natural populations of pathogenic microorganisms.
Collapse
|
brief-report |
23 |
69 |
3
|
Félix MA, Jovelin R, Ferrari C, Han S, Cho YR, Andersen EC, Cutter AD, Braendle C. Species richness, distribution and genetic diversity of Caenorhabditis nematodes in a remote tropical rainforest. BMC Evol Biol 2013; 13:10. [PMID: 23311925 PMCID: PMC3556333 DOI: 10.1186/1471-2148-13-10] [Citation(s) in RCA: 65] [Impact Index Per Article: 5.4] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/09/2012] [Accepted: 01/07/2013] [Indexed: 11/30/2022] Open
Abstract
BACKGROUND In stark contrast to the wealth of detail about C. elegans developmental biology and molecular genetics, biologists lack basic data for understanding the abundance and distribution of Caenorhabditis species in natural areas that are unperturbed by human influence. METHODS Here we report the analysis of dense sampling from a small, remote site in the Amazonian rain forest of the Nouragues Natural Reserve in French Guiana. RESULTS Sampling of rotting fruits and flowers revealed proliferating populations of Caenorhabditis, with up to three different species co-occurring within a single substrate sample, indicating remarkable overlap of local microhabitats. We isolated six species, representing the highest local species richness for Caenorhabditis encountered to date, including both tropically cosmopolitan and geographically restricted species not previously isolated elsewhere. We also documented the structure of within-species molecular diversity at multiple spatial scales, focusing on 57 C. briggsae isolates from French Guiana. Two distinct genetic subgroups co-occur even within a single fruit. However, the structure of C. briggsae population genetic diversity in French Guiana does not result from strong local patterning but instead presents a microcosm of global patterns of differentiation. We further integrate our observations with new data from nearly 50 additional recently collected C. briggsae isolates from both tropical and temperate regions of the world to re-evaluate local and global patterns of intraspecific diversity, providing the most comprehensive analysis to date for C. briggsae population structure across multiple spatial scales. CONCLUSIONS The abundance and species richness of Caenorhabditis nematodes is high in a Neotropical rainforest habitat that is subject to minimal human interference. Microhabitat preferences overlap for different local species, although global distributions include both cosmopolitan and geographically restricted groups. Local samples for the cosmopolitan C. briggsae mirror its pan-tropical patterns of intraspecific polymorphism. It remains an important challenge to decipher what drives Caenorhabditis distributions and diversity within and between species.
Collapse
|
Research Support, N.I.H., Extramural |
12 |
65 |
4
|
Schou MF, Kristensen TN, Kellermann V, Schlötterer C, Loeschcke V. A Drosophila laboratory evolution experiment points to low evolutionary potential under increased temperatures likely to be experienced in the future. J Evol Biol 2014; 27:1859-68. [PMID: 24925446 DOI: 10.1111/jeb.12436] [Citation(s) in RCA: 61] [Impact Index Per Article: 5.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/19/2014] [Accepted: 05/20/2014] [Indexed: 11/29/2022]
Abstract
The ability to respond evolutionarily to increasing temperatures is important for survival of ectotherms in a changing climate. Recent studies suggest that upper thermal limits may be evolutionary constrained. We address this hypothesis in a laboratory evolution experiment, encompassing ecologically relevant thermal regimes. To examine the potential for species to respond to climate change, we exposed replicate populations of Drosophila melanogaster to increasing temperatures (0.3 °C every generation) for 20 generations, whereas corresponding replicate control populations were held at benign thermal conditions throughout the experiment. We hypothesized that replicate populations exposed to increasing temperatures would show increased resistance to warm and dry environments compared with replicate control populations. Contrasting replicate populations held at the two thermal regimes showed (i) an increase in desiccation resistance and a decline in heat knock-down resistance in replicate populations exposed to increasing temperatures, (ii) similar egg-to-adult viability and fecundity in replicate populations from the two thermal regimes, when assessed at high stressful temperatures and (iii) no difference in nucleotide diversity between thermal regimes. The limited scope for adaptive evolutionary responses shown in this study highlights the challenges faced by ectotherms under climate change.
Collapse
|
Research Support, Non-U.S. Gov't |
11 |
61 |
5
|
Sackton TB, Kulathinal RJ, Bergman CM, Quinlan AR, Dopman EB, Carneiro M, Marth GT, Hartl DL, Clark AG. Population genomic inferences from sparse high-throughput sequencing of two populations of Drosophila melanogaster. Genome Biol Evol 2009; 1:449-65. [PMID: 20333214 PMCID: PMC2839279 DOI: 10.1093/gbe/evp048] [Citation(s) in RCA: 56] [Impact Index Per Article: 3.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 11/14/2009] [Indexed: 12/20/2022] Open
Abstract
Short-read sequencing techniques provide the opportunity to capture genome-wide sequence data in a single experiment. A current challenge is to identify questions that shallow-depth genomic data can address successfully and to develop corresponding analytical methods that are statistically sound. Here, we apply the Roche/454 platform to survey natural variation in strains of Drosophila melanogaster from an African (n = 3) and a North American (n = 6) population. Reads were aligned to the reference D. melanogaster genomic assembly, single nucleotide polymorphisms were identified, and nucleotide variation was quantified genome wide. Simulations and empirical results suggest that nucleotide diversity can be accurately estimated from sparse data with as little as 0.2x coverage per line. The unbiased genomic sampling provided by random short-read sequencing also allows insight into distributions of transposable elements and copy number polymorphisms found within populations and demonstrates that short-read sequencing methods provide an efficient means to quantify variation in genome organization and content. Continued development of methods for statistical inference of shallow-depth genome-wide sequencing data will allow such sparse, partial data sets to become the norm in the emerging field of population genomics.
Collapse
|
Journal Article |
16 |
56 |
6
|
Nabholz B, Sarah G, Sabot F, Ruiz M, Adam H, Nidelet S, Ghesquière A, Santoni S, David J, Glémin S. Transcriptome population genomics reveals severe bottleneck and domestication cost in the African rice (Oryza glaberrima). Mol Ecol 2014; 23:2210-27. [PMID: 24684265 DOI: 10.1111/mec.12738] [Citation(s) in RCA: 53] [Impact Index Per Article: 4.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/25/2013] [Accepted: 03/19/2014] [Indexed: 12/17/2022]
Abstract
The African cultivated rice (Oryza glaberrima) was domesticated in West Africa 3000 years ago. Although less cultivated than the Asian rice (O. sativa), O. glaberrima landraces often display interesting adaptation to rustic environment (e.g. drought). Here, using RNA-seq technology, we were able to compare more than 12,000 transcripts between 9 O. glaberrima, 10 wild O. barthii and one O. meridionalis individuals. With a synonymous nucleotide diversity πs = 0.0006 per site, O. glaberrima appears as the least genetically diverse crop grass ever documented. Using approximate Bayesian computation, we estimated that O. glaberrima experienced a severe bottleneck during domestication. This demographic scenario almost fully accounts for the pattern of genetic diversity across O. glaberrima genome as we detected very few outliers regions where positive selection may have further impacted genetic diversity. Moreover, the large excess of derived nonsynonymous substitution that we detected suggests that the O. glaberrima population suffered from the 'cost of domestication'. In addition, we used this genome-scale data set to demonstrate that (i) O. barthii genetic diversity is positively correlated with recombination rate and negatively with gene density, (ii) expression level is negatively correlated with evolutionary constraint, and (iii) one region on chromosome 5 (position 4-6 Mb) exhibits a clear signature of introgression with a yet unidentified Oryza species. This work represents the first genome-wide survey of the African rice genetic diversity and paves the way for further comparison between the African and the Asian rice, notably regarding the genetics underlying domestication traits.
Collapse
|
Research Support, Non-U.S. Gov't |
11 |
53 |
7
|
Lu L, Shao D, Qiu X, Sun L, Yan W, Zhou X, Yang L, He Y, Yu S, Xing Y. Natural variation and artificial selection in four genes determine grain shape in rice. THE NEW PHYTOLOGIST 2013; 200:1269-80. [PMID: 23952103 DOI: 10.1111/nph.12430] [Citation(s) in RCA: 49] [Impact Index Per Article: 4.1] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 04/19/2013] [Accepted: 06/25/2013] [Indexed: 05/20/2023]
Abstract
The size of cultivated rice (Oryza sativa) grains has been altered by both domestication and artificial selection over the course of evolutionary history. Several quantitative trait loci (QTLs) for grain size have been cloned in the past 10 yr. To explore the natural variation in these QTLs, resequencing of grain width and weight 2 (GW2), grain size 5 (GS5) and QTL for seed width 5 (qSW5) and genotyping of grain size 3 (GS3) were performed in the germplasms of 127 varieties of rice (O. sativa) and 10-15 samples of wild rice (Oryza rufipogon). Ten, 10 and 15 haplotypes were observed for GW2, GS5 and qSW5. qSW5 and GS3 had the strongest effects on grain size, which have been widely utilized in rice production, whereas GW2 and GS5 showed more modest effects. GS5 showed small sequence variations in O. sativa germplasm and that of its progenitor O. rufipogon. qSW5 exhibited the highest level of nucleotide diversity. GW2 showed signs of purifying selection. The four grain size genes experienced different selection intensities depending on their genetic effects. In the indica population, linkage disequilibrium (LD) was detected among GS3, qSW5 and GS5. The substantial genetic variation in these four genes provides the flexibility needed to design various rice grain shapes. These findings provide insight into the evolutionary features of grain size genes in rice.
Collapse
|
|
12 |
49 |
8
|
Campo DS, Xia GL, Dimitrova Z, Lin Y, Forbi JC, Ganova-Raeva L, Punkova L, Ramachandran S, Thai H, Skums P, Sims S, Rytsareva I, Vaughan G, Roh HJ, Purdy MA, Sue A, Khudyakov Y. Accurate Genetic Detection of Hepatitis C Virus Transmissions in Outbreak Settings. J Infect Dis 2016; 213:957-965. [PMID: 26582955 PMCID: PMC5119477 DOI: 10.1093/infdis/jiv542] [Citation(s) in RCA: 47] [Impact Index Per Article: 5.2] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/15/2015] [Accepted: 10/08/2015] [Indexed: 12/18/2022] Open
Abstract
Hepatitis C is a major public health problem in the United States and worldwide. Outbreaks of hepatitis C virus (HCV) infections are associated with unsafe injection practices, drug diversion, and other exposures to blood and are difficult to detect and investigate. Here, we developed and validated a simple approach for molecular detection of HCV transmissions in outbreak settings. We obtained sequences from the HCV hypervariable region 1 (HVR1), using end-point limiting-dilution (EPLD) technique, from 127 cases involved in 32 epidemiologically defined HCV outbreaks and 193 individuals with unrelated HCV strains. We compared several types of genetic distances and calculated a threshold, using minimal Hamming distances, that identifies transmission clusters in all tested outbreaks with 100% accuracy. The approach was also validated on sequences obtained using next-generation sequencing from HCV strains recovered from 239 individuals, and findings showed the same accuracy as that for EPLD. On average, the nucleotide diversity of the intrahost population was 6.2 times greater in the source case than in any incident case, allowing the correct detection of transmission direction in 8 outbreaks for which source cases were known. A simple and accurate distance-based approach developed here for detecting HCV transmissions streamlines molecular investigation of outbreaks, thus improving the public health capacity for rapid and effective control of hepatitis C.
Collapse
|
research-article |
9 |
47 |
9
|
Dong W, Cheng T, Li C, Xu C, Long P, Chen C, Zhou S. Discriminating plants using the DNA barcode rbcLb: an appraisal based on a large data set. Mol Ecol Resour 2013; 14:336-43. [PMID: 24119263 DOI: 10.1111/1755-0998.12185] [Citation(s) in RCA: 45] [Impact Index Per Article: 3.8] [Reference Citation Analysis] [Abstract] [Key Words] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/09/2013] [Revised: 09/11/2013] [Accepted: 10/01/2013] [Indexed: 11/26/2022]
Abstract
The ideal DNA barcode for plants remains to be discovered, and the candidate barcode rbcL has been met with considerable skepticism since its proposal. In fact, the variability within this gene has never been fully explored across all plant groups from algae to flowering plants, and its performance as a barcode has not been adequately tested. By analysing all of the rbcL sequences currently available in GenBank, we attempted to determine how well a region of rbcL performs as a barcode in species discrimination. We found that the rbcLb region was more variable than the frequently used rbcLa region. Both universal and plant group-specific primers were designed to amplify rbcLb, and the performance of rbcLa and rbcLb was tested in several ways. Using blast, both regions successfully identified all families and nearly all genera; however, the successful species identification rates varied significantly among plant groups, ranging from 24.58% to 85.50% for rbcLa and from 36.67% to 90.89% for rbcLb. Successful species discrimination ranged from 5.19% to 96.33% for rbcLa and from 22.09% to 98.43% for rbcLb in species-rich families, and from 0 to 88.73% for rbcLa and from 2.04% to 100% for rbcLb in species-rich genera. Both regions performed better for lower plants than for higher plants, although rbcLb performed significantly better than rbcLa overall, particularly for angiosperms. Considering the applicability across plants, easy and unambiguous alignment, high primer universality, high sequence quality and high species discrimination power for lower plants, we suggest rbcLb as a universal plant barcode.
Collapse
|
Research Support, Non-U.S. Gov't |
12 |
45 |
10
|
Silva-Junior OB, Grattapaglia D. Genome-wide patterns of recombination, linkage disequilibrium and nucleotide diversity from pooled resequencing and single nucleotide polymorphism genotyping unlock the evolutionary history of Eucalyptus grandis. THE NEW PHYTOLOGIST 2015; 208:830-45. [PMID: 26079595 DOI: 10.1111/nph.13505] [Citation(s) in RCA: 45] [Impact Index Per Article: 4.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 11/28/2014] [Accepted: 05/06/2015] [Indexed: 05/03/2023]
Abstract
We used high-density single nucleotide polymorphism (SNP) data and whole-genome pooled resequencing to examine the landscape of population recombination (ρ) and nucleotide diversity (ϴw ), assess the extent of linkage disequilibrium (r(2) ) and build the highest density linkage maps for Eucalyptus. At the genome-wide level, linkage disequilibrium (LD) decayed within c. 4-6 kb, slower than previously reported from candidate gene studies, but showing considerable variation from absence to complete LD up to 50 kb. A sharp decrease in the estimate of ρ was seen when going from short to genome-wide inter-SNP distances, highlighting the dependence of this parameter on the scale of observation adopted. Recombination was correlated with nucleotide diversity, gene density and distance from the centromere, with hotspots of recombination enriched for genes involved in chemical reactions and pathways of the normal metabolic processes. The high nucleotide diversity (ϴw = 0.022) of E. grandis revealed that mutation is more important than recombination in shaping its genomic diversity (ρ/ϴw = 0.645). Chromosome-wide ancestral recombination graphs allowed us to date the split of E. grandis (1.7-4.8 million yr ago) and identify a scenario for the recent demographic history of the species. Our results have considerable practical importance to Genome Wide Association Studies (GWAS), while indicating bright prospects for genomic prediction of complex phenotypes in eucalypt breeding.
Collapse
|
|
10 |
45 |
11
|
Xia H, Camus-Kulandaivelu L, Stephan W, Tellier A, Zhang Z. Nucleotide diversity patterns of local adaptation at drought-related candidate genes in wild tomatoes. Mol Ecol 2010; 19:4144-54. [PMID: 20831645 DOI: 10.1111/j.1365-294x.2010.04762.x] [Citation(s) in RCA: 43] [Impact Index Per Article: 2.9] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/28/2022]
Abstract
We surveyed nucleotide diversity at two candidate genes LeNCED1 and pLC30-15, involved in an ABA (abscisic acid) signalling pathway, in two closely related tomato species Solanum peruvianum and Solanum chilense. Our six population samples (three for each species) cover a range of mesic to very dry habitats. The ABA pathway plays an important role in the plants' response to drought stress. LeNCED1 is an upstream gene involved in ABA biosynthesis, and pLC30-15 is a dehydrin gene positioned downstream in the pathway. The two genes show very different patterns of nucleotide variation. LeNCED1 exhibits very low nucleotide diversity relative to the eight neutral reference loci that were previously surveyed in these populations. This suggests that strong purifying selection has been acting on this gene. In contrast, pLC30-15 exhibits higher levels of nucleotide diversity and, in particular in S. chilense, higher genetic differentiation between populations than the reference loci, which is indicative of local adaptation. In the more drought-tolerant species S. chilense, one population (from Quicacha) shows a significant haplotype structure, which appears to be the result of positive (diversifying) selection.
Collapse
|
Research Support, Non-U.S. Gov't |
15 |
43 |
12
|
Testing for the footprint of sexually antagonistic polymorphisms in the pseudoautosomal region of a plant sex chromosome pair. Genetics 2013; 194:663-72. [PMID: 23733787 DOI: 10.1534/genetics.113.152397] [Citation(s) in RCA: 43] [Impact Index Per Article: 3.6] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/18/2022] Open
Abstract
The existence of sexually antagonistic (SA) polymorphism is widely considered the most likely explanation for the evolution of suppressed recombination of sex chromosome pairs. This explanation is largely untested empirically, and no such polymorphisms have been identified, other than in fish, where no evidence directly implicates these genes in events causing loss of recombination. We tested for the presence of loci with SA polymorphism in the plant Silene latifolia, which is dioecious (with separate male and female individuals) and has a pair of highly heteromorphic sex chromosomes, with XY males. Suppressed recombination between much of the Y and X sex chromosomes evolved in several steps, and the results in Bergero et al. (2013) show that it is still ongoing in the recombining or pseudoautosomal, regions (PARs) of these chromosomes. We used molecular evolutionary approaches to test for the footprints of SA polymorphisms, based on sequence diversity levels in S. latifolia PAR genes identified by genetic mapping. Nucleotide diversity is high for at least four of six PAR genes identified, and our data suggest the existence of polymorphisms maintained by balancing selection in this genome region, since molecular evolutionary (HKA) tests exclude an elevated mutation rate, and other tests also suggest balancing selection. The presence of sexually antagonistic alleles at a locus or loci in the PAR is suggested by the very different X and Y chromosome allele frequencies for at least one PAR gene.
Collapse
|
Research Support, Non-U.S. Gov't |
12 |
43 |
13
|
Mosca E, Eckert AJ, Liechty JD, Wegrzyn JL, La Porta N, Vendramin GG, Neale DB. Contrasting patterns of nucleotide diversity for four conifers of Alpine European forests. Evol Appl 2012; 5:762-75. [PMID: 23144662 PMCID: PMC3492901 DOI: 10.1111/j.1752-4571.2012.00256.x] [Citation(s) in RCA: 43] [Impact Index Per Article: 3.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/03/2012] [Accepted: 02/11/2012] [Indexed: 11/29/2022] Open
Abstract
A candidate gene approach was used to identify levels of nucleotide diversity and to identify genes departing from neutral expectations in coniferous species of the Alpine European forest. Twelve samples were collected from four species that dominate montane and subalpine forests throughout Europe: Abies alba Mill, Larix decidua Mill, Pinus cembra L., and Pinus mugo Turra. A total of 800 genes, originally resequenced in Pinus taeda L., were resequenced across 12 independent trees for each of the four species. Genes were assigned to two categories, candidate and control, defined through homology-based searches to Arabidopsis. Estimates of nucleotide diversity per site varied greatly between polymorphic candidate genes (range: 0.0004–0.1295) and among species (range: 0.0024–0.0082), but were within the previously established ranges for conifers. Tests of neutrality using stringent significance thresholds, performed under the standard neutral model, revealed one to seven outlier loci for each species. Some of these outliers encode proteins that are involved with plant stress responses and form the basis for further evolutionary enquiries.
Collapse
|
Journal Article |
13 |
43 |
14
|
Hardigan MA, Lorant A, Pincot DDA, Feldmann MJ, Famula RA, Acharya CB, Lee S, Verma S, Whitaker VM, Bassil N, Zurn J, Cole GS, Bird K, Edger PP, Knapp SJ. Unraveling the Complex Hybrid Ancestry and Domestication History of Cultivated Strawberry. Mol Biol Evol 2021; 38:2285-2305. [PMID: 33507311 PMCID: PMC8136507 DOI: 10.1093/molbev/msab024] [Citation(s) in RCA: 43] [Impact Index Per Article: 10.8] [Reference Citation Analysis] [Abstract] [Key Words] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/06/2023] Open
Abstract
Cultivated strawberry (Fragaria × ananassa) is one of our youngest domesticates, originating in early eighteenth-century Europe from spontaneous hybrids between wild allo-octoploid species (Fragaria chiloensis and Fragaria virginiana). The improvement of horticultural traits by 300 years of breeding has enabled the global expansion of strawberry production. Here, we describe the genomic history of strawberry domestication from the earliest hybrids to modern cultivars. We observed a significant increase in heterozygosity among interspecific hybrids and a decrease in heterozygosity among domesticated descendants of those hybrids. Selective sweeps were found across the genome in early and modern phases of domestication—59–76% of the selectively swept genes originated in the three less dominant ancestral subgenomes. Contrary to the tenet that genetic diversity is limited in cultivated strawberry, we found that the octoploid species harbor massive allelic diversity and that F. × ananassa harbors as much allelic diversity as either wild founder. We identified 41.8 M subgenome-specific DNA variants among resequenced wild and domesticated individuals. Strikingly, 98% of common alleles and 73% of total alleles were shared between wild and domesticated populations. Moreover, genome-wide estimates of nucleotide diversity were virtually identical in F. chiloensis,F. virginiana, and F. × ananassa (π = 0.0059–0.0060). We found, however, that nucleotide diversity and heterozygosity were significantly lower in modern F. × ananassa populations that have experienced significant genetic gains and have produced numerous agriculturally important cultivars.
Collapse
|
Research Support, U.S. Gov't, Non-P.H.S. |
4 |
43 |
15
|
Cis- and Trans-regulatory Effects on Gene Expression in a Natural Population of Drosophila melanogaster. Genetics 2017; 206:2139-2148. [PMID: 28615283 DOI: 10.1534/genetics.117.201459] [Citation(s) in RCA: 42] [Impact Index Per Article: 5.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/27/2017] [Accepted: 06/06/2017] [Indexed: 12/30/2022] Open
Abstract
Cis- and trans-regulatory mutations are important contributors to transcriptome evolution. Quantifying their relative contributions to intraspecific variation in gene expression is essential for understanding the population genetic processes that underlie evolutionary changes in gene expression. Here, we have examined this issue by quantifying genome-wide, allele-specific expression (ASE) variation using a crossing scheme that produces F1 hybrids between 18 different Drosophila melanogaster strains sampled from the Drosophila Genetic Reference Panel and a reference strain from another population. Head and body samples from F1 adult females were subjected to RNA sequencing and the subsequent ASE quantification. Cis- and trans-regulatory effects on expression variation were estimated from these data. A higher proportion of genes showed significant cis-regulatory variation (∼28%) than those that showed significant trans-regulatory variation (∼9%). The sizes of cis-regulatory effects on expression variation were 1.98 and 1.88 times larger than trans-regulatory effects in heads and bodies, respectively. A generalized linear model analysis revealed that both cis- and trans-regulated expression variation was strongly associated with nonsynonymous nucleotide diversity and tissue specificity. Interestingly, trans-regulated variation showed a negative correlation with local recombination rate. Also, our analysis on proximal transposable element (TE) insertions suggested that they affect transcription levels of ovary-expressed genes more pronouncedly than genes not expressed in the ovary, possibly due to defense mechanisms against TE mobility in the germline. Collectively, our detailed quantification of ASE variations from a natural population has revealed a number of new relationships between genomic factors and the effects of cis- and trans-regulatory factors on expression variation.
Collapse
|
Research Support, Non-U.S. Gov't |
8 |
42 |
16
|
Feng C, Pettersson M, Lamichhaney S, Rubin CJ, Rafati N, Casini M, Folkvord A, Andersson L. Moderate nucleotide diversity in the Atlantic herring is associated with a low mutation rate. eLife 2017; 6:e23907. [PMID: 28665273 PMCID: PMC5524536 DOI: 10.7554/elife.23907] [Citation(s) in RCA: 39] [Impact Index Per Article: 4.9] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/04/2016] [Accepted: 06/28/2017] [Indexed: 12/23/2022] Open
Abstract
The Atlantic herring is one of the most abundant vertebrates on earth but its nucleotide diversity is moderate (π = 0.3%), only three-fold higher than in human. Here, we present a pedigree-based estimation of the mutation rate in this species. Based on whole-genome sequencing of four parents and 12 offspring, the estimated mutation rate is 2.0 × 10-9 per base per generation. We observed a high degree of parental mosaicism indicating that a large fraction of these de novo mutations occurred during early germ cell development. The estimated mutation rate - the lowest among vertebrates analyzed to date - partially explains the discrepancy between the rather low nucleotide diversity in herring and its huge census population size. But a species like the herring will never reach its expected nucleotide diversity because of fluctuations in population size over the millions of years it takes to build up high nucleotide diversity.
Collapse
|
research-article |
8 |
39 |
17
|
Yi DK, Lee HL, Sun BY, Chung MY, Kim KJ. The complete chloroplast DNA sequence of Eleutherococcus senticosus (Araliaceae); comparative evolutionary analyses with other three asterids. Mol Cells 2012; 33:497-508. [PMID: 22555800 PMCID: PMC3887725 DOI: 10.1007/s10059-012-2281-6] [Citation(s) in RCA: 38] [Impact Index Per Article: 2.9] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/14/2011] [Revised: 03/11/2012] [Accepted: 03/14/2012] [Indexed: 11/25/2022] Open
Abstract
This study reports the complete chloroplast (cp) DNA sequence of Eleutherococcus senticosus (GenBank: JN 637765), an endangered endemic species. The genome is 156,768 bp in length, and contains a pair of inverted repeat (IR) regions of 25,930 bp each, a large single copy (LSC) region of 86,755 bp and a small single copy (SSC) region of 18,153 bp. The structural organization, gene and intron contents, gene order, AT content, codon usage, and transcription units of the E. senticosus chloroplast genome are similar to that of typical land plant cp DNA. We aligned and analyzed the sequences of 86 coding genes, 19 introns and 113 intergenic spacers (IGS) in three different taxonomic hierarchies; Eleutherococcus vs. Panax, Eleutherococcus vs. Daucus, and Eleutherococcus vs. Nicotiana. The distribution of indels, the number of polymorphic sites and nucleotide diversity indicate that positional constraint is more important than functional constraint for the evolution of cp genome sequences in Asterids. For example, the intron sequences in the LSC region exhibited base substitution rates 5-11-times higher than that of the IR regions, while the intron sequences in the SSC region evolved 7-14-times faster than those in the IR region. Furthermore, the Ka/Ks ratio of the gene coding sequences supports a stronger evolutionary constraint in the IR region than in the LSC or SSC regions. Therefore, our data suggest that selective sweeps by base collection mechanisms more frequently eliminate polymorphisms in the IR region than in other regions. Chloroplast genome regions that have high levels of base substitutions also show higher incidences of indels. Thirty-five simple sequence repeat (SSR) loci were identified in the Eleutherococcus chloroplast genome. Of these, 27 are homopolymers, while six are di-polymers and two are tri-polymers. In addition to the SSR loci, we also identified 18 medium size repeat units ranging from 22 to 79 bp, 11 of which are distributed in the IGS or intron regions. These medium size repeats may contribute to developing a cp genome-specific gene introduction vector because the region may use for specific recombination sites.
Collapse
|
research-article |
13 |
38 |
18
|
Genetic Diversity on the Human X Chromosome Does Not Support a Strict Pseudoautosomal Boundary. Genetics 2016; 203:485-92. [PMID: 27010023 PMCID: PMC4858793 DOI: 10.1534/genetics.114.172692] [Citation(s) in RCA: 35] [Impact Index Per Article: 3.9] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/06/2015] [Accepted: 03/11/2016] [Indexed: 11/18/2022] Open
Abstract
Unlike the autosomes, recombination between the X chromosome and the Y chromosome is often thought to be constrained to two small pseudoautosomal regions (PARs) at the tips of each sex chromosome. PAR1 spans the first 2.7 Mb of the proximal arm of the human sex chromosomes, whereas the much smaller PAR2 encompasses the distal 320 kb of the long arm of each sex chromosome. In addition to PAR1 and PAR2, there is a human-specific X-transposed region that was duplicated from the X to the Y chromosome. The X-transposed region is often not excluded from X-specific analyses, unlike the PARs, because it is not thought to routinely recombine. Genetic diversity is expected to be higher in recombining regions than in nonrecombining regions because recombination reduces the effect of linked selection. In this study, we investigated patterns of genetic diversity in noncoding regions across the entire X chromosome of a global sample of 26 unrelated genetic females. We found that genetic diversity in PAR1 is significantly greater than in the nonrecombining regions (nonPARs). However, rather than an abrupt drop in diversity at the pseudoautosomal boundary, there is a gradual reduction in diversity from the recombining through the nonrecombining regions, suggesting that recombination between the human sex chromosomes spans across the currently defined pseudoautosomal boundary. A consequence of recombination spanning this boundary potentially includes increasing the rate of sex-linked disorders (e.g., de la Chapelle) and sex chromosome aneuploidies. In contrast, diversity in PAR2 is not significantly elevated compared to the nonPARs, suggesting that recombination is not obligatory in PAR2. Finally, diversity in the X-transposed region is higher than in the surrounding nonPARs, providing evidence that recombination may occur with some frequency between the X and Y chromosomes in the X-transposed region.
Collapse
|
Research Support, Non-U.S. Gov't |
9 |
35 |
19
|
Dutoit L, Vijay N, Mugal CF, Bossu CM, Burri R, Wolf J, Ellegren H. Covariation in levels of nucleotide diversity in homologous regions of the avian genome long after completion of lineage sorting. Proc Biol Sci 2017; 284:20162756. [PMID: 28202815 PMCID: PMC5326536 DOI: 10.1098/rspb.2016.2756] [Citation(s) in RCA: 31] [Impact Index Per Article: 3.9] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/12/2016] [Accepted: 01/18/2017] [Indexed: 12/30/2022] Open
Abstract
Closely related species may show similar levels of genetic diversity in homologous regions of the genome owing to shared ancestral variation still segregating in the extant species. However, after completion of lineage sorting, such covariation is not necessarily expected. On the other hand, if the processes that govern genetic diversity are conserved, diversity may potentially covary even among distantly related species. We mapped regions of conserved synteny between the genomes of two divergent bird species-collared flycatcher and hooded crow-and identified more than 600 Mb of homologous regions (66% of the genome). From analyses of whole-genome resequencing data in large population samples of both species we found nucleotide diversity in 200 kb windows to be well correlated (Spearman's ρ = 0.407). The correlation remained highly similar after excluding coding sequences. To explain this covariation, we suggest that a stable avian karyotype and a conserved landscape of recombination rate variation render the diversity-reducing effects of linked selection similar in divergent bird lineages. Principal component regression analysis of several potential explanatory variables driving heterogeneity in flycatcher diversity levels revealed the strongest effects from recombination rate variation and density of coding sequence targets for selection, consistent with linked selection. It is also possible that a stable karyotype is associated with a conserved genomic mutation environment contributing to covariation in diversity levels between lineages. Our observations imply that genetic diversity is to some extent predictable.
Collapse
|
research-article |
8 |
31 |
20
|
Caragiulo A, Dias-Freedman I, Clark JA, Rabinowitz S, Amato G. Mitochondrial DNA sequence variation and phylogeography of Neotropic pumas (Puma concolor). ACTA ACUST UNITED AC 2013; 25:304-12. [PMID: 23789770 DOI: 10.3109/19401736.2013.800486] [Citation(s) in RCA: 30] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/13/2022]
Abstract
Pumas occupy the largest latitudinal range of any New World terrestrial mammal. Human population growth and associated habitat reduction has reduced their North American range by nearly two-thirds, but the impact of human expansion in Central and South America on puma populations is not clear. We examined mitochondrial DNA diversity of pumas across the majority of their range, with a focus on Central and South America. Four mitochondrial gene regions (1140 base pairs) revealed 16 unique haplotypes differentiating pumas into three geographic groupings: North America, Central America and South America. These groups were highly differentiated as indicated by significant pairwise FST values. North American samples were genetically homogenous compared to Central and South American samples, and South American pumas were the most diverse and ancestral. These findings support an earlier hypothesis that North America was recolonized by founding pumas from Central and South America.
Collapse
|
Research Support, Non-U.S. Gov't |
12 |
30 |
21
|
Zhao W, Sun YQ, Pan J, Sullivan AR, Arnold ML, Mao JF, Wang XR. Effects of landscapes and range expansion on population structure and local adaptation. THE NEW PHYTOLOGIST 2020; 228:330-343. [PMID: 32323335 DOI: 10.1111/nph.16619] [Citation(s) in RCA: 29] [Impact Index Per Article: 5.8] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 01/13/2020] [Accepted: 04/15/2020] [Indexed: 05/25/2023]
Abstract
Understanding the origin and distribution of genetic diversity across landscapes is critical for predicting the future of organisms in changing climates. This study investigated how adaptive and demographic forces have shaped diversity and population structure in Pinus densata, a keystone species on Qinghai-Tibetan Plateau (QTP). We examined the distribution of genomic diversity across the range of P. densata using exome capture sequencing. We applied spatially explicit tests to dissect the impacts of allele surfing, geographic isolation and environmental gradients on population differentiation and forecasted how this genetic legacy may limit the persistence of P. densata in future climates. We found that allele surfing from range expansion could explain the distribution of 39% of the c. 48 000 genotyped single nucleotide polymorphisms (SNPs). Uncorrected, these allele frequency clines severely confounded inferences of selection. After controlling for demographic processes, isolation-by-environment explained 9.2-19.5% of the genetic structure, with c. 4.0% of loci being affected by selection. Allele surfing and genotype-environment associations resulted in genomic mismatch under projected climate scenarios. We illustrate that significant local adaptation, when coupled with reduced diversity as a result of demographic history, constrains potential evolutionary response to climate change. The strong signal of genomic vulnerability in P. densata may be representative for other QTP endemics.
Collapse
|
|
5 |
29 |
22
|
Shi FX, Li MR, Li YL, Jiang P, Zhang C, Pan YZ, Liu B, Xiao HX, Li LF. The impacts of polyploidy, geographic and ecological isolations on the diversification of Panax (Araliaceae). BMC PLANT BIOLOGY 2015; 15:297. [PMID: 26690782 PMCID: PMC4687065 DOI: 10.1186/s12870-015-0669-0] [Citation(s) in RCA: 28] [Impact Index Per Article: 2.8] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 09/09/2015] [Accepted: 11/23/2015] [Indexed: 05/12/2023]
Abstract
BACKGROUND Panax L. is a medicinally important genus within family Araliaceae, where almost all species are of cultural significance for traditional Chinese medicine. Previous studies suggested two independent origins of the East Asia and North America disjunct distribution of this genus and multiple rounds of whole genome duplications (WGDs) might have occurred during the evolutionary process. RESULTS We employed multiple chloroplast and nuclear markers to investigate the evolution and diversification of Panax. Our phylogenetic analyses confirmed previous observations of the independent origins of disjunct distribution and both ancient and recent WGDs have occurred within Panax. The estimations of divergence time implied that the ancient WGD might have occurred before the establishment of Panax. Thereafter, at least two independent recent WGD events have occurred within Panax, one of which has led to the formation of three geographically isolated tetraploid species P. ginseng, P. japonicus and P. quinquefolius. Population genetic analyses showed that the diploid species P. notoginseng harbored significantly lower nucleotide diversity than those of the two tetraploid species P. ginseng and P. quinquefolius and the three species showed distinct nucleotide variation patterns at exon regions. CONCLUSION Our findings based on the phylogenetic and population genetic analyses, coupled with the species distribution patterns of Panax, suggested that the two rounds of WGD along with the geographic and ecological isolations might have together contributed to the evolution and diversification of this genus.
Collapse
|
research-article |
10 |
28 |
23
|
Li Q, Yan W, Chen H, Tan C, Han Z, Yao W, Li G, Yuan M, Xing Y. Duplication of OsHAP family genes and their association with heading date in rice. JOURNAL OF EXPERIMENTAL BOTANY 2016; 67:1759-68. [PMID: 26798026 PMCID: PMC4783360 DOI: 10.1093/jxb/erv566] [Citation(s) in RCA: 25] [Impact Index Per Article: 2.8] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/03/2023]
Abstract
Heterotrimeric Heme Activator Protein (HAP) family genes are involved in the regulation of flowering in plants. It is not clear how many HAP genes regulate heading date in rice. In this study, we identified 35 HAP genes, including seven newly identified genes, and performed gene duplication and candidate gene-based association analyses. Analyses showed that segmental duplication and tandem duplication are the main mechanisms of HAP gene duplication. Expression profiling and functional identification indicated that duplication probably diversifies the functions of HAP genes. A nucleotide diversity analysis revealed that 13 HAP genes underwent selection. A candidate gene-based association analysis detected four HAP genes related to heading date. An investigation of transgenic plants or mutants of 23 HAP genes confirmed that overexpression of at least four genes delayed heading date under long-day conditions, including the previously cloned Ghd8/OsHAP3H. Our results indicate that the large number of HAP genes in rice was mainly produced by gene duplication, and a few HAP genes function to regulate heading date. Selection of HAP genes is probably caused by their diverse functions rather than regulation of heading.
Collapse
|
research-article |
9 |
25 |
24
|
Schou MF, Loeschcke V, Bechsgaard J, Schlötterer C, Kristensen TN. Unexpected high genetic diversity in small populations suggests maintenance by associative overdominance. Mol Ecol 2017; 26:6510-6523. [PMID: 28746770 DOI: 10.1111/mec.14262] [Citation(s) in RCA: 24] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/04/2017] [Revised: 06/23/2017] [Accepted: 06/28/2017] [Indexed: 12/17/2022]
Abstract
The effective population size (Ne ) is a central factor in determining maintenance of genetic variation. The neutral theory predicts that loss of variation depends on Ne , with less genetic drift in larger populations. We monitored genetic drift in 42 Drosophila melanogaster populations of different adult census population sizes (10, 50 or 500) using pooled RAD sequencing. In small populations, variation was lost at a substantially lower rate than expected. This observation was consistent across two ecological relevant thermal regimes, one stable and one with a stressful increase in temperature across generations. Estimated ratios between Ne and adult census size were consistently higher in small than in larger populations. The finding provides evidence for a slower than expected loss of genetic diversity and consequently a higher than expected long-term evolutionary potential in small fragmented populations. More genetic diversity was retained in areas of low recombination, suggesting that associative overdominance, driven by disfavoured homozygosity of recessive deleterious alleles, is responsible for the maintenance of genetic diversity in smaller populations. Consistent with this hypothesis, the X-chromosome, which is largely free of recessive deleterious alleles due to hemizygosity in males, fits neutral expectations even in small populations. Our experiments provide experimental answers to a range of unexpected patterns in natural populations, ranging from variable diversity on X-chromosomes and autosomes to surprisingly high levels of nucleotide diversity in small populations.
Collapse
|
Journal Article |
8 |
24 |
25
|
Luo A, Lan H, Ling C, Zhang A, Shi L, Ho SYW, Zhu C. A simulation study of sample size for DNA barcoding. Ecol Evol 2015; 5:5869-79. [PMID: 26811761 PMCID: PMC4717336 DOI: 10.1002/ece3.1846] [Citation(s) in RCA: 23] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/17/2015] [Revised: 10/20/2015] [Accepted: 10/21/2015] [Indexed: 01/31/2023] Open
Abstract
For some groups of organisms, DNA barcoding can provide a useful tool in taxonomy, evolutionary biology, and biodiversity assessment. However, the efficacy of DNA barcoding depends on the degree of sampling per species, because a large enough sample size is needed to provide a reliable estimate of genetic polymorphism and for delimiting species. We used a simulation approach to examine the effects of sample size on four estimators of genetic polymorphism related to DNA barcoding: mismatch distribution, nucleotide diversity, the number of haplotypes, and maximum pairwise distance. Our results showed that mismatch distributions derived from subsamples of ≥20 individuals usually bore a close resemblance to that of the full dataset. Estimates of nucleotide diversity from subsamples of ≥20 individuals tended to be bell‐shaped around that of the full dataset, whereas estimates from smaller subsamples were not. As expected, greater sampling generally led to an increase in the number of haplotypes. We also found that subsamples of ≥20 individuals allowed a good estimate of the maximum pairwise distance of the full dataset, while smaller ones were associated with a high probability of underestimation. Overall, our study confirms the expectation that larger samples are beneficial for the efficacy of DNA barcoding and suggests that a minimum sample size of 20 individuals is needed in practice for each population.
Collapse
|
Journal Article |
10 |
23 |