1
|
Cornuet JM, Luikart G. Description and power analysis of two tests for detecting recent population bottlenecks from allele frequency data. Genetics 1996; 144:2001-14. [PMID: 8978083 PMCID: PMC1207747 DOI: 10.1093/genetics/144.4.2001] [Citation(s) in RCA: 2475] [Impact Index Per Article: 85.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/03/2023] Open
Abstract
When a population experiences a reduction of its effective size, it generally develops a heterozygosity excess at selectively neutral loci, i.e., the heterozygosity computed from a sample of genes is larger than the heterozygosity expected from the number of alleles found in the sample if the population were at mutation drift equilibrium. The heterozygosity excess persists only a certain number of generations until a new equilibrium is established. Two statistical tests for detecting a heterozygosity excess are described. They require measurements of the number of alleles and heterozygosity at each of several loci from a population sample. The first test determines if the proportion of loci with heterozygosity excess is significantly larger than expected at equilibrium. The second test establishes if the average of standardized differences between observed and expected heterozygosities is significantly different from zero. Type I and II errors have been evaluated by computer simulations, varying sample size, number of loci, bottleneck size, time elapsed since the beginning of the bottleneck and level of variability of loci. These analyses show that the most useful markers for bottleneck detection are those evolving under the infinite allele model (IAM) and they provide guidelines for selecting sample sizes of individuals and loci. The usefulness of these tests for conservation biology is discussed.
Collapse
|
research-article |
29 |
2475 |
2
|
Cornuet JM, Piry S, Luikart G, Estoup A, Solignac M. New methods employing multilocus genotypes to select or exclude populations as origins of individuals. Genetics 1999; 153:1989-2000. [PMID: 10581301 PMCID: PMC1460843 DOI: 10.1093/genetics/153.4.1989] [Citation(s) in RCA: 879] [Impact Index Per Article: 33.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/13/2022] Open
Abstract
A new method for assigning individuals of unknown origin to populations, based on the genetic distance between individuals and populations, was compared to two existing methods based on the likelihood of multilocus genotypes. The distribution of the assignment criterion (genetic distance or genotype likelihood) for individuals of a given population was used to define the probability that an individual belongs to the population. Using this definition, it becomes possible to exclude a population as the origin of an individual, a useful extension of the currently available assignment methods. Using simulated data based on the coalescent process, the different methods were evaluated, varying the time of divergence of populations, the mutation model, the sample size, and the number of loci. Likelihood-based methods (especially the Bayesian method) always performed better than distance methods. Other things being equal, genetic markers were always more efficient when evolving under the infinite allele model than under the stepwise mutation model, even for equal values of the differentiation parameter F(st). Using the Bayesian method, a 100% correct assignment rate can be achieved by scoring ca. 10 microsatellite loci (H approximately 0.6) on 30-50 individuals from each of 10 populations when the F(st) is near 0.1.
Collapse
|
research-article |
26 |
879 |
3
|
Luikart G, Allendorf FW, Cornuet JM, Sherwin WB. Distortion of allele frequency distributions provides a test for recent population bottlenecks. J Hered 1998; 89:238-47. [PMID: 9656466 DOI: 10.1093/jhered/89.3.238] [Citation(s) in RCA: 864] [Impact Index Per Article: 32.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/13/2022] Open
Abstract
We use population genetics theory and computer simulations to demonstrate that population bottlenecks cause a characteristic mode-shift distortion in the distribution of allele frequencies at selectively neutral loci. Bottlenecks cause alleles at low frequency (< 0.1) to become less abundant than alleles in one or more intermediate allele frequency class (e.g., 0.1-0.2). This distortion is transient and likely to be detectable for only a few dozen generations. Consequently only recent bottlenecks are likely to be detected by tests for distortions in distributions of allele frequencies. We illustrate and evaluate a qualitative graphical method for detecting a bottleneck-induced distortion of allele frequency distributions. The simple novel method requires no information on historical population sizes or levels of genetic variation; it requires only samples of 5 to 20 polymorphic loci and approximately 30 individuals. The graphical method often differentiates between empirical datasets from bottlenecked and nonbottlenecked natural populations. Computer simulations show that the graphical method is likely (P > .80) to detect an allele frequency distortion after a bottleneck of < or = 20 breeding individuals when 8 to 10 polymorphic microsatellite loci are analyzed.
Collapse
|
|
27 |
864 |
4
|
Cornuet JM, Pudlo P, Veyssier J, Dehne-Garcia A, Gautier M, Leblois R, Marin JM, Estoup A. DIYABC v2.0: a software to make approximate Bayesian computation inferences about population history using single nucleotide polymorphism, DNA sequence and microsatellite data. ACTA ACUST UNITED AC 2014; 30:1187-1189. [PMID: 24389659 DOI: 10.1093/bioinformatics/btt763] [Citation(s) in RCA: 640] [Impact Index Per Article: 58.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/31/2013] [Accepted: 12/25/2013] [Indexed: 12/30/2022]
Abstract
MOTIVATION DIYABC is a software package for a comprehensive analysis of population history using approximate Bayesian computation on DNA polymorphism data. Version 2.0 implements a number of new features and analytical methods. It allows (i) the analysis of single nucleotide polymorphism data at large number of loci, apart from microsatellite and DNA sequence data, (ii) efficient Bayesian model choice using linear discriminant analysis on summary statistics and (iii) the serial launching of multiple post-processing analyses. DIYABC v2.0 also includes a user-friendly graphical interface with various new options. It can be run on three operating systems: GNU/Linux, Microsoft Windows and Apple Os X. AVAILABILITY Freely available with a detailed notice document and example projects to academic users at http://www1.montpellier.inra.fr/CBGP/diyabc CONTACT: estoup@supagro.inra.fr Supplementary information: Supplementary data are available at Bioinformatics online.
Collapse
|
Journal Article |
11 |
640 |
5
|
Estoup A, Jarne P, Cornuet JM. Homoplasy and mutation model at microsatellite loci and their consequences for population genetics analysis. Mol Ecol 2002; 11:1591-604. [PMID: 12207711 DOI: 10.1046/j.1365-294x.2002.01576.x] [Citation(s) in RCA: 533] [Impact Index Per Article: 23.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/20/2022]
Abstract
Homoplasy has recently attracted the attention of population geneticists, as a consequence of the popularity of highly variable stepwise mutating markers such as microsatellites. Microsatellite alleles generally refer to DNA fragments of different size (electromorphs). Electromorphs are identical in state (i.e. have identical size), but are not necessarily identical by descent due to convergent mutation(s). Homoplasy occurring at microsatellites is thus referred to as size homoplasy. Using new analytical developments and computer simulations, we first evaluate the effect of the mutation rate, the mutation model, the effective population size and the time of divergence between populations on size homoplasy at the within and between population levels. We then review the few experimental studies that used various molecular techniques to detect size homoplasious events at some microsatellite loci. The relationship between this molecularly accessible size homoplasy size and the actual amount of size homoplasy is not trivial, the former being considerably influenced by the molecular structure of microsatellite core sequences. In a third section, we show that homoplasy at microsatellite electromorphs does not represent a significant problem for many types of population genetics analyses realized by molecular ecologists, the large amount of variability at microsatellite loci often compensating for their homoplasious evolution. The situations where size homoplasy may be more problematic involve high mutation rates and large population sizes together with strong allele size constraints.
Collapse
|
Review |
23 |
533 |
6
|
Cornuet JM, Santos F, Beaumont MA, Robert CP, Marin JM, Balding DJ, Guillemaud T, Estoup A. Inferring population history with DIY ABC: a user-friendly approach to approximate Bayesian computation. ACTA ACUST UNITED AC 2008; 24:2713-9. [PMID: 18842597 PMCID: PMC2639274 DOI: 10.1093/bioinformatics/btn514] [Citation(s) in RCA: 463] [Impact Index Per Article: 27.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/13/2022]
Abstract
Summary: Genetic data obtained on population samples convey information about their evolutionary history. Inference methods can extract part of this information but they require sophisticated statistical techniques that have been made available to the biologist community (through computer programs) only for simple and standard situations typically involving a small number of samples. We propose here a computer program (DIY ABC) for inference based on approximate Bayesian computation (ABC), in which scenarios can be customized by the user to fit many complex situations involving any number of populations and samples. Such scenarios involve any combination of population divergences, admixtures and population size changes. DIY ABC can be used to compare competing scenarios, estimate parameters for one or more scenarios and compute bias and precision measures for a given scenario and known values of parameters (the current version applies to unlinked microsatellite data). This article describes key methods used in the program and provides its main features. The analysis of one simulated and one real dataset, both with complex evolutionary scenarios, illustrates the main possibilities of DIY ABC. Availability: The software DIY ABC is freely available at http://www.montpellier.inra.fr/CBGP/diyabc. Contact:j.cornuet@imperial.ac.uk Supplementary information: Supplementary data are also available at http://www.montpellier.inra.fr/CBGP/diyabc
Collapse
|
Research Support, Non-U.S. Gov't |
17 |
463 |
7
|
Legras JL, Merdinoglu D, Cornuet JM, Karst F. Bread, beer and wine: Saccharomyces cerevisiae diversity reflects human history. Mol Ecol 2008; 16:2091-102. [PMID: 17498234 DOI: 10.1111/j.1365-294x.2007.03266.x] [Citation(s) in RCA: 349] [Impact Index Per Article: 20.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/15/2022]
Abstract
Fermented beverages and foods have played a significant role in most societies worldwide for millennia. To better understand how the yeast species Saccharomyces cerevisiae, the main fermenting agent, evolved along this historical and expansion process, we analysed the genetic diversity among 651 strains from 56 different geographical origins, worldwide. Their genotyping at 12 microsatellite loci revealed 575 distinct genotypes organized in subgroups of yeast types, i.e. bread, beer, wine, sake. Some of these groups presented unexpected relatedness: Bread strains displayed a combination of alleles intermediate between beer and wine strains, and strains used for rice wine and sake were most closely related to beer and bread strains. However, up to 28% of genetic diversity between these technological groups was associated with geographical differences which suggests local domestications. Focusing on wine yeasts, a group of Lebanese strains were basal in an F(ST) tree, suggesting a Mesopotamia-based origin of most wine strains. In Europe, migration of wine strains occurred through the Danube Valley, and around the Mediterranean Sea. An approximate Bayesian computation approach suggested a postglacial divergence (most probable period 10,000-12,000 bp). As our results suggest intimate association between man and wine yeast across centuries, we hypothesize that yeast followed man and vine migrations as a commensal member of grapevine flora.
Collapse
|
Journal Article |
17 |
349 |
8
|
Estoup A, Rousset F, Michalakis Y, Cornuet JM, Adriamanga M, Guyomard R. Comparative analysis of microsatellite and allozyme markers: a case study investigating microgeographic differentiation in brown trout (Salmo trutta). Mol Ecol 1998; 7:339-53. [PMID: 9561790 DOI: 10.1046/j.1365-294x.1998.00362.x] [Citation(s) in RCA: 332] [Impact Index Per Article: 12.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/07/2023]
Abstract
A comparative study between microsatellite and allozyme markers was conducted on natural populations of resident brown trout (Salmo trutta) sampled over a reduced geographical scale and on hatchery strains. The higher level of polymorphism observed at microsatellite loci resulted in higher power of statistical tests for differentiation among population samples and for genotypic linkage disequilibrium. Genetic distances of Cavalli-Sforza and Edwards were on average two times larger for microsatellites than for allozymes but multilocus FST estimates computed over the entire set of populations were not significantly different for both categories of markers. Assignment tests of individual fish to the set of sampled populations demonstrated a much higher efficiency of microsatellites compared to allozymes. Pairwise multilocus FST estimates were significantly correlated to waterway distances and there was a significant tendency for the incorrectly classified individuals to be assigned to one of the nearest populations, indicating that isolation-by-distance acted significantly on brown trout populations. This increase of differentiation with distance was higher for allozymes than for microsatellites. Traditional measures of genetic differentiation (Cavalli-Sforza and Edwards' chord distance and FST) were compared for microsatellites to recently proposed statistics taking into account allele size differences (Goldstein's distance and PST). Using Goldstein's distance for neighbour-joining analysis did not improve the tree structure resolution. Multilocus estimates of PST and FST were not significantly different when computed over the entire set of populations but no significant correlation was detected between matrices of pairwise multilocus PST estimates and waterway distances.
Collapse
|
Comparative Study |
27 |
332 |
9
|
Cornuet JM, Ravigné V, Estoup A. Inference on population history and model checking using DNA sequence and microsatellite data with the software DIYABC (v1.0). BMC Bioinformatics 2010; 11:401. [PMID: 20667077 PMCID: PMC2919520 DOI: 10.1186/1471-2105-11-401] [Citation(s) in RCA: 331] [Impact Index Per Article: 22.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/02/2010] [Accepted: 07/28/2010] [Indexed: 11/10/2022] Open
Abstract
BACKGROUND Approximate Bayesian computation (ABC) is a recent flexible class of Monte-Carlo algorithms increasingly used to make model-based inference on complex evolutionary scenarios that have acted on natural populations. The software DIYABC offers a user-friendly interface allowing non-expert users to consider population histories involving any combination of population divergences, admixtures and population size changes. We here describe and illustrate new developments of this software that mainly include (i) inference from DNA sequence data in addition or separately to microsatellite data, (ii) the possibility to analyze five categories of loci considering balanced or non balanced sex ratios: autosomal diploid, autosomal haploid, X-linked, Y-linked and mitochondrial, and (iii) the possibility to perform model checking computation to assess the "goodness-of-fit" of a model, a feature of ABC analysis that has been so far neglected. RESULTS We used controlled simulated data sets generated under evolutionary scenarios involving various divergence and admixture events to evaluate the effect of mixing autosomal microsatellite, mtDNA and/or nuclear autosomal DNA sequence data on inferences. This evaluation included the comparison of competing scenarios and the quantification of their relative support, and the estimation of parameter posterior distributions under a given scenario. We also considered a set of scenarios often compared when making ABC inferences on the routes of introduction of invasive species to illustrate the interest of the new model checking option of DIYABC to assess model misfit. CONCLUSIONS Our new developments of the integrated software DIYABC should be particularly useful to make inference on complex evolutionary scenarios involving both recent and ancient historical events and using various types of molecular markers in diploid or haploid organisms. They offer a handy way for non-expert users to achieve model checking computation within an ABC framework, hence filling up a gap of ABC analysis. The software DIYABC V1.0 is freely available at http://www1.montpellier.inra.fr/CBGP/diyabc.
Collapse
|
Research Support, Non-U.S. Gov't |
15 |
331 |
10
|
Estoup A, Garnery L, Solignac M, Cornuet JM. Microsatellite variation in honey bee (Apis mellifera L.) populations: hierarchical genetic structure and test of the infinite allele and stepwise mutation models. Genetics 1995; 140:679-95. [PMID: 7498746 PMCID: PMC1206644 DOI: 10.1093/genetics/140.2.679] [Citation(s) in RCA: 264] [Impact Index Per Article: 8.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/25/2023] Open
Abstract
Samples from nine populations belonging to three African (intermissa, scutellata and capensis) and four European (mellifera, ligustica, carnica and cecropia) Apis mellifera subspecies were scored for seven microsatellite loci. A large amount of genetic variation (between seven and 30 alleles per locus) was detected. Average heterozygosity and average number of alleles were significantly higher in African than in European subspecies, in agreement with larger effective population sizes in Africa. Microsatellite analyses confirmed that A. mellifera evolved in three distinct and deeply differentiated lineages previously detected by morphological and mitochondrial DNA studies. Dendrogram analysis of workers from a given population indicated that super-sisters cluster together when using a sufficient number of microsatellite data whereas half-sisters do not. An index of classification was derived to summarize the clustering of different taxonomic levels in large phylogenetic trees based on individual genotypes. Finally, individual population x loci data were used to test the adequacy of the two alternative mutation models, the infinite allele model (IAM) and the stepwise mutation models. The better fit overall of the IAM probably results from the majority of the microsatellites used including repeats of two or three different length motifs (compound microsatellites).
Collapse
|
research-article |
30 |
264 |
11
|
Lombaert E, Guillemaud T, Cornuet JM, Malausa T, Facon B, Estoup A. Bridgehead effect in the worldwide invasion of the biocontrol harlequin ladybird. PLoS One 2010; 5:e9743. [PMID: 20305822 PMCID: PMC2840033 DOI: 10.1371/journal.pone.0009743] [Citation(s) in RCA: 257] [Impact Index Per Article: 17.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/12/2010] [Accepted: 02/23/2010] [Indexed: 11/30/2022] Open
Abstract
Recent studies of the routes of worldwide introductions of alien organisms suggest that many widespread invasions could have stemmed not from the native range, but from a particularly successful invasive population, which serves as the source of colonists for remote new territories. We call here this phenomenon the invasive bridgehead effect. Evaluating the likelihood of such a scenario is heuristically challenging. We solved this problem by using approximate Bayesian computation methods to quantitatively compare complex invasion scenarios based on the analysis of population genetics (microsatellite variation) and historical (first observation dates) data. We applied this approach to the Harlequin ladybird Harmonia axyridis (HA), a coccinellid native to Asia that was repeatedly introduced as a biocontrol agent without becoming established for decades. We show that the recent burst of worldwide invasions of HA followed a bridgehead scenario, in which an invasive population in eastern North America acted as the source of the colonists that invaded the European, South American and African continents, with some admixture with a biocontrol strain in Europe. This demonstration of a mechanism of invasion via a bridgehead has important implications both for invasion theory (i.e., a single evolutionary shift in the bridgehead population versus multiple changes in case of introduced populations becoming invasive independently) and for ongoing efforts to manage invasions by alien organisms (i.e., heightened vigilance against invasive bridgeheads).
Collapse
|
Research Support, Non-U.S. Gov't |
15 |
257 |
12
|
Estoup A, Solignac M, Cornuet JM, Goudet J, Scholl A. Genetic differentiation of continental and island populations of Bombus terrestris (Hymenoptera: Apidae) in Europe. Mol Ecol 1996; 5:19-31. [PMID: 9147693 DOI: 10.1111/j.1365-294x.1996.tb00288.x] [Citation(s) in RCA: 222] [Impact Index Per Article: 7.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/04/2023]
Abstract
Ten microsatellite loci and a partial sequence of the COII mitochondrial gene were used to investigate genetic differentiation in B. terrestris, a bumble bee of interest for its high-value crop pollination. The analysis included eight populations from the European continent, five from Mediterranean islands (six subspecies altogether) and one from Tenerife (initially described as a colour form of B. terrestris but recently considered as a separate species, B. canariensis). Eight of the 10 microsatellite loci displayed high levels of polymorphism in most populations. In B. terrestris populations, the total number of alleles detected per polymorphic locus ranged from 3 to 16, with observed allelic diversity from 3.8 +/- 0.5 to 6.5 +/- 1.4 and average calculated heterozygosities from 0.41 +/- 0.09 to 0.65 +/- 0.07. B. canariensis showed a significantly lower average calculated heterozygosity (0.12 +/- 0.08) and observed allelic diversity (1.5 +/- 0.04) as compared to both continental and island populations of B. terrestris. No significant differentiation was found among populations of B. terrestris from the European continent. In contrast, island populations were all significantly and most of them strongly differentiated from continental populations. B. terrestris mitochondrial DNA is characterized by a low nucleotide diversity: 0.18% +/- 0.07%, 0.20% +/- 0.04% and 0.27% +/- 0.04% for the continental populations, the island populations and all populations together, respectively. The only haplotype found in the Tenerife population differs by a single nucleotide substitution from the most common continental haplotype of B. terrestris. This situation, identical to that of Tyrrhenian islands populations and quite different from that of B. lucorum (15 substitutions between terrestris and lucorum mtDNA) casts doubts on the species status of B. canariensis. The large genetic distance between the Tenerife and B. terrestris populations estimated from microsatellite data result, most probably, from a severe bottleneck in the Canary island population. Microsatellite and mitochondrial DNA data call for the protection of the island populations of B. terrestris against importation of bumble bees of foreign origin which are used as crop pollinators.
Collapse
|
|
29 |
222 |
13
|
Gautier M, Gharbi K, Cezard T, Foucaud J, Kerdelhué C, Pudlo P, Cornuet JM, Estoup A. The effect of RAD allele dropout on the estimation of genetic variation within and between populations. Mol Ecol 2012; 22:3165-78. [DOI: 10.1111/mec.12089] [Citation(s) in RCA: 219] [Impact Index Per Article: 16.8] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/12/2012] [Revised: 09/04/2012] [Accepted: 09/12/2012] [Indexed: 12/17/2022]
|
|
13 |
219 |
14
|
Franck P, Garnery L, Loiseau A, Oldroyd BP, Hepburn HR, Solignac M, Cornuet JM. Genetic diversity of the honeybee in Africa: microsatellite and mitochondrial data. Heredity (Edinb) 2001; 86:420-30. [PMID: 11520342 DOI: 10.1046/j.1365-2540.2001.00842.x] [Citation(s) in RCA: 209] [Impact Index Per Article: 8.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/20/2022] Open
Abstract
A total of 738 colonies from 64 localities along the African continent have been analysed using the DraI RFLP of the COI-COII mitochondrial region. Mitochondrial DNA of African honeybees appears to be composed of three highly divergent lineages. The African lineage previously reported (named A) is present in almost all the localities except those from north-eastern Africa. In this area, two newly described lineages (called O and Y), putatively originating from the Near East, are observed in high proportion. This suggests an important differentiation of Ethiopian and Egyptian honeybees from those of other African areas. The A lineage is also present in high proportion in populations from the Iberian Peninsula and Sicily. Furthermore, eight populations from Morocco, Guinea, Malawi and South Africa have been assayed with six microsatellite loci and compared to a set of eight additional populations from Europe and the Middle East. The African populations display higher genetic variability than European populations at all microsatellite loci studied thus far. This suggests that African populations have larger effective sizes than European ones. According to their microsatellite allele frequencies, the eight African populations cluster together, but are divided in two subgroups. These are the populations from Morocco and those from the other African countries. The populations from southern Europe show very low levels of 'Africanization' at nuclear microsatellite loci. Because nuclear and mitochondrial DNA often display discordant patterns of differentiation in the honeybee, the use of both kinds of markers is preferable when assessing the phylogeography of Apis mellifera and to determine the taxonomic status of the subspecies.
Collapse
|
|
24 |
209 |
15
|
Estoup A, Solignac M, Harry M, Cornuet JM. Characterization of (GT)n and (CT)n microsatellites in two insect species: Apis mellifera and Bombus terrestris. Nucleic Acids Res 1993; 21:1427-31. [PMID: 8464734 PMCID: PMC309328 DOI: 10.1093/nar/21.6.1427] [Citation(s) in RCA: 200] [Impact Index Per Article: 6.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/30/2023] Open
Abstract
A set of 52 (CT)n and 23 (GT)n microsatellites in honeybee, 24 (CT)n and 2 (GT)n microsatellites in bumble-bee (n > 6) have been isolated from partial genomic libraries and sequenced. On average, (CT)n and (GT)n microsatellites occur every 15 kb and 34 kb in honeybee and every 40 kb and 500 kb in bumble-bee, respectively. The prevailing categories are imperfect repeats for (CT)n microsatellites in bumble-bee, and perfect repeats for both (CT)n and (GT)n microsatellites in honey-bee. Comparisons with data available in vertebrates indicate a lower proportion of perfect repeats in bees but length distributions are very similar regardless the phylum. This result extends to insects the concept of an evolutionary conservation for quantitative and qualitative characteristics of (CT)n and (GT)n microsatellites. Many (CT)n and (GT)n repeats are surrounded with various types of microsatellites, revealing an associative distribution of short repeat sequences. As expected, a high level of intrapopulational polymorphism has been found with one tested honeybee microsatellite. Also, flanking regions of this microsatellite are similar enough to allow PCR amplification in several other species of Apis and Bombus.
Collapse
|
research-article |
32 |
200 |
16
|
Pudlo P, Marin JM, Estoup A, Cornuet JM, Gautier M, Robert CP. Reliable ABC model choice via random forests. ACTA ACUST UNITED AC 2015; 32:859-66. [PMID: 26589278 DOI: 10.1093/bioinformatics/btv684] [Citation(s) in RCA: 196] [Impact Index Per Article: 19.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/29/2015] [Accepted: 09/30/2015] [Indexed: 01/25/2023]
Abstract
MOTIVATION Approximate Bayesian computation (ABC) methods provide an elaborate approach to Bayesian inference on complex models, including model choice. Both theoretical arguments and simulation experiments indicate, however, that model posterior probabilities may be poorly evaluated by standard ABC techniques. RESULTS We propose a novel approach based on a machine learning tool named random forests (RF) to conduct selection among the highly complex models covered by ABC algorithms. We thus modify the way Bayesian model selection is both understood and operated, in that we rephrase the inferential goal as a classification problem, first predicting the model that best fits the data with RF and postponing the approximation of the posterior probability of the selected model for a second stage also relying on RF. Compared with earlier implementations of ABC model choice, the ABC RF approach offers several potential improvements: (i) it often has a larger discriminative power among the competing models, (ii) it is more robust against the number and choice of statistics summarizing the data, (iii) the computing effort is drastically reduced (with a gain in computation efficiency of at least 50) and (iv) it includes an approximation of the posterior probability of the selected model. The call to RF will undoubtedly extend the range of size of datasets and complexity of models that ABC can handle. We illustrate the power of this novel methodology by analyzing controlled experiments as well as genuine population genetics datasets. AVAILABILITY AND IMPLEMENTATION The proposed methodology is implemented in the R package abcrf available on the CRAN. CONTACT jean-michel.marin@umontpellier.fr SUPPLEMENTARY INFORMATION Supplementary data are available at Bioinformatics online.
Collapse
|
Journal Article |
10 |
196 |
17
|
Excoffier L, Estoup A, Cornuet JM. Bayesian analysis of an admixture model with mutations and arbitrarily linked markers. Genetics 2005; 169:1727-38. [PMID: 15654099 PMCID: PMC1449551 DOI: 10.1534/genetics.104.036236] [Citation(s) in RCA: 183] [Impact Index Per Article: 9.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/18/2022] Open
Abstract
We introduce here a Bayesian analysis of a classical admixture model in which all parameters are simultaneously estimated. Our approach follows the approximate Bayesian computation (ABC) framework, relying on massive simulations and a rejection-regression algorithm. Although computationally intensive, this approach can easily deal with complex mutation models and partially linked loci, and it can be thoroughly validated without much additional computation cost. Compared to a recent maximum-likelihood (ML) method, the ABC approach leads to similarly accurate estimates of admixture proportions in the case of recent admixture events, but it is found superior when the admixture is more ancient. All other parameters of the admixture model such as the divergence time between parental populations, the admixture time, and the population sizes are also well estimated, unlike the ML method. The use of partially linked markers does not introduce any particular bias in the estimation of admixture, but ML confidence intervals are found too narrow if linkage is not specifically accounted for. The application of our method to an artificially admixed domestic bee population from northwest Italy suggests that the admixture occurred in the last 10-40 generations and that the parental Apis mellifera and A. ligustica populations were completely separated since the last glacial maximum.
Collapse
|
Research Support, Non-U.S. Gov't |
20 |
183 |
18
|
Berthier P, Beaumont MA, Cornuet JM, Luikart G. Likelihood-based estimation of the effective population size using temporal changes in allele frequencies: a genealogical approach. Genetics 2002; 160:741-51. [PMID: 11861575 PMCID: PMC1461962 DOI: 10.1093/genetics/160.2.741] [Citation(s) in RCA: 158] [Impact Index Per Article: 6.9] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/14/2022] Open
Abstract
A new genetic estimator of the effective population size (N(e)) is introduced. This likelihood-based (LB) estimator uses two temporally spaced genetic samples of individuals from a population. We compared its performance to that of the classical F-statistic-based N(e) estimator (N(eFk)) by using data from simulated populations with known N(e) and real populations. The new likelihood-based estimator (N(eLB)) showed narrower credible intervals and greater accuracy than (N(eFk)) when genetic drift was strong, but performed only slightly better when genetic drift was relatively weak. When drift was strong (e.g., N(e) = 20 for five generations), as few as approximately 10 loci (heterozygosity of 0.6; samples of 30 individuals) are sufficient to consistently achieve credible intervals with an upper limit <50 using the LB method. In contrast, approximately 20 loci are required for the same precision when using the classical F-statistic approach. The N(eLB) estimator is much improved over the classical method when there are many rare alleles. It will be especially useful in conservation biology because it less often overestimates N(e) than does N(eLB) and thus is less likely to erroneously suggest that a population is large and has a low extinction risk.
Collapse
|
research-article |
23 |
158 |
19
|
Solignac M, Vautrin D, Loiseau A, Mougel F, Baudry E, Estoup A, Garnery L, Haberl M, Cornuet JM. Five hundred and fifty microsatellite markers for the study of the honeybee (Apis mellifera L.) genome. ACTA ACUST UNITED AC 2003. [DOI: 10.1046/j.1471-8286.2003.00436.x] [Citation(s) in RCA: 143] [Impact Index Per Article: 6.5] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/20/2022]
|
|
22 |
143 |
20
|
Estoup A, Beaumont M, Sennedot F, Moritz C, Cornuet JM. Genetic analysis of complex demographic scenarios: spatially expanding populations of the cane toad, Bufo marinus. Evolution 2004; 58:2021-36. [PMID: 15521459 DOI: 10.1111/j.0014-3820.2004.tb00487.x] [Citation(s) in RCA: 133] [Impact Index Per Article: 6.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/29/2022]
Abstract
Inferring the spatial expansion dynamics of invading species from molecular data is notoriously difficult due to the complexity of the processes involved. For these demographic scenarios, genetic data obtained from highly variable markers may be profitably combined with specific sampling schemes and information from other sources using a Bayesian approach. The geographic range of the introduced toad Bufo marinus is still expanding in eastern and northern Australia, in each case from isolates established around 1960. A large amount of demographic and historical information is available on both expansion areas. In each area, samples were collected along a transect representing populations of different ages and genotyped at 10 microsatellite loci. Five demographic models of expansion, differing in the dispersal pattern for migrants and founders and in the number of founders, were considered. Because the demographic history is complex, we used an approximate Bayesian method, based on a rejection-regression algorithm, to formally test the relative likelihoods of the five models of expansion and to infer demographic parameters. A stepwise migration-foundation model with founder events was statistically better supported than other four models in both expansion areas. Posterior distributions supported different dynamics of expansion in the studied areas. Populations in the eastern expansion area have a lower stable effective population size and have been founded by a smaller number of individuals than those in the northern expansion area. Once demographically stabilized, populations exchange a substantial number of effective migrants per generation in both expansion areas, and such exchanges are larger in northern than in eastern Australia. The effective number of migrants appears to be considerably lower than that of founders in both expansion areas. We found our inferences to be relatively robust to various assumptions on marker, demographic, and historical features. The method presented here is the only robust, model-based method available so far, which allows inferring complex population dynamics over a short time scale. It also provides the basis for investigating the interplay between population dynamics, drift, and selection in invasive species.
Collapse
|
Research Support, Non-U.S. Gov't |
21 |
133 |
21
|
Malausa T, Bethenod MT, Bontemps A, Bourguet D, Cornuet JM, Ponsard S. Assortative mating in sympatric host races of the European corn borer. Science 2005; 308:258-60. [PMID: 15821092 DOI: 10.1126/science.1107577] [Citation(s) in RCA: 133] [Impact Index Per Article: 6.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/02/2022]
Abstract
Although a growing body of work supports the plausibility of sympatric speciation in animals, the practical difficulties of directly quantifying reproductive isolation between diverging taxa remain an obstacle to analyzing this process. We used a combination of genetic and biogeochemical markers to produce a direct field estimate of assortative mating in phytophagous insect populations. We show that individuals of the same insect species, the European corn borer Ostrinia nubilalis, that develop on different host plants can display almost absolute reproductive isolation-the proportion of assortative mating was >95%-even in the absence of temporal or spatial isolation.
Collapse
|
Research Support, Non-U.S. Gov't |
20 |
133 |
22
|
Beaumont MA, Nielsen R, Robert C, Hey J, Gaggiotti O, Knowles L, Estoup A, Panchal M, Corander J, Hickerson M, Sisson SA, Fagundes N, Chikhi L, Beerli P, Vitalis R, Cornuet JM, Huelsenbeck J, Foll M, Yang Z, Rousset F, Balding D, Excoffier L. In defence of model-based inference in phylogeography. Mol Ecol 2010; 19:436-446. [PMID: 29284924 DOI: 10.1111/j.1365-294x.2009.04515.x] [Citation(s) in RCA: 123] [Impact Index Per Article: 8.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/28/2022]
Abstract
Recent papers have promoted the view that model-based methods in general, and those based on Approximate Bayesian Computation (ABC) in particular, are flawed in a number of ways, and are therefore inappropriate for the analysis of phylogeographic data. These papers further argue that Nested Clade Phylogeographic Analysis (NCPA) offers the best approach in statistical phylogeography. In order to remove the confusion and misconceptions introduced by these papers, we justify and explain the reasoning behind model-based inference. We argue that ABC is a statistically valid approach, alongside other computational statistical techniques that have been successfully used to infer parameters and compare models in population genetics. We also examine the NCPA method and highlight numerous deficiencies, either when used with single or multiple loci. We further show that the ages of clades are carelessly used to infer ages of demographic events, that these ages are estimated under a simple model of panmixia and population stationarity but are then used under different and unspecified models to test hypotheses, a usage the invalidates these testing procedures. We conclude by encouraging researchers to study and use model-based inference in population genetics.
Collapse
|
Journal Article |
15 |
123 |
23
|
Garnery L, Cornuet JM, Solignac M. Evolutionary history of the honey bee Apis mellifera inferred from mitochondrial DNA analysis. Mol Ecol 1994; 1:145-54. [PMID: 1364272 DOI: 10.1111/j.1365-294x.1992.tb00170.x] [Citation(s) in RCA: 123] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/29/2022]
Abstract
Variability of mitochondrial DNA (mtDNA) of the honey bee Apis mellifera L. has been investigated by restriction and sequence analyses on a sample of 68 colonies from ten different subspecies. The 19 mtDNA types detected are clustered in three major phylogenetic lineages. These clades correspond well to three groups of populations with distinct geographical distributions: branch A for African subspecies (intermissa, monticola, scutellata, andansonii and capensis), branch C for North Mediterranean subspecies (caucasica, carnica and ligustica) and branch M for the West European populations (mellifera subspecies). These results partially confirm previous hypotheses based on morphometrical and allozymic studies, the main difference concerning North African populations, now assigned to branch A instead of branch M. The pattern of spatial structuring suggests the Middle East as the centre of dispersion of the species, in accordance with the geographic areas of the other species of the same genus. Based on a conservative 2% divergence rate per Myr, the separation of the three branches has been dated at about 1 Myr BP.
Collapse
|
Journal Article |
31 |
123 |
24
|
Franck P, Garnery L, Celebrano G, Solignac M, Cornuet JM. Hybrid origins of honeybees from italy (Apis mellifera ligustica) and sicily (A. m. sicula). Mol Ecol 2000; 9:907-21. [PMID: 10886654 DOI: 10.1046/j.1365-294x.2000.00945.x] [Citation(s) in RCA: 119] [Impact Index Per Article: 4.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/20/2022]
Abstract
The genetic variability of honeybee populations Apis mellifera ligustica, in continental Italy, and of A. m. sicula, in Sicily, was investigated using nuclear (microsatellite) and mitochondrial markers. Six populations (236 individual bees) and 17 populations (664 colonies) were, respectively, analysed using eight microsatellite loci and DraI restriction fragment length polymorphism (RFLP) of the cytochrome oxidase I (COI)-cytochrome oxidase II (COII) region. Microsatellite loci globally confirmed the southeastern European heritage of both subspecies (evolutionary branch C). However, A. m. ligustica mitochondrial DNA (mtDNA) appeared to be a composite of the two European (M and C) lineages over most of the Italian peninsula, and only mitotypes from the African (A) lineage were found in A. m. sicula samples. This demonstrates a hybrid origin for both subspecies. For A. m. ligustica, the most widely exported subspecies, this hybrid origin has long been obscured by the fact that in the main area of queen production (from which most of the previous ligustica bee samples originated) the M mitochondrial lineage is absent, whereas it is present almost everywhere else in Italy. This presents a new view of the evolutionary history of European honeybees. For instance, the Iberian peninsula was considered as the unique refuge for the M branch during the quaternary ice periods. Our results show that the Apennine peninsula played a similar role. The differential distribution of nuclear and mitochondrial markers observed in Italy seems to be a general feature of introgressed honeybee populations. Presumably, it stems from the social nature of the species in which both genome compartments are differentially affected by the two (individual and colonial) reproduction levels.
Collapse
|
|
25 |
119 |
25
|
Estoup A, Wilson IJ, Sullivan C, Cornuet JM, Moritz C. Inferring population history from microsatellite and enzyme data in serially introduced cane toads, Bufo marinus. Genetics 2001; 159:1671-87. [PMID: 11779806 PMCID: PMC1461904 DOI: 10.1093/genetics/159.4.1671] [Citation(s) in RCA: 119] [Impact Index Per Article: 5.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/14/2022] Open
Abstract
Much progress has been made on inferring population history from molecular data. However, complex demographic scenarios have been considered rarely or have proved intractable. The serial introduction of the South-Central American cane toad Bufo marinus in various Caribbean and Pacific islands involves four major phases: a possible genetic admixture during the first introduction, a bottleneck associated with founding, a transitory population boom, and finally, a demographic stabilization. A large amount of historical and demographic information is available for those introductions and can be combined profitably with molecular data. We used a Bayesian approach to combine this information with microsatellite (10 loci) and enzyme (22 loci) data and used a rejection algorithm to simultaneously estimate the demographic parameters describing the four major phases of the introduction history. The general historical trends supported by microsatellites and enzymes were similar. However, there was a stronger support for a larger bottleneck at introductions for microsatellites than enzymes and for a more balanced genetic admixture for enzymes than for microsatellites. Very little information was obtained from either marker about the transitory population boom observed after each introduction. Possible explanations for differences in resolution of demographic events and discrepancies between results obtained with microsatellites and enzymes were explored. Limits of our model and method for the analysis of nonequilibrium populations were discussed.
Collapse
|
research-article |
24 |
119 |