1
|
French CM, Damasceno RP, Vasconcellos MM, Rodrigues MT, Carnaval AC, Hickerson MJ. Elevational Range Impacts Connectivity and Predicted Deme Sizes From Models of Habitat Suitability. Mol Ecol 2025; 34:e17593. [PMID: 39569697 DOI: 10.1111/mec.17593] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/18/2024] [Revised: 11/04/2024] [Accepted: 11/06/2024] [Indexed: 11/22/2024]
Abstract
In integrative distributional, demographic and coalescent (iDDC) modelling, a critical component is the statistical relationship between habitat suitability and local population sizes. This study explores this relationship in two Enyalius lizard species from the Brazilian Atlantic Forest: the high-elevation E. iheringii and low-elevation E. catenatus and how this transformation affects spatiotemporal demographic inference. Most previous iDDC studies assumed a linear relationship, but this study hypothesises that the relationship may be nonlinear, especially for high-elevation species with broader environmental tolerances. We test two key hypotheses: (1) The habitat suitability to population size relationship is nonlinear for E. iheringii (high-elevation) and linear for E. catenatus (low-elevation); and (2) E. iheringii exhibits higher effective migration across populations than E. catenatus. Our findings provide clear support for hypothesis (2), but mixed support for hypothesis (1), with strong model support for a nonlinear transformation in the high-elevation E. iheringii and some (albeit weak) support for a nonlinear transformation also in E. catenatus. The iDDC models allow us to generate landscape-wide maps of predicted genetic diversity for both species, revealing that genetic diversity predictions for the high-elevation E. iheringii align with estimated patterns of historical range stability, whereas predictions for low-elevation E. catenatus are distinct from range-wide stability predictions. This research highlights the importance of accurately modelling the habitat suitability to population size relationship in iDDC studies, contributing to our understanding of species' demographic responses to environmental changes.
Collapse
Affiliation(s)
- Connor M French
- Biology Department, City College of New York, New York, USA
- Biology Ph.D. Program, Graduate Center, City University of New York, New York, USA
| | - Roberta P Damasceno
- Departamento de Zoologia, Instituto de Biociências, Universidade de São Paulo, São Paulo, Brazil
| | - Mariana M Vasconcellos
- Departamento de Zoologia, Instituto de Biociências, Universidade de São Paulo, São Paulo, Brazil
- Department of Biology, Evolutionary Ecology Group, University of Antwerp, Antwerp, Belgium
| | - Miguel T Rodrigues
- Departamento de Zoologia, Instituto de Biociências, Universidade de São Paulo, São Paulo, Brazil
| | - Ana C Carnaval
- Biology Department, City College of New York, New York, USA
- Biology Ph.D. Program, Graduate Center, City University of New York, New York, USA
| | - Michael J Hickerson
- Biology Department, City College of New York, New York, USA
- Biology Ph.D. Program, Graduate Center, City University of New York, New York, USA
- Division of Invertebrate Zoology, American Museum of Natural History, New York, New York, USA
| |
Collapse
|
2
|
Lu R, Zhu H, Wu X. Estimating mutation rates in a Markov branching process using approximate Bayesian computation. J Theor Biol 2023; 565:111467. [PMID: 36963627 DOI: 10.1016/j.jtbi.2023.111467] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/14/2022] [Revised: 02/15/2023] [Accepted: 03/15/2023] [Indexed: 03/26/2023]
Abstract
Estimating microbial mutation rates is an essential task in evolutionary biology, with wide range applications in related fields such as virology, epidemiology, clinic and public health, and antibiotic research. Significant progress has been made on this research since 1943 when Luria-Delbrück fluctuation analysis was first introduced. However, existing estimators of mutation rates are heavily reliant on model assumptions in fluctuation analysis, and become less applicable to real microbial experiments which deviate from the model assumptions. To overcome this difficulty, we propose to model fluctuation experimental data by a two-type Markov branching process (MBP) and use approximate Bayesian computation (ABC) to estimate the mutation probability parameters. Such an ABC-based mutation rate estimator is based on intensive simulations from the mutation process, thereby taking advantage of modern computing power. Most importantly, its likelihood-free feature allows more complex and realistic setups of the mutation process, especially when the distribution of the number of mutants cannot be easily derived. To further improve computation efficiency, we use a Gaussian process surrogate to substitute the simulator in the ABC algorithm, and call the resulting estimator GPS-ABC. Simulation studies show that, when used to estimate constant mutation rate in MBP, ABC-based estimators generally outperform traditional moment or likelihood-based estimators. When mutations occur in two stages, i.e., in MBP with a piece-wise constant mutation rate function, traditional mutation rate estimators become not applicable, yet GPS-ABC still achieves reasonable estimates. Finally, the proposed GPS-ABC estimator is used to analyze real fluctuation experimental datasets for studying drug resistance.
Collapse
Affiliation(s)
- Ruijin Lu
- Department of Statistics, Virginia Polytechnic Institute and State University, Blacksburg, VA 24061, United States of America
| | - Hongxiao Zhu
- Department of Statistics, Virginia Polytechnic Institute and State University, Blacksburg, VA 24061, United States of America
| | - Xiaowei Wu
- Department of Statistics, Virginia Polytechnic Institute and State University, Blacksburg, VA 24061, United States of America.
| |
Collapse
|
3
|
Baey C, Smith HG, Rundlöf M, Olsson O, Clough Y, Sahlin U. Calibration of a bumble bee foraging model using Approximate Bayesian Computation. Ecol Modell 2023. [DOI: 10.1016/j.ecolmodel.2022.110251] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/27/2022]
|
4
|
Cheng YH, Sun MT, Wang N, Gao CZ, Peng HQ, Zhang JY, Gu MM, Lu DB. Population Genetics of Oncomelania hupensis Snails from New-Emerging Snail Habitats in a Currently Schistosoma japonicum Non-Endemic Area. Trop Med Infect Dis 2023; 8:tropicalmed8010042. [PMID: 36668949 PMCID: PMC9861412 DOI: 10.3390/tropicalmed8010042] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/29/2022] [Revised: 12/29/2022] [Accepted: 01/03/2023] [Indexed: 01/09/2023] Open
Abstract
Schistosomiasis is still one of the most significant neglected tropical diseases worldwide, and China is endemic for Schistosoma japonicum. With its great achievement in schistosomiasis control, the government of China has set the goal to eliminate the parasitic disease at the country level by 2030. However, one major challenge is the remaining huge areas of habitats for the intermediate host Oncomelania hupensis. This is further exacerbated by an increasing number of new emerging snail habitats reported each year. Therefore, population genetics on snails in such areas will be useful in evaluation of snail control effect and/or dispersal. We then sampled snails from new emerging habitats in Taicang of Jiangsu, China, a currently S. japonicum non-endemic area from 2014 to 2017, and performed population genetic analyses based on nine microsatellites. Results showed that all snail populations had low genetic diversity, and most genetic variations originated from within snail populations. The estimated effective population size for the 2015 population was infinitive. All snails could be separated into two clusters, and further DIYABC analysis revealed that both the 2016 and the 2017 populations may derive from the 2015, indicating that the 2017 population must have been missed in the field survey performed in 2016. These findings may have implications in development of more practical guidelines for snail monitoring and control.
Collapse
|
5
|
Perez MF, Bonatelli IAS, Romeiro-Brito M, Franco FF, Taylor NP, Zappi DC, Moraes EM. Coalescent-based species delimitation meets deep learning: Insights from a highly fragmented cactus system. Mol Ecol Resour 2021; 22:1016-1028. [PMID: 34669256 DOI: 10.1111/1755-0998.13534] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/03/2021] [Revised: 09/16/2021] [Accepted: 10/12/2021] [Indexed: 11/26/2022]
Abstract
Delimiting species boundaries is a major goal in evolutionary biology. An increasing volume of literature has focused on the challenges of investigating cryptic diversity within complex evolutionary scenarios of speciation, including gene flow and demographic fluctuations. New methods based on model selection, such as approximate Bayesian computation, approximate likelihoods, and machine learning are promising tools arising in this field. Here, we introduce a framework for species delimitation using the multispecies coalescent model coupled with a deep learning algorithm based on convolutional neural networks (CNNs). We compared this strategy with a similar ABC approach. We applied both methods to test species boundary hypotheses based on current and previous taxonomic delimitations as well as genetic data (sequences from 41 loci) in Pilosocereus aurisetus, a cactus species complex with a sky-island distribution and taxonomic uncertainty. To validate our method, we also applied the same strategy on data from widely accepted species from the genus Drosophila. The results show that our CNN approach has a high capacity to distinguish among the simulated species delimitation scenarios, with higher accuracy than ABC. For the cactus data set, a splitter hypothesis without gene flow showed the highest probability in both CNN and ABC approaches, a result agreeing with previous taxonomic classifications and in line with the sky-island distribution and low dispersal of P. aurisetus. Our results highlight the cryptic diversity within the P. aurisetus complex and show that CNNs are a promising approach for distinguishing complex evolutionary histories, even outperforming the accuracy of other model-based approaches such as ABC.
Collapse
Affiliation(s)
- Manolo F Perez
- Departamento de Biologia, Universidade Federal de São Carlos, Sorocaba, Brazil.,Departamento de Genética e Evolução, Universidade Federal de São Carlos, São Carlos, Brazil
| | - Isabel A S Bonatelli
- Departamento de Biologia, Universidade Federal de São Carlos, Sorocaba, Brazil.,Departamento de Ecologia e Biologia Evolutiva, Universidade Federal de São Paulo, Diadema, Brazil
| | | | - Fernando F Franco
- Departamento de Biologia, Universidade Federal de São Carlos, Sorocaba, Brazil
| | | | - Daniela C Zappi
- Programa de Pós Graduação em Botânica, Instituto de Ciências Biológicas, Universidade de Brasília, Brasília, Brazil
| | - Evandro M Moraes
- Departamento de Biologia, Universidade Federal de São Carlos, Sorocaba, Brazil
| |
Collapse
|
6
|
Park M, Vinaroz M, Jitkrittum W. ABCDP: Approximate Bayesian Computation with Differential Privacy. ENTROPY (BASEL, SWITZERLAND) 2021; 23:961. [PMID: 34441101 PMCID: PMC8391538 DOI: 10.3390/e23080961] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 05/19/2021] [Revised: 07/15/2021] [Accepted: 07/20/2021] [Indexed: 11/17/2022]
Abstract
We developed a novel approximate Bayesian computation (ABC) framework, ABCDP, which produces differentially private (DP) and approximate posterior samples. Our framework takes advantage of the sparse vector technique (SVT), widely studied in the differential privacy literature. SVT incurs the privacy cost only when a condition (whether a quantity of interest is above/below a threshold) is met. If the condition is sparsely met during the repeated queries, SVT can drastically reduce the cumulative privacy loss, unlike the usual case where every query incurs the privacy loss. In ABC, the quantity of interest is the distance between observed and simulated data, and only when the distance is below a threshold can we take the corresponding prior sample as a posterior sample. Hence, applying SVT to ABC is an organic way to transform an ABC algorithm to a privacy-preserving variant with minimal modification, but yields the posterior samples with a high privacy level. We theoretically analyzed the interplay between the noise added for privacy and the accuracy of the posterior samples. We apply ABCDP to several data simulators and show the efficacy of the proposed framework.
Collapse
Affiliation(s)
- Mijung Park
- Computer Science, University of British Columbia, Vancouver, BC V6T 1Z4, Canada
| | - Margarita Vinaroz
- Max Planck Institute for Intelligent Systems, 72076 Tübingen, Germany;
- Department of Computer Science, University of Tübingen, 72076 Tübingen, Germany
| | | |
Collapse
|
7
|
Clemente F, Unterländer M, Dolgova O, Amorim CEG, Coroado-Santos F, Neuenschwander S, Ganiatsou E, Cruz Dávalos DI, Anchieri L, Michaud F, Winkelbach L, Blöcher J, Arizmendi Cárdenas YO, Sousa da Mota B, Kalliga E, Souleles A, Kontopoulos I, Karamitrou-Mentessidi G, Philaniotou O, Sampson A, Theodorou D, Tsipopoulou M, Akamatis I, Halstead P, Kotsakis K, Urem-Kotsou D, Panagiotopoulos D, Ziota C, Triantaphyllou S, Delaneau O, Jensen JD, Moreno-Mayar JV, Burger J, Sousa VC, Lao O, Malaspinas AS, Papageorgopoulou C. The genomic history of the Aegean palatial civilizations. Cell 2021; 184:2565-2586.e21. [PMID: 33930288 PMCID: PMC8127963 DOI: 10.1016/j.cell.2021.03.039] [Citation(s) in RCA: 29] [Impact Index Per Article: 7.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/17/2020] [Revised: 09/17/2020] [Accepted: 03/18/2021] [Indexed: 12/30/2022]
Abstract
The Cycladic, the Minoan, and the Helladic (Mycenaean) cultures define the Bronze Age (BA) of Greece. Urbanism, complex social structures, craft and agricultural specialization, and the earliest forms of writing characterize this iconic period. We sequenced six Early to Middle BA whole genomes, along with 11 mitochondrial genomes, sampled from the three BA cultures of the Aegean Sea. The Early BA (EBA) genomes are homogeneous and derive most of their ancestry from Neolithic Aegeans, contrary to earlier hypotheses that the Neolithic-EBA cultural transition was due to massive population turnover. EBA Aegeans were shaped by relatively small-scale migration from East of the Aegean, as evidenced by the Caucasus-related ancestry also detected in Anatolians. In contrast, Middle BA (MBA) individuals of northern Greece differ from EBA populations in showing ∼50% Pontic-Caspian Steppe-related ancestry, dated at ca. 2,600-2,000 BCE. Such gene flow events during the MBA contributed toward shaping present-day Greek genomes.
Collapse
Affiliation(s)
- Florian Clemente
- Department of Computational Biology, University of Lausanne, 1015 Lausanne, Switzerland; Swiss Institute of Bioinformatics, 1015 Lausanne, Switzerland
| | - Martina Unterländer
- Laboratory of Physical Anthropology, Department of History and Ethnology, Democritus University of Thrace, 69100 Komotini, Greece; Palaeogenetics Group, Institute of Organismic and Molecular Evolution, Johannes Gutenberg University of Mainz, 55099 Mainz, Germany
| | - Olga Dolgova
- CNAG-CRG, Centre for Genomic Regulation (CRG), The Barcelona Institute of Science and Technology, Baldiri Reixac 4, 08028 Barcelona, Spain
| | - Carlos Eduardo G Amorim
- Department of Computational Biology, University of Lausanne, 1015 Lausanne, Switzerland; Swiss Institute of Bioinformatics, 1015 Lausanne, Switzerland
| | - Francisco Coroado-Santos
- CE3C, Centre for Ecology, Evolution and Environmental Changes, Faculty of Sciences of the University of Lisbon, 1749-016 Lisbon, Portugal
| | - Samuel Neuenschwander
- Department of Computational Biology, University of Lausanne, 1015 Lausanne, Switzerland; Swiss Institute of Bioinformatics, 1015 Lausanne, Switzerland; Vital-IT, Swiss Institute of Bioinformatics, 1015 Lausanne, Switzerland
| | - Elissavet Ganiatsou
- Laboratory of Physical Anthropology, Department of History and Ethnology, Democritus University of Thrace, 69100 Komotini, Greece
| | - Diana I Cruz Dávalos
- Department of Computational Biology, University of Lausanne, 1015 Lausanne, Switzerland; Swiss Institute of Bioinformatics, 1015 Lausanne, Switzerland
| | - Lucas Anchieri
- Department of Computational Biology, University of Lausanne, 1015 Lausanne, Switzerland; Swiss Institute of Bioinformatics, 1015 Lausanne, Switzerland
| | - Frédéric Michaud
- Department of Computational Biology, University of Lausanne, 1015 Lausanne, Switzerland; Swiss Institute of Bioinformatics, 1015 Lausanne, Switzerland
| | - Laura Winkelbach
- Palaeogenetics Group, Institute of Organismic and Molecular Evolution, Johannes Gutenberg University of Mainz, 55099 Mainz, Germany
| | - Jens Blöcher
- Palaeogenetics Group, Institute of Organismic and Molecular Evolution, Johannes Gutenberg University of Mainz, 55099 Mainz, Germany
| | - Yami Ommar Arizmendi Cárdenas
- Department of Computational Biology, University of Lausanne, 1015 Lausanne, Switzerland; Swiss Institute of Bioinformatics, 1015 Lausanne, Switzerland
| | - Bárbara Sousa da Mota
- Department of Computational Biology, University of Lausanne, 1015 Lausanne, Switzerland; Swiss Institute of Bioinformatics, 1015 Lausanne, Switzerland
| | - Eleni Kalliga
- Laboratory of Physical Anthropology, Department of History and Ethnology, Democritus University of Thrace, 69100 Komotini, Greece
| | - Angelos Souleles
- Laboratory of Physical Anthropology, Department of History and Ethnology, Democritus University of Thrace, 69100 Komotini, Greece
| | - Ioannis Kontopoulos
- Center for GeoGenetics, GLOBE Institute, University of Copenhagen, 1350 Copenhagen, Denmark
| | | | - Olga Philaniotou
- Ephor Emerita of Antiquities, Hellenic Ministry of Culture and Sports, 10682 Athens, Greece
| | - Adamantios Sampson
- Department of Mediterranean Studies, University of the Aegean, 85132 Rhodes, Greece
| | - Dimitra Theodorou
- Ephorate of Antiquities of Kozani, Hellenic Ministry of Culture and Sports, 50004 Kozani, Greece
| | - Metaxia Tsipopoulou
- Ephor Emerita of Antiquities, Hellenic Ministry of Culture and Sports, 10682 Athens, Greece
| | - Ioannis Akamatis
- Department of History and Archaeology, Aristotle University of Thessaloniki, 54124 Thessaloniki, Greece
| | - Paul Halstead
- Department of Archaeology, University of Sheffield, Minalloy House, 10-16 Regent St., Sheffield S1 3NJ, UK
| | - Kostas Kotsakis
- Department of History and Archaeology, Aristotle University of Thessaloniki, 54124 Thessaloniki, Greece
| | - Dushka Urem-Kotsou
- Department of History and Ethnology, Democritus University of Thrace, 69100 Komotini, Greece
| | - Diamantis Panagiotopoulos
- Institute of Classical Archaeology, University of Heidelberg, Marstallhof 4, 69117 Heidelberg, Germany
| | - Christina Ziota
- Ephorate of Antiquities of Florina, Hellenic Ministry of Culture and Sports, 53100 Florina, Greece
| | - Sevasti Triantaphyllou
- Department of History and Archaeology, Aristotle University of Thessaloniki, 54124 Thessaloniki, Greece
| | - Olivier Delaneau
- Department of Computational Biology, University of Lausanne, 1015 Lausanne, Switzerland; Swiss Institute of Bioinformatics, 1015 Lausanne, Switzerland
| | - Jeffrey D Jensen
- School of Life Sciences, Arizona State University, Tempe, AZ 85287, USA
| | - J Víctor Moreno-Mayar
- Department of Computational Biology, University of Lausanne, 1015 Lausanne, Switzerland; Swiss Institute of Bioinformatics, 1015 Lausanne, Switzerland; Center for GeoGenetics, GLOBE Institute, University of Copenhagen, 1350 Copenhagen, Denmark; National Institute of Genomic Medicine (INMEGEN), 14610 Mexico City, Mexico
| | - Joachim Burger
- Palaeogenetics Group, Institute of Organismic and Molecular Evolution, Johannes Gutenberg University of Mainz, 55099 Mainz, Germany
| | - Vitor C Sousa
- CE3C, Centre for Ecology, Evolution and Environmental Changes, Faculty of Sciences of the University of Lisbon, 1749-016 Lisbon, Portugal
| | - Oscar Lao
- CNAG-CRG, Centre for Genomic Regulation (CRG), The Barcelona Institute of Science and Technology, Baldiri Reixac 4, 08028 Barcelona, Spain; Universitat Pompeu Fabra (UPF), Barcelona, Spain
| | - Anna-Sapfo Malaspinas
- Department of Computational Biology, University of Lausanne, 1015 Lausanne, Switzerland; Swiss Institute of Bioinformatics, 1015 Lausanne, Switzerland.
| | - Christina Papageorgopoulou
- Laboratory of Physical Anthropology, Department of History and Ethnology, Democritus University of Thrace, 69100 Komotini, Greece.
| |
Collapse
|
8
|
Taniguchi S, Bertl J, Futschik A, Kishino H, Okazaki T. Waves Out of the Korean Peninsula and Inter- and Intra-Species Replacements in Freshwater Fishes in Japan. Genes (Basel) 2021; 12:303. [PMID: 33669929 PMCID: PMC7924830 DOI: 10.3390/genes12020303] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/22/2020] [Revised: 02/11/2021] [Accepted: 02/18/2021] [Indexed: 11/24/2022] Open
Abstract
The Japanese archipelago is located at the periphery of the continent of Asia. Rivers in the Japanese archipelago, separated from the continent of Asia by about 17 Ma, have experienced an intermittent exchange of freshwater fish taxa through a narrow land bridge generated by lowered sea level. As the Korean Peninsula and Japanese archipelago were not covered by an ice sheet during glacial periods, phylogeographical analyses in this region can trace the history of biota that were, for a long time, beyond the last glacial maximum. In this study, we analyzed the phylogeography of four freshwater fish taxa, Hemibarbus longirostris, dark chub Nipponocypris temminckii, Tanakia ssp. and Carassius ssp., whose distributions include both the Korean Peninsula and Western Japan. We found for each taxon that a small component of diverse Korean clades of freshwater fishes migrated in waves into the Japanese archipelago to form the current phylogeographic structure of biota. The replacements of indigenous populations by succeeding migrants may have also influenced the phylogeography.
Collapse
Affiliation(s)
- Shoji Taniguchi
- Graduate School of Agricultural and Life Sciences, The University of Tokyo, 1-1-1, Yayoi, Bunkyo-ku, Tokyo 113-8657, Japan; (S.T.); (H.K.)
| | - Johanna Bertl
- Department of Mathematics, Aarhus University, Ny Munkegade, 118, bldg. 1530, 8000 Aarhus C, Denmark;
| | - Andreas Futschik
- Department of Applied Statistics, Johannes Kepler University Linz, Altenberger Str. 69, 4040 Linz, Austria;
| | - Hirohisa Kishino
- Graduate School of Agricultural and Life Sciences, The University of Tokyo, 1-1-1, Yayoi, Bunkyo-ku, Tokyo 113-8657, Japan; (S.T.); (H.K.)
| | - Toshio Okazaki
- Graduate School of Agricultural and Life Sciences, The University of Tokyo, 1-1-1, Yayoi, Bunkyo-ku, Tokyo 113-8657, Japan; (S.T.); (H.K.)
| |
Collapse
|
9
|
Sanchez T, Cury J, Charpiat G, Jay F. Deep learning for population size history inference: Design, comparison and combination with approximate Bayesian computation. Mol Ecol Resour 2020; 21:2645-2660. [DOI: 10.1111/1755-0998.13224] [Citation(s) in RCA: 27] [Impact Index Per Article: 5.4] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/17/2020] [Revised: 06/19/2020] [Accepted: 07/02/2020] [Indexed: 12/28/2022]
Affiliation(s)
- Théophile Sanchez
- Laboratoire de Recherche en Informatique CNRS UMR 8623 Université Paris‐Saclay Orsay France
| | - Jean Cury
- Laboratoire de Recherche en Informatique CNRS UMR 8623 Université Paris‐Saclay Orsay France
| | - Guillaume Charpiat
- Laboratoire de Recherche en Informatique CNRS UMR 8623 Université Paris‐Saclay Orsay France
| | - Flora Jay
- Laboratoire de Recherche en Informatique CNRS UMR 8623 Université Paris‐Saclay Orsay France
| |
Collapse
|
10
|
Abstract
Despite the efforts made to reconstruct the history of modern humans, there are still poorly explored regions that are key for understanding the phylogeography of our species. One of them is the Philippines, which is crucial to unravel the colonization of Southeast Asia and Oceania but where little is known about when and how the first humans arrived. In order to shed light into this settlement, we collected samples from 157 individuals of the Philippines with the four grandparents belonging to the same region and mitochondrial variants older than 20,000 years. Next, we analyzed the hypervariable I mtDNA region by approximate Bayesian computation based on extensive spatially explicit computer simulations to select among several migration routes towards the Philippines and to estimate population genetic parameters of this colonization. We found that the colonization of the Philippines occurred more than 60,000 years ago, with long-distance dispersal and from both north and south migration routes. Our results also suggest an environmental scenario especially optimal for humans, with large carrying capacity and population growth, in comparison to other regions of Asia. In all, our study suggests a rapid expansion of modern humans towards the Philippines that could be associated with the establisment of maritime technologies and favorable environmental conditions.
Collapse
|
11
|
Suárez-Atilano M, Cuarón AD, Vázquez-Domínguez E. Deciphering Geographical Affinity and Reconstructing Invasion Scenarios of Boa imperator on the Caribbean Island of Cozumel. COPEIA 2019. [DOI: 10.1643/cg-18-102] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/24/2022]
Affiliation(s)
- Marco Suárez-Atilano
- Departamento de Ecología de la Biodiversidad, Instituto de Ecología, Universidad Nacional Autónoma de México, Ap. Postal 70-275, Ciudad Universitaria, Ciudad de México, 04510, México; (MSA) ; and (EVD)
| | - Alfredo D. Cuarón
- SACBÉ—Servicios Ambientales, Conservación Biológica y Educación A.C., Casa del General 1er piso, Rancho Chichihualco, km 4.5 Carretera Costera Zona Hotelera Norte, Cozumel, Quintana Roo 77613, México;
| | - Ella Vázquez-Domínguez
- Departamento de Ecología de la Biodiversidad, Instituto de Ecología, Universidad Nacional Autónoma de México, Ap. Postal 70-275, Ciudad Universitaria, Ciudad de México, 04510, México; (MSA) ; and (EVD)
| |
Collapse
|
12
|
Feder AF, Pennings PS, Hermisson J, Petrov DA. Evolutionary Dynamics in Structured Populations Under Strong Population Genetic Forces. G3 (BETHESDA, MD.) 2019; 9:3395-3407. [PMID: 31462443 PMCID: PMC6778802 DOI: 10.1534/g3.119.400605] [Citation(s) in RCA: 15] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 06/04/2019] [Accepted: 08/16/2019] [Indexed: 12/16/2022]
Abstract
In the long-term neutral equilibrium, high rates of migration between subpopulations result in little population differentiation. However, in the short-term, even very abundant migration may not be enough for subpopulations to equilibrate immediately. In this study, we investigate dynamical patterns of short-term population differentiation in adapting populations via stochastic and analytical modeling through time. We characterize a regime in which selection and migration interact to create non-monotonic patterns of population differentiation over time when migration is weaker than selection, but stronger than drift. We demonstrate how these patterns can be leveraged to estimate high migration rates using approximate Bayesian computation. We apply this approach to estimate fast migration in a rapidly adapting intra-host Simian-HIV population sampled from different anatomical locations. We find differences in estimated migration rates between different compartments, even though all are above [Formula: see text] = 1. This work demonstrates how studying demographic processes on the timescale of selective sweeps illuminates processes too fast to leave signatures on neutral timescales.
Collapse
Affiliation(s)
- Alison F Feder
- Department of Biology, Stanford University,
- Department of Integrative Biology, University of California Berkeley
| | | | | | | |
Collapse
|
13
|
Wang W, Zheng Y, Zhao J, Yao M. Low genetic diversity in a critically endangered primate: shallow evolutionary history or recent population bottleneck? BMC Evol Biol 2019; 19:134. [PMID: 31242851 PMCID: PMC6595580 DOI: 10.1186/s12862-019-1451-y] [Citation(s) in RCA: 10] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/11/2019] [Accepted: 05/31/2019] [Indexed: 11/10/2022] Open
Abstract
BACKGROUND Current patterns of population genetic variation may have been shaped by long-term evolutionary history and contemporary demographic processes. Understanding the underlying mechanisms that yield those patterns is crucial for informed conservation of endangered species. The critically endangered white-headed langur, Trachypithecus leucocephalus, is endemic to a narrow range in southwest China. This species shows very low genetic diversity in its 2 main relict populations, Fusui and Chongzuo. Whether this has been caused by a short evolutionary history or recent population declines is unknown. Therefore, we investigated the contributions of historical and recent population demographic changes to population genetic diversity by using 15 nuclear microsatellite markers and mitochondrial DNA (mtDNA) control region sequences. RESULTS Using genetic data from 214 individuals we found a total of 9 mtDNA haplotypes in the Fusui population but only 1 haplotype in the Chongzuo population, and we found an overall low genetic diversity (haplotype and nucleotide diversities: h = 0.486 ± 0.036; π = 0.0028 ± 0.0003). The demographic history inferred from mtDNA and microsatellite markers revealed no evidence for historical population size fluctuations or recent population bottlenecks. Simulations of possible population divergence histories inferred by DIYABC analysis supported a recent divergence of the Chongzuo population from the Fusui population and no population bottlenecks. CONCLUSIONS Despite severe population declines caused by anthropogenic activities in the last century, the low genetic diversity of the extant white-headed langur populations is most likely primarily due to the species' shallow evolutionary history and to a recent, local population founder event.
Collapse
Affiliation(s)
- Weiran Wang
- School of Life Sciences, Peking University, Beijing, 100871, China.,Institute of Ecology, Peking University, Beijing, 100871, China.,Beijing National Day School, Beijing, 100871, China
| | - Yitao Zheng
- School of Life Sciences, Peking University, Beijing, 100871, China.,Institute of Ecology, Peking University, Beijing, 100871, China
| | - Jindong Zhao
- School of Life Sciences, Peking University, Beijing, 100871, China.,Institute of Ecology, Peking University, Beijing, 100871, China
| | - Meng Yao
- School of Life Sciences, Peking University, Beijing, 100871, China. .,Institute of Ecology, Peking University, Beijing, 100871, China.
| |
Collapse
|
14
|
Cohen P, Privman E. Speciation and hybridization in invasive fire ants. BMC Evol Biol 2019; 19:111. [PMID: 31142287 PMCID: PMC6542140 DOI: 10.1186/s12862-019-1437-9] [Citation(s) in RCA: 12] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/29/2018] [Accepted: 05/13/2019] [Indexed: 12/19/2022] Open
Abstract
BACKGROUND A major focus of evolutionary biology is the formation of reproductive barriers leading to divergence and ultimately, speciation. Often, it is not clear whether the separation of populations is complete or if there still is ongoing gene flow in the form of rare cases of admixture, known as isolation with migration. Here, we studied the speciation of two fire ant species, Solenopsis invicta and Solenopsis richteri, both native to South America, both inadvertently introduced to North America in the early twentieth century. While the two species are known to admix in the introduced range, in the native range no hybrids were found. RESULTS We conducted a population genomic survey of native and introduced populations of the two species using reduced representation genomic sequencing of 337 samples. Using maximum likelihood analysis over native range samples, we found no evidence of any gene flow between the species since they diverged. We estimated their time of divergence to 190,000 (100,000-350,000) generations ago. Modelling the demographic history of native and introduced S. invicta populations, we evaluated their divergence times and historic and contemporary population sizes, including the original founder population in North America, which was estimated at 26 (10-93) unrelated singly-mated queens. CONCLUSIONS We provide evidence for complete genetic isolation maintained between two invasive species in their natïve range, based, for the first time, on large scale genomic data analysis. The results lay the foundations for further studies into different stages in the formation of genetic barriers in dynamic, invasive populations.
Collapse
Affiliation(s)
- Pnina Cohen
- Department of Evolution and Environmental Biology, Institute of Evolution, University of Haifa, Haifa, Israel
| | - Eyal Privman
- Department of Evolution and Environmental Biology, Institute of Evolution, University of Haifa, Haifa, Israel
| |
Collapse
|
15
|
Izbicki R, Lee AB, Pospisil T. ABC–CDE: Toward Approximate Bayesian Computation With Complex High-Dimensional Data and Limited Simulations. J Comput Graph Stat 2019. [DOI: 10.1080/10618600.2018.1546594] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/27/2022]
Affiliation(s)
- Rafael Izbicki
- Department of Statistics, Federal University of São Carlos, São Carlos, Brazil
| | - Ann B. Lee
- Department of Statistics & Data Science, Carnegie Mellon University, Pittsburgh, PA
| | - Taylor Pospisil
- Department of Statistics & Data Science, Carnegie Mellon University, Pittsburgh, PA
| |
Collapse
|
16
|
Genomic evidence for shared common ancestry of East African hunting-gathering populations and insights into local adaptation. Proc Natl Acad Sci U S A 2019; 116:4166-4175. [PMID: 30782801 DOI: 10.1073/pnas.1817678116] [Citation(s) in RCA: 35] [Impact Index Per Article: 5.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/18/2022] Open
Abstract
Anatomically modern humans arose in Africa ∼300,000 years ago, but the demographic and adaptive histories of African populations are not well-characterized. Here, we have generated a genome-wide dataset from 840 Africans, residing in western, eastern, southern, and northern Africa, belonging to 50 ethnicities, and speaking languages belonging to four language families. In addition to agriculturalists and pastoralists, our study includes 16 populations that practice, or until recently have practiced, a hunting-gathering (HG) lifestyle. We observe that genetic structure in Africa is broadly correlated not only with geography, but to a lesser extent, with linguistic affiliation and subsistence strategy. Four East African HG (EHG) populations that are geographically distant from each other show evidence of common ancestry: the Hadza and Sandawe in Tanzania, who speak languages with clicks classified as Khoisan; the Dahalo in Kenya, whose language has remnant clicks; and the Sabue in Ethiopia, who speak an unclassified language. Additionally, we observed common ancestry between central African rainforest HGs and southern African San, the latter of whom speak languages with clicks classified as Khoisan. With the exception of the EHG, central African rainforest HGs, and San, other HG groups in Africa appear genetically similar to neighboring agriculturalist or pastoralist populations. We additionally demonstrate that infectious disease, immune response, and diet have played important roles in the adaptive landscape of African history. However, while the broad biological processes involved in recent human adaptation in Africa are often consistent across populations, the specific loci affected by selective pressures more often vary across populations.
Collapse
|
17
|
Rodrigues G, Prangle D, Sisson S. Recalibration: A post-processing method for approximate Bayesian computation. Comput Stat Data Anal 2018. [DOI: 10.1016/j.csda.2018.04.004] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/17/2022]
|
18
|
Parker A, Simpson MJ, Baker RE. The impact of experimental design choices on parameter inference for models of growing cell colonies. ROYAL SOCIETY OPEN SCIENCE 2018; 5:180384. [PMID: 30225025 PMCID: PMC6124093 DOI: 10.1098/rsos.180384] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 03/09/2018] [Accepted: 07/18/2018] [Indexed: 06/08/2023]
Abstract
To better understand development, repair and disease progression, it is useful to quantify the behaviour of proliferative and motile cell populations as they grow and expand to fill their local environment. Inferring parameters associated with mechanistic models of cell colony growth using quantitative data collected from carefully designed experiments provides a natural means to elucidate the relative contributions of various processes to the growth of the colony. In this work, we explore how experimental design impacts our ability to infer parameters for simple models of the growth of proliferative and motile cell populations. We adopt a Bayesian approach, which allows us to characterize the uncertainty associated with estimates of the model parameters. Our results suggest that experimental designs that incorporate initial spatial heterogeneities in cell positions facilitate parameter inference without the requirement of cell tracking, while designs that involve uniform initial placement of cells require cell tracking for accurate parameter inference. As cell tracking is an experimental bottleneck in many studies of this type, our recommendations for experimental design provide for significant potential time and cost savings in the analysis of cell colony growth.
Collapse
Affiliation(s)
- Andrew Parker
- Mathematical Institute, University of Oxford, Oxford, UK
| | - Matthew J. Simpson
- School of Mathematical Sciences, Queensland University of Technology, Brisbane, Australia
| | - Ruth E. Baker
- Mathematical Institute, University of Oxford, Oxford, UK
| |
Collapse
|
19
|
Fraïsse C, Roux C, Gagnaire PA, Romiguier J, Faivre N, Welch JJ, Bierne N. The divergence history of European blue mussel species reconstructed from Approximate Bayesian Computation: the effects of sequencing techniques and sampling strategies. PeerJ 2018; 6:e5198. [PMID: 30083438 PMCID: PMC6071616 DOI: 10.7717/peerj.5198] [Citation(s) in RCA: 23] [Impact Index Per Article: 3.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/01/2018] [Accepted: 06/19/2018] [Indexed: 01/25/2023] Open
Abstract
Genome-scale diversity data are increasingly available in a variety of biological systems, and can be used to reconstruct the past evolutionary history of species divergence. However, extracting the full demographic information from these data is not trivial, and requires inferential methods that account for the diversity of coalescent histories throughout the genome. Here, we evaluate the potential and limitations of one such approach. We reexamine a well-known system of mussel sister species, using the joint site frequency spectrum (jSFS) of synonymous mutations computed either from exome capture or RNA-seq, in an Approximate Bayesian Computation (ABC) framework. We first assess the best sampling strategy (number of: individuals, loci, and bins in the jSFS), and show that model selection is robust to variation in the number of individuals and loci. In contrast, different binning choices when summarizing the jSFS, strongly affect the results: including classes of low and high frequency shared polymorphisms can more effectively reveal recent migration events. We then take advantage of the flexibility of ABC to compare more realistic models of speciation, including variation in migration rates through time (i.e., periodic connectivity) and across genes (i.e., genome-wide heterogeneity in migration rates). We show that these models were consistently selected as the most probable, suggesting that mussels have experienced a complex history of gene flow during divergence and that the species boundary is semi-permeable. Our work provides a comprehensive evaluation of ABC demographic inference in mussels based on the coding jSFS, and supplies guidelines for employing different sequencing techniques and sampling strategies. We emphasize, perhaps surprisingly, that inferences are less limited by the volume of data, than by the way in which they are analyzed.
Collapse
Affiliation(s)
- Christelle Fraïsse
- Institut des Sciences de l’Evolution UMR5554, University Montpellier, CNRS, IRD, EPHE, Montpellier, France
- Department of Genetics, University of Cambridge, Cambridge, UK
- Institute of Science and Technology Austria, Klosterneuburg, Austria
| | - Camille Roux
- Université de Lille, Unité Evo-Eco-Paléo (EEP), UMR 8198, Villeneuve d’Ascq, France
| | - Pierre-Alexandre Gagnaire
- Institut des Sciences de l’Evolution UMR5554, University Montpellier, CNRS, IRD, EPHE, Montpellier, France
| | - Jonathan Romiguier
- Institut des Sciences de l’Evolution UMR5554, University Montpellier, CNRS, IRD, EPHE, Montpellier, France
| | - Nicolas Faivre
- Institut des Sciences de l’Evolution UMR5554, University Montpellier, CNRS, IRD, EPHE, Montpellier, France
- Department of Genetics, University of Cambridge, Cambridge, UK
| | - John J. Welch
- Department of Genetics, University of Cambridge, Cambridge, UK
| | - Nicolas Bierne
- Institut des Sciences de l’Evolution UMR5554, University Montpellier, CNRS, IRD, EPHE, Montpellier, France
- Department of Genetics, University of Cambridge, Cambridge, UK
| |
Collapse
|
20
|
Chatterjee S, Linford MR. Reordered (Sorted) Spectra. A Tool for Understanding Pattern Recognition Entropy (PRE) and Spectra in General. BULLETIN OF THE CHEMICAL SOCIETY OF JAPAN 2018. [DOI: 10.1246/bcsj.20180027] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/12/2022]
Affiliation(s)
- Shiladitya Chatterjee
- Department of Chemistry and Biochemistry, Brigham Young University, Provo, Utah 84602, USA
| | - Matthew R. Linford
- Department of Chemistry and Biochemistry, Brigham Young University, Provo, Utah 84602, USA
| |
Collapse
|
21
|
Giurghita D, Husmeier D. Statistical modelling of cell movement. STAT NEERL 2018. [DOI: 10.1111/stan.12140] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/29/2022]
Affiliation(s)
- Diana Giurghita
- School of Mathematics and Statistics; University of Glasgow; Glasgow G12 8QQ UK
| | - Dirk Husmeier
- School of Mathematics and Statistics; University of Glasgow; Glasgow G12 8QQ UK
| |
Collapse
|
22
|
Barbu CM, Sethuraman K, Billig EMW, Levy MZ. Two-scale dispersal estimation for biological invasions via synthetic likelihood. ECOGRAPHY 2018; 41:661-672. [PMID: 30104817 PMCID: PMC6086346 DOI: 10.1111/ecog.02575] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/08/2023]
Abstract
Biological invasions reshape environments and affect the ecological and economic welfare of states and communities. Such invasions advance on multiple spatial scales, complicating their control. When modeling stochastic dispersal processes, intractable likelihoods and autocorrelated data complicate parameter estimation. As with other approaches, the recent synthetic likelihood framework for stochastic models uses summary statistics to reduce this complexity; however, it additionally provides usable likelihoods, facilitating the use of existing likelihood-based machinery. Here, we extend this framework to parameterize multi-scale spatio-temporal dispersal models and compare existing and newly developed spatial summary statistics to characterize dispersal patterns. We provide general methods to evaluate potential summary statistics and present a fitting procedure that accurately estimates dispersal parameters on simulated data. Finally, we apply our methods to quantify the short and long range dispersal of Chagas disease vectors in urban Arequipa, Peru, and assess the feasibility of a purely reactive strategy to contain the invasion.
Collapse
Affiliation(s)
- Corentin M. Barbu
- Department of Biostatistics & Epidemiology, University of Pennsylvania Perelman School of Medicine, Philadelphia, USA
- UMR Agronomie, INRA, AgroParisTech, Université Paris-Saclay, 78850 Thiverval-Grignon, France
| | - Karthik Sethuraman
- Department of Biostatistics & Epidemiology, University of Pennsylvania Perelman School of Medicine, Philadelphia, USA
| | - Erica M. W. Billig
- Department of Biostatistics & Epidemiology, University of Pennsylvania Perelman School of Medicine, Philadelphia, USA
| | - Michael Z. Levy
- Department of Biostatistics & Epidemiology, University of Pennsylvania Perelman School of Medicine, Philadelphia, USA
| |
Collapse
|
23
|
Schrider DR, Kern AD. Supervised Machine Learning for Population Genetics: A New Paradigm. Trends Genet 2018; 34:301-312. [PMID: 29331490 PMCID: PMC5905713 DOI: 10.1016/j.tig.2017.12.005] [Citation(s) in RCA: 220] [Impact Index Per Article: 31.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/20/2017] [Revised: 11/29/2017] [Accepted: 12/08/2017] [Indexed: 01/21/2023]
Abstract
As population genomic datasets grow in size, researchers are faced with the daunting task of making sense of a flood of information. To keep pace with this explosion of data, computational methodologies for population genetic inference are rapidly being developed to best utilize genomic sequence data. In this review we discuss a new paradigm that has emerged in computational population genomics: that of supervised machine learning (ML). We review the fundamentals of ML, discuss recent applications of supervised ML to population genetics that outperform competing methods, and describe promising future directions in this area. Ultimately, we argue that supervised ML is an important and underutilized tool that has considerable potential for the world of evolutionary genomics.
Collapse
Affiliation(s)
- Daniel R Schrider
- Department of Genetics, and Human Genetics Institute of New Jersey, Rutgers University, Piscataway, NJ 08554, USA.
| | - Andrew D Kern
- Department of Genetics, and Human Genetics Institute of New Jersey, Rutgers University, Piscataway, NJ 08554, USA.
| |
Collapse
|
24
|
Lintusaari J, Gutmann MU, Dutta R, Kaski S, Corander J. Fundamentals and Recent Developments in Approximate Bayesian Computation. Syst Biol 2018; 66:e66-e82. [PMID: 28175922 PMCID: PMC5837704 DOI: 10.1093/sysbio/syw077] [Citation(s) in RCA: 45] [Impact Index Per Article: 6.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/30/2015] [Revised: 08/09/2016] [Accepted: 08/09/2016] [Indexed: 12/16/2022] Open
Abstract
Bayesian inference plays an important role in phylogenetics, evolutionary biology, and in many other branches of science. It provides a principled framework for dealing with uncertainty and quantifying how it changes in the light of new evidence. For many complex models and inference problems, however, only approximate quantitative answers are obtainable. Approximate Bayesian computation (ABC) refers to a family of algorithms for approximate inference that makes a minimal set of assumptions by only requiring that sampling from a model is possible. We explain here the fundamentals of ABC, review the classical algorithms, and highlight recent developments. [ABC; approximate Bayesian computation; Bayesian inference; likelihood-free inference; phylogenetics; simulator-based models; stochastic simulation models; tree-based models.]
Collapse
Affiliation(s)
- Jarno Lintusaari
- Department of Computer Science, Aalto University, Espoo, Finland.,Helsinki Institute for Information Technology HIIT, Espoo, Finland
| | - Michael U Gutmann
- Department of Computer Science, Aalto University, Espoo, Finland.,Helsinki Institute for Information Technology HIIT, Espoo, Finland.,Department of Mathematics and Statistics, University of Helsinki, Helsinki, Finland
| | - Ritabrata Dutta
- Department of Computer Science, Aalto University, Espoo, Finland.,Helsinki Institute for Information Technology HIIT, Espoo, Finland
| | - Samuel Kaski
- Department of Computer Science, Aalto University, Espoo, Finland.,Helsinki Institute for Information Technology HIIT, Espoo, Finland
| | - Jukka Corander
- Helsinki Institute for Information Technology HIIT, Espoo, Finland.,Department of Mathematics and Statistics, University of Helsinki, Helsinki, Finland.,Department of Biostatistics, University of Oslo, Oslo, Norway
| |
Collapse
|
25
|
Karabatsos G, Leisen F. An approximate likelihood perspective on ABC methods. STATISTICS SURVEYS 2018. [DOI: 10.1214/18-ss120] [Citation(s) in RCA: 15] [Impact Index Per Article: 2.1] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/19/2022]
|
26
|
Pimenta J, Lopes AM, Comas D, Amorim A, Arenas M. Evaluating the Neolithic Expansion at Both Shores of the Mediterranean Sea. Mol Biol Evol 2017; 34:3232-3242. [DOI: 10.1093/molbev/msx256] [Citation(s) in RCA: 12] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/08/2023] Open
|
27
|
McDermott PL, Wikle CK, Millspaugh J. Hierarchical Nonlinear Spatio-temporal Agent-Based Models for Collective Animal Movement. JOURNAL OF AGRICULTURAL, BIOLOGICAL AND ENVIRONMENTAL STATISTICS 2017. [DOI: 10.1007/s13253-017-0289-2] [Citation(s) in RCA: 9] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/30/2022]
|
28
|
Buerkle CA. Inconvenient truths in population and speciation genetics point towards a future beyond allele frequencies. J Evol Biol 2017; 30:1498-1500. [DOI: 10.1111/jeb.13106] [Citation(s) in RCA: 9] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/31/2017] [Revised: 04/21/2017] [Accepted: 04/24/2017] [Indexed: 11/28/2022]
Affiliation(s)
- C. A. Buerkle
- Department of Botany; University of Wyoming; Laramie WY USA
| |
Collapse
|
29
|
Inferring rooted species trees from unrooted gene trees using approximate Bayesian computation. Mol Phylogenet Evol 2017; 116:13-24. [PMID: 28780022 DOI: 10.1016/j.ympev.2017.07.017] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/15/2016] [Revised: 03/26/2017] [Accepted: 07/22/2017] [Indexed: 02/01/2023]
Abstract
Methods for inferring species trees from gene trees motivated by incomplete lineage sorting typically use either rooted gene trees to infer a rooted species tree, or use unrooted gene trees to infer an unrooted species tree, which is then typically rooted using one or more outgroups. Theoretically, however, it has been known since 2011 that it is possible to consistently infer the root of the species tree directly from unrooted gene trees without assuming an outgroup. Here, we use approximate Bayesian computation to infer the root of the species tree from unrooted gene trees assuming the multispecies coalescent model. It is hoped that this approach will be useful in cases where an appropriate outgroup is difficult to find and gene trees do not follow a molecular clock. We use approximate Bayesian computation to infer the root of the species tree from unrooted gene trees. This approach could also be useful when there is prior information that makes a small number of root locations plausible in an unrooted species tree.
Collapse
|
30
|
Cabrera AA, Palsbøll PJ. Inferring past demographic changes from contemporary genetic data: A simulation-based evaluation of the ABC methods implemented indiyabc. Mol Ecol Resour 2017; 17:e94-e110. [DOI: 10.1111/1755-0998.12696] [Citation(s) in RCA: 47] [Impact Index Per Article: 5.9] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/07/2016] [Revised: 06/12/2017] [Accepted: 06/20/2017] [Indexed: 01/19/2023]
Affiliation(s)
- Andrea A. Cabrera
- Marine Evolution and Conservation; Groningen Institute of Evolutionary Life Sciences; University of Groningen; Groningen The Netherlands
| | - Per J. Palsbøll
- Marine Evolution and Conservation; Groningen Institute of Evolutionary Life Sciences; University of Groningen; Groningen The Netherlands
| |
Collapse
|
31
|
Gutmann MU, Dutta R, Kaski S, Corander J. Likelihood-free inference via classification. STATISTICS AND COMPUTING 2017; 28:411-425. [PMID: 31997856 PMCID: PMC6956883 DOI: 10.1007/s11222-017-9738-6] [Citation(s) in RCA: 20] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 06/27/2016] [Accepted: 02/28/2017] [Indexed: 06/10/2023]
Abstract
Increasingly complex generative models are being used across disciplines as they allow for realistic characterization of data, but a common difficulty with them is the prohibitively large computational cost to evaluate the likelihood function and thus to perform likelihood-based statistical inference. A likelihood-free inference framework has emerged where the parameters are identified by finding values that yield simulated data resembling the observed data. While widely applicable, a major difficulty in this framework is how to measure the discrepancy between the simulated and observed data. Transforming the original problem into a problem of classifying the data into simulated versus observed, we find that classification accuracy can be used to assess the discrepancy. The complete arsenal of classification methods becomes thereby available for inference of intractable generative models. We validate our approach using theory and simulations for both point estimation and Bayesian inference, and demonstrate its use on real data by inferring an individual-based epidemiological model for bacterial infections in child care centers.
Collapse
Affiliation(s)
| | - Ritabrata Dutta
- InterDisciplinary Institute of Data Science, Universitá della Svizzera italiana, Lugano, Switzerland
| | - Samuel Kaski
- Helsinki Institute for Information Technology, Department of Computer Science, Aalto University, Espoo, Finland
| | - Jukka Corander
- Department of Biostatistics, University of Oslo, Oslo, Norway
- Helsinki Institute for Information Technology, Department of Mathematics and Statistics, University of Helsinki, Helsinki, Finland
| |
Collapse
|
32
|
Inferring epidemiological parameters from phylogenies using regression-ABC: A comparative study. PLoS Comput Biol 2017; 13:e1005416. [PMID: 28263987 PMCID: PMC5358897 DOI: 10.1371/journal.pcbi.1005416] [Citation(s) in RCA: 26] [Impact Index Per Article: 3.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/27/2016] [Revised: 03/20/2017] [Accepted: 02/16/2017] [Indexed: 02/06/2023] Open
Abstract
Inferring epidemiological parameters such as the R0 from time-scaled phylogenies is a timely challenge. Most current approaches rely on likelihood functions, which raise specific issues that range from computing these functions to finding their maxima numerically. Here, we present a new regression-based Approximate Bayesian Computation (ABC) approach, which we base on a large variety of summary statistics intended to capture the information contained in the phylogeny and its corresponding lineage-through-time plot. The regression step involves the Least Absolute Shrinkage and Selection Operator (LASSO) method, which is a robust machine learning technique. It allows us to readily deal with the large number of summary statistics, while avoiding resorting to Markov Chain Monte Carlo (MCMC) techniques. To compare our approach to existing ones, we simulated target trees under a variety of epidemiological models and settings, and inferred parameters of interest using the same priors. We found that, for large phylogenies, the accuracy of our regression-ABC is comparable to that of likelihood-based approaches involving birth-death processes implemented in BEAST2. Our approach even outperformed these when inferring the host population size with a Susceptible-Infected-Removed epidemiological model. It also clearly outperformed a recent kernel-ABC approach when assuming a Susceptible-Infected epidemiological model with two host types. Lastly, by re-analyzing data from the early stages of the recent Ebola epidemic in Sierra Leone, we showed that regression-ABC provides more realistic estimates for the duration parameters (latency and infectiousness) than the likelihood-based method. Overall, ABC based on a large variety of summary statistics and a regression method able to perform variable selection and avoid overfitting is a promising approach to analyze large phylogenies. Given the rapid evolution of many pathogens, analysing their genomes by means of phylogenies can inform us about how they spread. This is the focus of the field known as “phylodynamics”. Most existing methods inferring epidemiological parameters from virus phylogenies are limited by the difficulty of handling complex likelihood functions, which commonly incorporate latent variables. Here, we use an alternative method known as regression-based Approximate Bayesian Computation (ABC), which circumvents this problem by using simulations and dataset comparisons. Since phylogenies are difficult to compare to one another, we introduce many summary statistics to describe them and take advantage of current machine learning techniques able to perform variable selection. We show that the accuracy we reach is comparable to that of existing methods. This accuracy increases with phylogeny size and can even be higher than that of existing methods for some parameters. Overall, regression-based ABC opens new perspectives to infer epidemiological parameters from large phylogenies.
Collapse
|
33
|
Guirao-Rico S, Sánchez-Gracia A, Charlesworth D. Sequence diversity patterns suggesting balancing selection in partially sex-linked genes of the plant Silene latifolia are not generated by demographic history or gene flow. Mol Ecol 2017; 26:1357-1370. [PMID: 28035715 DOI: 10.1111/mec.13969] [Citation(s) in RCA: 14] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/11/2016] [Revised: 12/02/2016] [Accepted: 12/12/2016] [Indexed: 01/16/2023]
Abstract
DNA sequence diversity in genes in the partially sex-linked pseudoautosomal region (PAR) of the sex chromosomes of the plant Silene latifolia is higher than expected from within-species diversity of other genes. This could be the footprint of sexually antagonistic (SA) alleles that are maintained by balancing selection in a PAR gene (or genes) and affect polymorphism in linked genome regions. SA selection is predicted to occur during sex chromosome evolution, but it is important to test whether the unexpectedly high sequence polymorphism could be explained without it, purely by the combined effects of partial linkage with the sex-determining region and the population's demographic history, including possible introgression from Silene dioica. To test this, we applied approximate Bayesian computation-based model choice to autosomal sequence diversity data, to find the most plausible scenario for the recent history of S. latifolia and then to estimate the posterior density of the most relevant parameters. We then used these densities to simulate variation to be expected at PAR genes. We conclude that an excess of variants at high frequencies at PAR genes should arise in S. latifolia populations only for genes with strong associations with fully sex-linked genes, which requires closer linkage with the fully sex-linked region than that estimated for the PAR genes where apparent deviations from neutrality were observed. These results support the need to invoke selection to explain the S. latifolia PAR gene diversity, and encourage further work to test the possibility of balancing selection due to sexual antagonism.
Collapse
Affiliation(s)
- Sara Guirao-Rico
- Institute of Evolutionary Biology, School of Biological Sciences, University of Edinburgh, Edinburgh, EH9 3JN, UK
| | - Alejandro Sánchez-Gracia
- Departament de Genètica, Microbiologia i Estadística and Institut de Recerca de la Biodiversitat, Universitat de Barcelona, Av. Diagonal 643, Barcelona, 08028, Spain
| | - Deborah Charlesworth
- Institute of Evolutionary Biology, School of Biological Sciences, University of Edinburgh, Edinburgh, EH9 3JN, UK
| |
Collapse
|
34
|
Rousset F, Gouy A, Martinez-Almoyna C, Courtiol A. The summary-likelihood method and its implementation in theInfusionpackage. Mol Ecol Resour 2016; 17:110-119. [DOI: 10.1111/1755-0998.12627] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/28/2016] [Revised: 10/03/2016] [Accepted: 10/11/2016] [Indexed: 11/26/2022]
Affiliation(s)
- François Rousset
- CNRS; IRD; EPHE; CC065; Institut des Sciences de l’Évolution; University of Montpellier; Pl. E. Bataillon 34095 Montpellier France
- Institut de Biologie Computationnelle; University of Montpellier; CC05019, 860 rue St Priest, 34095 Montpellier France
| | - Alexandre Gouy
- CNRS; IRD; EPHE; CC065; Institut des Sciences de l’Évolution; University of Montpellier; Pl. E. Bataillon 34095 Montpellier France
- Institute of Ecology and Evolution; University of Berne; Baltzerstrasse 6 CH-3012 Berne Switzerland
| | - Camille Martinez-Almoyna
- CNRS; IRD; EPHE; CC065; Institut des Sciences de l’Évolution; University of Montpellier; Pl. E. Bataillon 34095 Montpellier France
| | - Alexandre Courtiol
- Leibniz Institute for Zoo and Wildlife Research; 10315 Berlin Germany
- Berlin Center for Genomics in Biodiversity Research (BeGenDiv); 14195 Berlin Germany
| |
Collapse
|
35
|
Importance of incomplete lineage sorting and introgression in the origin of shared genetic variation between two closely related pines with overlapping distributions. Heredity (Edinb) 2016; 118:211-220. [PMID: 27649619 PMCID: PMC5315522 DOI: 10.1038/hdy.2016.72] [Citation(s) in RCA: 49] [Impact Index Per Article: 5.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/03/2015] [Revised: 06/24/2016] [Accepted: 06/29/2016] [Indexed: 02/01/2023] Open
Abstract
Genetic variation shared between closely related species may be due to retention of ancestral polymorphisms because of incomplete lineage sorting (ILS) and/or introgression following secondary contact. It is challenging to distinguish ILS and introgression because they generate similar patterns of shared genetic diversity, but this is nonetheless essential for inferring accurately the history of species with overlapping distributions. To address this issue, we sequenced 33 independent intron loci across the genome of two closely related pine species (Pinus massoniana Lamb. and Pinus hwangshanensis Hisa) from Southeast China. Population structure analyses revealed that the species showed slightly more admixture in parapatric populations than in allopatric populations. Levels of interspecific differentiation were lower in parapatry than in allopatry. Approximate Bayesian computation suggested that the most likely speciation scenario explaining this pattern was a long period of isolation followed by a secondary contact. Ecological niche modeling suggested that a gradual range expansion of P. hwangshanensis during the Pleistocene climatic oscillations could have been the cause of the overlap. Our study therefore suggests that secondary introgression, rather than ILS, explains most of the shared nuclear genomic variation between these two species and demonstrates the complementarity of population genetics and ecological niche modeling in understanding gene flow history. Finally, we discuss the importance of contrasting results from markers with different dynamics of migration, namely nuclear, chloroplast and mitochondrial DNA.
Collapse
|
36
|
Kousathanas A, Leuenberger C, Helfer J, Quinodoz M, Foll M, Wegmann D. Likelihood-Free Inference in High-Dimensional Models. Genetics 2016; 203:893-904. [PMID: 27052569 PMCID: PMC4896201 DOI: 10.1534/genetics.116.187567] [Citation(s) in RCA: 18] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/26/2016] [Accepted: 04/04/2016] [Indexed: 11/18/2022] Open
Abstract
Methods that bypass analytical evaluations of the likelihood function have become an indispensable tool for statistical inference in many fields of science. These so-called likelihood-free methods rely on accepting and rejecting simulations based on summary statistics, which limits them to low-dimensional models for which the value of the likelihood is large enough to result in manageable acceptance rates. To get around these issues, we introduce a novel, likelihood-free Markov chain Monte Carlo (MCMC) method combining two key innovations: updating only one parameter per iteration and accepting or rejecting this update based on subsets of statistics approximately sufficient for this parameter. This increases acceptance rates dramatically, rendering this approach suitable even for models of very high dimensionality. We further derive that for linear models, a one-dimensional combination of statistics per parameter is sufficient and can be found empirically with simulations. Finally, we demonstrate that our method readily scales to models of very high dimensionality, using toy models as well as by jointly inferring the effective population size, the distribution of fitness effects (DFE) of segregating mutations, and selection coefficients for each locus from data of a recent experiment on the evolution of drug resistance in influenza.
Collapse
Affiliation(s)
- Athanasios Kousathanas
- Department of Biology and Biochemistry, University of Fribourg, 1700 Fribourg, Switzerland Swiss Institute of Bioinformatics, 1700 Fribourg, Switzerland
| | | | - Jonas Helfer
- Electrical Engineering and Computer Science, Massachusetts Institute of Technology, Cambridge Massachusetts 02139
| | - Mathieu Quinodoz
- Department of Computational Biology, University of Lausanne, 1200 Lausanne, Switzerland
| | - Matthieu Foll
- International Agency for Research on Cancer, 69372 Lyon, France
| | - Daniel Wegmann
- Department of Biology and Biochemistry, University of Fribourg, 1700 Fribourg, Switzerland Swiss Institute of Bioinformatics, 1700 Fribourg, Switzerland
| |
Collapse
|
37
|
Sheehan S, Song YS. Deep Learning for Population Genetic Inference. PLoS Comput Biol 2016; 12:e1004845. [PMID: 27018908 PMCID: PMC4809617 DOI: 10.1371/journal.pcbi.1004845] [Citation(s) in RCA: 150] [Impact Index Per Article: 16.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/02/2015] [Accepted: 03/02/2016] [Indexed: 02/05/2023] Open
Abstract
Given genomic variation data from multiple individuals, computing the likelihood of complex population genetic models is often infeasible. To circumvent this problem, we introduce a novel likelihood-free inference framework by applying deep learning, a powerful modern technique in machine learning. Deep learning makes use of multilayer neural networks to learn a feature-based function from the input (e.g., hundreds of correlated summary statistics of data) to the output (e.g., population genetic parameters of interest). We demonstrate that deep learning can be effectively employed for population genetic inference and learning informative features of data. As a concrete application, we focus on the challenging problem of jointly inferring natural selection and demography (in the form of a population size change history). Our method is able to separate the global nature of demography from the local nature of selection, without sequential steps for these two factors. Studying demography and selection jointly is motivated by Drosophila, where pervasive selection confounds demographic analysis. We apply our method to 197 African Drosophila melanogaster genomes from Zambia to infer both their overall demography, and regions of their genome under selection. We find many regions of the genome that have experienced hard sweeps, and fewer under selection on standing variation (soft sweep) or balancing selection. Interestingly, we find that soft sweeps and balancing selection occur more frequently closer to the centromere of each chromosome. In addition, our demographic inference suggests that previously estimated bottlenecks for African Drosophila melanogaster are too extreme.
Collapse
Affiliation(s)
- Sara Sheehan
- Department of Computer Science, Smith College, Northampton, Massachusetts, United States of America
- Computer Science Division, UC Berkeley, Berkeley, California, United States of America
| | - Yun S. Song
- Computer Science Division, UC Berkeley, Berkeley, California, United States of America
- Department of Statistics, UC Berkeley, Berkeley, California, United States of America
- Department of Integrative Biology, UC Berkeley, Berkeley, California, United States of America
- Department of Mathematics, University of Pennsylvania, Philadelphia, Pennsylvania, United States of America
- Department of Biology, University of Pennsylvania, Philadelphia, Pennsylvania, United States of America
| |
Collapse
|
38
|
Deschamps M, Laval G, Fagny M, Itan Y, Abel L, Casanova JL, Patin E, Quintana-Murci L. Genomic Signatures of Selective Pressures and Introgression from Archaic Hominins at Human Innate Immunity Genes. Am J Hum Genet 2016; 98:5-21. [PMID: 26748513 DOI: 10.1016/j.ajhg.2015.11.014] [Citation(s) in RCA: 170] [Impact Index Per Article: 18.9] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/17/2015] [Accepted: 11/06/2015] [Indexed: 01/25/2023] Open
Abstract
Human genes governing innate immunity provide a valuable tool for the study of the selective pressure imposed by microorganisms on host genomes. A comprehensive, genome-wide study of how selective constraints and adaptations have driven the evolution of innate immunity genes is missing. Using full-genome sequence variation from the 1000 Genomes Project, we first show that innate immunity genes have globally evolved under stronger purifying selection than the remainder of protein-coding genes. We identify a gene set under the strongest selective constraints, mutations in which are likely to predispose individuals to life-threatening disease, as illustrated by STAT1 and TRAF3. We then evaluate the occurrence of local adaptation and detect 57 high-scoring signals of positive selection at innate immunity genes, variation in which has been associated with susceptibility to common infectious or autoimmune diseases. Furthermore, we show that most adaptations targeting coding variation have occurred in the last 6,000-13,000 years, the period at which populations shifted from hunting and gathering to farming. Finally, we show that innate immunity genes present higher Neandertal introgression than the remainder of the coding genome. Notably, among the genes presenting the highest Neandertal ancestry, we find the TLR6-TLR1-TLR10 cluster, which also contains functional adaptive variation in Europeans. This study identifies highly constrained genes that fulfill essential, non-redundant functions in host survival and reveals others that are more permissive to change-containing variation acquired from archaic hominins or adaptive variants in specific populations-improving our understanding of the relative biological importance of innate immunity pathways in natural conditions.
Collapse
Affiliation(s)
- Matthieu Deschamps
- Unit of Human Evolutionary Genetics, Institut Pasteur, 75015 Paris, France; CNRS URA3012, 75015 Paris, France; Université Pierre et Marie Curie, Cellule Pasteur UPMC, 75015 Paris, France
| | - Guillaume Laval
- Unit of Human Evolutionary Genetics, Institut Pasteur, 75015 Paris, France; CNRS URA3012, 75015 Paris, France
| | - Maud Fagny
- Unit of Human Evolutionary Genetics, Institut Pasteur, 75015 Paris, France; CNRS URA3012, 75015 Paris, France; Université Pierre et Marie Curie, Cellule Pasteur UPMC, 75015 Paris, France
| | - Yuval Itan
- St. Giles Laboratory of Human Genetics of Infectious Diseases, Rockefeller Branch, The Rockefeller University, New York, NY 10065, USA
| | - Laurent Abel
- St. Giles Laboratory of Human Genetics of Infectious Diseases, Rockefeller Branch, The Rockefeller University, New York, NY 10065, USA; Laboratory of Human Genetics of Infectious Diseases, Necker Branch, INSERM U.1163, 75015 Paris, France; Imagine Institute, Paris Descartes University, 75015 Paris, France
| | - Jean-Laurent Casanova
- St. Giles Laboratory of Human Genetics of Infectious Diseases, Rockefeller Branch, The Rockefeller University, New York, NY 10065, USA; Laboratory of Human Genetics of Infectious Diseases, Necker Branch, INSERM U.1163, 75015 Paris, France; Imagine Institute, Paris Descartes University, 75015 Paris, France; Howard Hughes Medical Institute, New York, NY 10065, USA; Pediatric Hematology-Immunology Unit, Necker Hospital for Sick Children, 75015 Paris, France
| | - Etienne Patin
- Unit of Human Evolutionary Genetics, Institut Pasteur, 75015 Paris, France; CNRS URA3012, 75015 Paris, France
| | - Lluis Quintana-Murci
- Unit of Human Evolutionary Genetics, Institut Pasteur, 75015 Paris, France; CNRS URA3012, 75015 Paris, France.
| |
Collapse
|
39
|
Pudlo P, Marin JM, Estoup A, Cornuet JM, Gautier M, Robert CP. Reliable ABC model choice via random forests. ACTA ACUST UNITED AC 2015; 32:859-66. [PMID: 26589278 DOI: 10.1093/bioinformatics/btv684] [Citation(s) in RCA: 186] [Impact Index Per Article: 18.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/29/2015] [Accepted: 09/30/2015] [Indexed: 01/25/2023]
Abstract
MOTIVATION Approximate Bayesian computation (ABC) methods provide an elaborate approach to Bayesian inference on complex models, including model choice. Both theoretical arguments and simulation experiments indicate, however, that model posterior probabilities may be poorly evaluated by standard ABC techniques. RESULTS We propose a novel approach based on a machine learning tool named random forests (RF) to conduct selection among the highly complex models covered by ABC algorithms. We thus modify the way Bayesian model selection is both understood and operated, in that we rephrase the inferential goal as a classification problem, first predicting the model that best fits the data with RF and postponing the approximation of the posterior probability of the selected model for a second stage also relying on RF. Compared with earlier implementations of ABC model choice, the ABC RF approach offers several potential improvements: (i) it often has a larger discriminative power among the competing models, (ii) it is more robust against the number and choice of statistics summarizing the data, (iii) the computing effort is drastically reduced (with a gain in computation efficiency of at least 50) and (iv) it includes an approximation of the posterior probability of the selected model. The call to RF will undoubtedly extend the range of size of datasets and complexity of models that ABC can handle. We illustrate the power of this novel methodology by analyzing controlled experiments as well as genuine population genetics datasets. AVAILABILITY AND IMPLEMENTATION The proposed methodology is implemented in the R package abcrf available on the CRAN. CONTACT jean-michel.marin@umontpellier.fr SUPPLEMENTARY INFORMATION Supplementary data are available at Bioinformatics online.
Collapse
Affiliation(s)
- Pierre Pudlo
- Université de Montpellier, IMAG, Montpellier, Institut de Biologie Computationnelle (IBC), Montpellier
| | - Jean-Michel Marin
- Université de Montpellier, IMAG, Montpellier, Institut de Biologie Computationnelle (IBC), Montpellier
| | - Arnaud Estoup
- Institut de Biologie Computationnelle (IBC), Montpellier, CBGP, INRA, Montpellier
| | | | - Mathieu Gautier
- Institut de Biologie Computationnelle (IBC), Montpellier, CBGP, INRA, Montpellier
| | - Christian P Robert
- Université Paris Dauphine, CEREMADE, Paris, France and University of Warwick, Coventry, UK
| |
Collapse
|
40
|
Paz-Vinas I, Loot G, Stevens VM, Blanchet S. Evolutionary processes driving spatial patterns of intraspecific genetic diversity in river ecosystems. Mol Ecol 2015; 24:4586-604. [PMID: 26284462 DOI: 10.1111/mec.13345] [Citation(s) in RCA: 50] [Impact Index Per Article: 5.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/15/2014] [Revised: 07/30/2015] [Accepted: 08/13/2015] [Indexed: 01/17/2023]
Abstract
Describing, understanding and predicting the spatial distribution of genetic diversity is a central issue in biological sciences. In river landscapes, it is generally predicted that neutral genetic diversity should increase downstream, but there have been few attempts to test and validate this assumption across taxonomic groups. Moreover, it is still unclear what are the evolutionary processes that may generate this apparent spatial pattern of diversity. Here, we quantitatively synthesized published results from diverse taxa living in river ecosystems, and we performed a meta-analysis to show that a downstream increase in intraspecific genetic diversity (DIGD) actually constitutes a general spatial pattern of biodiversity that is repeatable across taxa. We further demonstrated that DIGD was stronger for strictly waterborne dispersing than for overland dispersing species. However, for a restricted data set focusing on fishes, there was no evidence that DIGD was related to particular species traits. We then searched for general processes underlying DIGD by simulating genetic data in dendritic-like river systems. Simulations revealed that the three processes we considered (downstream-biased dispersal, increase in habitat availability downstream and upstream-directed colonization) might generate DIGD. Using random forest models, we identified from simulations a set of highly informative summary statistics allowing discriminating among the processes causing DIGD. Finally, combining these discriminant statistics and approximate Bayesian computations on a set of twelve empirical case studies, we hypothesized that DIGD were most likely due to the interaction of two of these three processes and that contrary to expectation, they were not solely caused by downstream-biased dispersal.
Collapse
Affiliation(s)
- I Paz-Vinas
- Centre National de la Recherche Scientifique (CNRS), École Nationale de Formation Agronomique (ENFA), UMR 5174 EDB (Laboratoire Évolution & Diversité Biologique), Université Paul Sabatier, 118 route de Narbonne, 31062, Toulouse Cedex 4, France.,UPS, UMR 5174 (EDB), Université de Toulouse, 118 route de Narbonne, 31062, Toulouse Cedex 4, France.,UMR 7263 - IMBE, Équipe EGE, Centre Saint-Charles, Aix-Marseille Université, CNRS, IRD, Université d'Avignon et des Pays de Vaucluse, Case 36, 3 place Victor Hugo, 13331, Marseille Cedex 3, France
| | - G Loot
- UPS, UMR 5174 (EDB), Université de Toulouse, 118 route de Narbonne, 31062, Toulouse Cedex 4, France.,Station d'Écologie Expérimentale du CNRS à Moulis, USR 2936, Centre National de la Recherche Scientifique (CNRS), 2 route du CNRS, 09200, Moulis, France
| | - V M Stevens
- Station d'Écologie Expérimentale du CNRS à Moulis, USR 2936, Centre National de la Recherche Scientifique (CNRS), 2 route du CNRS, 09200, Moulis, France
| | - S Blanchet
- Centre National de la Recherche Scientifique (CNRS), École Nationale de Formation Agronomique (ENFA), UMR 5174 EDB (Laboratoire Évolution & Diversité Biologique), Université Paul Sabatier, 118 route de Narbonne, 31062, Toulouse Cedex 4, France.,Station d'Écologie Expérimentale du CNRS à Moulis, USR 2936, Centre National de la Recherche Scientifique (CNRS), 2 route du CNRS, 09200, Moulis, France
| |
Collapse
|
41
|
Benazzo A, Ghirotto S, Vilaça ST, Hoban S. Using ABC and microsatellite data to detect multiple introductions of invasive species from a single source. Heredity (Edinb) 2015; 115:262-72. [PMID: 25920671 DOI: 10.1038/hdy.2015.38] [Citation(s) in RCA: 14] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/12/2014] [Revised: 03/12/2015] [Accepted: 03/14/2015] [Indexed: 11/09/2022] Open
Abstract
The introduction of invasive species to new locations (that is, biological invasions) can have major impact on biodiversity, agriculture and public health. As such, determining the routes and modality of introductions with genetic data has become a fundamental goal in molecular ecology. To assist with this goal, new statistical methods and frameworks have been developed, such as approximate Bayesian computation (ABC) for inferring invasion history. Here, we present a model of invasion accounting for multiple introductions from a single source (MISS), a heretofore largely unexplored model. We simulate microsatellite data to evaluate the power of ABC to distinguish between single and multiple introductions from the same source, under a range of demographic parameters. We also apply ABC to microsatellite data from three invasions of bumblebee in New Zealand. In addition, we assess the performance of several methods of summary statistics selection. Our simulated results suggested good ability to distinguish between one- and two-wave models over much but not all of the parameter space tested, independent of summary statistics used. Globally, parameter estimation was good except for bottleneck timing. For one of the bumblebee species, we clearly rejected the MISS model, while for the other two we found inconclusive results. Since a second wave may provide genetic reinforcement to initial colonists, help relieve inbreeding among founders, or increase the hazard of the invasion, its detection may be crucial for managing invasions; we suggest that the MISS model could be considered as a potential model in future theoretical and empirical studies of invasions.
Collapse
Affiliation(s)
- A Benazzo
- Department of Life Sciences and Biotechnology, University of Ferrara, Ferrara, Italy
| | - S Ghirotto
- Department of Life Sciences and Biotechnology, University of Ferrara, Ferrara, Italy
| | - S T Vilaça
- Department of Life Sciences and Biotechnology, University of Ferrara, Ferrara, Italy
| | - S Hoban
- 1] Department of Life Sciences and Biotechnology, University of Ferrara, Ferrara, Italy [2] National Institute for Mathematical and Biological Synthesis (NIMBioS), University of Tennessee, Knoxville, TN, USA
| |
Collapse
|
42
|
Bekara MEA, Courcoul A, Bénet JJ, Durand B. Modeling tuberculosis dynamics, detection and control in cattle herds. PLoS One 2014; 9:e108584. [PMID: 25254369 PMCID: PMC4177924 DOI: 10.1371/journal.pone.0108584] [Citation(s) in RCA: 12] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/26/2014] [Accepted: 09/02/2014] [Indexed: 11/18/2022] Open
Abstract
Epidemiological models are key tools for designing and evaluating detection and control strategies against animal infectious diseases. In France, after decades of decrease of bovine tuberculosis (bTB) incidence, the disease keeps circulating. Increasing prevalence levels are observed in several areas, where the detection and control strategy could be adapted. The objective of this work was to design and calibrate a model of the within-herd transmission of bTB. The proposed model is a stochastic model operating in discrete-time. Three health states were distinguished: susceptible, latent and infected. Dairy and beef herd dynamics and bTB detection and control programs were explicitly represented. Approximate Bayesian computation was used to estimate three model parameters from field data: the transmission parameter when animals are inside (βinside) and outside (βoutside) buildings, and the duration of the latent phase. An independent dataset was used for model validation. The estimated median was 0.43 [0.16–0.84] month−1 for βinside and 0.08 [0.01–0.32] month−1 for βoutside. The median duration of the latent period was estimated 3.5 [2]–[8] months. The sensitivity analysis showed only minor influences of fixed parameter values on these posterior estimates. Validation based on an independent dataset showed that in more than 80% of herds, the observed proportion of animals with detected lesions was between the 2.5% and 97.5% percentiles of the simulated distribution. In the absence of control program and once bTB has become enzootic within a herd, the median effective reproductive ratio was estimated to be 2.2 in beef herds and 1.7 in dairy herds. These low estimates are consistent with field observations of a low prevalence level in French bTB-infected herds.
Collapse
Affiliation(s)
- Mohammed El Amine Bekara
- University Paris Est, Anses, Laboratory of Animal Health, Epidemiology Unit, Maisons-Alfort, France
| | - Aurélie Courcoul
- University Paris Est, Anses, Laboratory of Animal Health, Epidemiology Unit, Maisons-Alfort, France
| | - Jean-Jacques Bénet
- University Paris Est, National Veterinary School of Alfort (ENVA), EpiMAI Unit, Maisons-Alfort, France
| | - Benoit Durand
- University Paris Est, Anses, Laboratory of Animal Health, Epidemiology Unit, Maisons-Alfort, France
- * E-mail:
| |
Collapse
|
43
|
Prangle D, Blum MGB, Popovic G, Sisson SA. Diagnostic tools for approximate Bayesian computation using the coverage property. AUST NZ J STAT 2014. [DOI: 10.1111/anzs.12087] [Citation(s) in RCA: 27] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/30/2022]
Affiliation(s)
- D. Prangle
- Mathematics and Statistics Department; Lancaster University; Lancaster UK
| | - M. G. B. Blum
- Centre National de la Recherche Scientifique, Laboratoire TIMC-IMAG, UMR 5525; Université Joseph Fourier; Grenoble F-38041 France
| | - G. Popovic
- School of Mathematics and Statistics; University of New South Wales; Sydney Australia
| | - S. A. Sisson
- School of Mathematics and Statistics; University of New South Wales; Sydney Australia
| |
Collapse
|
44
|
Robinson JD, Bunnefeld L, Hearn J, Stone GN, Hickerson MJ. ABC inference of multi-population divergence with admixture from unphased population genomic data. Mol Ecol 2014; 23:4458-71. [PMID: 25113024 PMCID: PMC4285295 DOI: 10.1111/mec.12881] [Citation(s) in RCA: 48] [Impact Index Per Article: 4.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/02/2013] [Revised: 08/04/2014] [Accepted: 08/06/2014] [Indexed: 01/13/2023]
Abstract
Rapidly developing sequencing technologies and declining costs have made it possible to collect genome-scale data from population-level samples in nonmodel systems. Inferential tools for historical demography given these data sets are, at present, underdeveloped. In particular, approximate Bayesian computation (ABC) has yet to be widely embraced by researchers generating these data. Here, we demonstrate the promise of ABC for analysis of the large data sets that are now attainable from nonmodel taxa through current genomic sequencing technologies. We develop and test an ABC framework for model selection and parameter estimation, given histories of three-population divergence with admixture. We then explore different sampling regimes to illustrate how sampling more loci, longer loci or more individuals affects the quality of model selection and parameter estimation in this ABC framework. Our results show that inferences improved substantially with increases in the number and/or length of sequenced loci, while less benefit was gained by sampling large numbers of individuals. Optimal sampling strategies given our inferential models included at least 2000 loci, each approximately 2 kb in length, sampled from five diploid individuals per population, although specific strategies are model and question dependent. We tested our ABC approach through simulation-based cross-validations and illustrate its application using previously analysed data from the oak gall wasp, Biorhiza pallida.
Collapse
Affiliation(s)
- John D Robinson
- Department of Biology, City College of New York, 160 Convent Ave., MR 526, New York, NY, 10031, USA
| | | | | | | | | |
Collapse
|
45
|
Matrix inversions for chromosomal inversions: a method to construct summary statistics in complex coalescent models. Theor Popul Biol 2014; 97:1-10. [PMID: 25091264 DOI: 10.1016/j.tpb.2014.07.005] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/13/2013] [Revised: 07/09/2014] [Accepted: 07/22/2014] [Indexed: 12/27/2022]
Abstract
Chromosomal inversions allow genetic divergence of locally adapted populations by reducing recombination between chromosomes with different arrangements. While patterns of genetic variation within inverted regions are increasingly documented, inferential methods are largely missing to analyze such data. Previous work has provided expectations for coalescence patterns of neutral sites linked to an inversion polymorphism in two locally adapted populations. Here, we define a method to construct summary statistics in such complex population structure models. Under a scenario of selection on the inversion breakpoints, we first construct estimators of the migration rate between the two habitats, and of the recombination rate of a nucleotide site between the two inversion backgrounds. Next, we analyze the disequilibrium between two sites within an inversion and provide an estimator of the distinct recombination rate between these two sites in homokaryotypes and heterokaryotypes. These estimators should be suitable summary statistics for simulation-based methods that can handle the complex dependences in the data.
Collapse
|
46
|
Abstract
Mathematical models have been central to ecology for nearly a century. Simple models of population dynamics have allowed us to understand fundamental aspects underlying the dynamics and stability of ecological systems. What has remained a challenge, however, is to meaningfully interpret experimental or observational data in light of mathematical models. Here, we review recent developments, notably in the growing field of approximate Bayesian computation (ABC), that allow us to calibrate mathematical models against available data. Estimating the population demographic parameters from data remains a formidable statistical challenge. Here, we attempt to give a flavor and overview of ABC and its applications in population biology and ecology and eschew a detailed technical discussion in favor of a general discussion of the advantages and potential pitfalls this framework offers to population biologists.
Collapse
Affiliation(s)
- Michael P.H. Stumpf
- Centre for Integrative Systems Biology and
BioinformaticsDepartment of Life Sciences, Sir Ernst
Chain Building, Imperial College London, London, SW7
2AZUK
| |
Collapse
|
47
|
Amount of information needed for model choice in Approximate Bayesian Computation. PLoS One 2014; 9:e99581. [PMID: 24959900 PMCID: PMC4069000 DOI: 10.1371/journal.pone.0099581] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/17/2014] [Accepted: 05/16/2014] [Indexed: 11/19/2022] Open
Abstract
Approximate Bayesian Computation (ABC) has become a popular technique in evolutionary genetics for elucidating population structure and history due to its flexibility. The statistical inference framework has benefited from significant progress in recent years. In population genetics, however, its outcome depends heavily on the amount of information in the dataset, whether that be the level of genetic variation or the number of samples and loci. Here we look at the power to reject a simple constant population size coalescent model in favor of a bottleneck model in datasets of varying quality. Not only is this power dependent on the number of samples and loci, but it also depends strongly on the level of nucleotide diversity in the observed dataset. Whilst overall model choice in an ABC setting is fairly powerful and quite conservative with regard to false positives, detecting weaker bottlenecks is problematic in smaller or less genetically diverse datasets and limits the inferences possible in non-model organism where the amount of information regarding the two models is often limited. Our results show it is important to consider these limitations when performing an ABC analysis and that studies should perform simulations based on the size and nature of the dataset in order to fully assess the power of the study.
Collapse
|
48
|
Fountain T, Duvaux L, Horsburgh G, Reinhardt K, Butlin RK. Human-facilitated metapopulation dynamics in an emerging pest species, Cimex lectularius. Mol Ecol 2014; 23:1071-84. [PMID: 24446663 PMCID: PMC4016754 DOI: 10.1111/mec.12673] [Citation(s) in RCA: 41] [Impact Index Per Article: 3.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/09/2013] [Revised: 01/06/2014] [Accepted: 01/13/2014] [Indexed: 12/01/2022]
Abstract
The number and demographic history of colonists can have dramatic consequences for the way in which genetic diversity is distributed and maintained in a metapopulation. The bed bug (Cimex lectularius) is a re-emerging pest species whose close association with humans has led to frequent local extinction and colonization, that is, to metapopulation dynamics. Pest control limits the lifespan of subpopulations, causing frequent local extinctions, and human-facilitated dispersal allows the colonization of empty patches. Founder events often result in drastic reductions in diversity and an increased influence of genetic drift. Coupled with restricted migration, this can lead to rapid population differentiation. We therefore predicted strong population structuring. Here, using 21 newly characterized microsatellite markers and approximate Bayesian computation (ABC), we investigate simplified versions of two classical models of metapopulation dynamics, in a coalescent framework, to estimate the number and genetic composition of founders in the common bed bug. We found very limited diversity within infestations but high degrees of structuring across the city of London, with extreme levels of genetic differentiation between infestations (FST = 0.59). ABC results suggest a common origin of all founders of a given subpopulation and that the numbers of colonists were low, implying that even a single mated female is enough to found a new infestation successfully. These patterns of colonization are close to the predictions of the propagule pool model, where all founders originate from the same parental infestation. These results show that aspects of metapopulation dynamics can be captured in simple models and provide insights that are valuable for the future targeted control of bed bug infestations.
Collapse
Affiliation(s)
- Toby Fountain
- Department of Animal and Plant Sciences, University of Sheffield, Western Bank, Sheffield, S10 2TN, UK; Department of Biosciences, University of Helsinki, PO Box 65 (Viikinkaari 1), FI-00014, Helsinki, Finland
| | | | | | | | | |
Collapse
|
49
|
Bodare S, Stocks M, Yang JC, Lascoux M. Origin and demographic history of the endemic Taiwan spruce (Picea morrisonicola). Ecol Evol 2013; 3:3320-33. [PMID: 24223271 PMCID: PMC3797480 DOI: 10.1002/ece3.698] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/03/2013] [Revised: 06/26/2013] [Accepted: 06/27/2013] [Indexed: 11/08/2022] Open
Abstract
Taiwan spruce (Picea morrisonicola) is a vulnerable conifer species endemic to the island of Taiwan. A warming climate and competition from subtropical tree species has limited the range of Taiwan spruce to the higher altitudes of the island. Using seeds sampled from an area in the central mountain range of Taiwan, 15 nuclear loci were sequenced in order to measure genetic variation and to assess the long-term genetic stability of the species. Genetic diversity is low and comparable to other spruce species with limited ranges such as Picea breweriana, Picea chihuahuana, and Picea schrenkiana. Importantly, analysis using approximate Bayesian computation (ABC) provides evidence for a drastic decline in the effective population size approximately 0.3–0.5 million years ago (mya). We used simulations to show that this is unlikely to be a false-positive result due to the limited sample used here. To investigate the phylogenetic origin of Taiwan spruce, additional sequencing was performed in the Chinese spruce Picea wilsonii and combined with previously published data for three other mainland China species, Picea purpurea, Picea likiangensis, and P. schrenkiana. Analysis of population structure revealed that P. morrisonicola clusters most closely with P. wilsonii, and coalescent analyses using the program MIMAR dated the split to 4–8 mya, coincidental to the formation of Taiwan. Considering the population decrease that occurred after the split, however, led to a much more recent origin.
Collapse
Affiliation(s)
- Sofia Bodare
- Department of Ecology and Genetics, Evolutionary Biology Centre, Uppsala University Uppsala, Sweden
| | | | | | | |
Collapse
|
50
|
Excoffier L, Dupanloup I, Huerta-Sánchez E, Sousa VC, Foll M. Robust demographic inference from genomic and SNP data. PLoS Genet 2013; 9:e1003905. [PMID: 24204310 PMCID: PMC3812088 DOI: 10.1371/journal.pgen.1003905] [Citation(s) in RCA: 883] [Impact Index Per Article: 73.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/28/2013] [Accepted: 09/11/2013] [Indexed: 01/09/2023] Open
Abstract
We introduce a flexible and robust simulation-based framework to infer demographic parameters from the site frequency spectrum (SFS) computed on large genomic datasets. We show that our composite-likelihood approach allows one to study evolutionary models of arbitrary complexity, which cannot be tackled by other current likelihood-based methods. For simple scenarios, our approach compares favorably in terms of accuracy and speed with ∂a∂i, the current reference in the field, while showing better convergence properties for complex models. We first apply our methodology to non-coding genomic SNP data from four human populations. To infer their demographic history, we compare neutral evolutionary models of increasing complexity, including unsampled populations. We further show the versatility of our framework by extending it to the inference of demographic parameters from SNP chips with known ascertainment, such as that recently released by Affymetrix to study human origins. Whereas previous ways of handling ascertained SNPs were either restricted to a single population or only allowed the inference of divergence time between a pair of populations, our framework can correctly infer parameters of more complex models including the divergence of several populations, bottlenecks and migration. We apply this approach to the reconstruction of African demography using two distinct ascertained human SNP panels studied under two evolutionary models. The two SNP panels lead to globally very similar estimates and confidence intervals, and suggest an ancient divergence (>110 Ky) between Yoruba and San populations. Our methodology appears well suited to the study of complex scenarios from large genomic data sets.
Collapse
Affiliation(s)
- Laurent Excoffier
- CMPG, Institute of Ecology and Evolution, Berne, Switzerland
- Swiss Institute of Bioinformatics, Lausanne, Switzerland
| | - Isabelle Dupanloup
- CMPG, Institute of Ecology and Evolution, Berne, Switzerland
- Swiss Institute of Bioinformatics, Lausanne, Switzerland
| | - Emilia Huerta-Sánchez
- Center for Theoretical Evolutionary Genomics, Department of Integrative Biology, University of California, Berkeley, Berkeley, California, United States of America
| | - Vitor C. Sousa
- CMPG, Institute of Ecology and Evolution, Berne, Switzerland
- Swiss Institute of Bioinformatics, Lausanne, Switzerland
| | - Matthieu Foll
- CMPG, Institute of Ecology and Evolution, Berne, Switzerland
- Swiss Institute of Bioinformatics, Lausanne, Switzerland
- School of Life Sciences, Ecole Polytechnique Fédérale de Lausanne, Lausanne, Switzerland
| |
Collapse
|