1
|
Fonseca EM, Carstens BC. Artificial intelligence enables unified analysis of historical and landscape influences on genetic diversity. Mol Phylogenet Evol 2024; 198:108116. [PMID: 38871263 DOI: 10.1016/j.ympev.2024.108116] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/05/2023] [Revised: 04/04/2024] [Accepted: 06/04/2024] [Indexed: 06/15/2024]
Abstract
While genetic variation in any species is potentially shaped by a range of processes, phylogeography and landscape genetics are largely concerned with inferring how environmental conditions and landscape features impact neutral intraspecific diversity. However, even as both disciplines have come to utilize SNP data over the last decades, analytical approaches have remained for the most part focused on either broad-scale inferences of historical processes (phylogeography) or on more localized inferences about environmental and/or landscape features (landscape genetics). Here we demonstrate that an artificial intelligence model-based analytical framework can consider both deeper historical factors and landscape-level processes in an integrated analysis. We implement this framework using data collected from two Brazilian anurans, the Brazilian sibilator frog (Leptodactylus troglodytes) and granular toad (Rhinella granulosa). Our results indicate that historical demographic processes shape most the genetic variation in the sibulator frog, while landscape processes primarily influence variation in the granular toad. The machine learning framework used here allows both historical and landscape processes to be considered equally, rather than requiring researchers to make an a priori decision about which factors are important.
Collapse
Affiliation(s)
- Emanuel M Fonseca
- Museum of Biological Diversity & Department of Evolution, Ecology and Organismal Biology, The Ohio State University, 1315 Kinnear Rd., Columbus OH 43212, USA
| | - Bryan C Carstens
- Museum of Biological Diversity & Department of Evolution, Ecology and Organismal Biology, The Ohio State University, 1315 Kinnear Rd., Columbus OH 43212, USA.
| |
Collapse
|
2
|
Rincón-Barrado M, Villaverde T, Perez MF, Sanmartín I, Riina R. The sweet tabaiba or there and back again: phylogeographical history of the Macaronesian Euphorbia balsamifera. ANNALS OF BOTANY 2024; 133:883-904. [PMID: 38197716 PMCID: PMC11082519 DOI: 10.1093/aob/mcae001] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 10/09/2023] [Accepted: 03/01/2024] [Indexed: 01/11/2024]
Abstract
BACKGROUND AND AIMS Biogeographical relationships between the Canary Islands and north-west Africa are often explained by oceanic dispersal and geographical proximity. Sister-group relationships between Canarian and eastern African/Arabian taxa, the 'Rand Flora' pattern, are rare among plants and have been attributed to the extinction of north-western African populations. Euphorbia balsamifera is the only representative species of this pattern that is distributed in the Canary Islands and north-west Africa; it is also one of few species present in all seven islands. Previous studies placed African populations of E. balsamifera as sister to the Canarian populations, but this relationship was based on herbarium samples with highly degraded DNA. Here, we test the extinction hypothesis by sampling new continental populations; we also expand the Canarian sampling to examine the dynamics of island colonization and diversification. METHODS Using target enrichment with genome skimming, we reconstructed phylogenetic relationships within E. balsamifera and between this species and its disjunct relatives. A single nucleotide polymorphism dataset obtained from the target sequences was used to infer population genetic diversity patterns. We used convolutional neural networks to discriminate among alternative Canary Islands colonization scenarios. KEY RESULTS The results confirmed the Rand Flora sister-group relationship between western E. balsamifera and Euphorbia adenensis in the Eritreo-Arabian region and recovered an eastern-western geographical structure among E. balsamifera Canarian populations. Convolutional neural networks supported a scenario of east-to-west island colonization, followed by population extinctions in Lanzarote and Fuerteventura and recolonization from Tenerife and Gran Canaria; a signal of admixture between the eastern island and north-west African populations was recovered. CONCLUSIONS Our findings support the Surfing Syngameon Hypothesis for the colonization of the Canary Islands by E. balsamifera, but also a recent back-colonization to the continent. Populations of E. balsamifera from northwest Africa are not the remnants of an ancestral continental stock, but originated from migration events from Lanzarote and Fuerteventura. This is further evidence that oceanic archipelagos are not a sink for biodiversity, but may be a source of new genetic variability.
Collapse
Affiliation(s)
- Mario Rincón-Barrado
- Real Jardín Botánico (RJB), CSIC, Madrid, 28014, Spain
- Centro Nacional de Biotecnología (CNB), CSIC, Madrid, 28049, Spain
| | - Tamara Villaverde
- Universidad Rey Juan Carlos (URJC), Área de Biodiversidad y Conservación, Móstoles, 28933, Spain
| | - Manolo F Perez
- Institut de Systématique, Evolution, Biodiversité (ISYEB – URM 7205 CNRS), Muséum National d’Histoire Naturelle, SU, EPHE & UA, Paris, France
| | | | - Ricarda Riina
- Real Jardín Botánico (RJB), CSIC, Madrid, 28014, Spain
| |
Collapse
|
3
|
Cervantes CR, Montes JR, Rosas U, Arias S. Phylogenetic discordance and integrative species delimitation in the Mammillaria haageana species complex (Cactaceae). Mol Phylogenet Evol 2023; 187:107891. [PMID: 37517507 DOI: 10.1016/j.ympev.2023.107891] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/24/2023] [Revised: 06/15/2023] [Accepted: 07/26/2023] [Indexed: 08/01/2023]
Abstract
Species complexes consist of very close phylogenetic relatives, where morphological similarities make it difficult to distinguish between them using traditional taxonomic methods. Here, we focused on the long-standing challenge of species delimitation in the Mammillaria haageana complex, a group that presents great morphological diversity that makes its taxonomy a puzzle. Our work integrates genomic, morphological, and ecological data to establish the taxonomic limits in the M. haageana complex, and we also studied the evolutionary relationships with the remainder of the M. ser. Supertextae species. Our genetic analyses, as well as morphological and ecological evidence, led us to propose that the M. haageana complex is made up of six distinct entities (M. acultzingensis, M. conspicua, M. haageana, M. lanigera, M. meissneri, and M. san-angelensis), mainly as a result of ecological speciation. A recent taxonomic proposal considered these taxa as a single species; therefore, we propose their recognition at the species level. Our results also show a high level of incomplete lineage sorting rather than reticulation, which is especially likely in recently diverged species such as those comprising M. ser. Supertextae. The species hypotheses proposed here may be useful in future extinction risk assessments and conservation strategies.
Collapse
Affiliation(s)
- Cristian R Cervantes
- Unidad de Síntesis en Sistemática y Evolución, Instituto de Biología, Circuito Exterior s.n., Ciudad Universitaria, Ciudad de México 04510, México; Posgrado en Ciencias Biológicas, Instituto de Biología, Universidad Nacional Autónoma de México, Ciudad Universitaria, Coyoacán, Ciudad de México 04510, México.
| | - José-Rubén Montes
- Posgrado en Ciencias Biológicas, Instituto de Biología, Universidad Nacional Autónoma de México, Ciudad Universitaria, Coyoacán, Ciudad de México 04510, México
| | - Ulises Rosas
- Jardín Botánico, Instituto de Biología, Universidad Nacional Autónoma de México, Tercer Circuito Exterior, Ciudad Universitaria, Coyoacán, Ciudad de México 04510, México
| | - Salvador Arias
- Jardín Botánico, Instituto de Biología, Universidad Nacional Autónoma de México, Tercer Circuito Exterior, Ciudad Universitaria, Coyoacán, Ciudad de México 04510, México
| |
Collapse
|
4
|
Romeiro-Brito M, Khan G, Perez MF, Zappi DC, Taylor NP, Olsthoorn G, Franco FF, Moraes EM. Revisiting phylogeny, systematics, and biogeography of a Pleistocene radiation. AMERICAN JOURNAL OF BOTANY 2023; 110:1-17. [PMID: 36708517 DOI: 10.1002/ajb2.16134] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 08/14/2022] [Revised: 01/03/2023] [Accepted: 01/05/2023] [Indexed: 05/11/2023]
Abstract
PREMISE Pilosocereus (Cactaceae) is an important dry forest element in all subregions and transitional zones of the neotropics, with the highest diversity in eastern Brazil. The genus is subdivided into informal taxonomic groups; however, most of these are not supported by recent molecular phylogenetic inferences. This lack of confidence is probably due to the use of an insufficient number of loci and the complexity of cactus diversification. Here, we explored the species relationships in Pilosocereus in more detail, integrating multilocus phylogenetic approaches with the assessment of the ancestral range and the effect of geography on diversification shifts. METHODS We used 28 nuclear, plastid, and mitochondrial loci from 54 plant samples of 31 Pilosocereus species for phylogenetic analyses. We used concatenated and coalescent phylogenetic trees and Bayesian models to estimate the most likely ancestral range and diversification shifts. RESULTS All Pilosocereus species were clustered in the same branch, except P. bohlei. The phylogenetic relationships were more associated with the geographic distribution than taxonomic affinities among taxa. The genus began diversifying during the Plio-Pleistocene transition in the Caatinga domain and experienced an increased diversification rate during the Calabrian age. CONCLUSIONS We recovered a well-supported multispecies coalescent phylogeny. Our results refine the pattern of rapid diversification of Pilosocereus species across neotropical drylands during the Pleistocene and highlight the need for taxonomic rearrangements in the genus. We recovered a pulse of diversification during the Pleistocene that was likely driven by multiple dispersal and vicariance events within and among the Caatinga, Cerrado, and Atlantic Forest domains.
Collapse
Affiliation(s)
- Monique Romeiro-Brito
- Departamento de Biologia, Universidade Federal de São Carlos (UFSCar), Sorocaba, SP, 18052-780, Brazil
| | - Gulzar Khan
- Institute for Biology and Environmental Sciences, Carl von Ossietzky-University Oldenburg, Carl von Ossietzky-Str. 9-11, 26111, Oldenburg, Germany
| | - Manolo F Perez
- Departamento de Genética e Evolução, Universidade Federal de São Carlos (UFSCar), São Carlos, SP, 13565-905, Brazil
| | - Daniela C Zappi
- Programa de Pós-Graduação em Botânica, Instituto de Ciências Biológicas, Universidade de Brasília (UNB), PO Box 04457, Brasília, DF, 70910-970, Brazil
| | - Nigel P Taylor
- University of Gibraltar, Gibraltar Botanic Gardens Campus, The Alameda, PO Box 843, GX11 1AA, Gibraltar
| | | | - Fernando F Franco
- Departamento de Biologia, Universidade Federal de São Carlos (UFSCar), Sorocaba, SP, 18052-780, Brazil
| | - Evandro M Moraes
- Departamento de Biologia, Universidade Federal de São Carlos (UFSCar), Sorocaba, SP, 18052-780, Brazil
| |
Collapse
|
5
|
Korfmann K, Gaggiotti OE, Fumagalli M. Deep Learning in Population Genetics. Genome Biol Evol 2023; 15:6997869. [PMID: 36683406 PMCID: PMC9897193 DOI: 10.1093/gbe/evad008] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/13/2022] [Revised: 12/19/2022] [Accepted: 01/16/2023] [Indexed: 01/24/2023] Open
Abstract
Population genetics is transitioning into a data-driven discipline thanks to the availability of large-scale genomic data and the need to study increasingly complex evolutionary scenarios. With likelihood and Bayesian approaches becoming either intractable or computationally unfeasible, machine learning, and in particular deep learning, algorithms are emerging as popular techniques for population genetic inferences. These approaches rely on algorithms that learn non-linear relationships between the input data and the model parameters being estimated through representation learning from training data sets. Deep learning algorithms currently employed in the field comprise discriminative and generative models with fully connected, convolutional, or recurrent layers. Additionally, a wide range of powerful simulators to generate training data under complex scenarios are now available. The application of deep learning to empirical data sets mostly replicates previous findings of demography reconstruction and signals of natural selection in model organisms. To showcase the feasibility of deep learning to tackle new challenges, we designed a branched architecture to detect signals of recent balancing selection from temporal haplotypic data, which exhibited good predictive performance on simulated data. Investigations on the interpretability of neural networks, their robustness to uncertain training data, and creative representation of population genetic data, will provide further opportunities for technological advancements in the field.
Collapse
Affiliation(s)
- Kevin Korfmann
- Professorship for Population Genetics, Department of Life Science Systems, Technical University of Munich, Germany
| | - Oscar E Gaggiotti
- Centre for Biological Diversity, Sir Harold Mitchell Building, University of St Andrews, Fife KY16 9TF, UK
| | | |
Collapse
|
6
|
Lu-Irving P, Bragg JG, Rossetto M, King K, O’Brien M, van der Merwe MM. Capturing Genetic Diversity in Seed Collections: An Empirical Study of Two Congeners with Contrasting Mating Systems. PLANTS (BASEL, SWITZERLAND) 2023; 12:522. [PMID: 36771606 PMCID: PMC9921034 DOI: 10.3390/plants12030522] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 11/10/2022] [Revised: 01/10/2023] [Accepted: 01/13/2023] [Indexed: 06/18/2023]
Abstract
Plant mating systems shape patterns of genetic diversity and impact the long-term success of populations. As such, they are relevant to the design of seed collections aiming to maximise genetic diversity (e.g., germplasm conservation, ecological restoration). However, for most species, little is known empirically about how variation in mating systems and genetic diversity is distributed. We investigated the relationship between genetic diversity and mating systems in two functionally similar, co-occurring species of Hakea (Proteaceae), and evaluated the extent to which genetic diversity was captured in seeds. We genotyped hundreds of seedlings and mother plants via DArTseq, and developed novel implementations of two approaches to inferring the mating system from SNP data. A striking contrast in patterns of genetic diversity between H. sericea and H. teretifolia was revealed, consistent with a contrast in their mating systems. While both species had mixed mating systems, H. sericea was found to be habitually selfing, while H. teretifolia more evenly employed both selfing and outcrossing. In both species, seed collection schemes maximised genetic diversity by increasing the number of maternal lines and sites sampled, but twice as many sites were needed for the selfing species to capture equivalent levels of genetic variation at a regional scale.
Collapse
Affiliation(s)
- Patricia Lu-Irving
- Research Centre for Ecosystem Resilience, Australian Institute of Botanical Science, Royal Botanic Gardens Sydney, Mrs Macquaries Rd., Sydney, NSW 2000, Australia
| | - Jason G. Bragg
- Research Centre for Ecosystem Resilience, Australian Institute of Botanical Science, Royal Botanic Gardens Sydney, Mrs Macquaries Rd., Sydney, NSW 2000, Australia
| | - Maurizio Rossetto
- Research Centre for Ecosystem Resilience, Australian Institute of Botanical Science, Royal Botanic Gardens Sydney, Mrs Macquaries Rd., Sydney, NSW 2000, Australia
| | - Kit King
- Research Centre for Ecosystem Resilience, Australian Institute of Botanical Science, Royal Botanic Gardens Sydney, Mrs Macquaries Rd., Sydney, NSW 2000, Australia
| | - Mitchell O’Brien
- Research Centre for Ecosystem Resilience, Australian Institute of Botanical Science, Royal Botanic Gardens Sydney, Mrs Macquaries Rd., Sydney, NSW 2000, Australia
- Commonwealth Scientific and Industrial Research Organisation (CSIRO), Innovation Quarter Westmead, Level 3, East Tower, 158-164 Hawkesbury Rd., Westmead, NSW 2145, Australia
| | - Marlien M. van der Merwe
- Research Centre for Ecosystem Resilience, Australian Institute of Botanical Science, Royal Botanic Gardens Sydney, Mrs Macquaries Rd., Sydney, NSW 2000, Australia
| |
Collapse
|
7
|
Sanchez T, Bray EM, Jobic P, Guez J, Letournel AC, Charpiat G, Cury J, Jay F. dnadna: a deep learning framework for population genetics inference. Bioinformatics 2022; 39:6851140. [PMID: 36445000 PMCID: PMC9825738 DOI: 10.1093/bioinformatics/btac765] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/17/2021] [Revised: 10/30/2022] [Accepted: 11/28/2022] [Indexed: 11/30/2022] Open
Abstract
MOTIVATION We present dnadna, a flexible python-based software for deep learning inference in population genetics. It is task-agnostic and aims at facilitating the development, reproducibility, dissemination and re-usability of neural networks designed for population genetic data. RESULTS dnadna defines multiple user-friendly workflows. First, users can implement new architectures and tasks, while benefiting from dnadna utility functions, training procedure and test environment, which saves time and decreases the likelihood of bugs. Second, the implemented networks can be re-optimized based on user-specified training sets and/or tasks. Newly implemented architectures and pre-trained networks are easily shareable with the community for further benchmarking or other applications. Finally, users can apply pre-trained networks in order to predict evolutionary history from alternative real or simulated genetic datasets, without requiring extensive knowledge in deep learning or coding in general. dnadna comes with a peer-reviewed, exchangeable neural network, allowing demographic inference from SNP data, that can be used directly or retrained to solve other tasks. Toy networks are also available to ease the exploration of the software, and we expect that the range of available architectures will keep expanding thanks to community contributions. AVAILABILITY AND IMPLEMENTATION dnadna is a Python (≥3.7) package, its repository is available at gitlab.com/mlgenetics/dnadna and its associated documentation at mlgenetics.gitlab.io/dnadna/.
Collapse
Affiliation(s)
| | | | - Pierre Jobic
- Université Paris-Saclay, CNRS UMR 9015, INRIA, Laboratoire Interdisciplinaire des Sciences du Numérique, 91400 Orsay, France
- ENS Paris-Saclay, 91190 Gif-sur-Yvette, France
| | - Jérémy Guez
- Université Paris-Saclay, CNRS UMR 9015, INRIA, Laboratoire Interdisciplinaire des Sciences du Numérique, 91400 Orsay, France
- UMR7206 Eco-Anthropologie, Muséum National d’Histoire Naturelle, CNRS, Université de Paris, 75016 Paris, France
| | - Anne-Catherine Letournel
- Université Paris-Saclay, CNRS UMR 9015, INRIA, Laboratoire Interdisciplinaire des Sciences du Numérique, 91400 Orsay, France
| | - Guillaume Charpiat
- Université Paris-Saclay, CNRS UMR 9015, INRIA, Laboratoire Interdisciplinaire des Sciences du Numérique, 91400 Orsay, France
| | - Jean Cury
- To whom correspondence should be addressed. or
| | - Flora Jay
- To whom correspondence should be addressed. or
| |
Collapse
|
8
|
Kirschner P, Perez MF, Záveská E, Sanmartín I, Marquer L, Schlick-Steiner BC, Alvarez N, Steiner FM, Schönswetter P. Congruent evolutionary responses of European steppe biota to late Quaternary climate change. Nat Commun 2022; 13:1921. [PMID: 35396388 PMCID: PMC8993823 DOI: 10.1038/s41467-022-29267-8] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/17/2021] [Accepted: 03/08/2022] [Indexed: 11/09/2022] Open
Abstract
Quaternary climatic oscillations had a large impact on European biogeography. Alternation of cold and warm stages caused recurrent glaciations, massive vegetation shifts, and large-scale range alterations in many species. The Eurasian steppe biome and its grasslands are a noteworthy example; they underwent climate-driven, large-scale contractions during warm stages and expansions during cold stages. Here, we evaluate the impact of these range alterations on the late Quaternary demography of several phylogenetically distant plant and insect species, typical of the Eurasian steppes. We compare three explicit demographic hypotheses by applying an approach combining convolutional neural networks with approximate Bayesian computation. We identified congruent demographic responses of cold stage expansion and warm stage contraction across all species, but also species-specific effects. The demographic history of the Eurasian steppe biota reflects major paleoecological turning points in the late Quaternary and emphasizes the role of climate as a driving force underlying patterns of genetic variance on the biome level.
Collapse
Affiliation(s)
- Philipp Kirschner
- Department of Botany, University of Innsbruck, Sternwartestraße 15, 6020, Innsbruck, Austria. .,Department of Ecology, University of Innsbruck, Technikerstraße 25, 6020, Innsbruck, Austria.
| | - Manolo F Perez
- Real Jardín Botánico, CSIC, Plaza de Murillo 2, 28014, Madrid, Spain.,Departamento de Genetica e Evolucao, Universidade Federal de Sao Carlos, Rodovia Washington Luis, km 235, 13565905, Sao Carlos, Brazil
| | - Eliška Záveská
- Department of Botany, University of Innsbruck, Sternwartestraße 15, 6020, Innsbruck, Austria.,Institute of Botany of the Czech Academy of Sciences, Zámek 1, 25243, Průhonice, Czech Republic
| | - Isabel Sanmartín
- Real Jardín Botánico, CSIC, Plaza de Murillo 2, 28014, Madrid, Spain
| | - Laurent Marquer
- Department of Botany, University of Innsbruck, Sternwartestraße 15, 6020, Innsbruck, Austria
| | | | - Nadir Alvarez
- Geneva Natural History Museum of Geneva, Route de Malagnou 1, 1208, Genève, Switzerland.,Department of Genetics and Evolution, University of Geneva, Boulevard D'Yvoy 4, 1205, Genève, Switzerland
| | | | - Florian M Steiner
- Department of Ecology, University of Innsbruck, Technikerstraße 25, 6020, Innsbruck, Austria
| | - Peter Schönswetter
- Department of Botany, University of Innsbruck, Sternwartestraße 15, 6020, Innsbruck, Austria.
| |
Collapse
|
9
|
Evolutionary Genetics of Cacti: Research Biases, Advances and Prospects. Genes (Basel) 2022; 13:genes13030452. [PMID: 35328006 PMCID: PMC8952820 DOI: 10.3390/genes13030452] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/25/2022] [Revised: 02/22/2022] [Accepted: 02/25/2022] [Indexed: 02/01/2023] Open
Abstract
Here, we present a review of the studies of evolutionary genetics (phylogenetics, population genetics, and phylogeography) using genetic data as well as genome scale assemblies in Cactaceae (Caryophyllales, Angiosperms), a major lineage of succulent plants with astonishing diversity on the American continent. To this end, we performed a literature survey (1992–2021) to obtain detailed information regarding key aspects of studies investigating cactus evolution. Specifically, we summarize the advances in the following aspects: molecular markers, species delimitation, phylogenetics, hybridization, biogeography, and genome assemblies. In brief, we observed substantial growth in the studies conducted with molecular markers in the past two decades. However, we found biases in taxonomic/geographic sampling and the use of traditional markers and statistical approaches. We discuss some methodological and social challenges for engaging the cactus community in genomic research. We also stressed the importance of integrative approaches, coalescent methods, and international collaboration to advance the understanding of cactus evolution.
Collapse
|