1
|
Keogh SM, Johnson NA, Smith CH, Sietman BE, Garner JT, Randklev CR, Simons AM. Secondary contact erodes Pleistocene diversification in a wide-ranging freshwater mussel (Quadrula). Mol Ecol 2025; 34:e17572. [PMID: 39543938 DOI: 10.1111/mec.17572] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/18/2024] [Revised: 10/08/2024] [Accepted: 10/16/2024] [Indexed: 11/17/2024]
Abstract
The isolated river drainages of eastern North America serve as a natural laboratory to investigate the roles of allopatry and secondary contact in the evolutionary trajectories of recently diverged lineages. Drainage divides facilitate allopatric speciation, but due to their sensitivity to climatic and geomorphological changes, neighboring rivers frequently coalesce, creating recurrent opportunities of isolation and contact throughout the history of aquatic lineages. The freshwater mussel Quadrula quadrula is widely distributed across isolated rivers of eastern North America and possesses high phenotypic and molecular variation across its range. We integrate sequence data from three genomes, including female- and male-inherited mitochondrial markers and thousands of nuclear encoded SNPs with morphology and geography to illuminate the group's divergence history. Across contemporary isolated rivers, we found continuums of molecular and morphological variation, following a pattern of isolation by distance. In contact zones, hybridization was frequent with no apparent fitness consequences, as advanced hybrids were common. Accordingly, we recognize Q. quadrula as a single cohesive species with subspecific variation (Q. quadrula rumphiana). Demographic modeling and divergence dating supported a divergence history characterized by allopatric vicariance followed by secondary contact, likely driven by river rearrangements and Pleistocene glacial cycles. Despite clinal range-wide variation and hybridization in contact zones, the process-based species delimitation tool delimitR, which considers demographic scenarios like secondary contact, supported the delimitation of the maximum number of species tested. As such, when interpreting species delimitation results, we suggest careful consideration of spatial sampling and subsequent geographic patterns of biological variation, particularly for wide-ranging taxa.
Collapse
Affiliation(s)
- Sean M Keogh
- Gantz Family Collections Center, Field Museum of Natural History, Chicago, Illinois, USA
- Bell Museum of Natural History, University of Minnesota, St. Paul, Minnesota, USA
| | - Nathan A Johnson
- U.S. Geological Survey, Wetland and Aquatic Research Center, Gainesville, Florida, USA
| | - Chase H Smith
- Department of Integrative Biology, University of Texas, Austin, Texas, USA
| | - Bernard E Sietman
- Minnesota Department of Natural Resources, Center for Aquatic Mollusk Programs, Lake City, Minnesota, USA
| | - Jeffrey T Garner
- Alabama Division of Wildlife and Freshwater Fisheries, Florence, Alabama, USA
| | - Charles R Randklev
- Texas A&M Natural Resources Institute, AgriLife Research Center, Dallas, Texas, USA
| | - Andrew M Simons
- Bell Museum of Natural History, University of Minnesota, St. Paul, Minnesota, USA
- Department of Fisheries, Wildlife, and Conservation Biology, University of Minnesota, St. Paul, Minnesota, USA
| |
Collapse
|
2
|
Soares LS, Bombarely A, Freitas LB. How many species are there? Lineage diversification and hidden speciation in Solanaceae from highland grasslands in southern South America. ANNALS OF BOTANY 2024; 134:1291-1305. [PMID: 39196773 PMCID: PMC11688538 DOI: 10.1093/aob/mcae144] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 05/02/2024] [Accepted: 08/26/2024] [Indexed: 08/30/2024]
Abstract
BACKGROUND AND AIMS Species delimitation can be challenging when analysing recently diverged species, especially those taxonomically synonymized owing to morphological similarities. We aimed to untangle the relationships between two grassland species, Petunia guarapuavensis and Petunia scheideana, exploring the dynamics of fast divergence and addressing their species delimitation. METHODS We used a low-coverage genome sequencing and population genomic approach to distinguish species and populations between P. guarapuavensis and P. scheideana. Our analysis focused on detecting structuration, hybridization/introgression and phylogenetic patterns. We used demographic models to support species delimitation while exploring potential phylogeographical barriers influencing gene flow. KEY RESULTS Our findings indicated differentiation between the two species and revealed another lineage, which was phylogenetically distinct from the others and had no evidence of gene flow with them. The presence of a river acted as a phylogeographical barrier, limiting gene flow and allowing for structuration between closely related lineages. The optimal species delimitation scenario involved secondary contact between well-established lineages. CONCLUSIONS The rapid divergence observed in these Petunia species explains the lack of significant morphological differences, because floral diagnostic traits in species sharing pollinators tend to evolve more slowly. This study highlights the complexity of species delimitation in recently diverged groups and emphasizes the importance of genomic approaches in understanding evolutionary relationships and speciation dynamics.
Collapse
Affiliation(s)
- Luana S Soares
- Department of Genetics, Universidade Federal do Rio Grande do Sul, Porto Alegre, Brazil
| | - Aureliano Bombarely
- Instituto de Biologia Molecular y Celular de Plantas (IBMCP) (CSIC-UPV), Valencia, Spain
| | - Loreta B Freitas
- Department of Genetics, Universidade Federal do Rio Grande do Sul, Porto Alegre, Brazil
| |
Collapse
|
3
|
Titus BM, Gibbs HL, Simões N, Daly M. Topology Testing and Demographic Modeling Illuminate a Novel Speciation Pathway in the Greater Caribbean Sea Following the Formation of the Isthmus of Panama. Syst Biol 2024; 73:758-768. [PMID: 39041315 DOI: 10.1093/sysbio/syae045] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/13/2023] [Revised: 05/03/2024] [Accepted: 07/19/2024] [Indexed: 07/24/2024] Open
Abstract
Recent genomic analyses have highlighted the prevalence of speciation with gene flow in many taxa and have underscored the importance of accounting for these reticulate evolutionary processes when constructing species trees and generating parameter estimates. This is especially important for deepening our understanding of speciation in the sea where fast-moving ocean currents, expanses of deep water, and periodic episodes of sea level rise and fall act as soft and temporary allopatric barriers that facilitate both divergence and secondary contact. Under these conditions, gene flow is not expected to cease completely while contemporary distributions are expected to differ from historical ones. Here, we conduct range-wide sampling for Pederson's cleaner shrimp (Ancylomenes pedersoni), a species complex from the Greater Caribbean that contains three clearly delimited mitochondrial lineages with both allopatric and sympatric distributions. Using mtDNA barcodes and a genomic ddRADseq approach, we combine classic phylogenetic analyses with extensive topology testing and demographic modeling (10 site frequency replicates × 45 evolutionary models × 50 model simulations/replicate = 22,500 simulations) to test species boundaries and reconstruct the evolutionary history of what was expected to be a simple case study. Instead, our results indicate a history of allopatric divergence, secondary contact, introgression, and endemic hybrid speciation that we hypothesize was driven by the final closure of the Isthmus of Panama and the strengthening of the Gulf Stream Current ~3.5 Ma. The history of this species complex recovered by model-based methods that allow reticulation differs from that recovered by standard phylogenetic analyses and is unexpected given contemporary distributions. The geologically and biologically meaningful insights gained by our model selection analyses illuminate what is likely a novel pathway of species formation not previously documented that resulted from one of the most biogeographically significant events in Earth's history.
Collapse
Affiliation(s)
- Benjamin M Titus
- Department of Biological Sciences, University of Alabama, 1325 Science and Engineering Complex, Tuscaloosa, AL 35487, USA
- Dauphin Island Sea Lab, 101 Bienville Blvd, Dauphin Island, AL 36528, USA
- Department of Evolution, Ecology, and Organismal Biology, The Ohio State University, 1315 Kinnear Rd, Columbus, OH 43212, USA
| | - H Lisle Gibbs
- Department of Evolution, Ecology, and Organismal Biology, The Ohio State University, 1315 Kinnear Rd, Columbus, OH 43212, USA
| | - Nuno Simões
- Facultad de Ciencias, Universidad Nacional Autonoma de Mexico-Sisal, Puerto de abrigo s/n, Sisal, CP 97356 Yucatán, Mexico
- International Chair for Coastal and Marine Studies in Mexico, Harte Research Institute for Gulf of Mexico Studies, Texas A&M University, 6300 Ocean Dr, Corpus Christi, TX 78412, USA
- Laboratorio Nacional de Resilencia Costera (LANRESC, CONACYT), 97356 Sisal, Yucata´n, Mexico
| | - Marymegan Daly
- Department of Evolution, Ecology, and Organismal Biology, The Ohio State University, 1315 Kinnear Rd, Columbus, OH 43212, USA
| |
Collapse
|
4
|
Fonseca EM, Pope NS, Peterman WE, Werneck FP, Colli GR, Carstens BC. Genetic structure and landscape effects on gene flow in the Neotropical lizard Norops brasiliensis (Squamata: Dactyloidae). Heredity (Edinb) 2024; 132:284-295. [PMID: 38575800 PMCID: PMC11166928 DOI: 10.1038/s41437-024-00682-5] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/15/2023] [Revised: 03/12/2024] [Accepted: 03/18/2024] [Indexed: 04/06/2024] Open
Abstract
One key research goal of evolutionary biology is to understand the origin and maintenance of genetic variation. In the Cerrado, the South American savanna located primarily in the Central Brazilian Plateau, many hypotheses have been proposed to explain how landscape features (e.g., geographic distance, river barriers, topographic compartmentalization, and historical climatic fluctuations) have promoted genetic structure by mediating gene flow. Here, we asked whether these landscape features have influenced the genetic structure and differentiation in the lizard species Norops brasiliensis (Squamata: Dactyloidae). To achieve our goal, we used a genetic clustering analysis and estimate an effective migration surface to assess genetic structure in the focal species. Optimized isolation-by-resistance models and a simulation-based approach combined with machine learning (convolutional neural network; CNN) were then used to infer current and historical effects on population genetic structure through 12 unique landscape models. We recovered five geographically distributed populations that are separated by regions of lower-than-expected gene flow. The results of the CNN showed that geographic distance is the sole predictor of genetic variation in N. brasiliensis, and that slope, rivers, and historical climate had no discernible influence on gene flow. Our novel CNN approach was accurate (89.5%) in differentiating each landscape model. CNN and other machine learning approaches are still largely unexplored in landscape genetics studies, representing promising avenues for future research with increasingly accessible genomic datasets.
Collapse
Affiliation(s)
- Emanuel M Fonseca
- Department of Evolution, Ecology and Organismal Biology, The Ohio State University, Columbus, OH, USA
| | - Nathaniel S Pope
- Institute of Ecology and Evolution, University of Oregon, Eugene, OR, 97403, USA
| | - William E Peterman
- School of Environment and Natural Resources, The Ohio State University, Columbus, OH, USA
| | - Fernanda P Werneck
- Coordenação de Biodiversidade, Programa de Coleções Científicas Biológicas, Instituto Nacional de Pesquisas da Amazônia (INPA), Manaus, Brazil
| | - Guarino R Colli
- Departamento de Zoologia, Universidade de Brasília, Brasília, Brazil
| | - Bryan C Carstens
- Department of Evolution, Ecology and Organismal Biology, The Ohio State University, Columbus, OH, USA.
| |
Collapse
|
5
|
Tran LN, Sun CK, Struck TJ, Sajan M, Gutenkunst RN. Computationally Efficient Demographic History Inference from Allele Frequencies with Supervised Machine Learning. Mol Biol Evol 2024; 41:msae077. [PMID: 38636507 PMCID: PMC11082913 DOI: 10.1093/molbev/msae077] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/24/2023] [Revised: 04/08/2024] [Accepted: 04/12/2024] [Indexed: 04/20/2024] Open
Abstract
Inferring past demographic history of natural populations from genomic data is of central concern in many studies across research fields. Previously, our group had developed dadi, a widely used demographic history inference method based on the allele frequency spectrum (AFS) and maximum composite-likelihood optimization. However, dadi's optimization procedure can be computationally expensive. Here, we present donni (demography optimization via neural network inference), a new inference method based on dadi that is more efficient while maintaining comparable inference accuracy. For each dadi-supported demographic model, donni simulates the expected AFS for a range of model parameters then trains a set of Mean Variance Estimation neural networks using the simulated AFS. Trained networks can then be used to instantaneously infer the model parameters from future genomic data summarized by an AFS. We demonstrate that for many demographic models, donni can infer some parameters, such as population size changes, very well and other parameters, such as migration rates and times of demographic events, fairly well. Importantly, donni provides both parameter and confidence interval estimates from input AFS with accuracy comparable to parameters inferred by dadi's likelihood optimization while bypassing its long and computationally intensive evaluation process. donni's performance demonstrates that supervised machine learning algorithms may be a promising avenue for developing more sustainable and computationally efficient demographic history inference methods.
Collapse
Affiliation(s)
- Linh N Tran
- Genetics Graduate Interdisciplinary Program, University of Arizona, Tucson, AZ 85721, USA
- Department of Molecular & Cellular Biology, University of Arizona, Tucson, AZ 85721, USA
| | - Connie K Sun
- Department of Molecular & Cellular Biology, University of Arizona, Tucson, AZ 85721, USA
| | - Travis J Struck
- Department of Molecular & Cellular Biology, University of Arizona, Tucson, AZ 85721, USA
| | - Mathews Sajan
- Department of Molecular & Cellular Biology, University of Arizona, Tucson, AZ 85721, USA
| | - Ryan N Gutenkunst
- Department of Molecular & Cellular Biology, University of Arizona, Tucson, AZ 85721, USA
| |
Collapse
|
6
|
Tian Y, Yang X, Chen N, Li C, Yang W. Data-driven interpretable analysis for polysaccharide yield prediction. ENVIRONMENTAL SCIENCE AND ECOTECHNOLOGY 2024; 19:100321. [PMID: 38021368 PMCID: PMC10661693 DOI: 10.1016/j.ese.2023.100321] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 02/09/2023] [Revised: 09/17/2023] [Accepted: 09/17/2023] [Indexed: 12/01/2023]
Abstract
Cornstalks show promise as a raw material for polysaccharide production through xylanase. Rapid and accurate prediction of polysaccharide yield can facilitate process optimization, eliminating the need for extensive experimentation in actual production to refine reaction conditions, thereby saving time and costs. However, the intricate interplay of enzymatic factors poses challenges in predicting and optimizing polysaccharide yield accurately. Here, we introduce an innovative data-driven approach leveraging multiple artificial intelligence techniques to enhance polysaccharide production. We propose a machine learning framework to identify highly accurate polysaccharide yield prediction modeling methods and uncover optimal enzymatic parameter combinations. Notably, Random Forest (RF) and eXtreme Gradient Boost (XGB) demonstrate robust performance, achieving prediction accuracies of 93.0% and 95.6%, respectively, while an independently developed deep neural network (DNN) model achieves 91.1% accuracy. A feature importance analysis of XGB reveals the enzyme solution volume's dominant role (43.7%), followed by time (20.7%), substrate concentration (15%), temperature (15%), and pH (5.6%). Further interpretability analysis unveils complex parameter interactions and potential optimization strategies. This data-driven approach, incorporating machine learning, deep learning, and interpretable analysis, offers a viable pathway for polysaccharide yield prediction and the potential recovery of various agricultural residues.
Collapse
Affiliation(s)
- Yushi Tian
- School of Resource and Environment, Northeast Agriculture University, Harbin, 150030, PR China
| | - Xu Yang
- School of Resource and Environment, Northeast Agriculture University, Harbin, 150030, PR China
| | - Nianhua Chen
- School of Resource and Environment, Northeast Agriculture University, Harbin, 150030, PR China
| | - Chunyan Li
- School of Resource and Environment, Northeast Agriculture University, Harbin, 150030, PR China
| | - Wulin Yang
- College of Environmental Sciences and Engineering, Peking University, Beijing, 100871, PR China
| |
Collapse
|
7
|
Tran LN, Sun CK, Struck TJ, Sajan M, Gutenkunst RN. Computationally efficient demographic history inference from allele frequencies with supervised machine learning. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2024:2023.05.24.542158. [PMID: 38405827 PMCID: PMC10888863 DOI: 10.1101/2023.05.24.542158] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Indexed: 02/27/2024]
Abstract
Inferring past demographic history of natural populations from genomic data is of central concern in many studies across research fields. Previously, our group had developed dadi, a widely used demographic history inference method based on the allele frequency spectrum (AFS) and maximum composite likelihood optimization. However, dadi's optimization procedure can be computationally expensive. Here, we developed donni (demography optimization via neural network inference), a new inference method based on dadi that is more efficient while maintaining comparable inference accuracy. For each dadi-supported demographic model, donni simulates the expected AFS for a range of model parameters then trains a set of Mean Variance Estimation neural networks using the simulated AFS. Trained networks can then be used to instantaneously infer the model parameters from future input data AFS. We demonstrated that for many demographic models, donni can infer some parameters, such as population size changes, very well and other parameters, such as migration rates and times of demographic events, fairly well. Importantly, donni provides both parameter and confidence interval estimates from input AFS with accuracy comparable to parameters inferred by dadi's likelihood optimization while bypassing its long and computationally intensive evaluation process. donni's performance demonstrates that supervised machine learning algorithms may be a promising avenue for developing more sustainable and computationally efficient demographic history inference methods.
Collapse
Affiliation(s)
- Linh N. Tran
- Genetics Graduate Interdisciplinary Program, University of Arizona, Tucson, AZ, USA
- Department of Molecular & Cellular Biology, University of Arizona, Tucson, AZ, USA
| | - Connie K. Sun
- Department of Molecular & Cellular Biology, University of Arizona, Tucson, AZ, USA
| | - Travis J. Struck
- Department of Molecular & Cellular Biology, University of Arizona, Tucson, AZ, USA
| | - Mathews Sajan
- Department of Molecular & Cellular Biology, University of Arizona, Tucson, AZ, USA
| | - Ryan N. Gutenkunst
- Department of Molecular & Cellular Biology, University of Arizona, Tucson, AZ, USA
| |
Collapse
|
8
|
Tapondjou Nkonmeneck WP, Allen KE, Hime PM, Knipp KN, Kameni MM, Tchassem AM, Gonwouo LN, Brown RM. Diversification and historical demography of Rhampholeon spectrum in West-Central Africa. PLoS One 2022; 17:e0277107. [PMID: 36525408 PMCID: PMC9757597 DOI: 10.1371/journal.pone.0277107] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/01/2022] [Accepted: 10/19/2022] [Indexed: 12/23/2022] Open
Abstract
Pygmy Chameleons of the genus Rhampholeon represent a moderately diverse, geographically circumscribed radiation, with most species (18 out of 19 extant taxa) limited to East Africa. The one exception is Rhampholeon spectrum, a species restricted to West-Central African rainforests. We set out to characterize the geographic basis of genetic variation in this disjunctly distributed Rhampholeon species using a combination of multilocus Sanger data and genomic sequences to explore population structure and range-wide phylogeographic patterns. We also employed demographic analyses and niche modeling to distinguish between alternate explanations to contextualize the impact of past geological and climatic events on the present-day distribution of intraspecific genetic variation. Phylogenetic analyses suggest that R. spectrum is a complex of five geographically delimited populations grouped into two major clades (montane vs. lowland). We found pronounced population structure suggesting that divergence and, potentially, speciation began between the late Miocene and the Pleistocene. Sea level changes during the Pleistocene climatic oscillations resulted in allopatric divergence associated with dispersal over an ocean channel barrier and colonization of Bioko Island. Demographic inferences and range stability mapping each support diversification models with secondary contact due to population contraction in lowland and montane refugia during the interglacial period. Allopatric divergence, congruent with isolation caused by geologic uplift of the East African rift system, the "descent into the Icehouse," and aridification of sub-Saharan Africa during the Eocene-Oligocene are identified as the key events explaining the population divergence between R. spectrum and its closely related sister clade from the Eastern Arc Mountains. Our results unveil cryptic genetic diversity in R. spectrum, suggesting the possibility of a species complex distributed across the Lower Guinean Forest and the Island of Bioko. We highlight the major element of species diversification that modelled today's diversity and distributions in most West-Central African vertebrates.
Collapse
Affiliation(s)
- Walter Paulin Tapondjou Nkonmeneck
- Department of Ecology and Evolutionary Biology, University of Kansas, Lawrence, Kansas, United States of America
- Biodiversity Institute, University of Kansas, Lawrence, Kansas, United States of America
| | - Kaitlin E. Allen
- Department of Ecology and Evolutionary Biology, University of Kansas, Lawrence, Kansas, United States of America
- Biodiversity Institute, University of Kansas, Lawrence, Kansas, United States of America
| | - Paul M. Hime
- Biodiversity Institute, University of Kansas, Lawrence, Kansas, United States of America
| | - Kristen N. Knipp
- Department of Ecology and Evolutionary Biology, University of Kansas, Lawrence, Kansas, United States of America
- Biodiversity Institute, University of Kansas, Lawrence, Kansas, United States of America
| | - Marina M. Kameni
- Laboratory of Zoology, Faculty of Science, University of Yaoundé I, Yaoundé, Cameroon
| | - Arnaud M. Tchassem
- Laboratory of Zoology, Faculty of Science, University of Yaoundé I, Yaoundé, Cameroon
| | - LeGrand N. Gonwouo
- Laboratory of Zoology, Faculty of Science, University of Yaoundé I, Yaoundé, Cameroon
| | - Rafe M. Brown
- Department of Ecology and Evolutionary Biology, University of Kansas, Lawrence, Kansas, United States of America
- Biodiversity Institute, University of Kansas, Lawrence, Kansas, United States of America
| |
Collapse
|
9
|
Vaux F, Parvizi E, Craw D, Fraser CI, Waters J. Parallel recolonizations generate distinct genomic sectors in kelp following high-magnitude earthquake disturbance. Mol Ecol 2022; 31:4818-4831. [PMID: 35582778 PMCID: PMC9540901 DOI: 10.1111/mec.16535] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/25/2021] [Revised: 04/25/2022] [Accepted: 05/12/2022] [Indexed: 11/30/2022]
Abstract
Large-scale disturbance events have the potential to drastically reshape biodiversity patterns. Notably, newly vacant habitat space cleared by disturbance can be colonized by multiple lineages, which can lead to the evolution of distinct spatial "sectors" of genetic diversity within a species. We test for disturbance-driven sectoring of genetic diversity in intertidal southern bull kelp, Durvillaea antarctica (Chamisso) Hariot, following the high-magnitude 1855 Wairarapa earthquake in New Zealand. Specifically, we use genotyping-by-sequencing (GBS) to analyse fine-scale population structure across the uplift zone and apply machine learning to assess the fit of alternative recolonizaton models. Our analysis reveals that specimens from the uplift zone carry distinctive genomic signatures potentially linked to post-earthquake recolonization processes. Specifically, our analysis identifies two parapatric spatial-genomic sectors of D. antarctica at Turakirae Head, which experienced the most dramatic uplift. Based on phylogeographical modelling, we infer that bull kelp in the Wellington region was probably a source for recolonization of the heavily uplifted Turakirae Head coastline, via two parallel, eastward recolonization events. By identifying multiple parapatric genotypic sectors within a recently recolonized coastal region, the current study provides support for the hypothesis that competing lineage expansions can generate striking spatial structuring of genetic diversity, even in highly dispersive taxa.
Collapse
Affiliation(s)
- Felix Vaux
- Department of ZoologyUniversity of OtagoDunedinNew Zealand
| | - Elahe Parvizi
- Department of ZoologyUniversity of OtagoDunedinNew Zealand
| | - Dave Craw
- Department of GeologyUniversity of OtagoDunedinNew Zealand
| | | | | |
Collapse
|
10
|
Carstens BC, Moshier SP. Giant tree frogs exemplify the promise of integrating multiple types of data in phylogeographic investigations. Mol Ecol 2022; 31:3971-3974. [PMID: 35779007 DOI: 10.1111/mec.16593] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/10/2022] [Revised: 06/13/2022] [Accepted: 06/28/2022] [Indexed: 11/29/2022]
Abstract
Hugall et al. (2022) is one of the seminal publications from the single locus era of phylogeographic research. These authors were among the first to argue that genetic data are ideally suited to test hypotheses that are ultimately derived from other sources of information. While the testing of predictions from the fossil record has long been important to molecular systematics (e.g., Donoghue et al., 1989), phylogeographic investigations into the more recent evolutionary past lack a fossil record in most focal taxa. In lieu of fossils, which were not available for the small snails that served as the focal taxon, Hugall et al. (2002) applied the (then) new technique of environmental modelling to identify regions within the species range with habitat that was predicted to be stable throughout the Holocene. They then present data that suggests that these regions correspond to the areas with high genetic diversity. Apart from the inferences about snail evolutionary history, the core argument of Hugall et al. (2002) is that consilience (i.e., agreement between inferences drawn from different sources of data) is an important goal for phylogeographic investigation. Consilience in the inferences drawn from independent types of data has a multiplicative effect; when present the researcher is likely to have more confidence in their inference than would be possible to have in an inference from any one source of data. The manuscript by Jaynes et al. (2022) is a splendid illustration of this principle.
Collapse
Affiliation(s)
- Bryan C Carstens
- Department of Evolution, Ecology, and Organismal Biology, The Ohio State University, Columbus, Ohio, USA
| | - Shelby P Moshier
- Department of Evolution, Ecology, and Organismal Biology, The Ohio State University, Columbus, Ohio, USA
| |
Collapse
|
11
|
Smith ML, Wallace J, Tank DC, Sullivan J, Carstens BC. The role of multiple Pleistocene refugia in promoting diversification in the Pacific Northwest. Mol Ecol 2022; 31:4402-4416. [PMID: 35780485 DOI: 10.1111/mec.16595] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/12/2021] [Revised: 06/14/2022] [Accepted: 06/20/2022] [Indexed: 10/17/2022]
Abstract
Pleistocene glacial cycles drastically changed the distributions of taxa endemic to temperate rainforests in the Pacific Northwest, with many experiencing reduced habitat suitability during glacial periods. In this study, we investigate whether glacial cycles promoted intraspecific divergence and whether subsequent range changes led to secondary contact and gene flow. For seven invertebrate species endemic to the PNW, we estimated Species Distribution Models (SDMs) and projected them onto current and historical climate conditions to assess how habitat suitability changed during glacial cycles. Using single nucleotide polymorphism (SNP) data from these species, we assessed population genetic structure and used a machine-learning approach to compare models with and without gene flow between populations upon secondary contact after the Last Glacial Maximum (LGM). Finally, we estimated divergence times and rates of gene flow between populations. SDMs suggest that there was less suitable habitat in the North Cascades and Northern Rocky Mountains during glacial compared to interglacial periods, resulting in reduced habitat suitability and habitat fragmentation during the LGM. Our genomic data identify population structure in all taxa, and support gene flow upon secondary contact in five of the seven taxa. Parameter estimates suggest that population divergences date to the later Pleistocene for most populations. Our results support a role of refugial dynamics in driving intraspecific divergence in the Cascades Range. In these invertebrates, population structure often does not correspond to current biogeographic or environmental barriers. Rather, population structure may reflect refugial lineages that have since expanded their ranges, often leading to secondary contact between once isolated lineages.
Collapse
Affiliation(s)
- Megan L Smith
- Department of Evolution, Ecology & Organismal Biology, The Ohio State University, 318 W. 12th Avenue, 300 Aronoff Labs, Columbus, OH 43210-1293, USA
| | - Jessica Wallace
- Department of Evolution, Ecology & Organismal Biology, The Ohio State University, 318 W. 12th Avenue, 300 Aronoff Labs, Columbus, OH 43210-1293, USA
| | - David C Tank
- Department of Botany and Rocky Mountain Herbarium, University of Wyoming, 1000 E. University Ave., Laramie, WY 82071, USA.,Department of Biological Sciences, University of Idaho, 875 Perimeter Dr. MS 3051, Moscow, ID 83844-3051, USA.,Institute for Bioinformatics and Evolutionary Studies (IBEST), Biological Sciences, University of Idaho, 875 Perimeter Dr. MS 3051, Moscow, ID 83844-3051, USA
| | - Jack Sullivan
- Department of Biological Sciences, University of Idaho, 875 Perimeter Dr. MS 3051, Moscow, ID 83844-3051, USA.,Institute for Bioinformatics and Evolutionary Studies (IBEST), Biological Sciences, University of Idaho, 875 Perimeter Dr. MS 3051, Moscow, ID 83844-3051, USA
| | - Bryan C Carstens
- Department of Evolution, Ecology & Organismal Biology, The Ohio State University, 318 W. 12th Avenue, 300 Aronoff Labs, Columbus, OH 43210-1293, USA
| |
Collapse
|
12
|
Titus BM, Daly M. Population genomics for symbiotic anthozoans: can reduced representation approaches be used for taxa without reference genomes? Heredity (Edinb) 2022; 128:338-351. [PMID: 35418670 PMCID: PMC9076904 DOI: 10.1038/s41437-022-00531-3] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/09/2020] [Revised: 03/25/2022] [Accepted: 03/25/2022] [Indexed: 11/08/2022] Open
Abstract
Population genetic studies of symbiotic anthozoans have been historically challenging because their endosymbioses with dinoflagellates have impeded marker development. Genomic approaches like reduced representation sequencing alleviate marker development issues but produce anonymous loci, and without a reference genome, it is unknown which organism is contributing to the observed patterns. Alternative methods such as bait-capture sequencing targeting Ultra-Conserved Elements are now possible but costly. Thus, RADseq remains attractive, but how useful are these methods for symbiotic anthozoan taxa without a reference genome to separate anthozoan from algal sequences? We explore this through a case-study using a double-digest RADseq dataset for the sea anemone Bartholomea annulata. We assembled a holobiont dataset (3854 loci) for 101 individuals, then used a reference genome to create an aposymbiotic dataset (1402 loci). For both datasets, we investigated population structure and used coalescent simulations to estimate demography and population parameters. We demonstrate complete overlap in the spatial patterns of genetic diversity, demographic histories, and population parameter estimates for holobiont and aposymbiotic datasets. We hypothesize that the unique combination of anthozoan biology, diversity of the endosymbionts, and the manner in which assembly programs identify orthologous loci alleviates the need for reference genomes in some circumstances. We explore this hypothesis by assembling an additional 21 datasets using the assembly programs pyRAD and Stacks. We conclude that RADseq methods are more tractable for symbiotic anthozoans without reference genomes than previously realized.
Collapse
Affiliation(s)
- Benjamin M Titus
- Department of Biological Sciences, University of Alabama, Tuscaloosa, AL, USA.
- Dauphin Island Sea Lab, Dauphin Island, AL, USA.
- Department of Evolution, Ecology, and Organismal Biology, The Ohio State University, Columbus, OH, USA.
| | - Marymegan Daly
- Department of Evolution, Ecology, and Organismal Biology, The Ohio State University, Columbus, OH, USA
| |
Collapse
|
13
|
Ruffley M, Smith ML, Espíndola A, Turck DF, Mitchell N, Carstens B, Sullivan J, Tank DC. Genomic evidence of an ancient Inland Temperate Rainforest in the Pacific Northwest of North America. Mol Ecol 2022; 31:2985-3001. [PMID: 35322900 PMCID: PMC9322681 DOI: 10.1111/mec.16431] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/27/2021] [Revised: 01/15/2022] [Accepted: 02/21/2022] [Indexed: 12/02/2022]
Abstract
The disjunct temperate rainforests of the Pacific Northwest of North America (PNW) are characterized by late‐successional dominant tree species Thuja plicata (western redcedar) and Tsuga heterophylla (western hemlock). The demographic histories of these species, along with the PNW rainforest ecosystem in its entirety, have been heavily impacted by geological and climatic changes the PNW has experienced over the last 5 million years, including mountain orogeny and repeated Pleistocene glaciations. These environmental events have ultimately shaped the history of these species, with inland populations potentially being extirpated during the Pleistocene glaciations. Here, we collect genomic data for both species across their ranges to test multiple demographic models, each reflecting a different phylogeographical hypothesis on how the ecosystem‐dominating species may have responded to dramatic climatic change. Our results indicate that inland and coastal populations in both species diverged ~2.5 million years ago in the early Pleistocene and experienced decreases in population size during glacial cycles, with subsequent population expansion. Importantly, we found evidence for gene flow between coastal and inland populations during the mid‐Holocene. It is likely that intermittent migration in these species during this time has prevented allopatric speciation via genetic drift alone. In conclusion, our results from combining genomic data and demographic inference procedures establish that populations of the ecosystem dominants Thuja plicata and Tsuga heterophylla persisted in refugia located in both the coastal and inland regions of the PNW throughout the Pleistocene, with populations expanding and contracting in response to glacial cycles with occasional gene flow.
Collapse
Affiliation(s)
- Megan Ruffley
- Department of Biological Sciences, University of Idaho, 875 Perimeter Dr. MS 3051, Moscow, ID, 83844-3051, USA.,Institute for Bioinformatics and Evolutionary Studies (IBEST), 875 Perimeter Dr. MS 3051, Moscow, ID, 83844-3051, USA.,Department of Plant Biology, Carnegie Institution for Science, 260 Panama St, Stanford, CA, 94305, USA
| | - Megan L Smith
- Department of Evolution, Ecology, and Organismal Biology & Museum of Biological Diversity, The Ohio State University, 1315 Kinnear Rd, Columbus, OH, 43212, USA.,Department of Biology and Department of Computer Science, Indiana University, Bloomington, IN, 47405, USA
| | - Anahí Espíndola
- Department of Entomology, University of Maryland, 4291 Fieldhouse Dr, College Park, MD, 20742, USA
| | - Daniel F Turck
- Department of Biological Sciences, University of Idaho, 875 Perimeter Dr. MS 3051, Moscow, ID, 83844-3051, USA.,Stillinger Herbarium, University of Idaho, 875 Perimeter Dr. MS 3051, Moscow, ID, 83844-3051, USA
| | - Niels Mitchell
- Department of Biological Sciences, University of Idaho, 875 Perimeter Dr. MS 3051, Moscow, ID, 83844-3051, USA
| | - Bryan Carstens
- Department of Evolution, Ecology, and Organismal Biology & Museum of Biological Diversity, The Ohio State University, 1315 Kinnear Rd, Columbus, OH, 43212, USA
| | - Jack Sullivan
- Department of Biological Sciences, University of Idaho, 875 Perimeter Dr. MS 3051, Moscow, ID, 83844-3051, USA.,Institute for Bioinformatics and Evolutionary Studies (IBEST), 875 Perimeter Dr. MS 3051, Moscow, ID, 83844-3051, USA
| | - David C Tank
- Department of Biological Sciences, University of Idaho, 875 Perimeter Dr. MS 3051, Moscow, ID, 83844-3051, USA.,Institute for Bioinformatics and Evolutionary Studies (IBEST), 875 Perimeter Dr. MS 3051, Moscow, ID, 83844-3051, USA.,Stillinger Herbarium, University of Idaho, 875 Perimeter Dr. MS 3051, Moscow, ID, 83844-3051, USA.,Department of Botany & Rocky Mountain Herbarium, University of Wyoming, 1000 E. University Ave, Laramie, WY, 82071, USA
| |
Collapse
|
14
|
Blischak PD, Barker MS, Gutenkunst RN. Chromosome-scale inference of hybrid speciation and admixture with convolutional neural networks. Mol Ecol Resour 2021; 21:2676-2688. [PMID: 33682305 PMCID: PMC8675098 DOI: 10.1111/1755-0998.13355] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/30/2020] [Revised: 01/26/2021] [Accepted: 02/05/2021] [Indexed: 11/30/2022]
Abstract
Inferring the frequency and mode of hybridization among closely related organisms is an important step for understanding the process of speciation and can help to uncover reticulated patterns of phylogeny more generally. Phylogenomic methods to test for the presence of hybridization come in many varieties and typically operate by leveraging expected patterns of genealogical discordance in the absence of hybridization. An important assumption made by these tests is that the data (genes or SNPs) are independent given the species tree. However, when the data are closely linked, it is especially important to consider their nonindependence. Recently, deep learning techniques such as convolutional neural networks (CNNs) have been used to perform population genetic inferences with linked SNPs coded as binary images. Here, we use CNNs for selecting among candidate hybridization scenarios using the tree topology (((P1 , P2 ), P3 ), Out) and a matrix of pairwise nucleotide divergence (dXY ) calculated in windows across the genome. Using coalescent simulations to train and independently test a neural network showed that our method, HyDe-CNN, was able to accurately perform model selection for hybridization scenarios across a wide breath of parameter space. We then used HyDe-CNN to test models of admixture in Heliconius butterflies, as well as comparing it to phylogeny-based introgression statistics. Given the flexibility of our approach, the dropping cost of long-read sequencing and the continued improvement of CNN architectures, we anticipate that inferences of hybridization using deep learning methods like ours will help researchers to better understand patterns of admixture in their study organisms.
Collapse
Affiliation(s)
- Paul D. Blischak
- Department of Ecology & Evolutionary Biology, University of Arizona, Tucson, AZ, 85721, USA
- Department of Molecular & Cellular Biology, University of Arizona, Tucson, AZ, 85721, USA
| | - Michael S. Barker
- Department of Ecology & Evolutionary Biology, University of Arizona, Tucson, AZ, 85721, USA
| | - Ryan N. Gutenkunst
- Department of Molecular & Cellular Biology, University of Arizona, Tucson, AZ, 85721, USA
| |
Collapse
|
15
|
Parvizi E, Dutoit L, Fraser CI, Craw D, Waters JM. Concordant phylogeographic responses to large-scale coastal disturbance in intertidal macroalgae and their epibiota. Mol Ecol 2021; 31:646-657. [PMID: 34695264 DOI: 10.1111/mec.16245] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/15/2021] [Revised: 10/13/2021] [Accepted: 10/20/2021] [Indexed: 01/05/2023]
Abstract
Major ecological disturbance events can provide opportunities to assess multispecies responses to upheaval. In particular, catastrophic disturbances that regionally extirpate habitat-forming species can potentially influence the genetic diversity of large numbers of codistributed taxa. However, due to the rarity of such disturbance events over ecological timeframes, the genetic dynamics of multispecies recolonization processes have remained little understood. Here, we use single nucleotide polymorphism (SNP) data from multiple coastal species to track the dynamics of cocolonization events in response to ancient earthquake disturbance in southern New Zealand. Specifically, we use a comparative phylogeographic approach to understand the extent to which epifauna (with varying ecological associations with their macroalgal hosts) share comparable spatial and temporal recolonization patterns. Our study reveals concordant disturbance-related phylogeographic breaks in two intertidal macroalgal species along with two associated epibiotic species (a chiton and an isopod). By contrast, two codistributed species, one of which is an epibiotic amphipod and the other a subtidal macroalga, show few, if any, genetic effects of palaeoseismic coastal uplift. Phylogeographic model selection reveals similar post-uplift recolonization routes for the epibiotic chiton and isopod and their macroalgal hosts. Additionally, codemographic analyses support synchronous population expansions of these four phylogeographically similar taxa. Our findings indicate that coastal paleoseismic activity has driven concordant impacts on multiple codistributed species, with concerted recolonization events probably facilitated by macroalgal rafting. These results highlight that high-resolution comparative genomic data can help reconstruct concerted multispecies responses to recent ecological disturbance.
Collapse
Affiliation(s)
- Elahe Parvizi
- Department of Zoology, University of Otago, Dunedin, New Zealand
| | - Ludovic Dutoit
- Department of Zoology, University of Otago, Dunedin, New Zealand
| | - Ceridwen I Fraser
- Department of Marine Science, University of Otago, Dunedin, New Zealand
| | - Dave Craw
- Department of Geology, University of Otago, Dunedin, New Zealand
| | | |
Collapse
|
16
|
Stone BW, Wolfe AD. Phylogeographic analysis of shrubby beardtongues reveals range expansions during the Last Glacial Maximum and implicates the Klamath Mountains as a hotspot for hybridization. Mol Ecol 2021; 30:3826-3839. [PMID: 34013537 DOI: 10.1111/mec.15992] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/06/2020] [Revised: 05/12/2021] [Accepted: 05/14/2021] [Indexed: 12/26/2022]
Abstract
Quaternary glacial cycles often altered species' geographic distributions, which in turn altered the geographic structure of species' genetic diversity. In many cases, glacial expansion forced species in temperate climates to contract their ranges and reside in small pockets of suitable habitat (refugia), where they were likely to interact closely with other species, setting the stage for potential gene exchange. These introgression events, in turn, would have degraded species boundaries, making the inference of phylogenetic relationships challenging. Using high-throughput sequence data, we employed a combination of species distribution models and hybridization tests to assess the effect of glaciation on the geographic distributions, phylogenetic relationships, and patterns of gene flow of five species of Penstemon subgenus Dasanthera, long-lived shrubby angiosperms distributed throughout the Pacific Northwest of North America. Surprisingly, we found that rather than reducing their ranges to small refugia, most Penstemon subgenus Dasanthera species experienced increased suitable habitat during the Last Glacial Maximum relative to the present day. We also found substantial evidence for gene exchange between species, with the bulk of introgression events occurring in or near the Klamath Mountains of southwestern Oregon and northwestern California. Subsequently, our phylogenetic inference reveals blurred taxonomic boundaries in the Klamath Mountains, where introgression is most prevalent. Our results question the classical paradigm of temperate species' responses to glaciation and highlight the importance of contextualizing phylogenetic inference with species' histories of introgression.
Collapse
Affiliation(s)
- Benjamin W Stone
- Department of Evolution, Ecology and Organismal Biology, The Ohio State University, Columbus, OH, USA
| | - Andrea D Wolfe
- Department of Evolution, Ecology and Organismal Biology, The Ohio State University, Columbus, OH, USA
| |
Collapse
|
17
|
Allen KE, Greenbaum E, Hime PM, Tapondjou N. WP, Sterkhova VV, Kusamba C, Rödel M, Penner J, Peterson AT, Brown RM. Rivers, not refugia, drove diversification in arboreal, sub-Saharan African snakes. Ecol Evol 2021; 11:6133-6152. [PMID: 34141208 PMCID: PMC8207163 DOI: 10.1002/ece3.7429] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/18/2020] [Revised: 02/12/2021] [Accepted: 02/18/2021] [Indexed: 12/26/2022] Open
Abstract
The relative roles of rivers versus refugia in shaping the high levels of species diversity in tropical rainforests have been widely debated for decades. Only recently has it become possible to take an integrative approach to test predictions derived from these hypotheses using genomic sequencing and paleo-species distribution modeling. Herein, we tested the predictions of the classic river, refuge, and river-refuge hypotheses on diversification in the arboreal sub-Saharan African snake genus Toxicodryas. We used dated phylogeographic inferences, population clustering analyses, demographic model selection, and paleo-distribution modeling to conduct a phylogenomic and historical demographic analysis of this genus. Our results revealed significant population genetic structure within both Toxicodryas species, corresponding geographically to river barriers and divergence times from the mid-Miocene to Pliocene. Our demographic analyses supported the interpretation that rivers are indications of strong barriers to gene flow among populations since their divergence. Additionally, we found no support for a major contraction of suitable habitat during the last glacial maximum, allowing us to reject both the refuge and river-refuge hypotheses in favor of the river-barrier hypothesis. Based on conservative interpretations of our species delimitation analyses with the Sanger and ddRAD data sets, two new cryptic species are identified from east-central Africa. This study highlights the complexity of diversification dynamics in the African tropics and the advantages of integrative approaches to studying speciation in tropical regions.
Collapse
Affiliation(s)
- Kaitlin E. Allen
- Department of Ecology and Evolutionary BiologyUniversity of KansasLawrenceKSUSA
- Biodiversity InstituteUniversity of KansasLawrenceKSUSA
| | - Eli Greenbaum
- Department of Biological SciencesUniversity of Texas at El PasoEl PasoTXUSA
| | - Paul M. Hime
- Biodiversity InstituteUniversity of KansasLawrenceKSUSA
| | - Walter P. Tapondjou N.
- Department of Ecology and Evolutionary BiologyUniversity of KansasLawrenceKSUSA
- Biodiversity InstituteUniversity of KansasLawrenceKSUSA
| | - Viktoria V. Sterkhova
- Department of Ecology and Evolutionary BiologyUniversity of KansasLawrenceKSUSA
- Biodiversity InstituteUniversity of KansasLawrenceKSUSA
| | - Chifundera Kusamba
- Laboratoire d’Hérpétologie, Département de BiologieCentre de Recherche en Sciences NaturellesLwiroDemocratic Republic of Congo
| | - Mark‐Oliver Rödel
- Museum für Naturkunde – Leibniz Institute for Evolution and Biodiversity ScienceBerlinGermany
| | - Johannes Penner
- Museum für Naturkunde – Leibniz Institute for Evolution and Biodiversity ScienceBerlinGermany
- Chair of Wildlife Ecology and ManagementUniversity of FreiburgFreiburgGermany
| | - A. Townsend Peterson
- Department of Ecology and Evolutionary BiologyUniversity of KansasLawrenceKSUSA
- Biodiversity InstituteUniversity of KansasLawrenceKSUSA
| | - Rafe M. Brown
- Department of Ecology and Evolutionary BiologyUniversity of KansasLawrenceKSUSA
- Biodiversity InstituteUniversity of KansasLawrenceKSUSA
| |
Collapse
|
18
|
Fonseca EM, Colli GR, Werneck FP, Carstens BC. Phylogeographic model selection using convolutional neural networks. Mol Ecol Resour 2021; 21:2661-2675. [PMID: 33973350 DOI: 10.1111/1755-0998.13427] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [Key Words] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/07/2020] [Revised: 04/02/2021] [Accepted: 04/28/2021] [Indexed: 11/26/2022]
Abstract
The discipline of phylogeography has evolved rapidly in terms of the analytical toolkit used to analyse large genomic data sets. Despite substantial advances, analytical tools that could potentially address the challenges posed by increased model complexity have not been fully explored. For example, deep learning techniques are underutilized for phylogeographic model selection. In non-model organisms, the lack of information about their ecology and evolution can lead to uncertainty about which demographic models are appropriate. Here, we assess the utility of convolutional neural networks (CNNs) for assessing demographic models in South American lizards in the genus Norops. Three demographic scenarios (constant, expansion, and bottleneck) were considered for each of four inferred population-level lineages, and we found that the overall model accuracy was higher than 98% for all lineages. We then evaluated a set of 26 models that accounted for evolutionary relationships, gene flow, and changes in effective population size among the four lineages, identifying a single model with an estimated overall accuracy of 87% when using CNNs. The inferred demography of the lizard system suggests that gene flow between non-sister populations and changes in effective population sizes through time, probably in response to Pleistocene climatic oscillations, have shaped genetic diversity in this system. Approximate Bayesian computation (ABC) was applied to provide a comparison to the performance of CNNs. ABC was unable to identify a single model among the larger set of 26 models in the subsequent analysis. Our results demonstrate that CNNs can be easily and usefully incorporated into the phylogeographer's toolkit.
Collapse
Affiliation(s)
- Emanuel M Fonseca
- Department of Evolution, Ecology and Organismal Biology, The Ohio State University, Columbus, OH, USA
| | - Guarino R Colli
- Departamento de Zoologia, Universidade de Brasília, Brasília, Brazil
| | - Fernanda P Werneck
- Coordenação de Biodiversidade, Programa de Coleções Científicas Biológicas, Instituto Nacional de Pesquisas da Amazônia (INPA), Manaus, Brazil
| | - Bryan C Carstens
- Department of Evolution, Ecology and Organismal Biology, The Ohio State University, Columbus, OH, USA
| |
Collapse
|
19
|
Collin FD, Durif G, Raynal L, Lombaert E, Gautier M, Vitalis R, Marin JM, Estoup A. Extending approximate Bayesian computation with supervised machine learning to infer demographic history from genetic polymorphisms using DIYABC Random Forest. Mol Ecol Resour 2021; 21:2598-2613. [PMID: 33950563 PMCID: PMC8596733 DOI: 10.1111/1755-0998.13413] [Citation(s) in RCA: 49] [Impact Index Per Article: 12.3] [Reference Citation Analysis] [Abstract] [Key Words] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/10/2020] [Revised: 03/29/2021] [Accepted: 04/28/2021] [Indexed: 01/07/2023]
Abstract
Simulation-based methods such as approximate Bayesian computation (ABC) are well-adapted to the analysis of complex scenarios of populations and species genetic history. In this context, supervised machine learning (SML) methods provide attractive statistical solutions to conduct efficient inferences about scenario choice and parameter estimation. The Random Forest methodology (RF) is a powerful ensemble of SML algorithms used for classification or regression problems. Random Forest allows conducting inferences at a low computational cost, without preliminary selection of the relevant components of the ABC summary statistics, and bypassing the derivation of ABC tolerance levels. We have implemented a set of RF algorithms to process inferences using simulated data sets generated from an extended version of the population genetic simulator implemented in DIYABC v2.1.0. The resulting computer package, named DIYABC Random Forest v1.0, integrates two functionalities into a user-friendly interface: the simulation under custom evolutionary scenarios of different types of molecular data (microsatellites, DNA sequences or SNPs) and RF treatments including statistical tools to evaluate the power and accuracy of inferences. We illustrate the functionalities of DIYABC Random Forest v1.0 for both scenario choice and parameter estimation through the analysis of pseudo-observed and real data sets corresponding to pool-sequencing and individual-sequencing SNP data sets. Because of the properties inherent to the implemented RF methods and the large feature vector (including various summary statistics and their linear combinations) available for SNP data, DIYABC Random Forest v1.0 can efficiently contribute to the analysis of large SNP data sets to make inferences about complex population genetic histories.
Collapse
Affiliation(s)
| | - Ghislain Durif
- IMAG, Univ Montpellier, CNRS, UMR 5149, Montpellier, France
| | - Louis Raynal
- IMAG, Univ Montpellier, CNRS, UMR 5149, Montpellier, France
| | - Eric Lombaert
- ISA, INRAE, CNRS, Univ Côte d'Azur, Sophia Antipolis, France
| | - Mathieu Gautier
- CBGP, Univ Montpellier, CIRAD, INRAE, Institut Agro, IRD, Montpellier, France
| | - Renaud Vitalis
- CBGP, Univ Montpellier, CIRAD, INRAE, Institut Agro, IRD, Montpellier, France
| | | | - Arnaud Estoup
- CBGP, Univ Montpellier, CIRAD, INRAE, Institut Agro, IRD, Montpellier, France
| |
Collapse
|
20
|
Xue AT, Schrider DR, Kern AD. Discovery of Ongoing Selective Sweeps within Anopheles Mosquito Populations Using Deep Learning. Mol Biol Evol 2021; 38:1168-1183. [PMID: 33022051 PMCID: PMC7947845 DOI: 10.1093/molbev/msaa259] [Citation(s) in RCA: 19] [Impact Index Per Article: 4.8] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/19/2023] Open
Abstract
Identification of partial sweeps, which include both hard and soft sweeps that have not currently reached fixation, provides crucial information about ongoing evolutionary responses. To this end, we introduce partialS/HIC, a deep learning method to discover selective sweeps from population genomic data. partialS/HIC uses a convolutional neural network for image processing, which is trained with a large suite of summary statistics derived from coalescent simulations incorporating population-specific history, to distinguish between completed versus partial sweeps, hard versus soft sweeps, and regions directly affected by selection versus those merely linked to nearby selective sweeps. We perform several simulation experiments under various demographic scenarios to demonstrate partialS/HIC's performance, which exhibits excellent resolution for detecting partial sweeps. We also apply our classifier to whole genomes from eight mosquito populations sampled across sub-Saharan Africa by the Anopheles gambiae 1000 Genomes Consortium, elucidating both continent-wide patterns as well as sweeps unique to specific geographic regions. These populations have experienced intense insecticide exposure over the past two decades, and we observe a strong overrepresentation of sweeps at insecticide resistance loci. Our analysis thus provides a list of candidate adaptive loci that may be relevant to mosquito control efforts. More broadly, our supervised machine learning approach introduces a method to distinguish between completed and partial sweeps, as well as between hard and soft sweeps, under a variety of demographic scenarios. As whole-genome data rapidly accumulate for a greater diversity of organisms, partialS/HIC addresses an increasing demand for useful selection scan tools that can track in-progress evolutionary dynamics.
Collapse
Affiliation(s)
- Alexander T Xue
- Simons Center for Quantitative Biology, Cold Spring Harbor Laboratory, Cold Spring Harbor, NY
| | - Daniel R Schrider
- Department of Genetics, University of North Carolina, Chapel Hill, NC
| | - Andrew D Kern
- Institute of Ecology and Evolution, 5289 University of Oregon, Eugene, OR
| |
Collapse
|
21
|
Nye J, Mondal M, Bertranpetit J, Laayouni H. A fully integrated machine learning scan of selection in the chimpanzee genome. NAR Genom Bioinform 2021; 2:lqaa061. [PMID: 33575612 PMCID: PMC7671310 DOI: 10.1093/nargab/lqaa061] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/03/2019] [Revised: 06/11/2020] [Accepted: 07/31/2020] [Indexed: 11/13/2022] Open
Abstract
After diverging, each chimpanzee subspecies has been the target of unique selective pressures. Here, we employ a machine learning approach to classify regions as under positive selection or neutrality genome-wide. The regions determined to be under selection reflect the unique demographic and adaptive history of each subspecies. The results indicate that effective population size is important for determining the proportion of the genome under positive selection. The chimpanzee subspecies share signals of selection in genes associated with immunity and gene regulation. With these results, we have created a selection map for each population that can be displayed in a genome browser (www.hsb.upf.edu/chimp_browser). This study is the first to use a detailed demographic history and machine learning to map selection genome-wide in chimpanzee. The chimpanzee selection map will improve our understanding of the impact of selection on closely related subspecies and will empower future studies of chimpanzee.
Collapse
Affiliation(s)
- Jessica Nye
- Institut de Biologia Evolutiva (UPF-CSIC), Universitat Pompeu Fabra, Doctor Aiguader 88, 08003 Barcelona, Catalonia, Spain
| | - Mayukh Mondal
- Institut de Biologia Evolutiva (UPF-CSIC), Universitat Pompeu Fabra, Doctor Aiguader 88, 08003 Barcelona, Catalonia, Spain
| | - Jaume Bertranpetit
- Institut de Biologia Evolutiva (UPF-CSIC), Universitat Pompeu Fabra, Doctor Aiguader 88, 08003 Barcelona, Catalonia, Spain
| | - Hafid Laayouni
- Institut de Biologia Evolutiva (UPF-CSIC), Universitat Pompeu Fabra, Doctor Aiguader 88, 08003 Barcelona, Catalonia, Spain
| |
Collapse
|
22
|
Martin BT, Chafin TK, Douglas MR, Placyk JS, Birkhead RD, Phillips CA, Douglas ME. The choices we make and the impacts they have: Machine learning and species delimitation in North American box turtles (Terrapene spp.). Mol Ecol Resour 2021; 21:2801-2817. [PMID: 33566450 DOI: 10.1111/1755-0998.13350] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Key Words] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/15/2020] [Revised: 01/20/2021] [Accepted: 02/05/2021] [Indexed: 12/26/2022]
Abstract
Model-based approaches that attempt to delimit species are hampered by computational limitations as well as the unfortunate tendency by users to disregard algorithmic assumptions. Alternatives are clearly needed, and machine-learning (M-L) is attractive in this regard as it functions without the need to explicitly define a species concept. Unfortunately, its performance will vary according to which (of several) bioinformatic parameters are invoked. Herein, we gauge the effectiveness of M-L-based species-delimitation algorithms by parsing 64 variably-filtered versions of a ddRAD-derived SNP data set collected from North American box turtles (Terrapene spp.). Our filtering strategies included: (i) minor allele frequencies (MAF) of 5%, 3%, 1%, and 0% (= none), and (ii) maximum missing data per-individual/per-population at 25%, 50%, 75%, and 100% (= no filtering). We found that species-delimitation via unsupervised M-L impacted the signal-to-noise ratio in our data, as well as the discordance among resolved clades. The latter may also reflect biogeographic history, gene flow, incomplete lineage sorting, or combinations thereof (as corroborated from previously observed patterns of differential introgression). Our results substantiate M-L as a viable species-delimitation method, but also demonstrate how commonly observed patterns of phylogenetic discordance can seriously impact M-L-classification.
Collapse
Affiliation(s)
- Bradley T Martin
- Department of Biological Sciences, University of Arkansas, Fayetteville, AR, USA
| | - Tyler K Chafin
- Department of Biological Sciences, University of Arkansas, Fayetteville, AR, USA
| | - Marlis R Douglas
- Department of Biological Sciences, University of Arkansas, Fayetteville, AR, USA
| | - John S Placyk
- Department of Biology, University of Texas, Tyler, TX, USA.,Science Division, Trinity Valley Community College, Athens, Texas, USA
| | | | - Christopher A Phillips
- Illinois Natural History Survey, Prairie Research Institute, University of Illinois, Champaign, IL, USA
| | - Michael E Douglas
- Department of Biological Sciences, University of Arkansas, Fayetteville, AR, USA
| |
Collapse
|
23
|
Ghirotto S, Vizzari MT, Tassi F, Barbujani G, Benazzo A. Distinguishing among complex evolutionary models using unphased whole-genome data through random forest approximate Bayesian computation. Mol Ecol Resour 2020; 21:2614-2628. [PMID: 33000507 DOI: 10.1111/1755-0998.13263] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/13/2019] [Revised: 08/28/2020] [Accepted: 09/07/2020] [Indexed: 01/25/2023]
Abstract
Inferring past demographic histories is crucial in population genetics, and the amount of complete genomes now available should in principle facilitate this inference. In practice, however, the available inferential methods suffer from severe limitations. Although hundreds complete genomes can be simultaneously analysed, complex demographic processes can easily exceed computational constraints, and the procedures to evaluate the reliability of the estimates contribute to increase the computational effort. Here we present an approximate Bayesian computation framework based on the random forest algorithm (ABC-RF), to infer complex past population processes using complete genomes. To this aim, we propose to summarize the data by the full genomic distribution of the four mutually exclusive categories of segregating sites (FDSS), a statistic fast to compute from unphased genome data and that does not require the ancestral state of alleles to be known. We constructed an efficient ABC pipeline and tested how accurately it allows one to recognize the true model among models of increasing complexity, using simulated data and taking into account different sampling strategies in terms of number of individuals analysed, number and size of the genetic loci considered. We also compared the FDSS with the unfolded and folded site frequency spectrum (SFS), and for these statistics we highlighted the experimental conditions maximizing the inferential power of the ABC-RF procedure. We finally analysed real data sets, testing models on the dispersal of anatomically modern humans out of Africa and exploring the evolutionary relationships of the three species of Orangutan inhabiting Borneo and Sumatra.
Collapse
Affiliation(s)
- Silvia Ghirotto
- Department of Mathematics and Computer Science, University of Ferrara, Ferrara, Italy
| | - Maria Teresa Vizzari
- Department of Mathematics and Computer Science, University of Ferrara, Ferrara, Italy
| | - Francesca Tassi
- Department of Life Sciences and Biotechnology, University of Ferrara, Ferrara, Italy
| | - Guido Barbujani
- Department of Life Sciences and Biotechnology, University of Ferrara, Ferrara, Italy
| | - Andrea Benazzo
- Department of Life Sciences and Biotechnology, University of Ferrara, Ferrara, Italy
| |
Collapse
|
24
|
Xue AT, Hickerson MJ. Comparative phylogeographic inference with genome‐wide data from aggregated population pairs. Evolution 2020; 74:808-830. [DOI: 10.1111/evo.13945] [Citation(s) in RCA: 8] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/08/2017] [Revised: 01/24/2020] [Accepted: 01/29/2020] [Indexed: 12/20/2022]
Affiliation(s)
- Alexander T. Xue
- Subprogram in Ecology, Evolutionary Biology, and Behavior, Department of BiologyGraduate Center of City University of New York New York NY 10016
- Subprogram in Ecology, Evolutionary Biology, and Behavior, Department of BiologyCity College of City University of New York New York NY 10031
- Human Genetics Institute of New Jersey and Department of GeneticsRutgers University Piscataway NJ 08854
- Simons Center for Quantitative BiologyCold Spring Harbor Laboratory Cold Spring Harbor NY 11724
| | - Michael J. Hickerson
- Subprogram in Ecology, Evolutionary Biology, and Behavior, Department of BiologyGraduate Center of City University of New York New York NY 10016
- Subprogram in Ecology, Evolutionary Biology, and Behavior, Department of BiologyCity College of City University of New York New York NY 10031
- Division of Invertebrate ZoologyAmerican Museum of Natural History New York NY 10024
| |
Collapse
|
25
|
Smith ML, Carstens BC. Process-based species delimitation leads to identification of more biologically relevant species. Evolution 2019; 74:216-229. [PMID: 31705650 DOI: 10.1111/evo.13878] [Citation(s) in RCA: 55] [Impact Index Per Article: 9.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/28/2019] [Revised: 10/08/2019] [Accepted: 10/12/2019] [Indexed: 12/23/2022]
Abstract
Most approaches to species delimitation to date have considered divergence-only models. Although these models are appropriate for allopatric speciation, their failure to incorporate many of the population-level processes that drive speciation, such as gene flow (e.g., in sympatric speciation), places an unnecessary limit on our collective understanding of the processes that produce biodiversity. To consider these processes while inferring species boundaries, we introduce the R-package delimitR and apply it to identify species boundaries in the reticulate taildropper slug (Prophysaon andersoni). Results suggest that secondary contact is an important mechanism driving speciation in this system. By considering process, we both avoid erroneous inferences that can be made when population-level processes such as secondary contact drive speciation but only divergence is considered, and gain insight into the process of speciation in terrestrial slugs. Further, we apply delimitR to three published empirical datasets and find results corroborating previous findings. Finally, we evaluate the performance of delimitR using simulation studies, and find that error rates are near zero when comparing models that include lineage divergence and gene flow for three populations with a modest number of Single Nucleotide Polymorphisms (SNPs; 1500) and moderate divergence times (<100,000 generations). When we apply delimitR to a complex model set (i.e., including divergence, gene flow, and population size changes), error rates are moderate (∼0.15; 10,000 SNPs), and, when present, misclassifications occur among highly similar models.
Collapse
Affiliation(s)
- Megan L Smith
- Department of Evolution, Ecology, and Organismal Biology, The Ohio State University, Columbus, Ohio, 43210
| | - Bryan C Carstens
- Department of Evolution, Ecology, and Organismal Biology, The Ohio State University, Columbus, Ohio, 43210
| |
Collapse
|
26
|
Derkarabetian S, Castillo S, Koo PK, Ovchinnikov S, Hedin M. A demonstration of unsupervised machine learning in species delimitation. Mol Phylogenet Evol 2019; 139:106562. [PMID: 31323334 PMCID: PMC6880864 DOI: 10.1016/j.ympev.2019.106562] [Citation(s) in RCA: 48] [Impact Index Per Article: 8.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/17/2019] [Revised: 07/03/2019] [Accepted: 07/15/2019] [Indexed: 01/13/2023]
Abstract
One major challenge to delimiting species with genetic data is successfully differentiating population structure from species-level divergence, an issue exacerbated in taxa inhabiting naturally fragmented habitats. Many fields of science are now using machine learning, and in evolutionary biology supervised machine learning has recently been used to infer species boundaries. These supervised methods require training data with associated labels. Conversely, unsupervised machine learning (UML) uses inherent data structure and does not require user-specified training labels, potentially providing more objectivity in species delimitation. In the context of integrative taxonomy, we demonstrate the utility of three UML approaches (random forests, variational autoencoders, t-distributed stochastic neighbor embedding) for species delimitation in an arachnid taxon with high population genetic structure (Opiliones, Laniatores, Metanonychus). We find that UML approaches successfully cluster samples according to species-level divergences and not high levels of population structure, while model-based validation methods severely over-split putative species. UML offers intuitive data visualization in two-dimensional space, the ability to accommodate various data types, and has potential in many areas of systematic and evolutionary biology. We argue that machine learning methods are ideally suited for species delimitation and may perform well in many natural systems and across taxa with diverse biological characteristics.
Collapse
Affiliation(s)
- Shahan Derkarabetian
- Department of Organismic and Evolutionary Biology, Museum of Comparative Zoology, Harvard University, Cambridge, MA 02138, United States; Department of Biology, San Diego State University, San Diego, CA 92182, United States; Department of Evolution, Ecology, and Organismal Biology, University of California, Riverside, Riverside, CA 92521, United States.
| | - Stephanie Castillo
- Department of Biology, San Diego State University, San Diego, CA 92182, United States; Department of Entomology, University of California, Riverside, Riverside, CA 92521, United States
| | - Peter K Koo
- Howard Hughes Medical Institute, Department of Molecular and Cellular Biology, Harvard University, Cambridge, MA 02138, United States
| | - Sergey Ovchinnikov
- Center for Systems Biology, Harvard University, Cambridge, MA 02138, United States
| | - Marshal Hedin
- Department of Biology, San Diego State University, San Diego, CA 92182, United States
| |
Collapse
|
27
|
Titus BM, Blischak PD, Daly M. Genomic signatures of sympatric speciation with historical and contemporary gene flow in a tropical anthozoan (Hexacorallia: Actiniaria). Mol Ecol 2019; 28:3572-3586. [PMID: 31233641 DOI: 10.1111/mec.15157] [Citation(s) in RCA: 13] [Impact Index Per Article: 2.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/20/2018] [Revised: 05/21/2019] [Accepted: 06/04/2019] [Indexed: 12/23/2022]
Abstract
Sympatric diversification is recognized to have played an important role in the evolution of biodiversity. However, an in situ sympatric origin for codistributed taxa is difficult to demonstrate because different evolutionary processes can lead to similar biogeographic outcomes, especially in ecosystems that can readily facilitate secondary contact due to a lack of hard barriers to dispersal. Here we use a genomic (ddRADseq), model-based approach to delimit a species complex of tropical sea anemones that are codistributed on coral reefs throughout the Tropical Western Atlantic. We use coalescent simulations in fastsimcoal2 and ordinary differential equations in Moments to test competing diversification scenarios that span the allopatric-sympatric continuum. Our results suggest that the corkscrew sea anemone Bartholomea annulata is a cryptic species complex whose members are codistributed throughout their range. Simulation and model selection analyses from both approaches suggest these lineages experienced historical and contemporary gene flow, supporting a sympatric origin, but an alternative secondary contact model receives appreciable model support in fastsimcoal2. Leveraging the genome of the closely related Exaiptasia diaphana, we identify five loci under divergent selection between cryptic B. annulata lineages that fall within mRNA transcripts or CDS regions. Our study provides a rare empirical, genomic example of sympatric speciation in a tropical anthozoan and the first range-wide molecular study of a tropical sea anemone, underscoring that anemone diversity is under-described in the tropics, and highlighting the need for additional systematic studies into these ecologically and economically important species.
Collapse
Affiliation(s)
- Benjamin M Titus
- Department of Evolution, Ecology and Organismal Biology, The Ohio State University, Columbus, OH, USA.,Division of Invertebrate Zoology, American Museum of Natural History, New York, NY, USA
| | - Paul D Blischak
- Department of Evolution, Ecology and Organismal Biology, The Ohio State University, Columbus, OH, USA.,Department of Ecology and Evolutionary Biology, The University of Arizona, Tucson, AZ, USA
| | - Marymegan Daly
- Department of Evolution, Ecology and Organismal Biology, The Ohio State University, Columbus, OH, USA
| |
Collapse
|
28
|
Rankin AM, Wilke T, Lucid M, Leonard W, Espíndola A, Smith ML, Carstens BC, Sullivan J. Complex interplay of ancient vicariance and recent patterns of geographical speciation in north-western North American temperate rainforests explains the phylogeny of jumping slugs (Hemphillia spp.). Biol J Linn Soc Lond 2019. [DOI: 10.1093/biolinnean/blz040] [Citation(s) in RCA: 9] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/14/2022]
Abstract
AbstractThe history of the currently disjunct temperate rainforests of the Pacific Northwest of North America has shaped the evolution and diversity of endemics. This study focuses on how geological and climatic perturbations have driven speciation in the area by isolating lineages. We investigated the phylogenetic relationships and historical biogeography of the endemic jumping slugs (genus Hemphillia) using a multi-locus phylogeny. We evaluated the spatial distribution and divergence times of major lineages, generated ancestral area probabilities and inferred the biogeographical history of the genus. Our study revealed eight genetic lineages that formed three clades: one clade consisting of two Coast/Cascade lineages, and two reciprocally monophyletic clades that each contain a Coast/Cascade and two Rocky Mountains taxa. The results of the biogeographical analysis suggest that the ancestral range of the genus occupied Coast/Cascade habitats and then spread across into Northern Rocky Mountain interior habitats with subsequent fragmentations isolating coastal and inland lineages. Finally, there have been more recent speciation events among three lineage pairs that have shaped shallow structures of all clades. We add to our knowledge of the biogeographical history of the region in that we discovered diversification and speciation events that have occurred in ways more complex than previously thought.
Collapse
Affiliation(s)
- Andrew M Rankin
- Department of Biological Sciences, University of Idaho, Moscow, ID, USA
- Institute for Bioinformatics and Evolutionary Studies (IBEST), Biological Sciences, University of Idaho, Moscow, ID, USA
| | - Thomas Wilke
- Animal Ecology and Systematics, Justus Liebig University, Heinrich-Buff-Ring (IFZ), Giessen, Germany
| | - Michael Lucid
- Idaho Department of Fish and Game, Coeur d’Alene, ID, USA
| | | | - Anahí Espíndola
- Department of Entomology, University of Maryland, College Park, MD, USA
| | - Megan L Smith
- Department of Evolution, Ecology, & Organismal Biology, The Ohio State University, Columbus, OH, USA
| | - Bryan C Carstens
- Department of Evolution, Ecology, & Organismal Biology, The Ohio State University, Columbus, OH, USA
| | - Jack Sullivan
- Department of Biological Sciences, University of Idaho, Moscow, ID, USA
- Institute for Bioinformatics and Evolutionary Studies (IBEST), Biological Sciences, University of Idaho, Moscow, ID, USA
| |
Collapse
|
29
|
Sullivan J, Smith ML, Espíndola A, Ruffley M, Rankin A, Tank D, Carstens B. Integrating life history traits into predictive phylogeography. Mol Ecol 2019; 28:2062-2073. [DOI: 10.1111/mec.15029] [Citation(s) in RCA: 9] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/20/2018] [Revised: 01/04/2019] [Accepted: 01/16/2019] [Indexed: 11/29/2022]
Affiliation(s)
- Jack Sullivan
- Department of Biological Sciences University of Idaho Moscow Idaho
- Institute for Bioinformatics and Evolutionary Studies University of Idaho Moscow Idaho
| | - Megan L. Smith
- Department of Ecology, Evolution and Organismal Biology The Ohio State University Columbus Ohio
| | - Anahí Espíndola
- Department of Biological Sciences University of Idaho Moscow Idaho
- Department of Entomology University of Maryland College Park Maryland
| | - Megan Ruffley
- Department of Biological Sciences University of Idaho Moscow Idaho
- Institute for Bioinformatics and Evolutionary Studies University of Idaho Moscow Idaho
| | - Andrew Rankin
- Department of Biological Sciences University of Idaho Moscow Idaho
- Institute for Bioinformatics and Evolutionary Studies University of Idaho Moscow Idaho
| | - David Tank
- Department of Biological Sciences University of Idaho Moscow Idaho
- Institute for Bioinformatics and Evolutionary Studies University of Idaho Moscow Idaho
| | - Bryan Carstens
- Department of Ecology, Evolution and Organismal Biology The Ohio State University Columbus Ohio
| |
Collapse
|
30
|
Barratt CD, Bwong BA, Jehle R, Liedtke HC, Nagel P, Onstein RE, Portik DM, Streicher JW, Loader SP. Vanishing refuge? Testing the forest refuge hypothesis in coastal East Africa using genome-wide sequence data for seven amphibians. Mol Ecol 2018; 27:4289-4308. [DOI: 10.1111/mec.14862] [Citation(s) in RCA: 27] [Impact Index Per Article: 3.9] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/20/2018] [Revised: 08/08/2018] [Accepted: 08/29/2018] [Indexed: 01/03/2023]
Affiliation(s)
- Christopher D. Barratt
- Department of Environmental Sciences; University of Basel; Basel Switzerland
- German Centre for Integrative Biodiversity Research (iDiv) Halle-Jena-Leipzig; Leipzig Germany
| | - Beryl A. Bwong
- Department of Environmental Sciences; University of Basel; Basel Switzerland
- Herpetology Section; National Museums of Kenya; Nairobi Kenya
| | - Robert Jehle
- School of Environment and Life Sciences; University of Salford; Salford UK
| | - H. Christoph Liedtke
- Department of Environmental Sciences; University of Basel; Basel Switzerland
- Ecology, Evolution and Developmental Group; Department of Wetland Ecology; Estación Biológica de Doñana (CSIC); Sevilla Spain
| | - Peter Nagel
- Department of Environmental Sciences; University of Basel; Basel Switzerland
| | - Renske E. Onstein
- German Centre for Integrative Biodiversity Research (iDiv) Halle-Jena-Leipzig; Leipzig Germany
| | - Daniel M. Portik
- Department of Biology; The University of Texas at Arlington; Arlington Texas
- Department of Ecology and Evolutionary Biology; University of Arizona; Tucson Arizona
| | | | - Simon P. Loader
- Department of Environmental Sciences; University of Basel; Basel Switzerland
- Department of Life Sciences; Natural History Museum; London UK
| |
Collapse
|
31
|
Fraïsse C, Roux C, Gagnaire PA, Romiguier J, Faivre N, Welch JJ, Bierne N. The divergence history of European blue mussel species reconstructed from Approximate Bayesian Computation: the effects of sequencing techniques and sampling strategies. PeerJ 2018; 6:e5198. [PMID: 30083438 PMCID: PMC6071616 DOI: 10.7717/peerj.5198] [Citation(s) in RCA: 23] [Impact Index Per Article: 3.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/01/2018] [Accepted: 06/19/2018] [Indexed: 01/25/2023] Open
Abstract
Genome-scale diversity data are increasingly available in a variety of biological systems, and can be used to reconstruct the past evolutionary history of species divergence. However, extracting the full demographic information from these data is not trivial, and requires inferential methods that account for the diversity of coalescent histories throughout the genome. Here, we evaluate the potential and limitations of one such approach. We reexamine a well-known system of mussel sister species, using the joint site frequency spectrum (jSFS) of synonymous mutations computed either from exome capture or RNA-seq, in an Approximate Bayesian Computation (ABC) framework. We first assess the best sampling strategy (number of: individuals, loci, and bins in the jSFS), and show that model selection is robust to variation in the number of individuals and loci. In contrast, different binning choices when summarizing the jSFS, strongly affect the results: including classes of low and high frequency shared polymorphisms can more effectively reveal recent migration events. We then take advantage of the flexibility of ABC to compare more realistic models of speciation, including variation in migration rates through time (i.e., periodic connectivity) and across genes (i.e., genome-wide heterogeneity in migration rates). We show that these models were consistently selected as the most probable, suggesting that mussels have experienced a complex history of gene flow during divergence and that the species boundary is semi-permeable. Our work provides a comprehensive evaluation of ABC demographic inference in mussels based on the coding jSFS, and supplies guidelines for employing different sequencing techniques and sampling strategies. We emphasize, perhaps surprisingly, that inferences are less limited by the volume of data, than by the way in which they are analyzed.
Collapse
Affiliation(s)
- Christelle Fraïsse
- Institut des Sciences de l’Evolution UMR5554, University Montpellier, CNRS, IRD, EPHE, Montpellier, France
- Department of Genetics, University of Cambridge, Cambridge, UK
- Institute of Science and Technology Austria, Klosterneuburg, Austria
| | - Camille Roux
- Université de Lille, Unité Evo-Eco-Paléo (EEP), UMR 8198, Villeneuve d’Ascq, France
| | - Pierre-Alexandre Gagnaire
- Institut des Sciences de l’Evolution UMR5554, University Montpellier, CNRS, IRD, EPHE, Montpellier, France
| | - Jonathan Romiguier
- Institut des Sciences de l’Evolution UMR5554, University Montpellier, CNRS, IRD, EPHE, Montpellier, France
| | - Nicolas Faivre
- Institut des Sciences de l’Evolution UMR5554, University Montpellier, CNRS, IRD, EPHE, Montpellier, France
- Department of Genetics, University of Cambridge, Cambridge, UK
| | - John J. Welch
- Department of Genetics, University of Cambridge, Cambridge, UK
| | - Nicolas Bierne
- Institut des Sciences de l’Evolution UMR5554, University Montpellier, CNRS, IRD, EPHE, Montpellier, France
- Department of Genetics, University of Cambridge, Cambridge, UK
| |
Collapse
|
32
|
Smith ML, Ruffley M, Rankin AM, Espíndola A, Tank DC, Sullivan J, Carstens BC. Testing for the presence of cryptic diversity in tail-dropper slugs (Prophysaon) using molecular data. Biol J Linn Soc Lond 2018. [DOI: 10.1093/biolinnean/bly067] [Citation(s) in RCA: 8] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/14/2022]
Affiliation(s)
- Megan L Smith
- Department of Evolution, Ecology and Organismal Biology, The Ohio State University, Aronoff Labs, Columbus, OH, USA
| | - Megan Ruffley
- Department of Biological Sciences, University of Idaho, Moscow, ID, USA
- Institute for Bioinformatics and Evolutionary Studies (IBEST), Biological Sciences, University of Idaho, Moscow, ID, USA
| | - Andrew M Rankin
- Department of Biological Sciences, University of Idaho, Moscow, ID, USA
- Institute for Bioinformatics and Evolutionary Studies (IBEST), Biological Sciences, University of Idaho, Moscow, ID, USA
| | - Anahí Espíndola
- Department of Biological Sciences, University of Idaho, Moscow, ID, USA
- Institute for Bioinformatics and Evolutionary Studies (IBEST), Biological Sciences, University of Idaho, Moscow, ID, USA
| | - David C Tank
- Department of Biological Sciences, University of Idaho, Moscow, ID, USA
- Institute for Bioinformatics and Evolutionary Studies (IBEST), Biological Sciences, University of Idaho, Moscow, ID, USA
| | - Jack Sullivan
- Department of Biological Sciences, University of Idaho, Moscow, ID, USA
- Institute for Bioinformatics and Evolutionary Studies (IBEST), Biological Sciences, University of Idaho, Moscow, ID, USA
| | - Bryan C Carstens
- Department of Evolution, Ecology and Organismal Biology, The Ohio State University, Aronoff Labs, Columbus, OH, USA
| |
Collapse
|
33
|
Ruffley M, Smith ML, Espíndola A, Carstens BC, Sullivan J, Tank DC. Combining allele frequency and tree-based approaches improves phylogeographic inference from natural history collections. Mol Ecol 2018; 27:1012-1024. [PMID: 29334417 PMCID: PMC5878120 DOI: 10.1111/mec.14491] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/05/2017] [Revised: 12/07/2017] [Accepted: 12/08/2017] [Indexed: 01/25/2023]
Abstract
Model selection approaches in phylogeography have allowed researchers to evaluate the support for competing demographic histories, which provides a mode of inference and a measure of uncertainty in understanding climatic and spatial influences on intraspecific diversity. Here, to rank all models in the comparison set and determine what proportion of the total support the top-ranked model garners, we conduct model selection using two analytical approaches-allele frequency-based, implemented in fastsimcoal2, and gene tree-based, implemented in phrapl. We then expand this model selection framework by including an assessment of absolute fit of the models to the data. For this, we utilize DNA isolated from existing natural history collections that span the distribution of red alder (Alnus rubra) in the Pacific Northwest of North America to generate genomic data for the evaluation of 13 demographic scenarios. The quality of DNA recovered from herbarium specimen leaf tissue was assessed for its utility and effectiveness in demographic model selection, specifically in the two approaches mentioned. We present strong support for the use of herbarium tissue in the generation of genomic DNA, albeit with the inclusion of additional quality control checks prior to library preparation and analyses with multiple approaches that incorporate various data. Analyses with allele frequency spectra and gene trees predominantly support A. rubra having experienced an ancient vicariance event with intermittent and frequent gene flow between the disjunct populations. Additionally, the data consistently fit the most frequently selected model, corroborating the model selection techniques. Finally, these results suggest that the A. rubra disjunct populations do not represent separate species.
Collapse
Affiliation(s)
- Megan Ruffley
- Department of Biological Sciences, University of Idaho, Moscow, ID, USA
- Institute for Bioinformatics and Evolutionary Studies (IBEST), Biological Sciences, Moscow, ID, USA
- Stillinger Herbarium, University of Idaho, Moscow, ID, USA
| | - Megan L Smith
- Department of Evolution, Ecology, & Organismal Biology, The Ohio State University, Columbus, OH, USA
| | - Anahí Espíndola
- Department of Biological Sciences, University of Idaho, Moscow, ID, USA
- Institute for Bioinformatics and Evolutionary Studies (IBEST), Biological Sciences, Moscow, ID, USA
| | - Bryan C Carstens
- Department of Evolution, Ecology, & Organismal Biology, The Ohio State University, Columbus, OH, USA
| | - Jack Sullivan
- Department of Biological Sciences, University of Idaho, Moscow, ID, USA
- Institute for Bioinformatics and Evolutionary Studies (IBEST), Biological Sciences, Moscow, ID, USA
| | - David C Tank
- Department of Biological Sciences, University of Idaho, Moscow, ID, USA
- Institute for Bioinformatics and Evolutionary Studies (IBEST), Biological Sciences, Moscow, ID, USA
- Stillinger Herbarium, University of Idaho, Moscow, ID, USA
| |
Collapse
|
34
|
Stone GN, White SC, Csóka G, Melika G, Mutun S, Pénzes Z, Sadeghi SE, Schönrogge K, Tavakoli M, Nicholls JA. Tournament ABC analysis of the western Palaearctic population history of an oak gall wasp,Synergus umbraculus. Mol Ecol 2017; 26:6685-6703. [DOI: 10.1111/mec.14372] [Citation(s) in RCA: 23] [Impact Index Per Article: 2.9] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/29/2017] [Revised: 09/16/2017] [Accepted: 09/18/2017] [Indexed: 01/17/2023]
Affiliation(s)
- Graham N. Stone
- Institute of Evolutionary Biology; University of Edinburgh; Edinburgh UK
| | - Sarah C. White
- Institute of Evolutionary Biology; University of Edinburgh; Edinburgh UK
| | - György Csóka
- National Agricultural Research and Innovation Centre; Forest Research Institute; Mátrafüred Hungary
| | - George Melika
- Plant Health and Molecular Biology Laboratory; Directorate of Plant Protection, Soil Conservation and Agri-environment; Budapest Hungary
| | - Serap Mutun
- Department of Biology; Faculty of Science and Arts; Abant İzzet Baysal University; Bolu Turkey
| | - Zsolt Pénzes
- Department of Ecology; Faculty of Science and Informatics; University of Szeged; Szeged Hungary
| | - S. Ebrahim Sadeghi
- Agricultural Research, Education and Extension Organization (AREEO); Research Institute of Forests and Rangelands of Iran; Tehran Iran
| | | | - Majid Tavakoli
- Lorestan Agriculture and Natural Resources Research Center; Khorramabad Lorestan Iran
| | - James A. Nicholls
- Institute of Evolutionary Biology; University of Edinburgh; Edinburgh UK
| |
Collapse
|