1
|
Sinaiko G, Cao Y, Dietrich CH. Phylogenomics of the leafhopper genus Neoaliturus Distant, 1918 (Hemiptera: Cicadellidae: Deltocephalinae) reveals genetically divergent lineages in the invasive beet leafhopper. Mol Phylogenet Evol 2024; 195:108071. [PMID: 38579933 DOI: 10.1016/j.ympev.2024.108071] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/29/2024] [Revised: 03/20/2024] [Accepted: 04/02/2024] [Indexed: 04/07/2024]
Abstract
Phylogenomic analysis based on nucleotide sequences of 398 nuclear gene loci for 67 representatives of the leafhopper genus Neoaliturus yielded well-resolved estimates of relationships among species of the genus. Subgenus Neoaliturus (Neoaliturus) is consistently paraphyletic with respect to Neoaliturus (Circulifer). The analysis revealed the presence of at least ten genetically divergent clades among specimens consistent with the previous morphology-based definition of the leafhopper genus "Circulifer" which includes three previously recognized "species complexes." Specimens of the American beet leafhopper, N. tenellus (Baker), collected from the southwestern USA consistently group with one of these clades, comprising specimens from the eastern Mediterranean. Some of the remaining lineages are consistent with ecological differences previously observed among eastern Mediterranean populations and suggest that N. tenellus, as previously defined, comprises multiple monophyletic species, distinguishable by slight morphological differences.
Collapse
Affiliation(s)
- Guy Sinaiko
- School of Zoology, Tel-Aviv University, Tel-Aviv 6997801, Israel.
| | - Yanghui Cao
- Key Laboratory of Plant Protection Resources and Pest Management of the Ministry of Education, Entomological Museum, Northwest A&F University, Yangling, Shaanxi 712100, China
| | - Christopher H Dietrich
- Illinois Natural History Survey, Prairie Research Institute, University of Illinois, Champaign, IL 61820, USA
| |
Collapse
|
2
|
Bernot JP, Owen CL, Wolfe JM, Meland K, Olesen J, Crandall KA. Major Revisions in Pancrustacean Phylogeny and Evidence of Sensitivity to Taxon Sampling. Mol Biol Evol 2023; 40:msad175. [PMID: 37552897 PMCID: PMC10414812 DOI: 10.1093/molbev/msad175] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/30/2022] [Revised: 06/14/2023] [Accepted: 06/19/2023] [Indexed: 08/10/2023] Open
Abstract
The clade Pancrustacea, comprising crustaceans and hexapods, is the most diverse group of animals on earth, containing over 80% of animal species and half of animal biomass. It has been the subject of several recent phylogenomic analyses, yet relationships within Pancrustacea show a notable lack of stability. Here, the phylogeny is estimated with expanded taxon sampling, particularly of malacostracans. We show small changes in taxon sampling have large impacts on phylogenetic estimation. By analyzing identical orthologs between two slightly different taxon sets, we show that the differences in the resulting topologies are due primarily to the effects of taxon sampling on the phylogenetic reconstruction method. We compare trees resulting from our phylogenomic analyses with those from the literature to explore the large tree space of pancrustacean phylogenetic hypotheses and find that statistical topology tests reject the previously published trees in favor of the maximum likelihood trees produced here. Our results reject several clades including Caridoida, Eucarida, Multicrustacea, Vericrustacea, and Syncarida. Notably, we find Copepoda nested within Allotriocarida with high support and recover a novel relationship between decapods, euphausiids, and syncarids that we refer to as the Syneucarida. With denser taxon sampling, we find Stomatopoda sister to this latter clade, which we collectively name Stomatocarida, dividing Malacostraca into three clades: Leptostraca, Peracarida, and Stomatocarida. A new Bayesian divergence time estimation is conducted using 13 vetted fossils. We review our results in the context of other pancrustacean phylogenetic hypotheses and highlight 15 key taxa to sample in future studies.
Collapse
Affiliation(s)
- James P Bernot
- Department of Invertebrate Zoology, US National Museum of Natural History, Smithsonian Institution, Washington, DC, USA
- Department of Ecology and Evolutionary Biology, University of Connecticut, Storrs, CT, USA
| | - Christopher L Owen
- Systematic Entomology Laboratory, USDA-ARS, ℅ National Museum of Natural History, Smithsonian Institution, Washington, DC, USA
| | - Joanna M Wolfe
- Museum of Comparative Zoology and Department of Organismic and Evolutionary Biology, Harvard University, Cambridge, MA, USA
| | - Kenneth Meland
- Department of Biology, University of Bergen, Bergen, Norway
| | - Jørgen Olesen
- Natural History Museum of Denmark, University of Copenhagen, Copenhagen, Denmark
| | - Keith A Crandall
- Department of Invertebrate Zoology, US National Museum of Natural History, Smithsonian Institution, Washington, DC, USA
- Department of Biostatistics and Bioinformatics, Milken Institute School of Public Health, George Washington University, Washington, DC, USA
| |
Collapse
|
3
|
Pardo-De la Hoz CJ, Magain N, Piatkowski B, Cornet L, Dal Forno M, Carbone I, Miadlikowska J, Lutzoni F. Ancient Rapid Radiation Explains Most Conflicts Among Gene Trees and Well-Supported Phylogenomic Trees of Nostocalean Cyanobacteria. Syst Biol 2023; 72:694-712. [PMID: 36827095 DOI: 10.1093/sysbio/syad008] [Citation(s) in RCA: 2] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/18/2022] [Revised: 02/12/2023] [Accepted: 02/22/2023] [Indexed: 02/25/2023] Open
Abstract
Prokaryotic genomes are often considered to be mosaics of genes that do not necessarily share the same evolutionary history due to widespread horizontal gene transfers (HGTs). Consequently, representing evolutionary relationships of prokaryotes as bifurcating trees has long been controversial. However, studies reporting conflicts among gene trees derived from phylogenomic data sets have shown that these conflicts can be the result of artifacts or evolutionary processes other than HGT, such as incomplete lineage sorting, low phylogenetic signal, and systematic errors due to substitution model misspecification. Here, we present the results of an extensive exploration of phylogenetic conflicts in the cyanobacterial order Nostocales, for which previous studies have inferred strongly supported conflicting relationships when using different concatenated phylogenomic data sets. We found that most of these conflicts are concentrated in deep clusters of short internodes of the Nostocales phylogeny, where the great majority of individual genes have low resolving power. We then inferred phylogenetic networks to detect HGT events while also accounting for incomplete lineage sorting. Our results indicate that most conflicts among gene trees are likely due to incomplete lineage sorting linked to an ancient rapid radiation, rather than to HGTs. Moreover, the short internodes of this radiation fit the expectations of the anomaly zone, i.e., a region of the tree parameter space where a species tree is discordant with its most likely gene tree. We demonstrated that concatenation of different sets of loci can recover up to 17 distinct and well-supported relationships within the putative anomaly zone of Nostocales, corresponding to the observed conflicts among well-supported trees based on concatenated data sets from previous studies. Our findings highlight the important role of rapid radiations as a potential cause of strongly conflicting phylogenetic relationships when using phylogenomic data sets of bacteria. We propose that polytomies may be the most appropriate phylogenetic representation of these rapid radiations that are part of anomaly zones, especially when all possible genomic markers have been considered to infer these phylogenies. [Anomaly zone; bacteria; horizontal gene transfer; incomplete lineage sorting; Nostocales; phylogenomic conflict; rapid radiation; Rhizonema.].
Collapse
Affiliation(s)
| | - Nicolas Magain
- Evolution and Conservation Biology, InBioS Research Center, Université de Liège, Liège 4000, Belgium
| | - Bryan Piatkowski
- Biosciences Division, Oak Ridge National Laboratory, Oak Ridge, TN 37830, USA
| | - Luc Cornet
- Evolution and Conservation Biology, InBioS Research Center, Université de Liège, Liège 4000, Belgium
- BCCM/IHEM, Mycology and Aerobiology, Sciensano, Brussels, Belgium
| | | | - Ignazio Carbone
- Department of Entomology and Plant Pathology, North Carolina State University, Raleigh, NC 27606, USA
| | | | | |
Collapse
|
4
|
Nunes R, Storer C, Doleck T, Kawahara AY, Pierce NE, Lohman DJ. Predictors of sequence capture in a large-scale anchored phylogenomics project. Front Ecol Evol 2022. [DOI: 10.3389/fevo.2022.943361] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 03/19/2023] Open
Abstract
Next-generation sequencing (NGS) technologies have revolutionized phylogenomics by decreasing the cost and time required to generate sequence data from multiple markers or whole genomes. Further, the fragmented DNA of biological specimens collected decades ago can be sequenced with NGS, reducing the need for collecting fresh specimens. Sequence capture, also known as anchored hybrid enrichment, is a method to produce reduced representation libraries for NGS sequencing. The technique uses single-stranded oligonucleotide probes that hybridize with pre-selected regions of the genome that are sequenced via NGS, culminating in a dataset of numerous orthologous loci from multiple taxa. Phylogenetic analyses using these sequences have the potential to resolve deep and shallow phylogenetic relationships. Identifying the factors that affect sequence capture success could save time, money, and valuable specimens that might be destructively sampled despite low likelihood of sequencing success. We investigated the impacts of specimen age, preservation method, and DNA concentration on sequence capture (number of captured sequences and sequence quality) while accounting for taxonomy and extracted tissue type in a large-scale butterfly phylogenomics project. This project used two probe sets to extract 391 loci or a subset of 13 loci from over 6,000 butterfly specimens. We found that sequence capture is a resilient method capable of amplifying loci in samples of varying age (0–111 years), preservation method (alcohol, papered, pinned), and DNA concentration (0.020 ng/μl - 316 ng/ul). Regression analyses demonstrate that sequence capture is positively correlated with DNA concentration. However, sequence capture and DNA concentration are negatively correlated with sample age and preservation method. Our findings suggest that sequence capture projects should prioritize the use of alcohol-preserved samples younger than 20 years old when available. In the absence of such specimens, dried samples of any age can yield sequence data, albeit with returns that diminish with increasing age.
Collapse
|
5
|
Su X, Liu T, Liu YP, Harris AJ, Chen JY. Adaptive radiation in Orinus, an endemic alpine grass of the Qinghai-Tibet Plateau, based on comparative transcriptomic analysis. JOURNAL OF PLANT PHYSIOLOGY 2022; 277:153786. [PMID: 35963042 DOI: 10.1016/j.jplph.2022.153786] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 04/29/2022] [Revised: 07/20/2022] [Accepted: 07/20/2022] [Indexed: 06/15/2023]
Abstract
The species of Orinus (Poaceae) are important alpine plants with a variety of phenotypic traits and potential usages in molecular breeding toward drought-tolerant forage crops. However, the genetic basis of evolutionary adaption and diversification in the genus is still unclear. In the present study, we obtained transcriptomes for the two most divergent species, O. thoroldii and O. kokonoricus, using the Illumina platform and de novo assembly. In total, we generated 23,029 and 24,086 unigenes with N50 values of 1188 and 1203 for O. thoroldii and O. kokonoricus respectively, and identified 19,005 pairs of putative orthologs between the two species of Orinus. For these orthologs, estimations of non-synonymous/synonymous substitution rate ratios indicated that 568 pairs may be under strongly positive selection (Ka/Ks > 1), and Gene Ontogeny (GO) enrichment analysis revealed that significantly enriched pathways were in DNA repair and resistance to abiotic stress. Meanwhile, the divergence times of species between O. thoroldii and O. kokonoricus occurred 3.2 million years ago (Mya), and the recent evolutionary branch is an allotetraploid species, Cleistogenes songorica. We also detected a Ks peak of ∼0.60 for Orinus. Additionally, we identified 188 pairs of differentially expressed genes (DEGs) between the two species of Orinus, which were significantly enrich in stress resistance and lateral root development. Thus, we considered that the species diversification and evolutionary adaption of this genus was initiated by environmental selection, followed by phenotypic differentiation, finally leading to niche separation in the Qinghai-Tibet Plateau.
Collapse
Affiliation(s)
- Xu Su
- School of Life Sciences, Qinghai Normal University, Xining, 810008, China; Academy of Plateau Science and Sustainability, Qinghai Normal University, Xining, 810016, China; Key Laboratory of Medicinal Animal and Plant Resources of the Qinghai-Tibet Plateau in Qinghai Province, Qinghai Normal University, Xining, 810008, China; Key Laboratory of Land Surface Processes and Ecological Conservation of the Qinghai-Tibet Plateau, The Ministry of Education, Qinghai Normal University, Xining, 810008, China
| | - Tao Liu
- School of Life Sciences, Qinghai Normal University, Xining, 810008, China; School of Geographical Science, Qinghai Normal University, Xining, 810008, China
| | - Yu Ping Liu
- School of Life Sciences, Qinghai Normal University, Xining, 810008, China; Key Laboratory of Medicinal Animal and Plant Resources of the Qinghai-Tibet Plateau in Qinghai Province, Qinghai Normal University, Xining, 810008, China.
| | - A J Harris
- Key Laboratory of Plant Resources Conservation and Sustainable Utilization, South China Botanical Garden, Chinese Academy of Sciences, Guangzhou, 510650, China.
| | - Jin Yuan Chen
- School of Life Sciences, Qinghai Normal University, Xining, 810008, China; Key Laboratory of Medicinal Animal and Plant Resources of the Qinghai-Tibet Plateau in Qinghai Province, Qinghai Normal University, Xining, 810008, China
| |
Collapse
|
6
|
Out of chaos: Phylogenomics of Asian Sonerileae. Mol Phylogenet Evol 2022; 175:107581. [PMID: 35810973 DOI: 10.1016/j.ympev.2022.107581] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/05/2022] [Revised: 05/23/2022] [Accepted: 05/26/2022] [Indexed: 11/22/2022]
Abstract
Sonerileae is a diverse Melastomataceae lineage comprising ca. 1000 species in 44 genera, with >70% of genera and species distributed in Asia. Asian Sonerileae are taxonomically intractable with obscure generic circumscriptions. The backbone phylogeny of this group remains poorly resolved, possibly due to complexity caused by rapid species radiation in early and middle Miocene, which hampers further systematic study. Here, we used genome resequencing data to reconstruct the phylogeny of Asian Sonerileae. Three parallel datasets, viz. single-copy ortholog (SCO), genomic SNPs, and whole plastome, were assembled from genome resequencing data of 205 species for this purpose. Based on these genome-scale data, we provided the first well resolved phylogeny of Asian Sonerileae, with 34 major clades identified and 74% of the interclade relationships consistently resolved by both SCO and genomic data. Meanwhile, widespread phylogenetic discordance was detected among SCO gene trees as well as species trees reconstructed using different tree estimation methods (concatenation/site-based coalescent method/summary method) or different datasets (SCO/genomic/plastome). We explored sources of discordance using multiple approaches and found that the observed discordance in Asian Sonerileae was mainly caused by a combination of biased distribution of missing data, random noise from uninformative genes, incomplete lineage sorting, and hybridization/introgression. Exploration of these sources can enable us to generate hypotheses for future testing, which is the first step towards understanding the evolution of Asian Sonerileae. We also detected high levels of homoplasy for some characters traditionally used in taxonomy, which explains current chaotic generic delimitations. The backbone phylogeny of Asian Sonerileae revealed in this study offers a solid basis for future taxonomic revision at the generic level.
Collapse
|
7
|
Abreu EF, Pavan SE, Tsuchiya MTN, McLean BS, Wilson DE, Percequillo AR, Maldonado JE. Old specimens for old branches: Assessing effects of sample age in resolving a rapid Neotropical radiation of squirrels. Mol Phylogenet Evol 2022; 175:107576. [PMID: 35809853 DOI: 10.1016/j.ympev.2022.107576] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/21/2022] [Revised: 06/10/2022] [Accepted: 07/01/2022] [Indexed: 11/15/2022]
Abstract
Ultraconserved Elements (UCEs) have been useful to resolve challenging phylogenies of non-model clades, unpuzzling long-conflicted relationships in key branches of the Tree of Life at both deep and shallow levels. UCEs are often reliably recovered from historical samples, unlocking a vast number of preserved natural history specimens for analysis. However, the extent to which sample age and preservation method impact UCE recovery as well as downstream inferences remains unclear. Furthermore, there is an ongoing debate on how to curate, filter, and properly analyze UCE data when locus recovery is uneven across sample age and quality. In the present study we address these questions with an empirical dataset composed of over 3800 UCE loci from 219 historical and modern samples of Sciuridae, a globally distributed and ecologically important family of rodents. We provide a genome-scale phylogeny of two squirrel subfamilies (Sciurillinae and Sciurinae: Sciurini) and investigate their placement within Sciuridae. For historical specimens, recovery of UCE loci and mean length per locus were inversely related to sample age; deeper sequencing improved the number of UCE loci recovered but not locus length. Most of our phylogenetic inferences-performed on six datasets with alternative data-filtering strategies, and using three distinct optimality criteria-resulted in distinct topologies. Datasets containing more loci (40% and 50% taxa representativeness matrices) yielded more concordant topologies and higher support values than strictly filtered datasets (60% matrices) particularly with IQ-Tree and SVDquartets, while filtering based on information content provided better topological resolution for inferences with the coalescent gene-tree based approach in ASTRAL-III. We resolved deep relationships in Sciuridae (including among the five currently recognized subfamilies) and relationships among the deepest branches of Sciurini, but conflicting relationships remain at both genus- and species-levels for the rapid Neotropical tree squirrel radiation. Our results suggest that phylogenomic consensus can be difficult and heavily influenced by the age of available samples and the filtering steps used to optimize dataset properties.
Collapse
Affiliation(s)
- Edson F Abreu
- Laboratório de Mamíferos, Departamento de Ciências Biológicas, Escola Superior de Agricultura Luiz de Queiroz, Universidade de São Paulo, Piracicaba, SP, Brazil; Center for Conservation Genomics, Smithsonian National Zoo and Conservation Biology Institute, Washington, DC, USA.
| | - Silvia E Pavan
- Center for Conservation Genomics, Smithsonian National Zoo and Conservation Biology Institute, Washington, DC, USA
| | - Mirian T N Tsuchiya
- Center for Conservation Genomics, Smithsonian National Zoo and Conservation Biology Institute, Washington, DC, USA; Data Science Lab, Office of the Chief Information Officer, Smithsonian Institution, Washington, DC, USA
| | - Bryan S McLean
- Department of Biology, University of North Carolina Greensboro, Greensboro, NC, USA
| | - Don E Wilson
- Division of Mammals, National Museum of Natural History, Smithsonian Institution, Washington, DC, USA
| | - Alexandre R Percequillo
- Laboratório de Mamíferos, Departamento de Ciências Biológicas, Escola Superior de Agricultura Luiz de Queiroz, Universidade de São Paulo, Piracicaba, SP, Brazil
| | - Jesús E Maldonado
- Center for Conservation Genomics, Smithsonian National Zoo and Conservation Biology Institute, Washington, DC, USA
| |
Collapse
|
8
|
Gatesy J, Springer MS. Phylogenomic Coalescent Analyses of Avian Retroelements Infer Zero-Length Branches at the Base of Neoaves, Emergent Support for Controversial Clades, and Ancient Introgressive Hybridization in Afroaves. Genes (Basel) 2022; 13:genes13071167. [PMID: 35885951 PMCID: PMC9324441 DOI: 10.3390/genes13071167] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/23/2022] [Revised: 06/20/2022] [Accepted: 06/21/2022] [Indexed: 01/25/2023] Open
Abstract
Retroelement insertions (RIs) are low-homoplasy characters that are ideal data for addressing deep evolutionary radiations, where gene tree reconstruction errors can severely hinder phylogenetic inference with DNA and protein sequence data. Phylogenomic studies of Neoaves, a large clade of birds (>9000 species) that first diversified near the Cretaceous−Paleogene boundary, have yielded an array of robustly supported, contradictory relationships among deep lineages. Here, we reanalyzed a large RI matrix for birds using recently proposed quartet-based coalescent methods that enable inference of large species trees including branch lengths in coalescent units, clade-support, statistical tests for gene flow, and combined analysis with DNA-sequence-based gene trees. Genome-scale coalescent analyses revealed extremely short branches at the base of Neoaves, meager branch support, and limited congruence with previous work at the most challenging nodes. Despite widespread topological conflicts with DNA-sequence-based trees, combined analyses of RIs with thousands of gene trees show emergent support for multiple higher-level clades (Columbea, Passerea, Columbimorphae, Otidimorphae, Phaethoquornithes). RIs express asymmetrical support for deep relationships within the subclade Afroaves that hints at ancient gene flow involving the owl lineage (Strigiformes). Because DNA-sequence data are challenged by gene tree-reconstruction error, analysis of RIs represents one approach for improving gene tree-based methods when divergences are deep, internodes are short, terminal branches are long, and introgressive hybridization further confounds species−tree inference.
Collapse
Affiliation(s)
- John Gatesy
- Division of Vertebrate Zoology, American Museum of Natural History, New York, NY 10024, USA
- Correspondence:
| | - Mark S. Springer
- Department of Evolution, Ecology, and Organismal Biology, University of California, Riverside, CA 92521, USA;
| |
Collapse
|
9
|
Abstract
Motivation Phylogenomics faces a dilemma: on the one hand, most accurate species and gene tree estimation methods are those that co-estimate them; on the other hand, these co-estimation methods do not scale to moderately large numbers of species. The summary-based methods, which first infer gene trees independently and then combine them, are much more scalable but are prone to gene tree estimation error, which is inevitable when inferring trees from limited-length data. Gene tree estimation error is not just random noise and can create biases such as long-branch attraction. Results We introduce a scalable likelihood-based approach to co-estimation under the multi-species coalescent model. The method, called quartet co-estimation (QuCo), takes as input independently inferred distributions over gene trees and computes the most likely species tree topology and internal branch length for each quartet, marginalizing over gene tree topologies and ignoring branch lengths by making several simplifying assumptions. It then updates the gene tree posterior probabilities based on the species tree. The focus on gene tree topologies and the heuristic division to quartets enables fast likelihood calculations. We benchmark our method with extensive simulations for quartet trees in zones known to produce biased species trees and further with larger trees. We also run QuCo on a biological dataset of bees. Our results show better accuracy than the summary-based approach ASTRAL run on estimated gene trees. Availability and implementation QuCo is available on https://github.com/maryamrabiee/quco. Supplementary information Supplementary data are available at Bioinformatics online.
Collapse
Affiliation(s)
- Maryam Rabiee
- Department of Computer Science and Engineering, University of California, San Diego, La Jolla, CA 92093, USA
| | | |
Collapse
|
10
|
Amaral DT, Romeiro-Brito M, Bonatelli IAS. Exploring Phylogenetic Relationships and Divergence Times of Bioluminescent Species Using Genomic and Transcriptomic Data. Methods Mol Biol 2022; 2525:409-423. [PMID: 35836087 DOI: 10.1007/978-1-0716-2473-9_32] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 06/15/2023]
Abstract
Next-generation sequencing (NGS) has dominated the scene of genomics and evolutionary biology as a great amount of genomic data have been accumulated for a diverse set of species. At the same time, phylogenetic approaches and programs are in development to allow better use of such large-size datasets. Phylogenomics appears as a promising field to accommodate and explore all the information of NGS data in phylogenetic methods, being an important approach to investigate the evolution of bioluminescence in different organisms. To guarantee accurate results in phylogenomic studies, it is mandatory to correctly identify orthologous genes in phylogenetic reconstruction. Here, we show a simplified step-by-step framework to perform phylogenetic analysis along with divergence time estimation, beginning with an orthologous search. As empirical data, we exemplify transcriptome sequences of six species of the Elateroidea superfamily (Coleoptera). We introduce several bioinformatics tools for handling genomic data, especially those available in the software OrthoFinder, IQTREE, BEAST2, and TreePL.
Collapse
Affiliation(s)
- Danilo T Amaral
- Departamento de Biologia, Centro de Ciências Humanas e Biológicas, Universidade Federal de São Carlos (UFSCar), Sorocaba, Brazil.
- Programa de Pós Graduação em Biologia Comparada, Faculdade de Filosofia, Ciências e Letras de Ribeirão Preto, Universidade de São Paulo (USP), Ribeirão Preto, Brazil.
| | - Monique Romeiro-Brito
- Departamento de Biologia, Centro de Ciências Humanas e Biológicas, Universidade Federal de São Carlos (UFSCar), Sorocaba, Brazil
| | - Isabel A S Bonatelli
- Departamento de Ecologia e Biologia Evolutiva, Universidade Federal de São Paulo (UNIFESP), Diadema, São Paulo, Brazil
| |
Collapse
|
11
|
Mahbub M, Wahab Z, Reaz R, Rahman MS, Bayzid MS. wQFM: Highly Accurate Genome-scale Species Tree Estimation from Weighted Quartets. Bioinformatics 2021; 37:3734-3743. [PMID: 34086858 DOI: 10.1093/bioinformatics/btab428] [Citation(s) in RCA: 6] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/30/2020] [Revised: 05/24/2021] [Accepted: 06/03/2021] [Indexed: 02/01/2023] Open
Abstract
MOTIVATION Species tree estimation from genes sampled from throughout the whole genome is complicated due to the gene tree-species tree discordance. Incomplete lineage sorting (ILS) is one of the most frequent causes for this discordance, where alleles can coexist in populations for periods that may span several speciation events. Quartet-based summary methods for estimating species trees from a collection of gene trees are becoming popular due to their high accuracy and statistical guarantee under ILS. Generating quartets with appropriate weights, where weights correspond to the relative importance of quartets, and subsequently amalgamating the weighted quartets to infer a single coherent species tree can allow for a statistically consistent way of estimating species trees. However, handling weighted quartets is challenging. RESULTS We propose wQFM, a highly accurate method for species tree estimation from multi-locus data, by extending the quartet FM (QFM) algorithm to a weighted setting. wQFM was assessed on a collection of simulated and real biological datasets, including the avian phylogenomic dataset which is one of the largest phylogenomic datasets to date. We compared wQFM with wQMC, which is the best alternate method for weighted quartet amalgamation, and with ASTRAL, which is one of the most accurate and widely used coalescent-based species tree estimation methods. Our results suggest that wQFM matches or improves upon the accuracy of wQMC and ASTRAL. AVAILABILITY wQFM is available in open source form at https://github.com/Mahim1997/wQFM-2020. SUPPLEMENTARY INFORMATION Supplementary data are available at Bioinformatics online.
Collapse
Affiliation(s)
- Mahim Mahbub
- Department of Computer Science and Engineering, Bangladesh University of Engineering and Technology, Dhaka-1205, Bangladesh
| | - Zahin Wahab
- Department of Computer Science and Engineering, Bangladesh University of Engineering and Technology, Dhaka-1205, Bangladesh
| | - Rezwana Reaz
- Department of Computer Science and Engineering, Bangladesh University of Engineering and Technology, Dhaka-1205, Bangladesh
| | - M Saifur Rahman
- Department of Computer Science and Engineering, Bangladesh University of Engineering and Technology, Dhaka-1205, Bangladesh
| | - Md Shamsuzzoha Bayzid
- Department of Computer Science and Engineering, Bangladesh University of Engineering and Technology, Dhaka-1205, Bangladesh
| |
Collapse
|
12
|
Farah IT, Islam MM, Zinat KT, Rahman AH, Bayzid MS. Species tree estimation from gene trees by minimizing deep coalescence and maximizing quartet consistency: a comparative study and the presence of pseudo species tree terraces. Syst Biol 2021; 70:1213-1231. [PMID: 33844023 DOI: 10.1093/sysbio/syab026] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/31/2020] [Revised: 03/25/2021] [Accepted: 03/29/2021] [Indexed: 11/14/2022] Open
Abstract
Species tree estimation from multi-locus datasets is extremely challenging, especially in the presence of gene tree heterogeneity across the genome due to incomplete lineage sorting (ILS). Summary methods have been developed which estimate gene trees and then combine the gene trees to estimate a species tree by optimizing various optimization scores. In this study, we have extended and adapted the concept of phylogenetic terraces to species tree estimation by "summarizing" a set of gene trees, where multiple species trees with distinct topologies may have exactly the same optimality score (i.e., quartet score, extra lineage score, etc.). We particularly investigated the presence and impacts of equally optimal trees in species tree estimation from multi-locus data using summary methods by taking ILS into account. We analyzed two of the most popular ILS-aware optimization criteria: maximize quartet consistency (MQC) and minimize deep coalescence (MDC). Methods based on MQC are provably statistically consistent, whereas MDC is not a consistent criterion for species tree estimation. We present a comprehensive comparative study of these two optimality criteria. Our experiments, on a collection of datasets simulated under ILS, indicate that MDC may result in competitive or identical quartet consistency score as MQC, but could be significantly worse than MQC in terms of tree accuracy - demonstrating the presence and impacts of equally optimal species trees. This is the first known study that provides the conditions for the datasets to have equally optimal trees in the context of phylogenomic inference using summary methods.
Collapse
Affiliation(s)
- Ishrat Tanzila Farah
- Department of Computer Science and Engineering, Bangladesh University of Engineering and Technology Dhaka-1205, Bangladesh
| | - Md Muktadirul Islam
- Applied Statistics and Data Science (ASDS), Department of Statistics Jahangirnagar University Dhaka-1342, Bangladesh
| | - Kazi Tasnim Zinat
- Department of Computer Science and Engineering, Bangladesh University of Engineering and Technology Dhaka-1205, Bangladesh.,Department of Computer Science University of Maryland, College Park, Maryland, USA
| | - Atif Hasan Rahman
- Department of Computer Science and Engineering, Bangladesh University of Engineering and Technology Dhaka-1205, Bangladesh
| | - Md Shamsuzzoha Bayzid
- Department of Computer Science and Engineering, Bangladesh University of Engineering and Technology Dhaka-1205, Bangladesh
| |
Collapse
|
13
|
Barba-Montoya J, Tao Q, Kumar S. Using a GTR+Γ substitution model for dating sequence divergence when stationarity and time-reversibility assumptions are violated. Bioinformatics 2021; 36:i884-i894. [PMID: 33381826 DOI: 10.1093/bioinformatics/btaa820] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 09/07/2020] [Indexed: 11/15/2022] Open
Abstract
MOTIVATION As the number and diversity of species and genes grow in contemporary datasets, two common assumptions made in all molecular dating methods, namely the time-reversibility and stationarity of the substitution process, become untenable. No software tools for molecular dating allow researchers to relax these two assumptions in their data analyses. Frequently the same General Time Reversible (GTR) model across lineages along with a gamma (+Γ) distributed rates across sites is used in relaxed clock analyses, which assumes time-reversibility and stationarity of the substitution process. Many reports have quantified the impact of violations of these underlying assumptions on molecular phylogeny, but none have systematically analyzed their impact on divergence time estimates. RESULTS We quantified the bias on time estimates that resulted from using the GTR + Γ model for the analysis of computer-simulated nucleotide sequence alignments that were evolved with non-stationary (NS) and non-reversible (NR) substitution models. We tested Bayesian and RelTime approaches that do not require a molecular clock for estimating divergence times. Divergence times obtained using a GTR + Γ model differed only slightly (∼3% on average) from the expected times for NR datasets, but the difference was larger for NS datasets (∼10% on average). The use of only a few calibrations reduced these biases considerably (∼5%). Confidence and credibility intervals from GTR + Γ analysis usually contained correct times. Therefore, the bias introduced by the use of the GTR + Γ model to analyze datasets, in which the time-reversibility and stationarity assumptions are violated, is likely not large and can be reduced by applying multiple calibrations. AVAILABILITY AND IMPLEMENTATION All datasets are deposited in Figshare: https://doi.org/10.6084/m9.figshare.12594638.
Collapse
Affiliation(s)
- Jose Barba-Montoya
- Institute for Genomics and Evolutionary Medicine.,Department of Biology, Temple University, Philadelphia, PA 19122, USA
| | - Qiqing Tao
- Institute for Genomics and Evolutionary Medicine.,Department of Biology, Temple University, Philadelphia, PA 19122, USA
| | - Sudhir Kumar
- Institute for Genomics and Evolutionary Medicine.,Department of Biology, Temple University, Philadelphia, PA 19122, USA.,Center for Excellence in Genome Medicine and Research, King Abdulaziz University, Jeddah 21589, Saudi Arabia
| |
Collapse
|
14
|
Rabiee M, Mirarab S. SODA: Multi-locus species delimitation using quartet frequencies. Bioinformatics 2021; 36:5623-5631. [PMID: 33555318 DOI: 10.1093/bioinformatics/btaa1010] [Citation(s) in RCA: 9] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/18/2020] [Revised: 10/19/2020] [Accepted: 11/21/2020] [Indexed: 11/14/2022] Open
Abstract
MOTIVATION Species delimitation, the process of deciding how to group a set of organisms into units called species, is one of the most challenging problems in evolutionary computational biology. While many methods exist for species delimitation, most based on the coalescent theory, few are scalable to very large datasets, and methods that scale tend to be not accurate. Species delimitation is closely related to species tree inference from discordant gene trees, a problem that has enjoyed rapid advances in recent years. RESULTS In this paper, we build on the accuracy and scalability of recent quartet-based methods for species tree estimation and propose a new method called SODA for species delimitation. SODA relies heavily on a recently developed method for testing zero branch length in species trees. In extensive simulations, we show that SODA can easily scale to very large datasets while maintaining high accuracy. AVAILABILITY The code and data presented here are available on https://github.com/maryamrabiee/SODA. SUPPLEMENTARY INFORMATION Supplementary data are available at Bioinformatics online.
Collapse
Affiliation(s)
- Maryam Rabiee
- Computer Science and Engineering, University of California, San Diego, US
| | - Siavash Mirarab
- Electrical and Computer Engineering, University of California, San Diego, US
| |
Collapse
|
15
|
Collapsing dubiously resolved gene-tree branches in phylogenomic coalescent analyses. Mol Phylogenet Evol 2021; 158:107092. [PMID: 33545272 DOI: 10.1016/j.ympev.2021.107092] [Citation(s) in RCA: 17] [Impact Index Per Article: 5.7] [Reference Citation Analysis] [Abstract] [Key Words] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/10/2020] [Revised: 12/30/2020] [Accepted: 01/28/2021] [Indexed: 01/15/2023]
Abstract
In two-step coalescent analyses of phylogenomic data, gene-tree topologies are treated as fixed prior to species-tree inference. Although all gene-tree conflict is assumed to be caused by lineage sorting when applying these methods, in empirical datasets much of the conflict can be caused by estimation error. Weakly supported and even arbitrarily resolved clades are important sources of this estimation error for gene trees inferred from few informative characters relative to the number of sampled terminals, and the resulting extraneous conflict among gene trees can negatively impact species-tree inference. In this study, we quantified the relative severity of alternative methods for collapsing gene-tree branches for seven empirical datasets and quantified their effects on species-tree inference. The branch-collapsing methods that we employed were based on the strict consensus of optimal topologies, various bootstrap thresholds, and 0% approximate likelihood ratio test (SH-like aLRT) support. Up to 86% of internal gene-tree branches are dubiously or arbitrarily resolved in reanalyses of these published phylogenomic datasets, and collapsing these branches increased inferred species-tree coalescent branch lengths by up to 455%. For two datasets, the longer inferred branch lengths sometimes impacted inference of anomaly-zone conditions. Although branch-collapsing methods did not consistently affect the species-tree topology, they often increased branch support. The more severe and clearly justified gene-tree branch-collapsing methods, which we recommend be broadly applied for two-step coalescent analyses, are use of the strict consensus in parsimony analyses and the collapse clades with 0% SH-like aLRT support in likelihood analyses. Collapsing dubiously or arbitrarily resolved branches in gene trees sometimes improved congruence between coalescent-based results and concatenation trees. In such cases, we contend that the resolution provided by concatenation should be preferred and that incomplete lineage sorting is a poor explanation for the initial conflict between phylogenetic approaches.
Collapse
|
16
|
Uckele KA, Adams RP, Schwarzbach AE, Parchman TL. Genome-wide RAD sequencing resolves the evolutionary history of serrate leaf Juniperus and reveals discordance with chloroplast phylogeny. Mol Phylogenet Evol 2020; 156:107022. [PMID: 33242585 DOI: 10.1016/j.ympev.2020.107022] [Citation(s) in RCA: 6] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Key Words] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/11/2020] [Revised: 10/06/2020] [Accepted: 11/17/2020] [Indexed: 12/22/2022]
Abstract
Juniper (Juniperus) is an ecologically important conifer genus of the Northern Hemisphere, the members of which are often foundational tree species of arid regions. The serrate leaf margin clade is native to topologically variable regions in North America, where hybridization has likely played a prominent role in their diversification. Here we use a reduced-representation sequencing approach (ddRADseq) to generate a phylogenomic data set for 68 accessions representing all 22 species in the serrate leaf margin clade, as well as a number of close and distant relatives, to improve understanding of diversification in this group. Phylogenetic analyses using three methods (SVDquartets, maximum likelihood, and Bayesian) yielded highly congruent and well-resolved topologies. These phylogenies provided improved resolution relative to past analyses based on Sanger sequencing of nuclear and chloroplast DNA, and were largely consistent with taxonomic expectations based on geography and morphology. Calibration of a Bayesian phylogeny with fossil evidence produced divergence time estimates for the clade consistent with a late Oligocene origin in North America, followed by a period of elevated diversification between 12 and 5 Mya. Comparison of the ddRADseq phylogenies with a phylogeny based on Sanger-sequenced chloroplast DNA revealed five instances of pronounced discordance, illustrating the potential for chloroplast introgression, chloroplast transfer, or incomplete lineage sorting to influence organellar phylogeny. Our results improve understanding of the pattern and tempo of diversification in Juniperus, and highlight the utility of reduced-representation sequencing for resolving phylogenetic relationships in non-model organisms with reticulation and recent divergence.
Collapse
Affiliation(s)
- Kathryn A Uckele
- Department of Biology, MS 314, University of Nevada, Reno, Max Fleischmann Agriculture Building, 1664 N Virginia St., Reno, NV 89557, USA.
| | - Robert P Adams
- Baylor University, Utah Lab, 201 N 5500 W, Hurricane, UT 84790, USA.
| | - Andrea E Schwarzbach
- Department of Health and Biomedical Sciences, University of Texas - Rio Grande Valley, 1 W University Drive, Brownsville, TX 78520, USA.
| | - Thomas L Parchman
- Department of Biology, MS 314, University of Nevada, Reno, Max Fleischmann Agriculture Building, 1664 N Virginia St., Reno, NV 89557, USA.
| |
Collapse
|
17
|
Paris DH, Kelly DJ, Fuerst PA, Day NPJ, Richards AL. A Brief History of the Major Rickettsioses in the Asia-Australia-Pacific Region: A Capstone Review for the Special Issue of TMID. Trop Med Infect Dis 2020; 5:tropicalmed5040165. [PMID: 33121158 PMCID: PMC7709643 DOI: 10.3390/tropicalmed5040165] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/12/2020] [Revised: 10/15/2020] [Accepted: 10/16/2020] [Indexed: 12/19/2022] Open
Abstract
The rickettsioses of the "Far East" or Asia-Australia-Pacific region include but are not limited to endemic typhus, scrub typhus, and more recently, tick typhus or spotted fever. These diseases embody the diversity of rickettsial disease worldwide and allow us to interconnect the various contributions to this special issue of Tropical Medicine and Infectious Disease. The impact of rickettsial diseases-particularly of scrub typhus-was substantial during the wars and "police actions" of the last 80 years. However, the post-World War II arrival of effective antibiotics reduced their impact, when recognized and adequately treated (chloramphenicol and tetracyclines). Presently, however, scrub typhus appears to be emerging and spreading into regions not previously reported. Better diagnostics, or higher population mobility, change in antimicrobial policies, even global warming, have been proposed as possible culprits of this phenomenon. Further, sporadic reports of possible antibiotic resistance have received the attention of clinicians and epidemiologists, raising interest in developing and testing novel diagnostics to facilitate medical diagnosis. We present a brief history of rickettsial diseases, their relative importance within the region, focusing on the so-called "tsutsugamushi triangle", the past and present impact of these diseases within the region, and indicate how historically, these often-confused diseases were ingeniously distinguished from each another. Moreover, we will discuss the importance of DNA-sequencing efforts for Orientia tsutsugamushi, obtained from patient blood, vector chiggers, and rodent reservoirs, particularly for the dominant 56-kD type-specific antigen gene (tsa56), and whole-genome sequences, which are increasing our knowledge of the diversity of this unique agent. We explore and discuss the potential of sequencing and other effective tools to geographically trace rickettsial disease agents, and develop control strategies to better mitigate the rickettsioses.
Collapse
Affiliation(s)
- Daniel H. Paris
- Department of Medicine, Swiss Tropical and Public Health Institute, 4051 Basel, Switzerland
- Department of Clinical Research, University of Basel, 4051 Basel, Switzerland
- Correspondence: ; Tel.: +41-61-284-8111
| | - Daryl J. Kelly
- Department of Evolution, Ecology and Organismal Biology, The Ohio State University, Columbus, OH 43210, USA; (D.J.K.); (P.A.F.)
| | - Paul A. Fuerst
- Department of Evolution, Ecology and Organismal Biology, The Ohio State University, Columbus, OH 43210, USA; (D.J.K.); (P.A.F.)
| | - Nicholas P. J. Day
- Mahidol-Oxford Tropical Medicine Research Programme, Faculty of Tropical Medicine, Mahidol University, 420/6 Rajvithee Road, Bangkok 10400, Thailand;
- Center for Tropical Medicine, Nuffield Department of Clinical Medicine, Churchill Hospital, Old Road, Headington, Oxford OX3 7LJ, UK
| | - Allen L. Richards
- Department of Preventive Medicine and Biostatistics, Uniformed Services University of the Health Sciences, Bethesda, MD 20814, USA;
| |
Collapse
|
18
|
Phylogenomic analysis of trichomycterid catfishes (Teleostei: Siluriformes) inferred from ultraconserved elements. Sci Rep 2020; 10:2697. [PMID: 32060350 PMCID: PMC7021825 DOI: 10.1038/s41598-020-59519-w] [Citation(s) in RCA: 29] [Impact Index Per Article: 7.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/17/2019] [Accepted: 01/28/2020] [Indexed: 11/22/2022] Open
Abstract
The family Trichomycteridae is one of the most diverse groups of freshwater catfishes in South and Central America with eight subfamilies, 41 genera and more than 300 valid species. Its members are widely distributed throughout South America, reaching Costa Rica in Central America and are recognized by extraordinary anatomical specializations and trophic diversity. In order to assess the phylogenetic relationships of Trichomycteridae, we collected sequence data from ultraconserved elements (UCEs) of the genome from 141 specimens of Trichomycteridae and 12 outgroup species. We used a concatenated matrix to assess the phylogenetic relationships by Bayesian inference (BI) and maximum likelihood (ML) searches and a coalescent analysis of species trees. The results show a highly resolved phylogeny with broad agreement among the three distinct analyses, providing overwhelming support for the monophyletic status of subfamily Trichomycterinae including Ituglanis and Scleronema. Previous relationship hypotheses among subfamilies are strongly corroborated, such as the sister relationship between Copionodontinae and Trichogeninae forming a sister clade to the remaining trichomycterids and the intrafamilial clade TSVSG (Tridentinae-Stegophilinae-Vandelliinae-Sarcoglanidinae-Glanapteryginae). Monophyly of Glanapteryginae and Sarcoglanidinae was not supported and the enigmatic Potamoglanis is placed outside Tridentinae.
Collapse
|
19
|
Springer MS, Molloy EK, Sloan DB, Simmons MP, Gatesy J. ILS-Aware Analysis of Low-Homoplasy Retroelement Insertions: Inference of Species Trees and Introgression Using Quartets. J Hered 2019; 111:147-168. [DOI: 10.1093/jhered/esz076] [Citation(s) in RCA: 23] [Impact Index Per Article: 4.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/13/2019] [Accepted: 12/12/2019] [Indexed: 12/20/2022] Open
Abstract
Abstract
DNA sequence alignments have provided the majority of data for inferring phylogenetic relationships with both concatenation and coalescent methods. However, DNA sequences are susceptible to extensive homoplasy, especially for deep divergences in the Tree of Life. Retroelement insertions have emerged as a powerful alternative to sequences for deciphering evolutionary relationships because these data are nearly homoplasy-free. In addition, retroelement insertions satisfy the “no intralocus-recombination” assumption of summary coalescent methods because they are singular events and better approximate neutrality relative to DNA loci commonly sampled in phylogenomic studies. Retroelements have traditionally been analyzed with parsimony, distance, and network methods. Here, we analyze retroelement data sets for vertebrate clades (Placentalia, Laurasiatheria, Balaenopteroidea, Palaeognathae) with 2 ILS-aware methods that operate by extracting, weighting, and then assembling unrooted quartets into a species tree. The first approach constructs a species tree from retroelement bipartitions with ASTRAL, and the second method is based on split-decomposition with parsimony. We also develop a Quartet-Asymmetry test to detect hybridization using retroelements. Both ILS-aware methods recovered the same species-tree topology for each data set. The ASTRAL species trees for Laurasiatheria have consecutive short branch lengths in the anomaly zone whereas Palaeognathae is outside of this zone. For the Balaenopteroidea data set, which includes rorquals (Balaenopteridae) and gray whale (Eschrichtiidae), both ILS-aware methods resolved balaeonopterids as paraphyletic. Application of the Quartet-Asymmetry test to this data set detected 19 different quartets of species for which historical introgression may be inferred. Evidence for introgression was not detected in the other data sets.
Collapse
Affiliation(s)
- Mark S Springer
- Department of Evolution, Ecology, and Organismal Biology, University of California, Riverside, CA
| | - Erin K Molloy
- Department of Computer Science, University of Illinois at Urbana-Champaign, Urbana, IL
| | - Daniel B Sloan
- Department of Biology, Colorado State University, Fort Collins, CO
| | - Mark P Simmons
- Department of Biology, Colorado State University, Fort Collins, CO
| | - John Gatesy
- Division of Vertebrate Zoology and Sackler Institute for Comparative Genomics, American Museum of Natural History, New York, NY
| |
Collapse
|
20
|
Gatesy J, Sloan DB, Warren JM, Baker RH, Simmons MP, Springer MS. Partitioned coalescence support reveals biases in species-tree methods and detects gene trees that determine phylogenomic conflicts. Mol Phylogenet Evol 2019; 139:106539. [DOI: 10.1016/j.ympev.2019.106539] [Citation(s) in RCA: 15] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/03/2018] [Revised: 06/10/2019] [Accepted: 06/17/2019] [Indexed: 12/26/2022]
|
21
|
Widhelm TJ, Grewe F, Huang JP, Mercado-Díaz JA, Goffinet B, Lücking R, Moncada B, Mason-Gamer R, Lumbsch HT. Multiple historical processes obscure phylogenetic relationships in a taxonomically difficult group (Lobariaceae, Ascomycota). Sci Rep 2019; 9:8968. [PMID: 31222061 PMCID: PMC6586878 DOI: 10.1038/s41598-019-45455-x] [Citation(s) in RCA: 25] [Impact Index Per Article: 5.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/05/2018] [Accepted: 06/03/2019] [Indexed: 12/19/2022] Open
Abstract
In the age of next-generation sequencing, the number of loci available for phylogenetic analyses has increased by orders of magnitude. But despite this dramatic increase in the amount of data, some phylogenomic studies have revealed rampant gene-tree discordance that can be caused by many historical processes, such as rapid diversification, gene duplication, or reticulate evolution. We used a target enrichment approach to sample 400 single-copy nuclear genes and estimate the phylogenetic relationships of 13 genera in the lichen-forming family Lobariaceae to address the effect of data type (nucleotides and amino acids) and phylogenetic reconstruction method (concatenation and species tree approaches). Furthermore, we examined datasets for evidence of historical processes, such as rapid diversification and reticulate evolution. We found incongruence associated with sequence data types (nucleotide vs. amino acid sequences) and with different methods of phylogenetic reconstruction (species tree vs. concatenation). The resulting phylogenetic trees provided evidence for rapid and reticulate evolution based on extremely short branches in the backbone of the phylogenies. The observed rapid and reticulate diversifications may explain conflicts among gene trees and the challenges to resolving evolutionary relationships. Based on divergence times, the diversification at the backbone occurred near the Cretaceous-Paleogene (K-Pg) boundary (65 Mya) which is consistent with other rapid diversifications in the tree of life. Although some phylogenetic relationships within the Lobariaceae family remain with low support, even with our powerful phylogenomic dataset of up to 376 genes, our use of target-capturing data allowed for the novel exploration of the mechanisms underlying phylogenetic and systematic incongruence.
Collapse
Affiliation(s)
- Todd J Widhelm
- Field Museum, Science and Education, Chicago, 60605, USA.
- University of Illinois at Chicago, Biological Sciences, Chicago, 60607, USA.
| | - Felix Grewe
- Field Museum, Grainger Bioinformatics Center, Chicago, 60605, USA
| | - Jen-Pan Huang
- Field Museum, Science and Education, Chicago, 60605, USA
- Biodiversity Research Center, Academia Sinica, Taipei, Taiwan
| | | | - Bernard Goffinet
- University of Connecticut, Ecology and Evolutionary Biology, Storrs, 06268, USA
| | - Robert Lücking
- Botanischer Garten und Botanisches Museum, Herbarium, Berlin, 14195, Germany
| | - Bibiana Moncada
- Universidad Distrital Francisco José de Caldas, Torre de Laboratorios, Herbario, Bogotá, 11021, Colombia
| | | | | |
Collapse
|
22
|
Bravo GA, Antonelli A, Bacon CD, Bartoszek K, Blom MPK, Huynh S, Jones G, Knowles LL, Lamichhaney S, Marcussen T, Morlon H, Nakhleh LK, Oxelman B, Pfeil B, Schliep A, Wahlberg N, Werneck FP, Wiedenhoeft J, Willows-Munro S, Edwards SV. Embracing heterogeneity: coalescing the Tree of Life and the future of phylogenomics. PeerJ 2019; 7:e6399. [PMID: 30783571 PMCID: PMC6378093 DOI: 10.7717/peerj.6399] [Citation(s) in RCA: 67] [Impact Index Per Article: 13.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/05/2018] [Accepted: 01/07/2019] [Indexed: 12/23/2022] Open
Abstract
Building the Tree of Life (ToL) is a major challenge of modern biology, requiring advances in cyberinfrastructure, data collection, theory, and more. Here, we argue that phylogenomics stands to benefit by embracing the many heterogeneous genomic signals emerging from the first decade of large-scale phylogenetic analysis spawned by high-throughput sequencing (HTS). Such signals include those most commonly encountered in phylogenomic datasets, such as incomplete lineage sorting, but also those reticulate processes emerging with greater frequency, such as recombination and introgression. Here we focus specifically on how phylogenetic methods can accommodate the heterogeneity incurred by such population genetic processes; we do not discuss phylogenetic methods that ignore such processes, such as concatenation or supermatrix approaches or supertrees. We suggest that methods of data acquisition and the types of markers used in phylogenomics will remain restricted until a posteriori methods of marker choice are made possible with routine whole-genome sequencing of taxa of interest. We discuss limitations and potential extensions of a model supporting innovation in phylogenomics today, the multispecies coalescent model (MSC). Macroevolutionary models that use phylogenies, such as character mapping, often ignore the heterogeneity on which building phylogenies increasingly rely and suggest that assimilating such heterogeneity is an important goal moving forward. Finally, we argue that an integrative cyberinfrastructure linking all steps of the process of building the ToL, from specimen acquisition in the field to publication and tracking of phylogenomic data, as well as a culture that values contributors at each step, are essential for progress.
Collapse
Affiliation(s)
- Gustavo A. Bravo
- Department of Organismic and Evolutionary Biology, Museum of Comparative Zoology, Harvard University, Cambridge, MA, USA
| | - Alexandre Antonelli
- Department of Organismic and Evolutionary Biology, Museum of Comparative Zoology, Harvard University, Cambridge, MA, USA
- Gothenburg Global Biodiversity Centre, Göteborg, Sweden
- Department of Biological and Environmental Sciences, University of Gothenburg, Göteborg, Sweden
- Gothenburg Botanical Garden, Göteborg, Sweden
| | - Christine D. Bacon
- Gothenburg Global Biodiversity Centre, Göteborg, Sweden
- Department of Biological and Environmental Sciences, University of Gothenburg, Göteborg, Sweden
| | - Krzysztof Bartoszek
- Department of Computer and Information Science, Linköping University, Linköping, Sweden
| | - Mozes P. K. Blom
- Department of Bioinformatics and Genetics, Swedish Museum of Natural History, Stockholm, Sweden
| | - Stella Huynh
- Institut de Biologie, Université de Neuchâtel, Neuchâtel, Switzerland
| | - Graham Jones
- Department of Biological and Environmental Sciences, University of Gothenburg, Göteborg, Sweden
| | - L. Lacey Knowles
- Department of Ecology and Evolutionary Biology, University of Michigan, Ann Arbor, MI, USA
| | - Sangeet Lamichhaney
- Department of Organismic and Evolutionary Biology, Museum of Comparative Zoology, Harvard University, Cambridge, MA, USA
| | - Thomas Marcussen
- Centre for Ecological and Evolutionary Synthesis, University of Oslo, Oslo, Norway
| | - Hélène Morlon
- Institut de Biologie, Ecole Normale Supérieure de Paris, Paris, France
| | - Luay K. Nakhleh
- Department of Computer Science, Rice University, Houston, TX, USA
| | - Bengt Oxelman
- Gothenburg Global Biodiversity Centre, Göteborg, Sweden
- Department of Biological and Environmental Sciences, University of Gothenburg, Göteborg, Sweden
| | - Bernard Pfeil
- Department of Biological and Environmental Sciences, University of Gothenburg, Göteborg, Sweden
| | - Alexander Schliep
- Department of Computer Science and Engineering, Chalmers University of Technology and University of Gothenburg, Göteborg, Sweden
| | | | - Fernanda P. Werneck
- Coordenação de Biodiversidade, Programa de Coleções Científicas Biológicas, Instituto Nacional de Pesquisa da Amazônia, Manaus, AM, Brazil
| | - John Wiedenhoeft
- Department of Computer Science and Engineering, Chalmers University of Technology and University of Gothenburg, Göteborg, Sweden
- Department of Computer Science, Rutgers University, Piscataway, NJ, USA
| | - Sandi Willows-Munro
- School of Life Sciences, University of Kwazulu-Natal, Pietermaritzburg, South Africa
| | - Scott V. Edwards
- Department of Organismic and Evolutionary Biology, Museum of Comparative Zoology, Harvard University, Cambridge, MA, USA
- Gothenburg Centre for Advanced Studies in Science and Technology, Chalmers University of Technology and University of Gothenburg, Göteborg, Sweden
| |
Collapse
|
23
|
Devitt TJ, Wright AM, Cannatella DC, Hillis DM. Species delimitation in endangered groundwater salamanders: Implications for aquifer management and biodiversity conservation. Proc Natl Acad Sci U S A 2019; 116:2624-2633. [PMID: 30642970 PMCID: PMC6377464 DOI: 10.1073/pnas.1815014116] [Citation(s) in RCA: 49] [Impact Index Per Article: 9.8] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/18/2022] Open
Abstract
Groundwater-dependent species are among the least-known components of global biodiversity, as well as some of the most vulnerable because of rapid groundwater depletion at regional and global scales. The karstic Edwards-Trinity aquifer system of west-central Texas is one of the most species-rich groundwater systems in the world, represented by dozens of endemic groundwater-obligate species with narrow, naturally fragmented distributions. Here, we examine how geomorphological and hydrogeological processes have driven population divergence and speciation in a radiation of salamanders (Eurycea) endemic to the Edwards-Trinity system using phylogenetic and population genetic analysis of genome-wide DNA sequence data. Results revealed complex patterns of isolation and reconnection driven by surface and subsurface hydrology, resulting in both adaptive and nonadaptive population divergence and speciation. Our results uncover cryptic species diversity and refine the borders of several threatened and endangered species. The US Endangered Species Act has been used to bring state regulation to unrestricted groundwater withdrawals in the Edwards (Balcones Fault Zone) Aquifer, where listed species are found. However, the Trinity and Edwards-Trinity (Plateau) aquifers harbor additional species with similarly small ranges that currently receive no protection from regulatory programs designed to prevent groundwater depletion. Based on regional climate models that predict increased air temperature, together with hydrologic models that project decreased springflow, we conclude that Edwards-Trinity salamanders and other codistributed groundwater-dependent organisms are highly vulnerable to extinction within the next century.
Collapse
Affiliation(s)
- Thomas J Devitt
- Department of Integrative Biology, The University of Texas at Austin, Austin, TX 78712;
- Biodiversity Center, The University of Texas at Austin, Austin, TX 78712
| | - April M Wright
- Department of Integrative Biology, The University of Texas at Austin, Austin, TX 78712
- Biodiversity Center, The University of Texas at Austin, Austin, TX 78712
| | - David C Cannatella
- Department of Integrative Biology, The University of Texas at Austin, Austin, TX 78712
- Biodiversity Center, The University of Texas at Austin, Austin, TX 78712
| | - David M Hillis
- Department of Integrative Biology, The University of Texas at Austin, Austin, TX 78712;
- Biodiversity Center, The University of Texas at Austin, Austin, TX 78712
| |
Collapse
|
24
|
Liu L, Anderson C, Pearl D, Edwards SV. Modern Phylogenomics: Building Phylogenetic Trees Using the Multispecies Coalescent Model. Methods Mol Biol 2019; 1910:211-239. [PMID: 31278666 DOI: 10.1007/978-1-4939-9074-0_7] [Citation(s) in RCA: 19] [Impact Index Per Article: 3.8] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 06/09/2023]
Abstract
The multispecies coalescent (MSC) model provides a compelling framework for building phylogenetic trees from multilocus DNA sequence data. The pure MSC is best thought of as a special case of so-called "multispecies network coalescent" models, in which gene flow is allowed among branches of the tree, whereas MSC methods assume there is no gene flow between diverging species. Early implementations of the MSC, such as "parsimony" or "democratic vote" approaches to combining information from multiple gene trees, as well as concatenation, in which DNA sequences from multiple gene trees are combined into a single "supergene," were quickly shown to be inconsistent in some regions of tree space, in so far as they converged on the incorrect species tree as more gene trees and sequence data were accumulated. The anomaly zone, a region of tree space in which the most frequent gene tree is different from the species tree, is one such region where many so-called "coalescent" methods are inconsistent. Second-generation implementations of the MSC employed Bayesian or likelihood models; these are consistent in all regions of gene tree space, but Bayesian methods in particular are incapable of handling the large phylogenomic data sets currently available. Two-step methods, such as MP-EST and ASTRAL, in which gene trees are first estimated and then combined to estimate an overarching species tree, are currently popular in part because they can handle large phylogenomic data sets. These methods are consistent in the anomaly zone but can sometimes provide inappropriate measures of tree support or apportion error and signal in the data inappropriately. MP-EST in particular employs a likelihood model which can be conveniently manipulated to perform statistical tests of competing species trees, incorporating the likelihood of the collected gene trees on each species tree in a likelihood ratio test. Such tests provide a useful alternative to the multilocus bootstrap, which only indirectly tests the appropriateness of competing species trees. We illustrate these tests and implementations of the MSC with examples and suggest that MSC methods are a useful class of models effectively using information from multiple loci to build phylogenetic trees.
Collapse
Affiliation(s)
- Liang Liu
- Department of Statistics, University of Georgia, Athens, GA, USA
| | | | - Dennis Pearl
- Department of Statistics, Pennsylvania State University, University Park, PA, USA
| | - Scott V Edwards
- Department of Organismic and Evolutionary Biology & Museum of Comparative Zoology, Harvard University, Cambridge, MA, USA.
| |
Collapse
|
25
|
Richards EJ, Brown JM, Barley AJ, Chong RA, Thomson RC. Variation Across Mitochondrial Gene Trees Provides Evidence for Systematic Error: How Much Gene Tree Variation Is Biological? Syst Biol 2018; 67:847-860. [PMID: 29471536 DOI: 10.1093/sysbio/syy013] [Citation(s) in RCA: 34] [Impact Index Per Article: 5.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/01/2017] [Accepted: 02/15/2018] [Indexed: 12/28/2022] Open
Abstract
The use of large genomic data sets in phylogenetics has highlighted extensive topological variation across genes. Much of this discordance is assumed to result from biological processes. However, variation among gene trees can also be a consequence of systematic error driven by poor model fit, and the relative importance of biological vs. methodological factors in explaining gene tree variation is a major unresolved question. Using mitochondrial genomes to control for biological causes of gene tree variation, we estimate the extent of gene tree discordance driven by systematic error and employ posterior prediction to highlight the role of model fit in producing this discordance. We find that the amount of discordance among mitochondrial gene trees is similar to the amount of discordance found in other studies that assume only biological causes of variation. This similarity suggests that the role of systematic error in generating gene tree variation is underappreciated and critical evaluation of fit between assumed models and the data used for inference is important for the resolution of unresolved phylogenetic questions.
Collapse
Affiliation(s)
- Emilie J Richards
- Department of Biology, University of Hawai'i, 2538 McCarthy Mall, Edmondson Hall 2016, Honolulu, HI 96822, USA.,Department of Biology, University of North Carolina, 120 South Road, Coker Hall CB 3280 Chapel Hill, NC 27599, USA
| | - Jeremy M Brown
- Department of Biological Sciences and Museum of Natural Science, Louisiana State University, 202 Life Sciences Building, Baton Rouge, LA 70803, USA
| | - Anthony J Barley
- Department of Biology, University of Hawai'i, 2538 McCarthy Mall, Edmondson Hall 2016, Honolulu, HI 96822, USA
| | - Rebecca A Chong
- Department of Biology, University of Hawai'i, 2538 McCarthy Mall, Edmondson Hall 2016, Honolulu, HI 96822, USA
| | - Robert C Thomson
- Department of Biology, University of Hawai'i, 2538 McCarthy Mall, Edmondson Hall 2016, Honolulu, HI 96822, USA
| |
Collapse
|
26
|
Rabiee M, Sayyari E, Mirarab S. Multi-allele species reconstruction using ASTRAL. Mol Phylogenet Evol 2018; 130:286-296. [PMID: 30393186 DOI: 10.1016/j.ympev.2018.10.033] [Citation(s) in RCA: 80] [Impact Index Per Article: 13.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/19/2017] [Revised: 10/23/2018] [Accepted: 10/24/2018] [Indexed: 11/29/2022]
Abstract
Genome-wide phylogeny reconstruction is becoming increasingly common, and one driving factor behind these phylogenomic studies is the promise that the potential discordance between gene trees and the species tree can be modeled. Incomplete lineage sorting is one cause of discordance that bridges population genetic and phylogenetic processes. ASTRAL is a species tree reconstruction method that seeks to find the tree with minimum quartet distance to an input set of inferred gene trees. However, the published ASTRAL algorithm only works with one sample per species. To account for polymorphisms in present-day species, one can sample multiple individuals per species to create multi-allele datasets. Here, we introduce how ASTRAL can handle multi-allele datasets. We show that the quartet-based optimization problem extends naturally, and we introduce heuristic methods for building the search space specifically for the case of multi-individual datasets. We study the accuracy and scalability of the multi-individual version of ASTRAL-III using extensive simulation studies and compare it to NJst, the only other scalable method that can handle these datasets. We do not find strong evidence that using multiple individuals dramatically improves accuracy. When we study the trade-off between sampling more genes versus more individuals, we find that sampling more genes is more effective than sampling more individuals, even under conditions that we study where trees are shallow (median length: ≈1Ne) and ILS is extremely high.
Collapse
Affiliation(s)
- Maryam Rabiee
- Department of Computer Science and Engineering, University of California, San Diego, 9500 Gilman Dr, La Jolla, CA 92093, United States
| | - Erfan Sayyari
- Department of Electrical and Computer Engineering, University of California, San Diego, 9500 Gilman Dr, La Jolla, CA 92093, United States
| | - Siavash Mirarab
- Department of Electrical and Computer Engineering, University of California, San Diego, 9500 Gilman Dr, La Jolla, CA 92093, United States.
| |
Collapse
|
27
|
Degnan JH. Modeling Hybridization Under the Network Multispecies Coalescent. Syst Biol 2018; 67:786-799. [PMID: 29846734 PMCID: PMC6101600 DOI: 10.1093/sysbio/syy040] [Citation(s) in RCA: 59] [Impact Index Per Article: 9.8] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/16/2017] [Revised: 05/13/2018] [Accepted: 05/16/2018] [Indexed: 11/13/2022] Open
Abstract
Simultaneously modeling hybridization and the multispecies coalescent is becoming increasingly common, and inference of species networks in this context is now implemented in several software packages. This article addresses some of the conceptual issues and decisions to be made in this modeling, including whether or not to use branch lengths and issues with model identifiability. This article is based on a talk given at a Spotlight Session at Evolution 2017 meeting in Portland, Oregon. This session included several talks about modeling hybridization and gene flow in the presence of incomplete lineage sorting. Other talks given at this meeting are also included in this special issue of Systematic Biology.
Collapse
Affiliation(s)
- James H Degnan
- Department of Mathematics and Statistics, University of New Mexico, Albuquerque, NM 87131, USA
| |
Collapse
|
28
|
SVDquest: Improving SVDquartets species tree estimation using exact optimization within a constrained search space. Mol Phylogenet Evol 2018. [DOI: 10.1016/j.ympev.2018.03.006] [Citation(s) in RCA: 25] [Impact Index Per Article: 4.2] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/22/2022]
|
29
|
Adams RH, Schield DR, Card DC, Castoe TA. Assessing the Impacts of Positive Selection on Coalescent-Based Species Tree Estimation and Species Delimitation. Syst Biol 2018; 67:1076-1090. [DOI: 10.1093/sysbio/syy034] [Citation(s) in RCA: 12] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/20/2017] [Accepted: 05/05/2018] [Indexed: 11/13/2022] Open
Affiliation(s)
- Richard H Adams
- Department of Biology, University of Texas at Arlington, 501 S. Nedderman Dr., Arlington, TX 76019, USA
| | - Drew R Schield
- Department of Biology, University of Texas at Arlington, 501 S. Nedderman Dr., Arlington, TX 76019, USA
| | - Daren C Card
- Department of Biology, University of Texas at Arlington, 501 S. Nedderman Dr., Arlington, TX 76019, USA
| | - Todd A Castoe
- Department of Biology, University of Texas at Arlington, 501 S. Nedderman Dr., Arlington, TX 76019, USA
| |
Collapse
|
30
|
Galen SC, Borner J, Martinsen ES, Schaer J, Austin CC, West CJ, Perkins SL. The polyphyly of Plasmodium: comprehensive phylogenetic analyses of the malaria parasites (order Haemosporida) reveal widespread taxonomic conflict. ROYAL SOCIETY OPEN SCIENCE 2018; 5:171780. [PMID: 29892372 PMCID: PMC5990803 DOI: 10.1098/rsos.171780] [Citation(s) in RCA: 98] [Impact Index Per Article: 16.3] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 11/01/2017] [Accepted: 04/20/2018] [Indexed: 05/29/2023]
Abstract
The evolutionary relationships among the apicomplexan blood pathogens known as the malaria parasites (order Haemosporida), some of which infect nearly 200 million humans each year, has remained a vexing phylogenetic problem due to limitations in taxon sampling, character sampling and the extreme nucleotide base composition biases that are characteristic of this clade. Previous phylogenetic work on the malaria parasites has often lacked sufficient representation of the broad taxonomic diversity within the Haemosporida or the multi-locus sequence data needed to resolve deep evolutionary relationships, rendering our understanding of haemosporidian life-history evolution and the origin of the human malaria parasites incomplete. Here we present the most comprehensive phylogenetic analysis of the malaria parasites conducted to date, using samples from a broad diversity of vertebrate hosts that includes numerous enigmatic and poorly known haemosporidian lineages in addition to genome-wide multi-locus sequence data. We find that if base composition differences were corrected for during phylogenetic analysis, we recovered a well-supported topology indicating that the evolutionary history of the malaria parasites was characterized by a complex series of transitions in life-history strategies and host usage. Notably we find that Plasmodium, the malaria parasite genus that includes the species of human medical concern, is polyphyletic with the life-history traits characteristic of this genus having evolved in a dynamic manner across the phylogeny. We find support for multiple instances of gain and loss of asexual proliferation in host blood cells and production of haemozoin pigment, two traits that have been used for taxonomic classification as well as considered to be important factors for parasite virulence and used as drug targets. Lastly, our analysis illustrates the need for a widespread reassessment of malaria parasite taxonomy.
Collapse
Affiliation(s)
- Spencer C. Galen
- Sackler Institute for Comparative Genomics, American Museum of Natural History, Central Park West at 79th St., New York, NY 10024, USA
- Richard Gilder Graduate School, American Museum of Natural History, Central Park West at 79th St., New York, NY 10024, USA
| | - Janus Borner
- Institute of Zoology, Biocenter Grindel, University of Hamburg, Martin-Luther-King-Platz 3, D-20146 Hamburg, Germany
| | - Ellen S. Martinsen
- Center for Conservation Genomics, Smithsonian Conservation Biology Institute, National Zoological Park, PO Box 37012, MRC5503, Washington, DC 20013-7012, USA
| | - Juliane Schaer
- Department of Biology, Humboldt University, 10115, Berlin, Germany
| | - Christopher C. Austin
- Department of Biological Sciences, Museum of Natural Science, Louisiana State University, Baton Rouge, LA 70803, USA
| | | | - Susan L. Perkins
- Sackler Institute for Comparative Genomics, American Museum of Natural History, Central Park West at 79th St., New York, NY 10024, USA
| |
Collapse
|
31
|
Platt RN, Faircloth BC, Sullivan KAM, Kieran TJ, Glenn TC, Vandewege MW, Lee TE, Baker RJ, Stevens RD, Ray DA. Conflicting Evolutionary Histories of the Mitochondrial and Nuclear Genomes in New World Myotis Bats. Syst Biol 2018; 67:236-249. [PMID: 28945862 PMCID: PMC5837689 DOI: 10.1093/sysbio/syx070] [Citation(s) in RCA: 43] [Impact Index Per Article: 7.2] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/17/2017] [Revised: 07/31/2017] [Accepted: 08/15/2017] [Indexed: 01/05/2023] Open
Abstract
The rapid diversification of Myotis bats into more than 100 species is one of the most extensive mammalian radiations available for study. Efforts to understand relationships within Myotis have primarily utilized mitochondrial markers and trees inferred from nuclear markers lacked resolution. Our current understanding of relationships within Myotis is therefore biased towards a set of phylogenetic markers that may not reflect the history of the nuclear genome. To resolve this, we sequenced the full mitochondrial genomes of 37 representative Myotis, primarily from the New World, in conjunction with targeted sequencing of 3648 ultraconserved elements (UCEs). We inferred the phylogeny and explored the effects of concatenation and summary phylogenetic methods, as well as combinations of markers based on informativeness or levels of missing data, on our results. Of the 294 phylogenies generated from the nuclear UCE data, all are significantly different from phylogenies inferred using mitochondrial genomes. Even within the nuclear data, quartet frequencies indicate that around half of all UCE loci conflict with the estimated species tree. Several factors can drive such conflict, including incomplete lineage sorting, introgressive hybridization, or even phylogenetic error. Despite the degree of discordance between nuclear UCE loci and the mitochondrial genome and among UCE loci themselves, the most common nuclear topology is recovered in one quarter of all analyses with strong nodal support. Based on these results, we re-examine the evolutionary history of Myotis to better understand the phenomena driving their unique nuclear, mitochondrial, and biogeographic histories.
Collapse
Affiliation(s)
- Roy N Platt
- Department of Biological Sciences, Texas Tech University, 2901 Main St, Lubbock, TX, USA
| | - Brant C Faircloth
- Department of Biological Sciences and Museum of Natural Science, Louisiana State University, 202 Life Science Building, Baton Rouge, LA, USA
| | - Kevin A M Sullivan
- Department of Biological Sciences, Texas Tech University, 2901 Main St, Lubbock, TX, USA
| | - Troy J Kieran
- Department of Environmental Health Science, University of Georgia, 206 Environmental Health Sciences Building, Athens, GA, USA
| | - Travis C Glenn
- Department of Environmental Health Science, University of Georgia, 206 Environmental Health Sciences Building, Athens, GA, USA
| | - Michael W Vandewege
- Department of Biological Sciences, Texas Tech University, 2901 Main St, Lubbock, TX, USA
| | - Thomas E Lee
- Department of Biology, Abilene Christian University, 1600 Campus Ct. Abilene, TX, USA
| | - Robert J Baker
- Department of Biological Sciences, Texas Tech University, 2901 Main St, Lubbock, TX, USA
| | - Richard D Stevens
- Natural Resource Management, Texas Tech University, 2901 Main St, Lubbock, TX, USA
| | - David A Ray
- Department of Biological Sciences, Texas Tech University, 2901 Main St, Lubbock, TX, USA
| |
Collapse
|
32
|
Knowles LL, Huang H, Sukumaran J, Smith SA. A matter of phylogenetic scale: Distinguishing incomplete lineage sorting from lateral gene transfer as the cause of gene tree discord in recent versus deep diversification histories. AMERICAN JOURNAL OF BOTANY 2018; 105:376-384. [PMID: 29710372 DOI: 10.1002/ajb2.1064] [Citation(s) in RCA: 16] [Impact Index Per Article: 2.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 10/31/2017] [Accepted: 01/30/2018] [Indexed: 06/08/2023]
Abstract
PREMISE OF THE STUDY Discordant gene trees are commonly encountered when sequences from thousands of loci are applied to estimate phylogenetic relationships. Several processes contribute to this discord. Yet, we have no methods that jointly model different sources of conflict when estimating phylogenies. An alternative to analyzing entire genomes or all the sequenced loci is to identify a subset of loci for phylogenetic analysis. If we can identify data partitions that are most likely to reflect descent from a common ancestor (i.e., discordant loci that indeed reflect incomplete lineage sorting [ILS], as opposed to some other process, such as lateral gene transfer [LGT]), we can analyze this subset using powerful coalescent-based species-tree approaches. METHODS Test data sets were simulated where discord among loci could arise from ILS and LGT. Data sets where analyzed using the newly developed program CLASSIPHY (Huang et al., ) to assess whether our ability to distinguish the cause of discord among loci varied when ILS and LGT occurred in the recent versus deep past and whether the accuracy of these inferences were affected by the mutational process. KEY RESULTS We show that accuracy of probabilistic classification of individual loci by the cause of discord differed when ILS and LGT events occurred more recently compared with the distant past and that the signal-to-noise ratio arising from the mutational process contributes to difficulties in inferring LGT data partitions. CONCLUSIONS We discuss our findings in terms of the promise and limitations of identifying subsets of loci for species-tree inference that will not violate the underlying coalescent model (i.e., data partitions in which ILS, and not LGT, contributes to discord). We also discuss the empirical implications of our work given the many recalcitrant nodes in the tree of life (e.g., origins of angiosperms, amniotes, or Neoaves), and recent arguments for concatenating loci.
Collapse
Affiliation(s)
- L Lacey Knowles
- Department of Ecology and Evolutionary Biology, Museum of Zoology, University of Michigan, 1109 Geddes Avenue, Ann Arbor, MI, 48109-1079, USA
| | - Huateng Huang
- Department of Ecology and Evolutionary Biology, Museum of Zoology, University of Michigan, 1109 Geddes Avenue, Ann Arbor, MI, 48109-1079, USA
| | - Jeet Sukumaran
- Department of Ecology and Evolutionary Biology, Museum of Zoology, University of Michigan, 1109 Geddes Avenue, Ann Arbor, MI, 48109-1079, USA
| | - Stephen A Smith
- Department of Ecology and Evolutionary Biology, Museum of Zoology, University of Michigan, 1109 Geddes Avenue, Ann Arbor, MI, 48109-1079, USA
| |
Collapse
|
33
|
Scornavacca C, Galtier N. Incomplete Lineage Sorting in Mammalian Phylogenomics. Syst Biol 2018; 66:112-120. [PMID: 28173480 DOI: 10.1093/sysbio/syw082] [Citation(s) in RCA: 30] [Impact Index Per Article: 5.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/18/2016] [Revised: 03/25/2016] [Accepted: 09/04/2016] [Indexed: 01/05/2023] Open
Abstract
The impact of incomplete lineage sorting (ILS) on phylogenetic conflicts among genes, and the related issue of whether to account for ILS in species tree reconstruction, are matters of intense controversy. Here, focusing on full-genome data in placental mammals, we empirically test two assumptions underlying current usage of tree-building methods that account for ILS. We show that in this data set (i) distinct exons from a common gene do not share a common genealogy, and (ii) ILS is only a minor determinant of the existing phylogenetic conflict. These results shed new light on the relevance and conditions of applicability of ILS-aware methods in phylogenomic analyses of protein coding sequences.
Collapse
Affiliation(s)
- Celine Scornavacca
- UMR 5554-Institute of Evolutionary Sciences, University Montpellier, CNRS, IRD, EPHE, Place E. Bataillon-CC64, Montpellier, France
| | - Nicolas Galtier
- UMR 5554-Institute of Evolutionary Sciences, University Montpellier, CNRS, IRD, EPHE, Place E. Bataillon-CC64, Montpellier, France
| |
Collapse
|
34
|
Shih KM, Chang CT, Chung JD, Chiang YC, Hwang SY. Adaptive Genetic Divergence Despite Significant Isolation-by-Distance in Populations of Taiwan Cow-Tail Fir ( Keteleeria davidiana var. formosana). FRONTIERS IN PLANT SCIENCE 2018; 9:92. [PMID: 29449860 PMCID: PMC5799944 DOI: 10.3389/fpls.2018.00092] [Citation(s) in RCA: 13] [Impact Index Per Article: 2.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 10/02/2017] [Accepted: 01/17/2018] [Indexed: 05/05/2023]
Abstract
Double digest restriction site-associated DNA sequencing (ddRADseq) is a tool for delivering genome-wide single nucleotide polymorphism (SNP) markers for non-model organisms useful in resolving fine-scale population structure and detecting signatures of selection. This study performs population genetic analysis, based on ddRADseq data, of a coniferous species, Keteleeria davidiana var. formosana, disjunctly distributed in northern and southern Taiwan, for investigation of population adaptive divergence in response to environmental heterogeneity. A total of 13,914 SNPs were detected and used to assess genetic diversity, FST outlier detection, population genetic structure, and individual assignments of five populations (62 individuals) of K. davidiana var. formosana. Principal component analysis (PCA), individual assignments, and the neighbor-joining tree were successful in differentiating individuals between northern and southern populations of K. davidiana var. formosana, but apparent gene flow between the southern DW30 population and northern populations was also revealed. Fifteen of 23 highly differentiated SNPs identified were found to be strongly associated with environmental variables, suggesting isolation-by-environment (IBE). However, multiple matrix regression with randomization analysis revealed strong IBE as well as significant isolation-by-distance. Environmental impacts on divergence were found between populations of the North and South regions and also between the two southern neighboring populations. BLASTN annotation of the sequences flanking outlier SNPs gave significant hits for three of 23 markers that might have biological relevance to mitochondrial homeostasis involved in the survival of locally adapted lineages. Species delimitation between K. davidiana var. formosana and its ancestor, K. davidiana, was also examined (72 individuals). This study has produced highly informative population genomic data for the understanding of population attributes, such as diversity, connectivity, and adaptive divergence associated with large- and small-scale environmental heterogeneity in K. davidiana var. formosana.
Collapse
Affiliation(s)
- Kai-Ming Shih
- Department of Life Science, National Taiwan Normal University, Taipei, Taiwan
| | - Chung-Te Chang
- Department of Geography, National Taiwan University, Taipei, Taiwan
| | - Jeng-Der Chung
- Division of Silviculture, Taiwan Forestry Research Institute, Taipei, Taiwan
| | - Yu-Chung Chiang
- Department of Biological Sciences, National Sun Yat-Sen University, Kaohsiung, Taiwan
| | - Shih-Ying Hwang
- Department of Life Science, National Taiwan Normal University, Taipei, Taiwan
| |
Collapse
|
35
|
Blom MPK, Bragg JG, Potter S, Moritz C. Accounting for Uncertainty in Gene Tree Estimation: Summary-Coalescent Species Tree Inference in a Challenging Radiation of Australian Lizards. Syst Biol 2018; 66:352-366. [PMID: 28039387 DOI: 10.1093/sysbio/syw089] [Citation(s) in RCA: 21] [Impact Index Per Article: 3.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/14/2015] [Accepted: 09/27/2016] [Indexed: 11/12/2022] Open
Abstract
Accurate gene tree inference is an important aspect of species tree estimation in a summary-coalescent framework. Yet, in empirical studies, inferred gene trees differ in accuracy due to stochastic variation in phylogenetic signal between targeted loci. Empiricists should, therefore, examine the consistency of species tree inference, while accounting for the observed heterogeneity in gene tree resolution of phylogenomic data sets. Here, we assess the impact of gene tree estimation error on summary-coalescent species tree inference by screening ${\sim}2000$ exonic loci based on gene tree resolution prior to phylogenetic inference. We focus on a phylogenetically challenging radiation of Australian lizards (genus Cryptoblepharus, Scincidae) and explore effects on topology and support. We identify a well-supported topology based on all loci and find that a relatively small number of high-resolution gene trees can be sufficient to converge on the same topology. Adding gene trees with decreasing resolution produced a generally consistent topology, and increased confidence for specific bipartitions that were poorly supported when using a small number of informative loci. This corroborates coalescent-based simulation studies that have highlighted the need for a large number of loci to confidently resolve challenging relationships and refutes the notion that low-resolution gene trees introduce phylogenetic noise. Further, our study also highlights the value of quantifying changes in nodal support across locus subsets of increasing size (but decreasing gene tree resolution). Such detailed analyses can reveal anomalous fluctuations in support at some nodes, suggesting the possibility of model violation. By characterizing the heterogeneity in phylogenetic signal among loci, we can account for uncertainty in gene tree estimation and assess its effect on the consistency of the species tree estimate. We suggest that the evaluation of gene tree resolution should be incorporated in the analysis of empirical phylogenomic data sets. This will ultimately increase our confidence in species tree estimation using summary-coalescent methods and enable us to exploit genomic data for phylogenetic inference. [Coalescence; concatenation; Cryptoblepharus; exon capture; gene tree; phylogenomics; species tree.].
Collapse
Affiliation(s)
- Mozes P K Blom
- Research School of Biology, Australian National University, Canberra ACT 0200, Australia
| | - Jason G Bragg
- Research School of Biology, Australian National University, Canberra ACT 0200, Australia
| | - Sally Potter
- Research School of Biology, Australian National University, Canberra ACT 0200, Australia
| | - Craig Moritz
- Research School of Biology, Australian National University, Canberra ACT 0200, Australia
| |
Collapse
|
36
|
Zhu S, Degnan JH. Displayed Trees Do Not Determine Distinguishability Under the Network Multispecies Coalescent. Syst Biol 2018; 66:283-298. [PMID: 27780899 DOI: 10.1093/sysbio/syw097] [Citation(s) in RCA: 14] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/10/2015] [Accepted: 03/08/2016] [Indexed: 11/13/2022] Open
Abstract
Recent work in estimating species relationships from gene trees has included inferring networks assuming that past hybridization has occurred between species. Probabilistic models using the multispecies coalescent can be used in this framework for likelihood-based inference of both network topologies and parameters, including branch lengths and hybridization parameters. A difficulty for such methods is that it is not always clear whether, or to what extent, networks are identifiable-that is whether there could be two distinct networks that lead to the same distribution of gene trees. For cases in which incomplete lineage sorting occurs in addition to hybridization, we demonstrate a new representation of the species network likelihood that expresses the probability distribution of the gene tree topologies as a linear combination of gene tree distributions given a set of species trees. This representation makes it clear that in some cases in which two distinct networks give the same distribution of gene trees when sampling one allele per species, the two networks can be distinguished theoretically when multiple individuals are sampled per species. This result means that network identifiability is not only a function of the trees displayed by the networks but also depends on allele sampling within species. We additionally give an example in which two networks that display exactly the same trees can be distinguished from their gene trees even when there is only one lineage sampled per species. [gene tree, hybridization, identifiability, maximum likelihood, species tree, phylogeny.].
Collapse
Affiliation(s)
- Sha Zhu
- Wellcome Trust Centre for Human Genetics, University of Oxford, Oxford OX3 7BN, UK
| | - James H Degnan
- Department of Mathematics and Statistics, University of New Mexico, Albuquerque, NM 87110, USA
| |
Collapse
|
37
|
Duchêne DA, Bragg JG, Duchêne S, Neaves LE, Potter S, Moritz C, Johnson RN, Ho SYW, Eldridge MDB. Analysis of Phylogenomic Tree Space Resolves Relationships Among Marsupial Families. Syst Biol 2017; 67:400-412. [DOI: 10.1093/sysbio/syx076] [Citation(s) in RCA: 52] [Impact Index Per Article: 7.4] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/29/2017] [Accepted: 09/08/2017] [Indexed: 02/02/2023] Open
Affiliation(s)
- David A Duchêne
- School of Life and Environmental Sciences, University of Sydney, Sydney, NSW 2006, Australia
| | - Jason G Bragg
- Research School of Biology, Australian National University, Canberra, ACT 2601, Australia
- National Herbarium of NSW, The Royal Botanic Gardens and Domain Trust, Sydney, NSW 2000, Australia
| | - Sebastián Duchêne
- Centre for Systems Genomics, The University of Melbourne, Melbourne, VIC 3010, Australia
| | - Linda E Neaves
- Australian Museum Research Institute, Australian Museum, 1 William Street, Sydney, NSW 2000, Australia
| | - Sally Potter
- Research School of Biology, Australian National University, Canberra, ACT 2601, Australia
- Australian Museum Research Institute, Australian Museum, 1 William Street, Sydney, NSW 2000, Australia
| | - Craig Moritz
- Research School of Biology, Australian National University, Canberra, ACT 2601, Australia
| | - Rebecca N Johnson
- Australian Museum Research Institute, Australian Museum, 1 William Street, Sydney, NSW 2000, Australia
| | - Simon Y W Ho
- School of Life and Environmental Sciences, University of Sydney, Sydney, NSW 2006, Australia
| | - Mark D B Eldridge
- Australian Museum Research Institute, Australian Museum, 1 William Street, Sydney, NSW 2000, Australia
| |
Collapse
|
38
|
Molloy EK, Warnow T. To Include or Not to Include: The Impact of Gene Filtering on Species Tree Estimation Methods. Syst Biol 2017; 67:285-303. [DOI: 10.1093/sysbio/syx077] [Citation(s) in RCA: 138] [Impact Index Per Article: 19.7] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/08/2016] [Accepted: 09/13/2017] [Indexed: 01/27/2023] Open
Affiliation(s)
- Erin K Molloy
- Department of Computer Science, University of Illinois at Urbana-Champaign, Urbana, IL, USA
| | - Tandy Warnow
- Department of Computer Science, University of Illinois at Urbana-Champaign, Urbana, IL, USA
| |
Collapse
|
39
|
Olave M, Avila LJ, Sites JW, Morando M. Detecting hybridization by likelihood calculation of gene tree extra lineages given explicit models. Methods Ecol Evol 2017. [DOI: 10.1111/2041-210x.12846] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/29/2022]
Affiliation(s)
- Melisa Olave
- Patagonian Institute for the Study of Continental Ecosystems – The National Scientific and Technical Research Council (IPEEC‐CONICET) Puerto Madryn Chubut Argentina
- Department of Biology University of Konstanz Konstanz Germany
| | - Luciano J. Avila
- Patagonian Institute for the Study of Continental Ecosystems – The National Scientific and Technical Research Council (IPEEC‐CONICET) Puerto Madryn Chubut Argentina
| | - Jack W. Sites
- Department of Biology and M. L. Bean Life Science Museum Brigham Young University (BYU) Provo UT USA
| | - Mariana Morando
- Patagonian Institute for the Study of Continental Ecosystems – The National Scientific and Technical Research Council (IPEEC‐CONICET) Puerto Madryn Chubut Argentina
| |
Collapse
|
40
|
Inferring rooted species trees from unrooted gene trees using approximate Bayesian computation. Mol Phylogenet Evol 2017; 116:13-24. [PMID: 28780022 DOI: 10.1016/j.ympev.2017.07.017] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/15/2016] [Revised: 03/26/2017] [Accepted: 07/22/2017] [Indexed: 02/01/2023]
Abstract
Methods for inferring species trees from gene trees motivated by incomplete lineage sorting typically use either rooted gene trees to infer a rooted species tree, or use unrooted gene trees to infer an unrooted species tree, which is then typically rooted using one or more outgroups. Theoretically, however, it has been known since 2011 that it is possible to consistently infer the root of the species tree directly from unrooted gene trees without assuming an outgroup. Here, we use approximate Bayesian computation to infer the root of the species tree from unrooted gene trees assuming the multispecies coalescent model. It is hoped that this approach will be useful in cases where an appropriate outgroup is difficult to find and gene trees do not follow a molecular clock. We use approximate Bayesian computation to infer the root of the species tree from unrooted gene trees. This approach could also be useful when there is prior information that makes a small number of root locations plausible in an unrooted species tree.
Collapse
|
41
|
Owen CL, Marshall DC, Hill KBR, Simon C. How the Aridification of Australia Structured the Biogeography and Influenced the Diversification of a Large Lineage of Australian Cicadas. Syst Biol 2017; 66:569-589. [PMID: 28123112 DOI: 10.1093/sysbio/syw078] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/03/2014] [Accepted: 08/24/2016] [Indexed: 11/14/2022] Open
Abstract
Over the last 30 million years, Australia's landscape has undergone dramatic cooling and drying due to the establishment of the Antarctic Circumpolar Current and change in global CO$_{2}$ levels. Studies have shown that many Australian organisms went extinct during these major cooling events, while others experienced adaptive radiations and increases in diversification rates as a result of exploiting new niches in the arid zone. Despite the many studies on diversification and biogeography in Australia, few have been continent-wide and none have focused on a group of organisms adapted to feeding on plants. We studied 162 species of cicadas in the Australian Pauropsalta complex, a large generic lineage within the tribe Cicadettini. We asked whether there were changes in the diversification rate of Pauropsalta over time and if so: 1) which clades were associated with the rate change? 2) did timing of rate shifts correspond to known periods of dramatic historical climate change, 3) did increases in diversification rate along select lineages correspond to adaptive radiations with movement into the arid zone? To address these questions, we estimated a molecular phylogeny of the Pauropsalta complex using ${\sim}$5300 bp of nucleotide sequence data distributed among five loci (one mtDNA locus and four nDNA loci). We found that this large group of cicadas did not diversify at a constant rate as they spread through Australia; instead the signature of decreasing diversification rate changed roughly around the time of the expansion of the east Antarctic ice sheets ${\sim}$16 Ma and the glaciation of the northern hemisphere ${\sim}$3 Ma. Unlike other Australian taxa, the Pauropsalta complex did not explosively radiate in response to an early invasion of the arid zone. Instead multiple groups invaded the arid zone and experienced rates of diversification similar to mesic-distributed taxa. We found evidence for relictual groups, located in pre-Mesozoic habitat, that have not diversified and continue to reside on mesic hosts in isolated "habitat islands". Future work should focus on groups of similar ages with similar distribution patterns to determine whether this tempo and pattern of diversification and biogeography is consistent with evidence from other phytophagous insects.
Collapse
Affiliation(s)
- Christopher L Owen
- Department of Ecology and Evolutionary Biology, University of Connecticut, 75 N. Eagleville Road, Unit 3043, Storrs, CT 06269-3043, USA.,Computational Biology Institute, George Washington University, Innovation Hall, Suite 305, 45085 University Drive, Ashburn, VA 20147-2766, USA
| | - David C Marshall
- Department of Ecology and Evolutionary Biology, University of Connecticut, 75 N. Eagleville Road, Unit 3043, Storrs, CT 06269-3043, USA
| | - Kathy B R Hill
- Department of Ecology and Evolutionary Biology, University of Connecticut, 75 N. Eagleville Road, Unit 3043, Storrs, CT 06269-3043, USA
| | - Chris Simon
- Department of Ecology and Evolutionary Biology, University of Connecticut, 75 N. Eagleville Road, Unit 3043, Storrs, CT 06269-3043, USA
| |
Collapse
|
42
|
Distribution of coalescent histories under the coalescent model with gene flow. Mol Phylogenet Evol 2016; 105:177-192. [DOI: 10.1016/j.ympev.2016.08.024] [Citation(s) in RCA: 10] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/13/2016] [Revised: 08/16/2016] [Accepted: 08/31/2016] [Indexed: 12/19/2022]
|
43
|
Hime PM, Hotaling S, Grewelle RE, O'Neill EM, Voss SR, Shaffer HB, Weisrock DW. The influence of locus number and information content on species delimitation: an empirical test case in an endangered Mexican salamander. Mol Ecol 2016; 25:5959-5974. [PMID: 27748559 DOI: 10.1111/mec.13883] [Citation(s) in RCA: 29] [Impact Index Per Article: 3.6] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/21/2016] [Revised: 09/19/2016] [Accepted: 09/26/2016] [Indexed: 02/02/2023]
Abstract
Perhaps the most important recent advance in species delimitation has been the development of model-based approaches to objectively diagnose species diversity from genetic data. Additionally, the growing accessibility of next-generation sequence data sets provides powerful insights into genome-wide patterns of divergence during speciation. However, applying complex models to large data sets is time-consuming and computationally costly, requiring careful consideration of the influence of both individual and population sampling, as well as the number and informativeness of loci on species delimitation conclusions. Here, we investigated how locus number and information content affect species delimitation results for an endangered Mexican salamander species, Ambystoma ordinarium. We compared results for an eight-locus, 137-individual data set and an 89-locus, seven-individual data set. For both data sets, we used species discovery methods to define delimitation models and species validation methods to rigorously test these hypotheses. We also used integrated demographic model selection tools to choose among delimitation models, while accounting for gene flow. Our results indicate that while cryptic lineages may be delimited with relatively few loci, sampling larger numbers of loci may be required to ensure that enough informative loci are available to accurately identify and validate shallow-scale divergences. These analyses highlight the importance of striking a balance between dense sampling of loci and individuals, particularly in shallowly diverged lineages. They also suggest the presence of a currently unrecognized, endangered species in the western part of A. ordinarium's range.
Collapse
Affiliation(s)
- Paul M Hime
- Department of Biology, University of Kentucky, Lexington, KY, 40506, USA
| | - Scott Hotaling
- Department of Biology, University of Kentucky, Lexington, KY, 40506, USA
| | - Richard E Grewelle
- Department of Biology, University of Kentucky, Lexington, KY, 40506, USA
| | - Eric M O'Neill
- Department of Biology, University of Kentucky, Lexington, KY, 40506, USA
| | - S Randal Voss
- Department of Biology, University of Kentucky, Lexington, KY, 40506, USA
| | - H Bradley Shaffer
- Department of Ecology and Evolutionary Biology and the La Kretz Center for California Conservation Science, University of California Los Angeles, Los Angeles, CA, 90095, USA
| | - David W Weisrock
- Department of Biology, University of Kentucky, Lexington, KY, 40506, USA
| |
Collapse
|
44
|
Gaubert P, Njiokou F, Ngua G, Afiademanyo K, Dufour S, Malekani J, Bi SG, Tougard C, Olayemi A, Danquah E, Djagoun CAMS, Kaleme P, Mololo CN, Stanley W, Luo SJ, Antunes A. Phylogeography of the heavily poached African common pangolin (Pholidota, Manis tricuspis) reveals six cryptic lineages as traceable signatures of Pleistocene diversification. Mol Ecol 2016; 25:5975-5993. [PMID: 27862533 DOI: 10.1111/mec.13886] [Citation(s) in RCA: 20] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/16/2015] [Revised: 09/24/2016] [Accepted: 09/27/2016] [Indexed: 01/03/2023]
Abstract
Knowledge on faunal diversification in African rainforests remains scarce. We used phylogeography to assess (i) the role of Pleistocene climatic oscillations in the diversification of the African common pangolin (Manis tricuspis) and (ii) the utility of our multilocus approach for taxonomic delineation and trade tracing of this heavily poached species. We sequenced 101 individuals for two mitochondrial DNA (mtDNA), two nuclear DNA and one Y-borne gene fragments (totalizing 2602 bp). We used a time-calibrated, Bayesian inference phylogenetic framework and conducted character-based, genetic and phylogenetic delineation of species hypotheses within African common pangolins. We identified six geographic lineages partitioned into western Africa, Ghana, the Dahomey Gap, western central Africa, Gabon and central Africa, all diverging during the Middle to Late Pleistocene. MtDNA (cytochrome b + control region) was the sole locus to provide diagnostic characters for each of the six lineages. Tree-based Bayesian delimitation methods using single- and multilocus approaches gave high support for 'species' level recognition of the six African common pangolin lineages. Although the diversification of African common pangolins occurred during Pleistocene cyclical glaciations, causative correlation with traditional rainforest refugia and riverine barriers in Africa was not straightforward. We conclude on the existence of six cryptic lineages within African common pangolins, which might be of major relevance for future conservation strategies. The high discriminative power of the mtDNA markers used in this study should allow an efficient molecular tracing of the regional origin of African common pangolin seizures.
Collapse
Affiliation(s)
- Philippe Gaubert
- Institut des Sciences de l'Evolution de Montpellier (ISEM) - UM-CNRS-IRD-EPHE-CIRAD, Université de Montpellier, Place Eugène Bataillon - CC 64, 34095, Montpellier Cedex 05, France.,CIMAR/CIIMAR, Centro Interdisciplinar de Investigação Marinha e Ambiental, Universidade do Porto, Terminal de Cruzeiros do Porto de Leixões, Av. General Norton de Matos, s/n, 4450-208, Porto, Portugal
| | - Flobert Njiokou
- Laboratoire de Parasitologie et Ecologie, Faculté des Sciences, Université de Yaoundé I, BP 812, Yaoundé, Cameroon
| | - Gabriel Ngua
- Amigos de la Naturaleza y del Desarrollo de Guinea Ecuatorial (ANDEGE), Barrio Ukomba, S/N, Bata, Equatorial Guinea
| | - Komlan Afiademanyo
- Département de Zoologie et de Biologie Animale, Université de Lomé, BP 1515, Lomé, Togo
| | | | - Jean Malekani
- Department of Biology, University of Kinshasa, PO Box 218, Kinshasa XI, Democratic Republic of Congo
| | - Sery Gonedelé Bi
- Laboratoire de Génétique, Université Félix Houphouët Boigny d'Abidjan-Cocody, 22 BP 582, Abidjan 22, Côte d'Ivoire
| | - Christelle Tougard
- Institut des Sciences de l'Evolution de Montpellier (ISEM) - UM-CNRS-IRD-EPHE-CIRAD, Université de Montpellier, Place Eugène Bataillon - CC 64, 34095, Montpellier Cedex 05, France
| | - Ayodeji Olayemi
- Natural History Museum, Obafemi Awolowo University, HO 220005, Ile-Ife, Nigeria
| | - Emmanuel Danquah
- Department of Wildlife and Range Management, Faculty of Renewable Natural Resources, Kwame Nkrumah University of Science and Technology, University Post Office, Kumasi, Ghana
| | - Chabi A M S Djagoun
- Laboratory of Applied Ecology, Faculty of Agronomic Sciences, University of Abomey-Calavi, 01 BP 526 LEA-FSA, Cotonou, Benin
| | - Prince Kaleme
- Laboratoire de Mammalogie, Département de Biologie, Centre de Recherches en Sciences Naturelles (CRSN) - Lwiro, DS (Dépêche Spéciale) Bukavu, Democratic Republic of Congo.,Department of Zoology, University of Johannesburg, PO Box 524, Auckland Park 2006, South Africa
| | - Casimir Nebesse Mololo
- Université de Kisangani, Faculté des Sciences, B.P. 2012, Kisangani, Democratic Republic of Congo
| | - William Stanley
- Science and Education, Field Museum of Natural History, 1400 South Lake Shore Drive, Chicago, IL, 60605, USA
| | - Shu-Jin Luo
- School of Life Sciences, Peking-Tsinghua Center for Life Sciences, Peking University, Beijing, 100871, China
| | - Agostinho Antunes
- CIMAR/CIIMAR, Centro Interdisciplinar de Investigação Marinha e Ambiental, Universidade do Porto, Terminal de Cruzeiros do Porto de Leixões, Av. General Norton de Matos, s/n, 4450-208, Porto, Portugal.,Departamento de Biologia, Faculdade de Ciências, Universidade do Porto, Rua do Campo Alegre, 4169-007, Porto, Portugal
| |
Collapse
|
45
|
Simmons MP. Mutually exclusive phylogenomic inferences at the root of the angiosperms: Amborella
is supported as sister and Observed Variability is biased. Cladistics 2016; 33:488-512. [DOI: 10.1111/cla.12177] [Citation(s) in RCA: 14] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 09/17/2016] [Indexed: 01/16/2023] Open
Affiliation(s)
- Mark P. Simmons
- Department of Biology; Colorado State University; Fort Collins CO 80523-1878 USA
| |
Collapse
|
46
|
Hamilton CA, Lemmon AR, Lemmon EM, Bond JE. Expanding anchored hybrid enrichment to resolve both deep and shallow relationships within the spider tree of life. BMC Evol Biol 2016; 16:212. [PMID: 27733110 PMCID: PMC5062932 DOI: 10.1186/s12862-016-0769-y] [Citation(s) in RCA: 123] [Impact Index Per Article: 15.4] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/04/2016] [Accepted: 09/28/2016] [Indexed: 12/17/2022] Open
Abstract
BACKGROUND Despite considerable effort, progress in spider molecular systematics has lagged behind many other comparable arthropod groups, thereby hindering family-level resolution, classification, and testing of important macroevolutionary hypotheses. Recently, alternative targeted sequence capture techniques have provided molecular systematics a powerful tool for resolving relationships across the Tree of Life. One of these approaches, Anchored Hybrid Enrichment (AHE), is designed to recover hundreds of unique orthologous loci from across the genome, for resolving both shallow and deep-scale evolutionary relationships within non-model systems. Herein we present a modification of the AHE approach that expands its use for application in spiders, with a particular emphasis on the infraorder Mygalomorphae. RESULTS Our aim was to design a set of probes that effectively capture loci informative at a diversity of phylogenetic timescales. Following identification of putative arthropod-wide loci, we utilized homologous transcriptome sequences from 17 species across all spiders to identify exon boundaries. Conserved regions with variable flanking regions were then sought across the tick genome, three published araneomorph spider genomes, and raw genomic reads of two mygalomorph taxa. Following development of the 585 target loci in the Spider Probe Kit, we applied AHE across three taxonomic depths to evaluate performance: deep-level spider family relationships (33 taxa, 327 loci); family and generic relationships within the mygalomorph family Euctenizidae (25 taxa, 403 loci); and species relationships in the North American tarantula genus Aphonopelma (83 taxa, 581 loci). At the deepest level, all three major spider lineages (the Mesothelae, Mygalomorphae, and Araneomorphae) were supported with high bootstrap support. Strong support was also found throughout the Euctenizidae, including generic relationships within the family and species relationships within the genus Aptostichus. As in the Euctenizidae, virtually identical topologies were inferred with high support throughout Aphonopelma. CONCLUSIONS The Spider Probe Kit, the first implementation of AHE methodology in Class Arachnida, holds great promise for gathering the types and quantities of molecular data needed to accelerate an understanding of the spider Tree of Life by providing a mechanism whereby different researchers can confidently and effectively use the same loci for independent projects, yet allowing synthesis of data across independent research groups.
Collapse
Affiliation(s)
- Chris A. Hamilton
- Department of Biological Sciences, Auburn University & Auburn University Museum of Natural History, Auburn, AL USA
| | - Alan R. Lemmon
- Department of Scientific Computing, Florida State University, Tallahassee, FL USA
| | | | - Jason E. Bond
- Department of Biological Sciences, Auburn University & Auburn University Museum of Natural History, Auburn, AL USA
| |
Collapse
|
47
|
Gatesy J, Meredith RW, Janecka JE, Simmons MP, Murphy WJ, Springer MS. Resolution of a concatenation/coalescence kerfuffle: partitioned coalescence support and a robust family‐level tree for Mammalia. Cladistics 2016; 33:295-332. [DOI: 10.1111/cla.12170] [Citation(s) in RCA: 62] [Impact Index Per Article: 7.8] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 06/30/2016] [Indexed: 12/14/2022] Open
Affiliation(s)
- John Gatesy
- Department of Biology University of California Riverside CA 92521 USA
| | - Robert W. Meredith
- Department of Biology and Molecular Biology Montclair State University Montclair NJ 07043 USA
| | - Jan E. Janecka
- Department of Biological Sciences Duquesne University Pittsburgh PA 15282 USA
| | - Mark P. Simmons
- Department of Biology Colorado State University Fort Collins CO 80523 USA
| | - William J. Murphy
- Department of Veterinary Integrative Biosciences Texas A&M University College Station TX 77843 USA
| | - Mark S. Springer
- Department of Biology University of California Riverside CA 92521 USA
| |
Collapse
|
48
|
Algorithmic improvements to species delimitation and phylogeny estimation under the multispecies coalescent. J Math Biol 2016; 74:447-467. [DOI: 10.1007/s00285-016-1034-0] [Citation(s) in RCA: 186] [Impact Index Per Article: 23.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/26/2015] [Revised: 05/31/2016] [Indexed: 11/25/2022]
|
49
|
Solís-Lemus C, Yang M, Ané C. Inconsistency of Species Tree Methods under Gene Flow. Syst Biol 2016; 65:843-51. [DOI: 10.1093/sysbio/syw030] [Citation(s) in RCA: 107] [Impact Index Per Article: 13.4] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/20/2015] [Accepted: 04/01/2016] [Indexed: 11/14/2022] Open
|
50
|
Massatti R, Reznicek AA, Knowles LL. Utilizing RADseq data for phylogenetic analysis of challenging taxonomic groups: A case study in Carex sect. Racemosae. AMERICAN JOURNAL OF BOTANY 2016; 103:337-347. [PMID: 26851268 DOI: 10.3732/ajb.1500315] [Citation(s) in RCA: 29] [Impact Index Per Article: 3.6] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 07/05/2015] [Accepted: 12/29/2015] [Indexed: 06/05/2023]
Abstract
PREMISE OF THE STUDY Relationships among closely related and recently diverged taxa can be especially difficult to resolve. Here we use both Sanger sequencing and next-generation RADseq data sets to estimate phylogenetic relationships among species of Carex section Racemosae (Cyperaceae), a clade largely restricted to high latitudes and elevations. Interest in relationships among these taxa derives from questions about the species' biogeographic histories and possible links between diversification and Pleistocene glaciations. METHODS A combination of approaches and molecular markers were used to estimate relationships among Carex species within sect. Racemosae and taxa from closely related sections. Nuclear and chloroplast loci generated by Sanger sequencing were analyzed with *BEAST, and SNP data from RADseq loci were analyzed as a concatenated data set using maximum likelihood and as independent loci using SVDquartets. KEY RESULTS Sanger sequencing data sets resolved relationships among taxa at intermediate phylogenetic depths (albeit with low levels of support). Only the RADseq data resolved relationships with strong support at all phylogenetic depths. Moreover, different methods and data partitions of the RADseq data resulted in nearly identical topologies. Carex sect. Racemosae is a strongly supported clade, although a handful of species were found to group with closely related sections. Herbarium specimens up to 35 yr old successfully produced informative RADseq data. CONCLUSIONS Despite the short read lengths of RADseq data, they nevertheless resolved relationships that Sanger sequencing data did not. Resolution of the phylogenetic relationships among recently and rapidly diversifying taxa within sect. Racemosae clades suggest a role for the Pleistocene glaciations in clade diversification.
Collapse
Affiliation(s)
- Rob Massatti
- Department of Ecology and Evolutionary Biology, The University of Michigan, Ann Arbor, Michigan, 41809-1079 USA
| | - Anton A Reznicek
- Department of Ecology and Evolutionary Biology, The University of Michigan, Ann Arbor, Michigan, 41809-1079 USA
| | - L Lacey Knowles
- Department of Ecology and Evolutionary Biology, The University of Michigan, Ann Arbor, Michigan, 41809-1079 USA
| |
Collapse
|