1
|
Duchêne DA, Duchêne S, Stiller J, Heller R, Ho SYW. ClockstaRX: Testing Molecular Clock Hypotheses With Genomic Data. Genome Biol Evol 2024; 16:evae064. [PMID: 38526019 PMCID: PMC10999959 DOI: 10.1093/gbe/evae064] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/18/2023] [Revised: 01/11/2024] [Accepted: 03/21/2024] [Indexed: 03/26/2024] Open
Abstract
Phylogenomic data provide valuable opportunities for studying evolutionary rates and timescales. These analyses require theoretical and statistical tools based on molecular clocks. We present ClockstaRX, a flexible platform for exploring and testing evolutionary rate signals in phylogenomic data. Here, information about evolutionary rates in branches across gene trees is placed in Euclidean space, allowing data transformation, visualization, and hypothesis testing. ClockstaRX implements formal tests for identifying groups of loci and branches that make a large contribution to patterns of rate variation. This information can then be used to test for drivers of genomic evolutionary rates or to inform models for molecular dating. Drawing on the results of a simulation study, we recommend forms of data exploration and filtering that might be useful prior to molecular-clock analyses.
Collapse
Affiliation(s)
- David A Duchêne
- Center for Evolutionary Hologenomics, University of Copenhagen, Copenhagen 1352, Denmark
- Section of Epidemiology, Department of Public Health, University of Copenhagen, Copenhagen 1352, Denmark
| | - Sebastián Duchêne
- Department of Microbiology and Immunology, Peter Doherty Institute for Infection and Immunity, University of Melbourne, Melbourne, VIC 3010, Australia
| | - Josefin Stiller
- Villum Centre for Biodiversity Genomics, University of Copenhagen, 2100 Copenhagen, Denmark
| | - Rasmus Heller
- Section for Computational and RNA Biology, Department of Biology, University of Copenhagen, Copenhagen 2100, Denmark
| | - Simon Y W Ho
- School of Life and Environmental Sciences, University of Sydney, Sydney, NSW 2006, Australia
| |
Collapse
|
2
|
Wang Z, Wang YW, Kasuga T, Lopez-Giraldez F, Zhang Y, Zhang Z, Wang Y, Dong C, Sil A, Trail F, Yarden O, Townsend JP. Lineage-specific genes are clustered with HET-domain genes and respond to environmental and genetic manipulations regulating reproduction in Neurospora. PLoS Genet 2023; 19:e1011019. [PMID: 37934795 PMCID: PMC10684091 DOI: 10.1371/journal.pgen.1011019] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/23/2023] [Revised: 11/28/2023] [Accepted: 10/16/2023] [Indexed: 11/09/2023] Open
Abstract
Lineage-specific genes (LSGs) have long been postulated to play roles in the establishment of genetic barriers to intercrossing and speciation. In the genome of Neurospora crassa, most of the 670 Neurospora LSGs that are aggregated adjacent to the telomeres are clustered with 61% of the HET-domain genes, some of which regulate self-recognition and define vegetative incompatibility groups. In contrast, the LSG-encoding proteins possess few to no domains that would help to identify potential functional roles. Possible functional roles of LSGs were further assessed by performing transcriptomic profiling in genetic mutants and in response to environmental alterations, as well as examining gene knockouts for phenotypes. Among the 342 LSGs that are dynamically expressed during both asexual and sexual phases, 64% were detectable on unusual carbon sources such as furfural, a wildfire-produced chemical that is a strong inducer of sexual development, and the structurally-related furan 5-hydroxymethyl furfural (HMF). Expression of a significant portion of the LSGs was sensitive to light and temperature, factors that also regulate the switch from asexual to sexual reproduction. Furthermore, expression of the LSGs was significantly affected in the knockouts of adv-1 and pp-1 that regulate hyphal communication, and expression of more than one quarter of the LSGs was affected by perturbation of the mating locus. These observations encouraged further investigation of the roles of clustered lineage-specific and HET-domain genes in ecology and reproduction regulation in Neurospora, especially the regulation of the switch from the asexual growth to sexual reproduction, in response to dramatic environmental conditions changes.
Collapse
Affiliation(s)
- Zheng Wang
- Department of Biostatistics, Yale School of Public Health, New Haven, Connecticut, United States of America
| | - Yen-Wen Wang
- Department of Biostatistics, Yale School of Public Health, New Haven, Connecticut, United States of America
| | - Takao Kasuga
- College of Biological Sciences, University of California, Davis, California, United States of America
| | | | - Yang Zhang
- National Genomics Data Center, Beijing Institute of Genomics, Chinese Academy of Sciences, Beijing, China
| | - Zhang Zhang
- National Genomics Data Center, Beijing Institute of Genomics, Chinese Academy of Sciences, Beijing, China
| | - Yaning Wang
- Institute of Microbiology, Chinese Academy of Sciences, Beijing, China
| | - Caihong Dong
- Institute of Microbiology, Chinese Academy of Sciences, Beijing, China
| | - Anita Sil
- Department of Microbiology and Immunology, University of California, San Francisco, California, United States of America
| | - Frances Trail
- Department of Plant, Soil and Microbial Sciences, Michigan State University, East Lansing, Michigan, United States of America
| | - Oded Yarden
- Department of Plant Pathology and Microbiology, The Robert H. Smith Faculty of Agriculture, Food and Environment, The Hebrew University of Jerusalem, Rehovot, Israel
| | - Jeffrey P. Townsend
- Department of Biostatistics, Yale School of Public Health, New Haven, Connecticut, United States of America
- Department of Ecology and Evolutionary Biology, Program in Microbiology, and Program in Computational Biology and Bioinformatics, Yale University, New Haven, Connecticut, United States of America
| |
Collapse
|
3
|
Mongiardino Koch N, Tilic E, Miller AK, Stiller J, Rouse GW. Confusion will be my epitaph: genome-scale discordance stifles phylogenetic resolution of Holothuroidea. Proc Biol Sci 2023; 290:20230988. [PMID: 37434530 PMCID: PMC10336381 DOI: 10.1098/rspb.2023.0988] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/02/2023] [Accepted: 06/12/2023] [Indexed: 07/13/2023] Open
Abstract
Sea cucumbers (Holothuroidea) are a diverse clade of echinoderms found from intertidal waters to the bottom of the deepest oceanic trenches. Their reduced skeletons and limited number of phylogenetically informative traits have long obfuscated morphological classifications. Sanger-sequenced molecular datasets have also failed to constrain the position of major lineages. Noteworthy, topological uncertainty has hindered a resolution for Neoholothuriida, a highly diverse clade of Permo-Triassic age. We perform the first phylogenomic analysis of Holothuroidea, combining existing datasets with 13 novel transcriptomes. Using a highly curated dataset of 1100 orthologues, our efforts recapitulate previous results, struggling to resolve interrelationships among neoholothuriid clades. Three approaches to phylogenetic reconstruction (concatenation under both site-homogeneous and site-heterogeneous models, and coalescent-aware inference) result in alternative resolutions, all of which are recovered with strong support and across a range of datasets filtered for phylogenetic usefulness. We explore this intriguing result using gene-wise log-likelihood scores and attempt to correlate these with a large set of gene properties. While presenting novel ways of exploring and visualizing support for alternative trees, we are unable to discover significant predictors of topological preference, and our efforts fail to favour one topology. Neoholothuriid genomes seem to retain an amalgam of signals derived from multiple phylogenetic histories.
Collapse
Affiliation(s)
| | - Ekin Tilic
- Scripps Institution of Oceanography, University of California San Diego, La Jolla, CA, USA
- Department of Marine Zoology, Senckenberg Research Institute and Museum, Frankfurt, Germany
| | - Allison K. Miller
- Anatomy Department, University of Otago, Dunedin, Otago, New Zealand
| | - Josefin Stiller
- Centre for Biodiversity Genomics, Section for Ecology and Evolution, Department of Biology, University of Copenhagen, Copenhagen, Denmark
| | - Greg W. Rouse
- Scripps Institution of Oceanography, University of California San Diego, La Jolla, CA, USA
| |
Collapse
|
4
|
Mulhair PO, McCarthy CGP, Siu-Ting K, Creevey CJ, O'Connell MJ. Filtering artifactual signal increases support for Xenacoelomorpha and Ambulacraria sister relationship in the animal tree of life. Curr Biol 2022; 32:5180-5188.e3. [PMID: 36356574 DOI: 10.1016/j.cub.2022.10.036] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/13/2021] [Revised: 08/09/2022] [Accepted: 10/18/2022] [Indexed: 11/10/2022]
Abstract
Conflicting studies place a group of bilaterian invertebrates containing xenoturbellids and acoelomorphs, the Xenacoelomorpha, as either the primary emerging bilaterian phylum1,2,3,4,5,6 or within Deuterostomia, sister to Ambulacraria.7,8,9,10,11 Although their placement as sister to the rest of Bilateria supports relatively simple morphology in the ancestral bilaterian, their alternative placement within Deuterostomia suggests a morphologically complex ancestral bilaterian along with extensive loss of major phenotypic traits in the Xenacoelomorpha. Recent studies have questioned whether Deuterostomia should be considered monophyletic at all.10,12,13 Hidden paralogy and poor phylogenetic signal present a major challenge for reconstructing species phylogenies.14,15,16,17,18 Here, we assess whether these issues have contributed to the conflict over the placement of Xenacoelomorpha. We reanalyzed published datasets, enriching for orthogroups whose gene trees support well-resolved clans elsewhere in the animal tree.16 We find that most genes in previously published datasets violate incontestable clans, suggesting that hidden paralogy and low phylogenetic signal affect the ability to reconstruct branching patterns at deep nodes in the animal tree. We demonstrate that removing orthogroups that cannot recapitulate incontestable relationships alters the final topology that is inferred, while simultaneously improving the fit of the model to the data. We discover increased, but ultimately not conclusive, support for the existence of Xenambulacraria in our set of filtered orthogroups. At a time when we are progressing toward sequencing all life on the planet, we argue that long-standing contentious issues in the tree of life will be resolved using smaller amounts of better quality data that can be modeled adequately.19.
Collapse
Affiliation(s)
- Peter O Mulhair
- Computational and Molecular Evolutionary Biology Research Group, School of Life Sciences, Faculty of Medicine and Health Sciences, University of Nottingham, Nottingham NG7 2RD, UK; Computational and Molecular Evolutionary Biology Research Group, School of Biology, Faculty of Biological Sciences, University of Leeds, Leeds LS2 9JT, UK
| | - Charley G P McCarthy
- Computational and Molecular Evolutionary Biology Research Group, School of Life Sciences, Faculty of Medicine and Health Sciences, University of Nottingham, Nottingham NG7 2RD, UK
| | - Karen Siu-Ting
- Institute for Global Food Security, School of Biological Sciences, Queen's University Belfast, Belfast BT9 5DL, UK
| | - Christopher J Creevey
- Institute for Global Food Security, School of Biological Sciences, Queen's University Belfast, Belfast BT9 5DL, UK
| | - Mary J O'Connell
- Computational and Molecular Evolutionary Biology Research Group, School of Life Sciences, Faculty of Medicine and Health Sciences, University of Nottingham, Nottingham NG7 2RD, UK; Computational and Molecular Evolutionary Biology Research Group, School of Biology, Faculty of Biological Sciences, University of Leeds, Leeds LS2 9JT, UK.
| |
Collapse
|
5
|
Dornburg A, Mallik R, Wang Z, Bernal MA, Thompson B, Bruford EA, Nebert DW, Vasiliou V, Yohe LR, Yoder JA, Townsend JP. Placing human gene families into their evolutionary context. Hum Genomics 2022; 16:56. [PMID: 36369063 PMCID: PMC9652883 DOI: 10.1186/s40246-022-00429-5] [Citation(s) in RCA: 4] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/28/2022] [Accepted: 10/12/2022] [Indexed: 11/13/2022] Open
Abstract
Following the draft sequence of the first human genome over 20 years ago, we have achieved unprecedented insights into the rules governing its evolution, often with direct translational relevance to specific diseases. However, staggering sequence complexity has also challenged the development of a more comprehensive understanding of human genome biology. In this context, interspecific genomic studies between humans and other animals have played a critical role in our efforts to decode human gene families. In this review, we focus on how the rapid surge of genome sequencing of both model and non-model organisms now provides a broader comparative framework poised to empower novel discoveries. We begin with a general overview of how comparative approaches are essential for understanding gene family evolution in the human genome, followed by a discussion of analyses of gene expression. We show how homology can provide insights into the genes and gene families associated with immune response, cancer biology, vision, chemosensation, and metabolism, by revealing similarity in processes among distant species. We then explain methodological tools that provide critical advances and show the limitations of common approaches. We conclude with a discussion of how these investigations position us to gain fundamental insights into the evolution of gene families among living organisms in general. We hope that our review catalyzes additional excitement and research on the emerging field of comparative genomics, while aiding the placement of the human genome into its existentially evolutionary context.
Collapse
Affiliation(s)
- Alex Dornburg
- Department of Bioinformatics and Genomics, UNC-Charlotte, Charlotte, NC, USA.
| | - Rittika Mallik
- Department of Bioinformatics and Genomics, UNC-Charlotte, Charlotte, NC, USA
| | - Zheng Wang
- Department of Biostatistics, Yale School of Public Health, New Haven, CT, USA
| | - Moisés A Bernal
- Department of Biological Sciences, College of Science and Mathematics, Auburn University, Auburn, AL, USA
| | - Brian Thompson
- Department of Environmental Health Sciences, Yale School of Public Health, New Haven, CT, USA
| | - Elspeth A Bruford
- Department of Haematology, University of Cambridge School of Clinical Medicine, Cambridge, UK
- European Molecular Biology Laboratory, European Bioinformatics Institute, Hinxton, UK
| | - Daniel W Nebert
- Department of Environmental Health, Center for Environmental Genetics, University of Cincinnati Medical Center, P.O. Box 670056, Cincinnati, OH, 45267, USA
- Department of Pediatrics and Molecular Developmental Biology, Division of Human Genetics, Cincinnati Children's Hospital, Cincinnati, OH, 45229, USA
| | - Vasilis Vasiliou
- Department of Environmental Health Sciences, Yale School of Public Health, New Haven, CT, USA
| | - Laurel R Yohe
- Department of Bioinformatics and Genomics, UNC-Charlotte, Charlotte, NC, USA
| | - Jeffrey A Yoder
- Department of Molecular Biomedical Sciences, College of Veterinary Medicine, North Carolina State University, Raleigh, NC, USA
| | - Jeffrey P Townsend
- Department of Bioinformatics and Genomics, UNC-Charlotte, Charlotte, NC, USA
- Department of Ecology and Evolutionary Biology, Yale University, New Haven, CT, USA
| |
Collapse
|
6
|
Lozano-Fernandez J. A Practical Guide to Design and Assess a Phylogenomic Study. Genome Biol Evol 2022; 14:evac129. [PMID: 35946263 PMCID: PMC9452790 DOI: 10.1093/gbe/evac129] [Citation(s) in RCA: 8] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 08/03/2022] [Indexed: 11/13/2022] Open
Abstract
Over the last decade, molecular systematics has undergone a change of paradigm as high-throughput sequencing now makes it possible to reconstruct evolutionary relationships using genome-scale datasets. The advent of "big data" molecular phylogenetics provided a battery of new tools for biologists but simultaneously brought new methodological challenges. The increase in analytical complexity comes at the price of highly specific training in computational biology and molecular phylogenetics, resulting very often in a polarized accumulation of knowledge (technical on one side and biological on the other). Interpreting the robustness of genome-scale phylogenetic studies is not straightforward, particularly as new methodological developments have consistently shown that the general belief of "more genes, more robustness" often does not apply, and because there is a range of systematic errors that plague phylogenomic investigations. This is particularly problematic because phylogenomic studies are highly heterogeneous in their methodology, and best practices are often not clearly defined. The main aim of this article is to present what I consider as the ten most important points to take into consideration when planning a well-thought-out phylogenomic study and while evaluating the quality of published papers. The goal is to provide a practical step-by-step guide that can be easily followed by nonexperts and phylogenomic novices in order to assess the technical robustness of phylogenomic studies or improve the experimental design of a project.
Collapse
Affiliation(s)
- Jesus Lozano-Fernandez
- Department of Genetics, Microbiology and Statistics, Biodiversity Research Institute (IRBio), University of Barcelona, Avd. Diagonal 643, 08028 Barcelona, Spain
- Institute of Evolutionary Biology (CSIC – Universitat Pompeu Fabra), Passeig marítim de la Barcelona 37-49, 08003 Barcelona, Spain
| |
Collapse
|
7
|
Abreu EF, Pavan SE, Tsuchiya MTN, McLean BS, Wilson DE, Percequillo AR, Maldonado JE. Old specimens for old branches: Assessing effects of sample age in resolving a rapid Neotropical radiation of squirrels. Mol Phylogenet Evol 2022; 175:107576. [PMID: 35809853 DOI: 10.1016/j.ympev.2022.107576] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/21/2022] [Revised: 06/10/2022] [Accepted: 07/01/2022] [Indexed: 11/15/2022]
Abstract
Ultraconserved Elements (UCEs) have been useful to resolve challenging phylogenies of non-model clades, unpuzzling long-conflicted relationships in key branches of the Tree of Life at both deep and shallow levels. UCEs are often reliably recovered from historical samples, unlocking a vast number of preserved natural history specimens for analysis. However, the extent to which sample age and preservation method impact UCE recovery as well as downstream inferences remains unclear. Furthermore, there is an ongoing debate on how to curate, filter, and properly analyze UCE data when locus recovery is uneven across sample age and quality. In the present study we address these questions with an empirical dataset composed of over 3800 UCE loci from 219 historical and modern samples of Sciuridae, a globally distributed and ecologically important family of rodents. We provide a genome-scale phylogeny of two squirrel subfamilies (Sciurillinae and Sciurinae: Sciurini) and investigate their placement within Sciuridae. For historical specimens, recovery of UCE loci and mean length per locus were inversely related to sample age; deeper sequencing improved the number of UCE loci recovered but not locus length. Most of our phylogenetic inferences-performed on six datasets with alternative data-filtering strategies, and using three distinct optimality criteria-resulted in distinct topologies. Datasets containing more loci (40% and 50% taxa representativeness matrices) yielded more concordant topologies and higher support values than strictly filtered datasets (60% matrices) particularly with IQ-Tree and SVDquartets, while filtering based on information content provided better topological resolution for inferences with the coalescent gene-tree based approach in ASTRAL-III. We resolved deep relationships in Sciuridae (including among the five currently recognized subfamilies) and relationships among the deepest branches of Sciurini, but conflicting relationships remain at both genus- and species-levels for the rapid Neotropical tree squirrel radiation. Our results suggest that phylogenomic consensus can be difficult and heavily influenced by the age of available samples and the filtering steps used to optimize dataset properties.
Collapse
Affiliation(s)
- Edson F Abreu
- Laboratório de Mamíferos, Departamento de Ciências Biológicas, Escola Superior de Agricultura Luiz de Queiroz, Universidade de São Paulo, Piracicaba, SP, Brazil; Center for Conservation Genomics, Smithsonian National Zoo and Conservation Biology Institute, Washington, DC, USA.
| | - Silvia E Pavan
- Center for Conservation Genomics, Smithsonian National Zoo and Conservation Biology Institute, Washington, DC, USA
| | - Mirian T N Tsuchiya
- Center for Conservation Genomics, Smithsonian National Zoo and Conservation Biology Institute, Washington, DC, USA; Data Science Lab, Office of the Chief Information Officer, Smithsonian Institution, Washington, DC, USA
| | - Bryan S McLean
- Department of Biology, University of North Carolina Greensboro, Greensboro, NC, USA
| | - Don E Wilson
- Division of Mammals, National Museum of Natural History, Smithsonian Institution, Washington, DC, USA
| | - Alexandre R Percequillo
- Laboratório de Mamíferos, Departamento de Ciências Biológicas, Escola Superior de Agricultura Luiz de Queiroz, Universidade de São Paulo, Piracicaba, SP, Brazil
| | - Jesús E Maldonado
- Center for Conservation Genomics, Smithsonian National Zoo and Conservation Biology Institute, Washington, DC, USA
| |
Collapse
|
8
|
Del Amparo R, Arenas M. Consequences of Substitution Model Selection on Protein Ancestral Sequence Reconstruction. Mol Biol Evol 2022; 39:6628884. [PMID: 35789388 PMCID: PMC9254009 DOI: 10.1093/molbev/msac144] [Citation(s) in RCA: 4] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/04/2023] Open
Abstract
The selection of the best-fitting substitution model of molecular evolution is a traditional step for phylogenetic inferences, including ancestral sequence reconstruction (ASR). However, a few recent studies suggested that applying this procedure does not affect the accuracy of phylogenetic tree reconstruction. Here, we revisited this debate topic by analyzing the influence of selection among substitution models of protein evolution, with focus on exchangeability matrices, on the accuracy of ASR using simulated and real data. We found that the selected best-fitting substitution model produces the most accurate ancestral sequences, especially if the data present large genetic diversity. Indeed, ancestral sequences reconstructed under substitution models with similar exchangeability matrices were similar, suggesting that if the selected best-fitting model cannot be used for the reconstruction, applying a model similar to the selected one is preferred. We conclude that selecting among substitution models of protein evolution is recommended for reconstructing accurate ancestral sequences.
Collapse
Affiliation(s)
- Roberto Del Amparo
- CINBIO, Universidade de Vigo, Vigo, Spain.,Departamento de Bioquímica, Xenética e Immunoloxía, Universidade de Vigo, Vigo, Spain
| | - Miguel Arenas
- CINBIO, Universidade de Vigo, Vigo, Spain.,Departamento de Bioquímica, Xenética e Immunoloxía, Universidade de Vigo, Vigo, Spain.,Galicia Sur Health Research Institute (IIS Galicia Sur), Vigo, Spain
| |
Collapse
|
9
|
Hernández-Gutiérrez R, van den Berg C, Granados Mendoza C, Peñafiel Cevallos M, Freire M. E, Lemmon EM, Lemmon AR, Magallón S. Localized Phylogenetic Discordance Among Nuclear Loci Due to Incomplete Lineage Sorting and Introgression in the Family of Cotton and Cacao (Malvaceae). FRONTIERS IN PLANT SCIENCE 2022; 13:850521. [PMID: 35498660 PMCID: PMC9043901 DOI: 10.3389/fpls.2022.850521] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 01/07/2022] [Accepted: 03/14/2022] [Indexed: 06/14/2023]
Abstract
The economically important cotton and cacao family (Malvaceae sensu lato) have long been recognized as a monophyletic group. However, the relationships among some subfamilies are still unclear as discordant phylogenetic hypotheses keep arising when different sources of molecular data are analyzed. Phylogenetic discordance has previously been hypothesized to be the result of both introgression and incomplete lineage sorting (ILS), but the extent and source of discordance have not yet been evaluated in the context of loci derived from massive sequencing strategies and for a wide representation of the family. Furthermore, no formal methods have been applied to evaluate if the detected phylogenetic discordance among phylogenomic datasets influences phylogenetic dating estimates of the concordant relationships. The objective of this research was to generate a phylogenetic hypothesis of Malvaceae from nuclear genes, specifically we aimed to (1) investigate the presence of major discordance among hundreds of nuclear gene histories of Malvaceae; (2) evaluate the potential source of discordance; and (3) examine whether discordance and loci heterogeneity influence on time estimates of the origin and diversification of subfamilies. Our study is based on a comprehensive dataset representing 96 genera of the nine subfamilies and 268 nuclear loci. Both concatenated and coalescence-based approaches were followed for phylogenetic inference. Using branch lengths and topology, we located the placement of introgression events to directly evaluate whether discordance is due to introgression rather than ILS. To estimate divergence times, concordance and molecular rate were considered. We filtered loci based on congruence with the species tree and then obtained the molecular rate of each locus to distribute them into three different sets corresponding to shared molecular rate ranges. Bayesian dating was performed for each of the different sets of loci with the same parameters and calibrations. Phylogenomic discordance was detected between methods, as well as gene histories. At deep coalescent times, we found discordance in the position of five subclades probably due to ILS and a relatively small proportion of introgression. Divergence time estimation with each set of loci generated overlapping clade ages, indicating that, even with different molecular rate and gene histories, calibrations generally provide a strong prior.
Collapse
Affiliation(s)
- Rebeca Hernández-Gutiérrez
- Posgrado en Ciencias Biológicas, Universidad Nacional Autónoma de México, Mexico City, Mexico
- Departamento de Botánica, Instituto de Biología, Universidad Nacional Autónoma de México, Mexico City, Mexico
| | - Cássio van den Berg
- Departamento de Ciencias Biológicas, Universidade Estadual de Feira de Santana, Feira de Santana, Brazil
| | - Carolina Granados Mendoza
- Departamento de Botánica, Instituto de Biología, Universidad Nacional Autónoma de México, Mexico City, Mexico
| | | | - Efraín Freire M.
- Herbario Nacional del Ecuador (QCNE), Instituto Nacional de Biodiversidad, Quito, Ecuador
| | - Emily Moriarty Lemmon
- Department of Biological Science, Florida State University, Tallahassee, FL, United States
| | - Alan R. Lemmon
- Department of Scientific Computing, Florida State University, Tallahassee, FL, United States
| | - Susana Magallón
- Departamento de Botánica, Instituto de Biología, Universidad Nacional Autónoma de México, Mexico City, Mexico
| |
Collapse
|
10
|
Yardeni G, Viruel J, Paris M, Hess J, Groot Crego C, de La Harpe M, Rivera N, Barfuss MHJ, Till W, Guzmán-Jacob V, Krömer T, Lexer C, Paun O, Leroy T. Taxon-specific or universal? Using target capture to study the evolutionary history of rapid radiations. Mol Ecol Resour 2021; 22:927-945. [PMID: 34606683 PMCID: PMC9292372 DOI: 10.1111/1755-0998.13523] [Citation(s) in RCA: 13] [Impact Index Per Article: 4.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/28/2021] [Revised: 09/09/2021] [Accepted: 09/22/2021] [Indexed: 12/20/2022]
Abstract
Target capture has emerged as an important tool for phylogenetics and population genetics in nonmodel taxa. Whereas developing taxon‐specific capture probes requires sustained efforts, available universal kits may have a lower power to reconstruct relationships at shallow phylogenetic scales and within rapidly radiating clades. We present here a newly developed target capture set for Bromeliaceae, a large and ecologically diverse plant family with highly variable diversification rates. The set targets 1776 coding regions, including genes putatively involved in key innovations, with the aim to empower testing of a wide range of evolutionary hypotheses. We compare the relative power of this taxon‐specific set, Bromeliad1776, to the universal Angiosperms353 kit. The taxon‐specific set results in higher enrichment success across the entire family; however, the overall performance of both kits to reconstruct phylogenetic trees is relatively comparable, highlighting the vast potential of universal kits for resolving evolutionary relationships. For more detailed phylogenetic or population genetic analyses, for example the exploration of gene tree concordance, nucleotide diversity or population structure, the taxon‐specific capture set presents clear benefits. We discuss the potential lessons that this comparative study provides for future phylogenetic and population genetic investigations, in particular for the study of evolutionary radiations.
Collapse
Affiliation(s)
- Gil Yardeni
- Department of Botany and Biodiversity Research, University of Vienna, Vienna, Austria
| | | | - Margot Paris
- Unit of Ecology & Evolution, Department of Biology, University of Fribourg, Fribourg, Switzerland
| | - Jaqueline Hess
- Department of Botany and Biodiversity Research, University of Vienna, Vienna, Austria.,Department of Soil Ecology, Helmholtz Centre for Environmental Research, UFZ, Halle (Saale), Germany
| | - Clara Groot Crego
- Department of Botany and Biodiversity Research, University of Vienna, Vienna, Austria.,Vienna Graduate School of Population Genetics, Vienna, Austria
| | - Marylaure de La Harpe
- Department of Botany and Biodiversity Research, University of Vienna, Vienna, Austria
| | - Norma Rivera
- Department of Botany and Biodiversity Research, University of Vienna, Vienna, Austria
| | - Michael H J Barfuss
- Department of Botany and Biodiversity Research, University of Vienna, Vienna, Austria
| | - Walter Till
- Department of Botany and Biodiversity Research, University of Vienna, Vienna, Austria
| | - Valeria Guzmán-Jacob
- Biodiversity, Macroecology and Biogeography, University of Goettingen, Göttingen, Germany
| | - Thorsten Krömer
- Centro de Investigaciones Tropicales, Universidad Veracruzana, Xalapa, Mexico
| | - Christian Lexer
- Department of Botany and Biodiversity Research, University of Vienna, Vienna, Austria
| | - Ovidiu Paun
- Department of Botany and Biodiversity Research, University of Vienna, Vienna, Austria
| | - Thibault Leroy
- Department of Botany and Biodiversity Research, University of Vienna, Vienna, Austria
| |
Collapse
|
11
|
Duchêne DA, Mather N, Van Der Wal C, Ho SYW. Excluding loci with substitution saturation improves inferences from phylogenomic data. Syst Biol 2021; 71:676-689. [PMID: 34508605 PMCID: PMC9016599 DOI: 10.1093/sysbio/syab075] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/19/2020] [Accepted: 09/07/2021] [Indexed: 11/21/2022] Open
Abstract
The historical signal in nucleotide sequences becomes eroded over time by substitutions occurring repeatedly at the same sites. This phenomenon, known as substitution saturation, is recognized as one of the primary obstacles to deep-time phylogenetic inference using genome-scale data sets. We present a new test of substitution saturation and demonstrate its performance in simulated and empirical data. For some of the 36 empirical phylogenomic data sets that we examined, we detect substitution saturation in around 50% of loci. We found that saturation tends to be flagged as problematic in loci with highly discordant phylogenetic signals across sites. Within each data set, the loci with smaller numbers of informative sites are more likely to be flagged as containing problematic levels of saturation. The entropy saturation test proposed here is sensitive to high evolutionary rates relative to the evolutionary timeframe, while also being sensitive to several factors known to mislead phylogenetic inference, including short internal branches relative to external branches, short nucleotide sequences, and tree imbalance. Our study demonstrates that excluding loci with substitution saturation can be an effective means of mitigating the negative impact of multiple substitutions on phylogenetic inferences. [Phylogenetic model performance; phylogenomics; substitution model; substitution saturation; test statistics.]
Collapse
Affiliation(s)
- David A Duchêne
- Centre for Evolutionary Hologenomics, University of Copenhagen, 1352 Copenhagen, Denmark
| | - Niklas Mather
- School of Life and Environmental Sciences, University of Sydney, Sydney, NSW 2006, Australia
| | - Cara Van Der Wal
- School of Life and Environmental Sciences, University of Sydney, Sydney, NSW 2006, Australia
| | - Simon Y W Ho
- School of Life and Environmental Sciences, University of Sydney, Sydney, NSW 2006, Australia
| |
Collapse
|
12
|
Mongiardino Koch N. Phylogenomic Subsampling and the Search for Phylogenetically Reliable Loci. Mol Biol Evol 2021; 38:4025-4038. [PMID: 33983409 DOI: 10.1101/2021.02.13.431075] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 05/21/2023] Open
Abstract
Phylogenomic subsampling is a procedure by which small sets of loci are selected from large genome-scale data sets and used for phylogenetic inference. This step is often motivated by either computational limitations associated with the use of complex inference methods or as a means of testing the robustness of phylogenetic results by discarding loci that are deemed potentially misleading. Although many alternative methods of phylogenomic subsampling have been proposed, little effort has gone into comparing their behavior across different data sets. Here, I calculate multiple gene properties for a range of phylogenomic data sets spanning animal, fungal, and plant clades, uncovering a remarkable predictability in their patterns of covariance. I also show how these patterns provide a means for ordering loci by both their rate of evolution and their relative phylogenetic usefulness. This method of retrieving phylogenetically useful loci is found to be among the top performing when compared with alternative subsampling protocols. Relatively common approaches such as minimizing potential sources of systematic bias or increasing the clock-likeness of the data are found to fare worse than selecting loci at random. Likewise, the general utility of rate-based subsampling is found to be limited: loci evolving at both low and high rates are among the least effective, and even those evolving at optimal rates can still widely differ in usefulness. This study shows that many common subsampling approaches introduce unintended effects in off-target gene properties and proposes an alternative multivariate method that simultaneously optimizes phylogenetic signal while controlling for known sources of bias.
Collapse
|
13
|
Abstract
Phylogenomic subsampling is a procedure by which small sets of loci are selected from large genome-scale data sets and used for phylogenetic inference. This step is often motivated by either computational limitations associated with the use of complex inference methods or as a means of testing the robustness of phylogenetic results by discarding loci that are deemed potentially misleading. Although many alternative methods of phylogenomic subsampling have been proposed, little effort has gone into comparing their behavior across different data sets. Here, I calculate multiple gene properties for a range of phylogenomic data sets spanning animal, fungal, and plant clades, uncovering a remarkable predictability in their patterns of covariance. I also show how these patterns provide a means for ordering loci by both their rate of evolution and their relative phylogenetic usefulness. This method of retrieving phylogenetically useful loci is found to be among the top performing when compared with alternative subsampling protocols. Relatively common approaches such as minimizing potential sources of systematic bias or increasing the clock-likeness of the data are found to fare worse than selecting loci at random. Likewise, the general utility of rate-based subsampling is found to be limited: loci evolving at both low and high rates are among the least effective, and even those evolving at optimal rates can still widely differ in usefulness. This study shows that many common subsampling approaches introduce unintended effects in off-target gene properties and proposes an alternative multivariate method that simultaneously optimizes phylogenetic signal while controlling for known sources of bias.
Collapse
|
14
|
Literman R, Schwartz R. Genome-Scale Profiling Reveals Noncoding Loci Carry Higher Proportions of Concordant Data. Mol Biol Evol 2021; 38:2306-2318. [PMID: 33528497 PMCID: PMC8136493 DOI: 10.1093/molbev/msab026] [Citation(s) in RCA: 7] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Key Words] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/11/2022] Open
Abstract
Many evolutionary relationships remain controversial despite whole-genome sequencing data. These controversies arise, in part, due to challenges associated with accurately modeling the complex phylogenetic signal coming from genomic regions experiencing distinct evolutionary forces. Here, we examine how different regions of the genome support or contradict well-established relationships among three mammal groups using millions of orthologous parsimony-informative biallelic sites (PIBS) distributed across primate, rodent, and Pecora genomes. We compared PIBS concordance percentages among locus types (e.g. coding sequences (CDS), introns, intergenic regions), and contrasted PIBS utility over evolutionary timescales. Sites derived from noncoding sequences provided more data and proportionally more concordant sites compared with those from CDS in all clades. CDS PIBS were also predominant drivers of tree incongruence in two cases of topological conflict. PIBS derived from most locus types provided surprisingly consistent support for splitting events spread across the timescales we examined, although we find evidence that CDS and intronic PIBS may, respectively and to a limited degree, inform disproportionately about older and younger splits. In this era of accessible wholegenome sequence data, these results:1) suggest benefits to more intentionally focusing on noncoding loci as robust data for tree inference and 2) reinforce the importance of accurate modeling, especially when using CDS data.
Collapse
Affiliation(s)
- Robert Literman
- Department of Biological Sciences, University of Rhode Island, South Kingstown, RI, USA.,Center for Food Safety and Applied Nutrition, Office of Regulatory Science, U.S. Food and Drug Administration, College Park, MD, USA
| | - Rachel Schwartz
- Department of Biological Sciences, University of Rhode Island, South Kingstown, RI, USA
| |
Collapse
|
15
|
Beaulieu JM, O'Meara BC, Gilchrist MA. A Spatially Explicit Model of Stabilizing Selection for Improving Phylogenetic Inference. Mol Biol Evol 2021; 38:1641-1652. [PMID: 33306127 PMCID: PMC8042768 DOI: 10.1093/molbev/msaa318] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/26/2022] Open
Abstract
Ultraconserved elements (UCEs) are stretches of hundreds of nucleotides with highly conserved cores flanked by variable regions. Although the selective forces responsible for the preservation of UCEs are unknown, they are nonetheless believed to contain phylogenetically meaningful information from deep to shallow divergence events. Phylogenetic applications of UCEs assume the same degree of rate heterogeneity applies across the entire locus, including variable flanking regions. We present a Wright–Fisher model of selection on nucleotides (SelON) which includes the effects of mutation, drift, and spatially varying, stabilizing selection for an optimal nucleotide sequence. The SelON model assumes the strength of stabilizing selection follows a position-dependent Gaussian function whose exact shape can vary between UCEs. We evaluate SelON by comparing its performance to a simpler and spatially invariant GTR+Γ model using an empirical data set of 400 vertebrate UCEs used to determine the phylogenetic position of turtles. We observe much improvement in model fit of SelON over the GTR+Γ model, and support for turtles as sister to lepidosaurs. Overall, the UCE-specific parameters SelON estimates provide a compact way of quantifying the strength and variation in selection within and across UCEs. SelON can also be extended to include more realistic mapping functions between sequence and stabilizing selection as well as allow for greater levels of rate heterogeneity. By more explicitly modeling the nature of selection on UCEs, SelON and similar approaches can be used to better understand the biological mechanisms responsible for their preservation across highly divergent taxa and long evolutionary time scales.
Collapse
Affiliation(s)
- Jeremy M Beaulieu
- Department of Biological Sciences, University of Arkansas, Fayetteville, AR, USA
| | - Brian C O'Meara
- Department of Ecology and Evolutionary Biology, University of Tennessee, Knoxville, TN, USA
| | - Michael A Gilchrist
- Department of Ecology and Evolutionary Biology, University of Tennessee, Knoxville, TN, USA
| |
Collapse
|
16
|
Vankan M, Ho SYW, Duchêne DA. Evolutionary Rate Variation Among Lineages in Gene Trees has a Negative Impact on Species-Tree Inference. Syst Biol 2021; 71:490-500. [PMID: 34255084 PMCID: PMC8830059 DOI: 10.1093/sysbio/syab051] [Citation(s) in RCA: 6] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/03/2021] [Revised: 06/18/2021] [Indexed: 11/12/2022] Open
Abstract
Phylogenetic analyses of genomic data provide a powerful means of reconstructing the evolutionary relationships among organisms, yet such analyses are often hindered by conflicting phylogenetic signals among loci. Identifying the signals that are most influential to species-tree estimation can help to inform the choice of data for phylogenomic analysis. We investigated this in an analysis of 30 phylogenomic data sets. For each data set, we examined the association between several branch-length characteristics of gene trees and the distance between these gene trees and the corresponding species trees. We found that the distance of each gene tree to the species tree inferred from the full data set was positively associated with variation in root-to-tip distances and negatively associated with mean branch support. However, no such associations were found for gene-tree length, a measure of the overall substitution rate at each locus. We further explored the usefulness of the best-performing branch-based characteristics for selecting loci for phylogenomic analyses. We found that loci that yield gene trees with high variation in root-to-tip distances have a disproportionately distant signal of tree topology compared with the complete data sets. These results suggest that rate variation across lineages should be taken into consideration when exploring and even selecting loci for phylogenomic analysis.[Branch support; data filtering; nucleotide substitution model; phylogenomics; substitution rate; summary coalescent methods.]
Collapse
Affiliation(s)
- Mezzalina Vankan
- School of Life and Environmental Sciences, University of Sydney, NSW 2006, Australia.,Research School of Biology, Australian National University, ACT 2601, Australia
| | - Simon Y W Ho
- School of Life and Environmental Sciences, University of Sydney, NSW 2006, Australia
| | - David A Duchêne
- Research School of Biology, Australian National University, ACT 2601, Australia.,Centre for Evolutionary Hologenomics, University of Copenhagen, Copenhagen 1352, Denmark
| |
Collapse
|
17
|
Tao Q, Barba-Montoya J, Huuki LA, Durnan MK, Kumar S. Relative Efficiencies of Simple and Complex Substitution Models in Estimating Divergence Times in Phylogenomics. Mol Biol Evol 2021; 37:1819-1831. [PMID: 32119075 PMCID: PMC7253201 DOI: 10.1093/molbev/msaa049] [Citation(s) in RCA: 9] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/13/2022] Open
Abstract
The conventional wisdom in molecular evolution is to apply parameter-rich models of nucleotide and amino acid substitutions for estimating divergence times. However, the actual extent of the difference between time estimates produced by highly complex models compared with those from simple models is yet to be quantified for contemporary data sets that frequently contain sequences from many species and genes. In a reanalysis of many large multispecies alignments from diverse groups of taxa, we found that the use of the simplest models can produce divergence time estimates and credibility intervals similar to those obtained from the complex models applied in the original studies. This result is surprising because the use of simple models underestimates sequence divergence for all the data sets analyzed. We found three fundamental reasons for the observed robustness of time estimates to model complexity in many practical data sets. First, the estimates of branch lengths and node-to-tip distances under the simplest model show an approximately linear relationship with those produced by using the most complex models applied on data sets with many sequences. Second, relaxed clock methods automatically adjust rates on branches that experience considerable underestimation of sequence divergences, resulting in time estimates that are similar to those from complex models. And, third, the inclusion of even a few good calibrations in an analysis can reduce the difference in time estimates from simple and complex models. The robustness of time estimates to model complexity in these empirical data analyses is encouraging, because all phylogenomics studies use statistical models that are oversimplified descriptions of actual evolutionary substitution processes.
Collapse
Affiliation(s)
- Qiqing Tao
- Institute for Genomics and Evolutionary Medicine, Temple University, Philadelphia, PA.,Department of Biology, Temple University, Philadelphia, PA
| | - Jose Barba-Montoya
- Institute for Genomics and Evolutionary Medicine, Temple University, Philadelphia, PA.,Department of Biology, Temple University, Philadelphia, PA
| | - Louise A Huuki
- Institute for Genomics and Evolutionary Medicine, Temple University, Philadelphia, PA
| | - Mary Kathleen Durnan
- Institute for Genomics and Evolutionary Medicine, Temple University, Philadelphia, PA.,Department of Biology, Temple University, Philadelphia, PA
| | - Sudhir Kumar
- Institute for Genomics and Evolutionary Medicine, Temple University, Philadelphia, PA.,Department of Biology, Temple University, Philadelphia, PA.,Center for Excellence in Genome Medicine and Research, King Abdulaziz University, Jeddah, Saudi Arabia
| |
Collapse
|
18
|
Takezaki N. Resolving the Early Divergence Pattern of Teleost Fish Using Genome-Scale Data. Genome Biol Evol 2021; 13:6178791. [PMID: 33739405 PMCID: PMC8103497 DOI: 10.1093/gbe/evab052] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 03/10/2021] [Indexed: 12/13/2022] Open
Abstract
Regarding the phylogenetic relationship of the three primary groups of teleost fishes, Osteoglossomorpha (bonytongues and others), Elopomorpha (eels and relatives), Clupeocephala (the remaining teleost fish), early morphological studies hypothesized the first divergence of Osteoglossomorpha, whereas the recent prevailing view is the first divergence of Elopomorpha. Molecular studies supported all the possible relationships of the three primary groups. This study analyzed genome-scale data from four previous studies: 1) 412 genes from 12 species, 2) 772 genes from 15 species, 3) 1,062 genes from 30 species, and 4) 491 UCE loci from 27 species. The effects of the species, loci, and models used on the constructed tree topologies were investigated. In the analyses of the data sets (1)–(3), although the first divergence of Clupeocephala that left the other two groups in a sister relationship was supported by concatenated sequences and gene trees of all the species and genes, the first divergence of Elopomorpha among the three groups was supported using species and/or genes with low divergence of sequence and amino-acid frequencies. This result corresponded to that of the UCE data set (4), whose sequence divergence was low, which supported the first divergence of Elopomorpha with high statistical significance. The increase in accuracy of the phylogenetic construction by using species and genes with low sequence divergence was predicted by a phylogenetic informativeness approach and confirmed by computer simulation. These results supported that Elopomorpha was the first basal group of teleost fish to have diverged, consistent with the prevailing view of recent morphological studies.
Collapse
Affiliation(s)
- Naoko Takezaki
- Life Science Research Center, Kagawa University, Mikicho, Kitagun, Kagawa, Japan
| |
Collapse
|
19
|
Neumann JS, Desalle R, Narechania A, Schierwater B, Tessler M. Morphological Characters Can Strongly Influence Early Animal Relationships Inferred from Phylogenomic Data Sets. Syst Biol 2021; 70:360-375. [PMID: 32462193 PMCID: PMC7875439 DOI: 10.1093/sysbio/syaa038] [Citation(s) in RCA: 10] [Impact Index Per Article: 3.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/14/2018] [Revised: 01/27/2020] [Accepted: 01/29/2020] [Indexed: 12/19/2022] Open
Abstract
There are considerable phylogenetic incongruencies between morphological and phylogenomic data for the deep evolution of animals. This has contributed to a heated debate over the earliest-branching lineage of the animal kingdom: the sister to all other Metazoa (SOM). Here, we use published phylogenomic data sets ($\sim $45,000-400,000 characters in size with $\sim $15-100 taxa) that focus on early metazoan phylogeny to evaluate the impact of incorporating morphological data sets ($\sim $15-275 characters). We additionally use small exemplar data sets to quantify how increased taxon sampling can help stabilize phylogenetic inferences. We apply a plethora of common methods, that is, likelihood models and their "equivalent" under parsimony: character weighting schemes. Our results are at odds with the typical view of phylogenomics, that is, that genomic-scale data sets will swamp out inferences from morphological data. Instead, weighting morphological data 2-10$\times $ in both likelihood and parsimony can in some cases "flip" which phylum is inferred to be the SOM. This typically results in the molecular hypothesis of Ctenophora as the SOM flipping to Porifera (or occasionally Placozoa). However, greater taxon sampling improves phylogenetic stability, with some of the larger molecular data sets ($>$200,000 characters and up to $\sim $100 taxa) showing node stability even with $\geqq100\times $ upweighting of morphological data. Accordingly, our analyses have three strong messages. 1) The assumption that genomic data will automatically "swamp out" morphological data is not always true for the SOM question. Morphological data have a strong influence in our analyses of combined data sets, even when outnumbered thousands of times by molecular data. Morphology therefore should not be counted out a priori. 2) We here quantify for the first time how the stability of the SOM node improves for several genomic data sets when the taxon sampling is increased. 3) The patterns of "flipping points" (i.e., the weighting of morphological data it takes to change the inferred SOM) carry information about the phylogenetic stability of matrices. The weighting space is an innovative way to assess comparability of data sets that could be developed into a new sensitivity analysis tool. [Metazoa; Morphology; Phylogenomics; Weighting.].
Collapse
Affiliation(s)
- Johannes S Neumann
- Richard Gilder Graduate School, American Museum of Natural History, New York, NY 10024, USA
- Division of Invertebrate Zoology, American Museum of Natural History, New York, NY 10024, USA
- Sackler Institute for Comparative Genomics, American Museum of Natural History, New York, NY 10024, USA
| | - Rob Desalle
- Division of Invertebrate Zoology, American Museum of Natural History, New York, NY 10024, USA
- Sackler Institute for Comparative Genomics, American Museum of Natural History, New York, NY 10024, USA
| | - Apurva Narechania
- Sackler Institute for Comparative Genomics, American Museum of Natural History, New York, NY 10024, USA
| | - Bernd Schierwater
- Division of Invertebrate Zoology, American Museum of Natural History, New York, NY 10024, USA
- Sackler Institute for Comparative Genomics, American Museum of Natural History, New York, NY 10024, USA
- ITZ, Division of Ecology and Evolution, Tierärztliche Hochschule Hannover, Bünteweg 9, 30559 Hannover, Germany
| | - Michael Tessler
- Division of Invertebrate Zoology, American Museum of Natural History, New York, NY 10024, USA
- Sackler Institute for Comparative Genomics, American Museum of Natural History, New York, NY 10024, USA
| |
Collapse
|
20
|
Phylogenomics of manakins (Aves: Pipridae) using alternative locus filtering strategies based on informativeness. Mol Phylogenet Evol 2020; 155:107013. [PMID: 33217578 DOI: 10.1016/j.ympev.2020.107013] [Citation(s) in RCA: 14] [Impact Index Per Article: 3.5] [Reference Citation Analysis] [Abstract] [Key Words] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/13/2019] [Revised: 11/07/2020] [Accepted: 11/11/2020] [Indexed: 01/11/2023]
Abstract
Target capture sequencing effectively generates molecular marker arrays useful for molecular systematics. These extensive data sets are advantageous where previous studies using a few loci have failed to resolve relationships confidently. Moreover, target capture is well-suited to fragmented source DNA, allowing data collection from species that lack fresh tissues. Herein we use target capture to generate data for a phylogeny of the avian family Pipridae (manakins), a group that has been the subject of many behavioral and ecological studies. Most manakin species feature lek mating systems, where males exhibit complex behavioral displays including mechanical and vocal sounds, coordinated movements of multiple males, and high speed movements. We analyzed thousands of ultraconserved element (UCE) loci along with a smaller number of coding exons and their flanking regions from all but one species of Pipridae. We examined three different methods of phylogenetic estimation (concatenation and two multispecies coalescent methods). Phylogenetic inferences using UCE data yielded strongly supported estimates of phylogeny regardless of analytical method. Exon probes had limited capability to capture sequence data and resulted in phylogeny estimates with reduced support and modest topological differences relative to the UCE trees, although these conflicts had limited support. Two genera were paraphyletic among all analyses and data sets, with Antilophia nested within Chiroxiphia and Tyranneutes nested within Neopelma. The Chiroxiphia-Antilophia clade was an exception to the generally high support we observed; the topology of this clade differed among analyses, even those based on UCE data. To further explore relationships within this group, we employed two filtering strategies to remove low-information loci. Those analyses resulted in distinct topologies, suggesting that the relationships we identified within Chiroxiphia-Antilophia should be interpreted with caution. Despite the existence of a few continuing uncertainties, our analyses resulted in a robust phylogenetic hypothesis of the family Pipridae that provides a comparative framework for future ecomorphological and behavioral studies.
Collapse
|
21
|
Singhal S, Colston TJ, Grundler MR, Smith SA, Costa GC, Colli GR, Moritz C, Pyron RA, Rabosky DL. Congruence and Conflict in the Higher-Level Phylogenetics of Squamate Reptiles: An Expanded Phylogenomic Perspective. Syst Biol 2020; 70:542-557. [PMID: 32681800 DOI: 10.1093/sysbio/syaa054] [Citation(s) in RCA: 28] [Impact Index Per Article: 7.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/16/2019] [Revised: 05/05/2020] [Accepted: 07/05/2020] [Indexed: 12/16/2022] Open
Abstract
Genome-scale data have the potential to clarify phylogenetic relationships across the tree of life but have also revealed extensive gene tree conflict. This seeming paradox, whereby larger data sets both increase statistical confidence and uncover significant discordance, suggests that understanding sources of conflict is important for accurate reconstruction of evolutionary history. We explore this paradox in squamate reptiles, the vertebrate clade comprising lizards, snakes, and amphisbaenians. We collected an average of 5103 loci for 91 species of squamates that span higher-level diversity within the clade, which we augmented with publicly available sequences for an additional 17 taxa. Using a locus-by-locus approach, we evaluated support for alternative topologies at 17 contentious nodes in the phylogeny. We identified shared properties of conflicting loci, finding that rate and compositional heterogeneity drives discordance between gene trees and species tree and that conflicting loci rarely overlap across contentious nodes. Finally, by comparing our tests of nodal conflict to previous phylogenomic studies, we confidently resolve 9 of the 17 problematic nodes. We suggest this locus-by-locus and node-by-node approach can build consensus on which topological resolutions remain uncertain in phylogenomic studies of other contentious groups. [Anchored hybrid enrichment (AHE); gene tree conflict; molecular evolution; phylogenomic concordance; target capture; ultraconserved elements (UCE).].
Collapse
Affiliation(s)
- Sonal Singhal
- Department of Ecology and Evolutionary Biology, University of Michigan, Ann Arbor, MI 48109, USA.,Museum of Zoology, University of Michigan, Ann Arbor, MI 48109, USA.,Department of Biology, CSU Dominguez Hills, Carson, CA 90747, USA
| | - Timothy J Colston
- Department of Biological Sciences, The George Washington University, Washington D.C. 20052, USA.,Department of Biological Science, Florida State University, Tallahassee, FL 32306, USA
| | - Maggie R Grundler
- Department of Ecology and Evolutionary Biology, University of Michigan, Ann Arbor, MI 48109, USA.,Museum of Zoology, University of Michigan, Ann Arbor, MI 48109, USA.,Department of Environmental Science, Policy, & Management, University of California Berkeley, Berkeley, CA 94720, USA
| | - Stephen A Smith
- Department of Ecology and Evolutionary Biology, University of Michigan, Ann Arbor, MI 48109, USA
| | - Gabriel C Costa
- Department of Biology and Environmental Sciences, Auburn University at Montgomery, Montgomery, AL, USA
| | - Guarino R Colli
- Departamento de Zoologia, Universidade de Brasília, Brasília, DF, Brazil
| | - Craig Moritz
- Division of Ecology and Evolution, Research School of Biology, and Centre for Biodiversity Analysis, The Australian National University, 46 Sullivans Creek Road, Acton, ACT 2601, Australia
| | - R Alexander Pyron
- Department of Biological Sciences, The George Washington University, Washington D.C. 20052, USA
| | - Daniel L Rabosky
- Department of Ecology and Evolutionary Biology, University of Michigan, Ann Arbor, MI 48109, USA.,Museum of Zoology, University of Michigan, Ann Arbor, MI 48109, USA
| |
Collapse
|
22
|
Unraveling the Global Phylodynamic and Phylogeographic Expansion of Mycoplasma gallisepticum: Understanding the Origin and Expansion of This Pathogen in Ecuador. Pathogens 2020; 9:pathogens9090674. [PMID: 32825097 PMCID: PMC7557814 DOI: 10.3390/pathogens9090674] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/04/2020] [Revised: 07/31/2020] [Accepted: 08/18/2020] [Indexed: 12/17/2022] Open
Abstract
Mycoplasma gallisepticum (MG) is among the most significant problems in the poultry industry worldwide, representing a serious threat to international trade. Despite the fact that the mgc2 gene has been widely used for diagnostic and molecular characterization purposes, there is a lack of evidence supporting the reliability of this gene as a marker for molecular epidemiology approaches. Therefore, the current study aimed to assess the accuracy of the mgc2 gene for phylogenetic, phylodynamic, and phylogeographic evaluations. Furthermore, the global phylodynamic expansion of MG is described, and the origin and extension of the outbreak caused by MG in Ecuador were tracked and characterized. The results obtained strongly supported the use of the mgc2 gene as a reliable phylogenetic marker and accurate estimator for the temporal and phylogeographic structure reconstruction of MG. The phylodynamic analysis denoted the failures in the current policies to control MG and highlighted the imperative need to implement more sensitive methodologies of diagnosis and more efficient vaccines. Framed in Ecuador, the present study provides the first piece of evidence of the circulation of virulent field MG strains in Ecuadorian commercial poultry. The findings derived from the current study provide novel and significant insights into the origin, diversification, and evolutionary process of MG globally.
Collapse
|
23
|
Bagley JC, Uribe-Convers S, Carlsen MM, Muchhala N. Utility of targeted sequence capture for phylogenomics in rapid, recent angiosperm radiations: Neotropical Burmeistera bellflowers as a case study. Mol Phylogenet Evol 2020; 152:106769. [PMID: 32081762 DOI: 10.1016/j.ympev.2020.106769] [Citation(s) in RCA: 21] [Impact Index Per Article: 5.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/09/2019] [Revised: 02/10/2020] [Accepted: 02/12/2020] [Indexed: 02/06/2023]
Abstract
Targeted sequence capture is a promising approach for large-scale phylogenomics. However, rapid evolutionary radiations pose significant challenges for phylogenetic inference (e.g. incomplete lineages sorting (ILS), phylogenetic noise), and the ability of targeted nuclear loci to resolve species trees despite such issues remains poorly studied. We test the utility of targeted sequence capture for inferring phylogenetic relationships in rapid, recent angiosperm radiations, focusing on Burmeistera bellflowers (Campanulaceae), which diversified into ~130 species over less than 3 million years. We compared phylogenies estimated from supercontig (exons plus flanking sequences), exon-only, and flanking-only datasets with 506-546 loci (~4.7 million bases) for 46 Burmeistera species/lineages and 10 outgroup taxa. Nuclear loci resolved backbone nodes and many congruent internal relationships with high support in concatenation and coalescent-based species tree analyses, and inferences were largely robust to effects of missing taxa and base composition biases. Nevertheless, species trees were incongruent between datasets, and gene trees exhibited remarkably high levels of conflict (~4-60% congruence, ~40-99% conflict) not simply driven by poor gene tree resolution. Higher gene tree heterogeneity at shorter branches suggests an important role of ILS, as expected for rapid radiations. Phylogenetic informativeness analyses also suggest this incongruence has resulted from low resolving power at short internal branches, consistent with ILS, and homoplasy at deeper nodes, with exons exhibiting much greater risk of incorrect topologies due to homoplasy than other datasets. Our findings suggest that targeted sequence capture is feasible for resolving rapid, recent angiosperm radiations, and that results based on supercontig alignments containing nuclear exons and flanking sequences have higher phylogenetic utility and accuracy than either alone. We use our results to make practical recommendations for future target capture-based studies of Burmeistera and other rapid angiosperm radiations, including that such studies should analyze supercontigs to maximize the phylogenetic information content of loci.
Collapse
Affiliation(s)
- Justin C Bagley
- Department of Biology, University of Missouri-St. Louis, St. Louis, MO 63121, USA; Department of Biology, Virginia Commonwealth University, Richmond, VA 23284, USA.
| | - Simon Uribe-Convers
- Department of Biology, University of Missouri-St. Louis, St. Louis, MO 63121, USA
| | - Mónica M Carlsen
- Research Department, Science and Conservation Division, Missouri Botanical Garden, St. Louis, MO 63110, USA
| | - Nathan Muchhala
- Department of Biology, University of Missouri-St. Louis, St. Louis, MO 63121, USA
| |
Collapse
|
24
|
Abstract
Background: Locating the root node of the "tree of life" (ToL) is one of the hardest problems in phylogenetics, given the time depth. The root-node, or the universal common ancestor (UCA), groups descendants into organismal clades/domains. Two notable variants of the two-domains ToL (2D-ToL) have gained support recently. One 2D-ToL posits that eukaryotes (organisms with nuclei) and akaryotes (organisms without nuclei) are sister clades that diverged from the UCA, and that Asgard archaea are sister to other archaea. The other 2D-ToL proposes that eukaryotes emerged from within archaea and places Asgard archaea as sister to eukaryotes. Williams et al. ( Nature Ecol. Evol. 4: 138-147; 2020) re-evaluated the data and methods that support the competing two-domains proposals and concluded that eukaryotes are the closest relatives of Asgard archaea. Critique: The poor resolution of the archaea in their analysis, despite employing amino acid alignments from thousands of proteins and the best-fitting substitution models, contradicts their conclusions. We argue that they overlooked important aspects of estimating evolutionary relatedness and assessing phylogenetic signal in empirical data. Which 2D-ToL is better supported depends on which kind of molecular features are better for resolving common ancestors at the roots of clades - protein-domains or their component amino acids. We focus on phylogenetic character reconstructions necessary to describe the UCA or its closest descendants in the absence of reliable fossils. Clarifications: It is well known that different character types present different perspectives on evolutionary history that relate to different phylogenetic depths. We show that protein structural-domains support more reliable phylogenetic reconstructions of deep-diverging clades in the ToL. Accordingly, Eukaryotes and Akaryotes are better supported clades in a 2D-ToL.
Collapse
Affiliation(s)
| | - David Morrison
- Department of Organismal Biology, Systematic Biology, Uppsala University, Uppsala, 752 36, Sweden
| |
Collapse
|
25
|
Phillips AJ, Dornburg A, Zapfe KL, Anderson FE, James SW, Erséus C, Moriarty Lemmon E, Lemmon AR, Williams BW. Phylogenomic Analysis of a Putative Missing Link Sparks Reinterpretation of Leech Evolution. Genome Biol Evol 2020; 11:3082-3093. [PMID: 31214691 PMCID: PMC6598468 DOI: 10.1093/gbe/evz120] [Citation(s) in RCA: 15] [Impact Index Per Article: 3.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 06/12/2019] [Indexed: 12/17/2022] Open
Abstract
Leeches (Hirudinida) comprise a charismatic, yet often maligned group of worms. Despite their ecological, economic, and medical importance, a general consensus on the phylogenetic relationships of major hirudinidan lineages is lacking. This absence of a consistent, robust phylogeny of early-diverging lineages has hindered our understanding of the underlying processes that enabled evolutionary diversification of this clade. Here, we used an anchored hybrid enrichment-based phylogenomic approach, capturing hundreds of loci to investigate phylogenetic relationships among major hirudinidan lineages and their closest living relatives. We recovered Branchiobdellida as sister to a clade that includes all major lineages of hirudinidans and Acanthobdella, casting doubt on the utility of Acanthobdella as a “missing link” between hirudinidans and the clitellate group formerly known as Oligochaeta. Further, our results corroborate the reciprocal monophyly of jawed and proboscis-bearing leeches. Our phylogenomic resolution of early-diverging leeches provides a useful framework for illuminating the evolution of key adaptations and host–symbiont associations that have allowed leeches to colonize a wide diversity of habitats worldwide.
Collapse
Affiliation(s)
- Anna J Phillips
- Department of Invertebrate Zoology, National Museum of Natural History, Smithsonian Institution, Washington, District of Columbia
| | - Alex Dornburg
- North Carolina Museum of Natural Sciences, Research Laboratory, Raleigh, North Carolina
| | - Katerina L Zapfe
- North Carolina Museum of Natural Sciences, Research Laboratory, Raleigh, North Carolina.,Department of Biological Sciences, Clemson University
| | | | | | - Christer Erséus
- Department of Biological and Environmental Sciences, University of Gothenburg, Sweden
| | | | - Alan R Lemmon
- Department of Scientific Computing, Florida State University
| | - Bronwyn W Williams
- North Carolina Museum of Natural Sciences, Research Laboratory, Raleigh, North Carolina
| |
Collapse
|
26
|
Phylogenetic informativeness analyses to clarify past diversification processes in Cucurbitaceae. Sci Rep 2020; 10:488. [PMID: 31949198 PMCID: PMC6965171 DOI: 10.1038/s41598-019-57249-2] [Citation(s) in RCA: 14] [Impact Index Per Article: 3.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/18/2019] [Accepted: 12/20/2019] [Indexed: 01/12/2023] Open
Abstract
Phylogenomic studies have so far mostly relied on genome skimming or target sequence capture, which suffer from representation bias and can fail to resolve relationships even with hundreds of loci. Here, we explored the potential of phylogenetic informativeness and tree confidence analyses to interpret phylogenomic datasets. We studied Cucurbitaceae because their small genome size allows cost-efficient genome skimming, and many relationships in the family remain controversial, preventing inferences on the evolution of characters such as sexual system or floral morphology. Genome skimming and PCR allowed us to retrieve the plastome, 57 single copy nuclear genes, and the nuclear ribosomal ITS from 29 species representing all but one tribe of Cucurbitaceae. Node support analyses revealed few inter-locus conflicts but a pervasive lack of phylogenetic signal among plastid loci, suggesting a fast divergence of Cucurbitaceae tribes. Data filtering based on phylogenetic informativeness and risk of homoplasy clarified tribe-level relationships, which support two independent evolutions of fringed petals in the family. Our study illustrates how formal analysis of phylogenomic data can increase our understanding of past diversification processes. Our data and results will facilitate the design of well-sampled phylogenomic studies in Cucurbitaceae and related families.
Collapse
|
27
|
Muñoz M, Restrepo-Montoya D, Kumar N, Iraola G, Herrera G, Ríos-Chaparro DI, Díaz-Arévalo D, Patarroyo MA, Lawley TD, Ramírez JD. Comparative genomics identifies potential virulence factors in Clostridium tertium and C. paraputrificum. Virulence 2019; 10:657-676. [PMID: 31304854 PMCID: PMC6629180 DOI: 10.1080/21505594.2019.1637699] [Citation(s) in RCA: 8] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/26/2019] [Revised: 05/23/2019] [Accepted: 06/25/2019] [Indexed: 01/23/2023] Open
Abstract
Some well-known Clostridiales species such as Clostridium difficile and C. perfringens are agents of high impact diseases worldwide. Nevertheless, other foreseen Clostridiales species have recently emerged such as Clostridium tertium and C. paraputrificum. Three fecal isolates were identified as Clostridium tertium (Gcol.A2 and Gcol.A43) and C. paraputrificum (Gcol.A11) during public health screening for C. difficile infections in Colombia. C. paraputrificum genomes were highly diverse and contained large numbers of accessory genes. Genetic diversity and accessory gene percentage were lower among the C. tertium genomes than in the C. paraputrificum genomes. C. difficile tcdA and tcdB toxins encoding homologous sequences and other potential virulence factors were also identified. EndoA interferase, a toxic component of the type II toxin-antitoxin system, was found among the C. tertium genomes. toxA was the only toxin encoding gene detected in Gcol.A43, the Colombian isolate with an experimentally-determined high cytotoxic effect. Gcol.A2 and Gcol.A43 had higher sporulation efficiencies than Gcol.A11 (84.5%, 83.8% and 57.0%, respectively), as supported by the greater number of proteins associated with sporulation pathways in the C. tertium genomes compared with the C. paraputrificum genomes (33.3 and 28.4 on average, respectively). This work allowed complete genome description of two clostridiales species revealing high levels of intra-taxa diversity, accessory genomes containing virulence-factors encoding genes (especially in C. paraputrificum), with proteins involved in sporulation processes more highly represented in C. tertium. These finding suggest the need to advance in the study of those species with potential importance at public health level.
Collapse
Affiliation(s)
- Marina Muñoz
- Grupo de Investigaciones Microbiológicas – UR (GIMUR), Programa de Biología, Facultad de Ciencias Naturales y Matemáticas, Universidad del Rosario, Bogotá, Colombia
- Posgrado Interfacultades, Doctorado en Biotecnología, Facultad de Ciencias, Universidad Nacional de Colombia, Bogotá, Colombia
| | - Daniel Restrepo-Montoya
- Grupo de Investigaciones Microbiológicas – UR (GIMUR), Programa de Biología, Facultad de Ciencias Naturales y Matemáticas, Universidad del Rosario, Bogotá, Colombia
- Genomics and Bioinformatics Program, North Dakota State University, Fargo, ND, USA
| | - Nitin Kumar
- Host–Microbiota Interactions Laboratory, Wellcome Trust Sanger Institute, Hinxton, UK
| | - Gregorio Iraola
- Microbial Genomics Laboratory, Institut Pasteur Montevideo, Montevideo, Uruguay
- Center for Integrative Biology, Universidad Mayor, Santiago de Chile, Chile
| | - Giovanny Herrera
- Grupo de Investigaciones Microbiológicas – UR (GIMUR), Programa de Biología, Facultad de Ciencias Naturales y Matemáticas, Universidad del Rosario, Bogotá, Colombia
| | - Dora I. Ríos-Chaparro
- Grupo de Investigaciones Microbiológicas – UR (GIMUR), Programa de Biología, Facultad de Ciencias Naturales y Matemáticas, Universidad del Rosario, Bogotá, Colombia
| | - Diana Díaz-Arévalo
- Molecular Biology and Immunology Department, Fundación Instituto de Inmunología de Colombia (FIDIC), Bogotá, Colombia
- Faculty of Animal Sciences, Universidad de Ciencias Aplicadas y Ambientales (UDCA), Bogotá, Colombia
| | - Manuel A. Patarroyo
- Molecular Biology and Immunology Department, Fundación Instituto de Inmunología de Colombia (FIDIC), Bogotá, Colombia
- School of Medicine and Health Sciences, Universidad del Rosario, Bogotá, Colombia
| | - Trevor D. Lawley
- Host–Microbiota Interactions Laboratory, Wellcome Trust Sanger Institute, Hinxton, UK
| | - Juan David Ramírez
- Grupo de Investigaciones Microbiológicas – UR (GIMUR), Programa de Biología, Facultad de Ciencias Naturales y Matemáticas, Universidad del Rosario, Bogotá, Colombia
| |
Collapse
|
28
|
Hamilton CA, St Laurent RA, Dexter K, Kitching IJ, Breinholt JW, Zwick A, Timmermans MJTN, Barber JR, Kawahara AY. Phylogenomics resolves major relationships and reveals significant diversification rate shifts in the evolution of silk moths and relatives. BMC Evol Biol 2019; 19:182. [PMID: 31533606 PMCID: PMC6751749 DOI: 10.1186/s12862-019-1505-1] [Citation(s) in RCA: 21] [Impact Index Per Article: 4.2] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/18/2019] [Accepted: 08/29/2019] [Indexed: 03/13/2023] Open
Abstract
BACKGROUND Silkmoths and their relatives constitute the ecologically and taxonomically diverse superfamily Bombycoidea, which includes some of the most charismatic species of Lepidoptera. Despite displaying spectacular forms and diverse ecological traits, relatively little attention has been given to understanding their evolution and drivers of their diversity. To begin to address this problem, we created a new Bombycoidea-specific Anchored Hybrid Enrichment (AHE) probe set and sampled up to 571 loci for 117 taxa across all major lineages of the Bombycoidea, with a newly developed DNA extraction protocol that allows Lepidoptera specimens to be readily sequenced from pinned natural history collections. RESULTS The well-supported tree was overall consistent with prior morphological and molecular studies, although some taxa were misplaced. The bombycid Arotros Schaus was formally transferred to Apatelodidae. We identified important evolutionary patterns (e.g., morphology, biogeography, and differences in speciation and extinction), and our analysis of diversification rates highlights the stark increases that exist within the Sphingidae (hawkmoths) and Saturniidae (wild silkmoths). CONCLUSIONS Our study establishes a backbone for future evolutionary, comparative, and taxonomic studies of Bombycoidea. We postulate that the rate shifts identified are due to the well-documented bat-moth "arms race". Our research highlights the flexibility of AHE to generate genomic data from a wide range of museum specimens, both age and preservation method, and will allow researchers to tap into the wealth of biological data residing in natural history collections around the globe.
Collapse
Affiliation(s)
- C A Hamilton
- Florida Museum of Natural History, University of Florida, Gainesville, FL, 32611, USA.
- Department of Entomology, Plant Pathology & Nematology, University of Idaho, Moscow, ID, 83844, USA.
| | - R A St Laurent
- Florida Museum of Natural History, University of Florida, Gainesville, FL, 32611, USA
| | - K Dexter
- Florida Museum of Natural History, University of Florida, Gainesville, FL, 32611, USA
| | - I J Kitching
- Department of Life Sciences, Natural History Museum, Cromwell Road, London, SW7 5BD, UK
| | - J W Breinholt
- Florida Museum of Natural History, University of Florida, Gainesville, FL, 32611, USA
- RAPiD Genomics, 747 SW 2nd Avenue #314, Gainesville, FL, 32601, USA
| | - A Zwick
- Australian National Insect Collection, CSIRO, Clunies Ross St, Acton, ACT, Canberra, 2601, Australia
| | - M J T N Timmermans
- Department of Natural Sciences, Middlesex University, The Burroughs, London, NW4 4BT, UK
| | - J R Barber
- Department of Biological Sciences, Boise State University, Boise, ID, 83725, USA
| | - A Y Kawahara
- Florida Museum of Natural History, University of Florida, Gainesville, FL, 32611, USA.
| |
Collapse
|
29
|
Zhang C, Liu T, Yuan X, Huang H, Yao G, Mo X, Xue X, Yan H. The plastid genome and its implications in barcoding specific-chemotypes of the medicinal herb Pogostemon cablin in China. PLoS One 2019; 14:e0215512. [PMID: 30986249 PMCID: PMC6464210 DOI: 10.1371/journal.pone.0215512] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/11/2018] [Accepted: 04/03/2019] [Indexed: 12/04/2022] Open
Abstract
Pogostemon cablin (Blanco) Benth. (Patchouli) is not only an important essential oil plant, but also a valuable medicinal plant in China. P. cablin in China can be divided into three cultivars (Shipai, Gaoyao, and Hainan) and two chemotypes (pogostone-type and patchoulol-type). The pogostone-type and patchoulol-type are, respectively, used for medicinals and perfumes. In this study, we sequenced and characterized the plastid genomes for all three Chinese cultivars and aimed to develop a chemotype-specific barcode for future quality control. The plastid genomes of P. cablin cultivars ranged from 152,461 to 152,462 bp in length and comprise 114 genes including 80 protein coding genes, 30 tRNA genes, and four rRNA genes. Phylogenetic analyses suggested that P. cablin cultivars clustered with the other two Pogostemon species with strong support. Although extremely conserved in P. cablin plastid genomes, 58 cpSSRs were filtered out among the three cultivars. One single variable locus, cpSSR, was discovered. The cpSSR genotypes successfully matched the chemotypes of Chinese patchouli, which was further supported by PCR-based Sanger sequences in more Chinese patchouli samples. The barcode developed in this study is thought to be a simple and reliable quality control method for Chinese P. cablin on the market.
Collapse
Affiliation(s)
- Caiyun Zhang
- Guangdong Food and Drug Vocational College, Guangzhou, China
| | - Tongjian Liu
- Key Laboratory of Plant Resources Conservation and Sustainable Utilization, South China Botanical Garden, Chinese Academy of Sciences, Guangzhou, China
| | - Xun Yuan
- College of Life Sciences, South China Agricultural University, Guangzhou, China
| | - Huirun Huang
- Key Laboratory of Plant Resources Conservation and Sustainable Utilization, South China Botanical Garden, Chinese Academy of Sciences, Guangzhou, China
- Guangdong Provincial Key Laboratory of Applied Botany, South China Botanical Garden, Chinese Academy of Sciences, Guangzhou, China
| | - Gang Yao
- South China Limestone Plants Research Centre, College of Forestry and Landscape Architecture, South China Agricultural University, Guangzhou, China
| | - Xiaolu Mo
- Guangdong Food and Drug Vocational College, Guangzhou, China
| | - Xue Xue
- Guangdong Food and Drug Vocational College, Guangzhou, China
| | - Haifei Yan
- Key Laboratory of Plant Resources Conservation and Sustainable Utilization, South China Botanical Garden, Chinese Academy of Sciences, Guangzhou, China
- Guangdong Provincial Key Laboratory of Applied Botany, South China Botanical Garden, Chinese Academy of Sciences, Guangzhou, China
| |
Collapse
|
30
|
Abadi S, Azouri D, Pupko T, Mayrose I. Model selection may not be a mandatory step for phylogeny reconstruction. Nat Commun 2019; 10:934. [PMID: 30804347 PMCID: PMC6389923 DOI: 10.1038/s41467-019-08822-w] [Citation(s) in RCA: 166] [Impact Index Per Article: 33.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/06/2018] [Accepted: 01/29/2019] [Indexed: 11/29/2022] Open
Abstract
Determining the most suitable model for phylogeny reconstruction constitutes a fundamental step in numerous evolutionary studies. Over the years, various criteria for model selection have been proposed, leading to debate over which criterion is preferable. However, the necessity of this procedure has not been questioned to date. Here, we demonstrate that although incongruency regarding the selected model is frequent over empirical and simulated data, all criteria lead to very similar inferences. When topologies and ancestral sequence reconstruction are the desired output, choosing one criterion over another is not crucial. Moreover, skipping model selection and using instead the most parameter-rich model, GTR+I+G, leads to similar inferences, thus rendering this time-consuming step nonessential, at least under current strategies of model selection.
Collapse
Affiliation(s)
- Shiran Abadi
- School of Plant Sciences and Food Security, Tel Aviv University, Ramat Aviv, Tel-Aviv, 69978, Israel
| | - Dana Azouri
- School of Plant Sciences and Food Security, Tel Aviv University, Ramat Aviv, Tel-Aviv, 69978, Israel
- School of Molecular Cell Biology & Biotechnology, Tel Aviv University, Ramat Aviv, Tel-Aviv, 69978, Israel
| | - Tal Pupko
- School of Molecular Cell Biology & Biotechnology, Tel Aviv University, Ramat Aviv, Tel-Aviv, 69978, Israel.
| | - Itay Mayrose
- School of Plant Sciences and Food Security, Tel Aviv University, Ramat Aviv, Tel-Aviv, 69978, Israel.
| |
Collapse
|