1
|
Edwards SV, Cloutier A, Cockburn G, Driver R, Grayson P, Katoh K, Baldwin MW, Sackton TB, Baker AJ. A nuclear genome assembly of an extinct flightless bird, the little bush moa. SCIENCE ADVANCES 2024; 10:eadj6823. [PMID: 38781323 DOI: 10.1126/sciadv.adj6823] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 07/10/2023] [Accepted: 04/17/2024] [Indexed: 05/25/2024]
Abstract
We present a draft genome of the little bush moa (Anomalopteryx didiformis)-one of approximately nine species of extinct flightless birds from Aotearoa, New Zealand-using ancient DNA recovered from a fossil bone from the South Island. We recover a complete mitochondrial genome at 249.9× depth of coverage and almost 900 megabases of a male moa nuclear genome at ~4 to 5× coverage, with sequence contiguity sufficient to identify more than 85% of avian universal single-copy orthologs. We describe a diverse landscape of transposable elements and satellite repeats, estimate a long-term effective population size of ~240,000, identify a diverse suite of olfactory receptor genes and an opsin repertoire with sensitivity in the ultraviolet range, show that the wingless moa phenotype is likely not attributable to gene loss or pseudogenization, and identify potential function-altering coding sequence variants in moa that could be synthesized for future functional assays. This genomic resource should support further studies of avian evolution and morphological divergence.
Collapse
Affiliation(s)
- Scott V Edwards
- Department of Organismic and Evolutionary Biology, Harvard University, 26 Oxford Street, Cambridge, MA 02138, USA
- Museum of Comparative Zoology, Harvard University, 26 Oxford Street, Cambridge, MA 02138, USA
| | - Alison Cloutier
- Department of Organismic and Evolutionary Biology, Harvard University, 26 Oxford Street, Cambridge, MA 02138, USA
| | - Glenn Cockburn
- Evolution of Sensory Systems Research Group, Max Planck Institute for Biological Intelligence, 82319 Seewiesen, Germany
| | - Robert Driver
- Department of Biology, East Carolina University, E 5th Street, Greenville, NC 27605, USA
| | - Phil Grayson
- Department of Organismic and Evolutionary Biology, Harvard University, 26 Oxford Street, Cambridge, MA 02138, USA
- Museum of Comparative Zoology, Harvard University, 26 Oxford Street, Cambridge, MA 02138, USA
| | - Kazutaka Katoh
- Department of Genome Informatics, Research Institute for Microbial Diseases, Osaka University, 3-1 Yamadaoka, Suita 565-0871, Japan
| | - Maude W Baldwin
- Evolution of Sensory Systems Research Group, Max Planck Institute for Biological Intelligence, 82319 Seewiesen, Germany
| | - Timothy B Sackton
- Informatics Group, Harvard University, 38 Oxford Street, Cambridge, MA 02138, USA
| | - Allan J Baker
- Department of Ecology and Evolutionary Biology, University of Toronto, 25 Willcox Street, Toronto, ON M5S 3B2, Canada
- Department of Natural History, Royal Ontario Museum, 100 Queen's Park, Toronto, ON M5S 2C6, Canada
| |
Collapse
|
2
|
Li WH, Chuong CM, Chen CK, Wu P, Jiang TX, Harn HIC, Liu TY, Yu Z, Lu J, Chang YM, Yue Z, Lin J, Vu TD, Huang TY, Ng CS. Transition from natal downs to juvenile feathers: conserved regulatory switches in Neoaves. RESEARCH SQUARE 2023:rs.3.rs-3382427. [PMID: 37886492 PMCID: PMC10602114 DOI: 10.21203/rs.3.rs-3382427/v1] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 10/28/2023]
Abstract
The transition from natal downs for heat conservation to juvenile feathers for simple flight is a remarkable environmental adaptation process in avian evolution. However, the underlying epigenetic mechanism for this primary feather transition is mostly unknown. Here we conducted time-ordered gene co-expression network construction, epigenetic analysis, and functional perturbations in developing feather follicles to elucidate four downy-juvenile feather transition events. We discovered that LEF1 works as a key hub of Wnt signaling to build rachis and converts radial downy to bilateral symmetry. Extracellular matrix reorganization leads to peripheral pulp formation, which mediates epithelial -mesenchymal interactions for branching morphogenesis. ACTA2 compartments dermal papilla stem cells for feather cycling. Novel usage of scale keratins strengthens feather sheath with SOX14 as the epigenetic regulator. We found this primary feather transition largely conserved in chicken (precocious) and zebra finch (altricial) and discussed the possibility that this evolutionary adaptation process started in feathered dinosaurs.
Collapse
Affiliation(s)
| | | | | | - Ping Wu
- University of Southern California
| | | | - Hans I-Chen Harn
- Department of Pathology, Keck School of Medicine, University of Southern California
| | - Tzu-Yu Liu
- Department of Pathology, Keck School of Medicine, University of Southern California
| | - Zhou Yu
- Department of Pathology, Keck School of Medicine, University of Southern California
| | - Jiayi Lu
- Department of Pathology, Keck School of Medicine, University of Southern California
| | | | | | | | - Trieu-Duc Vu
- Foundation for Advancement of International Science
| | - Tao-Yu Huang
- Biodiversity Research Center, Academia Sinica, Taipei
| | | |
Collapse
|
3
|
Ortiz-Sepulveda CM, Genete M, Blassiau C, Godé C, Albrecht C, Vekemans X, Van Bocxlaer B. Target enrichment of long open reading frames and ultraconserved elements to link microevolution and macroevolution in non-model organisms. Mol Ecol Resour 2023; 23:659-679. [PMID: 36349833 DOI: 10.1111/1755-0998.13735] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/14/2021] [Revised: 10/09/2022] [Accepted: 10/19/2022] [Indexed: 11/10/2022]
Abstract
Despite the increasing accessibility of high-throughput sequencing, obtaining high-quality genomic data on non-model organisms without proximate well-assembled and annotated genomes remains challenging. Here, we describe a workflow that takes advantage of distant genomic resources and ingroup transcriptomes to select and jointly enrich long open reading frames (ORFs) and ultraconserved elements (UCEs) from genomic samples for integrative studies of microevolutionary and macroevolutionary dynamics. This workflow is applied to samples of the African unionid bivalve tribe Coelaturini (Parreysiinae) at basin and continent-wide scales. Our results indicate that ORFs are efficiently captured without prior identification of intron-exon boundaries. The enrichment of UCEs was less successful, but nevertheless produced substantial data sets. Exploratory continent-wide phylogenetic analyses with ORF supercontigs (>515,000 parsimony informative sites) resulted in a fully resolved phylogeny, the backbone of which was also retrieved with UCEs (>11,000 informative sites). Variant calling on ORFs and UCEs of Coelaturini from the Malawi Basin produced ~2000 SNPs per population pair. Estimates of nucleotide diversity and population differentiation were similar for ORFs and UCEs. They were low compared to previous estimates in molluscs, but comparable to those in recently diversifying Malawi cichlids and other taxa at an early stage of speciation. Skimming off-target sequence data from the same enriched libraries of Coelaturini from the Malawi Basin, we reconstructed the maternally-inherited mitogenome, which displays the gene order inferred for the most recent common ancestor of Unionidae. Overall, our workflow and results provide exciting perspectives for integrative genomic studies of microevolutionary and macroevolutionary dynamics in non-model organisms.
Collapse
Affiliation(s)
| | - Mathieu Genete
- CNRS, Univ. Lille, UMR 8198 - Evo-Eco-Paleo, F-59000 Lille, France
| | | | - Cécile Godé
- CNRS, Univ. Lille, UMR 8198 - Evo-Eco-Paleo, F-59000 Lille, France
| | - Christian Albrecht
- Department of Animal Ecology and Systematics, Justus Liebig University, D-35392 Giessen, Germany.,Department of Biology, Mbarara University of Science and Technology, Mbarara, Uganda
| | - Xavier Vekemans
- CNRS, Univ. Lille, UMR 8198 - Evo-Eco-Paleo, F-59000 Lille, France
| | | |
Collapse
|
4
|
Song H, Wang Q, Zhang Z, Lin K, Pang E. Identification of clade-wide putative cis-regulatory elements from conserved non-coding sequences in Cucurbitaceae genomes. HORTICULTURE RESEARCH 2023; 10:uhad038. [PMID: 37799630 PMCID: PMC10548412 DOI: 10.1093/hr/uhad038] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 08/22/2022] [Accepted: 02/20/2023] [Indexed: 10/07/2023]
Abstract
Cis-regulatory elements regulate gene expression and play an essential role in the development and physiology of organisms. Many conserved non-coding sequences (CNSs) function as cis-regulatory elements. They control the development of various lineages. However, predicting clade-wide cis-regulatory elements across several closely related species remains challenging. Based on the relationship between CNSs and cis-regulatory elements, we present a computational approach that predicts the clade-wide putative cis-regulatory elements in 12 Cucurbitaceae genomes. Using 12-way whole-genome alignment, we first obtained 632 112 CNSs in Cucurbitaceae. Next, we identified 16 552 Cucurbitaceae-wide cis-regulatory elements based on collinearity among all 12 Cucurbitaceae plants. Furthermore, we predicted 3 271 potential regulatory pairs in the cucumber genome, of which 98 were verified using integrative RNA sequencing and ChIP sequencing datasets from samples collected during various fruit development stages. The CNSs, Cucurbitaceae-wide cis-regulatory elements, and their target genes are accessible at http://cmb.bnu.edu.cn/cisRCNEs_cucurbit/. These elements are valuable resources for functionally annotating CNSs and their regulatory roles in Cucurbitaceae genomes.
Collapse
Affiliation(s)
- Hongtao Song
- MOE Key Laboratory for Biodiversity Science and Ecological Engineering and Beijing Key Laboratory of Gene Resource and Molecular Development, College of Life Sciences, Beijing Normal University, Beijing 100875, China
| | - Qi Wang
- MOE Key Laboratory for Biodiversity Science and Ecological Engineering and Beijing Key Laboratory of Gene Resource and Molecular Development, College of Life Sciences, Beijing Normal University, Beijing 100875, China
| | - Zhonghua Zhang
- College of Horticulture, Qingdao Agricultural University, Qingdao 266109, China
| | - Kui Lin
- MOE Key Laboratory for Biodiversity Science and Ecological Engineering and Beijing Key Laboratory of Gene Resource and Molecular Development, College of Life Sciences, Beijing Normal University, Beijing 100875, China
| | - Erli Pang
- MOE Key Laboratory for Biodiversity Science and Ecological Engineering and Beijing Key Laboratory of Gene Resource and Molecular Development, College of Life Sciences, Beijing Normal University, Beijing 100875, China
| |
Collapse
|
5
|
Genome Evolution and the Future of Phylogenomics of Non-Avian Reptiles. Animals (Basel) 2023; 13:ani13030471. [PMID: 36766360 PMCID: PMC9913427 DOI: 10.3390/ani13030471] [Citation(s) in RCA: 5] [Impact Index Per Article: 5.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/16/2022] [Revised: 01/13/2023] [Accepted: 01/15/2023] [Indexed: 02/01/2023] Open
Abstract
Non-avian reptiles comprise a large proportion of amniote vertebrate diversity, with squamate reptiles-lizards and snakes-recently overtaking birds as the most species-rich tetrapod radiation. Despite displaying an extraordinary diversity of phenotypic and genomic traits, genomic resources in non-avian reptiles have accumulated more slowly than they have in mammals and birds, the remaining amniotes. Here we review the remarkable natural history of non-avian reptiles, with a focus on the physical traits, genomic characteristics, and sequence compositional patterns that comprise key axes of variation across amniotes. We argue that the high evolutionary diversity of non-avian reptiles can fuel a new generation of whole-genome phylogenomic analyses. A survey of phylogenetic investigations in non-avian reptiles shows that sequence capture-based approaches are the most commonly used, with studies of markers known as ultraconserved elements (UCEs) especially well represented. However, many other types of markers exist and are increasingly being mined from genome assemblies in silico, including some with greater information potential than UCEs for certain investigations. We discuss the importance of high-quality genomic resources and methods for bioinformatically extracting a range of marker sets from genome assemblies. Finally, we encourage herpetologists working in genomics, genetics, evolutionary biology, and other fields to work collectively towards building genomic resources for non-avian reptiles, especially squamates, that rival those already in place for mammals and birds. Overall, the development of this cross-amniote phylogenomic tree of life will contribute to illuminate interesting dimensions of biodiversity across non-avian reptiles and broader amniotes.
Collapse
|
6
|
Espíndola-Hernández P, Mueller JC, Kempenaers B. Genomic signatures of the evolution of a diurnal lifestyle in Strigiformes. G3 GENES|GENOMES|GENETICS 2022; 12:6595023. [PMID: 35640557 PMCID: PMC9339318 DOI: 10.1093/g3journal/jkac135] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 10/22/2021] [Accepted: 05/17/2022] [Indexed: 11/25/2022]
Abstract
Understanding the targets of selection associated with changes in behavioral traits represents an important challenge of current evolutionary research. Owls (Strigiformes) are a diverse group of birds, most of which are considered nocturnal raptors. However, a few owl species independently adopted a diurnal lifestyle in their recent evolutionary history. We searched for signals of accelerated rates of evolution associated with a diurnal lifestyle using a genome-wide comparative approach. We estimated substitution rates in coding and noncoding conserved regions of the genome of seven owl species, including three diurnal species. Substitution rates of the noncoding elements were more accelerated than those of protein-coding genes. We identified new, owl-specific conserved noncoding elements as candidates of parallel evolution during the emergence of diurnality in owls. Our results shed light on the molecular basis of adaptation to a new niche and highlight the importance of regulatory elements for evolutionary changes in behavior. These elements were often involved in the neuronal development of the brain.
Collapse
Affiliation(s)
- Pamela Espíndola-Hernández
- Department of Behavioural Ecology and Evolutionary Genetics, Max Planck Institute for Ornithology , 82319 Seewiesen, Germany
| | - Jakob C Mueller
- Department of Behavioural Ecology and Evolutionary Genetics, Max Planck Institute for Ornithology , 82319 Seewiesen, Germany
| | - Bart Kempenaers
- Department of Behavioural Ecology and Evolutionary Genetics, Max Planck Institute for Ornithology , 82319 Seewiesen, Germany
| |
Collapse
|
7
|
Zhu T, Flouri T, Yang Z. A simulation study to examine the impact of recombination on phylogenomic inferences under the multispecies coalescent model. Mol Ecol 2022; 31:2814-2829. [PMID: 35313033 PMCID: PMC9321900 DOI: 10.1111/mec.16433] [Citation(s) in RCA: 9] [Impact Index Per Article: 4.5] [Reference Citation Analysis] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/17/2021] [Revised: 01/25/2022] [Accepted: 02/28/2022] [Indexed: 11/28/2022]
Affiliation(s)
- Tianqi Zhu
- Institute of Applied Mathematics Academy of Mathematics and Systems Science Chinese Academy of Sciences Beijing 100190 China
- Key Laboratory of Random Complex Structures and Data Science, Academy of Mathematics and Systems Science, Chinese Academy of Sciences Beijing 100190 China
| | - Tomáš Flouri
- Department of Genetics, Evolution and Environment University College London London WC1E 6BT UK
| | - Ziheng Yang
- Department of Genetics, Evolution and Environment University College London London WC1E 6BT UK
| |
Collapse
|
8
|
Beaulieu JM, O'Meara BC, Gilchrist MA. A Spatially Explicit Model of Stabilizing Selection for Improving Phylogenetic Inference. Mol Biol Evol 2021; 38:1641-1652. [PMID: 33306127 PMCID: PMC8042768 DOI: 10.1093/molbev/msaa318] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/26/2022] Open
Abstract
Ultraconserved elements (UCEs) are stretches of hundreds of nucleotides with highly conserved cores flanked by variable regions. Although the selective forces responsible for the preservation of UCEs are unknown, they are nonetheless believed to contain phylogenetically meaningful information from deep to shallow divergence events. Phylogenetic applications of UCEs assume the same degree of rate heterogeneity applies across the entire locus, including variable flanking regions. We present a Wright–Fisher model of selection on nucleotides (SelON) which includes the effects of mutation, drift, and spatially varying, stabilizing selection for an optimal nucleotide sequence. The SelON model assumes the strength of stabilizing selection follows a position-dependent Gaussian function whose exact shape can vary between UCEs. We evaluate SelON by comparing its performance to a simpler and spatially invariant GTR+Γ model using an empirical data set of 400 vertebrate UCEs used to determine the phylogenetic position of turtles. We observe much improvement in model fit of SelON over the GTR+Γ model, and support for turtles as sister to lepidosaurs. Overall, the UCE-specific parameters SelON estimates provide a compact way of quantifying the strength and variation in selection within and across UCEs. SelON can also be extended to include more realistic mapping functions between sequence and stabilizing selection as well as allow for greater levels of rate heterogeneity. By more explicitly modeling the nature of selection on UCEs, SelON and similar approaches can be used to better understand the biological mechanisms responsible for their preservation across highly divergent taxa and long evolutionary time scales.
Collapse
Affiliation(s)
- Jeremy M Beaulieu
- Department of Biological Sciences, University of Arkansas, Fayetteville, AR, USA
| | - Brian C O'Meara
- Department of Ecology and Evolutionary Biology, University of Tennessee, Knoxville, TN, USA
| | - Michael A Gilchrist
- Department of Ecology and Evolutionary Biology, University of Tennessee, Knoxville, TN, USA
| |
Collapse
|
9
|
Huang J, Bennett J, Flouri T, Leaché AD, Yang Z. Phase Resolution of Heterozygous Sites in Diploid Genomes is Important to Phylogenomic Analysis under the Multispecies Coalescent Model. Syst Biol 2021; 71:334-352. [PMID: 34143216 PMCID: PMC8977997 DOI: 10.1093/sysbio/syab047] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/28/2020] [Revised: 06/03/2021] [Accepted: 06/21/2021] [Indexed: 01/01/2023] Open
Abstract
Genome sequencing projects routinely generate haploid consensus sequences from diploid
genomes, which are effectively chimeric sequences with the phase at heterozygous sites
resolved at random. The impact of phasing errors on phylogenomic analyses under the
multispecies coalescent (MSC) model is largely unknown. Here, we conduct a computer
simulation to evaluate the performance of four phase-resolution strategies (the true phase
resolution, the diploid analytical integration algorithm which averages over all phase
resolutions, computational phase resolution using the program PHASE, and random
resolution) on estimation of the species tree and evolutionary parameters in analysis of
multilocus genomic data under the MSC model. We found that species tree estimation is
robust to phasing errors when species divergences were much older than average coalescent
times but may be affected by phasing errors when the species tree is shallow. Estimation
of parameters under the MSC model with and without introgression is affected by phasing
errors. In particular, random phase resolution causes serious overestimation of population
sizes for modern species and biased estimation of cross-species introgression probability.
In general, the impact of phasing errors is greater when the mutation rate is higher, the
data include more samples per species, and the species tree is shallower with recent
divergences. Use of phased sequences inferred by the PHASE program produced small biases
in parameter estimates. We analyze two real data sets, one of East Asian brown frogs and
another of Rocky Mountains chipmunks, to demonstrate that heterozygote phase-resolution
strategies have similar impacts on practical data analyses. We suggest that genome
sequencing projects should produce unphased diploid genotype sequences if fully phased
data are too challenging to generate, and avoid haploid consensus sequences, which have
heterozygous sites phased at random. In case the analytical integration algorithm is
computationally unfeasible, computational phasing prior to population genomic analyses is
an acceptable alternative. [BPP; introgression; multispecies coalescent; phase; species
tree.]
Collapse
Affiliation(s)
- Jun Huang
- Department of Genetics, Evolution and Environment, University College London, Gower Street, London WC1E 6BT, UK.,Department of Mathematics, Beijing Jiaotong University, Beijing, 100044, P.R. China
| | - Jeremy Bennett
- Department of Genetics, Evolution and Environment, University College London, Gower Street, London WC1E 6BT, UK.,Department of Ecology and Evolutionary Biology, University of Connecticut, 75 N. Eagleville Road, Unit 3043, Storrs, CT 06269-3043, USA
| | - Tomáš Flouri
- Department of Genetics, Evolution and Environment, University College London, Gower Street, London WC1E 6BT, UK
| | - Adam D Leaché
- Department of Biology & Burke Museum of Natural History and Culture, University of Washington, Seattle, WA 98195-1800, USA
| | - Ziheng Yang
- Department of Genetics, Evolution and Environment, University College London, Gower Street, London WC1E 6BT, UK
| |
Collapse
|
10
|
Arcila D, Hughes LC, Meléndez-Vazquez F, Baldwin CC, White W, Carpenter K, Williams JT, Santos MD, Pogonoski J, Miya M, Ortí G, Betancur-R R. Testing the utility of alternative metrics of branch support to address the ancient evolutionary radiation of tunas, stromateoids, and allies (Teleostei: Pelagiaria). Syst Biol 2021; 70:1123-1144. [PMID: 33783539 DOI: 10.1093/sysbio/syab018] [Citation(s) in RCA: 10] [Impact Index Per Article: 3.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/23/2021] [Accepted: 03/13/2021] [Indexed: 12/19/2022] Open
Abstract
The use of high-throughput sequencing technologies to produce genome-scale datasets was expected to settle some long-standing controversies across the Tree of Life, particularly in areas where short branches occur at deep timescales. Instead, these datasets have often yielded many well-supported but conflicting topologies, and highly variable gene-tree distributions. A variety of branch-support metrics beyond the nonparametric bootstrap are now available to assess how robust a phylogenetic hypothesis may be, as well as new methods to quantify gene-tree discordance. We applied multiple branch support metrics to an ancient group of marine fishes (Teleostei: Pelagiaria) whose interfamilial relationships have proven difficult to resolve due to a rapid accumulation of lineages very early in its history. We analyzed hundreds of loci including published UCE data and newly generated exonic data along with their flanking regions to represent all 16 extant families for more than 150 out of 284 valid species in the group. Branch support was lower for interfamilial relationships (except the SH-like aLRT and aBayes methods) regardless of the type of marker used. Several nodes that were highly supported with bootstrap had very low site and gene-tree concordance, revealing underlying conflict. Despite this conflict, we were able to identify four consistent interfamilial clades, each comprised of two or three families. Combining exons with their flanking regions also produced increased branch lengths in the deep branches of the pelagiarian tree. Our results demonstrate the limitations of employing current metrics of branch support and species-tree estimation when assessing the confidence of ancient evolutionary radiations and emphasize the necessity to embrace alternative measurements to explore phylogenetic uncertainty and discordance in phylogenomic datasets.
Collapse
Affiliation(s)
- Dahiana Arcila
- Department of Ichthyology, Sam Noble Oklahoma Museum of Natural History, Norman, Oklahoma, U.S.A.,Department of Biology, University of Oklahoma, Norman, Oklahoma, U.S.A
| | - Lily C Hughes
- Department of Biological Sciences, The George Washington University, Washington, District of Columbia, U.S.A.,Department of Organismal Biology and Anatomy, The University of Chicago, Illinois, Chicago, U.S.A.,Department of Vertebrate Zoology, Smithsonian Institution National Museum of Natural History, Washington, District of Columbia, U.S.A
| | - Fernando Meléndez-Vazquez
- Department of Ichthyology, Sam Noble Oklahoma Museum of Natural History, Norman, Oklahoma, U.S.A.,Department of Biology, University of Oklahoma, Norman, Oklahoma, U.S.A
| | - Carole C Baldwin
- Department of Vertebrate Zoology, Smithsonian Institution National Museum of Natural History, Washington, District of Columbia, U.S.A
| | - William White
- CSIRO Australian National Fish Collection, National Research Collections Australia, Hobart, Hobart, Tasmania, Australia
| | - Kent Carpenter
- Department of Biological Sciences, Old Dominion University, Norfolk, Virginia, U.S.A
| | - Jeffrey T Williams
- Department of Vertebrate Zoology, Smithsonian Institution National Museum of Natural History, Washington, District of Columbia, U.S.A
| | | | - John Pogonoski
- CSIRO Australian National Fish Collection, National Research Collections Australia, Hobart, Hobart, Tasmania, Australia
| | - Masaki Miya
- Natural History Museum and Institute, Chiba, Aoba-cho, Chuo-ku, Chiba, Japan
| | - Guillermo Ortí
- Department of Biological Sciences, The George Washington University, Washington, District of Columbia, U.S.A.,Department of Vertebrate Zoology, Smithsonian Institution National Museum of Natural History, Washington, District of Columbia, U.S.A
| | | |
Collapse
|
11
|
Lv X, Hu J, Hu Y, Li Y, Xu D, Ryder OA, Irwin DM, Yu L. Diverse phylogenomic datasets uncover a concordant scenario of laurasiatherian interordinal relationships. Mol Phylogenet Evol 2020; 157:107065. [PMID: 33387649 DOI: 10.1016/j.ympev.2020.107065] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/09/2020] [Revised: 12/22/2020] [Accepted: 12/24/2020] [Indexed: 10/22/2022]
Abstract
Resolving the interordinal relationships in the mammalian superorder Laurasiatheria has been among the most intractable problems in higher-level mammalian systematics, with many conflicting hypotheses having been proposed. The present study collected three different sources of genome-scale data with comprehensive taxon sampling of laurasiatherian species, including two protein-coding datasets (4,186 protein-coding genes for an amino acid dataset comprising 2,761,247 amino acid residues and a nucleotide dataset comprising 5,516,340 nucleotides from 1st and 2nd codon positions), an intronic dataset (1,210 introns comprising 1,162,723 nucleotides) and an ultraconserved elements (UCEs) dataset (1,246 UCEs comprising 1,946,472 nucleotides) from 40 species representing all six laurasiatherian orders and 7 non-laurasiatherian outgroups. Remarkably, phylogenetic trees reconstructed with the four datasets using different tree-building methods (RAxML, FastTree, ASTRAL and MP-EST) all supported the relationship (Eulipotyphla, (Chiroptera, ((Carnivora, Pholidota), (Cetartiodactyla, Perissodactyla)))). We find a resolution of interordinal relationships of Laurasiatheria among all types of markers used in the present study, and the likelihood ratio tests for tree comparisons confirmed that the present tree topology is the optimal hypothesis compared to other examined hypotheses. Jackknifing subsampling analyses demonstrate that the results of laurasiatherian tree reconstruction varied with the number of loci and ordinal representatives used, which are likely the two main contributors to phylogenetic disagreements of Laurasiatheria seen in previous studies. Our study provides significant insight into laurasiatherian evolution, and moreover, an important methodological strategy and reference for resolving phylogenies of adaptive radiation, which have been a long-standing challenge in the field of phylogenetics.
Collapse
Affiliation(s)
- Xue Lv
- State Key Laboratory for Conservation and Utilization of Bio-Resources in Yunnan, Yunnan University, Kunming, China; School of Life Sciences, Yunnan University, Kunming, China
| | - Jingyang Hu
- State Key Laboratory for Conservation and Utilization of Bio-Resources in Yunnan, Yunnan University, Kunming, China; School of Life Sciences, Yunnan University, Kunming, China; Kunming College of Life Science, University of Chinese Academy of Sciences, Kunming, China
| | - Yiwen Hu
- State Key Laboratory for Conservation and Utilization of Bio-Resources in Yunnan, Yunnan University, Kunming, China; School of Life Sciences, Yunnan University, Kunming, China
| | - Yitian Li
- State Key Laboratory for Conservation and Utilization of Bio-Resources in Yunnan, Yunnan University, Kunming, China; School of Life Sciences, Yunnan University, Kunming, China
| | - Dongming Xu
- State Key Laboratory of Genetic Resources and Evolution, Kunming Institute of Zoology, Kunming, China
| | - Oliver A Ryder
- Institute for Conservation Research, San Diego Zoo Global, Escondido, CA, USA
| | - David M Irwin
- Department of Laboratory Medicine and Pathobiology, University of Toronto, Toronto, Canada
| | - Li Yu
- State Key Laboratory for Conservation and Utilization of Bio-Resources in Yunnan, Yunnan University, Kunming, China.
| |
Collapse
|
12
|
Hughes LC, Ortí G, Saad H, Li C, White WT, Baldwin CC, Crandall KA, Arcila D, Betancur-R R. Exon probe sets and bioinformatics pipelines for all levels of fish phylogenomics. Mol Ecol Resour 2020; 21:816-833. [PMID: 33084200 DOI: 10.1111/1755-0998.13287] [Citation(s) in RCA: 15] [Impact Index Per Article: 3.8] [Reference Citation Analysis] [Abstract] [Key Words] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/20/2020] [Accepted: 10/09/2020] [Indexed: 11/28/2022]
Abstract
Exon markers have a long history of use in phylogenetics of ray-finned fishes, the most diverse clade of vertebrates with more than 35,000 species. As the number of published genomes increases, it has become easier to test exons and other genetic markers for signals of ancient duplication events and filter out paralogues that can mislead phylogenetic analysis. We present seven new probe sets for current target-capture phylogenomic protocols that capture 1,104 exons explicitly filtered for paralogues using gene trees. These seven probe sets span the diversity of teleost fishes, including four sets that target five hyperdiverse percomorph clades which together comprise ca. 17,000 species (Carangaria, Ovalentaria, Eupercaria, and Syngnatharia + Pelagiaria combined). We additionally included probes to capture legacy nuclear exons and mitochondrial markers that have been commonly used in fish phylogenetics (despite some exons being flagged for paralogues) to facilitate integration of old and new molecular phylogenetic matrices. We tested these probes experimentally for 56 fish species (eight species per probe set) and merged new exon-capture sequence data into an existing data matrix of 1,104 exons and 300 ray-finned fish species. We provide an optimized bioinformatics pipeline to assemble exon capture data from raw reads to alignments for downstream analysis. We show that legacy loci with known paralogues are at risk of assembling duplicated sequences with target-capture, but we also assembled many useful orthologous sequences that can be integrated with many PCR-generated matrices. These probe sets are a valuable resource for advancing fish phylogenomics because targeted exons can easily be extracted from increasingly available whole genome and transcriptome data sets, and also may be integrated with existing PCR-based exon and mitochondrial data.
Collapse
Affiliation(s)
- Lily C Hughes
- Department of Biological Sciences, George Washington University, Washington, DC, USA.,Computational Biology Institute, Milken Institute of Public Health, George Washington University, Washington, DC, USA.,Department of Vertebrate Zoology, National Museum of Natural History, Smithsonian Institution, Washington, DC, USA
| | - Guillermo Ortí
- Department of Biological Sciences, George Washington University, Washington, DC, USA.,Department of Vertebrate Zoology, National Museum of Natural History, Smithsonian Institution, Washington, DC, USA
| | - Hadeel Saad
- Department of Biological Sciences, George Washington University, Washington, DC, USA
| | - Chenhong Li
- College of Fisheries and Life Sciences, Shanghai Ocean University, Shanghai, China
| | - William T White
- CSIRO Australian National Fish Collection, National Research Collections of Australia, Hobart, TAS, Australia
| | - Carole C Baldwin
- Department of Vertebrate Zoology, National Museum of Natural History, Smithsonian Institution, Washington, DC, USA
| | - Keith A Crandall
- Department of Biological Sciences, George Washington University, Washington, DC, USA.,Computational Biology Institute, Milken Institute of Public Health, George Washington University, Washington, DC, USA
| | - Dahiana Arcila
- Department of Vertebrate Zoology, National Museum of Natural History, Smithsonian Institution, Washington, DC, USA.,Sam Noble Oklahoma Museum of Natural History, Norman, OK, USA.,Department of Biology, University of Oklahoma, Norman, OK, USA
| | | |
Collapse
|
13
|
Van Dam MH, Henderson JB, Esposito L, Trautwein M. Genomic Characterization and Curation of UCEs Improves Species Tree Reconstruction. Syst Biol 2020; 70:307-321. [PMID: 32750133 PMCID: PMC7875437 DOI: 10.1093/sysbio/syaa063] [Citation(s) in RCA: 12] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/16/2019] [Revised: 07/26/2020] [Accepted: 07/29/2020] [Indexed: 12/12/2022] Open
Abstract
Ultraconserved genomic elements (UCEs) are generally treated as independent loci in phylogenetic analyses. The identification pipeline for UCE probes does not require prior knowledge of genetic identity, only selecting loci that are highly conserved, single copy, without repeats, and of a particular length. Here, we characterized UCEs from 11 phylogenomic studies across the animal tree of life, from birds to marine invertebrates. We found that within vertebrate lineages, UCEs are mostly intronic and intergenic, while in invertebrates, the majority are in exons. We then curated four different sets of UCE markers by genomic category from five different studies including: birds, mammals, fish, Hymenoptera (ants, wasps, and bees), and Coleoptera (beetles). Of genes captured by UCEs, we find that many are represented by two or more UCEs, corresponding to nonoverlapping segments of a single gene. We considered these UCEs to be nonindependent, merged all UCEs that belonged to a particular gene, constructed gene and species trees, and then evaluated the subsequent effect of merging cogenic UCEs on gene and species tree reconstruction. Average bootstrap support for merged UCE gene trees was significantly improved across all data sets apparently driven by the increase in loci length. Additionally, we conducted simulations and found that gene trees generated from merged UCEs were more accurate than those generated by unmerged UCEs. As loci length improves gene tree accuracy, this modest degree of UCE characterization and curation impacts downstream analyses and demonstrates the advantages of incorporating basic genomic characterizations into phylogenomic analyses. [Anchored hybrid enrichment; ants; ASTRAL; bait capture; carangimorph; Coleoptera; conserved nonexonic elements; exon capture; gene tree; Hymenoptera; mammal; phylogenomic markers; songbird; species tree; ultraconserved elements; weevils.]
Collapse
Affiliation(s)
- Matthew H Van Dam
- Entomology Department, Institute for Biodiversity Science and Sustainability, California Academy of Sciences, 55 Music Concourse Dr., San Francisco, CA 94118, USA.,Center for Comparative Genomics, Institute for Biodiversity Science and Sustainability, California Academy of Sciences, 55 Music Concourse Dr., San Francisco, CA 94118, USA
| | - James B Henderson
- Center for Comparative Genomics, Institute for Biodiversity Science and Sustainability, California Academy of Sciences, 55 Music Concourse Dr., San Francisco, CA 94118, USA
| | - Lauren Esposito
- Entomology Department, Institute for Biodiversity Science and Sustainability, California Academy of Sciences, 55 Music Concourse Dr., San Francisco, CA 94118, USA.,Center for Comparative Genomics, Institute for Biodiversity Science and Sustainability, California Academy of Sciences, 55 Music Concourse Dr., San Francisco, CA 94118, USA
| | - Michelle Trautwein
- Entomology Department, Institute for Biodiversity Science and Sustainability, California Academy of Sciences, 55 Music Concourse Dr., San Francisco, CA 94118, USA.,Center for Comparative Genomics, Institute for Biodiversity Science and Sustainability, California Academy of Sciences, 55 Music Concourse Dr., San Francisco, CA 94118, USA
| |
Collapse
|
14
|
Huang J, Flouri T, Yang Z. A Simulation Study to Examine the Information Content in Phylogenomic Data Sets under the Multispecies Coalescent Model. Mol Biol Evol 2020; 37:3211-3224. [DOI: 10.1093/molbev/msaa166] [Citation(s) in RCA: 9] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/05/2023] Open
Abstract
AbstractWe use computer simulation to examine the information content in multilocus data sets for inference under the multispecies coalescent model. Inference problems considered include estimation of evolutionary parameters (such as species divergence times, population sizes, and cross-species introgression probabilities), species tree estimation, and species delimitation based on Bayesian comparison of delimitation models. We found that the number of loci is the most influential factor for almost all inference problems examined. Although the number of sequences per species does not appear to be important to species tree estimation, it is very influential to species delimitation. Increasing the number of sites and the per-site mutation rate both increase the mutation rate for the whole locus and these have the same effect on estimation of parameters, but the sequence length has a greater effect than the per-site mutation rate for species tree estimation. We discuss the computational costs when the data size increases and provide guidelines concerning the subsampling of genomic data to enable the application of full-likelihood methods of inference.
Collapse
Affiliation(s)
- Jun Huang
- Department of Genetics, Evolution and Environment, University College London, London, United Kingdom
- Department of Mathematics, Beijing Jiaotong University, Beijing, P.R. China
| | - Tomáš Flouri
- Department of Genetics, Evolution and Environment, University College London, London, United Kingdom
| | - Ziheng Yang
- Department of Genetics, Evolution and Environment, University College London, London, United Kingdom
| |
Collapse
|
15
|
Jiao X, Yang Z. Defining Species When There is Gene Flow. Syst Biol 2020; 70:108-119. [PMID: 32617579 DOI: 10.1093/sysbio/syaa052] [Citation(s) in RCA: 19] [Impact Index Per Article: 4.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/27/2020] [Revised: 06/23/2020] [Accepted: 06/23/2020] [Indexed: 12/20/2022] Open
Abstract
Whatever one's definition of species, it is generally expected that individuals of the same species should be genetically more similar to each other than they are to individuals of another species. Here, we show that in the presence of cross-species gene flow, this expectation may be incorrect. We use the multispecies coalescent model with continuous-time migration or episodic introgression to study the impact of gene flow on genetic differences within and between species and highlight a surprising but plausible scenario in which different population sizes and asymmetrical migration rates cause a genetic sequence to be on average more closely related to a sequence from another species than to a sequence from the same species. Our results highlight the extraordinary impact that even a small amount of gene flow may have on the genetic history of the species. We suggest that contrasting long-term migration rate and short-term hybridization rate, both of which can be estimated using genetic data, may be a powerful approach to detecting the presence of reproductive barriers and to define species boundaries.[Gene flow; introgression; migration; multispecies coalescent; species concept; species delimitation.].
Collapse
Affiliation(s)
- Xiyun Jiao
- Department of Genetics, Evolution and Environment, University College London, Gower Street, London WC1E 6BT, UK
| | - Ziheng Yang
- Department of Genetics, Evolution and Environment, University College London, Gower Street, London WC1E 6BT, UK
| |
Collapse
|
16
|
Chan KO, Hutter CR, Wood PL, Grismer LL, Brown RM. Larger, unfiltered datasets are more effective at resolving phylogenetic conflict: Introns, exons, and UCEs resolve ambiguities in Golden-backed frogs (Anura: Ranidae; genus Hylarana). Mol Phylogenet Evol 2020; 151:106899. [PMID: 32590046 DOI: 10.1016/j.ympev.2020.106899] [Citation(s) in RCA: 20] [Impact Index Per Article: 5.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/03/2020] [Revised: 05/18/2020] [Accepted: 06/17/2020] [Indexed: 01/01/2023]
Abstract
Using FrogCap, a recently-developed sequence-capture protocol, we obtained >12,000 highly informative exons, introns, and ultraconserved elements (UCEs), which we used to illustrate variation in evolutionary histories of these classes of markers, and to resolve long-standing systematic problems in Southeast Asian Golden-backed frogs of the genus-complex Hylarana. We also performed a comprehensive suite of analyses to assess the relative performance of different genetic markers, data filtering strategies, tree inference methods, and different measures of branch support. To reduce gene tree estimation error, we filtered the data using different thresholds of taxon completeness (missing data) and parsimony informative sites (PIS). We then estimated species trees using concatenated datasets and Maximum Likelihood (IQ-TREE) in addition to summary (ASTRAL-III), distance-based (ASTRID), and site-based (SVDQuartets) multispecies coalescent methods. Topological congruence and branch support were examined using traditional bootstrap, local posterior probabilities, gene concordance factors, quartet frequencies, and quartet scores. Our results did not yield a single concordant topology. Instead, introns, exons, and UCEs clearly possessed different phylogenetic signals, resulting in conflicting, yet strongly-supported phylogenetic estimates. However, a combined analysis comprising the most informative introns, exons, and UCEs converged on a similar topology across all analyses, with the exception of SVDQuartets. Bootstrap values were consistently high despite high levels of incongruence and high proportions of gene trees supporting conflicting topologies. Although low bootstrap values did indicate low heuristic support, high bootstrap support did not necessarily reflect congruence or support for the correct topology. This study reiterates findings of some previous studies, which demonstrated that traditional bootstrap values can produce positively misleading measures of support in large phylogenomic datasets. We also showed a remarkably strong positive relationship between branch length and topological congruence across all datasets, implying that very short internodes remain a challenge to resolve, even with orders of magnitude more data than ever before. Overall, our results demonstrate that more data from unfiltered or combined datasets produced superior results. Although data filtering reduced gene tree incongruence, decreased amounts of data also biased phylogenetic estimation. A point of diminishing returns was evident, at which higher congruence (from more stringent filtering) at the expense of amount of data led to topological error as assessed by comparison to more complete datasets across different genomic markers. Additionally, we showed that applying a parameter-rich model to a partitioned analysis of concatenated data produces better results compared to unpartitioned, or even partitioned analysis using model selection. Despite some lingering uncertainties, a combined analysis of our genomic data and sequences supplemented from GenBank (on the basis of a few gene regions) revealed highly supported novel systematic arrangements. Based on these new findings, we transfer Amnirana nicobariensis into the genus Indosylvirana; and I. milleti and Hylarana celebensis to the genus Papurana. We also provisionally place H. attigua in the genus Papurana pending verification from positively identified (voucher substantiated) samples.
Collapse
Affiliation(s)
- Kin Onn Chan
- Lee Kong Chian National History Museum, Faculty of Science, National University of Singapore, 2 Conservatory Drive, 117377, Singapore.
| | - Carl R Hutter
- Biodiversity Institute and Department of Ecology and Evolutionary Biology, University of Kansas, Lawrence, KS 66045, USA; Museum of Natural Sciences and Department of Biological Sciences, Louisiana State University, Baton Rouge, LA 70803, USA
| | - Perry L Wood
- Museum of Natural Sciences and Department of Biological Sciences, Louisiana State University, Baton Rouge, LA 70803, USA; Department of Biological Sciences & Museum of Natural History, Auburn University, Auburn, AL 36849, USA
| | - L Lee Grismer
- Herpetology Laboratory, Department of Biology, La Sierra University, 4500 Riverwalk Parkway, Riverside, CA 92505, USA
| | - Rafe M Brown
- Biodiversity Institute and Department of Ecology and Evolutionary Biology, University of Kansas, Lawrence, KS 66045, USA
| |
Collapse
|
17
|
Deep-Time Demographic Inference Suggests Ecological Release as Driver of Neoavian Adaptive Radiation. DIVERSITY-BASEL 2020. [DOI: 10.3390/d12040164] [Citation(s) in RCA: 8] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 12/13/2022]
Abstract
Assessing the applicability of theory to major adaptive radiations in deep time represents an extremely difficult problem in evolutionary biology. Neoaves, which includes 95% of living birds, is believed to have undergone a period of rapid diversification roughly coincident with the Cretaceous–Paleogene (K-Pg) boundary. We investigate whether basal neoavian lineages experienced an ecological release in response to ecological opportunity, as evidenced by density compensation. We estimated effective population sizes (Ne) of basal neoavian lineages by combining coalescent branch lengths (CBLs) and the numbers of generations between successive divergences. We used a modified version of Accurate Species TRee Algorithm (ASTRAL) to estimate CBLs directly from insertion–deletion (indel) data, as well as from gene trees using DNA sequence and/or indel data. We found that some divergences near the K-Pg boundary involved unexpectedly high gene tree discordance relative to the estimated number of generations between speciation events. The simplest explanation for this result is an increase in Ne, despite the caveats discussed herein. It appears that at least some early neoavian lineages, similar to the ancestor of the clade comprising doves, mesites, and sandgrouse, experienced ecological release near the time of the K-Pg mass extinction.
Collapse
|
18
|
Lou F, Zhang Y, Song N, Ji D, Gao T. Comprehensive Transcriptome Analysis Reveals Insights into Phylogeny and Positively Selected Genes of Sillago Species. Animals (Basel) 2020; 10:ani10040633. [PMID: 32272562 PMCID: PMC7222750 DOI: 10.3390/ani10040633] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/24/2020] [Revised: 03/31/2020] [Accepted: 04/01/2020] [Indexed: 01/09/2023] Open
Abstract
Sillago species lives in the demersal environments and face multiple stressors, such as localized oxygen depletion, sulfide accumulation, and high turbidity. In this study, we performed transcriptome analyses of seven Sillago species to provide insights into the phylogeny and positively selected genes of this species. After de novo assembly, 82,024, 58,102, 63,807, 85,990, 102,185, 69,748, and 102,903 unigenes were generated from S. japonica, S. aeolus, S. sp.1, S. sihama, S. sp.2, S. parvisquamis, and S. sinica, respectively. Furthermore, 140 shared orthologous exon markers were identified and then applied to reconstruct the phylogenetic relationships of the seven Sillago species. The reconstructed phylogenetic structure was significantly congruent with the prevailing morphological and molecular biological view of Sillago species relationships. In addition, a total of 44 genes were identified to be positively selected, and these genes were potential participants in the stress response, material (carbohydrate, amino acid and lipid) and energy metabolism, growth and differentiation, embryogenesis, visual sense, and other biological processes. We suspected that these genes possibly allowed Sillago species to increase their ecological adaptation to multiple environmental stressors.
Collapse
Affiliation(s)
- Fangrui Lou
- Fishery College, Zhejiang Ocean University, Zhoushan 316022, Zhejiang, China;
| | - Yuan Zhang
- Fishery College, Ocean University of China, Qingdao 266003, Shandong, China; (Y.Z.); (N.S.)
| | - Na Song
- Fishery College, Ocean University of China, Qingdao 266003, Shandong, China; (Y.Z.); (N.S.)
| | - Dongping Ji
- Agricultural Machinery Service Center, Fangchenggang 538000, Guangxi, China;
| | - Tianxiang Gao
- Fishery College, Zhejiang Ocean University, Zhoushan 316022, Zhejiang, China;
- Correspondence: ; Tel.: +86-580-2089-333
| |
Collapse
|
19
|
Karin BR, Gamble T, Jackman TR. Optimizing Phylogenomics with Rapidly Evolving Long Exons: Comparison with Anchored Hybrid Enrichment and Ultraconserved Elements. Mol Biol Evol 2020; 37:904-922. [PMID: 31710677 PMCID: PMC7038749 DOI: 10.1093/molbev/msz263] [Citation(s) in RCA: 20] [Impact Index Per Article: 5.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/17/2022] Open
Abstract
Marker selection has emerged as an important component of phylogenomic study design due to rising concerns of the effects of gene tree estimation error, model misspecification, and data-type differences. Researchers must balance various trade-offs associated with locus length and evolutionary rate among other factors. The most commonly used reduced representation data sets for phylogenomics are ultraconserved elements (UCEs) and Anchored Hybrid Enrichment (AHE). Here, we introduce Rapidly Evolving Long Exon Capture (RELEC), a new set of loci that targets single exons that are both rapidly evolving (evolutionary rate faster than RAG1) and relatively long in length (>1,500 bp), while at the same time avoiding paralogy issues across amniotes. We compare the RELEC data set to UCEs and AHE in squamate reptiles by aligning and analyzing orthologous sequences from 17 squamate genomes, composed of 10 snakes and 7 lizards. The RELEC data set (179 loci) outperforms AHE and UCEs by maximizing per-locus genetic variation while maintaining presence and orthology across a range of evolutionary scales. RELEC markers show higher phylogenetic informativeness than UCE and AHE loci, and RELEC gene trees show greater similarity to the species tree than AHE or UCE gene trees. Furthermore, with fewer loci, RELEC remains computationally tractable for full Bayesian coalescent species tree analyses. We contrast RELEC to and discuss important aspects of comparable methods, and demonstrate how RELEC may be the most effective set of loci for resolving difficult nodes and rapid radiations. We provide several resources for capturing or extracting RELEC loci from other amniote groups.
Collapse
Affiliation(s)
- Benjamin R Karin
- Department of Biology, Villanova University, Villanova, PA
- Museum of Vertebrate Zoology and Department of Integrative Biology, University of California, Berkeley, CA
| | - Tony Gamble
- Department of Biological Sciences, Marquette University, Milwaukee, WI
- Milwaukee Public Museum, Milwaukee, WI
- Bell Museum of Natural History, University of Minnesota, St. Paul, MN
| | - Todd R Jackman
- Department of Biology, Villanova University, Villanova, PA
| |
Collapse
|
20
|
Du Y, Wu S, Edwards SV, Liu L. The effect of alignment uncertainty, substitution models and priors in building and dating the mammal tree of life. BMC Evol Biol 2019; 19:203. [PMID: 31694538 PMCID: PMC6833305 DOI: 10.1186/s12862-019-1534-9] [Citation(s) in RCA: 8] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/07/2019] [Accepted: 10/21/2019] [Indexed: 11/29/2022] Open
Abstract
BACKGROUND The flood of genomic data to help build and date the tree of life requires automation at several critical junctures, most importantly during sequence assembly and alignment. It is widely appreciated that automated alignment protocols can yield inaccuracies, but the relative impact of various sources error on phylogenomic analysis is not yet known. This study employs an updated mammal data set of 5162 coding loci sampled from 90 species to evaluate the effects of alignment uncertainty, substitution models, and fossil priors on gene tree, species tree, and divergence time estimation. Additionally, a novel coalescent likelihood ratio test is introduced for comparing competing species trees against a given set of gene trees. RESULTS The aligned DNA sequences of 5162 loci from 90 species were trimmed and filtered using trimAL and two filtering protocols. The final dataset contains 4 sets of alignments - before trimming, after trimming, filtered by a recently proposed pipeline, and further filtered by comparing ML gene trees for each locus with the concatenation tree. Our analyses suggest that the average discordance among the coalescent trees is significantly smaller than that among the concatenation trees estimated from the 4 sets of alignments or with different substitution models. There is no significant difference among the divergence times estimated with different substitution models. However, the divergence dates estimated from the alignments after trimming are more recent than those estimated from the alignments before trimming. CONCLUSIONS Our results highlight that alignment uncertainty of the updated mammal data set and the choice of substitution models have little impact on tree topologies yielded by coalescent methods for species tree estimation, whereas they are more influential on the trees made by concatenation. Given the choice of calibration scheme and clock models, divergence time estimates are robust to the choice of substitution models, but removing alignments deemed problematic by trimming algorithms can lead to more recent dates. Although the fossil prior is important in divergence time estimation, Bayesian estimates of divergence times in this data set are driven primarily by the sequence data.
Collapse
Affiliation(s)
- Yan Du
- Department of Statistics, University of Georgia, 310 Herty Drive, Athens, GA 30606 USA
| | - Shaoyuan Wu
- Jiangsu Key Laboratory of Phylogenomics & Comparative Genomics, School of Life Sciences, Jiangsu Normal University, Xuzhou, Jiangsu 221116 People’s Republic of China
| | - Scott V. Edwards
- Department of Organismic & Evolutionary Biology, Museum of Comparative Zoology, Harvard University, Cambridge, MA 02138 USA
| | - Liang Liu
- Liang Liu, Department of Statistics and Institute of Bioinformatics, University of Georgia, 310 Herty Drive, Athens, GA 30606 USA
| |
Collapse
|
21
|
Cloutier A, Sackton TB, Grayson P, Clamp M, Baker AJ, Edwards SV. Whole-Genome Analyses Resolve the Phylogeny of Flightless Birds (Palaeognathae) in the Presence of an Empirical Anomaly Zone. Syst Biol 2019; 68:937-955. [PMID: 31135914 PMCID: PMC6857515 DOI: 10.1093/sysbio/syz019] [Citation(s) in RCA: 59] [Impact Index Per Article: 11.8] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/09/2018] [Revised: 03/06/2019] [Accepted: 04/09/2019] [Indexed: 01/17/2023] Open
Abstract
Palaeognathae represent one of the two basal lineages in modern birds, and comprise the volant (flighted) tinamous and the flightless ratites. Resolving palaeognath phylogenetic relationships has historically proved difficult, and short internal branches separating major palaeognath lineages in previous molecular phylogenies suggest that extensive incomplete lineage sorting (ILS) might have accompanied a rapid ancient divergence. Here, we investigate palaeognath relationships using genome-wide data sets of three types of noncoding nuclear markers, together totaling 20,850 loci and over 41 million base pairs of aligned sequence data. We recover a fully resolved topology placing rheas as the sister to kiwi and emu + cassowary that is congruent across marker types for two species tree methods (MP-EST and ASTRAL-II). This topology is corroborated by patterns of insertions for 4274 CR1 retroelements identified from multispecies whole-genome screening, and is robustly supported by phylogenomic subsampling analyses, with MP-EST demonstrating particularly consistent performance across subsampling replicates as compared to ASTRAL. In contrast, analyses of concatenated data supermatrices recover rheas as the sister to all other nonostrich palaeognaths, an alternative that lacks retroelement support and shows inconsistent behavior under subsampling approaches. While statistically supporting the species tree topology, conflicting patterns of retroelement insertions also occur and imply high amounts of ILS across short successive internal branches, consistent with observed patterns of gene tree heterogeneity. Coalescent simulations and topology tests indicate that the majority of observed topological incongruence among gene trees is consistent with coalescent variation rather than arising from gene tree estimation error alone, and estimated branch lengths for short successive internodes in the inferred species tree fall within the theoretical range encompassing the anomaly zone. Distributions of empirical gene trees confirm that the most common gene tree topology for each marker type differs from the species tree, signifying the existence of an empirical anomaly zone in palaeognaths.
Collapse
Affiliation(s)
- Alison Cloutier
- Department of Organismic and Evolutionary Biology, Harvard University, 26 Oxford Street, Cambridge, MA 02138, USA
- Department of Ornithology, Museum of Comparative Zoology, Harvard University, 26 Oxford Street, Cambridge, MA 02138, USA
| | - Timothy B Sackton
- Informatics Group, Harvard University, 28 Oxford Street, Cambridge, MA 02138, USA
| | - Phil Grayson
- Department of Organismic and Evolutionary Biology, Harvard University, 26 Oxford Street, Cambridge, MA 02138, USA
- Department of Ornithology, Museum of Comparative Zoology, Harvard University, 26 Oxford Street, Cambridge, MA 02138, USA
| | - Michele Clamp
- Informatics Group, Harvard University, 28 Oxford Street, Cambridge, MA 02138, USA
| | - Allan J Baker
- Department of Ecology and Evolutionary Biology, University of Toronto, 25 Willcox Street, Toronto, Ontario M5S 3B2, Canada
- Department of Natural History, Royal Ontario Museum, 100 Queen’s Park, Toronto, Ontario M5S 2C6, Canada
| | - Scott V Edwards
- Department of Organismic and Evolutionary Biology, Harvard University, 26 Oxford Street, Cambridge, MA 02138, USA
- Department of Ornithology, Museum of Comparative Zoology, Harvard University, 26 Oxford Street, Cambridge, MA 02138, USA
| |
Collapse
|
22
|
Pie MR, Bornschein MR, Ribeiro LF, Faircloth BC, McCormack JE. Phylogenomic species delimitation in microendemic frogs of the Brazilian Atlantic Forest. Mol Phylogenet Evol 2019; 141:106627. [PMID: 31539606 DOI: 10.1016/j.ympev.2019.106627] [Citation(s) in RCA: 10] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/08/2019] [Revised: 08/17/2019] [Accepted: 09/17/2019] [Indexed: 10/26/2022]
Abstract
The advent of next-generation sequencing allows researchers to use large-scale datasets for species delimitation analyses, yet one can envision an inflection point where the added accuracy of including more loci does not offset the increased computational burden. One alternative to including all loci could be to prioritize the analysis of loci for which there is an expectation of high informativeness. Here, we explore the issue of species delimitation and locus selection with montane species from two anuran genera that have been isolated in sky islands across the southern Brazilian Atlantic Forest: Melanophryniscus (Bufonidae) and Brachycephalus (Brachycephalidae). To delimit species, we obtained genetic data using target enrichment of ultraconserved elements from 32 populations (13 for Melanophryniscus and 19 for Brachycephalus), and we were able to create datasets that included over 800 loci with no missing data. We ranked loci according to their number of parsimony-informative sites, and we performed species delimitation analyses using BPP with the most informative 10, 20, 40, 80, 160, 320, and 640 loci. We identified three types of phylogenetic node: nodes with either consistently high or low support regardless of the number of loci or their informativeness and nodes that were initially poorly supported where support became stronger as we included more data. When viewed across all sensitivity analyses, our results suggest that the current species richness in both genera is likely underestimated. In addition, our results show the effects of different sampling strategies on species delimitation using phylogenomic datasets.
Collapse
Affiliation(s)
- Marcio R Pie
- Departamento de Zoologia, Universidade Federal do Paraná, CEP 81531-980 Curitiba, Paraná, Brazil; Mater Natura - Instituto de Estudos Ambientais, CEP 80250-020 Curitiba, Paraná, Brazil.
| | - Marcos R Bornschein
- Mater Natura - Instituto de Estudos Ambientais, CEP 80250-020 Curitiba, Paraná, Brazil; Instituto de Biociências, Universidade Estadual Paulista, Praça Infante Dom Henrique s/no, Parque Bitaru, CEP 11330-900 São Vicente, São Paulo, Brazil
| | - Luiz F Ribeiro
- Mater Natura - Instituto de Estudos Ambientais, CEP 80250-020 Curitiba, Paraná, Brazil; Escola de Ciências da Vida, Pontifícia Universidade Católica do Paraná, CEP 80215-901 Curitiba, Paraná, Brazil
| | - Brant C Faircloth
- Department of Biological Sciences and Museum of Natural Science, Louisiana State University, Baton Rouge, LA 70803, USA
| | - John E McCormack
- Moore Laboratory of Zoology, Occidental College, 1600 Campus Road, Los Angeles, CA 90041, USA
| |
Collapse
|
23
|
Yuan H, Atta C, Tornabene L, Li C. Assexon: Assembling Exon Using Gene Capture Data. Evol Bioinform Online 2019; 15:1176934319874792. [PMID: 31523128 PMCID: PMC6732846 DOI: 10.1177/1176934319874792] [Citation(s) in RCA: 12] [Impact Index Per Article: 2.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/15/2019] [Accepted: 08/19/2019] [Indexed: 12/30/2022] Open
Abstract
Exon capture across species has been one of the most broadly applied approaches
to acquire multi-locus data in phylogenomic studies of non-model organisms.
Methods for assembling loci from short-read sequences (eg, Illumina platforms)
that rely on mapping reads to a reference genome may not be suitable for studies
comprising species across a wide phylogenetic spectrum; thus, de novo assembling
methods are more generally applied. Current approaches for assembling targeted
exons from short reads are not particularly optimized as they cannot (1)
assemble loci with low read depth, (2) handle large files efficiently, and (3)
reliably address issues with paralogs. Thus, we present Assexon: a streamlined
pipeline that de novo assembles targeted exons and their flanking sequences from
raw reads. We tested our method using reads from Lepisosteus
osseus (4.37 Gb) and Boleophthalmus pectinirostris
(2.43 Gb), which are captured using baits that were designed based on genome
sequence of Lepisosteus oculatus and Oreochromis
niloticus, respectively. We compared performance of Assexon to
PHYLUCE and HybPiper, which are commonly used pipelines to assemble
ultra-conserved element (UCE) and Hyb-seq data. A custom exon capture analysis
pipeline (CP) developed by Yuan et al was compared as well. Assexon accurately
assembled more than 3400 to 3800 (20%-28%) loci than PHYLUCE and more than 1900
to 2300 (8%-14%) loci than HybPiper across different levels of phylogenetic
divergence. Assexon ran at least twice as fast as PHYLUCE and HybPiper. Number
of loci assembled using CP was comparable with Assexon in both tests, while
Assexon ran at least 7 times faster than CP. In addition, some steps of CP
require the user’s interaction and are not fully automated, and this user time
was not counted in our calculation. Both Assexon and CP retrieved no paralogs in
the testing runs, but PHYLUCE and Hybpiper did. In conclusion, Assexon is a tool
for accurate and efficient assembling of large read sets from exon capture
experiments. Furthermore, Assexon includes scripts to filter poorly aligned
coding regions and flanking regions, calculate summary statistics of loci, and
select loci with reliable phylogenetic signal. Assexon is available at https://github.com/yhadevol/Assexon.
Collapse
Affiliation(s)
- Hao Yuan
- Shanghai Universities Key Laboratory of Marine Animal Taxonomy and Evolution (Shanghai Ocean University), Shanghai, China.,Shanghai Collaborative Innovation for Aquatic Animal Genetics and Breeding, Shanghai, China.,Key Laboratory of Exploration and Utilization of Aquatic Genetic Resources (Shanghai Ocean Universitiy), Ministry of Education, Shanghai, China
| | - Calder Atta
- School of Aquatic and Fishery Sciences and the Burke Museum of Natural History and Culture, University of Washington, Seattle, WA, USA
| | - Luke Tornabene
- School of Aquatic and Fishery Sciences and the Burke Museum of Natural History and Culture, University of Washington, Seattle, WA, USA
| | - Chenhong Li
- Shanghai Universities Key Laboratory of Marine Animal Taxonomy and Evolution (Shanghai Ocean University), Shanghai, China.,Shanghai Collaborative Innovation for Aquatic Animal Genetics and Breeding, Shanghai, China.,Key Laboratory of Exploration and Utilization of Aquatic Genetic Resources (Shanghai Ocean Universitiy), Ministry of Education, Shanghai, China
| |
Collapse
|
24
|
Ericson PGP, Qu Y, Rasmussen PC, Blom MPK, Rheindt FE, Irestedt M. Genomic differentiation tracks earth-historic isolation in an Indo-Australasian archipelagic pitta (Pittidae; Aves) complex. BMC Evol Biol 2019; 19:151. [PMID: 31340765 PMCID: PMC6657069 DOI: 10.1186/s12862-019-1481-5] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/26/2018] [Accepted: 07/16/2019] [Indexed: 01/01/2023] Open
Abstract
Background Allopatric speciation has played a particularly important role in archipelagic settings where populations evolve in isolation after colonizing different islands. The Indo-Australasian island realm is an unparalleled natural laboratory of biotic diversification. Here we explore how the level of earth-historic isolation has influenced genetic differentiation across the region by investigating phylogeographic patterns in the Pitta sordida species complex. Results We generated a de novo genome and compared population genomics of 29 individuals of Pitta sordida from the entire distributional range and we reconstructed phylogenetic relationship using mitogenomes, a multi-nuclear gene dataset and single nucleotide polymorphisms (SNPs). We found deep divergence between an eastern and a western group of taxa across Indo-Australasia. Within both groups we have identified major lineages that are geographically separated into Philippines, Borneo, western Sundaland, and New Guinea, respectively. Although these lineages are genetically well-differentiated, suggesting a long-term isolation, there are signatures of extensive gene flow within each lineage throughout the Pleistocene, despite the wide geographic range occupied by some of them. We found little evidence of hybridization or introgression among the studied taxa, but forsteni from Sulawesi makes an exception. This individual, belonging to the eastern clade, is genetically admixed between the western and eastern clades. Geographically this makes sense as Sulawesi is not far from Borneo that houses a population of hooded pittas that belongs to the western clade. Conclusions We found that geological vicariance events cannot explain the current genetic differentiation in the Pitta sordida species complex. Instead, the glacial-interglacial cycles may have played a major role therein. During glacials the sea level could be up to 120 m lower than today and land bridges formed within both the Sunda Shelf and the Sahul Shelf permitting dispersal of floral and faunal elements. The geographic distribution of hooded pittas shows the importance of overwater, “stepping-stone” dispersals not only to deep-sea islands, but also from one shelf to the other. The most parsimonious hypothesis is an Asian ancestral home of the Pitta sordida species complex and a colonization from west to east, probably via Wallacea. Electronic supplementary material The online version of this article (10.1186/s12862-019-1481-5) contains supplementary material, which is available to authorized users.
Collapse
Affiliation(s)
- Per G P Ericson
- Department of Bioinformatics and Genetics, Swedish Museum of Natural History, PO Box 50007, SE-104 05, Stockholm, Sweden.
| | - Yanhua Qu
- Department of Bioinformatics and Genetics, Swedish Museum of Natural History, PO Box 50007, SE-104 05, Stockholm, Sweden.,Key Laboratory of Zoological Systematics and Evolution, Institute of Zoology, Chinese Academy of Sciences, Beijing, 100101, China
| | - Pamela C Rasmussen
- Department of Integrative Biology and MSU Museum, Michigan State University, East Lansing, 48824, MI, USA.,Bird Group, The Natural History Museum, Akeman Street, Tring, HP23 6AP, UK
| | - Mozes P K Blom
- Department of Bioinformatics and Genetics, Swedish Museum of Natural History, PO Box 50007, SE-104 05, Stockholm, Sweden
| | - Frank E Rheindt
- Department of Biological Sciences, National University of Singapore, 14 Science Drive 4, Singapore, 119077, Singapore
| | - Martin Irestedt
- Department of Bioinformatics and Genetics, Swedish Museum of Natural History, PO Box 50007, SE-104 05, Stockholm, Sweden
| |
Collapse
|
25
|
Comparative Phylogenomics, a Stepping Stone for Bird Biodiversity Studies. DIVERSITY-BASEL 2019. [DOI: 10.3390/d11070115] [Citation(s) in RCA: 24] [Impact Index Per Article: 4.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 01/02/2023]
Abstract
Birds are a group with immense availability of genomic resources, and hundreds of forthcoming genomes at the doorstep. We review recent developments in whole genome sequencing, phylogenomics, and comparative genomics of birds. Short read based genome assemblies are common, largely due to efforts of the Bird 10K genome project (B10K). Chromosome-level assemblies are expected to increase due to improved long-read sequencing. The available genomic data has enabled the reconstruction of the bird tree of life with increasing confidence and resolution, but challenges remain in the early splits of Neoaves due to their explosive diversification after the Cretaceous-Paleogene (K-Pg) event. Continued genomic sampling of the bird tree of life will not just better reflect their evolutionary history but also shine new light onto the organization of phylogenetic signal and conflict across the genome. The comparatively simple architecture of avian genomes makes them a powerful system to study the molecular foundation of bird specific traits. Birds are on the verge of becoming an extremely resourceful system to study biodiversity from the nucleotide up.
Collapse
|
26
|
Sackton TB, Grayson P, Cloutier A, Hu Z, Liu JS, Wheeler NE, Gardner PP, Clarke JA, Baker AJ, Clamp M, Edwards SV. Convergent regulatory evolution and loss of flight in paleognathous birds. Science 2019; 364:74-78. [DOI: 10.1126/science.aat7244] [Citation(s) in RCA: 125] [Impact Index Per Article: 25.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/28/2018] [Accepted: 02/27/2019] [Indexed: 01/05/2023]
Abstract
A core question in evolutionary biology is whether convergent phenotypic evolution is driven by convergent molecular changes in proteins or regulatory regions. We combined phylogenomic, developmental, and epigenomic analysis of 11 new genomes of paleognathous birds, including an extinct moa, to show that convergent evolution of regulatory regions, more so than protein-coding genes, is prevalent among developmental pathways associated with independent losses of flight. A Bayesian analysis of 284,001 conserved noncoding elements, 60,665 of which are corroborated as enhancers by open chromatin states during development, identified 2355 independent accelerations along lineages of flightless paleognaths, with functional consequences for driving gene expression in the developing forelimb. Our results suggest that the genomic landscape associated with morphological convergence in ratites has a substantial shared regulatory component.
Collapse
|
27
|
Bravo GA, Antonelli A, Bacon CD, Bartoszek K, Blom MPK, Huynh S, Jones G, Knowles LL, Lamichhaney S, Marcussen T, Morlon H, Nakhleh LK, Oxelman B, Pfeil B, Schliep A, Wahlberg N, Werneck FP, Wiedenhoeft J, Willows-Munro S, Edwards SV. Embracing heterogeneity: coalescing the Tree of Life and the future of phylogenomics. PeerJ 2019; 7:e6399. [PMID: 30783571 PMCID: PMC6378093 DOI: 10.7717/peerj.6399] [Citation(s) in RCA: 67] [Impact Index Per Article: 13.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/05/2018] [Accepted: 01/07/2019] [Indexed: 12/23/2022] Open
Abstract
Building the Tree of Life (ToL) is a major challenge of modern biology, requiring advances in cyberinfrastructure, data collection, theory, and more. Here, we argue that phylogenomics stands to benefit by embracing the many heterogeneous genomic signals emerging from the first decade of large-scale phylogenetic analysis spawned by high-throughput sequencing (HTS). Such signals include those most commonly encountered in phylogenomic datasets, such as incomplete lineage sorting, but also those reticulate processes emerging with greater frequency, such as recombination and introgression. Here we focus specifically on how phylogenetic methods can accommodate the heterogeneity incurred by such population genetic processes; we do not discuss phylogenetic methods that ignore such processes, such as concatenation or supermatrix approaches or supertrees. We suggest that methods of data acquisition and the types of markers used in phylogenomics will remain restricted until a posteriori methods of marker choice are made possible with routine whole-genome sequencing of taxa of interest. We discuss limitations and potential extensions of a model supporting innovation in phylogenomics today, the multispecies coalescent model (MSC). Macroevolutionary models that use phylogenies, such as character mapping, often ignore the heterogeneity on which building phylogenies increasingly rely and suggest that assimilating such heterogeneity is an important goal moving forward. Finally, we argue that an integrative cyberinfrastructure linking all steps of the process of building the ToL, from specimen acquisition in the field to publication and tracking of phylogenomic data, as well as a culture that values contributors at each step, are essential for progress.
Collapse
Affiliation(s)
- Gustavo A. Bravo
- Department of Organismic and Evolutionary Biology, Museum of Comparative Zoology, Harvard University, Cambridge, MA, USA
| | - Alexandre Antonelli
- Department of Organismic and Evolutionary Biology, Museum of Comparative Zoology, Harvard University, Cambridge, MA, USA
- Gothenburg Global Biodiversity Centre, Göteborg, Sweden
- Department of Biological and Environmental Sciences, University of Gothenburg, Göteborg, Sweden
- Gothenburg Botanical Garden, Göteborg, Sweden
| | - Christine D. Bacon
- Gothenburg Global Biodiversity Centre, Göteborg, Sweden
- Department of Biological and Environmental Sciences, University of Gothenburg, Göteborg, Sweden
| | - Krzysztof Bartoszek
- Department of Computer and Information Science, Linköping University, Linköping, Sweden
| | - Mozes P. K. Blom
- Department of Bioinformatics and Genetics, Swedish Museum of Natural History, Stockholm, Sweden
| | - Stella Huynh
- Institut de Biologie, Université de Neuchâtel, Neuchâtel, Switzerland
| | - Graham Jones
- Department of Biological and Environmental Sciences, University of Gothenburg, Göteborg, Sweden
| | - L. Lacey Knowles
- Department of Ecology and Evolutionary Biology, University of Michigan, Ann Arbor, MI, USA
| | - Sangeet Lamichhaney
- Department of Organismic and Evolutionary Biology, Museum of Comparative Zoology, Harvard University, Cambridge, MA, USA
| | - Thomas Marcussen
- Centre for Ecological and Evolutionary Synthesis, University of Oslo, Oslo, Norway
| | - Hélène Morlon
- Institut de Biologie, Ecole Normale Supérieure de Paris, Paris, France
| | - Luay K. Nakhleh
- Department of Computer Science, Rice University, Houston, TX, USA
| | - Bengt Oxelman
- Gothenburg Global Biodiversity Centre, Göteborg, Sweden
- Department of Biological and Environmental Sciences, University of Gothenburg, Göteborg, Sweden
| | - Bernard Pfeil
- Department of Biological and Environmental Sciences, University of Gothenburg, Göteborg, Sweden
| | - Alexander Schliep
- Department of Computer Science and Engineering, Chalmers University of Technology and University of Gothenburg, Göteborg, Sweden
| | | | - Fernanda P. Werneck
- Coordenação de Biodiversidade, Programa de Coleções Científicas Biológicas, Instituto Nacional de Pesquisa da Amazônia, Manaus, AM, Brazil
| | - John Wiedenhoeft
- Department of Computer Science and Engineering, Chalmers University of Technology and University of Gothenburg, Göteborg, Sweden
- Department of Computer Science, Rutgers University, Piscataway, NJ, USA
| | - Sandi Willows-Munro
- School of Life Sciences, University of Kwazulu-Natal, Pietermaritzburg, South Africa
| | - Scott V. Edwards
- Department of Organismic and Evolutionary Biology, Museum of Comparative Zoology, Harvard University, Cambridge, MA, USA
- Gothenburg Centre for Advanced Studies in Science and Technology, Chalmers University of Technology and University of Gothenburg, Göteborg, Sweden
| |
Collapse
|
28
|
A simple strategy for recovering ultraconserved elements, exons, and introns from low coverage shotgun sequencing of museum specimens: Placement of the partridge genus Tropicoperdix within the galliformes. Mol Phylogenet Evol 2018; 129:304-314. [DOI: 10.1016/j.ympev.2018.09.005] [Citation(s) in RCA: 12] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/26/2018] [Revised: 07/23/2018] [Accepted: 09/06/2018] [Indexed: 11/19/2022]
|
29
|
Collins RA, Hrbek T. An In Silico Comparison of Protocols for Dated Phylogenomics. Syst Biol 2018; 67:633-650. [PMID: 29319797 DOI: 10.1093/sysbio/syx089] [Citation(s) in RCA: 18] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/23/2015] [Accepted: 10/24/2017] [Indexed: 01/02/2023] Open
Abstract
In the age of genome-scale DNA sequencing, choice of molecular marker arguably remains an important decision in planning a phylogenetic study. Using published genomes from 23 primate species, we make a standardized comparison of four of the most frequently used protocols in phylogenomics, viz., targeted sequence-enrichment using ultraconserved element and exon-capture probes, and restriction-site-associated DNA sequencing (RADseq and ddRADseq). Here, we present a procedure to perform in silico extractions from genomes and create directly comparable data sets for each class of marker. We then compare these data sets in terms of both phylogenetic resolution and ability to consistently and precisely estimate clade ages using fossil-calibrated molecular-clock models. Furthermore, we were also able to directly compare these results to previously published data sets from Sanger-sequenced nuclear exons and mitochondrial genomes under the same analytical conditions. Our results show-although with the exception of the mitochondrial genome data set and the smallest ddRADseq data set-that for uncontroversial nodes all data classes performed equally well, that is they recovered the same well supported topology. However, for one difficult-to-resolve node comprising a rapid diversification, we report well supported but conflicting topologies among the marker classes consistent with the mismodeling of gene tree heterogeneity as demonstrated by species tree analyses of single nucleotide polymorphisms. Likewise, clade age estimates showed consistent discrepancies between data sets under strict and relaxed clock models; for recent nodes, clade ages estimated by nuclear exon data sets were younger than those of the UCE, RADseq and mitochondrial data, but vice versa for the deepest nodes in the primate phylogeny. This observation is explained by temporal differences in phylogenetic informativeness (PI), with the data sets with strong PI peaks toward the present underestimating the deepest node ages. Finally, we conclude by emphasizing that while huge numbers of loci are probably not required for uncontroversial phylogenetic questions-for which practical considerations such as ease of data generation, sharing, and aggregating, therefore become increasingly important-accurately modeling heterogeneous data remains as relevant as ever for the more recalcitrant problems.
Collapse
Affiliation(s)
- Rupert A Collins
- Laboratório de Evolução e Genética Animal, Department of Genetics, Federal University of Amazonas, Av. Rodrigo Otavio Ramos, 3000, Manaus, AM, 69077-000, Brazil.,School of Biological Sciences, Life Sciences Building, University of Bristol, 24 Tyndall Ave, Bristol BS8 1TH, UK
| | - Tomas Hrbek
- Laboratório de Evolução e Genética Animal, Department of Genetics, Federal University of Amazonas, Av. Rodrigo Otavio Ramos, 3000, Manaus, AM, 69077-000, Brazil.,Department of Biology, 4102 LSB Brigham Young University, Provo, UT, 84602, USA
| |
Collapse
|
30
|
Comprehensive phylogeny of ray-finned fishes (Actinopterygii) based on transcriptomic and genomic data. Proc Natl Acad Sci U S A 2018; 115:6249-6254. [PMID: 29760103 DOI: 10.1073/pnas.1719358115] [Citation(s) in RCA: 306] [Impact Index Per Article: 51.0] [Reference Citation Analysis] [Abstract] [Key Words] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/01/2023] Open
Abstract
Our understanding of phylogenetic relationships among bony fishes has been transformed by analysis of a small number of genes, but uncertainty remains around critical nodes. Genome-scale inferences so far have sampled a limited number of taxa and genes. Here we leveraged 144 genomes and 159 transcriptomes to investigate fish evolution with an unparalleled scale of data: >0.5 Mb from 1,105 orthologous exon sequences from 303 species, representing 66 out of 72 ray-finned fish orders. We apply phylogenetic tests designed to trace the effect of whole-genome duplication events on gene trees and find paralogy-free loci using a bioinformatics approach. Genome-wide data support the structure of the fish phylogeny, and hypothesis-testing procedures appropriate for phylogenomic datasets using explicit gene genealogy interrogation settle some long-standing uncertainties, such as the branching order at the base of the teleosts and among early euteleosts, and the sister lineage to the acanthomorph and percomorph radiations. Comprehensive fossil calibrations date the origin of all major fish lineages before the end of the Cretaceous.
Collapse
|
31
|
Polychronopoulos D, King JWD, Nash AJ, Tan G, Lenhard B. Conserved non-coding elements: developmental gene regulation meets genome organization. Nucleic Acids Res 2018; 45:12611-12624. [PMID: 29121339 PMCID: PMC5728398 DOI: 10.1093/nar/gkx1074] [Citation(s) in RCA: 57] [Impact Index Per Article: 9.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/26/2017] [Accepted: 10/24/2017] [Indexed: 12/20/2022] Open
Abstract
Comparative genomics has revealed a class of non-protein-coding genomic sequences that display an extraordinary degree of conservation between two or more organisms, regularly exceeding that found within protein-coding exons. These elements, collectively referred to as conserved non-coding elements (CNEs), are non-randomly distributed across chromosomes and tend to cluster in the vicinity of genes with regulatory roles in multicellular development and differentiation. CNEs are organized into functional ensembles called genomic regulatory blocks–dense clusters of elements that collectively coordinate the expression of shared target genes, and whose span in many cases coincides with topologically associated domains. CNEs display sequence properties that set them apart from other sequences under constraint, and have recently been proposed as useful markers for the reconstruction of the evolutionary history of organisms. Disruption of several of these elements is known to contribute to diseases linked with development, and cancer. The emergence, evolutionary dynamics and functions of CNEs still remain poorly understood, and new approaches are required to enable comprehensive CNE identification and characterization. Here, we review current knowledge and identify challenges that need to be tackled to resolve the impasse in understanding extreme non-coding conservation.
Collapse
Affiliation(s)
- Dimitris Polychronopoulos
- Computational Regulatory Genomics Group, MRC London Institute of Medical Sciences, Du Cane Road, London W12 0NN, UK.,Institute of Clinical Sciences, Faculty of Medicine, Imperial College London, Hammersmith Campus, Du Cane Road, London W12 0NN, UK
| | - James W D King
- Computational Regulatory Genomics Group, MRC London Institute of Medical Sciences, Du Cane Road, London W12 0NN, UK.,Institute of Clinical Sciences, Faculty of Medicine, Imperial College London, Hammersmith Campus, Du Cane Road, London W12 0NN, UK
| | - Alexander J Nash
- Computational Regulatory Genomics Group, MRC London Institute of Medical Sciences, Du Cane Road, London W12 0NN, UK.,Institute of Clinical Sciences, Faculty of Medicine, Imperial College London, Hammersmith Campus, Du Cane Road, London W12 0NN, UK
| | - Ge Tan
- Computational Regulatory Genomics Group, MRC London Institute of Medical Sciences, Du Cane Road, London W12 0NN, UK.,Institute of Clinical Sciences, Faculty of Medicine, Imperial College London, Hammersmith Campus, Du Cane Road, London W12 0NN, UK
| | - Boris Lenhard
- Computational Regulatory Genomics Group, MRC London Institute of Medical Sciences, Du Cane Road, London W12 0NN, UK.,Institute of Clinical Sciences, Faculty of Medicine, Imperial College London, Hammersmith Campus, Du Cane Road, London W12 0NN, UK.,Sars International Centre for Marine Molecular Biology, University of Bergen, Thormøhlensgate 55, N-5008 Bergen, Norway
| |
Collapse
|
32
|
The evolution of the macrophage-specific enhancer (Fms intronic regulatory element) within the CSF1R locus of vertebrates. Sci Rep 2017; 7:17115. [PMID: 29215000 PMCID: PMC5719456 DOI: 10.1038/s41598-017-15999-x] [Citation(s) in RCA: 8] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/21/2017] [Accepted: 11/03/2017] [Indexed: 01/07/2023] Open
Abstract
The Csf1r locus encodes the receptor for macrophage colony-stimulating factor, which controls the proliferation, differentiation and survival of macrophages. The 300 bp Fms intronic regulatory element (FIRE), within the second intron of Csf1r, is necessary and sufficient to direct macrophage-specific transcription. We have analysed the conservation and divergence of the FIRE DNA sequence in vertebrates. FIRE is present in the same location in the Csf1r locus in reptile, avian and mammalian genomes. Nearest neighbor analysis based upon this element alone largely recapitulates phylogenies inferred from much larger genomic sequence datasets. One core element, containing binding sites for AP1 family and the macrophage-specific transcription factor, PU.1, is conserved from lizards to humans. Around this element, the FIRE sequence is conserved within clades with the most conserved elements containing motifs for known myeloid-expressed transcription factors. Conversely, there is little alignment between clades outside the AP1/PU.1 element. The analysis favours a hybrid between “enhanceosome” and “smorgasbord” models of enhancer function, in which elements cooperate to bind components of the available transcription factor milieu.
Collapse
|
33
|
Genomic evidence reveals a radiation of placental mammals uninterrupted by the KPg boundary. Proc Natl Acad Sci U S A 2017; 114:E7282-E7290. [PMID: 28808022 DOI: 10.1073/pnas.1616744114] [Citation(s) in RCA: 84] [Impact Index Per Article: 12.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/18/2022] Open
Abstract
The timing of the diversification of placental mammals relative to the Cretaceous-Paleogene (KPg) boundary mass extinction remains highly controversial. In particular, there have been seemingly irreconcilable differences in the dating of the early placental radiation not only between fossil-based and molecular datasets but also among molecular datasets. To help resolve this discrepancy, we performed genome-scale analyses using 4,388 loci from 90 taxa, including representatives of all extant placental orders and transcriptome data from flying lemurs (Dermoptera) and pangolins (Pholidota). Depending on the gene partitioning scheme, molecular clock model, and genic deviation from molecular clock assumptions, extensive sensitivity analyses recovered widely varying diversification scenarios for placental mammals from a given gene set, ranging from a deep Cretaceous origin and diversification to a scenario spanning the KPg boundary, suggesting that the use of suboptimal molecular clock markers and methodologies is a major cause of controversies regarding placental diversification timing. We demonstrate that reconciliation between molecular and paleontological estimates of placental divergence times can be achieved using the appropriate clock model and gene partitioning scheme while accounting for the degree to which individual genes violate molecular clock assumptions. A birth-death-shift analysis suggests that placental mammals underwent a continuous radiation across the KPg boundary without apparent interruption by the mass extinction, paralleling a genus-level radiation of multituberculates and ecomorphological diversification of both multituberculates and therians. These findings suggest that the KPg catastrophe evidently played a limited role in placental diversification, which, instead, was likely a delayed response to the slightly earlier radiation of angiosperms.
Collapse
|