1
|
Steenwyk JL, King N. The promise and pitfalls of synteny in phylogenomics. PLoS Biol 2024; 22:e3002632. [PMID: 38768403 PMCID: PMC11105162 DOI: 10.1371/journal.pbio.3002632] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 05/22/2024] Open
Abstract
Reconstructing the tree of life remains a central goal in biology. Early methods, which relied on small numbers of morphological or genetic characters, often yielded conflicting evolutionary histories, undermining confidence in the results. Investigations based on phylogenomics, which use hundreds to thousands of loci for phylogenetic inquiry, have provided a clearer picture of life's history, but certain branches remain problematic. To resolve difficult nodes on the tree of life, 2 recent studies tested the utility of synteny, the conserved collinearity of orthologous genetic loci in 2 or more organisms, for phylogenetics. Synteny exhibits compelling phylogenomic potential while also raising new challenges. This Essay identifies and discusses specific opportunities and challenges that bear on the value of synteny data and other rare genomic changes for phylogenomic studies. Synteny-based analyses of highly contiguous genome assemblies mark a new chapter in the phylogenomic era and the quest to reconstruct the tree of life.
Collapse
Affiliation(s)
- Jacob L. Steenwyk
- Howard Hughes Medical Institute, University of California, Berkeley, California, United States of America
- Department of Molecular and Cell Biology, University of California, Berkeley, California, United States of America
| | - Nicole King
- Howard Hughes Medical Institute, University of California, Berkeley, California, United States of America
- Department of Molecular and Cell Biology, University of California, Berkeley, California, United States of America
| |
Collapse
|
2
|
Dylus D, Altenhoff A, Majidian S, Sedlazeck FJ, Dessimoz C. Inference of phylogenetic trees directly from raw sequencing reads using Read2Tree. Nat Biotechnol 2024; 42:139-147. [PMID: 37081138 PMCID: PMC10791578 DOI: 10.1038/s41587-023-01753-4] [Citation(s) in RCA: 3] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/18/2022] [Accepted: 03/16/2023] [Indexed: 04/22/2023]
Abstract
Current methods for inference of phylogenetic trees require running complex pipelines at substantial computational and labor costs, with additional constraints in sequencing coverage, assembly and annotation quality, especially for large datasets. To overcome these challenges, we present Read2Tree, which directly processes raw sequencing reads into groups of corresponding genes and bypasses traditional steps in phylogeny inference, such as genome assembly, annotation and all-versus-all sequence comparisons, while retaining accuracy. In a benchmark encompassing a broad variety of datasets, Read2Tree is 10-100 times faster than assembly-based approaches and in most cases more accurate-the exception being when sequencing coverage is high and reference species very distant. Here, to illustrate the broad applicability of the tool, we reconstruct a yeast tree of life of 435 species spanning 590 million years of evolution. We also apply Read2Tree to >10,000 Coronaviridae samples, accurately classifying highly diverse animal samples and near-identical severe acute respiratory syndrome coronavirus 2 sequences on a single tree. The speed, accuracy and versatility of Read2Tree enable comparative genomics at scale.
Collapse
Affiliation(s)
- David Dylus
- Department of Computational Biology, University of Lausanne, Lausanne, Switzerland
- SIB Swiss Institute of Bioinformatics, Lausanne, Switzerland
- F. Hoffmann-La Roche Ltd, Immunology, Infectious Disease, and Ophthalmology (I2O), Roche Pharmaceutical Research and Early Development (pRED), Basel, Switzerland
| | - Adrian Altenhoff
- SIB Swiss Institute of Bioinformatics, Lausanne, Switzerland
- Department of Computer Science, ETH, Zurich, Switzerland
| | - Sina Majidian
- Department of Computational Biology, University of Lausanne, Lausanne, Switzerland
- SIB Swiss Institute of Bioinformatics, Lausanne, Switzerland
| | - Fritz J Sedlazeck
- Human Genome Sequencing Center, Baylor College of Medicine, Houston, TX, USA.
- Department of Computer Science, Rice University, Houston, TX, USA.
| | - Christophe Dessimoz
- Department of Computational Biology, University of Lausanne, Lausanne, Switzerland.
- SIB Swiss Institute of Bioinformatics, Lausanne, Switzerland.
- Department of Computer Science, University College London, London, UK.
- Centre for Life's Origins and Evolution, Department of Genetics, Evolution and Environment, University College London, London, UK.
| |
Collapse
|
3
|
McLay TGB, Fowler RM, Fahey PS, Murphy DJ, Udovicic F, Cantrill DJ, Bayly MJ. Phylogenomics reveals extreme gene tree discordance in a lineage of dominant trees: hybridization, introgression, and incomplete lineage sorting blur deep evolutionary relationships despite clear species groupings in Eucalyptus subgenus Eudesmia. Mol Phylogenet Evol 2023; 187:107869. [PMID: 37423562 DOI: 10.1016/j.ympev.2023.107869] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/31/2023] [Revised: 06/29/2023] [Accepted: 06/30/2023] [Indexed: 07/11/2023]
Abstract
Eucalypts are a large and ecologically important group of plants on the Australian continent, and understanding their evolution is important in understanding evolution of the unique Australian flora. Previous phylogenies using plastome DNA, nuclear-ribosomal DNA, or random genome-wide SNPs, have been confounded by limited genetic sampling or by idiosyncratic biological features of the eucalypts, including widespread plastome introgression. Here we present phylogenetic analyses of Eucalyptus subgenus Eudesmia (22 species from western, northern, central and eastern Australia), in the first study to apply a target-capture sequencing approach using custom, eucalypt-specific baits (of 568 genes) to a lineage of Eucalyptus. Multiple accessions of all species were included, and target-capture data were supplemented by separate analyses of plastome genes (average of 63 genes per sample). Analyses revealed a complex evolutionary history likely shaped by incomplete lineage sorting and hybridization. Gene tree discordance generally increased with phylogenetic depth. Species, or groups of species, toward the tips of the tree are mostly supported, and three major clades are identified, but the branching order of these clades cannot be confirmed with confidence. Multiple approaches to filtering the nuclear dataset, by removing genes or samples, could not reduce gene tree conflict or resolve these relationships. Despite inherent complexities in eucalypt evolution, the custom bait kit devised for this research will be a powerful tool for investigating the evolutionary history of eucalypts more broadly.
Collapse
Affiliation(s)
- Todd G B McLay
- Royal Botanic Gardens Victoria, Melbourne 3004, Vic, Australia; School of BioSciences, The University of Melbourne, Parkville 3010, Vic, Australia.
| | - Rachael M Fowler
- School of BioSciences, The University of Melbourne, Parkville 3010, Vic, Australia
| | - Patrick S Fahey
- Research Centre for Ecosystem Resilience, The Royal Botanic Garden Sydney, Sydney 2000, NSW, Australia; Qld Alliance for Agriculture and Food Innovation, The University of Queensland, St Lucia 4072, Qld, Australia
| | - Daniel J Murphy
- Royal Botanic Gardens Victoria, Melbourne 3004, Vic, Australia; School of BioSciences, The University of Melbourne, Parkville 3010, Vic, Australia
| | - Frank Udovicic
- Royal Botanic Gardens Victoria, Melbourne 3004, Vic, Australia
| | - David J Cantrill
- Royal Botanic Gardens Victoria, Melbourne 3004, Vic, Australia; School of BioSciences, The University of Melbourne, Parkville 3010, Vic, Australia
| | - Michael J Bayly
- School of BioSciences, The University of Melbourne, Parkville 3010, Vic, Australia
| |
Collapse
|
4
|
Dylus D, Altenhoff A, Majidian S, Sedlazeck FJ, Dessimoz C. Read2Tree: scalable and accurate phylogenetic trees from raw reads. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2022:2022.04.18.488678. [PMID: 36561179 PMCID: PMC9774205 DOI: 10.1101/2022.04.18.488678] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 12/24/2022]
Abstract
The inference of phylogenetic trees is foundational to biology. However, state-of-the-art phylogenomics requires running complex pipelines, at significant computational and labour costs, with additional constraints in sequencing coverage, assembly and annotation quality. To overcome these challenges, we present Read2Tree, which directly processes raw sequencing reads into groups of corresponding genes. In a benchmark encompassing a broad variety of datasets, our assembly-free approach was 10-100x faster than conventional approaches, and in most cases more accurate-the exception being when sequencing coverage was high and reference species very distant. To illustrate the broad applicability of the tool, we reconstructed a yeast tree of life of 435 species spanning 590 million years of evolution. Applied to Coronaviridae samples, Read2Tree accurately classified highly diverse animal samples and near-identical SARS-CoV-2 sequences on a single tree-thereby exhibiting remarkable breadth and depth. The speed, accuracy, and versatility of Read2Tree enables comparative genomics at scale.
Collapse
Affiliation(s)
- David Dylus
- Department of Computational Biology, University of Lausanne, 1015 Lausanne, Switzerland
- present address: F. Hoffmann-La Roche Ltd, Immunology, Infectious Disease, and Ophthalmology (I2O), Roche Pharmaceutical Research and Early Development (pRED), Basel, 4070, Switzerland
- SIB Swiss Institute of Bioinformatics, 1015 Lausanne, Switzerland
| | - Adrian Altenhoff
- SIB Swiss Institute of Bioinformatics, 1015 Lausanne, Switzerland
- Department of Computer Science, ETH, 8092 Zurich, Switzerland
| | - Sina Majidian
- Department of Computational Biology, University of Lausanne, 1015 Lausanne, Switzerland
- SIB Swiss Institute of Bioinformatics, 1015 Lausanne, Switzerland
| | - Fritz J Sedlazeck
- Human Genome Sequencing Center, Baylor College of Medicine, Houston, TX, 77030, USA
- Department of Computer Science, Rice University, Houston, TX, 77005, USA
| | - Christophe Dessimoz
- Department of Computational Biology, University of Lausanne, 1015 Lausanne, Switzerland
- SIB Swiss Institute of Bioinformatics, 1015 Lausanne, Switzerland
- Department of Computer Science, University College London, London WC1E 6BT, UK
- Centre for Life’s Origins and Evolution, Department of Genetics, Evolution and Environment, University College London, London WC1E, UK
| |
Collapse
|
5
|
Saha P, Sarkar D. Structural and information-theoretic complexity measures of brain networks: Evolutionary aspects and implications. Biosystems 2022; 218:104711. [DOI: 10.1016/j.biosystems.2022.104711] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/15/2021] [Revised: 03/21/2022] [Accepted: 05/23/2022] [Indexed: 11/16/2022]
|
6
|
McLay TGB, Birch JL, Gunn BF, Ning W, Tate JA, Nauheimer L, Joyce EM, Simpson L, Schmidt‐Lebuhn AN, Baker WJ, Forest F, Jackson CJ. New targets acquired: Improving locus recovery from the Angiosperms353 probe set. APPLICATIONS IN PLANT SCIENCES 2021; 9:APS311420. [PMID: 34336399 PMCID: PMC8312740 DOI: 10.1002/aps3.11420] [Citation(s) in RCA: 33] [Impact Index Per Article: 11.0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 10/06/2020] [Accepted: 03/15/2021] [Indexed: 05/10/2023]
Abstract
PREMISE Universal target enrichment kits maximize utility across wide evolutionary breadth while minimizing the number of baits required to create a cost-efficient kit. The Angiosperms353 kit has been successfully used to capture loci throughout the angiosperms, but the default target reference file includes sequence information from only 6-18 taxa per locus. Consequently, reads sequenced from on-target DNA molecules may fail to map to references, resulting in fewer on-target reads for assembly, and reducing locus recovery. METHODS We expanded the Angiosperms353 target file, incorporating sequences from 566 transcriptomes to produce a 'mega353' target file, with each locus represented by 17-373 taxa. This mega353 file is a drop-in replacement for the original Angiosperms353 file in HybPiper analyses. We provide tools to subsample the file based on user-selected taxon groups, and to incorporate other transcriptome or protein-coding gene data sets. RESULTS Compared to the default Angiosperms353 file, the mega353 file increased the percentage of on-target reads by an average of 32%, increased locus recovery at 75% length by 49%, and increased the total length of the concatenated loci by 29%. DISCUSSION Increasing the phylogenetic density of the target reference file results in improved recovery of target capture loci. The mega353 file and associated scripts are available at: https://github.com/chrisjackson-pellicle/NewTargets.
Collapse
Affiliation(s)
- Todd G. B. McLay
- National Herbarium of VictoriaRoyal Botanic Gardens VictoriaMelbourneAustralia
- School of BiosciencesUniversity of MelbourneMelbourneAustralia
- Centre for Australian National Biodiversity ResearchCSIROCanberraAustralia
| | - Joanne L. Birch
- School of BiosciencesUniversity of MelbourneMelbourneAustralia
| | - Bee F. Gunn
- National Herbarium of VictoriaRoyal Botanic Gardens VictoriaMelbourneAustralia
- School of BiosciencesUniversity of MelbourneMelbourneAustralia
| | - Weixuan Ning
- School of Fundamental SciencesMassey UniversityPalmerston NorthNew Zealand
| | - Jennifer A. Tate
- School of Fundamental SciencesMassey UniversityPalmerston NorthNew Zealand
| | - Lars Nauheimer
- James Cook UniversityCairnsAustralia
- Australian Tropical HerbariumJames Cook UniversityCairnsAustralia
| | - Elizabeth M. Joyce
- James Cook UniversityCairnsAustralia
- Australian Tropical HerbariumJames Cook UniversityCairnsAustralia
| | - Lalita Simpson
- James Cook UniversityCairnsAustralia
- Australian Tropical HerbariumJames Cook UniversityCairnsAustralia
| | | | | | - Félix Forest
- Royal Botanic Gardens, KewRichmondSurreyTW9 3AEUnited Kingdom
| | - Chris J. Jackson
- National Herbarium of VictoriaRoyal Botanic Gardens VictoriaMelbourneAustralia
| |
Collapse
|