1
|
Harris BJ, Sheridan PO, Davín AA, Gubry-Rangin C, Szöllősi GJ, Williams TA. Rooting Species Trees Using Gene Tree-Species Tree Reconciliation. Methods Mol Biol 2022; 2569:189-211. [PMID: 36083449 DOI: 10.1007/978-1-0716-2691-7_9] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 05/24/2023]
Abstract
Interpreting phylogenetic trees requires a root, which provides the direction of evolution and polarizes ancestor-descendant relationships. But inferring the root using genetic data is difficult, particularly in cases where the closest available outgroup is only distantly related, which are common for microbes. In this chapter, we present a workflow for estimating rooted species trees and the evolutionary history of the gene families that evolve within them using probabilistic gene tree-species tree reconciliation. We illustrate the pipeline using a small dataset of prokaryotic genomes, for which the example scripts can be run using modest computer resources. We describe the rooting method used in this work in the context or other rooting strategies and discuss some of the limitations and opportunities presented by probabilistic gene tree-species tree reconciliation methods.
Collapse
Affiliation(s)
- Brogan J Harris
- School of Biological Sciences, University of Bristol, Bristol, UK
| | - Paul O Sheridan
- School of Biological Sciences, University of Bristol, Bristol, UK
- School of Biological Sciences, University of Aberdeen, Aberdeen, UK
| | - Adrián A Davín
- Australian Centre for Ecogenomics, School of Chemistry and Molecular Biosciences, The University of Queensland, Brisbane, Australia
| | | | - Gergely J Szöllősi
- Dept. of Biological Physics, Eötvös Loránd University, Budapest, Hungary
- MTA-ELTE "Lendület" Evolutionary Genomics Research Group, Budapest, Hungary
- Institute of Evolution, Centre for Ecological Research, Budapest, Hungary
| | - Tom A Williams
- School of Biological Sciences, University of Bristol, Bristol, UK.
| |
Collapse
|
2
|
Davín AA, Schrempf D, Williams TA, Hugenholtz P, Szöllősi GJ. Relative Time Inference Using Lateral Gene Transfers. Methods Mol Biol 2022; 2569:75-94. [PMID: 36083444 DOI: 10.1007/978-1-0716-2691-7_4] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 05/24/2023]
Abstract
Many organisms are able to incorporate exogenous DNA into their genomes. This process, called lateral gene transfer (LGT), has the potential to benefit the recipient organism by providing useful coding sequences, such as antibiotic resistance genes or enzymes which expand the organism's metabolic niche. For evolutionary biologists, LGTs have often been considered a nuisance because they complicate the reconstruction of the underlying species tree that many analyses aim to recover. However, LGT events between distinct organisms harbor information on the relative divergence time of the donor and recipient lineages. As a result transfers provide a novel and as yet mostly unexplored source of information to determine the order of divergence of clades, with the potential for absolute dating if linked to the fossil record.
Collapse
Affiliation(s)
- Adrián A Davín
- Australian Centre for Ecogenomics, School of Chemistry and Molecular Biosciences, The University of Queensland, Brisbane, QLD, Australia
| | - Dominik Schrempf
- Department of Biological Physics, Eötvös Loránd University, Budapest, Hungary
| | - Tom A Williams
- School of Biological Sciences, University of Bristol, Bristol, UK
| | - Philip Hugenholtz
- Australian Centre for Ecogenomics, School of Chemistry and Molecular Biosciences, The University of Queensland, Brisbane, QLD, Australia
| | - Gergely J Szöllősi
- Department of Biological Physics, Eötvös Loránd University, Budapest, Hungary
- MTA-ELTE "Lendület" Evolutionary Genomics Research Group, Budapest, Hungary
- Institute of Evolution, Centre for Ecological Research, Budapest, Hungary
| |
Collapse
|
3
|
Mao Y, Hou S, Shi J, Economo EP. TREEasy: An automated workflow to infer gene trees, species trees, and phylogenetic networks from multilocus data. Mol Ecol Resour 2020; 20. [PMID: 32073732 DOI: 10.1111/1755-0998.13149] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/10/2019] [Revised: 01/27/2020] [Accepted: 02/10/2020] [Indexed: 11/30/2022]
Abstract
Multilocus genomic data sets can be used to infer a rich set of information about the evolutionary history of a lineage, including gene trees, species trees, and phylogenetic networks. However, user-friendly tools to run such integrated analyses are lacking, and workflows often require tedious reformatting and handling time to shepherd data through a series of individual programs. Here, we present a tool written in Python-TREEasy-that performs automated sequence alignment (with MAFFT), gene tree inference (with IQ-Tree), species inference from concatenated data (with IQ-Tree and RaxML-NG), species tree inference from gene trees (with ASTRAL, MP-EST, and STELLS2), and phylogenetic network inference (with SNaQ and PhyloNet). The tool only requires FASTA files and nine parameters as inputs. The tool can be run as command line or through a Graphical User Interface (GUI). As examples, we reproduced a recent analysis of staghorn coral evolution, and performed a new analysis on the evolution of the "WGD clade" of yeast. The latter revealed novel patterns that were not identified by previous analyses. TREEasy represents a reliable and simple tool to accelerate research in systematic biology (https://github.com/MaoYafei/TREEasy).
Collapse
Affiliation(s)
- Yafei Mao
- Biodiversity and Biocomplexity Unit, Okinawa Institute of Science and Technology Graduate University, Onna, Japan
| | - Siqing Hou
- Cognitive Neurorobotics Research Unit, Okinawa Institute of Science and Technology Graduate University, Onna, Japan
| | - Junfeng Shi
- Shanghai Key Laboratory of Stomatology, Shanghai Research Institute of Stomatology, Shanghai Ninth People's Hospital, College of Stomatology, Shanghai Jiao Tong University School of Medicine, Shanghai, China
| | - Evan P Economo
- Biodiversity and Biocomplexity Unit, Okinawa Institute of Science and Technology Graduate University, Onna, Japan
| |
Collapse
|
4
|
Pérez-Losada M, Arenas M, Galán JC, Bracho MA, Hillung J, García-González N, González-Candelas F. High-throughput sequencing (HTS) for the analysis of viral populations. INFECTION GENETICS AND EVOLUTION 2020; 80:104208. [PMID: 32001386 DOI: 10.1016/j.meegid.2020.104208] [Citation(s) in RCA: 27] [Impact Index Per Article: 6.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 09/30/2019] [Revised: 01/21/2020] [Accepted: 01/24/2020] [Indexed: 12/12/2022]
Abstract
The development of High-Throughput Sequencing (HTS) technologies is having a major impact on the genomic analysis of viral populations. Current HTS platforms can capture nucleic acid variation across millions of genes for both selected amplicons and full viral genomes. HTS has already facilitated the discovery of new viruses, hinted new taxonomic classifications and provided a deeper and broader understanding of their diversity, population and genetic structure. Hence, HTS has already replaced standard Sanger sequencing in basic and applied research fields, but the next step is its implementation as a routine technology for the analysis of viruses in clinical settings. The most likely application of this implementation will be the analysis of viral genomics, because the huge population sizes, high mutation rates and very fast replacement of viral populations have demonstrated the limited information obtained with Sanger technology. In this review, we describe new technologies and provide guidelines for the high-throughput sequencing and genetic and evolutionary analyses of viral populations and metaviromes, including software applications. With the development of new HTS technologies, new and refurbished molecular and bioinformatic tools are also constantly being developed to process and integrate HTS data. These allow assembling viral genomes and inferring viral population diversity and dynamics. Finally, we also present several applications of these approaches to the analysis of viral clinical samples including transmission clusters and outbreak characterization.
Collapse
Affiliation(s)
- Marcos Pérez-Losada
- Computational Biology Institute, Milken Institute School of Public Health, George Washington University, Washington, DC, USA; CIBIO-InBIO, Centro de Investigação em Biodiversidade e Recursos Genéticos, Universidade do Porto, Campus Agrário de Vairão, Vairão 4485-661, Portugal
| | - Miguel Arenas
- Department of Biochemistry, Genetics and Immunology, University of Vigo, 36310 Vigo, Spain; Biomedical Research Center (CINBIO), University of Vigo, 36310 Vigo, Spain.
| | - Juan Carlos Galán
- Microbiology Service, Hospital Ramón y Cajal, Madrid, Spain; CIBER in Epidemiology and Public Health, Spain.
| | - Mª Alma Bracho
- CIBER in Epidemiology and Public Health, Spain; Joint Research Unit "Infection and Public Health" FISABIO-University of Valencia, Valencia, Spain.
| | - Julia Hillung
- Joint Research Unit "Infection and Public Health" FISABIO-University of Valencia, Valencia, Spain; Institute for Integrative Systems Biology (I2SysBio), CSIC-University of Valencia, Valencia, Spain.
| | - Neris García-González
- Joint Research Unit "Infection and Public Health" FISABIO-University of Valencia, Valencia, Spain; Institute for Integrative Systems Biology (I2SysBio), CSIC-University of Valencia, Valencia, Spain.
| | - Fernando González-Candelas
- CIBER in Epidemiology and Public Health, Spain; Joint Research Unit "Infection and Public Health" FISABIO-University of Valencia, Valencia, Spain; Institute for Integrative Systems Biology (I2SysBio), CSIC-University of Valencia, Valencia, Spain.
| |
Collapse
|
5
|
Jones KE, Fér T, Schmickl RE, Dikow RB, Funk VA, Herrando‐Moraira S, Johnston PR, Kilian N, Siniscalchi CM, Susanna A, Slovák M, Thapa R, Watson LE, Mandel JR. An empirical assessment of a single family-wide hybrid capture locus set at multiple evolutionary timescales in Asteraceae. APPLICATIONS IN PLANT SCIENCES 2019; 7:e11295. [PMID: 31667023 PMCID: PMC6814182 DOI: 10.1002/aps3.11295] [Citation(s) in RCA: 20] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 02/27/2019] [Accepted: 09/05/2019] [Indexed: 05/23/2023]
Abstract
PREMISE Hybrid capture with high-throughput sequencing (Hyb-Seq) is a powerful tool for evolutionary studies. The applicability of an Asteraceae family-specific Hyb-Seq probe set and the outcomes of different phylogenetic analyses are investigated here. METHODS Hyb-Seq data from 112 Asteraceae samples were organized into groups at different taxonomic levels (tribe, genus, and species). For each group, data sets of non-paralogous loci were built and proportions of parsimony informative characters estimated. The impacts of analyzing alternative data sets, removing long branches, and type of analysis on tree resolution and inferred topologies were investigated in tribe Cichorieae. RESULTS Alignments of the Asteraceae family-wide Hyb-Seq locus set were parsimony informative at all taxonomic levels. Levels of resolution and topologies inferred at shallower nodes differed depending on the locus data set and the type of analysis, and were affected by the presence of long branches. DISCUSSION The approach used to build a Hyb-Seq locus data set influenced resolution and topologies inferred in phylogenetic analyses. Removal of long branches improved the reliability of topological inferences in maximum likelihood analyses. The Astereaceae Hyb-Seq probe set is applicable at multiple taxonomic depths, which demonstrates that probe sets do not necessarily need to be lineage-specific.
Collapse
Affiliation(s)
- Katy E. Jones
- Botanischer Garten und Botanisches Museum BerlinFreie Universität BerlinKönigin‐Luise‐Str. 6–814195BerlinGermany
| | - Tomáš Fér
- Department of BotanyFaculty of ScienceCharles UniversityBenátská 2CZ 12800PragueCzech Republic
| | - Roswitha E. Schmickl
- Department of BotanyFaculty of ScienceCharles UniversityBenátská 2CZ 12800PragueCzech Republic
- Institute of BotanyThe Czech Academy of SciencesZámek 1CZ 25243PrůhoniceCzech Republic
| | - Rebecca B. Dikow
- Data Science LabOffice of the Chief Information OfficerSmithsonian InstitutionWashingtonD.C.20013‐7012USA
| | - Vicki A. Funk
- Department of BotanyNational Museum of Natural HistorySmithsonian InstitutionWashingtonD.C.20013‐7012USA
| | | | - Paul R. Johnston
- Freie Universität BerlinEvolutionary BiologyBerlinGermany
- Berlin Center for Genomics in Biodiversity ResearchBerlinGermany
- Leibniz‐Institute of Freshwater Ecology and Inland Fisheries (IGB)BerlinGermany
| | - Norbert Kilian
- Botanischer Garten und Botanisches Museum BerlinFreie Universität BerlinKönigin‐Luise‐Str. 6–814195BerlinGermany
| | - Carolina M. Siniscalchi
- Department of Biological SciencesUniversity of MemphisMemphisTennessee38152USA
- Center for BiodiversityUniversity of MemphisMemphisTennessee38152USA
| | - Alfonso Susanna
- Botanic Institute of Barcelona (IBB‐CSIC‐ICUB)Pg. del Migdia s.n.ES 08038BarcelonaSpain
| | - Marek Slovák
- Department of BotanyFaculty of ScienceCharles UniversityBenátská 2CZ 12800PragueCzech Republic
- Plant Science and Biodiversity CentreSlovak Academy of SciencesSK‐84523BratislavaSlovakia
| | - Ramhari Thapa
- Department of Biological SciencesUniversity of MemphisMemphisTennessee38152USA
- Center for BiodiversityUniversity of MemphisMemphisTennessee38152USA
| | - Linda E. Watson
- Department of Plant Biology, Ecology, and EvolutionOklahoma State UniversityStillwaterOklahoma74078USA
| | - Jennifer R. Mandel
- Department of Biological SciencesUniversity of MemphisMemphisTennessee38152USA
- Center for BiodiversityUniversity of MemphisMemphisTennessee38152USA
| |
Collapse
|
6
|
Pérez-Losada M, Arenas M, Castro-Nallar E. Microbial sequence typing in the genomic era. INFECTION, GENETICS AND EVOLUTION : JOURNAL OF MOLECULAR EPIDEMIOLOGY AND EVOLUTIONARY GENETICS IN INFECTIOUS DISEASES 2018; 63:346-359. [PMID: 28943406 PMCID: PMC5908768 DOI: 10.1016/j.meegid.2017.09.022] [Citation(s) in RCA: 28] [Impact Index Per Article: 4.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 05/22/2017] [Revised: 09/18/2017] [Accepted: 09/19/2017] [Indexed: 12/18/2022]
Abstract
Next-generation sequencing (NGS), also known as high-throughput sequencing, is changing the field of microbial genomics research. NGS allows for a more comprehensive analysis of the diversity, structure and composition of microbial genes and genomes compared to the traditional automated Sanger capillary sequencing at a lower cost. NGS strategies have expanded the versatility of standard and widely used typing approaches based on nucleotide variation in several hundred DNA sequences and a few gene fragments (MLST, MLVA, rMLST and cgMLST). NGS can now accommodate variation in thousands or millions of sequences from selected amplicons to full genomes (WGS, NGMLST and HiMLST). To extract signals from high-dimensional NGS data and make valid statistical inferences, novel analytic and statistical techniques are needed. In this review, we describe standard and new approaches for microbial sequence typing at gene and genome levels and guidelines for subsequent analysis, including methods and computational frameworks. We also present several applications of these approaches to some disciplines, namely genotyping, phylogenetics and molecular epidemiology.
Collapse
Affiliation(s)
- Marcos Pérez-Losada
- Computational Biology Institute, Milken Institute School of Public Health, George Washington University, Ashburn, VA 20147, USA; CIBIO-InBIO, Centro de Investigação em Biodiversidade e Recursos Genéticos, Universidade do Porto, Campus Agrário de Vairão, Vairão 4485-661, Portugal; Children's National Medical Center, Washington, DC 20010, USA.
| | - Miguel Arenas
- Department of Biochemistry, Genetics and Immunology, University of Vigo, Vigo, Spain
| | - Eduardo Castro-Nallar
- Universidad Andrés Bello, Center for Bioinformatics and Integrative Biology, Facultad de Ciencias Biológicas, Santiago 8370146, Chile
| |
Collapse
|