1
|
Winn JC, Maduna SN, Bester-van der Merwe AE. A comprehensive phylogenomic study unveils evolutionary patterns and challenges in the mitochondrial genomes of Carcharhiniformes: A focus on Triakidae. Genomics 2024; 116:110771. [PMID: 38147941 DOI: 10.1016/j.ygeno.2023.110771] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/15/2023] [Revised: 12/14/2023] [Accepted: 12/22/2023] [Indexed: 12/28/2023]
Abstract
The complex evolutionary patterns in the mitochondrial genome (mitogenome) of the most species-rich shark order, the Carcharhiniformes (ground sharks) has led to challenges in the phylogenomic reconstruction of the families and genera belonging to the order, particularly the family Triakidae (houndsharks). The current state of Triakidae phylogeny remains controversial, with arguments for both monophyly and paraphyly within the family. We hypothesize that this variability is triggered by the selection of different a priori partitioning schemes to account for site and gene heterogeneity within the mitogenome. Here we used an extensive statistical framework to select the a priori partitioning scheme for inference of the mitochondrial phylogenomic relationships within Carcharhiniformes, tested site heterogeneous CAT + GTR + G4 models and incorporated the multi-species coalescent model (MSCM) into our analyses to account for the influence of gene tree discordance on species tree inference. We included five newly assembled houndshark mitogenomes to increase resolution of Triakidae. During the assembly procedure, we uncovered a 714 bp-duplication in the mitogenome of Galeorhinus galeus. Phylogenetic reconstruction confirmed monophyly within Triakidae and the existence of two distinct clades of the expanded Mustelus genus. The latter alludes to potential evolutionary reversal of reproductive mode from placental to aplacental, suggesting that reproductive mode has played a role in the trajectory of adaptive divergence. These new sequences have the potential to contribute to population genomic investigations, species phylogeography delineation, environmental DNA metabarcoding databases and, ultimately, improved conservation strategies for these ecologically and economically important species.
Collapse
Affiliation(s)
- Jessica C Winn
- Molecular Breeding and Biodiversity Group, Department of Genetics, Stellenbosch University, Stellenbosch, Western Cape 7602, South Africa
| | - Simo N Maduna
- Department of Ecosystems in the Barents Region, Svanhovd Research Station, Norwegian Institute of Bioeconomy Research, 9925 Svanvik, Norway
| | - Aletta E Bester-van der Merwe
- Molecular Breeding and Biodiversity Group, Department of Genetics, Stellenbosch University, Stellenbosch, Western Cape 7602, South Africa.
| |
Collapse
|
2
|
Sha N, Li Z, Sun Q, Han Y, Tian L, Wu Y, Li X, Shi Y, Zhang J, Peng J, Wang L, Dang Z, Liang C. Elucidation of the evolutionary history of Stipa in China using comparative transcriptomic analysis. FRONTIERS IN PLANT SCIENCE 2023; 14:1275018. [PMID: 38148860 PMCID: PMC10751131 DOI: 10.3389/fpls.2023.1275018] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 08/09/2023] [Accepted: 11/08/2023] [Indexed: 12/28/2023]
Abstract
Phylogenetic analysis provides crucial insights into the evolutionary relationships and diversification patterns within specific taxonomic groups. In this study, we aimed to identify the phylogenetic relationships and explore the evolutionary history of Stipa using transcriptomic data. Samples of 12 Stipa species were collected from the Qinghai-Tibet Plateau and Mongolian Plateau, where they are widely distributed, and transcriptome sequencing was performed using their fresh spikelet tissues. Using bidirectional best BLAST analysis, we identified two sets of one-to-one orthologous genes shared between Brachypodium distachyon and the 12 Stipa species (9397 and 2300 sequences, respectively), as well as 62 single-copy orthologous genes. Concatenation methods were used to construct a robust phylogenetic tree for Stipa, and molecular dating was used to estimate divergence times. Our results indicated that Stipa originated during the Pliocene. In approximately 0.8 million years, it diverged into two major clades each consisting of native species from the Mongolian Plateau and the Qinghai-Tibet Plateau, respectively. The evolution of Stipa was closely associated with the development of northern grassland landscapes. Important external factors such as global cooling during the Pleistocene, changes in monsoonal circulation, and tectonic movements contributed to the diversification of Stipa. This study provided a highly supported phylogenetic framework for understanding the evolution of the Stipa genus in China and insights into its diversification patterns.
Collapse
Affiliation(s)
- Na Sha
- Key Laboratory of Ecology and Resource Use of the Mongolian Plateau, Ministry of Education of China, Collaborative Innovation Center for Grassland Ecological Security, School of Ecology and Environment, Inner Mongolia University, Hohhot, Inner Mongolia, China
| | - Zhiyong Li
- Key Laboratory of Ecology and Resource Use of the Mongolian Plateau, Ministry of Education of China, Collaborative Innovation Center for Grassland Ecological Security, School of Ecology and Environment, Inner Mongolia University, Hohhot, Inner Mongolia, China
| | - Qiang Sun
- Key Laboratory of Ecology and Resource Use of the Mongolian Plateau, Ministry of Education of China, Collaborative Innovation Center for Grassland Ecological Security, School of Ecology and Environment, Inner Mongolia University, Hohhot, Inner Mongolia, China
| | - Ying Han
- Key Laboratory of Ecology and Resource Use of the Mongolian Plateau, Ministry of Education of China, Collaborative Innovation Center for Grassland Ecological Security, School of Ecology and Environment, Inner Mongolia University, Hohhot, Inner Mongolia, China
| | - Li Tian
- Key Laboratory of Ecology and Resource Use of the Mongolian Plateau, Ministry of Education of China, Collaborative Innovation Center for Grassland Ecological Security, School of Ecology and Environment, Inner Mongolia University, Hohhot, Inner Mongolia, China
| | - Yantao Wu
- Key Laboratory of Ecology and Resource Use of the Mongolian Plateau, Ministry of Education of China, Collaborative Innovation Center for Grassland Ecological Security, School of Ecology and Environment, Inner Mongolia University, Hohhot, Inner Mongolia, China
| | - Xing Li
- Institute of Landscape and Environment, Inner Mongolia Academy of Forestry Science, Hohhot, Inner Mongolia, China
| | - Yabo Shi
- School of Resources and Environment, Baotou Teachers’ College, Baotou, Inner Mongolia, China
| | - Jinghui Zhang
- Key Laboratory of Ecology and Resource Use of the Mongolian Plateau, Ministry of Education of China, Collaborative Innovation Center for Grassland Ecological Security, School of Ecology and Environment, Inner Mongolia University, Hohhot, Inner Mongolia, China
| | - Jiangtao Peng
- Key Laboratory of Ecology and Resource Use of the Mongolian Plateau, Ministry of Education of China, Collaborative Innovation Center for Grassland Ecological Security, School of Ecology and Environment, Inner Mongolia University, Hohhot, Inner Mongolia, China
| | - Lixin Wang
- Key Laboratory of Ecology and Resource Use of the Mongolian Plateau, Ministry of Education of China, Collaborative Innovation Center for Grassland Ecological Security, School of Ecology and Environment, Inner Mongolia University, Hohhot, Inner Mongolia, China
| | - Zhenhua Dang
- Key Laboratory of Ecology and Resource Use of the Mongolian Plateau, Ministry of Education of China, Collaborative Innovation Center for Grassland Ecological Security, School of Ecology and Environment, Inner Mongolia University, Hohhot, Inner Mongolia, China
| | - Cunzhu Liang
- Key Laboratory of Ecology and Resource Use of the Mongolian Plateau, Ministry of Education of China, Collaborative Innovation Center for Grassland Ecological Security, School of Ecology and Environment, Inner Mongolia University, Hohhot, Inner Mongolia, China
| |
Collapse
|
3
|
Xian Q, Wang S, Liu Y, Kan S, Zhang W. Structure-Based GC Investigation Sheds New Light on ITS2 Evolution in Corydalis Species. Int J Mol Sci 2023; 24:ijms24097716. [PMID: 37175423 PMCID: PMC10178233 DOI: 10.3390/ijms24097716] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/30/2023] [Revised: 04/20/2023] [Accepted: 04/21/2023] [Indexed: 05/15/2023] Open
Abstract
Guanine and cytosine (GC) content is a fundamental component of genetic diversity and essential for phylogenetic analyses. However, the GC content of the ribosomal internal transcribed spacer 2 (ITS2) remains unknown, despite the fact that ITS2 is a widely used phylogenetic marker. Here, the ITS2 was high-throughput sequenced from 29 Corydalis species, and their GC contents were comparatively investigated in the context of ITS2's characteristic secondary structure and concerted evolution. Our results showed that the GC contents of ITS2 were 131% higher than those of their adjacent 5.8S regions, suggesting that ITS2 underwent GC-biased evolution. These GCs were distributed in a heterogeneous manner in the ITS2 secondary structure, with the paired regions being 130% larger than the unpaired regions, indicating that GC is chosen for thermodynamic stability. In addition, species with homogeneous ITS2 sequences were always GC-rich, supporting GC-biased gene conversion (gBGC), which occurred with ITS2's concerted evolution. The RNA substitution model inferred also showed a GC preference among base pair transformations, which again supports gBGC. Overall, structurally based GC investigation reveals that ITS2 evolves under structural stability and gBGC selection, significantly increasing its GC content.
Collapse
Affiliation(s)
- Qing Xian
- Marine College, Shandong University, Weihai 264209, China
| | - Suyin Wang
- Marine College, Shandong University, Weihai 264209, China
| | - Yanyan Liu
- College of Plant Protection, Henan Agricultural University, Zhengzhou 450002, China
| | - Shenglong Kan
- Shenzhen Branch, Guangdong Laboratory of Lingnan Modern Agriculture, Genome Analysis Laboratory of the Ministry of Agriculture and Rural Affairs, Agricultural Genomics Institute at Shenzhen, Chinese Academy of Agricultural Sciences, Shenzhen 518120, China
| | - Wei Zhang
- Marine College, Shandong University, Weihai 264209, China
| |
Collapse
|
4
|
Cornuault J, Sanmartín I. A road map for phylogenetic models of species trees. Mol Phylogenet Evol 2022; 173:107483. [DOI: 10.1016/j.ympev.2022.107483] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/31/2021] [Revised: 03/09/2022] [Accepted: 04/05/2022] [Indexed: 10/18/2022]
|
5
|
Ningsih R, Arfa Yanti N. Molecular Identification of Phytophthora sp. From Indonesian Cocoa Using Phylogenetic Analysis. Pak J Biol Sci 2022; 25:245-253. [PMID: 35234015 DOI: 10.3923/pjbs.2022.245.253] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/15/2022]
Abstract
<b>Background and Objective:</b> Diseases caused by <i>Phytophthora</i> species cause widespread damage worldwide and are troubling cocoa farmers in Indonesia. The specific species causing disease in an area can be ascertained by characterizing its rDNA fragments. This study aimed to identify <i>Phytophthora</i> sp., samples from cocoa plantations in Southeast Sulawesi, Indonesia, based on phylogenetic analysis of rDNA fragments. <b>Materials and Methods:</b> Identification of rDNA fragments of <i>Phytophthora</i> sp., done by amplifying rDNA fragments using PCR (Polymerase Chain Reactions) techniques with the specific primer of <i>Phytophthora</i> (Phy-F and Phy-R) which can amplify regions of ITS1, 5.8S rRNA and ITS2. The rDNA fragments are then sequenced and analyzed using: The BLAST (Basic Local Alignment Search Tools) provided by NCBI (National Center for Biotechnology Information) via (www.ncbi.nlm.nih.gov/blast) to analyze the local alignment of DNA sequences with Genbank DNA data and Mega 7.0.26 software is used to construct the phylogenetic tree. <b>Results:</b> The DNA sequencing results showed the rDNA measuring 786 bp consisted of complete sequences of ITS 1 (210 bp), 5.8S rRNA (162 bp) and ITS 2 (414 bp). Based on phylogenetic tree analysis using the maximum likelihood method with 1000 bootstrap replications showed that the rDNA of <i>Phytophthora</i> sp., isolates and 29 comparator isolates formed 2 large groups. <i>Phytophthora</i> sp., formed a subgroup with <i>Phytophthora palmivora</i> with a bootstrap value of 99%. <b>Conclusion:</b> The type of <i>Phytophthora</i> spreading in cocoa plantations in Southeast Sulawesi, Indonesia, is 1 group with <i>Phytophthora palmivora</i>.
Collapse
|
6
|
Chen D, Hosner PA, Dittmann DL, O'Neill JP, Birks SM, Braun EL, Kimball RT. Divergence time estimation of Galliformes based on the best gene shopping scheme of ultraconserved elements. BMC Ecol Evol 2021; 21:209. [PMID: 34809586 PMCID: PMC8609756 DOI: 10.1186/s12862-021-01935-1] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/06/2021] [Accepted: 11/08/2021] [Indexed: 11/10/2022] Open
Abstract
BACKGROUND Divergence time estimation is fundamental to understanding many aspects of the evolution of organisms, such as character evolution, diversification, and biogeography. With the development of sequence technology, improved analytical methods, and knowledge of fossils for calibration, it is possible to obtain robust molecular dating results. However, while phylogenomic datasets show great promise in phylogenetic estimation, the best ways to leverage the large amounts of data for divergence time estimation has not been well explored. A potential solution is to focus on a subset of data for divergence time estimation, which can significantly reduce the computational burdens and avoid problems with data heterogeneity that may bias results. RESULTS In this study, we obtained thousands of ultraconserved elements (UCEs) from 130 extant galliform taxa, including representatives of all genera, to determine the divergence times throughout galliform history. We tested the effects of different "gene shopping" schemes on divergence time estimation using a carefully, and previously validated, set of fossils. Our results found commonly used clock-like schemes may not be suitable for UCE dating (or other data types) where some loci have little information. We suggest use of partitioning (e.g., PartitionFinder) and selection of tree-like partitions may be good strategies to select a subset of data for divergence time estimation from UCEs. Our galliform time tree is largely consistent with other molecular clock studies of mitochondrial and nuclear loci. With our increased taxon sampling, a well-resolved topology, carefully vetted fossil calibrations, and suitable molecular dating methods, we obtained a high quality galliform time tree. CONCLUSIONS We provide a robust galliform backbone time tree that can be combined with more fossil records to further facilitate our understanding of the evolution of Galliformes and can be used as a resource for comparative and biogeographic studies in this group.
Collapse
Affiliation(s)
- De Chen
- MOE Key Laboratory for Biodiversity Science and Ecological Engineering, College of Life Sciences, Beijing Normal University, Beijing, China
- Department of Biology, University of Florida, Gainesville, FL, USA
| | - Peter A Hosner
- Department of Biology, University of Florida, Gainesville, FL, USA
- Natural History Museum of Denmark and Center for Global Mountain Biodiversity, University of Copenhagen, Copenhagen, Denmark
| | - Donna L Dittmann
- Museum of Natural Science, Louisiana State University, Baton Rouge, LA, USA
| | - John P O'Neill
- Museum of Natural Science, Louisiana State University, Baton Rouge, LA, USA
| | - Sharon M Birks
- Burke Museum of Natural History and Culture, University of Washington, Seattle, WA, USA
| | - Edward L Braun
- Department of Biology, University of Florida, Gainesville, FL, USA
| | | |
Collapse
|
7
|
Pyrka I, Stefanaki A, Vlachonasios KE. DNA Barcoding of St. John's wort (Hypericum spp.) Growing Wild in North-Eastern Greece. PLANTA MEDICA 2021; 87:528-537. [PMID: 33618378 DOI: 10.1055/a-1379-3249] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/12/2023]
Abstract
Plants of the genus Hypericum, commonly known as "St. John's wort" ("spathohorto" or "valsamo" in Greek), have been used since antiquity for their therapeutic properties. Wild-harvested Hypericum plants are still popular today in herbal medicines, commercially exploited due to their bioactive compounds, hypericin and hyperforin, which have antidepressant, antimicrobial and antiviral activity. Species identification of commercial products is therefore important and DNA barcoding, a molecular method that uses small sequences of organisms' genome as barcodes, can be useful in this direction. In this study, we collected plants of the genus Hypericum that grow wild in North-Eastern Greece and explored the efficiency of matK, and trnH-psbA regions as DNA barcodes for their identification. We focused on 5 taxa, namely H. aucheri, H. montbretii, H. olympicum, H. perforatum subsp. perforatum, and H. thasium, the latter a rare Balkan endemic species collected for the first time from mainland Greece. matK (using the genus-specific primers designed herein), trnH-psbA, and their combination were effectively used for the identification of the 5 Hypericum taxa and the discrimination of different H. perforatum subsp. perforatum populations. These barcodes were also able to discriminate Greek populations of H. perforatum, H. aucheri, H. montbretii, and H. olympicum from populations of the same species growing in other countries.
Collapse
Affiliation(s)
- Ioanna Pyrka
- Postgraduate Studies Program, Conservation of Biodiversity and Sustainable Exploitation of Native Plants (BNP), School of Biology, Faculty of Sciences, Aristotle University of Thessaloniki, Thessaloniki, Greece
| | | | - Konstantinos E Vlachonasios
- Postgraduate Studies Program, Conservation of Biodiversity and Sustainable Exploitation of Native Plants (BNP), School of Biology, Faculty of Sciences, Aristotle University of Thessaloniki, Thessaloniki, Greece
- Department of Botany, School of Biology, Faculty of Sciences, Aristotle University of Thessaloniki, Thessaloniki, Greece
- Natural Products Research Centre of Excellence (NatPro-AUTh), Center of Interdisciplinary Research and Innovation of Aristotle University of Thessaloniki (CIRI-AUTh), Thessaloniki, Greece
| |
Collapse
|
8
|
EntroPhylo: An entropy-based tool to select phylogenetic informative regions and primer design. INFECTION GENETICS AND EVOLUTION 2021; 92:104857. [PMID: 33838312 DOI: 10.1016/j.meegid.2021.104857] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 12/24/2020] [Revised: 03/18/2021] [Accepted: 04/05/2021] [Indexed: 11/24/2022]
Abstract
We present a novel entropy-based computational tool that selects phylogenetic informative genomic regions associated with degenerate primer design. This tool identifies proper phylogenetic markers and proposes suitable degenerate primers to amplify and sequence them. The algorithm calculates the entropy value per site, and the selected region is used for primer design. In order to evaluate the tool, sequences of bovine papillomavirus L1 gene were obtained. Once the molecular region was selected, the primers were designed by the software and used in a PCR reaction for viral detection. Three positive samples were tested with four different concentrations, and it was possible to detect the virus in all samples. The results show the applicability of a tool that can select informative regions for phylogenetic analysis and design primers to amplify and sequence these regions, becoming relevant for several studies focusing on pathogen detection, as well as phylogenetic and genetics studies of populations.
Collapse
|
9
|
Abstract
The phylogeny of Neoaves, the largest clade of extant birds, has remained unclear despite intense study. The difficulty associated with resolving the early branches in Neoaves is likely driven by the rapid radiation of this group. However, conflicts among studies may be exacerbated by the data type analyzed. For example, analyses of coding exons typically yield trees that place Strisores (nightjars and allies) sister to the remaining Neoaves, while analyses of non-coding data typically yield trees where Mirandornites (flamingos and grebes) is the sister of the remaining Neoaves. Our understanding of data type effects is hampered by the fact that previous analyses have used different taxa, loci, and types of non-coding data. Herein, we provide strong corroboration of the data type effects hypothesis for Neoaves by comparing trees based on coding and non-coding data derived from the same taxa and gene regions. A simple analytical method known to minimize biases due to base composition (coding nucleotides as purines and pyrimidines) resulted in coding exon data with increased congruence to the non-coding topology using concatenated analyses. These results improve our understanding of the resolution of neoavian phylogeny and point to a challenge—data type effects—that is likely to be an important factor in phylogenetic analyses of birds (and many other taxonomic groups). Using our results, we provide a summary phylogeny that identifies well-corroborated relationships and highlights specific nodes where future efforts should focus.
Collapse
|
10
|
Maldonado E, Antunes A. LMAP_S: Lightweight Multigene Alignment and Phylogeny eStimation. BMC Bioinformatics 2019; 20:739. [PMID: 31888452 PMCID: PMC6937843 DOI: 10.1186/s12859-019-3292-5] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/04/2019] [Accepted: 11/26/2019] [Indexed: 01/22/2023] Open
Abstract
Background Recent advances in genome sequencing technologies and the cost drop in high-throughput sequencing continue to give rise to a deluge of data available for downstream analyses. Among others, evolutionary biologists often make use of genomic data to uncover phenotypic diversity and adaptive evolution in protein-coding genes. Therefore, multiple sequence alignments (MSA) and phylogenetic trees (PT) need to be estimated with optimal results. However, the preparation of an initial dataset of multiple sequence file(s) (MSF) and the steps involved can be challenging when considering extensive amount of data. Thus, it becomes necessary the development of a tool that removes the potential source of error and automates the time-consuming steps of a typical workflow with high-throughput and optimal MSA and PT estimations. Results We introduce LMAP_S (Lightweight Multigene Alignment and Phylogeny eStimation), a user-friendly command-line and interactive package, designed to handle an improved alignment and phylogeny estimation workflow: MSF preparation, MSA estimation, outlier detection, refinement, consensus, phylogeny estimation, comparison and editing, among which file and directory organization, execution, manipulation of information are automated, with minimal manual user intervention. LMAP_S was developed for the workstation multi-core environment and provides a unique advantage for processing multiple datasets. Our software, proved to be efficient throughout the workflow, including, the (unlimited) handling of more than 20 datasets. Conclusions We have developed a simple and versatile LMAP_S package enabling researchers to effectively estimate multiple datasets MSAs and PTs in a high-throughput fashion. LMAP_S integrates more than 25 software providing overall more than 65 algorithm choices distributed in five stages. At minimum, one FASTA file is required within a single input directory. To our knowledge, no other software combines MSA and phylogeny estimation with as many alternatives and provides means to find optimal MSAs and phylogenies. Moreover, we used a case study comparing methodologies that highlighted the usefulness of our software. LMAP_S has been developed as an open-source package, allowing its integration into more complex open-source bioinformatics pipelines. LMAP_S package is released under GPLv3 license and is freely available at https://lmap-s.sourceforge.io/.
Collapse
Affiliation(s)
- Emanuel Maldonado
- CIIMAR/CIMAR - Interdisciplinary Centre of Marine and Environmental Research, University of Porto, Terminal de Cruzeiros do Porto de Leixões, Av. General Norton de Matos, s/n, 4450-208, Porto, Portugal
| | - Agostinho Antunes
- CIIMAR/CIMAR - Interdisciplinary Centre of Marine and Environmental Research, University of Porto, Terminal de Cruzeiros do Porto de Leixões, Av. General Norton de Matos, s/n, 4450-208, Porto, Portugal. .,Department of Biology, Faculty of Sciences, University of Porto, Rua do Campo Alegre, 4169-007, Porto, Portugal.
| |
Collapse
|
11
|
Belinky F, Sela I, Rogozin IB, Koonin EV. Crossing fitness valleys via double substitutions within codons. BMC Biol 2019; 17:105. [PMID: 31842858 PMCID: PMC6916188 DOI: 10.1186/s12915-019-0727-4] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/26/2019] [Accepted: 11/20/2019] [Indexed: 02/07/2023] Open
Abstract
BACKGROUND Single nucleotide substitutions in protein-coding genes can be divided into synonymous (S), with little fitness effect, and non-synonymous (N) ones that alter amino acids and thus generally have a greater effect. Most of the N substitutions are affected by purifying selection that eliminates them from evolving populations. However, additional mutations of nearby bases potentially could alleviate the deleterious effect of single substitutions, making them subject to positive selection. To elucidate the effects of selection on double substitutions in all codons, it is critical to differentiate selection from mutational biases. RESULTS We addressed the evolutionary regimes of within-codon double substitutions in 37 groups of closely related prokaryotic genomes from diverse phyla by comparing the fractions of double substitutions within codons to those of the equivalent double S substitutions in adjacent codons. Under the assumption that substitutions occur one at a time, all within-codon double substitutions can be represented as "ancestral-intermediate-final" sequences (where "intermediate" refers to the first single substitution and "final" refers to the second substitution) and can be partitioned into four classes: (1) SS, S intermediate-S final; (2) SN, S intermediate-N final; (3) NS, N intermediate-S final; and (4) NN, N intermediate-N final. We found that the selective pressure on the second substitution markedly differs among these classes of double substitutions. Analogous to single S (synonymous) substitutions, SS double substitutions evolve neutrally, whereas analogous to single N (non-synonymous) substitutions, SN double substitutions are subject to purifying selection. In contrast, NS show positive selection on the second step because the original amino acid is recovered. The NN double substitutions are heterogeneous and can be subject to either purifying or positive selection, or evolve neutrally, depending on the amino acid similarity between the final or intermediate and the ancestral states. CONCLUSIONS The results of the present, comprehensive analysis of the evolutionary landscape of within-codon double substitutions reaffirm the largely conservative regime of protein evolution. However, the second step of a double substitution can be subject to positive selection when the first step is deleterious. Such positive selection can result in frequent crossing of valleys on the fitness landscape.
Collapse
Affiliation(s)
- Frida Belinky
- National Center for Biotechnology Information, National Library of Medicine, National Institutes of Health, Bethesda, MD, USA
| | - Itamar Sela
- National Center for Biotechnology Information, National Library of Medicine, National Institutes of Health, Bethesda, MD, USA
| | - Igor B Rogozin
- National Center for Biotechnology Information, National Library of Medicine, National Institutes of Health, Bethesda, MD, USA
| | - Eugene V Koonin
- National Center for Biotechnology Information, National Library of Medicine, National Institutes of Health, Bethesda, MD, USA.
| |
Collapse
|
12
|
Zhang D, Gao F, Jakovlić I, Zou H, Zhang J, Li WX, Wang GT. PhyloSuite: An integrated and scalable desktop platform for streamlined molecular sequence data management and evolutionary phylogenetics studies. Mol Ecol Resour 2019; 20:348-355. [DOI: 10.1111/1755-0998.13096] [Citation(s) in RCA: 825] [Impact Index Per Article: 137.5] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/13/2019] [Revised: 09/12/2019] [Accepted: 09/24/2019] [Indexed: 01/12/2023]
Affiliation(s)
- Dong Zhang
- Key Laboratory of Aquaculture Disease Control Ministry of Agriculture, and State Key Laboratory of Freshwater Ecology and Biotechnology Institute of Hydrobiology Chinese Academy of Sciences Wuhan China
- University of Chinese Academy of Sciences Beijing China
| | - Fangluan Gao
- Institute of Plant Virology Fujian Agriculture and Forestry University Fuzhou Fujian China
| | | | - Hong Zou
- Key Laboratory of Aquaculture Disease Control Ministry of Agriculture, and State Key Laboratory of Freshwater Ecology and Biotechnology Institute of Hydrobiology Chinese Academy of Sciences Wuhan China
| | | | - Wen X. Li
- Key Laboratory of Aquaculture Disease Control Ministry of Agriculture, and State Key Laboratory of Freshwater Ecology and Biotechnology Institute of Hydrobiology Chinese Academy of Sciences Wuhan China
| | - Gui T. Wang
- Key Laboratory of Aquaculture Disease Control Ministry of Agriculture, and State Key Laboratory of Freshwater Ecology and Biotechnology Institute of Hydrobiology Chinese Academy of Sciences Wuhan China
| |
Collapse
|
13
|
DeBiasse MB, Ryan JF. Phylotocol: Promoting Transparency and Overcoming Bias in Phylogenetics. Syst Biol 2019; 68:672-678. [PMID: 30597106 PMCID: PMC6568013 DOI: 10.1093/sysbio/syy090] [Citation(s) in RCA: 13] [Impact Index Per Article: 2.2] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/20/2018] [Revised: 12/20/2018] [Accepted: 12/20/2018] [Indexed: 11/22/2022] Open
Abstract
The integrity of science requires that the process be based on sound experimental design and objective methodology. Strategies that increase reproducibility and transparency in science protect this integrity by reducing conscious and unconscious biases. Given the large number of analysis options and the constant development of new methodologies in phylogenetics, this field is one that would particularly benefit from more transparent research design. Herein, we introduce phylotocol (fi lō 'ta kôl), an a priori protocol-driven approach in which all analyses are planned and documented at the start of a project. The phylotocol template is simple and the implementation options are flexible to reduce administrative burdens and allow researchers to adapt it to their needs without restricting scientific creativity. While the primary goal of phylotocol is to increase transparency and accountability, it has a number of auxiliary benefits including improving study design and reproducibility, enhancing collaboration and education, and increasing the likelihood of project completion. Our goal with this Point of View article is to encourage a dialog about transparency in phylogenetics and the best strategies to bring transparent research practices to our field.
Collapse
Affiliation(s)
- Melissa B DeBiasse
- Whitney Laboratory for Marine Bioscience, 9505 Ocean Shore Boulevard, St. Augustine, FL 32080, USA
- Department of Biology, University of Florida, 220 Bartram Hall, Gainesville, FL, 32611, USA
| | - Joseph F Ryan
- Whitney Laboratory for Marine Bioscience, 9505 Ocean Shore Boulevard, St. Augustine, FL 32080, USA
- Department of Biology, University of Florida, 220 Bartram Hall, Gainesville, FL, 32611, USA
| |
Collapse
|
14
|
Campbell MA, Sado T, Shinzato C, Koyanagi R, Okamoto M, Miya M. Multilocus phylogenetic analysis of the first molecular data from the rare and monotypic Amarsipidae places the family within the Pelagia and highlights limitations of existing data sets in resolving pelagian interrelationships. Mol Phylogenet Evol 2018. [DOI: 10.1016/j.ympev.2018.03.008] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/17/2022]
|
15
|
Blom MPK, Bragg JG, Potter S, Moritz C. Accounting for Uncertainty in Gene Tree Estimation: Summary-Coalescent Species Tree Inference in a Challenging Radiation of Australian Lizards. Syst Biol 2018; 66:352-366. [PMID: 28039387 DOI: 10.1093/sysbio/syw089] [Citation(s) in RCA: 21] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/14/2015] [Accepted: 09/27/2016] [Indexed: 11/12/2022] Open
Abstract
Accurate gene tree inference is an important aspect of species tree estimation in a summary-coalescent framework. Yet, in empirical studies, inferred gene trees differ in accuracy due to stochastic variation in phylogenetic signal between targeted loci. Empiricists should, therefore, examine the consistency of species tree inference, while accounting for the observed heterogeneity in gene tree resolution of phylogenomic data sets. Here, we assess the impact of gene tree estimation error on summary-coalescent species tree inference by screening ${\sim}2000$ exonic loci based on gene tree resolution prior to phylogenetic inference. We focus on a phylogenetically challenging radiation of Australian lizards (genus Cryptoblepharus, Scincidae) and explore effects on topology and support. We identify a well-supported topology based on all loci and find that a relatively small number of high-resolution gene trees can be sufficient to converge on the same topology. Adding gene trees with decreasing resolution produced a generally consistent topology, and increased confidence for specific bipartitions that were poorly supported when using a small number of informative loci. This corroborates coalescent-based simulation studies that have highlighted the need for a large number of loci to confidently resolve challenging relationships and refutes the notion that low-resolution gene trees introduce phylogenetic noise. Further, our study also highlights the value of quantifying changes in nodal support across locus subsets of increasing size (but decreasing gene tree resolution). Such detailed analyses can reveal anomalous fluctuations in support at some nodes, suggesting the possibility of model violation. By characterizing the heterogeneity in phylogenetic signal among loci, we can account for uncertainty in gene tree estimation and assess its effect on the consistency of the species tree estimate. We suggest that the evaluation of gene tree resolution should be incorporated in the analysis of empirical phylogenomic data sets. This will ultimately increase our confidence in species tree estimation using summary-coalescent methods and enable us to exploit genomic data for phylogenetic inference. [Coalescence; concatenation; Cryptoblepharus; exon capture; gene tree; phylogenomics; species tree.].
Collapse
Affiliation(s)
- Mozes P K Blom
- Research School of Biology, Australian National University, Canberra ACT 0200, Australia
| | - Jason G Bragg
- Research School of Biology, Australian National University, Canberra ACT 0200, Australia
| | - Sally Potter
- Research School of Biology, Australian National University, Canberra ACT 0200, Australia
| | - Craig Moritz
- Research School of Biology, Australian National University, Canberra ACT 0200, Australia
| |
Collapse
|
16
|
Kolařík M, Vohník M. When the ribosomal DNA does not tell the truth: The case of the taxonomic position of Kurtia argillacea, an ericoid mycorrhizal fungus residing among Hymenochaetales. Fungal Biol 2017; 122:1-18. [PMID: 29248111 DOI: 10.1016/j.funbio.2017.09.006] [Citation(s) in RCA: 17] [Impact Index Per Article: 2.1] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/09/2017] [Revised: 09/13/2017] [Accepted: 09/27/2017] [Indexed: 11/19/2022]
Abstract
The nuclear ribosomal DNA (nuc-rDNA) is widely used for the identification and phylogenetic reconstruction of Agaricomycetes. However, nuc-rDNA-based phylogenies may sometimes be in conflict with phylogenetic relationships derived from protein coding genes. In this study, the taxonomic position of the basidiomycetous mycobiont that forms the recently discovered sheathed ericoid mycorrhiza was investigated, because its nuc-rDNA is highly dissimilar to any other available fungal sequences in terms of nucleotide composition and length, and its nuc-rDNA-based phylogeny is inconclusive and significantly disagrees with protein coding sequences and morphological data. In the present work, this mycobiont was identified as Kurtia argillacea (= Hyphoderma argillaceum) residing in the order Hymenochaetales (Basidiomycota). Bioinformatic screening of the Kurtia ribosomal DNA sequence indicates that it represents a gene with a non-standard substitution rate or nucleotide composition heterogeneity rather than a deep paralogue or a pseudogene. Such a phenomenon probably also occurs in other lineages of the Fungi and should be taken into consideration when nuc-rDNA (especially that with unusual nucleotide composition) is used as a sole marker for phylogenetic reconstructions. Kurtia argillacea so far represents the only confirmed non-sebacinoid ericoid mycorrhizal fungus in the Basidiomycota and its intriguing placement among mostly saprobic and parasitic Hymenochaetales begs further investigation of its eco-physiology.
Collapse
Affiliation(s)
- Miroslav Kolařík
- Laboratory of Fungal Genetics and Metabolism, Institute of Microbiology, Czech Academy of Sciences (CAS), Vídeňská 1083, CZ-14220 Prague, Czech Republic.
| | - Martin Vohník
- Department of Mycorrhizal Symbioses, Institute of Botany CAS, CZ-252 43 Průhonice, Czech Republic; Department of Experimental Plant Biology, Faculty of Science, Charles University, Viničná 5, CZ-128 44 Prague, Czech Republic
| |
Collapse
|
17
|
Zaucha J, Heddle JG. Resurrecting the Dead (Molecules). Comput Struct Biotechnol J 2017; 15:351-358. [PMID: 28652896 PMCID: PMC5472138 DOI: 10.1016/j.csbj.2017.05.002] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/15/2017] [Revised: 05/11/2017] [Accepted: 05/21/2017] [Indexed: 12/15/2022] Open
Abstract
Biological molecules, like organisms themselves, are subject to genetic drift and may even become "extinct". Molecules that are no longer extant in living systems are of high interest for several reasons including insight into how existing life forms evolved and the possibility that they may have new and useful properties no longer available in currently functioning molecules. Predicting the sequence/structure of such molecules and synthesizing them so that their properties can be tested is the basis of "molecular resurrection" and may lead not only to a deeper understanding of evolution, but also to the production of artificial proteins with novel properties and even to insight into how life itself began.
Collapse
Affiliation(s)
- Jan Zaucha
- Departament of Computer Science, University of Bristol, Life Sciences Building, 24 Tyndall Avenue, Bristol BS8 1TQ, United Kingdom
| | - Jonathan G. Heddle
- Bionanoscience and Biochemistry Laboratory, Jagiellonian University, Malopolska Centre of Biotechnology, Gronstajowa 7A, 30-387 Kraków, Poland
| |
Collapse
|
18
|
Sha LN, Fan X, Li J, Liao JQ, Zeng J, Wang Y, Kang HY, Zhang HQ, Zheng YL, Zhou YH. Contrasting evolutionary patterns of multiple loci uncover new aspects in the genome origin and evolutionary history of Leymus (Triticeae; Poaceae). Mol Phylogenet Evol 2017; 114:175-188. [PMID: 28533082 DOI: 10.1016/j.ympev.2017.05.015] [Citation(s) in RCA: 11] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/21/2016] [Revised: 05/14/2017] [Accepted: 05/16/2017] [Indexed: 12/28/2022]
Abstract
Leymus Hochst. (Triticeae: Poaceae), a group of allopolyploid species with the NsXm genomes, is a perennial genus with diversity in morphology, cytology, ecology, and distribution in the Triticeae. To investigate the genome origin and evolutionary history of Leymus, three unlinked low-copy nuclear genes (Acc1, Pgk1, and GBSSI) and three chloroplast regions (trnL-F, matK, and rbcL) of 32 Leymus species were analyzed with those of 36 diploid species representing 18 basic genomes in the Triticeae. The phylogenetic relationships were reconstructed using Bayesian inference, Maximum parsimony, and NeighborNet methods. A time-calibrated phylogeny was generated to estimate the evolutionary history of Leymus. The results suggest that reticulate evolution has occurred in Leymus species, with several distinct progenitors contributing to the Leymus. The molecular data in resolution of the Xm-genome lineage resulted in two apparently contradictory results, with one placing the Xm-genome lineage as closely related to the P/F genome and the other splitting the Xm-genome lineage as sister to the Ns-genome donor. Our results suggested that (1) the Ns genome of Leymus was donated by Psathyrostachys, and additional Ns-containing alleles may be introgressed into some Leymus polyploids by recurrent hybridization; (2) The phylogenetic incongruence regarding the resolution of the Xm-genome lineage suggested that the Xm genome of Leymus was closely related to the P genome of Agropyron; (3) Both Ns- and Xm-genome lineages served as the maternal donor during the speciation of Leymus species; (4) The Pseudoroegneria, Lophopyrum and Australopyrum genomes contributed to some Leymus species.
Collapse
Affiliation(s)
- Li-Na Sha
- Triticeae Research Institute, Sichuan Agricultural University, Wenjiang 611130, Sichuan, China; Key Laboratory of Crop Genetic Resources and Improvement, Ministry of Education, Sichuan Agricultural University, Yaan 625014, Sichuan, China
| | - Xing Fan
- Triticeae Research Institute, Sichuan Agricultural University, Wenjiang 611130, Sichuan, China
| | - Jun Li
- Crop Research Institute, Sichuan Academy of Agricultural Science, Chengdu 610066, Sichuan, China
| | - Jin-Qiu Liao
- College of Life Science, Sichuan Agricultural University, Yaan 625014, Sichuan, China
| | - Jian Zeng
- College of Resources, Sichuan Agricultural University, Wenjiang 611130, Sichuan, China
| | - Yi Wang
- Triticeae Research Institute, Sichuan Agricultural University, Wenjiang 611130, Sichuan, China
| | - Hou-Yang Kang
- Triticeae Research Institute, Sichuan Agricultural University, Wenjiang 611130, Sichuan, China
| | - Hai-Qin Zhang
- Triticeae Research Institute, Sichuan Agricultural University, Wenjiang 611130, Sichuan, China
| | - You-Liang Zheng
- Triticeae Research Institute, Sichuan Agricultural University, Wenjiang 611130, Sichuan, China; Key Laboratory of Crop Genetic Resources and Improvement, Ministry of Education, Sichuan Agricultural University, Yaan 625014, Sichuan, China
| | - Yong-Hong Zhou
- Triticeae Research Institute, Sichuan Agricultural University, Wenjiang 611130, Sichuan, China; Key Laboratory of Crop Genetic Resources and Improvement, Ministry of Education, Sichuan Agricultural University, Yaan 625014, Sichuan, China.
| |
Collapse
|
19
|
Lu L, Cox CJ, Mathews S, Wang W, Wen J, Chen Z. Optimal data partitioning, multispecies coalescent and Bayesian concordance analyses resolve early divergences of the grape family (Vitaceae). Cladistics 2017; 34:57-77. [DOI: 10.1111/cla.12191] [Citation(s) in RCA: 27] [Impact Index Per Article: 3.4] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 01/04/2017] [Indexed: 12/25/2022] Open
Affiliation(s)
- Limin Lu
- State Key Laboratory of Systematic and Evolutionary Botany Institute of Botany Chinese Academy of Sciences Beijing 100093 China
| | - Cymon J. Cox
- Centro de Ciências do Mar Universidade do Algarve Gambelas Faro 8005‐319 Portugal
| | - Sarah Mathews
- CSIRO National Research Collections Australian National Herbarium Canberra ACT 2601 Australia
| | - Wei Wang
- State Key Laboratory of Systematic and Evolutionary Botany Institute of Botany Chinese Academy of Sciences Beijing 100093 China
| | - Jun Wen
- Department of Botany National Museum of Natural History MRC166, Smithsonian Institution Washington DC 20013‐7012 USA
| | - Zhiduan Chen
- State Key Laboratory of Systematic and Evolutionary Botany Institute of Botany Chinese Academy of Sciences Beijing 100093 China
| |
Collapse
|
20
|
Baca SM, Toussaint EF, Miller KB, Short AE. Molecular phylogeny of the aquatic beetle family Noteridae (Coleoptera: Adephaga) with an emphasis on data partitioning strategies. Mol Phylogenet Evol 2017; 107:282-292. [DOI: 10.1016/j.ympev.2016.10.016] [Citation(s) in RCA: 24] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/23/2016] [Revised: 10/11/2016] [Accepted: 10/22/2016] [Indexed: 01/06/2023]
|
21
|
Buenaventura E, Pape T. Multilocus and multiregional phylogeny reconstruction of the genus Sarcophaga (Diptera, Sarcophagidae). Mol Phylogenet Evol 2017; 107:619-629. [DOI: 10.1016/j.ympev.2016.12.028] [Citation(s) in RCA: 17] [Impact Index Per Article: 2.1] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/16/2016] [Revised: 12/20/2016] [Accepted: 12/21/2016] [Indexed: 11/24/2022]
|
22
|
Abstract
Molecular evolution can reveal the relationship between sets of homologous sequences and the patterns of change that occur during their evolution. An important aspect of these studies is the inference of a phylogenetic tree, which explicitly describes evolutionary relationships between homologous sequences. This chapter provides an introduction to evolutionary trees and how to infer them from sequence data using some commonly used inferential methodology. It focuses on statistical methods for inferring trees and how to assess the confidence one should have in any resulting tree, with a particular emphasis on the underlying assumptions of the methods and how they might affect the tree estimate. There is also some discussion of the underlying algorithms used to perform tree search and recommendations regarding the performance of different algorithms. Finally, there are a few practical guidelines, including how to combine multiple software packages to improve inference, and a comparison between Bayesian and Maximum likelihood phylogenetics.
Collapse
Affiliation(s)
- Simon Whelan
- Department of Ecology and Genetics, Uppsala University, Uppsala, Sweden.
| | - David A Morrison
- Department of Organism Biology, Uppsala University, Uppsala, Sweden
| |
Collapse
|
23
|
Evolutionary switches between two serine codon sets are driven by selection. Proc Natl Acad Sci U S A 2016; 113:13109-13113. [PMID: 27799560 DOI: 10.1073/pnas.1615832113] [Citation(s) in RCA: 25] [Impact Index Per Article: 2.8] [Reference Citation Analysis] [Abstract] [Key Words] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/18/2022] Open
Abstract
Serine is the only amino acid that is encoded by two disjoint codon sets so that a tandem substitution of two nucleotides is required to switch between the two sets. Previously published evidence suggests that, for the most evolutionarily conserved serines, the codon set switch occurs by simultaneous substitution of two nucleotides. Here we report a genome-wide reconstruction of the evolution of serine codons in triplets of closely related species from diverse prokaryotes and eukaryotes. The results indicate that the great majority of codon set switches proceed by two consecutive nucleotide substitutions, via a threonine or cysteine intermediate, and are driven by selection. These findings imply a strong pressure of purifying selection in protein evolution, which in the case of serine codon set switches occurs via an initial deleterious substitution quickly followed by a second, compensatory substitution. The result is frequent reversal of amino acid replacements and, at short evolutionary distances, pervasive homoplasy.
Collapse
|
24
|
Bromberg R, Grishin NV, Otwinowski Z. Phylogeny Reconstruction with Alignment-Free Method That Corrects for Horizontal Gene Transfer. PLoS Comput Biol 2016; 12:e1004985. [PMID: 27336403 PMCID: PMC4918981 DOI: 10.1371/journal.pcbi.1004985] [Citation(s) in RCA: 26] [Impact Index Per Article: 2.9] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/22/2015] [Accepted: 05/10/2016] [Indexed: 01/20/2023] Open
Abstract
Advances in sequencing have generated a large number of complete genomes. Traditionally, phylogenetic analysis relies on alignments of orthologs, but defining orthologs and separating them from paralogs is a complex task that may not always be suited to the large datasets of the future. An alternative to traditional, alignment-based approaches are whole-genome, alignment-free methods. These methods are scalable and require minimal manual intervention. We developed SlopeTree, a new alignment-free method that estimates evolutionary distances by measuring the decay of exact substring matches as a function of match length. SlopeTree corrects for horizontal gene transfer, for composition variation and low complexity sequences, and for branch-length nonlinearity caused by multiple mutations at the same site. We tested SlopeTree on 495 bacteria, 73 archaea, and 72 strains of Escherichia coli and Shigella. We compared our trees to the NCBI taxonomy, to trees based on concatenated alignments, and to trees produced by other alignment-free methods. The results were consistent with current knowledge about prokaryotic evolution. We assessed differences in tree topology over different methods and settings and found that the majority of bacteria and archaea have a core set of proteins that evolves by descent. In trees built from complete genomes rather than sets of core genes, we observed some grouping by phenotype rather than phylogeny, for instance with a cluster of sulfur-reducing thermophilic bacteria coming together irrespective of their phyla. The source-code for SlopeTree is available at: http://prodata.swmed.edu/download/pub/slopetree_v1/slopetree.tar.gz. Due to their lack of distinct morphological features, bacteria and archaea were extremely difficult to classify until technology was developed to obtain their DNA sequences; these sequences could then be compared to estimate evolutionary relationships. Now, due to technological advances, there is a flood of available sequences from a wide variety of organisms. These advances have spurred the development of algorithms which can estimate evolutionary relationships using whole genomes, in contrast to the more traditional methods which used single genes earlier and now typically use groups of conserved genes. However, there are many challenges when attempting to infer evolutionary relationships, in particular horizontal gene transfer, where DNA is transferred from one organism to another, resulting in an organism’s genome containing DNA that does not reflect its evolution by descent. We developed a new whole-genome method for estimating evolutionary distances which identifies and corrects for horizontal transfer. We found that for SlopeTree and all other whole-genome methods we applied, horizontal transfer causes some evolutionary distances to be grossly underestimated, and that our correction corrects for this.
Collapse
Affiliation(s)
- Raquel Bromberg
- Department of Biophysics and Department of Biochemistry, University of Texas Southwestern Medical Center at Dallas, Dallas, Texas, United States of America
| | - Nick V. Grishin
- Department of Biophysics and Department of Biochemistry, University of Texas Southwestern Medical Center at Dallas, Dallas, Texas, United States of America
- Howard Hughes Medical Institute, University of Texas Southwestern Medical Center at Dallas, Dallas, Texas, United States of America
| | - Zbyszek Otwinowski
- Department of Biophysics and Department of Biochemistry, University of Texas Southwestern Medical Center at Dallas, Dallas, Texas, United States of America
- * E-mail:
| |
Collapse
|
25
|
Marinho MAT, Wolff M, Ramos-Pastrana Y, de Azeredo-Espin AML, Amorim DDS. The first phylogenetic study of Mesembrinellidae (Diptera: Oestroidea) based on molecular data: clades and congruence with morphological characters. Cladistics 2016; 33:134-152. [DOI: 10.1111/cla.12157] [Citation(s) in RCA: 18] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 01/27/2016] [Indexed: 11/28/2022] Open
Affiliation(s)
- Marco Antonio Tonus Marinho
- Laboratório de Morfologia e Evolução de Diptera; Departamento de Biologia; Faculdade de Filosofia, Ciências e Letras (FFCLRP); Universidade de São Paulo (USP); CEP 14040-901 Ribeirão Preto SP Brazil
| | - Marta Wolff
- Grupo de Entomología; Universidad de Antioquia; Calle 67 n° 53-108 Medellín Colombia
| | - Yardany Ramos-Pastrana
- Grupo de Entomología; Universidad de Antioquia; Calle 67 n° 53-108 Medellín Colombia
- Museo de Historia Natural; Centro de Investigaciones de la Biodiversidad Andino-Amazonica (INBIANAM); Grupo Fauna Silvestre; Universidad de la Amazonia; Carrera 11 n° 6-69 Florencia Caquetá Colombia
| | - Ana Maria Lima de Azeredo-Espin
- Laboratório Genética e Evolução Animal; Centro de Biologia Molecular e Engenharia Genética (CBMEG); Universidade Estadual de Campinas (UNICAMP); CEP 13083-875 Campinas SP Brazil
- Departamento de Genética, Evolução e Bioagentes (DGEB) Instituto de Biologia (IB); Universidade Estadual de Campinas; CEP 13083-970 Campinas SP Brazil
| | - Dalton de Souza Amorim
- Laboratório de Morfologia e Evolução de Diptera; Departamento de Biologia; Faculdade de Filosofia, Ciências e Letras (FFCLRP); Universidade de São Paulo (USP); CEP 14040-901 Ribeirão Preto SP Brazil
| |
Collapse
|
26
|
Zhang L, Wu W, Yan HF, Ge XJ. Phylotranscriptomic Analysis Based on Coalescence was Less Influenced by the Evolving Rates and the Number of Genes: A Case Study in Ericales. Evol Bioinform Online 2016; 11:81-91. [PMID: 26819541 PMCID: PMC4718149 DOI: 10.4137/ebo.s22448] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/03/2015] [Revised: 09/24/2015] [Accepted: 09/28/2015] [Indexed: 12/19/2022] Open
Abstract
Advances in high-throughput sequencing have generated a vast amount of transcriptomic data that are being increasingly used in phylogenetic reconstruction. However, processing the vast datasets for a huge number of genes and even identifying optimal analytical methodology are challenging. Through de novo sequenced and retrieved data from public databases, we identified 221 orthologous protein-coding genes to reconstruct the phylogeny of Ericales, an order characterized by rapid ancient radiation. Seven species representing different families in Ericales were used as in-groups. Both concatenation and coalescence methods yielded the same well-supported topology as previous studies, with only two nodes conflicting with previously reported relationships. The results revealed that a partitioning strategy could improve the traditional concatenation methodology. Rapidly evolving genes negatively affected the concatenation analysis, while slowly evolving genes slightly affected the coalescence analysis. The coalescence methods usually accommodated rate heterogeneity better and required fewer genes to yield well-supported topologies than the concatenation methods with both real and simulated data.
Collapse
Affiliation(s)
- Lu Zhang
- Key Laboratory of Plant Resource Conservation and Sustainable Utilization, South China Botanical Garden, Chinese Academy of Sciences, Guangzhou, China
- University of Chinese Academy of Sciences, Beijing, China
| | - Wei Wu
- Key Laboratory of Plant Resource Conservation and Sustainable Utilization, South China Botanical Garden, Chinese Academy of Sciences, Guangzhou, China
| | - Hai-Fei Yan
- Key Laboratory of Plant Resource Conservation and Sustainable Utilization, South China Botanical Garden, Chinese Academy of Sciences, Guangzhou, China
| | - Xue-Jun Ge
- Key Laboratory of Plant Resource Conservation and Sustainable Utilization, South China Botanical Garden, Chinese Academy of Sciences, Guangzhou, China
| |
Collapse
|
27
|
Behura SK. Insect phylogenomics. INSECT MOLECULAR BIOLOGY 2015; 24:403-11. [PMID: 25963452 PMCID: PMC4503476 DOI: 10.1111/imb.12174] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 10/16/2014] [Revised: 03/10/2015] [Accepted: 04/04/2015] [Indexed: 05/08/2023]
Abstract
Phylogenomics, the integration of phylogenetics with genome data, has emerged as a powerful approach to study the evolution and systematics of species. Recently, several studies employing phylogenomic tools have provided better insights into insect evolution. Next-generation sequencing methods are now increasingly used by entomologists to generate genomic and transcript sequences of various insect species and strains. These data provide opportunities for comparative genomics and large-scale multigene phylogenies of diverse lineages of insects. Phy-logenomic investigations help us to better understand systematic and evolutionary relationships of insect species that play important roles as herbivores, predators, detritivores, pollinators and disease vectors. It is important that we critically assess the prospects and limitations of phylogenomic methods. In this review, I describe the current status, outline the major challenges and remark on potential future applications of phylogenomic tools in studying insect systematics and evolution.
Collapse
Affiliation(s)
- S K Behura
- Eck Institute for Global Health and Department of Biological Sciences, University of Notre Dame, Notre Dame, IN, USA
| |
Collapse
|
28
|
Gross A, Hosoya T, Zhao YJ, Baral HO. Hymenoscyphus linearis sp. nov: another close relative of the ash dieback pathogen H. fraxineus. Mycol Prog 2015. [DOI: 10.1007/s11557-015-1041-3] [Citation(s) in RCA: 14] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/09/2023]
|
29
|
Blair C, Méndez de la Cruz FR, Law C, Murphy RW. Molecular phylogenetics and species delimitation of leaf-toed geckos (Phyllodactylidae: Phyllodactylus) throughout the Mexican tropical dry forest. Mol Phylogenet Evol 2015; 84:254-65. [DOI: 10.1016/j.ympev.2015.01.003] [Citation(s) in RCA: 16] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/29/2014] [Revised: 01/07/2015] [Accepted: 01/09/2015] [Indexed: 10/24/2022]
|
30
|
Frandsen PB, Calcott B, Mayer C, Lanfear R. Automatic selection of partitioning schemes for phylogenetic analyses using iterative k-means clustering of site rates. BMC Evol Biol 2015; 15:13. [PMID: 25887041 PMCID: PMC4327964 DOI: 10.1186/s12862-015-0283-7] [Citation(s) in RCA: 55] [Impact Index Per Article: 5.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/11/2014] [Accepted: 01/13/2015] [Indexed: 11/10/2022] Open
Abstract
BACKGROUND Model selection is a vital part of most phylogenetic analyses, and accounting for the heterogeneity in evolutionary patterns across sites is particularly important. Mixture models and partitioning are commonly used to account for this variation, and partitioning is the most popular approach. Most current partitioning methods require some a priori partitioning scheme to be defined, typically guided by known structural features of the sequences, such as gene boundaries or codon positions. Recent evidence suggests that these a priori boundaries often fail to adequately account for variation in rates and patterns of evolution among sites. Furthermore, new phylogenomic datasets such as those assembled from ultra-conserved elements lack obvious structural features on which to define a priori partitioning schemes. The upshot is that, for many phylogenetic datasets, partitioned models of molecular evolution may be inadequate, thus limiting the accuracy of downstream phylogenetic analyses. RESULTS We present a new algorithm that automatically selects a partitioning scheme via the iterative division of the alignment into subsets of similar sites based on their rates of evolution. We compare this method to existing approaches using a wide range of empirical datasets, and show that it consistently leads to large increases in the fit of partitioned models of molecular evolution when measured using AICc and BIC scores. In doing so, we demonstrate that some related approaches to solving this problem may have been associated with a small but important bias. CONCLUSIONS Our method provides an alternative to traditional approaches to partitioning, such as dividing alignments by gene and codon position. Because our method is data-driven, it can be used to estimate partitioned models for all types of alignments, including those that are not amenable to traditional approaches to partitioning.
Collapse
Affiliation(s)
- Paul B Frandsen
- Office of Research Information Services, Office of the CIO, Smithsonian Institution, Washington, D.C., USA. .,Department of Entomology, Rutgers University, New Brunswick, New Jersey, USA.
| | - Brett Calcott
- School of Life Sciences, Arizona State University, Tempe, AZ, USA.
| | - Christoph Mayer
- Zoologisches Forschungsmuseum Alexander Koenig (ZFMK)/Zentrum für Molekulare Biodiversitätsforschung (ZMB), Bonn, Germany.
| | - Robert Lanfear
- Ecology Evolution and Genetics, Research School of Biology, Australian National University, Canberra, ACT, Australia. .,National Evolutionary Synthesis Center, Durham, NC, USA. .,Department of Biological Sciences, Macquarie University, Sydney, Australia.
| |
Collapse
|
31
|
Abstract
Partitioning is a commonly used method in phylogenetics that aims to accommodate variation in substitution patterns among sites. Despite its popularity, there have been few systematic studies of its effects on phylogenetic inference, and there have been no studies that compare the effects of different approaches to partitioning across many empirical data sets. In this study, we applied four commonly used approaches to partitioning to each of 34 empirical data sets, and then compared the resulting tree topologies, branch-lengths, and bootstrap support estimated using each approach. We find that the choice of partitioning scheme often affects tree topology, particularly when partitioning is omitted. Most notably, we find occasional instances where the use of a suboptimal partitioning scheme produces highly supported but incorrect nodes in the tree. Branch-lengths and bootstrap support are also affected by the choice of partitioning scheme, sometimes dramatically so. We discuss the reasons for these effects and make some suggestions for best practice.
Collapse
Affiliation(s)
- David Kainer
- Division of Evolution, Ecology and Genetics, Research School of Biology, The Australian National University, Canberra, ACT, Australia
| | - Robert Lanfear
- Division of Evolution, Ecology and Genetics, Research School of Biology, The Australian National University, Canberra, ACT, Australia National Evolutionary Synthesis Center, Durham, NC Department of Biological Sciences, Macquarie University, Sydney, NSW, Australia
| |
Collapse
|
32
|
Wang YC, Wang JD, Chen CH, Chen YW, Li C. A novel BLAST-Based Relative Distance (BBRD) method can effectively group members of protein arginine methyltransferases and suggest their evolutionary relationship. Mol Phylogenet Evol 2015; 84:101-11. [PMID: 25576770 DOI: 10.1016/j.ympev.2014.12.002] [Citation(s) in RCA: 9] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/08/2014] [Revised: 11/24/2014] [Accepted: 12/05/2014] [Indexed: 01/06/2023]
Abstract
We developed a novel BLAST-Based Relative Distance (BBRD) method by Pearson's correlation coefficient to avoid the problems of tedious multiple sequence alignment and complicated outgroup selection. We showed its application on reconstructing reliable phylogeny for nucleotide and protein sequences as exemplified by the fmr-1 gene and dihydrolipoamide dehydrogenase, respectively. We then used BBRD to resolve 124 protein arginine methyltransferases (PRMTs) that are homologues of nine mammalian PRMTs. The tree placed the uncharacterized PRMT9 with PRMT7 in the same clade, outside of all the Type I PRMTs including PRMT1 and its vertebrate paralogue PRMT8, PRMT3, PRMT6, PRMT2 and PRMT4. The PRMT7/9 branch then connects with the type II PRMT5. Some non-vertebrates contain different PRMTs without high sequence homology with the mammalian PRMTs. For example, in the case of Drosophila arginine methyltransferase (DART) and Trypanosoma brucei methyltransferases (TbPRMTs) in the analyses, the BBRD program grouped them with specific clades and thus suggested their evolutionary relationships. The BBRD method thus provided a great tool to construct a reliable tree for members of protein families through evolution.
Collapse
Affiliation(s)
- Yi-Chun Wang
- Department of Biomedical Sciences, Chung Shan Medical University, No. 110, Sec. 1, Jianguo N. Rd., Taichung 40201, Taiwan; Department of Medical Research, Chung Shan Medical University Hospital, No. 110, Sec. 1, Jianguo N. Rd., Taichung 40201, Taiwan.
| | - Jing-Doo Wang
- Department of Computer Science and Information Engineering, Asia University, No. 500, Lioufeng Rd., Wufeng District, Taichung 41354, Taiwan; Department of Medical Research, China Medical University Hospital, China Medical University, Taichung, Taiwan
| | - Chin-Han Chen
- Department of Biomedical Sciences, Chung Shan Medical University, No. 110, Sec. 1, Jianguo N. Rd., Taichung 40201, Taiwan
| | - Yi-Wen Chen
- Department of Life Science, Tunghai University, No. 1727, Sec. 4, Taiwan Boulevard, Xitun District, Taichung 40704, Taiwan
| | - Chuan Li
- Department of Biomedical Sciences, Chung Shan Medical University, No. 110, Sec. 1, Jianguo N. Rd., Taichung 40201, Taiwan; Department of Medical Research, Chung Shan Medical University Hospital, No. 110, Sec. 1, Jianguo N. Rd., Taichung 40201, Taiwan.
| |
Collapse
|
33
|
Aizawa M, Yoshimaru H, Takahashi M, Kawahara T, Sugita H, Saito H, Sabirov RN. Genetic structure of Sakhalin spruce (Picea glehnii) in northern Japan and adjacent regions revealed by nuclear microsatellites and mitochondrial gene sequences. JOURNAL OF PLANT RESEARCH 2015; 128:91-102. [PMID: 25421922 DOI: 10.1007/s10265-014-0682-7] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 05/10/2014] [Accepted: 09/09/2014] [Indexed: 06/04/2023]
Abstract
The genetic structure of Sakhalin spruce (Picea glehnii) was studied across the natural range of the species, including two small isolated populations in south Sakhalin and Hayachine, by using six microsatellite loci and maternally inherited mitochondrial gene sequences. We also analyzed P. jezoensis, a sympatric spruce in the range. Genetic diversity of P. glehnii was higher in central Hokkaido and the lowest in the Hayachine. Bayesian clustering and principal coordinate analysis by using the microsatellites indicated that the Hayachine was clearly distinct from other populations, implying that it had undergone strong genetic drift since the last glacial period. P. glehnii harbored four mitochondrial haplotypes, two of which were shared with P. jezoensis. One of the two was observed without geographical concentration, suggesting its derivation from ancestral polymorphism. Another was observed in south Sakhalin and in P. jezoensis across Sakhalin. The Bayesian clustering--by using four microsatellite loci, including P. jezoensis populations--indicated unambiguous species delimitation, but with possible admixture of P. jezoensis genes into P. glehnii in south Sakhalin, where P. glehnii is abundantly overwhelmed by P. jezoensis; this might explain the occurrence of introgression of the haplotype of P. jezoensis into P. glehnii.
Collapse
Affiliation(s)
- Mineaki Aizawa
- Department of Forest Science, Faculty of Agriculture, Utsunomiya University, 350, Mine-machi, Utsunomiya, Tochigi, 321-8505, Japan,
| | | | | | | | | | | | | |
Collapse
|
34
|
|
35
|
Escudero M, Eaton DA, Hahn M, Hipp AL. Genotyping-by-sequencing as a tool to infer phylogeny and ancestral hybridization: A case study in Carex (Cyperaceae). Mol Phylogenet Evol 2014; 79:359-67. [DOI: 10.1016/j.ympev.2014.06.026] [Citation(s) in RCA: 87] [Impact Index Per Article: 7.9] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/06/2014] [Revised: 06/19/2014] [Accepted: 06/30/2014] [Indexed: 11/27/2022]
|
36
|
Arbizu C, Ruess H, Senalik D, Simon PW, Spooner DM. Phylogenomics of the carrot genus (Daucus, Apiaceae). AMERICAN JOURNAL OF BOTANY 2014; 101:1666-1685. [PMID: 25077508 DOI: 10.3732/ajb.1400106] [Citation(s) in RCA: 12] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/03/2023]
Abstract
UNLABELLED • PREMISE OF THE STUDY We explored the utility of multiple nuclear orthologs for the taxonomic resolution of wild and cultivated carrot, Daucus species.• METHODS We studied the phylogeny of 92 accessions of 13 species and two subspecies of Daucus and 15 accessions of related genera (107 accessions total) with DNA sequences of 94 nuclear orthologs. Reiterative analyses examined data of both alleles using ambiguity codes or a single allele with the highest coverage, trimmed vs. untrimmed homopolymers; pure exonic vs. pure intronic data; the use of all 94 markers vs. a reduced subset of markers; and analysis of a concatenated data set vs. a coalescent (species tree) approach.• KEY RESULTS Our maximum parsimony and maximum likelihood trees were highly resolved, with 100% bootstrap support for most of the external and many of the internal clades. They resolved multiple accessions of many different species as monophyletic with strong support, but failed to support other species. The single allele analysis gave slightly better topological resolution; trimming homopolymers failed to increase taxonomic resolution; the exonic data had a smaller proportion of parsimony-informative characters. Similar results demonstrating the same dominant topology can be obtained with many fewer markers. A Bayesian concordance analysis provided an overall similar phylogeny, but the coalescent analysis provided drastic changes in topology to all the above.• CONCLUSIONS Our research highlights some difficult species groups in Daucus and misidentifications in germplasm collections. It highlights a useful subset of markers and approaches for future studies of dominant topologies in Daucus.
Collapse
Affiliation(s)
- Carlos Arbizu
- U. S. Department of Agriculture, Agricultural Research Service, Vegetable Crops Research Unit; and Department of Horticulture, University of Wisconsin, 1575 Linden Drive, Madison, Wisconsin 53706-1590 USA
| | - Holly Ruess
- U. S. Department of Agriculture, Agricultural Research Service, Vegetable Crops Research Unit; and Department of Horticulture, University of Wisconsin, 1575 Linden Drive, Madison, Wisconsin 53706-1590 USA
| | - Douglas Senalik
- U. S. Department of Agriculture, Agricultural Research Service, Vegetable Crops Research Unit; and Department of Horticulture, University of Wisconsin, 1575 Linden Drive, Madison, Wisconsin 53706-1590 USA
| | - Philipp W Simon
- U. S. Department of Agriculture, Agricultural Research Service, Vegetable Crops Research Unit; and Department of Horticulture, University of Wisconsin, 1575 Linden Drive, Madison, Wisconsin 53706-1590 USA
| | - David M Spooner
- U. S. Department of Agriculture, Agricultural Research Service, Vegetable Crops Research Unit; and Department of Horticulture, University of Wisconsin, 1575 Linden Drive, Madison, Wisconsin 53706-1590 USA
| |
Collapse
|
37
|
Tang CQ, Obertegger U, Fontaneto D, Barraclough TG. Sexual species are separated by larger genetic gaps than asexual species in rotifers. Evolution 2014; 68:2901-16. [PMID: 24975991 PMCID: PMC4262011 DOI: 10.1111/evo.12483] [Citation(s) in RCA: 24] [Impact Index Per Article: 2.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/20/2014] [Accepted: 06/13/2014] [Indexed: 12/17/2022]
Abstract
Why organisms diversify into discrete species instead of showing a continuum of genotypic and phenotypic forms is an important yet rarely studied question in speciation biology. Does species discreteness come from adaptation to fill discrete niches or from interspecific gaps generated by reproductive isolation? We investigate the importance of reproductive isolation by comparing genetic discreteness, in terms of intra- and interspecific variation, between facultatively sexual monogonont rotifers and obligately asexual bdelloid rotifers. We calculated the age (phylogenetic distance) and average pairwise genetic distance (raw distance) within and among evolutionarily significant units of diversity in six bdelloid clades and seven monogonont clades sampled for 4211 individuals in total. We find that monogonont species are more discrete than bdelloid species with respect to divergence between species but exhibit similar levels of intraspecific variation (species cohesiveness). This pattern arises because bdelloids have diversified into discrete genetic clusters at a faster net rate than monogononts. Although sampling biases or differences in ecology that are independent of sexuality might also affect these patterns, the results are consistent with the hypothesis that bdelloids diversified at a faster rate into less discrete species because their diversification does not depend on the evolution of reproductive isolation.
Collapse
Affiliation(s)
- Cuong Q Tang
- Department of Life Sciences, Imperial College London, Ascot, Berkshire, SL5 7PY, United Kingdom.
| | | | | | | |
Collapse
|
38
|
Utility of indels for species-level identification of a biologically complex plant group: a study with intergenic spacer in Citrus. Mol Biol Rep 2014; 41:7217-22. [PMID: 25048292 DOI: 10.1007/s11033-014-3606-7] [Citation(s) in RCA: 16] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/22/2013] [Accepted: 07/10/2014] [Indexed: 11/27/2022]
Abstract
The Consortium of Barcode of Life plant working group proposed to use the defined portion of plastid genes rbcL and matK either singly or in combination as the standard DNA barcode for plants. But DNA barcode based identification of biologically complex plant groups are always a challenging task due to the occurrence of natural hybridization. Here, we examined the use of indels polymorphism in trnH-psbA and trnL-trnF sequences for rapid species identification of citrus. DNA from young leaves of selected citrus species were isolated and matK gene (~800 bp) and trnH-psbA spacer (~450 bp) of Chloroplast DNA was amplified for species level identification. The sequences within the group taxa of Citrus were aligned using the ClustalX program. With few obvious misalignments were corrected manually using the similarity criterion. We identified a 54 bp inverted repeat or palindrome sequence (27-80 regions) and 6 multi residues indel coding regions. Large inverted repeats in cpDNA provided authentication at the higher taxonomic levels. These diagnostics indel marker from trnH-psbA were successful in identifying different species (5 out of 7) within the studied Citrus except Citrus limon and Citrus medica. These two closely related species are distinguished through the 6 bp deletion in trnL-trnF. This study demonstrated that the indel polymorphism based approach easily characterizes the Citrus species and the same may be applied in other complex groups. Likewise other indels occurring intergenic spacer of chloroplast regions may be tested for rapid identification of other secondary citrus species.
Collapse
|
39
|
Mello B, Schrago CG. Assignment of Calibration Information to Deeper Phylogenetic Nodes is More Effective in Obtaining Precise and Accurate Divergence Time Estimates. Evol Bioinform Online 2014; 10:79-85. [PMID: 24855333 PMCID: PMC4022701 DOI: 10.4137/ebo.s13908] [Citation(s) in RCA: 14] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/15/2013] [Revised: 03/25/2014] [Accepted: 04/03/2014] [Indexed: 11/25/2022] Open
Abstract
Divergence time estimation has become an essential tool for understanding macroevolutionary events. Molecular dating aims to obtain reliable inferences, which, within a statistical framework, means jointly increasing the accuracy and precision of estimates. Bayesian dating methods exhibit the propriety of a linear relationship between uncertainty and estimated divergence dates. This relationship occurs even if the number of sites approaches infinity and places a limit on the maximum precision of node ages. However, how the placement of calibration information may affect the precision of divergence time estimates remains an open question. In this study, relying on simulated and empirical data, we investigated how the location of calibration within a phylogeny affects the accuracy and precision of time estimates. We found that calibration priors set at median and deep phylogenetic nodes were associated with higher precision values compared to analyses involving calibration at the shallowest node. The results were independent of the tree symmetry. An empirical mammalian dataset produced results that were consistent with those generated by the simulated sequences. Assigning time information to the deeper nodes of a tree is crucial to guarantee the accuracy and precision of divergence times. This finding highlights the importance of the appropriate choice of outgroups in molecular dating.
Collapse
Affiliation(s)
- Beatriz Mello
- Department of Genetics, Federal University of Rio de Janeiro, Rio de Janeiro, Brazil
| | - Carlos G Schrago
- Department of Genetics, Federal University of Rio de Janeiro, Rio de Janeiro, Brazil
| |
Collapse
|
40
|
Lanfear R, Calcott B, Kainer D, Mayer C, Stamatakis A. Selecting optimal partitioning schemes for phylogenomic datasets. BMC Evol Biol 2014. [PMID: 24742000 DOI: 10.1186/1472-2148-14-82] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 04/08/2023] Open
Abstract
BACKGROUND Partitioning involves estimating independent models of molecular evolution for different subsets of sites in a sequence alignment, and has been shown to improve phylogenetic inference. Current methods for estimating best-fit partitioning schemes, however, are only computationally feasible with datasets of fewer than 100 loci. This is a problem because datasets with thousands of loci are increasingly common in phylogenetics. METHODS We develop two novel methods for estimating best-fit partitioning schemes on large phylogenomic datasets: strict and relaxed hierarchical clustering. These methods use information from the underlying data to cluster together similar subsets of sites in an alignment, and build on clustering approaches that have been proposed elsewhere. RESULTS We compare the performance of our methods to each other, and to existing methods for selecting partitioning schemes. We demonstrate that while strict hierarchical clustering has the best computational efficiency on very large datasets, relaxed hierarchical clustering provides scalable efficiency and returns dramatically better partitioning schemes as assessed by common criteria such as AICc and BIC scores. CONCLUSIONS These two methods provide the best current approaches to inferring partitioning schemes for very large datasets. We provide free open-source implementations of the methods in the PartitionFinder software. We hope that the use of these methods will help to improve the inferences made from large phylogenomic datasets.
Collapse
Affiliation(s)
- Robert Lanfear
- Ecology Evolution and Genetics, Research School of Biology, Australian National University, Canberra, ACT, Australia.
| | | | | | | | | |
Collapse
|
41
|
Lanfear R, Calcott B, Kainer D, Mayer C, Stamatakis A. Selecting optimal partitioning schemes for phylogenomic datasets. BMC Evol Biol 2014; 14:82. [PMID: 24742000 PMCID: PMC4012149 DOI: 10.1186/1471-2148-14-82] [Citation(s) in RCA: 433] [Impact Index Per Article: 39.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/19/2013] [Accepted: 04/03/2014] [Indexed: 11/10/2022] Open
Abstract
BACKGROUND Partitioning involves estimating independent models of molecular evolution for different subsets of sites in a sequence alignment, and has been shown to improve phylogenetic inference. Current methods for estimating best-fit partitioning schemes, however, are only computationally feasible with datasets of fewer than 100 loci. This is a problem because datasets with thousands of loci are increasingly common in phylogenetics. METHODS We develop two novel methods for estimating best-fit partitioning schemes on large phylogenomic datasets: strict and relaxed hierarchical clustering. These methods use information from the underlying data to cluster together similar subsets of sites in an alignment, and build on clustering approaches that have been proposed elsewhere. RESULTS We compare the performance of our methods to each other, and to existing methods for selecting partitioning schemes. We demonstrate that while strict hierarchical clustering has the best computational efficiency on very large datasets, relaxed hierarchical clustering provides scalable efficiency and returns dramatically better partitioning schemes as assessed by common criteria such as AICc and BIC scores. CONCLUSIONS These two methods provide the best current approaches to inferring partitioning schemes for very large datasets. We provide free open-source implementations of the methods in the PartitionFinder software. We hope that the use of these methods will help to improve the inferences made from large phylogenomic datasets.
Collapse
Affiliation(s)
- Robert Lanfear
- Ecology Evolution and Genetics, Research School of Biology, Australian National University, Canberra, ACT, Australia.
| | | | | | | | | |
Collapse
|
42
|
Lanfear R, Calcott B, Kainer D, Mayer C, Stamatakis A. Selecting optimal partitioning schemes for phylogenomic datasets. BMC Evol Biol 2014. [PMID: 24742000 DOI: 10.6084/m9.figshare.938920] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 04/08/2023] Open
Abstract
BACKGROUND Partitioning involves estimating independent models of molecular evolution for different subsets of sites in a sequence alignment, and has been shown to improve phylogenetic inference. Current methods for estimating best-fit partitioning schemes, however, are only computationally feasible with datasets of fewer than 100 loci. This is a problem because datasets with thousands of loci are increasingly common in phylogenetics. METHODS We develop two novel methods for estimating best-fit partitioning schemes on large phylogenomic datasets: strict and relaxed hierarchical clustering. These methods use information from the underlying data to cluster together similar subsets of sites in an alignment, and build on clustering approaches that have been proposed elsewhere. RESULTS We compare the performance of our methods to each other, and to existing methods for selecting partitioning schemes. We demonstrate that while strict hierarchical clustering has the best computational efficiency on very large datasets, relaxed hierarchical clustering provides scalable efficiency and returns dramatically better partitioning schemes as assessed by common criteria such as AICc and BIC scores. CONCLUSIONS These two methods provide the best current approaches to inferring partitioning schemes for very large datasets. We provide free open-source implementations of the methods in the PartitionFinder software. We hope that the use of these methods will help to improve the inferences made from large phylogenomic datasets.
Collapse
Affiliation(s)
- Robert Lanfear
- Ecology Evolution and Genetics, Research School of Biology, Australian National University, Canberra, ACT, Australia.
| | | | | | | | | |
Collapse
|
43
|
Blair C, Heckman KL, Russell AL, Yoder AD. Multilocus coalescent analyses reveal the demographic history and speciation patterns of mouse lemur sister species. BMC Evol Biol 2014; 14:57. [PMID: 24661555 PMCID: PMC3987692 DOI: 10.1186/1471-2148-14-57] [Citation(s) in RCA: 19] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/18/2013] [Accepted: 03/18/2014] [Indexed: 12/20/2022] Open
Abstract
Background Debate continues as to whether allopatric speciation or peripatric speciation through a founder effect is the predominant force driving evolution in vertebrates. The mouse lemurs of Madagascar are a system in which evolution has generated a large number of species over a relatively recent time frame. Here, we examine speciation patterns in a pair of sister species of mouse lemur, Microcebus murinus and M. griseorufus. These two species have ranges that are disparately proportioned in size, with M. murinus showing a much more extensive range that marginally overlaps that of M. griseorufus. Given that these two species are sister taxa, the asymmetric but overlapping geographic ranges are consistent with a model of peripatric speciation. To test this hypothesis, we analyze DNA sequence data from four molecular markers using coalescent methods. If the peripatric speciation model is supported, we predict substantially greater genetic diversity in M. murinus, relative to M. griseorufus. Further, we expect a larger effective population size in M. murinus and in the common ancestor of the two species than in M. griseorufus, with a concomitant decrease in gene tree/species tree incongruence in the latter and weak signs of demographic expansion in M. murinus. Results Our results reject a model of peripatric divergence. Coalescent effective population size estimates were similar for both extant species and larger than that estimated for their most recent common ancestor. Gene tree results show similar levels of incomplete lineage sorting within species with respect to the species tree, and locus-specific estimates of genetic diversity are concordant for both species. Multilocus demographic analyses suggest range expansions for M. murinus, with this species also experiencing more recent population declines over the past 160 thousand years. Conclusions Results suggest that speciation occurred in allopatry from a common ancestor narrowly distributed throughout southwest Madagascar, with subsequent range expansion for M. murinus. Population decline in M. murinus is likely related to patterns of climate change in Madagascar throughout the Pleistocene, potentially exacerbated by continual anthropogenic perturbation. Genome-level data are needed to quantify the role of niche specialization and adaptation in shaping the current ranges of these species.
Collapse
Affiliation(s)
- Christopher Blair
- Department of Biology, Duke University, Box 90338, BioSci 130 Science Drive, Durham, NC 27708, USA.
| | | | | | | |
Collapse
|
44
|
García-Pereira MJ, Carvajal-Rodríguez A, Whelan S, Caballero A, Quesada H. Impact of deep coalescence and recombination on the estimation of phylogenetic relationships among species using AFLP markers. Mol Phylogenet Evol 2014; 76:102-9. [PMID: 24631855 DOI: 10.1016/j.ympev.2014.03.001] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/08/2013] [Revised: 02/26/2014] [Accepted: 03/04/2014] [Indexed: 10/25/2022]
Abstract
Deep coalescence and the nongenealogical pattern of descent caused by recombination have emerged as a common problem for phylogenetic inference at the species level. Here we use computer simulations to assess whether AFLP-based phylogenies are robust to the uncertainties introduced by these factors. Our results indicate that phylogenetic signal can prevail even in the face of extensive deep coalescence allowing recovering the correct species tree topology. The impact of recombination on tree accuracy was related to total tree depth and species effective population size. The correct tree topology could be recovered upon many simulation settings due to a trade-off between the conflicting signals resulting from intra-locus recombination and the benefits of the joint consideration of unlinked loci that better matched overall the true species tree. Errors in tree topology were not only determined by deep coalescence, but also by the timing of divergence and the tree-building errors arising from an insufficient number of characters. DNA sequences generally outperformed AFLPs upon any simulated scenario, but this difference in performance was nearly negligible when a sufficient number of AFLP characters were sampled. Our simulations suggest that the impact of deep coalescence and intra-locus recombination on the reliability of AFLP trees could be minimal for effective population sizes equal to or lower than 10,000 (typical of many vertebrates and tree plants) given tree depths above 0.02 substitutions per site.
Collapse
Affiliation(s)
- María Jesús García-Pereira
- Departamento de Bioquímica, Genética e Inmunología, Facultad de Biología, Universidad de Vigo, 36310 Vigo, Spain.
| | - Antonio Carvajal-Rodríguez
- Departamento de Bioquímica, Genética e Inmunología, Facultad de Biología, Universidad de Vigo, 36310 Vigo, Spain.
| | - Simon Whelan
- Evolutionary Biology, Evolutionary Biology Centre, Uppsala University, Uppsala 75236-SE, Sweden.
| | - Armando Caballero
- Departamento de Bioquímica, Genética e Inmunología, Facultad de Biología, Universidad de Vigo, 36310 Vigo, Spain.
| | - Humberto Quesada
- Departamento de Bioquímica, Genética e Inmunología, Facultad de Biología, Universidad de Vigo, 36310 Vigo, Spain.
| |
Collapse
|
45
|
Muñoz-Pajares AJ. SIDIER: substitution and indel distances to infer evolutionary relationships. Methods Ecol Evol 2013. [DOI: 10.1111/2041-210x.12118] [Citation(s) in RCA: 18] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/30/2022]
Affiliation(s)
- Antonio Jesús Muñoz-Pajares
- Centro de Investigação em Biodiversidade e Recursos Genéticos; CIBIO; Campus Agrário de Vairão Rua Padre Armando Quintas 4485-661 Vairão Portugal
| |
Collapse
|
46
|
Inference of global HIV-1 sequence patterns and preliminary feature analysis. Virol Sin 2013; 28:228-38. [PMID: 23913180 DOI: 10.1007/s12250-013-3348-z] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/04/2013] [Accepted: 07/26/2013] [Indexed: 12/12/2022] Open
Abstract
The epidemiology of HIV-1 varies in different areas of the world, and it is possible that this complexity may leave unique footprints in the viral genome. Thus, we attempted to find significant patterns in global HIV-1 genome sequences. By applying the rule inference algorithm RIPPER (Repeated Incremental Pruning to Produce Error Reduction) to multiple sequence alignments of Env sequences from four classes of compiled datasets, we generated four sets of signature patterns. We found that these patterns were able to distinguish southeastern Asian from nonsoutheastern Asian sequences with 97.5% accuracy, Chinese from non-Chinese sequences with 98.3% accuracy, African from non-African sequences with 88.4% accuracy, and southern African from non-southern African sequences with 91.2% accuracy. These patterns showed different associations with subtypes and with amino acid positions. In addition, some signature patterns were characteristic of the geographic area from which the sample was taken. Amino acid features corresponding to the phylogenetic clustering of HIV-1 sequences were consistent with some of the deduced patterns. Using a combination of patterns inferred from subtypes B, C, and all subtypes chimeric with CRF01_AE worldwide, we found that signature patterns of subtype C were extremely common in some sampled countries (for example, Zambia in southern Africa), which may hint at the origin of this HIV-1 subtype and the need to pay special attention to this area of Africa. Signature patterns of subtype B sequences were associated with different countries. Even more, there are distinct patterns at single position 21 with glycine, leucine and isoleucine corresponding to subtype C, B and all possible recombination forms chimeric with CRF01_AE, which also indicate distinct geographic features. Our method widens the scope of inference of signature from geographic, genetic, and genomic viewpoints. These findings may provide a valuable reference for epidemiological research or vaccine design.
Collapse
|
47
|
Truong C, Divakar PK, Yahr R, Crespo A, Clerc P. Testing the use of ITS rDNA and protein-coding genes in the generic and species delimitation of the lichen genus Usnea (Parmeliaceae, Ascomycota). Mol Phylogenet Evol 2013; 68:357-72. [DOI: 10.1016/j.ympev.2013.04.005] [Citation(s) in RCA: 17] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/14/2012] [Revised: 02/01/2013] [Accepted: 04/02/2013] [Indexed: 12/31/2022]
|
48
|
Elucidating the origin of the ExbBD components of the TonB system through Bayesian inference and maximum-likelihood phylogenies. Mol Phylogenet Evol 2013; 69:674-86. [PMID: 23891663 DOI: 10.1016/j.ympev.2013.07.010] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/25/2012] [Revised: 06/28/2013] [Accepted: 07/12/2013] [Indexed: 01/03/2023]
Abstract
Uptake of ferric siderophores, vitamin B12, and other molecules in gram-negative bacteria is mediated by a multi-protein complex known as the TonB system. The ExbB and ExbD protein components of the TonB system play key energizing roles and are homologous with the flagellar motor proteins MotA and MotB. Here, the phylogenetic relationships of ExbBD and MotAB were investigated using Bayesian inference and the maximum-likelihood method. Phylogenetic trees of these proteins suggest that they are separated into distinct monophyletic groups and have originated from a common ancestral system. Several horizontal gene transfer events for ExbB-ExbD are also inferred, and a model for the evolution of the TonB system is proposed.
Collapse
|
49
|
Lasek-Nesselquist E, Gogarten JP. The effects of model choice and mitigating bias on the ribosomal tree of life. Mol Phylogenet Evol 2013; 69:17-38. [PMID: 23707703 DOI: 10.1016/j.ympev.2013.05.006] [Citation(s) in RCA: 51] [Impact Index Per Article: 4.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/25/2013] [Revised: 04/26/2013] [Accepted: 05/08/2013] [Indexed: 01/03/2023]
Abstract
Deep-level relationships within Bacteria, Archaea, and Eukarya as well as the relationships of these three domains to each other require resolution. The ribosomal machinery, universal to all cellular life, represents a protein repertoire resistant to horizontal gene transfer, which provides a largely congruent signal necessary for reconstructing a tree suitable as a backbone for life's reticulate history. Here, we generate a ribosomal tree of life from a robust taxonomic sampling of Bacteria, Archaea, and Eukarya to elucidate deep-level intra-domain and inter-domain relationships. Lack of phylogenetic information and systematic errors caused by inadequate models (that cannot account for substitution rate or compositional heterogeneities) or improper model selection compound conflicting phylogenetic signals from HGT and/or paralogy. Thus, we tested several models of varying sophistication on three different datasets, performed removal of fast-evolving or long-branched Archaea and Eukarya, and employed three different strategies to remove compositional heterogeneity to examine their effects on the topological outcome. Our results support a two-domain topology for the tree of life, where Eukarya emerges from within Archaea as sister to a Korarchaeota/Thaumarchaeota (KT) or Crenarchaeota/KT clade for all models under all or at least one of the strategies employed. Taxonomic manipulation allows single-matrix and certain mixture models to vacillate between two-domain and three-domain phylogenies. We find that models vary in their ability to resolve different areas of the tree of life, which does not necessarily correlate with model complexity. For example, both single-matrix and some mixture models recover monophyletic Crenarchaeota and Euryarchaeota archaeal phyla. In contrast, the most sophisticated model recovers a paraphyletic Euryarchaeota but detects two large clades that comprise the Bacteria, which were recovered separately but never together in the other models. Overall, models recovered consistent topologies despite dataset modifications due to the removal of compositional bias, which reflects either ineffective bias reduction or robust datasets that allow models to overcome reconstruction artifacts. We recommend a comparative approach for evolutionary models to identify model weaknesses as well as consensus relationships.
Collapse
|
50
|
Leavitt SD, Esslinger TL, Spribille T, Divakar PK, Thorsten Lumbsch H. Multilocus phylogeny of the lichen-forming fungal genus Melanohalea (Parmeliaceae, Ascomycota): Insights on diversity, distributions, and a comparison of species tree and concatenated topologies. Mol Phylogenet Evol 2013; 66:138-52. [DOI: 10.1016/j.ympev.2012.09.013] [Citation(s) in RCA: 37] [Impact Index Per Article: 3.1] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/21/2012] [Revised: 08/31/2012] [Accepted: 09/16/2012] [Indexed: 10/27/2022]
|