1
|
Phylogenetic Relationships of the Strongyloid Nematodes of Australasian Marsupials Based on Mitochondrial Protein Sequences. Animals (Basel) 2022; 12:ani12212900. [PMID: 36359023 PMCID: PMC9655308 DOI: 10.3390/ani12212900] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/26/2022] [Revised: 10/03/2022] [Accepted: 10/20/2022] [Indexed: 12/04/2022] Open
Abstract
Simple Summary Parasitic strongyloid nematodes endemic to the gastrointestinal tracts of Australasian marsupials are one of the most diverse groups of mammalian parasites. These nematodes are currently placed in the family Chabertiidae comprising two subfamilies, namely the Cloacininae and Phascolostrongylinae. Their current classification relies primarily on morphological features and has not been validated using molecular data. This study aimed to determine the phylogenetic relationships of the Cloacininae and Phascolostrongylinae within the family Chabertiidae and their relationship with other groups of strongyloid nematodes from non-marsupial hosts, using mitochondrial protein sequence datasets. The findings supported the recognition of the family Cloacinidae, containing the Cloacininae and Phascolostrongylinae, as a monophyletic group within the Strongyloidea. However, the subfamily Phascolostrongylinae was paraphyletic, and the relationships of individual genera corresponded with their host families. Genera of the Cloacininae and Phascolostrongylinae occurring in macropod hosts were more closely related compared to genera of the Phascolostrongylinae occurring in wombats. This study suggests an alternative hypothesis for the origin of marsupial strongyloid nematodes in vombatid hosts that should be explored further using molecular approaches and more widespread sampling. Abstract Australasian marsupials harbour a diverse group of gastrointestinal strongyloid nematodes. These nematodes are currently grouped into two subfamilies, namely the Cloacininae and Phascolostrongylinae. Based on morphological criteria, the Cloacininae and Phascolostrongylinae were defined as monophyletic and placed in the family Cloacinidae, but this has not been supported by molecular data and they are currently placed in the Chabertiidae. Although molecular data (internal transcribed spacers of the nuclear ribosomal RNA genes or mitochondrial protein-coding genes) have been used to verify morphological classifications within the Cloacininae and Phascolostrongylinae, the phylogenetic relationships between the subfamilies have not been rigorously tested. This study determined the phylogenetic relationships of the subfamilies Cloacininae and Phascolostrongylinae using amino acid sequences conceptually translated from the twelve concatenated mitochondrial protein-coding genes. The findings demonstrated that the Cloacininae and Phascolostrongylinae formed a well-supported monophyletic assemblage, consistent with their morphological classification as an independent family, Cloacinidae. Unexpectedly, however, the subfamily Phascolostrongylinae was split into two groups comprising the genera from macropodid hosts (kangaroos and wallabies) and those from vombatid hosts (wombats). Genera of the Cloacininae and Phascolostrongylinae occurring in macropodid hosts were more closely related compared to genera of the Phascolostrongylinae occurring in wombats that formed a sister relationship with the remaining genera from macropods. These findings provide molecular evidence supporting the monophyly of the family Cloacinidae and an alternative hypothesis for the origin of marsupial strongyloid nematodes in vombatid hosts that requires further exploration using molecular approaches and additional samples
Collapse
|
2
|
Pereira EA, Ceron K, da Silva HR, Santana DJ. The dispersal between Amazonia and Atlantic Forest during the Early Neogene revealed by the biogeography of the treefrog tribe Sphaenorhynchini (Anura, Hylidae). Ecol Evol 2022; 12:e8754. [PMID: 35386873 PMCID: PMC8975791 DOI: 10.1002/ece3.8754] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/07/2021] [Revised: 01/18/2022] [Accepted: 03/04/2022] [Indexed: 11/11/2022] Open
Abstract
The Amazonia and the Atlantic Forest, separated by the diagonal of open formations, are two ecoregions that comprise the most diverse tropical forests in the world. The Sphaenorhynchini tribe is among the few tribes of anurans that occur in both rainforests, and their historical biogeographic have never been proposed. In this study, we infer a dated phylogeny for the species of the Sphaenorhynchini and we reconstructed the biogeographic history describing the diversification chronology, and possible patterns of dispersion and vicariance, providing information about how orogeny, forest dynamics and allopatric speciation affected their evolution in South America. We provided a dated phylogeny and biogeography study for the Sphaenorhynchini tribe using mitochondrial and nuclear genes. We analyzed 41 samples to estimate the ancestral areas using biogeographical analysis based on the estimated divergence times and the current geographical ranges of the species of Sphaenorhynchini. We recovered three characteristic clades that we recognize as groups of species (S. lacteus, S. planicola, and S. platycephalus groups), with S. carneus and G. pauloalvini being the sister taxa of all other species from the tribe. We found that the diversification of the tribe lineages coincided with the main climatic and geological factors that shaped the Neotropical landscape during the Cenozoic. The most recent common ancestor of the Sphaenorhynchini species emerged in the North of the Atlantic Forest and migrated to the Amazonia in different dispersion events that occurred during the connections between these ecoregions. This is the first large-scale study to include an almost complete calibrated phylogeny of Sphaenorhynchini, presenting important information about the evolution and diversification of the tribe. Overall, we suggest that biogeographic historical of Sphaenorhynchini have resulted from a combination of repeated range expansion and contraction cycles concurrent with climate fluctuations and dispersal events between the Atlantic Forest and Amazonia.
Collapse
Affiliation(s)
- Elvis Almeida Pereira
- Laboratório de HerpetologiaDepartamento de Biologia AnimalUniversidade Federal Rural do Rio de JaneiroRio de JaneiroBrazil
- Mapinguari ‐ Laboratório de Biogeografia e Sistemática de Anfíbios e RépteisUniversidade Federal de Mato Grosso do SulCampo GrandeBrazil
- Laboratório de Genética e BiodiversidadeUniversidade Federal do Oeste do ParáSantarémBrazil
| | - Karoline Ceron
- Mapinguari ‐ Laboratório de Biogeografia e Sistemática de Anfíbios e RépteisUniversidade Federal de Mato Grosso do SulCampo GrandeBrazil
- Departamento de Biologia AnimalUniversidade Estadual de Campinas (UNICAMP)São PauloBrazil
| | - Hélio Ricardo da Silva
- Laboratório de HerpetologiaDepartamento de Biologia AnimalUniversidade Federal Rural do Rio de JaneiroRio de JaneiroBrazil
| | - Diego José Santana
- Mapinguari ‐ Laboratório de Biogeografia e Sistemática de Anfíbios e RépteisUniversidade Federal de Mato Grosso do SulCampo GrandeBrazil
| |
Collapse
|
3
|
Vu HT, Vu QL, Nguyen TD, Tran N, Nguyen TC, Luu PN, Tran DD, Nguyen TK, Le L. Genetic Diversity and Identification of Vietnamese Paphiopedilum Species Using DNA Sequences. BIOLOGY 2019; 9:E9. [PMID: 31906128 PMCID: PMC7168009 DOI: 10.3390/biology9010009] [Citation(s) in RCA: 8] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 11/14/2019] [Revised: 12/20/2019] [Accepted: 12/20/2019] [Indexed: 12/23/2022]
Abstract
Paphiopedilum is among the most popular ornamental orchid genera due to its unique slipper flowers and attractive leaf coloration. Most of the Paphiopedilum species are in critical danger due to over-exploitation. They were listed in Appendix I of the Convention on International Trade in Endangered Species of Wild Fauna and Flora, which prevents their being traded across borders. While most Paphiopedilum species are distinctive, owing to their respective flowers, their vegetative features are more similar and undistinguished. Hence, the conservation of these species is challenging, as most traded specimins are immature and non-flowered. An urgent need exists for effective identification methods to prevent further illegal trading of Paphiopedilum species. DNA barcoding is a rapid and sensitive method for species identification, at any developmental stage, using short DNA sequences. In this study, eight loci, i.e., ITS, LEAFY, ACO, matK, trnL, rpoB, rpoC1, and trnH-psbA, were screened for potential barcode sequences on the Vietnamese Paphiopedilum species. In total, 17 out of 22 Paphiopedilum species were well identified. The studied DNA sequences were deposited to GenBank, in which Paphiopedilum dalatense accessions were introduced for the first time. ACO, LEAFY, and trnH-psbA were limited in amplification rate for Paphiopedilum. ITS was the best single barcode. Single ITS could be used along with nucleotide polymorphism characteristics for species discrimination. The combination of ITS + matK was the most efficient identification barcode for Vietnamese Paphiopedilum species. This barcode also succeeded in recognizing misidentified or wrongly-named traded samples. Different bioinformatics programs and algorithms for establishing phylogenetic trees were also compared in the study to propose quick, simple, and effective tools for practical use. It was proved that both the Bayesian Inference method in the MRBAYES program and the neighbor-joining method in the MEGA software met the criteria. Our study provides a barcoding database of Vietnamese Paphiopedilum which may significantly contribute to the control and conservation of these valuable species.
Collapse
Affiliation(s)
- Huyen-Trang Vu
- Faculty of Biotechnology, Nguyen-Tat-Thanh University, 298A-300A Nguyen-Tat-Thanh Street, District 04, Hochiminh City 700000, Vietnam; (H.-T.V.); (T.-D.N.); (T.-C.N.)
- Faculty of Biotechnology, International University—Vietnam National University, Linh Trung Ward, Thu Duc District, Hochiminh City 700000, Vietnam;
| | - Quoc-Luan Vu
- Tay Nguyen Institute for Scientific Research, Vietnam Academy of Science and Technology, 116 Xo Viet Nghe Tinh, Ward 7, Da Lat City, Lam Dong province 66000, Vietnam;
| | - Thanh-Diem Nguyen
- Faculty of Biotechnology, Nguyen-Tat-Thanh University, 298A-300A Nguyen-Tat-Thanh Street, District 04, Hochiminh City 700000, Vietnam; (H.-T.V.); (T.-D.N.); (T.-C.N.)
| | - Ngan Tran
- Faculty of Biotechnology, International University—Vietnam National University, Linh Trung Ward, Thu Duc District, Hochiminh City 700000, Vietnam;
| | - Thanh-Cong Nguyen
- Faculty of Biotechnology, Nguyen-Tat-Thanh University, 298A-300A Nguyen-Tat-Thanh Street, District 04, Hochiminh City 700000, Vietnam; (H.-T.V.); (T.-D.N.); (T.-C.N.)
| | - Phuong-Nam Luu
- Faculty of Biotechnology, Nguyen-Tat-Thanh University, 298A-300A Nguyen-Tat-Thanh Street, District 04, Hochiminh City 700000, Vietnam; (H.-T.V.); (T.-D.N.); (T.-C.N.)
| | - Duy-Duong Tran
- Agricultural Genetics Institute, Pham Van Dong Street, Hanoi 100000, Vietnam; (D.-D.T.); (T.-K.N.)
| | - Truong-Khoa Nguyen
- Agricultural Genetics Institute, Pham Van Dong Street, Hanoi 100000, Vietnam; (D.-D.T.); (T.-K.N.)
| | - Ly Le
- Faculty of Biotechnology, International University—Vietnam National University, Linh Trung Ward, Thu Duc District, Hochiminh City 700000, Vietnam;
| |
Collapse
|
4
|
Exploring the sequence, function, and evolutionary space of protein superfamilies using sequence similarity networks and phylogenetic reconstructions. Methods Enzymol 2019; 620:315-347. [PMID: 31072492 DOI: 10.1016/bs.mie.2019.03.015] [Citation(s) in RCA: 10] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/21/2022]
Abstract
Integrative computational methods can facilitate the discovery of new protein functions and enzymatic reactions by enabling the observation and investigation of complex sequence-structure-function and evolutionary relationships within protein superfamilies. Here, we highlight the use of sequence similarity networks (SSNs) and phylogenetic reconstructions to map the functional divergence and evolutionary history of protein superfamilies. We exemplify this approach using the nitroreductase (NTR) flavoenzyme superfamily, demonstrating that SSN investigations can provide a rapid and effective means to classify groups of proteins, expose sequence similarity relationships across the global scale of a protein superfamily, and efficiently support detailed phylogenetic analyses. Integration of such approaches with systematic experimental characterization will expand our understanding of the functional diversity of enzymes, their evolution, and their associated physiological roles.
Collapse
|
5
|
High-Throughput Reconstruction of Ancestral Protein Sequence, Structure, and Molecular Function. Methods Mol Biol 2019; 1851:135-170. [PMID: 30298396 DOI: 10.1007/978-1-4939-8736-8_8] [Citation(s) in RCA: 6] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/12/2022]
Abstract
Ancestral protein sequence reconstruction is a powerful technique for explicitly testing hypotheses about the evolution of molecular function, allowing researchers to meticulously dissect how historical changes in protein sequence impacted functional repertoire by altering the protein's 3D structure. These techniques have provided concrete, experimentally validated insights into ancient evolutionary processes and help illuminate the complex relationship between protein sequence, structure, and function. Inferring the protein family phylogenies on which ancestral sequence reconstruction depends and reconstructing the sequences, themselves, are amenable to high-throughput computational analysis. However, determining the structures of ancestral-reconstructed proteins and characterizing their functions typically rely on time-consuming and expensive laboratory analyses, limiting most current studies to examining a relatively small number of specific hypotheses. For this reason, we have little detailed, unbiased information about how molecular function evolves across large protein family phylogenies. Here we describe a generalized protocol that integrates ancestral sequence reconstruction with structural homology modeling and structure-based molecular affinity prediction to characterize historical changes in protein function across families with thousands of individual sequences. We highlight key steps in the analysis protocol requiring particularly careful attention to avoid introducing potential errors as well as steps for which computationally efficient subroutines can be substituted for more intensive approaches, allowing researchers to scale the analysis up or down, depending on available resources and requirements for reproducibility and scientific rigor. In our view, this approach provides a compelling compliment to more laboratory-intensive procedures, generating important contextual information that can help guide detailed experiments.
Collapse
|
6
|
Schrago CG, Aguiar BO, Mello B. Comparative evaluation of maximum parsimony and Bayesian phylogenetic reconstruction using empirical morphological data. J Evol Biol 2018; 31:1477-1484. [DOI: 10.1111/jeb.13344] [Citation(s) in RCA: 41] [Impact Index Per Article: 6.8] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/27/2018] [Revised: 06/13/2018] [Accepted: 06/27/2018] [Indexed: 11/27/2022]
Affiliation(s)
- Carlos G. Schrago
- Department of Genetics; Federal University of Rio de Janeiro; Rio de Janeiro Brazil
| | - Barbara O. Aguiar
- Department of Genetics; Federal University of Rio de Janeiro; Rio de Janeiro Brazil
| | - Beatriz Mello
- Department of Genetics; Federal University of Rio de Janeiro; Rio de Janeiro Brazil
| |
Collapse
|
7
|
Pan Z, Baerson SR, Wang M, Bajsa‐Hirschel J, Rimando AM, Wang X, Nanayakkara NPD, Noonan BP, Fromm ME, Dayan FE, Khan IA, Duke SO. A cytochrome P450 CYP71 enzyme expressed in Sorghum bicolor root hair cells participates in the biosynthesis of the benzoquinone allelochemical sorgoleone. THE NEW PHYTOLOGIST 2018; 218:616-629. [PMID: 29461628 PMCID: PMC5887931 DOI: 10.1111/nph.15037] [Citation(s) in RCA: 9] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 11/03/2017] [Accepted: 01/08/2018] [Indexed: 05/24/2023]
Abstract
Sorgoleone, a major component of the hydrophobic root exudates of Sorghum spp., is probably responsible for many of the allelopathic properties attributed to members of this genus. Much of the biosynthetic pathway for this compound has been elucidated, with the exception of the enzyme responsible for the catalysis of the addition of two hydroxyl groups to the resorcinol ring. A library prepared from isolated Sorghum bicolor root hair cells was first mined for P450-like sequences, which were then analyzed by quantitative reverse transcription-polymerase chain reaction (RT-qPCR) to identify those preferentially expressed in root hairs. Full-length open reading frames for each candidate were generated, and then analyzed biochemically using both a yeast expression system and transient expression in Nicotiana benthamiana leaves. RNA interference (RNAi)-mediated repression in transgenic S. bicolor was used to confirm the roles of these candidates in the biosynthesis of sorgoleone in planta. A P450 enzyme, designated CYP71AM1, was found to be capable of catalyzing the formation of dihydrosorgoleone using 5-pentadecatrienyl resorcinol-3-methyl ether as substrate, as determined by gas chromatography-mass spectroscopy (GC-MS). RNAi-mediated repression of CYP71AM1 in S. bicolor resulted in decreased sorgoleone contents in multiple independent transformant events. Our results strongly suggest that CYP71AM1 participates in the biosynthetic pathway of the allelochemical sorgoleone.
Collapse
Affiliation(s)
- Zhiqiang Pan
- US Department of AgricultureAgricultural Research ServiceNatural Products Utilization Research UnitUniversityMS 38677USA
| | - Scott R. Baerson
- US Department of AgricultureAgricultural Research ServiceNatural Products Utilization Research UnitUniversityMS 38677USA
| | - Mei Wang
- National Center for Natural Products ResearchSchool of PharmacyUniversity of MississippiUniversityMS 38677USA
| | - Joanna Bajsa‐Hirschel
- US Department of AgricultureAgricultural Research ServiceNatural Products Utilization Research UnitUniversityMS 38677USA
| | - Agnes M. Rimando
- US Department of AgricultureAgricultural Research ServiceNatural Products Utilization Research UnitUniversityMS 38677USA
| | - Xiaoqiang Wang
- Department of Biological SciencesUniversity of North TexasDentonTX 76203USA
| | - N. P. Dhammika Nanayakkara
- National Center for Natural Products ResearchSchool of PharmacyUniversity of MississippiUniversityMS 38677USA
| | - Brice P. Noonan
- Department of BiologyUniversity of MississippiUniversityMS 38677USA
| | - Michael E. Fromm
- Epicrop Technologies Inc.5701 N. 58th Street, Suite 1LincolnNE 68507USA
| | - Franck E. Dayan
- US Department of AgricultureAgricultural Research ServiceNatural Products Utilization Research UnitUniversityMS 38677USA
| | - Ikhlas A. Khan
- National Center for Natural Products ResearchSchool of PharmacyUniversity of MississippiUniversityMS 38677USA
| | - Stephen O. Duke
- US Department of AgricultureAgricultural Research ServiceNatural Products Utilization Research UnitUniversityMS 38677USA
| |
Collapse
|
8
|
Jayaswal PK, Dogra V, Shanker A, Sharma TR, Singh NK. A tree of life based on ninety-eight expressed genes conserved across diverse eukaryotic species. PLoS One 2017; 12:e0184276. [PMID: 28922368 PMCID: PMC5603157 DOI: 10.1371/journal.pone.0184276] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/01/2017] [Accepted: 08/21/2017] [Indexed: 01/07/2023] Open
Abstract
Rapid advances in DNA sequencing technologies have resulted in the accumulation of large data sets in the public domain, facilitating comparative studies to provide novel insights into the evolution of life. Phylogenetic studies across the eukaryotic taxa have been reported but on the basis of a limited number of genes. Here we present a genome-wide analysis across different plant, fungal, protist, and animal species, with reference to the 36,002 expressed genes of the rice genome. Our analysis revealed 9831 genes unique to rice and 98 genes conserved across all 49 eukaryotic species analysed. The 98 genes conserved across diverse eukaryotes mostly exhibited binding and catalytic activities and shared common sequence motifs; and hence appeared to have a common origin. The 98 conserved genes belonged to 22 functional gene families including 26S protease, actin, ADP–ribosylation factor, ATP synthase, casein kinase, DEAD-box protein, DnaK, elongation factor 2, glyceraldehyde 3-phosphate, phosphatase 2A, ras-related protein, Ser/Thr protein phosphatase family protein, tubulin, ubiquitin and others. The consensus Bayesian eukaryotic tree of life developed in this study demonstrated widely separated clades of plants, fungi, and animals. Musa acuminata provided an evolutionary link between monocotyledons and dicotyledons, and Salpingoeca rosetta provided an evolutionary link between fungi and animals, which indicating that protozoan species are close relatives of fungi and animals. The divergence times for 1176 species pairs were estimated accurately by integrating fossil information with synonymous substitution rates in the comprehensive set of 98 genes. The present study provides valuable insight into the evolution of eukaryotes.
Collapse
Affiliation(s)
- Pawan Kumar Jayaswal
- National Research Centre on Plant Biotechnology, IARI, Pusa, New Delhi, India
- Banasthali University, Banasthali, Rajasthan, India
| | - Vivek Dogra
- National Research Centre on Plant Biotechnology, IARI, Pusa, New Delhi, India
| | - Asheesh Shanker
- Bioinformatics Programme, Centre for Biological Sciences, Central University of South Bihar, Patna, Bihar, India
| | - Tilak Raj Sharma
- National Research Centre on Plant Biotechnology, IARI, Pusa, New Delhi, India
| | - Nagendra Kumar Singh
- National Research Centre on Plant Biotechnology, IARI, Pusa, New Delhi, India
- * E-mail:
| |
Collapse
|
9
|
Dabert M, Proctor H, Dabert J. Higher-level molecular phylogeny of the water mites (Acariformes: Prostigmata: Parasitengonina: Hydrachnidiae). Mol Phylogenet Evol 2016; 101:75-90. [DOI: 10.1016/j.ympev.2016.05.004] [Citation(s) in RCA: 20] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/02/2015] [Revised: 04/09/2016] [Accepted: 05/01/2016] [Indexed: 10/21/2022]
|
10
|
Su Z, Townsend JP. Utility of characters evolving at diverse rates of evolution to resolve quartet trees with unequal branch lengths: analytical predictions of long-branch effects. BMC Evol Biol 2015; 15:86. [PMID: 25968460 PMCID: PMC4429678 DOI: 10.1186/s12862-015-0364-7] [Citation(s) in RCA: 19] [Impact Index Per Article: 2.1] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/15/2015] [Accepted: 04/29/2015] [Indexed: 11/30/2022] Open
Abstract
BACKGROUND The detection and avoidance of "long-branch effects" in phylogenetic inference represents a longstanding challenge for molecular phylogenetic investigations. A consequence of parallelism and convergence, long-branch effects arise in phylogenetic inference when there is unequal molecular divergence among lineages, and they can positively mislead inference based on parsimony especially, but also inference based on maximum likelihood and Bayesian approaches. Long-branch effects have been exhaustively examined by simulation studies that have compared the performance of different inference methods in specific model trees and branch length spaces. RESULTS In this paper, by generalizing the phylogenetic signal and noise analysis to quartets with uneven subtending branches, we quantify the utility of molecular characters for resolution of quartet phylogenies via parsimony. Our quantification incorporates contributions toward the correct tree from either signal or homoplasy (i.e. "the right result for either the right reason or the wrong reason"). We also characterize a highly conservative lower bound of utility that incorporates contributions to the correct tree only when they correspond to true, unobscured parsimony-informative sites (i.e. "the right result for the right reason"). We apply the generalized signal and noise analysis to classic quartet phylogenies in which long-branch effects can arise due to unequal rates of evolution or an asymmetrical topology. Application of the analysis leads to identification of branch length conditions in which inference will be inconsistent and reveals insights regarding how to improve sampling of molecular loci and taxa in order to correctly resolve phylogenies in which long-branch effects are hypothesized to exist. CONCLUSIONS The generalized signal and noise analysis provides analytical prediction of utility of characters evolving at diverse rates of evolution to resolve quartet phylogenies with unequal branch lengths. The analysis can be applied to identifying characters evolving at appropriate rates to resolve phylogenies in which long-branch effects are hypothesized to occur.
Collapse
Affiliation(s)
- Zhuo Su
- Department of Ecology and Evolutionary Biology, Yale University, New Haven, CT, 06520, USA.
| | - Jeffrey P Townsend
- Department of Ecology and Evolutionary Biology, Yale University, New Haven, CT, 06520, USA.
- Department of Biostatistics, Yale University, New Haven, CT, 06520, USA.
- Program in Computational Biology and Bioinformatics, Yale University, New Haven, CT, 06520, USA.
- Department of Biostatistics, Yale School of Public Health, 135 College St #222., New Haven, CT, 06511, United States of America.
| |
Collapse
|
11
|
Williams BL, Akazome Y, Oka Y, Eisthen HL. Dynamic evolution of the GnRH receptor gene family in vertebrates. BMC Evol Biol 2014; 14:215. [PMID: 25344287 PMCID: PMC4232701 DOI: 10.1186/s12862-014-0215-y] [Citation(s) in RCA: 15] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/08/2014] [Accepted: 09/25/2014] [Indexed: 12/13/2022] Open
Abstract
BACKGROUND Elucidating the mechanisms underlying coevolution of ligands and receptors is an important challenge in molecular evolutionary biology. Peptide hormones and their receptors are excellent models for such efforts, given the relative ease of examining evolutionary changes in genes encoding for both molecules. Most vertebrates possess multiple genes for both the decapeptide gonadotropin releasing hormone (GnRH) and for the GnRH receptor. The evolutionary history of the receptor family, including ancestral copy number and timing of duplications and deletions, has been the subject of controversy. RESULTS We report here for the first time sequences of three distinct GnRH receptor genes in salamanders (axolotls, Ambystoma mexicanum), which are orthologous to three GnRH receptors from ranid frogs. To understand the origin of these genes within the larger evolutionary context of the gene family, we performed phylogenetic analyses and probabilistic protein homology searches of GnRH receptor genes in vertebrates and their near relatives. Our analyses revealed four points that alter previous views about the evolution of the GnRH receptor gene family. First, the "mammalian" pituitary type GnRH receptor, which is the sole GnRH receptor in humans and previously presumed to be highly derived because it lacks the cytoplasmic C-terminal domain typical of most G-protein coupled receptors, is actually an ancient gene that originated in the common ancestor of jawed vertebrates (Gnathostomata). Second, unlike previous studies, we classify vertebrate GnRH receptors into five subfamilies. Third, the order of subfamily origins is the inverse of previous proposed models. Fourth, the number of GnRH receptor genes has been dynamic in vertebrates and their ancestors, with multiple duplications and losses. CONCLUSION Our results provide a novel evolutionary framework for generating hypotheses concerning the functional importance of structural characteristics of vertebrate GnRH receptors. We show that five subfamilies of vertebrate GnRH receptors evolved early in the vertebrate phylogeny, followed by several independent instances of gene loss. Chief among cases of gene loss are humans, best described as degenerate with respect to GnRH receptors because we retain only a single, ancient gene.
Collapse
|
12
|
Ragan MA, Chan CX. Biological Intuition in Alignment-Free Methods: Response to Posada. J Mol Evol 2013; 77:1-2. [DOI: 10.1007/s00239-013-9573-0] [Citation(s) in RCA: 8] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/18/2013] [Accepted: 07/04/2013] [Indexed: 10/26/2022]
|
13
|
Budd A, Devos DP. Evaluating the Evolutionary Origins of Unexpected Character Distributions within the Bacterial Planctomycetes-Verrucomicrobia-Chlamydiae Superphylum. Front Microbiol 2012; 3:401. [PMID: 23189077 PMCID: PMC3505017 DOI: 10.3389/fmicb.2012.00401] [Citation(s) in RCA: 9] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/15/2012] [Accepted: 10/31/2012] [Indexed: 12/26/2022] Open
Abstract
Recently, several characters that are absent from most bacteria, but which are found in many eukaryotes or archaea, have been identified within the bacterial Planctomycetes-Verrucomicrobia-Chlamydiae (PVC) superphylum. Hypotheses of the evolutionary history of such characters are commonly based on the inference of phylogenies of gene or protein families associated with the traits, estimated from multiple sequence alignments (MSAs). So far, studies of this kind have focused on the distribution of (i) two genes involved in the synthesis of sterol, (ii) tubulin genes, and (iii) c1 transfer genes. In many cases, these analyses have concluded that horizontal gene transfer (HGT) is likely to have played a role in shaping the taxonomic distribution of these gene families. In this article, we describe several issues with the inference of HGT from such analyses, in particular concerning the considerable uncertainty associated with our estimation of both gene family phylogenies (especially those containing ancient lineage divergences) and the Tree of Life (ToL), and the need for wider use and further development of explicit probabilistic models to compare hypotheses of vertical and horizontal genetic transmission. We suggest that data which is often taken as evidence for the occurrence of ancient HGT events may not be as convincing as is commonly described, and consideration of alternative theories is recommended. While focusing on analyses including PVCs, this discussion is also relevant for inferences of HGT involving other groups of organisms.
Collapse
Affiliation(s)
- A. Budd
- European Molecular Biology LaboratoryHeidelberg, Germany
| | - D. P. Devos
- European Molecular Biology LaboratoryHeidelberg, Germany
| |
Collapse
|
14
|
Kumar S, Filipski AJ, Battistuzzi FU, Kosakovsky Pond SL, Tamura K. Statistics and truth in phylogenomics. Mol Biol Evol 2011; 29:457-72. [PMID: 21873298 DOI: 10.1093/molbev/msr202] [Citation(s) in RCA: 164] [Impact Index Per Article: 12.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/23/2023] Open
Abstract
Phylogenomics refers to the inference of historical relationships among species using genome-scale sequence data and to the use of phylogenetic analysis to infer protein function in multigene families. With rapidly decreasing sequencing costs, phylogenomics is becoming synonymous with evolutionary analysis of genome-scale and taxonomically densely sampled data sets. In phylogenetic inference applications, this translates into very large data sets that yield evolutionary and functional inferences with extremely small variances and high statistical confidence (P value). However, reports of highly significant P values are increasing even for contrasting phylogenetic hypotheses depending on the evolutionary model and inference method used, making it difficult to establish true relationships. We argue that the assessment of the robustness of results to biological factors, that may systematically mislead (bias) the outcomes of statistical estimation, will be a key to avoiding incorrect phylogenomic inferences. In fact, there is a need for increased emphasis on the magnitude of differences (effect sizes) in addition to the P values of the statistical test of the null hypothesis. On the other hand, the amount of sequence data available will likely always remain inadequate for some phylogenomic applications, for example, those involving episodic positive selection at individual codon positions and in specific lineages. Again, a focus on effect size and biological relevance, rather than the P value, may be warranted. Here, we present a theoretical overview and discuss practical aspects of the interplay between effect sizes, bias, and P values as it relates to the statistical inference of evolutionary truth in phylogenomics.
Collapse
Affiliation(s)
- Sudhir Kumar
- Center for Evolutionary Medicine and Informatics, Biodesign Institute, Arizona State University, Arizona, USA.
| | | | | | | | | |
Collapse
|
15
|
Brandão MM, Silva-Filho MC. Evolutionary history of Arabidopsis thaliana aminoacyl-tRNA synthetase dual-targeted proteins. Mol Biol Evol 2010; 28:79-85. [PMID: 20624849 DOI: 10.1093/molbev/msq176] [Citation(s) in RCA: 18] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/12/2022] Open
Abstract
Aminoacyl-transfer RNA (tRNA) synthetases (aaRS) are key players in translation and act early in protein synthesis by mediating the attachment of amino acids to their cognate tRNA molecules. In plants, protein synthesis may occur in three subcellular compartments (cytosol, mitochondria, and chloroplasts), which requires multiple versions of the protein to be correctly delivered to its proper destination. The organellar aaRS are nuclear encoded and equipped with targeting information at the N-terminal sequence, which enables them to be specifically translocated to their final location. Most of the aaRS families present organellar proteins that are dual targeted to mitochondria and chloroplasts. Here, we examine the dual targeting behavior of aaRS from an evolutionary perspective. Our results show that Arabidopsis thaliana aaRS sequences are a result of a horizontal gene transfer event from bacteria. However, there is no evident bias indicating one single ancestor (Cyanobacteria or Proteobacteria). The dual-targeted aaRS phylogenetic relationship was characterized into two different categories (paralogs and homologs) depending on the state recovered for both dual-targeted and cytosolic proteins. Taken together, our results suggest that the dual-targeted condition is a gain-of-function derived from gene duplication. Selection may have maintained the original function in at least one of the copies as the additional copies diverged.
Collapse
Affiliation(s)
- Marcelo M Brandão
- Departamento de Genética, Escola Superior de Agricultura Luiz de Queiroz, Universidade de São Paulo, Piracicaba, SP, Brazil
| | | |
Collapse
|
16
|
Dabert M, Witalinski W, Kazmierski A, Olszanowski Z, Dabert J. Molecular phylogeny of acariform mites (Acari, Arachnida): Strong conflict between phylogenetic signal and long-branch attraction artifacts. Mol Phylogenet Evol 2010; 56:222-41. [DOI: 10.1016/j.ympev.2009.12.020] [Citation(s) in RCA: 203] [Impact Index Per Article: 14.5] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/10/2009] [Revised: 12/19/2009] [Accepted: 12/21/2009] [Indexed: 11/25/2022]
|
17
|
Cook D, Rimando AM, Clemente TE, Schröder J, Dayan FE, Nanayakkara ND, Pan Z, Noonan BP, Fishbein M, Abe I, Duke SO, Baerson SR. Alkylresorcinol synthases expressed in Sorghum bicolor root hairs play an essential role in the biosynthesis of the allelopathic benzoquinone sorgoleone. THE PLANT CELL 2010; 22:867-87. [PMID: 20348430 PMCID: PMC2861460 DOI: 10.1105/tpc.109.072397] [Citation(s) in RCA: 20] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/05/2023]
Abstract
Sorghum bicolor is considered to be an allelopathic crop species, producing phytotoxins such as the lipid benzoquinone sorgoleone, which likely accounts for many of the allelopathic properties of Sorghum spp. Current evidence suggests that sorgoleone biosynthesis occurs exclusively in root hair cells and involves the production of an alkylresorcinolic intermediate (5-[(Z,Z)-8',11',14'-pentadecatrienyl]resorcinol) derived from an unusual 16:3Delta(9,12,15) fatty acyl-CoA starter unit. This led to the suggestion of the involvement of one or more alkylresorcinol synthases (ARSs), type III polyketide synthases (PKSs) that produce 5-alkylresorcinols using medium to long-chain fatty acyl-CoA starter units via iterative condensations with malonyl-CoA. In an effort to characterize the enzymes responsible for the biosynthesis of the pentadecyl resorcinol intermediate, a previously described expressed sequence tag database prepared from isolated S. bicolor (genotype BTx623) root hairs was first mined for all PKS-like sequences. Quantitative real-time RT-PCR analyses revealed that three of these sequences were preferentially expressed in root hairs, two of which (designated ARS1 and ARS2) were found to encode ARS enzymes capable of accepting a variety of fatty acyl-CoA starter units in recombinant enzyme studies. Furthermore, RNA interference experiments directed against ARS1 and ARS2 resulted in the generation of multiple independent transformant events exhibiting dramatically reduced sorgoleone levels. Thus, both ARS1 and ARS2 are likely to participate in the biosynthesis of sorgoleone in planta. The sequences of ARS1 and ARS2 were also used to identify several rice (Oryza sativa) genes encoding ARSs, which are likely involved in the production of defense-related alkylresorcinols.
Collapse
Affiliation(s)
- Daniel Cook
- U.S. Department of Agriculture, Agricultural Research Service, Natural Products Utilization Research Unit, University, Mississippi 38677
| | - Agnes M. Rimando
- U.S. Department of Agriculture, Agricultural Research Service, Natural Products Utilization Research Unit, University, Mississippi 38677
| | - Thomas E. Clemente
- Center for Biotechnology, University of Nebraska, Lincoln, Nebraska 68588
| | - Joachim Schröder
- Universität Freiburg, Institut für Biologie II, D-79104 Freiburg, Germany
| | - Franck E. Dayan
- U.S. Department of Agriculture, Agricultural Research Service, Natural Products Utilization Research Unit, University, Mississippi 38677
| | - N.P. Dhammika Nanayakkara
- National Center for Natural Products Research, School of Pharmacy, University of Mississippi, University, Mississippi 38677
| | - Zhiqiang Pan
- U.S. Department of Agriculture, Agricultural Research Service, Natural Products Utilization Research Unit, University, Mississippi 38677
| | - Brice P. Noonan
- Department of Biology, University of Mississippi, University, Mississippi 38677
| | - Mark Fishbein
- Department of Botany, Oklahoma State University, Stillwater, Oklahoma 74078
| | - Ikuro Abe
- Graduate School of Pharmaceutical Sciences, University of Tokyo, Tokyo 113-0033, Japan
| | - Stephen O. Duke
- U.S. Department of Agriculture, Agricultural Research Service, Natural Products Utilization Research Unit, University, Mississippi 38677
| | - Scott R. Baerson
- U.S. Department of Agriculture, Agricultural Research Service, Natural Products Utilization Research Unit, University, Mississippi 38677
- Address correspondence to
| |
Collapse
|
18
|
Schwartz RS, Mueller RL. Branch length estimation and divergence dating: estimates of error in Bayesian and maximum likelihood frameworks. BMC Evol Biol 2010; 10:5. [PMID: 20064267 PMCID: PMC2827399 DOI: 10.1186/1471-2148-10-5] [Citation(s) in RCA: 40] [Impact Index Per Article: 2.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/12/2009] [Accepted: 01/11/2010] [Indexed: 11/10/2022] Open
Abstract
BACKGROUND Estimates of divergence dates between species improve our understanding of processes ranging from nucleotide substitution to speciation. Such estimates are frequently based on molecular genetic differences between species; therefore, they rely on accurate estimates of the number of such differences (i.e. substitutions per site, measured as branch length on phylogenies). We used simulations to determine the effects of dataset size, branch length heterogeneity, branch depth, and analytical framework on branch length estimation across a range of branch lengths. We then reanalyzed an empirical dataset for plethodontid salamanders to determine how inaccurate branch length estimation can affect estimates of divergence dates. RESULTS The accuracy of branch length estimation varied with branch length, dataset size (both number of taxa and sites), branch length heterogeneity, branch depth, dataset complexity, and analytical framework. For simple phylogenies analyzed in a Bayesian framework, branches were increasingly underestimated as branch length increased; in a maximum likelihood framework, longer branch lengths were somewhat overestimated. Longer datasets improved estimates in both frameworks; however, when the number of taxa was increased, estimation accuracy for deeper branches was less than for tip branches. Increasing the complexity of the dataset produced more misestimated branches in a Bayesian framework; however, in an ML framework, more branches were estimated more accurately. Using ML branch length estimates to re-estimate plethodontid salamander divergence dates generally resulted in an increase in the estimated age of older nodes and a decrease in the estimated age of younger nodes. CONCLUSIONS Branch lengths are misestimated in both statistical frameworks for simulations of simple datasets. However, for complex datasets, length estimates are quite accurate in ML (even for short datasets), whereas few branches are estimated accurately in a Bayesian framework. Our reanalysis of empirical data demonstrates the magnitude of effects of Bayesian branch length misestimation on divergence date estimates. Because the length of branches for empirical datasets can be estimated most reliably in an ML framework when branches are <1 substitution/site and datasets are > or =1 kb, we suggest that divergence date estimates using datasets, branch lengths, and/or analytical techniques that fall outside of these parameters should be interpreted with caution.
Collapse
Affiliation(s)
- Rachel S Schwartz
- Department of Biology, Colorado State University, Fort Collins, CO 80523-1878, USA.
| | | |
Collapse
|
19
|
Berry IM, Athreya G, Kothari M, Daniels M, Bruno WJ, Korber B, Kuiken C, Ribeiro RM, Leitner T. The evolutionary rate dynamically tracks changes in HIV-1 epidemics: application of a simple method for optimizing the evolutionary rate in phylogenetic trees with longitudinal data. Epidemics 2009; 1:230-9. [PMID: 21352769 PMCID: PMC3053002 DOI: 10.1016/j.epidem.2009.10.003] [Citation(s) in RCA: 19] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/10/2009] [Revised: 10/06/2009] [Accepted: 10/30/2009] [Indexed: 12/24/2022] Open
Abstract
Large-sequence datasets provide an opportunity to investigate the dynamics of pathogen epidemics. Thus, a fast method to estimate the evolutionary rate from large and numerous phylogenetic trees becomes necessary. Based on minimizing tip height variances, we optimize the root in a given phylogenetic tree to estimate the most homogenous evolutionary rate between samples from at least two different time points. Simulations showed that the method had no bias in the estimation of evolutionary rates and that it was robust to tree rooting and topological errors. We show that the evolutionary rates of HIV-1 subtype B and C epidemics have changed over time, with the rate of evolution inversely correlated to the rate of virus spread. For subtype B, the evolutionary rate slowed down and tracked the start of the HAART era in 1996. Subtype C in Ethiopia showed an increase in the evolutionary rate when the prevalence increase markedly slowed down in 1995. Thus, we show that the evolutionary rate of HIV-1 on the population level dynamically tracks epidemic events.
Collapse
Affiliation(s)
- Irina Maljkovic Berry
- Theoretical Biology & Biophysics, MS K710, Los Alamos National Laboratory, Los Alamos, NM 87545, U.S.A
- Center for Nonlinear Studies (CNLS), Los Alamos National Laboratory, Los Alamos, NM 87545, U.S.A
- Department of Virology, Swedish Institute for Infectious Disease Control, SE-171 82 Solna, & Department of Microbiology, Tumor and Cell Biology, Karolinska Institute, SE-171 77 Stockholm, Sweden
| | - Gayathri Athreya
- Theoretical Biology & Biophysics, MS K710, Los Alamos National Laboratory, Los Alamos, NM 87545, U.S.A
| | - Moulik Kothari
- Theoretical Biology & Biophysics, MS K710, Los Alamos National Laboratory, Los Alamos, NM 87545, U.S.A
| | - Marcus Daniels
- Theoretical Biology & Biophysics, MS K710, Los Alamos National Laboratory, Los Alamos, NM 87545, U.S.A
| | - William J. Bruno
- Theoretical Biology & Biophysics, MS K710, Los Alamos National Laboratory, Los Alamos, NM 87545, U.S.A
| | - Bette Korber
- Theoretical Biology & Biophysics, MS K710, Los Alamos National Laboratory, Los Alamos, NM 87545, U.S.A
| | - Carla Kuiken
- Theoretical Biology & Biophysics, MS K710, Los Alamos National Laboratory, Los Alamos, NM 87545, U.S.A
| | - Ruy M. Ribeiro
- Theoretical Biology & Biophysics, MS K710, Los Alamos National Laboratory, Los Alamos, NM 87545, U.S.A
| | - Thomas Leitner
- Theoretical Biology & Biophysics, MS K710, Los Alamos National Laboratory, Los Alamos, NM 87545, U.S.A
| |
Collapse
|
20
|
Gazave E, Lapébie P, Richards GS, Brunet F, Ereskovsky AV, Degnan BM, Borchiellini C, Vervoort M, Renard E. Origin and evolution of the Notch signalling pathway: an overview from eukaryotic genomes. BMC Evol Biol 2009; 9:249. [PMID: 19825158 PMCID: PMC2770060 DOI: 10.1186/1471-2148-9-249] [Citation(s) in RCA: 171] [Impact Index Per Article: 11.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/01/2009] [Accepted: 10/13/2009] [Indexed: 12/20/2022] Open
Abstract
Background Of the 20 or so signal transduction pathways that orchestrate cell-cell interactions in metazoans, seven are involved during development. One of these is the Notch signalling pathway which regulates cellular identity, proliferation, differentiation and apoptosis via the developmental processes of lateral inhibition and boundary induction. In light of this essential role played in metazoan development, we surveyed a wide range of eukaryotic genomes to determine the origin and evolution of the components and auxiliary factors that compose and modulate this pathway. Results We searched for 22 components of the Notch pathway in 35 different species that represent 8 major clades of eukaryotes, performed phylogenetic analyses and compared the domain compositions of the two fundamental molecules: the receptor Notch and its ligands Delta/Jagged. We confirm that a Notch pathway, with true receptors and ligands is specific to the Metazoa. This study also sheds light on the deep ancestry of a number of genes involved in this pathway, while other members are revealed to have a more recent origin. The origin of several components can be accounted for by the shuffling of pre-existing protein domains, or via lateral gene transfer. In addition, certain domains have appeared de novo more recently, and can be considered metazoan synapomorphies. Conclusion The Notch signalling pathway emerged in Metazoa via a diversity of molecular mechanisms, incorporating both novel and ancient protein domains during eukaryote evolution. Thus, a functional Notch signalling pathway was probably present in Urmetazoa.
Collapse
Affiliation(s)
- Eve Gazave
- Aix-Marseille Universités, Centre d'Océanologie de Marseille, Station marine d'Endoume - CNRS UMR 6540-DIMAR, rue de Batterie des Lions, 13007 Marseille, France.
| | | | | | | | | | | | | | | | | |
Collapse
|
21
|
Wertheim JO, Sanderson MJ, Worobey M, Bjork A. Relaxed molecular clocks, the bias-variance trade-off, and the quality of phylogenetic inference. Syst Biol 2009; 59:1-8. [PMID: 20525616 DOI: 10.1093/sysbio/syp072] [Citation(s) in RCA: 54] [Impact Index Per Article: 3.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/13/2022] Open
Abstract
Because a constant rate of DNA sequence evolution cannot be assumed to be ubiquitous, relaxed molecular clock inference models have proven useful when estimating rates and divergence dates. Furthermore, it has been recently suggested that using relaxed molecular clocks may provide superior accuracy and precision in phylogenetic inference compared with traditional time-free methods that do not incorporate a molecular clock. We perform a simulation study to determine if assuming a relaxed molecular clock does indeed improve the quality of phylogenetic inference. We analyze sequence data simulated under various rate distributions using relaxed-clocks, strict-clocks, and time-free Bayesian phylogenetic inference models. Our results indicate that no difference exists in the quality of phylogenetic inference between assuming a relaxed molecular clock and making no assumption about the clock-likeness of sequence evolution. This pattern is likely due to the bias-variance trade-off inherent in this type of phylogenetic inference. We also compared the quality of inference between Bayesian and maximum likelihood time-free inference models and found them to be qualitatively similar.
Collapse
Affiliation(s)
- Joel O Wertheim
- Department of Ecology and Evolutionary Biology, University of Arizona, Tucson, AZ 85721, USA.
| | | | | | | |
Collapse
|
22
|
Maraun M, Erdmann G, Schulz G, Norton RA, Scheu S, Domes K. Multiple convergent evolution of arboreal life in oribatid mites indicates the primacy of ecology. Proc Biol Sci 2009; 276:3219-27. [PMID: 19535377 PMCID: PMC2817162 DOI: 10.1098/rspb.2009.0425] [Citation(s) in RCA: 27] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/13/2009] [Accepted: 05/27/2009] [Indexed: 11/12/2022] Open
Abstract
Frequent convergent evolution in phylogenetically unrelated taxa points to the importance of ecological factors during evolution, whereas convergent evolution in closely related taxa indicates the importance of favourable pre-existing characters (pre-adaptations). We investigated the transitions to arboreal life in oribatid mites (Oribatida, Acari), a group of mostly soil-living arthropods. We evaluated which general force-ecological factors, historical constraints or chance-was dominant in the evolution of arboreal life in oribatid mites. A phylogenetic study of 51 oribatid mite species and four outgroup taxa, using the ribosomal 18S rDNA region, indicates that arboreal life evolved at least 15 times independently. Arboreal oribatid mite species are not randomly distributed in the phylogenetic tree, but are concentrated among strongly sclerotized, sexual and evolutionary younger taxa. They convergently evolved a capitate sensillus, an anemoreceptor that either precludes overstimulation in the exposed bark habitat or functions as a gravity receptor. Sexual reproduction and strong sclerotization were important pre-adaptations for colonizing the bark of trees that facilitated the exploitation of living resources (e.g. lichens) and served as predator defence, respectively. Overall, our results indicate that ecological factors are most important for the observed pattern of convergent evolution of arboreal life in oribatid mites, supporting an adaptationist view of evolution.
Collapse
Affiliation(s)
- Mark Maraun
- Universität Göttingen, Institut für Zoologie und Anthropologie, 37073 Göttingen, Germany.
| | | | | | | | | | | |
Collapse
|
23
|
Liu Y, Leigh JW, Brinkmann H, Cushion MT, Rodriguez-Ezpeleta N, Philippe H, Lang BF. Phylogenomic analyses support the monophyly of Taphrinomycotina, including Schizosaccharomyces fission yeasts. Mol Biol Evol 2008; 26:27-34. [PMID: 18922765 DOI: 10.1093/molbev/msn221] [Citation(s) in RCA: 70] [Impact Index Per Article: 4.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/14/2022] Open
Abstract
Several morphologically dissimilar ascomycete fungi including Schizosaccharomyces, Taphrina, Saitoella, Pneumocystis, and Neolecta have been grouped into the taxon Taphrinomycotina (Archiascomycota or Archiascomycotina), originally based on rRNA phylogeny. These analyses lack statistically significant support for the monophyly of this grouping, and although confirmed by more recent multigene analyses, this topology is contradicted by mitochondrial phylogenies. To resolve this inconsistency, we have assembled phylogenomic mitochondrial and nuclear data sets from four distantly related taphrinomycotina taxa: Schizosaccharomyces pombe, Pneumocystis carinii, Saitoella complicata, and Taphrina deformans. Our phylogenomic analyses based on nuclear data (113 proteins) conclusively support the monophyly of Taphrinomycotina, diverging as a sister group to Saccharomycotina + Pezizomycotina. However, despite the improved taxon sampling, Taphrinomycotina continue to be paraphyletic with the mitochondrial data set (13 proteins): Schizosaccharomyces species associate with budding yeasts (Saccharomycotina) and the other Taphrinomycotina group as a sister group to Saccharomycotina + Pezizomycotina. Yet, as Schizosaccharomyces and Saccharomycotina species are fast evolving, the mitochondrial phylogeny may be influenced by a long-branch attraction (LBA) artifact. After removal of fast-evolving sequence positions from the mitochondrial data set, we recover the monophyly of Taphrinomycotina. Our combined results suggest that Taphrinomycotina is a legitimate taxon, that this group of species diverges as a sister group to Saccharomycotina + Pezizomycotina, and that phylogenetic positioning of yeasts and fission yeasts with mitochondrial data is plagued by a strong LBA artifact.
Collapse
Affiliation(s)
- Yu Liu
- Robert Cedergren Centre, Département de Biochimie, Université de Montréal, Montréal, Québec, Canada
| | | | | | | | | | | | | |
Collapse
|
24
|
Khan HA, Arif IA, Bahkali AH, Al Farhan AH, Al Homaidan AA. Bayesian, maximum parsimony and UPGMA models for inferring the phylogenies of antelopes using mitochondrial markers. Evol Bioinform Online 2008; 4:263-70. [PMID: 19204824 PMCID: PMC2614192 DOI: 10.4137/ebo.s934] [Citation(s) in RCA: 13] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/05/2023] Open
Abstract
This investigation was aimed to compare the inference of antelope phylogenies resulting from the 16S rRNA, cytochrome-b (cyt-b) and d-loop segments of mitochondrial DNA using three different computational models including Bayesian (BA), maximum parsimony (MP) and unweighted pair group method with arithmetic mean (UPGMA). The respective nucleotide sequences of three Oryx species (Oryx leucoryx, Oryx dammah and Oryx gazella) and an out-group (Addax nasomaculatus) were aligned and subjected to BA, MP and UPGMA models for comparing the topologies of respective phylogenetic trees. The 16S rRNA region possessed the highest frequency of conserved sequences (97.65%) followed by cyt-b (94.22%) and d-loop (87.29%). There were few transitions (2.35%) and none transversions in 16S rRNA as compared to cyt-b (5.61% transitions and 0.17% transversions) and d-loop (11.57% transitions and 1.14% transversions) while comparing the four taxa. All the three mitochondrial segments clearly differentiated the genus Addax from Oryx using the BA or UPGMA models. The topologies of all the gamma-corrected Bayesian trees were identical irrespective of the marker type. The UPGMA trees resulting from 16S rRNA and d-loop sequences were also identical (Oryx dammah grouped with Oryx leucoryx) to Bayesian trees except that the UPGMA tree based on cyt-b showed a slightly different phylogeny (Oryx dammah grouped with Oryx gazella) with a low bootstrap support. However, the MP model failed to differentiate the genus Addax from Oryx. These findings demonstrate the efficiency and robustness of BA and UPGMA methods for phylogenetic analysis of antelopes using mitochondrial markers.
Collapse
Affiliation(s)
- Haseeb A Khan
- Molecular Fingerprinting and Biodiversity Unit, Prince Sultan Research Chair Program in Environment and Wildlife, College of Science, King Saud University, Riyadh, Saudi Arabia.
| | | | | | | | | |
Collapse
|
25
|
Wróbel B. Statistical measures of uncertainty for branches in phylogenetic trees inferred from molecular sequences by using model-based methods. J Appl Genet 2008; 49:49-67. [DOI: 10.1007/bf03195249] [Citation(s) in RCA: 28] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/28/2022]
|
26
|
Abstract
There are many examples of groups (such as birds, bees, mammals, multicellular animals, and flowering plants) that have undergone a rapid radiation. In such cases, where there is a combination of short internal and long external branches, correctly estimating and rooting phylogenetic trees is known to be a difficult problem. In this simulation study, we tested the performances of different phylogenetic methods at estimating a tree that models a rapid radiation. We found that maximum likelihood, corrected and uncorrected neighbor-joining, and corrected and uncorrected parsimony, all suffer from biases toward specific tree topologies. In addition, we found that using a single-taxon outgroup to root a tree frequently disrupts an otherwise correct ingroup phylogeny. Moreover, for uncorrected parsimony, we found cases where several individual trees (in which the outgroup was placed incorrectly) were selected more frequently than the correct tree. Even for parameter settings where the correct tree was selected most frequently when using extremely long sequences, for sequences of up to 60,000 nucleotides the incorrectly rooted trees were each selected more frequently than the correct tree. For all the cases tested here, tree estimation using a two taxon outgroup was more accurate than when using a single-taxon outgroup. However, the ingroup was most accurately recovered when no outgroup was used.
Collapse
Affiliation(s)
- Liat Shavit
- The Allan Wilson Centre for Molecular Ecology and Evolution, Massey University, Palmerston North, New Zealand.
| | | | | | | |
Collapse
|
27
|
Kolaczkowski B, Thornton JW. Effects of branch length uncertainty on Bayesian posterior probabilities for phylogenetic hypotheses. Mol Biol Evol 2007; 24:2108-18. [PMID: 17636043 DOI: 10.1093/molbev/msm141] [Citation(s) in RCA: 31] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/13/2022] Open
Abstract
In Bayesian phylogenetics, confidence in evolutionary relationships is expressed as posterior probability--the probability that a tree or clade is true given the data, evolutionary model, and prior assumptions about model parameters. Model parameters, such as branch lengths, are never known in advance; Bayesian methods incorporate this uncertainty by integrating over a range of plausible values given an assumed prior probability distribution for each parameter. Little is known about the effects of integrating over branch length uncertainty on posterior probabilities when different priors are assumed. Here, we show that integrating over uncertainty using a wide range of typical prior assumptions strongly affects posterior probabilities, causing them to deviate from those that would be inferred if branch lengths were known in advance; only when there is no uncertainty to integrate over does the average posterior probability of a group of trees accurately predict the proportion of correct trees in the group. The pattern of branch lengths on the true tree determines whether integrating over uncertainty pushes posterior probabilities upward or downward. The magnitude of the effect depends on the specific prior distributions used and the length of the sequences analyzed. Under realistic conditions, however, even extraordinarily long sequences are not enough to prevent frequent inference of incorrect clades with strong support. We found that across a range of conditions, diffuse priors--either flat or exponential distributions with moderate to large means--provide more reliable inferences than small-mean exponential priors. An empirical Bayes approach that fixes branch lengths at their maximum likelihood estimates yields posterior probabilities that more closely match those that would be inferred if the true branch lengths were known in advance and reduces the rate of strongly supported false inferences compared with fully Bayesian integration.
Collapse
|
28
|
Molecular phylogeny of the Drosophila obscura species group, with emphasis on the Old World species. BMC Evol Biol 2007; 7:87. [PMID: 17555574 PMCID: PMC1904182 DOI: 10.1186/1471-2148-7-87] [Citation(s) in RCA: 45] [Impact Index Per Article: 2.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/03/2006] [Accepted: 06/07/2007] [Indexed: 11/10/2022] Open
Abstract
Background Species of the Drosophila obscura species group (e.g., D. pseudoobscura, D. subobscura) have served as favorable models in evolutionary studies since the 1930's. Despite numbers of studies conducted with varied types of data, the basal phylogeny in this group is still controversial, presumably owing to not only the hypothetical 'rapid radiation' history of this group, but also limited taxon sampling from the Old World (esp. the Oriental and Afrotropical regions). Here we reconstruct the phylogeny of this group by using sequence data from 6 loci of 21 species (including 16 Old World ones) covering all the 6 subgroups of this group, estimate the divergence times among lineages, and statistically test the 'rapid radiation' hypothesis. Results Phylogenetic analyses indicate that each of the subobscura, sinobscura, affinis, and pseudoobscura subgroups is monophyletic. The subobscura and microlabis subgroups form the basal clade in the obscura group. Partial species of the obscura subgroup (the D. ambigua/D. obscura/D. tristis triad plus the D. subsilvestris/D. dianensis pair) forms a monophyletic group which appears to be most closely related to the sinobscura subgroup. The remaining basal relationships in the obscura group are not resolved by the present study. Divergence times on a ML tree based on mtDNA data are estimated with a calibration of 30–35 Mya for the divergence between the obscura and melanogaster groups. The result suggests that at least half of the current major lineages of the obscura group originated by the mid-Miocene time (~15 Mya), a time of the last developing and fragmentation of the temperate forest in North Hemisphere. Conclusion The obscura group began to diversify rapidly before invading into the New World. The subobscura and microlabis subgroups form the basal clade in this group. The obscura subgroup is paraphyletic. Partial members of this subgroup (D. ambigua, D. obscura, D. tristis, D. subsilvestris, and D. dianensis) form a monophyletic group which appears to be most closely related to the sinobscura subgroup.
Collapse
|
29
|
Castoe TA, Stephens T, Noonan BP, Calestani C. A novel group of type I polyketide synthases (PKS) in animals and the complex phylogenomics of PKSs. Gene 2007; 392:47-58. [PMID: 17207587 DOI: 10.1016/j.gene.2006.11.005] [Citation(s) in RCA: 47] [Impact Index Per Article: 2.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/30/2006] [Revised: 11/03/2006] [Accepted: 11/10/2006] [Indexed: 11/18/2022]
Abstract
Type I polyketide synthases (PKSs), and related fatty acid synthases (FASs), represent a large group of proteins encoded by a diverse gene family that occurs in eubacteria and eukaryotes (mainly in fungi). Collectively, enzymes encoded by this gene family produce a wide array of polyketide compounds that encompass a broad spectrum of biological activity including antibiotic, antitumor, antifungal, immunosuppressive, and predator defense functional roles. We employed a phylogenomics approach to estimate relationships among members of this gene family from eubacterial and eukaryotic genomes. Our results suggest that some animal genomes (sea urchins, birds, and fish) possess a previously unidentified group of pks genes, in addition to possessing fas genes used in fatty acid metabolism. These pks genes in the chicken, fish, and sea urchin genomes do not appear to be closely related to any other animal or fungal genes, and instead are closely related to pks genes from the slime mold Dictyostelium and eubacteria. Continued accumulation of genome sequence data from diverse animal lineages is required to clarify whether the presence of these (non-fas) pks genes in animal genomes owes their origins to horizontal gene transfer (from eubacterial or Dictostelium genomes) or to more conventional patterns of vertical inheritance coupled with massive gene loss in several animal lineages. Additionally, results of our broad-scale phylogenetic analyses bolster the support for previous hypotheses of horizontal gene transfer of pks genes from bacterial to fungal and protozoan lineages.
Collapse
Affiliation(s)
- Todd A Castoe
- Department of Biology, University of Central Florida, 4000 Central Florida Blvd., Orlando, FL 32816-2368, USA
| | | | | | | |
Collapse
|
30
|
Abstract
Phylogenetic analysis has changed greatly in the last decade, and the most important themes in that change are reviewed here. Sequence data have become the most common source of phylogenetic information. This means that explicit models for evolutionary processes have been developed in a likelihood context, which allow more realistic data analyses. These models are becoming increasingly complex, both for nucleotides and for amino acid sequences, and so all such models need to be quantitatively assessed for each data set, to find the most appropriate one for use in any particular tree-building analysis. Bayesian analysis has been developed for tree-building and is greatly increasing in popularity. This is because a good heuristic strategy exists, which allows large data sets to be analyzed with complex evolutionary models in a practical time. Perhaps the most disappointing aspect of tree interpretation is the ongoing confusion between rooted and unrooted trees, while the effect of taxon and character sampling is often overlooked when constructing a phylogeny (especially in parasitology). The review finishes with a detailed consideration of the analysis of a multi-gene data set for several dozen taxa of Cryptosporidium (Apicomplexa), illustrating many of the theoretical and practical points highlighted in the review.
Collapse
Affiliation(s)
- David A Morrison
- Department of Parasitology (SWEPAR), National Veterinary Institute and Swedish University of Agricultural Sciences, 751 89 Uppsala, Sweden
| |
Collapse
|
31
|
Beiko RG, Keith JM, Harlow TJ, Ragan MA. Searching for convergence in phylogenetic Markov chain Monte Carlo. Syst Biol 2006; 55:553-65. [PMID: 16857650 DOI: 10.1080/10635150600812544] [Citation(s) in RCA: 42] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/24/2022] Open
Abstract
Markov chain Monte Carlo (MCMC) is a methodology that is gaining widespread use in the phylogenetics community and is central to phylogenetic software packages such as MrBayes. An important issue for users of MCMC methods is how to select appropriate values for adjustable parameters such as the length of the Markov chain or chains, the sampling density, the proposal mechanism, and, if Metropolis-coupled MCMC is being used, the number of heated chains and their temperatures. Although some parameter settings have been examined in detail in the literature, others are frequently chosen with more regard to computational time or personal experience with other data sets. Such choices may lead to inadequate sampling of tree space or an inefficient use of computational resources. We performed a detailed study of convergence and mixing for 70 randomly selected, putatively orthologous protein sets with different sizes and taxonomic compositions. Replicated runs from multiple random starting points permit a more rigorous assessment of convergence, and we developed two novel statistics, delta and epsilon, for this purpose. Although likelihood values invariably stabilized quickly, adequate sampling of the posterior distribution of tree topologies took considerably longer. Our results suggest that multimodality is common for data sets with 30 or more taxa and that this results in slow convergence and mixing. However, we also found that the pragmatic approach of combining data from several short, replicated runs into a "metachain" to estimate bipartition posterior probabilities provided good approximations, and that such estimates were no worse in approximating a reference posterior distribution than those obtained using a single long run of the same length as the metachain. Precision appears to be best when heated Markov chains have low temperatures, whereas chains with high temperatures appear to sample trees with high posterior probabilities only rarely.
Collapse
Affiliation(s)
- Robert G Beiko
- ARC Centre in Bioinformatics and Institute for Molecular Bioscience, The University of Queensland, Brisbane, Queensland 4072, Australia.
| | | | | | | |
Collapse
|
32
|
Affiliation(s)
- Ross H Crozier
- School of Marine and Tropical Biology, James Cook University, Townsville, Queensland 4811, Australia.
| |
Collapse
|
33
|
Mar JC, Harlow TJ, Ragan MA. Bayesian and maximum likelihood phylogenetic analyses of protein sequence data under relative branch-length differences and model violation. BMC Evol Biol 2005; 5:8. [PMID: 15676079 PMCID: PMC549035 DOI: 10.1186/1471-2148-5-8] [Citation(s) in RCA: 33] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/23/2004] [Accepted: 01/28/2005] [Indexed: 11/15/2022] Open
Abstract
BACKGROUND Bayesian phylogenetic inference holds promise as an alternative to maximum likelihood, particularly for large molecular-sequence data sets. We have investigated the performance of Bayesian inference with empirical and simulated protein-sequence data under conditions of relative branch-length differences and model violation. RESULTS With empirical protein-sequence data, Bayesian posterior probabilities provide more-generous estimates of subtree reliability than does the nonparametric bootstrap combined with maximum likelihood inference, reaching 100% posterior probability at bootstrap proportions around 80%. With simulated 7-taxon protein-sequence datasets, Bayesian posterior probabilities are somewhat more generous than bootstrap proportions, but do not saturate. Compared with likelihood, Bayesian phylogenetic inference can be as or more robust to relative branch-length differences for datasets of this size, particularly when among-sites rate variation is modeled using a gamma distribution. When the (known) correct model was used to infer trees, Bayesian inference recovered the (known) correct tree in 100% of instances in which one or two branches were up to 20-fold longer than the others. At ratios more extreme than 20-fold, topological accuracy of reconstruction degraded only slowly when only one branch was of relatively greater length, but more rapidly when there were two such branches. Under an incorrect model of sequence change, inaccurate trees were sometimes observed at less extreme branch-length ratios, and (particularly for trees with single long branches) such trees tended to be more inaccurate. The effect of model violation on accuracy of reconstruction for trees with two long branches was more variable, but gamma-corrected Bayesian inference nonetheless yielded more-accurate trees than did either maximum likelihood or uncorrected Bayesian inference across the range of conditions we examined. Assuming an exponential Bayesian prior on branch lengths did not improve, and under certain extreme conditions significantly diminished, performance. The two topology-comparison metrics we employed, edit distance and Robinson-Foulds symmetric distance, yielded different but highly complementary measures of performance. CONCLUSIONS Our results demonstrate that Bayesian inference can be relatively robust against biologically reasonable levels of relative branch-length differences and model violation, and thus may provide a promising alternative to maximum likelihood for inference of phylogenetic trees from protein-sequence data.
Collapse
Affiliation(s)
- Jessica C Mar
- Department of Mathematics, The University of Queensland, Brisbane, Qld 4072, Australia
- Department of Biostatistics, Harvard School of Public Health, Boston, MA 02115 USA
| | - Timothy J Harlow
- Institute for Molecular Bioscience, The University of Queensland, Brisbane, Qld 4072, Australia
- Australian Research Council (ARC) Centre in Bioinformatics, Australia
| | - Mark A Ragan
- Institute for Molecular Bioscience, The University of Queensland, Brisbane, Qld 4072, Australia
- Australian Research Council (ARC) Centre in Bioinformatics, Australia
- Program in Evolutionary Biology, Canadian Institute for Advanced Research, Canada
| |
Collapse
|