1
|
Kulikov N, Derakhshandeh F, Mayer C. Machine learning can be as good as maximum likelihood when reconstructing phylogenetic trees and determining the best evolutionary model on four taxon alignments. Mol Phylogenet Evol 2024; 200:108181. [PMID: 39209046 DOI: 10.1016/j.ympev.2024.108181] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/30/2023] [Revised: 07/15/2024] [Accepted: 08/26/2024] [Indexed: 09/04/2024]
Abstract
Phylogenetic tree reconstruction with molecular data is important in many fields of life science research. The gold standard in this discipline is the phylogenetic tree reconstruction based on the Maximum Likelihood method. In this study, we present neural networks to predict the best model of sequence evolution and the correct topology for four sequence alignments of nucleotide or amino acid sequence data. We trained neural networks with different architectures using simulated alignments for a wide range of evolutionary models, model parameters and branch lengths. By comparing the accuracy of model and topology prediction of the trained neural networks with Maximum Likelihood and Neighbour Joining methods, we show that for quartet trees, the neural network classifier outperforms the Neighbour Joining method and is in most cases as good as the Maximum Likelihood method to infer the best model of sequence evolution and the best tree topology. These results are consistent for nucleotide and amino acid sequence data. We also show that our method is superior for model selection than previously published methods based on convolutionary networks. Furthermore, we found that neural network classifiers are much faster than the IQ-TREE implementation of the Maximum Likelihood method. Our results show that neural networks could become a true competitor for the Maximum Likelihood method in phylogenetic reconstructions.
Collapse
Affiliation(s)
- Nikita Kulikov
- Molecular Evolutionary Biology, Department of Biology, Hamburg University, Germany; Leibniz Institute for the Analysis of Biodiversity Change (LIB), Germany.
| | - Fatemeh Derakhshandeh
- Leibniz Institute for the Analysis of Biodiversity Change (LIB), Germany; Medical Faculty, Heidelberg University, Germany
| | - Christoph Mayer
- Leibniz Institute for the Analysis of Biodiversity Change (LIB), Germany
| |
Collapse
|
2
|
Rolemberg Santana Travaglini Berti de Correia C, Torres C, Gomes E, Maffei Rodriguez G, Klaysson Pereira Regatieri W, Takamiya NT, Aparecida Rogerio L, Malavazi I, Damário Gomes M, Dener Damasceno J, Luiz da Silva V, Antonio Fernandes de Oliveira M, Santos da Silva M, Silva Nascimento A, Cappellazzo Coelho A, Regina Maruyama S, Teixeira FR. Functional characterization of Cullin-1-RING ubiquitin ligase (CRL1) complex in Leishmania infantum. PLoS Pathog 2024; 20:e1012336. [PMID: 39018347 DOI: 10.1371/journal.ppat.1012336] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/09/2024] [Revised: 07/29/2024] [Accepted: 06/10/2024] [Indexed: 07/19/2024] Open
Abstract
Cullin-1-RING ubiquitin ligases (CRL1) or SCF1 (SKP1-CUL1-RBX1) E3 ubiquitin ligases are the largest and most extensively investigated class of E3 ligases in mammals that regulate fundamental processes, such as the cell cycle and proliferation. These enzymes are multiprotein complexes comprising SKP1, CUL1, RBX1, and an F-box protein that acts as a specificity factor by interacting with SKP1 through its F-box domain and recruiting substrates via other domains. E3 ligases are important players in the ubiquitination process, recognizing and transferring ubiquitin to substrates destined for degradation by proteasomes or processing by deubiquitinating enzymes. The ubiquitin-proteasome system (UPS) is the main regulator of intracellular proteolysis in eukaryotes and is required for parasites to alternate hosts in their life cycles, resulting in successful parasitism. Leishmania UPS is poorly investigated, and CRL1 in L. infantum, the causative agent of visceral leishmaniasis in Latin America, is yet to be described. Here, we show that the L. infantum genes LINF_110018100 (SKP1-like protein), LINF_240029100 (cullin-like protein-like protein), and LINF_210005300 (ring-box protein 1 -putative) form a LinfCRL1 complex structurally similar to the H. sapiens CRL1. Mass spectrometry analysis of the LinfSkp1 and LinfCul1 interactomes revealed proteins involved in several intracellular processes, including six F-box proteins known as F-box-like proteins (Flp) (data are available via ProteomeXchange with identifier PXD051961). The interaction of LinfFlp 1-6 with LinfSkp1 was confirmed, and using in vitro ubiquitination assays, we demonstrated the function of the LinfCRL1(Flp1) complex to transfer ubiquitin. We also found that LinfSKP1 and LinfRBX1 knockouts resulted in nonviable L. infantum lineages, whereas LinfCUL1 was involved in parasite growth and rosette formation. Finally, our results suggest that LinfCul1 regulates the S phase progression and possibly the transition between the late S to G2 phase in L. infantum. Thus, a new class of E3 ubiquitin ligases has been described in L. infantum with functions related to various parasitic processes that may serve as prospective targets for leishmaniasis treatment.
Collapse
Affiliation(s)
- Camila Rolemberg Santana Travaglini Berti de Correia
- Department of Genetics and Evolution, Federal University of São Carlos, São Carlos, Brazil
- Department of Biochemistry and Immunology, Ribeirão Preto Medical School, University of São Paulo, Ribeirão Preto, Brazil
| | - Caroline Torres
- Department of Genetics and Evolution, Federal University of São Carlos, São Carlos, Brazil
| | - Ellen Gomes
- Department of Genetics and Evolution, Federal University of São Carlos, São Carlos, Brazil
| | | | | | - Nayore Tamie Takamiya
- Department of Genetics and Evolution, Federal University of São Carlos, São Carlos, Brazil
| | | | - Iran Malavazi
- Department of Genetics and Evolution, Federal University of São Carlos, São Carlos, Brazil
| | - Marcelo Damário Gomes
- Department of Biochemistry and Immunology, Ribeirão Preto Medical School, University of São Paulo, Ribeirão Preto, Brazil
| | - Jeziel Dener Damasceno
- Institute of Infection, Immunity and Inflammation, University of Glasgow, Glasgow, United Kingdom
| | - Vitor Luiz da Silva
- Department of Biochemistry, Institute of Chemistry, University of São Paulo, São Paulo, Brazil
- Department of Chemical and Biological Sciences, Biosciences Institute, São Paulo State University (UNESP), Botucatu, Brazil
| | | | - Marcelo Santos da Silva
- Department of Biochemistry, Institute of Chemistry, University of São Paulo, São Paulo, Brazil
| | | | | | - Sandra Regina Maruyama
- Department of Genetics and Evolution, Federal University of São Carlos, São Carlos, Brazil
| | | |
Collapse
|
3
|
Radaszkiewicz KA, Sulcova M, Kohoutkova E, Harnos J. The role of prickle proteins in vertebrate development and pathology. Mol Cell Biochem 2024; 479:1199-1221. [PMID: 37358815 PMCID: PMC11116189 DOI: 10.1007/s11010-023-04787-z] [Citation(s) in RCA: 4] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/23/2023] [Accepted: 06/09/2023] [Indexed: 06/27/2023]
Abstract
Prickle is an evolutionarily conserved family of proteins exclusively associated with planar cell polarity (PCP) signalling. This signalling pathway provides directional and positional cues to eukaryotic cells along the plane of an epithelial sheet, orthogonal to both apicobasal and left-right axes. Through studies in the fruit fly Drosophila, we have learned that PCP signalling is manifested by the spatial segregation of two protein complexes, namely Prickle/Vangl and Frizzled/Dishevelled. While Vangl, Frizzled, and Dishevelled proteins have been extensively studied, Prickle has been largely neglected. This is likely because its role in vertebrate development and pathologies is still being explored and is not yet fully understood. The current review aims to address this gap by summarizing our current knowledge on vertebrate Prickle proteins and to cover their broad versatility. Accumulating evidence suggests that Prickle is involved in many developmental events, contributes to homeostasis, and can cause diseases when its expression and signalling properties are deregulated. This review highlights the importance of Prickle in vertebrate development, discusses the implications of Prickle-dependent signalling in pathology, and points out the blind spots or potential links regarding Prickle, which could be studied further.
Collapse
Affiliation(s)
- K A Radaszkiewicz
- Department of Experimental Biology, Faculty of Science, Masaryk University, Brno, 62500, Czechia
| | - M Sulcova
- Department of Experimental Biology, Faculty of Science, Masaryk University, Brno, 62500, Czechia
| | - E Kohoutkova
- Department of Experimental Biology, Faculty of Science, Masaryk University, Brno, 62500, Czechia
| | - J Harnos
- Department of Experimental Biology, Faculty of Science, Masaryk University, Brno, 62500, Czechia.
| |
Collapse
|
4
|
Atallah OO, Yassin SM, Verchot J. New Insights into Hop Latent Viroid Detection, Infectivity, Host Range, and Transmission. Viruses 2023; 16:30. [PMID: 38257731 PMCID: PMC10819085 DOI: 10.3390/v16010030] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/30/2023] [Revised: 12/21/2023] [Accepted: 12/21/2023] [Indexed: 01/24/2024] Open
Abstract
Hop latent viroid (HLVd), a subviral pathogen from the family Pospiviroidae, is a major threat to the global cannabis industry and is the causative agent for "dudding disease". Infected plants can often be asymptomatic for a period of growth and then develop symptoms such as malformed and yellowing leaves, as well as stunted growth. During flowering, HLVd-infected plants show reduced levels of valuable metabolites. This study was undertaken to expand our basic knowledge of HLVd infectivity, transmission, and host range. HLVd-specific primers were used for RT-PCR detection in plant samples and were able to detect HLVd in as little as 5 picograms of total RNA. A survey of hemp samples obtained from a diseased production system proved sole infection of HLVd (72%) with no coexistence of hop stunt viroid. HLVd was infectious through successive passage assays using a crude sap or total RNA extract derived from infected hemp. HLVd was also highly transmissible through hemp seeds at rates of 58 to 80%. Host range assays revealed new hosts for HLVd: tomato, cucumber, chrysanthemum, Nicotiana benthamiana, and Arabidopsis thaliana (Col-0). Sequence analysis of 77 isolates revealed only 3 parsimony-informative sites, while 10 sites were detected among all HLVd isolates available in the GenBank. The phylogenetic relationship among HLVd isolates allowed for inferring two major clades based on the genetic distance. Our findings facilitate further studies on host-viroid interaction and viroid management.
Collapse
Affiliation(s)
| | | | - Jeanmarie Verchot
- Department of Plant Pathology & Microbiology, Texas A&M University, College Station, TX 77843, USA; (O.O.A.); (S.M.Y.)
| |
Collapse
|
5
|
Burgstaller-Muehlbacher S, Crotty SM, Schmidt HA, Reden F, Drucks T, von Haeseler A. ModelRevelator: Fast phylogenetic model estimation via deep learning. Mol Phylogenet Evol 2023; 188:107905. [PMID: 37595933 DOI: 10.1016/j.ympev.2023.107905] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/08/2023] [Accepted: 08/12/2023] [Indexed: 08/20/2023]
Abstract
Selecting the best model of sequence evolution for a multiple-sequence-alignment (MSA) constitutes the first step of phylogenetic tree reconstruction. Common approaches for inferring nucleotide models typically apply maximum likelihood (ML) methods, with discrimination between models determined by one of several information criteria. This requires tree reconstruction and optimisation which can be computationally expensive. We demonstrate that neural networks can be used to perform model selection, without the need to reconstruct trees, optimise parameters, or calculate likelihoods. We introduce ModelRevelator, a model selection tool underpinned by two deep neural networks. The first neural network, NNmodelfind, recommends one of six commonly used models of sequence evolution, ranging in complexity from Jukes and Cantor to General Time Reversible. The second, NNalphafind, recommends whether or not a Γ-distributed rate heterogeneous model should be incorporated, and if so, provides an estimate of the shape parameter, ɑ. Users can simply input an MSA into ModelRevelator, and swiftly receive output recommending the evolutionary model, inclusive of the presence or absence of rate heterogeneity, and an estimate of ɑ. We show that ModelRevelator performs comparably with likelihood-based methods and the recently published machine learning method ModelTeller over a wide range of parameter settings, with significant potential savings in computational effort. Further, we show that this performance is not restricted to the alignments on which the networks were trained, but is maintained even on unseen empirical data. We expect that ModelRevelator will provide a valuable alternative for phylogeneticists, especially where traditional methods of model selection are computationally prohibitive.
Collapse
Affiliation(s)
- Sebastian Burgstaller-Muehlbacher
- Center for Integrative Bioinformatics Vienna, Max Perutz Labs, University of Vienna and Medical University of Vienna, Vienna BioCenter (VBC) 5, 1030 Vienna, Austria.
| | - Stephen M Crotty
- School of Mathematical Sciences, University of Adelaide, Adelaide, SA 5005, Australia; ARC Centre of Excellence for Mathematical and Statistical Frontiers, University of Adelaide, Adelaide, SA 5005, Australia
| | - Heiko A Schmidt
- Center for Integrative Bioinformatics Vienna, Max Perutz Labs, University of Vienna and Medical University of Vienna, Vienna BioCenter (VBC) 5, 1030 Vienna, Austria
| | - Franziska Reden
- Center for Integrative Bioinformatics Vienna, Max Perutz Labs, University of Vienna and Medical University of Vienna, Vienna BioCenter (VBC) 5, 1030 Vienna, Austria
| | - Tamara Drucks
- Center for Integrative Bioinformatics Vienna, Max Perutz Labs, University of Vienna and Medical University of Vienna, Vienna BioCenter (VBC) 5, 1030 Vienna, Austria; Research Unit Machine Learning, TU Wien, 1040 Vienna, Austria
| | - Arndt von Haeseler
- Center for Integrative Bioinformatics Vienna, Max Perutz Labs, University of Vienna and Medical University of Vienna, Vienna BioCenter (VBC) 5, 1030 Vienna, Austria; Bioinformatics and Computational Biology, Faculty of Computer Science, University of Vienna, Währinger Straße 29, 1090 Vienna, Austria
| |
Collapse
|
6
|
Chen Q, Song Y, Liu K, Su C, Yu R, Li Y, Yang Y, Zhou B, Wang J, Hu G. Genome-Wide Identification and Functional Characterization of FAR1-RELATED SEQUENCE ( FRS) Family Members in Potato ( Solanum tuberosum). PLANTS (BASEL, SWITZERLAND) 2023; 12:2575. [PMID: 37447143 DOI: 10.3390/plants12132575] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 05/24/2023] [Revised: 07/01/2023] [Accepted: 07/05/2023] [Indexed: 07/15/2023]
Abstract
FAR1-RELATED SEQUENCE (FRS) transcription factors are generated by transposases and play vital roles in plant growth and development, light signaling transduction, phytohormone response, and stress resistance. FRSs have been described in various plant species. However, FRS family members and their functions remain poorly understood in vegetative crops such as potato (Solanum tuberosum, St). In the present study, 20 putative StFRS proteins were identified in potato via genome-wide analysis. They were non-randomly localized to eight chromosomes and phylogenetic analysis classified them into six subgroups along with FRS proteins from Arabidopsis and tomato. Conserved protein motif, protein domain, and gene structure analyses supported the evolutionary relationships among the FRS proteins. Analysis of the cis-acting elements in the promoters and the expression profiles of StFRSs in various plant tissues and under different stress treatments revealed the spatiotemporal expression patterns and the potential roles of StFRSs in phytohormonal and stress responses. StFRSs were differentially expressed in the cultivar "Xisen 6", which is exposed to a variety of stresses. Hence, these genes may be critical in regulating abiotic stress. Elucidating the StFRS functions will lay theoretical and empirical foundations for the molecular breeding of potato varieties with high light use efficiency and stress resistance.
Collapse
Affiliation(s)
- Qingshuai Chen
- Shandong Provincial Key Laboratory of Biophysics, Institute of Biophysics, Dezhou University, Dezhou 253023, China
| | - Yang Song
- Shandong Provincial Key Laboratory of Biophysics, Institute of Biophysics, Dezhou University, Dezhou 253023, China
- College of Life Science, Dezhou University, Dezhou 253023, China
| | - Kui Liu
- Shandong Provincial Key Laboratory of Biophysics, Institute of Biophysics, Dezhou University, Dezhou 253023, China
| | - Chen Su
- Shandong Provincial Key Laboratory of Biophysics, Institute of Biophysics, Dezhou University, Dezhou 253023, China
- College of Life Science, Dezhou University, Dezhou 253023, China
| | - Ru Yu
- Shandong Provincial Key Laboratory of Biophysics, Institute of Biophysics, Dezhou University, Dezhou 253023, China
| | - Ying Li
- College of Life Science, Dezhou University, Dezhou 253023, China
| | - Yi Yang
- Shandong Provincial Key Laboratory of Biophysics, Institute of Biophysics, Dezhou University, Dezhou 253023, China
| | - Bailing Zhou
- Shandong Provincial Key Laboratory of Biophysics, Institute of Biophysics, Dezhou University, Dezhou 253023, China
| | - Jihua Wang
- Shandong Provincial Key Laboratory of Biophysics, Institute of Biophysics, Dezhou University, Dezhou 253023, China
| | - Guodong Hu
- Shandong Provincial Key Laboratory of Biophysics, Institute of Biophysics, Dezhou University, Dezhou 253023, China
| |
Collapse
|
7
|
Pomarici ND, Cacciato R, Kokot J, Fernández-Quintero ML, Liedl KR. Evolution of the Immunoglobulin Isotypes-Variations of Biophysical Properties among Animal Classes. Biomolecules 2023; 13:801. [PMID: 37238671 PMCID: PMC10216798 DOI: 10.3390/biom13050801] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/13/2023] [Revised: 05/03/2023] [Accepted: 05/05/2023] [Indexed: 05/28/2023] Open
Abstract
The adaptive immune system arose around 500 million years ago in jawed fish, and, since then, it has mediated the immune defense against pathogens in all vertebrates. Antibodies play a central role in the immune reaction, recognizing and attacking external invaders. During the evolutionary process, several immunoglobulin isotypes emerged, each having a characteristic structural organization and dedicated function. In this work, we investigate the evolution of the immunoglobulin isotypes, in order to highlight the relevant features that were preserved over time and the parts that, instead, mutated. The residues that are coupled in the evolution process are often involved in intra- or interdomain interactions, meaning that they are fundamental to maintaining the immunoglobulin fold and to ensuring interactions with other domains. The explosive growth of available sequences allows us to point out the evolutionary conserved residues and compare the biophysical properties among different animal classes and isotypes. Our study offers a general overview of the evolution of immunoglobulin isotypes and advances the knowledge of their characteristic biophysical properties, as a first step in guiding protein design from evolution.
Collapse
Affiliation(s)
| | | | | | - Monica L. Fernández-Quintero
- Department of General, Inorganic and Theoretical Chemistry, Center for Molecular Biosciences Innsbruck (CMBI), University of Innsbruck, Innrain 80-82, A-6020 Innsbruck, Austria
| | - Klaus R. Liedl
- Department of General, Inorganic and Theoretical Chemistry, Center for Molecular Biosciences Innsbruck (CMBI), University of Innsbruck, Innrain 80-82, A-6020 Innsbruck, Austria
| |
Collapse
|
8
|
Manna M, Rengasamy B, Ambasht NK, Sinha AK. Characterization and expression profiling of PIN auxin efflux transporters reveal their role in developmental and abiotic stress conditions in rice. FRONTIERS IN PLANT SCIENCE 2022; 13:1059559. [PMID: 36531415 PMCID: PMC9751476 DOI: 10.3389/fpls.2022.1059559] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 10/01/2022] [Accepted: 11/11/2022] [Indexed: 06/17/2023]
Abstract
The auxin efflux transporter proteins called PINs ferry auxin from its source to sinks in particular directions depending on their polar localizations in the plasma membrane, thus facilitating the development of the entire plant architecture. The rice genome has 12 PIN genes distributed over eight chromosomes. To study their roles in plant development, abiotic stress responsiveness, and shaping an auxin-dependent root architecture, a genome-wide analysis was carried out. Based on phylogeny, cellular localization, and hydrophilic loop domain size, the PINs were categorized into canonical and noncanonical PINs. PINs were found expressed in all of the organs of plants that emphasized their indispensable role throughout the plant's life cycle. We discovered that PIN5C and PIN9 were upregulated during salt and drought stress. We also found that regardless of its cellular level, auxin functioned as a molecular switch to turn on auxin biosynthesis genes. On the contrary, although PIN expression was upregulated upon initial treatment with auxin, prolonged auxin treatment not only led to their downregulation but also led to the development of auxin-dependent altered root formation in rice. Our study paves the way for developing stress-tolerant rice and plants with a desirable root architecture by genetic engineering.
Collapse
Affiliation(s)
- Mrinalini Manna
- National Institute of Plant Genome Research, Aruna Asaf Ali Marg, New Delhi, India
| | | | | | - Alok Krishna Sinha
- National Institute of Plant Genome Research, Aruna Asaf Ali Marg, New Delhi, India
| |
Collapse
|
9
|
Mendoza-Hoffmann F, Zarco-Zavala M, Ortega R, Celis-Sandoval H, Torres-Larios A, García-Trejo JJ. Evolution of the Inhibitory and Non-Inhibitory ε, ζ, and IF 1 Subunits of the F 1F O-ATPase as Related to the Endosymbiotic Origin of Mitochondria. Microorganisms 2022; 10:microorganisms10071372. [PMID: 35889091 PMCID: PMC9317440 DOI: 10.3390/microorganisms10071372] [Citation(s) in RCA: 5] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/28/2022] [Revised: 07/03/2022] [Accepted: 07/03/2022] [Indexed: 12/10/2022] Open
Abstract
The F1FO-ATP synthase nanomotor synthesizes >90% of the cellular ATP of almost all living beings by rotating in the “forward” direction, but it can also consume the same ATP pools by rotating in “reverse.” To prevent futile F1FO-ATPase activity, several different inhibitory proteins or domains in bacteria (ε and ζ subunits), mitochondria (IF1), and chloroplasts (ε and γ disulfide) emerged to block the F1FO-ATPase activity selectively. In this study, we analyze how these F1FO-ATPase inhibitory proteins have evolved. The phylogeny of the α-proteobacterial ε showed that it diverged in its C-terminal side, thus losing both the inhibitory function and the ATP-binding/sensor motif that controls this inhibition. The losses of inhibitory function and the ATP-binding site correlate with an evolutionary divergence of non-inhibitory α-proteobacterial ε and mitochondrial δ subunits from inhibitory bacterial and chloroplastidic ε subunits. Here, we confirm the lack of inhibitory function of wild-type and C-terminal truncated ε subunits of P. denitrificans. Taken together, the data show that ζ evolved to replace ε as the primary inhibitor of the F1FO-ATPase of free-living α-proteobacteria. However, the ζ inhibitory function was also partially lost in some symbiotic α-proteobacteria and totally lost in some strictly parasitic α-proteobacteria such as the Rickettsiales order. Finally, we found that ζ and IF1 likely evolved independently via convergent evolution before and after the endosymbiotic origin mitochondria, respectively. This led us to propose the ε and ζ subunits as tracer genes of the pre-endosymbiont that evolved into the actual mitochondria.
Collapse
Affiliation(s)
- Francisco Mendoza-Hoffmann
- Facultad de Ciencias Químicas e Ingeniería, Universidad Autónoma de Baja California (UABC)—Campus Tijuana, Tijuana C.P. 22390, Baja California, Mexico
- Correspondence: (F.M.-H.); (J.J.G.-T.)
| | - Mariel Zarco-Zavala
- Departamento de Biología, Facultad de Química, Ciudad Universitaria, Universidad Nacional Autónoma de México (U.N.A.M.), Ciudad de Mexico C.P. 04510, Coyoacan, Mexico
| | - Raquel Ortega
- Departamento de Biología, Facultad de Química, Ciudad Universitaria, Universidad Nacional Autónoma de México (U.N.A.M.), Ciudad de Mexico C.P. 04510, Coyoacan, Mexico
| | - Heliodoro Celis-Sandoval
- Instituto de Fisiología Celular (IFC), Ciudad Universitaria, Universidad Nacional Autónoma de México (U.N.A.M.), Ciudad de Mexico C.P. 04510, Coyoacan, Mexico
| | - Alfredo Torres-Larios
- Instituto de Fisiología Celular (IFC), Ciudad Universitaria, Universidad Nacional Autónoma de México (U.N.A.M.), Ciudad de Mexico C.P. 04510, Coyoacan, Mexico
| | - José J. García-Trejo
- Departamento de Biología, Facultad de Química, Ciudad Universitaria, Universidad Nacional Autónoma de México (U.N.A.M.), Ciudad de Mexico C.P. 04510, Coyoacan, Mexico
- Correspondence: (F.M.-H.); (J.J.G.-T.)
| |
Collapse
|
10
|
Goremykin V. Assessment of Absolute Substitution Model Fit Accommodating Time-Reversible and Non-Time-Reversible Evolutionary Processes. Syst Biol 2022:6632685. [PMID: 35792853 DOI: 10.1093/sysbio/syac046] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/14/2021] [Revised: 06/16/2022] [Accepted: 06/24/2022] [Indexed: 11/13/2022] Open
Abstract
The loss of information accompanying assessment of absolute fit of substitution models to phylogenetic data negatively affects the discriminatory power of previous methods and can make them insensitive to lineage-specific changes in the substitution process. As an alternative, I propose evaluating absolute fit of substitution models based on a novel statistic which describes the observed data without information loss and which is unlikely to become zero-inflated with increasing numbers of taxa. This method can accommodate gaps and is sensitive to lineage-specific shifts in the substitution process. In simulation experiments, it exhibits greater discriminatory power than previous methods. The method can be implemented in both Bayesian and Maximum Likelihood phylogenetic analyses, and used to screen any set of models. Recently, it has been suggested that model selection may be an unnecessary step in phylogenetic inference. However, results presented here emphasize the importance of model fit assessment for reliable phylogenetic inference.
Collapse
Affiliation(s)
- Vadim Goremykin
- Research and Innovation Centre, Fondazione Edmund Mach, 38010 San Michele all'Adige (TN), Italy
| |
Collapse
|
11
|
Genome-Wide Identification and Expression Analysis of Homeodomain Leucine Zipper Subfamily IV (HD-ZIP IV) Gene Family in Cannabis sativa L. PLANTS 2022; 11:plants11101307. [PMID: 35631732 PMCID: PMC9144208 DOI: 10.3390/plants11101307] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 03/15/2022] [Revised: 05/06/2022] [Accepted: 05/10/2022] [Indexed: 12/19/2022]
Abstract
The plant-specific homeodomain zipper family (HD-ZIP) of transcription factors plays central roles in regulating plant development and environmental resistance. HD-ZIP transcription factors IV (HDZ IV) have been involved primarily in the regulation of epidermal structure development, such as stomata and trichomes. In our study, we identified nine HDZ IV-encoding genes in Cannabis sativa L. by conducting a computational analysis of cannabis genome resources. Our analysis suggests that these genes putatively encode proteins that have all the conserved domains of HDZ IV transcription factors. The phylogenetic analysis of HDZ IV gene family members of cannabis, rice (Oryza sativa), and Arabidopsis further implies that they might have followed distinct evolutionary paths after divergence from a common ancestor. All the identified cannabis HDZ IV gene promoter sequences have multiple regulation motifs, such as light- and hormone-responsive elements. Furthermore, experimental evidence shows that different HDZ IV genes have different expression patterns in root, stem, leaf, and flower tissues. Four genes were primarily expressed in flowers, and the expression of CsHDG5 (XP_030501222.1) was also correlated with flower maturity. Fifty-nine genes were predicted as targets of HDZ IV transcription factors. Some of these genes play central roles in pathogen response, flower development, and brassinosteroid signaling. A subcellular localization assay indicated that one gene of this family is localized in the Arabidopsis protoplast nucleus. Taken together, our work lays fundamental groundwork to illuminate the function of cannabis HDZ IV genes and their possible future uses in increasing cannabis trichome morphogenesis and secondary metabolite production.
Collapse
|
12
|
SufB intein splicing in Mycobacterium tuberculosis is influenced by two remote conserved N-extein histidines. Biosci Rep 2022; 42:230724. [PMID: 35234249 PMCID: PMC8891592 DOI: 10.1042/bsr20212207] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/14/2021] [Revised: 01/11/2022] [Accepted: 01/27/2022] [Indexed: 11/24/2022] Open
Abstract
Inteins are auto-processing domains that implement a multistep biochemical reaction termed protein splicing, marked by cleavage and formation of peptide bonds. They excise from a precursor protein, generating a functional protein via covalent bonding of flanking exteins. We report the kinetic study of splicing and cleavage reaction in [Fe–S] cluster assembly protein SufB from Mycobacterium tuberculosis (Mtu). Although it follows a canonical intein splicing pathway, distinct features are added by extein residues present in the active site. Sequence analysis identified two conserved histidines in the N-extein region; His-5 and His-38. Kinetic analyses of His-5Ala and His-38Ala SufB mutants exhibited significant reductions in splicing and cleavage rates relative to the SufB wildtype (WT) precursor protein. Structural analysis and molecular dynamics (MD) simulations suggested that Mtu SufB displays a unique mechanism where two remote histidines work concurrently to facilitate N-terminal cleavage reaction. His-38 is stabilized by the solvent-exposed His-5, and can impact N–S acyl shift by direct interaction with the catalytic Cys1. Development of inteins as biotechnological tools or as pathogen-specific novel antimicrobial targets requires a more complete understanding of such unexpected roles of conserved extein residues in protein splicing.
Collapse
|
13
|
Nasir A, Mughal F, Caetano-Anollés G. The tree of life describes a tripartite cellular world. Bioessays 2021; 43:e2000343. [PMID: 33837594 DOI: 10.1002/bies.202000343] [Citation(s) in RCA: 7] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Key Words] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/25/2020] [Revised: 03/11/2021] [Accepted: 03/15/2021] [Indexed: 12/28/2022]
Abstract
The canonical view of a 3-domain (3D) tree of life was recently challenged by the discovery of Asgardarchaeota encoding eukaryote signature proteins (ESPs), which were treated as missing links of a 2-domain (2D) tree. Here we revisit the debate. We discuss methodological limitations of building trees with alignment-dependent approaches, which often fail to satisfactorily address the problem of ''gaps.'' In addition, most phylogenies are reconstructed unrooted, neglecting the power of direct rooting methods. Alignment-free methodologies lift most difficulties but require employing realistic evolutionary models. We argue that the discoveries of Asgards and ESPs, by themselves, do not rule out the 3D tree, which is strongly supported by comparative and evolutionary genomic analyses and vast genomic and biochemical superkingdom distinctions. Given uncertainties of retrodiction and interpretation difficulties, we conclude that the 3D view has not been falsified but instead has been strengthened by genomic analyses. In turn, the objections to the 2D model have not been lifted. The debate remains open. Also see the video abstract here: https://youtu.be/-6TBN0bubI8.
Collapse
Affiliation(s)
- Arshan Nasir
- Theoretical Biology and Biophysics (T-6), Los Alamos National Laboratory, Los Alamos, New Mexico, USA
| | - Fizza Mughal
- Department of Crop Sciences, University of Illinois at Urbana-Champaign, Urbana, Illinois, USA
| | - Gustavo Caetano-Anollés
- Department of Crop Sciences, University of Illinois at Urbana-Champaign, Urbana, Illinois, USA
| |
Collapse
|
14
|
Smirnov V, Warnow T. Phylogeny Estimation Given Sequence Length Heterogeneity. Syst Biol 2020; 70:268-282. [PMID: 32692823 PMCID: PMC7875441 DOI: 10.1093/sysbio/syaa058] [Citation(s) in RCA: 16] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/10/2019] [Revised: 07/14/2020] [Accepted: 07/15/2020] [Indexed: 12/21/2022] Open
Abstract
Phylogeny estimation is a major step in many biological studies, and has many well known challenges. With the dropping cost of sequencing technologies, biologists now have increasingly large datasets available for use in phylogeny estimation. Here we address the challenge of estimating a tree given large datasets with a combination of full-length sequences and fragmentary sequences, which can arise due to a variety of reasons, including sample collection, sequencing technologies, and analytical pipelines. We compare two basic approaches: (1) computing an alignment on the full dataset and then computing a maximum likelihood tree on the alignment, or (2) constructing an alignment and tree on the full length sequences and then using phylogenetic placement to add the remaining sequences (which will generally be fragmentary) into the tree. We explore these two approaches on a range of simulated datasets, each with 1000 sequences and varying in rates of evolution, and two biological datasets. Our study shows some striking performance differences between methods, especially when there is substantial sequence length heterogeneity and high rates of evolution. We find in particular that using UPP to align sequences and RAxML to compute a tree on the alignment provides the best accuracy, substantially outperforming trees computed using phylogenetic placement methods. We also find that FastTree has poor accuracy on alignments containing fragmentary sequences. Overall, our study provides insights into the literature comparing different methods and pipelines for phylogenetic estimation, and suggests directions for future method development. [Phylogeny estimation, sequence length heterogeneity, phylogenetic placement.]
Collapse
Affiliation(s)
- Vladimir Smirnov
- Department of Computer Science, University of Illinois at Urbana-Champaign, Urbana, IL, 61801, USA
| | - Tandy Warnow
- Department of Computer Science, University of Illinois at Urbana-Champaign, Urbana, IL, 61801, USA
| |
Collapse
|
15
|
Machine learning-based analyses support the existence of species complexes for Strongyloides fuelleborni and Strongyloides stercoralis. Parasitology 2020; 147:1184-1195. [PMID: 32539880 PMCID: PMC7443747 DOI: 10.1017/s0031182020000979] [Citation(s) in RCA: 22] [Impact Index Per Article: 5.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/06/2022]
Abstract
Human strongyloidiasis is a serious disease mostly attributable to Strongyloides stercoralis and to a lesser extent Strongyloides fuelleborni, a parasite mainly of non-human primates. The role of animals as reservoirs of human-infecting Strongyloides is ill-defined, and whether dogs are a source of human infection is debated. Published multi-locus sequence typing (MLST) studies attempt to elucidate relationships between Strongyloides genotypes, hosts, and distributions, but typically examine relatively few worms, making it difficult to identify population-level trends. Combining MLST data from multiple studies is often impractical because they examine different combinations of loci, eliminating phylogeny as a means of examining these data collectively unless hundreds of specimens are excluded. A recently-described machine learning approach that facilitates clustering of MLST data may offer a solution, even for datasets that include specimens sequenced at different combinations of loci. By clustering various MLST datasets as one using this procedure, we sought to uncover associations among genotype, geography, and hosts that remained elusive when examining datasets individually. Multiple datasets comprising hundreds of S. stercoralis and S. fuelleborni individuals were combined and clustered. Our results suggest that the commonly proposed 'two lineage' population structure of S. stercoralis (where lineage A infects humans and dogs, lineage B only dogs) is an over-simplification. Instead, S. stercoralis seemingly represents a species complex, including two distinct populations over-represented in dogs, and other populations vastly more common in humans. A distinction between African and Asian S. fuelleborni is also supported here, emphasizing the need for further resolving these taxonomic relationships through modern investigations.
Collapse
|
16
|
Wascher M, Kubatko L. Consistency of SVDQuartets and Maximum Likelihood for Coalescent-Based Species Tree Estimation. Syst Biol 2020; 70:33-48. [PMID: 32415974 DOI: 10.1093/sysbio/syaa039] [Citation(s) in RCA: 10] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/21/2018] [Revised: 05/06/2020] [Accepted: 05/07/2020] [Indexed: 11/14/2022] Open
Abstract
Numerous methods for inferring species-level phylogenies under the coalescent model have been proposed within the last 20 years, and debates continue about the relative strengths and weaknesses of these methods. One desirable property of a phylogenetic estimator is that of statistical consistency, which means intuitively that as more data are collected, the probability that the estimated tree has the same topology as the true tree goes to 1. To date, consistency results for species tree inference under the multispecies coalescent (MSC) have been derived only for summary statistics methods, such as ASTRAL and MP-EST. These methods have been found to be consistent given true gene trees but may be inconsistent when gene trees are estimated from data for loci of finite length. Here, we consider the question of statistical consistency for four taxa for SVDQuartets for general data types, as well as for the maximum likelihood (ML) method in the case in which the data are a collection of sites generated under the MSC model such that the sites are conditionally independent given the species tree (we call these data coalescent independent sites [CIS] data). We show that SVDQuartets is statistically consistent for all data types (i.e., for both CIS data and for multilocus data), and we derive its rate of convergence. We additionally show that ML is consistent for CIS data under the JC69 model and discuss why a proof for the more general multilocus case is difficult. Finally, we compare the performance of ML and SDVQuartets using simulation for both data types. [Consistency; gene tree; maximum likelihood; multilocus data; hylogenetic inference; species tree; SVDQuartets.].
Collapse
Affiliation(s)
- Matthew Wascher
- Department of Statistics, The Ohio State University, Columbus, OH, USA
| | - Laura Kubatko
- Department of Statistics, The Ohio State University, Columbus, OH, USA.,Department of Evolution, Ecology, and Organismal Biology, The Ohio State University, Columbus, OH, USA
| |
Collapse
|
17
|
Suvorov A, Hochuli J, Schrider DR. Accurate Inference of Tree Topologies from Multiple Sequence Alignments Using Deep Learning. Syst Biol 2020; 69:221-233. [PMID: 31504938 PMCID: PMC8204903 DOI: 10.1093/sysbio/syz060] [Citation(s) in RCA: 30] [Impact Index Per Article: 7.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/07/2019] [Accepted: 08/28/2019] [Indexed: 11/13/2022] Open
Abstract
Reconstructing the phylogenetic relationships between species is one of the most formidable tasks in evolutionary biology. Multiple methods exist to reconstruct phylogenetic trees, each with their own strengths and weaknesses. Both simulation and empirical studies have identified several "zones" of parameter space where accuracy of some methods can plummet, even for four-taxon trees. Further, some methods can have undesirable statistical properties such as statistical inconsistency and/or the tendency to be positively misleading (i.e. assert strong support for the incorrect tree topology). Recently, deep learning techniques have made inroads on a number of both new and longstanding problems in biological research. In this study, we designed a deep convolutional neural network (CNN) to infer quartet topologies from multiple sequence alignments. This CNN can readily be trained to make inferences using both gapped and ungapped data. We show that our approach is highly accurate on simulated data, often outperforming traditional methods, and is remarkably robust to bias-inducing regions of parameter space such as the Felsenstein zone and the Farris zone. We also demonstrate that the confidence scores produced by our CNN can more accurately assess support for the chosen topology than bootstrap and posterior probability scores from traditional methods. Although numerous practical challenges remain, these findings suggest that the deep learning approaches such as ours have the potential to produce more accurate phylogenetic inferences.
Collapse
Affiliation(s)
- Anton Suvorov
- Department of Genetics, University of North Carolina at Chapel Hill, 120 Mason Farm Road, UNC-Chapel Hill, Chapel Hill, NC 27599-7264, USA
| | - Joshua Hochuli
- Biological and Biomedical Sciences Program, University of North Carolina at Chapel Hill, 130 Mason Farm Road, UNC-Chapel Hill Chapel Hill, NC 27599-7264, USA
| | - Daniel R Schrider
- Biological and Biomedical Sciences Program, University of North Carolina at Chapel Hill, 130 Mason Farm Road, UNC-Chapel Hill Chapel Hill, NC 27599-7264, USA
| |
Collapse
|
18
|
Perez‐Lamarque B, Morlon H. Characterizing symbiont inheritance during host–microbiota evolution: Application to the great apes gut microbiota. Mol Ecol Resour 2019; 19:1659-1671. [DOI: 10.1111/1755-0998.13063] [Citation(s) in RCA: 12] [Impact Index Per Article: 2.4] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/25/2019] [Revised: 06/26/2019] [Accepted: 06/28/2019] [Indexed: 01/19/2023]
Affiliation(s)
- Benoît Perez‐Lamarque
- Institut de Biologie de l'ENS (IBENS), Département de biologie, École normale supérieure, CNRS, INSERM PSL University Paris France
- Muséum national d'Histoire naturelleUMR 7205 CNRS‐MNHN‐UPMC‐EPHE “Institut de Systématique, Evolution, Biodiversité – ISYEB” Herbier National 16 rue Buffon Paris France
| | - Hélène Morlon
- Institut de Biologie de l'ENS (IBENS), Département de biologie, École normale supérieure, CNRS, INSERM PSL University Paris France
| |
Collapse
|
19
|
Lemoine F, Domelevo Entfellner JB, Wilkinson E, Correia D, Dávila Felipe M, De Oliveira T, Gascuel O. Renewing Felsenstein's phylogenetic bootstrap in the era of big data. Nature 2018; 556:452-456. [PMID: 29670290 PMCID: PMC6030568 DOI: 10.1038/s41586-018-0043-0] [Citation(s) in RCA: 349] [Impact Index Per Article: 58.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/23/2017] [Accepted: 03/01/2018] [Indexed: 12/29/2022]
Abstract
Felsenstein's application of the bootstrap method to evolutionary trees is one of the most cited scientific papers of all time. The bootstrap method, which is based on resampling and replications, is used extensively to assess the robustness of phylogenetic inferences. However, increasing numbers of sequences are now available for a wide variety of species, and phylogenies based on hundreds or thousands of taxa are becoming routine. With phylogenies of this size Felsenstein's bootstrap tends to yield very low supports, especially on deep branches. Here we propose a new version of the phylogenetic bootstrap in which the presence of inferred branches in replications is measured using a gradual 'transfer' distance rather than the binary presence or absence index used in Felsenstein's original version. The resulting supports are higher and do not induce falsely supported branches. The application of our method to large mammal, HIV and simulated datasets reveals their phylogenetic signals, whereas Felsenstein's bootstrap fails to do so.
Collapse
Affiliation(s)
- F Lemoine
- Unité Bioinformatique Evolutive, C3BI USR 3756, Institut Pasteur & CNRS, Paris, France
- Hub Bioinformatique et Biostatistique, C3BI USR 3756, Institut Pasteur & CNRS, Paris, France
| | - J-B Domelevo Entfellner
- Department of Computer Science, University of the Western Cape, Cape Town, South Africa
- South African MRC Bioinformatics Unit, South African National Bioinformatics Institute, University of the Western Cape, Cape Town, South Africa
| | - E Wilkinson
- KwaZulu-Natal Research Innovation and Sequencing Platform (KRISP), School of Laboratory Medicine and Medical Sciences, College of Health Sciences, University of KwaZulu-Natal, Durban, South Africa
| | - D Correia
- Unité Bioinformatique Evolutive, C3BI USR 3756, Institut Pasteur & CNRS, Paris, France
| | - M Dávila Felipe
- Unité Bioinformatique Evolutive, C3BI USR 3756, Institut Pasteur & CNRS, Paris, France
| | - T De Oliveira
- KwaZulu-Natal Research Innovation and Sequencing Platform (KRISP), School of Laboratory Medicine and Medical Sciences, College of Health Sciences, University of KwaZulu-Natal, Durban, South Africa
- Centre for the AIDS Programme of Research in South Africa (CAPRISA), University of KwaZulu-Natal, Durban, South Africa
| | - O Gascuel
- Unité Bioinformatique Evolutive, C3BI USR 3756, Institut Pasteur & CNRS, Paris, France.
- Méthodes et Algorithmes pour la Bioinformatique, LIRMM UMR 5506, Université de Montpellier & CNRS, Montpellier, France.
| |
Collapse
|
20
|
Saarela JM, Burke SV, Wysocki WP, Barrett MD, Clark LG, Craine JM, Peterson PM, Soreng RJ, Vorontsova MS, Duvall MR. A 250 plastome phylogeny of the grass family (Poaceae): topological support under different data partitions. PeerJ 2018; 6:e4299. [PMID: 29416954 PMCID: PMC5798404 DOI: 10.7717/peerj.4299] [Citation(s) in RCA: 76] [Impact Index Per Article: 12.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/08/2017] [Accepted: 01/08/2018] [Indexed: 12/23/2022] Open
Abstract
The systematics of grasses has advanced through applications of plastome phylogenomics, although studies have been largely limited to subfamilies or other subgroups of Poaceae. Here we present a plastome phylogenomic analysis of 250 complete plastomes (179 genera) sampled from 44 of the 52 tribes of Poaceae. Plastome sequences were determined from high throughput sequencing libraries and the assemblies represent over 28.7 Mbases of sequence data. Phylogenetic signal was characterized in 14 partitions, including (1) complete plastomes; (2) protein coding regions; (3) noncoding regions; and (4) three loci commonly used in single and multi-gene studies of grasses. Each of the four main partitions was further refined, alternatively including or excluding positively selected codons and also the gaps introduced by the alignment. All 76 protein coding plastome loci were found to be predominantly under purifying selection, but specific codons were found to be under positive selection in 65 loci. The loci that have been widely used in multi-gene phylogenetic studies had among the highest proportions of positively selected codons, suggesting caution in the interpretation of these earlier results. Plastome phylogenomic analyses confirmed the backbone topology for Poaceae with maximum bootstrap support (BP). Among the 14 analyses, 82 clades out of 309 resolved were maximally supported in all trees. Analyses of newly sequenced plastomes were in agreement with current classifications. Five of seven partitions in which alignment gaps were removed retrieved Panicoideae as sister to the remaining PACMAD subfamilies. Alternative topologies were recovered in trees from partitions that included alignment gaps. This suggests that ambiguities in aligning these uncertain regions might introduce a false signal. Resolution of these and other critical branch points in the phylogeny of Poaceae will help to better understand the selective forces that drove the radiation of the BOP and PACMAD clades comprising more than 99.9% of grass diversity.
Collapse
Affiliation(s)
- Jeffery M. Saarela
- Beaty Centre for Species Discovery and Botany Section, Canadian Museum of Nature, Ottawa, ON, Canada
| | - Sean V. Burke
- Plant Molecular and Bioinformatics Center, Biological Sciences, Northern Illinois University, DeKalb, IL, USA
| | - William P. Wysocki
- Center for Data Intensive Sciences, University of Chicago, Chicago, IL, USA
| | - Matthew D. Barrett
- Botanic Gardens and Parks Authority, Kings Park and Botanic Garden, West Perth, WA, Australia
- School of Biological Sciences, The University of Western Australia, Crawley, WA, Australia
| | - Lynn G. Clark
- Department of Ecology, Evolution and Organismal Biology, Iowa State University, Ames, IA, USA
| | | | - Paul M. Peterson
- Department of Botany, National Museum of Natural History, Smithsonian Institution, Washington, DC, USA
| | - Robert J. Soreng
- Department of Botany, National Museum of Natural History, Smithsonian Institution, Washington, DC, USA
| | - Maria S. Vorontsova
- Comparative Plant & Fungal Biology, Royal Botanic Gardens, Kew, Richmond, Surrey, UK
| | - Melvin R. Duvall
- Plant Molecular and Bioinformatics Center, Biological Sciences, Northern Illinois University, DeKalb, IL, USA
| |
Collapse
|
21
|
Zhai Y, Alexandre BC. A Poissonian Model of Indel Rate Variation for Phylogenetic Tree Inference. Syst Biol 2018; 66:698-714. [PMID: 28204784 DOI: 10.1093/sysbio/syx033] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/04/2015] [Accepted: 01/27/2017] [Indexed: 01/22/2023] Open
Abstract
While indel rate variation has been observed and analyzed in detail, it is not taken into account by current indel-aware phylogenetic reconstruction methods. In this work, we introduce a continuous time stochastic process, the geometric Poisson indel process, that generalizes the Poisson indel process by allowing insertion and deletion rates to vary across sites. We design an efficient algorithm for computing the probability of a given multiple sequence alignment based on our new indel model. We describe a method to construct phylogeny estimates from a fixed alignment using neighbor joining. Using simulation studies, we show that ignoring indel rate variation may have a detrimental effect on the accuracy of the inferred phylogenies, and that our proposed method can sidestep this issue by inferring latent indel rate categories. We also show that our phylogenetic inference method may be more stable to taxa subsampling than methods that either ignore indels or indel rate variation. [evolutionary stochastic process; indel rate variation; Poisson indel process; TKF91.].
Collapse
Affiliation(s)
- Yongliang Zhai
- Department of Statistics, University of British Columbia, Vancouver, British Columbia, V6T 1Z4, Canada
| | - Bouchard-Côté Alexandre
- Department of Statistics, University of British Columbia, Vancouver, British Columbia, V6T 1Z4, Canada
| |
Collapse
|
22
|
Goloboff PA, Torres A, Arias JS. Weighted parsimony outperforms other methods of phylogenetic inference under models appropriate for morphology. Cladistics 2017; 34:407-437. [DOI: 10.1111/cla.12205] [Citation(s) in RCA: 205] [Impact Index Per Article: 29.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 03/22/2017] [Indexed: 11/28/2022] Open
Affiliation(s)
- Pablo A. Goloboff
- Unidad Ejecutora Lillo; Fundación Miguel Lillo; CONICET; Miguel Lillo 251 4000 San Miguel de Tucumán Argentina
| | - Ambrosio Torres
- Unidad Ejecutora Lillo; Fundación Miguel Lillo; CONICET; Miguel Lillo 251 4000 San Miguel de Tucumán Argentina
| | - J. Salvador Arias
- Unidad Ejecutora Lillo; Fundación Miguel Lillo; CONICET; Miguel Lillo 251 4000 San Miguel de Tucumán Argentina
| |
Collapse
|