1
|
De Laet J, Goloboff PA. Nothing to it: a reply to Wheeler's "much ado about nothing". Cladistics 2024; 40:456-467. [PMID: 38345481 DOI: 10.1111/cla.12571] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/10/2023] [Revised: 12/31/2023] [Accepted: 01/09/2024] [Indexed: 07/15/2024] Open
Abstract
Wheeler (Cladistics 2023, 39, 475) recently suggested that the issues with inapplicable characters in phylogenetic analysis can be dealt with directly by treating observed absences of a feature not in a separate absence/presence character but as insertion/deletion events in a complex character that describes the feature in all its variation; and that this dynamic homology view can be achieved by imposing a sequence or linear order on a set of characters and by analysing the resulting sequence character using custom alphabet tree alignment algorithms. As Wheeler observed, this approach can lead to considering inappropriate character states (such as a head state and a foot state) homologous. We show that it is also sensitive to the specific ordering assumption used and that such different character orders can lead to a preference for different trees. We present a simple four-taxon dataset with observations of absence, but no inapplicable characters or other kinds of character dependence, for which the dynamic homology framework gives different results to classic algorithms for independent characters, including an optimal tree with biologically impossible reconstructions at inner nodes (every terminal has a head but the inner nodes are headless). We show how these issues can be solved by removing the character ordering assumption that the approach requires. Doing so, the dynamic homology framework reduces in general to Maddison's (Syst. Biol. 1993, 42, 576) well-known proposal to deal with inapplicability using step matrix analysis of complex characters. If in addition costs are interpreted in terms of homology, it reduces to Goloboff et al.'s (Cladistics 2021, 37, 596) step matrix implementation for maximization of homology as applied to inapplicable characters. However, if used with homogeneous costs, as Wheeler suggested, it reduces to unordered analysis of such complex characters, which is known to treat tails that may share many observed features as irrelevant for establishing kinship when they differ in just one feature, e.g. colour.
Collapse
Affiliation(s)
- Jan De Laet
- Meise Botanic Garden, Nieuwelaan 38, Meise, Belgium
| | - Pablo A Goloboff
- Unidad Ejecutora Lillo, UEL (CONICET - Fundación Miguel Lillo), Miguel Lillo 251, 4000, San Miguel de Tucumán, Argentina
| |
Collapse
|
2
|
Goloboff PA, De Laet J. Farewell to the requirement for character independence: phylogenetic methods to incorporate different types of dependence between characters. Cladistics 2024; 40:209-241. [PMID: 38014464 DOI: 10.1111/cla.12564] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/19/2023] [Revised: 10/15/2023] [Accepted: 10/18/2023] [Indexed: 11/29/2023] Open
Abstract
This paper discusses methods to take into account interactions between characters, in the context of parsimony analysis. These interactions can be in the form of some characters becoming inapplicable given certain states of other, primary characters; in the form of only certain states being allowed in some characters when a given state or set of states occurs for other characters; or in the form of transformation costs in some character being higher or lower when other characters have certain states or transformations between states. Character-state reconstructions and evaluation of trees under the assumption of independence may easily lead to ancestral assignments that violate elementary rules of biomechanics, well-established theories relating form and function or ideas about character co-variation. An obvious example is reconstructing an ancestral bird as wingless and flying at the same time; another is reconstructing a protein-coding gene as having a stop codon in some ancestors. If the characters are optimized independently, such chimeric ancestral reconstructions can occur even when no terminal displays the impossible combination of states. A set of conventions (implemented via new TNT commands and options) allows the definition of complex rules of interaction. By recoding groups of characters with proper step-matrix costs (and excluding impossible combinations from the set of permissible states), it is possible to find the ancestral reconstructions that maximize homology (and thus the degree to which similarities can be explained by common ancestry), within the constraints imposed by the rules specified by the user. We expect that considerations of biomechanics, functional morphology and natural history will be a source of many theories on possible character dependences, and that the present implementation will encourage users to take the possibility of character dependences into account in their phylogenetic analyses.
Collapse
Affiliation(s)
- Pablo A Goloboff
- Unidad Ejecutora Lillo, UEL (CONICET-Fundación Miguel Lillo), Miguel Lillo 251, 4000, S.M. de Tucumán, Argentina
| | - Jan De Laet
- Meise Botanic Garden, Nieuwelaan 38, Meise, Belgium
| |
Collapse
|
3
|
Grams M, Richter S. On the four complementary aspects of hierarchical character relationships and their bearing on scoring constraints, expressed in a new syntax for character dependencies. Cladistics 2023; 39:437-455. [PMID: 37428134 DOI: 10.1111/cla.12550] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/22/2022] [Revised: 06/02/2023] [Accepted: 06/03/2023] [Indexed: 07/11/2023] Open
Abstract
Morphological matrices, including the conceptualization of characters and character states and scoring thereof, still are a valuable and necessary tool for phylogenetic analyses. Although they are often seen only as numerically simplified summaries of observations for the purpose of cladistic analyses, they also hold value as collections of ideas, concepts and the current state of knowledge, conveying various hypotheses on character state identity, homology and evolutionary transformations. A common and persistent issue in scoring and analysing morphological matrices is the phenomenon of inapplicable characters ("inapplicables"). Inapplicables result from the ontological dependency (based on hierarchical relationships) between characters. Traditionally handled the same as "missing data", inapplicables were shown to be problematic in holding the potential to result in unreasonable algorithmic preference for certain cladograms over others. Recently, though, this problem has been solved by approaching parsimony as a maximization of homology rather than a minimization of transformational steps. We herein aim to further improve our theoretical understanding of the underlying hierarchical nature of morphological characters, which causes the phenomenon of ontological dependencies and, thereby, inapplicables. As a result, we present a discussion of various character-dependency scenarios and a new concept of hierarchical character relationships as being composed of four complementary sub-aspects. Building on this, a new syntax for the designation of character dependencies as part of the character statement is proposed, to help identify and apply scoring constraints for manual and automated scoring of morphological character matrices and their cladistic analysis.
Collapse
Affiliation(s)
- Markus Grams
- Universität Rostock Institut für Biowissenschaften, Allgemeine & Spezielle Zoologie, Rostock, Germany
| | - Stefan Richter
- Universität Rostock Institut für Biowissenschaften, Allgemeine & Spezielle Zoologie, Rostock, Germany
| |
Collapse
|
4
|
Pohle A, Kröger B, Warnock RCM, King AH, Evans DH, Aubrechtová M, Cichowolski M, Fang X, Klug C. Early cephalopod evolution clarified through Bayesian phylogenetic inference. BMC Biol 2022; 20:88. [PMID: 35421982 PMCID: PMC9008929 DOI: 10.1186/s12915-022-01284-5] [Citation(s) in RCA: 5] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/07/2021] [Accepted: 03/22/2022] [Indexed: 11/23/2022] Open
Abstract
BACKGROUND Despite the excellent fossil record of cephalopods, their early evolution is poorly understood. Different, partly incompatible phylogenetic hypotheses have been proposed in the past, which reflected individual author's opinions on the importance of certain characters but were not based on thorough cladistic analyses. At the same time, methods of phylogenetic inference have undergone substantial improvements. For fossil datasets, which typically only include morphological data, Bayesian inference and in particular the introduction of the fossilized birth-death model have opened new possibilities. Nevertheless, many tree topologies recovered from these new methods reflect large uncertainties, which have led to discussions on how to best summarize the information contained in the posterior set of trees. RESULTS We present a large, newly compiled morphological character matrix of Cambrian and Ordovician cephalopods to conduct a comprehensive phylogenetic analysis and resolve existing controversies. Our results recover three major monophyletic groups, which correspond to the previously recognized Endoceratoidea, Multiceratoidea, and Orthoceratoidea, though comprising slightly different taxa. In addition, many Cambrian and Early Ordovician representatives of the Ellesmerocerida and Plectronocerida were recovered near the root. The Ellesmerocerida is para- and polyphyletic, with some of its members recovered among the Multiceratoidea and early Endoceratoidea. These relationships are robust against modifications of the dataset. While our trees initially seem to reflect large uncertainties, these are mainly a consequence of the way clade support is measured. We show that clade posterior probabilities and tree similarity metrics often underestimate congruence between trees, especially if wildcard taxa are involved. CONCLUSIONS Our results provide important insights into the earliest evolution of cephalopods and clarify evolutionary pathways. We provide a classification scheme that is based on a robust phylogenetic analysis. Moreover, we provide some general insights on the application of Bayesian phylogenetic inference on morphological datasets. We support earlier findings that quartet similarity metrics should be preferred over the Robinson-Foulds distance when higher-level phylogenetic relationships are of interest and propose that using a posteriori pruned maximum clade credibility trees help in assessing support for phylogenetic relationships among a set of relevant taxa, because they provide clade support values that better reflect the phylogenetic signal.
Collapse
Affiliation(s)
- Alexander Pohle
- Paläontologisches Institut und Museum, Universität Zürich, Karl-Schmid-Strasse 4, CH-8006, Zürich, Switzerland.
| | - Björn Kröger
- Finnish Museum of Natural History, University of Helsinki, P.O. Box 44, Jyrängöntie 2, FI-00014, Helsinki, Finland
| | - Rachel C M Warnock
- GeoZentrum Nordbayern, Friedrich-Alexander Universität Erlangen-Nürnberg, Loewenichstrasse 28, 91054, Erlangen, Germany
| | - Andy H King
- Geckoella Ltd, Suite 323, 7 Bridge Street, Taunton, TA1 1TG, UK
| | - David H Evans
- Natural England, Rivers House, East Quay, Bridgwater, TA6 4YS, UK
| | - Martina Aubrechtová
- Institute of Geology and Palaeontology, Faculty of Science, Charles University, Albertov 6, 12843, Prague, Czech Republic
- Institute of Geology, Czech Academy of Sciences, Rozvojová 269, 16500, Prague, Czech Republic
| | - Marcela Cichowolski
- Instituto de Estudios Andinos "Don Pablo Groeber", CONICET and Facultad de Ciencias Exactas y Naturales, Universidad de Buenos Aires, Ciudad Universitaria, Pab. 2, C1428EGA, Buenos Aires, Argentina
| | - Xiang Fang
- State Key Laboratory of Palaeobiology and Stratigraphy, Nanjing Institute of Geology and Palaeontology and Center for Excellence in Life and Paleoenvironment, Chinese Academy of Sciences, 39 East Beijing Road, Nanjing, 210008, China
| | - Christian Klug
- Paläontologisches Institut und Museum, Universität Zürich, Karl-Schmid-Strasse 4, CH-8006, Zürich, Switzerland
| |
Collapse
|
5
|
Lehtonen S. Phenotypic characters of static homology increase phylogenetic stability under direct optimization of otherwise dynamic homology characters. Cladistics 2021; 36:617-626. [PMID: 34618977 DOI: 10.1111/cla.12438] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 08/20/2020] [Indexed: 11/29/2022] Open
Abstract
Direct optimization of unaligned sequence characters provides a natural framework to explore the sensitivity of phylogenetic hypotheses to variation in analytical parameters. Phenotypic data, when combined into such analyses, are typically analyzed with static homology correspondences unlike the dynamic homology sequence data. Static homology characters may be expected to constrain the direct optimization and thus, potentially increase the similarity of phylogenetic hypotheses under different cost sets. However, whether a total-evidence approach increases the phylogenetic stability or not remains empirically largely unexplored. Here, I studied the impact of static homology data on sensitivity using six empirical data sets composed of several molecular markers and phenotypic data. The inclusion of static homology phenotypic data increased the average stability of phylogenetic hypothesis in five out of the six data sets. To investigate if any static homology characters would have similar effect, the analyses were repeated with randomized phenotypic data, and with one of the molecular markers fixed as static homology characters. These analyses had, on average, almost no effect on the phylogenetic stability, although the randomized phenotypic data sometimes resulted in even higher stability than empirical phenotypic data. The impact was related to the strength of the phylogenetic signal in the phenotypic data: higher average jackknife support of the phenotypic tree correlated with stronger stabilizing effect in the total-evidence analysis. Phenotypic data with a strong signal made the total-evidence trees topologically more similar to the phenotypic trees, thus, they constrained the dynamic homology correspondences of the sequence data. Characters that increase phylogenetic stability are particularly valuable for phylogenetic inference. These results indicate an important role and additive value of phenotypic data in increasing the stability of phylogenetic hypotheses in total-evidence analyses.
Collapse
Affiliation(s)
- Samuli Lehtonen
- Biodiversity Unit, University of Turku, Turku, FI-20014, Finland
| |
Collapse
|
6
|
Goloboff PA, De Laet J, Ríos-Tamayo D, Szumik CA. A reconsideration of inapplicable characters, and an approximation with step-matrix recoding. Cladistics 2021; 37:596-629. [PMID: 34570932 DOI: 10.1111/cla.12456] [Citation(s) in RCA: 13] [Impact Index Per Article: 4.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 03/01/2021] [Indexed: 11/28/2022] Open
Abstract
Evidence for phylogenetic analysis comes in the form of observed similarities, and trees are selected to minimize the number of similarities that cannot be accounted for by homology (homoplasies). Thus, the classical argument for parsimony directly links homoplasy with explanatory power. When characters are hierarchically related, a first character may represent a primary structure such as tail absence/presence and a secondary (subordinate) character may describe tail colour; this makes tail colour inapplicable when tail is absent. It has been proposed that such character hierarchies should be evaluated on the same logical basis as standard characters, maximizing the number of similarities accounted for by secondary homology, i.e. common ancestry. Previous evaluations of the homology of a given ancestral reconstruction contain the unintuitive quantity "subcharacters" (number of regions of applicability). Rather than counting subcharacters, this paper proposes an equivalent but more intuitive formulation, based on counting the number of changes into each separate state. In this formulation, x-transformations, the homoplasy for the reconstruction is simply the number of changes into the state beyond the first, summed over all states. There is thus no direct connection between homoplasy and number of steps, only between homoplasy and extra steps. The link between the two formulations is that, for any region of applicability of any character, a subcharacter can be interpreted as the change into the state that is plesiomorphic in that region. Although some authors have claimed that the equivalence between maximizing explanatory power and minimizing independent originations of similar features (i.e. the standard justification of parsimony) does not hold for inapplicable characters, evaluating homoplasy with x-transformations clearly connects the two sides of that equation. Furthermore, as the evaluation with x-transformations provides a direct count and a straightforward interpretation of homoplasy, it extends naturally into implied weighting, and sheds light on problems with additive, step-matrix or continuous characters. It also allows deriving transformation costs for recoding hierarchies as step-matrix characters (where recoded states correspond to permissible combinations of states in primary and secondary characters), so that homology of the original observations is properly measured. Those transformation costs set the cost of gaining the primary structure to the maximum difference between "present" states plus cost of loss, and difference between "present" states to the sum of user-defined transformation costs between secondary features. With such recoding, invoking multiple independent derivations of the structure and similar features will cost as many extra "steps" as the instances of similarities (in both original characters) that are not being homologized. The step-matrix recoding also can take into account nested dependences. We present a simple convention for naming characters, which TNT can use to automatically convert the original data into a step-matrix form and set the proper transformation costs. Finally, the basic elements for handling inapplicable characters in the context of maximum-likelihood inference are outlined, and some quantitative comparisons between different approaches to inapplicables are provided.
Collapse
Affiliation(s)
- Pablo A Goloboff
- Unidad Ejecutora Lillo, Consejo Nacional de Investigaciones Científicas y Técnicas, Fundación Miguel Lillo, Miguel Lillo 251, San Miguel de Tucumán, 4000, Argentina.,American Museum of Natural of Natural History, New York, NY, USA
| | - Jan De Laet
- Göteborgs Botaniska Trädgård, Göteborg, Sweden
| | - Duniesky Ríos-Tamayo
- Unidad Ejecutora Lillo, Consejo Nacional de Investigaciones Científicas y Técnicas, Fundación Miguel Lillo, Miguel Lillo 251, San Miguel de Tucumán, 4000, Argentina
| | - Claudia A Szumik
- Unidad Ejecutora Lillo, Consejo Nacional de Investigaciones Científicas y Técnicas, Fundación Miguel Lillo, Miguel Lillo 251, San Miguel de Tucumán, 4000, Argentina
| |
Collapse
|
7
|
Hopkins MJ, St John K. Incorporating Hierarchical Characters into Phylogenetic Analysis. Syst Biol 2021; 70:1163-1180. [PMID: 33560427 DOI: 10.1093/sysbio/syab005] [Citation(s) in RCA: 8] [Impact Index Per Article: 2.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/25/2020] [Revised: 12/28/2020] [Accepted: 02/01/2021] [Indexed: 11/13/2022] Open
Abstract
Popular optimality criteria for phylogenetic trees focus on sequences of characters that are applicable to all the taxa. As studies grow in breadth, it can be the case that some characters are applicable for a portion of the taxa and inapplicable for others. Past work has explored the limitations of treating inapplicable characters as missing data, noting that this strategy may favor trees where interval nodes are assigned impossible states, where the arrangement of taxa within subclades is unduly influenced by variation in distant parts of the tree, and/or where taxa that otherwise share most primary characters are grouped distantly. Approaches that avoid the first two problems have recently been proposed. Here, we propose an alternative approach which avoids all three problems. We focus on data matrices that use reductive coding of traits, that is, explicitly incorporate the innate hierarchy induced by inapplicability, and as such our approach extend to hierarchical characters, in general. In the spirit of maximum parsimony, the proposed criterion seeks the phylogenetic tree with the minimal changes across any tree branch, but where changes are defined in terms of dissimilarity metrics that weigh the affects of inapplicable characters. The approach can accommodate binary, multistate, ordered, unordered, and polymorphic characters. We give a polynomial-time algorithm, inspired by Fitch's algorithm, to score trees under a family of dissimilarity metrics, and prove its correctness. We show that the resulting optimality criteria is computationally hard, by reduction to the NP-hardness of the maximum parsimony optimality criteria. We demonstrate our approach using synthetic and empirical data sets and compare the results with other recently proposed methods for choosing optimal phylogenetic trees when the data includes hierarchical characters.
Collapse
Affiliation(s)
- Melanie J Hopkins
- Division of Paleontology (Invertebrates), American Museum of Natural History, New York, NY, USA
| | - Katherine St John
- Department of Computer Science, Hunter College, City University of New York, New York, NY, USA.,Division of Invertebrate Zoology, American Museum of Natural History, New York, NY, USA
| |
Collapse
|
8
|
Brazeau MD, Guillerme T, Smith MR. An algorithm for Morphological Phylogenetic Analysis with Inapplicable Data. Syst Biol 2019; 68:619-631. [PMID: 30535172 PMCID: PMC6568014 DOI: 10.1093/sysbio/syy083] [Citation(s) in RCA: 42] [Impact Index Per Article: 8.4] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/25/2017] [Revised: 11/27/2018] [Accepted: 11/03/2018] [Indexed: 11/13/2022] Open
Abstract
Morphological data play a key role in the inference of biological relationships and evolutionary history and are essential for the interpretation of the fossil record. The hierarchical interdependence of many morphological characters, however, complicates phylogenetic analysis. In particular, many characters only apply to a subset of terminal taxa. The widely used "reductive coding" approach treats taxa in which a character is inapplicable as though the character's state is simply missing (unknown). This approach has long been known to create spurious tree length estimates on certain topologies, potentially leading to erroneous results in phylogenetic searches-but pratical solutions have yet to be proposed and implemented. Here, we present a single-character algorithm for reconstructing ancestral states in reductively coded data sets, following the theoretical guideline of minimizing homoplasy over all characters. Our algorithm uses up to three traversals to score a tree, and a fourth to fully resolve final states at each node within the tree. We use explicit criteria to resolve ambiguity in applicable/inapplicable dichotomies, and to optimize missing data. So that it can be applied to single characters, the algorithm employs local optimization; as such, the method provides a fast but approximate inference of ancestral states and tree score. The application of our method to published morphological data sets indicates that, compared to traditional methods, it identifies different trees as "optimal." As such, the use of our algorithm to handle inapplicable data may significantly alter the outcome of tree searches, modifying the inferred placement of living and fossil taxa and potentially leading to major differences in reconstructions of evolutionary history.
Collapse
Affiliation(s)
- Martin D Brazeau
- Department of Life Sciences, Imperial College London, Silwood Park Campus, Buckhurst Road, Ascot SL5 7PY, UK
- Department of Earth Sciences, Natural History Museum, Cromwell Road, London SW7 5BD, UK
| | - Thomas Guillerme
- Department of Life Sciences, Imperial College London, Silwood Park Campus, Buckhurst Road, Ascot SL5 7PY, UK
- School of Biological Sciences, The University of Queensland, St. Lucia 4067, Queensland, Australia
| | - Martin R Smith
- Department of Earth Sciences, University of Cambridge, Downing Street, Cambridge CB2 3EQ, UK
- Department of Earth Sciences, Mountjoy Site, Durham University, South Road, Durham DH1 3LE, UK
| |
Collapse
|
9
|
Ospina-Sarria JJ, Cabra-García J. Parsimony analysis of unaligned sequence data: some clarifications. Cladistics 2018; 34:574-577. [PMID: 34706480 DOI: 10.1111/cla.12229] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 10/10/2017] [Indexed: 11/29/2022] Open
Abstract
De Laet (2015) claimed that minimization of ad hoc hypotheses of homoplasy does not lead to a preference for trivial optimizations when analysing unaligned sequence data, as claimed by Wheeler (2012; see also Kluge and Grant, 2006). In addition, De Laet asserted that Kluge and Grant's (2006) parsimony rationale is internally inconsistent in terms of Baker's (2003) theoretical framework. We argue that De Laet used extraneous presuppositions to critique Wheeler's position and, as such, his criticism should be considered cautiously in terms of its scope. Finally, we demonstrate that considering Kluge and Grant's parsimony rationale as inconsistent rests on De Laet's misunderstanding of the ideographic character concept and the consequences of relating it to Baker's rationale.
Collapse
Affiliation(s)
- Jhon Jairo Ospina-Sarria
- Departamento de Zoologia, Instituto de Biociências, Universidade de São Paulo, São Paulo, SP 05508-090, Brazil
| | - Jimmy Cabra-García
- Departamento de Zoologia, Instituto de Biociências, Universidade de São Paulo, São Paulo, SP 05508-090, Brazil.,Departamento de Biología, Universidad del Valle, Cali, AA 25360, Colombia
| |
Collapse
|
10
|
Zeeshan N, Naz S, Naz S, Afroz A, Zahur M, Zia S. Heterologous expression and enhanced production of β-1,4-glucanase of Bacillus halodurans C-125 in Escherichia coli. ELECTRON J BIOTECHN 2018. [DOI: 10.1016/j.ejbt.2018.05.001] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/16/2022] Open
|