1
|
Forest-Based Networks. Bull Math Biol 2022; 84:119. [PMID: 36107279 PMCID: PMC9477957 DOI: 10.1007/s11538-022-01081-9] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/13/2022] [Accepted: 09/05/2022] [Indexed: 11/04/2022]
Abstract
In evolutionary studies, it is common to use phylogenetic trees to represent the evolutionary history of a set of species. However, in case the transfer of genes or other genetic information between the species or their ancestors has occurred in the past, a tree may not provide a complete picture of their history. In such cases, tree-based phylogenetic networks can provide a useful, more refined representation of the species’ evolution. Such a network is essentially a phylogenetic tree with some arcs added between the tree’s edges so as to represent reticulate events such as gene transfer, hybridization and recombination. Even so, this model does not permit the direct representation of evolutionary scenarios where reticulate events have taken place between different subfamilies or lineages of species. To represent such scenarios, in this paper we introduce the notion of a forest-based network, that is, a collection of leaf-disjoint phylogenetic trees on a set of species with arcs added between the edges of distinct trees within the collection. Forest-based networks include the recently introduced class of overlaid species forests which can be used to model introgression. As we shall see, even though the definition of forest-based networks is closely related to that of tree-based networks, they lead to new mathematical theory which complements that of tree-based networks. As well as studying the relationship of forest-based networks with other classes of phylogenetic networks, such as tree-child networks and universal tree-based networks, we present some characterizations of some special classes of forest-based networks. We expect that our results will be useful for developing new models and algorithms to understand reticulate evolution, such as introgression and gene transfer between species.
Collapse
|
2
|
Davidov N, Hernandez A, Jian J, McKenna P, Medlin KA, Mojumder R, Owen M, Quijano A, Rodriguez A, St John K, Thai K, Uraga M. Maximum Covering Subtrees for Phylogenetic Networks. IEEE/ACM TRANSACTIONS ON COMPUTATIONAL BIOLOGY AND BIOINFORMATICS 2021; 18:2823-2827. [PMID: 33242309 DOI: 10.1109/tcbb.2020.3040910] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/11/2023]
Abstract
Tree-based phylogenetic networks, which may be roughly defined as leaf-labeled networks built by adding arcs only between the original tree edges, have elegant properties for modeling evolutionary histories. We answer an open question of Francis, Semple, and Steel about the complexity of determining how far a phylogenetic network is from being tree-based, including non-binary phylogenetic networks. We show that finding a phylogenetic tree covering the maximum number of nodes in a phylogenetic network can be computed in polynomial time via an encoding into a minimum-cost flow problem.
Collapse
|
3
|
Fischer M, Francis A. The Space of Tree-Based Phylogenetic Networks. Bull Math Biol 2020; 82:70. [PMID: 32500263 DOI: 10.1007/s11538-020-00744-9] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/11/2019] [Accepted: 05/05/2020] [Indexed: 10/24/2022]
Abstract
Phylogenetic networks are generalizations of phylogenetic trees that allow the representation of reticulation events such as horizontal gene transfer or hybridization, and can also represent uncertainty in inference. A subclass of these, tree-based phylogenetic networks, have been introduced to capture the extent to which reticulate evolution nevertheless broadly follows tree-like patterns. Several important operations that change a general phylogenetic network have been developed in recent years and are important for allowing algorithms to move around spaces of networks; a vital ingredient in finding an optimal network given some biological data. A key such operation is the nearest neighbour interchange, or NNI. While it is already known that the space of unrooted phylogenetic networks is connected under NNI, it has been unclear whether this also holds for the subspace of tree-based networks. In this paper, we show that the space of unrooted tree-based phylogenetic networks is indeed connected under the NNI operation. We do so by explicitly showing how to get from one such network to another one without losing tree-basedness along the way. Moreover, we introduce some new concepts, for instance "shoat networks", and derive some interesting aspects concerning tree-basedness. Last, we use our results to derive an upper bound on the size of the space of tree-based networks.
Collapse
Affiliation(s)
- Mareike Fischer
- Institute of Mathematics and Computer Science, University of Greifswald, Greifswald, Germany
| | - Andrew Francis
- Centre for Research in Mathematics and Data Science, Western Sydney University, Sydney, Australia.
| |
Collapse
|
4
|
Abstract
Recently, so-called tree-based phylogenetic networks have attracted considerable attention. These networks can be constructed from a phylogenetic tree, called the base tree, by adding additional edges. The primary aim of this study is to provide sufficient criteria for tree-basedness by reducing phylogenetic networks to related graph structures. Even though it is generally known that determining whether a network is tree-based is an NP-complete problem, one of these criteria, namely edge-basedness, can be verified in linear time. Surprisingly, the class of edge-based networks is closely related to a well-known family of graphs, namely, the class of generalized series-parallel graphs, and we explore this relationship in full detail. Additionally, we introduce further classes of tree-based networks and analyze their relationships.
Collapse
|
5
|
Pons JC, Scornavacca C, Cardona G. Generation of Level- k LGT Networks. IEEE/ACM TRANSACTIONS ON COMPUTATIONAL BIOLOGY AND BIOINFORMATICS 2020; 17:158-164. [PMID: 30703035 DOI: 10.1109/tcbb.2019.2895344] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/09/2023]
Abstract
Phylogenetic networks provide a mathematical model to represent the evolution of a set of species where, apart from speciation, reticulate evolutionary events have to be taken into account. Among these events, lateral gene transfers need special consideration due to the asymmetry in the roles of the species involved in such an event. To take into account this asymmetry, LGT networks were introduced. Contrarily to the case of phylogenetic trees, the combinatorial structure of phylogenetic networks is much less known and difficult to describe. One of the approaches in the literature is to classify them according to their level and find generators of the given level that can be used to recursively generate all networks. In this paper, we adapt the concept of generators to the case of LGT networks. We show how these generators, classified by their level, give rise to simple LGT networks of the specified level, and how any LGT network can be obtained from these simple networks, that act as building blocks of the generic structure. The stochastic models of evolution of phylogenetic networks are also much less studied than those for phylogenetic trees. In this setting, we introduce a novel two-parameter model that generates LGT networks. Finally, we present some computer simulations using this model in order to investigate the complexity of the generated networks, depending on the parameters of the model.
Collapse
|
6
|
Hong Y, Wang J. Frin: An Efficient Method for Representing Genome Evolutionary History. Front Genet 2019; 10:1261. [PMID: 31867045 PMCID: PMC6909884 DOI: 10.3389/fgene.2019.01261] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/18/2019] [Accepted: 11/14/2019] [Indexed: 11/13/2022] Open
Abstract
Phylogenetic analysis is important in understanding the process of biological evolution, and phylogenetic trees are used to represent the evolutionary history. Each taxon in a phylogenetic tree has not more than one parent, so phylogenetic trees cannot express the complex evolutionary information implicit in phylogeny. Phylogenetic networks can be used to express genome evolutionary histories. Therefore, it is great significance to research the construction of phylogenetic networks. Cass algorithm is an efficient method for constructing phylogenetic networks because it can construct a much simpler network. However, Cass relies heavily on the order of input data, i.e. different networks can be constructed for the same dataset with different input orders. Based on the frequency and incompatibility degree of taxa, we propose an efficiently improved algorithm of Cass, called as Frin. The experimental results show that the networks constructed by Frin are not only simpler than those constructed by other methods, but Frin can also construct more consistent phylogenetic networks when the treated data have different input orders. Furthermore, the phylogenetic network constructed by Frin is closer to the original information described by phylogenetic trees. Frin has been built as a Java software package and is freely available at https://github.com/wangjuanimu/Frin.
Collapse
Affiliation(s)
- Yan Hong
- School of Computer Science, Inner Mongolia University, Hohhot, China
| | - Juan Wang
- School of Computer Science, Inner Mongolia University, Hohhot, China
| |
Collapse
|
7
|
Hendriksen M, Francis A. Tree-metrizable HGT networks. Math Biosci 2019; 318:108283. [PMID: 31711966 DOI: 10.1016/j.mbs.2019.108283] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/26/2019] [Revised: 11/03/2019] [Accepted: 11/03/2019] [Indexed: 11/17/2022]
Abstract
Phylogenetic trees are often constructed by using a metric on the set of taxa that label the leaves of the tree. While there are a number of methods for constructing a tree using a given metric, such trees will only display the metric if it satisfies the so-called "four point condition", established by Buneman in 1971. While this condition guarantees that a unique tree will display the metric, meaning that the distance between any two leaves can be found by adding the distances on arcs in the path between the leaves, it doesn't exclude the possibility that a phylogenetic network might also display the metric. This possibility was recently pointed out and "tree-metrized" networks - that display a tree metric - with a single reticulation were characterized. In this paper, we show that in the case of HGT (horizontal gene transfer) networks, in fact there are tree-metrized networks containing many reticulations.
Collapse
Affiliation(s)
- Michael Hendriksen
- Centre for Research in Mathematics, Western Sydney University, Australia.
| | - Andrew Francis
- Centre for Research in Mathematics, Western Sydney University, Australia.
| |
Collapse
|
8
|
Francis A, Huber KT, Moulton V. Correction to: Tree-Based Unrooted Phylogenetic Networks. Bull Math Biol 2018; 81:936-937. [PMID: 30446916 DOI: 10.1007/s11538-018-0530-3] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/24/2018] [Accepted: 10/30/2018] [Indexed: 11/24/2022]
Abstract
The level-5 example of a network presented in Fig. 4 of Francis et al. (2018) is tree-based even though it states in the caption and in the text that this is not the case.
Collapse
Affiliation(s)
- A Francis
- Centre for Research in Mathematics, Western Sydney University, Sydney, Australia
| | - K T Huber
- School of Computing Sciences, University of East Anglia, Norwich, UK.
| | - V Moulton
- School of Computing Sciences, University of East Anglia, Norwich, UK
| |
Collapse
|
9
|
Tree-based networks: characterisations, metrics, and support trees. J Math Biol 2018; 78:899-918. [PMID: 30283985 DOI: 10.1007/s00285-018-1296-9] [Citation(s) in RCA: 6] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/24/2017] [Revised: 08/25/2018] [Indexed: 10/28/2022]
Abstract
Phylogenetic networks generalise phylogenetic trees and allow for the accurate representation of the evolutionary history of a set of present-day species whose past includes reticulate events such as hybridisation and lateral gene transfer. One way to obtain such a network is by starting with a (rooted) phylogenetic tree T, called a base tree, and adding arcs between arcs of T. The class of phylogenetic networks that can be obtained in this way is called tree-based networks and includes the prominent classes of tree-child and reticulation-visible networks. Initially defined for binary phylogenetic networks, tree-based networks naturally extend to arbitrary phylogenetic networks. In this paper, we generalise recent tree-based characterisations and associated proximity measures for binary phylogenetic networks to arbitrary phylogenetic networks. These characterisations are in terms of matchings in bipartite graphs, path partitions, and antichains. Some of the generalisations are straightforward to establish using the original approach, while others require a very different approach. Furthermore, for an arbitrary tree-based network N, we characterise the support trees of N, that is, the tree-based embeddings of N. We use this characterisation to give an explicit formula for the number of support trees of N when N is binary. This formula is written in terms of the components of a bipartite graph.
Collapse
|
10
|
Hendriksen M. Tree-based unrooted nonbinary phylogenetic networks. Math Biosci 2018; 302:131-138. [PMID: 29932953 DOI: 10.1016/j.mbs.2018.06.005] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/08/2018] [Revised: 06/18/2018] [Accepted: 06/18/2018] [Indexed: 11/16/2022]
Abstract
Phylogenetic networks are a generalisation of phylogenetic trees that allow for more complex evolutionary histories that include hybridisation-like processes. It is of considerable interest whether a network can be considered 'tree-like' or not, which leads to the introduction of tree-based networks in the rooted, binary context. Tree-based networks are those networks which can be constructed by adding additional edges into a phylogenetic tree, called the base tree. Previous extensions have considered extending to the binary, unrooted case and the nonbinary, rooted case. In this paper, we extend tree-based networks to the context of unrooted, nonbinary networks in three ways, depending on the types of additional edges that are permitted. Further, we study fully tree-based networks which are phylogenetic networks in which every embedded tree is a base tree. We also extend this concept to unrooted, nonbinary, phylogenetic networks and classify the resulting networks. Finally, we derive some results on the colourability of tree-based networks, which can be useful to determine whether a network is tree-based.
Collapse
Affiliation(s)
- Michael Hendriksen
- Centre for Research in Mathematics, Western Sydney University, NSW, Australia.
| |
Collapse
|