1
|
Murray CS, Karram M, Bass DJ, Doceti M, Becker D, Nunez JCB, Ratan A, Bergland AO. Balancing selection and the functional effects of shared polymorphism in cryptic Daphnia species. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2024:2024.04.16.589693. [PMID: 38659826 PMCID: PMC11042267 DOI: 10.1101/2024.04.16.589693] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 04/26/2024]
Abstract
The patterns of genetic variation within and between related taxa represent the genetic history of a species. Shared polymorphisms, loci with identical alleles across species, are of unique interest as they may represent cases of ancient selection maintaining functional variation post-speciation. In this study, we investigate the abundance of shared polymorphism in the Daphnia pulex species complex. We test whether shared mutations are consistent with the action of balancing selection or alternative hypotheses such as hybridization, incomplete lineage sorting, or convergent evolution. We analyzed over 2,000 genomes from North American and European D. pulex and several outgroup species to examine the prevalence and distribution of shared alleles between the focal species pair, North American and European D. pulex. We show that while North American and European D. pulex diverged over ten million years ago, they retained tens of thousands of shared alleles. We found that the number of shared polymorphisms between North American and European D. pulex cannot be explained by hybridization or incomplete lineage sorting alone. Instead, we show that most shared polymorphisms could be the product of convergent evolution, that a limited number appear to be old trans-specific polymorphisms, and that balancing selection is affecting young and ancient mutations alike. Finally, we provide evidence that a blue wavelength opsin gene with trans-specific polymorphisms has functional effects on behavior and fitness in the wild. Ultimately, our findings provide insights into the genetic basis of adaptation and the maintenance of genetic diversity between species.
Collapse
Affiliation(s)
- Connor S. Murray
- Department of Biology, University of Virginia, Charlottesville, VA, USA
| | - Madison Karram
- Department of Biology, University of Virginia, Charlottesville, VA, USA
| | - David J. Bass
- Department of Biology, University of Virginia, Charlottesville, VA, USA
| | - Madison Doceti
- Department of Biology, University of Virginia, Charlottesville, VA, USA
| | - Dörthe Becker
- Department of Biology, University of Virginia, Charlottesville, VA, USA
- School of Biosciences, Ecology and Evolutionary Biology, University of Sheffield, Sheffield, UK
| | | | - Aakrosh Ratan
- Center of Public Health Genomics, University of Virginia, Charlottesville, VA, USA
- Department of Public Health Sciences, University of Virginia, Charlottesville, VA, USA
| | - Alan O. Bergland
- Department of Biology, University of Virginia, Charlottesville, VA, USA
| |
Collapse
|
2
|
Shaukat MA, Nguyen TT, Hsu EB, Yang S, Bhatti A. Comparative study of encoded and alignment-based methods for virus taxonomy classification. Sci Rep 2023; 13:18662. [PMID: 37907535 PMCID: PMC10618506 DOI: 10.1038/s41598-023-45461-0] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/04/2023] [Accepted: 10/19/2023] [Indexed: 11/02/2023] Open
Abstract
The emergence of viruses and their variants has made virus taxonomy more important than ever before in controlling the spread of diseases. The creation of efficient treatments and cures that target particular virus properties can be aided by understanding virus taxonomy. Alignment-based methods are commonly used for this task, but are computationally expensive and time-consuming, especially when dealing with large datasets or when detecting new virus variants is time sensitive. An alternative approach, the encoded method, has been developed that does not require prior sequence alignment and provides faster results. However, each encoded method has its own claimed accuracy. Therefore, careful evaluation and comparison of the performance of different encoded methods are essential to identify the most accurate and reliable approach for virus taxonomy classification. This study aims to address this issue by providing a comprehensive and comparative analysis of the potential of encoded methods for virus classification and phylogenetics. We compared the vectors generated for each encoded method using distance metrics to determine their similarity to alignment-based methods. The results and their validation show that K-merNV followed by CgrDft encoded methods, perform similarly to state-of-the-art multi-sequence alignment methods. This is the first study to incorporate and compare encoded methods that will facilitate future research in making more informed decisions regarding selection of a suitable method for virus taxonomy.
Collapse
Affiliation(s)
- Muhammad Arslan Shaukat
- Institute for Intelligent Systems Research and Innovation (IISRI), Deakin University, Victoria, Australia.
| | - Thanh Thi Nguyen
- Faculty of Information Technology, Monash University, Victoria, Australia
| | - Edbert B Hsu
- Department of Emergency Medicine, Johns Hopkins University, Maryland, USA
| | - Samuel Yang
- Department of Emergency Medicine, Stanford University, California, USA
| | - Asim Bhatti
- Institute for Intelligent Systems Research and Innovation (IISRI), Deakin University, Victoria, Australia
| |
Collapse
|
3
|
Yao TH, Wu Z, Bharath K, Li J, Baladandayuthapani V. PROBABILISTIC LEARNING OF TREATMENT TREES IN CANCER. Ann Appl Stat 2023; 17:1884-1908. [PMID: 37711665 PMCID: PMC10501503 DOI: 10.1214/22-aoas1696] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 09/16/2023]
Abstract
Accurate identification of synergistic treatment combinations and their underlying biological mechanisms is critical across many disease domains, especially cancer. In translational oncology research, preclinical systems such as patient-derived xenografts (PDX) have emerged as a unique study design evaluating multiple treatments administered to samples from the same human tumor implanted into genetically identical mice. In this paper, we propose a novel Bayesian probabilistic tree-based framework for PDX data to investigate the hierarchical relationships between treatments by inferring treatment cluster trees, referred to as treatment trees (Rx-tree). The framework motivates a new metric of mechanistic similarity between two or more treatments accounting for inherent uncertainty in tree estimation; treatments with a high estimated similarity have potentially high mechanistic synergy. Building upon Dirichlet Diffusion Trees, we derive a closed-form marginal likelihood encoding the tree structure, which facilitates computationally efficient posterior inference via a new two-stage algorithm. Simulation studies demonstrate superior performance of the proposed method in recovering the tree structure and treatment similarities. Our analyses of a recently collated PDX dataset produce treatment similarity estimates that show a high degree of concordance with known biological mechanisms across treatments in five different cancers. More importantly, we uncover new and potentially effective combination therapies that confer synergistic regulation of specific downstream biological pathways for future clinical investigations. Our accompanying code, data, and shiny application for visualization of results are available at: https://github.com/bayesrx/RxTree.
Collapse
Affiliation(s)
- Tsung-Hung Yao
- Department of Biostatistics, University of Michigan at Ann Arbor
| | - Zhenke Wu
- Department of Biostatistics, University of Michigan at Ann Arbor
| | | | - Jinju Li
- Department of Biostatistics, University of Michigan at Ann Arbor
| | | |
Collapse
|
4
|
Yan L, Masood TB, Rasheed F, Hotz I, Wang B. Geometry-Aware Merge Tree Comparisons for Time-Varying Data With Interleaving Distances. IEEE TRANSACTIONS ON VISUALIZATION AND COMPUTER GRAPHICS 2023; 29:3489-3506. [PMID: 35349444 DOI: 10.1109/tvcg.2022.3163349] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/14/2023]
Abstract
Merge trees, a type of topological descriptors, serve to identify and summarize the topological characteristics associated with scalar fields. They have great potential for analyzing and visualizing time-varying data. First, they give compressed and topology-preserving representations of data instances. Second, their comparisons provide a basis for studying the relations among data instances, such as their distributions, clusters, outliers, and periodicities. A number of comparative measures have been developed for merge trees. However, these measures are often computationally expensive since they implicitly consider all possible correspondences between critical points of the merge trees. In this paper, we perform geometry-aware comparisons of merge trees using labeled interleaving distances. The main idea is to decouple the computation of a comparative measure into two steps: a labeling step that generates a correspondence between the critical points of two merge trees, and a comparison step that computes distances between a pair of labeled merge trees by encoding them as matrices. We show that our approach is general, computationally efficient, and practically useful. Our framework makes it possible to integrate geometric information of the data domain in the labeling process. At the same time, the framework reduces the computational complexity since not all possible correspondences have to be considered. We demonstrate via experiments that such geometry-aware merge tree comparisons help to detect transitions, clusters, and periodicities of time-varying datasets, as well as to diagnose and highlight the topological changes between adjacent data instances.
Collapse
|
5
|
Sainbhi AS, Vakitbilir N, Gomez A, Stein KY, Froese L, Zeiler FA. Non-Invasive Mapping of Cerebral Autoregulation Using Near-Infrared Spectroscopy: A Study Protocol. Methods Protoc 2023; 6:58. [PMID: 37368002 DOI: 10.3390/mps6030058] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/17/2023] [Revised: 05/18/2023] [Accepted: 06/06/2023] [Indexed: 06/28/2023] Open
Abstract
The ability of cerebral vessels to maintain a fairly constant cerebral blood flow is referred to as cerebral autoregulation (CA). Using near-infrared spectroscopy (NIRS) paired with arterial blood pressure (ABP) monitoring, continuous CA can be assessed non-invasively. Recent advances in NIRS technology can help improve the understanding of continuously assessed CA in humans with high spatial and temporal resolutions. We describe a study protocol for creating a new wearable and portable imaging system that derives CA maps of the entire brain with high sampling rates at each point. The first objective is to evaluate the CA mapping system's performance during various perturbations using a block-trial design in 50 healthy volunteers. The second objective is to explore the impact of age and sex on regional disparities in CA using static recording and perturbation testing in 200 healthy volunteers. Using entirely non-invasive NIRS and ABP systems, we hope to prove the feasibility of deriving CA maps of the entire brain with high spatial and temporal resolutions. The development of this imaging system could potentially revolutionize the way we monitor brain physiology in humans since it would allow for an entirely non-invasive continuous assessment of regional differences in CA and improve our understanding of the impact of the aging process on cerebral vessel function.
Collapse
Affiliation(s)
- Amanjyot Singh Sainbhi
- Department of Biomedical Engineering, Price Faculty of Engineering, University of Manitoba, Winnipeg, MB R3T 5V6, Canada
| | - Nuray Vakitbilir
- Department of Biomedical Engineering, Price Faculty of Engineering, University of Manitoba, Winnipeg, MB R3T 5V6, Canada
| | - Alwyn Gomez
- Section of Neurosurgery, Department of Surgery, Rady Faculty of Health Sciences, University of Manitoba, Winnipeg, MB R3A 1R9, Canada
- Department of Human Anatomy and Cell Science, Rady Faculty of Health Sciences, University of Manitoba, Winnipeg, MB R3E 0J9, Canada
| | - Kevin Y Stein
- Department of Biomedical Engineering, Price Faculty of Engineering, University of Manitoba, Winnipeg, MB R3T 5V6, Canada
| | - Logan Froese
- Department of Biomedical Engineering, Price Faculty of Engineering, University of Manitoba, Winnipeg, MB R3T 5V6, Canada
| | - Frederick A Zeiler
- Department of Biomedical Engineering, Price Faculty of Engineering, University of Manitoba, Winnipeg, MB R3T 5V6, Canada
- Section of Neurosurgery, Department of Surgery, Rady Faculty of Health Sciences, University of Manitoba, Winnipeg, MB R3A 1R9, Canada
- Department of Human Anatomy and Cell Science, Rady Faculty of Health Sciences, University of Manitoba, Winnipeg, MB R3E 0J9, Canada
- Centre on Aging, University of Manitoba, Winnipeg, MB R3T 2N2, Canada
- Division of Anaesthesia, Department of Medicine, Addenbrooke's Hospital, University of Cambridge, Cambridge CB2 0QQ, UK
- Department of Clinical Neuroscience, Karolinska Institutet, 171 77 Stockholm, Sweden
| |
Collapse
|
6
|
Bogdanowicz D, Giaro K. Generalization of Phylogenetic Matching Metrics with Experimental Tests of Practical Advantages. J Comput Biol 2023; 30:261-276. [PMID: 36576792 DOI: 10.1089/cmb.2022.0090] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/29/2022] Open
Abstract
The ability to quantify a dissimilarity of different phylogenetic trees is required in various types of phylogenetic studies, for example, such metrics are used to assess the quality of phylogeny construction methods and to define optimization criteria in supertree building algorithms. In this article, starting from the already described concept of matching metrics, we define three new metrics for rooted phylogenetic trees. One of them, Matching Pair Jaccard (MPJ) distance, is still purely topological, but we now utilize the Jaccard index set dissimilarity measure in its construction. This modification substantially changes the structural features of the metric space. In particular, we investigate the properties of the previously known Matching Cluster Jaccard (MCJ) and the new MPJ metrics, such as the asymptotic behavior of their expected distance between two random trees, the space diameter, and the change of a distance after a single leaf relocation. The other two metrics, Matching Cluster Weight-aware (MCW) and Matching Cluster Jaccard Weight-aware (MCJW) distances, are the first propositions of generalization of matching metrics designed for rooted phylogenies with branch lengths. The experimental tests of the practical utility of the phylogenetic metrics show the superiority of MCJ, MPJ over the previous best tree comparison method. To define the MCW and MCJW metrics, we introduce a general method for constructing matching metrics for weighted rooted phylogenetic trees.
Collapse
Affiliation(s)
- Damian Bogdanowicz
- Department of Algorithms and System Modeling, Faculty of Electronics, Telecommunications and Informatics, Gdansk University of Technology, Gdansk, Poland
| | - Krzysztof Giaro
- Department of Algorithms and System Modeling, Faculty of Electronics, Telecommunications and Informatics, Gdansk University of Technology, Gdansk, Poland
| |
Collapse
|
7
|
Zhai H, Fukuyama J. A convenient correspondence between k-mer-based metagenomic distances and phylogenetically-informed β-diversity measures. PLoS Comput Biol 2023; 19:e1010821. [PMID: 36608056 PMCID: PMC9879504 DOI: 10.1371/journal.pcbi.1010821] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/28/2022] [Revised: 01/26/2023] [Accepted: 12/16/2022] [Indexed: 01/07/2023] Open
Abstract
k-mer-based distances are often used to describe the differences between communities in metagenome sequencing studies because of their computational convenience and history of effectiveness. Although k-mer-based distances do not use information about taxon abundances, we show that one class of k-mer distances between metagenomes (the Euclidean distance between k-mer spectra, or EKS distances) are very closely related to a class of phylogenetically-informed β-diversity measures that do explicitly use both the taxon abundances and information about the phylogenetic relationships among the taxa. Furthermore, we show that both of these distances can be interpreted as using certain features of the taxon abundances that are related to the phylogenetic tree. Our results allow practitioners to perform phylogenetically-informed analyses when they only have k-mer data available and provide a theoretical basis for using k-mer spectra with relatively small values of k (on the order of 4-5). They are also useful for analysts who wish to know more of the properties of any method based on k-mer spectra and provide insight into one class of phylogenetically-informed β-diversity measures.
Collapse
Affiliation(s)
- Hongxuan Zhai
- Department of Statistics, Indiana University Bloomington, Bloomington, Indiana, United States of America
| | - Julia Fukuyama
- Department of Statistics, Indiana University Bloomington, Bloomington, Indiana, United States of America
| |
Collapse
|
8
|
Yan L, Masood TB, Sridharamurthy R, Rasheed F, Natarajan V, Hotz I, Wang B. Scalar Field Comparison with Topological Descriptors: Properties and Applications for Scientific Visualization. COMPUTER GRAPHICS FORUM 2021; 40:599-633. [DOI: 10.1111/cgf.14331] [Citation(s) in RCA: 7] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 09/01/2023]
Abstract
AbstractIn topological data analysis and visualization, topological descriptors such as persistence diagrams, merge trees, contour trees, Reeb graphs, and Morse–Smale complexes play an essential role in capturing the shape of scalar field data. We present a state‐of‐the‐art report on scalar field comparison using topological descriptors. We provide a taxonomy of existing approaches based on visualization tasks associated with three categories of data: single fields, time‐varying fields, and ensembles. These tasks include symmetry detection, periodicity detection, key event/feature detection, feature tracking, clustering, and structure statistics. Our main contributions include the formulation of a set of desirable mathematical and computational properties of comparative measures, and the classification of visualization tasks and applications that are enabled by these measures.
Collapse
Affiliation(s)
- Lin Yan
- Scientific Computing and Imaging Institute University of Utah USA
| | - Talha Bin Masood
- Department of Science and Technology (ITN) Linköping University Norrköping Sweden
| | | | - Farhan Rasheed
- Department of Science and Technology (ITN) Linköping University Norrköping Sweden
| | - Vijay Natarajan
- Department of Computer Science and Automation Indian Institute of Science Bangalore India
| | - Ingrid Hotz
- Department of Science and Technology (ITN) Linköping University Norrköping Sweden
| | - Bei Wang
- Scientific Computing and Imaging Institute University of Utah USA
| |
Collapse
|
9
|
Gomez A, Dian J, Froese L, Zeiler FA. Near-Infrared Cerebrovascular Reactivity for Monitoring Cerebral Autoregulation and Predicting Outcomes in Moderate to Severe Traumatic Brain Injury: Proposal for a Pilot Observational Study. JMIR Res Protoc 2020; 9:e18740. [PMID: 32415822 PMCID: PMC7450363 DOI: 10.2196/18740] [Citation(s) in RCA: 15] [Impact Index Per Article: 3.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/16/2020] [Revised: 04/12/2020] [Accepted: 05/15/2020] [Indexed: 01/16/2023] Open
Abstract
BACKGROUND Impaired cerebrovascular reactivity after traumatic brain injury (TBI) in adults is emerging as an important prognostic factor, with strong independent association with 6-month outcomes. To date, it is unknown if impaired cerebrovascular reactivity during the acute phase is associated with ongoing impaired continuously measured cerebrovascular reactivity in the long-term, and if such measures are associated with clinical phenotype at those points in time. OBJECTIVE We describe a prospective pilot study to assess the use of near-infrared spectroscopy (NIRS) to derive continuous measures of cerebrovascular reactivity during the acute and long-term phases of TBI in adults. METHODS Over 2 years, we will recruit up to 80 adults with moderate/severe TBI admitted to the intensive care unit (ICU) with invasive intracranial pressure (ICP) monitoring. These patients will undergo high-frequency data capture of ICP, arterial blood pressure (ABP), and NIRS for the first 5 days of care. Patients will then have 30 minutes of noninvasive NIRS and ABP monitoring in the clinic at 3, 6, and 12 months post-injury. Outcomes will be assessed via the Glasgow Outcome Scale and Short Form-12 questionnaires. Various relationships between NIRS and ICP-derived cerebrovascular reactivity metrics and associated outcomes will be assessed using biomedical signal processing techniques and both multivariate and time-series statistical methodologies. RESULTS Study recruitment began at the end of February 2020, with data collection ongoing and three patients enrolled at the time of writing. The expected duration of data collection will be from February 2020 to January 2022, as per our local research ethics board approval (B2018:103). Support for this work has been obtained through the National Institutes of Health (NIH) through the National Institute of Neurological Disorders and Stroke (NINDS) (R03NS114335), funded in January 2020. CONCLUSIONS With the application of NIRS technology for monitoring of patients with TBI, we expect to be able to outline core relationships between noninvasively measured aspects of cerebral physiology and invasive measures, as well as patient outcomes. Documenting these relationships carries the potential to revolutionize the way we monitor patients with TBI, moving to more noninvasive techniques. INTERNATIONAL REGISTERED REPORT IDENTIFIER (IRRID) DERR1-10.2196/18740.
Collapse
Affiliation(s)
- Alwyn Gomez
- Section of Neurosurgery, Department of Surgery, Rady Faculty of Health Sciences, University of Manitoba, Winnipeg, MB, Canada
| | - Joshua Dian
- Section of Neurosurgery, Department of Surgery, Rady Faculty of Health Sciences, University of Manitoba, Winnipeg, MB, Canada
| | - Logan Froese
- Biomedical Engineering, Faculty of Engineering, University of Manitoba, Winnipeg, MB, Canada
| | - Frederick Adam Zeiler
- Section of Neurosurgery, Department of Surgery, Rady Faculty of Health Sciences, University of Manitoba, Winnipeg, MB, Canada.,Biomedical Engineering, Faculty of Engineering, University of Manitoba, Winnipeg, MB, Canada.,Department of Anatomy and Cell Science, Rady Faculty of Health Sciences, University of Manitoba, Winnipeg, MB, Canada.,Centre on Aging, University of Mantioba, Winnipeg, MB, Canada.,Division of Anaesthesia, Department of Medicine, University of Cambridge, Cambridge, United Kingdom
| |
Collapse
|
10
|
Flament-Simon SC, de Toro M, Chuprikova L, Blanco M, Moreno-González J, Salas M, Blanco J, Redrejo-Rodríguez M. High diversity and variability of pipolins among a wide range of pathogenic Escherichia coli strains. Sci Rep 2020; 10:12452. [PMID: 32719405 PMCID: PMC7385651 DOI: 10.1038/s41598-020-69356-6] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/26/2020] [Accepted: 07/01/2020] [Indexed: 12/24/2022] Open
Abstract
Self-synthesizing transposons are integrative mobile genetic elements (MGEs) that encode their own B-family DNA polymerase (PolB). Discovered a few years ago, they are proposed as key players in the evolution of several groups of DNA viruses and virus–host interaction machinery. Pipolins are the most recent addition to the group, are integrated in the genomes of bacteria from diverse phyla and also present as circular plasmids in mitochondria. Remarkably, pipolins-encoded PolBs are proficient DNA polymerases endowed with DNA priming capacity, hence the name, primer-independent PolB (piPolB). We have now surveyed the presence of pipolins in a collection of 2,238 human and animal pathogenic Escherichia coli strains and found that, although detected in only 25 positive isolates (1.1%), they are present in E. coli strains from a wide variety of pathotypes, serotypes, phylogenetic groups and sequence types. Overall, the pangenome of strains carrying pipolins is highly diverse, despite the fact that a considerable number of strains belong to only three clonal complexes (CC10, CC23 and CC32). Comparative analysis with a set of 67 additional pipolin-harboring genomes from GenBank database spanning strains from diverse origin, further confirmed these results. The genetic structure of pipolins shows great flexibility and variability, with the piPolB gene and the attachment sites being the only common features. Most pipolins contain one or more recombinases that would be involved in excision/integration of the element in the same conserved tRNA gene. This mobilization mechanism might explain the apparent incompatibility of pipolins with other integrative MGEs such as integrons. In addition, analysis of cophylogeny between pipolins and pipolin-harboring strains showed a lack of congruence between several pipolins and their host strains, in agreement with horizontal transfer between hosts. Overall, these results indicate that pipolins can serve as a vehicle for genetic transfer among circulating E. coli and possibly also among other pathogenic bacteria.
Collapse
Affiliation(s)
- Saskia-Camille Flament-Simon
- Laboratorio de Referencia de E. Coli (LREC), Departamento de Microbiología y Parasitología, Facultad de Veterinaria, Universidad de Santiago de Compostela (USC), 27002, Lugo, Spain
| | - María de Toro
- Plataforma de Genómica y Bioinformática, CIBIR (Centro de Investigación Biomédica de La Rioja), La Rioja, 26006, Logroño, Spain
| | - Liubov Chuprikova
- Departamento de Bioquímica & Instituto de Investigaciones Biomédicas "Alberto Sols" CSIC-UAM, Universidad Autónoma de Madrid (UAM), 28029, Madrid, Spain
| | - Miguel Blanco
- Laboratorio de Referencia de E. Coli (LREC), Departamento de Microbiología y Parasitología, Facultad de Veterinaria, Universidad de Santiago de Compostela (USC), 27002, Lugo, Spain
| | - Juan Moreno-González
- Centro de Biología Molecular Severo Ochoa, Consejo Superior de Investigaciones Científicas-Universidad Autónoma de Madrid, 28049, Madrid, Spain
| | - Margarita Salas
- Centro de Biología Molecular Severo Ochoa, Consejo Superior de Investigaciones Científicas-Universidad Autónoma de Madrid, 28049, Madrid, Spain
| | - Jorge Blanco
- Laboratorio de Referencia de E. Coli (LREC), Departamento de Microbiología y Parasitología, Facultad de Veterinaria, Universidad de Santiago de Compostela (USC), 27002, Lugo, Spain
| | - Modesto Redrejo-Rodríguez
- Centro de Biología Molecular Severo Ochoa, Consejo Superior de Investigaciones Científicas-Universidad Autónoma de Madrid, 28049, Madrid, Spain. .,Departamento de Bioquímica & Instituto de Investigaciones Biomédicas "Alberto Sols" CSIC-UAM, Universidad Autónoma de Madrid (UAM), 28029, Madrid, Spain.
| |
Collapse
|
11
|
Merda D, Felten A, Vingadassalon N, Denayer S, Titouche Y, Decastelli L, Hickey B, Kourtis C, Daskalov H, Mistou MY, Hennekinne JA. NAuRA: Genomic Tool to Identify Staphylococcal Enterotoxins in Staphylococcus aureus Strains Responsible for FoodBorne Outbreaks. Front Microbiol 2020; 11:1483. [PMID: 32714310 PMCID: PMC7344154 DOI: 10.3389/fmicb.2020.01483] [Citation(s) in RCA: 21] [Impact Index Per Article: 5.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/27/2020] [Accepted: 06/08/2020] [Indexed: 11/13/2022] Open
Abstract
Food contamination by staphylococcal enterotoxins (SEs) is responsible for many food poisoning outbreaks (FPOs) each year, and they represent the third leading cause of FPOs in Europe. SEs constitute a protein family with 27 proteins. However, enzyme immunoassays can only detect directly in food the five classical SEs (SEA-SEE). Thus, molecular characterization methods of strains found in food are now used for FPO investigations. Here, we describe the development and implementation of a genomic analysis tool called NAuRA (Nice automatic Research of alleles) that can detect the presence of 27 SEs genes in just one analysis- and create a database of allelic data and protein variants for harmonizing analyses. This tool uses genome assembly data and the 27 protein sequences of SEs. To include the different divergence levels between SE-coding genes, parameters of coverage and identity were generated from 10,000 simulations and a dataset of 244 assembled genomes from strains responsible for outbreaks in Europe as well as the RefSeq reference database. Based on phylogenetic inference performed using maximum-likelihood on the core genomes of the strains in this collection, we demonstrated that strains responsible for FPOs are distributed throughout the phylogenetic tree. Moreover, 71 toxin profiles were obtained using the NAuRA pipeline and these profiles do not follow the evolutionary history of strains. This study presents a pioneering method to investigate strains isolated from food at the genomic level and to analyze the diversity of all 27 SE-coding genes together.
Collapse
Affiliation(s)
- Déborah Merda
- French Agency for Food, Environmental and Occupational Health & Safety (ANSES), University of Paris-Est, Maisons-Alfort, France
| | - Arnaud Felten
- French Agency for Food, Environmental and Occupational Health & Safety (ANSES), University of Paris-Est, Maisons-Alfort, France
| | - Noémie Vingadassalon
- French Agency for Food, Environmental and Occupational Health & Safety (ANSES), University of Paris-Est, Maisons-Alfort, France
| | - Sarah Denayer
- Scientific Service of FoodBorne Pathogens, Sciensano, Brussels, Belgium
| | - Yacine Titouche
- Laboratory of Analytical Biochemistry and Biotechnology, University of Mouloud Mammeri, Tizi Ouzou, Algeria
| | - Lucia Decastelli
- National Reference Laboratory for Coagulase-Positive Including Staphylococcus aureus, Istituto Zooprofilattico Sperimentale del Piemonte, Liguria e Valle d'Aosta, Turin, Italy
| | | | - Christos Kourtis
- State General Laboratory, Food Microbiology Laboratory, Nicosia, Cyprus
| | - Hristo Daskalov
- National Center of Food Safety, NDRVI, BFSA, Sofia, Bulgaria
| | - Michel-Yves Mistou
- French Agency for Food, Environmental and Occupational Health & Safety (ANSES), University of Paris-Est, Maisons-Alfort, France
| | - Jacques-Antoine Hennekinne
- French Agency for Food, Environmental and Occupational Health & Safety (ANSES), University of Paris-Est, Maisons-Alfort, France
| |
Collapse
|
12
|
Goluch T, Bogdanowicz D, Giaro K. Visual TreeCmp
: Comprehensive Comparison of Phylogenetic Trees on the Web. Methods Ecol Evol 2020. [DOI: 10.1111/2041-210x.13358] [Citation(s) in RCA: 11] [Impact Index Per Article: 2.8] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/25/2022]
Affiliation(s)
- Tomasz Goluch
- Department of Algorithms and System Modeling Faculty of Electronics, Telecommunications and Informatics Gdańsk University of Technology Gdańsk Poland
| | - Damian Bogdanowicz
- Department of Algorithms and System Modeling Faculty of Electronics, Telecommunications and Informatics Gdańsk University of Technology Gdańsk Poland
| | - Krzysztof Giaro
- Department of Algorithms and System Modeling Faculty of Electronics, Telecommunications and Informatics Gdańsk University of Technology Gdańsk Poland
| |
Collapse
|
13
|
Yan L, Wang Y, Munch E, Gasparovic E, Wang B. A Structural Average of Labeled Merge Trees for Uncertainty Visualization. IEEE TRANSACTIONS ON VISUALIZATION AND COMPUTER GRAPHICS 2020; 26:832-842. [PMID: 31403426 PMCID: PMC7752151 DOI: 10.1109/tvcg.2019.2934242] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/10/2023]
Abstract
Physical phenomena in science and engineering are frequently modeled using scalar fields. In scalar field topology, graph-based topological descriptors such as merge trees, contour trees, and Reeb graphs are commonly used to characterize topological changes in the (sub)level sets of scalar fields. One of the biggest challenges and opportunities to advance topology-based visualization is to understand and incorporate uncertainty into such topological descriptors to effectively reason about their underlying data. In this paper, we study a structural average of a set of labeled merge trees and use it to encode uncertainty in data. Specifically, we compute a 1-center tree that minimizes its maximum distance to any other tree in the set under a well-defined metric called the interleaving distance. We provide heuristic strategies that compute structural averages of merge trees whose labels do not fully agree. We further provide an interactive visualization system that resembles a numerical calculator that takes as input a set of merge trees and outputs a tree as their structural average. We also highlight structural similarities between the input and the average and incorporate uncertainty information for visual exploration. We develop a novel measure of uncertainty, referred to as consistency, via a metric-space view of the input trees. Finally, we demonstrate an application of our framework through merge trees that arise from ensembles of scalar fields. Our work is the first to employ interleaving distances and consistency to study a global, mathematically rigorous, structural average of merge trees in the context of uncertainty visualization.
Collapse
|
14
|
Markin A, Eulenstein O. Cophenetic Median Trees. IEEE/ACM TRANSACTIONS ON COMPUTATIONAL BIOLOGY AND BIOINFORMATICS 2019; 16:1459-1470. [PMID: 30222583 DOI: 10.1109/tcbb.2018.2870173] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/08/2023]
Abstract
Median tree inference under path-difference metrics has shown great promise for large-scale phylogeny estimation. Similar to these metrics is the family of cophenetic metrics that originates from a classic dendrogram comparison method introduced more than 50 years ago. Despite the appeal of this family of metrics, the problem of computing median trees under cophenetic metrics has not been analyzed. Like other standard median tree problems relevant in practice, as we show here, this problem is also NP-hard. NP-hard median tree problems have been successfully addressed by local search heuristics that are solving thousands of instances of a corresponding (local neighborhood) search problem. For the local neighborhood search problem under a cophenetic metric, the best known (naïve) algorithm has a time complexity that is typically prohibitive for effective heuristic searches. Building on the pioneering work on path-difference median trees, we develop efficient algorithms for Manhattan and Euclidean cophenetic search problems that improve on the naïve solution by a linear and a quadratic factor, respectively. We demonstrate the performance and effectiveness of the resulting heuristic methods in a comparative study using benchmark empirical datasets.
Collapse
|
15
|
Avino M, Ng GT, He Y, Renaud MS, Jones BR, Poon AFY. Tree shape-based approaches for the comparative study of cophylogeny. Ecol Evol 2019; 9:6756-6771. [PMID: 31312429 PMCID: PMC6618157 DOI: 10.1002/ece3.5185] [Citation(s) in RCA: 11] [Impact Index Per Article: 2.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/23/2019] [Revised: 02/21/2019] [Accepted: 03/29/2019] [Indexed: 12/17/2022] Open
Abstract
Cophylogeny is the congruence of phylogenetic relationships between two different groups of organisms due to their long-term interaction. We investigated the use of tree shape distance measures to quantify the degree of cophylogeny. We implemented a reverse-time simulation model of pathogen phylogenies within a fixed host tree, given cospeciation probability, host switching, and pathogen speciation rates. We used this model to evaluate 18 distance measures between host and pathogen trees including two kernel distances that we developed for labeled and unlabeled trees, which use branch lengths and accommodate different size trees. Finally, we used these measures to revisit published cophylogenetic studies, where authors described the observed associations as representing a high or low degree of cophylogeny. Our simulations demonstrated that some measures are more informative than others with respect to specific coevolution parameters especially when these did not assume extreme values. For real datasets, trees' associations projection revealed clustering of high concordance studies suggesting that investigators are describing it in a consistent way. Our results support the hypothesis that measures can be useful for quantifying cophylogeny. This motivates their usage in the field of coevolution and supports the development of simulation-based methods, i.e., approximate Bayesian computation, to estimate the underlying coevolutionary parameters.
Collapse
Affiliation(s)
- Mariano Avino
- Department of Pathology and Laboratory Medicine Western University London Ontario Canada
| | - Garway T Ng
- Department of Pathology and Laboratory Medicine Western University London Ontario Canada
| | - Yiying He
- Department of Pathology and Laboratory Medicine Western University London Ontario Canada
| | - Mathias S Renaud
- Department of Pathology and Laboratory Medicine Western University London Ontario Canada
| | - Bradley R Jones
- BC Centre for Excellence in HIV/AIDS Vancouver British Columbia Canada
| | - Art F Y Poon
- Department of Pathology and Laboratory Medicine Western University London Ontario Canada.,Department of Applied Mathematics Western University London Ontario Canada
| |
Collapse
|
16
|
Zeiler FA, Donnelly J, Cardim D, Menon DK, Smielewski P, Czosnyka M. ICP Versus Laser Doppler Cerebrovascular Reactivity Indices to Assess Brain Autoregulatory Capacity. Neurocrit Care 2018; 28:194-202. [PMID: 29043544 PMCID: PMC5948245 DOI: 10.1007/s12028-017-0472-x] [Citation(s) in RCA: 16] [Impact Index Per Article: 2.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/02/2022]
Abstract
BACKGROUND To explore the relationship between various autoregulatory indices in order to determine which approximate small vessel/microvascular (MV) autoregulatory capacity most accurately. METHODS Utilizing a retrospective cohort of traumatic brain injury patients (N = 41) with: transcranial Doppler (TCD), intracranial pressure (ICP) and cortical laser Doppler flowmetry (LDF), we calculated various continuous indices of autoregulation and cerebrovascular responsiveness: A. ICP derived [pressure reactivity index (PRx)-correlation between ICP and mean arterial pressure (MAP), PAx-correlation between pulse amplitude of ICP (AMP) and MAP, RAC-correlation between AMP and cerebral perfusion pressure (CPP)], B. TCD derived (Mx-correlation between mean flow velocity (FVm) and CPP, Mx_a-correlation between FVm and MAP, Sx-correlation between systolic flow velocity (FVs) and CPP, Sx_a-correlation between FVs and MAP, Dx-correlation between diastolic flow index (FVd) and CPP, Dx_a-correlation between FVd and MAP], and LDF derived (Lx-correlation between LDF cerebral blood flow [CBF] and CPP, Lx_a-correlation between LDF-CBF and MAP). We assessed the relationship between these indices via Pearson correlation, Friedman test, principal component analysis (PCA), agglomerative hierarchal clustering (AHC), and k-means cluster analysis (KMCA). RESULTS LDF-based autoregulatory index (Lx) was most associated with TCD-based Mx/Mx_a and Dx/Dx_a across Pearson correlation, PCA, AHC, and KMCA. Lx was only remotely associated with ICP-based indices (PRx, PAx, RAC). TCD-based Sx/Sx_a was more closely associated with ICP-derived PRx, PAx and RAC. This indicates that vascular-derived indices of autoregulatory capacity (i.e., TCD and LDF based) covary, with Sx/Sx_a being the exception, whereas indices of cerebrovascular reactivity derived from pulsatile CBV (i.e., ICP indices) appear to not be closely related to those of vascular origin. CONCLUSIONS Transcranial Doppler Mx is the most closely associated with LDF-based Lx/Lx_a. Both Sx/Sx-a and the ICP-derived indices appear to be dissociated with LDF-based cerebrovascular reactivity, leaving Mx/Mx-a as a better surrogate for the assessment of cortical small vessel/MV cerebrovascular reactivity. Sx/Sx_a cocluster/covary with ICP-derived indices, as seen in our previous work.
Collapse
Affiliation(s)
- F. A. Zeiler
- Division of Anaesthesia, Addenbrooke’s Hospital, University of Cambridge, Cambridge, UK
- Section of Surgery, Rady Faculty of Health Sciences, University of Manitoba, Winnipeg, MB Canada
- Clinician Investigator Program, Rady Faculty of Health Science, University of Manitoba, Winnipeg, MB Canada
| | - J. Donnelly
- Section of Brain Physics, Division of Neurosurgery, Department of Clinical Neurosciences, Addenbrooke’s Hospital, University of Cambridge, Cambridge, CB2 0QQ UK
| | - D. Cardim
- Section of Brain Physics, Division of Neurosurgery, Department of Clinical Neurosciences, Addenbrooke’s Hospital, University of Cambridge, Cambridge, CB2 0QQ UK
| | - D. K. Menon
- Division of Anaesthesia, Addenbrooke’s Hospital, University of Cambridge, Cambridge, UK
- Neurosciences Critical Care Unit, Addenbrooke’s Hospital, Cambridge, UK
- Queens’ College, University of Cambridge, Cambridge, UK
- National Institute for Health Research, Cambridge, UK
| | - P. Smielewski
- Section of Brain Physics, Division of Neurosurgery, Department of Clinical Neurosciences, Addenbrooke’s Hospital, University of Cambridge, Cambridge, CB2 0QQ UK
| | - M. Czosnyka
- Section of Brain Physics, Division of Neurosurgery, Department of Clinical Neurosciences, Addenbrooke’s Hospital, University of Cambridge, Cambridge, CB2 0QQ UK
- Institute of Electronic Systems, Warsaw University of Technology, Warszawa, Poland
| |
Collapse
|
17
|
Kendall M, Ayabina D, Xu Y, Stimson J, Colijn C. Estimating Transmission from Genetic and Epidemiological Data: A Metric to Compare Transmission Trees. Stat Sci 2018. [DOI: 10.1214/17-sts637] [Citation(s) in RCA: 11] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/19/2022]
|
18
|
Henri C, Leekitcharoenphon P, Carleton HA, Radomski N, Kaas RS, Mariet JF, Felten A, Aarestrup FM, Gerner Smidt P, Roussel S, Guillier L, Mistou MY, Hendriksen RS. An Assessment of Different Genomic Approaches for Inferring Phylogeny of Listeria monocytogenes. Front Microbiol 2017; 8:2351. [PMID: 29238330 PMCID: PMC5712588 DOI: 10.3389/fmicb.2017.02351] [Citation(s) in RCA: 42] [Impact Index Per Article: 6.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/07/2017] [Accepted: 11/15/2017] [Indexed: 11/13/2022] Open
Abstract
Background/objectives: Whole genome sequencing (WGS) has proven to be a powerful subtyping tool for foodborne pathogenic bacteria like L. monocytogenes. The interests of genome-scale analysis for national surveillance, outbreak detection or source tracking has been largely documented. The genomic data however can be exploited with many different bioinformatics methods like single nucleotide polymorphism (SNP), core-genome multi locus sequence typing (cgMLST), whole-genome multi locus sequence typing (wgMLST) or multi locus predicted protein sequence typing (MLPPST) on either core-genome (cgMLPPST) or pan-genome (wgMLPPST). Currently, there are little comparisons studies of these different analytical approaches. Our objective was to assess and compare different genomic methods that can be implemented in order to cluster isolates of L. monocytogenes. Methods: The clustering methods were evaluated on a collection of 207 L. monocytogenes genomes of food origin representative of the genetic diversity of the Anses collection. The trees were then compared using robust statistical analyses. Results: The backward comparability between conventional typing methods and genomic methods revealed a near-perfect concordance. The importance of selecting a proper reference when calling SNPs was highlighted, although distances between strains remained identical. The analysis also revealed that the topology of the phylogenetic trees between wgMLST and cgMLST were remarkably similar. The comparison between SNP and cgMLST or SNP and wgMLST approaches showed that the topologies of phylogenic trees were statistically similar with an almost equivalent clustering. Conclusion: Our study revealed high concordance between wgMLST, cgMLST, and SNP approaches which are all suitable for typing of L. monocytogenes. The comparable clustering is an important observation considering that the two approaches have been variously implemented among reference laboratories.
Collapse
Affiliation(s)
- Clémentine Henri
- Agence Nationale de Sécurité Sanitaire de l'Alimentation, Maisons-Alfort Laboratory for Food Safety, University Paris-Est, Maisons-Alfort, France
| | - Pimlapas Leekitcharoenphon
- European Union Reference Laboratory for Antimicrobial Resistance, National Food Institute, WHO Collaborating Center for Antimicrobial Resistance in Food Borne Pathogens and Genomics, Technical University of Denmark, Kongens Lyngby, Denmark
| | - Heather A Carleton
- National Center for Emerging and Zoonotic Infectious Diseases, Centers for Disease Control and Prevention, Atlanta, GA, United States
| | - Nicolas Radomski
- Agence Nationale de Sécurité Sanitaire de l'Alimentation, Maisons-Alfort Laboratory for Food Safety, University Paris-Est, Maisons-Alfort, France
| | - Rolf S Kaas
- European Union Reference Laboratory for Antimicrobial Resistance, National Food Institute, WHO Collaborating Center for Antimicrobial Resistance in Food Borne Pathogens and Genomics, Technical University of Denmark, Kongens Lyngby, Denmark
| | - Jean-François Mariet
- Agence Nationale de Sécurité Sanitaire de l'Alimentation, Maisons-Alfort Laboratory for Food Safety, University Paris-Est, Maisons-Alfort, France
| | - Arnaud Felten
- Agence Nationale de Sécurité Sanitaire de l'Alimentation, Maisons-Alfort Laboratory for Food Safety, University Paris-Est, Maisons-Alfort, France
| | - Frank M Aarestrup
- European Union Reference Laboratory for Antimicrobial Resistance, National Food Institute, WHO Collaborating Center for Antimicrobial Resistance in Food Borne Pathogens and Genomics, Technical University of Denmark, Kongens Lyngby, Denmark
| | - Peter Gerner Smidt
- National Center for Emerging and Zoonotic Infectious Diseases, Centers for Disease Control and Prevention, Atlanta, GA, United States
| | - Sophie Roussel
- Agence Nationale de Sécurité Sanitaire de l'Alimentation, Maisons-Alfort Laboratory for Food Safety, University Paris-Est, Maisons-Alfort, France
| | - Laurent Guillier
- Agence Nationale de Sécurité Sanitaire de l'Alimentation, Maisons-Alfort Laboratory for Food Safety, University Paris-Est, Maisons-Alfort, France
| | - Michel-Yves Mistou
- Agence Nationale de Sécurité Sanitaire de l'Alimentation, Maisons-Alfort Laboratory for Food Safety, University Paris-Est, Maisons-Alfort, France
| | - René S Hendriksen
- European Union Reference Laboratory for Antimicrobial Resistance, National Food Institute, WHO Collaborating Center for Antimicrobial Resistance in Food Borne Pathogens and Genomics, Technical University of Denmark, Kongens Lyngby, Denmark
| |
Collapse
|
19
|
Bárcenas-Reyes I, Loza-Rubio E, Cantó-Alarcón GJ, Luna-Cozar J, Enríquez-Vázquez A, Barrón-Rodríguez RJ, Milián-Suazo F. Whole genome sequence phylogenetic analysis of four Mexican rabies viruses isolated from cattle. Res Vet Sci 2017; 113:21-24. [PMID: 28818750 DOI: 10.1016/j.rvsc.2017.08.004] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/28/2016] [Revised: 05/25/2017] [Accepted: 08/03/2017] [Indexed: 12/25/2022]
Abstract
Phylogenetic analysis of the rabies virus in molecular epidemiology has been traditionally performed on partial sequences of the genome, such as the N, G, and P genes; however, that approach raises concerns about the discriminatory power compared to whole genome sequencing. In this study we characterized four strains of the rabies virus isolated from cattle in Querétaro, Mexico by comparing the whole genome sequence to that of strains from the American, European and Asian continents. Four cattle brain samples positive to rabies and characterized as AgV11, genotype 1, were used in the study. A cDNA sequence was generated by reverse transcription PCR (RT-PCR) using oligo dT. cDNA samples were sequenced in an Illumina NextSeq 500 platform. The phylogenetic analysis was performed with MEGA 6.0. Minimum evolution phylogenetic trees were constructed with the Neighbor-Joining method and bootstrapped with 1000 replicates. Three large and seven small clusters were formed with the 26 sequences used. The largest cluster grouped strains from different species in South America: Brazil, and the French Guyana. The second cluster grouped five strains from Mexico. A Mexican strain reported in a different study was highly related to our four strains, suggesting common source of infection. The phylogenetic analysis shows that the type of host is different for the different regions in the American Continent; rabies is more related to bats. It was concluded that the rabies virus in central Mexico is genetically stable and that it is transmitted by the vampire bat Desmodus rotundus.
Collapse
Affiliation(s)
- I Bárcenas-Reyes
- Facultad de Ciencias Naturales, Universidad Autónoma de Querétaro, Av. de las Ciencias S/N Juriquilla, Delegación Santa Rosa Jáuregui, C. P. 76230 Querétaro, Mexico
| | - E Loza-Rubio
- CENID-M-INIFAP, Centro Nacional de Investigación Disciplinaria en Microbiología, Instituto Nacional de Investigaciones Forestales, Agrícolas y Pecuarias, Carretera México-Toluca, km 15.5, C.P. 05110 Mexico D.F., Mexico
| | - G J Cantó-Alarcón
- Facultad de Ciencias Naturales, Universidad Autónoma de Querétaro, Av. de las Ciencias S/N Juriquilla, Delegación Santa Rosa Jáuregui, C. P. 76230 Querétaro, Mexico.
| | - J Luna-Cozar
- Facultad de Ciencias Naturales, Universidad Autónoma de Querétaro, Av. de las Ciencias S/N Juriquilla, Delegación Santa Rosa Jáuregui, C. P. 76230 Querétaro, Mexico
| | - A Enríquez-Vázquez
- LPAC - Laboratorio de Patología Animal Calamanda, el Marques, Querétaro, Mexico
| | - R J Barrón-Rodríguez
- CENID-M-INIFAP, Centro Nacional de Investigación Disciplinaria en Microbiología, Instituto Nacional de Investigaciones Forestales, Agrícolas y Pecuarias, Carretera México-Toluca, km 15.5, C.P. 05110 Mexico D.F., Mexico
| | - F Milián-Suazo
- Facultad de Ciencias Naturales, Universidad Autónoma de Querétaro, Av. de las Ciencias S/N Juriquilla, Delegación Santa Rosa Jáuregui, C. P. 76230 Querétaro, Mexico
| |
Collapse
|
20
|
Bogdanowicz D, Giaro K. Comparing Phylogenetic Trees by Matching Nodes Using the Transfer Distance Between Partitions. J Comput Biol 2017; 24:422-435. [PMID: 28177699 PMCID: PMC5421509 DOI: 10.1089/cmb.2016.0204] [Citation(s) in RCA: 10] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/12/2022] Open
Abstract
Ability to quantify dissimilarity of different phylogenetic trees describing the relationship between the same group of taxa is required in various types of phylogenetic studies. For example, such metrics are used to assess the quality of phylogeny construction methods, to define optimization criteria in supertree building algorithms, or to find horizontal gene transfer (HGT) events. Among the set of metrics described so far in the literature, the most commonly used seems to be the Robinson-Foulds distance. In this article, we define a new metric for rooted trees-the Matching Pair (MP) distance. The MP metric uses the concept of the minimum-weight perfect matching in a complete bipartite graph constructed from partitions of all pairs of leaves of the compared phylogenetic trees. We analyze the properties of the MP metric and present computational experiments showing its potential applicability in tasks related to finding the HGT events.
Collapse
Affiliation(s)
- Damian Bogdanowicz
- Department of Algorithms and System Modeling, Gdansk University of Technology , Gdansk, Poland
| | - Krzysztof Giaro
- Department of Algorithms and System Modeling, Gdansk University of Technology , Gdansk, Poland
| |
Collapse
|
21
|
Abstract
UNLABELLED Evolutionary relationships are frequently described by phylogenetic trees, but a central barrier in many fields is the difficulty of interpreting data containing conflicting phylogenetic signals. We present a metric-based method for comparing trees which extracts distinct alternative evolutionary relationships embedded in data. We demonstrate detection and resolution of phylogenetic uncertainty in a recent study of anole lizards, leading to alternate hypotheses about their evolutionary relationships. We use our approach to compare trees derived from different genes of Ebolavirus and find that the VP30 gene has a distinct phylogenetic signature composed of three alternatives that differ in the deep branching structure. KEY WORDS phylogenetics, evolution, tree metrics, genetics, sequencing.
Collapse
Affiliation(s)
- Michelle Kendall
- Department of Mathematics, Imperial College London, London, United Kingdom
| | - Caroline Colijn
- Department of Mathematics, Imperial College London, London, United Kingdom
| |
Collapse
|
22
|
Frasca M, Bertoni A, Valentini G. UNIPred: Unbalance-Aware Network Integration and Prediction of Protein Functions. J Comput Biol 2015; 22:1057-74. [PMID: 26402488 DOI: 10.1089/cmb.2014.0110] [Citation(s) in RCA: 15] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/29/2022] Open
Abstract
The proper integration of multiple sources of data and the unbalance between annotated and unannotated proteins represent two of the main issues of the automated function prediction (AFP) problem. Most of supervised and semisupervised learning algorithms for AFP proposed in literature do not jointly consider these items, with a negative impact on both sensitivity and precision performances, due to the unbalance between annotated and unannotated proteins that characterize the majority of functional classes and to the specific and complementary information content embedded in each available source of data. We propose UNIPred (unbalance-aware network integration and prediction of protein functions), an algorithm that properly combines different biomolecular networks and predicts protein functions using parametric semisupervised neural models. The algorithm explicitly takes into account the unbalance between unannotated and annotated proteins both to construct the integrated network and to predict protein annotations for each functional class. Full-genome and ontology-wide experiments with three eukaryotic model organisms show that the proposed method compares favorably with state-of-the-art learning algorithms for AFP.
Collapse
Affiliation(s)
- Marco Frasca
- DI - Department of Computer Science, University of Milan , Milan, Italy
| | - Alberto Bertoni
- DI - Department of Computer Science, University of Milan , Milan, Italy
| | - Giorgio Valentini
- DI - Department of Computer Science, University of Milan , Milan, Italy
| |
Collapse
|